英文:
Different results between construct query in GraphDB workbench vs SPARQLWrapper
问题
I am trying to download a number of named graphs from my GraphDB repository using the API. Simply put, I want to retrieve the entire content of a named graph, in a serialization of choice (Turtle and JSON-LD).
My first approach is using a construct query:
PREFIX ex: <http://example.com/ns#>
CONSTRUCT {?s ?p ?o}
WHERE {graph <http://example.com/ns#id/>
{?s ?p ?o .}}
I'm using the python SPARQLWrapper here, which is returning results, only these results contain triples from ALL my graphs, not just http://example.com/ns#id/. Furthermore, if I add some kind of filter like filter (?o = "weafsdghadifhilaerjgak") I still get the same result (all triples from all graphs), making me think that the query is not actually running.
I tried opening a fresh notebook with no possible contamination of the query variable, but I still get the same result.
Running the query in GraphDB workbench gives the expected result. Looking for pointers on why there could be a difference between the GraphDB workbench result, and the SPARQLWrapper result.
Here is the code I'm using, minus some specifics for security sake:
from SPARQLWrapper import SPARQLWrapper, BASIC, QueryResult
from rdflib import Graph
db = SPARQLWrapper("myendpoint:7300/repositories/myrepo/statements")
query = '''
PREFIX ex: <http://example.com/ns#>
CONSTRUCT {?s ?p ?o}
WHERE {graph <http://example.com/ns#id/>
{?s ?p ?o .
#filter (?o = "somestringthatdefinitelydoesnotexist")
}}
'''
db.setHTTPAuth(BASIC)
db.setCredentials('my', 'credentials')
db.setQuery(query)
db.method = "GET"
db.setReturnFormat('json-ld')
db.queryType = "CONSTRUCT"
result = db.query()
jsonresult = result._convertJSONLD()
v = jsonresult.serialize(format='turtle')
print(query)
print(v)
Please note that I've replaced HTML escape codes with their corresponding characters for better readability in the code snippet.
英文:
I am trying to download a number of named graphs from my GraphDB repository using the API. Simply put, I want to retrieve the entire content of a named graph, in a serialization of choice (Turtle and JSON-LD).
My first approach is using a construct query:
PREFIX ex: <http://example.com/ns#>
CONSTRUCT {?s ?p ?o}
WHERE {graph <http://example.com/ns#id/>
{?s ?p ?o .}}
I'm using the python SPARQLWrapper here, which is returning results, only these results contain triples from ALL my graphs, not just <http://example.com/ns#id/>. Furthermore, if I add some kind of filter like filter (?o = "weafsdghadifhilaerjgak") I still get the same result (all triples from all graphs), making me think that the query is not actually running.
I tried opening a fresh notebook with no possible contamination of the query variable, but I still get the same result.
Running the query in GraphDB workbench gives the expected result. Looking for pointers on why there could be a difference between the GraphDB workbench result, and the SPARQLWrapper result
Here is the code I'm using, minus some specifics for security sake:
from SPARQLWrapper import SPARQLWrapper, BASIC, QueryResult
from rdflib import Graph
db = SPARQLWrapper("myendpoint:7300/repositories/myrepo/statements")
query = '''
PREFIX ex: <http://example.com/ns#>
CONSTRUCT {?s ?p ?o}
WHERE {graph <http://example.com/ns#id/>
{?s ?p ?o .
#filter (?o = "somestringthatdefinitelydoesnotexist")
}}
'''
db.setHTTPAuth(BASIC)
db.setCredentials('my', 'credentials')
db.setQuery(query)
db.method = "GET"
db.setReturnFormat('json-ld')
db.queryType = "CONSTRUCT"
result = db.query()
jsonresult=(result._convertJSONLD())
# v = jsonresult.serialize(format='json-ld')
v = jsonresult.serialize(format='turtle')
print(query)
print(v)
答案1
得分: 0
使用SPARQLWrapper时,用于GET查询的端点URL必须按以下结构构建:
myendpoint:port/repositories/{repositoryID}
对于POST查询(更新、插入),端点不同:
myendpoint:port/repositories/{repositoryID}/statements
在POST "endpoint URL"上运行GET查询将导致检索所有三元组,无论您附加的查询是什么。这也是为什么我的过滤器对查询结果没有任何影响的原因。
在我看来,这是一种愚蠢的复杂性:毕竟,GraphDB Workbench本身会解析您的查询,确定它是GET还是POST情况,并正确处理API。
TL:DR 只有在执行POST查询(SPARQL插入)时使用/statements。
这在GraphDB的REST API文档中有一些描述,但并不清楚在端点上运行错误的“类型”查询不会导致错误,而是返回存储库中的所有语句。
英文:
When using SPARQLWrapper, the endpoint url for GET queries has to be structured as follows:
myendpoint:port/repositories/{repositoryID}
For POST queries, (updating, inserting), the endpoint is different:
myendpoint:port/repositories/{repositoryID}/statements
by running a GET query on a POST "endpoint url" will result in retrieving ALL triples, regardless of the query you attach. Hence also why my filter was not having any impact on the query results.
This is in my opinion a silly complexity: after all, GraphDB workbench itsself parses your query, figures out whether it's a GET or POST situation, and handles the api just fine.
TL:DR Use /statements ONLY when doing POST query (SPARQL Insert).
This is somewhat described in the REST API Documentation on GraphDB, but it was not clear that running the wrong "type" of query on an endpoint does not give an error, but rather just returns all statements in the repo.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论