Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No results for queries containing booleans in a triple pattern #549

Open
arenas-guerrero-julian opened this issue Sep 15, 2022 · 7 comments

Comments

@arenas-guerrero-julian
Copy link

Description

Hi!

I am running Ontop with GTFS-Madrid-Bench. For queries with triples patterns containing a boolean. For instance query 16:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gtfs: <http://vocab.gtfs.org/terms#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT * WHERE {
	?trip a gtfs:Trip .
	?trip gtfs:service ?service .
	?trip gtfs:route ?route . 

	?service a gtfs:Service .
	?service gtfs:serviceRule ?serviceRule .

	?serviceRule a gtfs:CalendarDateRule .
	?serviceRule  dct:date ?servDate .
	?serviceRule  gtfs:dateAddition "true"^^xsd:boolean .
	FILTER (?servDate >= %DATE1% ) .
	FILTER (?servDate <= %DATE2% ) .
}

ontop produces an EMPTY SQL query and no SPARQL result set. If the triple pattern ?serviceRule gtfs:dateAddition "true"^^xsd:boolean . is removed then ontop works fine.

Steps to Reproduce

Properties file:

jdbc.name = gtfs1
jdbc.user = root
jdbc.password = root
jdbc.fetchSize = 500000
jdbc.url = jdbc:mysql://localhost:3306/gtfs1?allowPublicKeyRetrieval=true&useSSL=false
jdbc.driver = com.mysql.jdbc.Driver

ontop.queryLogging=true
ontop.applicationName=Ontop
ontop.queryLogging.includeReformulatedQuery=true

Expected behavior: SPARQL query result set and unfolded SQL query

Actual behavior: EMPTY SQL query

Reproduces how often: it happens every time

Attached material

All the materials are available in the GTFS-Madrid-Bench repo

Versions

  • Ontop versions: 4.1.0, 4.2.0, 4.2.1
  • RDBMSs: MySQL, PostgreSQL
@kontchakov
Copy link
Contributor

Hi Julian

Is it the case that the database column for gtfs:dateAddition (exception_type) is of type int? So, the expected behaviour would be to convert the integers 0 or 1, which are non-canonical lexical forms of the boolean values, into the canonical forms, false and true, right?

Best
Roman

@arenas-guerrero-julian
Copy link
Author

arenas-guerrero-julian commented Sep 15, 2022

Hi @kontchakov

Exactly, it is of type int(11) DEFAULT NULL, and the column is filled with 0's and 1's and they should be converted to true/false

Thanks,
Julián

@bcogrel
Copy link
Member

bcogrel commented Sep 15, 2022

Hi, I agree that using the canonical form for booleans would be better, but, strictly speaking, this is not a requirement from R2RML.

I suggest making the query robust to non-canonical boolean values. It can be done by testing the boolean value in a FILTER instead of relying on the lexical value equality:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gtfs: <http://vocab.gtfs.org/terms#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT * WHERE {
	?trip a gtfs:Trip .
	?trip gtfs:service ?service .
	?trip gtfs:route ?route . 

	?service a gtfs:Service .
	?service gtfs:serviceRule ?serviceRule .

	?serviceRule a gtfs:CalendarDateRule .
	?serviceRule  dct:date ?servDate .
	?serviceRule  gtfs:dateAddition _:dateAddition .
        FILTER (_:dateAddition) .
	FILTER (?servDate >= %DATE1% ) .
	FILTER (?servDate <= %DATE2% ) .
}

Best,
Benjamin

@arenas-guerrero-julian
Copy link
Author

Thanks @bcogrel

That is a very good solution :) . The only thing is that (I think) a variable should be used, e.g.:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gtfs: <http://vocab.gtfs.org/terms#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT * WHERE {
	?trip a gtfs:Trip .
	?trip gtfs:service ?service .
	?trip gtfs:route ?route . 

	?service a gtfs:Service .
	?service gtfs:serviceRule ?serviceRule .

	?serviceRule a gtfs:CalendarDateRule .
	?serviceRule  dct:date ?servDate .
	?serviceRule  gtfs:dateAddition ?some_v .
        FILTER (?some_v) .
	FILTER (?servDate >= %DATE1% ) .
	FILTER (?servDate <= %DATE2% ) .
}

@bcogrel
Copy link
Member

bcogrel commented Sep 15, 2022

Good!

I actually used a b-node for not adding another variable to the results. The b-node played here the role of a local, non-projected, variable.

But if there is no problem in having one variable more in the results, I agree, it is more readable with a regular variable.

@arenas-guerrero-julian
Copy link
Author

I tested with the blank node and Ontop produces an error, with the variable it works nicely :)

@bcogrel
Copy link
Member

bcogrel commented Sep 15, 2022

Right, the b-node trick works for BGPs but apparently not with FILTERs.
At least, the RDF4J SPARQL parser does not accept it.

Good to know, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants