PSPARQL and CPSPARQL

PSPARQL (for Path SPARQL) is a query language for RDF (see Path RDF as a query language for RDF) that we designed by extending SPARQL with path expressions. It allows the use of regular expression patterns, i.e., regular expressions with variables in the predicate position of SPARQL.

The regular expression patterns allowed in PSPARQL grammar are the ones constructed over the set of urirefs, blanks, and variables.

CPSPARQL (for Constrained Path SPARQL) extends PSPARQL with constraints on path steps.

PSPARQL Examples

Assume that we have a data graph representing cities and the possible transportation means between them, where a triple <c1,t,c2> means that there exists a transportation mean t from c1 to c2. Suppose that graph contains other information like: capital cities, cities that have international airports, etc. Now, consider the following queries.

The following query returns the capital of the France:

SELECT ?City
WHERE { ?City ex:capital ex:France . }

The following query returns all cities connected to the capital of France by a plane or train:

SELECT ?City2
WHERE { ?City1 ex:capital ex:France .
        ?City1 (ex:plane | ex:train) ?City2 . }

The following query returns all direct or indirect transportation mean from Paris to Amman:

SELECT ?T
WHERE { ex:Paris +?T ex:Amman . }

CPSPARQL Examples

Consider the RDF graph G of Figure 1, that represents the transportation means between cities, the type of the transportation mean, and the price of tickets.

Assume that someone wants to go from Roma to a city in one of the Canary Islands. The following SPARQL query finds the name of such city with only direct trips (no paths):

SELECT ?City
WHERE { ?Trip ex:from ex:Roma.
        ?Trip ex:to ?City.
        ?City ex:cityIn ex:CanaryIslands.
}

Nonetheless, SPARQL cannot express indirect trips with variable length paths. We can express that using regular expressions with the following (C)PSPARQL query:

SELECT ?City
WHERE { ex:Roma (ex:from-.ex:to)+ ?City.
        ?City ex:cityIn ex:CanaryIslands.
}

Note that - is the inverse operator. For example, given the RDF triple (ex:Paris, ex:from, _:Flight), we can deduce (_:Flight, ex:from-, ex:Paris).

Suppose that he want to use only planes. To do that, we first define a constraint that consists of: a name, interval delimiters to include or exclude path node extremities, a quantifier, and a variable is used to be substituted by nodes, and a graph to be matched. For example, the name of constraint in the following query is const1, it is open from left and universal which ensures that all trips are of type plane.

SELECT ?City
WHERE { CONSTRAINT const1 ]ALL ?Trip]: { ?Trip rdf:type ex:Plane. }
        ex:Roma (ex:from-%const1%.ex:to)+ ?City.
        ?City ex:cityIn ex:CanaryIslands.
}

Moreover, he cannot go out the European union, e.g., for the visa problem, i.e., all intermediate stops are cities in the Europe.

SELECT ?City
WHERE { CONSTRAINT const1 ]ALL ?Trip]: { ?Trip rdf:type ex:Plane. }
        CONSTRAINT const2 ]ALL ?Stop]: { ?Stop ex:city:In ?Country.
                                         ?Country ex:partOf ex:Europe. }
        ex:Roma (ex:from-%const1%.ex:to%const2%)+ ?City.
        ?City ex:cityIn ex:CanaryIslands.
}

Query engine

We have implemented a PSPARQL query evaluator in Java (1.5). This evaluator can parse SPARQL, PSPARQL and CPSPARQL queries, parse RDF documents written in the Turtle language, evaluate the query and then return the answer set.

Algorithm

The algorithm follows the backtrack technique developed in our work. The evaluation of regular expression patterns is used for computing the satisfiability set of a given regular expression, to take into account the multiple appearances of a given variable in different places of the query, i.e., to take into account the current mappings.

SPARQL coverage and compliance

Some aspects of SPARQL are not yet implemented in the current version. These aspects are:

This evaluator passed 435 test cases out of the 440 in the W3C Data Access Working Group SPARQL test base. The 5 missed tests are those that use the DESCRIBE result form which is not implemented.

Servlet demo

The form below allows to access a servlet running in a virtual server at INRIA.

Query

Other graphs to query from are available in this directory.

Availability

The (C)PSPARQL query engine is a research prototype. It is freely available for downloads under the CeCILL-B licence WITHOUT WARRANTY OF ANY KIND, either expressed or implied. See the License for the specific language governing rights and limitations under the License.

Sources and compiled versions are available from the gitlab repository. Note that the repository is not currently available. You can download the last version (CPSPARQLEngine-V3.3.zip) from https://exmo.inria.fr/files/software/cpsparql/.

Resources

PSPARQL development site.

References

© | ? | *

https://exmo.inria.fr/software/psparql/

Feel free to comment to Jerome:Euzenat#inria:fr, $Id: index.html,v 1.20 2023/01/05 11:14:37 euzenat Exp $