Ontology traversal with Jena and SPARQL

In this tutorial we demonstrate how to traverse through an Ontology using Apache Jena. We show two approaches using the Jena API and SPARQL model queries.

1. Requirements:

2. Some information

An ontology describes the types, properties and relationships between entities of a particular domain. The pizza.owl ontology describes different kinds of pizza like vegetarian pizza or meaty pizza. Additionally toppings or spiciness are categorized and linked via axioms to describe the pizza domain. E.g. a vegetarian pizza can not have a meaty topping.

When traversing through an ontology, you have to remember that it is represented as graph. The most effective (in terms of less coding) way to handle graph and tree like structures is recursion. We suggest to start reading about recursion in order to understand the following code snippets, if you are not familiar with it.

3. Traversal using Jena API methods

The following java code reads an ontology and traverses through every class in the ontology.

The “readOntology” method reads an ontology into a Jena OntModel. This model can be queried using the Jena API methods like “listSubClasses” etc.

“TraverseStart” has an optional parameter for an OntClass, to specify a certain starting class for the traversal. If this parameter is Null, all known roots are used as starting point.

The recursion happens the private “traverse” method, which is called over and over again for each class. Remember to have an abort criteria, otherwise you can easily run into loops and therefore cause your stack to run out of memory.

The output for the camera.owl looks like this:


4. Traversal using SPARQL queries

The following Java code replicates the functionality from above, but instead of using Jena methods, we query the OntModel ourselfes using SPARQL. The following code reads an ontology and traverses down using SPARQL queries:

The “readOntology” method stays the same. Since we do not want to use the Jena API to query the model, we have to extract the roots ourselfes. That is what the “getRoots” method does.

The “traverseStart” and “traverse” method are equal to the ones on top. Executing the code returns the following using the camera.owl:

The output looks slightly different. Thats because the SPARQL query strictly works on the triples provided in the OWL file. There are constructs like:

The first declaration of Camera represents a subClass of  PurchaseableItem. Therefore it would not count as root. The owl:intersectionOf and the included “redefinition” of the Camera results in the Camera being a root class in terms of the “getRoots” method.

These axioms / restrictions have to be processed and filtered, which we did not do in the presented code.

5. Conclusion

You can extract the topology of an ontology using both the Jena API and SPARQL queries. While the Jena API offers many methods to retrieve data conveniently and you do not require any functionality not covered from Jena, i would stick to the Jena API.

If you have to query more complex data, you can not avoid to write your own SPARQL queries. This is cumbersome but more powerful. We saw that in terms of the topology, the Jena API did better (in terms of results expected) than (our) hand written SPARQL queries.

However it is possible to reach the same results if you improve the SPARQL queries.

If you have errors, exceptions, problems or improvements feel free to comment and ask.




Leave a Reply