In this tutorial we demonstrate how to traverse through an Ontology using Apache Jena. We show two approaches using the Jena API and SPARQL model queries.
1. Requirements:
- Apache Jena 3.01 (download here)
- pizza.owl or camera.owl as examples ontologies
2. Some information
An ontology describes the types, properties and relationships between entities of a particular domain. The pizza.owl ontology describes different kinds of pizza like vegetarian pizza or meaty pizza. Additionally toppings or spiciness are categorized and linked via axioms to describe the pizza domain. E.g. a vegetarian pizza can not have a meaty topping.
When traversing through an ontology, you have to remember that it is represented as graph. The most effective (in terms of less coding) way to handle graph and tree like structures is recursion. We suggest to start reading about recursion in order to understand the following code snippets, if you are not familiar with it.
3. Traversal using Jena API methods
The following java code reads an ontology and traverses through every class in the ontology.
import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import org.apache.jena.ontology.OntClass; import org.apache.jena.ontology.OntModel; import org.apache.jena.rdf.model.ModelFactory; public class OntologyTraverserAPI { public static void readOntology( String file, OntModel model ) { InputStream in = null; try { in = new FileInputStream( file ); model.read(in, "RDF/XML"); in.close(); } catch (IOException e) { e.printStackTrace(); } } /** * Traverse the Ontology to find all given concepts */ public static void traverseStart( OntModel model, OntClass ontClass ) { // if ontClass is specified we only traverse down that branch if( ontClass != null ) { traverse(ontClass, new ArrayList<OntClass>(), 0); return; } // create an iterator over the root classes Iterator<OntClass> i = model.listHierarchyRootClasses(); // traverse through all roots while (i.hasNext()) { OntClass tmp = i.next(); traverse( tmp, new ArrayList<OntClass>(), 0 ); } } /** * Start from a class, then recurse down to the sub-classes. * Use occurs check to prevent getting stuck in a loop * @param oc OntClass to traverse from * @param occurs stores visited nodes * @param depth indicates the graph "depth" * @return list of concepts / entities which were visited when recursing through the hierarchy (avoid loops) */ private static void traverse( OntClass oc, List<OntClass> occurs, int depth ) { if( oc == null ) return; // if end reached abort (Thing == root, Nothing == deadlock) if( oc.getLocalName() == null || oc.getLocalName().equals( "Nothing" ) ) return; // print depth times "\t" to retrieve a explorer tree like output for( int i = 0; i < depth; i++ ) { System.out.print("\t"); } // print out the OntClass System.out.println( oc.toString() ); // check if we already visited this OntClass (avoid loops in graphs) if ( oc.canAs( OntClass.class ) && !occurs.contains( oc ) ) { // for every subClass, traverse down for ( Iterator<OntClass> i = oc.listSubClasses( true ); i.hasNext(); ) { OntClass subClass = i.next(); // push this expression on the occurs list before we recurse to avoid loops occurs.add( oc ); // traverse down and increase depth (used for logging tabs) traverse( subClass, occurs, depth + 1 ); // after traversing the path, remove from occurs list occurs.remove( oc ); } } } public static void main(String[] args) { // create OntModel OntModel model = ModelFactory.createOntologyModel(); // read camera ontology readOntology( "./ontology/camera.owl", model ); // start traverse traverseStart( model, null ); } }
The “readOntology” method reads an ontology into a Jena OntModel. This model can be queried using the Jena API methods like “listSubClasses” etc.
“TraverseStart” has an optional parameter for an OntClass, to specify a certain starting class for the traversal. If this parameter is Null, all known roots are used as starting point.
The recursion happens the private “traverse” method, which is called over and over again for each class. Remember to have an abort criteria, otherwise you can easily run into loops and therefore cause your stack to run out of memory.
The output for the camera.owl looks like this:
http://www.xfront.com/owl/ontologies/camera/#Money http://www.xfront.com/owl/ontologies/camera/#SLR http://www.xfront.com/owl/ontologies/camera/#Window http://www.xfront.com/owl/ontologies/camera/#Range http://www.xfront.com/owl/ontologies/camera/#BodyWithNonAdjustableShutterSpeed http://www.xfront.com/owl/ontologies/camera/#PurchaseableItem http://www.xfront.com/owl/ontologies/camera/#Camera http://www.xfront.com/owl/ontologies/camera/#Digital http://www.xfront.com/owl/ontologies/camera/#Large-Format http://www.xfront.com/owl/ontologies/camera/#Lens http://www.xfront.com/owl/ontologies/camera/#Body
4. Traversal using SPARQL queries
The following Java code replicates the functionality from above, but instead of using Jena methods, we query the OntModel ourselfes using SPARQL. The following code reads an ontology and traverses down using SPARQL queries:
import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util.List; import org.apache.jena.ontology.OntModel; import org.apache.jena.query.Query; import org.apache.jena.query.QueryExecution; import org.apache.jena.query.QueryExecutionFactory; import org.apache.jena.query.QueryFactory; import org.apache.jena.query.QuerySolution; import org.apache.jena.query.ResultSet; import org.apache.jena.rdf.model.ModelFactory; import org.apache.jena.rdf.model.RDFNode; public class OntologyTraverserSPARQL { public static void readOntology( String file, OntModel model ) { InputStream in = null; try { in = new FileInputStream( file ); model.read( in, "RDF/XML" ); in.close(); } catch ( IOException e ) { e.printStackTrace(); } } private static List<String> getRoots( OntModel model ) { List<String> roots = new ArrayList<String>(); // find all owl:Class entities and filter these which do not have a parent String getRootsQuery = "SELECT DISTINCT ?s WHERE " + "{" + " ?s <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.w3.org/2002/07/owl#Thing> . " + " FILTER ( ?s != <http://www.w3.org/2002/07/owl#Thing> && ?s != <http://www.w3.org/2002/07/owl#Nothing> ) . " + " OPTIONAL { ?s <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?super . " + " FILTER ( ?super != <http://www.w3.org/2002/07/owl#Thing> && ?super != ?s ) } . " + "}"; Query query = QueryFactory.create( getRootsQuery ); try ( QueryExecution qexec = QueryExecutionFactory.create( query, model ) ) { ResultSet results = qexec.execSelect(); while( results.hasNext() ) { QuerySolution soln = results.nextSolution(); RDFNode sub = soln.get("s"); if( !sub.isURIResource() ) continue; roots.add( sub.toString() ); } } return roots; } public static void traverseStart( OntModel model, String entity ) { // if starting class available if( entity != null ) { traverse( model, entity, new ArrayList<String>(), 0 ); } // get roots and traverse each root else { List<String> roots = getRoots( model ); for( int i = 0; i < roots.size(); i++ ) { traverse( model, roots.get( i ), new ArrayList<String>(), 0 ); } } } public static void traverse( OntModel model, String entity, List<String> occurs, int depth ) { if( entity == null ) return; String queryString = "SELECT ?s WHERE { " + "?s <http://www.w3.org/2000/01/rdf-schema#subClassOf> <" + entity + "> . }" ; Query query = QueryFactory.create( queryString ); if ( !occurs.contains( entity ) ) { // print depth times "\t" to retrieve an explorer tree like output for( int i = 0; i < depth; i++ ) { System.out.print("\t"); } // print out the URI System.out.println( entity ); try ( QueryExecution qexec = QueryExecutionFactory.create( query, model ) ) { ResultSet results = qexec.execSelect(); while( results.hasNext() ) { QuerySolution soln = results.nextSolution(); RDFNode sub = soln.get("s"); if( !sub.isURIResource() ) continue; String str = sub.toString(); // push this expression on the occurs list before we recurse to avoid loops occurs.add( entity ); // traverse down and increase depth (used for logging tabs) traverse( model, str, occurs, depth + 1 ); // after traversing the path, remove from occurs list occurs.remove( entity ); } } } } public static void main(String[] args) { // create OntModel OntModel model = ModelFactory.createOntologyModel(); // read camera ontology readOntology( "./ontology/camera.owl", model ); // start traverse traverseStart( model, null ); } }
The “readOntology” method stays the same. Since we do not want to use the Jena API to query the model, we have to extract the roots ourselfes. That is what the “getRoots” method does.
The “traverseStart” and “traverse” method are equal to the ones on top. Executing the code returns the following using the camera.owl:
http://www.xfront.com/owl/ontologies/camera/#PurchaseableItem http://www.xfront.com/owl/ontologies/camera/#Camera http://www.xfront.com/owl/ontologies/camera/#Digital http://www.xfront.com/owl/ontologies/camera/#Large-Format http://www.xfront.com/owl/ontologies/camera/#Lens http://www.xfront.com/owl/ontologies/camera/#Body http://www.xfront.com/owl/ontologies/camera/#Large-Format http://www.xfront.com/owl/ontologies/camera/#Digital http://www.xfront.com/owl/ontologies/camera/#Window http://www.xfront.com/owl/ontologies/camera/#Range http://www.xfront.com/owl/ontologies/camera/#Money http://www.xfront.com/owl/ontologies/camera/#Camera http://www.xfront.com/owl/ontologies/camera/#Digital http://www.xfront.com/owl/ontologies/camera/#Large-Format http://www.xfront.com/owl/ontologies/camera/#Large-Format http://www.xfront.com/owl/ontologies/camera/#Digital http://www.xfront.com/owl/ontologies/camera/#Body http://www.xfront.com/owl/ontologies/camera/#Lens
The output looks slightly different. Thats because the SPARQL query strictly works on the triples provided in the OWL file. There are constructs like:
<owl:Class rdf:ID="Camera"> <rdfs:subClassOf rdf:resource="#PurchaseableItem"/> </owl:Class> <owl:Class rdf:ID="SLR"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Camera"/> <owl:Restriction> <owl:onProperty rdf:resource="#viewFinder"/> <owl:hasValue rdf:resource="#ThroughTheLens"/> </owl:Restriction> </owl:intersectionOf> </owl:Class>
The first declaration of Camera represents a subClass of PurchaseableItem. Therefore it would not count as root. The owl:intersectionOf and the included “redefinition” of the Camera results in the Camera being a root class in terms of the “getRoots” method.
These axioms / restrictions have to be processed and filtered, which we did not do in the presented code.
5. Conclusion
You can extract the topology of an ontology using both the Jena API and SPARQL queries. While the Jena API offers many methods to retrieve data conveniently and you do not require any functionality not covered from Jena, i would stick to the Jena API.
If you have to query more complex data, you can not avoid to write your own SPARQL queries. This is cumbersome but more powerful. We saw that in terms of the topology, the Jena API did better (in terms of results expected) than (our) hand written SPARQL queries.
However it is possible to reach the same results if you improve the SPARQL queries.
If you have errors, exceptions, problems or improvements feel free to comment and ask.





