top of page
Writer's pictureMichael DeBellis

Using SPARQL to Refactor User Names to UUIDs

This describes a set of SPARQL transformations to refactor your ontology. One of the main goals of refactoring is to start with a simple design and add complexity if and when it is needed. As I've discussed in a previous post, using UUIDs for the IRIs of your ontology makes writing SPARQL queries more complex (I will have more on that in another post soon). For that reason, I created SPARQL queries that can take an ontology with user supplied names and transform them to UUIDs. This way a developer can start off using user supplied names but if it becomes necessary (e.g., multiple languages, security, very large ontologies) to use UUIDs the change can be made automatically. The complete file (called UUID Transformations) can be found on my GitHub site here: SPARQL Tools.


The way to use it is to first use the SPARQL transformations in the file SPARQL Label Utilities to make sure that all the entities in your ontology have values for their rdfs:label property. For more on the label utilities see this previous post. Then use the following to create an annotation property that will hold the value for the new UUID IRI for each entity:

INSERT DATA {dmn:entityCopy a owl:AnnotationProperty.}	

Note that in the prefix "dmn:" is bound to the IRI for your ontology. Then use the next SPARQL transformations to create copies for each class, property, and individual. E.g., the transformation for classes looks like this:

INSERT	{?c dmn:entityCopy ?niri.}
WHERE {?c a owl:Class.
       BIND(STRAFTER(STR(?c), '#') as ?name)
       BIND (IRI(CONCAT(STR(dmn:),"OwlClass", STRUUID())) AS ?niri)
       FILTER(?c != owl:Thing && ?c != owl:Nothing && !isBlank(?c)).}	

The first BIND retrieves the IRI after the # sign (i.e., the user supplied name). The second BIND creates a new IRI by concatenating the IRI for the domain (the dmn: prefix) with the string "OwlClass" and a UUID generated by SPARQL. The isBlank test is to test for anonymous classes which are created when you define restrictions. Those don't have IRIs and will cause errors if you try to assign one to them. There are similar transformations for individuals, datatype and object properties in the file. The next transformation does most of the work:

DELETE {?e ?p ?o.}		
INSERT {?newe ?newp ?newo.}
WHERE {?e ?p ?o.
	OPTIONAL{?e dmn:entityCopy ?ne.}
	OPTIONAL{?p dmn:entityCopy ?np.}
	OPTIONAL{?o dmn:entityCopy ?no.}
	BIND(IF (BOUND(?ne),?ne, ?e) AS ?newe)
	BIND(IF (BOUND(?np),?np, ?p) AS ?newp)	
	BIND(IF (BOUND(?no),?no, ?o) AS ?newo)
	FILTER(?p != dmn:entityCopy)}

What this does is to find every triple in your ontology. Then it tests each triple to see if the Subject, Predicate, and/or Object have UUID copies. For example, the Subject could be an anonymous class. In which case it stays the same (although its Object may be a domain Class which needs to be changed). The Predicate could be a domain Predicate (which will be converted to a UUID and hence has a copy) or a Predicate like rdfs:label which of course shouldn't be changed. The Object could be a literal such as a string or an integer which doesn't have a copy and can be transferred just as it is. The Object could also be an entity in the domain ontology in which case it needs to be changed. The BIND statement binds ?newe, ?newp, and ?newo to the appropriate resource. Note: we could use COALESE here instead of the IF test to see if the UUID variable (e.g., ?ne) is bound. However, that would result in many SPARQL error messages so I chose to use the IF test. (Thanks to Lorenz Buehmann who pointed out some issues with my original code on the Protégé mailing list).


The DELETE statement deletes the old triple and the INSERT statement inserts the new triple with UUIDs where appropriate. There are also a couple of cleanup transformations after this one that I think are obvious by the comments in the Github file. I used this on the PizzaWData tutorial ontology and here is the result, the ontology now has UUIDs (make sure to update your rendering options to view it) rather than user supplied names: PizzaWDataWUUIDs


The figure below shows what the classes look like with UUIDs when the rendering hasn't been adjusted (you can see the name of the selected class in the label annotation).





Comments


bottom of page