This post shows how to use SPARQL to take user defined names in IRIs and use them to fill in the rdfs:label with an appropriately formatted string. It is a follow up post to my post from a while ago about names in OWL. As discussed in that post there is a distinction between the last part of the IRI and the value of the rdfs:label property. In Protégé, if you choose the "User Defined Names" option for your IRIs then Protégé leaves the rdfs:label property empty. However, for some tools it can be useful to still have a value for that property. The following SPARQL queries can be used in the Snap SPARQL plugin to take IRI names of classes and individuals in "CamelBack" notation and put names that look like this "Camel Back" into the rdfs:label. The last queries are for property names which are typically like this propertyName and turn them into: "property Name" labels.
The following works for the Pizza tutorial ontology, although it has nothing specific to the ontology in it so you should be able to use it on any ontology developed with desktop Protégé using the "User Supplied Names" feature that follows these naming conventions. The query checks to see if a value already exists for rdfs:label (that is what the filter testing if ?elbl is bound does). It should be obvious how to change the following so that they adapt to other naming conventions such as using underscores (just match for "_" in the REPLACE and have " " as the pattern to replace it).
Also, there is a bug in the Protégé Snap SPARQL plugin where it uses 0 based indexing for the SUBSTR function when it is supposed to start with 1. So if you use these in other implementations of SPARQL, change the "1" value in the SUBSTR functions to "2".
As always, if you have questions, feel free to comment below or email me.
Addendum: I realized I had forgotten to exclude owl:topDataProperty and owl:topObjectProperty. You never want to mess with those. I added a clause in the FILTER for those queries below so that the top properties are excluded.
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
#Create labels for all Classes
CONSTRUCT {?c rdfs:label ?lblname.}
WHERE {?c rdfs:subClassOf owl:Thing.
BIND(STRAFTER(STR(?c), '#') as ?name)
BIND(REPLACE(?name,"([A-Z])", " $1" ) as ?namewbs)
BIND (IF (STRSTARTS(?namewbs," "),SUBSTR(?namewbs,1),?namewbs) AS ?lblname)
OPTIONAL{?c rdfs:label ?elbl.}
FILTER(?c != owl:Thing && ?c != owl:Nothing && !BOUND(?elbl))}
#Create labels for all Individuals
CONSTRUCT {?i rdfs:label ?lblname.}
WHERE {?i a owl:Thing.
BIND(STRAFTER(STR(?i), '#') as ?name)
BIND(REPLACE(?name,"([A-Z])", " $1" ) as ?namewbs)
BIND (IF (STRSTARTS(?namewbs," "),SUBSTR(?namewbs,1),?namewbs) AS ?lblname)
OPTIONAL{?i rdfs:label ?elbl.}
FILTER(!BOUND(?elbl))}
#Create labels for all Object Properties
CONSTRUCT {?p rdfs:label ?lblname.}
WHERE {?p a owl:ObjectProperty.
BIND(STRAFTER(STR(?p), '#') as ?name)
BIND(REPLACE(?name,"([A-Z])", " $1" ) as ?lblname)
OPTIONAL{?p rdfs:label ?elbl.}
FILTER(?p != owl:topObjectProperty &&!BOUND(?elbl))}
#Create labels for all Data Properties
CONSTRUCT {?p rdfs:label ?lblname.}
WHERE {?p a owl:DatatypeProperty.
BIND(STRAFTER(STR(?p), '#') as ?name)
BIND(REPLACE(?name,"([A-Z])", " $1" ) as ?lblname)
OPTIONAL{?p rdfs:label ?elbl.}
FILTER(?p != owl:topDataProperty && !BOUND(?elbl))}
Hi Michael,
thanks for sharing your workflows to the people out there.
Minor comments:
You could avoid the white space in the beginning with the following regex: (?!^)([A-Z])
Moreover, a slightly more efficient regex would be ([a-z0-9])([A-Z]) I guess. It should at least (i) make it unnecessary to remove of the leading white space and (ii) avoid splitting multiple upper case chars, e.g. in acronyms like USA.
Indeed this is still just error prone for some if not many cases.
Cheers,
Lorenz