VocBench Frequently Asked Questions

Building

Q. I see I need to use Maven to build VocBench. I don't know how to use it and, frankly, I don't want to learn yet another technology, I just want to use <replace_here_with_your_favourite_IDE>. May I skip this step?

Yes, in theory you can. You can just check all the dependencies written in the pom.xml file of each project, search them on the web, download them (make a prayer hoping you downloaded the exact version of everything, all the required dependencies etc..), then import them in <replace_here_with_your_favourite_IDE>, inside the project that you just created to hold the sources, then import the sources etc...

...though..let us be a bit evangelists...Maven may be a big beast if you start to love it, you get addicted to it, and start thinking things like "I want to prepare a beer with it" (we are sure there is a Maven plugin even for that...), but, if you are an end user of Maven and just need it to build a project, the time you need to download and install it, is so short you will get immediately paid back for the time you would need to build this single project in the traditional way. See the build instructions to see how quick is to build VocBench (and most other projects you will find out there) with Maven.

Last but not the least, Maven is today a de facto standard for releasing source code.

...and..oh, yes, almost probably there is a Maven plugin for <replace_here_with_your_favourite_IDE> !

General

Q. Can VocBench connect to other triplestores than RDF4J or GraphDB?

A question coming to many curious users willing to use VB with other technologies is: why not other triple stores? and if yes, which ones? why not many others have been explored? why not a more agnostic access to different technologies?

The answer "passes" through the years of experience with the (more than decennial) evolutions of Semantic Turkey (before 1.0, the version that was released with VocBench 3).

By first, let's enucleate the main challenges to real interoperability with triple stores:

standards limitations. The SPARQL standard is limited: it offers Web API but, just to mention the main (but not sole) limitation, there is no support for transactions. This gap gave life to a series of technologies being setup by different vendors, with some vendors adhering to one or more de-facto standards offered by middlewares.
backend-side extension. Some of the functionalities of VocBench require us to develop extensions being deployed in the backend. Again, this drives us to be technology-specific with some architecture for extensions
query-performance: yes, SPARQL is a protocol and returned results should be the same runinng the same query everywhere. We can't say the same about which query runs best on which store.

Let's see them one by one:

For what concerns the lack of a standard for transactions and the consequent proliferation of specific technologies, we tried to follow a shallow-integration approach. Semantic Turkey, the RDF Management platform behind VocBench, was initially depending on another project of the ART group, the OWLART api, a very thin middleware over different RDF frameworks. This layer did not implement any storage facility and delegated to the underlying implementations (i.e. wrappers for notable frameworks such as Jena and Sesame, the precursor to RDF4J) most of the business logic. The first versions of Semantic Turkey, developed before SPARQL was a recommendation, were based on use of graph API, thus making it easier to switch among different technologies, providing that the semantics of basic operations such as addTriple(), deleteTriple(), listStatements() etc.. were consistent among them. When differences arose, we developed the wrappers so that these could conform to the semantics we provided in our API. The experience has been effort-demanding, and in the end it resulted in:

drifting us away too often from our objectives, which are ST and its applications (e.g. VocBench) and not an RDF API
losing much of the new interesting stuff coming from each of the single wrapped technologies, as we had to conform to the common denominator of all of them
progressively managing to keep in track only few of the wrapped technologies, until we ended up with Sesame alone, some always-slightly-outdated Jena stuff, and none of the others

It was then with much relief that, from version 1.0 of ST and VB3.0, we decided to abandon the OWLART project and move to RDF4J as the native, stable solution for RDF management within ST. We can finally focus on ST's features alone, standing on top of a mature library such as RDF4J and directly benefiting from all of its improvements by simply upgrading to its new versions.

A set of facilities has been developed over RDF4J repository connection and query classes in order to simplify the development of services specifically for ST, requiring SPARQL for exploiting them at their best.

And now, the next questions: ok, so now you use RDF4J, then why not connecting to the multiple available triple stores that are compliant with RDF4J? and here we come to the second point described above: requirements on the backend. The need for extensions being directly deployed on the triple store rather than in the application imposes a choice on a reference architecture. In our case, this reference is RDF4J's Sail stack framework.

So, VocBench allows - in theory - for connections to any triple store accessible through a RDF4J remote connection and triple stores compliant with the Sail stack framework (such as GraphDB) are to be preferred. Various triplestores may have further requirements.

Finally, the last obstacle is represented by non-trivial differences in the way queries are processed. Several RDF systems, limiting their activity to trivial queries for retrieving resources and using DESCRIBE queries to provide their information, might easily switch triple stores. However, VocBench features several services with non-trivial queries, produced dynamically taking into account tents of preferences and settings specified by users and system administrators. Unfortunately, a same query, due to different choices of query optimizers and executing engines, which are internal to each triple store, may produce very different performance. There are indeed techniques and solutions that can help maintaining performances consistent across different engines (e.g. nested queries enforce the order of resolution of the query clauses, thus maintaining a consistent processing). Nonetheless, keeping the hundreds of services that characterize VocBench aligned even just across GraphDB and RDF4J internal storage solutions requires contintuous dedication and maintenance.

Information for connecting VocBench to separate triple stores are provided here and a complete example is available on the VocBench test drives.

A complete description of the requirements for connectable triplestores is provided here.

Q. Why all those URIs in the data used also for labels? (in SKOS-XL)

VocBench allows for the use of different standards for representing lexicalizations: plain rdfs:labels, skos labeling properties, and SKOS-XL labels. Labels in SKOS-XL are “reified”. This means that the labels are described by URIs, and can have properties in turn. This has two advantages:

You can model labels in turn (adding, for instance, lexical relationships between labels)
You can provide them with editorial notes and metadata in general

VB already provides some additional metadata automatically, such as creation and modification dates (as it does for concepts), while it is up to the user to add additional notes or, if they want, to adopt vocabularies of lexical relations.

VB also allows for reified skos:definitions, a pattern foreseen by the SKOS standard (see second case of the "Advanced Documentation Features" section of the SKOS Primer document).

However, VB provides an export for SKOS which also allows to flatten the reified skos:definitions to simple literals. There are then options for retaining or removing the triples providing the reified descriptions and labels.

If you prefer to handle these transformations offline from VB, these operations are also available as command line utilities through the full distribution of the OWLART library.

Q. I'm not interested in the collaborative aspects of VB, or at least I don't need all that user authentication/permissions part. Can I disable these features?
Also I would like more in-depth control over the data, at triple level, thought not necessarily by having to use SPARQL.

Just install VocBench and keep using the default administrator login which is suggested by the platform. In this case, you will be using VB3 in a very much similar way to a desktop tool.

Q. Can VocBench be used to publish data other than editing it?

Well, you can always use VocBench as a browsing tool, providing users with capabilities for reading only. However, there is a dedicated tool that does a much better job: ShowVoc! ShowVoc is a companion to VocBench for data publication, browsing, and consumption (both machine and human consumption).

It provides:

a streamlined interface, thought for easy content consumption
a global search that spans across all hosted datasets, homogenizing results coming from different lexical and semantic representations
a translation service exploiting mapping between the various hosted datasets
a linkset browser, showing linksets between the available datasets
no need to authenticate for ordinary users: they just land on the page and browse
all other features that you already know from VB are present in ShowVoc as well

SKOS

Q. Why I cannot see the concepts in the concept tab when a scheme is selected?

the concept tab works this way: if you click on any scheme (or a combination of them), then the concepts of that/those scheme are shown. If you click on no scheme, then all concepts are shown. In the first case (the one filtered by schemes) the root concepts are taken directly by the explicit declaration of being topConcepts, as in skos:topConceptOf. In the latter case, these are automatically computed, by taking all concepts having no parent.

So, one typical thing that happens is that only root concepts are shown, because some tools only use the skos:topConceptOf property and do not use the skos:inScheme property (wrongly, see https://www.w3.org/TR/skos-reference/#L2577) assuming that it should be inherited by child concepts.

Another possibility (especially if you see no concept at all) is that the triples with skos:topConceptOf are missing (and only skos:inScheme has been used). This is wrong modeling as well

Sheet2RDF

Q. URI Creation always uses the default namespace of the current project. How can I create new URIs with a different namespace?

There are two solutions:

use the formatting converter, where you can completely build the structure of the URI (see coda:formatter in Available Converters)

a clean and more general solution adaptable to all converters is to use the annoation for changing the default namespace (see @DefaultNamespace in Available Annotations)