VocBench Frequently Asked Questions


Q. I see I need to use Maven to build VocBench. I don't know how to use it and, frankly, I don't want to learn yet another technology, I just want to use <replace_here_with_your_favourite_IDE>. May I skip this step?

Yes, in theory you can. You can just check all the dependencies written in the pom.xml file of each project, search them on the web, download them (make a prayer hoping you downloaded the exact version of everything, all the required dependencies etc..), then import them in <replace_here_with_your_favourite_IDE>, inside the project that you just created to hold the sources, then import the sources etc...

...though..let us be a bit evangelists...Maven may be a big beast if you start to love it, you get addicted to it, and start thinking things like "I want to prepare a beer with it" (we are sure there is a Maven plugin even for that...), but, if you are an end user of Maven and just need it to build a project, the time you need to download and install it, is so short you will get immediately paid back for the time you would need to build this single project in the traditional way. See the build instructions to see how quick is to build VocBench (and most other projects you will find out there) with Maven.

Last but not the least, Maven is today a de facto standard for releasing source code.

...and..oh, yes, almost probably there is a Maven plugin for <replace_here_with_your_favourite_IDE> !


Q. Can VocBench connect to other triplestores than RDF4J or GraphDB?

A question coming to many curious users willing to use VB with other technologies is: why not other triple stores? and if yes, which ones? why not many others have been explored? why not a more agnostic access to different technologies?

The answer "passes" through the years of experience with the (more than decennial) beta versions of Semantic Turkey (before 1.0).

Semantic Turkey, the RDF Management platform behind VocBench, was initially based on another project of the ART group, the OWLART api, a thin middleware over different RDF frameworks. This layer did not implement any storage facility and delegated to the underlying implementations (i.e. wrappers for notable frameworks such as Jena and Sesame, the precursor to RDF4J) most of the business logic. The first versions of Semantic Turkey, developed before SPARQL was a recommendation, were based on use of graph API, thus making it easier to switch among different technologies, providing that the semantics of basic operations such as addTriple(), deleteTriple(), listStatements() etc.. were consistent among them. When differences arose, we developed the wrappers so that these could conform to the semantics we provided in our API. The experience has been effort-demanding, and in the end it resulted in:

It was then with much relief that, from version 1.0 of ST and VB3.0, we decided to abandon the OWLART project and move to RDF4J as the native, stable solution for RDF management within ST. We can finally focus on ST's features alone, standing on top of a mature library such as RDF4J and directly benefiting from all of its improvements by simply upgrading to its new versions.

A set of facilities has been developed over RDF4J repository connection and query classes in order to simplify the development of services specifically for ST, requiring SPARQL for exploiting them at their best.

So, VocBench allows - in theory - for connections to any triple store accessible through a RDF4J remote connection and triple stores compliant with the Sail stack framework (such as GraphDB) are to be preferred. Various triplestores may have further requirements.

Information for connecting VocBench to separate triple stores are provided here and a complete example is available on the VocBench test drives.

A complete description of the requirements for connectable triplestores is provided here.

Q. Why all those URIs in the data used also for labels? (in SKOS-XL)

VocBench allows for the use of different standards for representing lexicalizations: plain rdfs:labels, skos labeling properties, and SKOS-XL labels. Labels in SKOS-XL are “reified”. This means that the labels are described by URIs, and can have properties in turn. This has two advantages:

  1. You can model labels in turn (adding, for instance, lexical relationships between labels)
  2. You can provide them with editorial notes and metadata in general

VB already provides some additional metadata automatically, such as creation and modification dates (as it does for concepts), while it is up to the user to add additional notes or, if they want, to adopt vocabularies of lexical relations.

VB also allows for reified skos:definitions, a pattern foreseen by the SKOS standard (see second case of the "Advanced Documentation Features" section of the SKOS Primer document).

However, VB provides an export for SKOS which also allows to flatten the reified skos:definitions to simple literals. There are then options for retaining or removing the triples providing the reified descriptions and labels.

If you prefer to handle these transformations offline from VB, these operations are also available as command line utilities through the full distribution of the OWLART library.

Q. I'm not interested in the collaborative aspects of VB, or at least I don't need all that user authentication/permissions part. Can I disable these features?
Also I would like more in-depth control over the data, at triple level, thought not necessarily by having to use SPARQL

Just install VocBench and keep using the default administrator login which is suggested by the platform. In this case, you will be using VB3 in a very much similar way to a desktop tool.

Q. Can VocBench be used to publish data other than editing it?

Well, you can always use VocBench as a browsing tool, providing users with capabilities for reading only. However, there is a dedicated tool that does a much better job: ShowVoc! ShowVoc is a companion to VocBench for data publication, browsing, and consumption (both machine and human consumption).

It provides:

Q. Why all concepts are being exported, and not just the ones with status "published"?

The reason is that concepts in a thesaurus are useful references for document bases and anything for which a thesaurus represents a semantic index. For instance, documents tagged with a concept should always retain a valid link to a description of the concept on the Web.