VocBench Frequently Asked Questions


Q. I see I need to use Maven to build VocBench. I don't know how to use it and, frankly, I don't want to learn a new technology, I just want to use <replace_here_with_your_favourite_IDE>. May I skip this step?

Yes, in theory you can. You can just check all the dependencies written in the pom.xml file of each project, search them on the web, download them (make a prayer hoping you downloaded the exact version of everything, all the required dependencies etc..), then import them in <replace_here_with_your_favourite_IDE>, inside the project that you just created to hold the sources, then import the sources etc...

...though..let us be a bit evangelists...Maven may be a big beast if you start to love it, you get addicted to it, and start thinking things like "I want to prepare a beer with it" (we are sure there is a Maven plugin even for that...), but, if you are an end user of Maven and just need it to build a project, the time you need to download and install it, is so short you will get immediately paid back for the time you would need to build this single project in the traditional way. See the build instructions to see how quick is to build VocBench (and most other projects you will find out there) with Maven.

Last but not the least, Maven is today a de facto standard for releasing source code.

...and..oh, yes, almost probably there is a Maven plugin for <replace_here_with_your_favourite_IDE> !

Issues at Project Creation

Q. Why after trying to create a project I get an error message such as the following?


This happens when the user is trying to create a project connecting to a remote triple store, using the history and/or validation features, and the remote triple store does not contain the change-tracking sail component.

Pls check that the change-tracking sail component has been deployed on the remote triple store, as per the instructions here.


Q. Can VocBench connect to other triplestores than RDF4J or GraphDB?

A question coming to many curious users willing to use VB with other technologies is: why not other triple stores? and if yes, which ones? why not many others have been explored? why not a more agnostic access to different technologies?

The answer "passes" through the years of experience with the (more than decennial) beta versions of Semantic Turkey (before 1.0).

Semantic Turkey, the RDF Management platform behind VocBench, was initially based on another project of the ART group, the OWLART api, a thin middleware over different RDF frameworks. This layer did not implement any storage facility and delegated to the underlying implementations (i.e. wrappers for notable frameworks such as Jena and Sesame, the precursor to RDF4J) most of the business logic. The first versions of Semantic Turkey, developed before SPARQL was a recommendation, were based on use of graph API, thus making it easier to switch among different technologies, providing that the semantics of basic operations such as addTriple(), deleteTriple(), listStatements() etc.. were consistent among them. When differences arose, we developed the wrappers so that these could conform to the semantics we provided in our API. The experience has been effort-demanding, and in the end it resulted in:

It was then with much relief that, from version 1.0 of ST and VB3.0, we decided to abandon the OWLART project and move to RDF4J as the native, stable solution for RDF management within ST. We can finally focus on ST's features alone, standing on top of a mature library such as RDF4J and directly benefiting from all of its improvements by simply upgrading to its new versions.

A set of facilities has been developed over RDF4J repository connection and query classes in order to simplify the development of services specifically for ST, requiring SPARQL for exploiting them at their best.

So, VocBench allows - in theory - for connections to any triple store accessible through a RDF4J remote connection and triple storescompliant with the Sail stack framework (such as GraphDB) are to be preferred. Various triplestores may have further requirements.

Information for connecting VocBench to separate triple stores are provided here and a complete example is available on the VocBench test drives.

A complete description of the requirements for connectable triplestores is provided here.

Q. Why all those URIs in the data used also for labels? (in SKOS-XL)

VocBench allows for the use of different standards for representing lexicalizations: plain rdfs:labels, skos labeling properties, and SKOS-XL labels. Labels in SKOS-XL are “reified”. This means that the labels are described by URIs, and can have properties in turn. This has two advantages:

  1. You can model labels in turn (adding, for instance, lexical relationships between labels)
  2. You can provide them with editorial notes and metadata in general

VB already provides some additional metadata automatically, such as creation and modification dates (as it does for concepts), while it is up to the user to add additional notes or, if they want, to adopt vocabularies of lexical relations.

VB also aows for reified skos:definitions, a pattern foreseen by the SKOS standard (see second case of the "Advanced Documentation Features" section of the SKOS Primer document).

However, VB provides an export for SKOS which also allows to flatten the reified skos:definitions to simple literals. There are then options for retaining or removing the triples providing the reified descriptions and labels.

If you prefer to handle these transformations offline from VB, these operations are also available as command line utilities through the full distribution of the OWLART library.

Q. I'm not interested in the collaborative aspects of VB, or at least I don't need all that user authentication/permissions part. Can I disable these features?
Also I would like more in-depth control over the data, at triple level, thought not necessarily by having to use SPARQL

Just install VocBench and keep using the default administrator login which is suggested by the platform. In this case, you will be using VB3 in a very much similar way to a desktop tool.

Q. Can VocBench be used to publish data other than editing it?

We got this asked many times...and when we receive a request/are asked a question many times, it is a good sign for a new feature to be added. Fact is, with all the best intentions into supporting it, editing and publishing are really two completely different tasks. Some systems may exhibit some publishing feature, but this usually amounts to being able to dump data in a given SPARQL endpoint (used for publication).

Simply, the advantage of using web standards is that, the gap from data management to data publishing is (technically) inexisting. From VocBench, it suffices to export data in any of the available RDF serialization formats, and load it (that same data, as is) into some content publishing framework.

A few caveats and solutions:

About which publishing systems to choose, it is not our role to suggest them, so consider the ones below (grouped by publishing modality) our personal advice, with no aim to be exhaustive nor to be optimal:

  1. Publish data as a sparql endpoint.
    Just export all the data and load it into a RDF data server with a sparql endpoint. Sesame and Jena are just two open source and free solutions, and there are many more.

  2. Provide HTTP dereferenciation for your published data
    One very common solution is Pubby, or this other tool developed again by us: Loddy. Other than being a plain Linked Data Server as Pubby, Loddy provides lot of customization options for reorganizing the information in the HTML presentation. In particular, a template has been specifically provided in the downloadable demo for SKOSXL thesauri. The Agrovoc vocabulary exposes its linkeda data content as HTML pages through Loddy.

  3. Have a browser for your skos thesaurus published at a given address
    a very good SKOS browser is SKOSMOS. The Agrovoc vocabulary also provides a SKOSMOS instance for browsing its content.

Q. Why all concepts are being exported, and not just the ones with status "published"?

The reason is that concepts in a thesaurus are useful references for document bases and anything for which a thesaurus represents a semantic index. For instance, documents tagged with a concept should always retain a valid link to a description of the concept on the Web.

Bug and Issues

Q. When reconnecting to VB3, I cannot see the effects of the last changes I brought to my dataset

If your project uses a local repository backed by NativeStore, then most probably this is due to a bug in RDF4J for which NativeStore repositories might not be properly persisted if the server is closed abruptely, without closing the project holding the repository first.

As we suggest in the installation instructions, it is better for your stable projects to adopt a separate triple store, as described in the system administrator manual.

In any case, projects with local repositories using in-memory store are not affected by this issue.