VocBench System Administration Manual
This manual supports System Administrators in all task related to installation and maintenance of the tool, usually including tasks which cannot be performed by the VocBench Administrator through the UI
Installation Options
Semantic Turkey Installation Options
The Installation Options section of Semantic Turkey's site provides many details on how to customize the configuration of the system for various environments or according to specific requirements. Please refer to that page for ST-specific configuration options.
Separate HTTP Server
The standalone distribution contains everything needed to start playing with the system. Depending on the system administrator preference, organization policies, load balance optimization etc..., it might be desired to separate the RDF services managed by Semantic Turkey from the VocBench Web Application. In particular, since the Web Application contains only static assets, any standard http server can be adopted.
The VocBench web application is available as a web archive (.war file) in the /deploy directory of the distribution, with name: vocbench3-<versionnumber>.war
- It is possible to remove this war from there and redeploy it in any application server, such as Apache Tomcat
- Furthermore, it is possible to extract the content of the war file (you might want to remove the META-INF and WEB-INF directories) and deploy it to any HTTP server, such as: Apache HTTP Server.
In both cases, it might be necessary to have the VocBench web application explicitly pointing to the Semantic Turkey server, as explained in the following section.
Further Configuration for reaching Semantic Turkey on a different host/port
Note that the custom settings below are not possible when running VocBench from inside the Karaf container as in the standalone distribution. You might want to have VocBench installed as a separate web application in a dedicated server, as explained in the previous section.
In vbconfig.js (under src/ of the source package, or under the root folder of the built distribution) it is possible to configure the SemanticTurkey host resolution. By default VocBench3 resolves dynamically the IP address of the SemanticTurkey server (by using the same IP address of the VocBench host machine) and the port number (by using the same port of the VocBench container).
In case VocBench3 run on a different container or on a dedicated server (that is, its port/host are different from those of SemanticTurkey) the port and host cannot be automatically inferred as they belong to a separate (and unknown to VB) server. It is thus mandatory to change the following configuration by specifying the values for the st_host (only if ST is on a different machine) and st_port variables.
/** * IP address/logical host name of the machine hosting Semantic Turkey. * By default (variable left unspecified) the host is resolved dynamically * by using the same address of the machine hosting VocBench. * Thus if VocBench3 and Semantic Turkey are running on the same machine * this variable can be left commented, otherwise uncomment the line and * edit the value. */ // var st_host = "127.0.0.1"; /** * Port of the container hosting Semantic Turkey. * By default (variable left unspecified) the port is resolved dynamically * by using the same port of the container hosting VocBench. * Thus if VocBench3 and Semantic Turkey are running on the same container * this variable can be left commented, otherwise uncomment the line and * edit the value. */ // var st_port = "1979";
It is also possible to change the path where the SemanticTurkey server is listening and the protocol to use. The latter, if unspecified, is dynamically resolved as well as st_host and st_port just described.
/** * Path where SemanticTurkey server is listening. If omitted, the sole host * is considered. * Please note that the path of Semantic Turkey services is defined as in: * http://semanticturkey.uniroma2.it/doc/user/web_api.jsf#services_address_structure * This additional path information is considered to be the starting part * of the path described above, and is usually necessary in case Semantic * Turkey is installed behind a proxy redirecting the ST URL. */ var st_path; /** * Protocol - either http or https * By default (variable left unspecified) the protocol is resolved * dynamically by using the same one of the container hosting VocBench. */ // var st_protocol = "http";
Running VocBench on HTTPS
The last part of the vbconfig.js file concerns is related to the protocol for connecting to the Semantic Turkeyserver, either http or https.
/** * Protocol - either http or https */ var st_protocol = "http";
Please note that the Karaf container for the ST server must be configured for HTTPS, as explained in the ST documentation.
Running VocBench as a System Service
It is possible to run Semantic Turkey as a system service. Instructions for doing so are reported on a dedicated section of the ST system administration manual.
Serving the VocBench web application as a service mostly depends on where it is installed. In the typical ST installation package, the web application is provided as an set of static assets embedded in the same Karaf container hosting Semantic Turkey, so there is no other action to perform and ST and VB will be both started through the Karaf service. If the client is hosted on any other HTTP server, the relevant documentation for that server should be verified.
Maintenance and Settings
VocBench allows Administrators and other authenticated users to perform most of the fine tuning of the system, through the UI of the platform. This section deals with all the hidden tricks and tunings which need to work under the hood
Data Management
Separate Triple Store
VocBench (or better, its RDF Backend Semantic Turkey), comes with an embedded distribution of RDF4J, which includes a couple of storage solutions: in-memory store and native store. Creating local repositories with this embedded solution is very convenient for quickly playing with the platform without any additional installation.
However, to the purpose of having full control over your data, we recommend to adopt a separate triple store and connect to it remotely.
VocBench requires a triple store compliant with the RDF4J client API and, in order to support the more advanced features of the system such as history and validation, compliant with the RDF4J's Sail Stack mechanism.
Current available options are:
- A dedicated RDF4J server from the RDF4J site
- Ontotext GraphDB, an RDF4J-compliant triple store natively implementing RDF4J's Sail Stack mechanism, offering high performance and supporting different levels of reasoning.
Compatibility notes:- For users of VB3 10.x, we recommend:
- either GDB version 9.8.1 Here's a permanent link for GDB 9.8.1: https://download.ontotext.com/owlim/40f76740-e3f4-11eb-bcaf-42843b1b6b38/graphdb-free-9.8.1-dist.zip
- or 9.9 / 9.10.x (further versions have not been tested at the time of writing) providing that the lucene-fts plugin is installed. Instructions: download the zip and unzip it into the /lib/plugins directory of your graphdb installation
- For users of VB3 5.x or 6.0.x we recommend GDB version 8.8 or higher, yet still in the 8.x range; 9.x is not fully compatible with these versions of VB.
- For those using VB3 version 4.0.x: the previous version of VB3 (4.0.2) is compatible with GraphDB version 8.5. Later versions have introduced some incompatibilities (that we have solved together with the OntoText team) which do not allow VB3 to exploit features such as History and Validation. Here's a direct download link to version 8.5
http://download.ontotext.com/owlim/ca7e186e-2927-11e8-bbf9-42843b1b6b38/graphdb-free-8.5.0-dist.zip
- For users of VB3 10.x, we recommend:
Using history & validation: deploying the change-tracking sail component in the connected triple store
If the history or validation functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey for tracking changes in the repositories has to be deployed inside the connected store.
The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-changetracking-sail-<version>.jar, deployed inside the system/it/uniroma2/art/semanticturkey/st-changetracking-sail/<version> directory of Semantic Turkey.
This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.
Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use
Using the Trivial Inference Engine: deploying the trivial-inference sail component in the connected triple store
If the trivial inference engine functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey managing inferences drawn by trivial reasoning in the repositories has to be deployed inside the connected store.
The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-trivial-inference-sail-<version>.jar, deployed inside the system/it/uniroma2/art/semanticturkey/st-trivial-inference-sail/<version> directory of Semantic Turkey.
This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.
Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use
Configuring VocBench and GraphDB for large quantities of data
When large quantities of data (hundreds of megabytes if not gigabytes) are being loaded, Semantic Turkey (the RDF management platform behind VB) might require higher memory settings in order to work properly.
The following environment variables can be setup for improving performance with large quantities of data:
JAVA_MAX_MEM and JAVA_MIN_MEM: general Java settings, that will be used by Semantic Turkey
GDB_MAX_MEM and
GDB_MIN_MEM: specific settings read by GraphDB
The following one is a recommended configuration for a PC with 16Gb of RAM. The values can be increased upon need.
GDB_MAX_MEM=6144M
GDB_MIN_MEM=2048M
JAVA_MAX_MEM=2048M
JAVA_MIN_MEM=256M
Instructions for changing environment variables for most OSs can be found here. Also, it is possible to directly assign values of these variables locally in the karaf.(bat) file used to run VocBench
Internationalization: providing localizations for new languages.
The internationalization page of this system admin manual describes how to prepare a new localization file and register it to VocBench.
SAML Authentication
VocBench, from version 10.1, supports the authentication with any SAML Identity Provider. This means that system administrator can configure VocBench/SemanticTurkey in order to let a user log into VocBench through an external IDP that uses the SAML standard. Anyway, even if VocBench recognizes only users authenticated through the SAML IDP, they still have to be registered also in VocBench. It is SemanticTurkey's task to map the SAML authenticated user to a local one and if such user is not present in the internal ST store it will prompt him to register. The Identity Provider just needs to be configured properly as we will see at the end of this section.
VocBench configuration
Enabling SAML authentication
In order to enable the SAML authentication in the VocBench UI, login into the application with the administrator user and go to Administration > Settings Mgr, then select the SemanticTurkeyCoreSettingsManager tab and then the only Settings Manager available. Choose the scope SYSTEM, then change the Authentication Service settings to SAML and submit the changes.

Logout from VocBench and refresh the page, now you will see a SAML login button replacing the default email/password login form.

Customization of the SAML Login button
The label of the login button can be customized. This can be done by editing the vbconfig.js file (under src/ of the source package, or under the root folder of the built distribution): just uncomment the variable saml_login_label and provide the desired label.

Semantic Turkey configuration
Changing authentication service
An alternative way to enable the SAML authentication is to do it directly from SemanticTurkey. Edit the setting authService in SemanticTurkeyData\system\plugins\it.uniroma2.art.semanticturkey.settings.core.SemanticTurkeyCoreSettingsManager\settings.props. The allowed values are: Default and SAML. This procedure can be useful if the authService has been accidentally changed to SAML from VocBench UI and the administrator is not able to login again for restoring the Default one.
SAML configuration
In SemanticTurkey there are two files that need to be configured in order to instruct the system to communicate with the SAML IDP:
- <ST_INST>\etc\saml-login.properties
- <ST_INST>\etc\saml-login.xml
The first one contains properties that need to be provided manually. An important property is saml.defaultIDP which depends on the SAML Identity Provider.
# A value between https or http. It is related to the protocol used by the server (service provider) on which VocBench will be deployed.
saml.scheme=https
# The name of the service provider (without "http://" and without "/" at the end).
saml.serverName=example.org
# The entityID of the Identity Provider. Take this value from metadata file received from the team responsible for registering your instance.
saml.defaultIDP=entityID
# The base url of the server (service provider) --> scheme://serverName/contextPath
saml.entityBaseURL=https://example.org/semanticturkey
# The url of VocBench
saml.redirect=https://example.org/vocbench3
The second one contains a “placeholder” xml code that must be replaced with the metadata generated from the Identity Provider.
<?xml version="1.0" encoding="UTF-8"?>
<!-- Replace this EntityDescriptor with your metadata -->
<EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata">
</EntityDescriptor>
Generation of metadata
In order to be recognized from an Identity Provider, SemanticTurkey needs to be registered as a Service Provider into the latter. In order to do that, the Identity Provider needs SemanticTurkey metadata. These metadata can be automatically produced simply by accessing the following URL (eventually replacing localhost:1979 with the proper ST host url) when SemanticTurkey is up and running.
http://localhost:1979/semanticturkey/it.uniroma2.art.semanticturkey/st-core-services/saml/metadata
Attributes mapping
SemanticTurkey supports any SAML Identity Provider but it requires a minimal configuration of the IDP. In fact, once a user tries to log into VocBench passing through a SAML Identity Provider, this one sends back to SemanticTurkey a response (assertion) which contains attributes of the authenticated user. SemanticTurkey, in order to recognize the SAML user and map this to an internal registered one, needs that the user's email address is mapped to an attribute that must be named emailAddress, so the Identity Provider must be configured accordingly. In addition, two further optional attributes are the user's given name and family name which need to be returned in the assertion in attributes named respectively firstName and lastName. These two attributes are useful when the SAML authenticated user is not yet registered into VocBench, in such case VocBench redirect the user to the registration page pre-filling the email, given name and family name form fields.
