VocBench System Administration Manual

This manual supports System Administrators in all task related to installation and maintenance of the tool, usually including tasks which cannot be performed by the VocBench Administrator through the UI

Installation Options

Semantic Turkey Installation Options

The Installation Options section of Semantic Turkey's site provides many details on how to customize the configuration of the system for various environments or according to specific requirements. Please refer to that page for ST-specific configuration options.

Separate HTTP Server

The standalone distribution contains everything needed to start playing with the system. Depending on the system administrator preference, organization policies, load balance optimization etc..., it might be desired to separate the RDF services managed by Semantic Turkey from the VocBench Web Application. In particular, since the Web Application contains only static assets, any standard http server can be adopted.

The VocBench web application is available as a .jar file in the /lib directory of the distribution, with name: vocbench3-<versionnumber>.jar. It is possible to extract the directory public/vocbench3/ from the jar file and deploy it to any HTTP server, such as: Apache HTTP Server.

In this case, it might be necessary to have the VocBench web application explicitly pointing to the Semantic Turkey server, as explained in the following section.

Further Configuration for reaching Semantic Turkey on a different host/port

Note that the custom settings below are also possible when running VocBench from inside the standalone distribution, but since you need to edit the content of a file placed inside the Vocbench3 jar, you might need to extract and repackage the archive.

In vbconfig.js (under the public/vocbench3/ dir of vocbench3-ver.jar placed in the lib/ folder of the built distribution) it is possible to configure the SemanticTurkey host resolution. By default VocBench3 resolves dynamically the IP address of the SemanticTurkey server (by using the same IP address of the VocBench host machine) and the port number (by using the same port of the VocBench container).

In case VocBench3 runs on a different container or on a dedicated server (that is, its port/host are different from those of SemanticTurkey) the port and host cannot be automatically inferred as they belong to a separate (and unknown to VB) server. It is thus mandatory to change the following configuration by specifying the values for the st_host (only if ST is on a different machine) and st_port variables.

    /**
    * IP address/logical host name of the machine hosting Semantic Turkey.
    * By default (variable left unspecified) the host is resolved dynamically
    * by using the same address of the machine hosting VocBench.
    * Thus if VocBench3 and Semantic Turkey are running on the same machine
    * this variable can be left commented, otherwise uncomment the line and
    * edit the value.
    */
    // var st_host = "127.0.0.1";

    /**
    * Port of the container hosting Semantic Turkey.
    * By default (variable left unspecified) the port is resolved dynamically
    * by using the same port of the container hosting VocBench.
    * Thus if VocBench3 and Semantic Turkey are running on the same container
    * this variable can be left commented, otherwise uncomment the line and
    * edit the value.
    */
    // var st_port = "1979";
  

It is also possible to change the path where the SemanticTurkey server is listening and the protocol to use. The latter, if unspecified, is dynamically resolved as well as st_host and st_port just described.

    /**
    * Path where SemanticTurkey server is listening. If omitted, the sole host
    * is considered.
    * Please note that the path of Semantic Turkey services is defined as in:
    * http://semanticturkey.uniroma2.it/doc/user/web_api.jsf#services_address_structure
    * This additional path information is considered to be the starting part
    * of the path described above, and is usually necessary in case Semantic
    * Turkey is installed behind a proxy redirecting the ST URL.
    */ 
    var st_path;

    /**
    * Protocol - either http or https
    * By default (variable left unspecified) the protocol is resolved 
    * dynamically by using the same one of the container hosting VocBench.
    */
    // var st_protocol = "http"; 
  

Running VocBench on HTTPS

The last part of the vbconfig.js file concerns is related to the protocol for connecting to the Semantic Turkeyserver, either http or https.

/**   
  * Protocol - either http or https   
  */  

var st_protocol = "http";  

Please note that the Spring Boot container for the ST server must be configured for HTTPS, as explained in the ST documentation.

Running VocBench as a System Service

It is possible to run Semantic Turkey as a system service. Instructions for doing so are reported on a dedicated section of the ST system administration manual.

Serving the VocBench web application as a service mostly depends on where it is installed. In the typical ST installation package, the web application is provided as an set of static assets embedded within the installtion of Semantic Turkey, so there is no other action to perform and ST and VB will be both started through Spring Boot. If the client is hosted on any other HTTP server, the relevant documentation for that server should be verified.

Maintenance and Settings

VocBench allows Administrators and other authenticated users to perform most of the fine tuning of the system, through the UI of the platform. This section deals with all the hidden tricks and tunings which need to work under the hood

 

Data Management

Separate Triple Store

VocBench (or better, its RDF Backend Semantic Turkey), comes with an embedded distribution of RDF4J, which includes a couple of storage solutions: in-memory store and native store. Creating local repositories with this embedded solution is very convenient for quickly playing with the platform without any additional installation.

However, to the purpose of having full control over your data, we recommend to adopt a separate triple store and connect to it remotely.

VocBench requires a triple store compliant with the RDF4J client API and, in order to support the more advanced features of the system such as history and validation, compliant with the RDF4J's Sail Stack mechanism.

Current available options are:

Using history & validation: deploying the change-tracking sail component in the connected triple store

If the history or validation functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey for tracking changes in the repositories has to be deployed inside the connected store.

The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-changetracking-sail-<version>.jar, deployed into the lib ( for VB3 version < 12 it is instead: system/it/uniroma2/art/semanticturkey/st-changetracking-sail/<version>) directory of Semantic Turkey.

This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.

Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use

Using the Trivial Inference Engine: deploying the trivial-inference sail component in the connected triple store

If the trivial inference engine functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey managing inferences drawn by trivial reasoning in the repositories has to be deployed inside the connected store.

The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-trivial-inference-sail-<version>.jar, deployed inside the lib ( for VB3 version < 12 it is instead: system/it/uniroma2/art/semanticturkey/st-trivial-inference-sail/<version> ) directory of Semantic Turkey.

This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.

Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use

Configuring VocBench and GraphDB for large quantities of data

When large quantities of data (hundreds of megabytes if not gigabytes) are being loaded, Semantic Turkey (the RDF management platform behind VB) might require higher memory settings in order to work properly.

The following environment variables can be setup for improving performance with large quantities of data:

JAVA_MAX_MEM and JAVA_MIN_MEM: general Java settings, that will be used by Semantic Turkey

GDB_MAX_MEM and GDB_MIN_MEM: specific settings read by GraphDB

The following one is a recommended configuration for a PC with 16Gb of RAM. The values can be increased upon need.

GDB_MAX_MEM=6144M
GDB_MIN_MEM=2048M
JAVA_MAX_MEM=2048M
JAVA_MIN_MEM=256M

Instructions for changing environment variables for most OSs can be found here. Also, it is possible to directly assign values of these variables locally in the setvars.in.(bat/sh) file used to run VocBench

Internationalization: providing localizations for new languages.

The internationalization page of this system admin manual describes how to prepare a new localization file and register it to VocBench.

SAML Authentication

Before you read the following instructions, please consider that SAML authentication has changed since version 12.0 of VocBench (and Semantic Turkey). If you are using a prior version, please read the instructions at the end of this section.

VocBench supports the authentication with any SAML Identity Provider. This means that system administrator can configure VocBench/SemanticTurkey in order to let a user log into VocBench through an external IDP that uses the SAML standard. Anyway, even if VocBench recognizes only users authenticated through the SAML IDP, they still have to be registered also in VocBench. It is SemanticTurkey's task to map the SAML authenticated user to a local one and if such user is not present in the internal ST store it will prompt him to register. The Identity Provider just needs to be configured properly as we will see at the end of this section.

VocBench configuration

Enabling SAML authentication

In order to enable the SAML authentication in the VocBench UI, login into the application with the administrator user and go to Administration > Settings Mgr, then select the SemanticTurkeyCoreSettingsManager tab and then the only Settings Manager available. Choose the scope SYSTEM, then change the Authentication Service settings to SAML and submit the changes.

Logout from VocBench and refresh the page, now you will see a SAML login button replacing the default email/password login form.

Customization of the SAML Login button

The label of the login button can be customized. This can be done by editing the vbconfig.js file (under the public/vocbench3/ dir of vocbench3-ver.jar placed in the lib/ folder of the built distribution): just uncomment the variable saml_login_label and provide the desired label.

Semantic Turkey configuration

Changing authentication service

An alternative way to enable the SAML authentication is to do it directly from SemanticTurkey. Edit the setting authService in SemanticTurkeyData\system\plugins\it.uniroma2.art.semanticturkey.settings.core.SemanticTurkeyCoreSettingsManager\settings.props. The allowed values are: Default and SAML. This procedure can be useful if the authService has been accidentally changed to SAML from VocBench UI and the administrator is not able to login again for restoring the Default one.

SAML configuration

In SemanticTurkey in order to instruct the system to communicate with the SAML IDP you need to edit the configuration file at <ST_INST>\config\saml\application.yml

By default the content of this file is the following

    #spring.security.saml2.relyingparty.registration.st_saml:
    #  base-url: "https://example.org"
    #  redirect-url: "https://example.org/vocbench3"
    #  identityprovider.metadata-location: "file:./config/saml/idp-metadata.xml"
    #  signing:
    #    credentials:
    #      - private-key-location: "file:./config/saml/private.key"
    #        certificate-location: "file:./config/saml/public.cer"
    #    keystore:
    #      - location: "file:./config/saml/keystore.jsk"
    #        alias: ""
    #        password: ""
  

For a basic configuration it's enough to edit the content by uncommenting (removing the leading #) and providing these properties:

Signature customization (Optional)

Semantic Turkey needs to sign the authentication requests sent to the Identity Provider. For this purpose, it has been provided with a private key and public certificate pair already included in the classpath (respectively classpath:security/private.key and classpath:security/public.cert)

Optionally, for security reasons, you may need to configure SemanticTurkey by providing your personal pair of key and certificate, or the references to a JKS keystore. If you want to override the default key and certificate, you'll need to provide the following properties:

Alternatively, you can use a JKS keystore by providing these properties:

Generation of metadata

In order to be recognized from an Identity Provider, SemanticTurkey needs to be registered as a Service Provider into the latter. In order to do that, the Identity Provider needs SemanticTurkey metadata. These metadata can be automatically produced simply by accessing the following URL (eventually replacing localhost:1979 with the proper ST base url, the same set in <ST_INST>\config\saml\application.yml as seen previously) when SemanticTurkey is up and running:

http://localhost:1979/semanticturkey/saml2/service-provider-metadata/st_saml

Attributes mapping

SemanticTurkey supports any SAML Identity Provider but it requires a minimal configuration of the IDP. In fact, once a user tries to log into VocBench passing through a SAML Identity Provider, this one sends back to SemanticTurkey a response (assertion) which contains attributes of the authenticated user. SemanticTurkey, in order to recognize the SAML user and map this to an internal registered one, needs that the user's email address is mapped to an attribute that must be named emailAddress, so the Identity Provider must be configured accordingly. In addition, two further optional attributes are the user's given name and family name which need to be returned in the assertion in attributes named respectively firstName and lastName. These two attributes are useful when the SAML authenticated user is not yet registered into VocBench, in such case VocBench redirect the user to the registration page pre-filling the email, given name and family name form fields.

SAML configuration with VB < 12.x

SAML support in Semantic Turkey was originally implemented in version 10.1. SAML configuration explained previously is valid for version 12.x or above. In previous versions the configuration of SAML was slightly different: two files needed to be configured in order to instruct the system to communicate with the SAML IDP:

The first one contains properties that need to be provided manually.

    # A value between https or http. It is related to the protocol used by the server (service provider) on which VocBench will be deployed.
    saml.scheme=https
    # The name of the service provider (without "http://" and without "/" at the end).
    saml.serverName=example.org
    # The entityID of the Identity Provider. Take this value from metadata file received from the team responsible for registering your instance.
    saml.defaultIDP=entityID
    # The base url of the server (service provider) --> scheme://serverName/contextPath
    saml.entityBaseURL=https://example.org/semanticturkey
    # The url of VocBench
    saml.redirect=https://example.org/vocbench3
  

The second one contains a "placeholder" xml code that must be replaced with the metadata generated from the Identity Provider.

    <?xml version="1.0" encoding="UTF-8"?>
    <!-- Replace this EntityDescriptor with your metadata -->
    <EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata">
    </EntityDescriptor>
  

Another important difference is represented by the endpoint URL for generating the Semantic Turkey metadata. In the old versions it was the following URL (eventually replacing localhost:1979 with the proper ST host url):

http://localhost:1979/semanticturkey/it.uniroma2.art.semanticturkey/st-core-services/saml/metadata