VocBench System Administration Manual
This manual supports System Administrators in all task related to installation and maintenance of the tool, usually including tasks which cannot be performed by the VocBench Administrator through the UI
Installation Options
Semantic Turkey Installation Options
The Installation Options section of Semantic Turkey's site provides many details on how to customize the configuration of the system for various environments or according to specific requirements. Please refer to that page for ST-specific configuration options.
Separate HTTP Server
The standalone distribution contains everything needed to start playing with the system. Depending on the system administrator preference, organization policies, load balance optimization etc..., it might be desired to separate the RDF services managed by Semantic Turkey from the VocBench Web Application. In particular, since the Web Application contains only static assets, any standard http server can be adopted.
The VocBench web application is available as a .jar file in the /lib directory of the distribution, with name: vocbench3-<versionnumber>.jar. It is possible to extract the directory public/vocbench3/ from the jar file and deploy it to any HTTP server, such as: Apache HTTP Server.
In this case, it might be necessary to have the VocBench web application explicitly pointing to the Semantic Turkey server, as explained in the following section.
Further Configuration for reaching Semantic Turkey on a different host/port
Note that the custom settings below are also possible when running VocBench from inside the standalone distribution, but since you need to edit the content of a file placed inside the Vocbench3 jar, you might need to extract and repackage the archive.
In vbconfig.js (under the public/vocbench3/ dir of vocbench3-ver.jar placed in the lib/ folder of the built distribution) it is possible to configure the SemanticTurkey host resolution. By default VocBench3 resolves dynamically the IP address of the SemanticTurkey server (by using the same IP address of the VocBench host machine) and the port number (by using the same port of the VocBench container).
In case VocBench3 runs on a different container or on a dedicated server (that is, its port/host are different from those of SemanticTurkey) the port and host cannot be automatically inferred as they belong to a separate (and unknown to VB) server. It is thus mandatory to change the following configuration by specifying the values for the st_host (only if ST is on a different machine) and st_port variables.
/** * IP address/logical host name of the machine hosting Semantic Turkey. * By default (variable left unspecified) the host is resolved dynamically * by using the same address of the machine hosting VocBench. * Thus if VocBench3 and Semantic Turkey are running on the same machine * this variable can be left commented, otherwise uncomment the line and * edit the value. */ // var st_host = "127.0.0.1"; /** * Port of the container hosting Semantic Turkey. * By default (variable left unspecified) the port is resolved dynamically * by using the same port of the container hosting VocBench. * Thus if VocBench3 and Semantic Turkey are running on the same container * this variable can be left commented, otherwise uncomment the line and * edit the value. */ // var st_port = "1979";
It is also possible to change the path where the SemanticTurkey server is listening and the protocol to use. The latter, if unspecified, is dynamically resolved as well as st_host and st_port just described.
/** * Path where SemanticTurkey server is listening. If omitted, the sole host * is considered. * Please note that the path of Semantic Turkey services is defined as in: * http://semanticturkey.uniroma2.it/doc/user/web_api.jsf#services_address_structure * This additional path information is considered to be the starting part * of the path described above, and is usually necessary in case Semantic * Turkey is installed behind a proxy redirecting the ST URL. */ var st_path; /** * Protocol - either http or https * By default (variable left unspecified) the protocol is resolved * dynamically by using the same one of the container hosting VocBench. */ // var st_protocol = "http";
Running VocBench on HTTPS
The last part of the vbconfig.js file concerns is related to the protocol for connecting to the Semantic Turkeyserver, either http or https.
/** * Protocol - either http or https */ var st_protocol = "http";
Please note that the Spring Boot container for the ST server must be configured for HTTPS, as explained in the ST documentation.
Running VocBench as a System Service
It is possible to run Semantic Turkey as a system service. Instructions for doing so are reported on a dedicated section of the ST system administration manual.
Serving the VocBench web application as a service mostly depends on where it is installed. In the typical ST installation package, the web application is provided as an set of static assets embedded within the installtion of Semantic Turkey, so there is no other action to perform and ST and VB will be both started through Spring Boot. If the client is hosted on any other HTTP server, the relevant documentation for that server should be verified.
Maintenance and Settings
VocBench allows Administrators and other authenticated users to perform most of the fine tuning of the system, through the UI of the platform. This section deals with all the hidden tricks and tunings which need to work under the hood
Data Management
Separate Triple Store
VocBench (or better, its RDF Backend Semantic Turkey), comes with an embedded distribution of RDF4J, which includes a couple of storage solutions: in-memory store and native store. Creating local repositories with this embedded solution is very convenient for quickly playing with the platform without any additional installation.
However, to the purpose of having full control over your data, we recommend to adopt a separate triple store and connect to it remotely.
VocBench requires a triple store compliant with the RDF4J client API and, in order to support the more advanced features of the system such as history and validation, compliant with the RDF4J's Sail Stack mechanism.
Current available options are:
- A dedicated RDF4J server from the RDF4J site.
The version of RDF4J to be adopted is 4.3.12, which is updated for VocBench 13.0. If you need to check compliance for past versions of VocBench, you can check the pom.xml file of the project by selecting the tag corresponding to your version, and then checking the rdf4j.version propert.
- Ontotext GraphDB, an RDF4J-compliant triple store natively implementing RDF4J's Sail Stack mechanism, offering high performance and supporting different levels of reasoning.
Compatibility notes:- For users of VB3 12.x, GDB version at least 10.6.2 MUST be used. Later releases can be used and, especially in the case of patch or minor version releases, there should be no problem at all. However, this is the latest known version to have been tested extensively
- This version requires the lucene-fts (built for Java 17) plugin to be installed. Instructions: download the zip and unzip it into the /lib/plugins directory of your graphdb installation
- For users of VB3 10.x/11.x, we recommend:
- either GDB version 9.8.1 Here's a permanent link for GDB 9.8.1: https://download.ontotext.com/owlim/40f76740-e3f4-11eb-bcaf-42843b1b6b38/graphdb-free-9.8.1-dist.zip
- or 9.9 / 9.10.x (further versions have not been tested at the time of writing) providing that the lucene-fts plugin is installed. Instructions: download the zip and unzip it into the /lib/plugins directory of your graphdb installation
- For users of VB3 5.x or 6.0.x we recommend GDB version 8.8 or higher, yet still in the 8.x range; 9.x is not fully compatible with these versions of VB.
- For those using VB3 version 4.0.x: the previous version of VB3 (4.0.2) is compatible with GraphDB version 8.5. Later versions have introduced some incompatibilities (that we have solved together with the OntoText team) which do not allow VB3 to exploit features such as History and Validation. Here's a direct download link to version 8.5
http://download.ontotext.com/owlim/ca7e186e-2927-11e8-bbf9-42843b1b6b38/graphdb-free-8.5.0-dist.zip
Using history & validation: deploying the change-tracking sail component in the connected triple store
If the history or validation functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey for tracking changes in the repositories has to be deployed inside the connected store.
The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-changetracking-sail-<version>.jar, deployed into the lib ( for VB3 version < 12 it is instead: system/it/uniroma2/art/semanticturkey/st-changetracking-sail/<version>) directory of Semantic Turkey.
This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.
Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use
Using the Trivial Inference Engine: deploying the trivial-inference sail component in the connected triple store
If the trivial inference engine functionalities are required for projects connected to the separate triple store, then the component of Semantic Turkey managing inferences drawn by trivial reasoning in the repositories has to be deployed inside the connected store.
The component, which is implemented as a sail layer for triple stores compliant with RDF4J's Sail Stack mechanism, is available as a jar file called st-trivial-inference-sail-<version>.jar, deployed inside the lib ( for VB3 version < 12 it is instead: system/it/uniroma2/art/semanticturkey/st-trivial-inference-sail/<version> ) directory of Semantic Turkey.
This jar file has to be copied inside the lib directory of the connected triple store in order to enable the history validation functionalities.
Warning for those using GraphDB: we have noticed that the loading of deployed sails on the triple store only works for the standalone version, not the os-specific installation package. Please take this into account when choosing which version of GraphDB to use
Configuring VocBench and GraphDB for large quantities of data
When large quantities of data (hundreds of megabytes if not gigabytes) are being loaded, Semantic Turkey (the RDF management platform behind VB) might require higher memory settings in order to work properly.
The following environment variables can be setup for improving performance with large quantities of data:
JAVA_MAX_MEM and JAVA_MIN_MEM: general Java settings, that will be used by Semantic Turkey
GDB_MAX_MEM and
GDB_MIN_MEM: specific settings read by GraphDB
The following one is a recommended configuration for a PC with 16Gb of RAM. The values can be increased upon need.
GDB_MAX_MEM=6144M
GDB_MIN_MEM=2048M
JAVA_MAX_MEM=2048M
JAVA_MIN_MEM=256M
Instructions for changing environment variables for most OSs can be found here. Also, it is possible to directly assign values of these variables locally in the setvars.in.(bat/sh) file used to run VocBench
Internationalization: providing localizations for new languages.
The internationalization page of this system admin manual describes how to prepare a new localization file and register it to VocBench.
SAML Authentication
Before you read the following instructions, please consider that SAML authentication has changed since version 12.0 of VocBench (and Semantic Turkey). If you are using a prior version, please read the instructions at the end of this section.
VocBench supports the authentication with any SAML Identity Provider. This means that system administrator can configure VocBench/SemanticTurkey in order to let a user log into VocBench through an external IDP that uses the SAML standard. Anyway, even if VocBench recognizes only users authenticated through the SAML IDP, they still have to be registered also in VocBench. It is SemanticTurkey's task to map the SAML authenticated user to a local one and if such user is not present in the internal ST store it will prompt him to register. The Identity Provider just needs to be configured properly as we will see at the end of this section.
VocBench configuration
Enabling SAML authentication
In order to enable the SAML authentication in the VocBench UI, login into the application with the administrator user and go to Administration > Settings Mgr, then select the SemanticTurkeyCoreSettingsManager tab and then the only Settings Manager available. Choose the scope SYSTEM, then change the Authentication Service settings to SAML and submit the changes.
Logout from VocBench and refresh the page, now you will see a SAML login button replacing the default email/password login form.
Customization of the SAML Login button
The label of the login button can be customized. This can be done by editing the vbconfig.js file (under the public/vocbench3/ dir of vocbench3-ver.jar placed in the lib/ folder of the built distribution): just uncomment the variable saml_login_label and provide the desired label.
Semantic Turkey configuration
Changing authentication service
An alternative way to enable the SAML authentication is to do it directly from SemanticTurkey. Edit the setting authService in SemanticTurkeyData\system\plugins\it.uniroma2.art.semanticturkey.settings.core.SemanticTurkeyCoreSettingsManager\settings.props. The allowed values are: Default and SAML. This procedure can be useful if the authService has been accidentally changed to SAML from VocBench UI and the administrator is not able to login again for restoring the Default one.
SAML configuration
In SemanticTurkey in order to instruct the system to communicate with the SAML IDP you need to edit the configuration file at <ST_INST>\config\saml\application.yml
By default the content of this file is the following
#spring.security.saml2.relyingparty.registration.st_saml:
# base-url: "https://example.org"
# redirect-url: "https://example.org/vocbench3"
# identityprovider.metadata-location: "file:./config/saml/idp-metadata.xml"
# signing:
# credentials:
# - private-key-location: "file:./config/saml/private.key"
# certificate-location: "file:./config/saml/public.cer"
# keystore:
# - location: "file:./config/saml/keystore.jsk"
# alias: ""
# password: ""
For a basic configuration it's enough to edit the content by uncommenting (removing the leading #
) and providing these properties:
-
spring.security.saml2.relyingparty.registration.st_saml.base-url
: the baseURL where SemanticTurkey responds. This value will be used for determining:- the SP entityID, as {baseUrl}/semanticturkey/saml2/service-provider-metadata/st_saml
- the Location of the Assertion Consumer Service, namely where the SP will receive the IDP assertion, as {baseUrl}/semanticturkey/saml2/login/sso/st_saml
-
spring.security.saml2.relyingparty.registration.st_saml.redirect-url
: the URL where VocBench responds. This value will be used for redirecting the user once successfully authenticated. -
spring.security.saml2.relyingparty.registration.st_saml.identityprovider.metadata-location
: the path where the Identity Provider metadata xml is located. By default, the value is set to file:./config/saml/idp-metadata.xml. So, you can simply leave the value untouched and replace the content of <ST_INST>/config/saml/idp-metadata.xml with the metadata of EU Login IDP. Alternatively, you can place the metadata file to a different location and update the value accordingly.
Signature customization (Optional)
Semantic Turkey needs to sign the authentication requests sent to the Identity Provider. For this purpose, it has been provided with a private key and public certificate pair already included in the classpath (respectively classpath:security/private.key and classpath:security/public.cert)
Optionally, for security reasons, you may need to configure SemanticTurkey by providing your personal pair of key and certificate, or the references to a JKS keystore. If you want to override the default key and certificate, you'll need to provide the following properties:
spring.security.saml2.relyingparty.registration.st_saml.signing.credentials.private-key-location
: path to the private key filespring.security.saml2.relyingparty.registration.st_saml.signing.credentials.certificate-location
: path to the public certificate file
Alternatively, you can use a JKS keystore by providing these properties:
spring.security.saml2.relyingparty.registration.st_saml.signing.keystore.location
: path to the keystore .jsk filespring.security.saml2.relyingparty.registration.st_saml.signing.keystore.alias
: alias namespring.security.saml2.relyingparty.registration.st_saml.signing.keystore.password
: keystore password
Generation of metadata
In order to be recognized from an Identity Provider, SemanticTurkey needs to be registered as a Service Provider into the latter. In order to do that, the Identity Provider needs SemanticTurkey metadata. These metadata can be automatically produced simply by accessing the following URL (eventually replacing localhost:1979 with the proper ST base url, the same set in <ST_INST>\config\saml\application.yml as seen previously) when SemanticTurkey is up and running:
http://localhost:1979/semanticturkey/saml2/service-provider-metadata/st_saml
Attributes mapping
SemanticTurkey supports any SAML Identity Provider but it requires a minimal configuration of the IDP. In fact, once a user tries to log into VocBench passing through a SAML Identity Provider, this one sends back to SemanticTurkey a response (assertion) which contains attributes of the authenticated user. SemanticTurkey, in order to recognize the SAML user and map this to an internal registered one, needs that the user's email address is mapped to an attribute that must be named emailAddress, so the Identity Provider must be configured accordingly. In addition, two further optional attributes are the user's given name and family name which need to be returned in the assertion in attributes named respectively firstName and lastName. These two attributes are useful when the SAML authenticated user is not yet registered into VocBench, in such case VocBench redirect the user to the registration page pre-filling the email, given name and family name form fields.
SAML configuration with VB < 12.x
SAML support in Semantic Turkey was originally implemented in version 10.1. SAML configuration explained previously is valid for version 12.x or above. In previous versions the configuration of SAML was slightly different: two files needed to be configured in order to instruct the system to communicate with the SAML IDP:
- <ST_INST>\etc\saml-login.properties
- <ST_INST>\etc\saml-login.xml
The first one contains properties that need to be provided manually.
# A value between https or http. It is related to the protocol used by the server (service provider) on which VocBench will be deployed.
saml.scheme=https
# The name of the service provider (without "http://" and without "/" at the end).
saml.serverName=example.org
# The entityID of the Identity Provider. Take this value from metadata file received from the team responsible for registering your instance.
saml.defaultIDP=entityID
# The base url of the server (service provider) --> scheme://serverName/contextPath
saml.entityBaseURL=https://example.org/semanticturkey
# The url of VocBench
saml.redirect=https://example.org/vocbench3
The second one contains a "placeholder" xml code that must be replaced with the metadata generated from the Identity Provider.
<?xml version="1.0" encoding="UTF-8"?>
<!-- Replace this EntityDescriptor with your metadata -->
<EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata">
</EntityDescriptor>
Another important difference is represented by the endpoint URL for generating the Semantic Turkey metadata. In the old versions it was the following URL (eventually replacing localhost:1979 with the proper ST host url):
http://localhost:1979/semanticturkey/it.uniroma2.art.semanticturkey/st-core-services/saml/metadata