Test Drive: creating projects and loading data

This is a series of simple test drives for you to try VocBench in creating projects and loading data to be managed withing them.

These tests assume that VocBench has already been started for the first time and that the administrator is logged into the system.

Creating a SKOS Project for Managing a Small Thesaurus

In this test drive we create a project for managing a simple SKOS thesaurus, the "Land and Water" FAO vocabulary (which is available for download here), by using the embedded RDF4J store.

Once logged in, the list of projects available in VocBench is shown (obviously empty if VocBench has just been installed)

ProjectsLanding Page

Click on the "Create" button in order to access the following project creation page:

New Project

Fill the fields as in the following text and image below:

Creating LandAndWater project

a. Loading the thesaurus data

Download the Land and Water thesaurus RDF file

Go to the Global Data Management and select "Load Data", as described here

Just click on the "Browse" button next to the RDF file label and choose the Land And Water data file you have previously downloaded. Leave all the other fields unchanged, as in the following figure.

Loading LandAndWater data

and then click on the "Submit" button. A confirmation message will inform you that the data has been loaded successfully.

By clicking on the "Data" menu entry, it is possible to view the loaded thesaurus.

Loaded LandAndWater data

The "warning" yellow symbol informs us that no concept scheme has been selected, thus all concepts in the thesaurus are being shown. By clicking on the "Scheme" tab it is possible to select one (or more) of the available schemes so that the "Concept" tab will now show only those concepts belonging to it (or them).

Selecting a scheme for LandAndWater

Creating an OWL Project for Managing an Ontology

In this test drive we create a project for managing an OWL ontology, the "Friend of a Friend" vocabulary (which is available for download here), by using the embedded RDF4J store.

Once logged in, the list of projects available in VocBench is shown (obviously empty if VocBench has just been installed)

ProjectsLanding Page

Click on the "Create" button in order to access the following project creation page:

New Project

Fill the fields as in the following text and image below:

Creating FOAF project

a. Loading the ontology data

Download the FOAF ontology RDF file

Go to the Global Data Management and select "Load Data", as described here

Just click on the "Browse" button next to the RDF file label and choose the FOAF data file you have previously downloaded. Leave all the other fields unchanged, as in the following figure.

Loading FOAF data

and then click on the "Submit" button. A confirmation message will inform you that the data has been loaded successfully.

By clicking on the "Data" menu entry, it is possible to view the loaded ontology.

Loaded FOAF data

Creating a SKOS Project for Managing a Large Thesaurus by Connecting to an External Triple Store, Exploiting History, Validation and Inference,

In this test drive we create a project for managing a large SKOS thesaurus, the "Eurovoc" thesaurus published by the Publications Office of the European Union. We provide here a dump of Eurovoc which has already been cleaned of duplicate data. The project will enable the History and Validation features and will be relying on an external triple store: GraphDB.

a. Setting up and Running GraphDB

By first, setup the GraphDB server.

If GraphDB has never been used with VocBench, and since we desire to activate the History an Validation features in this test drive, the change-tracking sail component must be deployed into the triple store. Follow the instructions for doing that in the related section of the system administration manual.

Once the change-tracking sail component has been deployed into GDB, it is possible to start the triple store.

b. Creating the Project

Log into VocBench, the list of projects available in VocBench will be shown (obviously empty if VocBench has just been installed)

ProjectsLanding Page

Click on the "Create" button in order to access the following project creation page:

New Project

Fill the fields as in the following text and image below:

Creating Eurovoc project

The click on the "Remote Access Config" button, which leads to the following window:

Setting up a new triple store connection

By first create a new configuration, through the "Manage Config" button, which leads to the following window, where you have to insert the address where GDB is listening (we assume here that everything has been installed on the same machine, so that a "localhost" 127.0.0.1 address would work). Unless authorization credentials have been configured in GDB, no username and password is required.

Setting up a new triple store connection

By clicking on the + button the configuration will be saved, as in the following figure

Setting up a new triple store connection

After clicking on the "OK" button in the previous window, it is possible to select through the "Server URL" combobox the configuration previously setup, so that the window will look like this:

Setting up a new triple store connection

After clicking on "OK" on the previous windows, we come back to the Project Creation page:

Creating Eurovoc project

We now setup the configuration for the core repository being created for storing the Eurovoc data, by clicking on the first of the two "Configure" Buttons, related to the "Data Repository ID", and changing only one field, the "ruleset" one, setting it to "owl-horst-optimized", and leaving everything else as is.

Loading LandAndWater data

We leave the other configuration unchanged (no need to click at all on the other "Configure" button), and finally click on the "Create" button at the bottom of the page, thus creating the project and the two repositories (core and support for history/validation) on GraphDB.

c. Loading the thesaurus data

Download the Eurovoc thesaurus RDF file and unzip it.

Select the newly created project, then go to the Global Data Management (top-right menu) and select "Load Data", as described here

Click on the "Browse" button next to the RDF file label and choose the Eurovoc data file you have previously downloaded.

It is important that the "Implicitly validate loaded data" checkbox is ticked, otherwise all the loaded Eurovoc data will be subject to validation, consuming lot of unwanted resources.

Leave all the other fields unchanged, as in the following figure:

Loading Eurovoc data

and then click on the "Submit" button.

The data loading process can take quite a few minutes (depending on the underlying hardware) as the hundreds of thousands of triples of Eurovoc are loaded both in the core repository and in the support repository, for purpose of history (validation has been skipped by ticking the option for implicit validation). On a modern Microsoft Surface with an i7 processor, 16Gb of memory and SSD hard disk, it takes less than 6 minutes.

A confirmation message will inform you that the data has been loaded successfully.

By clicking on the "Data" menu entry, it is possible to view the loaded thesaurus.

Loaded Eurovoc data

However, there are two inconveniences that limit the usability:

By clicking on the user icon on the top-right corner of the application, it is possible to select the language that the user is proficient with:

Loaded Eurovoc data

Going back to the data section, by clicking on the "Scheme" tab it is possible to select one (or more) of the available schemes so that the "Concept" tab will now show only those concepts belonging to it (or them).

Selecting a scheme for Eurovoc

In this case, we have selected the global "EuroVoc" scheme, which contains all concepts in the thesaurus, finally shown in the following figure (a refresh of the concept tree, through the dedicated circular icon, might be necessary)

Selecting a scheme for Eurovoc

 

Creating a OntoLex Project for Managing a Large Lexicon by Connecting to an External Triple Store

In this test drive we create a project for managing a large Lexicon , the "Wordnet" Lexicon. The project will be relying on an external triple store: GraphDB.

a. Setting up and Running GraphDB

By first, setup the GraphDB server.

If GraphDB has never been used with VocBench follow the instructions for doing that in the related section of the system administration manual.

b. Creating the Project

Log into VocBench, the list of projects available in VocBench will be shown (obviously empty if VocBench has just been installed)

ProjectsLanding Page

Click on the "Create" button in order to access the following project creation page:

New Project

Fill the fields as in the following text and image below:

Creating Wordnet project

The click on the "Remote Access Config" button, which leads to the following window:

Setting up a new triple store connection

By first create a new configuration, through the "Manage Config" button, which leads to the following window, where you have to insert the address where GDB is listening (we assume here that everything has been installed on the same machine, so that a "localhost" 127.0.0.1 address would work). Unless authorization credentials have been configured in GDB, no username and password is required.

Setting up a new triple store connection

By clicking on the + button the configuration will be saved, as in the following figure

Setting up a new triple store connection

After clicking on the "OK" button in the previous window, it is possible to select through the "Server URL" combobox the configuration previously setup, so that the window will look like this:

Setting up a new triple store connection

After clicking on "OK" on the previous windows, we come back to the Project Creation page:

Creating Wordnet project

We now setup the configuration for the core repository being created for storing the Wordnet data, by clicking on the first of the two "Configure" Buttons, related to the "Data Repository ID", and checking that one field, the "ruleset" one, has the value "empty", and leaving everything else as is.

Loading LandAndWater data

We leave the other configuration unchanged (no need to click at all on the other "Configure" button), and finally click on the "Create" button at the bottom of the page, thus creating the project and the repository (core) on GraphDB.

c. Loading the Lexicon data

Before loading the data into VocBench, go to the Data and select the "Lex. Entry". Click on the gear button called "Settings". In the new window, select "Search Based"

Configure Search based

do the same action (press gear and then select "Search Based") on the concept tab.

Download the Wordnet lexicon RDF file and unzip it.

Select the newly created project, then go to the Global Data Management (top-right menu) and select "Load Data", as described here

Click on the "Browse" button next to the RDF file label and choose the Wordnet data file you have previously downloaded (wn_08_eng.rdf).

Leave all the other fields unchanged, as in the following figure:

Loading Wordnet data

and then click on the "Submit" button.

The data loading process can take quite a few minutes (depending on the underlying hardware) as the hundreds of thousands of triples of Wordnet are loaded in the core repository

A confirmation message will inform you that the data has been loaded successfully.

Load pwn-concepts.rdf as well (to have also all concepts).

By clicking on the "Data" menu entry, it is possible to view the loaded lexicon.

The default tab in "Data" is "Lexicon". Select "Princeton Wordnet".

Wordned Lexicon

Open the "Lex.Entry" tab.

Wordned Lex.Entry

since the "Search Based" no Lexical Entry is shown. To see some Lexical Entries, search "dog"

In the lower part of the window, in the "Search" field write "dog" and press enter. After a couple of seconds, all Lexical Entries containing the word dog are returned:

Worned Lex.Entries dog

Click on the first entry, "water dog"to see all the information associated to it:

Worned Lex.Entries dog

Open the "Concept" tab.

Wordned Concept

since the "Search Based" no concept is shown. To see some concepts, search "dog"

In the lower part of the window, in the "Search" field write "dog" and press enter. After a couple of seconds, all concepts containing the word dog are returned:

Worned Concepts dog

Click on the first entry, "Greater Swiss Mountain dog"to see all the information assocaited to it:

Worned Concepts dog