VocBench Logo

Home Documentation Downloads Support About us

Global Data Management

The Global Data Management menu provides functionalities for overall management of the data in the project.

export_repository

Load Data

Please notice that this functionality is meant for loading the data that has to be maintained within the project (e.g. load the latest distribution of the Eurovoc dataset in order to edit Eurovoc within VocBench). If the intent is to owl:import an ontology, in order to create a knowledge base based on it, or to create another ontology extending its model, then the Import functionality in the "Metadata Management" section should be used.

The Load Data option allows to load data from an external file inside the current working graph (i.e. the graph where the data being edited is stored). Through the window opened after selecting this option, it is possible to browse the file system and select the file to be loaded. The baseuri field is usually not mandatory as the baseuri is generally provided by the file being loaded. The value for the baseuri is used only when the loaded file contains local references (e.g. #Person) and no baseuri has been specified. Formats such as NTRIPLES, which always contain fully specified URIs, never need this optional parameter, and in cases local references are possible (such as in RDFXML), usually the baseuri is provided inside the file.

load_repository

The "Resolve transitive imports from :" combo box, shows the following possible values, instructing VocBench on how to import vocabularies specified on transitive dependencies, that is, vocabularies that are owl:imported by the loaded data, or by other vocabularies in turn imported by it.

Note that the file content will be loaded inside the project's working graph, so it is possible to modify it. This is the main difference with respect to the import options on the Import Panel that allow users to import existing ontologies as read-only data. Conversely, the Load RDF option is typically used to reimport data which has been previously backed up from an old repository, or for data exchange among different users.

Loading Large Amounts of Data on Projects Requiring Validation

When dealing with big datasets in projects with validation enabled, loading the initial data might result in a very slow process, this is because each single triple present in the repository is copied first in the support repository (in its refied form) and in the validation graph of the core repository, and then it needs to be validated, causing another heavy operation for being finalized.

As a solution, authorized users can tick the "Implicitly validate loaded data" option (which appears only in projects requiring validation, and only for authorized users) which, as the options says, skips the validation process and copies the data directly in the repository (if history is enabled though, a copy in the history will be made in any case, still requiring less time than a copy in the validation, and not requesting validators to perform the heavy validation operation later).

loading_under_validation

Export Data

After selecting the Export Data option, a window like the one in figure below will be shown. Different file formats are available from the combobox associated to the "Export Format" label, according to the most common RDF serialization formats. The following section, "Graphs to export", lists all the named graphs available in the dataset of the project, so that the user can decide whether to export the sole working data or other information. Note that the provenance information of the graph will obviously be maintained only for quad-oriented formats (e.g. N-Quads, TriX, etc..), while triple-formats (N-Triple, TriG..) will have all the data merged in the same triple set.

load_repository

The last part of the export page concerns the possibility to use export filters for altering the content to be exported according to user preferences. Export Filters range from very specific transformations (e.g. transform all SKOS-XL reified labels into plain SKOS core ones) through user-customizable ones (e.g. DeletePropertyValue, allowing the user to specify a property and a value that should be removed from all resources in the dataset, adopted usually in order to remove some editorial data that is not desired to appear on the published dataset, but repurposable for any need) to completely specifiable filters, such as the SPARQL Update Export Filter, that allows the user to completley change the content according to user-defined SPARQL updates.

Note that when filters are adopted, all the content to be exported is copied to a temporary in-memory repository, which can thus be altered destructively by the filters without corrupting the original data. The export process is optimized in case no filter has been selected: in this case, no temporary repository is generated and the data is directly dumped from the original dataset.

We have provided a dedicated page for describing interesting configurations of the Export Filters.

Clear Data

Through this action, the project repository will be completely cleared.

Needless to say, pay attention to this action because it erases all information in the project (we recommed to save the existing data before clearing it).

Versioning

VocBench allows authorized users to create time-stamped data dumps of the dataset, that can later be inspected through the same project. Each versioned dump is stored to a separate repository, which is prevented from being written by the application.

version dump menu

The Dump menu allows the user to create a new version of the dataset, either by following conventioned coordinates for the creation of the repository for the new version, or by manually specifying them. The following figure shows the simple dialog for dumping a new version of the dataset, requesting only an ID (a tag) for the version.

load_repository

The following figure shows instead the repository configuration in case of a dump to a custom location established by the user. The configuration panel is very similar to the one for creating the main dataset repository when initializing the project. Note that all the information related to the custom dump will be retained by the project, so it will be always possible to access this custom location without having to note down its coordinates/configuration.

load_repository

After a new version has been dumpted, it will be listed on the available versions. The "Switch to" button located on the topright corner allows the user to temporarily switch to this version and inspect its content. Everything in VB will now be localized to this dumped version, except that it will not be possible to write on it. The Delete button allows the authorized user to delete the selected version.

load_repository

Data Refactor

The Data Refactor page allows the administrator or project manager (or equivalently authrozied user) to perform massive refactoring of the loaded data. Note that this refactoring is usually performed at the beginning of the life of a project, usually after some data has been loaded from an external file. This is because the data is non-conformant to the specifications of the project (e.g. the dataset contains SKOS core labels while the project is thought for managing SKOS-XL lexicalizations) and might need to be refactored in order to be properly managed with the intended project settings and configuration.

load_repository

Current refactoring options include going back and forth from SKOS to SKOS-XL and migrating data from the default graph (i.e. the single unnamed graph in the repository) to the working graph (i.e. the graph named after the baseuri of the dataset, which is supposed to hold the working data).

Metadata Management

The Menatadata Management view can be accessed by the top-rightmost menu

metadata management

The Metadata Management View (see figure below) is divided into two main sections, that can be selected through the buttons at the top of the page:

  1. The namespaces and import section, allowing users to set prefix-namespace mappings, to owl:import ontology vocabularies and to edit the ontology mirror, a local mirror of ontologies stored within VB
  2. The metadata vocabularies section allows for the specification of metadata according to different existing metadata vocabularies, such as VoID, LIME, DCAT, ADMS, etc...
metadata management