The Data View: Editing OntoLex-Lemon lexicons

Intro

This section describes the OntoLex-Lemon editing functionalities available in VocBench. We will go through the set of functionalities as they appear in the various elements of the user interface.

Please note that: as of version 4.0 and 5.0 of VocBench, the settings for the combinations of other models (RDFS, OWL or SKOS) with Ontolex as a lexicalization model in a project need still improvement. As a consequence, if you need to use Ontolex-Lemon at any extent in your project (from simply lexicalizing an existing ontology with lexical entries from an existing lexicon to building from scratch all of these resources we strongly suggest to choose Ontolex-Lemon both as the (semantic) model of the project and the lexical model.

The Various Sections in OntoLex Projects

OntoLex Projects offer 8 sections for managing different kind of resources. These include sections that we have already described in the manual page on OWL Editing and SKOS Editing, so here we will discuss the Lexicon and Lexical Entry Sections peculiar to OntoLex-Lemon.

When editing an OntoLex-Lemon lexicon, the landing page for the Data View is the Lexicon Section.

Empty Lexicon Page

The Lexicon Section

The Lexicon Section (as of all the data section) is composed of two main areas: the structure on the left, and the resource view on the right. However, differently from other resources (e.g. classes, properties or concepts), the structure page offers a simple list, and not a tree as lime:Lexicons have no taxonomic relation among them. Quoting the definition of lime:Lexicon: "A lexicon is a collection of lexical entries for a particular language or domain".

The Lexicon section allows to create and destroy Lexicons. When the usual + button is clicked, a form allows the user to prompt the lexicon information. The form is composed of the following elements:

Lexicon Creation Dialog

Once one or more schemes are created, they will appear as a list in the structure section. Note that the radio buttons on the left of the lexicons allow to select one of them. This selection will affect the Lexical Entry Section.

Lexicon Section

The Lexical Entry Section

The Lexical Entry Section is composed of two main areas as all other sections. The structure on the left, and the resource view on the right. The structure offers a list-view very similar to the one in the Lexicon section, though here the lexical entries are subdivided alphabetically. A dropdown menu between the toolbar and the list allows switching to a different letter of the alphabet.

As we anticipated in the previous section, the list view is affected by the selection of a lexicon performed on the Lexicon section. Only those lexical entries belonging to the selected lexicon will be shown on the list. Most of the following examples are based on the data from the lemon lexica for DBpedia.

Lexical Entry View

The subdivision of the lexical entries by first letter is similar to the organization of alphabetically indexed printed dictionaries. It should help the user navigate a potentially long list of lexical entries. Additionally, the user can find a lexical entry through the search field on the bottom of the structure: the selection of a result causes the automatic switch to the right letter of the alphabet.

The alphabetic indexing of lexical entries is also meant to reduce the number of lexical entries that should be retrieved and then displayed in the UI: this optimization can be very important when working with large lexicons. In some circumstances, a single alphabetic index may not sufficient, because the number of matching entries is too high. The gear (Gear Icon) above the list can be clicked to reveal a dialog that allows to customize the view. In particular, it is possible to switch between the "Index based" and the "Search based" visualization modes. The former is enabled by default, and it can be configured by choosing between using just one or two letters as index. A longer index has be shown to be adequate for browsing very large lexicons such as the Open Multilingual Wordnet. The "search based" visualization mode offers an alternative in which the list is only populated with the results of a search. This mode has been used with the English portion of the IATE terminology.

Lexical Entry View

The Lexical Entry Section also allows to create and delete lexical entries. When usual button + is clicked, a form allows the user to supply the necessary information about the lexical entry. The form is composed of the following elements:

Lexical Entry Creation

The Lexical-Entry-view

The resource-view for lexical entry (or, simply, lexical-entry-view) is divided into a few sections listing the following information:

The addition of a form is informed by the contextual information associated with the lexical entry, so that it is only possible to select a natural language compatible to the one declared in the lexicon a given lexical entry belongs to. In fact, the system first looks at the language declared in the lexical entry, which should match the one declared in the lexicon.

Create Other Form

It is worth to mention the fact that the content of the RDFS members section is used by system to display the constituents in the intended order.

Lexical Entry Constituents

Individual constituents should not be added or removed individually, nor should RDFS members be edited directly. Actually, the decomposition of a lexical entry should be edited as a whole: the new decomposition would automatically replace the existing one. If the user indicated a compound lexical entry as a constituent of the lexical entry, the system does the following: the compound is used as subterm, while its constituents are in turn used in the decomposition of the lexical entry.

Create Constituents List

Another interesting observation is that the Lexical Senses and Denotations sections are related from the point of view of the OntoLex-Lemon model, as they represent the same information conceptually, that is to say the binding of lexical entries to resources in the lexicalized dataset. The Lexical Senses Section lists the reified version of the binding, represented by a ontolex:LexicalSense object relating the two entities. Differently, the Denotations Sections lists the bindings realized as a mere triple with predicate ontolex:denotes. The system tries to maintain the two sections in sync by doing the following:

In fact, the system manages the creation and the deletion, respectively, of slightly more triples, since it takes into consideration the inverse properties defined by the OntoLex-Lemon model. As an example, when adding a denotation to a lexical entry, the system adds to the denoted resource the inverse properties relating it to the denoting lexical entry (only if the denoted resource is locally defined in the current project).

Addition of a denotation to a lexical entry

The Form-view

The forms of lexical entries are modeled in OntoLex-Lemon as resources of type ontolex:Form. Consequently, it is possible to double-click on them (e.g. within the Lexical Forms Section of a lexical-entry-view) and open their resource view. The resource-view for a form (or, simply, form-view) is divided into a few sections listing the following information:

Form View

When adding a representation to a form (e.g. a written representation), the system constraints the choice of the natural language based on the one defined in the lexicon. In fact, the system first looks at the lexical entry that is related to the form being edited; however, in the absence of errors, the natural language defined in the lexical entry shall match the one defined in the lexicon. In the following example about an English lexicon, we add the British spelling "colour" to a form as an additional written representation (possibly, accompanying the written representation "color" for the American spelling).

Addition of a written representation to a lexical entry

Changes to the Concept and Scheme Sections

In an OntoLex project, the behavior of the Concept and Scheme Sections is changed in the following ways:

These changes make it easier to create concept sets as defined by the OntoLex-Lemon model. Another features introduced with the support for OntoLex-Lemon, though not restricted to OntoLex projects, is the possibility to switch the concept tree from a "hierarchy based" to a "search based" visualization mode. The latter, as we have already discussed, is particularly useful to browse collections of concepts that are both large and rather flat. Again the switch can be operated by clicking on the gear button ().

Lemon VocBench Custom Forms

The OntoLex-Lemon model is a collection of related OWL ontologies that define different modules of the overall specification. It is therefore possible, in principle, to edit and visualizing OntoLex-Lemon lexicons using only the triple-level features of a generic ontology/RDF editor. Nonetheless, the reliance of the model on indirection (e.g. the written representation of a form of a lexical entry) and reification (e.g. a form is a resource) makes its hard to create and, consequently, understand the sometimes complex patterns required to represent seemingly simple information. In the previous sections, we have described the capabilities that have been added to VocBench to help users work with such model. However, these capabilities do not cover all possibilities embodied by the OntoLex-Lemon model; in particular, the system is not equally convenient to edit the ontology-lexicon interface, as specified by the synsem module.

This limitation is mitigated by the use of the Lemon VocBench Custom Forms. They implement a subset of the lemon patterns for the ontology-lexicon interface as custom forms that should be used as custom constructors for the class ontolex:LexicalEntry.

Once the custom forms have been bound to the ontolex:LexicalEntry class, upon the creation of a new lexical entry, the user is prompted with the list of available forms, each corresponding to a different design pattern.

List of custom forms for the creation of a lexical entry

When the user chooses a form (e.g. Intersective Object Property Adjective, that is to say an adjective denoting a class defined as a value restriction on an object property), the form for the creation of the lexical entry is extended with additional fields that are specific to the chosen form (e.g. an object property and its value).

Custom form for an Intersective Object Property Adjective

The corresponding lexical entry is a very complex one:

It should be quite obvious that the representation of the information above without the use of custom forms would have been quite difficult. Moreover, it is similarly difficult to understand that complex graph pattern, and recognize it as an instance of a lemon design pattern. The custom forms also help visualize and understand complex lexical entries, since the custom forms are applied in "reversed mode": they are matched against the data, and the custom form that best fit the data is used to decode the triples at hand and present a clear form-based preview (including the name of the design pattern and its salient information).

Custom form for an Intersective Object Property Adjective

The Lexicographer view

As regards the lexicon editing functionalities available in VocBench there is also the LexicographerView. It can be consulted at the following link "LexicographerView"