Managing Multiple Schemes and Multiple Hierarchies

This page describe how to manage multiple schemes and, possibly, mulltiple, different, hierarchies, when different groups/organizations are working on the same SKOS dataset,

A good reading before going ahead is provided by this page on SKOS development.

In large efforts for developing reference dataset for particular domains, it might happen that different organizations want to collaborate on a common ground. The reason is clear: instead of multiplying efforts, different actors put their strenghts together for maximizing the result and for providing a single reference resource.

There are quite a few cases. Just to mention a few in the domain of agriculture:

While the first case is a kind of its own, started as a coordinated aligned backbone and meant to progress as a single unified resource, in the two other cases the approach is identical: delegate to other departments or other organizations the management of parts of a thesaurus.

In this case, a few features from group management come handy.

Group Management and Multiple Schemes

In order to have "happy flatmates" coliving within the same SKOS dataset, it is useful to adopt good, agreed, policies on the one side, and to enforce restrictions on the other through the sytem. Luckily, VocBench comes of help for such case.

Simply, it is possible to define "groups" in VocBench and to associate specific behaviors in each project. So, to clarify, groups are persistent in VocBench at the system level (once defined, they do not need to be redefined for each project) but must be set for specific behaviors project by project. The main objectives of this feature are:

The group management page (and other related pages) provides all the information fron the technical perspective.

Multiple, Scheme-specific, Hierarchies

One known limitation of SKOS (mainly due to limitations of RDF, before the advent of RDF* at least) is that while certain information can be scoped to schemes (e.g. the membership itself of a concept to a scheme) other information cannot. For instance, relationships such as those based on skos:broader/narrower cannot be scoped to any scheme and only asserted in general. The reason is that, as the relation involves two concepts (thus filling a triple with the concepts as subject and object and the skos:broader/narrower as predicate), there is no way (unless the triple is refied or by using other tricks, such as micrographs) to tell something about it, such as its scoping to a scheme.

How to solve this situation? well, in standard SKOS (and RDF) this is not possible. However, one solution, supported by VocBench with dedicated features, is to define scheme-specific properties representing the broader/narrower relations.

So, suppose there are two schemes: :schemeA and :schemeB, this solution foresees the creation of two properties, :broaderA and :broaderB. Each of them represents the broader relation in its associated scheme and can be used to bind two concepts into a broader relation only in the context of a particular scheme. Without lack of generality, we consider the case of two schemes (and will refer to two schemes in the following text) even though identical considerations apply for multiple schemes.

Please notice that this escamotage is necessary only when both the two concepts involved in the broader relation belong to both the schemes A and B. This is because if one of the two concepts does not belong to one scheme, it will be in any case filtered out from the hierarchy of the missing scheme.

So, what is the support provided by VocBench?

A further setting (Projects-->Project-Groups management in the UI of the administrstor) consents administrators and project managers to set default values for the above choices for each user and for groups (and thus users belonging to these groups).

Is this really necessary?

As we all know from spidey's uncle (and recently even an auntie..), "with great power comes great responsibility". This feature is a sort of hack with respect to the common way of handling SKOS, introduces some complexity (see the next paragraph on how to implement the procedure) and..yes, it can be used, but users should be really aware of how and when it is used.

There is indeed a common misconception for which it is quite a common case to desire multiple, different, hierarchical relations. Indeed, cases for which different hierarchies are desired in different schemes should really amount to a minor number of cases, specifically these three cases:

  1. simple disagreement on whether c2 is broader than c1, with broader intended in a common sense in both schemes
  2. case of two concepts belonging to two different schemes and the two schemes have very specific and different semantics of broader/narrower.
  3. merely structural organization of the tree

Now, let's examine them case by case:

Case 1: disagreement.

We really encourage to resolve disagreements. "agreeing to disagree" on a same relation between two concepts is - very possibly - like suggesting that the semantics of the concepts themselves are not clear, which is like telling that a semantic resource is failing in being semantic..not really a good message! This case is usually not a sign of an healthy management and giving much freedom is only resulting in making more mess in your.

Case 2: different semantics of broader/narrower.

well, SKOS is a very shallow model, and the broader/narrower have been explicitly said to allow for different interpretations in different schemes.
We make an example here that can show how different semantics can be imbued in the same properties: if you sell cameras, you may decide that a lens is something you want to show under a camera and use a scheme to represent this recommendation tree. Lens is not narrower than camera (it is part-of it). So it is narrower intended as "part-of" ?. Not even that, narrower here means "if you bought a camera, you might be interested in buying a lens". Now, if the camera and lens concepts were to be shown on another scheme with broader intended in the more common case of "is-a", then "lens" is not narrower than "camera" because lenses are not cameras!
However, in two very general schemes, is there really a different intension of the broader/narrower relation? Usually not.

Case 3: Structural Organization.

As of what has been already described in case 2, we remark that the hierarchy in SKOS (differently from, say, OWL, where the objective is not to represent a hierarchy; on the contrary, the hierarchy is only a convenient view to observe the classifications)

So, these cases are merely emerging due to the "artificial nature" of the hierarchy in SKOS and the fact that skos:broader/narrower are not transitive. For instance, if you have:

c1 skos:broader c2 .
c2 skos:broader c3 .

in schemeA, with c1, c2, c3 members of A

then you want to have, in schemeB, only:

c1 skos:broader c3 .

because c2 doesn't belong to schemeB (and thus you need to "wire up" c1 and c3)

and here is the limitation: since the skos:broader relation is not scheme-local, you would end up with c1 being both directly under c2 and under c3 in schemeA. That's where a specific broader property, local to a scheme, would be desired if modeling the two schemes together.

So, the structural issue when some concept in the middle is missing from one of the schemes is a real case that can always happen, even though it might be uncommon as well.

Pls notice that if you attempt at skos:broader in the first, things such as the unwanted triple "c1 skos:broader c3" in schemeA in the previous example would be detected by the ICV for "redundancy in the hierarchy". So a maintainer of scheme A could easily detect these cases and say: "well, thanks my scheme-B friends, but we decide to split for that relationship as C2 is missing from your tree but not from ours and so we do not need the relationship wiring up c1 with c3", then switching: c1 skos:broader c3 to c1 :broaderB c3

Choosing and Implementing the Policies

So, given what VB3 provides as a support, and these modeling tricks, how should user implement the policies for managing multiple thesauri within a same dataset?

There are multiple possibilities, and all of these need to be supported by policies, while not being bound to further characteristics of VB3.

We can see, basically, two main scenarios:

Publication

Known Limitations Upon Publication

This solution is long researched by institutions that, while relying on linked data technologies and models for representing their data, have no intention to publish their resources as Linked Open Datasets. The reason is that if the same IRI scheme is kept, there will be a single IRI for reprersenting the same concept and if this concept is connected to others through different, contrasting, relationships, then there will be only one loci for representing the hierarchical information and a single representation must be chosen. So, approaches in this case include:

How to Publish

Here VocBench comes again of help for supporting the work of the users.

In Global Data Management-->Export Data, there is a dedicated transformer for normalizing some properties into others. This can be simpy invoked to transform scheme-specific properties into the standard skos:broader for producing the output datasets.