Linguistic independent classes

Linguistic independent classes

Being able to collect an immense amount of linguistic polymorphism is surely nice, yet what makes a cataloguing system strong is its capability to classify things. The capability to record that a cat is a feline, is a mammal, is an animal.

In the real world any concept can be used to classify any other. In order to use a profile to classify another in the OWm2 storage engine, a user must say that he/she is willing to use the first as a class object. This is aimed to limit the number of classes to avoid turning GUIs widgets into a nightmare.

When you say that a given profile is also a class you may choose to set it abstract. If you do so it means that this class object cannot be used to directly classify simple profile objects, but only to classify other class objects in a taxonomical tree.

This may sound quite obscure, so we shall better make a clear example of the implications. On classifying animals, among many other things you may possibly want to say that "tiger" is a "feline", is a "mammal", is an "animal". Just in the same way you may be interested in stating that German is an Indo-European language.

Yet you might not want to spend ages in tagging all single German expressions as Indo-European, or all mammals as animals, and you might not be satisfied if anyone said that "tiger" is simply an "animal". You may want to state once for all that "feline" is "mammal", is "animal" and be done with it. And you may want your GUI widgets to propose only analytical class objects, while keeping the more generic conceptual layers in an invisible implied background.

The answer is in defining these class objects as abstract. Abstract class objects can build up tremendously powerful semantic frames, while keeping the number of choices in widgets to a very minimum. Since class objects are classified by tree structures, as anything else, you can always change and move your classification structure later on.

Using such implied background information can make your data extremely powerful, because while nobody will be offered a chance to use an abstract class object to classify a given profile, anyone will be able to use these class objects as search terms. So a comparative linguist, for example, will be able to extract in one go all Indo-european expressions for "cat".

Once again, all class objects include a profile object that is responsible for their linguistic manifestation. So you can translate them and define them as any other concept, because concepts is what they are.

What we build is more than a traditional taxonomy. Many relations can be expressed as a directed graph, as shown in the attached picture. Such structure allows for inclusion of a class into an infinite number of traditional taxonomy trees.

In the example we see how the class object medicinal herbs can be included in the wider class object medicine by a number of different paths.

In fact both the following taxonomies are expressed:

  • [medicine [therapy [medicinal herbs]]]
  • [medicine [alternative medicine [medicinal plants]]]

This gives our classificatory structure a better degree of flexibility and allows for complex systems to be efficiently represented. But it is not enough, yet, as we shall see. Let's consider one more example, shown in the third attached picture. Here we deal with agriculture and soil classification.

A particular kind of soil, the Alfisol, is so called because of the presence in it of Aluminium and Iron. The profile objects "aluminium" and "iron" are surely included in the conceptual "base" that can lead to fully grasp what an Alfisol profile really is. But is this inclusion hierarchical? Can we say that "everything about iron" should labelled as "belonging into an Alfisol class object"? Most people will say that no, we cannot.

A hierachical categorization is not the best tool to express the relation, here we need something more like a WWW link. Something that suggests you may also have a look at the information about Iron, but that it does not automatically fit the Iron profile into the list of components of the "Anfisol set". Good candidates for this set are, in instead, landmarks where Anfisol is common, or specialized agricultural techniques for this peculiar kind of soil.

So what we have is non-directed link, that we expressed as a pointed line in our last example. In matematical terms, we shall say that we use both a directed and a non-directed graph to map semantic values. The resulting table of combinations (to remain in mathematical terms we would say "the resulting incidence matrix") is not simply composed by Yes/No values, but rather of any pick of empty/directed/non-directed. And what we have, in practical terms, is the capability of expressing a relation between two profiles that either:

  1. puts one value INTO another (and thus has a "direction")
  2. simply links the two elements  to each other (and so has no predefined direction, it can be walked both ways with the same result)

To optimize space consumption we obviously do not store empty values, so all we have is the cells of the matrix that contain a relational value. At design time we decided that this is "enough". This was simply the designer's decision, and certainly not a law of physics, but for all practical means this structure really seems to do all a dictionary needs.

Next we shall see that some classes are more classes than others.