Hierarchical data and translation trees
Our understanding of reality is often built in frames containing each other. A cat is a feline, which is a mammal, which is an animal, which is a living entity, etc. When we say "Fritz the cat" we imply a full lot of information in that animal codename.
When dealing with content cataloguing we need a lot of such frames, and we need to make sure we correctly map what is included in what. But we need frames even to express how a multimedia_text came to be appended to a given profile.
It is important for us to know that someone created a piece of content as an original, or as a translation. If it was a translation we must know from what it was translated in the first place. This serves two purposes:
So our content gets assigned to a profile by a so called translation tree. This tree marks the way in which multimedia_text objects were produced and let's us immediately see that translation 1-4 is pretty likely to contain a huge semantic drift.
This tree is not an object, but simply a structure that can be contained by proper objects. So while looking at its database definition you see nothing at all in the profile table, the OWm2 engine can assemble the whole relational structure that is built on top of it.
Yet, wait! Aren't we making dictionary entries? So what is this content we are talking about? The lemma itself or its definition? The answer is in the kind of tree we use. There are actually many, OWm2 uses them to map all possible kinds of taxonomic relations with just one dedicated set of routines that manages them all. There are trees that order class taxonomy, others that order network topology, others mark the way an "entry" is built. But before we proceed to this there are a few things to explain.
A tree structure knows many things:
You will probably have noticed that these three elements do not tell us anything about the hierarchical position of a given element within the tree.They do not say that translation 1-4 came from translation 1-3. Let's see why.
One of the most difficult challenges for the coder is to find a way to efficiently represent hierarchical data in a relational database. It may seem weird, but there is no immediate way to retrieve a taxonomic tree from relational tables by a single efficient query. So we all resort to tricks.
All tree elements have a left and right value. They work as frames, so we immediately see that element 0-7 includes all the others, element 1-6 is included in 0-7 and includes both 2-3 and 4-5. This makes it trivial to arrange queries that retrieve a full taxonomic mapping in a single shot and can compute an element depth on the fly. It also makes it trivial to move around parts of a tree. Here we have the basic structure that allows merging and splitting things without much fuss and without any risk of loosing bits and pieces in the process.
Such a genial solution is obviously no invention of the OWm2 team, all credits for it go to Mike Hillyer for a very clear explanation of this method, along with basic code snippets that made it trivial to build what OWm2 needs.
So, once we explained the basic technology we use to map and move relational data, let's move to the way in which "dictionary entries" are built.