Saturday, June 23, 2007

Metadata

So when is data metadata? It is all very relative. I even tend to see the word as something that does not really help a lot of people and actually fuzzes more up than it clears up.

Firstly, it is an absolutely relative concept. When you are dealing in data from customers, their products, customers and contracts are represented by data in applications. Your (and their) metadata is the model of that application, which, if we and they are lucky, represents and corresponds to something in the real world.

If you are part of a data modelling department that supports data modellers and application builders working for different customers, your data are their models. Your meta data is their meta model, and if you and them are lucky, your meta data corresponds to their models out their in the real world.

If you are thinking about the meta data of that modelling support department, the language you do that in is of the meta meta model. There are different meta meta models, although not a lot because it becomes quite abstract. This is the place where the shortcuts are taken, where we say the meta class of a class is class. Evidently wrong but very usable.

So what does it mean - a level above the level we are talking about? But wait, it becomes less clear when we consider self reference. Let's look at an example.

When we hear a song on the radio, we know what is the song and what is the songs metadata. With the latter we mean who performs it, who has written it, who produced it and which label it is on, for example. Also the genre and, for some genres, the beats per minute are song metadata.

Somewhere in the nineties it became customary to name the composer or the performer in the songs lyrics itself. We had some 'Darkchild' in a lot of hits, for example Tony Braxtons song. (I always misheard that as 'dogchild'). A Beautiful Liar starts with the word "Beyonce, Beyonce, Shakira, Shakira" - now is that song metadata or not? Obviously the DJ slacked and did not mention the song metadata enough, which should be his core business.

To be strict, and there is no reason not to, I would say that as a part of the song, the names of these lovely persons (I suppose, I do not really know them) are data. Only if you know that these are the names of singers, and actually the names of the performers of the current song, they also become part of the song metadata. (As opposed to the song in which the singer professes his eagerness to 'know' Kylie 'in the backseat of the car', where the descriptors are just part of the song lyric data).

So metadata is a concept classification based on context and viewpoint. Everything can be metadata. It is everywhere. We have difficulty to talk about the higher levels, and two levels up are already problematic for most of us. And as such not a very useful concept, except when defining the services of certain companies, like mine.

Tuesday, February 6, 2007

Emulation and experiencing time

Now some experiments purport to show that time is not linear, it does not exist, or worse. It might be that not our world is emulated (think matrix) but that our own OS (HS, for Homo Sapiens) is running on older foundations foisted upon us by evolution.

Think of what people do: they build emulators. Actually the whole of computing is mostly building simulators or emulators. You might think your Core 2 Dual is fast, but actually it is mostly a simulator for some old electronic calculator instruction set. What it actually can do is shown by other processors, but the fact that they are not popular underscores my point in a way.

Now what can we learn if we reflect upon emulators, simulators and other layered implementations? Some things run bad because they are constrained by the underlying layer, like I/O and access to other serialized resources. Emulators greatly benefit from some hardware assist. Especially timing issues identify running in the emulation layer. Aha.

Now it would be interesting to devise how our conscience runs on a lower layer inherited from the primates and earlier, how we employ our own abstractions and how we are limited by the platform. We should also specify what can run on our platform and how to get rid of the constraints it puts upon future systems.