Writing about metadata is risky business, since every post and every tweet potentially starts the same discussion: what exactly is metadata, anyway? So here’s my ambitious attempt to cut to the chase, and open the can of worms again.
Why would you care, anyway? Isn’t this just some highly technical or theoretical debate? Well, to some extent it is, but the fact remains that for any content technology,metadata is essential. Metadata is what allows us to use a system to manage content in the first place. Even if you take the brute force approach of using enterprise search, rather than meticulously tagging all your content with metadata, you’ll find results will be disappointing, at best. (In fact, if there’s no useful metadata available,search engines will have to create it themselves.) Metadata is so important that we now even getcourt rulings to define it.
Of course, the essence is easily defined. Metadata is data about data. The problem is that, in the end, you can’t really define the distinction between data and metadata.
The examples are abundant: a document’s author, the date content was created or published, the name of a database column, even the filename is metadata. You can see it in any system dealing with content, and often, helpfully, it will actually be marked as “metadata.” There are standards for what metadata you could have (like Dublin Core, or EXIF) or how to store it in a document itself (like XMP). If that’s all you want to know, now might be a good time to stop reading. Because from there, it starts getting tricky.