I worked on a project recently highlighting findability issues with unstructured content and the need for appropriate tagging using values from a controlled vocabulary.
At the heart of this project was Digital Asset Management (DAM), a rapidly growing area as more multimedia content is being distributed online, particularly for marketing purposes. The inherent problem with digital assets is the potentially large amount of information about what a piece of content is but the lack of information describing what that content is about. Unlike other content, which may contain text or be located with surrounding textual context, digital assets do not typically contain text, especially any which is structured for discovery by search engines. Any textual and searchable elements must be associated to digital assets through the use of metadata. Metadata describing what the content is, including attributes like video length, number of pixels, and file size, can be associated to the content and is often automatically attributed through business rules.
What the asset is about, however, is not inherent. It must be associated to the content either manually or automatically by loading the content once business rules have been thought out and established.