Republished with Permission of Robert S. Seiner, Publisher of The Data Administration Newsletter, LLC (TDAN.com).

Introduction

Meta-data is not just about structured data anymore.  Nine years ago I published a definition for meta-data that makes as much good sense today as it did when I wrote it.  I defined meta-data as …

Information documented in IT tools that improves both business & technical understanding of data and data-related processes.”

I have seen my definition repeated in DM Review magazine, Intelligent Enterprise magazine and several other publications by some astute :) authors.  I can see sticking with that definition because it says so much more than "data about data".  However, when looking closely at my definition ... or the industry definition, the question that could pop into people's minds is - what exactly is data?  My definition mentions data and data-related processes but doesn't clearly specify structured or unstructured data.

This article is not intended to define or debate the differences between structured and unstructured data.  This author considers structured data to be tabular or delimited by nature and recorded in a file or database table.  For the purpose of this article, unstructured data will be referred to as "artifacts".  Artifacts includes data/documents/content recorded in electronic format that can be managed and leveraged for the benefit of your company, your customers, your suppliers, etc.  Artifacts include word processing files, html files (web pages), project plans, presentation files, spreadsheets, graphics, audio files, video files, emails ... any data that is not in tabular or delimited format.  Some people call this recorded knowledge.  Some people call this web content.  Some people call this data documents as in document management.  Everybody calls it valuable.  For this article, that is the definition of unstructured data.

Just like structured data ... to manage artifacts of unstructured data, a company needs to record meta-data about those artifacts, organize that meta-data, and make that meta-data available to the knowledge workers of the organization so they can locate artifacts when they need them.  The conceptual model (Figure 1) described in this article represents many of the types of meta-data that can be recorded about artifacts.  The model may not include absolutely everything that you need to know about the artifacts, but it should provide a good start toward understanding the relationship between meta-data and unstructured data.