While metadata challenges have been reported in the mainstream media for years, a recent warning in February of 2010 from the U.S. District Court for the Western District of Pennsylvania highlights the continued challenges with educating the public on the dangers of metadata.
For those that are unfamiliar with the term, metadata is information contained within an electronic document that provides additional information about the document itself or certain parts of it, and that generally is hidden from view in the normal display of the document. It is often described as ‘data about your data.’
Metadata can be automatically generated by the application used to create the document, or it can be manually entered by a reviewer, author or commentator. Examples of metadata include:
- A word processing file or spreadsheet contains information about the author of the document, when it was created, when it was last edited, notes about the document, the number of characters and words it contains, etc. In a Microsoft Word document, for example, some of the metadata about that document can be accessed and modified under the Properties feature.
- Documents can also include comments and notes from reviewers, auditors and others. Generally, the display of this content can be turned on or off depending on the purpose of the metadata inclusion.
In short, metadata provides critical information about documents or files, but is rarely intended for display with the primary content included in the document.
In my next post on metadata I will review some of the common issues that users have when working with documents and PDFs. In the meantime I encourage you to take a look at this whitepaper on the Dangers of Metadata.