When presenting around the country during the GetMETAsmart tour, one question that nearly always came up, was how to lock your photo metadata so that it could not be changed by others. This idea of permanent metadata is not a new one. According to John Nack of Adobe, this has been a perennial request from photographers for many years.
What is a File Header?
The “file header” is the non-image portion of a digital image file, preceding or following the actual pixel data, which contains a marker segment or other information about the file such as those contained in various types of technical, descriptive and administrative metadata typically written using the EXIF, IPTC or XMP standards.
Despite the specifications in the latest IPTC (International Press Telecommunications Council) schema requesting that the Creator and Source fields be considered as write-once fields, few developers (with the exception of HindSight’s METAmachine) have implemented this request.
Pixels and Metadata are Part of the same BLOB
At present there is no definitive answer, and probably will not be for some time to come. Part of this reason is that at a digital level, a digital image is just a large blob (binary large object) of bits, which when compared to other file formats—with the exception of digital video files—can be quite large. The metadata for most file formats is stored within a portion of the file header, and it can be difficult to prevent access to that data without also causing restrictions on what can be done with the pixel data.
Whenever someone performs a screen grab (or a series of them for a larger image), extracts a TIFF file from an Adobe Acrobat (PDF) file, or even copies and pastes the pixel data from one image frame to a new one, the original metadata is left behind as the current systems don’t tie the metadata to the pixel stream for these operations.
I Get It
I fully understand why photographers and image owners want their metadata to be permanent—they want it to be unalterable and perpetually associated with the image, regardless of the workflow through which it may travel. Unfortunately, today, unprotected data containers such as EXIF, IPTC/IIM, or XMP are inherently modifiable, and it will take a significantly more complex system to change that, and that can’t happen overnight. Take a look at how long it has taken to get XMP support up to the level it is today (and that system still has a number of limitations); then add an additional component—like permanence—and you raise a number of conflicting requirements, making it exponentially harder to carry out.
Major Architectural Changes Will be Needed
The reality is that in order to protect metadata this way will require major architectural changes throughout ALL image editing, metadata annotation, and image database tools. At minimum, guaranteeing permanence in metadata would require new extensions to existing file formats(like .ptif for a protected TIFF) which would also break backwards compatibility. The applications, file formats, and workflows, which we currently use today are simply not designed for the more rigorous practices that will be needed to assure metadata and image permanence. As a result, insisting on permanent metadata would likely eliminate many legacy and open source (or small developer) tools from being used in an image workflow where metadata permanence is involved. Of course, so long as any application exists that can open a given file format and then resave it as another where metadata and image permanence are no longer guaranteed, it's a moot point.
Signed Metadata is One Alternative
As one alternative, some have suggested the idea of integrating signed metadata into images. It wouldn't be impossible to remove this metadata, but it would be difficult, if not impossible, to forge such a signature. In such a system, any modification to the image without re-signing the metadata would be a sign of tampering (provided the signed metadata is still readable within legacy tools). Given the lack of popularity/low adoption rate with the use of digital signatures for digital documents (legal for use in the USA for the last 12-15 years), it’s difficult to tell if this option will emerge or not. However, it could solve a number of current problems in identifying the owner of an image, or determining when images had been tampered with.
If there were a new file format released that incorporated signed-signatures or other encryption it might be possible to make the metadata permanent. Yet so long as that format allows someone to view the image and copy the pixel data (such as a screen capture), or resave the file as a TIFF, JPEG—or other format where you are allowed you to change the metadata—the system is compromised. In addition, it’s very possible that such a solution which requires encrypting image data and metadata; will in-turn complicate or prevent the recovery of image data (not to mention metadata), in case the unencrypted original file is lost.
Watermarks May be an Option
Given the issues with other social media networks, and even basic Content Management Systems inadvertently or intentionally “stripping” metadata from uploaded images, it may make more sense to consider ways to embed information into the actual picture elements (pixels) using various watermarking techniques. You can create a visible watermark by visually stamping a chunk of information, like your name and copyright notice as visible text into the face of an image file. However, even watermarks are not necessarily permanent as a determine thief can clone over the marks, given enough time.
In addition to visible watermarks here are a few techniques out there that can transparently embed or weave information into the image file itself. In all cases it will affect the image quality to some exten—you are, after all, changing the pixels. In one approach—generically referred to as steganography—a piece of information like a unique ID number, or your copyright info is coded into the pixels in a specific pattern so that if the image is cropped or slightly modified, pieces of the pattern can be reassembled to reveal the original info. Making this more robust depends on increasing the intensity of the pattern, which of course will degrade your image as the pattern becomes more apparent. If the pattern is weakly applied, simple techniques like rotating the image a fraction of a percent may prevent it from being readable.
Of course the bigger problem with this approach is that you may be the only one that knows the information or link is stored this way. Thus, anyone wanting to track you down might not even think of opening your image in an application that can read these invisible watermarks.
Could Image Registries Hold the Key?
Another alternative to making sure that your photo metadata is preserved is to have it stored externally to the file. While it's possible to include a link or registry ID within the file, if the embedded photo metadata is stripped, that doesn't help if others are trying to figure out where a file came from, or what's going on in the image. However, a registry that incorporates a visual recognition system could take an image with no metadata, and compare it to the images within it's registry, and provide a match (or matches) to the metadata that should have been there. I happen to know from other hats that I wear that this is something that the PLUS Registry should have in the near future, so stay tuned for more on this option.
I hope that gives you an understanding of why you can’t currently “lock” up your metadata. If you have questions or comments, please feel free to add those in the comments below, or in a note to the Dam Coalition, or my Twitter account.