Notes on the future of Open RAW formats, and a look at DNG (by Stuart Nixon)
[Legal preamble - these are my own opinions only. I welcome and will happily publish valid corrections.]
There seems to be some confusion about DNG. Speaking both as a software developer and a photographer (and having had a lot to do with patent issues and image standards like JPEG 2000), I'd like to add my $0.02 to the debate.
DNG IS NOT THE ANSWER
Let me first make one thing clear: DNG is not an open standard for defining and storing all needed RAW camera information.
DNG makes the RAW format problem worse, not better.
DNG is not an open standard in that it does not document all the essential information contained in current RAW format files like NEF and CR2 (which also don't document this information).
In many ways, DNG can be viewed as simply yet another RAW format with undocumented information - except that DNG has the added risk that information can be lost during conversion to/from DNG and other RAW formats.
From a software developer point of view, DNG is a step backwards. From a camera manufacture's perspective, DNG does not address the missing elements in EXIF.
From a photographers perspective, DNG is dangerous because people believe they are storing for the future with the format, when nothing could be further from the truth.
THE SITUATION TODAY
A quick recap of the current state of the RAW format market is in order:
- Aldus defined the TIFF 6.0 specification in 1992. Aldus was taken over by Adobe, who now control the format specification.
- TIFF uses an extendable structure called IFD, which stands for Image File Directory. This makes it possible to add new content to TIFF files, and allows other formats like JPEG to be embedded into TIFF. The TIFF IFD structure can also be embedded into other formats like JPEG.
- There are many, many sub-variations of TIFF, to the point that no one product (including Adobe's own products) reads and writes all the TIFF variations and permutations. And this is just for official TIFF format files - no product comes even close to reading all the different extended versions.
- TIFF has many technical problems including being hard to implement, no standard support UNICODE support and limited to 2GB file size. Various hacks exist to address some of these issues, making technical consistency even worse.
- Over the years, sub-variations, standards and extensions to TIFF were created by different groups. Some of the more important ones are:
IMPORTANT FORMATS AND STANDARDS FOR PHOTOGRAPHERS
Deserves a special mention because it is the way TIFF can be extended.
Various Image Resource Block (IRB) extensions to TIFF and other formats used by Adobe. Of particular interest are the IRBs used to store thumbnails and IPTC data.
A standard - predating TIFF - for storing meta data, used by the media industries. Adobe defined an IRB to store meta data in the IPTC-NAA format. This is the most common way meta data like copyright is stored in TIFF files, despite a slow move to XML.
A vitally important standard by Japanese companies, which started life as a standard way to store camera information, using the TIFF IFD structure, in JPEG. Quickly morphed into a general way to store camera information into TIFF files as well, by simply dropping the EXIF structure into TIFF files. A great deal of useful camera information is encoded into EXIF, with one glaring exception: because the EXIF standard was designed to encode meta data for JPEG files, it did NOT describe a way store/encode RAW data (this is the main issue that DNG attempts to address).
- EXIF MakerNotes
The curse of the photo industry. Because EXIF did not contain all camera information, it included a way for manufacturers to extend EXIF with manufacturer specific data - the "MakerNotes" section. This quickly grew in undocumented and incompatible ways, as manufacturers used MakerNotes to store information like the RAW data. 99% OF RAW FORMATS ARE SIMPLY TIFF + EXIF + MAKERNOTES. THE PROBLEM IS THAT THE MAKERNOTES ARE NON-STANDARD AND NOT-DOCUMENTED.
An ISO standard for "Photography - Electronic still picture imaging - removable memory - Image data format" to quote the standard. A credible ISO attempt to standardise the blizzard of TIFF information. However, it is incompatible with EXIF in several crucial areas, and uses a totally different way to store thumbnail IFDs, making TIFF/EP problematic. Most products read TIFF/EP but use EXIF in preference.
- .CRW, .CR2 and .TIF Canon RAW photo formats
There are three main formats used by Canon for their RAW photo format. The oldest format is the .CRW format, which is known as the Canon CIFF format. This is used by the D30, D60 and so on. Canon then swapped to a new RAW format, with a .TIF extension, used by the 1Ds. Probably because of confusion between RAW TIFF files and normal TIF files, Canon then swapped to the .CR2 extension with the 1D MkII. Canon .CR2 and RAW .TIF files are essentially the same format. Canon’s RAW format is internally well structured and consistent between different cameras. The latest format draws heavily from the TIFF IFD format. Probably the purest and best of the various RAW formats in existence (albeit still undocumented by Canon). Uses a clever, lossless, JPEG encoding technique for RAW data.
Undocumented Nikon RAW format (actually two formats; one for scanners and one for DSLR's). Essentially EXIF format but inside TIFF rather than JPEG headers, and again with undocumented extensions. Unlike other manufactures, Nikon keeps tweaking NEF to the point that each camera and each Nikon editor product generates different, sometimes incompatible, NEF files. With Nikon NEF the best approach is the write-protect the original NEF file and never modify it to ensure maximum compatibility. Uses two fairly clumsy techniques for compressing RAW data; e.g lossy or lossless. Where as most manufacturers don't document their RAW format, Nikon went one step further and started encrypting crucial data in NEF files. In a curious coincidence, Mikon started encryption at the time newer Nikon cameras were released with very poor high ISO noise problems which must be removed in post processing.
Yet another TIFF variation, this one pushed by Adobe.
Has a number of advantages and disadvantages:
- Defines a standard way to encode raw sensor data
- Defines a standard way to transform colour
- Takes the proprietary RAW format problem and makes it much worse, with MakerNotes et al now being moved about.
- Offers the option to include the entire original raw file - but unless everyone actually uses this (thus doubling file size) just gives a false sense of security.
- Makes no attempt to define a standard way to store all the data currently stored in undocumented ways in MakerNotes.
- [Section removed on clarification of Adobe RGB profile status]
- Takes MakerNotes and moves them into another format, both perpetuating the problem and making it worse (as decoders often rely on absolution location information in Makernotes)
- Makes no attempt to extend EXIF or TIFF/EP in a coherent fashion.
- Controlled by a single manufacturer - Adobe - who given don't have a good history of managing the TIFF standard or allowing software developer of Adobe controlled software profiles or standards.
In addition to the above, other format issues deserve special mention:
There is a special place reserved in programmer hell for the authors of the various ways that thumbnails can be stored in TIFF related formats. Thumbnails are vital (obviously). Yet there are many ways to store them. Sometimes a photo will contain 4 or 5 thumbnails - all of them different and some of them not even matching the image any more. Because thumbnails are so important, software vendors have ended up creating their own catalogs for thumbnails. So each product has its own way of storing things. Thumbnails are currently stored in at least the following ways: TIFF thumbnails. A Adobe sub-IFD for thumbnails. TIFF/EP thumbnails. A different (and reverse to the above) standard. EXIF thumbnails. Yet another way to store them. NEF/CRW thumbnails. More ways to store thumbnails. MakerNotes thumbnails. Often also stored in makernotes. Photoshop IRB thumbnails. Another common thumbnail storage method.
- JPEG 2000
Despite a name similar to JPEG, the new JPEG 2000 is a totally new image format, using wavelet image encoding/compression instead of DCT encoding as used by JPEG.
Very unlikely to be used for on-camera encoding for a long time, for a number of reasons:
- Wavelet space encoding style not ideal for on-camera sensor encoding
- Very complex to implement in fast / low power hardware
- Patent risks associated with any new format.
The core JPEG 2000 standard unfortunately does not define a standard Meta format, although it does define a way to add new information to a JPEG 2000 file. One extension to JPEG 2000 (so-called because extensions are not part of the core format and thus don’t have to be supported by JPEG 2000 applications) is the “JPX” file format, which is a superset of the "JP2" JPEG 2000 file format. JPX files enable definition of an ICM within a JPX file, and also have a detailed definition of storing Meta data using XML.
In addition, it is possible that other formats (EXIF, IPTC et al) will simply be wrapped and stored in JPEG 2000 files. Given the many Meta data standards, and the lack of a core standard Meta definition within JPEG 2000, unified and generic support for meta data within JPEG 2000 is likely to be some time away. However (and despite its complexity), the JPEG 2000 Meta data stored in XML hold promise for the future.
Another way that Meta data can be stored is using XML. This can be embedded in the photo file itself for some formats (for example JPEG 2000), or stored as a "sidecar" file associated with the photo. The Adobe XMP format is one of several specific meta data definitions in XML. In theory XML offers considerable promise. In practice XML is currently implemented with many different “styles” (different ways of recording information), and a full XML implementation for meta data can be both quite slow and large. Not withstanding these limitations, XML is probably going to be the main standard in the long term for storing meta data.
WHAT DOES EVERYONE WANT?
We all want standards for photos, but for different reasons:
- Want a way to store photos that will be around in 50 years.
- Want to be able to use any product to edit any photo.
- Large software vendors:
- Want to have the standard under their control
- Want a legal hold over potential competitors
- Want to limit use of photos on competitive OSs/products
- Want hidden "submarine" patent/legal/copyright control over standards
- Small software vendors:
- Want a common and uniform standard
- Want protection from legal actions from large software vendors.
- Enlightened camera companies:
- Want and will use open standards like EXIF
- Must be able to extend the standard as technology evolves
- Don't want to have class actions against them in future years by people losing access to their photos.
- Other camera companies
- See formats as a competitive edge
- Want to make money out of their own software and compete with software vendors
REAL WORLD ISSUES
There are some realities to consider:
- On camera hardware:
- Lots of effort goes into the on-camera hardware.
- Hardware level support for IFD structure and JPEG encoding
- Impractical to force a standard that does not do these on to camera companies.
- Cameras evolving at a tremendous rate - can not be held up by standards
- Likely to have major changes that will break standards
- On standards:
- XML is nice, but not going to make it onto the camera for a while.
- JPEG 2000 is years away from common camera level use.
- The TIFF IFD structure is horrible, but given it is widely used at the hardware level it is really the only way to go for now.
- Extend EXIF to standardise storing of RAW sensor data
- Extend EXIF to store common information currently stored in MakerNotes.
- Allow/encourage small software vendors (who don't have a particular axe to grind) to be involved
- Include a standard color profile/matrix for raw-to-calibrated conversion.
- Require all involved parties to waive legal/copyright control over widely used profiles and formats.
- Taking a sympathetic approach to the difficulties that camera manufacturers have with standards given rapid technology evolution and hardware design constraints.
An open consortium to define a standard (EXIF 3.0 perhaps?) to store common camera information, by extending the EXIF standard.
It should include:
- Compatiblity with EXIF 2.x
- Ways to encode RAW data using JPEG lossy/lossless encoding that is easy for camera manufacturers to migrate to.
- Additional tags to store all the stuff hidden in MakerNotes like:
- Focus points
- Sensitivity curves
- White balance values
- Standard transformation matrix conversions
- Spectral sensitity curves
- Recommendation of a non-manufacturer specific RGB profile wide colorspace to move away from Adobe RGB. Ditto for CMYK profiles et al.
I trust this (rather long) post is of interest.