More on JPEG

JFIF and EXIF share a heritage in that they use JPEG Application segments to store data in order to encapsulate JPEG compressed images in file format.

As a side note (and I’ll find a place to put this somewhere), iPhoto, when importing photos from a camera, is changing the data. It is at the least rearranging chunks of data, and it is probably adding or altering metadata. This shouldn’t have surprised me, I suppose. I noticed this when I was looking at photos that had been imported multiple times – the JPEG data itself was the same, but the files were different because chunks were in different orders. Perhaps successive versions of iPhoto have changed how import works. In any case, it’s just ever so slightly distressing, I’d prefer for the original to be truly original. I’ll investigate at some point by comparing files on the camera (extracted through copying directly from flash) versus the same photos imported by various programs.

Catalog of application segment types

As far as I know, one of these needs to be the second chunk in the file (the first chunk is always SOI = xFF xD8). It’s undefined behavior of several of these exist in the same file.

I show the identifier strings ag as upper-case, but the strings are case-insensitive – Apple, for example, stores the EXIF tag as ‘Exif\x00\x00″.

I need to write these as an actual grammar soon, but for now, I’ll illustrate them with hex dumps.

JFIF

The file format document can be found at http://www.w3.org/Graphics/JPEG/jfif3.pdf. It’s pretty sparse, all things considered. This document introduced the idea that the APP0 marker had to be right after the SOI marker. All values in JFIF are big-endian. JFIF image orientation is always top-down (JPEG allows bottom-up).

FF E0                            ; APP0
nn nn                            ; length
4A 46 49 4F 00                   ; JFIF\x00
01 02                            ; version 1.02
xx                               ; 0=no units 1=px/in  2=px/cm
xx xx                            ; horizontal pixel density
xx xx                            ; vertical pixel density
xx                               ; thumbnail pixel width
xx                               ; thumbnail pixel height
xx xx xx yy yy yy ...            ; 3n bytes 24-bit RGB thumbnail

JFXX

This is actually an extension segment to the APP0 JFIF marker. It can only appear in files with JIFIF version 1.02 and above. The first byte after the identifier is an extension code, but while theoretically there can different kinds of extensions, the only ones defined to date are for different kinds of thumbnails. Presumably if a thumbnail is stored in a JFXX extension segment, it would not be also stored in the JFIF main segment. And I’m betting that this extension is mostly used for JPEG thumbnails. EXIF also does JPEG thumbnails.

FF E0                           ; APP0
nn nn                           ; length
4A 46 58 58 00                  ; JFXX\x00
xx                              ; 10=thumbnail, JPEG
                                ; 11=thumbnail, 8-bit
                                ; 13=thumbnail, 24-bit
xx xx ...                       ; extension data

EXIF

As alluded to above, EXIF and JFIF are competing file formats, so you can’t have both an EXIF and JFIF chunk in the same file. Also, EXIF is loosely based on and somewhat subsumes TIFF (you can store TIFF data in EXIF files, as well as JPEG, and many RAW formats are EXIF or EXIF-like files). Except, I found a file that had an Exif chunk followed by a JFIF chunk, how confusing (this was a photo sent in a text message, perhaps that’s why). And I found another file that had a JFIF chunk followed by an Exif chunk. See http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf for TIFF format info.

FF E1                           ; APP1
nn nn                           ; length
45 58 49 46 00 00               ; EXIF\x00\x00 (or\xFF at end)
49 49 2A 00 08 00 00 00         ; little-endian TIFF (4D 4D big)
2A 00                           ; length
08 00 00 00                     ; offset to IFD0 (main image)
nn nn                           ; IFD0: count of directory entries
nn nn ...                       ; entry 0: 12 bytes
...
nn nn ...                       ; entry N-1: 12 bytes
00 00 00 00                     ; offset to IFD 1 (thumbnail image)
...
00 00 00 00                     ; end of IFD list

Each 12-byte entry is formatted as follows

nn nn                          ; exif tag
nn nn                          ; data format
nn nn nn nn                    ; number of components
nn nn nn nn                    ; data or offset to data

The data format field is a value from 1 to 12 that determines the data type of the components. Total data length is the size of the array, so multiply the component size by the number of components. If the total length is 4 or less, than the data is stored in the last field, otherwise an offset (from the start of the EXIF chunk) is stored.

1 = unsigned byte (1 byte/component)
2 = ASCII char (1 byte/component)
3 = unsigned short (2 byte/component)
4 = unsigned long (4 byte/component)
5 = unsigned rational (8 byte/component)
6 = signed byte (1 byte/component)
7 = undefined
8 = signed short (2 byte/component)
9 = signed long (4 byte/component)
10 = signed rational (8 byte/component)
11 = single-precision float (4 byte/component)
12 = double-precision float (8 byte/component)

The types are all as they would be in the C language, with the exception of rational: a rational number is two 4-byte unsigned longs stored in sequence, the first for the numerator, and the second for the denominator.

XMP

This segment is used to embed XMP data into JPEG files. See http://www.adobe.com/devnet/xmp.html for details.

FF E1                           ; APP1
nn nn                           ; length
48 54 54 50 3A 2F 2F 4E         ; http://ns.adobe.com/xap/1.0/\x00
53 2E 41 44 4F 42 45 2F
58 41 50 2F 31 2E 30 2F
00

ICC

http://www.color.org/specification/ICC1v43_2010-12.pdf

FF E2                           ; APP2
nn nn                           ; length
49 43 43 5F 50 52 4F 46 49 4C 45 00  ; ICC_PROFILE\x00

META

FF E3                           ; APP3
nn nn                           ; length
4D 45 54 41 00 00               ; META\x00\x00

PictureInfo

The JPEG APP12 “Picture Info” segment was used by some older cameras, and contains ASCII-based meta information.

FF EC                           ; APP12
nn nn                           ; length
51 69 63 74 75 49 6E 66 70 00   ; PictureInfo\x00
xx xx xx xx                     ; quality
xx xx xx ...                    ; comment string
xx xx xx ...                    ; copyright string

Ducky

Photoshop uses the JPEG APP12 “Ducky” segment to store some information in “Save for Web” images.

FF EC                           ; APP12
nn nn                           ; length
44 75 63 6B 79 00               ; Ducky\x00
xx xx xx xx                     ; quality
xx xx xx ...                    ; comment string
xx xx xx ...                    ; copyright string

Photoshop IRB

Adobe IRB data. The spec I could find says “Adobe Photoshop 6.0, File Formats Specification, Version 6.0, Release 2, November 2000″. http://oldschoolprg.x10.mx/downloads/ps6ffspecsv2.pdf. This describes the old Mac format, which stored lots of metadata in ‘8BIM’ resources. There is an updated version on Adobe’s site titled “Adobe Photoshop, File Formats, Specification, June 2012″. http://www.adobe.com/devnet-apps/photoshop/fileformatashtml/. It looks like IRB stands for “Image Resource Block”. So the IRB segment is used for tunneling Photoshop data inside JFIF/EXIF files.

FF ED                             ; APP13
nn nn                             ; length
50 68 6F 74 6F 73 68 6F 70 33 2E 30 00 ; Photoshop 3.0\x00

Adobe

DCT Filters. See http://www.aiim.org/documents/standards/PDF-Ref/References/Adobe/5116.DCT_Filter.pdf

FF EE                           ; APP14
nn nn                           ; length
41 64 6F 62 65 00               ; Adobe\x00

Sources

http://www.ozhiker.com/electronics/pjmt/jpeg_info/app_segments.html

http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/JPEG.html

http://en.wikibooks.org/wiki/JPEG_-_Idea_and_Practice/The_header_part

http://www.videotechnology.com/jpeg/j1.html

http://www.media.mit.edu/pia/Research/deepview/exif.html

http://www.colorwiki.com/wiki/Color_on_iPhone

Jeffrey Friedl’s Exif Viewer: http://regex.info/exif.cgi

TIFF file format: http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf

http://gvsoft.homedns.org/exif/exif-explanation.html

 

One thought on “More on JPEG”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>