Purpose
Allow information to be tied to a specific location or region inside the data, in this case specifically inside an image. If applied to video a time marker would also be useful (such as the time start and end for captions, or region key frames and time markers to point out something in the film, such as an actor moving across the screen.)
Given how under developed and proprietarily implemented this feature has been in elsewhere, current standards should be built on, and current implementations should be considered and learned from.
Scope
Currently the scope of ImageRegionTags should only include
- how to describe 2d regions of an image
- how include reference to other data
- how to contain it's own data
- how to maintain import/export across existing systems
- how to be encoded into images, to be accessible and intuitive
Current Implementations
Microsoft People Tags
Summery:
- Information is encoded into XMP (embedded or sidecar file)
- region is restricted to rectangular shape
- upper left corner of region is recorded as a percentage of each direction, tracked from upper left corner of the image
- region size is described as percentage of total image in each direction
- Example (0.10,0.25,0.23,0.50) the upper left corner of the region is 10% right, 25% down from the upper left corner of the image, and the rectangular region is 23% of the total width and 50% of the total height of the image
- Specification defines types of data that can be included with the tag, including a unique id, and label
Sources:
Flickr Notes
Summery:
- information is kept in a database, but can be accessed through an API, or graphically on the website
- The note dimentions, as well as most image data can be obtained through Flickr.photos.getInfo
- Notes have a unique ID, Author ID, Author name, and region. Region is defined based on pixel location of a set image size (The 500 pixel "normal view" whatever that is?)
Fotonotes
Summery:
- earliest implementation that has any documentation
- does not separate encoding, data format, and displaying clearly (at least not to me)
- Is not entirely clear what part of the image header its data is being encoded into (again, not to me, probably because I havent followed through the code closely enough yet.)
Sources:
Still researching
Metadata Working Group
Summery:
- Group formed to solve this problem and other similar problems
- Contains more sophisticated regions that other available options
- Is contained in the XMP region
Sources:
Current Conclusions
- XMP is the preferred place/method to encode such data
- rectangular regions must be supported
- Types of information the regions are to be linked to must be anticipated, rather than hacked in later.
- It should be possible / reasonable to create 2 way sync with other existing methods.
