spatially mapping derivative content to original documents

Last Update:

Quadrant Addressing or Quaddressing provides an intuitive and consistent way to make relative spatial references to an original, two dimensional artifact (primarily for documents such as a printed text, photographs, etc.) regardless of its size or shape, or whether the artifact is physical or digital. Quaddressing relies on imprecision to aid human judgement instead of relying on highly specific measurements of varying units across different types of documents, thereby greatly increasing the ease and efficiency of both encoding and interpretation across all documents for both human actors and code.

Often while doing historical research and publishing my findings, I capture and display a smaller piece of information from an authoritative source. Actual examples include:

  • quoting a single sentence in a newspaper article, when the newspaper is in a larger format with many other articles on the same page,
  • taking not only full photographs of handwritten documents, but also photographing each page in sections for greater detail, and
  • cropping a photo to “zoom in” on one aspect of it, such as showing a single person’s face among an otherwise unordered group of people in a photo, or showing several city blocks from a much larger map that might itself not have a visible coordinate system or other spatial reference points.

In each of these cases, much of the context of the original and the relationship of the part to the whole is missing, at least to some degree. It is one of my fundamental goals to assist people in discovering, accessing, and understanding the original documents, whether physical or digital. To that end, I have been experimenting with an approach that would allow me to simply and easily add metadata to individual images that would provide an intuitive way to guide people back to their positions in the original documents.

My approach is to use a simple addressing system that can be applied uniformly regardless of the size or shape of a document, or whether it is print or digital. The approach scales depending on individual needs and can be easily and efficiently encoded and decoded, by both humans and code. I call this addressing system quaddressing.

Beginning with the negative use case: if a derivative represents the entire area of the original, its quaddress is “0”. To specify an area within the document, simply divide the current division into quadrants, beginning with the entire area:

<sodipodi:namedview id=“namedview32” pagecolor="#505050" bordercolor="#eeeeee" borderopacity=“1” inkscape:pageshadow=“0” inkscape:pageopacity=“0” inkscape:pagecheckerboard=“0” showgrid=“false” inkscape:zoom=“0.86125573” inkscape:cx=“502.17373” inkscape:cy=“425.54144” inkscape:window-width=“1920” inkscape:window-height=“1080” inkscape:window-x=“0” inkscape:window-y=“0” inkscape:window-maximized=“1” inkscape:current-layer=“svg30” units=“px” fit-margin-top=“0” fit-margin-left=“0” fit-margin-right=“0” fit-margin-bottom=“0” width=“500px” viewbox-width=“500” viewbox-height=“500” />1234

As greater specificity is needed, divide each quadrant in the current/highest division into additional quadrants, thus creating a new current/highest division:

12341234123412341234

The original quadrants are shown by the major grid, the added division by the minor grid.

This approach provides a clear and consistent method for achieving greater specificity by adding an arbitrary number of divisions that can then be referenced in the metadata with an arbitrarily long string of integers:


image.quaddress = "43";
12341234123412341234

Quaddresses in citations may be abbreviated as in the following real-world example:

— 11 Nov 1937, “In Tribute to Emily Mills”, The Ithaca Journal, Ithaca NY, p2, q212, newspapers.com.

This approach is simple and leverages the natural human ability to accurately estimate halves (and thus quarters), even of asymmetrical shapes. It should be noted that many documents and artifacts are themselves based on implicit grids, such as columns in a newspaper, streets on a map, and even the composition of photographs.

As with my approach to mitime, I intentionally sacrifice precision for simplicity to increase efficiency. After all, what I am trying to accomplish is to intuitively guide the person back to the approximate starting position of the content in the original document, at which point human judgement is aided adequately enough to quickly locate the content in context.

High precision is not needed and for human actors, makes both the metadata encoding and location resolution counter-productively complex. This approach is more efficient because it simply seeks to enhance human judgement rather than get in its way.

So, as a general rule, use the lowest level of specificity necessary to achieve the desired goal. In many cases, one or two divisions are sufficient, as again, we are only looking to aid human judgement by locating the content approximately.

There may be times when greater specificity may be genuinely needed, such as very large, or very detailed originals from which multiple derivatives are created. The approach for implementing and interpreting quaddresses is the same regardless of how many divisions are used.

Note too that this approach is compatible with mixed levels of specificity on a per-item basis as needed. For example, given a collection of otherwise related images, one could decide to use one or two divisions for most derivatives while specifying higher divisions on other images as needed. There is no reason to do an up-front review of data to decide the highest level of specificity likely to be needed across a collection, or in a publication, instead use mixed levels of specificity, only using higher divisions when absolutely necessary.


collection1.image1.quaddress = "43";
collection1.image2.quaddress = "4321";

And so it should be clear that quaddresses of different levels of specificity can be combined in multiple references to the same original as the relative proportions of the areas being referenced warrant it:


collection1.image1.quaddress1 = "43";
collection1.image1.quaddress2 = "4321";

This also illustrates that the length of the quaddress itself implies the number of divisions applied, and therefore, the relative size of the individual quaddress.

And similarly, different grids can be applied to derivatives as well, such as specifying a location within an photo within a page. The photo would have its quaddress within the page, probably at a low specificity, and then a derivative image of the photo would have its own quaddress, possibly having a greater specificity.

Obviously, many originals will not be perfectly square, and it is important to remember that this approach is purposely an abstraction of a map. To interpret a quaddress against say, an 11x17 sheet a paper, one can easily make the adjustment because division by quadrants only requires accurately judging the approximate midpoints along an X and Y axis for an area, something that is very natural for everyone.

12341234123412341234

The same can be said of non-rectilinear originals. As long as the grid is imagined to be the same maximum width and maximum height of the original–that is to say, the grid is the object’s minimum bounding box–the approach works, as again, using human judgement to apply the mapping is easily understood by anyone.

12341234123412341234

The shape of the original artifact is one factor to consider when deciding to include higher divisions.

In my use cases, such as referencing text within an article on a newspaper page, I only need to specify the approximate starting point of the content. Nothing is gained by, for example, trying to encode the overall area the derivative image occupies. However, this use case and others can be very straightforwardly accommodated, as shown in these two divisional grid examples:


# a single quaddress, already discussed
image.quaddress = "43";

# a range of quaddresses
image.quadrange = "14-34";

# multiple, possibly non-contiguous, quaddresses
# as a comma delimited string of integers
image.quaddresses = "12,22,43";

# multiple, possibly non-contiguous, quaddress ranges
image.quadranges = "14-34,22-44";

# etc.

The following is a live demo illustrating quaddresses. For practical reasons, I have limited the demo to a maximum of four divisions:

I believe that in most cases, two divisions are ideal to approximate locations and resolve quaddresses quickly. Three divisions is still manageable, but beyond that, resolving quaddresses starts becoming challenging and inefficient for human actors.

Notice that by their nature, quaddresses are sortable numerically, and if the number of reference divisions is constant, or alternately, the number is zero-padded, quaddresses are sortable lexicographically.