[Historic] Temporal Tagging

Tue Jan 22 01:32:44 GMT 2013

The issue of how to handle changes in names, locations, and shapes (for
territories, building footprints, etc) has been a favorite topic of mine in
a few discussions to date. The problems can quickly become complex, but I
think a simple schema could encompass all possible historical scenarios.
I'm happy to share the observations / conclusions, which have generally
been the following:

1) Unlimited Changes to All Attributes: The schema should allow the
recording of an *unlimited* number of changes to *any* attribute of any
entity. No attribute can be used as a unique identifier, like 'name' or
'address' or even location-based attributes, because throughout history,
all of these things can change. Therefore, a truly unique identifier must
be assigned to each entity. Additionally, buildings are moved, streets are
renamed, rerouted, and renumbered. Tim, as you've pointed out, this has
happened many times to some entities. Therefore, a schema that simply
allows for a single 'old-name' isn't flexible enough. All changes to
attributes as described above must each have a time associated with them.
2) Confidence Factors: Because historical data inherently entail
uncertainty, there should be a method of assigning a confidence factor to
any attribute. (This feature has no purpose in realtime mapping, because
all data can be verified against actual conditions.) This confidence factor
would be applicable both to attributes like names as well as times. For
names, for example, "we think the name of this hill was Telegraph Hill, but
there are conflicting reports that claim it was called Signal Hill, so we
assign a 60% confidence factor to Telegraph Hill and a 40% confidence
factor to Signal Hill". The renderer could then decide how to display the
name(s). For times, for example, "we know this hill changed name from Loma
Alta to Telegraph HIll sometime between 1848 and 1852, but we don't know
for certain when, so we assign the date of the change as January 1, 1850
and give it a confidence factor of 4 years (creating a buffer with a
temporal diameter of 4 years around that date). This idea is critical,
because it allows conflicting reports and developing research to be
displayed alongside well-established facts.
3) Spatial and Non-Spatial Entities: Because shapes (nodes, etc.) cannot be
used as unique identifiers the way they can for realtime mapping, there
exists a need to create a distinction between spatial entities and
non-spatial entities. This way, each spatial permutation (or version) of an
entity, like a building or a road or a territorial boundary, can have a
distinct shape that is still linked to the nonspatial entity that
represents the concept of its agreed-upon identity. For example, 'United
States of America' would be a nonspatial entity with a start date of 1776
and no end date. But linked to that entity would be dozens of spatial
entities, because the boundaries of the United States have changed dozens
of times, therefore changing the shape, through small border edits or
territorial acquisitions. Each of those shapes would have its own start and
end time, and the map would display the correct shape as determined by the
time being viewed.

Obviously, we're talking about a dramatically different way of recording
place data, but in my view, these levels of detail are critical to making a
viable historical mapping platform where multiple types of data can be
shared and displayed. Looking forward to hearing everyone else's thoughts
on this.

Brad Thompson
Pastmapper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/historic/attachments/20130121/a3243af8/attachment-0001.html>