[OSM-talk] Overpass API v0.7.50 almost done
Roland Olbricht
roland.olbricht at gmx.de
Sat Jun 7 08:03:54 UTC 2014
Dear all,
first of all a big thank you to all who have contributed to the SSD
funding or to the general FOSSGIS funding for Overpass API.
I will publish before SotM-EU 2014 a new stable Overpass API release,
the first since about a year. To avoid confusion, I would like to sketch
what it does and what it doesn't and how this relates to the running
service.
A proper documentation will be written along with the workshop slides
for the SotM-EU.
In particular, I ask you to test the attic feature to detect bugs now
such that they are fixed in the database re-run in during July.
== Summary for the impatient ==
For current data, the Overpass API is and will be completely reliable.
Most new features for current data will be postponed to the next
version, because they don't affect the urgent database rebuild.
For Augmented Diffs, I will change the mode of operations, mostly
because the current approach has intrinsic stability problems. The last
bigger incident happened on 18th April this year. Currently, the new
Augmented Diff mode can be tested, and the old one should be continued
until end of July.
Attic data is a completely new feature. The underlying data is basically
available since the license change in September 2012. But due to two
detected and possible other undetected bugs, attic data before 02nd June
2014 might be damaged.
== Current data ==
Queries on current data are such well-adopted and in widespread use that
I will not break backward compatibility. Essentially only two features
have been added:
Sparse queries are now faster. An example is
http://overpass-turbo.eu/s/3FS
That query would have taken hours with such a large bounding box in the
past. Now it should run in less than 15 seconds. There is still room for
improvement: It doesn't work yet that fast for regular expressions, but
this is postponed to the next version.
And features like ways and relations can expose their geometry directly
on the feature. This is triggered by adding the word "geom" to the "out"
statement. This currently works only for XML output. For JSON output,
the final format is discussed on
https://github.com/drolbr/Overpass-API/issues/93
They should mimic GeoJSON as close as it makes sense. Once again, the
feature is very useful, but doesn't affect the database debugging. Thus
it is postponed.
== Augmented diffs ==
The Augmented Diffs are redesigned to be always generated on the fly.
This is because the Augmented Diffs have been piling up on the server to
almost a terabyte of data and we are running out of space. A second
advantage is that their generating does no longer block applying the
diffs to the current database.
You can access them via e.g.
http://overpass-api.de/api_0750/augmented_diff?id=905039
(by changing the id to the number you actually need).
There are also advantages to the users:
Augmented diffs now carry a number that can be straightforward computed
from the desired date. Number 1 starts at 2012-09-12T06:56:00Z, the
effective license change date. And the date interval of Augmented Diff n
is expressed in seconds since the epoch always
(n - 1347432900 ) / 60
until one minute later. You can get the current Augmented Diff with the call
http://overpass-api.de/api_0750/augmented_diff_status
which is deduced from
http://overpass-api.de/api_0750/timestamp
giving the date of the current database state.
A second advantage is that filtering now makes sense. The full Overpass
QL language can be used for filtering on Augmented Diffs, and it almost
always makes the request faster. For that purpose take the request from
http://overpass-api.de/api_0750/augmented_diff?id=905039&debug=yes
and adapt it to your needs.
A third advantage is that you can requests arbitrary timeframes, not
only minutes.
However, the price to pay is that these changes are not completely
backwards compatible; I'll take the reservation that the format was
experimental. The "info" elements are no longer available, in favour of
the geometry features of ways and relations. This is the format more
widely adopted, and offering the "info" variant would have taken too
much time to implement again.
== Attic data ==
This is a completely new kind of feature, and complements the "Historic
dumps" of OSM data. I'll prefer the notion "attic data" like in version
control conventions to avoid confusion with mapping of historical features.
You can run a query against the database state as it were at an
arbitrary date in the past. Just put
[date:"2014-06-02T20:00:00Z"];
in the front of your query.
In a similar way, you can get what has changed in the results by adding
[diff:"2014-06-02T20:00:00Z"];
or
[diff:"2014-06-02T20:00:00Z","2014-06-02T20:00:00Z"];
in front. The first form takes as second time implicitly the current time.
If you need in addition deletion dates, you can use
[adiff:"2014-06-02T20:00:00Z"];
or
[adiff:"2014-06-02T20:00:00Z","2014-06-02T20:00:00Z"];
In this case, you will get meta information on otherwise gone objects,
either that they have been deleted, bearing "visible=false", or still
exist but have gone out of scope of the request ("visible=true").
The data is reliable for 02nd June 2014 or more recent. For older
version, some objects might be damaged. There were bugs in the update
code, and as the database stores deltas to recent data, these bugs have
messed up some data that has become attic before 02nd June 2014.
I will resolve these issues with a whole database replay in July. As
such a database replay, processing two years of OSM editing, will take
about a month, I would like to spot any further data consistency bugs
before. So please feel free to report any bugs you find.
Best regards,
Roland
More information about the talk
mailing list