[OSM-talk] Overpass API v0.7.50 almost done

Roland Olbricht roland.olbricht at gmx.de
Sat Jun 7 08:03:54 UTC 2014


Dear all,

first of all a big thank you to all who have contributed to the SSD 
funding or to the general FOSSGIS funding for Overpass API.

I will publish before SotM-EU 2014 a new stable Overpass API release, 
the first since about a year. To avoid confusion, I would like to sketch 
what it does and what it doesn't and how this relates to the running 
service.

A proper documentation will be written along with the workshop slides 
for the SotM-EU.

In particular, I ask you to test the attic feature to detect bugs now 
such that they are fixed in the database re-run in during July.


== Summary for the impatient ==

For current data, the Overpass API is and will be completely reliable. 
Most new features for current data will be postponed to the next 
version, because they don't affect the urgent database rebuild.

For Augmented Diffs, I will change the mode of operations, mostly 
because the current approach has intrinsic stability problems. The last 
bigger incident happened on 18th April this year. Currently, the new 
Augmented Diff mode can be tested, and the old one should be continued 
until end of July.

Attic data is a completely new feature. The underlying data is basically 
available since the license change in September 2012. But due to two 
detected and possible other undetected bugs, attic data before 02nd June 
2014 might be damaged.


== Current data ==

Queries on current data are such well-adopted and in widespread use that 
I will not break backward compatibility. Essentially only two features 
have been added:

Sparse queries are now faster. An example is
http://overpass-turbo.eu/s/3FS
That query would have taken hours with such a large bounding box in the 
past. Now it should run in less than 15 seconds. There is still room for 
improvement: It doesn't work yet that fast for regular expressions, but 
this is postponed to the next version.

And features like ways and relations can expose their geometry directly 
on the feature. This is triggered by adding the word "geom" to the "out" 
statement. This currently works only for XML output. For JSON output, 
the final format is discussed on
https://github.com/drolbr/Overpass-API/issues/93
They should mimic GeoJSON as close as it makes sense. Once again, the 
feature is very useful, but doesn't affect the database debugging. Thus 
it is postponed.


== Augmented diffs ==

The Augmented Diffs are redesigned to be always generated on the fly. 
This is because the Augmented Diffs have been piling up on the server to 
almost a terabyte of data and we are running out of space. A second 
advantage is that their generating does no longer block applying the 
diffs to the current database.

You can access them via e.g.
http://overpass-api.de/api_0750/augmented_diff?id=905039
(by changing the id to the number you actually need).

There are also advantages to the users:

Augmented diffs now carry a number that can be straightforward computed 
from the desired date. Number 1 starts at 2012-09-12T06:56:00Z, the 
effective license change date. And the date interval of Augmented Diff n 
is expressed in seconds since the epoch always
(n - 1347432900 ) / 60
until one minute later. You can get the current Augmented Diff with the call
http://overpass-api.de/api_0750/augmented_diff_status
which is deduced from
http://overpass-api.de/api_0750/timestamp
giving the date of the current database state.

A second advantage is that filtering now makes sense. The full Overpass 
QL language can be used for filtering on Augmented Diffs, and it almost 
always makes the request faster. For that purpose take the request from
http://overpass-api.de/api_0750/augmented_diff?id=905039&debug=yes
and adapt it to your needs.

A third advantage is that you can requests arbitrary timeframes, not 
only minutes.

However, the price to pay is that these changes are not completely 
backwards compatible; I'll take the reservation that the format was 
experimental. The "info" elements are no longer available, in favour of 
the geometry features of ways and relations. This is the format more 
widely adopted, and offering the "info" variant would have taken too 
much time to implement again.


== Attic data ==

This is a completely new kind of feature, and complements the "Historic 
dumps" of OSM data. I'll prefer the notion "attic data" like in version 
control conventions to avoid confusion with mapping of historical features.

You can run a query against the database state as it were at an 
arbitrary date in the past. Just put
[date:"2014-06-02T20:00:00Z"];
in the front of your query.

In a similar way, you can get what has changed in the results by adding
[diff:"2014-06-02T20:00:00Z"];
or
[diff:"2014-06-02T20:00:00Z","2014-06-02T20:00:00Z"];
in front. The first form takes as second time implicitly the current time.

If you need in addition deletion dates, you can use
[adiff:"2014-06-02T20:00:00Z"];
or
[adiff:"2014-06-02T20:00:00Z","2014-06-02T20:00:00Z"];
In this case, you will get meta information on otherwise gone objects, 
either that they have been deleted, bearing "visible=false", or still 
exist but have gone out of scope of the request ("visible=true").

The data is reliable for 02nd June 2014 or more recent. For older 
version, some objects might be damaged. There were bugs in the update 
code, and as the database stores deltas to recent data, these bugs have 
messed up some data that has become attic before 02nd June 2014.

I will resolve these issues with a whole database replay in July. As 
such a database replay, processing two years of OSM editing, will take 
about a month, I would like to spot any further data consistency bugs 
before. So please feel free to report any bugs you find.


Best regards,

Roland



More information about the talk mailing list