[OSM-dev] OSM History Retriever

Martijn van Exel mvexel at gmail.com
Thu Jul 29 10:02:41 BST 2010


Hi Brett,

What kind of trouble do you envisage when doing a bbox operation on a full history dump? I guess the movement of features over time makes an accurate determination of what is and what isn't in the bounding box less trivial, but for my purposes having all the historical data for which the current version is within the given bbox would be adequate.

I would not hesitate to get some of my team members busy with adapting osmosis to deal with full history planet extracts better if necessary, but I'd love to hear from you what the likely caveats are.

Best, 
Martijn 

Sent from my iPad

On Jul 23, 2010, at 2:20 PM, Brett Henderson <brett at bretth.com> wrote:

> On Thu, Jul 22, 2010 at 6:14 PM, Martijn van Exel <mvexel at gmail.com> wrote:
> 
> On 22 jul 2010, at 06:57, Brett Henderson wrote:
> 
>> On Wed, Jul 21, 2010 at 11:52 PM, Lars Francke <lars.francke at gmail.com> wrote:
>> [...]
>> 
>> That format is fine and exactly what I would have expected.  I suspect Osmosis would parse it okay, but without support for the visible attribute it won't be particularly useful.
>>  
> Not for visualization purposes maybe, but for analysis purposes the visible attribute is not really an issue. My goal is to extract full history dumps for certain spatial extents and import them into a PostGIS, in order to calculate historical metrics exposing the crowd dynamics of OSM - for example number of contributors over time, version growth over time, movement of nodes over time. All this calculated for grid cells. 
> 
> The full history dump is 13GB bz2 compressed. Anyone got a rough idea how long it would take for osmosis to extract, say, a bbox of the Netherlands out of that on a 4GB AMD Opteron quad core machine? More RAM would probaby help?
> 
> Osmosis is unlikely to work well on a full history dump.  The --bounding-box task is really only designed to work with data from a single point in time.  Data across a time range is much more difficult to accurately perform bounding box filtering, although it might be good enough.  A bigger issue is that it will ignore visible attributes and strip those attributes from the output.
> 
> A relatively small amount of RAM is used for a single --bounding-box task if you specify the idTrackerType=BitSet option.
> 
> Brett
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20100729/54320f2c/attachment-0001.html>


More information about the dev mailing list