[OSM-dev] OSM PBF and spatial characteristics of blocks

Wed Jan 6 12:48:07 UTC 2016

Hi Andrew,

The underlying storage looks useful indeed. My current idea is to change the indexing in Vanilla Extract as follows (and later add parts from TileMaker as well):

- Implement an adaptation of a MODIS grid, in order to have rectangular grid cells for easier indexing and storage containers for unprojected WGS84 datum. The size of the cells should be configurable, to find a sweet spot between data size for reads and update pefrormance for writes (since we must rewrite whole cells later on)
- Create an in-memory rtree index for all grid cells
- for all geometries, check during creation of the storage if they overlap other cells using the rtree index. If they do overlap, write a relationship info (I'll probably use SQLite db, also for the cell storage). So when you load kater on extract a cell at row 18, col 20, this cell may have a relationship to some cell at row 25, col 23 for example. Youd load this cell as well when extracting the data

Something like this. What do you think?

Ben 

Von meinem iPad gesendet

> Am 06.01.2016 um 13:13 schrieb Andrew Byrd <andrew at fastmail.net>:
> 
> 
>> On 06 Jan 2016, at 03:40, Stadin, Benjamin <Benjamin.Stadin at heidelberg-mobil.com> wrote:
>> Does your Vanilla Extract consider overlapping polygons? Like if you export a small area within a country, does it add the country's polygon that overlaps the area? 
>> It looks pretty interesting though. I'm not sure where to start at, yet I thinkit will be good to combine features from TileMaker and Vanilla Extract. 
> 
> Our spatial indexing is rather crude and tile-based. This is intentional to keep it small and simple. We have a grid of cells which correspond to the web mercator tiles at a single zoom level, and every OSM object is assigned to one tile only. This is problematic for objects that span multiple tiles. Also note that free-floating nodes which are not included in any way are not reachable using the current index. For our applications we just haven’t needed to index free-floating POI nodes yet, and don’t need large administrative borders or huge area polygons.
> 
> Obviously in the long term we’ll want to improve the index to handle these cases. Both of these limitations should be straightforward to overcome. To index large polygons as areas and you’d either need some kind of multi-level index (rectangle tree or “pyramids") or just accept rasterizing area polygons into all the index cells they overlap (a polygon’s ID appearing repeatedly, in every tile it overlaps).
> 
> So the indexing system would need some work for your application. But I thought the two underlying storage systems for OSM data could be useful to you.
> 
> -Andrew
>