[Talk-ca] Dealing with huge features

Frank Steggink steggink at steggink.org
Tue Nov 10 03:47:47 GMT 2009


Frank Steggink wrote:
> Hi,
>
> While testing out the Python version of canvec-to-osm, I came across a 
> couple of huge features. Especially wooded areas have the tendency to 
> grow large.
>
> The second file I looked at (NTS tile 021L03), was already 3.1 MB large, 
> despite that I set maxnodes to 2000 in shp-to-osm. It was mostly 
> occupied by a giant multipolygon, which contains 321 members and more 
> than 30k nodes. When I opened this file in JOSM, it was really 
> struggling with it. Uploading this would be a real nightmare.
>
> Since the feature occupied less than half of the NTS tile, there would 
> even be room for several such features. In this scenario it is easy to 
> imagine that the nodes limit for getting data from the OSM server is 
> exceeded. I don't think this is a desirable situation, but I don't know 
> a clear solution how to deal with this.
>
> Although splitting up the features is not a good idea, it would at least 
> provide a means to upload the data in smaller chunks, and be able to 
> retrieve a part of the data, provided that the tile doesn't exceed the 
> server limit. Hopefully JOSM would also be more performant.
>
> For those curious, I have uploaded this OSM file here: [1]. It is part 
> of this area: [2].
> Anyways, check the file out for yourself, and please share any ideas how 
> we should deal with a situation like this.
>
> In the meantime I noticed that NTS tile 021L10 contains even a 4.1 MB 
> large file. If there are roughly 10k nodes per MB, this would mean that 
> all multipolygon members would contain at least 40k nodes...
>
> Regarding the Python script: the first version is nearly complete. I 
> want to make a couple of small changes to it, and also check if 
> everything looks OK in JOSM, and perhaps generate a couple of tiles from 
> it (locally). Once that is done, I'll make it available to whoever is 
> interested. It takes roughly 35 mins to convert all features (except 
> highways and hydro) in NTS tile 021L. Executing shp-to-osm costs most of 
> the time. The script also downloads any missing Canvec SHP files, but 
> they were already downloaded.
>
> Cheers,
>
> Frank
>
> [1] 
> http://www.steggink.org/osm/Canvec_test/021l03_VE_1240009_2_Wooded_area10.osm.zip
> [2] 
> http://www.openstreetmap.org/?minlon=-71.5&minlat=46&maxlon=-71&maxlat=46.25&box=yes
>
>
> _______________________________________________
> Talk-ca mailing list
> Talk-ca at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-ca
>   
By the way, the rules file I used still defines tags for the inner 
polygons. Regarding the performance I don't think that matters much.
The other file I checked has 43200 nodes in one file, so the huge 
multipolygon contains at least 41201 nodes. It has 745 multipolygon members.

Frank




More information about the Talk-ca mailing list