[Talk-ca] Dealing with huge features
Frank Steggink
steggink at steggink.org
Tue Nov 10 03:47:47 GMT 2009
Frank Steggink wrote:
> Hi,
>
> While testing out the Python version of canvec-to-osm, I came across a
> couple of huge features. Especially wooded areas have the tendency to
> grow large.
>
> The second file I looked at (NTS tile 021L03), was already 3.1 MB large,
> despite that I set maxnodes to 2000 in shp-to-osm. It was mostly
> occupied by a giant multipolygon, which contains 321 members and more
> than 30k nodes. When I opened this file in JOSM, it was really
> struggling with it. Uploading this would be a real nightmare.
>
> Since the feature occupied less than half of the NTS tile, there would
> even be room for several such features. In this scenario it is easy to
> imagine that the nodes limit for getting data from the OSM server is
> exceeded. I don't think this is a desirable situation, but I don't know
> a clear solution how to deal with this.
>
> Although splitting up the features is not a good idea, it would at least
> provide a means to upload the data in smaller chunks, and be able to
> retrieve a part of the data, provided that the tile doesn't exceed the
> server limit. Hopefully JOSM would also be more performant.
>
> For those curious, I have uploaded this OSM file here: [1]. It is part
> of this area: [2].
> Anyways, check the file out for yourself, and please share any ideas how
> we should deal with a situation like this.
>
> In the meantime I noticed that NTS tile 021L10 contains even a 4.1 MB
> large file. If there are roughly 10k nodes per MB, this would mean that
> all multipolygon members would contain at least 40k nodes...
>
> Regarding the Python script: the first version is nearly complete. I
> want to make a couple of small changes to it, and also check if
> everything looks OK in JOSM, and perhaps generate a couple of tiles from
> it (locally). Once that is done, I'll make it available to whoever is
> interested. It takes roughly 35 mins to convert all features (except
> highways and hydro) in NTS tile 021L. Executing shp-to-osm costs most of
> the time. The script also downloads any missing Canvec SHP files, but
> they were already downloaded.
>
> Cheers,
>
> Frank
>
> [1]
> http://www.steggink.org/osm/Canvec_test/021l03_VE_1240009_2_Wooded_area10.osm.zip
> [2]
> http://www.openstreetmap.org/?minlon=-71.5&minlat=46&maxlon=-71&maxlat=46.25&box=yes
>
>
> _______________________________________________
> Talk-ca mailing list
> Talk-ca at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-ca
>
By the way, the rules file I used still defines tags for the inner
polygons. Regarding the performance I don't think that matters much.
The other file I checked has 43200 nodes in one file, so the huge
multipolygon contains at least 41201 nodes. It has 745 multipolygon members.
Frank
More information about the Talk-ca
mailing list