[OSM-dev] Update of TIGER ruby import code

Tue Jul 10 00:25:53 BST 2007

Just as an FYI, osmosis has a sorting task which will work on huge files 
(ie. planets).  It uses a custom written file based merge sort, it will 
sort an entire planet in just over an hour.

Command line is:

osmosis --read-xml file="data.osm" --sort type="TypeThenId" --write-xml file="data-sorted.osm"

Apologies for the noise, but may be useful.  From memory, JOSM is one of 
the those generators that will write entities out of order ...

Al Wold wrote:
> So, does it seem like it would be a good idea to clamp down the spec 
> and say that any entities that are referenced by id must be defined in 
> the file before they are referenced?  That would help to avoid 
> problems in the future, and I think it would be a lot easier for 
> generators to order the data than it would be to try to implement 
> something like my 5-pass algorithm.
>
> I'd like to avoid writing another version of this script that makes 
> shortcuts and doesn't work when you throw it a variation, but it does 
> seem like it will take quite a bit to make it work on unordered files 
> :).  I think for now, I will just have it generate a warning when it 
> encounters an undefined id, so it will work on ordered files, but at 
> least let the user know it is not working if it encounters a problem.
>
> On the subject of ways that go outside of the box, I think it makes 
> the most sense to have a command line option.  If you divide a large 
> file into sections, it might be nice to not have to merge ways from 
> adjacent boxes, so that can be an option to have it do another pass 
> and collect missing segments from any ways that are in the box.
>
> -Al