[OSM-dev] Final kinks in osmosis planet dumping

Brett Henderson brett at bretth.com
Mon Sep 10 12:00:53 BST 2007


spaetz wrote:
>> planet.rb writes node attributes in the order id, lat, lon, timestamp.
>> osmosis writes node attributes in the order id, timestamp, user, lat, lon
>>     
>
> I wonder how much larger the resulting file becomes with user id. Given that restricting the number of decimal places in lat/lon saved us 20MB in the resulting bz2, this could be a noticable amount. Do we fulfil our attribution requirements with this, or would we have to deliver a full list of users who modified that element in order to do that? Do people think the user id is important in planet.osm?
>   
I don't have a strong opinion either way.  I'd like to see it in there 
because the info is useful for reporting purposes.  For example I can 
produce stats for Victoria Australia to determine the list of active 
users ...

But if it causes problems I can add an argument to the --write-xml task 
to suppress the user attribute.
>   
>> **** 4. Additional indenting whitespace.
>> osmosis is currently using 4 space indenting, planetrb is using 2 space 
>> indenting.
>> I can change osmosis to use 2 space indenting if it helps reduce file 
>> sizes.  Should I drop it to 1 space indenting to further reduce file size?
>>     
>
> I'd let's compare the bzipped file size to see whether it makes a noticeable difference (I would think not, but haven't tested it).
>   
I think we'll have to do this using prod data.  I don't have user info 
in my local database and setting them all to the same user will affect 
the compression ratio.
>> **** 5. Inclusion of database password on command line.
>> Currently the only way to provide a database password to osmosis is on 
>> the command line.  Presumably this will allow other users on the same 
>> system to see the password (through the use of ps, top, etc).  If this 
>> is a problem I'll have to update the database tasks to be able to read a 
>> properties file containing connection information.
>>     
>
> Mmh, every user on dev would now be able to see the password with which to connect to the db server. I would *prefer* if we could find a non-commandline variant of handling that. It is not critical by any means (I hope dev users are a responsible lot), but then...
>   
I'll definitely fix this.  The quickest fix is for me to allow them to 
be loaded from a property file but kleptog's suggestion of a .osmosisrc 
file is probably a better long term solution.
>  
>   
>> **** 6. Minimum of Java 1.6
>> Dev currently has jdk1.5 installed preventing osmosis from running.  The 
>> only code requiring 1.6 is the --bounding-polygon task.
>> I have two options:
>> a. Rework osmosis to only require 1.5 (either by temporarily deleting 
>> the polygon task, creating a branch without the polygon task, reworking 
>> the polygon task to use older features, etc).
>> b. Upgrade dev to jdk 1.6.
>>     
>
> I upgraded dev's java to 1.6 and it works fine. So this is no issue for me any more.
>   
Excellent.  Thanks for that.
>   
>> **** 7. Different rounding of lat/lon coordinates.
>> The java DecimalFormat appears to be rounding numbers in a slightly 
>> different way to whatever planet.rb is using (sprintf?).  This only 
>> occurs when a 5 has to be rounded one way or the other.  I don't think 
>> this is an issue and short of writing my own decimal formatter I don't 
>> think there's much I can do about this one.
>>     
>
> I am doing a test dump on dev and will look at UTF-8 characters to see that it works out well on this box. Will let you know about the timing as well.
>   
Cool, I look forward to the results.

Cheers,
Brett




More information about the dev mailing list