[OSM-dev] Final kinks in osmosis planet dumping
Brett Henderson
brett at bretth.com
Mon Sep 10 12:00:53 BST 2007
spaetz wrote:
>> planet.rb writes node attributes in the order id, lat, lon, timestamp.
>> osmosis writes node attributes in the order id, timestamp, user, lat, lon
>>
>
> I wonder how much larger the resulting file becomes with user id. Given that restricting the number of decimal places in lat/lon saved us 20MB in the resulting bz2, this could be a noticable amount. Do we fulfil our attribution requirements with this, or would we have to deliver a full list of users who modified that element in order to do that? Do people think the user id is important in planet.osm?
>
I don't have a strong opinion either way. I'd like to see it in there
because the info is useful for reporting purposes. For example I can
produce stats for Victoria Australia to determine the list of active
users ...
But if it causes problems I can add an argument to the --write-xml task
to suppress the user attribute.
>
>> **** 4. Additional indenting whitespace.
>> osmosis is currently using 4 space indenting, planetrb is using 2 space
>> indenting.
>> I can change osmosis to use 2 space indenting if it helps reduce file
>> sizes. Should I drop it to 1 space indenting to further reduce file size?
>>
>
> I'd let's compare the bzipped file size to see whether it makes a noticeable difference (I would think not, but haven't tested it).
>
I think we'll have to do this using prod data. I don't have user info
in my local database and setting them all to the same user will affect
the compression ratio.
>> **** 5. Inclusion of database password on command line.
>> Currently the only way to provide a database password to osmosis is on
>> the command line. Presumably this will allow other users on the same
>> system to see the password (through the use of ps, top, etc). If this
>> is a problem I'll have to update the database tasks to be able to read a
>> properties file containing connection information.
>>
>
> Mmh, every user on dev would now be able to see the password with which to connect to the db server. I would *prefer* if we could find a non-commandline variant of handling that. It is not critical by any means (I hope dev users are a responsible lot), but then...
>
I'll definitely fix this. The quickest fix is for me to allow them to
be loaded from a property file but kleptog's suggestion of a .osmosisrc
file is probably a better long term solution.
>
>
>> **** 6. Minimum of Java 1.6
>> Dev currently has jdk1.5 installed preventing osmosis from running. The
>> only code requiring 1.6 is the --bounding-polygon task.
>> I have two options:
>> a. Rework osmosis to only require 1.5 (either by temporarily deleting
>> the polygon task, creating a branch without the polygon task, reworking
>> the polygon task to use older features, etc).
>> b. Upgrade dev to jdk 1.6.
>>
>
> I upgraded dev's java to 1.6 and it works fine. So this is no issue for me any more.
>
Excellent. Thanks for that.
>
>> **** 7. Different rounding of lat/lon coordinates.
>> The java DecimalFormat appears to be rounding numbers in a slightly
>> different way to whatever planet.rb is using (sprintf?). This only
>> occurs when a 5 has to be rounded one way or the other. I don't think
>> this is an issue and short of writing my own decimal formatter I don't
>> think there's much I can do about this one.
>>
>
> I am doing a test dump on dev and will look at UTF-8 characters to see that it works out well on this box. Will let you know about the timing as well.
>
Cool, I look forward to the results.
Cheers,
Brett
More information about the dev
mailing list