[OSM-dev] UTF-8 problems in informationfreeway?
Stefan Baebler
stefan.baebler at gmail.com
Tue Dec 18 06:59:44 GMT 2007
Actually, we don't have to wait. There is some other node with broken
utf8 in latest osc:
http://planet.openstreetmap.org/daily/hourly/2007121805-2007121806.osc.gz
<node id="26036509" timestamp="2007-12-18T05:09:10Z" user="Vlado"
lat="48.2163297" lon="19.9527257">
<tag k="is_in" v="Rimavská Sobota,Banskobystrický kraj,Stredné
Slovensko,Slovensko"/>
<tag k="region_id" v="609"/>
<tag k="source:name"
v="http://earth-info.nga.mil/gns/html/cntry_files.html,www.statistics.sk"/>
<tag k="city_id" v="514811"/>
<tag k="place" v="village"/>
<tag k="population" v="1200"/>
<tag k="ascii_name" v="Hajnacka"/>
<tag k="name" v="HajnáÄka"/>
<tag k="created_by" v="JOSM"/>
<tag k="import_ref" v="city_import_sk_1"/>
<tag k="ele" v="322"/>
</node>
But API call to http://www.openstreetmap.org/api/0.5/node/26036509 shows
it nicely as:
<node id="26036509" lat="48.2163297" lon="19.9527257" user="Vlado"
visible="true" timestamp="2007-12-18T05:09:10+00:00">
<tag k="is_in" v="Rimavská Sobota,Banskobystrický kraj,Stredné
Slovensko,Slovensko"/>
<tag k="region_id" v="609"/>
<tag k="source:name"
v="http://earth-info.nga.mil/gns/html/cntry_files.html,www.statistics.sk"/>
<tag k="city_id" v="514811"/>
<tag k="place" v="village"/>
<tag k="population" v="1200"/>
<tag k="ascii_name" v="Hajnacka"/>
<tag k="name" v="Hajnáčka"/>
<tag k="created_by" v="JOSM"/>
<tag k="import_ref" v="city_import_sk_1"/>
<tag k="ele" v="322"/>
</node>
Osmosis WorksOnMyComputer(tm), but I'm not sure what the difference to
the server where osc get generated could be.
Probably someone with access should play on the same server with osmosis
osmosis --rmc host="..." database="..." user="..." password="..."
intervalBegin="2007-12-18_05:09:09"
intervalBegin="2007-12-18_05:09:11"--sc --wxc file="utftestdump.osc"
greets,
Stefan
Stefan Baebler wrote:
> 80n wrote:
>> No, the problem shows up clearly enough if you compare a node in
>> planet.osm to the corresponding node in a diff file.
>
> Osmosis makes a nice, .osc file, with nice utf8 characters. I ran it
> like:
> osmosis --rmc host="localhost" database="..." user="..."
> password="..." intervalBegin="2007-12-02_08:50:00" --sc --wxc
> file=".\data\dump.osc"
>
> and got:
> <node id="29161753" timestamp="2007-12-02T08:52:13Z" user="Osmosis
> System User" lat="46.1356895" lon="14.7445634">
> <tag k="created_by" v="JOSM"/>
> <tag k="name" v="Moravče"/>
> <tag k="place" v="town"/>
> </node>
>
> I changed the node 29161753 a bit (added is_in) at 07:01 CET, so it
> should come up in one of the next hourly diffs:
> http://planet.openstreetmap.org/daily/hourly/2007121805-2007121806.osc.gz
> http://planet.openstreetmap.org/daily/hourly/2007121806-2007121807.osc.gz
> http://planet.openstreetmap.org/daily/hourly/2007121807-2007121808.osc.gz
> (not sure which timezone applies - GMT, CET or CET+DST)
>
> and a bit later also to osmxapi
> http://www.informationfreeway.org/api/0.5/node%5bplace=town%5d%5bbbox=14.5,46.1,14.8,46.2%5d
>
>
> Now let's wait and see.
>
> Stefan
>
>>
>> On Dec 17, 2007 10:44 AM, J.D. Schmidt < jdsmobile at gmail.com
>> <mailto:jdsmobile at gmail.com>> wrote:
>>
>> Did the UTF-8 encoding problems start to show up, after OSMXAPI was
>> moved to the HyperCube server ? AFAIU, eventhough its url is an
>> informationfreeway.org <http://informationfreeway.org> address,
>> OSMXAPI processing was moved to the
>> HyperCube server, so something might have been differently
>> configured on
>> an US locale installed server, versus the UK locale installed
>> informationfreeway.org <http://informationfreeway.org> server.
>>
>> Might be a good idea to check that first.
>>
>> Dutch
>>
>> 80n skrev:
>> > Brett
>> > Yes, it's probably something like that. All I can say for sure
>> is that the
>> > node in planet.osm looks different to the same one in an Osmosis
>> diff file.
>> >
>> > If we could identify how it is encoded differently then maybe I could
>> > compensate for it on import into Osmxapi, but it would be better
>> to fix the
>> > problem at source - wherever that it.
>> >
>> > Anyway, there's no rush to deal with it at the moment.
>> > 80n
>> >
>> > On Dec 17, 2007 6:05 AM, Brett Henderson <brett at bretth.com
>> <mailto:brett at bretth.com>> wrote:
>> >
>> >> It warms my heart to return from leave to discover new osmosis utf8
>> >> problems, I missed those little guys ;-)
>> >>
>> >> I'll check it out. Might take me a few days though because I'm
>> a bit
>> >> overwhelmed with email and Christmas at the moment ... Strictly
>> >> speaking I suspect this is not truly a bug but yet another
>> artefact of
>> >> the database encoding issues, it may not be easy to nail.
>> >>
>> >> Stefan Baebler wrote:
>> >>> Another artifact of similar utf problem can be seen at yesterday's
>> >>> lowzoom(!) tile:
>> >>>
>> http://tah.openstreetmap.org/Tiles/info.php?x=1107&y=727&z=11&layer=tile
>> <http://tah.openstreetmap.org/Tiles/info.php?x=1107&y=727&z=11&layer=tile>
>>
>> >>> Mengeš ("š" is ok)
>> >>> Domžale ("ž" is ok)
>> >>> Moravče ("č" turned into "Ä ")
>> >>>
>> >>> On zoom 12 and higher "č" in Moravče is ok:
>> >>>
>> >>
>> http://tah.openstreetmap.org/Tiles/info.php?x=4431&y=2909&z=13&layer=tile
>>
>> <http://tah.openstreetmap.org/Tiles/info.php?x=4431&y=2909&z=13&layer=tile>
>>
>> >>> There definitely is a problem _somewhere_.
>> >>>
>> >>> In today's and last week's dump node is ok (extract made with
>> osmosis!):
>> >>> <node id="29161753" timestamp="2007-12-02T08:52:13Z"
>> >>> lat="46.1356895" lon="14.7445634">
>> >>> <tag k="created_by" v="JOSM"/>
>> >>> <tag k="name" v="Moravče"/>
>> >>> <tag k="place" v="town"/>
>> >>> </node>
>> >>> (this xml snippet is an extract of a planet file, done with osmosis
>> >>> for local archive: http://osm.baebler.net/data/ )
>> >>>
>> >>> Osmosis seems to handle that in files(!) just fine, but osmxapi
>> gives it
>> >> wrong:
>> >>>
>> >>
>> http://www.informationfreeway.org/api/0.5/node%5bplace=town%5d%5bbbox=14.5,46.1,14.8,46.2%5d
>>
>> <http://www.informationfreeway.org/api/0.5/node%5bplace=town%5d%5bbbox=14.5,46.1,14.8,46.2%5d>
>>
>> >>> either there is a bug in osmxapi or during the import into its db.
>> >>>
>> >>> hope it helps tracking it down.
>> >>>
>> >>> greets,
>> >>> Štefan
>> >>>
>> >>> On Dec 8, 2007 12:54 AM, Frederik Ramm <frederik at remote.org
>> <mailto:frederik at remote.org>> wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>>
>> >>>>> This appears to be an osmosis problem.
>> >>>>> A recent planet contains the following:
>> >>>>>
>> >>>> [...]
>> >>>>
>> >>>> I concur; the latest daily diff before the Dec06 planet file
>> had UTF-8
>> >>>> problems as well but the affected objects were represented ok
>> in the
>> >>>> planet file.
>> >>>>
>> >>>> Bye
>> >>>> Frederik
>> >>>>
>> >>>> _______________________________________________
>> >>>>
>> >>>> dev mailing list
>> >>>> dev at openstreetmap.org <mailto:dev at openstreetmap.org>
>> >>>> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>> >>>>
>> >>>>
>> >>> _______________________________________________
>> >>> dev mailing list
>> >>> dev at openstreetmap.org <mailto:dev at openstreetmap.org>
>> >>> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>> >>>
>> >>
>> >> _______________________________________________
>> >> dev mailing list
>> >> dev at openstreetmap.org <mailto:dev at openstreetmap.org>
>> >> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>> >>
>> >>
>> >>
>> ------------------------------------------------------------------------
>> >>
>> >> _______________________________________________
>> >> dev mailing list
>> >> dev at openstreetmap.org <mailto:dev at openstreetmap.org>
>> >> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> dev mailing list
>> dev at openstreetmap.org
>> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
>
More information about the dev
mailing list