[OSM-dev] UTF-8 problems in informationfreeway?

80n 80n80n at gmail.com
Fri Jan 4 10:22:21 GMT 2008


The 100 character truncation was a bug in Osmxapi, which is fixed now.

About 600 tags (out of 180 million) are affected in Osmxapi's database,
these will get corrected as they percolate through from the osmosis feed.

80n

On Jan 4, 2008 10:17 AM, Stefan Baebler <stefan.baebler at gmail.com> wrote:

> Khm, but where to set the new limit?
>
> According to http://www.facstaff.bucknell.edu/rbeard/name.html
> it should be 310 bytes = (
>
> length("Krungthepmahanakonbowornratanakosinmahintarayudyayamahadiloponoparatanarajthaniburiromudomrajniwesmahasatarnamornpimarnavatarsatitsakattiyavisanukamphrasit")
> -1) * 2 to accomodate slightly shorter names, with all of the
> characters being exotic, needing 2 bytes to encode :)
>
> Or the limits can be simply taken from the official OSM schema (for
> ways at least, tags of nodes are a mess with semicolons).
>
> Stefan
>
>
>
> On Jan 4, 2008 9:45 AM, 80n <80n80n at gmail.com> wrote:
> > Hmmm... yes it's truncating at 100 characters.  Working on a fix...
> >
> >
> >
> > On Jan 4, 2008 7:26 AM, Stefan Baebler <stefan.baebler at gmail.com> wrote:
> > > Hi again!
> > >
> > > Osmxapi behaves much better now, but there is a problem with my test
> node
> > > in planet.osm it is:
> > > <node id="29161753" timestamp="2007-12-22T05:59:49Z" lat="46.1356895"
> > >
> > > lon="14.7445634">
> > > <tag k="created_by" v="JOSM"/>
> > > <tag k="name" v="Moravče"/>
> > >
> > > <tag k="is_in" v="Slovenia, Europe"/>
> > >
> > > <tag k="place" v="town"/>
> > >
> > > <tag k="note" v="Testing 34 random UTF-8
> > > characters:Č莞ŠšĐđĆć€ÄäËëÖöÜüŁłßÇç÷פ§ÉéÁáÂâ"/>
> > > </node>
> > > while
> > >
> >
> http://osmxapi.hypercube.telascience.org/api/0.5/node%5bplace=town%5d%5bbbox=14.5,46.1,14.8,46.2%5d
> > > gives
> > > <node id="29161753" lat="46.1356895" lon="14.7445634"
> > >
> > > timestamp="2007-12-22T05:59:49Z">
> > > <tag k="is_in" v="Slovenia, Europe"/>
> > >
> > > <tag k="name" v="Moravče"/>
> > >
> > > <tag k="note" v="Testing 34 random UTF-8
> > > characters:Č莞ŠšĐđĆć€ÄäËëÖöÜüŁłßÇç÷פ§ÉéÁ�.."/>
> > >
> > > <tag k="place" v="town"/>
> > > </node>
> > >
> > > Note that the last 2 characters in note tag should be "Ââ".
> > > Planet.osm is ok, but osmxapi seems to misinterpret some characters.
> > > any ideas?
> > >
> > > UTF characters in hourly diffs and their import into osmxapi still
> need
> > > to be checked.
> > >
> > > Stefan
> > >
> > >
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20080104/4449bd2d/attachment.html>


More information about the dev mailing list