[OSM-dev] UTF-8 Failure...
Andrew McCarthy
me at andrewmccarthy.ie
Mon Aug 18 09:34:33 BST 2008
Hi,
On Mon, Aug 18, 2008 at 10:09:01AM +0200, spaetz wrote:
> As my utf-8 knowledge is next to inexistent, I would appreciate if people could have a look whether this is really a case of an UTF-8 error, or whether our UTF-8 checker is wrong.
>
> The UTF-8 checker in the t at h client aborts when rendering
> ./tilesGen.pl --Layers=caption xy 35 21 6
> with an UTF-8 error in line 262. The node is question is:
>
> <tag k='loc_name' v='Banska Stiavnica,Banska Stiavnica a Banska Bela,Banskà Ãtiavnica,Banskà Ãtiavnica a Banskà Bel##.'/>
The problem is in the last word in this line. Bel is okay, but the next
two bytes, C3 2E, aren't valid UTF-8. If the first byte is in the range
C0–DF, the second byte must be in the range 80–BF.
If you have it on your system, the man page for UTF-8(7) is a really
good reference for this. Very short, and surprisingly readable.
Cheers,
Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20080818/090d50bb/attachment.pgp>
More information about the dev
mailing list