[OSM-dev] UTF8 problem with last night's daily .osc

Joachim Zobel jzobel at heute-morgen.de
Sat Aug 30 22:04:42 BST 2008


Am Samstag, den 30.08.2008, 08:39 -0700 schrieb Karl Newman:
> If I recall correctly, the database column is not actually set for
> UTF-8 (but is double-encoded to return actual UTF-8 to the client...).
> Wouldn't it be a better long-term fix to change the database to UTF-8
> (or whatever), then presumably MySql wouldn't allow invalid sequences
> to be stored? Still would be a good idea to raise an error if the
> length was too long, though.

This means you are storing utf8 as raw in latin1 database tables, right?
That is the root of the problem and what should be fixed. If the
database would consider each utf8 character a single character, it
couldn't truncate them.

Sincerely,
Joachim






More information about the dev mailing list