[OSM-dev] API 0.6: Tags, Uniqueness, and Case Insensitivity

Andy Allan gravitystorm at gmail.com
Mon Feb 9 17:02:23 GMT 2009


On Mon, Feb 9, 2009 at 2:12 PM, Frederik Ramm <frederik at remote.org> wrote:
> Hi,
>
>    we currently have case-insensitive tags because the default MySQL
> collation is case-insensitive. But nobody cares because we do not have
> an unique index on tag keys.
>
> With 0.6 we will have such an index. Will we continue using the default
> collation so that it becomes invalid to have "NAME=x" and "name=x" on
> the same object, or will we the general UTF-8 overhaul lead to a
> different collation that makes "NAME" and "name" different?

So the general consensus that we came to, where 'we' is some form of
secret cabal, was that case-sensitivity in UTF-8 brings up the
questions of cases in every script not just latin-1, and then things
like are é and e+combining_acute the same, and in any case when two
utf8 byte sequences are "the same" should the second be converted into
the first or would Name be returned as Name and then an error if you
tried committing name and then blahblahblah.

So getting back to the point, we want it case sensitive, and no utf8
normalisation (NFC, NFD) etc. would be attempted. The server will
treat two different utf8 byte sequences as two different tags, and we
take the principle of "no tag inspection" to its logical extreme.

That, and anything else would involve work.

Cheers,
Andy

PS / NB I don't think there's actually any unit tests for this :-)




More information about the dev mailing list