[openstreetmap/openstreetmap-website] Require user names to be unique after unicode normalisation (PR #4405)

Paul Norman notifications at github.com
Thu Dec 21 10:02:47 UTC 2023


I am strongly in favour of requiring 12. I doubt parts of the toolset work on earlier versions, with osmdbt requiring logical replication. Recent postgres versions are easily available with pgdg on Ubuntu, Debian, and RHEL-based systems.

I was researching another way to do it which right now is equivalent in functionality, but could be much better under PostgreSQL 16.

PostgreSQL 12 added [non-deterministic collations](https://www.postgresql.org/docs/current/collation.html#COLLATION-NONDETERMINISTIC) with an index created on that collation, then you get `SELECT 'n' = 'ñ COLLATE usernames;` returning true and using an index.

Something this would create a suitable collation

```sql
CREATE COLLATION usernames (
provider = icu,
deterministic = false, 
locale = 'und-u-ka-shifted-kk-ks-level1'
);
```

This would only look at the base character, case insensitive. e.g. `'N' = 'ñ'`.

Where this approach shines is under PostgreSQL 16, where you can add tailoring rules which set equality differently. 

```sql
CREATE COLLATION coll1 (
provider = icu,
deterministic = false,
locale = 'und',
rules = '& a = b');
SELECT 'a' = 'b' COLLATE coll1;
 ?column?
──────────
 t
(1 row)
```

I'm still trying to figure out how to start with a locale other than a base locale when adding rules, as well as how to handle all the quoting needed when every character you care about is a homograph to another.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/openstreetmap-website/pull/4405#issuecomment-1865979238
You are receiving this because you are subscribed to this thread.

Message ID: <openstreetmap/openstreetmap-website/pull/4405/c1865979238 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/rails-dev/attachments/20231221/c1acb2d2/attachment-0001.htm>


More information about the rails-dev mailing list