[OSM-dev] broken utf8 in minute changeset 200907140650

Eddy Petrișor eddy.petrisor at gmail.com
Wed Sep 9 17:43:23 BST 2009


2009/7/14 Richard Fairhurst <richard at systemed.net>:

Hello,

I still see some issues with the UTF-8 workarounds made in Potlatch.
If I want to correct a letter that was previously typed, after writing
the new letter the cursor jumps always at least(?) over the next
character.


For instance, if I want to correct "Soseaua" to become "Șoseaua"
(first letter is capital letter S with comma below), after removing
"S" and typing "Ș", the cursor ends up after "Șos".

If I want to change the first "a" in "Calaretilor" into "ă" (lowercase
a with breve), the cursor ands up after "Căla", so again, two farther
than it should be.


If I want to change the character '"" into '„' in '"Abcdefgh"', the
cursor ends up after '„Abc'.


It looks like the cursor moves with the same ammount of positions as
bytes encoding the UTF-8 character.


> Further to the long-running Linux Flash Player brokenness, it would be very
> helpful for any Linux FP users to do this:
>
> 1. Go to http://www.systemeD.net/stuff/keycode.html
> 2. Type some non-ASCII characters into the top box (letters with accents,
> the sort of thing you might want to enter as an OSM tag value)
> 3. For each one, tell me what it returns in the next two boxes

For completeness, I am adding the letters and characters used in Romanian:

For  "ă â ș ț î  Ă Â Ș Ț Î „ ” « »", respectively, original how Flash
stores it and how the server gets it:

ă
3 c4 192
03 C3 84 C6 92


â
e2
C3 A2

ș
3 c8 2122
03 C3 88 E2 84 A2

ț
3 c8 203a
03 C3 88 E2 80 BA

î
ee
C3 AE

Ă
3 c4 201a

Â
3 c3 201a
03 C3 83 E2 80 9A

Ș
3 c8 2dc
03 C3 88 CB 9C

Ț
3 c8 161
03 C3 88 C5 A1

Î
3 c3 17d
03 C3 83 C5 BD

„
3 e2 20ac 17e
03 C3 A2 E2 82 AC C5 BE

”
3 e2 20ac 9d
03 C3 A2 E2 82 AC C2 9D

«
ab
C2 AB

»
bb
C2 BB



Also, please note that pressing AltGr generates the extra initial
noise "3" in the flash store line, but since there is another standard
variant of the Romanian layout which doesn't involve pressing "AltGr"
to get to the special characters, the initial "3" is not present.


> Thanks!
>
> cheers
> Richard
> --
> View this message in context: http://www.nabble.com/broken-utf8-in-minute-changeset-200907140650-tp24475713p24487675.html
> Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.
>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>



-- 
Regards,
EddyP
=============================================
"Imagination is more important than knowledge" A.Einstein




More information about the dev mailing list