[OpenStreetMap] #4127: non-ascii UTF-8 symbols in GPX traces names are converted to '_' on upload

OpenStreetMap trac at openstreetmap.org
Sat Dec 10 16:22:04 GMT 2011


#4127: non-ascii UTF-8 symbols in GPX traces names are converted to '_' on upload
---------------------------------------------+------------------------------
 Reporter:  one_half_3544                    |       Owner:  rails-dev@…                
     Type:  defect                           |      Status:  new                        
 Priority:  minor                            |   Milestone:                             
Component:  website                          |     Version:                             
 Keywords:  trace, gpx, utf-8, localization  |  
---------------------------------------------+------------------------------

Comment(by one_half_3544):

 Hm. Don't you think assuming utf-8 by default would be a good idea?

 tcpdumping while uploading (this trace
 http://www.openstreetmap.org/user/one_half_3544/traces/1149633 ) shows
 that at least browser (firefox) lists utf-8 in Accept-Encoding:
 {{{
 19:42:31.827995 IP (tos 0x0, ttl 64, id 46125, offset 0, flags [DF], proto
 TCP (6), length 1500)
     192.168.1.35.50499 > soup.osm.ichosted.org.uk.www: Flags [.], cksum
 0x6b4f (correct), seq 1:1449, ack 1, win 115, options [nop,nop,T
 S val 6163510 ecr 795011323], length 1448
         0x0000:  4500 05dc b42d 4000 4006 b280 c0a8 0123  E....- at .@......#
         0x0010:  c13f 4b63 c543 0050 c958 b802 fc6e 645c  .?Kc.C.P.X...nd\
         0x0020:  8010 0073 6b4f 0000 0101 080a 005e 0c36  ...skO.......^.6
         0x0030:  2f62 e8fb 504f 5354 202f 7472 6163 652f  /b..POST./trace/
         0x0040:  6372 6561 7465 2048 5454 502f 312e 310d  create.HTTP/1.1.
         0x0050:  0a48 6f73 743a 2077 7777 2e6f 7065 6e73  .Host:.www.opens
         0x0060:  7472 6565 746d 6170 2e6f 7267 0d0a 5573  treetmap.org..Us
         0x0070:  6572 2d41 6765 6e74 3a20 4d6f 7a69 6c6c  er-Agent:.Mozill
         0x0080:  612f 352e 3020 2858 3131 3b20 4c69 6e75  a/5.0.(X11;.Linu
         0x0090:  7820 7838 365f 3634 3b20 7276 3a37 2e30  x.x86_64;.rv:7.0
         0x00a0:  2e31 2920 4765 636b 6f2f 3230 3130 3031  .1).Gecko/201001
         0x00b0:  3031 2046 6972 6566 6f78 2f37 2e30 2e31  01.Firefox/7.0.1
         0x00c0:  2049 6365 7765 6173 656c 2f37 2e30 2e31  .Iceweasel/7.0.1
         0x00d0:  0d0a 4163 6365 7074 3a20 7465 7874 2f68  ..Accept:.text/h
         0x00e0:  746d 6c2c 6170 706c 6963 6174 696f 6e2f  tml,application/
         0x00f0:  7868 746d 6c2b 786d 6c2c 6170 706c 6963  xhtml+xml,applic
         0x0100:  6174 696f 6e2f 786d 6c3b 713d 302e 392c  ation/xml;q=0.9,
         0x0110:  2a2f 2a3b 713d 302e 380d 0a41 6363 6570  */*;q=0.8..Accep
         0x0120:  742d 4c61 6e67 7561 6765 3a20 656e 2d75  t-Language:.en-u
         0x0130:  732c 656e 3b71 3d30 2e35 0d0a 4163 6365  s,en;q=0.5..Acce
         0x0140:  7074 2d45 6e63 6f64 696e 673a 2067 7a69  pt-Encoding:.gzi
         0x0150:  702c 2064 6566 6c61 7465 0d0a 4163 6365  p,.deflate..Acce
         0x0160:  7074 2d43 6861 7273 6574 3a20 4953 4f2d  pt-Charset:.ISO-
         0x0170:  3838 3539 2d31 2c75 7466 2d38 3b71 3d30  8859-1,utf-8;q=0
         0x0180:  2e37 2c2a 3b71 3d30 2e37 0d0a 436f 6e6e  .7,*;q=0.7..Conn
         0x0190:  6563 7469 6f6e 3a20 6b65 6570 2d61 6c69  ection:.keep-ali
         0x01a0:  7665 0d0a 5265 6665 7265 723a 2068 7474  ve..Referer:.htt
         0x01b0:  703a 2f2f 7777 772e 6f70 656e 7374 7265  p://www.openstre
         0x01c0:  6574 6d61 702e 6f72 672f 7472 6163 652f  etmap.org/trace/
         0x01d0:  6372 6561 7465 0d0a 436f 6f6b 6965 3a20  create..Cookie:.
 }}}
 And transmits utf-8 filename as is:
 {{{
         0x0540:  223b 2066 696c 656e 616d 653d 2230 382d  ";.filename="08-
         0x0550:  3031 2d30 3320 d09b d0b5 d0b1 d18f d0b6  01-03...........
         0x0560:  d18c d0b5 202d 20d0 a1d0 bed1 81d0 bdd0  .....-..........
         0x0570:  bed0 b2d1 8bd0 b920 d0b1 d0be d180 5f72  .............._r
         0x0580:  6f61 6473 2e67 7078 220d 0a43 6f6e 7465  oads.gpx"..Conte
         0x0590:  6e74 2d54 7970 653a 2061 7070 6c69 6361  nt-Type:.applica
         0x05a0:  7469 6f6e 2f6f 6374 6574 2d73 7472 6561  tion/octet-strea
         0x05b0:  6d0d 0a0d 0a3c 3f78 6d6c 2076 6572 7369  m....<?xml.versi
         0x05c0:  6f6e 3d27 312e 3027 2065 6e63 6f64 696e  on='1.0'.encodin
         0x05d0:  673d 2755 5446 2d38 273f 3e0a            g='UTF-8'?>.
 }}}
 Some tcp packets later, in the same POST request comes trace description
 field (duplicates filename):
 {{{
         0x04c0:  0d0a 436f 6e74 656e 742d 4469 7370 6f73  ..Content-Dispos
         0x04d0:  6974 696f 6e3a 2066 6f72 6d2d 6461 7461  ition:.form-data
         0x04e0:  3b20 6e61 6d65 3d22 7472 6163 655b 6465  ;.name="trace[de
         0x04f0:  7363 7269 7074 696f 6e5d 220d 0a0d 0a30  scription]"....0
         0x0500:  382d 3031 2d30 3320 d09b d0b5 d0b1 d18f  8-01-03.........
         0x0510:  d0b6 d18c d0b5 202d 20d0 a1d0 bed1 81d0  .......-........
         0x0520:  bdd0 bed0 b2d1 8bd0 b920 d0b1 d0be d180  ................
         0x0530:  3b20 d0a2 d180 d0b5 d0ba 20d0 bfd0 bed0  ;...............
         0x0540:  bbd1 8cd0 b7d0 bed0 b2d0 b0d1 82d0 b5d0  ................
         0x0550:  bbd1 8f20 564f 524f 4e20 d181 2076 656c  ....VORON....vel
         0x0560:  6f70 6974 6572 2e73 7062 2e72 7520 0d0a  opiter.spb.ru...
         0x0570:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  ----------------
         0x0580:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d32 3130  -------------210
         0x0590:  3839 3933 3230 3531 3332 3438 3038 3130  8993205132480810
 }}}
 It comes in utf-8, but it is not converted to '_'.
 So this looks like a server-side bug to me after all.

 Do you know the place in the source, which handles trace upload? (or at
 least - where is the source hosted? =)) I have more traces, so I want to
 resolve this problem.

-- 
Ticket URL: <https://trac.openstreetmap.org/ticket/4127#comment:2>
OpenStreetMap <http://www.openstreetmap.org/>
OpenStreetMap is a free editable map of the whole world



More information about the rails-dev mailing list