[OSM-dev] broken utf8 in minute changeset 200907140650

Ævar Arnfjörð Bjarmason avarab at gmail.com
Tue Jul 14 17:42:01 BST 2009


On Tue, Jul 14, 2009 at 4:19 PM, Tom Hughes<tom at compton.nu> wrote:
> On 14/07/09 17:09, Ævar Arnfjörð Bjarmason wrote:
>
>> Yes from a client point of view. But the server portion of Potlatch
>> shouldn't trust the client side to do data validation. Doing
>> server-side content validation equivalent to the main API would have
>> prevented both the issue described in ticket:1936 and presumably this
>> issue too.
>
> No it wouldn't prevented this issue, because the main API checks for valid
> UTF-8, which this was. The problem in this case was that it was a UTF-8
> control character which is not valid in XML and both APIs allow those
> through at the moment.

Yes the main API checks for valid UTF-8 once it gets a hold of it, but
the main API also *incidentally* does further validations when it does
XML parsing via libxml, which is where it'll reject things which makes
XML parsers puke.

Here's what your test server says when I try to upload XML with CAN
and SYN control characters:

avar at aoeu:~/Desktop$ cat create_changeset.xml
<?xml version='1.0' encoding='UTF-8'?>
<osm version='0.6' generator='me'>
  <changeset visible='true'>
    <tag k='created_by' v='Ban Potlatch 1.0' />
    <tag k="name" v="Meycauayan City Northbound Entry Point"/>
  </changeset>
</osm>
avar at aoeu:~/Desktop$ hexdump -C create_changeset.xml | grep Me
000000a0  6b 3d 22 6e 61 6d 65 22  20 76 3d 22 18 16 4d 65  |k="name" v="..Me|

avar at aoeu:~/Desktop$ curl -u "avarab at gmail.com":avarfoobar -i -o - -T
create_changeset.xml 'http://osm.compton.nu/api/0.6/changeset/create'
HTTP/1.1 100 Continue

HTTP/1.1 100 Continue

HTTP/1.1 400 Bad Request
Date: Tue, 14 Jul 2009 16:32:28 GMT
Server: Apache/2.2.11 (Fedora) DAV/2 Phusion_Passenger/2.2.2 PHP/5.2.9
mod_python/3.3.1 Python/2.6
X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.2
X-Runtime: 214
Cache-Control: no-cache
Error: Cannot parse valid Changeset from xml string <?xml
version='1.0' encoding='UTF-8'?>, <osm version='0.6' generator='me'>,
<changeset visible='true'>, <tag k='created_by' v='Ban Potlatch 1.0'
/>, <tag k="name" v="Meycauayan City Northbound Entry Point"/>,
</changeset>, </osm>, .
Content-Length: 285
Status: 400
Content-Type: text/html; charset=utf-8
Connection: close

Cannot parse valid Changeset from xml string <?xml version='1.0'
encoding='UTF-8'?>
<osm version='0.6' generator='me'>
  <changeset visible='true'>
    <tag k='created_by' v='Ban Potlatch 1.0' />
    <tag k="name" v="Meycauayan City Northbound Entry Point"/>
  </changeset>
</osm>

It throws the same error even if I escape the control characters:

avar at aoeu:~/Desktop$ curl -u "avarab at gmail.com":avarfoobar -i -o - -T
create_changeset-2.xml
'http://osm.compton.nu/api/0.6/changeset/create'
HTTP/1.1 100 Continue

HTTP/1.1 100 Continue

HTTP/1.1 400 Bad Request
Date: Tue, 14 Jul 2009 16:41:38 GMT
Server: Apache/2.2.11 (Fedora) DAV/2 Phusion_Passenger/2.2.2 PHP/5.2.9
mod_python/3.3.1 Python/2.6
X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.2
X-Runtime: 222
Cache-Control: no-cache
Error: Cannot parse valid Changeset from xml string <?xml
version='1.0' encoding='UTF-8'?>, <osm version='0.6' generator='me'>,
<changeset visible='true'>, <tag k='created_by' v='Ban Potlatch 1.0'
/>, <tag k="name" v="Meycauayan City Northbound Entry Point"
/>, </changeset>, </osm>, .
Content-Length: 294
Status: 400
Content-Type: text/html; charset=utf-8
Connection: close

Cannot parse valid Changeset from xml string <?xml version='1.0'
encoding='UTF-8'?>
<osm version='0.6' generator='me'>
  <changeset visible='true'>
    <tag k='created_by' v='Ban Potlatch 1.0' />
    <tag k="name" v="Meycauayan City Northbound Entry Point" />
  </changeset>
</osm>




More information about the dev mailing list