[OSM-dev] Semicolon
Brett Henderson
brett at bretth.com
Tue Nov 20 03:56:49 GMT 2007
Tom Hughes wrote:
> In message <53cf5b6f0711190633g4d99a1bbk7b9fbae3be2b0b72 at mail.gmail.com>
> Stefan Baebler <stefan.baebler at gmail.com> wrote:
>
>
>> Hi!
>>
>> In a discussion with BrettH about Osmosis handling semicolons in
>> nodes' tags it struck me that tags of nodes
>> are kept in a text field in the nodes table(separated with semicolon),
>> while tags of ways are kept normalized - in a separate table.
>>
>> Current escaping is sort of random, see node 100325036 for example.
>>
>
> The current escaping is not random at all - there is no escaping!
>
> Well that's not quite true - Potlatch uses a (broken) form of escaping
> but as it is the only thing that undoes that escaping on read it is
> not hugely helpful.
>
Can I get confirmation of how the escaping is supposed to work? Is a
';' within a tag key or value represented as ";;;"?
I was intending to fix the escaping in osmosis, although if no other
tools support it either there may be no point.
>
>> in a planet's extract the node is written as
>> <node id="100325036" timestamp="2007-11-06T20:38:08Z"
>> lat="46.9372873" lon="15.4481424">
>> <tag k="name" v="Kasten"/>
>> <tag k="place" v="hamlet"/>
>> <tag k="created_by" v="Potlatch 0.4c"/>
>> <tag k="is_in" v="Wundschuh"/>
>> <tag k=";;Austria" v="bulk_upload.pl-f0deb1fc-2237-4d40-ae4d-3dd108453350"/>
>> </node>
>>
>> However API serves is differently, missing the last tag (with
>> semicolons) above completely.
>> http://www.openstreetmap.org/api/0.5/node/100325036 only gives:
>> <node id="100325036" lat="46.9372873" lon="15.4481424"
>> user="atrejuvienna" visible="true"
>> timestamp="2007-11-06T20:38:08+00:00">
>> <tag k="name" v="Kasten"/>
>> <tag k="place" v="hamlet"/>
>> <tag k="created_by" v="Potlatch 0.4c"/>
>> <tag k="is_in" v="Wundschuh"/>
>> </node>
>>
>> In a dump made by osmosis the strange tag was written a bit nicer, as:
>> <tag k="Austria" v="bulk_upload.pl-f0deb1fc-2237-4d40-ae4d-3dd108453350"/>
>> but might be that author actually wanted something completely diferent, eg:
>> <tag k="is_in" v="Wundschuh;;;Austria"/>
>>
>
> What's you're looking at is what happens to Potlatch's escaping when
> it is processed by tools that don't understand it (or any other form
> of escaping).
>
>
>> While escaping is doable i believe that we should look into moving
>> nodes' tags into a separate table, both for avoiding escaping, uniform
>> handling of tags of ways and nodes, and perhaps even better indexing.
>>
>
> I agree that it should be moved out personally - any volunteers to
> do the job?
>
> Indexing is a double edged sword - we can get better indexing, but
> the downside is that the tags table has to be a MyISAM table.
>
Why does it have to be a MyISAM? Is it because of full text indexing,
some strange form of id generation, or something else I'm unaware of?
If it is full text indexing, which queries are using that currently?
Brett
More information about the dev
mailing list