[Tagging] Reviving the conditions debate

martinq osm-martinq at fantasymail.de
Sun Jun 17 20:16:49 BST 2012


Colin,

> Martin, if it walks like a duck and quacks like a duck, then it had
> better be a duck... What I mean with this, is if the grammar is so
> English-like such that people are tempted to use constructions which are
> not (or not quite) supported by the grammar, or if the way it works is
> contrary to how the English language would interpret it, then "errors
> will occur".

Now, the primary intention was to allow to transfer conditions from 
sign-posted information as "natural" as possible. Due the fact that 
space is limited and people must be able to read it during driving, they 
use a extremely simplified language - mostly nouns plus words like 
except, if, when, on, then...
You don't get a Nobel Prize for Literature for text like "except 
vehicles less than 5m long" (also in several variants, e.g. "except 
vehicles with length<5m").

But nevertheless your point is valid. If we give the impression computer 
understands sentences, soon or later we will find things like
"tracktype:cond = grade2 if weather was good for more than 3 days, 
grade3 after a day of rain, grade4 after a thunderstorm". I haven't 
taken this into account.

 > Plus, of course, that the majority of users will not have
> English as their first language, and we have to make this accessible to
> the man-in-the-street and not allow it to become so obfuscated that only
> PhD's need apply.

As I said, for information (exceptions, conditions) typically 
sign-posted, you typically don't need a PhD. The parser can understand 
quite amazing amount of signs just with the nouns we have already 
defined for vehicles plus the properties (length, weight - plus variants 
like long) and a few the date and time (Su or Sunday, etc) things also 
described for other tags.

For mappers that are not familiar with the English language, the benefit 
of the proposal is clearly massively reduced.

> [...] If you start with
> the premise that the answer must be expressed in ANTLR and shouldn't
> include brackets, that's putting the cart before the horse.

The objective was to be able to understand sign-posted sentences.
I picked ANTLR to play around - but no "must".
I also said I want to avoid to use brackets just for solving precedence 
problems - thus mappers shouldn't have to add brackets if they are not 
used on the sign.

But one clear issue is the lack of a bigger amount of examples.

>  > Human language is sadly not very precise: "except taxis AND bicycles"
> does not mean, you must be in a vehicle that is both (it means if not
> taxis AND if not bicycles),
>
> The human language here is extremely precise to any fluent
> English-speaker. It means what it says.

Yes, no ambiguity here. Picked a bad example.

> "If I were king" I would be looking for a system that:

Huh, a new condition, added it to the parser.
"maxspeed:cond = 200 if your are king" ;-)

> * makes common cases easy
> * makes complex cases possible
> * makes each rule as standalone as possible (one sign -> one rule)
> * does not rely on the user's fluency in English grammar (knowledge of a
> set of specific words, e.g. tags and functions, is fine)

Signs use extremely simplified language. Thus almost every word on the 
sign has an essential meaning and has to be translated into a system 
also using English nouns.

But I haven't forgot your valid point - if people think that it is 
free-text English, then we will soon or later see fancy conditions.

> * uses grammatical constructions which are familiar to most people, or
> easily learnt

The normal form is not easy to learn - this has to be made in addition 
to the own language -> English translation.

But translating simplified text from one language to a simplified 
English may also be a problem. Thus no major advantage here for the 
alternative proposal.

However, looking to most signs, you will see combinations of one or two 
attributes plus time related information only. In this case the "normal 
form" is no challenge (but same is true for the simplified 
"sign-language" English).

Outlook:
Yet the parser cannot parse complex conditions, the extensibility 
requirements is therefore not fulfilled. Thus there is no alternative 
"sign-language" proposal yet. And it looks like it is not feasible with 
reasonable effort or unreasonable restrictions/rules (then we better 
stick to the normal form).

IMO, the work done may be helpful for an editor: People can enter the 
sign-text and the editor converts it to normal form with standardized 
nouns and conditions. This tool may support several languages instead of 
just English.

martinq



More information about the Tagging mailing list