[OSM-talk] [OSM-talk-fr] Continued aggression against French contributors (cadastre integration)

Frederik Ramm frederik at remote.org
Thu Oct 18 22:05:23 BST 2012


Christian,

    I think you are mixing up things here.

There is a general requirement for a dedicated import account, and you 
write yourself that you think that it is good to use a dedicated import 
in some cases.

While this requirement is in theory a general requirement, DWG has never 
enforced that and was not planning to enforce it for minor imports of 
the type you mention ("reworked manually, objects integrated one by 
one...").

Such imports will generally result in small changesets, small mapper 
"productivity" (nobody can "rework manually" and "integrate" thousands 
of objects per day), and look much more like traditional mapping than 
like an import.

Your argument goes like this:

"The rule does not make sense for small, hand-made imports, therefore it 
does not apply to people uploading 10000 cadastre buildings at once either."

But there is a difference between small, hand-made imports and the kind 
of mass-import that many people commit when they import cadastre data.

I believe that it would be possible to import cadastre data in the 
"hand-made" style, and maybe that was indeed originally the intention; 
perhaps some people are actually importing their neighbourhood from 
cadastre data and we don't even notice because they *really* verify 
everything manually and therefore their edits look much more like manual 
edits.

I'm happy to accept that such edits are more "using a variety of legal 
third-party material in the mapping process" than they are "importing 
third-party data", and I will certainly not request that someone creates 
a separate account for taking 100 buildings from their government GIS, 
fixing them up in JOSM, and uploading them.

The same if someone uploads a couple of hand-verified schools in their 
area, or a couple SNCF crossings, or something.

However, as soon as someone sets out to say not

"I will map my parent's village, let me check out the available sources 
and use them for help"

but

"I will take this SNCF dataset and import all level crossings in Alsace"

then they are doing an import - even if they are perhaps occasionally 
aligning something by hand.

Until now, DWG has only ever enforced the separate-account rule when 
people were clearly contributing much more data than could possibly be 
"manually reviewed".

In a recent message, to talk-it 
(http://lists.openstreetmap.org/pipermail/talk-it/2012-September/030778.html), 
Paul writes

"We recognize that the line between an import and assisted mapping is 
not currently clearly defined; however all the cases I have seen 
recently clearly were on the import side of that line."

So he calls it "assisted mapping", I called it "using a variety of legal 
third-party material in the mapping process", we could also call it "a 
manually verified, small-scale import".

These things are ok and while it is not currently written, DWG does not 
enforce separate accounts for them. If that is any help, we can try to 
sit down together and try to clarify the line between "assisted mapping" 
and "import".

There are many reasons why we want mass imports clearly separated from 
normal, human-contributed data. We got burnt by this in the license 
change in Poland, where we had to spend massive amounts of time sorting 
between "good" and "bad" changesets contributed by the same account. We 
have situations in which it is unclear whether data in OSM is from an 
import or from, say, manual imagery tracing by the user or so; if there 
are doubts about that data quality, we will not hesitate to wholesale 
delete something that was imported (because we know that the script can 
be fixed and it can be imported again), but if there's manual work 
behind it then we'd rather not do that. Believe it or not, we have even 
had complaints from users who felt that their "stats" (i.e. number of 
objects contributed) were ruined unfairly by other users doing imports. 
Sometimes it turns out a whole import stands on shaky legal ground and 
has to be reverted; in some cases we had to revert a large block of work 
of a particular user because neither the user nor we could exactly say 
which bits were from the incompatible source. That hurts.

Imports dwarf anything else done with an account. Any statistics you run 
on an account will be dominated by the import characteristics. Any 
analyses - even "social" things like Richard Weait has playfully done - 
won't work with an account that is used by a human mapper and for 
imports at the same time. Importing data is a whole different class of 
activity.

Now of course you (and Alex Barth) have a point when you say: This could 
all be solved by proper source attribution in the changeset! Editors 
could automatically higlight imports/bot edits in the change history so 
that everyone knows that this is data of a different kind. Statistics 
engines could create different league tables, taking into account those 
changesets flagged as imports/bots. Disciplined mappers would always tag 
their changesets properly, and DWG would slap them on the wrist if we 
find a 10000 object changeset that was not tagged as an import or bot edit.

This is *theoretically* possible, and quite thinkable that we get there 
some day. It would require a number of changes, for example, if you make 
any sort of history call on an object you would have to see the "was 
this an import/bot" property of the changeset involved, and many 
analyses would have to process that data, and we would have to have 
editors that prod users to actually toggle the "this was an import" bit 
on their changesets, and so on.

We are not there yet; we are currently in a situation where many DWG 
tasks are actually made much *easier* through the separate account 
requirement. Normally I am the first to say that "mappers must not be 
inconvenienced" - we should not invent mechanisms that make life harder 
for mappers. But I have less patience for mass importers; mass importing 
requires a lot of diligence and asking those few mass importers we have 
to create an extra account is really not a big deal. If you want to be a 
mass importer then you have to be 150% correct - for a normal mapper, 
50% is ok ;)

So, in the spirit of pragmatism, we currently ask those people doing 
mass imports to create a separate account. This might change with time, 
with better tools, with a different API version, but currently we are 
asking for it.

Members of the French community have, in this discussion, often claimed 
that it makes no sense to use a separate account for their small-scale, 
hand-crafted imports. And I agree! 100 houses from cadastre - no issue 
at all, and even if you add 100 cadastre houses every week, small-scale, 
in oyur local area, hand-crafted... I'll be the last one to complain.

The absurd thing is that the same "small-scale, hand-crafted" argument 
has also been used by individuals who have uploaded more than 100,000 
buildings. I'm sure that you would agree that imports on that scale have 
very little "manual edit" about them.

But apart from all this, and I really hope I don't have to repeat this 
over and over, I would really expect the French community to show a 
little cooperation here. As I said to someone else today already: Do we 
really have to discuss this to death - can you not, for once, even if 
you don't fully understand why the separate account rule exists, even if 
you believe that it doesn't fully apply to Cadastre rules, can you not 
simply shrug and say "ok, those 50 of us who do large-scale imports will 
create a separate account and that's it"?

I have now spent an hour writing this and I fear it will not help much. 
Endless man-days have been spent by all involved parties arguing their 
point. It is wearing me out but I can't let the French community ignore 
the separate account rule and enforce it for others. The French 
community is part of OSM, uses the same API, same database, same 
editors; the import guidelines are there to safeguard the quality and 
integrity of all our data; we can't make exceptions from them.

What we can do is what I wrote above - try to clarify the line between 
"assisted mapping" and imports. However I don't think that *any* of the 
French contributors who have until now been asked by DWG to create a 
separate account would ever fall within any definition of "assisted 
mapping".

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"



More information about the talk mailing list