[OSM-talk] [OSM-talk-fr] Continued aggression against French contributors (cadastre integration)
frederik at remote.org
Thu Oct 18 22:05:23 BST 2012
I think you are mixing up things here.
There is a general requirement for a dedicated import account, and you
write yourself that you think that it is good to use a dedicated import
in some cases.
While this requirement is in theory a general requirement, DWG has never
enforced that and was not planning to enforce it for minor imports of
the type you mention ("reworked manually, objects integrated one by
Such imports will generally result in small changesets, small mapper
"productivity" (nobody can "rework manually" and "integrate" thousands
of objects per day), and look much more like traditional mapping than
like an import.
Your argument goes like this:
"The rule does not make sense for small, hand-made imports, therefore it
does not apply to people uploading 10000 cadastre buildings at once either."
But there is a difference between small, hand-made imports and the kind
of mass-import that many people commit when they import cadastre data.
I believe that it would be possible to import cadastre data in the
"hand-made" style, and maybe that was indeed originally the intention;
perhaps some people are actually importing their neighbourhood from
cadastre data and we don't even notice because they *really* verify
everything manually and therefore their edits look much more like manual
I'm happy to accept that such edits are more "using a variety of legal
third-party material in the mapping process" than they are "importing
third-party data", and I will certainly not request that someone creates
a separate account for taking 100 buildings from their government GIS,
fixing them up in JOSM, and uploading them.
The same if someone uploads a couple of hand-verified schools in their
area, or a couple SNCF crossings, or something.
However, as soon as someone sets out to say not
"I will map my parent's village, let me check out the available sources
and use them for help"
"I will take this SNCF dataset and import all level crossings in Alsace"
then they are doing an import - even if they are perhaps occasionally
aligning something by hand.
Until now, DWG has only ever enforced the separate-account rule when
people were clearly contributing much more data than could possibly be
In a recent message, to talk-it
"We recognize that the line between an import and assisted mapping is
not currently clearly defined; however all the cases I have seen
recently clearly were on the import side of that line."
So he calls it "assisted mapping", I called it "using a variety of legal
third-party material in the mapping process", we could also call it "a
manually verified, small-scale import".
These things are ok and while it is not currently written, DWG does not
enforce separate accounts for them. If that is any help, we can try to
sit down together and try to clarify the line between "assisted mapping"
There are many reasons why we want mass imports clearly separated from
normal, human-contributed data. We got burnt by this in the license
change in Poland, where we had to spend massive amounts of time sorting
between "good" and "bad" changesets contributed by the same account. We
have situations in which it is unclear whether data in OSM is from an
import or from, say, manual imagery tracing by the user or so; if there
are doubts about that data quality, we will not hesitate to wholesale
delete something that was imported (because we know that the script can
be fixed and it can be imported again), but if there's manual work
behind it then we'd rather not do that. Believe it or not, we have even
had complaints from users who felt that their "stats" (i.e. number of
objects contributed) were ruined unfairly by other users doing imports.
Sometimes it turns out a whole import stands on shaky legal ground and
has to be reverted; in some cases we had to revert a large block of work
of a particular user because neither the user nor we could exactly say
which bits were from the incompatible source. That hurts.
Imports dwarf anything else done with an account. Any statistics you run
on an account will be dominated by the import characteristics. Any
analyses - even "social" things like Richard Weait has playfully done -
won't work with an account that is used by a human mapper and for
imports at the same time. Importing data is a whole different class of
Now of course you (and Alex Barth) have a point when you say: This could
all be solved by proper source attribution in the changeset! Editors
could automatically higlight imports/bot edits in the change history so
that everyone knows that this is data of a different kind. Statistics
engines could create different league tables, taking into account those
changesets flagged as imports/bots. Disciplined mappers would always tag
their changesets properly, and DWG would slap them on the wrist if we
find a 10000 object changeset that was not tagged as an import or bot edit.
This is *theoretically* possible, and quite thinkable that we get there
some day. It would require a number of changes, for example, if you make
any sort of history call on an object you would have to see the "was
this an import/bot" property of the changeset involved, and many
analyses would have to process that data, and we would have to have
editors that prod users to actually toggle the "this was an import" bit
on their changesets, and so on.
We are not there yet; we are currently in a situation where many DWG
tasks are actually made much *easier* through the separate account
requirement. Normally I am the first to say that "mappers must not be
inconvenienced" - we should not invent mechanisms that make life harder
for mappers. But I have less patience for mass importers; mass importing
requires a lot of diligence and asking those few mass importers we have
to create an extra account is really not a big deal. If you want to be a
mass importer then you have to be 150% correct - for a normal mapper,
50% is ok ;)
So, in the spirit of pragmatism, we currently ask those people doing
mass imports to create a separate account. This might change with time,
with better tools, with a different API version, but currently we are
asking for it.
Members of the French community have, in this discussion, often claimed
that it makes no sense to use a separate account for their small-scale,
hand-crafted imports. And I agree! 100 houses from cadastre - no issue
at all, and even if you add 100 cadastre houses every week, small-scale,
in oyur local area, hand-crafted... I'll be the last one to complain.
The absurd thing is that the same "small-scale, hand-crafted" argument
has also been used by individuals who have uploaded more than 100,000
buildings. I'm sure that you would agree that imports on that scale have
very little "manual edit" about them.
But apart from all this, and I really hope I don't have to repeat this
over and over, I would really expect the French community to show a
little cooperation here. As I said to someone else today already: Do we
really have to discuss this to death - can you not, for once, even if
you don't fully understand why the separate account rule exists, even if
you believe that it doesn't fully apply to Cadastre rules, can you not
simply shrug and say "ok, those 50 of us who do large-scale imports will
create a separate account and that's it"?
I have now spent an hour writing this and I fear it will not help much.
Endless man-days have been spent by all involved parties arguing their
point. It is wearing me out but I can't let the French community ignore
the separate account rule and enforce it for others. The French
community is part of OSM, uses the same API, same database, same
editors; the import guidelines are there to safeguard the quality and
integrity of all our data; we can't make exceptions from them.
What we can do is what I wrote above - try to clarify the line between
"assisted mapping" and imports. However I don't think that *any* of the
French contributors who have until now been asked by DWG to create a
separate account would ever fall within any definition of "assisted
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the talk