[OSM-talk] [OSM-talk-fr] Continued aggression against French contributors (cadastre integration)

Jérome Armau jerarmau at gmail.com
Thu Oct 18 23:25:49 BST 2012


I think your approach based solely on the dataset size has limits. A
typical French village with a few hundred inhabitants will include
somewhere around 15,000 nodes and 500 building-tagged ways (that's a
village I know with 200 inhabitants). Now, integrating such amounts of data
doesn't mean that the work was a massive, automated import. Typically,
users will at least:
- check that the new layer is properly aligned with the survey points
around (doesn't show up in the changeset)
- manually integrate highways, place names, landmarks from the Cadastre
vector (adds a few dozen nodes and ways)
- check for typical geometry errors using the JOSM validator (doesn't show
up in the changeset)
- remove from the import all buildings that are already in the database
(removes a dozen ways at most)

All points and ways are not manually checked, simply because in the vast
majority of cases, the quality of the cadastre data is high (which is also
the reason why the number of nodes is high).

Are you saying that because this manual integration work has little impact
on the changeset size, this is a massive automated import? Remember that
for the vast majority of communes out there, we're talking about a few
hundred buildings or less.

On Thu, Oct 18, 2012 at 2:05 PM, Frederik Ramm <frederik at remote.org> wrote:

> Christian,
>
>    I think you are mixing up things here.
>
> There is a general requirement for a dedicated import account, and you
> write yourself that you think that it is good to use a dedicated import in
> some cases.
>
> While this requirement is in theory a general requirement, DWG has never
> enforced that and was not planning to enforce it for minor imports of the
> type you mention ("reworked manually, objects integrated one by one...").
>
> Such imports will generally result in small changesets, small mapper
> "productivity" (nobody can "rework manually" and "integrate" thousands of
> objects per day), and look much more like traditional mapping than like an
> import.
>
> Your argument goes like this:
>
> "The rule does not make sense for small, hand-made imports, therefore it
> does not apply to people uploading 10000 cadastre buildings at once either."
>
> But there is a difference between small, hand-made imports and the kind of
> mass-import that many people commit when they import cadastre data.
>
> I believe that it would be possible to import cadastre data in the
> "hand-made" style, and maybe that was indeed originally the intention;
> perhaps some people are actually importing their neighbourhood from
> cadastre data and we don't even notice because they *really* verify
> everything manually and therefore their edits look much more like manual
> edits.
>
> I'm happy to accept that such edits are more "using a variety of legal
> third-party material in the mapping process" than they are "importing
> third-party data", and I will certainly not request that someone creates a
> separate account for taking 100 buildings from their government GIS, fixing
> them up in JOSM, and uploading them.
>
> The same if someone uploads a couple of hand-verified schools in their
> area, or a couple SNCF crossings, or something.
>
> However, as soon as someone sets out to say not
>
> "I will map my parent's village, let me check out the available sources
> and use them for help"
>
> but
>
> "I will take this SNCF dataset and import all level crossings in Alsace"
>
> then they are doing an import - even if they are perhaps occasionally
> aligning something by hand.
>
> Until now, DWG has only ever enforced the separate-account rule when
> people were clearly contributing much more data than could possibly be
> "manually reviewed".
>
> In a recent message, to talk-it (http://lists.openstreetmap.**
> org/pipermail/talk-it/2012-**September/030778.html<http://lists.openstreetmap.org/pipermail/talk-it/2012-September/030778.html>),
> Paul writes
>
> "We recognize that the line between an import and assisted mapping is not
> currently clearly defined; however all the cases I have seen recently
> clearly were on the import side of that line."
>
> So he calls it "assisted mapping", I called it "using a variety of legal
> third-party material in the mapping process", we could also call it "a
> manually verified, small-scale import".
>
> These things are ok and while it is not currently written, DWG does not
> enforce separate accounts for them. If that is any help, we can try to sit
> down together and try to clarify the line between "assisted mapping" and
> "import".
>
> There are many reasons why we want mass imports clearly separated from
> normal, human-contributed data. We got burnt by this in the license change
> in Poland, where we had to spend massive amounts of time sorting between
> "good" and "bad" changesets contributed by the same account. We have
> situations in which it is unclear whether data in OSM is from an import or
> from, say, manual imagery tracing by the user or so; if there are doubts
> about that data quality, we will not hesitate to wholesale delete something
> that was imported (because we know that the script can be fixed and it can
> be imported again), but if there's manual work behind it then we'd rather
> not do that. Believe it or not, we have even had complaints from users who
> felt that their "stats" (i.e. number of objects contributed) were ruined
> unfairly by other users doing imports. Sometimes it turns out a whole
> import stands on shaky legal ground and has to be reverted; in some cases
> we had to revert a large block of work of a particular user because neither
> the user nor we could exactly say which bits were from the incompatible
> source. That hurts.
>
> Imports dwarf anything else done with an account. Any statistics you run
> on an account will be dominated by the import characteristics. Any analyses
> - even "social" things like Richard Weait has playfully done - won't work
> with an account that is used by a human mapper and for imports at the same
> time. Importing data is a whole different class of activity.
>
> Now of course you (and Alex Barth) have a point when you say: This could
> all be solved by proper source attribution in the changeset! Editors could
> automatically higlight imports/bot edits in the change history so that
> everyone knows that this is data of a different kind. Statistics engines
> could create different league tables, taking into account those changesets
> flagged as imports/bots. Disciplined mappers would always tag their
> changesets properly, and DWG would slap them on the wrist if we find a
> 10000 object changeset that was not tagged as an import or bot edit.
>
> This is *theoretically* possible, and quite thinkable that we get there
> some day. It would require a number of changes, for example, if you make
> any sort of history call on an object you would have to see the "was this
> an import/bot" property of the changeset involved, and many analyses would
> have to process that data, and we would have to have editors that prod
> users to actually toggle the "this was an import" bit on their changesets,
> and so on.
>
> We are not there yet; we are currently in a situation where many DWG tasks
> are actually made much *easier* through the separate account requirement.
> Normally I am the first to say that "mappers must not be inconvenienced" -
> we should not invent mechanisms that make life harder for mappers. But I
> have less patience for mass importers; mass importing requires a lot of
> diligence and asking those few mass importers we have to create an extra
> account is really not a big deal. If you want to be a mass importer then
> you have to be 150% correct - for a normal mapper, 50% is ok ;)
>
> So, in the spirit of pragmatism, we currently ask those people doing mass
> imports to create a separate account. This might change with time, with
> better tools, with a different API version, but currently we are asking for
> it.
>
> Members of the French community have, in this discussion, often claimed
> that it makes no sense to use a separate account for their small-scale,
> hand-crafted imports. And I agree! 100 houses from cadastre - no issue at
> all, and even if you add 100 cadastre houses every week, small-scale, in
> oyur local area, hand-crafted... I'll be the last one to complain.
>
> The absurd thing is that the same "small-scale, hand-crafted" argument has
> also been used by individuals who have uploaded more than 100,000
> buildings. I'm sure that you would agree that imports on that scale have
> very little "manual edit" about them.
>
> But apart from all this, and I really hope I don't have to repeat this
> over and over, I would really expect the French community to show a little
> cooperation here. As I said to someone else today already: Do we really
> have to discuss this to death - can you not, for once, even if you don't
> fully understand why the separate account rule exists, even if you believe
> that it doesn't fully apply to Cadastre rules, can you not simply shrug and
> say "ok, those 50 of us who do large-scale imports will create a separate
> account and that's it"?
>
> I have now spent an hour writing this and I fear it will not help much.
> Endless man-days have been spent by all involved parties arguing their
> point. It is wearing me out but I can't let the French community ignore the
> separate account rule and enforce it for others. The French community is
> part of OSM, uses the same API, same database, same editors; the import
> guidelines are there to safeguard the quality and integrity of all our
> data; we can't make exceptions from them.
>
> What we can do is what I wrote above - try to clarify the line between
> "assisted mapping" and imports. However I don't think that *any* of the
> French contributors who have until now been asked by DWG to create a
> separate account would ever fall within any definition of "assisted
> mapping".
>
>
> Bye
> Frederik
>
> --
> Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"
>
> ______________________________**_________________
> talk mailing list
> talk at openstreetmap.org
> http://lists.openstreetmap.**org/listinfo/talk<http://lists.openstreetmap.org/listinfo/talk>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20121018/16486adc/attachment-0001.html>


More information about the talk mailing list