[Talk-us] A Friendly Guide to 'Bots and Imports

Katie Filbert filbertk at gmail.com
Fri Aug 6 16:59:02 BST 2010


On Fri, Aug 6, 2010 at 9:11 AM, Serge Wroclawski <emacsen at gmail.com> wrote:

> Moving away from discussions of specific imports, I'd like to explore
> what people think about a few areas of this discussion:
>
> 1) When someone says "I want to import X", what should our first response
> be?
>

The nature of OSM with few rules (compared to say the many rules on
Wikipedia) is appealing in some aspects and I don't want to see OSM become
burdened with so many rules.

At the same time, we might learn some lessons from how Wikipedia handles
bots...

1) Anyone that wants to run a bot or new tasks for an existing bot
(automated or semi-automated tasks) must submit a request to the bot
approval group (BAG). Others are free to comment on the request, in addition
to BAG.

2) You explain what the bot will be doing.  The BAG assesses whether it's a
good idea, and gives constructive feedback

3) Bot operators are encouraged to share the code, at least with BAG, but
ideally make it open source so others can review it.

4) The bot then goes through a trial (e.g. doing 50 edits)

5) The bot runs on a separate account from the user's normal account.  The
bot account is flagged, so it's hidden by default from Special:RecentChanges
and gets higher API rate limits.

The bot's user page has information on who's running the bot, what it's
doing, bot shutoff button that anyone can use if the bot is AWOL, info on
how to contact the bot operator, and the bot operator needs to be
responsive.

http://en.wikipedia.org/wiki/Wikipedia:Bots

Certainly not all bots and imports are bad, but I would be happy to have
such careful attention and review for OSM bots and imports to help ensure
the task is suitable, the bot works properly, and is not disruptive or
harmful to the community.


> 2) When someone points out a widespread problem (such as the Salt Lake
> City addresses), how do we want to proceed?
>

I'm not totally convinced it's effective, but Wikipedia handles disputes and
issues with "requests for comments" and tries to reach consensus.  For
something like the addresses, there may be not be 100% consensus but say,
3/4 agreement would be good, making compromises necessary to get there.

http://en.wikipedia.org/wiki/WP:RFC

Things can escalate from there, if necessary.  For OSM, we tend to discuss
things on the mailing list, and we may want to do things differently.  Not
sure what's best.


> 3) Is it better to discourage bots and imports (as we do currently) or
> better to heavily document bots and set up standardized methods? (and
> do people think those methods will be used?)
>

See above (1).

Furthermore, Wikipedia users have gone as far as to create bot frameworks
(pywikipedia) that are well-tested and there are tools (e.g.
autowikibrowser) for semi-automated edits.

For OSM, something else we ought to do better with is using the dev API
server (http://*api06*.*dev*.openstreetmap.org/).  Last I knew, it's not
populated with data except what individuals put in it.  It would be great
the dev server instead was a full, up-to-date mirror of OSM that people
could use to test imports and semi/fully automated edits.  I think this is
especially important since, unlike setting up MediaWiki, it's not so simple
for individuals to setup their own OSM stack

More testing and more eyes on bots and imports, I think the better for bad
bots and imports to be weeded out and the good, useful ones can proceed.


> 4) In the US, what (if any) role should OSM US play in imports?
>
>
Not sure it needs to be OSM US specifically, but having a staging area (e.g.
to store copies of data imported -- in original & osm format? -- and a good
development server for testing are important.


>
> And now my .02:
>
> 1. I think the first reactions to a request to import should be
> something that outlines the danger to OSM of importing. That's the
> guide this thread talks about. We want to instill on the user the
> potential pitfalls and encourage them to work with the community-
> maybe even discovering that the data set was known previously and not
> imported for a reason.
>

Community feedback is indeed important.


> 2. I think widespread "bot fixes" should be encouraged to wait 10
> days. It's just too easy to make a large change and too hard to fix
> it. I'd also suggest that we (as a community) develop tools to make it
> easier to demonstrate what an import or bot would do on a test server.
>
> Imagine I want to fix all the streets in Cleveland. I could spin up an
> instance of Cleveland as of a certain time, apply my changes to that
> test site, and show it off to the large community, soliciting
> feedback.
>

Agree.


>
> This isn't really feasible right now using existing OSM methods.
>
> 3. I think imports and bots are inevitable, so the more documented we
> make the process, the less we encourage people to go wild and write
> their own. At the same time, we want to discourage bots and imports in
> general.
>

I agree to some extent about discouraging bots and imports, at the same time
realize that in some cases, bots and imports can be done well and be
beneficial.

At the same time, I see cases of good imports like the DC GIS data, where we
can get them involved them in OSM.  Bring them into the community to the
extent possible, get DC GIS folks out to mapping parties, etc. Also, work
out a way to get updates from them and a procedure for getting the updates
into OSM.

Or if we can get USGS involved in OSM?  if/how do we want to do that?

Oftentimes, OSM users come from a geography or GIS background and they are
good pool of people that we should encourage to get involved with OSM. (if
they have data to contribute, great)


> 4. I think OSM US can play a significant role in two ways. I think the
> organization can help by working with governments to make data sets
> available. And I think it could possibly help with some equipment and
> infrastructure. Those are why I'm involved in OSM US now, and (blatant
> plug) why I'm running for office on the next board.
>

I think it's best for local OSM volunteers (constituents) to seek out the
data sets and partnerships (involvement from/with) governments and other
organizations, attend meetings, and governments may take requests from their
constituents more seriously.  If the local volunteer desires some "official"
chapter support (e.g. letter of support), fine, but not necessary.


>
> At the same time, I think the process needs to be bottom-up community
> driven.
>

Absolutely.

The US chapter should be only in a support role and only when the community
requests support.


>
> - Serge
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>

-Katie

-- 
Katie Filbert
filbertk at gmail.com
@filbertkm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20100806/9c82051f/attachment-0001.html>


More information about the Talk-us mailing list