[Talk-GB] [Imports] Importing fuel stations in UK and future similar imports
ilya at zverev.info
Fri May 12 16:08:15 UTC 2017
First, I was amazed at the response. Thanks for constructive feedback, which I answer below, and no thanks for toxic responses, including asking for money (what money? We — as in maps.me — get none out of this) and imposing impossible restrictions (manually investigate context for each of thousands of points?). No import is perfect, and I cannot make this one too good for you. But it is pretty okay to me.
I have refined the tag processing script. Removed name tags, changed postal_code to addr:postcode and formatted phone numbers according to the wikipedia table. "Navads" is appended to the source tag if present. I am not sure if I should add the brand:wikidata=Q154950 tag, and for now decided against that.
You can see the updated result here: http://bl.ocks.org/Zverik/raw/ddcfaf2da25a3dfda00a3d93a62f218d/ (with OpenStreetMap and Satellite layers).
Also I have started the wiki page: https://wiki.openstreetmap.org/wiki/Navads_Imports
* Geocoding and accuracy: as I see on the map, all points in the dataset are placed properly on top of the fuel stations. The error based on OSM data is mostly inside 10 meters. I will ask NavAds for coordinate source for further datasets, but since most points are already in OSM, I think that would fall into the "fair use" clause. In this import, only 125 points are added as unmatched.
* Other fuel stations inside 50 meters: I have found only one instance where the brand was changed. It is here: https://goo.gl/maps/9GLTVg1EWR82 . The Street View from 2015 shows the BP station, but the map lists both BP and Shell. I assume the fuel station was overtaken in the past two years.
Then I filtered fuel stations with the ref_distance > 30 meters (there are eight) and placed them on satellite imagery. Looks like that all of these are correct, and the big distances come from placement errors in OSM.
* Official information vs on the ground: five objects have their opening hours changed. I assume Shell knows how their fuel stations work. Regarding other tags, only phone and addr:postcode replace OSM values (11 and 9 changed); other tags, including operator, are preserved. In the Frederik's hypothetical example, the number of rooms will be added only if there are no such tag on the already existing hotel.
* Freshness: Navads will update the data when Shell provides the update. It is as fresh as can be, but your changes to OSM won't be overwritten: if you saw opening hours changed, do update these. By the way, Robert's example about mismatch between opening hours on the Shell website and in the data is incorrect, I checked it and they match.
* Five Ways Roundabout issue: I have forwarded that to NavAds. Also I asked them about links to branches (I cannot find any on the Shell website though) and names.
* "The general view seems to be against IDs like this": what has happened with the principle "any tags you like"? Did we saturate the key space and not accepting new keys anymore? Can I read that "general view" documented anywhere? The "ref:navads_shell" key is the only one that is not verifiable on the ground, and is clearly added so the further updates do not have to rely on matching.
> 12 мая 2017 г., в 1:22, Frederik Ramm <frederik at remote.org> написал(а):
> On 05/11/2017 05:39 PM, Ilya Zverev wrote:
>> Together with the NavAds company, we plan to import a thousand Shell
>> fuel stations to the United Kingdom. The source is official, which
>> means, Shell company specifically shared the dataset to put them on
>> maps. Do you have any objections or questions?
> There are a couple other "we make your business visible on the map"
> SEO-type businesses active in OSM, some better, some worse.
> Typical problems include:
> * Geocoding. We will want to know how the lat,lon pairs they use for
> import have been generated. Sometimes the "official" source will
> actually be based on measured GPS coordinates (good). Sometimes the
> "official" source has simly geocoded their address with Google or HERE
> (not admissible, license violation). Sometimes they have geocoded their
> address with OpenStreetMap which is also bad because it can reinforce
> errors or imprecisions - for example, if OSM has an address
> interpolation range along a street, and a POI is placed with a specific
> address at the computed interpolation point, then it looks like a
> precise address but isn't.
> * Ignoring the area around the imported information. We want imports to
> match the existing data; automatic conflation is often not enough. A POI
> can end up in a house, a lake, or in the middle of a road, and if that
> is not just a one-off but a systematic problem (of the "let's dump our
> stuff into OSM and the community can then fix it" kind) then it is
> reason enough to revert the whole import and ask the importer to go back
> to the drawing board.
> * Mismatch between "official" data and reality. Especially for larger
> chains it can easily happen that the company database doesn't reflect
> reality on the ground, either through an error or because the reality on
> the ground is somehow undesirable. For example, a hotel might be in the
> chain's offical database with 49 rooms because local regulations tighten
> for hotels of 50 rooms and up, but everyone knows in practice that the
> hotel has 60 rooms and this is mapped in OSM. We wouldn't want
> "official" data to overwrite what we have in OSM.
> * Advertising. Some SEO companies go as far as putting advertising
> messages in note tags, or invent new tags to describe the business in
> the most colourful terms. While such advertising may occasionally be
> factually correct ("family-owned since 1948"), we're usually not
> interested in that.
> These are *general* comments.
> Looking at your proposed import specifically,
> 1. I'd be interested in the geocoding source as per the first bullet
> point above. Of course this is only relevant to newly added POIs.
> 2. It seems to me that you're setting brand, operator, and occasionally
> even name to "Shell". In Germany, the operator of a fuel station is
> usually a small local business that has a franchise relationship with
> the fuel company. Are you absolutely sure that Shell is actually the
> operator of all these fuel stations? (It should be easily visible on a
> receipt you get there.)
> 3. Also, in those cases where no name was set before and you put
> "name=Shell", are you absolutely sure that it's not "name=Joe's Garage"
> or something? Would such a situation be correctly recorded in your data?
> I notice that e.g. the station in Dursley with the ID NVDS298-10019092
> is a proposed import with "name=Shell", whereas Shell's own station
> locator lists this as "Millwood Motor Company Limited".
> 4. I would also recommend not plastering the www.shell.co.uk URL all
> over the place - if the *individual* fuel station doesn't have a web
> site then it's not worth pointing to the Shell corporate site IMHO.
> 5. New stations look generally well placed compared to aerial imagery (I
> only looked at a random sample) but the second bullet point above is an
> issue; for example you have placed a new fuel station at the correct
> location but in the middle of a school campus
> - a high quality import would have a human verify the sitation, detect
> the issue, and reduce the school grounds accordingly (or maybe call the
> chain and ask if this is some special kind of training fuel station...)
> Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
> Imports mailing list
> Imports at openstreetmap.org
More information about the Talk-GB