[Talk-GB] Post-processing shop values (was mechanical edit)

SK53 sk53.osm at gmail.com
Mon Nov 3 12:19:05 UTC 2014


Broadly I'm not in favour of the various mechanical edits suggested.
Bringing together bookmaker tags is harmless, but does little to gather in
the long tail. More extensive edits both deny the undoubted value of
revisiting an area (either in life or from an armchair), and are
unnecessary.

The following code example shows just how much can be done to post-process
OSM data (in this case in an osm2pgsql schema). The normalisation of shop
tags is fairly comprehensive, the normalisation of name only includes one
partial example. This has been done in code, but can easily be implemented
using one or more look-up tables:

-----------------------------------------------------------------------------------------------------

select 'N'::char(1) osm_ele_typ, osm_id
, shop raw_shop
, name raw_name
, tags->'brand' raw_brand
, "addr:housenumber" addrno
, "addr:housename" addrnm
, tags->'addr:street' addrst
, tags->'addr:postcode' addrpc
, case when shop in ('betting','bookies','bet') then 'bookmaker'
       when shop in ('wine','off_licence') then 'alcohol'
       when shop in ('tanning','salon','tanning_salon','beautician') then
'beauty'
       when shop in ('closed','empty','disused') then 'vacant'
       when shop in ('food') then 'grocery'
       when shop in ('kiosk','newsagents') then 'newsagent'
       when shop in ('building_supplies') then 'builders_merchant'
       when shop in ('angling') then 'fishing'
       when shop in ('funeral_director') then 'funeral_directors'
       when shop in ('cobbler') then 'shoe_repair'
       when shop in ('glaziery','windows','glazing') then 'glazier'
       when shop in ('golf') then 'sport'
       when shop in ('bridal') then 'wedding'
       when shop in ('fish','seafood','fishmonger') then
'fish_and_seafood'
       when shop in ('baby') then 'baby_goods'
       when shop in ('chocolate','sweet') then 'confectionery'
       when shop in ('chandlery') then 'chandler'
       when shop in ('travel
agent','travel_agent','travel_agents','travel') then
'travel_agency'
       when shop in ('Charity_Shop','Charity') then 'charity'
       when shop in ('records') then 'music'
       when shop in ('laundrette') then 'laundry'
       when shop in ('boutique') then 'clothes'
       when shop in ('deli') then 'delicatessen'  -- in deference to Serge
& NYC delis
       when shop in ('dry_cleaner','dry_cleaners','dry
cleaners','drycleaner') then 'dry_cleaning'
       else shop
       end::varchar(255) shop
  , case when name in ('Co-op','Co-operative','coop','Coop') then 'The
Co-operative Food'
       else name end::varchar(255) as "name"
, st_setsrid(st_transform(way,4326),4326) as wgs_geom
from gb1408_point
where shop not in ('mall','market','shopping_centre')

-----------------------------------------------------------------------------------------------------

Using a look-up table would enable new values to be added easily (and
possible crowd-sourcing of correct values.

I havent added other things to this SQL, such as grouping shops into higher
level categories (See examples on Will Philips osm-nottingham site), or
adding brand and format (e.g., Tesco Extra, Metro, Express). Each can
follow the same approach, and can in turn be managed as data not code.

This SQL effectively does everything that mechanical edits do, and there is
no danger that information provided by the original mapper is lost (as
might be the case with over zealous harmonising).

The key to getting shop data better is actually to map many more of them.
There will be no serious interest in the shop data in aggregate (rather
than individual POIs) until it is much more complete. Improving the
guidance on the wiki is a good way to go: we've already seen many comments
over the past few weeks which have refined what we already have. I hope we
can keep up the discussion here and on the wiki.

Regards,

Jerry

PS. If any one has ideas of how the above ideas shown in SQL might be put
into LUA do pitch in.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-gb/attachments/20141103/5eb5b501/attachment.html>


More information about the Talk-GB mailing list