[Talk-GB] OSM should not be a database dump.
James Derrick
lists at jamesderrick.org
Sat Feb 11 10:20:17 UTC 2023
Hi,
On 10/02/2023 16:33, SK53 wrote:
> Adding lots of potentially out-of-date data to OSM tends to move the
> project from being one of mapping things to one about maintaining a
> somewhat
> out-of-date database.
+1
We need to think of the lifecycle "cost" of the data collected and
stored and MAINTAINED by OSM.
It's not just about storing "other entities data" in OSM (database
dump), it's about what happens for the next ten years, and the tooling
required.
Using widely consumed data in a consistent manner is absolutely OSM's
goal - so using external databases to validate and enhance our coverage
is great.
Adding limited external "foreign key" reference IDs has some value as it
can assist future maintenance checks - e.g. if a School changes name, or
a take away becomes a house (the sort of checks Rob Whittaker's Survey
Me! tool does well - to name but one).
The issue is when the effort required to maintain the OSM data exceeds
the value to OSM consumers.
You can add a shop in two keys `shop=supermarket` `name=Iceland` - works
fine.
These days, best practice is to add several other keys to external
databases, and my opinion is it it getting out of hand:
`brand=Iceland`
`brand:wikidata=Q721810`
`brand:wikipedia=en:Iceland (supermarket)`
And that's without `ref= fhrs:id= fhrs:local_authority_id= branch=
contact:website=`.
Suddenly, you are looking at maintaining nine duplicated keys where two
work - a higher barrier to entry, and ongoing maintenance cost.
Okay, the extras are typically added later by armchair mappers (yes, I
do both survey and armchair by season) but...
Iceland decides to change their brand strategy and we're into an
automated update to Food Warehouse to update all the keys, but not all
branches are moving, and some are closing so we're looking at a
quarterly project, then a ghost hunt for old Maplin stores...
Don't get me wrong - there is value in performing these extra tasks (and
I use the Chain Reaction tool for the extra references, and really like
look-up tables of brands). We just need to consider the data lifecycle -
is there an API? can we produce a tool? is the data good?
The architect in me wants a unique single ID for each entity, but
postal_code was never it, and UPRN/ UPRI is too encumbered (ironically,
to pay for the cost of maintaining it!)... so we might end up with OSM
being the cross-reference database for the world's separated data - IF
we can MAINTAIN all the foreign keys.
My plea is simply - think about the mapper standing in front of a thing
in the rain, and adding tags. Think about the mapper correcting a
spelling error in an armchair.
Do the tools exist to make the process easy and joyful?
Is it a slog through external data providers getting reference keys
manually?
Does OSM get enough value from multiple external references (say aiding
consistency, maintenance)?
No API, no tools, no prospect of tools, no maintenance, so consider no
import?
We need to think of the lifecycle "cost" of the data collected and
stored and MAINTAINED by OSM.
James
--
James Derrick
lists at jamesderrick.org, Cramlington, England
I wouldn't be a volunteer if you paid me...
https://www.openstreetmap.org/user/James%20Derrick
More information about the Talk-GB
mailing list