[HOT] Improving the quality of OpenStreetMap Data

Samson Ngumenawe samson.ngumenawe at hotosm.org
Tue Mar 7 19:39:20 UTC 2023


I am so thankful for the pool of ideas that are coming in via this thread
on the mailing list and I will be synchronizing these ideas into my data
quality efforts at HOT. However, I want to clarify some of the issues and
ideas,
1. HOT is not working directly on developing the buildings tool in iD
Editor, however, as a way forward, I will be engaging with the iD Editor
development team to discuss the possibility of having the buildings tool
since it is the most used tool by beginner mappers to trace buildings.
2. Some of the ideas where HOT takes direct responsibility like Tasking
Manager notification about the expiry of the task mapping period will be
forwarded to the Tasking Manager development team at HOT.
3. As Rob mentioned, the data team is collaborating with the
technology, and innovation team at HOT to integrate an underpass tool into
the tasking manager to automate live data quality checks. For more details,
check the recent blog post(
https://www.hotosm.org/updates/hot-data-quality-updates/) to read about how
the tool will like and more data quality efforts.

Regards.

On Tue, Mar 7, 2023 at 3:12 PM <hot-request at openstreetmap.org> wrote:

> Send HOT mailing list submissions to
>         hot at openstreetmap.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.openstreetmap.org/listinfo/hot
> or, via email, send a message with subject or body 'help' to
>         hot-request at openstreetmap.org
>
> You can reach the person managing the list at
>         hot-owner at openstreetmap.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of HOT digest..."
>
>
> Today's Topics:
>
>    1. Re: Improving the quality of OpenStreetMap Data (Mike Thompson)
>    2. Re: Improving the quality of OpenStreetMap Data (john whelan)
>    3. Re: HOT Digest, Vol 157, Issue 3 (Rob Savoye)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 6 Mar 2023 15:10:31 -0700
> From: Mike Thompson <miketho16 at gmail.com>
> To: John Whelan <jwhelan0112 at gmail.com>
> Cc: Frans Schutz <frans.schutz at gmail.com>, Samson Ngumenawe
>         <samson.ngumenawe at hotosm.org>, hot at openstreetmap.org
> Subject: Re: [HOT] Improving the quality of OpenStreetMap Data
> Message-ID:
>         <
> CALJoUkvvZDZAWeUEPoVM4ar5GG+WHr3k9NWpsg9VT3VsggU8qg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Mon, Mar 6, 2023 at 1:44 PM John Whelan <jwhelan0112 at gmail.com> wrote:
>
> > There are two issues the first is an exact duplicate I think these could
> > be handled by a bot, if the way and tags are the same at the same
> position.
> >
> Agree, if the community consensus supports a bot, it should be created and
> ran, but it needs to be discussed.  There might be some pitfalls.
>
> >
> > The second is pure bad mapping where a building can be mapped two or
> three
> > times.
> >
> We should determine the root cause.  I suspect the task in the tasking
> manager has expired and has been assigned to a second mapper.  That coupled
> with people not saving their work on a frequent basis.
>
>
> > It's rare that these buildings are squarely mapped.  Some mappers are
> > given the advice ignore what is there and just map.
> >
> This has been my experience.  Not sure how to solve it.
>
> Mike
>
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.openstreetmap.org/pipermail/hot/attachments/20230306/67254509/attachment-0001.htm
> >
>
> ------------------------------
>
> Message: 2
> Date: Mon, 6 Mar 2023 19:03:37 -0500
> From: john whelan <jwhelan0112 at gmail.com>
> To: Benjamin Herfort <herfort at uni-heidelberg.de>
> Cc: hot at openstreetmap.org
> Subject: Re: [HOT] Improving the quality of OpenStreetMap Data
> Message-ID:
>         <
> CAJ-Ex1Ef5QEJBrZnMiFEnCbzpk9G6y8edoTyc-Yf3scQAzb5gg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> To gain credibility I think it is important to look professional.
>
> https://www.jatws.org/johnw/hot/dupbuilding3.jpg here we see a couple of
> iD
> mapped buildings HOT project actually mapped within 3 mins of each other.
>
> Duplicate buildings, the mapped outline sort of follows the building. You
> want me to trust this map?
>
> It's important to keep the message simple.  Two pages of technical
> information is information overload for many mappers, especially new ones.
>
> I suspect saying you must use a mouse rather than relying on a
> touchpad would increase accuracy even in iD and make the map look more
> credible.
>
> An instruction saying do not delete anything to armchair mappers would save
> buildings etc being deleted by mistake because they aren't on a three year
> old image.
>
> I'd like to say use a tool to map buildings but that might be too much for
> a mapathon organiser to arrange.
>
> I think Samson makes too much of Spatial Offsets.  We are mapping for the
> most part from satellite imagery.  The big birds have a basic raw imagery
> of 60 meters and they are the most accurate ones.  All The imagery we use
> has been adjusted.  Realistically Bing adjustment is the usual standard OSM
> uses and it's not that far out.  You can get high precision GPS to get cm
> accuracy but it's not cheap.
>
> If we can say there is a building and we're correct to within 10 meters I
> think that's as good as we can expect.  The relative positions of buildings
> will be better than this.
>
> On the armchair mapping side it's basically settlements (buildings) and
> highways to get there that are important.  Don't expect an armchair mapper
> to add road names although some can be quite creative.
>
> We've probably mapped less than a quarter of the buildings we can see on
> imagery.
>
> Mike made a comment that some mappers sessions may have timed out and the
> tile released to another mapper.  Yes but often I see weeks between the
> duplicates in history and in the example above there was only 3 mins
> difference.  Some sort of pop up reminder saying your time is up, save now
> twenty minutes before the tile is released in task manager is within HOT's
> control.
>
> https://www.jatws.org/johnw/hot/dupbuilding.jpg is an example of a
> building
> mapped three times.
>
> Then you get to who is the target audience of his message?
>
> If you post here it's basically armchair mappers.
>
> To enrich the data with highway names, clinics etc you need boots on the
> ground or access to data that can be imported.  I could be wrong but I
> suspect these boots on the ground mappers don't read this mailing list.
>
> We need to identify what we can do to support boots on the ground.  Street
> complete on a phone perhaps?  They need processes and procedures.  These
> may have to be offline.  JOSM should be able to workoff line,  drop the
> imagery on a raspberry pi.  It just needs the processes working out.
>
> So yes the documents are fine but they don't identify the simple things we
> can do to build the foundations.
>
> I think the thread has at least identified a number of problems that if
> they can be solved perhaps by training that would be good.
>
> Cheerio John
>
>
>
> On Mon, 6 Mar 2023 at 17:00, Benjamin Herfort <herfort at uni-heidelberg.de>
> wrote:
>
> > Hey,
> >
> > while I agree with you that for building mapping there might be some
> > potential improvements on the editors side, I also think that this blurs
> a
> > bit the many other valid points that Samson highlighted in his mail.
> >
> > In fact, the aspect I support strongest in his summary is the notion that
> > data quality concerns much more than "just" the buildings and many
> > different dimensions (such as accuracy of the geometrical footprint, but
> > also completeness and the level of detail for the attributes, ...).
> >
> > Coming from that perspective I would find it interesting to further
> > discuss which other ideas you have how we can improve OSM in general. For
> > instance in regard to the mapping of place names, amenities, road
> > attributes, health care data etc.
> >
> > Kind regards,
> >
> > Benni
> >
> >
> > On 06.03.23 21:44, John Whelan wrote:
> >
> > There are two issues the first is an exact duplicate I think these could
> > be handled by a bot, if the way and tags are the same at the same
> position.
> >
> > The second is pure bad mapping where a building can be mapped two or
> three
> > times.  It's rare that these buildings are squarely mapped.  Some mappers
> > are given the advice ignore what is there and just map.
> >
> > Not an easy one to solve but at least they can be picked out with the
> > right tools.  They aren't worth running on one tile of a project though.
> >
> > The bigger problem is validator fatigue, they know it takes 3 clicks to
> > map a building using a tool, then you ask them to sort out someone else's
> > mess and it takes more than 3 clicks to sort out one building.
> >
> > Cheerio John
> >
> > Mike Thompson wrote on 3/6/2023 3:36 PM:
> >
> > Hi John,
> >
> > While I agree with you that we should get a building tool into the hands
> > of all mappers one way or the other, I wonder if the problem of exact
> > duplicate buildings is due to poor network connectivity (client thinks
> > upload failed, so upon the next upload the same data is uploaded a second
> > time).
> >
> > Mike
> >
> > On Mon, Mar 6, 2023 at 10:26 AM John Whelan <jwhelan0112 at gmail.com>
> wrote:
> >
> >> I'm thinking of putting myself up as the unmapper of the year.  Over the
> >> last few days I've deleted more than 500 duplicate buildings. They're
> >> actually quite easy to spot with the right tools.
> >>
> >> My favourite of the week is the mapper who retagged highway=residential
> >> as road=yes.  Very difficult to spot.  Now if he had been taught only to
> >> use the buildings_tool and nowt else the problem wouldn't be there.
> >>
> >> HOT has been after a building tool for id for many many years.  I don't
> >> think it is ever going to happen.  It could be a limitation in what you
> can
> >> do with a script in a browser or some other reason.
> >>
> >> I think we need to rely on the business case that for buildings I can
> get
> >> more buildings mapped with the buildings_tool for the same number of
> mouse
> >> clicks.  I think it's six including the tag for iD compared to two or
> three
> >> for the buildings_tool so you get two to three times as many buildings
> out
> >> of the same mappers in the same time period, besides that the validators
> >> and other OSM users will love you.  As a general tool for an
> introduction
> >> to OSM making small changes to many different objects iD is fine but for
> >> new mappers you need something simpler with less choices. The more
> choices
> >> they have the more likely they will choose the wrong one.
> >>
> >> Cheerio John
> >>
> >> Frans Schutz wrote on 3/6/2023 9:49 AM:
> >>
> >> Hi John, good to see you are still active with mapping.
> >> I agree with your point about the building tool, most mappers I review
> >> use this tool when they map in JOSM.
> >> However, most mappers, mostly new mappers use Ideditor and think they
> can
> >> map when they just roads the instructions. They often make mess of it
> and
> >> gives a lot of work to validators. A tool in Ideditor which produce fair
> >> squared buildings should be of great help.
> >>
> >> Vriendelijke groeten
> >> Frans
> >>
> >> Op 6 mrt. 2023 om 3:37 PM heeft John Whelan <jwhelan0112 at gmail.com>
> >> <jwhelan0112 at gmail.com> het volgende geschreven:
> >>
> >> ? I think the single most important thing HOT can do to improve data
> >> quality is to use JOSM buildings_tools plugin to map buildings.  It can
> be
> >> run from a USB stick and works fine with Microsoft OPENJDK so you don't
> >> need to install JOSM on the machine.  Yesterday I added more than 200
> >> building=yes tags to untagged ways.  It takes a few more moments to
> train
> >> but requires fewer mouse clicks per building and normally you get at
> least
> >> 50% more buildings mapped out of brand new untrained mappers in a 45
> minute
> >> period.  You don't need to train them on every bit of JOSM just enough
> to
> >> use the tool and upload.
> >>
> >> Cheerio John
> >>
> >> Samson Ngumenawe via HOT wrote on 3/6/2023 12:36 AM:
> >>
> >> Dear OpenStreetMap Contributors,
> >>
> >> As we are working to ensure that the OpenStreetMap data is of good
> >> quality and fit for purpose, on behalf of the Humanitarian OpenStreetMap
> >> Team would like to reach out to the entire OpenStreetMap community, to
> help
> >> us achieve this goal.
> >>
> >> As you are aware, good-quality data is essential for effective
> >> humanitarian response and disaster management. Accurate and up-to-date
> >> geospatial information is critical in ensuring that aid and assistance
> can
> >> be delivered to those who need it most. However, maintaining data
> quality
> >> in a rapidly changing environment is challenging. We recognize that
> this is
> >> an ongoing process and need support from the OSM contributing community
> to
> >> help improve data quality continuously.
> >>
> >> At the Humanitarian OpenStreetMap Team, I am working towards
> implementing
> >> various measures to ensure the data created from remote mapping, field
> >> data, and imports is of quality. These measures include data validation,
> >> data quality checks and metrics, community engagement, and partnerships
> >> with other organizations like HeiGIT.
> >>
> >> To build a pool of experienced data quality enthusiasts, HOT also
> >> conducts mapping events, internship training programs, and outreach
> >> initiatives to help engage the communities to create awareness and
> improve
> >> data quality.
> >>
> >> Some of the data quality improvement efforts include;
> >>
> >>    -
> >>
> >>    Top 10 data quality aspects. I have defined our top 10 data quality
> >>    aspects that we are focusing our efforts on (
> >>
> https://wiki.openstreetmap.org/wiki/Humanitarian_OSM_Team/top_10_data_quality_aspects
> )
> >>    to let the community know about the sources of the errors and
> possible ways
> >>    of how such errors can be addressed. This is the basis for a set of
> data
> >>    quality metrics (
> >>
> https://wiki.openstreetmap.org/wiki/Humanitarian_OSM_Team/Core_Impact_Area_Datasets_,_Use_cases_%26_Data_Quality_Metrics
> )
> >>    that I am working on and will be implementing to help us track the
> quality
> >>    of data in the context of the most important data uses in
> humanitarian
> >>    response and along community priorities. HOT has now dedicated data
> quality
> >>    staff in all of our Hub teams - Dinar Adiatma for Asia Pacific,
> Shamillah
> >>    Nassozi in East/Southern Africa, and Omowonuola Akintola in West &
> Northern
> >>    Africa. Together with the regional Hub teams, we are creating
> >>    regional-specific approaches on how to address data quality issues
> in their
> >>    local context by defining data quality, regional needs, tools that
> track
> >>    data quality issues, and solving them. Please read and provide
> feedback
> >>    about the data quality approach (
> >>
> https://wiki.openstreetmap.org/wiki/Humanitarian_OSM_Team/Open_Mapping_Hub_-_Asia_Pacific/Data_Quality_Approach
> )
> >>    for the Open Mapping Hub Asia-Pacific. These will all be based on the
> >>    global Data quality strategy for which I am currently defining the
> >>    strategic objectives for each team that will be collaborating on
> >>    implementing the strategy and soon I will be sharing the draft data
> quality
> >>    strategy for public review here as well.
> >>
> >>
> >>    -
> >>
> >>    I am working with the Quality Control Working Group to build an
> >>    active team of global data validators (
> >>    https://tasks.hotosm.org/teams/7/membership/) whose efforts are
> >>    incredible in ensuring the quality of remotely mapped data is good.
> From
> >>    the recent Turkey/Syria mapping activations, the validators have
> played a
> >>    big role in checking and fixing the errors and improved the quality
> of the
> >>    data that is being used to provide response to the disaster-impacted
> >>    communities in Turkey and Syria. In the current response as well, we
> are
> >>    seeing a lot of new and inexperienced mappers join. Yes, there are
> areas
> >>    where quality is not good enough currently and I?m really grateful to
> >>    everyone that is helping us improve and validate map data.
> >>
> >>
> >> I am calling on all OpenStreetMap contributors to help us in this effort
> >> to improve OpenStreetMap data quality continuously and I invite you to
> >> share your expertise, insights, and feedback on how we can work
> together to
> >> improve & maintain good quality data.
> >>
> >> I am always openly available for a chat/call and in case you have any
> >> feedback that you would like to share with me, do not hesitate to reach
> out
> >> to me by emailing samson.ngumenawe at hotosm.org or data at hotosm.org
> >>
> >> Your support and contributions are vital in making OpenStreetMap a
> >> reliable and comprehensive resource for humanitarian aid and disaster
> >> response.
> >>
> >> Thank you
> >>
> >> --
> >>
> >> https://unsummit.hotosm.org/
> >>
> >> *Samson Ngumenawe*
> >> Data Quality Coordinator
> >> samson.ngumenawe at hotosm.org
> >> Timezone: UTC+03:00 (Kampala, Uganda)
> >>
> >> *Humanitarian OpenStreetMap Team*
> >> *Using OpenStreetMap for Humanitarian Response & Economic Development*
> >> web <http://hotosm.org/> | twitter <https://twitter.com/hotosm> |
> >> facebook <https://www.facebook.com/hotosm> | donate
> >> <https://donate.hotosm.org/>
> >>
> >>
> >> _______________________________________________
> >> HOT mailing listHOT at openstreetmap.orghttps://
> lists.openstreetmap.org/listinfo/hot
> >>
> >>
> >> --
> >> Sent from Postbox <https://www.postbox-inc.com>
> >> _______________________________________________
> >> HOT mailing list
> >> HOT at openstreetmap.org
> >> https://lists.openstreetmap.org/listinfo/hot
> >>
> >>
> >> --
> >> Sent from Postbox <https://www.postbox-inc.com>
> >> _______________________________________________
> >> HOT mailing list
> >> HOT at openstreetmap.org
> >> https://lists.openstreetmap.org/listinfo/hot
> >>
> >
> > --
> > Sent from Postbox <https://www.postbox-inc.com>
> >
> > _______________________________________________
> > HOT mailing listHOT at openstreetmap.orghttps://
> lists.openstreetmap.org/listinfo/hot
> >
> > _______________________________________________
> > HOT mailing list
> > HOT at openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/hot
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.openstreetmap.org/pipermail/hot/attachments/20230306/225e30ac/attachment-0001.htm
> >
>
> ------------------------------
>
> Message: 3
> Date: Mon, 6 Mar 2023 16:18:19 -0700
> From: Rob Savoye <rob.savoye at hotosm.org>
> To: hot at openstreetmap.org
> Subject: Re: [HOT] HOT Digest, Vol 157, Issue 3
> Message-ID: <4c238a9c-e54c-b517-9e17-b134c8801dfb at hotosm.org>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> On 3/6/23 14:56, hot-request at openstreetmap.org wrote:
> > There are two issues the first is an exact duplicate I think these could
> > be handled by a bot, if the way and tags are the same at the same
> position.
>
>    HOT is actually working on a tool (Underpass) to find duplicate
> buildings in near real-time, so it will be able to monitor duplicates
> from mapathons pretty easily. It's being built into the Tasking Manager,
> but also runs standalone.
>
>    The tech team is doing a lot of work these days on automating data
> quality, because it is a problem, and it's better to burn CPU cycles
> than brain cells. I know some areas in Africa with quadruple duplicate
> buildings all added by mapathons. :-( Many thanks to the unmappers!
>
>    A lot of duplicates came in because iD didn't display buildings based
> on your zoom level, so people didn't know there was already something
> there. I also prefer JOSM with the building plugin personally.
>
>         - rob -
> --
> Senior Tech Lead
> Humanitarian OpenStreetMap Team
> https://www.hotosm.org
>
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> HOT mailing list
> HOT at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/hot
>
>
> ------------------------------
>
> End of HOT Digest, Vol 157, Issue 4
> ***********************************
>


-- 

https://unsummit.hotosm.org/

*Samson Ngumenawe*
Data Quality Coordinator
samson.ngumenawe at hotosm.org
Timezone: UTC+03:00 (Kampala, Uganda)

*Humanitarian OpenStreetMap Team*
*Using OpenStreetMap for Humanitarian Response & Economic Development*
web <http://hotosm.org/> | twitter <https://twitter.com/hotosm> | facebook
<https://www.facebook.com/hotosm> | donate <https://donate.hotosm.org/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/hot/attachments/20230307/c3358946/attachment.htm>


More information about the HOT mailing list