[OSM-talk] Pictures of opening hours signs for machine learning purposes

Bryce Cogswell bryceco at yahoo.com
Mon Apr 5 01:55:08 UTC 2021


I looked into doing something like this a few years ago using iOS with its ML support. I collected only about 30 photos or so to play with, as I intended to do a conventional  OCR conversion rather than training a model from scratch.

I used Apple's VNDetectTextRectanglesRequest to extract locations of text in the image and then piped that through the open source Tesseract OCR library to extract the actual text. What I found was that Tesseract was far too poor at recognizing the text to be useful. I believe the issue is that Tesseract is trained for fonts that generally appear in pages of prose, so basically Helvetica/Times Roman. The fonts used on opening hours boards of businesses are typically more stylized or block letters that didn’t fit the default Tesseract model. I was too lazy to train my own model for the range of fonts I was seeing in the wild.

However, based on your post I just investigated and see that Apple has a new API starting in iOS 13 that does the OCR component as well. Testing that on a few of my photos (see below) shows that it does a perfect job of extracting the text from images that Tesseract failed on. Based on that I’ll probably start working on a higher-level processing stage to extract the days/hours from the raw text, which seems like a pretty simple proposition. It would be great to use the photos you have as a starting point to see what level of coverage is feasible with a larger sample size.

Ultimately this work would be integrated into Go Map!!

https://share.icloud.com/photos/0L4lpDAcNQNb-KLk_h3m09f0w

Bryce



> On Apr 4, 2021, at 1:02 PM, Toggenburger Lukas <Lukas.Toggenburger at fhgr.ch> wrote:
> 
> Hi all!
> 
> tl;dr
> 
> I have made about 800 to 1000 pictures of opening hours sign of shops which I am willing to share, e.g. for machine learning purposes.
> 
> 
> Long story
> 
> Once upon a time I envisioned that there should be a smartphone app (either standalone or part of e.g. Vespucci or StreetComplete) that simplifies the process of mapping opening hours of shops and the like when being on the go. The process would have been that a mapper can take a picture of such a sign and it would be automatically converted (using machine learning techniques) to an OSM opening hours string ( https://wiki.openstreetmap.org/wiki/Key:opening_hours ) and/or be displayed in a suitable opening hours display/editor (e.g. the one from Vespucci: https://github.com/simonpoole/OpeningHoursFragment ). Motivated by a Masters student willing to work on this as one of his semester theses I took a lot of pictures of opening hours signs, post box collection times, event announcements and the like. Unfortunately neither this student nor I find the time necessary to further work on this. In order to not let these hours of work go to waste, I search for one or multiple parties interested in either continuing to work on this or use these pictures for something else that is useful.
> 
> So what do we have?
> 
> - 2763 JPEG files (I often took multiple shots of the same sign)
> - 13.5 GB
> - Approx. 800 to 1000 files when deduplicated (I did only partially do this, so I don't know the exact number)
> - Most pictures are shot around Lake Zürich in (german-speaking) Switzerland, a small amount of them in other cantons and countries
> - Almost all pictures were shot using a Fairphone 2 smartphone (4096x3072 px) with above-average JPEG quality, most of them should be geotagged
> - Almost all stem from 2018 (so probably too old to be directly useful for mapping)
> - Most signs are shot with an angle to prevent myself being visible in the pictures
> 
> Here are my proposed next steps:
> 
> - Remove duplicates
> - Maybe remove some of the pictures according to some criteria, e.g. language of the sign (most of them are german)
> - Invent some kind of normalized opening hours scheme (so that one "meaning" of opening hours has exactly one textual representation). This is necessary since almost all opening hours instances can be represented using several strings while having the same meaning.
> - Decide on an annotation scheme (in-file, out-of-file, etc.)
> - Annotate the pictures with this normalized opening hours scheme
> - Maybe add annotated pictures from other sources
> - Train an OCR/machine learning system using the annotated pictures (maybe use fancy techniques like data augmentation)
> - Implement your model as part of Vespucci, StreetComplete or a standalone app and simplify the process of mapping opening hours. Note that achieving especially high accuracy is NOT necessary since mappers still can (and probably should) check and adjust the recognized opening hours before uploading them.
> - Publish the annotated pictures under a liberal licence to boost interest in improving the state-of-the-art for this kind of machine learning problem
> 
> I'm also open for usage ideas outside of OpenStreetMap.
> 
> Make sure to either write me directly or CC me when replying to the mailing list, so I won't miss your mail.
> 
> Best regards
> 
> Lukas
> 
> 
> 
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20210404/32295021/attachment-0001.htm>


More information about the talk mailing list