[OSM-talk] Pictures of opening hours signs for machine learning purposes
iboates at gmail.com
Sat Apr 10 14:04:44 UTC 2021
@Lukas: I was having a bit of trouble getting the guest account permissions
set up on my AWS but then Bryce went ahead and posted a direct link, thanks
On Sat, Apr 10, 2021 at 5:52 AM Bryce Cogswell via talk <
talk at openstreetmap.org> wrote:
> @Bryce: Did you already make significant efforts regarding deduplicating /
> sorting or otherwise processing the images? If yes, maybe you could share
> this altered dataset with Isaac and other interested parties?
> I didn’t do any additional work on deduplicating the images. I’m not sure
> why you think this is important if you’re going to use it for ML training.
> @Bryce: Congratulations! I already saw some correctly recognized
> specimens! That is certainly encouraging, isn't it? Do you already know
> if/how you would proceed further? If you would be okay with publishing with
> what you already have, maybe others could build upon that.
> I remember one idea we had: If users of such a recognition feature would
> be willing to (automatically, with little/no effort) share the pictures to
> increase the pool of pictures you could create a virtuos cycle, especially
> if you can motivate them to either mark detections as correct or let them
> fix it as needed.
> Keep in mind I’m not doing any ML training, so having a larger sample size
> doesn’t benefit me. I wanted a large number of test images in order to
> measure the expected accuracy of the OCR and algorithm in a real-world
> settings. My plan now is to build a stand-alone app for testing during
> surveying, improve the recognition by building better spatial models of how
> the text is laid out, and then finally integrate it into Go Map!!
> I’m working on this at https://github.com/bryceco/OpeningHoursPhoto but
> the code is super rough at this point.
> The image set it is at
> <http://gomaposm.com/opening_hours/opening_hours.zip> (12.5GB download)
> talk mailing list
> talk at openstreetmap.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the talk