[Talk-se] OpenStreetMap data to georeference Swedish texts
Thomas Fischer
thomas.fischer at his.se
Thu Sep 15 07:05:28 UTC 2016
Hello,
my name is Thomas Fischer and I am a researcher and teacher at the
University in Skövde (Högskolan i Skövde).
In the past months, I have worked on a project that makes use of
OpenStreetMap data to geolocate/georeference text phases as found in
news paper articles or police reports. For example, if a news report
is about a traffic accident on E20 near Vårgårda, the software's
task is to identify the place were E20 passes closest to the city of
Vårgårda. Similarly, if the report talks about Skolgatan, the
software shall identify the right instance of a road of this name if
the report names the municipality in the same input text. The
software is designed specifically for Sweden (hierarchy of road
names and administrative structures) and the Swedish language (stop
words, for blacklisting certain terms, converting plural forms into
singular, ...).
A demonstrator of the resulting software is available at
http://fish.research.nsa.his.se:9876/
The demonstrator's source code is available at GitHub:
https://github.com/thomasfischer-his/pbflookup
In my experience, the software is working already quite well. There
are two main limitations:
- Missing, inconsistent, or erroneous information in the underlying
map data from OpenStreetMap (unfortunately)
- Lack of a deeper textual analysis: Words are extracted as is from
the input text, so the word 'vara' (to be) and the city of Vara
are indistinguishable.
The software is written in C++, but care has been taken for code
quality by using static code checkers and tools like Coverity Scan
or Valgrind.
The project to develop this softwas was co-financed by
Internetstiftelsen
I am contacting this mailing list for two reasons. First, I would
like to get some feedback such as error reports or ideas for
extensions.
Second, I would like to pursue this project further, so I am looking
for cooperation partners that can (financially) support the project.
Examples for cooperation partners can be app developers that want to
make use of the JSON/XML interface, news agencies or newspapers that
want to add the service of matching their articles to a map
location, or governmental, regional, or municipal agencies that want
to geolocate their own or citizens' reports (think of 112 calls or
citizens reporting broken street lights).
So, if you have an existing contact at an organization that may be
interested, please forward this mail. If you personally interested
in a cooperation, you can contact me directly, of course.
Greetings from Skövde, Sweden,
Thomas Fischer
https://www.his.se/fish
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.openstreetmap.org/pipermail/talk-se/attachments/20160915/8867c050/attachment.sig>
More information about the Talk-se
mailing list