[Imports] Norwegian place name registry (SSR)

Jason Remillard remillard.jason at gmail.com
Mon Jul 15 14:41:03 UTC 2013


Hi Thomas,

Some ideas for you

Another way of doing this import is to break it up by feature type.
For example, if the location data is very accurate (say < 15 meter
uncertainty), then you can run a fully automated import of the lake
names. For each lake name in the source database, see if we have a
lake under it in OSM, if yes put your tags into it, else add a point
feature with your tags. I am planning a similar import of just lake
names in Massachusetts this fall. The code to do both imports would be
90% the same.

Then for the rivers, you could do a http://maproulette.org/

Perhaps, picking off specific features, you can get the manual
conflation count down to something that you can grind through in a 6
months. You don't want to get half way through it and get sick of it,
and stop.

Thanks
Jason.


On Mon, Jul 15, 2013 at 2:16 AM, Thomas Hirsch
<Thomas.Hirsch at kartverket.no> wrote:
> Hi Jason –
>
>
>
> I hope it’s ok if I CC to the mailing list, it sounds like you intended to
> post there.
>
>
>
>> - Use the semicolon separator on alt_name to get all of your alternate
>> names in.
>
>
>
> Ok, that would be a better option to what Karl Ove suggested. I was just
> uncertain because a lot of renderers do not interpret semicolons in name and
> basic type tags. But I guess it’s the renderers which should learn.
>
>
>
>> - You should definitely use offical_name tag in addition to the name tag.
>> […] and we can correct them via the name tag, leaving the offical_name
>> alone.
>
>
>
> Sounds like a plan.
>
>
>
>> - 900.000 points is a lot of points, I think you need to get some
>> automation going. How long do you think this import is going to take given
>> the current team size?
>
>
>
> I was able to conflate 4k in four days, tracing complex geometries (lakes,
> rivers) along the way. That was in a kommune with good aerial coverage.
>
> Yes, 900 person-days are a lot of work, but at the same time, this will seed
> 900k of the most important places in the country (most of them are not on
> the map at all), so I think it’s best to establish the procedure and allow
> people to map the features at the same time. It’s a big difference
> motivationally whether you are tracing millions of lakes in the Norwegian
> hinterland, or whether you suddenly are tracing named bodies of water in the
> mountains. ;)
>
>
>
> We haven’t had a proper hand count for the team size yet. If we reach ten
> people, it may be ninety days. If we stay at two it’s an open process.
> Anyway, in areas of missing or low resolution aerial coverage, not all names
> can be imported, unless as a node (or note), so the process will
> continue/repeat every time the coverage improves.
>
>
>
>> - Have you tried using the conflation plugin with some test data, I am not
>> sure how well it will work for this application.
>
>
>
> We are in process of testing the conflation plugin, and I have a few API
> scripts adapted to the task which identify candidate features and allow to
> conflate. (https://github.com/relet/ssr) We will evaluate what works best
> for us.
>
>
>
>> - Are you sure the factsheet URL is going to be stable?
>
>
>
> I’m a bit worried about it. It should, and the people behind it are aware of
> the issue, but you never know about the next generation. Linked open data is
> not a well-established concept in people’s heads. Then, if nothing breaks
> when URL change, change remains acceptable.
>
>
>
>> - how accurate are the source positions? If they are very accurate, say <
>> 4 meters, then the automation will be easy. If the uncertainty is 150
>> meters, its is going to be a lot work regardless of how you do it.
>
>
>
> The positions are very accurate.
>
>
>
>> - Just an FYI, we did a similar import in the US.
>> http://wiki.openstreetmap.org/wiki/GNIS
>
>
>
> Thank you so much! I had been looking for how imports (and more importantly,
> continuous synchronization) are handled in similar cases. If I understand
> correctly, GNIS is not updated, or does the survey group just have no means
> to identify outdated names?
>
>
>
> If anyone has good examples for how continuous imports/geosynchronization
> are handled as opposed to one-time imports, I’d be very curious about that.



More information about the Imports mailing list