[Imports] import of data from LINZ

Robin Paulson robin.paulson at gmail.com
Tue Aug 21 00:44:56 BST 2012

see answers inline below.

a number of people have been involved in writing this, hence the time
and the slightly disjointed nature of it.

On 19 August 2012 20:03, Paul Norman <penorman at mac.com> wrote:
>> From: Robin Paulson [mailto:robin.paulson at gmail.com]
>> Subject: Re: [Imports] import of data from LINZ
>> hi paul,
>> answers inline
>> On 17 August 2012 15:51, Paul Norman <penorman at mac.com> wrote:
>> > A few comments
>> >
>> > Has the tagging been finalized?
>> No - We are review all of the existing mapping an quite a few layers
>> have yet to be tagged. We have gone through an _extensive_ peer reviewed
>> tagging process on the NZOpenGPS Google Group.
>> > Could you post a sample .osm of a typical area?
>> We've already done a "trial" import. about two years ago, we imported
>> all the layers for the Chatham Islands, several 100 km east of the
>> mainland. We chose this location, as there was no previous mapping
>> there, although there are plenty of map-worthy items
>> check here for more:
>> http://www.openstreetmap.org/?lat=-43.89&lon=-176.525&zoom=10&layers=M
>> i.e. we are confident the import application generates decent .osm files
> The point is not to generate a valid .osm file, but to make it easier to
> review the tagging. The area in the DB gives me enough to provide some
> comments, which I'll do below.

we're working on it. It has already been double-reviewed by (very)
long time OSM users & developers.

Note those (the area previously cited) are from the old dataset, the
new ones are much cleaned up.

>> > How were the building_use translations developed? Were only the
>> > descriptions from LINZ used or were the from LINZ names to OSM
>> > verified for each mapping by looking at imagery or another suitable
>> means?
>> A painstaking manual process of reviewing the LINZ tags and finding the
>> best OSM tags to use. Our conversion scripts allow complex translation
>> of building uses with conditional logic, etc.
> But did you look at just the LINZ documentation or did you verify that the
> mappings are correct with imagery or local knowledge?

1. There is no 100% local coverage for decent aerial maps in NZ. We
use them when available to check the data we import. For examples some
of the outlying islands don't even show on the satellite imagery it is
that low quality/cloud cover, etc.

2. New Zealand is a nation on 4 million people in a country the size
of the Great Britain. We cannot rely on local knowledge for all of the
data especially those items in the middle of wilderness areas with
little to no human access.

LINZ however did generate a lot of their data from other aerial
surveys/site visits and review content on a rotating basis. Although
not 100% accurate their data provides the base for the NZ community to
move forward with a decent base map that will be enhanced and cleaned
over time.

We know there is a general leaning towards hand edits are best and
this works well where there is a dense population base. It will be far
easier for us to engage the wider mapping community once we have a
base layer of decent data that requires smaller amounts of hand
editing rather than an empty match. We believe the trade off in this
instance is worth the small amount of incorrect data.

On a related noted our road data will be coming from the NZ Open GPS
project which has taken the LINZ road data and added a lot of meta
data, removed paper roads and generally made it good enough for GPS

>> Before we let others loose on the tool we will be documenting the
>> correct processes. We are also going to limit the checkouts to a max
>> 100 features at a time. No one can just import the whole of NZ for an
>> entire layer. Offshore islands are somewhat unique in that there is
>> no/little existing OSM data so we can import more wholesale. These also
>> provide a great testing ground for our imports. Please check the Chatham
>> Islands for our test bed import that was done 18 months ago.
> It's good to hear that it will be limited and it will be documented
>> People are already trying to import the LINZ data off their own back.
>> Using our tools we do this in a managed way. The process has been well
>> thought out and we are proceeding with caution. It has taken us well
>> over two years to get to this point.
>> > I suggest having a requirement that the changesets are tagged with
>> > source=* and attribution=*. It's also a good idea for the source to be
>> > linked in the user page of the import accounts being used.
>> Data from LINZ will be updated on a regular basis and by November we
>> will have ID's for all points. We plan to provide tools for 2 way review
>> as LINZ and others will be able to use OSM data as pointers to where
>> they should update their data.
>> > I also suggest instead of attribution=*, source_ref=*,
>> > LINZ:source_version=* you just use source=LINZ v16
>> We'd like to keep the source separate from the attribution, collapsing
>> the two requires data-parsing, if it's ever processed at a later date,
>> which seems unnecessary
> The current thinking on import best practices is to minimize the number of
> metadata tags.

We will look into how we tag attribution and source. A source tag is
important for us for the update process but we can move the
attribution somewhere else if we can automate its insertion so it
doesn't accidentally get left off.

It is important to emphasize that the LINZ:layer and LINZ:dataset tags
are what will make later data releases from the gov't able to be
incorporated (pre-per-feature ID tags), and bulk corrections to
already uploaded data possible.

>> > You cannot count on the attribution tag always being there as a user
>> > could remove it - LINZ needs to be satisfied with attribution through
>> > the wiki page and the history showing who imported it.
>> LINZ are satisfied with our tagging and have actively encouraged it.
>> We have a very good relationship with the government on open data. A
>> couple of the local OSM team (that's Glen Barnes and Robert Coup) also
>> have a lot of engagement with the working group on open data policy in
>> central government. We are fully supported in this regard.
> The problem is that attribution through a tag is no guarantee that
> attribution will be provided as that tag can be removed. OSM offers
> attribution through the wiki page, the history and changeset tags.
> Now, for some feedback on the area you linked:
> Some objects have no OSM tags, for example
> http://www.openstreetmap.org/browse/node/767087010 or
> http://www.openstreetmap.org/browse/node/767087005 which have
> LINZ:point_size = 6.0
> LINZ:size = 4.1
> LINZ:text_placement_justification = 1.0
> LINZ:x_offset = 1.0
> name = Loch Long memorial
> It looks like some display information from their database made it in.

Yes, that was done on purpose in case someone wanted to use it when
cleaning up/merging the tags to see what/where offsets vs the nearby
map features and how important the thing was (size). Also if anyone
wanted to use it for cartography.

Those fields are gone now in the new release.

> http://www.openstreetmap.org/browse/node/767114113 has tags which belong on
> the nearby waterway=river

This is part of a post import cleanup process that will be under
taken. There is a description layer which labels some things like
rivers (some rivers also have tags directly). Post import we plan to
have a few check scripts that look for hotspots to review and hand
edit. We expect the whole import process to take months once we get it
up and running but we will get there.

> http://maps.paulnorman.ca/imports/review/streamconverge.png is a point where
> three waterways converge on one spot. It appears that some waterways are
> reversed.

Yes - We can't gaurentee the direction of rivers right now. LINZ are
working on fix this in the base data. One of the notes on importing
further rivers is to check direction and fix if we can work out which
way is downhill...

> http://www.openstreetmap.org/browse/way/58380136 is a way which has tags
> which belong on the multipolygon (e.g. natural=sand)

Right: We will review that tagging.

> Should most of the natural=sand be natural=beach surface=sand? That's what
> it looks like from the imagery, but I'd defer to someone with local
> knowledge.

We have a post import task to review these. Interestingly the point
description layer has a tag called beach which will allow us to zoom
on quite a few of these areas.

Also: The sat imagery is misleading, we believe they are tall (10m+)
dunes and not a classic beach front.

> The display information on some nodes is the most serious issue, but it
> should be easy to fix.

We will look at the tagging of the multi-ploygons and review the
attribution issues.

> Overall it looks not bad, but I look forward to seeing an updated .osm file
> with these issues fixed.

We will be doing some more targeted imports to check tags on the
layers over the next week. We will keep the group posted/

A lot of the todo cleanup stuff may be best left to mappers on the
ground, and there's very little we can do about that. There is also
this script, which helps with clean up:


and a general description of our process:

robin and other NZ mappers

More information about the Imports mailing list