[Talk-GB] Ordnance Survey data matching

Gregory nomoregrapes at googlemail.com
Mon Apr 5 01:00:59 BST 2010


In the time it takes to read and do all that you could:
Colour all OSM roads red with transparency 50%, colour all the OS roads blue
with transparency 50%. Ignore what is purple, because that is where the
datasets are the same, also ignore anything close together.
Although you wouldn't think so, the human eye is a lot better at solving
some equations.

What is left in blue you then need to go out and survey, because it might
not be right in OS. What is left in red might be stuff the OS wouldn't have
(like a service road that is a changed car park layout perhaps), you can go
out and survey it anyway to check on the OSM user.

On 4 April 2010 06:28, John Robert Peterson <jrp.crs at gmail.com> wrote:

> Please slap me if I'm either jumping the gun, or duplicating here, but
> I don't think anyone has covered this publicly already.
>
> I have had a quick poke around, and the meridian2 data seems to use a
> UID called OSODR (Ordnance Survey Oscar Database Reference). After
> some further poking around, it seems that this reference will be
> consistent across all of their data releases, though this is based in
> part on assumptions. (anyone have any more detail on this?)
>
> Now it seems like a very worthwhile exercise to attempt to do some
> detailed matching up of the the ways in the OS data and the ways is
> the OSM data, this is a completely non intrusive process, and can even
> be done offline, so it's not a problem to be doing now.
>
> I'm not well positioned to do this myself due to a lack of sql
> experience, but here is my suggestions:
>
> Pick a county that's a manageable size, and have some well mapped
> areas, some poorly mapped areas, and some non mapped areas.
>
> Ignore everything that isn't a road.
>
> Then run a bunch of searches on the 2 datasets to find ways that match
> between them.
>
> if the start and end coords match (within ~5 meters or so), they are
> likey the same;
> if the start and end coords match, but backwards, they are likely the
> same with a reversal.
> the above ways can then be removed form further searches.
>
> Take a look at the matches, and remove any that in fact don't follow
> the same (or close to) course (for each node in each dataset, check
> it's proximity to the closest waysegment in the other, not perfect,
> but good enough i reckon)
>
> Take a look at the data that's left, and work out where to go next. I
> suspect there will be ways that exist as 2 end to end ways (where a
> road name changes) in one set, but as a single way in the other. Or
> areas where a road name changes, but the position of the change is
> different between the datasets.
>
> There will be areas that just straight up don't match, these will be
> numerous, and would be best filtered for carefully, and flagged for
> human checking (openstreetbugs?)
>
> Subtleties that need further investigating would include: split carage
> ways; roads that only partially exist in our data (country lanes that
> have poorly defined ends or have not been fully surveyed); anything in
> our data marked position=approximate
>
> The results of this process could lead to some really useful data. our
> geometry (in general) seems to be better than the meridian2 data, but
> there are areas where we are missing data such as names, or any data
> at all in some rural areas.
>
> The general idea would be to do an import that takes the best from
> both data sets, and preserves all of our data except where identified
> as beeing inferior.
>
> If we can generate a list of ways that exist in meridian2, but are
> absent totally from our data, I say it would be worth importing them
> (carefully) their geometry is fairly poor, but it's well within usable
> parameters. And it's complete.
>
> If the import is done sensibly, it would be a fairly simple process to
> reimport any ways that have had no further work on them if better data
> becomes available from OS (someone said something about that
> happening) using a filter on last update user and OSODR reference.
> (this is based on the same assumption as above)
>
> Other moderately related points: their coastline data is way ahead of
> ours (even if offset by a fixed distance from what I've seen, no sure
> even which side the error is on, email me for a reference if
> interested, I'll try to find the data I was looking at again). At
> least in areas where no one has updated it. Unfortunately coastline
> ways are quite long, (though from what I've seen, not unmanageably so)
> and may have been updated in part or only very slightly, checking for
> version of nodes may be worthwhile in this case.
>
> So, am I onto somthing, or has this already been descussed to death on
> some other list?
>
> Thanks,
> JR
>
> _______________________________________________
> Talk-GB mailing list
> Talk-GB at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-gb
>



-- 
Gregory
osm at livingwithdragons.com
http://www.livingwithdragons.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-gb/attachments/20100404/96d0178e/attachment-0001.html>


More information about the Talk-GB mailing list