[OSM-talk] Re: [OSM-dev] MassGIS dataset

Christopher Schmidt crschmidt at crschmidt.net
Sat Jun 3 12:21:58 BST 2006


I'm moving this back to the talk list, again, because it's thoroughly
un-dev.

On Sat, Jun 03, 2006 at 06:02:16AM +0200, Lars Aronsson wrote:
> Christopher Schmidt wrote:
> 
> > I can't help with gathering line segments: I have a full dataset 
> > which would be of about 20x higher quality level than TIGER, and 
> > is released under an Attribution license, but with the request 
> > not to load the database more than is neccesary, I'm not going 
> > to load it. (The data, for the record, is the MassGIS department 
> > of transportation roads layer -- but like TIGER, it's an 
> > accurate, high-density data source which
> 
> That's good for Massachusetts, but it won't help us in Europe. 
> Free GIS data is unheard of here.  In my theory this is because 
> European governments are more shrewd and skilled in how to make 
> money off of their subjects than the amateur U.S. government(s).

Or perhaps because the US Government, at one point in time, realized
this was not a good idea. I think most people agree that charging the
taxpayer to get data collected using their tax money is irresponsible --
so you're claiming that the European governments are irresponsible?

> Anyway, suppose that you could load this data into OSM.  That 
> would mean breaking up structured data into line segments, right?  
> And then we use OSM (the map wiki) to edit it, improve it, make it 
> more up to date. Could you ever assemble the structured data out 
> of that again?  The purpose would be to allow MassGIS users to 
> benefit from the improvements we've made.  It's not really like 
> OSM allows you to set a (CVS) tag after the import, that can be 
> used to generate a global diff of everything that's changed since 
> that tag.  Isn't that kind of functionality what we'd need?

You don't need a diff -- you just create a compatible file (See
http://wiki.openstreetmap.org/index.php/Converting_OSM_to_GML , from
which you can get back to the original shapefiles that the data came in)
and ensure that you store the unique identifier that MassGIS uses in
their files in the OSM data. (tag: fid=F1, F2, etc.) Once you've edited
the dataset, you select the data for Massachusetts, spit it back out to
a file, and you can compare the files to see what's changed -- it's not
trivial, but it's certainly not impossible, and it's equally possible to
look for segments that have changed in the MassGIS data and change them.

For example, a new road has been added at the end of my street. For the
next 5 years, the MassGIS data will not contain this road, as that's how
often they release updates. It would be nice to be able to drive down
that street in navigation software, because otherwise I have to head in
the wrong direction for another half mile to get down the road.

OSM allows me to store the current data, update it to include that
street, export the data to a navigation software, and use it.

Are there other solutions? Sure. I could run my own version of OSM just
for Massachusetts. But part of the purpose (which the OSM project is
largely succeeding at) of OSM is to:

"The OpenStreetMap Foundation is an international non-profit
organisation dedicated to encouraging the growth, development and
distribution of free geospatial data and to providing geospatial data
for anybody to use and share."

*growth*,  *development*, *distribution*. None of those are the same
aims as MassGIS, and any further projects doing the same thing would
just be mimicking OpenStreetMap. Although it's certainly possible that
performance might be better with a Mass-based 'shard' of OSM, I'm not
convinced at the moment that OSM can't become *the* repository for this
data... provided technical issues are solved. And I think that would be
a good thing. (Of course, this is up for debate.)

> I have this doubt about TIGER.  Ben is importing TIGER into OSM, 
> which I'm sure is a fun exercise. But what is it good for, really? 
> There are applications that use the TIGER data, i.e. that are able 
> to import data in the format used by TIGER.  Will these apps ever 
> be able to use anything that comes out of OSM, after people have 
> edited the TIGER import?  What's the next step after the import?

Assuming the OSM data is exported on a regular basis (even monthly is
good enough in this case), you export the data, and you use it as a
basis for apps serving TIGER data... I'm not sure why you feel that this
is any different than any other place. Are you saying that once all of
London is mapped in OSM, there will no longer be any improvements to it
other than new roads? Things change. Improvements can be made. I can
almost assure you that Inner Circle in Regents Park is not 100%
accurate, the same way that much TIGER data is not 100% accurate, the
same way that MassGIS data is not 100% up to date and accurate.

> In my mind we should use OSM as a factory for producing 
> TIGER-formatted free GIS data for Europe (and Africa and Asia), so 
> that existing TIGER-based applications can be used here too.  But 
> I'm not sure what the purpose of OSM can be for the U.S., since 
> you already have TIGER and other free GIS data sources.

TIGER is not accurate enough to create nice looking maps on its own:
saying that TIGER is an acceptable data source would be like saying that
NavTeq, et. al. don't have any purpose in the US, since anyone can go
out and make money by selling the TIGER data. Until you've spent a while
looking at how far off TIGER gets on rural roads -- hundreds of meters
-- it's hard to see this, but I can assure you it's true.

Improvements can be made to this data: There may be names which are
wrong, roads which are new and missing, information which isn't stored
in the database which could be (for example: MassGIS data lacks address
ranges on segments, which makes it non-usable for geocoding.). 

I agree that the primary target of OSM should be places where no free
geodata exists. But acting as a way to store, improve, export, and make
access to accurate geodata is extremely important too, and something
that should not be overlooked. Data that exists doesn't always do what
you want it to. Once OSM is performing more capably as a technical
platform, it can be used to update and export that data so it does do
what you wnat it to. And that's a benefit for all users of the data.

-- 
Christopher Schmidt
Web Developer




More information about the talk mailing list