[Openstreetmap] Re: Naming segments using applet

Mon Dec 19 17:12:46 GMT 2005

Whew!  I've just completed the responses to my longish post.  Lots of
interesting comments.  I'll try to complete my comments in this post and
1 other in response to SteveC's message, since those 2 contain the
majority of the salient points.

> I probably don't need to  point this out but I should say that I really 
> don't speak for the whole of OpenStreetMap - I try to represent 
> consensus where I see it, but the following opinions are all my own.  
> Deep breath...

Understood.  Discussion is good. :-)

> > Already, there are instances with streetmaps where I can see problems.
> > For instance, in the US and Canada (I am not sure about across the pond)
> > there are many roadways that "change" monikers or function as part of 2
> > or more routes.  These may diverge into multiple separate paths,
> > converge from others, or even "jump" to another road.  There is no
> > facility for this in the current db.  
> 
> I personally have issues with how the current 'tags' implementation is 
> actually just a badly-validated arbitrary key/value system, but I think 
> it can express the situation you describe above.  There's nothing to 
> stop a particular street segment being designated as part of several 
> different routes using tags.  If I'm mistaken, this needs picking up 
> right now, so please correct me if I'm wrong.

I cannot correct you on this point as you know more about the system
than I do :-).  I currently can only use the web applet to work with the
data as I cannot get the offline editors to work here.  There are few
facilities available via the java applet.

> 
> > Also, interstate highway systems
> > don't seem to be a strong point either as their associated
> > entryways/exits, viaways, overpasses, and the like aren't symbolized.
> >   
> 
> Just a matter of time, and again something I imagine will be expressed 
> with tags on nodes and line segments.  We're imagining that client 
> implementations will settle on a way to represent certain types.

[... deleted for bervity]

Going by the current schema (thanks for the link to them earlier) things
are going to get real slow real soon as the data grow.  The means of
attribute access aren't optimized for situations where one may need to
search through terabytes of tags in order to visualize the data.

> > I realize that I'm not the first.  I've been lurking around the OSM for
> > quite a while.  I've read all of the entries in the Mailing Lists.  I've
> > even sent a few probing email messages that never got through earlier
> > this year.  
> 
> To the list, or individuals?  If there's a problem with the list then 
> that needs fixing.

via the list.  there were some issues about needing 3 differing
logins/passwords in order to access various resources. (wiki, editing,
mailing lists)

> 
> > I didn't participate until recently because the OSM seemed
> > to be Eurocentric in it's concerns and I was watching another project
> > that was based off of the US Census Bureau's TIGER data and "fixing" the
> > accuracy using GPS data.
> >   
> 
> Is that project open, and do you think we should be working together?

That project seems to have died.  Similar issues of which I'm trying to
address here.  Indeed, several similarities.

> > I would recommend moving to the OGC (Open GIS Consortium) schemas as
> > these are usable by many GIS packages already.  Not to insult the great
> > work that's been done, but can one really say that it _works_ now?  It
> > is usable for sure.  
> 
> The bits I care about work.  There is an end-to-end solution for 
> uploading GPS data, tracing it over aerial photography, annotating the 
> street names and viewing it on the web.  There are lots of things which 
> are barely adequate, but it all works.  It goes! For some value of 
> works. Yes.

THERE'S the rub.  It's a common problem, some would say the worst, of
OSS projects.  We all like to "scratch our own itches" without
considering that that "itch" may just be poison-ivy and that one is
spreading the infection by scratching.  The real solution is to treat
the itch with the correct medication.

[my comments on large datasets deleted]

> It is planned that node creation will snap to the nearest GPS track 
> point if one is available, and there are plans to allow importing of 
> waypoints etc directly, with areas and points of interest to follow.  
> What format is your data in, and what kind of data is it?  Perhaps we 
> could write a script to import it into the database directly, and 
> maintain the accuracy?

All in the future, but available now.  The format can be whatever.
Shapefile, CSV, MS-Access Personal GeoDatabase (ESRI), Arc/Info E00.
I've converted a couple of thousand points to GPX already to put online
(and have partially done so) in order to test the java applet.

> 
> With the assistance of GPS and aerial photography, the bar is currently 
> really low for making vector street maps and distributing them under 
> Free licenses.  That's all that OpenStreetMap is aiming for, at least 
> for the moment.  One reason OpenStreetMap exists is that we don't *need* 
> sub-meter accuracy to be useful.

True for some cases.  In high density areas accuracy is critical.  I'd
hate to map Manhattan with 20M accuracy and try to provide useful
directions from that. :-) 

> >   If I could use my current tool set then your collection
> > would be flush with lots of data from North America containing much more
> > information than just the location of the pavement.
> >   
> 
> Cool.  What do you need?

For starters, using the current schema, the order of insertion for
street records so that I can write a shapefile importer.  That would get
the street vector data in place.  But would leave the geocoded
information unused as well as elevation data.

> > Since you say time is the issue, consider this:  using either MySQL's
> > GIS set or PostGIS (both OGC compatible) one can import data directly
> > from shapefiles, CSV, and many other formats.  Let's also not forget
> > using JDBC from GIS packages.  From there the data can be served from
> > any of about a dozen WMS/WFS servers available as open-source.  These
> > are tested and (for the most part) production quality _now_.  How's that
> > for time savings :-)
> >   
> 
> But they don't support the versioning and wiki-style rollback features 
> we cherish, do they?

SQL provides for versioning and rollback.

[my comments removed]

> I really hope you can find somewhere to help.  With respect, it sounds 
> like your "dog in this fight" is accuracy, and a database that supports 
> the formats of your existing data (preferably using OGC schema).  Those 
> aren't unreasonable goals, but they're not ones I care for - you 
> certainly shouldn't take my opinion as representative of the OSM 
> community as a whole though.  I'm just quite active on this list... I 
> don't do lurking :)

What I meant about the "dog" is that I already have data that I can use
for the vast majority of the things I want to play with.  Some would say
that I have too much data (can you ever have too much?)  If I want to
look up roads in London, Leeds, Amsterdam or Tokyo, I can do it locally
already.  As well as many other sorts of information.  What I want to do
is to provide some of this to others.  Whether it is to OSM or another
project doesn't really matter to me.

As far as accuracy, I'm a believer that one should provide the best that
one can.  Since I have accurate data that's what I provide.  Why should
I provide 20M data when I have 3M?

> Understood, really.  My own focus is streets and will be for the 
> foreseeable future.  Others have their itches to scratch, and if someone 
> writes code that uses the OpenStreetMap database to map waterways or 
> footpaths or whatever, then so long as they tag the data in a way that 
> means I can ignore it when drawing streets, then I don't care.

How about when all this other data causes the system to bog down when
providing you the data you need?  Will you care when it takes minutes to
recall the image instead of seconds?

> 
> > Take a look at the
> > "bus-stop" discussion here recently.
> >   
> 
> People are somewhat blinkered in their eagerness to use the tools we 
> already have - bus stops are, I think, trivial to implement as points of 
> interest.  We just haven't done it yet.

However, this is exactly what I mean.  GIS is addictive.  The data are
easy to acquire and provide.  They are numerous in their uses, and
everyone has a good idea for making it more useful.  Stop and think
about how many different useful data sets are connected to something as
simple as a street...  way too many to enumerate.

> Wise words of warning, but I don't really agree.  We're seeing examples 
> all over the web where allowing thousands of people the ability to tag 
> and manipulate things in free-form ways creates a data set which is 
> 'good enough'.  I believe that every extra thing which is added to 
> OpenStreetMap will increase the complexity, and that as the complexity 
> increases we will lose contributors.  OpenStreetMap doesn't need to 
> cater to GIS professionals (though of course it needs to heed their 
> advice!), it needs to cater to ordinary folks, like the millions of 
> people getting satnav devices for Christmas this year who might want to 
> help make simple maps of their local area, for its own sake.  (That's a 
> total pipe dream, I'm sure, but it's worth pointing out).

You just drove over the cliff. :-)  GIS data by their very nature are
different than blog entries, documents, excel files, or even numerical
data.  Spatial data encompasses attributes that don't immediately meet
the casual analysis; or for that matter the in-depth analysis.  GIS data
are not the purview of GIS professionals... on the contrary anyone can
use spatial data.  However, it takes knowledge of GIS in order to make
those data recallable in a useful manner.

[comments about kittens and OGC bureaucracy deleted]

I'm not advocating becoming a member organization of OGC, just using the
well thought out SCHEMAs that they have created.  One doesn't
necessarily need to understand every nuance of the spec to use it
advantageously.  However, as one starts to understand those nuances one
can immediately take advantage of the knowledge if the facilities are
already there :-)  Sheesh, OSM is already using MySQL.  Use the GIS
facilities that are there.

> > Once one can remove the software
> > development/maintenance headaches one can focus on the truly important
> > part:  the data.
> >
> Amen to that. 
> 
> Except that software development and maintenance is fun!  And I wouldn't 
> like to replace it with the installation and maintenance of software 
> packages which I perceive to be bloated and over-engineered and where 
> 90% of what they do is superfluous to OSM's needs.

As a coder since the 1970's you won't get any arguments about
development being fun :-)  However, it's more fun to extend than to
redo.  That extra 90% may not be useful to you now, but wouldn't it be
nice for it to already be there when you need it rather than having to
wait for a new implementation.

> > There's nothing special about 'wiki'-ing with GIS data.  The maps are
> > visualized at the client end.  The database controls the changes.  Been
> > doing that for a very long time.
> >   
> 
> With hundreds/thousands of essentially anonymous contributors?  I'd be 
> really interested in your experience with that.  In particular how did 
> the systems you're familiar with deal with the problems we anticipate 
> OSM will face imminently (vandalism, editing conflicts, 
> authority/reliability of data, etc)?

Hundreds of contributers: yes.  Anonymous, somewhat.  Well, let me say
that the systems were/are proprietary.  The data generated are
immediately used for enterprise critical uses.  The data change
constantly -- in that what was correct before is not correct now and
needs to be corrected yesterday by noon, so I guess that you could
consider that vandalism :-)  There are folks entering, modifying,
deleting, and analysing the data all day long.  Reliability is crucial.
Recall time is needed to be immediate (as in <1 minute from query until
display.)  And _lots_ of money as well as peoples' livelihoods are at
risk if something goes wrong.

Granted that these systems are run with payware and/or custom code, but
some of the newer OSS systems contain many of the features that make
these systems work.

> As far as misunderstanding the scope:  I've watched these type of
> > projects evolve since the early 1980's.  The scope _always_ starts
> > small.  So far, from reading ALL of the archives, this project has
> > followed the same old script EXACTLY to the letter.  This is what
> > prompted me to comment.  
> 
> Heavens, if we're so predictable can you tell us how it all ends and 
> save us the bother?

I'd rather not have to. :-)  One only has so much time, you can either
waste it by making all the mistakes yourself, or you can look at the
mistakes of others and learn from them.

> (And, really, how did people run "these type of projects" in the 80's?  
> Excuse me if I'm an irritating combination of arrogant, ignorant and 
> naive today but I find myself unable to see through your "I'm sorry, but 
> I must intervene, this is madness, madness I tell you" attitude and find 
> any real insight).

We ran them with a lot of blood, sweat, tears, profanity, and of course
custom code :-)  Man, the days of VAX/VMS, PDP 11/44, and SysVr3 were
great :-).  Back then it wasn't called GIS, or for that matter anything
at all.  We were visualizing using Tektronix Graphics Terminals over
9600 bps serial connections.  10base2 and 10base5 coax ethernet
connected the back room machines to each other.

Data were hand-entered off of paper forms or uploaded from 9-track
reel-to-reel tapes.  The data were then run through conversion programs
and stored into databases (NOT SQL) using record-locking techniques.
Data were analysed, corrected, or modified by any of a number of people
tasked with making sense of the information.

Kendall
-- 
Kendall Sears <krsears at starband.net>