[OSM-talk] OSM the mediocre alternative

Sun Apr 22 01:43:09 BST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christopher Schmidt wrote:
> On Sat, Apr 21, 2007 at 07:06:41PM +0200, Lars Aronsson wrote:
>> But very few of us know how the PostGIS extension to PostgreSQL 
>> works internally or how you should program geographic datatypes 
>> and indexes on your own.  The result from this is Steve's current 
>> data model and the fact that the rest of us accept this as a 
>> viable solution.  Those who don't, because they know more of GIS, 
>> like Christopher Schmidt, are repelled by everything they find 
>> under the hood of OSM.
> 
> I'm actually not repelled by everything. It's simply a different choice
> than I would make. Specifically: 
>  * OSM uses topology as its base storage. Topology is good for making 
>    graphs, which is important when you need to do routing. For this
>    reason, (it seems to me) that OSM was built towards the goal of
>    creating driving directions. Great.
> 
>  * Most GIS uses Simple Features -- not topological -- for handling
>    data. The result is very different -- Simple Features are designed
>    for making maps. If you'd like evidence, look at how the mapnik maps
>    are built: the topology is turned into simple features, and stored in
>    PostGIS. My MapServer demos just under a year ago worked the same
>    way. 
> 
> The difference to me is simple:
>  
>   If I want to drawn an OSM feature on a map, I have to fetch a large
>   number of pieces of data fromm the API individually, and combine them
>   to create a geographic feature.

I agree this is the state of the current OSM api, but I can't see why a
topologically stored dataset can't be fetched as features in one go.
Converting features to topology, however, seems like it is prone to errors.

> Example: 
> 
>   Way ID 4213747:
>     1 way.
>     21 segments.
>     22 nodes.
> 
> So, to visualize this one way, I have to make 44 fetches to the API. 

Not if you use the new extension the the API that Richard Fairhurst has
written for potlatch.

In terms of underlying relational databases, this is a completely
trivial query (albeit quite a long one because there are several joins
going on):

select * from way_segments
 inner join segments on way_segments.segment_id=segments.id
 inner join nodes as nodeStart on segment.node_a = nodeStart.id
 inner join nodes as nodeEnd on segment.node_b = nodeEnd.id
order by way_segments.id,way_segments.sequence_id

A relational database, with the proper indexes in place, can do a query
like this in no time at all.

> Now, if I switch to a simple features model:
> 
>   http://hypercube.telascience.org/~crschmidt/featureserver/featureserver.cgi/osm-line/4213747
> 
> I'm given a geometry ("Line"), list of coordinates, and list of
> properties. (This is JSON output: you can also see it as html by adding '.html'
> to the end, or as atom by adding '.atom' to the end.) 
> 
> "Line" can also be "Polygon", or "Point". (Or "MULTIPOLYGON", etc.,
> though FeatureServer doesn't support those.)
> 
> This is one fetch. I can now draw the feature. I can also query for
> other features which have the same name, and get the information for
> those, too:
> 
> http://hypercube.telascience.org/~crschmidt/featureserver/featureserver.cgi/osm-line/all.html?queryable=name&name=Al%20Jami'ah%20Street
> 
> This shows me that there is also a feature, ID 4213746, which has the
> same name. I can draw all these features on a map with the output of one
> query.

> 
> In OSM, that would be 88. 88 queries to the API, just so I can display
> two features.

The fact that OSM is missing this kind of query is not a function of
it's data structure, it is a function of the fact that no one has added
this kind of querying to the API.

> However, all these are technical problems. Mapping to topology isn't
> hard

Yes it is. It might not be hard in practise because there are libraries
available, but it is much harder in theory than going the other way.

> -- mapping back the other way isn't impossible.

No, it's not impossible, it's completely trivial. See the query above.

What OSM does really badly is fetch data by location, because it is
lacking spatial (i.e. r-tree) indexes. This needs to be fixed.

(see wikipedia about r-trees: http://en.wikipedia.org/wiki/R-tree)

Robert (Jamie) Munro

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGKq+Zz+aYVHdncI0RAmvFAKDo4BZMbvvpgYTTTVJHCt17Thq9bwCeLHzF
YPrzhFZkdfuuDCi5BqKciD4=
=RJkg
-----END PGP SIGNATURE-----