[osmosis-dev] Invalid ways, now with code

Toby Murray toby.murray at gmail.com
Fri Dec 14 02:40:10 GMT 2012


On Thu, Dec 13, 2012 at 5:13 AM, Paweł Paprota <ppawel at fastmail.fm> wrote:
> On 12/13/2012 03:48 AM, Toby Murray wrote:
>>>
>>>
>>> SELECT ST_IsValid('LINESTRING(1 1, 1 1, 1 1, 1 1, 1 1)'::geometry)
>>
>>
>> This was touched on in the previous thread a little. The problem is
>> that there is no direct interaction with the database at this level.
>> When doing an import, no connection to the database even exists.
>> Everything goes to a dump file and is then loaded in via a COPY
>> command. So I can't just call ST_IsValid.
>
>
> Sure, I didn't mean that you should call ST_IsValid - it was just to
> illustrate the point that a way with 5 nodes that are in the same position
> is invalid.
>
>
>> Furthermore, I don't see a
>> way of checking for geometry validity in java.
>
>
> GEOS (which PostGIS is based on) is a port of JTS from Java to C so there
> most certainly is a way to check validity in Java :-)
>
> http://www.vividsolutions.com/jts/javadoc/com/vividsolutions/jts/geom/Geometry.html#isValid()
>
> The implementation does not seem too complicated, it could be used in
> Osmosis...
>
> https://github.com/sergei/jts/blob/master/src/com/vividsolutions/jts/operation/valid/IsValidOp.java#L185

Ah ok. This is my first step into the world of GIS programming and I
wasn't aware of the JTS. I was just looking in the postgis API since
that is what is already being used to generate the linestrings in the
first place.


>> Also, out of the 3,455 ways with invalid geometries currently in my
>> database, 3,157 of them are single node ways. So just this simple node
>> count check eliminates over 90% of invalid geometries. Given the
>> challenges and costs of doing more exhaustive checking, this seems
>> like a decent compromise.
>>
>
> Sure but in this case I would suggest naming this option differently because
> setting keepInvalidWays to false implies that there will be no invalid ways
> in the database which may not be true with the current implementation of
> this option.

Well technically there is a difference between way validity and
linestring validity :)

Zero node ways actually result in a valid linestring even though they
have no real meaning in OSM. IMO zero and single node ways should be
rejected outright by the OSM API because they have no meaning. But
they aren't so we get to deal with the results. Multiple nodes at the
same location could in theory be valid in some crazy OSM 3D scheme.
I'm sure some proposal for mapping elevators like this exists in the
german wiki or something...

Brett had mentioned in the previous thread that maybe another option
could be added to specify whether all invalid geometries should be
kept out or not, regardless of node count. This could still be added
although it might be a little redundant.

>> I don't think it is currently possible to keep *all*
>> invalid ways out of the database during an import. If this is really a
>> requirement, it will need to be done with a query after the database
>> has been populated.
>>
>
> I'm not sure about this - WayGeometryBuilder#createWayLinestring seems like
> a perfect place to plug in the code from JTS to skip invalid ways.

Yes, it would in theory be possible with the JTS library. Do you have
any idea how efficient this check is? If it takes 1 ms to check for
linestring validity then it will add 28 hours to a planet import at
which point it would be faster to do a "DELETE where not ST_IsValid()"
query after the data is loaded. But I guess I might play around with
it and see.

The other problem is that this method can't be used in replication
mode since the linestring is built in SQL. But checking node count
still works.

I still think the code I have right now is useful. The biggest plus is
that it creates consistency between what happens during import and
what happens during diff application. And it takes care of a vast
majority of the problem cases at the cost of (more or less) one if
statement.

Toby



More information about the osmosis-dev mailing list