[Imports-us] NJ Landuse import (NJ2002LULC)

Serge Wroclawski emacsen at gmail.com
Wed Aug 21 10:29:08 UTC 2013


On Wed, Aug 21, 2013 at 12:19 AM, Darrell Fuhriman <darrell at garnix.org> wrote:
>
> I won't argue with your first two points, however...
>
>
>> 3. The import is wrong in a number of places
>>
>> I don't know if it's because the import is old, but the data is simply
>> wrong in many areas.
>
> Based on what? It's great to declare "it's simply wrong" but there should be a little bit more justification than that. Many areas? How many? How much of the total area do these "simply wrong" areas entail? Why are they "simply wrong"? And further, why is it being "wrong in a number of places" justification for removing everything in its entirety?

I'll start with your last question first.

The justification for removing the NJ2002LULC data is quite
comprehensive, with several points which all support the argument of
doing so.

Now, "Based on what"- based on what's actually present on the ground,
vs what the data says is present. Many areas- not a huge number, but
I've seen more than a few. How many- I don't know, since I don't
survey the entire state of New Jersey by hand, but just yesterday I
found an area that it said was a graveyard that was actually a school.
Now, unless New Jersey elementary school students have a haunted
playground, that information is wrong.

Similarly, I've found areas classified as industrial that were residential, etc.

The data is simply wrong, either out of date, or has always been wrong.

Nonetheless, this is not the only argument for its removal, and there
were a total of five arguments I made.

>> 4. I disagree with many areas' subjective data
>>
>> One of my bigger frustrations with this import is that I simply
>> disagree with some of the classifications, especially "scrub", and
>> this is very typical of landuse classifications- they're highly
>> subjective.
>
> Totally disagree. "Subjective" is not the same as "something I disagree with".

I think that's the very definition of "Subjective"- two people looking
at something and having answers which differ based on the same data.

> Scrub/Shrub is fairly well defined classification (albeit, one of the less clear ones), with a body of scholarly literature behind it.

That may be the case, but the data in OSM must follow OSM
classifications, which differ.

Areas with trees that are 20+ feet high trees are not "scrub" by my
working definition of scrub, and they don't match the OSM wiki's
picture of what scrub means:

http://wiki.openstreetmap.org/wiki/Scrub

In this case we have a few possibilities:

1. The classifications NJ uses differ from the classifications OSM uses

2. There's a problem with the import (rather than the original
dataset) in what was classified as what

3. The data was wrong in the NJ2002LULC dataset

4. My working definition of scrub is wrong, and the wiki should be
updated for poor souls like me.

> They have been defined, and photo-identified by trained experts, and often rely on data that may not be
> available in visible photos (soil types, historical use records, infrared bands), or obvious without training
> (vegetation types).

That may be so, where my only experience with these places is "I used
to play there when I was a kid"- and I look at the Bing imagery and it
looks like I remember.

> While I won't argue that these are 100% "objective" (simply because I doubt there is any such thing),
>"subjective" is not the same as wrong, and there are often good reasons to trust one person's "subjective"
> opinion over another's.

That is not a view that jives with the OSM philosophy. If two OSM
observes look at an area, they should be able to come to roughly the
same conclusions. Additionally, they do not need to be experts, as we
do not have such requirements in the project.

The arguments you put forward are exactly one of the arguments why
some of us do not believe that landuse is a generally good thing to
store in OSM generally, because its so difficult to measure by
non-experts.


>> I think that unlike the TIGER work that's being proposed, this import
>> is not really fixable. In order to fix it, the relations and their
>> component ways would basically need to be reconstructed. The work
>> would be huge and so complex I don't think that it would be doable
>> without some serious software engineering.
>
> I'm skeptical of this statement, though I will admit to not having looked closely at the data.
>
>> So my proposal is to remove this import entirely.
>
> My preference is to fix, rather than to remove.

Darrell, what is your OSM username? I ask because I've done quite a
lot of fixing of New Jersey, and especially trying to fix the import,
and these conclusions come from about a year of trying to fix it (off
and on).

If someone with 4+ years of active contributions, deep involvement
with the OSM technical community etc. can't figure out how to unwind
the complex data structures easily in order to fix small problems,
then I have a lot less hope for someone who is new to the project, or
even your average casual mapper.

>From the tone of your email, I'm guessing that you feel passionately
about this topic. Perhaps you can find solutions to the various
problems. If that's the case, then I look forward to reading your
proposal on what to do with the import.

- Serge



More information about the Imports-us mailing list