[Talk-us] Fresno castradal imports
steveaOSM at softworkers.com
Thu Apr 26 23:26:36 BST 2012
>User nmixter has been the user who did the import. I would recommend
>to revert the changeset(s) and delete the useless stuff. In the
>small area I checked there were many errors (overlapping lines,
>double nodes...). I agree, that there is no way to fix stuff. User
>BiIbo modified many objects (about 33 %), but it is not obvious what
>he really changed.
>...I think we should simply delete all objects without any osm-tags.
User nmixter, in addition to being a friend of mine and frequent
hiking buddy so we can upload our GPS tracks from the hike into OSM
(what I consider "real OSM mapping") is likely one of the top
contributors of OSM data on Earth. Really, by number of uploads, he
may be the project's #1 contributor (or was at one point in time).
That said, I offer the following (not brief) history as deeper
insight, not as mean-spirited or holier-than-thou.
Nathan (Mixter) is a very earnest fellow when it comes to OSM. I
believe him to have the utmost respect for our project, and he really
wants the map to "bloom" as he puts it. Reading
http://wiki.openstreetmap.org/wiki/California/Import, he talks about
"ongoing" imports, usually county-at-a-time, using data which he
scrupulously finds and decides to use only as he believes it from
official sources and being of reasonable quality (or which, as I keep
saying can be MADE INTO reasonable quality data). He talks about
"turning the state brown" (California Farm Bureau data) and "turning
the state green" (California State Department of Conservation
Farmland Mapping and Monitoring Program, FMMP data).
He did help user:Apo42 (another hiking buddy of ours) well-integrate
the Mid-Peninsula Regional Open Space Distric data so that much of
that parkland (except closed-to-public areas) appears on the map, and
offered good consensus (let's agree with MROSD that we shouldn't
enter into OSM the closed-to-public areas, as did I) that Apo use a
"landuse=common" tag on these not-quite-leisure=park areas; Nathan is
no stranger when it comes to good discussion and offering and
listening to a greater sense of consensus BEFORE he does an
He also has worked with me extensively on the import of the Santa
Cruz County GIS Department's official landuse data into OSM, the
process of which we have documented extensively at
The way Nathan did this was an initial upload (which was fraught with
technical problems), revert those (but not completely), do the second
upload (which was better, but still filled with "noisy" data), and
then worked closely with me on fixing up the data. Nathan might have
done 10% to 20% of the fix-up, but I did the other 80% to 90% (having
lived in Santa Cruz County for many decades) and it has taken me the
better part of two years of rather frequent OSM editing to do so.
During this time, Santa Cruz won a "Gold Star" award on BestOfOSM.org
(one of just a handful in North America) for "nearly perfect landuse"
but I myself will say that was not without huge effort on my part to
correct thousands of serious mistakes in Nathan's import.
Nonetheless, he-and-I-together made a large part of this possible.
(Of course, we also stand on many shoulders of other OSM
My point is that OSM + TIGER + TIGER-cleanup + early contributors + a
noisy but OK import + some cleanup by the importer + years of local
love in editing by yours truly (or anybody) = a Gold Star award! So,
imports, done well (with consensus, good tags, assuring quality
data...) really are worth doing. Just not haphazardly.
About six months ago, Nathan "just did" a similar countywide import
of FMMP data into neighboring (to Santa Cruz County) San Mateo
County. This, too, had its rough edges, but it did cause the map to
"bloom" from largely colorless white/blank areas (except urban
"roaded" areas) into a "fairly good" mapped area showing landuse in
colors which more-or-less respect both the original data, OSM's tags
and mapnik rendering (without straying too close to or over the
boundary of "coding for the renderer"). San Mateo County farm data
were still hundreds of polygons, and took me the better part of a
month to review and conclude "well, it's in the map and it looks OK,
but I'm still not sure if I consider it good enough..." before
more-or-less resigning myself to what Nathan did and my lack of
knowledge of specific areas (which I can't see in Bing, for example).
Other OSM editors (in the future) who know rural San Mateo County
better than I do are just going to have to improve what nmixter did
there. A reversion or deletion of the changesets would be overkill.
Meanwhile, Nathan (on every hike we went on) seemed very eager to
upload Monterey County, a rather huge area in California (it is over
100 miles from north to south and probably larger than Connecticut).
I looked at an early and quite raw version of the dataset (and I
still have it) and it was so overwhelming (not for my JOSM editor,
Java environment or serious computing desktop, but rather as a
single, comprehensible "thing" to mentally and visually parse at
once) that I asked him to please hold off on uploading these data. I
said I would help him massage the tags into "more appropriate for
OSM" tags (something he apparently did not do for Fresno County) and
to his credit, he HAS withheld the Monterey County upload.
And, to Nathan's excellent credit, he DID enter a page on the
"pending" Monterey County upload into OSM's wiki
is largely a cut-and-paste of my "initial distillation of criticisms"
of the dataset that I sent to him in a private message.
However, as we now see, nmixter has sprayed Fresno with a large data
upload, which looks to me (sorry, Nathan) like something which needs
to be reverted in whole. As already noted here, many tags are
superfluous, it is quite likely that good tests that Validator would
have violently complained about were undone or simply ignored and
uploaded anyway, and Nathan appears to not have tried to achieve
consensus in a way that would have prevented this. That is the crux
of the problem, not his raw data (which with some improvement and
more consensus, can truly turn into something appropriate for OSM, in
Nathan goes by at least three OSM user accounts that I know of:
nmixter, srmixter and eureka gold. It is usually the latter by which
he does bulk uploads, but not always. His home/work areas around
Hollister and Gilroy are truly something beautiful to behold
(hand-drawn buildings, rich sets of POIs...).
Now come the most important issues in this missive: Nathan is
earnest. He really wants to communicate among fellow OSMers, my
personal evidence and experiences clearly establishes that (and I so
attest), and he wants to upload quality data (and he even listens
when I say to him "what you have now is poor quality data for OSM;
don't upload those until we or others improve it so it is GOOD ENOUGH
for OSM"). There is a kind of "hike and talk among your local fellow
OSMers to achieve consensus..." and there is a "read (and contribute,
when you have something important to say) the talk-us pages to
achieve consensus..." and there is even a "put up a page in OSM's
wiki to achieve consensus...". What seems to be sorely lacking is a
sort of "mid-level" (countywide? statewide? it doesn't have to be by
political boundaries) way of achieving consensus which we miss (and
miss badly) here in the USA. User:nmixter is its latest example with
his recent upload in Fresno.
I'm personal-messaging him to read talk-us as we discuss this, and
maybe he himself will chime in here. Let's give it a few days, as
this channel can be a bit slow. He really wants to upload quality
data, he just has a tendency towards an itchy trigger finger on his
"upload" button when he feels a lack of consensus on whether his data
is "ready for import." Yes, that is a problem, but I believe we can
address it by better achieving better networked communication and
consensus (how to better do so?) rather than "tearing him a new one"
for one more messy upload. And of course, I can't speak FOR Nathan,
but rather as somebody who has worked with him on OSM fairly closely
regarding countywide bulk data imports/uploads for years now.
Let's communicate our intentions and amongst each other, improve the
map, and not isolate each other with boos and tear-downs. We really
can work well (and better) together. These talk pages are part of
More information about the Talk-us