[Talk-us] Fresno castradal imports

stevea steveaOSM at softworkers.com
Thu Apr 26 23:26:36 BST 2012


WernerP wrote:
>User nmixter has been the user who did the import. I would recommend 
>to revert the changeset(s) and delete the useless stuff. In the 
>small area I checked there were many errors (overlapping lines, 
>double nodes...). I agree, that there is no way to fix stuff. User 
>BiIbo modified many objects (about 33 %), but it is not obvious what 
>he really changed.
>...I think we should simply delete all objects without any osm-tags.


User nmixter, in addition to being a friend of mine and frequent 
hiking buddy so we can upload our GPS tracks from the hike into OSM 
(what I consider "real OSM mapping") is likely one of the top 
contributors of OSM data on Earth.  Really, by number of uploads, he 
may be the project's #1 contributor (or was at one point in time).

That said, I offer the following (not brief) history as deeper 
insight, not as mean-spirited or holier-than-thou.

Nathan (Mixter) is a very earnest fellow when it comes to OSM.  I 
believe him to have the utmost respect for our project, and he really 
wants the map to "bloom" as he puts it.  Reading 
http://wiki.openstreetmap.org/wiki/California/Import, he talks about 
"ongoing" imports, usually county-at-a-time, using data which he 
scrupulously finds and decides to use only as he believes it from 
official sources and being of reasonable quality (or which, as I keep 
saying can be MADE INTO reasonable quality data).  He talks about 
"turning the state brown" (California Farm Bureau data) and "turning 
the state green" (California State Department of Conservation 
Farmland Mapping and Monitoring Program, FMMP data).

He did help user:Apo42 (another hiking buddy of ours) well-integrate 
the Mid-Peninsula Regional Open Space Distric data so that much of 
that parkland (except closed-to-public areas) appears on the map, and 
offered good consensus (let's agree with MROSD that we shouldn't 
enter into OSM the closed-to-public areas, as did I) that Apo use a 
"landuse=common" tag on these not-quite-leisure=park areas; Nathan is 
no stranger when it comes to good discussion and offering and 
listening to a greater sense of consensus BEFORE he does an 
import/bulk upload.

He also has worked with me extensively on the import of the Santa 
Cruz County GIS Department's official landuse data into OSM, the 
process of which we have documented extensively at 
http://wiki.openstreetmap.org/wiki/Santa_Cruz_County,_California. 
The way Nathan did this was an initial upload (which was fraught with 
technical problems), revert those (but not completely), do the second 
upload (which was better, but still filled with "noisy" data), and 
then worked closely with me on fixing up the data.  Nathan might have 
done 10% to 20% of the fix-up, but I did the other 80% to 90% (having 
lived in Santa Cruz County for many decades) and it has taken me the 
better part of two years of rather frequent OSM editing to do so. 
During this time, Santa Cruz won a "Gold Star" award on BestOfOSM.org 
(one of just a handful in North America) for "nearly perfect landuse" 
but I myself will say that was not without huge effort on my part to 
correct thousands of serious mistakes in Nathan's import. 
Nonetheless, he-and-I-together made a large part of this possible. 
(Of course, we also stand on many shoulders of other OSM 
contributors!)

My point is that OSM + TIGER + TIGER-cleanup + early contributors + a 
noisy but OK import + some cleanup by the importer + years of local 
love in editing by yours truly (or anybody) = a Gold Star award!  So, 
imports, done well (with consensus, good tags, assuring quality 
data...) really are worth doing.  Just not haphazardly.

About six months ago, Nathan "just did" a similar countywide import 
of FMMP data into neighboring (to Santa Cruz County) San Mateo 
County.  This, too, had its rough edges, but it did cause the map to 
"bloom" from largely colorless white/blank areas (except urban 
"roaded" areas) into a "fairly good" mapped area showing landuse in 
colors which more-or-less respect both the original data, OSM's tags 
and mapnik rendering (without straying too close to or over the 
boundary of "coding for the renderer").  San Mateo County farm data 
were still hundreds of polygons, and took me the better part of a 
month to review and conclude "well, it's in the map and it looks OK, 
but I'm still not sure if I consider it good enough..." before 
more-or-less resigning myself to what Nathan did and my lack of 
knowledge of specific areas (which I can't see in Bing, for example). 
Other OSM editors (in the future) who know rural San Mateo County 
better than I do are just going to have to improve what nmixter did 
there.  A reversion or deletion of the changesets would be overkill.

Meanwhile, Nathan (on every hike we went on) seemed very eager to 
upload Monterey County, a rather huge area in California (it is over 
100 miles from north to south and probably larger than Connecticut). 
I looked at an early and quite raw version of the dataset (and I 
still have it) and it was so overwhelming (not for my JOSM editor, 
Java environment or serious computing desktop, but rather as a 
single, comprehensible "thing" to mentally and visually parse at 
once) that I asked him to please hold off on uploading these data.  I 
said I would help him massage the tags into "more appropriate for 
OSM" tags (something he apparently did not do for Fresno County) and 
to his credit, he HAS withheld the Monterey County upload.

And, to Nathan's excellent credit, he DID enter a page on the 
"pending" Monterey County upload into OSM's wiki 
(http://wiki.openstreetmap.org/wiki/Monterey_County_Checklist) which 
is largely a cut-and-paste of my "initial distillation of criticisms" 
of the dataset that I sent to him in a private message.

However, as we now see, nmixter has sprayed Fresno with a large data 
upload, which looks to me (sorry, Nathan) like something which needs 
to be reverted in whole.  As already noted here, many tags are 
superfluous, it is quite likely that good tests that Validator would 
have violently complained about were undone or simply ignored and 
uploaded anyway, and Nathan appears to not have tried to achieve 
consensus in a way that would have prevented this.  That is the crux 
of the problem, not his raw data (which with some improvement and 
more consensus, can truly turn into something appropriate for OSM, in 
my opinion).

Nathan goes by at least three OSM user accounts that I know of: 
nmixter, srmixter and eureka gold.  It is usually the latter by which 
he does bulk uploads, but not always.  His home/work areas around 
Hollister and Gilroy are truly something beautiful to behold 
(hand-drawn buildings, rich sets of POIs...).

Now come the most important issues in this missive:  Nathan is 
earnest.  He really wants to communicate among fellow OSMers, my 
personal evidence and experiences clearly establishes that (and I so 
attest), and he wants to upload quality data (and he even listens 
when I say to him "what you have now is poor quality data for OSM; 
don't upload those until we or others improve it so it is GOOD ENOUGH 
for OSM").  There is a kind of "hike and talk among your local fellow 
OSMers to achieve consensus..." and there is a "read (and contribute, 
when you have something important to say) the talk-us pages to 
achieve consensus..." and there is even a "put up a page in OSM's 
wiki to achieve consensus...".  What seems to be sorely lacking is a 
sort of "mid-level" (countywide? statewide? it doesn't have to be by 
political boundaries) way of achieving consensus which we miss (and 
miss badly) here in the USA.  User:nmixter is its latest example with 
his recent upload in Fresno.

I'm personal-messaging him to read talk-us as we discuss this, and 
maybe he himself will chime in here.  Let's give it a few days, as 
this channel can be a bit slow.  He really wants to upload quality 
data, he just has a tendency towards an itchy trigger finger on his 
"upload" button when he feels a lack of consensus on whether his data 
is "ready for import."  Yes, that is a problem, but I believe we can 
address it by better achieving better networked communication and 
consensus (how to better do so?) rather than "tearing him a new one" 
for one more messy upload.  And of course, I can't speak FOR Nathan, 
but rather as somebody who has worked with him on OSM fairly closely 
regarding countywide bulk data imports/uploads for years now.

Let's communicate our intentions and amongst each other, improve the 
map, and not isolate each other with boos and tear-downs.  We really 
can work well (and better) together.  These talk pages are part of 
that.

SteveA
California



More information about the Talk-us mailing list