[Talk-us] Fresno castradal imports

Nathan Mixter srmixter at hotmail.com
Fri Apr 27 07:35:29 BST 2012

Thanks for everyone's comments. I do take pride in making sure my imports are good. That said I realize there are issues with the Fresno import, one of my first imports. But I am not sure if we need to throw out the baby with the bath water just yet. If it comes to that, fine. I think the problems with the import can be fixed. I am surprised it took two years to mention and then from a person from Canada who I guess is doing a review for the license.

So to address some of the issues for Fresno County:

"The consensus is against dumping castradal data into OSM."
Consensus by who? There have been many cases where it has been used effectively. Sometimes it is the only way to go because some areas can't be mapped by hand or are so vast they are impossible. Sure, it shouldn't be just dumped in, but if it is selectively added it can be beneficial. Landuse is beneficial because it tells you where you can or cannot go and makes the map more usable. With parcel data, address data can later be added making the map routable.

I don't buy the argument that having too much data is a bad thing. If that is the case, we might as well all stop editing the map altogether. Eventually as more and more people join, people will add items to the map anyway. You can't have all white areas forever. The data shouldn't be too over digitalized, But being under digitalized can be just as bad. There has to be a mix. I like the idea of having a filter for people who need to download larger areas.

No documentation or community consultation.
When I did the import, there weren't really any active mappers in the county. I try to consult local users when doing imports because they often have good suggestions about ways to handle certain areas or know what has been done locally. Unlike with Europe, there are many counties in the U.S. that don't have any active mappers, making it tough to establish a consensus. That is something we need to figure out a way to solve. One trick I've used is to move my location marker each time and then see all the users around and what they are doing. Maybe in the future there could be some type of host site where people can comment on imports in an organized way and search for things near by them and comment on them or offer to help importing.

Extra tags.
I'm not sure what the problem with having extra tag information in the database. A couple extra bytes? We are allowed to create custom tags and use them however we want to describe the data when tags don't already exist. What is wrong with incorporating a few of the original tags from the producers of the data into OSM? Was this overkill? Maybe. Definitely with the description and location tags. But we can send the woodpecker bot in or run some other script in and take out all the tags that aren't needed. No big deal. There are areas that do have overlapping tags like natural=water and landuse=residential due to an error in my scripting conversion, but these can be deleted manually or by script.

Empty areas
These can easily be filled in or deleted on a case by case basis. Sometimes this is federal land that can be tagged with a landuse tag upon further review. I tried to account for all areas when running the script on the initial data but missed some of the variables.

Split lots
Split lots are a pain, but these can be joined manually as I have done on several locations.

Some duplicates were added since there were several different layers used in the Fresno imports, i.e. parks, schools and the county and city zones. That was my fault for not checking, but they can be merged or deleted in JOSM. The county lot includes all the crop types, which makes it possible to break down farms into vineyards or orchards. The city parcel set includes all the extra parcel information for each address.

I'm not sure what user:BiIbo did to update the data afterward. But I have done numerous updates myself to delete dups, merge data and align the layers. The Tiger data in the county was real bad, which adds to the problem of things not lining up. While I acknowledge there are several errors in the data, I think it is still possible to clean them up manually or merge parcels into one area for each block, and I am willing to continue working on cleaning up the data if that is the route chosen.

When I originally converted the shapefile, it created more than 100 OSM files, which I loaded one by one. It was tedious. That was back in the day when files took forever to load. I wish I knew what I know now about importing. I do agree it is a lot easier to do a little pre-import work first and save a lot of time later on.

So bottom line, I am OK if the data has to be reverted. I still think it belongs in OSM but in a more condensed and error-free form. I am willing to help however I can to either clean up the data or to reimport it in a more streamlined fashion. Thanks for hearing me out. Hopefully I didn't put too many people to sleep. Ok, it's now on the wiki too at http://wiki.openstreetmap.org/wiki/Talk:Fresno_County,_California. Feel free to add comments or thoughts. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20120426/8bb4a10d/attachment.html>

More information about the Talk-us mailing list