[Talk-us] California is too big ;)

OSM Volunteer stevea steveaOSM at softworkers.com
Tue Nov 6 17:15:21 UTC 2018

On Nov 6, 2018,at 12:38:05 AM PST, Frederik Ramm <frederik at remote.org> wrote:
> ...on the Geofabrik download server, we usually split up countries into
> sub-regions once their single .osm.pbf has gone over a certain size. The
> aim is to make it easy for people to work with data just for their
> region, even on lower-spec hardware where it might be difficult to
> handle huge files.
> ...but after that,
> for the first time ever, a second-level entity (California) will be
> larger than all not-yet-split countries.
> So I wonder:
> 1. is there already a site where someone interested in only a subset of
> California can download current data and potentially also daily diffs?

Whether you know this or not, your algorithm of "splitting" makes too much sense to ignore, especially as there really are those with older hardware and "making geographic entities 'bite-sized'" is a technical reality, hence necessity.  The data are otherwise simply too large.

> 2. is there a demand for this?

Not by me, but that doesn't mean it doesn't exist, it VERY likely does exist.  Let's keep OSM "human sized" by making the data that reasonable people and reasonable hardware/software toolchains can handle "bite sized," lest we and our machines choke on too much data.

> 3. what would be a sensible way to split California - in 58 counties, or
> maybe just go with SoCal and NorCal for now?

I haven't known personally that this "splitting" goes on in OSM (planet.osm becoming a smaller .osm or .osm.pbf), but it makes perfect technical sense.

And while I read and understand Vivek Bansal's suggestion about "six Californias" and Tod Fitch's "I detest this" (incidentally, I "detest this," too), I have suggestion which is likely easier, more "politically simple" and I believe is rather geographically elegant.

There is a "straight across" (west to east, "latitudinal") split of California (almost) which nicely keeps the major population centers (of Southern and Northern California) apart, as well as neatly falls across county lines (political boundaries of admin_level=6), as well as is almost a "straight line" (geographically, a great circle, because Earth is spheroid).

It works like this:  there are 58 counties in California.  Split these 10 counties into "Southern California:"

San Diego, Imperial, Orange, Riverside, Los Angeles, Ventura, San Bernardino, Santa Barbara, San Luis Obispo and Kern.

And split "the rest" (48 of them) into "Northern California."

Geographically, this is very close to a "straight line" (east west) at about latitude 35.7889805 although this wanders very slightly in Sequoia National Park (because of a mild survey error 150 years ago near the Kern River, I think) and it does take a few minor "jogs" in far eastern California on this "line" near Lamont Peak (between two national Wilderness boundaries), another "north, then easterly again" jog of about a kilometer near Boulder Peak close to United States Highway 395 and finally a similar "north, then easterly again" jog of about a mile (~1.6 km) in the Pahrump Valley Wilderness Area very close to the Nevada boundary, then easterly a few kilometers to the Nevada State Line.  That's it.

Honestly, it sounds more complicated than it is:  most people look at a wider-scale map of California's counties and "see" this east-west line rather neatly divides California into two, a northern and southern, and simply with the designation of "those ten counties" as the method to do so.  It isn't "perfectly straight" but it is "perfectly suited" to do this division of California, in my opinion.

I hope this helps.  It is one of the few times that living in California has intersected with OSM and the talk-us pages where I can say "I think I know what I'm talking about here."  Although, I certainly welcome other suggestions:  these are the "talk" pages, after all!


