[Talk-us] Hamlets!

stevea steveaOSM at softworkers.com
Fri Jun 21 17:38:11 UTC 2013


>On Fri, Jun 21, 2013 at 9:32 AM, Richard Welty <rwelty at averillpark.net> wrote:
>  > hamlets are not incorporated areas and have no government functions.

Virtually always true, in my experience.  However, a hamlet might 
find itself inside of an incorporated city limit (say, for historical 
reasons).  See below.

>  > in urban areas, hamlets are generally once distinct communities
>>  that have been absorbed into larger entities. they have no legal standing,
>  > but frequently the postal service will still deliver based on the name.

Virtually always true, in my experience.  But note how this smears 
together history, what hamlets "are," and what the post office does 
with them.  It is useful to call attention to these three (and other 
semantic aspects) of entities in OSM, hamlets being a good example. 
There are data, and there are the algorithms that consume them.

>  > in rural areas in NY, hamlets generally have white on green road signs
>>  erected by the state highway department and may have a CDP boundary.
>>  local post offices and/or school districts may use the same name as the
>  > hamlet.

Ditto:  what the DOT does and the federal Census Bureau (and post 
offices, school districts) do are important considerations of how 
data such as "here is a hamlet" ARE USED, and might INFLUENCE what we 
do with hamlets, but we must keep in mind how both a data structure 
(point, polygon...) AND an algorithm (geocoder, addressing index...) 
are important to whatever final result is being sought.  IT IS BOTH.

>  > the CDP boundaries are at best vaguely related to the post office delivery
>>  routes sharing the name.

See?  You have conflated the semantics of mail and census, and you 
get a mess.  Don't do that.  (Or if you do, be smart about it, rather 
than just expecting it to work from lazy assumptions).

>Serge Wroclawski wrote:
>I'm disinclined to touch a CDP based on my experience of living in
>one. In some places, they have the same function as a town.

Agreed:  CDPs are useful.  In many places, they are the only entity 
in the map that resembles organized human settlement of thousands 
upon thousands of people.  Let's not blithely throw that away, but 
keep CDPs as the oddities that they are.  (Just in a proper OSM 
framework where data consumption tools properly recognize and respect 
them as such).

>In NYC and DC, the hamlets were not places I'd ever heard of (even if
>they were close by). If they're just apartments, then it seems silly
>to keep them around, even if the post office delivers to them.

We should keep around the actual entities they deign to represent, 
but perhaps more accurately for what they actually are (an apartment 
complex, a mobile home park, a subdivision...).  This is done with 
better tags on data (and perhaps polygons instead of nodes, where 
appropriate), and smarter algorithms that consume those data.  High 
quality data (with smart tags and the proper structure) + high 
quality algorithms (that smartly recognize the proper broad spectrum 
of data entities within their scope) = high quality results!  Well, 
usually.

>So if I read you correctly, it seems like in urban areas that we know
>it's generally safe to reclassify them (either as a building, or
>building complex (as a multipolygon), or perhaps a neighborhood.
>
>Is that a fair statement?

Yes, provided we properly enter data into OSM that is accurate for 
what these entities are, tagged correctly.

>On 6/21/13 11:07 AM, Sean Bartell wrote:
>>>I realized only after last week's discussion about neighborhoods that
>>>the hamlets (which are distinct from nehighborhoods) are the things
>>>messing up the geocoder. A neighborhood is understood to be a place
>>>that's not often in an address, but a hamlet is a village, and so a
>>>hamlet in the middle of an urban place doesn't make sense.
>>So a hamlet within municipal boundaries is almost certainly wrong. Could
>>we try to detect which imported hamlets are within cities, and delete
>>them or change them to place=neighbourhood?

A village might be a larger version of a hamlet:  VERY roughly 
speaking, both are "unincorporated communities" (too small to 
incorporate).  Say a hamlet is "a settlement with less than 100-200 
inhabitants" (as our wiki does).  Our wiki also notes 
place=isolated_dwelling, a settlement of "not more than two 
households."  Yet this page also says something (crucial) which I 
believe true of hamlets as well:  "They are outside other settlements 
(this does not mean that they are outside administrative boundaries) 
and form by themselves a settlement."  Let's be clear:  hamlets ALSO 
have this quirk of not necessarily being outside of administrative 
boundaries.  Yet they may also be outside of administrative 
boundaries.  Algorithms need to accommodate this actuality.

Richard Welty wrote:
>i think we need to pull things like CDPs and hamlets out of the
>admin_boundary framework and confine it strictly to real government
>administration (and i think things like fire districts should be excluded
>from the admin_boundary framework as well).  i have heard the argument
>that all of these things can be considered administrative, but this
>become so broad and general that you end up with a useless mess.

Agreed:  I have converted my local CDPs from boundary=administrative 
to boundary=census and believe fire districts and universities and 
such don't belong in the strict hierarchy of boundary=administrative.

>i also think the US is a little peculiar in that our official addressing
>derives solely from postal routes, which can differ significantly from
>the admin boundary framework. this is one of the issues with virtually
>all of the data consumers that try to handle this; european assumptions
>are the norm and the US isn't europe. i see this in the address handling
>for things like OsmAnd and mkgmap as well. i suspect we need some
>algorithmic changes in these entities to reflect US reality; fiddling the
>data is only a bandaid.

Well, "fiddling the data is only a bandaid" is partly true.  Fiddling 
the data AND fiddling the algorithms that consume them are the more 
complete solution.  Richard has hit the bulls-eye:  it's both.  As a 
corollary, for best results to be expected, both must be different in 
different parts of the world, that's just the way the world is.

SteveA
California



More information about the Talk-us mailing list