[Talk-GB] Request for UK address lists for postcode extraction

David Earl david at frankieandshadow.com
Mon Dec 1 15:10:29 GMT 2008


On 01/12/2008 14:11, Brian Quinion wrote:
> Hi,
> 
> I'm currently doing some work trying to generate postcode location
> data for the UK using address lists and address lookup using OSM data
> to supplement NPE.  So far it seems to work quite well with the
> address lists that I have available to me (and coping quite well with
> ambiguous road names) but I'm limited in my data sources and most of
> the address data is fairly consistent in both format and quality.
> 
> So, before I open the interface to the public, I'd like to test the
> code with some lists provided by other people.
> 
> Does anyone have, or know of, any address lists that I would be able
> to use for this purpose?  Obviously it needs to be license compatible
> with OSM (so please no lists generated from royal mail postcode data!)
> and ideally I'm after data sets containing at least:
> 
> street address (house name / number optional)
> town / city
> postcode
> 
> formatted as CSV or TSV.  I'm specifically not after data containing
> the names of individuals.
> 
> Has anyone got any suggestions, or is willing to offer any data?  Even
> personal address books would be useful for testing...

Why not do it the other way round?

You know all the 2,500 or so prefixes, and there are only 26 x 26 * 100 
combinations for the second part for each - about 200 million in all. If 
you feed these potential postcodes in quotes into Google UK over a long 
period with appropriate pauses so as not to get locked out, and look at 
the result for recognizable addresses (that's the tricky bit) as I'm 
doing in the Namefinder, you'd probably cover 75% of UK postcodes.

Yes, its slow, but it's probably the biggest source there is. At one a 
second it would take about 6 years, but by enlisting 100 friends you'd 
do it in a month - less if it's possible to be more intelligent about it 
- for example, for the number part if there's no 14XX or 15XX I doubt 
there would be any 16s or above either, except for a few special cases.

David





More information about the Talk-GB mailing list