[Talk-GB] Request for UK address lists for postcode extraction
Andy Robinson (blackadder-lists)
ajrlists at googlemail.com
Mon Dec 1 16:13:45 GMT 2008
Brian Quinion wrote:
>Sent: 01 December 2008 4:01 PM
>To: Andy Robinson (blackadder-lists)
>Cc: David Earl; talk-gb at openstreetmap.org
>Subject: Re: [Talk-GB] Request for UK address lists for postcode extraction
>Andy Robinson wrote:
>> David Earl wrote:
>>>On 01/12/2008 14:11, Brian Quinion wrote:
>>>> Has anyone got any suggestions, or is willing to offer any data? Even
>>>> personal address books would be useful for testing...
>>>You know all the 2,500 or so prefixes, and there are only 26 x 26 * 100
>>>combinations for the second part for each - about 200 million in all. If
>>>you feed these potential postcodes in quotes into Google UK over a long
>>>period with appropriate pauses so as not to get locked out, and look at
>>>the result for recognizable addresses (that's the tricky bit) as I'm
>>>doing in the Namefinder, you'd probably cover 75% of UK postcodes.
>> I'm curious about this. Data scraped via Google is still subject to the
>> terms of the original page it references?
>I looked into this and came to the conclusion that you could probably
>claim 'fair use' as long as you pulled each address from a different
>website. The trouble is that for most searches you end up on one of a
>small number of directory sites so doing any significant number is
>likely to end up as a database extraction. The results are also
>mostly limited to business addresses.
>Probably it would be possible to filter it so not too many requests
>went to any one site, but that still leaves the possibility that they
>used royal mails postcode finder (or similar) to find their original
>data. Across a large number of sites you could end up doing a
>database extraction from royal mail regardless.
>Address books and company mailing lists seemed like a preferable
>source and as long as individuals names are not included privacy
>shouldn't be an issue.
I'd noted that too. Business directory listings (Yell, Thomson etc) or house
price finders which are using copyright Land Registry data in the
One source I am exploring is planning application listings produced by the
local authority. Which is I think were you had headed?
More information about the Talk-GB