[OSM-dev] IP as personal information (Was: Re: Any tile statistics (unique visitors))

Stefan de Konink stefan at konink.de
Sun Dec 7 14:05:43 GMT 2008


Jochen Topf wrote:
> On Sun, Dec 07, 2008 at 02:22:54PM +0100, Stefan de Konink wrote:
>> Jochen Topf wrote:
>>> On Sun, Dec 07, 2008 at 01:36:05AM +0100, Stefan de Konink wrote:
>>>> Tom Hughes wrote:
>>>>> If what you're asking is why we don't just give anybody who asks 
>>>>> root access to the web server then I hope that would answer itself.
>>>> Read-rights on my logs are available to anyone :)
>>> I hope you have properly anonymized them beforehand! Even anonymized,
>>> the logs of a tile server are problematic in a privacy sense. You can
>>> probably identify some individuals from the session data.
>> In The Netherlands 'personal data' is extremely well defined. And I can  
>> tell you, an IP address is definitely *not* one of them.
> 
> A German court ordered a German federal ministry this year to stop
> logging IP addresses because they contain personal information. But it
> doesn't matter how you or "The Netherlands" define "personal data", some
> IP addresses can be traced back to a person.

Some postal codes can too. Still the CBP (Dutch government backed 
organisation for the Protection Personal information) ruled on a 
question for me that postal codes are *not* identified as personal 
information because the amount of work to be done to identify one person 
is tremendous and potentially equal to actually requesting the data from 
this person.

In this case it was about population cancer statistics.

> And even if you anonymize the IP address you'll still get data that can
> be traced back to a person. If you know all the places somebody has
> looked at, chances are that you can figure out in some cases who that
> person is. If you don't believe this, I encourage you to read up about
> the AOL search log disaster two years ago. For instance at:
> http://www.wired.com/politics/security/news/2006/08/71579

You are overrating Apache log files, I don't know what you like to store 
on a high volume website, but the original inquiry was only a measure to 
unique visitors; not even browser statistics. Thus could you identify 
TomH by its IP address, most likely if TomH has no wife or children or 
is not sharing his IP address using NAT technologies.

...the rest of the consumers are; and most likely also on a DHCP pool 
due to incompetence of the telco.

>> Never the less, I expect from any users that do aggregation task they  
>> care about aggregation results not about raw data :)
> 
> Well, we were not talking about "users that do aggregation", but
> "anybody". You said you give anybody access to log files. If you give
> anybody access, thats more than just "users that do aggregation".

Anybody that can login to my server yes, TomH was noting root rights 
where required for this. Anyone that is able to login to my server has 
read rights on those files.


Stefan




More information about the dev mailing list