[Routing] Crowdsourced costing - offer of writing a client+metric for it

Marcus Wolschon Marcus at Wolschon.biz
Wed Dec 3 11:11:46 GMT 2008


On Wed, 3 Dec 2008 11:44:15 +0100, Sascha Silbe
<sascha-ml-gis-osm-routing at silbe.org> wrote:
> On Wed, Dec 03, 2008 at 07:50:26AM +0100, Marcus Wolschon wrote:
> 
>> * day = (weekday,weekend,holiday)
>> * time = (night,morning,afternoon)
>> ** sub-time = (early,mid,late)
> Unfortunately, that's too coarse. You'll miss things like "Shops are 
> closed on Wednesday, so much less traffic then". Or the yearly recurring 
> traffic increase due to christmas insanity...


I tried to not make it to make my firt draft not too
fine-grained due to privacy and not-enough-data -concerns.
We can have a 
* day =
(weekday,weekend,holiday,beforeHoliday,afterHoliday,schoolVacation,beforeSchoolVacation,afterSchoolVacation)
** weekday (mo,...sa,so)

is that more to your liking?

>> * we do not receive user-identifications at all
>> * the IP and upload-time are stored to revert manually detected 
>> vandalism
> Those two are mutually exclusive. Though german ISPs tend to assign 
> dynamic IP addresses so they can sell static ones for more money, it's 
> not the case everywhere. Even then, given usual IP address lifetimes 
> (24h), the tuple (IP address,time) is enough to identify a user if 
> correlated with other sources (e.g. emails).

Yes, that can be a problem. Does anyone on this list have suggestions
on how to prevent or revert vandalism?
We do have the option of ignoring that part for a prototype for the
time being and care about it later.


> Summary:
> - there are some cases where anonymity is not possible at all
>     - => need to warn users about that
> - in the general case, the server must anonymize the data (either by not 
> storing user-identifiable data or by not handing it out)
>     - => the users have to trust the server

The server need never hand out data of a single upload.
To be meaningfull it even needs to be averaged for many
users.


> PS: The rest of your proposal looked OK to me (apart from using bloat 
> like SOAP, but that's a matter of taste anyway). I'd say go on and set 
> up such a server. Let's gather some experience from using it and 
> analyzing the data it collected. Before we do that, it's hard to know 
> what data is useful; kind of like crystal gazing. I dare say we'll need 
> a second version anyway, even if we try to imagine every possible usage 
> scenario now.

Okay.
I suggested soap because it is supported by many
systems, it is type-safe and trivial to check and
there is good tooling.
For a test it should be okay. Something binary or
at least smaller can still be specified later for
a final version.

Marcus





More information about the Routing mailing list