[OSM-talk] Proposal to run an automated bot edit that will remove tracking parameters

Mateusz Konieczny matkoniecz at tutanota.com
Wed Feb 3 22:18:51 UTC 2021


It is definitely not removing all parameters just because one is tracking.

Most cases are left for manual review, so will not be handled by automatic edit,but for example see https://www.openstreetmap.org/node/2887372516 that has

website = https://www.hammer-zuhause.de/maerkte/storeDetail?utm_campaign=googlemaps&utm_medium=organic&storeCode=0214&utm_source=uberall&utm_content=06366_K%C3%B6then_(Anhalt)

that will be turned to:

website = https://www.hammer-zuhause.de/maerkte/storeDetail?storeCode=0214


Feb 3, 2021, 23:07 by oliversimmo at gmail.com:

> (sorry if you received this twice, forgot to do "reply to all" :/ )
>
> Just asking for clarification that this is only removing URL query sections recognised as tracking, and not the entire URL query.
> The query is often used for, well, a query for a dynamic page.
>
> e.g.
> https://www.example.com/search?tracking=yes&q=my%20search&tracking_id=12345
> Should become
> https://www.example.com/search?q=my%20search
> NOT
> https://www.example.com/search
>
>
> On Tue, 2 Feb 2021, 21:21 Mateusz Konieczny via talk, <> talk at openstreetmap.org> > wrote:
>
>> tl;dr:
>>
>> I propose to run an automated bot edit that will remove tracking parameters, turning tags 
>> such as
>> website=>> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
>> into
>> website=>> http://paris.intersquat.org/les-lieux/le-satellite/
>>
>>
>> I did it already before, see: >> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
>> This edit will remove newly edited links and purge more tracking parameters.
>>
>> If anything will go wrong I will fix it.
>> I have experience with automated edits, see
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>>
>> changes listing: >> https://gist.github.com/matkoniecz/1d20caa198ec2d4001d95adf09123a8a
>> (based on current OSM database, if OSM data changes then actual edit will be different,
>> feel free to make backup of this linked file - it may be deleted some time after edit)
>>
>> source code and other documentation (except source code, duplicate of that posting):
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters2/
>>
>> --------------------------------------
>> full details:
>>
>> I propose to run a scripted edit - it was already run before but this will
>> remove more tracking parameters.
>>
>> URL often have unnecessary parts, some added for tracking purposes
>> by FB, Google and others.
>> This tracking parameters should never appear in any osm tags.
>>
>> It means that it is beneficial to turn tag
>> website=>> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
>> into
>> website=>> http://paris.intersquat.org/les-lieux/le-satellite/
>> and it is worth doing it as an edit.
>>
>> This urls can be often fixed using an automated script, allowing to
>> use human time on something more productive.
>>
>> Human-made edit will also result in changing "last edited by"
>> (while not allowing to filter out such edits unlike marked bot edit),
>> there are better ways to spot areas requiring fixes and we are not lacking
>> places with QA indicators that manual review is needed.
>>
>> Usually tracking links are added by clueless people who just searched for
>> a website and copied it from FB/Google.
>>
>> There are rare cases of links created to specifically track OSM users
>> see for example
>> * >> https://www.openstreetmap.org/way/754704241/history
>> ** >> https://www.cronauerlaw.com/?utm_source=openstreetmap
>> * >> https://www.openstreetmap.org/node/1063808111/history
>> ** >> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
>> * >> https://www.openstreetmap.org/node/6817678019/history
>> ** >> https://www.resotainer.fr/agence-bonneuil-sur-marne?utm_source=open-street-map&utm_medium=recherche-locale&utm_content=openstreetmap&utm_campaign=open-street-map-garde-meubles-bonneuil-sur-marne
>> * >> https://www.openstreetmap.org/node/1684317522
>> ** >> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
>>
>> In general I have not noticed correlation between presence of tracking links
>> and additional issues that would not be detected automatically.
>>
>> Therefore automatic removal of tracking parameters is not causing loss of
>> useful indicators of areas that should be reviewed.
>> Osmose and JOSM validators and StreetComplete are offering many better indicators,
>> and we are not in danger of running out of places where human intervention is clearly needed.
>>
>> Automatic removal would allow me and others to spend time on something more useful,
>> than reviewing all cases where tracking is clearly present and confirming them one by one.
>>
>> Proposed bot edit would remove links where all used parameters are tracking
>> users and may be removed.
>>
>> I am reviewing manually more complicated cases to catch
>> also currently unknown tracking parameters.
>>
>> Anchors (#section) will be preserved.
>>
>> Code is tested, was using it in a manual review mode and for a fully automated edit run
>> that removed tracking parameters from over 1000 objects - see
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
>>
>> I have experience with automated edits, see
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>>
>> Yes, editing element will cause it to be edited and change "last edited" date.
>> Effect will be exactly the same in case of using bot and manual edit
>> (which would be necessary in case of rejecting this automated edit proposal).
>> Note that in case of bot edits you may filter out bot edits marked as automatic.
>>
>> following are consider as tracking parameters and would be removed:
>>
>> fbclid, gclid, campaign_ref, mc_id, utm_source, utm_medium, utm_term,
>> utm_content, utm_campaign, utm_id, gclsrc, dclid, wt.tsrc, WT.tsrc,
>> zanpid, yclid, utm_campain, trkCampaign, mkt_tok, sc_campaign, sc_channel,
>> sc_content, sc_medium, sc_outcome, sc_geo, sc_country, mbid, cmpid,
>> campaign_id, Campaign, fb_action_ids, fb_action_types, fb_ref, fb_source,
>> gs_l, _hsenc, igshid, CampIDMin, CampIDMaj, campaign, Campaign,
>> campaignid, campaignId, adid, adgroupid, refr, referrer, cm_mmc, lw_cmp,
>> CLID, ReferralSource, SourceID, trkid, adjust_creative, partner_slug, y_source,
>> oppartnerid, padid, otppartnerid, ref_device_id, utm_kxconfid, SEO_id,
>> originalReferrer, spMailingID, hsCtaTracking
>> _______________________________________________
>>  talk mailing list
>>  >> talk at openstreetmap.org
>>  >> https://lists.openstreetmap.org/listinfo/talk
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20210203/f09fcf15/attachment-0001.htm>


More information about the talk mailing list