[OSM-talk] Proposal to run an automated bot edit that will remove tracking parameters

Oliver Simmons oliversimmo at gmail.com
Wed Feb 3 22:07:58 UTC 2021


(sorry if you received this twice, forgot to do "reply to all" :/ )

Just asking for clarification that this is only removing URL query sections
recognised as tracking, and not the entire URL query.
The query is often used for, well, a query for a dynamic page.

e.g.
https://www.example.com/search?tracking=yes&q=my%20search&tracking_id=12345
Should become
https://www.example.com/search?q=my%20search
NOT
https://www.example.com/search


On Tue, 2 Feb 2021, 21:21 Mateusz Konieczny via talk, <
talk at openstreetmap.org> wrote:

> tl;dr:
>
> I propose to run an automated bot edit that will remove tracking
> parameters, turning tags
> such as
> website=
> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
> into
> website=http://paris.intersquat.org/les-lieux/le-satellite/
>
>
> I did it already before, see:
> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
> This edit will remove newly edited links and purge more tracking
> parameters.
>
> If anything will go wrong I will fix it.
> I have experience with automated edits, see
>
> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>
> changes listing:
> https://gist.github.com/matkoniecz/1d20caa198ec2d4001d95adf09123a8a
> (based on current OSM database, if OSM data changes then actual edit will
> be different,
> feel free to make backup of this linked file - it may be deleted some time
> after edit)
>
> source code and other documentation (except source code, duplicate of that
> posting):
>
> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters2/
>
> --------------------------------------
> full details:
>
> I propose to run a scripted edit - it was already run before but this will
> remove more tracking parameters.
>
> URL often have unnecessary parts, some added for tracking purposes
> by FB, Google and others.
> This tracking parameters should never appear in any osm tags.
>
> It means that it is beneficial to turn tag
> website=
> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
> into
> website=http://paris.intersquat.org/les-lieux/le-satellite/
> and it is worth doing it as an edit.
>
> This urls can be often fixed using an automated script, allowing to
> use human time on something more productive.
>
> Human-made edit will also result in changing "last edited by"
> (while not allowing to filter out such edits unlike marked bot edit),
> there are better ways to spot areas requiring fixes and we are not lacking
> places with QA indicators that manual review is needed.
>
> Usually tracking links are added by clueless people who just searched for
> a website and copied it from FB/Google.
>
> There are rare cases of links created to specifically track OSM users
> see for example
> * https://www.openstreetmap.org/way/754704241/history
> ** https://www.cronauerlaw.com/?utm_source=openstreetmap
> * https://www.openstreetmap.org/node/1063808111/history
> **
> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
> * https://www.openstreetmap.org/node/6817678019/history
> **
> https://www.resotainer.fr/agence-bonneuil-sur-marne?utm_source=open-street-map&utm_medium=recherche-locale&utm_content=openstreetmap&utm_campaign=open-street-map-garde-meubles-bonneuil-sur-marne
> * https://www.openstreetmap.org/node/1684317522
> **
> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
>
> In general I have not noticed correlation between presence of tracking
> links
> and additional issues that would not be detected automatically.
>
> Therefore automatic removal of tracking parameters is not causing loss of
> useful indicators of areas that should be reviewed.
> Osmose and JOSM validators and StreetComplete are offering many better
> indicators,
> and we are not in danger of running out of places where human intervention
> is clearly needed.
>
> Automatic removal would allow me and others to spend time on something
> more useful,
> than reviewing all cases where tracking is clearly present and confirming
> them one by one.
>
> Proposed bot edit would remove links where all used parameters are tracking
> users and may be removed.
>
> I am reviewing manually more complicated cases to catch
> also currently unknown tracking parameters.
>
> Anchors (#section) will be preserved.
>
> Code is tested, was using it in a manual review mode and for a fully
> automated edit run
> that removed tracking parameters from over 1000 objects - see
>
> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
>
> I have experience with automated edits, see
>
> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>
> Yes, editing element will cause it to be edited and change "last edited"
> date.
> Effect will be exactly the same in case of using bot and manual edit
> (which would be necessary in case of rejecting this automated edit
> proposal).
> Note that in case of bot edits you may filter out bot edits marked as
> automatic.
>
> following are consider as tracking parameters and would be removed:
>
> fbclid, gclid, campaign_ref, mc_id, utm_source, utm_medium, utm_term,
> utm_content, utm_campaign, utm_id, gclsrc, dclid, wt.tsrc, WT.tsrc,
> zanpid, yclid, utm_campain, trkCampaign, mkt_tok, sc_campaign, sc_channel,
> sc_content, sc_medium, sc_outcome, sc_geo, sc_country, mbid, cmpid,
> campaign_id, Campaign, fb_action_ids, fb_action_types, fb_ref, fb_source,
> gs_l, _hsenc, igshid, CampIDMin, CampIDMaj, campaign, Campaign,
> campaignid, campaignId, adid, adgroupid, refr, referrer, cm_mmc, lw_cmp,
> CLID, ReferralSource, SourceID, trkid, adjust_creative, partner_slug,
> y_source,
> oppartnerid, padid, otppartnerid, ref_device_id, utm_kxconfid, SEO_id,
> originalReferrer, spMailingID, hsCtaTracking
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20210203/f38974cf/attachment.htm>


More information about the talk mailing list