[OSM-talk] Proposal to run an automated bot edit that will remove tracking parameters

Edoardo Yossef Marascalchi e.marascalchi at gmail.com
Thu Feb 4 06:37:17 UTC 2021


Will this take care of link shortner (bit.ly goog.le ...) As well expanding
the links and replacing it with the correct one?

Il gio 4 feb 2021, 1:03 AM Oliver Simmons <oliversimmo at gmail.com> ha
scritto:

> Brilliant then 👍
> I have zero negatives towards this in that case.
>
> On Wed, 3 Feb 2021, 22:24 Mateusz Konieczny via talk, <
> talk at openstreetmap.org> wrote:
>
>> It is definitely not removing all parameters just because one is tracking.
>>
>> Most cases are left for manual review, so will not be handled by
>> automatic edit,
>> but for example see https://www.openstreetmap.org/node/2887372516 that
>> has
>>
>> website =
>> https://www.hammer-zuhause.de/maerkte/storeDetail?utm_campaign=googlemaps&utm_medium=organic&storeCode=0214&utm_source=uberall&utm_content=06366_K%C3%B6then_(Anhalt)
>>
>> that will be turned to:
>>
>> website =
>> https://www.hammer-zuhause.de/maerkte/storeDetail?storeCode=0214
>>
>>
>> Feb 3, 2021, 23:07 by oliversimmo at gmail.com:
>>
>> (sorry if you received this twice, forgot to do "reply to all" :/ )
>>
>> Just asking for clarification that this is only removing URL query
>> sections recognised as tracking, and not the entire URL query.
>> The query is often used for, well, a query for a dynamic page.
>>
>> e.g.
>>
>> https://www.example.com/search?tracking=yes&q=my%20search&tracking_id=12345
>> Should become
>> https://www.example.com/search?q=my%20search
>> NOT
>> https://www.example.com/search
>>
>>
>> On Tue, 2 Feb 2021, 21:21 Mateusz Konieczny via talk, <
>> talk at openstreetmap.org> wrote:
>>
>> tl;dr:
>>
>> I propose to run an automated bot edit that will remove tracking
>> parameters, turning tags
>> such as
>> website=
>> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
>> into
>> website=http://paris.intersquat.org/les-lieux/le-satellite/
>>
>>
>> I did it already before, see:
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
>> This edit will remove newly edited links and purge more tracking
>> parameters.
>>
>> If anything will go wrong I will fix it.
>> I have experience with automated edits, see
>>
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>>
>> changes listing:
>> https://gist.github.com/matkoniecz/1d20caa198ec2d4001d95adf09123a8a
>> (based on current OSM database, if OSM data changes then actual edit will
>> be different,
>> feel free to make backup of this linked file - it may be deleted some
>> time after edit)
>>
>> source code and other documentation (except source code, duplicate of
>> that posting):
>>
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters2/
>>
>> --------------------------------------
>> full details:
>>
>> I propose to run a scripted edit - it was already run before but this will
>> remove more tracking parameters.
>>
>> URL often have unnecessary parts, some added for tracking purposes
>> by FB, Google and others.
>> This tracking parameters should never appear in any osm tags.
>>
>> It means that it is beneficial to turn tag
>> website=
>> http://paris.intersquat.org/les-lieux/le-satellite/?fbclid=de58e340d6aa79a584552a2055042d004b9b19454bc0d7a6046fc81fc90f51
>> into
>> website=http://paris.intersquat.org/les-lieux/le-satellite/
>> and it is worth doing it as an edit.
>>
>> This urls can be often fixed using an automated script, allowing to
>> use human time on something more productive.
>>
>> Human-made edit will also result in changing "last edited by"
>> (while not allowing to filter out such edits unlike marked bot edit),
>> there are better ways to spot areas requiring fixes and we are not lacking
>> places with QA indicators that manual review is needed.
>>
>> Usually tracking links are added by clueless people who just searched for
>> a website and copied it from FB/Google.
>>
>> There are rare cases of links created to specifically track OSM users
>> see for example
>> * https://www.openstreetmap.org/way/754704241/history
>> ** https://www.cronauerlaw.com/?utm_source=openstreetmap
>> * https://www.openstreetmap.org/node/1063808111/history
>> **
>> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
>> * https://www.openstreetmap.org/node/6817678019/history
>> **
>> https://www.resotainer.fr/agence-bonneuil-sur-marne?utm_source=open-street-map&utm_medium=recherche-locale&utm_content=openstreetmap&utm_campaign=open-street-map-garde-meubles-bonneuil-sur-marne
>> * https://www.openstreetmap.org/node/1684317522
>> **
>> http://www.travelerscoffee.ru?utm_campaign=geo&utm_source=openstreetmap&utm_medium=link
>>
>> In general I have not noticed correlation between presence of tracking
>> links
>> and additional issues that would not be detected automatically.
>>
>> Therefore automatic removal of tracking parameters is not causing loss of
>> useful indicators of areas that should be reviewed.
>> Osmose and JOSM validators and StreetComplete are offering many better
>> indicators,
>> and we are not in danger of running out of places where human
>> intervention is clearly needed.
>>
>> Automatic removal would allow me and others to spend time on something
>> more useful,
>> than reviewing all cases where tracking is clearly present and confirming
>> them one by one.
>>
>> Proposed bot edit would remove links where all used parameters are
>> tracking
>> users and may be removed.
>>
>> I am reviewing manually more complicated cases to catch
>> also currently unknown tracking parameters.
>>
>> Anchors (#section) will be preserved.
>>
>> Code is tested, was using it in a manual review mode and for a fully
>> automated edit run
>> that removed tracking parameters from over 1000 objects - see
>>
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_tracking_parameters
>>
>> I have experience with automated edits, see
>>
>> https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account
>>
>> Yes, editing element will cause it to be edited and change "last edited"
>> date.
>> Effect will be exactly the same in case of using bot and manual edit
>> (which would be necessary in case of rejecting this automated edit
>> proposal).
>> Note that in case of bot edits you may filter out bot edits marked as
>> automatic.
>>
>> following are consider as tracking parameters and would be removed:
>>
>> fbclid, gclid, campaign_ref, mc_id, utm_source, utm_medium, utm_term,
>> utm_content, utm_campaign, utm_id, gclsrc, dclid, wt.tsrc, WT.tsrc,
>> zanpid, yclid, utm_campain, trkCampaign, mkt_tok, sc_campaign, sc_channel,
>> sc_content, sc_medium, sc_outcome, sc_geo, sc_country, mbid, cmpid,
>> campaign_id, Campaign, fb_action_ids, fb_action_types, fb_ref, fb_source,
>> gs_l, _hsenc, igshid, CampIDMin, CampIDMaj, campaign, Campaign,
>> campaignid, campaignId, adid, adgroupid, refr, referrer, cm_mmc, lw_cmp,
>> CLID, ReferralSource, SourceID, trkid, adjust_creative, partner_slug,
>> y_source,
>> oppartnerid, padid, otppartnerid, ref_device_id, utm_kxconfid, SEO_id,
>> originalReferrer, spMailingID, hsCtaTracking
>> _______________________________________________
>> talk mailing list
>> talk at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>>
>> _______________________________________________
>> talk mailing list
>> talk at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20210204/6dc76c15/attachment-0001.htm>


More information about the talk mailing list