<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 27/09/2020 16:28, Rodrigo Díez
Villamuera wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAHsonMXyhAJUTVs8jX_99ShvHWHYLdX-1_RdJ=bXOv2fLDKYQQ@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr"><br>
<div>I am importing a subset of nodes from UK (those tagged with
amenity:pub) for a pet project.</div>
</div>
</blockquote>
<p>Firstly - welcome!</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAHsonMXyhAJUTVs8jX_99ShvHWHYLdX-1_RdJ=bXOv2fLDKYQQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>When analysing the data I realised that some of these nodes
contain a website: tag that does not contain an appropriate
URL schema (http/https).</div>
<div><br>
</div>
<div>Ie: <a href="http://www.mypub.com" moz-do-not-send="true">www.mypub.com</a>
rather than <a href="http://www.mypub.com"
moz-do-not-send="true">http://www.mypub.com</a> or <a
href="https://www.mypub.com" moz-do-not-send="true">https://www.mypub.com</a></div>
</div>
</blockquote>
<p>I'm not actually convinced that's a problem - as others have
said, web browsers are perfectly capable of converting
"<a class="moz-txt-link-abbreviated" href="http://www.mypub.com">www.mypub.com</a>" into either <a class="moz-txt-link-rfc2396E" href="https://www.mypub.com">"https://www.mypub.com"</a>or "<a class="moz-txt-link-rfc2396E" href="http://www.mypub.com">"http://www.mypub.com"</a>as
appropriate, so this doesn't really add any value. "Letting the
browser sort it out" is a great approach as it can deal with
now/near future things such as removal TLS 1.0 and 1.1 support as
well.<br>
</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAHsonMXyhAJUTVs8jX_99ShvHWHYLdX-1_RdJ=bXOv2fLDKYQQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>This goes in contradiction with the <a
href="https://wiki.openstreetmap.org/wiki/Key:website"
moz-do-not-send="true">Wiki documentation for website.</a></div>
</div>
</blockquote>
<p>Unfortunately, OSM's wiki doesn't always reflect actual usage and
this is one example. Changing "<a class="moz-txt-link-abbreviated" href="http://www.mypub.com">www.mypub.com</a>" to
<a class="moz-txt-link-rfc2396E" href="https://www.mypub.com">"https://www.mypub.com"</a> doesn't really add any value unless you're
actually updating something else about the pub. Actually, using "<a class="moz-txt-link-abbreviated" href="http://www.mypub.com">www.mypub.com</a>"
has some advantages here as it allows the user's web browser to
negotiate https if available (the default nowadays) but fall back
to http if not. <br>
</p>
<blockquote type="cite"
cite="mid:CAHsonMXyhAJUTVs8jX_99ShvHWHYLdX-1_RdJ=bXOv2fLDKYQQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>I created a proposal for a one-off, scoped, automated edit
for these nodes to find the appropiate scheme for the existing
URL and retag the nodes.</div>
<div><br>
</div>
<div>I added the proposal to the Automated edits log. You can
read it <a
href="https://wiki.openstreetmap.org/wiki/Automated_edits/rodrigodiez/Add_missing_URL_scheme_to_pub_websites_in_UK"
moz-do-not-send="true">here</a>.</div>
</div>
</blockquote>
<p><br>
</p>
<p>What would be rather more interesting would be detecting websites
that "don't or no longer represent the pub" in some way: <br>
</p>
<ul>
<li>Perhaps the pub had a website, but now has new tenants, and
they now communicate with customers on the facebook page?</li>
<li>Perhaps the website is (like one of your examples) just for
the brewery?</li>
<li>Perhaps the website now points at domain parking?</li>
<li>Perhaps the https certificate has expired, which at the very
least indicates that the website is unlikely to be kept up to
date?</li>
</ul>
<p>Any problems found would likely need to be resolved manually, but
some at least of the above should be detectable automatically.</p>
<p>Best Regards,</p>
<p>Andy</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAHsonMXyhAJUTVs8jX_99ShvHWHYLdX-1_RdJ=bXOv2fLDKYQQ@mail.gmail.com"></blockquote>
</body>
</html>