Add templated version of wiki href to link element in presets

Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Add templated version of wiki href to link element in presets

SimonPoole
The current default preset has 12'214 lines of which roughly 5'564 are
used for href attributes in the "link" element. I actually suspect that
the number of lines used for the href entries is increasing faster than
the actual preset content and it is only a matter of time before they
will outnumber the rest. This makes the, already far too large file,
larger than necessary and requires continuous maintenance of the entries
for no good reason.

If we used a template this could be reduced to 1'500 plus a couple of
potential special cases. My suggestion would be to add a "template"
attribute to the link element, with a placeholder {country} (or similar)
for the two letter iso code plus the colon (including the colon makes
adding a special entry for EN unnecessary).

Example:

<link
template="http://wiki.openstreetmap.org/wiki/{country}Tag:route=railway" />

Special cases could still be handled with xx.href attributes.

The downside of doing it this way is that in the case the page doesn't
exist it will need a retry without the country + colon, however I would
consider that bearable, alternatively we could an attribute holding the
countries for all existing versions, but I don't really think that is
worth the trouble.

Comments? Better suggestions?

Simon



signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Dirk Stöcker
On Sat, 3 Nov 2018, Simon Poole wrote:

> The current default preset has 12'214 lines of which roughly 5'564 are
> used for href attributes in the "link" element. I actually suspect that
> the number of lines used for the href entries is increasing faster than
> the actual preset content and it is only a matter of time before they
> will outnumber the rest. This makes the, already far too large file,
> larger than necessary and requires continuous maintenance of the entries
> for no good reason.

This entries are maintained by an automatic script and most of the data is
compressed away to zero (remember, JAR is a zipped file format!)

> If we used a template this could be reduced to 1'500 plus a couple of
> potential special cases. My suggestion would be to add a "template"
> attribute to the link element, with a placeholder {country} (or similar)
> for the two letter iso code plus the colon (including the colon makes
> adding a special entry for EN unnecessary).

What you suggest is a more complex approach, which essentially strips away
the already perfectly compressible parts. Probably the additional code
required will outweight the save space. So that approach will result in
the opposite of the wanted goal.

Ciao
--
http://www.dstoecker.eu/ (PGP key available)

Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

SimonPoole
I normally don't edit compressed files.

Your argument would be perfectly valid (well if we forget that the
entries have to be parsed) if the URLs were stored in a separate file,
as is, bloating a manually curated file, it is bonkers.

Simon


Am 03.11.2018 um 16:33 schrieb Dirk Stöcker:

> On Sat, 3 Nov 2018, Simon Poole wrote:
>
>> The current default preset has 12'214 lines of which roughly 5'564 are
>> used for href attributes in the "link" element. I actually suspect that
>> the number of lines used for the href entries is increasing faster than
>> the actual preset content and it is only a matter of time before they
>> will outnumber the rest. This makes the, already far too large file,
>> larger than necessary and requires continuous maintenance of the entries
>> for no good reason.
>
> This entries are maintained by an automatic script and most of the
> data is compressed away to zero (remember, JAR is a zipped file format!)
>
>> If we used a template this could be reduced to 1'500 plus a couple of
>> potential special cases. My suggestion would be to add a "template"
>> attribute to the link element, with a placeholder {country} (or similar)
>> for the two letter iso code plus the colon (including the colon makes
>> adding a special entry for EN unnecessary).
>
> What you suggest is a more complex approach, which essentially strips
> away the already perfectly compressible parts. Probably the additional
> code required will outweight the save space. So that approach will
> result in the opposite of the wanted goal.
>
> Ciao


signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Vincent Privat
By the way the script doesn't work anymore, I was unable to update the
strings for last release.

Le sam. 3 nov. 2018 à 16:40, Simon Poole <[hidden email]> a écrit :

> I normally don't edit compressed files.
>
> Your argument would be perfectly valid (well if we forget that the
> entries have to be parsed) if the URLs were stored in a separate file,
> as is, bloating a manually curated file, it is bonkers.
>
> Simon
>
>
> Am 03.11.2018 um 16:33 schrieb Dirk Stöcker:
> > On Sat, 3 Nov 2018, Simon Poole wrote:
> >
> >> The current default preset has 12'214 lines of which roughly 5'564 are
> >> used for href attributes in the "link" element. I actually suspect that
> >> the number of lines used for the href entries is increasing faster than
> >> the actual preset content and it is only a matter of time before they
> >> will outnumber the rest. This makes the, already far too large file,
> >> larger than necessary and requires continuous maintenance of the entries
> >> for no good reason.
> >
> > This entries are maintained by an automatic script and most of the
> > data is compressed away to zero (remember, JAR is a zipped file format!)
> >
> >> If we used a template this could be reduced to 1'500 plus a couple of
> >> potential special cases. My suggestion would be to add a "template"
> >> attribute to the link element, with a placeholder {country} (or similar)
> >> for the two letter iso code plus the colon (including the colon makes
> >> adding a special entry for EN unnecessary).
> >
> > What you suggest is a more complex approach, which essentially strips
> > away the already perfectly compressible parts. Probably the additional
> > code required will outweight the save space. So that approach will
> > result in the opposite of the wanted goal.
> >
> > Ciao
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Dirk Stöcker
In reply to this post by SimonPoole
On Sat, 3 Nov 2018, Simon Poole wrote:

> I normally don't edit compressed files.

Yes, and?

> Your argument would be perfectly valid
> (well if we forget that the entries have to be parsed)

Parsing throws away any nonmatching language, so runtime memory
requirements wont change and I doubt that the parser time decrease due to
removal of a few tags is relevant compared to a runtime user visible
uneeded behaviour change.

> if the URLs were stored in a separate file, as is, bloating a manually
> curated file, it is bonkers.

That's a source-code file! The software is not and should not be optimized
for source code files. I opened the file with my editor a few seconds ago
and loading time and scrolling time and any working time is not really
measurable, so there is also no issue to fix here (unlike e.g. with the
maps wiki, which was split into many subpage, as it became unmanageable
at its time).

I don't think something is broken here, so it also needs no fix.

Ciao
--
http://www.dstoecker.eu/ (PGP key available)

Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Dirk Stöcker
In reply to this post by Vincent Privat
On Sat, 3 Nov 2018, Vincent Privat wrote:

> By the way the script doesn't work anymore, I was unable to update the
> strings for last release.

Someone replaced a bold small dot with a large normal dot in the wiki :-)

Fixed and Updated.

Whoa takes it long nowadays...

(We could optimize and parallelize the access, but I doubt wiki admins
will be happy about that :-)

Ciao
--
http://www.dstoecker.eu/ (PGP key available)

Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Vincent Privat
Yes it's slower and slower over time.
I guess it can be improved a lot by using MediaWiki API, see
https://josm.openstreetmap.de/ticket/16702

Le sam. 3 nov. 2018 à 21:25, Dirk Stöcker <[hidden email]> a
écrit :

> On Sat, 3 Nov 2018, Vincent Privat wrote:
>
> > By the way the script doesn't work anymore, I was unable to update the
> > strings for last release.
>
> Someone replaced a bold small dot with a large normal dot in the wiki :-)
>
> Fixed and Updated.
>
> Whoa takes it long nowadays...
>
> (We could optimize and parallelize the access, but I doubt wiki admins
> will be happy about that :-)
>
> Ciao
> --
> http://www.dstoecker.eu/ (PGP key available)
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Add templated version of wiki href to link element in presets

Dirk Stöcker
On Sat, 3 Nov 2018, Vincent Privat wrote:

> Yes it's slower and slower over time.
> I guess it can be improved a lot by using MediaWiki API, see
> https://josm.openstreetmap.de/ticket/16702

That would help the Wiki side maybe, but probably not us. We do ATM one
call per link. I only see a speedup when we could get all the i18n links
at once somehow. I check the docs, but didn't find a solution.

There is an easy way for speedup. Use WWW::Mechanize instead of wget
calls. But that increases the frequency of page calls and is not so good
an idea. That's exactly what I use as rule to block IPs and WIKI admins
wont be so different :-)

Ciao
--
http://www.dstoecker.eu/ (PGP key available)