Re: Extremely long Amtrak route relations / coastline v. water

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Richard Fairhurst
[cross-posted to talk-us@ and tagging@, please choose your follow-ups wisely]

Brian M. Sperlongano wrote:
It seems that we are increasingly doing things to simplify the 
> model because certain tooling can't handle the real level of 
> complexity that exists in the real world.  I'm in favor of fixing 
> the tooling rather than neutering the data.

I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. 

OSM optimises for the mapper, because mappers are our most valuable resource. That's how it's always been and that's how it should be.

But that does not mean that volunteer tool authors should rewrite their tools to cope with the 0.1% case; nor that it is reasonable for mappers to make stuff ever more complex and expect developers to automatically fall in line; nor that any given map has a obligation to render this 0.1%, or indeed, anything that the map's creator doesn't want to render.

The Tongass National Forest is not "in the real world", it is an artificial administrative construct drawn up on some bureaucrat's desk. It's not an actual forest where the boundaries represent a single contiguous mass of trees. Nothing is lost or "neutered" by mapping it as several relations (with a super-relation for completeness if you insist), just as nothing is lost by tagging Chesapeake Bay with the series of letters "c","o","a","s","t","l","i","n" and "e".

Richard

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
On 22/11/2020 11:24, Richard Fairhurst wrote:
[cross-posted to talk-us@ and tagging@, please choose your follow-ups wisely]

If you go against the accepted principle of not X-posting on a newsgroup, you've no entitlement to lecture how others respond.


Brian M. Sperlongano wrote:
It seems that we are increasingly doing things to simplify the 
> model because certain tooling can't handle the real level of 
> complexity that exists in the real world.  I'm in favor of fixing 
> the tooling rather than neutering the data.

Actually, splitting way because software can't handle it, is making the database more complex.


I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. (etc)

Likewise we need to stop software developers from expecting contributors to add data purely because they can't be bothered/not competent enough to write a few lines of code. (OSM-carto demanding boundaries on ways & numerous routers expecting multiple foodways to criss-cross pedestrian areas, are just two examples)

Contributing to the database (also *volunteers*) are expected to map to a certain standard. There shouldn't be a reason to expect develops not to do the same.

Desiring relations to list in their entirety is *not* a "0.1% case". Splitting them into 'super relations' should not be the desired, final solution.

If developers are offended at receiving suggestions on how to improve their software, or even have it criticized, then they should rescind it.

DaveF

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Seth Deegan
In reply to this post by Richard Fairhurst
I recently found out about the Extremely long Amtrak route relations from clay_c. 

Your message is a bit confusing at first but I think you are proposing that relations and super-relations should be used more-often to reduce the complexity of processing data for data consumers?

In that case, I would support an API limit on the number of members in a relation.
I agree that developers shouldn't have to handle this burden.

In response to DaveF's comment:
Actually, splitting way because software can't handle it, is making the database more complex.

Yes, but benefits outweigh the costs.
If the editors did this automatically and still made the data interpretable, this wouldn't be an issue. 

Sorry if I misinterpreted the conversation.

On Sun, Nov 22, 2020 at 5:29 AM Richard Fairhurst <[hidden email]> wrote:
[cross-posted to talk-us@ and tagging@, please choose your follow-ups wisely]

Brian M. Sperlongano wrote:
It seems that we are increasingly doing things to simplify the 
> model because certain tooling can't handle the real level of 
> complexity that exists in the real world.  I'm in favor of fixing 
> the tooling rather than neutering the data.

I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. 

OSM optimises for the mapper, because mappers are our most valuable resource. That's how it's always been and that's how it should be.

But that does not mean that volunteer tool authors should rewrite their tools to cope with the 0.1% case; nor that it is reasonable for mappers to make stuff ever more complex and expect developers to automatically fall in line; nor that any given map has a obligation to render this 0.1%, or indeed, anything that the map's creator doesn't want to render.

The Tongass National Forest is not "in the real world", it is an artificial administrative construct drawn up on some bureaucrat's desk. It's not an actual forest where the boundaries represent a single contiguous mass of trees. Nothing is lost or "neutered" by mapping it as several relations (with a super-relation for completeness if you insist), just as nothing is lost by tagging Chesapeake Bay with the series of letters "c","o","a","s","t","l","i","n" and "e".

Richard
_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging


--
Thanks,
Seth

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Christoph Hormann-2
In reply to this post by Tagging mailing list


> Dave F via Tagging <[hidden email]> hat am 22.11.2020 17:08 geschrieben:
>
> [...] OSM-carto demanding boundaries on ways

???

I am smelling fake news here.

--
Christoph Hormann
http://www.imagico.de/

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Brian M. Sperlongano
In reply to this post by Seth Deegan
Super relations could also solve problems like the Tongass National Forest.  By crafting a relation of relations, you still preserve the ability to have one tagged super-object represent one complex thing in real life, but with natural cut points so that any consumer can choose to deal with in in manageable stages.  No different from the 2000 node limit on ways.  There should still be a top level object that represents the whole thing.

I like the idea of an api limit, though we would need a strategy to deal with existing large objects.

On Sun, Nov 22, 2020, 11:24 AM Seth Deegan <[hidden email]> wrote:
I recently found out about the Extremely long Amtrak route relations from clay_c. 

Your message is a bit confusing at first but I think you are proposing that relations and super-relations should be used more-often to reduce the complexity of processing data for data consumers?

In that case, I would support an API limit on the number of members in a relation.
I agree that developers shouldn't have to handle this burden.

In response to DaveF's comment:
Actually, splitting way because software can't handle it, is making the database more complex.

Yes, but benefits outweigh the costs.
If the editors did this automatically and still made the data interpretable, this wouldn't be an issue. 

Sorry if I misinterpreted the conversation.

On Sun, Nov 22, 2020 at 5:29 AM Richard Fairhurst <[hidden email]> wrote:
[cross-posted to talk-us@ and tagging@, please choose your follow-ups wisely]

Brian M. Sperlongano wrote:
It seems that we are increasingly doing things to simplify the 
> model because certain tooling can't handle the real level of 
> complexity that exists in the real world.  I'm in favor of fixing 
> the tooling rather than neutering the data.

I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. 

OSM optimises for the mapper, because mappers are our most valuable resource. That's how it's always been and that's how it should be.

But that does not mean that volunteer tool authors should rewrite their tools to cope with the 0.1% case; nor that it is reasonable for mappers to make stuff ever more complex and expect developers to automatically fall in line; nor that any given map has a obligation to render this 0.1%, or indeed, anything that the map's creator doesn't want to render.

The Tongass National Forest is not "in the real world", it is an artificial administrative construct drawn up on some bureaucrat's desk. It's not an actual forest where the boundaries represent a single contiguous mass of trees. Nothing is lost or "neutered" by mapping it as several relations (with a super-relation for completeness if you insist), just as nothing is lost by tagging Chesapeake Bay with the series of letters "c","o","a","s","t","l","i","n" and "e".

Richard
_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging


--
Thanks,
Seth
_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Brian M. Sperlongano
In reply to this post by Christoph Hormann-2
I agree.  I removed such duplicate tagging from my area some time ago, and it has not affected anything.

I even went so far as to draft a proposal to change that recommendation.


On Sun, Nov 22, 2020, 11:37 AM Christoph Hormann <[hidden email]> wrote:


> Dave F via Tagging <[hidden email]> hat am 22.11.2020 17:08 geschrieben:
>
> [...] OSM-carto demanding boundaries on ways

???

I am smelling fake news here.

--
Christoph Hormann
http://www.imagico.de/

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
In reply to this post by Tagging mailing list



Nov 22, 2020, 17:08 by [hidden email]:
Likewise we need to stop software developers from expecting contributors to add data purely because they can't be bothered/not competent enough to write a few lines of code. (OSM-carto demanding boundaries on ways)
[citation needed] for OSM-carto demanding boundaries on ways

Also [citation needed] for OSM-Carto support for boundary relations being extremely easy to implement
& numerous routers expecting multiple foodways to criss-cross pedestrian areas, are just two examples)
Also [citation needed] for that reason is
"can't be bothered/not competent enough to write a few lines of code"
If developers are offended at receiving suggestions on how to improve their software, or even have it criticized, then they should rescind it.
If you insult others, claim that something is trivial to implement (it is not),
while something you demand is implemented already and suggest that
anyone offended by your comments should stop releasing software....

I would say that it is quite poor way to encourage volunteer
contributors to implement what you want.

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
In reply to this post by Christoph Hormann-2
I'm surprised you think that as you were a contributor to the discussions:

https://github.com/gravitystorm/openstreetmap-carto/pull/3102

https://lists.openstreetmap.org/pipermail/tagging/2018-March/035347.html

DaveF



On 22/11/2020 16:32, Christoph Hormann wrote:
>
>> Dave F via Tagging <[hidden email]> hat am 22.11.2020 17:08 geschrieben:
>>
>> [...] OSM-carto demanding boundaries on ways
> ???
>
> I am smelling fake news here.
>


_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Clay Smalley
In reply to this post by Tagging mailing list
On Sun, Nov 22, 2020 at 11:12 AM Dave F via Tagging <[hidden email]> wrote:
On 22/11/2020 11:24, Richard Fairhurst wrote:

I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. (etc)

Likewise we need to stop software developers from expecting contributors to add data purely because they can't be bothered/not competent enough to write a few lines of code. (OSM-carto demanding boundaries on ways & numerous routers expecting multiple foodways to criss-cross pedestrian areas, are just two examples)

Contributing to the database (also *volunteers*) are expected to map to a certain standard. There shouldn't be a reason to expect develops not to do the same.

If it's so easy, why don't you write the "few lines of code" necessary to fix this issue?
 
Desiring relations to list in their entirety is *not* a "0.1% case". Splitting them into 'super relations' should not be the desired, final solution.

Amtrak routes, like many other public transit routes, are already split into super-relations (see [1], [2]). This is a non-issue. I've already decided to split up long-distance Amtrak routes into more manageable chunks, especially since I'm the one who takes on most of the work of managing them. My original question was *how* to split them up, not *whether* to split them. I'm not convinced that attempts to persuade me not to do so are helpful in any way, so I'm going to consider them off-topic and ignore them.

-Clay



_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list


On 22/11/2020 18:12, Clay Smalley wrote:
On Sun, Nov 22, 2020 at 11:12 AM Dave F via Tagging <[hidden email]> wrote:


Contributing to the database (also *volunteers*) are expected to map to a certain standard. There shouldn't be a reason to expect develops not to do the same.

If it's so easy, why don't you write the "few lines of code" necessary to fix this issue?
I did. Note the response.
https://github.com/gravitystorm/openstreetmap-carto/pull/3102#issuecomment-372455636

 
Desiring relations to list in their entirety is *not* a "0.1% case". Splitting them into 'super relations' should not be the desired, final solution.

Amtrak routes, like many other public transit routes, are already split into super-relations (see [1], [2]).

Yes. I've done it myself on UK bicycle routes, but only out of necessity, due to the software limitation, not, as I stated, from any desire. It would be much less error prone with tags & ways being added to the wrong relations.

DaveF

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

SimonPoole
In reply to this post by Brian M. Sperlongano

Am 22.11.2020 um 17:35 schrieb Brian M. Sperlongano:
> ..
>
> I like the idea of an api limit, though we would need a strategy to
> deal with existing large objects.
> ..

This is, "surprise", not a new topic. There are certain issues with the
semantics of relations which make this slightly more involved as the
maximum node limit on ways.

See

- https://github.com/openstreetmap/openstreetmap-website/issues/1711

- https://github.com/zerebubuth/openstreetmap-cgimap/pull/174

With the later giving some insights in to why simply declaring a limit
is not a good idea. But putting a bound in place and expecting all tools
to be handle relations up to that size (just as we currently do with
ways) would be a good thing to improve robustness of the whole system.

Simon


_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging

OpenPGP_0x4721711092E282EA.asc (5K) Download Attachment
OpenPGP_signature (505 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Phake Nick
In reply to this post by Richard Fairhurst
Excuse me, what is the limitation here against tagging "Extremely long Amtrak relations"? Some of those Amtrak services, while long, in my knowledge are still far from the longest in the OSM database, like they're shorter than the train route between Moscow to Pyongyang, which have been tagged as a regular relationship with no observable problem to me.
In my opinion, since these long Amtrak service are still just a single services, with no break or cha.ge of train or change of train number in-between, it seems outright bogus to tag them separately, and would confuse anyway who wish to use OSM data to provide navigation involving such train routes.

在 2020年11月22日週日 19:29,Richard Fairhurst <[hidden email]> 寫道:
[cross-posted to talk-us@ and tagging@, please choose your follow-ups wisely]

Brian M. Sperlongano wrote:
It seems that we are increasingly doing things to simplify the 
> model because certain tooling can't handle the real level of 
> complexity that exists in the real world.  I'm in favor of fixing 
> the tooling rather than neutering the data.

I sincerely hope "I'm in favor of fixing" translates as "I'm planning to fix", though I fear I may be disappointed.

More broadly, we need to nip this "oh just fix the tools" stuff in the bud. 

OSM optimises for the mapper, because mappers are our most valuable resource. That's how it's always been and that's how it should be.

But that does not mean that volunteer tool authors should rewrite their tools to cope with the 0.1% case; nor that it is reasonable for mappers to make stuff ever more complex and expect developers to automatically fall in line; nor that any given map has a obligation to render this 0.1%, or indeed, anything that the map's creator doesn't want to render.

The Tongass National Forest is not "in the real world", it is an artificial administrative construct drawn up on some bureaucrat's desk. It's not an actual forest where the boundaries represent a single contiguous mass of trees. Nothing is lost or "neutered" by mapping it as several relations (with a super-relation for completeness if you insist), just as nothing is lost by tagging Chesapeake Bay with the series of letters "c","o","a","s","t","l","i","n" and "e".

Richard
_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
In reply to this post by Tagging mailing list



Nov 22, 2020, 19:34 by [hidden email]:


On 22/11/2020 18:12, Clay Smalley wrote:
On Sun, Nov 22, 2020 at 11:12 AM Dave F via Tagging <[hidden email]> wrote:

Contributing to the database (also *volunteers*) are expected to map to a certain standard. There shouldn't be a reason to expect develops not to do the same.

If it's so easy, why don't you write the "few lines of code" necessary to fix this issue?
I did. Note the response.
The mention was about "few lines of code" necessary to fix this issue

Not about lines of code making something similar on a different software stack,
that is not fixing the issue at all.

And the very next comment is

> Exactly, however there is no way to express that in CartoCSS.


_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
In reply to this post by Tagging mailing list



Nov 22, 2020, 19:00 by [hidden email]:
I'm surprised you think that as you were a contributor to the discussions:

https://github.com/gravitystorm/openstreetmap-carto/pull/3102
This is a closed, not implemented PR. So it is not a case of
"OSM-carto demanding boundaries on ways".


https://lists.openstreetmap.org/pipermail/tagging/2018-March/035347.html
Yes, long time ago there was a problematic idea that was abandoned.

Describing something like that over two years later as
"OSM-carto demanding boundaries on ways" - in present tense and
with claim that it is technical issue caused by incompetent programmers
is misleading at best.

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Christoph Hormann-2


> Mateusz Konieczny via Tagging <[hidden email]> hat am 22.11.2020 20:49 geschrieben:
>
> > https://lists.openstreetmap.org/pipermail/tagging/2018-March/035347.html
> Yes, long time ago there was a problematic idea that was abandoned.

Exactly.  It also shows how we in OSM traditionally make decisions about tagging.  An idea to change tagging practice was suggested - on an open channel for everyone to read and comment on without hurdles and with an archive that allows us now to read up on things years later.  It was discussed and arguments and reasoning were provided both for and against the idea and based on that we reached consensus that it was a bad idea and it was abandoned.

--
Christoph Hormann
http://www.imagico.de/

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Brian M. Sperlongano
In reply to this post by SimonPoole
As time goes on, we will encounter increasingly accurate and resolute mechanisms for surveying things like coastlines and land cover.  For example, there are discussions about whether to use things like AI and machine learning to produce such data.  The demand for ways to deal with larger objects will only grow in the future

Therefore, a holistic solution is needed for large objects.  Setting an api limit is good because it gives consumers a guarantee about the worst-case object they might have to handle.  However, it must also be combined with a replacement mechanism for representing large objects.  The 2,000 node limit for a way is fine because longer ways can be combined via relations.  If the relation member limit were capped, you create a class of objects that cannot be represented in the data set.

What I think is missing is a way to store huge multipolygons in such a way that they can be worked with in a piecemeal way.  The answer that immediately comes to mind is a scheme where large objects are represented as relations of relations, where portions of a huge multipolygon are chopped up into fragments and stored in subordinate multipolygon relations.  This hierarchy could perhaps nest several levels if needed.  Now a 40,000 member relation could be composed of 200 relations of 200 members each, with each subordinate relation member being a valid multipolygon with disjoint or adjacent portions of the overall geometry.

Then, an editor could say "here is a large relation, I've drawn bounding boxes for the 200 sub-relations, if you select one, I'll load its data and you can edit just that sub-relation".

This could almost work under the current relation scheme (provided new relation types are invented to cover these types of data structures, and consumers roger up to supporting such hierarchical relations).  The thing that makes this fail for interactive data consumers (such as an editor or a display) is that there's no way to know where relation members are, spatially, within the relation.  The api does not have a way to say "what is the bounding box of this object?"  A consumer would need to traverse down through the hierarchy to compute the inner bounding boxes, which defeats the purpose of subdividing it in the first place.


On Sun, Nov 22, 2020 at 1:44 PM Simon Poole <[hidden email]> wrote:

Am 22.11.2020 um 17:35 schrieb Brian M. Sperlongano:
> ..
>
> I like the idea of an api limit, though we would need a strategy to
> deal with existing large objects.
> ..

This is, "surprise", not a new topic. There are certain issues with the
semantics of relations which make this slightly more involved as the
maximum node limit on ways.

See

- https://github.com/openstreetmap/openstreetmap-website/issues/1711

- https://github.com/zerebubuth/openstreetmap-cgimap/pull/174

With the later giving some insights in to why simply declaring a limit
is not a good idea. But putting a bound in place and expecting all tools
to be handle relations up to that size (just as we currently do with
ways) would be a good thing to improve robustness of the whole system.

Simon

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

stevea
Brian, as someone who worked on these national rail relations (and still does, to some extent, though only around the edges), I agree with you that "very large" relations (in Amtrak we say that one route is >2500 relations and meets that standard of "very large") do exist.  And, they are unwieldy, especially for those who edit with "only" moderate (or less!) compute resources.  Such demands are partly why open mapping didn't happen until the 21st century:  our computers are just keeping up (as they do:  compute power fills niches of computing that the tech / hardware / software development only catches up to, it then begins to be used for those applications).  Desktop computers and Java / early JOSM / Potlatch 1 and 2 around 2005 were an OK match for each other.  OSM has grown these from there.  (Nicely, in my opinion, though there are always newer, faster technologies and software platforms to harness them).

AI on "bigger iron" is real today with what Facebook is doing in OSM with its machine learning toolchain.  Data structures must keep up.  Physics had to look to the ancients and see that 10 to the 100th power wasn't so big, there are Sanskrit words from long ago for what we consider fairly large numbers today.  And so OSM must keep up with inventing its future (of data structures) capable of "keeping up with Earth as data."

Super-relations do seem the way to go:  so far, so good.  I don't know about "200 deep" but as things go much wider (but not much deeper than perhaps several "relation-levels" for now, let's say "great-great-grandparents" I can hold in my mind for now), they seem like they will suffice.  If we need that many "dimensions" (relations "deep," not necessarily wide, as data simply ARE wide, not necessarily deep, though we should be prepared to go deep if we need to) we can, as you say "go hundreds deep."

But yes, doing so both preserves the legacy of relations of relations (and even "of relations...ad infinitum") we don't need to do that very often.  However, in train routes, there are now super-routes that exist that are "grandparents," so three deep.  This seems to be happening with bicycle routes, too and perhaps road routes, I'm not quite enough of a road geek to say yes or no, but I think so.

Luckily, relational databases (like OSM) give us ways to link them all together, "walking up and down the chain of hierarchy."  Some software, use cases, routers, whatever...will be sophisticated to do this walking and "be smarter for doing so," presenting a much fuller, richer, complete universe of data, some (software) will not and will present a more "flat" view of the world (OSM's data, really, similar to looking at ways or nodes only but ignoring relations).

We both have and use methods to do this, so, "good."  But you are right to be talking about it, as data consumers, use cases, "those downstream" will need to have their antennae tuned to be paying attention to these "more sophisticated" ways of embedding hierarchy in our data.  We have been doing this since relations were developed in OSM, some data consumer softwares pay attention, some don't.  It's a real thing.  I like, for example, the way that Lonvia (in the www.waymarkedtrails.org series of overlay layers) allows and displays a view of relations and super-relations in the table of routes presented.  That's called "paying attention" and it's great when developers pay attention to these richnesses in the structure of our (sometimes hierarchical) data (so, thank you again, Sarah).  Being aware there IS a hierarchy is the first step to "walking it" and presenting its complexity to data consumers in ways that make sense for that sort of structure.

We'll solve coastline / water edges, it'll be mostly legacy (we've done it like this for quite some time) with a bit of "new methods of thinking about things" going forward.  This is how OSM works.  Talking about it is fine.  We're generating light, not heat.

A lot of people (Simon, Phake, Dave F, Clay, Mateusz, Christoph, Brian, Seth, Richard, more...) are quite right here.  Let's listen to each other.  We're all much MORE in agreement than disagreement.

SteveA
_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Kevin Kenny-3
In reply to this post by Brian M. Sperlongano
On Sun, Nov 22, 2020 at 8:04 PM Brian M. Sperlongano <[hidden email]> wrote:
Therefore, a holistic solution is needed for large objects.  Setting an api limit is good because it gives consumers a guarantee about the worst-case object they might have to handle.  However, it must also be combined with a replacement mechanism for representing large objects.  The 2,000 node limit for a way is fine because longer ways can be combined via relations.  If the relation member limit were capped, you create a class of objects that cannot be represented in the data set.

We've already substantially solved that problem for routes. Super-relations seem to work well, and only rarely do we even need a three-level hierarchy. As Steve points out, we could go deeper, but there's no need.

What I think is missing is a way to store huge multipolygons in such a way that they can be worked with in a piecemeal way.  The answer that immediately comes to mind is a scheme where large objects are represented as relations of relations, where portions of a huge multipolygon are chopped up into fragments and stored in subordinate multipolygon relations.  This hierarchy could perhaps nest several levels if needed.  Now a 40,000 member relation could be composed of 200 relations of 200 members each, with each subordinate relation member being a valid multipolygon with disjoint or adjacent portions of the overall geometry.

Then, an editor could say "here is a large relation, I've drawn bounding boxes for the 200 sub-relations, if you select one, I'll load its data and you can edit just that sub-relation". 

This could almost work under the current relation scheme (provided new relation types are invented to cover these types of data structures, and consumers roger up to supporting such hierarchical relations).  The thing that makes this fail for interactive data consumers (such as an editor or a display) is that there's no way to know where relation members are, spatially, within the relation.  The api does not have a way to say "what is the bounding box of this object?"  A consumer would need to traverse down through the hierarchy to compute the inner bounding boxes, which defeats the purpose of subdividing it in the first place.

You're right that it's a problem, but you misdiagnose the details. Rather than identifying bounding boxes, which is easy, the problem comes down to identifying topology - is a given point in space on the inside or outside of the multipolygon? The minimal information needed when that question is asked is one of two things. You need to know either the 'winding number' - essentially, if you draw a mathematical ray from the point to infinity in a given direction, how many times do you cross the boundary of the region? (Odd = inside, even = outside).  The second is to add a requirement to the data model that the boundaries of regions must follow a particular winding direction; most GIS systems use the "right hand rule" of specifying that as you proceed along a boundary way, the interior of a relation should be on your right.

The second rule is by far the easiest to implement. Unfortunately, it's also inconsistent with OSM's base data model. The problem is that we do not necessarily require multipolygons to be sorted in any particular order (depending on client software to order them if necessary), nor do we require the boundary ways to proceed in any particular direction with respect to the multipolygon.  In fact, we cannot require the boundary ways to proceed in a particular direction, since shared ways between adjacent multipolygons are a fairly common practice. The practice is somewhat controversial; nevertheless, it seems like a good idea when the adjoining regions by their nature are both known to touch and known to be mutually exclusive. The lines that separate landuse from landuse, landcover from landcover, administrative region from administrative region, land from water, or cadastral parcel from cadastral parcel (where cadastre is accepted, as it is with objects like public recreational land).

Except for monsters such as the World Ocean (the coastline is a perpetual headache), seas, and objects with extremely complex topology, the problem is somewhat manageable. A single 'ring' (the cycle of contiguous ways, inner or outer, that form one region of a multipolygon) or a single 'complex polygon' (an outer way and any inner ways subordinate to it) are generally quite manageable in terms of data volume.  I can edit shorelines of the Great Lakes, for instance, with some confidence, by loading into JOSM all the data near the single stretch of shoreline that I'm working on, plus the entire outer perimeter of the lake (using the 'download incomplete members' function); having the shoreline outside the immediate region of interest doesn't stress the memory even of a somewhat obsolete laptop computer. Not all editors are as competent with managing large relations - I've never, for instance, grown comfortable with attempting similar tasks in any of the browser-based ones I've tried. I used Meerkartor briefly during a time when the large relations were causing random JOSM crashes (something to do with interactions with accessibility extensions when painting the data in the UI), and is was also fairly workable, so this isn't a JOSM advertisement, necessarily.

The objects that typically give me the worst headaches aren't necessarily the largest ones - as I said, I deal with long routes such as the Appalachian Trail, or large areas such as the Great Lakes - but rather the diffuse ones. (Many National Forests are both!)  Editing messy multipolygon like https://www.openstreetmap.org/relation/6360587 - particularly one where the ways are shared with other objects (as where a recreation area shares boundaries with an adjacent wilderness area, or is defined by a shoreline or a stream centerline) - is, as an elderly relative of mine used to put it, "a pain where a pill don't fix it!"

I do not agree at all with the contention that nothing is lost by breaking the association among the individual fragments of such a diffuse area.  They share a name, an administrative authority, a management plan, a web site, a set of regulations, and so on.  They are the parts of a whole that happens to be fragmented into a lot of spatially disjoint, although loosely grouped, pieces.  I do understand that "relations are not categories" but I'm not trying to create a relation for "all Wild Forest areas" or "all New York State lands", but rather for the particular facility known as the "Wilcox Lake Wild Forest." The neighbours and visitors of that forest do conceptualize it as a single thing, so we do lose a lot if you tell me "just don't map that way."

Extracting a geographic region from a large multipolygon for rendering is somewhat a solved problem, although implementations in particular tools vary. There are a number of named algorithms related to the issue. Wikipedia offers some good jumping-off points:


They work quite well in practice for rendering and geocoding in limited geographic areas. The spatial indexing of the relational databases we use also performs well in practice except for the case where the region is both large and topologically complex.

The key issue for editing is that edits must ensure topologic consistency. Most proposals that I've seen for representing large multipolygons by subdivision fail at this - they require the entire multipolygon to identify that the portion being edited does not introduce crossing ways or disconnect the boundary. This is the perennial problem with the coastline - it's never complete and consistent, so the generalization of the coastline never seems to happen.

Apologies to the 'tagging' mailing list in that I'm wandering off into data storage, data retrieval, editing and rendering technology, none of which really bears on how the objects are mapped and tagged.  There's almost certainly a better forum in which to hash out design details of a data model that addresses Brian's issue satisfactorily, and I'll happily follow to wherever the discussion of the technological problems moves.

--
73 de ke9tv/2, Kevin

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Tagging mailing list
In reply to this post by Christoph Hormann-2


On 22/11/2020 22:27, Christoph Hormann wrote:
> Exactly. It also shows how we in OSM traditionally make decisions
> about tagging. An idea to change tagging practice was suggested - on
> an open channel for everyone to read and comment on without hurdles
> and with an archive that allows us now to read up on things years
> later. It was discussed and arguments and reasoning were provided both
> for and against the idea and based on that we reached consensus that
> it was a bad idea and it was abandoned.

Yes, but the demand was still made & the solution of writing competent
code to enable the proposal was never implemented, so your point is?

DaveF


_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
Reply | Threaded
Open this post in threaded view
|

Re: Extremely long Amtrak route relations / coastline v. water

Christoph Hormann-2


> Dave F via Tagging <[hidden email]> hat am 24.11.2020 01:24 geschrieben:
>
> Yes, but the demand was still made &

So what?  Someone (an individual, not 'OSM-Carto' as a whole) made a suggestion (and not a demand) that turned out to not be such a good idea and therefore did not achieve consensus.

> the solution of writing competent
> code to enable the proposal was never implemented,
> so your point is?

I am not sure what you mean here.  One of the problem of tagging boundaries on ways and one of the main reason why the idea did not reach consensus is that it does not solve any of the rendering problems w.r.t. boundaries in substance.

Code for processing OSM boundary data for cartographic applications exists.  Not all of it is open source and much of it is just rough implementations not robust enough for routine use.  And there are of course very different cartographic problems to solve w.r.t. boundary rendering.  Why is nothing in that direction in OSM-Carto right now?  Because no one so far has invested the volunteer time to do so an no one has invested the money to pay someone qualified to do so either.  And a large number of people consider the status quo as good enough.  "The good enough is an enemy of the great" is a very common pattern in map style development.

--
Christoph Hormann
http://www.imagico.de/

_______________________________________________
Tagging mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/tagging
12