[osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan


Hello All!


We were discussing today on the #dev Slack channel the idea of extending Osmosis to handle enhanced PBF files created by Osmium that have location information for each WayNode. This has the potential to eliminate the need for giant node-to-location maps consuming tens of GB of RAM when processing planetary PBF files. 


I'd like to see this happen as OSM data keeps growing exponentially and I'd be more than happy to help out with the changes to Osmosis if that would be useful. Once we have an OsmosisReader than can handle these enhanced PBF files (and, obviously, older PBF files as well), it would be nice to take the next step and have them published by the OSM project. Although we'd prefer the enhanced data be centralized so everyone has easy access to it, we could also do this part if resources to do it are limited.


Best,


     Jon


https://blog.jochentopf.com/2016-04-20-node-locations-on-ways.html

http://docs.osmcode.org/osmium/latest/osmium-add-locations-to-ways.html




_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently is perhaps the greatest challenge of working with OSM data, and the one I've spent more time on than most.  Including that information in the raw data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.  I'm not able to spend much time on it, and it doesn't see many other contributions.  Unfortunately this means you'll probably be on your own.  I'll do my best to answer any questions, but am unlikely to be able to help directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use cases that other tools like Osmium don't cover?  If there are that's great, I'm a bit out of touch.

Cheers,
Brett

On Thu, 1 Mar 2018 at 07:35 Locke, Jonathan <[hidden email]> wrote:


Hello All!


We were discussing today on the #dev Slack channel the idea of extending Osmosis to handle enhanced PBF files created by Osmium that have location information for each WayNode. This has the potential to eliminate the need for giant node-to-location maps consuming tens of GB of RAM when processing planetary PBF files. 


I'd like to see this happen as OSM data keeps growing exponentially and I'd be more than happy to help out with the changes to Osmosis if that would be useful. Once we have an OsmosisReader than can handle these enhanced PBF files (and, obviously, older PBF files as well), it would be nice to take the next step and have them published by the OSM project. Although we'd prefer the enhanced data be centralized so everyone has easy access to it, we could also do this part if resources to do it are limited.


Best,


     Jon


https://blog.jochentopf.com/2016-04-20-node-locations-on-ways.html

http://docs.osmcode.org/osmium/latest/osmium-add-locations-to-ways.html



_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan
In reply to this post by Locke, Jonathan
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan

Hi Brett,


Attached is a patch that adds WayNode location (latitude/longitude) support to OsmosisReader. It seems to work fine. You can check it out by uncommenting the @Test annotation on the test I added and supplying the path to your own PBF file (I would have added a full unit test there, but I didn't know how you wanted to handle data for test cases in this project, so I just left it like this for now). You'll want to create your PBF file with a command similar to this:


osmium add-locations-to-ways --keep-untagged-nodes -o new-mexico-latest-with-way-nodes.osm.pbf new-mexico-latest.osm.pbf


Only one technical question for you: is there any way to detect from the header of the PBF file whether the file contains way node locations? It would be nice not to have to read nodes for 45 minutes before discovering that the PBF file doesn't have way node locations. Once there's an osmosis release with the way node location feature in it (and ideally a data drop with way node locations), we hope to switch to consuming only PBF files with way node locations and we'd remove support from our apps for PBF files without this information (thus the need to detect if the file has way nodes).


Best,


    Jon


From: Brett Henderson <[hidden email]>
Sent: Sunday, March 4, 2018 3:40:02 PM
To: Locke, Jonathan
Cc: [hidden email]
Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
 
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

osmosis-way-node-location.patch (64K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Jochen123
On Mon, Mar 05, 2018 at 05:18:00PM +0000, Locke, Jonathan wrote:
> Only one technical question for you: is there any way to detect from
> the header of the PBF file whether the file contains way node
> locations? It would be nice not to have to read nodes for 45 minutes
> before discovering that the PBF file doesn't have way node locations.
> Once there's an osmosis release with the way node location feature in
> it (and ideally a data drop with way node locations), we hope to
> switch to consuming only PBF files with way node locations and we'd
> remove support from our apps for PBF files without this information
> (thus the need to detect if the file has way nodes).

Osmium sets "LocationsOnWays" as an "optional feature" string in the
header to signify that there are locations on the ways.

Jochen
--
Jochen Topf  [hidden email]  https://www.jochentopf.com/  +49-351-31778688

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
In reply to this post by Locke, Jonathan
Hi Jon,

Thanks for sending through the patch.  I've just taken a look at it.  Some comments:

WayNode
The WayNode class now has the new optional latitude and longitude fields which makes sense.  Can you update the store method and (StoreReader, StoreClassRegister) constructor to persist and load those parameters again?  They're needed in case the pipeline does any functionality that requires creating temp files such as sorting.

The class is now mutable which may cause problems in a multi-threaded pipeline if task implementations are tempted to modify state on the fly.  Some of the Osmosis entity classes are mutable (an historical decision which I regrettably allowed through), but they're protected from concurrent modification through a somewhat elaborate read/write protection mechanism (see CommonEntityData for details ...).  In this case, can we keep the class immutable by adding those additional fields to an overloaded constructor?

osmformat.proto
Are these changes mastered somewhere else?  In other words, are these new fields the same ones used by Osmium in its implementation?  I'm wondering if we need to re-sync from the main OSM-binary project.  The osmosis version is the same as that in https://github.com/scrosby/OSM-binary.  The repo https://github.com/brettch/OSM-binary is a fork of that, and the osmosis branch there *is* the same code as the osmosis-osm-binary directory in the osmosis repo, just with some re-arranging to fit inside the Osmosis structure and fit inside the Osmosis java package namespace.

Jochen (if you're reading), where does Osmium source its osmformat.proto file from?

Long story short, rather than make changes directly to the file in Osmosis and create a fork, should we apply them to upstream first and then re-sync Osmosis with that?

Otherwise, the changes look relatively straightforward.  I don't have many strong opinions on how to test it.  Osmosis doesn't have an amazing test suite, it started out as a hacked together tool and grew into something bigger than I planned.  I mostly rely on some basic end to end testing for each task that creates files and reads then back again.  That's not possible if you only have read support for the new file format.  We may need to check in a small test file with a couple of ways created by Osmium.

Cheers,
Brett


On Tue, 6 Mar 2018 at 04:18 Locke, Jonathan <[hidden email]> wrote:

Hi Brett,


Attached is a patch that adds WayNode location (latitude/longitude) support to OsmosisReader. It seems to work fine. You can check it out by uncommenting the @Test annotation on the test I added and supplying the path to your own PBF file (I would have added a full unit test there, but I didn't know how you wanted to handle data for test cases in this project, so I just left it like this for now). You'll want to create your PBF file with a command similar to this:


osmium add-locations-to-ways --keep-untagged-nodes -o new-mexico-latest-with-way-nodes.osm.pbf new-mexico-latest.osm.pbf


Only one technical question for you: is there any way to detect from the header of the PBF file whether the file contains way node locations? It would be nice not to have to read nodes for 45 minutes before discovering that the PBF file doesn't have way node locations. Once there's an osmosis release with the way node location feature in it (and ideally a data drop with way node locations), we hope to switch to consuming only PBF files with way node locations and we'd remove support from our apps for PBF files without this information (thus the need to detect if the file has way nodes).


Best,


    Jon


From: Brett Henderson <[hidden email]>
Sent: Sunday, March 4, 2018 3:40:02 PM
To: Locke, Jonathan
Cc: [hidden email]

Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
Hmm, my previous email was probably hard to read.  Let me know if it doesn't make sense.

One other thing I should mention.  GIven that you are just adding read support, have you looked at the osmosis-pbf2 sub-project?  It contains an alternative PBF reading implementation that I wrote to support multi-threaded reading.  The task is registered as --read-pbf-fast.  The class org.openstreetmap.osmosis.pbf2.v0_6.impl.PbfBlobDecoder in method processWays contains the WayNode parsing logic.


On Sun, 11 Mar 2018 at 21:19 Brett Henderson <[hidden email]> wrote:
Hi Jon,

Thanks for sending through the patch.  I've just taken a look at it.  Some comments:

WayNode
The WayNode class now has the new optional latitude and longitude fields which makes sense.  Can you update the store method and (StoreReader, StoreClassRegister) constructor to persist and load those parameters again?  They're needed in case the pipeline does any functionality that requires creating temp files such as sorting.

The class is now mutable which may cause problems in a multi-threaded pipeline if task implementations are tempted to modify state on the fly.  Some of the Osmosis entity classes are mutable (an historical decision which I regrettably allowed through), but they're protected from concurrent modification through a somewhat elaborate read/write protection mechanism (see CommonEntityData for details ...).  In this case, can we keep the class immutable by adding those additional fields to an overloaded constructor?

osmformat.proto
Are these changes mastered somewhere else?  In other words, are these new fields the same ones used by Osmium in its implementation?  I'm wondering if we need to re-sync from the main OSM-binary project.  The osmosis version is the same as that in https://github.com/scrosby/OSM-binary.  The repo https://github.com/brettch/OSM-binary is a fork of that, and the osmosis branch there *is* the same code as the osmosis-osm-binary directory in the osmosis repo, just with some re-arranging to fit inside the Osmosis structure and fit inside the Osmosis java package namespace.

Jochen (if you're reading), where does Osmium source its osmformat.proto file from?

Long story short, rather than make changes directly to the file in Osmosis and create a fork, should we apply them to upstream first and then re-sync Osmosis with that?

Otherwise, the changes look relatively straightforward.  I don't have many strong opinions on how to test it.  Osmosis doesn't have an amazing test suite, it started out as a hacked together tool and grew into something bigger than I planned.  I mostly rely on some basic end to end testing for each task that creates files and reads then back again.  That's not possible if you only have read support for the new file format.  We may need to check in a small test file with a couple of ways created by Osmium.

Cheers,
Brett


On Tue, 6 Mar 2018 at 04:18 Locke, Jonathan <[hidden email]> wrote:

Hi Brett,


Attached is a patch that adds WayNode location (latitude/longitude) support to OsmosisReader. It seems to work fine. You can check it out by uncommenting the @Test annotation on the test I added and supplying the path to your own PBF file (I would have added a full unit test there, but I didn't know how you wanted to handle data for test cases in this project, so I just left it like this for now). You'll want to create your PBF file with a command similar to this:


osmium add-locations-to-ways --keep-untagged-nodes -o new-mexico-latest-with-way-nodes.osm.pbf new-mexico-latest.osm.pbf


Only one technical question for you: is there any way to detect from the header of the PBF file whether the file contains way node locations? It would be nice not to have to read nodes for 45 minutes before discovering that the PBF file doesn't have way node locations. Once there's an osmosis release with the way node location feature in it (and ideally a data drop with way node locations), we hope to switch to consuming only PBF files with way node locations and we'd remove support from our apps for PBF files without this information (thus the need to detect if the file has way nodes).


Best,


    Jon


From: Brett Henderson <[hidden email]>
Sent: Sunday, March 4, 2018 3:40:02 PM
To: Locke, Jonathan
Cc: [hidden email]

Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Jochen123
In reply to this post by Brett Henderson
On Sun, Mar 11, 2018 at 10:19:21AM +0000, Brett Henderson wrote:

> *osmformat.proto*
> Are these changes mastered somewhere else?  In other words, are these new
> fields the same ones used by Osmium in its implementation?  I'm wondering
> if we need to re-sync from the main OSM-binary project.  The osmosis
> version is the same as that in https://github.com/scrosby/OSM-binary.  The
> repo https://github.com/brettch/OSM-binary is a fork of that, and the
> osmosis branch there *is* the same code as the osmosis-osm-binary directory
> in the osmosis repo, just with some re-arranging to fit inside the Osmosis
> structure and fit inside the Osmosis java package namespace.
>
> *Jochen* (if you're reading), where does Osmium source its osmformat.proto
> file from?

The PBF implementation in Osmium doesn't need the .proto file, so this
didn't come up there.

> Long story short, rather than make changes directly to the file in Osmosis
> and create a fork, should we apply them to upstream first and then re-sync
> Osmosis with that?

Yes, we should do this upstream in https://github.com/scrosby/OSM-binary.
If somebody creates a pull request, I'll merge it there...

Jochen
--
Jochen Topf  [hidden email]  https://www.jochentopf.com/  +49-351-31778688

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan

Thanks for the feedback, Brett. I'm on vacation tomorrow, but I'll try to address your concerns on Tuesday.


   Jon


From: Jochen Topf <[hidden email]>
Sent: Sunday, March 11, 2018 5:56:54 AM
To: Brett Henderson
Cc: Locke, Jonathan; Van Exel, Martijn; [hidden email]
Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
 
On Sun, Mar 11, 2018 at 10:19:21AM +0000, Brett Henderson wrote:
> *osmformat.proto*
> Are these changes mastered somewhere else?  In other words, are these new
> fields the same ones used by Osmium in its implementation?  I'm wondering
> if we need to re-sync from the main OSM-binary project.  The osmosis
> version is the same as that in https://github.com/scrosby/OSM-binary.  The
> repo https://github.com/brettch/OSM-binary is a fork of that, and the
> osmosis branch there *is* the same code as the osmosis-osm-binary directory
> in the osmosis repo, just with some re-arranging to fit inside the Osmosis
> structure and fit inside the Osmosis java package namespace.
>
> *Jochen* (if you're reading), where does Osmium source its osmformat.proto
> file from?

The PBF implementation in Osmium doesn't need the .proto file, so this
didn't come up there.

> Long story short, rather than make changes directly to the file in Osmosis
> and create a fork, should we apply them to upstream first and then re-sync
> Osmosis with that?

Yes, we should do this upstream in https://github.com/scrosby/OSM-binary.
If somebody creates a pull request, I'll merge it there...

Jochen
--
Jochen Topf  [hidden email]  https://linkprotect.cudasvc.com/url?a=https://www.jochentopf.com/&c=E,1,HZFs9cweUVxpTLJQ4cO5l0bBsPZCxUql3lJmd3REbm5WoZilH4Cd7sTEj0hlrb-qeHzuiAi4u5gnj2ZpgPntBsyW7BF_R6_foHVx7qkNssf1vtzLlOQQ&typo=1  +49-351-31778688

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan
In reply to this post by Brett Henderson

Hi Brett,


I addressed the issues you brought up and there's a new patch to the base code attached to this email. The only change I made to the Osmosis .proto file was to add these two lines precisely as they were added to Osmium:


+   repeated sint64 lat = 9 [packed = true];

+   repeated sint64 lon = 10 [packed = true];


I believe the other differences in the generated Java code probably have to do with our using different versions of protoc during the build process. You may want to regenerate the code using the protoc version you used to generate the Java code before and then these files should be mostly the same. 


You probably will also want to run the tests as I cannot run the tests as Gradle quits when it runs into failures in the database code (I'm most likely not set up for Osmosis' database tests). Finally, I don't really follow what you're saying about syncing code with some other codebase. I don't have commit acess to anything so perhaps someone else can sync the proto file appropriately. I believe all that needs to happen to make this work is to add the two lines above (as in the patch file).


Best,


     Jon



From: Brett Henderson <[hidden email]>
Sent: Sunday, March 11, 2018 5:43:08 AM
To: Locke, Jonathan
Cc: [hidden email]; Van Exel, Martijn
Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
 
Hmm, my previous email was probably hard to read.  Let me know if it doesn't make sense.

One other thing I should mention.  GIven that you are just adding read support, have you looked at the osmosis-pbf2 sub-project?  It contains an alternative PBF reading implementation that I wrote to support multi-threaded reading.  The task is registered as --read-pbf-fast.  The class org.openstreetmap.osmosis.pbf2.v0_6.impl.PbfBlobDecoder in method processWays contains the WayNode parsing logic.


On Sun, 11 Mar 2018 at 21:19 Brett Henderson <[hidden email]> wrote:
Hi Jon,

Thanks for sending through the patch.  I've just taken a look at it.  Some comments:

WayNode
The WayNode class now has the new optional latitude and longitude fields which makes sense.  Can you update the store method and (StoreReader, StoreClassRegister) constructor to persist and load those parameters again?  They're needed in case the pipeline does any functionality that requires creating temp files such as sorting.

The class is now mutable which may cause problems in a multi-threaded pipeline if task implementations are tempted to modify state on the fly.  Some of the Osmosis entity classes are mutable (an historical decision which I regrettably allowed through), but they're protected from concurrent modification through a somewhat elaborate read/write protection mechanism (see CommonEntityData for details ...).  In this case, can we keep the class immutable by adding those additional fields to an overloaded constructor?

osmformat.proto
Are these changes mastered somewhere else?  In other words, are these new fields the same ones used by Osmium in its implementation?  I'm wondering if we need to re-sync from the main OSM-binary project.  The osmosis version is the same as that in https://github.com/scrosby/OSM-binary.  The repo https://github.com/brettch/OSM-binary is a fork of that, and the osmosis branch there *is* the same code as the osmosis-osm-binary directory in the osmosis repo, just with some re-arranging to fit inside the Osmosis structure and fit inside the Osmosis java package namespace.

Jochen (if you're reading), where does Osmium source its osmformat.proto file from?

Long story short, rather than make changes directly to the file in Osmosis and create a fork, should we apply them to upstream first and then re-sync Osmosis with that?

Otherwise, the changes look relatively straightforward.  I don't have many strong opinions on how to test it.  Osmosis doesn't have an amazing test suite, it started out as a hacked together tool and grew into something bigger than I planned.  I mostly rely on some basic end to end testing for each task that creates files and reads then back again.  That's not possible if you only have read support for the new file format.  We may need to check in a small test file with a couple of ways created by Osmium.

Cheers,
Brett


On Tue, 6 Mar 2018 at 04:18 Locke, Jonathan <[hidden email]> wrote:

Hi Brett,


Attached is a patch that adds WayNode location (latitude/longitude) support to OsmosisReader. It seems to work fine. You can check it out by uncommenting the @Test annotation on the test I added and supplying the path to your own PBF file (I would have added a full unit test there, but I didn't know how you wanted to handle data for test cases in this project, so I just left it like this for now). You'll want to create your PBF file with a command similar to this:


osmium add-locations-to-ways --keep-untagged-nodes -o new-mexico-latest-with-way-nodes.osm.pbf new-mexico-latest.osm.pbf


Only one technical question for you: is there any way to detect from the header of the PBF file whether the file contains way node locations? It would be nice not to have to read nodes for 45 minutes before discovering that the PBF file doesn't have way node locations. Once there's an osmosis release with the way node location feature in it (and ideally a data drop with way node locations), we hope to switch to consuming only PBF files with way node locations and we'd remove support from our apps for PBF files without this information (thus the need to detect if the file has way nodes).


Best,


    Jon


From: Brett Henderson <[hidden email]>
Sent: Sunday, March 4, 2018 3:40:02 PM
To: Locke, Jonathan
Cc: [hidden email]

Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

osmosis.patch (72K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
Thanks Jon, I'll take a look as soon as I can.

As for testing, have you tried running it in the Docker environment? This command should do it.

./docker.sh ./gradlew build

That will automatically spin up a database server in a Docker container and run the build/tests in another container that has access to the database server. The only prerequisite is Docker itself.

On Wed., 14 Mar. 2018, 7:35 am Locke, Jonathan, <[hidden email]> wrote:

Hi Brett,


I addressed the issues you brought up and there's a new patch to the base code attached to this email. The only change I made to the Osmosis .proto file was to add these two lines precisely as they were added to Osmium:


+   repeated sint64 lat = 9 [packed = true];

+   repeated sint64 lon = 10 [packed = true];


I believe the other differences in the generated Java code probably have to do with our using different versions of protoc during the build process. You may want to regenerate the code using the protoc version you used to generate the Java code before and then these files should be mostly the same. 


You probably will also want to run the tests as I cannot run the tests as Gradle quits when it runs into failures in the database code (I'm most likely not set up for Osmosis' database tests). Finally, I don't really follow what you're saying about syncing code with some other codebase. I don't have commit acess to anything so perhaps someone else can sync the proto file appropriately. I believe all that needs to happen to make this work is to add the two lines above (as in the patch file).


Best,


     Jon



From: Brett Henderson <[hidden email]>
Sent: Sunday, March 11, 2018 5:43:08 AM
To: Locke, Jonathan
Cc: [hidden email]; Van Exel, Martijn

Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
Hmm, my previous email was probably hard to read.  Let me know if it doesn't make sense.

One other thing I should mention.  GIven that you are just adding read support, have you looked at the osmosis-pbf2 sub-project?  It contains an alternative PBF reading implementation that I wrote to support multi-threaded reading.  The task is registered as --read-pbf-fast.  The class org.openstreetmap.osmosis.pbf2.v0_6.impl.PbfBlobDecoder in method processWays contains the WayNode parsing logic.


On Sun, 11 Mar 2018 at 21:19 Brett Henderson <[hidden email]> wrote:
Hi Jon,

Thanks for sending through the patch.  I've just taken a look at it.  Some comments:

WayNode
The WayNode class now has the new optional latitude and longitude fields which makes sense.  Can you update the store method and (StoreReader, StoreClassRegister) constructor to persist and load those parameters again?  They're needed in case the pipeline does any functionality that requires creating temp files such as sorting.

The class is now mutable which may cause problems in a multi-threaded pipeline if task implementations are tempted to modify state on the fly.  Some of the Osmosis entity classes are mutable (an historical decision which I regrettably allowed through), but they're protected from concurrent modification through a somewhat elaborate read/write protection mechanism (see CommonEntityData for details ...).  In this case, can we keep the class immutable by adding those additional fields to an overloaded constructor?

osmformat.proto
Are these changes mastered somewhere else?  In other words, are these new fields the same ones used by Osmium in its implementation?  I'm wondering if we need to re-sync from the main OSM-binary project.  The osmosis version is the same as that in https://github.com/scrosby/OSM-binary.  The repo https://github.com/brettch/OSM-binary is a fork of that, and the osmosis branch there *is* the same code as the osmosis-osm-binary directory in the osmosis repo, just with some re-arranging to fit inside the Osmosis structure and fit inside the Osmosis java package namespace.

Jochen (if you're reading), where does Osmium source its osmformat.proto file from?

Long story short, rather than make changes directly to the file in Osmosis and create a fork, should we apply them to upstream first and then re-sync Osmosis with that?

Otherwise, the changes look relatively straightforward.  I don't have many strong opinions on how to test it.  Osmosis doesn't have an amazing test suite, it started out as a hacked together tool and grew into something bigger than I planned.  I mostly rely on some basic end to end testing for each task that creates files and reads then back again.  That's not possible if you only have read support for the new file format.  We may need to check in a small test file with a couple of ways created by Osmium.

Cheers,
Brett


On Tue, 6 Mar 2018 at 04:18 Locke, Jonathan <[hidden email]> wrote:

Hi Brett,


Attached is a patch that adds WayNode location (latitude/longitude) support to OsmosisReader. It seems to work fine. You can check it out by uncommenting the @Test annotation on the test I added and supplying the path to your own PBF file (I would have added a full unit test there, but I didn't know how you wanted to handle data for test cases in this project, so I just left it like this for now). You'll want to create your PBF file with a command similar to this:


osmium add-locations-to-ways --keep-untagged-nodes -o new-mexico-latest-with-way-nodes.osm.pbf new-mexico-latest.osm.pbf


Only one technical question for you: is there any way to detect from the header of the PBF file whether the file contains way node locations? It would be nice not to have to read nodes for 45 minutes before discovering that the PBF file doesn't have way node locations. Once there's an osmosis release with the way node location feature in it (and ideally a data drop with way node locations), we hope to switch to consuming only PBF files with way node locations and we'd remove support from our apps for PBF files without this information (thus the need to detect if the file has way nodes).


Best,


    Jon


From: Brett Henderson <[hidden email]>
Sent: Sunday, March 4, 2018 3:40:02 PM
To: Locke, Jonathan
Cc: [hidden email]

Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
It's always nice to hear that your software is useful :-)  Thanks!  Yell out if you run into any problems and I'll do my best to point you in the right direction.

Cheers,
Brett

On Fri, 2 Mar 2018 at 11:32 Locke, Jonathan <[hidden email]> wrote:
Hi Brett,
From our perspective, it's definitely worth adding this feature because we use OsmosisReader in a host of custom Java applications (dozens of them). I think at this point, Osmosis code is running on our servers 24/7/365 doing various kinds of back-end processing for different groups around the world.
I totally understand the part about not having time. I am the author of Apache Wicket and I've stepped away from that project for what are probably similar reasons (OSS really does soak up time like mad!). So, I will spend some time developing a patch for OsmosisReader that supports this new location-enhanced format and I'll get in touch when my patch is ready for your review. With luck, I shouldn't have too many questions and the patch will be close to what you'd like. I figure I will just need to look at the proto files and maybe the osmium code and make the appropriate changes. Anyway, thanks for writing a great little library. I've had few if any problems with it and like I said, it powers a lot of what we do with OSM.
Best,
  Jon
------
Hi Jon,

It sounds like a great initiative.  Linking ways to locations efficiently
is perhaps the greatest challenge of working with OSM data, and the one
I've spent more time on than most.  Including that information in the raw
data sets would be a huge boon for downstream consumers.

As you may have noticed Osmosis development is fairly quiet these days.
I'm not able to spend much time on it, and it doesn't see many other
contributions.  Unfortunately this means you'll probably be on your own.
I'll do my best to answer any questions, but am unlikely to be able to help
directly.

I'm curious about whether it's worth adding to Osmosis.  Are there many use
cases that other tools like Osmium don't cover?  If there are that's great,
I'm a bit out of touch.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan
In reply to this post by Jochen123

Is there a way to get this optional feature string in Osmosis? I expected it to be passed as metadata to the initialize method of Sink, but it wasn't.


    Jon




From: Jochen Topf <[hidden email]>
Sent: Wednesday, March 7, 2018 5:32:52 AM
To: Locke, Jonathan
Cc: Brett Henderson; Van Exel, Martijn; [hidden email]
Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
 
On Mon, Mar 05, 2018 at 05:18:00PM +0000, Locke, Jonathan wrote:
> Only one technical question for you: is there any way to detect from
> the header of the PBF file whether the file contains way node
> locations? It would be nice not to have to read nodes for 45 minutes
> before discovering that the PBF file doesn't have way node locations.
> Once there's an osmosis release with the way node location feature in
> it (and ideally a data drop with way node locations), we hope to
> switch to consuming only PBF files with way node locations and we'd
> remove support from our apps for PBF files without this information
> (thus the need to detect if the file has way nodes).

Osmium sets "LocationsOnWays" as an "optional feature" string in the
header to signify that there are locations on the ways.

Jochen
--
Jochen Topf  [hidden email]  https://linkprotect.cudasvc.com/url?a=https://www.jochentopf.com/&c=E,1,FuOx7DlpJLPRK_QydJ8IUnCrezjCVNUEtMN8rMuZ8QHTDpbSE3iS6YyeL_timpP1CQjKtbrxBzehr6eh7HJWweI8uY8JNwyUnl71UjgP3cU4PRPoLGFE&typo=1  +49-351-31778688

_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Brett Henderson
Hi Jon,

On Fri, 16 Mar 2018 at 06:50 Locke, Jonathan <[hidden email]> wrote:

Is there a way to get this optional feature string in Osmosis? I expected it to be passed as metadata to the initialize method of Sink, but it wasn't.


The initialize method was added relatively recently in Osmosis history so the PBF tasks are unlikely to use it.  I originally added it to pass replication metadata through the pipeline.

It shouldn't be too hard to implement, but I'll need to take a look.  Again, it may be easier to add this to the --read-pbf-fast implementation.

Cheers,
Brett


_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev
Reply | Threaded
Open this post in threaded view
|

Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations

Locke, Jonathan

Okay, thanks, Brett. If you do find time to look at this someday, we'd prefer to have access to the metadata in OsmosisReader as we don't use the multithreaded reader (because we do our own multithreaded PBF processing already).


     jon


From: Brett Henderson <[hidden email]>
Sent: Thursday, March 15, 2018 3:35:17 PM
To: Locke, Jonathan
Cc: Jochen Topf; Van Exel, Martijn; [hidden email]
Subject: Re: [osmosis-dev] Osmosis and Osmium-enhanced PBF files with way node locations
 
Hi Jon,

On Fri, 16 Mar 2018 at 06:50 Locke, Jonathan <[hidden email]> wrote:

Is there a way to get this optional feature string in Osmosis? I expected it to be passed as metadata to the initialize method of Sink, but it wasn't.


The initialize method was added relatively recently in Osmosis history so the PBF tasks are unlikely to use it.  I originally added it to pass replication metadata through the pipeline.

It shouldn't be too hard to implement, but I'll need to take a look.  Again, it may be easier to add this to the --read-pbf-fast implementation.

Cheers,
Brett


_______________________________________________
osmosis-dev mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/osmosis-dev