problems with differential update?

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

problems with differential update?

wambacher

Hi,

the differential update from planet.osm.org is making problems.

can't process 633.state.txt

#Thu Apr 11 09:50:03 CEST 2019
sequenceNumber=3446633
timestamp=2019-04-11T07\:49\:02Z


Apr 11, 2019 10:34:22 AM org.openstreetmap.osmosis.core.pipeline.common.ActiveTaskManager waitForCompletion
SCHWERWIEGEND: Thread for task 1-read-replication-interval failed
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file /tmp/change7292477158762139154.tmp.  publicId=(null), systemId=(null), lineNumber=3343, columnNumber=32.
    at org.openstreetmap.osmosis.xml.v0_6.XmlChangeReader.run(XmlChangeReader.java:95)
    at org.openstreetmap.osmosis.replication.v0_6.ReplicationDownloader.processChangeset(ReplicationDownloader.java:107)
    at org.openstreetmap.osmosis.replication.v0_6.BaseReplicationDownloader.processReplicationFile(BaseReplicationDownloader.java:166)
    at org.openstreetmap.osmosis.replication.v0_6.BaseReplicationDownloader.download(BaseReplicationDownloader.java:268)
    at org.openstreetmap.osmosis.replication.v0_6.BaseReplicationDownloader.runImpl(BaseReplicationDownloader.java:304)
    at org.openstreetmap.osmosis.replication.v0_6.BaseReplicationDownloader.run(BaseReplicationDownloader.java:383)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.xml.sax.SAXParseException; lineNumber: 3343; columnNumber: 32; Invalid byte 2 of 4-byte UTF-8 sequence.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)

are there any known problems?

regards

walter


_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Hartmut Holzgraefe-3
On 11.04.19 10:35, [hidden email] wrote:
> Hi,
>
> the differential update from planet.osm.org is making problems.
>
[...]
> org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file /tmp/change7292477158762139154.tmp.  publicId=(null), systemId=(null), lineNumber=3343, columnNumber=32.
[...]
> Caused by: org.xml.sax.SAXParseException; lineNumber: 3343;
> columnNumber: 32; Invalid byte 2 of 4-byte UTF-8 sequence.
>     at
[...]
> are there any known problems?


Same here, but unfortunately the mentioned /tmp file gets removed, so it
is not easy to tell what UTF-8 sequence it actually doesn't like.

I did a quick check by downloading the last few hours of minutely
diffs, and then running all of them through xmllint, but that
didn't complain about any of them ...

--
hartmut

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole
It's diff 634 but the line number seems to be off, investigating.

Am 11.04.2019 um 11:31 schrieb Hartmut Holzgraefe:

> On 11.04.19 10:35, [hidden email] wrote:
>> Hi,
>>
>> the differential update from planet.osm.org is making problems.
>>
> [...]
>> org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file /tmp/change7292477158762139154.tmp.  publicId=(null), systemId=(null), lineNumber=3343, columnNumber=32.
> [...]
>> Caused by: org.xml.sax.SAXParseException; lineNumber: 3343;
>> columnNumber: 32; Invalid byte 2 of 4-byte UTF-8 sequence.
>>     at
> [...]
>> are there any known problems?
>
> Same here, but unfortunately the mentioned /tmp file gets removed, so it
> is not easy to tell what UTF-8 sequence it actually doesn't like.
>
> I did a quick check by downloading the last few hours of minutely
> diffs, and then running all of them through xmllint, but that
> didn't complain about any of them ...
>

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole
Unzipping and re-zipping helps so it is not actually an error in the OSM
data. AFAIK both sys admins are currently unavailable so it will take a
while to fix.

Simon

Am 11.04.2019 um 16:17 schrieb Simon Poole:

> Sorry 643
>
> Am 11.04.2019 um 16:15 schrieb Simon Poole:
>> It's diff 634 but the line number seems to be off, investigating.
>>
>> Am 11.04.2019 um 11:31 schrieb Hartmut Holzgraefe:
>>> On 11.04.19 10:35, [hidden email] wrote:
>>>> Hi,
>>>>
>>>> the differential update from planet.osm.org is making problems.
>>>>
>>> [...]
>>>> org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file /tmp/change7292477158762139154.tmp.  publicId=(null), systemId=(null), lineNumber=3343, columnNumber=32.
>>> [...]
>>>> Caused by: org.xml.sax.SAXParseException; lineNumber: 3343;
>>>> columnNumber: 32; Invalid byte 2 of 4-byte UTF-8 sequence.
>>>>     at
>>> [...]
>>>> are there any known problems?
>>> Same here, but unfortunately the mentioned /tmp file gets removed, so it
>>> is not easy to tell what UTF-8 sequence it actually doesn't like.
>>>
>>> I did a quick check by downloading the last few hours of minutely
>>> diffs, and then running all of them through xmllint, but that
>>> didn't complain about any of them ...
>>>




_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole

Should be fixed now, thanks to Grant.

Am 11.04.2019 um 16:30 schrieb Simon Poole:
Unzipping and re-zipping helps so it is not actually an error in the OSM
data. AFAIK both sys admins are currently unavailable so it will take a
while to fix.

Simon

Am 11.04.2019 um 16:17 schrieb Simon Poole:
Sorry 643

Am 11.04.2019 um 16:15 schrieb Simon Poole:
It's diff 634 but the line number seems to be off, investigating.

Am 11.04.2019 um 11:31 schrieb Hartmut Holzgraefe:
On 11.04.19 10:35, [hidden email] wrote:
Hi,

the differential update from planet.osm.org is making problems.

[...]
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file /tmp/change7292477158762139154.tmp.  publicId=(null), systemId=(null), lineNumber=3343, columnNumber=32.
[...]
Caused by: org.xml.sax.SAXParseException; lineNumber: 3343;
columnNumber: 32; Invalid byte 2 of 4-byte UTF-8 sequence.
    at
[...]
are there any known problems?
Same here, but unfortunately the mentioned /tmp file gets removed, so it
is not easy to tell what UTF-8 sequence it actually doesn't like.

I did a quick check by downloading the last few hours of minutely
diffs, and then running all of them through xmllint, but that
didn't complain about any of them ...





_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

wambacher
In reply to this post by SimonPoole

it's looking good. Everything is running fine now.

Don't know who did what.

walter

----

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Roland Olbricht
In reply to this post by SimonPoole
> Unzipping and re-zipping helps so it is not actually an error in the OSM
> data. AFAIK both sys admins are currently unavailable so it will take a
> while to fix.

There has been any problem at all here on the Overpass API instances:

I got from the server within seconds
de7bdcc5c9f8ea5ba4043d569a51ee6a  634.osc.gz
and uncompressed with gunzip
9027caf2a2f23763fb6037915997f0a1  -

The file now has
4589b5f4c11dbbe0bb504c5642c54b72  634.osc.gz
and uncompressed with gunzip
9027caf2a2f23763fb6037915997f0a1  -

Could it have been that the file triggered an arcane bug in a gz library
from the Java universe?

If you want to cross, check, I have put the files here:
https://dev.overpass-api.de/misc/de7bdcc5_634.osc.gz
https://dev.overpass-api.de/misc/4589b5f4_634.osc.gz

Roland

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Frederik Ramm
Hi,

On 4/11/19 21:26, Roland Olbricht wrote:
> Could it have been that the file triggered an arcane bug in a gz library
> from the Java universe?

Yes, that's exactly the problem - these files decompress fine with gzip,
just not with Java.

Bye
Frederik

--
Frederik Ramm  ##  eMail [hidden email]  ##  N49°00'09" E008°23'33"

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Nelson A. de Oliveira
In reply to this post by Roland Olbricht
On Thu, Apr 11, 2019 at 4:29 PM Roland Olbricht <[hidden email]> wrote:
> If you want to cross, check, I have put the files here:
> https://dev.overpass-api.de/misc/de7bdcc5_634.osc.gz
> https://dev.overpass-api.de/misc/4589b5f4_634.osc.gz

It's nice how both files are identified by file:

de7bdcc5_634.osc.gz: gzip compressed data, from FAT filesystem
(MS-DOS, OS/2, NT), original size 559156

4589b5f4_634.osc.gz: gzip compressed data, was "634.osc", last
modified: Thu Apr 11 14:37:47 2019, max compression, from Unix,
original size 559156

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
mmd
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

mmd
In reply to this post by Frederik Ramm
Am 11.04.19 um 21:33 schrieb Frederik Ramm:
> Hi,
>
> On 4/11/19 21:26, Roland Olbricht wrote:
>> Could it have been that the file triggered an arcane bug in a gz library
>> from the Java universe?
>
> Yes, that's exactly the problem - these files decompress fine with gzip,
> just not with Java.
>

Nothing really new here. That's exactly the same thing which happened
two months ago:
https://lists.openstreetmap.org/pipermail/talk/2019-February/082057.html

Too bad osmosis is still unmaintained...

--




_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole

Am 11.04.2019 um 22:40 schrieb mmd:
> ....
> Too bad osmosis is still unmaintained...

"still"? afaik nobody has even lifted a finger to find a new maintainer,
so somehow I wouldn't expect a solution to the issue barring magic,
unicorns and so on.



_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole
In reply to this post by Roland Olbricht

Am 11.04.2019 um 21:26 schrieb Roland Olbricht:
> ....
> Could it have been that the file triggered an arcane bug in a gz library
> from the Java universe?
>
> ...
I decompressed the failing file using the same method as osmosis without
issue, so it must be fairly subtle whatever the issue is.


_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Maarten Deen
In reply to this post by mmd
On 2019-04-11 22:40, mmd wrote:

> Am 11.04.19 um 21:33 schrieb Frederik Ramm:
>> Hi,
>>
>> On 4/11/19 21:26, Roland Olbricht wrote:
>>> Could it have been that the file triggered an arcane bug in a gz
>>> library
>>> from the Java universe?
>>
>> Yes, that's exactly the problem - these files decompress fine with
>> gzip,
>> just not with Java.
>>
>
> Nothing really new here. That's exactly the same thing which happened
> two months ago:
> https://lists.openstreetmap.org/pipermail/talk/2019-February/082057.html

Does the determination in java follow the same rules (or even the same
library) as file(1)? In its manpage it says
> file tests each argument in an attempt to classify it. There are three
> sets of tests, performed in this order: filesystem tests, magic tests,
> and language tests. The first test that succeeds causes the file type
> to be printed.

Is there any similarity in the first bytes of the zipped file?

Regards,
Maarten

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole

Am 12.04.2019 um 08:26 schrieb Maarten Deen:

>
> Does the determination in java follow the same rules (or even the same
> library) as file(1)? In its manpage it says
>> file tests each argument in an attempt to classify it. There are
>> three sets of tests, performed in this order: filesystem tests, magic
>> tests, and language tests. The first test that succeeds causes the
>> file type to be printed.
>
> Is there any similarity in the first bytes of the zipped file?
>
This would be a valid question except: I hardwired the decompression
type to GZip yesterday in a test version of osmosis and still got the
error and as said a test program using the same methods decompresses the
file just fine.


_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

Maarten Deen
On 2019-04-12 09:52, Simon Poole wrote:

> Am 12.04.2019 um 08:26 schrieb Maarten Deen:
>>
>> Does the determination in java follow the same rules (or even the same
>> library) as file(1)? In its manpage it says
>>> file tests each argument in an attempt to classify it. There are
>>> three sets of tests, performed in this order: filesystem tests, magic
>>> tests, and language tests. The first test that succeeds causes the
>>> file type to be printed.
>>
>> Is there any similarity in the first bytes of the zipped file?
>>
> This would be a valid question except: I hardwired the decompression
> type to GZip yesterday in a test version of osmosis and still got the
> error and as said a test program using the same methods decompresses
> the
> file just fine.

That is interesting. I don't know how the unzipping is done in osmosis,
but if it is still a java call (now hardwired to use gzip, I assume the
executable, not a built in library), could it not be that java still
tries to figure out what kind of file it is and fails because of that?

Regards,
Maarten

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole
https://github.com/openstreetmap/osmosis/pull/50

Am 12.04.2019 um 10:16 schrieb Maarten Deen:

> On 2019-04-12 09:52, Simon Poole wrote:
>> Am 12.04.2019 um 08:26 schrieb Maarten Deen:
>>>
>>> Does the determination in java follow the same rules (or even the same
>>> library) as file(1)? In its manpage it says
>>>> file tests each argument in an attempt to classify it. There are
>>>> three sets of tests, performed in this order: filesystem tests, magic
>>>> tests, and language tests. The first test that succeeds causes the
>>>> file type to be printed.
>>>
>>> Is there any similarity in the first bytes of the zipped file?
>>>
>> This would be a valid question except: I hardwired the decompression
>> type to GZip yesterday in a test version of osmosis and still got the
>> error and as said a test program using the same methods decompresses the
>> file just fine.
>
> That is interesting. I don't know how the unzipping is done in
> osmosis, but if it is still a java call (now hardwired to use gzip, I
> assume the executable, not a built in library), could it not be that
> java still tries to figure out what kind of file it is and fails
> because of that?
>
> Regards,
> Maarten

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: problems with differential update?

SimonPoole

The actual fix is https://github.com/openstreetmap/osmosis/pull/51

As can be seen this is a very long time bug (from 2007 with a fix available since 2007) with apache xerces (the XML parser) which osmosis has an explicit dependency on. However the version of xerces included in JDK since 2015 has the patch applied so the fix is simply to remove the explicit dependency which wasn't needed with modern JDKs in any case.

Simon

Am 12.04.2019 um 11:03 schrieb Simon Poole:
https://github.com/openstreetmap/osmosis/pull/50

Am 12.04.2019 um 10:16 schrieb Maarten Deen:
On 2019-04-12 09:52, Simon Poole wrote:
Am 12.04.2019 um 08:26 schrieb Maarten Deen:
Does the determination in java follow the same rules (or even the same
library) as file(1)? In its manpage it says
file tests each argument in an attempt to classify it. There are
three sets of tests, performed in this order: filesystem tests, magic
tests, and language tests. The first test that succeeds causes the
file type to be printed.
Is there any similarity in the first bytes of the zipped file?

This would be a valid question except: I hardwired the decompression
type to GZip yesterday in a test version of osmosis and still got the
error and as said a test program using the same methods decompresses the
file just fine.
That is interesting. I don't know how the unzipping is done in
osmosis, but if it is still a java call (now hardwired to use gzip, I
assume the executable, not a built in library), could it not be that
java still tries to figure out what kind of file it is and fails
because of that?

Regards,
Maarten

      
_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

_______________________________________________
talk mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk

signature.asc (499 bytes) Download Attachment