NYC Name Vandalism

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

NYC Name Vandalism

William Morris
Does anyone know if this is traceable to OSM, or was it limited to Mapbox's mirror? I can't seem to find a related changeset, in any case . . .


-Bill


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

William Morris
Never mind; I found it: https://www.openstreetmap.org/changeset/61555047

Looks like it was quickly reverted three weeks ago, but damn, it got through to Mapbox anyway 😬.

Thanks to whoever put the necessary safeguards in place to catch that stuff, and to who reverted it.

-Bill





On Thu, Aug 30, 2018 at 10:22 AM William Morris <[hidden email]> wrote:
Does anyone know if this is traceable to OSM, or was it limited to Mapbox's mirror? I can't seem to find a related changeset, in any case . . .


-Bill


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Ian Dees
In reply to this post by William Morris
Yes, the original harmful edit was made by user "MedwedianPresident" in changeset https://www.openstreetmap.org/changeset/61555047 20 days ago. It was then reverted by naoliv a day later: https://www.openstreetmap.org/changeset/61556585.

naoliv also blocked the user: https://www.openstreetmap.org/user_blocks/2141

On Thu, Aug 30, 2018 at 9:26 AM William Morris <[hidden email]> wrote:
Does anyone know if this is traceable to OSM, or was it limited to Mapbox's mirror? I can't seem to find a related changeset, in any case . . .


-Bill

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Nelson A. de Oliveira
On Thu, Aug 30, 2018 at 11:36 AM, Ian Dees <[hidden email]> wrote:
> Yes, the original harmful edit was made by user "MedwedianPresident" in
> changeset https://www.openstreetmap.org/changeset/61555047 20 days ago. It
> was then reverted by naoliv a day later:
> https://www.openstreetmap.org/changeset/61556585.

Actually the credit for spotting this and warning the DWG goes to
"code elusive" :-)
The NY vandalism was reverted in
https://www.openstreetmap.org/changeset/61556105

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Kevin Kenny-3
In reply to this post by Ian Dees
On Thu, Aug 30, 2018 at 10:39 AM Ian Dees <[hidden email]> wrote:
>
> Yes, the original harmful edit was made by user "MedwedianPresident" in changeset https://www.openstreetmap.org/changeset/61555047 20 days ago. It was then reverted by naoliv a day later: https://www.openstreetmap.org/changeset/61556585.
>
> naoliv also blocked the user: https://www.openstreetmap.org/user_blocks/2141

Many thanks to the hopelessly overloaded DWG for handling this.

A problem here is that it gives us a tremendous black eye in the
press.  I wonder how, moving forward, we can lessen the chances of
this sort of hate speech propagating off the project. Other projects
have found that having a mandatory review and moderation process for
new users is helpful, because the sort of person who leaves this sort
of mess is usually creating a one-time account to do it, rather than
having made earlier sound contributions.

If my experience with other open-source and crowdsourced projects is
any guide, it only takes a incident or two like this for The Powers
That Be in many organizations to start forbidding the use of
open-source material "because there's no quality control and too much
legal risk."

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Frederik Ramm
Hi,

On 08/30/2018 10:20 PM, Kevin Kenny wrote:
> A problem here is that it gives us a tremendous black eye in the
> press.  I wonder how, moving forward, we can lessen the chances of
> this sort of hate speech propagating off the project.

We can only speculate about the motives here - frankly my money is on
"attention seeking teenager" who could just as well have labelled a city
the "Weed Capital". Which would not have been hate speech and maybe not
reported as widely, but not really any better.

> Other projects
> have found that having a mandatory review and moderation process for
> new users is helpful,

Some have, some not; the English Wikipedia, for example, has not, while
the German Wikipedia has, to a degree.

A move like this would have to be very carefully considered as it binds
resources and reduces our ability to attract new mappers (a certain
percentage of whom would not make that first hurdle).

It is also a technical challenge: If the new signup creates a new
object, and before this is reviewed someone else creates the same new
object, what happens? If the new signup modifies an object and before
the modification is reviewed someone else modifies a different object in
a way hat would make both edits clash (e.g. buildings overlap), what
happens? If we don't attract enough reviewers and new edits remain
unpublished for days or even weeks...?

Reducing the possible participation envelope of new mappers is certainly
something that can be discussed, but it's not something we should do on
a whim, and certainly not to please unspecified and scared "Powers That
Be". Perhaps educating our users about the strengths and weaknesses of
crowd-sourcing is another option.

Bye
Frederik

--
Frederik Ramm  ##  eMail [hidden email]  ##  N49°00'09" E008°23'33"

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Frederik Ramm
Hi,

On 30.08.2018 22:43, Frederik Ramm wrote:
> We can only speculate about the motives here

Ah, just a security researcher, I guess this makes it ok then?

https://reddit.com/r/openstreetmap/comments/9brqx4/this_is_medwedianpresident1_talking_what_i_did/

> frankly my money is on
> "attention seeking teenager"

Or maybe it is the same guy who's been asked to be more mature here?

https://www.reddit.com/r/civclassics/comments/6rxu7p/before_you_leave_medwedianpresident_a_couple/

And what is this:

https://archive.is/4NzTp

Bye
Frederik

--
Frederik Ramm  ##  eMail [hidden email]  ##  N49°00'09" E008°23'33"

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Rihards
In reply to this post by Kevin Kenny-3
On 2018.08.30. 23:20, Kevin Kenny wrote:

> On Thu, Aug 30, 2018 at 10:39 AM Ian Dees <[hidden email]> wrote:
>>
>> Yes, the original harmful edit was made by user "MedwedianPresident" in changeset https://www.openstreetmap.org/changeset/61555047 20 days ago. It was then reverted by naoliv a day later: https://www.openstreetmap.org/changeset/61556585.
>>
>> naoliv also blocked the user: https://www.openstreetmap.org/user_blocks/2141
>
> Many thanks to the hopelessly overloaded DWG for handling this.
>
> A problem here is that it gives us a tremendous black eye in the
> press.  I wonder how, moving forward, we can lessen the chances of
> this sort of hate speech propagating off the project. Other projects
> have found that having a mandatory review and moderation process for
> new users is helpful, because the sort of person who leaves this sort
> of mess is usually creating a one-time account to do it, rather than
> having made earlier sound contributions.

It gives us the same press as some vandals messing with wikipedia -
let's not see it as a worse thing than it is.

As a sidenote, this was detected and revert in OSM in a day. If data
consumers would update the data more frequently, the impact would be
much, much smaller (in this specific case, probably nobody would have
noticed).

> If my experience with other open-source and crowdsourced projects is
> any guide, it only takes a incident or two like this for The Powers
> That Be in many organizations to start forbidding the use of
> open-source material "because there's no quality control and too much
> legal risk."--
 Rihards

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Richard Welty-2
On 8/31/18 5:58 AM, Rihards wrote:
>
> It gives us the same press as some vandals messing with wikipedia -
> let's not see it as a worse thing than it is.
>
> As a sidenote, this was detected and revert in OSM in a day. If data
> consumers would update the data more frequently, the impact would be
> much, much smaller (in this specific case, probably nobody would have
> noticed).
update frequency is a real issue. i recently encountered issues with
outdated maps in an OSM based GPS application, and when i went
to edit, found that someone was keeping all the new construction
at WDW in Orlando up to date - it was just that the GPS app wasn't
pulling new maps frequently enough to keep up with reality.

richard

--
[hidden email]
 Averill Park Networking - GIS & IT Consulting
 OpenStreetMap - PostgreSQL - Linux
 Java - Web Applications - Search


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Alan Brown-6
In reply to this post by Rihards
Hi -

I haven't commented on this forum for several years, but this event did catch my attention.

There are some uses of OSM map data which would not allow for frequent updates - offline uses - and therefore, a way of catching such vandalism immediately - less than a day, even - would be very helpful.

The thought that occurred, is that certain attributes of certain high profile objects should be caught - or even stopped - very early.  The name tag of New York City should be an obvious example - what would  cause it to change (short of us selling it back to the Dutch, or similar event)?  A new user, offensive language (one of the new street names in  the changelist had the word "fuck", and "Adolf Hilter") - these should be immediate red flags.  In principle, changelists could be submitted to some sort criteria that could trigg moderation, instead of automatically checking it in.

Granted, it would be nearly impossible to make this criteria perfect: there's  not offensive about the word "Jew", but it was applied in an offensive way in this situation; I'd have no idea what would be offensive in Hungarian, much less Thai; someone could draw something offensive (like a peeing Android) that would be very hard to catch; there are places like "Dildo, Newfoundland" that are legitimate.  But I don't think it would be all that hard to flag a changelist like this last vandalism, without interrupting legitimate edits by very much.  At very least, you can force your vandals to be clever to succeed.

In our usage, we will scan the names of significant objects for potentially offensive changes.  But it would be good to have some sort of gateway in the OSM database itself.  I don't understand any of the details of the OSM check-in process, if there is any monitoring for potential vandalism.

-Alan

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Brian May
Alan,

Your phrase "The name tag of New York City should be an obvious example - what would  cause it to change" - That makes a lot of sense. To further expand on this thought, identify and prioritize features in OSM that theoretically should not change much at all over long periods of time. Others have probably already thought of this, but it does seem like a really good idea to prioritize high-profile / large features and have the QA tools out there score these very highly for review ASAP. Like out of thousands of small "potential issues" to look at in a day, a name change to New York City is priority #1 four alarm fire to respond to right away, because its scored very highly as a prominent feature that should not change. Recent Great Lakes name changes also come to mind.

Brian

On 9/4/2018 9:36 PM, Alan Brown wrote:
Hi -

I haven't commented on this forum for several years, but this event did catch my attention.

There are some uses of OSM map data which would not allow for frequent updates - offline uses - and therefore, a way of catching such vandalism immediately - less than a day, even - would be very helpful.

The thought that occurred, is that certain attributes of certain high profile objects should be caught - or even stopped - very early.  The name tag of New York City should be an obvious example - what would  cause it to change (short of us selling it back to the Dutch, or similar event)?  A new user, offensive language (one of the new street names in  the changelist had the word "fuck", and "Adolf Hilter") - these should be immediate red flags.  In principle, changelists could be submitted to some sort criteria that could trigg moderation, instead of automatically checking it in.

Granted, it would be nearly impossible to make this criteria perfect: there's  not offensive about the word "Jew", but it was applied in an offensive way in this situation; I'd have no idea what would be offensive in Hungarian, much less Thai; someone could draw something offensive (like a peeing Android) that would be very hard to catch; there are places like "Dildo, Newfoundland" that are legitimate.  But I don't think it would be all that hard to flag a changelist like this last vandalism, without interrupting legitimate edits by very much.  At very least, you can force your vandals to be clever to succeed.

In our usage, we will scan the names of significant objects for potentially offensive changes.  But it would be good to have some sort of gateway in the OSM database itself.  I don't understand any of the details of the OSM check-in process, if there is any monitoring for potential vandalism.

-Alan


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Nick Hocking
In reply to this post by William Morris

Other criteria for ranking a object for "change protection" could be

1) How long  has it been since the last change to it.

2) How big is it.   ( a long road would rank higher than a short one)

3) How many things are "attached to it"

4) How important is it.   (Motorways are more important than tracks)
                          (tourism objects are more important than objects of local interest only)



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Frederik Ramm
In reply to this post by Alan Brown-6
Hi,

On 05.09.2018 03:36, Alan Brown wrote:
> Granted, it would be nearly impossible to make this criteria perfect:

I think it would already be nearly impossible to make these criteria
even *good*. It is easy to come up with a knee-jerk "nobody should be
allowed to change the name tag of New York", and many will nod in
approval. It seems obvious, doesn't it. Who, then, makes the catalogue
of such places? Is only their name tag "protected"? Or also their
location? Can the node be moved by a mile, 10 miles, 100 miles? Can the
population be changed, and if so, by what amount?

> I'd have no idea what would be
> offensive in Hungarian, much less Thai; someone could draw something
> offensive (like a peeing Android) that would be very hard to catch;
> there are places like "Dildo, Newfoundland" that are legitimate.

All this is true, and simple regular expression matching will never fix
things (the village of Fucking in Austria is a well-known example but
the number of names that are legit in one language and offensive in
another is high).

> But I
> don't think it would be all that hard to flag a changelist like this
> last vandalism,

If you prohibit me from changing the name of New York to "Jewtropolis",
I'll just create a city node one block away from it with a slightly
higher population, causing it to be rendered with priority.

If you start down this road, you will end up not using OSM place names
at all but instead relying on a curated data set like Natural Earth,
which is a valid decision to make for a cartographer but means taking
control away from mappers and giving it to a hand-picked circle of data
curators.

> At very least, you can force your vandals to be clever to succeed.

But is this really what we want - ever more clever vandalism that is
ever less likely to be detected? Is it not even *better* to have
"obvious" vandalism that we can fix easily? Today, getting "Jewtropolis"
written large across OSM for an hour or two is no big deal, nothing to
brag about before your cool hacker friends. "So what" is the answer. Do
we want to make this into a trophy? Today, the headline is "some asshole
put 'Jewtropolis' on OSM" - tomorrow, "clever hacker penetrates OSM
defences"?

> In our usage, we will scan the names of significant objects for
> potentially offensive changes.  But it would be good to have some sort
> of gateway in the OSM database itself.

It is ok for a data consumer to do that. Nobody is hurt if your filters
wrongly reject a valid contribution in Africa. It would also be ok to
build something that prioritizes things for review. But trying to build
some kind of "protection" into the data ingestion at OSM would

* impact performance negatively
* disenfranchise mappers
* bind resources for the constant maintenance of block lists
* encourage clever(er) vandalism

and hence not be worth it.

Bye
Ferderik

--
Frederik Ramm  ##  eMail [hidden email]  ##  N49°00'09" E008°23'33"

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Greg Troxel-2
In reply to this post by Nick Hocking
I tend to agree that automated systems are going to be not that useful.

I tend to notice some things in my area, but it's hard to keep track.

This makes me wonder about a tool that

  - lets people sign up to watch edits, in some area, or in general,
    sort of like maproulette.   Use some scoring system where new
    mappers edits are more likely to be looked at by somebody, and
    people who claim an area as theirs are more likely to get shown
    edits there, or maybe let people get all edits in some bbox

  - lets people give a rating to a changeset, something like:
       i) high priority for inspection by others
       ii) worthy of being checked by a local
       iii) probably ok
       iv) definitely ok

  - presents things to multiple people

  - somehow uses a rater's own edit history to validate this (perhaps be
    cautious about people with < 500 changesets, and very cautious < 50)


This is a slippery slope to a reputation system, but I think in terms of
culture, the fact that anybody can review is there already, and the
bright line is needing permission to change things, vs a more efficient
way of others looking over changes.


Unfortunately my editor crashed and I lost the source code :-)


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

SimonPoole
osmcha (osmcha.mapbox.com) already does most of this. While detecting
vandalism in general is difficult, edits like those in question are easy
to detect and small in number.

IMHO it really isn't an issue with openstreetmap in this case, as even
with the delay (somebody reported the user in question instead of
reverting and then reporting) in the specific case the vandalism was
swiftly removed. The reason that this is being discussed at all is
because of the edit resurfacing with a third party and having to be a)
detected, b) reported, and c) fixed again. Yes what we know this was a
glitch in the third parties workflow, but they are bound to happen and
we shouldn't pretend given the large number of edits that any procedures
put in place are going to be 100% effective, be it directly with OSM or
by third parties. 

Simon


Am 05.09.2018 um 16:23 schrieb Greg Troxel:

> I tend to agree that automated systems are going to be not that useful.
>
> I tend to notice some things in my area, but it's hard to keep track.
>
> This makes me wonder about a tool that
>
>   - lets people sign up to watch edits, in some area, or in general,
>     sort of like maproulette.   Use some scoring system where new
>     mappers edits are more likely to be looked at by somebody, and
>     people who claim an area as theirs are more likely to get shown
>     edits there, or maybe let people get all edits in some bbox
>
>   - lets people give a rating to a changeset, something like:
>        i) high priority for inspection by others
>        ii) worthy of being checked by a local
>        iii) probably ok
>        iv) definitely ok
>
>   - presents things to multiple people
>
>   - somehow uses a rater's own edit history to validate this (perhaps be
>     cautious about people with < 500 changesets, and very cautious < 50)
>
>
> This is a slippery slope to a reputation system, but I think in terms of
> culture, the fact that anybody can review is there already, and the
> bright line is needing permission to change things, vs a more efficient
> way of others looking over changes.
>
>
> Unfortunately my editor crashed and I lost the source code :-)
>
>
> _______________________________________________
> Talk-us mailing list
> [hidden email]
> https://lists.openstreetmap.org/listinfo/talk-us


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Alan Brown-6
Hi,

Perhaps I didn't express it clearly, but my interest was in the idea that certain. rather limited changelists could be flagged for moderation before they are put into main dataset.  There will always be things that seem like they should be blocked, but are actually appropriate.   In the interest of having the most accurate data, I'm not convinced this form of moderation can't have a role.  As I understand it, the virtue of OSM is to allow anyone to contribute accurate, detailed local knowledge about the places they know about; however, there's no value in having junk in the database for even a moment, if it can be avoided.  Place names are usually verifiable facts, even disputed place names.  So you don't want the open nature of OSM to compromise accuracy, or a quest for accuracy to discourage people from contributing accurate information.

I said my peace; I suspect the OSM community is not culturally disposed to that form of moderation. So I will ask about a different approach.

In my case, I've seen editing errors that affected motorway connectivity (not vandalism), that were made and corrected within a couple hours.  Pretty good - except our planet file was in that two hour window.  I want to avoid these errors, without getting caught in the errors of the next two hour window.

I'm not sure if Mapbox or others use a process like this, but this is what I can imagine:

PLANETcur is the current planet file
PLANETprev is the last used planet file
CHANGEcur-prev is a comprehensive list of changelists between the two datasets

A particular consumer of OSM data can automatically scan CHANGEcur-prev and/or PLANETcur for potentially troubling content, according to their own criteria.  In their local copy, if they detect something they do not want to accept - offensive place names, incomplete topology - they can attempt to revert - in their local copy only! - recent changes that violate their criteria.  They accept whatever mistakes their "reversion" algorithm makes.  The identified "questionable changelists"  can be submitted back to the OSM community to review and revert, but always by a human.

My hope is that I am being completely unoriginal, and I can cobble together existing tools quickly. How unoriginal am I?

I am looking over the osmcha.mapbox.com page, and saw reference to a utility called "osm-compare":   https://github.com/mapbox/osm-compare/blob/master/comparators/README.md - which has an obscenity filter.  If I understand this correctly, osm-compare flags changelists for review, osmcha.mapbox.com allows people to review the flagged datasets and reverse bad edits.  Could someone define osm-compare filters that produce results that can be automatically pulled into a local copy?

(If a changeset has been reviewed by a second person - can that information be provided).

All I want is something that allows me to be a little bit more conservative in accepting edits, without requiring complex processes or large resource.  A little insight would be appreciated.

Thanks,
Alan







On Wednesday, September 5, 2018, 7:52:46 AM PDT, Simon Poole <[hidden email]> wrote:


osmcha (osmcha.mapbox.com) already does most of this. While detecting
vandalism in general is difficult, edits like those in question are easy
to detect and small in number.

IMHO it really isn't an issue with openstreetmap in this case, as even
with the delay (somebody reported the user in question instead of
reverting and then reporting) in the specific case the vandalism was
swiftly removed. The reason that this is being discussed at all is
because of the edit resurfacing with a third party and having to be a)
detected, b) reported, and c) fixed again. Yes what we know this was a
glitch in the third parties workflow, but they are bound to happen and
we shouldn't pretend given the large number of edits that any procedures
put in place are going to be 100% effective, be it directly with OSM or
by third parties. 

Simon


Am 05.09.2018 um 16:23 schrieb Greg Troxel:

> I tend to agree that automated systems are going to be not that useful.
>
> I tend to notice some things in my area, but it's hard to keep track.
>
> This makes me wonder about a tool that
>
>  - lets people sign up to watch edits, in some area, or in general,
>    sort of like maproulette.  Use some scoring system where new
>    mappers edits are more likely to be looked at by somebody, and
>    people who claim an area as theirs are more likely to get shown
>    edits there, or maybe let people get all edits in some bbox
>
>  - lets people give a rating to a changeset, something like:
>        i) high priority for inspection by others
>        ii) worthy of being checked by a local
>        iii) probably ok
>        iv) definitely ok
>
>  - presents things to multiple people
>
>  - somehow uses a rater's own edit history to validate this (perhaps be
>    cautious about people with < 500 changesets, and very cautious < 50)
>
>
> This is a slippery slope to a reputation system, but I think in terms of
> culture, the fact that anybody can review is there already, and the
> bright line is needing permission to change things, vs a more efficient
> way of others looking over changes.
>
>
> Unfortunately my editor crashed and I lost the source code :-)
>
>
> _______________________________________________
> Talk-us mailing list
> [hidden email]
> https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

SimonPoole


OSM changesets or even the diffs only contain the changes to objects, while in the simple case (somebody vandalising the name tag on NYC) that may be enough to determine that something is "bad" for example messing up a motorway exit will require you to actually apply (in one way or the other) the changes to the existing data, or even the changes from multiple changesets. Then naturally you will have edits that undo previous bad edits that need to be handle some way too (this creates a potential for conflicts, as the fix may be a substantial amount of time later). 

I suspect some of the above is why Mapbox uses a different granularity grouping of changes for their review process (see the SotM talk by Lukas).

In any case all of the above potentially lead to your private fork of the planet getting out of sync real fast with the original, implying that applying diffs will become more problematic over time.  So you wouldn't be able to take you fixed and known good planet fork, apply only good diffs, and expect to be able to continue to do that for a year or so.

IMHO the only thing that could really work in the OSM model is reverting real fast in the -original- dataset.

Naturally there is the other aspect that we want our contributors to gain experience and become better mappers over time. You are only going to get that if leave the opportunity to make mistakes open and don't robo-fix everything that goes wrong

Simon


Am 06.09.2018 um 02:56 schrieb Alan Brown:
Hi,

Perhaps I didn't express it clearly, but my interest was in the idea that certain. rather limited changelists could be flagged for moderation before they are put into main dataset.  There will always be things that seem like they should be blocked, but are actually appropriate.   In the interest of having the most accurate data, I'm not convinced this form of moderation can't have a role.  As I understand it, the virtue of OSM is to allow anyone to contribute accurate, detailed local knowledge about the places they know about; however, there's no value in having junk in the database for even a moment, if it can be avoided.  Place names are usually verifiable facts, even disputed place names.  So you don't want the open nature of OSM to compromise accuracy, or a quest for accuracy to discourage people from contributing accurate information.

I said my peace; I suspect the OSM community is not culturally disposed to that form of moderation. So I will ask about a different approach.

In my case, I've seen editing errors that affected motorway connectivity (not vandalism), that were made and corrected within a couple hours.  Pretty good - except our planet file was in that two hour window.  I want to avoid these errors, without getting caught in the errors of the next two hour window.

I'm not sure if Mapbox or others use a process like this, but this is what I can imagine:

PLANETcur is the current planet file
PLANETprev is the last used planet file
CHANGEcur-prev is a comprehensive list of changelists between the two datasets

A particular consumer of OSM data can automatically scan CHANGEcur-prev and/or PLANETcur for potentially troubling content, according to their own criteria.  In their local copy, if they detect something they do not want to accept - offensive place names, incomplete topology - they can attempt to revert - in their local copy only! - recent changes that violate their criteria.  They accept whatever mistakes their "reversion" algorithm makes.  The identified "questionable changelists"  can be submitted back to the OSM community to review and revert, but always by a human.

My hope is that I am being completely unoriginal, and I can cobble together existing tools quickly. How unoriginal am I?

I am looking over the osmcha.mapbox.com page, and saw reference to a utility called "osm-compare":   https://github.com/mapbox/osm-compare/blob/master/comparators/README.md - which has an obscenity filter.  If I understand this correctly, osm-compare flags changelists for review, osmcha.mapbox.com allows people to review the flagged datasets and reverse bad edits.  Could someone define osm-compare filters that produce results that can be automatically pulled into a local copy?

(If a changeset has been reviewed by a second person - can that information be provided).

All I want is something that allows me to be a little bit more conservative in accepting edits, without requiring complex processes or large resource.  A little insight would be appreciated.

Thanks,
Alan







On Wednesday, September 5, 2018, 7:52:46 AM PDT, Simon Poole [hidden email] wrote:


osmcha (osmcha.mapbox.com) already does most of this. While detecting
vandalism in general is difficult, edits like those in question are easy
to detect and small in number.

IMHO it really isn't an issue with openstreetmap in this case, as even
with the delay (somebody reported the user in question instead of
reverting and then reporting) in the specific case the vandalism was
swiftly removed. The reason that this is being discussed at all is
because of the edit resurfacing with a third party and having to be a)
detected, b) reported, and c) fixed again. Yes what we know this was a
glitch in the third parties workflow, but they are bound to happen and
we shouldn't pretend given the large number of edits that any procedures
put in place are going to be 100% effective, be it directly with OSM or
by third parties. 

Simon


Am 05.09.2018 um 16:23 schrieb Greg Troxel:
> I tend to agree that automated systems are going to be not that useful.
>
> I tend to notice some things in my area, but it's hard to keep track.
>
> This makes me wonder about a tool that
>
>  - lets people sign up to watch edits, in some area, or in general,
>    sort of like maproulette.  Use some scoring system where new
>    mappers edits are more likely to be looked at by somebody, and
>    people who claim an area as theirs are more likely to get shown
>    edits there, or maybe let people get all edits in some bbox
>
>  - lets people give a rating to a changeset, something like:
>        i) high priority for inspection by others
>        ii) worthy of being checked by a local
>        iii) probably ok
>        iv) definitely ok
>
>  - presents things to multiple people
>
>  - somehow uses a rater's own edit history to validate this (perhaps be
>    cautious about people with < 500 changesets, and very cautious < 50)
>
>
> This is a slippery slope to a reputation system, but I think in terms of
> culture, the fact that anybody can review is there already, and the
> bright line is needing permission to change things, vs a more efficient
> way of others looking over changes.
>
>
> Unfortunately my editor crashed and I lost the source code :-)
>
>
> _______________________________________________
> Talk-us mailing list
> [hidden email]
> https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Andy Townsend
In reply to this post by Alan Brown-6
On 06/09/2018 01:56, Alan Brown wrote:
... I suspect the OSM community is not culturally disposed to that form of moderation. So I will ask about a different approach.

One thing the OSM community _is_ culturally disposed to is people trying things to see if they work, and in order to do there there's no need to do it on a planet-wide basis.

A place to start might be consuming diffs, and if you're looking for "something that processes diffs that is relatively easy to understand" then https://github.com/zverik/regional might be a place to start; an example of how that can be used is at https://github.com/SomeoneElseOSM/mod_tile/blob/zoom/openstreetmap-tiles-update-expire#L159 .  In that example "trim_osc.py" is called to remove data from a diff file based on geographical location prior to inclusion in a rendering database; you could create something similar based on some other criteria*.  https://wiki.openstreetmap.org/wiki/User:SomeoneElse/Ubuntu_1804_tileserver_load#Updating_your_database_as_people_edit_OpenStreetMap is the elevant bit of some "set up a tile server" instructions that call that; you could use those to create a tile server incorporating your filtering and see how it compared to "real OSM" after a few days.

Best regards,

Andy


* perhaps initially just a simple obscenity filter for place names, not withstanding that that won't catch e.g. things drawn with water and roads to form letters - I've seen a couple of those recently.



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: NYC Name Vandalism

Alan Brown-6
In reply to this post by SimonPoole

> In any case all of the above potentially lead to your private fork of the planet getting out of sync real fast with the original, implying that applying diffs will become
> more problematic over time.  So you wouldn't be able to take you fixed and known good planet fork, apply only good diffs, and expect to be able to continue to do that > for a year or so.

No, that's not the idea at all.  It's to download planet files regular, identify problematic recent changes, revert only those.  You still need a history to revert.  Contribute back to the OSM community what the problematic attributes/object are, and a human can revert them.

> IMHO the only thing that could really work in the OSM model is reverting real fast in the -original- dataset.
Fixing the original dataset *is* paramount.  However, it's important to understand that there's this transitional problem; that, any given version of the dataset will have defects, and to catch the most egregious of these - particularly from vandalism is really the only goal here.  Not every use of OSM data will be pulled with high frequency from the database; there are offline applications where it's "pull once, use for a few years".  You may be able to rely on the community to repair persistent high-profile issues, but these transitory issues are another matter.
> Naturally there is the other aspect that we want our contributors to gain experience and become better mappers over time. You are only going to get that if leave the > opportunity to make mistakes open and don't robo-fix everything that goes wrong

It's not robo-fixing, it's "robo-flagging for moderation".  Fundamentally different thing.  Something that has be vetted could certainly violate any rules this flagging uses.

-Alan

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us