Request for review of plan for scripted edit

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Request for review of plan for scripted edit

Alex Hennings
Community,

I'm planning a scripted change and would like feedback. Plans are outlined here:

I'd appreciate feedback or questions in the 'Discussion' portion of that wiki page, or within this email list.

Thanks,
-Alex

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

Jmapb
On 8/8/2019 1:28 PM, Alex Hennings wrote:
> Community,
>
> I'm planning a scripted change and would like feedback. Plans are
> outlined here:
> https://wiki.openstreetmap.org/wiki/Automated_edits/blackboxlogic
>
> I'd appreciate feedback or questions in the 'Discussion' portion of
> that wiki page, or within this email list.

Hi Alex!

First, a possible typo: I think "Nodes, Ways and References" should be
"Nodes, Ways and Relations"?

I'm a fan of the +1-xxx-xxx-xxxx format, since it's the only standard
format that's visually intuitive to North American users. I often switch
numbers to this format when I make updates to an existing POI.

Personally, though, I've always felt a little uneasy about automated
updates like this because they give a false impression of the freshness
of the data. If it's been five years since any "real" updates to a POI,
I'd rather that the date of last update reflected that. It's hard to
gauge the community consensus on this issue, but IMO running this on
POIs that have been manually updated (ie not by a mass edit) in the last
6 months would be fine.

Regarding the single area code question... now that cell phones, VOIP,
and nationwide calling plans are ubiquitous, the idea that a certain
area code refers to a certain area is steadily eroding. I have started
to see a few businesses with out-of-state phone numbers on their
signs... but at this point it's still more likely that an out-of-state
area code is an error or SEO spam. I'd suggest that these would go into
your "Manually review or flag" category.

Regardless, the idea that an area can have a single "traditional" area
code is still true. Personally I have no problem with prepending the
traditional area code onto 7-digit phone numbers. (I do it all the time
in manual mapping.)

Finally, thanks for posting your tools... I see these are written in
CSharp, which I'm only tangentially familiar with. What sort of
environment would one need to build these?

Thanks, Jason



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

Kevin Broderick
I'm of mixed feelings on the apparent freshness, but as long as the guidelines are followed so changesets are of reasonable size and easily identified as scripted, I don't see much of an issue.

While having an automated script make assumptions caused me to twitch a little, the reality is that a human is going to make the same assumption. If I see a seven-digit number on a sign and want to dial it, I'm going to assume that the area code is 207; if I'm across the state line in New Hampshire, I'm going to assume 603. If that assumption isn't correct, the source data is bad anyhow, and adding the implicit area code isn't making it substantially worse. Have you been able to discern how many seven-digit numbers are in the system?

On Thu, Aug 8, 2019 at 2:56 PM Jmapb <[hidden email]> wrote:
On 8/8/2019 1:28 PM, Alex Hennings wrote:
> Community,
>
> I'm planning a scripted change and would like feedback. Plans are
> outlined here:
> https://wiki.openstreetmap.org/wiki/Automated_edits/blackboxlogic
>
> I'd appreciate feedback or questions in the 'Discussion' portion of
> that wiki page, or within this email list.

Hi Alex!

First, a possible typo: I think "Nodes, Ways and References" should be
"Nodes, Ways and Relations"?

I'm a fan of the +1-xxx-xxx-xxxx format, since it's the only standard
format that's visually intuitive to North American users. I often switch
numbers to this format when I make updates to an existing POI.

Personally, though, I've always felt a little uneasy about automated
updates like this because they give a false impression of the freshness
of the data. If it's been five years since any "real" updates to a POI,
I'd rather that the date of last update reflected that. It's hard to
gauge the community consensus on this issue, but IMO running this on
POIs that have been manually updated (ie not by a mass edit) in the last
6 months would be fine.

Regarding the single area code question... now that cell phones, VOIP,
and nationwide calling plans are ubiquitous, the idea that a certain
area code refers to a certain area is steadily eroding. I have started
to see a few businesses with out-of-state phone numbers on their
signs... but at this point it's still more likely that an out-of-state
area code is an error or SEO spam. I'd suggest that these would go into
your "Manually review or flag" category.

Regardless, the idea that an area can have a single "traditional" area
code is still true. Personally I have no problem with prepending the
traditional area code onto 7-digit phone numbers. (I do it all the time
in manual mapping.)

Finally, thanks for posting your tools... I see these are written in
CSharp, which I'm only tangentially familiar with. What sort of
environment would one need to build these?

Thanks, Jason



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


--
Kevin Broderick

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

Alex Hennings
Fixed: references -> relations.

Noted: "False impression of data freshness". I hadn't considered this and I would like more opinions.

Regarding "single area-code Question" I think you're talking about 7 digit numbers, and my plan to optimistically appending an area code in cases like Maine where there is only one area code. I acknowledge that as the weakest of the assumptions but I thought of it as a safe guess and a net positive change. For context on why I feel confident, living in Maine where we only have one area code, locals will omit the area code because we don't need it when dialing locally.

Regarding "how many seven-digit numbers" Good question! There are 6 in Maine, 163 in USA.
(node
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512); // or 3609331155 for usa
 way
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512);
 relation
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512););
 out count;

The C# tools I'm building should work with any c# IDE. I use VisualStudio which is free, and available on iOS and Windows. I'm happy to help if you want to get them up and running.

-Alex

On Thu, Aug 8, 2019 at 3:58 PM Kevin Broderick <[hidden email]> wrote:
I'm of mixed feelings on the apparent freshness, but as long as the guidelines are followed so changesets are of reasonable size and easily identified as scripted, I don't see much of an issue.

While having an automated script make assumptions caused me to twitch a little, the reality is that a human is going to make the same assumption. If I see a seven-digit number on a sign and want to dial it, I'm going to assume that the area code is 207; if I'm across the state line in New Hampshire, I'm going to assume 603. If that assumption isn't correct, the source data is bad anyhow, and adding the implicit area code isn't making it substantially worse. Have you been able to discern how many seven-digit numbers are in the system?

On Thu, Aug 8, 2019 at 2:56 PM Jmapb <[hidden email]> wrote:
On 8/8/2019 1:28 PM, Alex Hennings wrote:
> Community,
>
> I'm planning a scripted change and would like feedback. Plans are
> outlined here:
> https://wiki.openstreetmap.org/wiki/Automated_edits/blackboxlogic
>
> I'd appreciate feedback or questions in the 'Discussion' portion of
> that wiki page, or within this email list.

Hi Alex!

First, a possible typo: I think "Nodes, Ways and References" should be
"Nodes, Ways and Relations"?

I'm a fan of the +1-xxx-xxx-xxxx format, since it's the only standard
format that's visually intuitive to North American users. I often switch
numbers to this format when I make updates to an existing POI.

Personally, though, I've always felt a little uneasy about automated
updates like this because they give a false impression of the freshness
of the data. If it's been five years since any "real" updates to a POI,
I'd rather that the date of last update reflected that. It's hard to
gauge the community consensus on this issue, but IMO running this on
POIs that have been manually updated (ie not by a mass edit) in the last
6 months would be fine.

Regarding the single area code question... now that cell phones, VOIP,
and nationwide calling plans are ubiquitous, the idea that a certain
area code refers to a certain area is steadily eroding. I have started
to see a few businesses with out-of-state phone numbers on their
signs... but at this point it's still more likely that an out-of-state
area code is an error or SEO spam. I'd suggest that these would go into
your "Manually review or flag" category.

Regardless, the idea that an area can have a single "traditional" area
code is still true. Personally I have no problem with prepending the
traditional area code onto 7-digit phone numbers. (I do it all the time
in manual mapping.)

Finally, thanks for posting your tools... I see these are written in
CSharp, which I'm only tangentially familiar with. What sort of
environment would one need to build these?

Thanks, Jason



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


--
Kevin Broderick
_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

USA mailing list
Given the low numbers of 7-digit numbers I recommend correcting them manually rather than writing code to do it.

On Aug 8, 2019 2:02 PM, Alex Hennings <[hidden email]> wrote:
Fixed: references -> relations.

Noted: "False impression of data freshness". I hadn't considered this and I would like more opinions.

Regarding "single area-code Question" I think you're talking about 7 digit numbers, and my plan to optimistically appending an area code in cases like Maine where there is only one area code. I acknowledge that as the weakest of the assumptions but I thought of it as a safe guess and a net positive change. For context on why I feel confident, living in Maine where we only have one area code, locals will omit the area code because we don't need it when dialing locally.

Regarding "how many seven-digit numbers" Good question! There are 6 in Maine, 163 in USA.
(node
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512); // or 3609331155 for usa
 way
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512);
 relation
    ['phone'~'^([^0-9]*[0-9]){7}[^0-9]*$']
    (area:3600063512););
 out count;

The C# tools I'm building should work with any c# IDE. I use VisualStudio which is free, and available on iOS and Windows. I'm happy to help if you want to get them up and running.

-Alex

On Thu, Aug 8, 2019 at 3:58 PM Kevin Broderick <[hidden email]> wrote:
I'm of mixed feelings on the apparent freshness, but as long as the guidelines are followed so changesets are of reasonable size and easily identified as scripted, I don't see much of an issue.

While having an automated script make assumptions caused me to twitch a little, the reality is that a human is going to make the same assumption. If I see a seven-digit number on a sign and want to dial it, I'm going to assume that the area code is 207; if I'm across the state line in New Hampshire, I'm going to assume 603. If that assumption isn't correct, the source data is bad anyhow, and adding the implicit area code isn't making it substantially worse. Have you been able to discern how many seven-digit numbers are in the system?

On Thu, Aug 8, 2019 at 2:56 PM Jmapb <[hidden email]> wrote:
On 8/8/2019 1:28 PM, Alex Hennings wrote:
> Community,
>
> I'm planning a scripted change and would like feedback. Plans are
> outlined here:
> https://wiki.openstreetmap.org/wiki/Automated_edits/blackboxlogic
>
> I'd appreciate feedback or questions in the 'Discussion' portion of
> that wiki page, or within this email list.

Hi Alex!

First, a possible typo: I think "Nodes, Ways and References" should be
"Nodes, Ways and Relations"?

I'm a fan of the +1-xxx-xxx-xxxx format, since it's the only standard
format that's visually intuitive to North American users. I often switch
numbers to this format when I make updates to an existing POI.

Personally, though, I've always felt a little uneasy about automated
updates like this because they give a false impression of the freshness
of the data. If it's been five years since any "real" updates to a POI,
I'd rather that the date of last update reflected that. It's hard to
gauge the community consensus on this issue, but IMO running this on
POIs that have been manually updated (ie not by a mass edit) in the last
6 months would be fine.

Regarding the single area code question... now that cell phones, VOIP,
and nationwide calling plans are ubiquitous, the idea that a certain
area code refers to a certain area is steadily eroding. I have started
to see a few businesses with out-of-state phone numbers on their
signs... but at this point it's still more likely that an out-of-state
area code is an error or SEO spam. I'd suggest that these would go into
your "Manually review or flag" category.

Regardless, the idea that an area can have a single "traditional" area
code is still true. Personally I have no problem with prepending the
traditional area code onto 7-digit phone numbers. (I do it all the time
in manual mapping.)

Finally, thanks for posting your tools... I see these are written in
CSharp, which I'm only tangentially familiar with. What sort of
environment would one need to build these?

Thanks, Jason



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


--
Kevin Broderick
_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

Bryce Jasmer
In reply to this post by Kevin Broderick
I’m really opposed to this idea of scaring people away from editing objects with the “data freshness” boogie man argument. If someone really cares about freshness, the entire history of an object is available to you. 

On Thu, Aug 8, 2019 at 12:57 PM Kevin Broderick <[hidden email]> wrote:
I'm of mixed feelings on the apparent freshness, but as long as the guidelines are followed so changesets are of reasonable size and easily identified as scripted, I don't see much of an issue.

While having an automated script make assumptions caused me to twitch a little, the reality is that a human is going to make the same assumption. If I see a seven-digit number on a sign and want to dial it, I'm going to assume that the area code is 207; if I'm across the state line in New Hampshire, I'm going to assume 603. If that assumption isn't correct, the source data is bad anyhow, and adding the implicit area code isn't making it substantially worse. Have you been able to discern how many seven-digit numbers are in the system?

On Thu, Aug 8, 2019 at 2:56 PM Jmapb <[hidden email]> wrote:
On 8/8/2019 1:28 PM, Alex Hennings wrote:
> Community,
>
> I'm planning a scripted change and would like feedback. Plans are
> outlined here:
> https://wiki.openstreetmap.org/wiki/Automated_edits/blackboxlogic
>
> I'd appreciate feedback or questions in the 'Discussion' portion of
> that wiki page, or within this email list.

Hi Alex!

First, a possible typo: I think "Nodes, Ways and References" should be
"Nodes, Ways and Relations"?

I'm a fan of the +1-xxx-xxx-xxxx format, since it's the only standard
format that's visually intuitive to North American users. I often switch
numbers to this format when I make updates to an existing POI.

Personally, though, I've always felt a little uneasy about automated
updates like this because they give a false impression of the freshness
of the data. If it's been five years since any "real" updates to a POI,
I'd rather that the date of last update reflected that. It's hard to
gauge the community consensus on this issue, but IMO running this on
POIs that have been manually updated (ie not by a mass edit) in the last
6 months would be fine.

Regarding the single area code question... now that cell phones, VOIP,
and nationwide calling plans are ubiquitous, the idea that a certain
area code refers to a certain area is steadily eroding. I have started
to see a few businesses with out-of-state phone numbers on their
signs... but at this point it's still more likely that an out-of-state
area code is an error or SEO spam. I'd suggest that these would go into
your "Manually review or flag" category.

Regardless, the idea that an area can have a single "traditional" area
code is still true. Personally I have no problem with prepending the
traditional area code onto 7-digit phone numbers. (I do it all the time
in manual mapping.)

Finally, thanks for posting your tools... I see these are written in
CSharp, which I'm only tangentially familiar with. What sort of
environment would one need to build these?

Thanks, Jason



_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us


--
Kevin Broderick
_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

Re: Request for review of plan for scripted edit

Mike N.
In reply to this post by USA mailing list
On 8/8/2019 5:25 PM, Paul Norman via Talk-us wrote:
> Given the low numbers of 7-digit numbers I recommend correcting them
> manually rather than writing code to do it.

   On this one I'm not sure how introducing an error-prone keyboarding
exercise into the mix is an improvement over a programmatic solution.
At least with the programmatic solution, a typo applies to all and is
more easily spotted.

_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us
Reply | Threaded
Open this post in threaded view
|

data freshness boogie man (was: request for review of plan for scripted edit)

Jmapb
In reply to this post by Bryce Jasmer
On 8/8/2019 5:52 PM, Bryce Jasmer wrote:

> I’m really opposed to this idea of scaring people away from editing
> objects with the “data freshness” boogie man argument. If someone
> really cares about freshness, the entire history of an object is
> available to you.

That's true for any single object. But what if you want to query for
stale data in a given area, in preparation for a survey? Accessing each
object's history makes the process exponentially more complicated. If
you can mange to do that, then you could try to skip edits by known bot
accounts... but there are always more. You can try to filter out
changesets with bot=yes, which well-behaved bots will set. But lots of
people make these wide-ranging edits semi-manually using JOSM or Level0.

I understand that this particular objection to armchair data cleanup is
far from universal. But the compulsive reformatting of incorrect data
makes me roll my eyes a bit. I find it useless bordering on ridiculous
when I see a mapped restaurant that's been gone for years, but someone's
added branding tags, someone's prefixed the website with http://,
someone's reformatted the phone number, someone's fixed the opening
hours, someone's corrected the cuisine... but nobody has bothered to see
if the place actually still exists.

I think my prejudice stems from reading this cautionary tale on the wiki
in my OSM infancy:
https://wiki.openstreetmap.org/wiki/What%27s_the_problem_with_mechanical_edits%3F
(I see now that this was originally written by Frederik Ramm, though I
had no idea who that was at the time.)

The bottom line, though, is that a well-planned, well-discussed, and
well-behaved bot is by far the *best* way to make these sorts of edits,
if someone feels they must be made. My preference would be to only touch
recently edited objects but that's by no means a dealbreaker.

Jason


_______________________________________________
Talk-us mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-us