Investigating a NSW Schools Import

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Investigating a NSW Schools Import

Andrew Harvey-3
I'm investigating the possibility of importing NSW Public Schools data into OpenStreetMap in line with the Import Guidelines.

At this stage I'm seeking buy in from the local community as well as feedback on my plan, before taking it to the imports list. Please refrain from jumping the gun by importing this data before this review has been completed.

The data is CC BY 4.0 licensed (although the link above says only CC BY, I've confirmed via email that it is CC BY 4.0) and the OSMF CC BY waiver has been completed.

1. Attribute Mapping

Here is a sample of the attribute mapping I've applied, I'd appreciate any feedback on this.

amenity=school
access=private
    // Although generally on public land, access to public school grounds is similar to any other private property, you can be there by invitation and you can walk up to the front door but the owner/school can ask you to leave at any time.
addr:city=Crows Nest
    // suburb
addr:postcode=2065
capacity=927
    // Based on enrolment numbers, although they aren't necessarily equal I think in lieu of better information it serves as a good placeholder
contact:email=northsydbo-h.sch[hidden email]
website=http://www.northsydbo-h.schools.nsw.edu.au
    // derived from the email, but looks fine
contact:fax=+61 2 9957 6310
contact:phone=+61 2 9955 1565
fee=no
    Public schools don't have compulsory fees to attend, unlike most private schools.
isced:level=2-3
    Upstream uses Infants, Primary, Secondary which are mapped to the https://wiki.openstreetmap.org/wiki/Key:isced:level levels 0, 1, 2-3 respectively.
grades=7-12
name=North Sydney Boys High School
operator=NSW Department of Education
    // being part of the NSW Public Schools dataset implies they are operated by the NSW Department of Education which in tern implies they are public schools
ref:au.gov.nsw.cese=8132
ref:au.gov=7614
    // these references although not necessary might make it easier to keep data in sync with future upstream updates
school:gender=male
  // mixed, male, female
school:selective=yes
  // yes, no, partial
school:specialty=comprehensive
  // agricultural, languages, arts, comprehensive (most)
start_date=1915-01-01

A question is should we apply the source tag to the changeset or the object. What should we do with the existing source tag? If it's obvious it relates to the geometry I suggest it be moved to source:geometry, otherwise I'd suggest it be deleted (it's still there in the history). Though a specific source:name tag should be retained.

I'm proposing to use on the changeset tag:

source=NSW CESE Public Schools Master Dataset

Along with a comment pointing to this thread.

2.  Import Plan

To make importing the data easier, I've put together a basic web application at https://andrewharvey.github.io/au-nsw-public-schools-to-osm/diff.html. Based on matches identified by distance to nearest school within 200m you can see which tags will change. It uses the JOSM Remote Control to load the change into JOSM where the final upload(s) will take place.

My import plan is to go through this and and apply the changes or edit them manually as necessary. In cases where tags conflict I plan to open changeset comments to ask the author to determine what to do.

I plan to use an dedicated imports account.

I plan to add the attribution to the Contributors page.

Most public primary schools have interchangeable names "Foo Primary School" and "Foo Public School". If a different name is already tagged, I propose we move it to alt_name, that makes the names consistent, but also means that it might not match the name used on the ground. What do people think about this?

All the code and cached versions of the data files are available at https://github.com/andrewharvey/au-nsw-public-schools-to-osm.

Some stats...

Total features from OSM: 2636 (1649 matched, 987 unmatched)
Total features from Upstream: 2209 (1713 matched, 496 unmatched)

Of most interest are those 496 features from the upstream dataset not found in OSM, but the other 1713 are still of interest as they add a lot of missing tags to the existing objects in OSM. I haven't yet looked through the "schools" we have in OSM but the dataset doesn't have, because the vast majority will be private schools, and not things we need to investigate further.

I'm aware there are a number of "schools" in the upstream data we might not consider schools for OSM for example "Field of Mars Environmental Education Centre", "Royal National Park Environmental Education Centre" as the students attending here are likely on excursion. However given the wiki says "place where pupils, normally between the ages of about 6 and 18 are taught under the supervision of teachers.", I think we should include them.

There are a number of schools for the disadvantaged, disabled, etc. and in hospital schools not currently mapped in OSM, importing this data means we can include more of these kinds of schools. Unfortunately they aren't tagged as such so they appear the same, but I still think it's better to have them than not.

_______________________________________________
Talk-au mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-au
Reply | Threaded
Open this post in threaded view
|

Re: Investigating a NSW Schools Import

Andrew Harvey-3
I've received some feedback from the imports list and made some changes to the plan.

1. The ref tags won't be imported as they can in some instances discourage local edits. They make the object feel less like a native OSM object and editors may feel it's a special imported piece of data that they can't touch. I certainly don't want this imported data to feel like that, and given these IDs aren't that important (feel free to identify any use cases where they are helpful) I'll skip them initially.

2. access=private tag won't be imported, this is best surveyed on an individual basis.

3. capacity changed to school:enrolment to better reflect what this number actually is. Enrolment could be lower or greater than the capacity.

4. I'll put tho source tags on the changeset, moving existing source tags to source:geometry where it is an Imagery source, or if source is survey, leaving it in place.

5. I'll leave in place existing "Foo Primary School" names as what's on the ground should prevail. "Foo Public School" where "Foo Primary School" exists will be placed in the alt_name. These can always be changed on an individual basis with local knowledge of the school.

I'll give it a few more days for feedback, if I don't get anything back I'll start the import as planned.

On 23 April 2018 at 21:19, Andrew Harvey <[hidden email]> wrote:
I'm investigating the possibility of importing NSW Public Schools data into OpenStreetMap in line with the Import Guidelines.

At this stage I'm seeking buy in from the local community as well as feedback on my plan, before taking it to the imports list. Please refrain from jumping the gun by importing this data before this review has been completed.

The data is CC BY 4.0 licensed (although the link above says only CC BY, I've confirmed via email that it is CC BY 4.0) and the OSMF CC BY waiver has been completed.

1. Attribute Mapping

Here is a sample of the attribute mapping I've applied, I'd appreciate any feedback on this.

amenity=school
access=private
    // Although generally on public land, access to public school grounds is similar to any other private property, you can be there by invitation and you can walk up to the front door but the owner/school can ask you to leave at any time.
addr:city=Crows Nest
    // suburb
addr:postcode=2065
capacity=927
    // Based on enrolment numbers, although they aren't necessarily equal I think in lieu of better information it serves as a good placeholder
contact:email=northsydbo-h.sch[hidden email]
website=http://www.northsydbo-h.schools.nsw.edu.au
    // derived from the email, but looks fine
contact:fax=+61 2 9957 6310
contact:phone=+61 2 9955 1565
fee=no
    Public schools don't have compulsory fees to attend, unlike most private schools.
isced:level=2-3
    Upstream uses Infants, Primary, Secondary which are mapped to the https://wiki.openstreetmap.org/wiki/Key:isced:level levels 0, 1, 2-3 respectively.
grades=7-12
name=North Sydney Boys High School
operator=NSW Department of Education
    // being part of the NSW Public Schools dataset implies they are operated by the NSW Department of Education which in tern implies they are public schools
ref:au.gov.nsw.cese=8132
ref:au.gov=7614
    // these references although not necessary might make it easier to keep data in sync with future upstream updates
school:gender=male
  // mixed, male, female
school:selective=yes
  // yes, no, partial
school:specialty=comprehensive
  // agricultural, languages, arts, comprehensive (most)
start_date=1915-01-01

A question is should we apply the source tag to the changeset or the object. What should we do with the existing source tag? If it's obvious it relates to the geometry I suggest it be moved to source:geometry, otherwise I'd suggest it be deleted (it's still there in the history). Though a specific source:name tag should be retained.

I'm proposing to use on the changeset tag:

source=NSW CESE Public Schools Master Dataset

Along with a comment pointing to this thread.

2.  Import Plan

To make importing the data easier, I've put together a basic web application at https://andrewharvey.github.io/au-nsw-public-schools-to-osm/diff.html. Based on matches identified by distance to nearest school within 200m you can see which tags will change. It uses the JOSM Remote Control to load the change into JOSM where the final upload(s) will take place.

My import plan is to go through this and and apply the changes or edit them manually as necessary. In cases where tags conflict I plan to open changeset comments to ask the author to determine what to do.

I plan to use an dedicated imports account.

I plan to add the attribution to the Contributors page.

Most public primary schools have interchangeable names "Foo Primary School" and "Foo Public School". If a different name is already tagged, I propose we move it to alt_name, that makes the names consistent, but also means that it might not match the name used on the ground. What do people think about this?

All the code and cached versions of the data files are available at https://github.com/andrewharvey/au-nsw-public-schools-to-osm.

Some stats...

Total features from OSM: 2636 (1649 matched, 987 unmatched)
Total features from Upstream: 2209 (1713 matched, 496 unmatched)

Of most interest are those 496 features from the upstream dataset not found in OSM, but the other 1713 are still of interest as they add a lot of missing tags to the existing objects in OSM. I haven't yet looked through the "schools" we have in OSM but the dataset doesn't have, because the vast majority will be private schools, and not things we need to investigate further.

I'm aware there are a number of "schools" in the upstream data we might not consider schools for OSM for example "Field of Mars Environmental Education Centre", "Royal National Park Environmental Education Centre" as the students attending here are likely on excursion. However given the wiki says "place where pupils, normally between the ages of about 6 and 18 are taught under the supervision of teachers.", I think we should include them.

There are a number of schools for the disadvantaged, disabled, etc. and in hospital schools not currently mapped in OSM, importing this data means we can include more of these kinds of schools. Unfortunately they aren't tagged as such so they appear the same, but I still think it's better to have them than not.


_______________________________________________
Talk-au mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-au
Reply | Threaded
Open this post in threaded view
|

Re: Investigating a NSW Schools Import

Andrew Harvey-3
I've done this import, where existing data conflicts I've mostly left in place, sometimes when it's obviously wrong I've fixed it.

There were a lot of intstances where some data existed by in a similar tag, eg. website, contact:website and url. As much as possible I've tried to leave in place the existing key.

A few were omitted as I lack the local knowledge I feel necessary to add them to OSM: Aurora College, Saturday School of Community Languages, Central Sydney Intensive English High School, plus some others where I've left notes.



On 28 April 2018 at 11:20, Andrew Harvey <[hidden email]> wrote:
I've received some feedback from the imports list and made some changes to the plan.

1. The ref tags won't be imported as they can in some instances discourage local edits. They make the object feel less like a native OSM object and editors may feel it's a special imported piece of data that they can't touch. I certainly don't want this imported data to feel like that, and given these IDs aren't that important (feel free to identify any use cases where they are helpful) I'll skip them initially.

2. access=private tag won't be imported, this is best surveyed on an individual basis.

3. capacity changed to school:enrolment to better reflect what this number actually is. Enrolment could be lower or greater than the capacity.

4. I'll put tho source tags on the changeset, moving existing source tags to source:geometry where it is an Imagery source, or if source is survey, leaving it in place.

5. I'll leave in place existing "Foo Primary School" names as what's on the ground should prevail. "Foo Public School" where "Foo Primary School" exists will be placed in the alt_name. These can always be changed on an individual basis with local knowledge of the school.

I'll give it a few more days for feedback, if I don't get anything back I'll start the import as planned.

On 23 April 2018 at 21:19, Andrew Harvey <[hidden email]> wrote:
I'm investigating the possibility of importing NSW Public Schools data into OpenStreetMap in line with the Import Guidelines.

At this stage I'm seeking buy in from the local community as well as feedback on my plan, before taking it to the imports list. Please refrain from jumping the gun by importing this data before this review has been completed.

The data is CC BY 4.0 licensed (although the link above says only CC BY, I've confirmed via email that it is CC BY 4.0) and the OSMF CC BY waiver has been completed.

1. Attribute Mapping

Here is a sample of the attribute mapping I've applied, I'd appreciate any feedback on this.

amenity=school
access=private
    // Although generally on public land, access to public school grounds is similar to any other private property, you can be there by invitation and you can walk up to the front door but the owner/school can ask you to leave at any time.
addr:city=Crows Nest
    // suburb
addr:postcode=2065
capacity=927
    // Based on enrolment numbers, although they aren't necessarily equal I think in lieu of better information it serves as a good placeholder
contact:email=northsydbo-h.sch[hidden email]
website=http://www.northsydbo-h.schools.nsw.edu.au
    // derived from the email, but looks fine
contact:fax=+61 2 9957 6310
contact:phone=+61 2 9955 1565
fee=no
    Public schools don't have compulsory fees to attend, unlike most private schools.
isced:level=2-3
    Upstream uses Infants, Primary, Secondary which are mapped to the https://wiki.openstreetmap.org/wiki/Key:isced:level levels 0, 1, 2-3 respectively.
grades=7-12
name=North Sydney Boys High School
operator=NSW Department of Education
    // being part of the NSW Public Schools dataset implies they are operated by the NSW Department of Education which in tern implies they are public schools
ref:au.gov.nsw.cese=8132
ref:au.gov=7614
    // these references although not necessary might make it easier to keep data in sync with future upstream updates
school:gender=male
  // mixed, male, female
school:selective=yes
  // yes, no, partial
school:specialty=comprehensive
  // agricultural, languages, arts, comprehensive (most)
start_date=1915-01-01

A question is should we apply the source tag to the changeset or the object. What should we do with the existing source tag? If it's obvious it relates to the geometry I suggest it be moved to source:geometry, otherwise I'd suggest it be deleted (it's still there in the history). Though a specific source:name tag should be retained.

I'm proposing to use on the changeset tag:

source=NSW CESE Public Schools Master Dataset

Along with a comment pointing to this thread.

2.  Import Plan

To make importing the data easier, I've put together a basic web application at https://andrewharvey.github.io/au-nsw-public-schools-to-osm/diff.html. Based on matches identified by distance to nearest school within 200m you can see which tags will change. It uses the JOSM Remote Control to load the change into JOSM where the final upload(s) will take place.

My import plan is to go through this and and apply the changes or edit them manually as necessary. In cases where tags conflict I plan to open changeset comments to ask the author to determine what to do.

I plan to use an dedicated imports account.

I plan to add the attribution to the Contributors page.

Most public primary schools have interchangeable names "Foo Primary School" and "Foo Public School". If a different name is already tagged, I propose we move it to alt_name, that makes the names consistent, but also means that it might not match the name used on the ground. What do people think about this?

All the code and cached versions of the data files are available at https://github.com/andrewharvey/au-nsw-public-schools-to-osm.

Some stats...

Total features from OSM: 2636 (1649 matched, 987 unmatched)
Total features from Upstream: 2209 (1713 matched, 496 unmatched)

Of most interest are those 496 features from the upstream dataset not found in OSM, but the other 1713 are still of interest as they add a lot of missing tags to the existing objects in OSM. I haven't yet looked through the "schools" we have in OSM but the dataset doesn't have, because the vast majority will be private schools, and not things we need to investigate further.

I'm aware there are a number of "schools" in the upstream data we might not consider schools for OSM for example "Field of Mars Environmental Education Centre", "Royal National Park Environmental Education Centre" as the students attending here are likely on excursion. However given the wiki says "place where pupils, normally between the ages of about 6 and 18 are taught under the supervision of teachers.", I think we should include them.

There are a number of schools for the disadvantaged, disabled, etc. and in hospital schools not currently mapped in OSM, importing this data means we can include more of these kinds of schools. Unfortunately they aren't tagged as such so they appear the same, but I still think it's better to have them than not.



_______________________________________________
Talk-au mailing list
[hidden email]
https://lists.openstreetmap.org/listinfo/talk-au