>>Anyways, there is no point of talking about who first, last, only
>>etc. All approaches using closed commercial software are pointless for
>>OSM - it cannot be reused. Everything can be done with open source so
>>that all code/algorithms are open and clear and there is no need to
>>pay piles of money for nothing.
The statement was not quite wise and in some aspects it is wrong. Blindly trusting open source solutions is not the best thing for a newcomer especially not for a developer. By experience I know that sometimes a hint, a simple warning may help a developer to change his way of thinking. Besides, reading someone’s complex and complicated source (like the generalisation related source) is not just a simple exercise. If you ever wrote a complex basic sw and tried to read it after several months of brake then you understand what is my point.
The vector data generalisation issue was many times up for discussion on this forum too, in years. Because the vector map-making is not in the OSM’s strategy (an official authority answer from some times ago) the issue may be of interest only to private persons and institutions. However, some generalization issues yet my be of interest for OSM. After all, the majority of the source data has vector interpretation.
If we agree that a vector data generalisation is a procedure applied to a downscaled source vector data that performs vector smoothing and object (or part of object) collapse, then we may add some comments to the referenced mail.
Generalisation of questions like first, best and so on might be incorrect. For instance, the vector smoothing (sometimes called simplification) algorithms are known from the end 1980s and beginning of 1990s. In many countries at that time started raster-to-vector transformation of the scanned data layer foils of the mapping authorities. These vector smoothing algorithms radically evolved in years by experimental adjustments of many parameters. The best ones are those using dynamic smoothing criterion - when to replace a series of consecutive vectors with a resultant vector. Obviously, if a smoothing algorithm works well on cadastre/land-office data (usually long vectors without fine detail curvatures) it those not necessarily mean that it will work well on data with many fine curvature details like hydrographic data.
However, my major point here is to underline the big difference between doing data generalisation on OSM data and government institution data. Namely, the natural object fragmentation in the OSM source data inevitably causes data generalization problems, no matter what kind and how advanced the applied model is. Simply, without a defragmentation (the whole object reconstruction) in the data preparation a correct and efficient collapse strategy is impossible. To avoid repetitions from related discussions in the past I will just present some illustrative arguments. Most of the examples are screen dump images from today (it is an important reason why images and not links).
Object and part-of-object collapse is an essential part of any data generalisation (even in pure raster imaging). There are three basic strategies: size based, (object) class based and dynamic collapse. Because the size of fragments vary arbitrarily (mapper filings dependent), this strategy causes inacceptable brakes. For instance look at the Lena river here https://osm.org/go/9pITX-- and take one step zoom-out on the same. The same is present in many other OSM based maps like here https://goo.gl/2n2ycU and here https://goo.gl/FLMZbZ . In case of the class collapse strategy the usual asynchronous collapse may create confusion (large objects in one class disappear while small objects in another class are still present) like here https://goo.gl/Qkvybr and here https://goo.gl/XiSvc7. Finally, just to mention again, fragmentation is a vector smoothing killer. As a rule, you cannot avoid the famous stripe effect. This was discussed and illustrated many times in the past and it is difficult to avoid it even in a pure raster rendering. For instance look at some stripes here https://goo.gl/T2rUXf (note that the stripes are not border/solid lines) or the same stripes in other maps like here https://goo.gl/4DkBPb or here https://goo.gl/FcBmhb or even if you do not see it the stripe is there, just zoom in like here https://goo.gl/wbmCy5 or here https://goo.gl/VaDwaQ.
In conclusion, applying data generalization on fragmented data is full of traps and usually ends up with errors.
2018-05-16 10:01 GMT+03:00 SandorS wrote:
> The statement was not quite wise and in some aspects it is wrong. Blindly
> trusting open source solutions is not the best thing for a newcomer
> especially not for a developer. By experience I know that sometimes a hint,
> a simple warning may help a developer to change his way of thinking.
> Besides, reading someone’s complex and complicated source (like the
> generalisation related source) is not just a simple exercise. If you ever
> wrote a complex basic sw and tried to read it after several months of brake
> then you understand what is my point.
I think there was some misunderstanding here. I only stated that
using commercial (or to be more precise - closed) software in my
opinion is not the way to go for an open project like OSM. Results of
generalisation using commercial software can be used to refine the
approach (by looking, comparing), but final software will probably be
open (and open gis software is gaining fast anyway). For example
Netherlands example of using commercial software shows that
generalisation is successfully(?) done on PARTS of data, so now we
know that we do not have to think of ways to generalise whole world at
Open data entered by volunteers does have a higher probability of
errors, but generalisation could be just one of many ways of detecting
such errors (and so fixing them).
Regarding other points. There are a lot of different operations done
on a lot of different types of objects in different sequences. But in
my opinion it is not "all or nothing". You can start with some
generalisation and progress with time.
For example doing simple polygon aggregation/amalgamation is doable
now with good results but should of course be improved with proper
polygon conflict resolution methods (already described in numerous
scientific papers, f.e. "Detecting and resolving size and proximity
conflicts in the generalization of polygonal maps", Bader and Weibel
Transport network collapsing (multi-lines to one line) can already
be done with standard PostGIS functions (buffer,
Building simplification is already in testing stage.
GRASS provides functionality for transport network displacement and
And these already give good and noticeable results. Of course a lot
of other things must be done, but we have a greatest luxury - time -
there are no deadlines. This is why some operations could be
implemented with even better results than closed implementations.