Last month there were some problems with the weekly planet dump which
impacted the website and API.
Ironbelly, the primary site planet server, was upgraded to Ubuntu 20.04.
This revealed a bug in planet-dump-ng, the software that generates the
weekly planet dumps, which caused it to consume excessive memory,
exhausting resources. Ironbelly also serves as the NFS server for GPX
traces. The resource exhaustion caused NFS to stop responding, leading
to stalled website processes which caused an outage.
A series of actions were taken as a result of this
- The bug with planet-dump-ng was fixed
- Resource limits were placed on the planet dump generation process,
preventing bugs from consuming all the RAM on the machine
- Planned maintenance was done on ironbelly on August 15th to upgrade
the kernel and firmware
Medium term the existing plan is to move GPX traces to an object store,
removing the dependency of the website on ironbelly. This is waiting on
Rails 6.1 being released.