Recently I’ve migrated my munin instance onto a Linode virtual machine. Having fewer resources was a potential source of problems.
One of the problems that I ran into early on was that sometimes a munin-cron run would still be executing when another munin-cron was scheduled to start. This caused many contention issues, ridiculous load(35+), and once caused a reboot of the VM.
One solution I tried was to turn up the time between tests from 5 minutes to 10 minutes. This still had generated problems, as munin-cron runs still weren’t done after 10 minutes, and any greater time between polls meant subpar graph granularity.
My final solution was to modify munin’s cron entry to the following:
*/7 * * * * munin test ( ! "$(pidof /usr/bin/munin-cron)" ) && test -x /usr/bin/munin-cron && nice -19 /usr/bin/munin-cron
I ran a test to see if there’s still a PID active from munin-cron, and if not check to see if the munin-cron executable exists. If it does exist, run a munin-cron job with the nicest setting possible.
This has solved the problem while still keeping granular graphs, and has significantly reduced the load on the virtual machine.
load average: 0.09, 0.42, 0.38