Build a Lightweight Unified Monitor and Alert system

I have a lot of programs run on different platforms, some are small scripts, some are large frameworks. I’ve monitored them by a lot of scripts and found it’s hard to track and maintain. So I want to build a unified system for monitor as well as alert.

Zabbix and Nagios are not my choice, they are too heavy. What I need should meet the conditions:

  1. Without agent. Some programs are running in docker, it’s not convinent to install a agent.
  2. Lightweight enough.
  3. Concise concepts and simple principles.
  4. Send metrics by UDP. The monitor part should not affect the main program.
  5. Measure Anything, Measure Everything

After the survey, I choice graphite. Send a metric to graphite is very simple, such as:

echo "secfree.test 4 `date +%s`" | nc server port

As above, you can send a message of three component (key, value, timestamp) by TCP or UDP. So I can send metrics from shell, python scripts, java program easily. The key supports hierarchy. After send metrics to graphite, we can view it on graphite-web.

graphite-composer

graphite events can be used to track something that is not numeric.

graphite-events

With the collected metrics, we can check and alert when necessary. An optional tools is graphite-beacon: a simple alerting system for Graphite metrics.

Refer:

  1. Measure Anything, Measure Everything
  2. Tracking Every Release
  3. Counting & Timing
  4. Monitoring 101: Collecting the right data
  5. Monitoring 101: Alerting on what matters