I have a lot of programs run on different platforms, some are small scripts, some are large frameworks. I’ve monitored them by a lot of scripts and found it’s hard to track and maintain. So I want to build a unified system for monitor as well as alert.
Zabbix
and Nagios
are not my choice, they are too heavy. What I need should meet the conditions:
- Without agent. Some programs are running in docker, it’s not convinent to install a agent.
- Lightweight enough.
- Concise concepts and simple principles.
- Send metrics by UDP. The monitor part should not affect the main program.
-
Measure Anything, Measure Everything
After the survey, I choice graphite. Send a metric to graphite is very simple, such as:
echo "secfree.test 4 `date +%s`" | nc server port
As above, you can send a message of three component (key, value, timestamp) by TCP or UDP. So I can send metrics from shell, python scripts, java program easily. The key supports hierarchy. After send metrics to graphite, we can view it on graphite-web.
graphite events can be used to track something that is not numeric.
With the collected metrics, we can check and alert when necessary. An optional tools is graphite-beacon: a simple alerting system for Graphite metrics.
Refer: