In my last post, we discovered how to customize some of Datadog’s pre-packaged integrations to build actionable insights for SQL-backed applications quickly. In this post, we are going to dive a bit deeper and look at how we might integrate Datadog with pre-built/custom metrics tooling (such as a shell script, for example).
As I wrapped up the last post, I alluded to a metric “gathered via a different route;” this metric keeps a count of the number of individual client processes running on the host. It is generated using a lightweight shell script called from a cron job; for a general point of reference, the shell script looks like this:
-l) echo $num_procsnum_procs=$(ps awx | grep textract.py | grep -v grep | wc
I called this script
textract-count and placed it in my
PATH so that I can call it without path prefixing.
Before discovering how easy it is to include and alert on these types of metrics in Datadog, I had my cron job push a quick message to an SNS topic which pings my mobile should the number fall below a tolerable threshold. In the course of building out the dashboard from the previous post, I discovered it is quite trivial to push the value to Datadog and wrap it in enough context to expose it for alerting and dashboarding.
Datadog has a blog post covering the mechanics of constructing the (text-based) message and how to send it to the DogStatsD daemon running locally (which is a prerequisite to using this method), so I am not going to rehash that content here. The only issue I encountered with their documentation is where the writer employs this syntax to echo out his string to localhost:8125:
$ echo -n "datadogstring" >/dev/udp/localhost/8125
The use of
/dev/udp are bash builtins. In fact, bash specifically has to be compiled to use them. My Ubuntu distro did not have this feature in bash (nor did I care to recompile bash). I was able to use netcat to work around the issue like this:
$ echo -n "datadogstring" | nc -4u -w1 localhost 8125
Netcat is available for most distros, is a quick
yum install or
apt-get install away, and is probably much easier than recompiling bash. With this solution able to effectively push a metric, I swapped out my SNS-based cron task with one simply runs an echo command like the above and looks like this:
#!/usr/bin/bash env PATH=$PATH:/usr/local/bin echo -n "extractions.processes.by_host.$HOSTNAME:`textract-count`|g|#document_parser" | nc -4u -w1 localhost 8125
Now that the metrics are available in Datadog, they are part of a wider ecosystem of available metrics, such as EC2 and host-level metrics, making it much easier to analyse issues when they arise and perform RCAs when things go awry. Also, now I can add it to my dashboard!