Adding a custom metric
After enabling monitoring on a machine, ClearGLASS provides some default metrics to monitor the server's health - cpu, ram, disk, network, etc. Additionally you can add extra plugins from the long list of collectd plugins that exist. However, you may need to add a custom metric, to graph (and set alerts) on something more specific, eg the result of a query in your database. In this example we will create a custom metric to monitor that a site contains a text string, but you could choose to monitor virtually anything.
We are going to create a custom metric to check that a site - redmine.engagemedia.org - loads and the page body contains a specific text string. The monitoring graph that will appear on ClearGLASS will show the time it takes for the page to be loaded, or in case that the site is inaccessible or the text is not contained in the response, it will display negative values. We can also set a rule to get alerted in such a case.
Custom metrics are deployed through the python plugin of collectd. ClearGLASS enable monitoring script will make sure this is installed for you, since it is an extra package for some Linux distributions
Case 1: ClearGLASS has shell access to the server
If ClearGLASS already has SSH access on the server, the process is much simpler: On the machine page, select 'Add graph', then 'custom', give it a name and paste the following code on the python script box
import time try: from urllib2 import urlopen except ImportError: from urllib import urlopen URL = 'https://mysite.org/' TEXT = 'Welcome to the company intranet' def read(): start=time.time() try: nf=urlopen(URL) except: return -1 page=nf.read() end=time.time() nf.close() if TEXT in page: RESULT = end - start else: # TEXT not found RESULT = -1 return RESULT
and press deploy. This will check if http://mysite.org is accessible, and if a specific text string appears there.
Key things on the custom scripts to remember: scripts should be written in python, so make sure indentation is ok otherwise they will fail to be deployed and run. What ClearGLASS regards as the output of the script is what read() returns, so make sure this exists and returns what you would like to be monitored.
After a few seconds and you will see a graph with the time taken for the site to respond (or negative values if the site is inaccessible, or the string cannot be found)
We can add a rule, by clicking on 'Add rule' and selecting our newly added metric, and get notified if e.g. the value drops below 0.
Case 2: ClearGLASS does not have shell access to the server
If you haven't associated a key with this server, first make sure that the ClearGLASS collectd conf file on your server - /opt/mistio-collectd/collectd.conf - has the string
Create folder /opt/mistio-collectd/plugins/mist-python/, edit file /opt/mistio-collectd/plugins/mist-python/include.conf and insert the name of the metric - in this case it will be check_mysite_org. Make sure the name does not contain spaces or special characters, but only letters.
Globals true ModulePath "/opt/mistio-collectd/plugins/mist-python/" LogTraces true Interactive false Import check_mysite_org
Now create a file at /opt/mistio-collectd/plugins/mist-python/check_mysite_org.py
with the following content:
#!/usr/bin/python import collectd import time try: from urllib2 import urlopen except ImportError: from urllib import urlopen URL = 'https://redmine.engagemedia.org/' TEXT = 'Welcome to the EngageMedia intranet' def read(): start=time.time() try: nf=urlopen(URL) except: return -1 page=nf.read() end=time.time() nf.close() if TEXT in page: RESULT = end - start else: # TEXT not found RESULT = -1 return RESULT def read_callback(): val = read() if val is None: return vl = collectd.Values(type="gauge") vl.plugin = "mist.python" vl.plugin_instance = "check_mysite_org" vl.dispatch(values=[val]) collectd.register_read(read_callback)
Restart the collectd agent:
root@server1:/opt/mistio-collectd# /opt/mistio-collectd/collectd.sh restart and in a few seconds you should see the graph appear on the machine's page on ClearGLASS.
The script has to be a python script, and it has to import collectd at the start. It also has to contain the functions read() and read_callback(). The output of read() is what ClearGLASS monitoring considers as the metric data, while the function read_callback() has to at least provide the name of the script, in this instance:
vl.plugin_instance = "check_redmine_engagemedia_org"
When a custom metric is run and how to change the time
ClearGLASS's agents send monitoring data every 5 seconds by default. This means that our custom metric will be run every 5 seconds, and send the results to ClearGLASS. This might be too often for some checks, so we can change the time and set it to something that makes sense for us. For example, for our website checking we want it to be checked every 2 minutes. In order to do this we will edit the python file we have created on the second case (ClearGLASS has no ssh key) or the python file that was created for us in the first case (ClearGLASS has ssh key access), which in our example is /opt/mistio-collectd/plugins/mist-python/check_redmine_engagemedia_org.py
On the bottom of the file, change
where 120 is the number of seconds we want the metric to be run. Then restart collectd
root@test2:/home/ubuntu# /opt/mistio-collectd/collectd.sh restart
How to troubleshoot
If you deploy a custom metric and you are unable to see it graphed on ClearGLASS, have a look on what collectd logs say. On one tab type:
root@server1:/opt/mistio-collectd# tail -f /var/log/messages /var/log/syslog /var/log/daemon.log
and restart collectd agent
root@server1:/opt/mistio-collectd# /opt/mistio-collectd/collectd.sh restart
Chances are there some python error that prevents the plugin from loading, so the graphs never appear. For example check this output
python plugin: Error importing module "redis_memusage". collectd: Unhandled python exception in importing module: ImportError: No module named redis collectd: Traceback (most recent call last): collectd: File "/opt/mistio-collectd/plugins/mist-python/redis_memusage.py", line 5, in
#012 import redis collectd: ImportError: No module named redis collectd: Initialization complete, entering read-loop.
here we get informed that the plugin /opt/mistio-collectd/plugins/mist-python/redis_memusage.py fails due to a python related issue (missing a lib) so we can fix the issue - in this case make sure this lib is installed.