Le Huy on software design, pattern, DSL and distributed system: 2013

I used to work with Graphite for a while, I can say that it is one of my favorite tools in devops cultural movement where spreading operation's awareness among operation, development teams and other stakeholders is key factor.

Graphite stores time series data and provides powerful set of functions to manipulate with data, which is very useful for learning about trend of resource consumption as well finding problematic one among large number physical/logical resources.

Installation
In general installation is not difficult. To set it up just follow the installation document.

How it work
Graphite is written entirely in Python. From run time point of view Graphite consist of 1) Carbon Cache (developed using Twisted) and 2) WebApp (developed using Django). Metric data are stored in file system , one metric per file in a format designed specially for time series data.
We get the graph by sending a HTTP request to URL of the WebApp. We use parameters of URL to tell Graphite what kind of graph of which time period we want. We can not only request graph of ordinary metrics but by using Graphite functions also to combine, filter, transform these metrics to generate rich, powerful graphs useful for planning and troubleshooting.
The WebApp read data of metrics involved from corresponding per metric data file and also from Carbon Cache for those values that have not yet written to the data file.

We send metrics to the Carbon Cache process, which is responsible for writing data into data file and serving query from WebApp. To scale up, many Carbon Caches may be configured behind a Carbon relay.

Getting Metrics into Graphite
It is pretty simple, just send metric in form of metric_name metric_value seconds_from_epoch to default Graphite Carbon default port 2003 e.g.

$echo "cpu_load_per_min.`hostname`  `uptime | awk '{print $10}'` `date +%s`" | nc localhost 2003;

There is no need to create metric definition in advance, just send data to Graphite and it will create a metric for you if it does not yet exists. Graphite uses pattern of metric name defined in storage-schemas.conf to determine how often (in term of seconds) one data point of certain metric is stored. For example if the rule say that a metric is once per minute and we send two values of the same metric at the same minute then the second one will overwrite the first one. It is also obvious that Graphite does not support finer resolution than one second.

Metric name can be composed of node separated by dot which allows us to organize metrics into hierarchy enabling combination them using Graphite functions. Suppose we have N httpd servers and we want to track their consumption of network bandwidth in term of megabyte incoming and outgoing each minute, we can use the following naming convention

front.mb_in_per_min.host0,...,front.mb_in_per_min.hostN
front.mb_out_per_min.host0,..., front.mb_out_per_min.hostN

Asking questions

There may questions that can be asked about the system. Given the example above, we can draw a graph of total of in coming or network bandwidth per min in mega byte by enter function sumSeries

target=sumSeries(front.mb_in_per_min.host*)

To give a nice legend to a graph we can decorate original expression with function alias

alias(target=sumSeries(front.mb_in_per_min.host*),"incoming traffic in mb")

Every time series data is a trending line however , I feel that some time it is useful to draw two lines one current and other from the past. So we see the difference over period of time let say one day. The function for doing that is timeShift. Let compare the difference between today traffic with same day last week.

target=alias(sumSeries(front.mb_in_per_min.host*),"now") &
target=alias(timeShift(sumSeries(front.mb_in_per_min.host*),"-1w"),"now - 1 week")

Because volume of traffic depends on day of week (weekend traffic is much lower), it is good idea compare one graph with other on the same week day.
Compare different metrics may be useful in case we want to know why e.g. value of a high level metric (e.g. response time of an URL) now is higher than in the past. Try to relate the high level metric to several low level metrics can help use to pinpoint where is the issue. Because different metrics has different scale the function secondYAXIS

target=alias(averageSeries(front.avg_response_time_per_min.host*),"incoming traffic in mb") &
target=alias(secondYAXIS(sumSeries(front.request_per_min.host*))),"total requests")

Performance hacks

Having dozens of dashboards with hundreds of graphs each requires hundreds or thousands of metrics when rendering can bring down the Graphite server to knee. The following hacks may be helpful for such situation.

When rendering a graph, WebApp identify metrics involved and for each metric WebApp find the corresponding file name, read its content of the file then query the metric data from Carbon Cache finally merges the two result.

As cost of remote query to Carbon Cache is non negligible, a typical requested graph requiring hundreds of metrics will result in the same number of remote query being sent to Carbon Cache, which consumes substantial amount CPU. To mitigate it I have patched both WebApp and Carbon Cache of Graphite version 0.9.x so WebApp will send a bulk of request for metrics in single call to Carbon Cache see my commits on github

Both Carbon Cache and WebApp generate a lot of IO. If the IO system is not fast enough, the server may sometimes hang waiting for completion of IO. We can improve the situation a bit by tuning the file system where Graphite stores their metrics files. My favorite options when mounting the file system is

$cat /etc/fstab
...
/dev/sdb  /graphite-storage  ext3  defaults,noatime,data=writeback,barrier=0  1 2

The noatime option instruct file system not to change access time (changing access time modify disk block containing inode) of a file when someone read it. Other options trade off safety of file system for performance see ext3 document for detail.

References

One of main issues we are facing when using chef in our environment is that a configuration file generated by either chef template, cookbook_file or file may exists in incomplete state during short period of time when chef-client modifying it. Processes accessing the configuration file during this period may fail strangely leaving no clear trace.

Let look at a very simple hypothetical example. We create chef template that generates /etc/hosts using data specified in role. At one time we add more machines into this /etc/hosts. What happens behind scene is that chef-client create new temporary file with required content, compare checksum of it with one of actual /etc/hosts and overwrite the /etc/hosts if it is not equals to the recent generated temporary file. During period of overwriting, any processes may see incomplete /etc/hosts, thus may not be able to resolve a hostname even though it already exists in the original file.

Luckily there is a fix for this problem. We can monkey patch chef template, cookbook_file and file providers to follow the well known pattern, create and write to temporary file then use File.rename to rename the temporary file to the final file.

The ruby File.rename uses syscall rename, which guarantees that any processes accessing the configuration file at any point of time always see either complete old or complete new version of the file.

The File.rename requires both old name and new name being in the same mounted filesystem. So oneway to make sure that File.rename will success is to create temporary file in the same directory as one of an overwritten file, which can archive easily by passing the directory of the overwritten file as tmpdir to Tempfile.new(basename, [tmpdir = Dir.tmpdir], options).

Le Huy on software design, pattern, DSL and distributed system

Tuesday, September 10, 2013

Joining Wonga

Friday, August 16, 2013

Some useful tips for Graphite

Wednesday, February 20, 2013

Making chef-client safer

About Me

Site Service

Labels

Ruby

Python

Configuration management

Java & J2EE

Martin Fowler

Kitchen Soap

Bruce Eckel's Weblog

InfoQ

Blog Archive

Links