Вы находитесь на странице: 1из 20

About InfoQ Our Audience Contribute About C4Media Exclusive updates on:

Facilitating the spread of knowledge and innovation in professional software development Search Login

Development Architecture Data Science Culture & DevOps San Francisco Nov 7-11
& Design Methods London Mar 6-10, 2017
En | | | Fr | Br New YorkJun 26-30, 2017
1,324,148 Aug unique visitors

Streaming Machine Learning Reactive Microservices Containers Java All topics The InfoQ Podcast
You are here: InfoQ Homepage Articles Getting Started with Monitoring using Graphite

Getting Started with Monitoring using Graphite

Posted by Franklin Angulo on Jan 23, 2015 | 4 Discuss


Share | My Reading List Read later

RELATED CONTENT

JavaOne 2016 - Day 1 Highlights Sep 26, 2016


In this article, I'll provide a guide to help through all of the steps involved in setting up a
monitoring system using a Graphite stack.

What We Will Cover On Abstractions and For-Each


Performance in C# Sep 29, 2016
We will cover the following topics to set up our Graphite monitoring system:
1. Introduction to Carbon & Whisper
Unorthodox Paths to High
2. Whisper Storage Schemas & Related Vendor Content Performance Aug 21, 2016
Aggregations
3. Graphite Webapp Node.js Performance Cheat Sheet
The Ultimate DevOps Toolkit
Yao Yue on Making Twitter's Pelikan
Prerequisites Cache Fast And Reliable Jul 01, 2016
Top 10 Java Performance Problems
First and foremost we need hardware on
The Ultimate Guide To JavaScript Error
which to run the Graphite stack. For Monitoring Examining Low Pause Garbage
simplicity, I will be using Amazon Web Collection in Java Apr 26, 2016
Start your FREE TRIAL of AppDynamics Pro
Services EC2 hosts. However, feel free to
use any type of computer that you might
have laying around in your office or at Related Sponsor Java 9 - The (G1) GC Awakens!
Apr 24, 2016
home.
Specifications:
How to Build (and Scale) with Top 10 Performance Mistakes
Operating System: Red Hat Enterprise Apr 24, 2016
Microservices
Linux (RHEL) 6.5
Instance Type: m3.xlarge
Elastic Block Store (EBS) Volume: 250 GB Understanding HotSpot JVM
Python Version: 2.6.6 Performance with JITWatch
Apr 15, 2016
Introduction to Carbon & Whisper
David Riddoch on Bypassing the
Graphite is composed of multiple back-end and front-end components. The back-end
Kernel and Hypervisor for Network
components are used to store numeric time-series data. The front-end components are used to
I/O, Solarflare, OpenOnload
retrieve the metric data and optionally render graphs. In this article, I'll focus first on the back- Apr 14, 2016
end components: Carbon and Whisper.
Broken Performance Tools
Apr 03, 2016

Q&A: Relevant Search with


Elasticsearch and Solr Sep 26, 2016

Metrics can be published to a load balancer or directly to a Carbon process. The Carbon

converted by Web2PDFConvert.com
process interacts with the Whisper database library to store the time-series data to the SPONSORED CONTENT
filesystem. Top 10 Java Performance
Problems
Install Carbon As Java applications become
more distributed and complex,
Carbon refers to a series of daemons that make up the storage backend of a Graphite
finding and diagnosing
installation. The daemons listen for time-series data using an event-driven networking engine performance issues becomes
called Twisted. The Twisted framework permits Carbon daemons to handle a large number of harder and harder. Download this
clients and a large amount of traffic with a low amount of overhead. eBook and learn how to
troubleshoot and diagnose some
To install Carbon, run the following commands (assuming RHEL operating system): of the most common performance
issues in Java today.

# sudo yum groupinstall "Development Tools" The Ultimate DevOps Toolkit


# sudo yum install python-devel This e-book breaks down the
# sudo yum install git various foundational components
# sudo easy_install pip that build up the transition to
DevOps within an enterprise. It
# sudo pip install twisted also introduces the continuous life
# cd /tmp cycle that maintains the
# git clone https://github.com/graphite-project/carbon.git integration and collaboration
# cd /tmp/carbon needed to deliver an exceptional
# sudo python setup.py install end user experience with your
applications.
The /opt/graphite directory should now have the carbon libraries and configuration files:

# ls -l /opt/graphite Sponsored by
drwxr-xr-x. 2 root root 4096 May 18 23:56 bin
drwxr-xr-x. 2 root root 4096 May 18 23:56 conf
drwxr-xr-x. 4 root root 4096 May 18 23:56 lib
RELATED CONTENT
drwxr-xr-x. 6 root root 4096 May 18 23:56 storage
The Five Stages of Cloud Native
Inside the bin folder, you’ll find the three different types of Carbon daemons. Sep 22, 2016

Cache: accepts metrics over various protocols and writes them to disk as efficiently as
possible; caches metric values in RAM as they are received, and flushes them to disk on a Consul 0.7 Adds Atomic K/V Updates, ACL
specified interval using the underlying Whisper library. Replication and Improved Protocol
Relay: serves two distinct purposes: replication and sharding of incoming metrics. Robustness Sep 22, 2016
Aggregator: runs in front of a cache to buffer metrics over time before reporting them into
Whisper. InfoQ eMag: Cloud Portability
Aug 22, 2016
Install Whisper
Whisper is a database library for storing time-series data that is then retrieved and manipulated
by applications using the create, update, and fetch operations. Book Review and Excerpt:
Infrastructure as Code Jul 25, 2016
To install Whisper, run the following commands:

# cd /tmp Implementing Infrastructure as Code


# git clone https://github.com/graphite-project/whisper.git Jul 24, 2016
# cd /tmp/whisper
# sudo python setup.py install
Future of Container-Enabled
The Whisper scripts should now be in place: Infrastructure Jul 22, 2016

# ls -l /usr/bin/whisper*
The Holistic Approach: Preventing
-rwxr-xr-x. 1 root root 1711 May 19 00:00 /usr/bin/whisper-create.py
Software Disasters Apr 28, 2016
-rwxr-xr-x. 1 root root 2902 May 19 00:00 /usr/bin/whisper-dump.py
-rwxr-xr-x. 1 root root 1779 May 19 00:00 /usr/bin/whisper-fetch.py
-rwxr-xr-x. 1 root root 1121 May 19 00:00 /usr/bin/whisper-info.py Rachel Reese on The Good and Bad
-rwxr-xr-x. 1 root root 674 May 19 00:00 /usr/bin/whisper-merge.py of Microservices (with F#) Apr 28, 2016
-rwxr-xr-x. 1 root root 5982 May 19 00:00 /usr/bin/whisper-resize.py
-rwxr-xr-x. 1 root root 1060 May 19 00:00 /usr/bin/whisper-set-aggregation-method.py
-rwxr-xr-x. 1 root root 969 May 19 00:00 /usr/bin/whisper-update.py Immutable Infrastructure: Rise of the
Machine Images Apr 24, 2016

Start a Carbon Cache Process


InfoQ eMag: QCon London 2016
The Carbon installation comes with sensible defaults for port numbers and many other Report Apr 06, 2016
configuration parameters. Copy the existing example configuration files:

Unwinding Platform Complexity with


Concourse Sep 29, 2016

converted by Web2PDFConvert.com
# cd /opt/graphite/conf
# cp aggregation-rules.conf.example aggregation-rules.conf SPONSORED CONTENT
# cp blacklist.conf.example blacklist.conf
Java for Cloud Natives: The
# cp carbon.conf.example carbon.conf Lessons Learned
# cp carbon.amqp.conf.example carbon.amqp.conf Recently, cloud-native has been
# cp relay-rules.conf.example relay-rules.conf touted as the panacea to all the
# cp rewrite-rules.conf.example rewrite-rules.conf development concerns. In this
# cp storage-schemas.conf.example storage-schemas.conf webcast, James Governor will
highlight organizations that lead
# cp storage-aggregation.conf.example storage-aggregation.conf
the new era of development
# cp whitelist.conf.example whitelist.conf innovation with agile and
# vi carbon.conf continuous methodologies and
follow cloud-native development
Under the cache section, the line receiver port has a default value and it is used to accept practices.
incoming metrics through the plaintext protocol (see below):
Cloud Portability - Download
[cache] the FREE InfoQ eMag
LINE_RECEIVER_INTERFACE = 0.0.0.0 What is lock-in,really? Is the
concern overblown, or is it
LINE_RECEIVER_PORT = 2003
something that negatively impacts
Start a carbon-cache process by running the following command: your business? Can any products
actually eliminate lock-in? Does
open-source software reduce your
# cd /opt/graphite/bin lock-in risk? In this InfoQ eMag,
# ./carbon-cache.py start we explore the topic of cloud lock-
Starting carbon-cache (instance a) in from multiple angles.

The process should now be listening on port 2003:


SPONSORED CONTENT
# ps -efla | grep carbon-cache
A Guide to REST and API
1 S root 2674 1 0 80 0 - 75916 ep_pol 00:18 ? 00:00:03 /usr/bin/python ./carbon-cache.py
Design start
Download this high-level guide to
# netstat -nap | grep 2003 get up to speed on REST and API
tcp 0 0 0.0.0.0:2003 0.0.0.0:* LISTEN design. Learn about architectural
2674/python
styles vs. standards; architectural
constraints; connectors and
Publish Metrics components; using URIs for
identification; using media types
A metric is any measurable quantity that can vary over time, for example: for representation, and more.
number of requests per second
Integrate Big Data Analytics
request processing time with Content Management
CPU usage This paper explains how to
dramatically improve the
A datapoint is a tuple containing:
likeliness of a successful big data
a metric name project by integrating advanced
DAM / ECM technology with big
a measured value
data analytics.
at a specific point in time (usually a timestamp)
Client applications publish metrics by sending data points to a Carbon process. The application
RELATED CONTENT
establishes a TCP connection on the Carbon process' port and sends data points in a simple
plaintext format. In our example, the port is 2003. The TCP connection may remain open and Keep Calm and CF Push on Azure
Sep 29, 2016
reused as many times as necessary. The Carbon process listens for incoming data but does not
send any response back to the client.
The datapoint format is defined as: When Java Shops Grow up they
Become Web Companies Sep 29, 2016
a single line of text per data point
a dotted metric name at position 0
a value at position 1 The Power of Partnership & Building
a Unix Epoch timestamp at position 2 a Cloud Native Tier-1 Platform in
spaces for the position separators Parallel to the Existing Platform and
Performing a Reverse Integration
For example, here are some valid datapoints: Deployment Strategy Sep 29, 2016
The number of metrics received by the carbon-cache process every minute
​ carbon.agents.graphite-tutorial.metricsReceived 28198 1400509108 Twitter Open Sources Stream Processing
Engine Heron Sep 29, 2016
The number of metrics created by the carbon-cache process every minute
​ carbon.agents.graphite-tutorial.creates 8 1400509110
W3C Web Payments HTTP Specification
The p95 response times for a sample server endpoint over a minute Working Drafts Released Sep 29, 2016
​ PRODUCTION.host.graphite-tutorial.responseTime.p95 0.10 1400509112
Client applications have multiple ways to publish metrics:
Building a Scalable Minimum Viable Product

converted by Web2PDFConvert.com
using the plaintext protocol with tools such as the netcat (nc) command Sep 29, 2016
using the pickle protocol
using the Advanced Message Queueing Protocol (AMQP) Kubernetes 1.4 Simplifies Cluster Deployment,
using libraries such as the Dropwizard Metrics library Improves Security and Federation Sep 29, 2016
For simplicity, in this tutorial I'll be using the plaintext protocol through the netcat command. To
publish the example datapoints listed above, run the following commands: Continuous Improvement Beyond
Retrospectives Sep 29, 2016
sudo yum install nc
echo "carbon.agents.graphite-tutorial.metricsReceived 28198 `date +%s`" | nc localhost 2003
echo "carbon.agents.graphite-tutorial.creates 8 `date +%s`" | nc localhost 2003 Unik: Unikernel Backend to Cloud
echo Foundry Sep
"PRODUCTION.host.graphite-tutorial.responseTime.p95 0.10 `date +%s`" | nc localhost 28, 2016
2003

The carbon-cache log files will contain information about the new metrics received and where the Adding ES6 to Your Developer
information was stored: Toolbox Sep 28, 2016

# tail -f /opt/graphite/storage/log/carbon-cache/carbon-cache-a/creates.log
TypeScript 2.0 Released Sep 28, 2016
19/05/2014 10:42:44 :: creating database file /opt/graphite/storage/whisper/carbon/agents/graphite-tutorial/metricsReceived.wsp
19/05/2014 10:42:53 :: creating database file /opt/graphite/storage/whisper/carbon/agents/graphite-tutorial/creates.wsp
19/05/2014 10:42:57 :: creating database file /opt/graphite/storage/whisper/PRODUCTION/host/graphite-tutorial/responseTime/p95.

Carbon interacts with Whisper to store the time-series data to the filesystem. Navigate the
filesystem to make sure the data files have been created:

# ls -l /opt/graphite/storage/whisper/carbon/agents/graphite-tutorial/
total 3040
-rw-r--r--. 1 root root 1555228 May 19 10:42 creates.wsp
-rw-r--r--. 1 root root 1555228 May 19 10:42 metricsReceived.wsp
# ls -l /opt/graphite/storage/whisper/PRODUCTION/host/graphite-tutorial/responseTime/
total 20
-rw-r--r--. 1 root root 17308 May 19 10:42 p95.wsp

Finally, you can retrieve metadata information about the Whisper file that was created for the
metric using the whisper-info script:

# whisper-info.py /opt/graphite/storage/whisper/PRODUCTION/host/graphite-tutorial/responseTime/p95.wsp
maxRetention: 86400
xFilesFactor: 0.5
aggregationMethod: average
fileSize: 17308

Archive 0
retention: 86400
secondsPerPoint: 60
points: 1440
size: 17280
offset: 28

The whisper-dump script is a more complete script that outputs the original data for all storage
retention periods along with the metadata information about the Whisper file:

converted by Web2PDFConvert.com
# whisper-dump.py /opt/graphite/storage/whisper/PRODUCTION/host/graphite-tutorial/responseTime/p95.wsp
Meta data:
aggregation method: average
max retention: 86400
xFilesFactor: 0.5

Archive 0 info:
offset: 28
seconds per point: 60
points: 1440
retention: 86400
size: 17280

Archive 0 data:
0: 1400609220, 0.1000000000000000055511151231257827
1: 0, 0
2: 0, 0
3: 0, 0
4: 0, 0
5: 0, 0
...
1437: 0, 0
1438: 0, 0
1439: 0, 0

Aggregation method, max retention, xFilesFactor, and all of the other attributes of the Whisper
file are important to understand. Don't worry if you're lost at this point, I'll be covering these in
more detail in the next section.

Whisper Storage Schemas & Aggregations


There might be some confusion when you or your fellow developers and system administrators
start publishing data points and get unexpected results:
Why are my data points getting averaged?
I've been publishing data points intermittently, why are there no data points?
I've been publishing data points for many days, why am I only getting data for one day?

How does Whisper store data?


We first need to understand how data is stored in the Whisper files. When a Whisper file is
created, it has a fixed size that will never change. Within the Whisper file there are potentially
multiple "buckets", that you need to define in the configuration files, for data points at different
resolutions. For example:
Bucket A: data points with 10-second resolution
Bucket B: data points with 60-second resolution
Bucket C: data points with 10-minute resolution
Each bucket also has a retention attribute indicating the length of time data points in the bucket
should be retained for. For example:
Bucket A: data points with 10-second resolution retained for 6 hours
Bucket B: data points with 60-second resolution retained for 1 day
Bucket C: data points with 10-minute resolution retained for 7 days
Given these two pieces of information, Whisper performs some simple math to figure out how
many points it will actually need to keep in each bucket:
Bucket A: 6 hours x 60 mins/hour x 6 data points/min = 2160 points
Bucket B: 1 day x 24 hours/day x 60 mins/hour x 1 data point/min = 1440 points
Bucket C: 7 days x 24 hours/day x 6 data points/hour = 1008 points
If a Whisper file is created with this storage schema configuration, it will have a size of 56 KB. If
you run it through the whisper-dump.py script, the following will be the output. Note that an
archive corresponds to a bucket and the seconds per point and points attributes match our
computations above.
Meta data:

converted by Web2PDFConvert.com
aggregation method: average
max retention: 604800
xFilesFactor: 0.5

Archive 0 info:
offset: 52
seconds per point: 10
points: 2160
retention: 21600
size: 25920

Archive 1 info:
offset: 25972
seconds per point: 60
points: 1440
retention: 86400
size: 17280

Archive 2 info:
offset: 43252
seconds per point: 600
points: 1008
retention: 604800
size: 12096

What about aggregations?


Aggregations come into play when data from a high precision bucket is moved to a lower
precision bucket. Let's use Bucket A and B from our previous example.
ucket A: 10-second resolution retained for 6 hours (higher precision)
Bucket B: 60-second resolution retained for 1 day (lower precision)
We might have an application publishing data points every 10 seconds. Any data points
published less than 6 hours ago will be found in Bucket A. However, if I start to query for data
points published more than 6 hours ago, they will be found in Bucket B.

How are data points moved to Bucket B?


The lower precision value is divided by the higher precision value to determine the number of
data points that will need to be aggregated.
l 60 seconds (Bucket B) / 10 seconds (Bucket A) = 6 data points to aggregate
NOTE: Whisper needs the lower precision value to be cleanly divisible by the higher precision
value (i.e. the division must result in a whole number). Otherwise the aggregation might not be
accurate.
To aggregate the data, Whisper reads 6 10-second data points from Bucket A and applies a
function to them to come up with the single 60-second data point that will be stored in Bucket B.
There are five options for the aggregation function: average, sum, max, min and last. The choice
of aggregation function depends on the data points you're dealing with. 95th percentile values,
for example, should probably be aggregated with the max function. For counters, on the other
hand, the sum function would be more appropriate.
Whisper also handles the concept of an xFilesFactor when aggregating data points. It represents
the ratio of data points a bucket must contain to be aggregated accurately. In our previous
example, Whisper determined that it needed to aggregate 6 10-second data points. It could be
possible that only 4 data points have data and the other 2 are null - due to networking issues,
application restarts, etc.
If our Whisper file has an xFilesFactor of 0.5, it means that it will aggregate the data only if at
least 50% of the data points are present. If more than 50% of the data points are null, Whisper
will create a null aggregation. In our case, we have 4 out of 6 data points - 66%. The aggregation
function will be applied on the non-null data points to create the aggregated value.
You may set the xFilesFactor to any value between 0 and 1. A value of 0 indicates that the
aggregation should be computed even if there is only one data point available. A value of 1
indicates that the aggregation should be computed only if all data points are present.
In the previous section, we made copies of all the example configuration files in the
/opt/graphite/conf directory. The configuration files that control how Whisper files are created
are:

converted by Web2PDFConvert.com
/opt/graphite/conf/storage-schemas.conf
/opt/graphite/conf/storage-aggregation.conf

Default Storage Schemas


The storage-schemas configuration file is composed of multiple entries containing a pattern
against which to match metric names and a retention definition. By default there are two entries:
carbon and everything else.
The carbon entry matches metric names that start with the "carbon" string. Carbon daemons emit
their own internal metrics every 60 seconds - by default, but it can be changed. For example, a
carbon-cache process will emit a metric for the number of metric files it creates every minute.
The retention definition indicates that data points reported every 60 seconds will be retained for
90 days.

[carbon]
pattern = ^carbon\.
retentions = 60s:90d

The everything else entry captures any other metric that is not carbon-related by specifying a
pattern with an asterisk. The retention definition indicates that data points reported every 60
seconds will be retained for 1 day.

[default_1min_for_1day]
pattern = .*
retentions = 60s:1d

Default Storage Aggregation


The storage-aggregation configuration file is also composed of multiple entries containing:
a pattern against which to match metric names
an xFilesFactor value
an aggregation function
By default there are four entries:
l Metrics ending in .min
​ Use the min aggregation function
​ At least 10% of data points should be present to aggregate
l Metrics ending in .max
​ Use the max aggregation function
​ At least 10% of data points should be present to aggregate
l Metrics ending in .count
​ Use the sum aggregation function
​ Aggregate if there is at least one data point
l Any other metrics
​ Use the average aggregation function
​ At least 50% of data points should be present to aggregate

[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max

[sum]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average

The default storage schemas and storage aggregations work well for testing, but for real

converted by Web2PDFConvert.com
production metrics you might want to modify the configuration files.

Modify Storage Schemas


First off, I'll modify the carbon entry. I'd like to keep the metrics reported by Carbon every 60
seconds for 180 days (6 months). After 180 days, I'd like to rollup the metrics to a precision of 10
minutes and keep those for another 180 days.

[carbon]
pattern = ^carbon\.
retentions = 1min:180d,10min:180d

At Squarespace we use the Dropwizard framework to build RESTful web services. We have many
of these services running in staging and production environments and they all use the
Dropwizard Metrics library to publish application and business metrics every 10 seconds. I'd like
to keep 10-second data for 3 days. After 3 days, the data should be aggregated to 1-minute
data and kept for 180 days (6 months). Finally, after 6 months, the data should be aggregated to
10-minute data and kept for 180 days.
NOTE: If my metrics library published data points at a different rate, my retention definition would
need to change to match it.

[production_staging]
pattern = ^(PRODUCTION|STAGING).*
retentions = 10s:3d,1min:180d,10min:180d

Metrics that are not carbon, production, or staging metrics are probably just test metrics. I'll keep
those around only for one day and assume that they will be published every minute.

[default_1min_for_1day]
pattern = .*
retentions = 60s:1d

Modify Storage Aggregations


I'm going to keep the default storage aggregation entries, but will add a couple more for metrics
ending in ratio, m1_rate and p95.
NOTE: Any new entries should be added before the default entry.

[ratio]
pattern = \.ratio$
xFilesFactor = 0.1
aggregationMethod = average

[m1_rate]
pattern = \.m1_rate$
xFilesFactor = 0.1
aggregationMethod = sum

[p95]
pattern = \.p95$
xFilesFactor = 0.1
aggregationMethod = max

At this point you have configured your Graphite backend to match the data point publishing rates
of your application and fully understand how the data points are stored in the filesystem. In the
next section, we'll attempt to visualize the data using graphite-webapp.

Graphite Webapp
Now that we have the back-end components up and running and storing numeric time-series
data in the formats that we have specified, it's time to take a look at the front-end components of
Graphite. Specifically, we need a way to query and visualize the information that is stored.
The Graphite web application is a Django application that runs under Apache/mod_wsgi,
according to the Github readme file. In general, it provides the following:
l a URL-based API endpoint to retrieve raw data and generate graphs
l a user interface to navigate metrics and build and save dashboards

The Installation Maze


The installation of graphite-web is really a maze. I have installed it multiple times - in RHEL,

converted by Web2PDFConvert.com
CentOS, Ubuntu and Mac OS X - and every time the steps have been different. Treat it as a
game, enjoy it, and you'll know that you've completed the maze when all the required
dependencies have been installed.
Instructions for RHEL 6.5:

# cd /tmp
# git clone https://github.com/graphite-project/graphite-web.git
# cd /tmp/graphite-web
# python check-dependencies.py
[REQUIRED] Unable to import the 'django' module, do you have Django installed
[REQUIRED] Unable to import the 'pyparsing' module, do you have pyparsing module installed
[REQUIRED] Unable to import the 'tagging' module, do you have django-tagging installed
[OPTIONAL] Unable to import the 'memcache' module, do you have python-memcached installed
[OPTIONAL] Unable to import the 'txamqp' module, this is required if you want to use AMQP as an input to Carbon. Note that txam
[OPTIONAL] Unable to import the 'python-rrdtool' module, this is required for
3 optional dependencies not met. Please consider the optional items before proceeding.
3 necessary dependencies not met. Graphite will not function until these dependencies are fulfilled.

The goal is to install at least all of the required dependencies. Install the optional dependencies if
you're planning on using the AMQ functionality or the caching functionality using Memcache.

# sudo yum install cairo-devel


# sudo yum install pycairo-devel
# sudo pip install django
# sudo pip install pyparsing
# sudo pip install django-tagging
# sudo pip install python-memcached
# sudo pip install txamqp
# sudo pip install pytz
# cd /tmp/graphite-web
# python check-dependencies.py
[OPTIONAL] Unable to import the 'python-rrdtool' module, this is required for
1 optional dependencies not met. Please consider the optional items before proceeding.
All necessary dependencies are met.

I've installed enough packages to meet the required dependencies. I can now install graphite-
web:

# cd /tmp/graphite-web
# sudo python setup.py install
# ls -l /opt/graphite/webapp/
total 12
drwxr-xr-x. 6 root root 4096 May 23 14:33 content
drwxr-xr-x. 15 root root 4096 May 23 14:33 graphite
-rw-r--r--. 1 root root 280 May 23 14:33 graphite_web-0.10.0_alpha-py2.6.egg-info

The setup script moves the web application files to the proper location under
/opt/graphite/webapp.

Initialize the Database


The web application maintains an internal database to store user information and dashboards.
Initialize the database by running the following:

converted by Web2PDFConvert.com
# cd /opt/graphite
# export PYTHONPATH=$PYTHONPATH:`pwd`/webapp
# django-admin.py syncdb --settings=graphite.settings
You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'root'): feangulo
Email address: feangulo@yaipan.com
Password:
Password (again):
Error: Blank passwords aren't allowed.
Password:
Password (again):
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

This will create a new database and store it in the /opt/graphite/storage directory:

# ls -l /opt/graphite/storage/graphite.db
-rw-r--r--. 1 root root 74752 May 23 14:46 /opt/graphite/storage/graphite.db

Graphite Webapp Settings


The configuration file containing the graphite-webapp settings is located in the
/opt/graphite/webapp/graphite folder. Copy the sample configuration file:

# vi /opt/graphite/webapp/graphite/local_settings.py
#########################
# General Configuration #
#########################
TIME_ZONE = 'UTC'
##########################
# Database Configuration #
##########################
DATABASES = {
'default': {
'NAME': '/opt/graphite/storage/graphite.db',
'ENGINE': 'django.db.backends.sqlite3',
'USER': '',
'PASSWORD': '',
'HOST': '',
'PORT': ''
}
}

At this point, if you followed the instructions in the previous sections, you should only have one
carbon-cache process running on port 2003 with a query port on 7002. These are the defaults
expected by the graphite-webapp. Therefore, there are no other changes required to the
configuration file.

# ps -efla | grep carbon-cache


1 S root 14101 1 0 80 0 - 75955 ep_pol May20 ? 00:00:26 /usr/bin/python ./carbon-cache.py start
# netstat -nap | grep 2003
tcp 0 0 0.0.0.0:2003 0.0.0.0:* LISTEN 14101/python
# netstat -nap | grep 7002
tcp 0 0 0.0.0.0:7002 0.0.0.0:* LISTEN 14101/python

However, you could specify the carbon-cache process to read from explicitly in the settings file:

# vi /opt/graphite/webapp/graphite/local_settings.py
#########################
# Cluster Configuration #
#########################
CARBONLINK_HOSTS = ["127.0.0.1:7002:a"]

This means that I have a carbon-cache process running locally, with the query port set to 7002

converted by Web2PDFConvert.com
and the name set to 'a'. If you look at the Carbon configuration file, you should see something
like this:

# vi /opt/graphite/conf/carbon.conf
[cache]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
CACHE_QUERY_INTERFACE = 0.0.0.0
CACHE_QUERY_PORT = 7002

NOTE: Where did the ‘a’ come from? That’s the default name assigned. To define more caches,
you’d need to create additional named sections in the configuration file.

[cache:b]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2004
CACHE_QUERY_INTERFACE = 0.0.0.0
CACHE_QUERY_PORT = 7003

Dashboard and Graph Template Configuration


The Graphite webapp comes with dashboard and graph template defaults. Copy the sample
configuration files:

# cd /opt/graphite/conf
# cp dashboard.conf.example dashboard.conf
# cp graphTemplates.conf.example graphTemplates.conf

I modify the dashboard configuration file to have larger graph tiles.

# vi /opt/graphite/conf/dashboard.conf
[ui]
default_graph_width = 500
default_graph_height = 400
automatic_variants = true
refresh_interval = 60
autocomplete_delay = 375
merge_hover_delay = 750

I modify the default graph template to have a black background and a white foreground. I also
like the font to be smaller.

# vi /opt/graphite/conf/graphTemplates.conf
[default]
background = black
foreground = white
minorLine = grey
majorLine = rose
lineColors = blue,green,red,purple,brown,yellow,aqua,grey,magenta,pink,gold,rose
fontName = Sans
fontSize = 9
fontBold = False
fontItalic = False

Run the Web Application


We are finally ready to run the web application. I'm going to run it on port 8085 but you may set
the port to any value you'd like. Run the following commands:

# cd /opt/graphite
# PYTHONPATH=`pwd`/storage/whisper ./bin/run-graphite-devel-server.py --port=8085 --libs=`pwd`/webapp /opt/graphite 1>/opt/grap
# tail -f /opt/graphite/storage/log/webapp/process.log

Open a web browser and point it to http://your-ip:8085. Make sure that the Graphite web
application loads. If you're tailing the process.log file, you should be able to see any resources
that are loaded and any queries that are made from the web application.
(Click on the image to enlarge it)

converted by Web2PDFConvert.com
Navigate the Metrics
In a previous section, we had published a couple of metrics to the carbon-cache using the netcat
command. Specifically, we had published the following:

carbon.agents.graphite-tutorial.metricsReceived
carbon.agents.graphite-tutorial.creates
PRODUCTION.host.graphite-tutorial.responseTime.p95

The web application displays metrics as a tree. If you navigate the metric tree in the left panel,
you should be able to see all of these metrics.

You may click on any metric and it will be graphed (past 24 hours by default) in the panel on the
right. To change the date range to query, use the buttons in the panel above the graph.
(Click on the image to enlarge it)

Create a Dashboard
The default view is great to quickly browse metrics and visualize them. But if you want to build a
dashboard, point your browser to http://your-ip:8085/dashboard. The top portion of the page is

converted by Web2PDFConvert.com
another way to navigate your metrics. You can either click on the options to navigate, or start
typing to get suggestions. You can click on a metric and a graph tile will appear in the bottom
section. As you keep clicking on new metrics, additional tiles appear in the panel below thereby
creating a dashboard. At times you might want to display multiple metrics in a single graph. To do
this, drag and drop a tile on top of another one and the metrics will be graphed together. You
may also change the position of the tiles in the layout by dragging them around.
(Click on the image to enlarge it)

The user interface looks very simple, but don't be discouraged. You can do very powerful
operations on your metric data. If you click on one of the graph tiles, you will get a dialog. It
displays the list of metrics being graphed and you may edit them directly. There are also multiple
menus in the dialog to apply functions on the data, change aspects of the visualization, and
many other operations.

converted by Web2PDFConvert.com
You may also configure and save your dashboard, load other dashboards, change the date
range of the current dashboard, share a dashboard, among other things, using the top-most
menu. By far my favorite thing is the Dashboard -> Edit Dashboard feature. It saves me a lot of
time when I need to create or modify dashboards.
(Click on the image to enlarge it)

To illustrate, I am going to build a dashboard to monitor the carbon-cache process. As


mentioned in a previous section, Carbon processes report internal metrics. I don't like to build
dashboards manually, instead I will use the Edit Dashboard feature.

To build a dashboard to monitor the carbon-cache process, specify the following in the Edit
Dashboard window.

converted by Web2PDFConvert.com
NOTE: This dashboard will monitor all carbon-cache processes that you have running. Notice the
use of the asterisk (*) in the metric name to match all values following the carbon.agents prefix.

[
{
"target": [
"aliasByNode(carbon.agents.*.metricsReceived,2)"
],
"title": "Carbon Caches - Metrics Received"
},
{
"target": [
"aliasByNode(carbon.agents.*.creates,2)"
],
"title": "Carbon Caches - Create Operations"
},
{
"target": [
"aliasByNode(carbon.agents.*.cpuUsage,2)"
],
"title": "Carbon Caches - CPU Usage"
},
{
"target": [
"aliasByNode(carbon.agents.*.memUsage,2)"
],
"title": "Carbon Caches - Memory Usage"
}
]

Update the dashboard definition and you should now see something like this:
(Click on the image to enlarge it)

Changing content in the Edit Dashboard dialog updates the dashboard on the browser. However,
it does not save it to Graphite's internal database of dashboards. Go ahead and save the
dashboard so that you can share it and open it up later.

converted by Web2PDFConvert.com
To look up the dashboard, open the Finder:

On a production Graphite installation, the Graphite Caches dashboard would look more like this:

converted by Web2PDFConvert.com
It’s All About the API
Graphite has some drawbacks like any other tool: it doesn't scale well, the storage mechanism
isn't the most optimal - but the fact is that Graphite's API is a beauty. Having a user interface is
nice, but the most important is that whatever you can do through the UI, you can also do via
graphite-web API requests. Users are able to request custom graphs by building a simple URL.
The parameters are specified in the query string of the HTTP GET request. By default a PNG
image is returned as the response, but the user may also indicate the required format of the
response - for example, JSON data.
Sample request #1:
Metric: CPU usage of all carbon-cache processes
Graph dimensions: 500x300
Time range: 12 hours ago until 5 minutes ago
Response format: PNG image (default)
http://your-ip:8085/render?target=carbon.agents.*.cpuUsage&width=500&height=300&from=-
12h&until=-5min

Sample request #2:


Metric: CPU usage of all carbon-cache processes
Graph dimensions: 500x300
Time range: 12 hours ago until 5 minutes ago
Response format: JSON data
http://your-ip:8085/render?target=carbon.agents.*.cpuUsage&width=500&height=300&from=-
12h&until=-5min&format=json
(Click on the image to enlarge it)

converted by Web2PDFConvert.com
Graphite's API supports a wide variety of display options as well as data manipulation functions
that follow a simple functional syntax. Functions can be nested, allowing for complex expressions
and calculations. View the online documentation to peruse all of the available functions:
Graphite Functions: http://graphite.readthedocs.org/en/latest/functions.html
Let’s say I have an application that runs on hundreds of servers and each of them publishes
their individual p95 response times every 10 seconds. Using functions provided by the API, I can
massage the metrics and build an informative graph:
averageSeries: computes the average of all the values in the set
​ We want to see the average among all the p95 latencies
scale: multiply a value by a constant
​ The latencies are reported in milliseconds, but we want to display them in seconds
alias: change the name of the metric when displaying
​ Instead of the full metric name, we want to display only avg p95 in the graph legend
The argument passed as part of the metric query to the API would be:
alias(scale(averageSeries(PRODUCTION.host.*.requests.p95),0.001),'avg p95')
The API would return the following graph:
InfoQ Weekly Newsletter
Subscribe to our Weekly email newsletter to follow
all new content on InfoQ

Your email here Subscribe

Congratulations! We have installed and configured carbon, whisper and the graphite-webapp,
published metrics, navigated metrics and built a dashboard. You can now build your own
awesome dashboards for your business and application metrics.
This was an introductory article on Graphite. For advanced topics see:
Stress Testing Carbon Caches
Carbon Aggregators
Graphite Querying Statistics on an ELK Stack

About the Author


Franklin Angulo oversees the teams which build and maintain the large-scale
backend engine at the core of Squarespace, a website building platform based in
New York City. Franklin is a seasoned professional with experience leading
complex, large-scale, multi-disciplinary engineering projects. Before joining
Squarespace, he was a senior software engineer at Amazon working on route
planning optimizations, shipping rate shopping and capacity planning algorithms

converted by Web2PDFConvert.com
for global inbound logistics and the Amazon Locker program.

Personas DevOps Topics application performance management


Performance Tuning Infrastructure Performance

Related Editorial

Building and Tuning High Performance Java Platforms


Tuning Java for Big Data
On Abstractions and For-Each Performance in C#
JavaOne 2016 - Day 1 Highlights
Unorthodox Paths to High Performance

Tell us what you think


Please enter a subject
Message

Post Message

Community comments Watch Thread

Great one! Very thorough. by Sapien Technologies Posted Jul 04, 2015 10:54

Re: Great one! Very thorough. by Vladimir Stepanov Posted Aug 30, 2016 02:44
How to monitor multiple Hosts? by suresh reddy Posted Oct 10, 2015 10:07
Great job! by Vladimir Stepanov Posted Aug 30, 2016 02:40

Great one! Very thorough.


Jul 04, 2015 10:54 by Sapien Technologies

Great one! Very thorough. Thanks!!

Few things I had to figure out at couple of places in my install.

1) Init Graphite-web db
Using command "django-admin.py syncdb --settings=graphite.settings" complains settings
module not found
Instead, used "sudo PYTHONPATH=/opt/graphite/webapp
DJANGO_SETTINGS_MODULE=graphite.settings django-admin.py syncdb"

2) launching host:8085/ didn't render the page correctly. Blue Ext components were missing.
Issue was django server doesn't serve static files(js, css etc) unless run in "--insecure" mode.
See - github.com/graphite-project/graphite-web/issues...
Added "--insecure" option by hacking ./bin/run-graphite-devel-server.py to get going
Reply Back to top

How to monitor multiple Hosts?


Oct 10, 2015 10:07 by suresh reddy

Hi,

I am new to this tool, trying to install on CentOS 6.6, but i do not know how to configure
dashboard, collect the data from different servers and how to add hosts to the carbon. Please
help me if you have any documentation regarding the installation and configuration. I am glad
if you respond back to me ASAP.

Thanks.
Reply Back to top

converted by Web2PDFConvert.com
Great job!
Aug 30, 2016 02:40 by Vladimir Stepanov

Very useful! Nice start point to work with Graphite.


Reply Back to top

Re: Great one! Very thorough.


Aug 30, 2016 02:44 by Vladimir Stepanov

Sapien Technologies, thank you for your comment! You saved my mind :) I could not
understand for a very long time why I see blank pages in graphite webapp instead of proper
UI.
I think this comment about "--insecure" option is very important and at least this should be
documented in Graphite...
Reply Back to top

Development Architecture & Design Culture & Methods Data Science DevOps
Unwinding Platform Complexity W3C Web Payments HTTP Building a Scalable Minimum Twitter Open Sources Stream Kubernetes 1.4 Simplifies
with Concourse Specification Working Drafts Viable Product Processing Engine Heron Cluster Deployment, Improves
Released Security and Federation
Keep Calm and CF Push on Continuous Improvement Traffic Data Monitoring Using
Azure Traffic Data Monitoring Using Beyond Retrospectives IoT, Kafka and Spark JavaOne 2016: IBM’s Keynote –
IoT, Kafka and Spark Streaming Accelerating Innovation with
When Java Shops Grow up Streaming Better Estimations Using Java
they Become Web Companies Techniques from Psychology RXJava by Example
JavaOne 2016: IBM’s Keynote – New Book on Docker Provides
Accelerating Innovation with Comprehensive Introduction
Java

Home QCONS WORLDWIDE


Shanghai InfoQ Weekly Newsletter
All topics Oct 20-22, 2016 Personalize Your Main Interests
Subscribe to our Weekly email
QCon Conferences San Francisco newsletter to follow all new
Nov 7-11, 2016 content on InfoQ Development
About InfoQ Tokyo 2016 Architecture & Design
London Your email here Subscribe
Our Audience Mar 6-10, 2017 Data Science
Your personalized RSS Culture & Methods
Contribute Beijing
Apr 16-18, 2017 For daily content and announcements DevOps
About C4Media São Paulo For major community updates This affects what content you see on the
Apr 24-26, 2017 For weekly community updates homepage & your RSS feed. Click
Create account preferences to access more fine-grained
New York personalization.
Login Jun 26-30, 2017

General Feedback Bugs Advertising Editorial Marketing InfoQ.com and all content copyright © 2006-2016
feedback@infoq.com bugs@infoq.com sales@infoq.com editors@infoq.com marketing@infoq.com C4Media Inc. InfoQ.com hosted at Contegix, the best
ISP we've ever worked with.
Privacy policy

converted by Web2PDFConvert.com

Вам также может понравиться