In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia.
Before going ahead let me briefly explain about what is Kafka and Ganglia.
Kafka – Kafka is open source distributed message broker project developed by Apache, Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
Ganglia – Ganglia is distributed system for monitoring high performance computing systems such as grids, clusters etc.
Now lets get started, In this example we have a Hadoop cluster with 3 Kafka brokers, First we will see how to install and configure ganglia on these machines.
Step 1: Setup and Configure Ganglia gmetad and gmond
First thing is you need to install EPEL repo on all the nodes
yum install epel-release
On master node (ganglia-server) download below packages
yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpdphpaprapr-util
On slave nodes (ganglia-client) download below packages
yum install ganglia-gmond
On master node do the following
chown apache:apache -R /var/www/html/ganglia
Edit below config file and allow ganglia webpage from any IP
vi /etc/httpd/conf.d/ganglia.conf
It should look like below:
#
# Ganglia monitoring system php web frontend
#
Alias /ganglia /usr/share/ganglia
<Location /ganglia>
Order deny,allow
Allow from all #this is very important or else you won’t be able to see ganglia web UI
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>
On master node edit gmetadconfig file and it should look like below (Please change highlighted IP address to your ganglia-server private IP address)
#cat /etc/ganglia/gmetad.conf |grep -v ^#
data_source "hadoopkafka" 172.30.0.81:8649
gridname "Hadoop-Kafka"
setuid_username ganglia
case_sensitive_hostnames 0
On master node edit gmond.conf, keep other parameters to default except below ones
Copy gmond.conf to all other nodes in the cluster
cluster {
name = "hadoopkafka"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
host = 172.30.0.81
port = 8649
#ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8649
#bind = 239.2.11.71
#retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}
Start apache service on master node
service httpd start
Start gmetad service on master node
service gmetad start
Start gmond service on every node in the server
service gmond start
This is it! Now you can see basic ganglia metrics by visiting web UI at http://IP-address-of-ganglia-server/ganglia
Step 2: Ganglia Integration with Kafka
Enable JMX Monitoring for Kafka Brokers
In order to get custom Kafka metrics we need to enable JMX monitoring for Kafka Broker Daemon.
To enable JMX Monitoring for Kafka broker, please follow below instructions:
Edit kafka-run-class.sh and modify KAFKA_JMX_OPTS variable like below (please replace red text with your Kafka Broker hostname)
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Add below line in kafka-server-start.sh (in case of Hortonworks hadoop, path is /usr/hdp/current/kafka-broker/bin/kafka-server-start.sh)
export JMX_PORT=${JMX_PORT:-9999}
That’s it! Please do the above steps on all Kafka brokers and restart the kafka brokers ( manually or via management UI whatever applicable)
Verify that JMX port has been enabled!
You can use jconsole to do so.
Download, install and configure jmxtrans
Download jmxtrans rpm from below link and install it using rpm command
http://code.google.com/p/jmxtrans/downloads/detail?name=jmxtrans-250-0.noarch.rpm&can=2&q=
Once you have installed jmxtrans, please make sure that java &jps configured in $PATH variable
Write a JSON for fetching MBeans on each Kafka Broker.
I have written JSON for monitoring custom Kafka metrics, please download it from here.
Please note that, you need to replace “IP_address_of_kafka_broker” with your kafka broker’s IP address in downloaded JSON, same is the case for ganglia server’s IP address.
Once you are done with writing JSON, please verify the syntax using any online JSON validator( http://jsonlint.com/ ).
Start the jmxtrans using below command
cd /usr/share/jmxtrans/
sh jmxtrans.sh start $name-of-the-json-file
Verify that jmxtrans has started successfully using simple “ps” command
Repeat above procedure on all Kafka brokers
Verify custom metrics
Login to ganglia server and go to rrd directory ( by default it is /var/lib/ganglia/rrds/ ) and check if there are new rrd files for kafka metrics.
You should see output like below (output is truncated)
Go to ganglia web UI –> select hadoopkafka from below highlighted dropdown
Select “custom.metrics” from below highlighted dropdown
That’s all!