Monday, September 21, 2015

Protocol Buffers

In this topic I will talk about a very nice project that is done by Google which is Protocol Buffers...

What is Protocol Buffer?


Its a method of serializing structured data in an efficient, fast, and simple way where you can control how the object looks like using a predefined language.

So basically you define the object structure using a language which called "Interface description language" and usually its written in files with extension .proto and can be then compiled using protoc utility to generate Java, C++, Python class files, and to create instances you can use the builder class generated along with the generated classes.


Interface description language


A simple way to define the fields, types, and in case required or optional and you also have to assign sequence number for the tool to uniquely identify each of them when serializing and de-serializing. You can see the example below and read more about the language to develop more advanced object types.


I would say this is a good way to serialize and de-serialize objects to be totally independent of the language used especially in systems that involve communication between heterogeneous systems which may be developed with different technologies and/or languages.

 

Complete Example


First you need to install protocol buffer compiler using the following command:

sudo apt-get install protobuf-compiler

Maven Build Script



Protocol Buffer File



Read and Write Proto Buff Objects (its just an example :D)




Friday, September 18, 2015

Monitoring tools and libraries

This quarter I have been assigned a task to design and implement a solution to monitor the accuracy of our results and the health of our system (e.g., Request durations, Error rate, Caching ratio, .... etc).

In the beginning I felt it might be a silly task, but then it turned out being an exciting task that I've learnt alot from as there are many open source technologies I have used during this project.

In this topic I will mention some of these systems and libs that might be useful for many of you to track the health of your system and give you a good indication of how good your system is. ;)

Metrics Library


Simply if you want to monitor your system you should expose data to measure and correctly monitor your system. This library is used to expose some metrics and store them via JMX, however it supports other ways to expose and report your metrics (e.g., Console Reporter, JMX, HTTP, CSV, ....).

These metrics could be one of the following type:

Counters


You can use this metric to report something that increase and decrease over time (e.g., Bookings, Errors, Done requests, ... etc)

 

Histogram


I have found this metric type very useful to measure statistical changes of a sequence or series of data like (Request duration) .. for example you can update this metric with the duration of all done requests and measure at any point of time the mean, standard deviation, 75th percentile and so on to know how good your system at any point of time.

 

Timer


I didn't use this type of metrics but its mainly can be used for example to measure the duration of a request and get the rate of requests per second.

 

Health Checks


This one is also very useful if you have multiple subsystems that you need to check its health and see if anything happens, like health of database connection.

I used this lib and exposed all metrics via the JMX reporter and the results were amazing especially when you use other nice monitoring tools like the ones I will describe below.

Grafana


You can use Grafana to visualize your metrics and measure your system health over time.

I used Grafana to create multiple dashboards to measure the Error Rate, Cache Ratio, No Results Rate, and to measure the accuracy of our results.

I would recommend it for anybody wants to visualize and measure his system health and accuracy.

I will put some useful points about Grafana:
  • Datasource - You can define multiple data sources and each one has its own query editor which supported by Grafana (e.g., InfluxDB, Graphite, ...)
  • User - It supports user authentication and authorization via LDAP, Database, Google Authentication.
  • Dashboard - You can group graphs in one dashboard and create multiple dashboards to track your system.
  • Row - In one dashboard you can have multiple rows to organise how the graphs should look like.
  • Panel - Panel has multiple types but for me the most important was the graph which you can define your graph with the not very powerful query editor :D. 

You can also define the period you need to see on each graph and the refresh rate of the whole dashboard.

Sample Images from Grafana website: