THE CHALLENGE

A client wanted to incorporate a metrics collection system to monitor the service level agreements (SLAs) established with the provider of its Platform software (high performance and complexity).

THE SOLUTION

The solution was divided into:

  • Analysis of required metrics and identification of the points in the Business Flow where they should be collected.
  • Dumping of the Metrics from the identified points to a Database.
  • Optimisation and Statistical Metrics Modelling (aggregations, calculations, transformations, etc.)
  • Metrics visualization through appropriate interface.

TECHNOLOGICAL SOLUTION

The Platform is a complex system in which a high volume of operations are executed on which the metrics to be recorded are based. That is why it was decided to use InfluxDB.

InfluxDB is a Database based on time series that handles these series very efficiently, and is characterised by its capacity to collect thousands of data per second, being able to make calculations and aggregate information in an optimal way (averages, maximums, time searches, etc.), and all this in real time, which can result in a competitive advantage for the companies that employ it in this type of monitoring context.

The Platform (implemented in Java) is dumping all the required metrics through the API that provides InfluxDB.

Using Flux query language we design how the information will be exploited, which fields will be generated, which calculations and aggregations are needed.

This query language has been chosen because it offers the following characteristics compared to other languages:

  • Usability: Easy to learn and use, focused on productivity.
  • Readability: Easy to read, follow and understand and therefore to maintain.
  • Modularity: Possibility of creating your own functions and libraries.
  • Testable: The queries are code and should be able to be tested and controlled through version control. It should even be possible to test specific and differentiated parts of a complex composite query.
  • Open to contribution and easy to share functions and libraries with other developers, increasing development efficiency.
  • It also offers the possibility of sharing common queries and common cases to avoid “reinventing” certain queries in InfluxDB

Finally, for data visualisation we chose Grafana  for its suitability to represent statistical data, and for its integration with InfluxDB and Flux, making it possible to create Dashboards adapted to the client’s needs.