Rahul Amaram, Principal Software Engineer (RTB), Vizury Interactive Solutions Pvt. Ltd.
Working for an ad-tech company, monitoring the various ad-exchange metrics have always been a challenge. There are multiple metrics to be tracked per ad-exchange such as QPS, bid rate, avg. bid value, hit rate, etc. and this needs to be tracked across tens of publishers. What makes this challenging is that these metrics are highly dynamic, varying based on the time of the day and the day of the week. Bischeck turned out to be a boon in helping us address this. We currently track close to hundred business metrics using bischeck with the metrics values getting generated every 2 mins and the threshold at any given moment being calculated based on six historical data points spread over a month. Barring a few metrics which have no fixed pattern, the other metrics are being monitored pretty reliably.
What is even more awesome about Bischeck, is the support provided by the developers (I have mainly interacted with Anders Haal). There have been at least two instances where we were running into issues and we received immediate resolution for both these issues. What even impressed me more, is that this had been taken as feedback and the fixes / features got incorporated in their future releases.
Dynamic thresholds have been the major missing component in Nagios and Bischeck addresses this an excellent way. There is a slight learning curve involved but it is definitely well worth the effort.
Erik Larkin, System administrator at Gap
Erik Larkin, who works at retail giant The Gap’s ecommerce arm, uses Bischeck for advanced online order rates monitoring that compares current rates to data from the same time of day and the same day of the week from previous weeks. He’s also building load checks with dynamic thresholds that alert only when one or more servers in a cluster has a significantly higher load than other servers in the cluster, instead of best-guess static thresholds that are prone to false alarms.
“Bischeck’s ability to set dynamic thresholds based on both current and historic data opens the door for some seriously advanced monitoring and analysis,” he said. “One particularly nice standout is its ability to apply a minimum or maximum value for those dynamic thresholds to avoid potential false alarms, such as when a previous day’s order rate was significantly higher than normal.”
Peter Johansson, Head of Programs and IT product management at DHL Freight
DHL is one of the largest logistic companies in the world. DHL Freight is one of DHLs four business areas with a focus on domestic and international logistics solutions for companies. With a turnover of €700 million, DHL Freight manage approximate 65000 shipments orders per day, 120 000 items in the size of a single package to a complete haulier truck and over a million daily scanning events generated by the inbound shipment order flow. A network of 24 terminals around Sweden manage the logistic distribution.
On a high level the business process is simple, getting package from A to B. But since the process include a number of applications and a high number of internal and external integration that all must work under a tight time schedule, its key to monitor that the process to detect exceptions from the expected business levels. DHL have enterprise monitoring solutions in place for IT monitoring, but from a business perspective DHL Freight relied on manual checks.
“Our situation was difficult since we could see that small incidents that was not detected early in the process caused major impact later in the process with high costs and decreased customer quality”
A solution for business activity monitoring, BAM, was needed to understand what goes on inside the business applications, but the traditional monitoring systems did not provide a solution.
“Defining and managing thresholds are key when doing monitoring and our thresholds are not static. To monitoring our orders of shipment you have to understand that our threshold expectations are different depending on the time of the day and even the day in the week or month. We needed a monitoring system that had some dynamic threshold capability. We also needed adaptive thresholds since our business is process driven. That means that we wanted a solution where the monitored data like shipments orders can be used as a threshold in the next step of the process, for example when we monitoring how many of these shipments orders are geographical coded for delivery and truck loading.”
With the bischeck solution DHL was able to get an advanced solution for business activity monitoring that supports dynamic and adaptive threshold management and directly integrates with any standard Nagios solution.
“Today people in the business gets notifications over email and SMS if there is anything that is faulty in our processes. We can quickly and proactively react on the problems that would have caused business disturbance. We now don’t have to wait until our customers or users inform us about the problems and when it’s often to late to fix. Currently we monitor around 80 key applications indicators and it is constantly growing.”