Bischeck 1.1.0 is released

Release 1.1.0 is a minor upgrade of Bischeck with some new features.  The documentations has been updated to reflect the new stuff in 1.1.0. You should also check out the release note before for you start upgrading from 1.0.X which is the only previous version that is supported for upgrade. If you are still on 0.4.3 upgrade first to 1.0.2 and then to 1.1.0.

In the “Configuration” and “Installation and administration” guide we have marked all the 1.1.0 changes with a label [1.1.0] so it should be easy to search for.

Thanks to everybody that has tested this version. A special thanks goes to Pasquale Settanni at Eutelsat Broadband for his testing effort and valuable feedback, as always. 

As usual we look forward to your feedback.

New features

  •  Command line utility to explore cache content. Support for full syntax of Bischeck mathematical expression to enable simple testing of threshold expressions and virtual services. For more information see the “Bischeck – Installation and administration guide”.
  •  Server integration with Librato, https://metrics.librato.com/, is now supported. The server integration with Librato enables Bischeck metrics to be sent to Librato’s cloud monitoring service. For more information see the “Bischeck – Configuration guide”.
  • NRDP server integration is supported over SSL. Use the property ssl in the server.xml in the NRDP section to enable SSL. Default is false.
  • Support to disable SSL (X.509) certification validation for connection over HTTPS, like NRDP. Set the property disableCertificateValidation in the properties.xml. Disable validation have its risks – you have been warned. The default is false. A more secure way to manage certificates is to create a local keystore for Bischeck, see http://docs.oracle.com/javase/6/docs/technotes/tools/solaris/keytool.html. This will also require setup of additional system properties to java that has to be added in the $BISHOME/bin/bischeck script. Loads of documentation exists on the web.
  • Support for Jolokia, http://www.jolokia.org/, for JMX remoting. Jolokia is a jmx agent that support HTTP/JSON access and remove all the problems with the standard JMX agent that use RMI. RMI is especially problematic in any network environment with firewalls. With Jolokia its simple to tunnel the JMX connection over ssh. Jolokia provides fine grain security and access capabilities. The RMI based JMX agent is still the default, but that will change in the future releases of Bischeck. If you like to use Jolokia with Bischeck 1.1.0 just uncomment row 53 and comment row 52 in the $BISHOME/bin/bischeck script. Two additional configuration files has been added to the $BISHOME/resources directory to control the behavior of Jolokia:
    • jolokia.conf – basic setting, like port. Read more in the “JVM agent” chapter at http://www.jolokia.org/reference/html/agents.html.
    • jolokia-access.xml – define policy based security. More info at http://www.jolokia.org/reference/html/security.html
  • Add function to calculate the standard deviation on a series of data.
  • Add function to calculate the median value on a series of data.
  • [FR-252] “Adding the hour level to the period definition”. This feature request enable fine grain granularity of the warning and critical level for a specific hour interval.

.... 
<period>
  <months> 
    <dayofmonth>25</dayofmonth> 
  </months> 
  <calcmethod>></calcmethod> 
  <warning>10</warning> 
  <critical>20</critical> 
  <hoursIDREF>101</hoursIDREF> 
</period> 
.... 
<hours hoursID="101"> 
  <hourinterval> 
    <from>00:00</from> 
    <to>11:00</to> 
    <threshold>1000</threshold> 
  </hourinterval> 
  <hourinterval> 
    <from>12:00</from> 
    <to>24:00</to> 
    <threshold>2000</threshold> 
    <!−− Override the values from the period section −−> 
    <warning>20</warning> 
    <critical>30</critical> 
  </hourinterval> 
</hours>
.... 

Between 00 – 11:59 the warning and critical values in the period section will be used and between 12 and 23:59 the warning and critical “override” values are used. For the threshold between 11 and 12 the linear equation will be used to calculate the threshold value starting at 1000 at 11:00 and 2000 at 12:00, but the warning and critical will in that time interval be the values from the period section. For more information see the “Bischeck – Configuration guide”.

  • Testing of thresholds rules has been enhanced. The bin/bischeck threshold.Twenty4HourThreshold command will list the resolved threshold configuration depending on the service definition and date, and in addition calculate the state and threshold for specific measured value and at the time of the day. For thresholds that are based on cached expression the threshold will be calculate if the data are available in the cache. For more information see the “Bischeck – Installation and administration guide”.
  • [FR-254] “Enable to test service in op5 web interface”. This request is not limited to Nagios/OP5, but the capability to on-demand execute a service and its serviceitems in Bischeck. This functionality has been implemented using JMX. The MBean is called com.ingby.socbox.bischeck.service:type=ExecuteServiceOnDemand and have a method with the following signature:boolean execute(java.lang.String host,java.lang.String service)If you using Jolokia as JMX agent a valid call to execute the service sshport for host moon would be:
    $ curl http://localhost:7777/jolokia/exec/com.ingby.socbox.bischeck.service:type=ExecuteServiceOnDemand/execute/moon/sshport
    
    {"timestamp":1400018354,"status":200,"request":{"operation":"execute","mbean":"com.ingby.socbox.bischeck.service:type=ExecuteServiceOnDemand","arguments":["moon","sshport"],"type":"exec"},"value":true}

    To use the function from Nagios as a check command their are a number of things to consider when implementing a check command (we have not done that, thats your task):

    • Use the $HOSTNAME$ macro has the host parameter
    • Use the $SERVICEDESC$ macros has the service parameter
    • Make sure that check command return the same status as the current status since the “real” status will come from Bischeck through the normal passive check and not from the check command. That means that the check command must also use the macro $SERVICESTATEID$ to return the same value so the state is not changed.
    • Its also important to understand that the on-demand function will not return any performance data. The MBean only return true if the job could be scheduled and false if the host and/or service name do not exists or that the scheduling fail.

Bugs fixed and important issues

  • [TR-257] “No Nagios state on null”. This bug caused no state information to be sent to Nagios if any serviceitems for a service if the serviceitems metrics was null.
  • Bischeck is not longer checking if the pid file exist on start up. This was removed since it’s problematic from Java in a standard way determine the pid of the running process. Instead the its now up to the bischeckd script.he
  • If many services definitions are configured with aggregation the peek load every hour can be very high. In previous release the schedule was kicked off with a cron definition where the second was set to 0. With 1.1.0 the second field will be set to a random value between 0-59. This distribute the scheduling off aggregation over a interval of a minute.
  • The install script with the upgrade option -X will copy existing logback.xml configuration in addition to all configuration files in the etc directory.

Leave a Reply

Your email address will not be published. Required fields are marked *

*