The Linux collector version 106 is now available. Notable improvements include:
- We now intercept go processes on 64bit CentOS and Debian
- Fixed bug with socket client/server determination
- Other minor bug fixes
The Linux collector version 106 is now available. Notable improvements include:
The Linux collector version 105 is now available. Notable improvements include:
The Linux collector version 104 is now available. Notable improvements include:
Support for Zones & Containers
Jan 19, 2015
Back in April we announced the beta release of our Solaris Collector (As a reminder, the Solaris collector works with Solaris 10 and 11 on both SPARC and Intel platforms). Solaris has been receiving rave reviews in beta and the next step in improving Solaris collecting was to provide additional support by collecting within Solaris Zones. Now the AppFirst Collector fully supports both Global and Non-Global Solaris Zones.
You can see all the details about how to install and view your zones in our docs: http://support.appfirst.com/docs/collectors/#zones
And why stop there?
We also decided you needed more details into your Linux Container. So now, the AppFirst Collector can capture details on process and application details in specific Docker containers. See the support documentation for details on how to install on your host OS or directly into a container here: http://support.appfirst.com/docs/collectors/#containers
Server Topology Mapping
Today we are happy to announce the release of Technology Preview: a gateway for AppFirst customers to try new tools and features that are currently in development. The goal of Technology Preview is to incorporate an early feedback loop for new features; allowing customers and partners to shape the tools that are most important to them. For more details, see our documentation.
The first tool being released into Technology Preview is the Server/Device Topology.
Server Topology allows users to see, at a quick glance, the health and communication paths of their infrastructure. It was created to show the topology of your Servers and Devices as well as the metrics, processes, and logs associated for a given server or device. You are able to set metric thresholds to quickly see the overall health of a Server and its respective elements.
This was initially designed with two audiences in mind: for operators and for application owners.
For those responsible for the data center, Server Topology can act as a first alert of pinch-points in your infrastructure. By viewing a quick and simple visual, you can quickly identify systems running hot.
For those responsible for the applications running on top of the infrastructure, Server Topology provides a easy-to-digest view of any application crossover between environments.
In this situation, you can apply server tags by application. For example, let’s say you tag all the servers hosting Zookeeper, used as the management layer of our big data store. In this scenario the environment is designed to be self-contained. Therefore, it is critical that each backend is separate and has no communication between the two. Looking at the example below, Server Topology allows users to quickly identify if there is any crossover, which would indicate a misconfiguration on a server or in a line of code.
Viewing a topology is easy. Simply select a Server Tag to see its topology. If you want to see a specific layout, configure a Server Tag comprised of these servers before here (Admin – Organize Servers) .
The graph shows a topology of your servers and devices. Each server/device is colored according to its health, which is determined based on the thresholds on the right of the screen.
Red indicates that a process is exceeding a set threshold
Yellow indicates that a process is above eighty percent (80%) of the threshold
Green indicated the a process is healthy according to set thresholds
Grey indicates that No Process information can be found for that server or device
These colors percolate upwards; that is, if any process exceeds the threshold it becomes red, the server/device will also be red. You will only see green if all of processes associated with that server/device are below all the thresholds. The health of each process is recalculated when new thresholds are set or added. Similarly, when a metric is deleted, the health colors are recalculated using only the remaining thresholds. You can edit the min, max and step values of the thresholds.
Selecting a server or devices populates a table below the topology graph, displaying processes and metrics are causing problems by highlighting the cells that exceed the set thresholds.
Click to see Logs associated with server
Click to see Correlate, allowing you to drill down to very specific granular metrics on a given server
Click to return to the Processes list
The Servers Topology tool is currently in Technology Preview. We’d love to hear your feedback on how this is helpful, how it could be improved, and what you like or dislike. Email email@example.com or give feedback directly from the tool by clicking the feedback icon in the top right:
Key points for new API v5:
A customer of ours recently told us about their use-case for extending the AppFirst collector to support the data center team. Ops was already using AppFirst’s platform to collect, aggregate and correlate massive amounts of application & operations data for monitoring & troubleshooting. However, there was glaring concern about how long it would take the data center guys (and gals) to log into their own system and check the current status and history of their hardware data.
In turn, they extended the AppFirst collector to capture IPMI sensor data to monitor the environmental trends of their physical hardware. Taken together they were able to see key metrics from the physical hardware all the way up to business performance.
Specifically, the data center team needed to track a few key metrics and analyze performance over time to see if there is a trend towards downward failure.
Fans: Are the system’s fans about to fail, working too hard or experiencing degradation?
Temperature: Is the system temperature anything but stable?
Voltage: Has a power supply gone bad? Specifically, the Power Good signal, which prevents a computer from attempting to operate on improper voltages and damaging itself by alerting it to improper power supply(1).
First, a little background. AppFirst’s collector supports the ingestion of multiple types of additional data: logs, statsd and polled data. Polled data can include scripts with a nagios output format to collect data from APIs, management interfaces, SNMP, IPMI or any other source of data available.
AppFirst supports server hardware equipped with a BMC controller, with a variety of hardware sensor data that can be monitored using the IPMI standard. IPMI sensor data can be gathered in-band (through the host O/S), or out-of-band (through a dedicated lan connection independent of the system processor and host O/S). A free tool, FreeIPMI, is available to examine IPMI data, and AppFirst has written a series scripts that allow you to take advantage of AppFirst’s big-data store and visualization tools to easily monitor this data.
Here is an example that uses in-band communications (but out-of-band works also). You can do this two ways.
With log data:
To run either method you will need to install FreeIPMI and OpenIPMI on your system.
You are able to use source from here: FreeIPMI, or you may be able to find an rpm package compatible with your OS.
Before moving any further along, make sure that you can execute the ipmi-sensors command. The output looks like this (but much longer):
[appfirst@servername: ~]$ sudo /usr/sbin/ipmi-sensors ID | Name | Type | Reading | Units | Event 1 | Pwr Unit Status | Power Unit | N/A | N/A | 'OK' 2 | IPMI Watchdog | Watchdog 2 | N/A | N/A | 'OK'
Once you have ipmi-sensors working, you will need a perl script, check_ipmi_sensors, to format the output to be Nagios-compatible. We currently install the required script at the same time you installed your collector, and it is located here:
If not, its available here: https://github.com/appfirst/nagios-plugins/blob/master/check_ipmi_sensors
You can test that it’s working by executing this command:
sudo /usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan
and you should get output that looks like this:
fan OK | System_Fan_1=2156.00;System_Fan_2=2156.00;System_Fan_3=2156.00; System_Fan_4=2156.00;Processor_1_Fan=2450.00
From your AppFirst UI, select Admin → Setup → Polled Data from the top menu, and locate your collector. Click on the Server Hostname and add lines similar to these to the config file:
command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan command[sensor_temp]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t temperature command[sensor_voltage]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t voltage
The -t parameter is the sensor type. To get a list of sensor types, execute:
sudo /usr/sbin/ipmi-sensors -L
Note that your hardware may not report all of these.
Save the file, giving the collector up to five minutes to update with the new config to start polling the device. In the Correlate tool, you will soon find the new polled data commands that can be displayed.
You can monitor the output of any command as log data by simply capturing it’s output and appending it to a file, and then configuring the collector to monitor that file as a log file. One way to do that would be to execute the command as a cron job. You would also want to configure logrotate to prevent the excess consumption of disk space.
An alternative is to use our simple bash script, poll2log, and leave the storage to us. It simulates log rotation, and only stores the output from one execution. This script should have been installed when you installed your collector, and should be located here:
But if not, you can find it here: https://github.com/appfirst/nagios-plugins/blob/master/poll2log
Add a line like this to your crontab (probably as root):
*/5 * * * * /usr/share/appfirst/plugins/libexec/poll2log "/usr/local/sbin/ipmi-sensors --output-sensor-state" /var/log/ipmi-sensors.log
From your AppFirst dashboard, select Administration | Logs, and click the Add Log button (upper left). Find your server in the pull-down list, set the Type to “File”, and the File Path to /var/log/ipmi-sensors.log. Save the configuration, and give the collector a few minutes to receive the configuration. In the Correlate tool, you will soon find the new log files that can be displayed.
You can monitor the ipmi sensor data from another machine that has a collector installed. This allows you to see the sensor data in the case where the target machine does not have a collector installed. You simply need to add the host information to the command line:
command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -h 10.7.7.7 -s “-u username -p password” -t fan
Alternatively, this can also be an environment variable:
export IPMI_USER=user" with "-u $IPMI_USER
Intel BMC Web Console Users Guide:http://download.intel.com/support/motherboards/server/sb/intel_rmm4_ibwc_userguide_r2_72.pdf
(1) Wikipedia: http://en.wikipedia.org/wiki/Power_good_signal
The Linux collector version 101 is now available. Notable improvements include: