Monitoring IPMI Sensor Data

A customer of ours recently told us about their use-case for extending the AppFirst collector to support the data center team. Ops was already using AppFirst’s platform to collect, aggregate and correlate massive amounts of application & operations data for monitoring & troubleshooting. However, there was glaring concern about how long it would take the data center guys (and gals) to log into their own system and check the current status and history of their hardware data.

In turn, they extended the AppFirst collector to capture IPMI sensor data to monitor the environmental trends of their physical hardware. Taken together they were able to see key metrics from the physical hardware all the way up to business performance.

Specifically, the data center team needed to track a few key metrics and analyze performance over time to see if there is a trend towards downward failure.

  1. Fans: Are the system’s fans about to fail, working too hard or experiencing degradation?

  2. Temperature: Is the system temperature anything but stable?

  3. Voltage: Has a power supply gone bad? Specifically, the Power Good signal, which prevents a computer from attempting to operate on improper voltages and damaging itself by alerting it to improper power supply(1).

How to get started

First, a little background. AppFirst’s collector supports the ingestion of multiple types of additional data: logs, statsd and polled data. Polled data can include scripts with a nagios output format to collect data from APIs, management interfaces, SNMP, IPMI or any other source of data available.

Collecting IPMI Data

AppFirst supports server hardware equipped with a BMC controller, with a variety of hardware sensor data that can be monitored using the IPMI standard. IPMI sensor data can be gathered in-band (through the host O/S), or out-of-band (through a dedicated lan connection independent of the system processor and host O/S). A free tool, FreeIPMI, is available to examine IPMI data, and AppFirst has written a series scripts that allow you to take advantage of AppFirst’s big-data store and visualization tools to easily monitor this data.

Here is an example that uses in-band communications (but out-of-band works also). You can do this two ways.

With polled data:

With log data:

To run either method you will need to install FreeIPMI and OpenIPMI on your system.

You are able to use source from here: FreeIPMI, or you may be able to find an rpm package compatible with your OS.

Before moving any further along, make sure that you can execute the ipmi-sensors command. The output looks like this (but much longer):

[appfirst@servername: ~]$ sudo /usr/sbin/ipmi-sensors
ID | Name                     | Type                              | Reading   | Units  | Event
1   | Pwr Unit Status    | Power Unit                    | N/A           | N/A     | 'OK'
2  | IPMI Watchdog    | Watchdog 2                  | N/A           | N/A     | 'OK'

Polled Data Configuration:

Once you have ipmi-sensors working, you will need a perl script, check_ipmi_sensors, to format the output to be Nagios-compatible. We currently install the required script at the same time you installed your collector, and it is located here:

/usr/share/appfirst/plugins/libexec/check_ipmi_sensors

If not, its available here: https://github.com/appfirst/nagios-plugins/blob/master/check_ipmi_sensors

You can test that it’s working by executing this command:

sudo /usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan

and you should get output that looks like this:

fan OK | System_Fan_1=2156.00;System_Fan_2=2156.00;System_Fan_3=2156.00;
System_Fan_4=2156.00;Processor_1_Fan=2450.00

From your AppFirst UI, select Admin → Setup → Polled Data  from the top menu, and locate your collector.  Click on the Server Hostname and add lines similar to these to the config file:

command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan
command[sensor_temp]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t temperature
command[sensor_voltage]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t voltage

The -t parameter is the sensor type. To get a list of sensor types, execute:

sudo /usr/sbin/ipmi-sensors -L

Note that your hardware may not report all of these.

Save the file, giving the collector up to five minutes to update with the new config to start polling the device. In the Correlate tool, you will soon find the new polled data commands that can be displayed.

Log Data Configuration:

You can monitor the output of any command as log data by simply capturing it’s output and appending it to a file, and then configuring the collector to monitor that file as a log file. One way to do that would be to execute the command as a cron job. You would also want to configure logrotate to prevent the excess consumption of disk space.

An alternative is to use our simple bash script, poll2log, and leave the storage to us.  It simulates log rotation, and only stores the output from one execution. This script should have been installed when you installed your collector, and should be located here:

/usr/share/appfirst/plugins/libexec/poll2log

But if not, you can find it here: https://github.com/appfirst/nagios-plugins/blob/master/poll2log

Add a line like this to your crontab (probably as root):

*/5 * * * * /usr/share/appfirst/plugins/libexec/poll2log "/usr/local/sbin/ipmi-sensors
             --output-sensor-state" /var/log/ipmi-sensors.log

From your AppFirst dashboard, select Administration | Logs, and click the Add Log button (upper left). Find your server in the pull-down list, set the Type to “File”, and the File Path to /var/log/ipmi-sensors.log. Save the configuration, and give the collector a few minutes to receive the configuration. In the Correlate tool, you will soon find the new log files that can be displayed.

Out-of-Band Configuration:

You can monitor the ipmi sensor data from another machine that has a collector installed. This allows you to see the sensor data in the case where the target machine does not have a collector installed. You simply need to add the host information to the command line:

command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -h 10.7.7.7 -s “-u username -p password” -t fan

Alternatively, this can also be an environment variable:

export IPMI_USER=user" with "-u $IPMI_USER

Links that may be of interest:

Intel BMC Web Console Users Guide:http://download.intel.com/support/motherboards/server/sb/intel_rmm4_ibwc_userguide_r2_72.pdf

(1) Wikipedia: http://en.wikipedia.org/wiki/Power_good_signal

View your Server Data faster

Many users loved the level of detail at which they could view server data in the old version of our product – they could quickly choose a server from a list, and immediately see a wide variety of information about processes, logs, metrics, and alerts for that particular server.

We’ve brought back many of the fetures of the old servers page into our new user interface, making it even faster to view important server metrics and data.

Server metrics, log messages, triggered alerts, and polled data are organized into tabs. We’ve also added a new section – Collector Info – which tells you which collector version the server is running and when the collector last uploaded data.

You can use your left and right arrow keys to quickly switch between tabs.

The dropdown above the list of servers lets you filter servers by running/not running, as well as by any server tags or server groups that are configured. You can use your up and down arrow keys to quickly cycle through the list of servers.

The new servers page is available today!

New & Improved Dashboard Management

Our latest update to the AppFirst Product is a new and improved interface for creating, editing, and navigating between your Dashboards.

In an effort to save space and to give users a faster way to manage and switch between their Dashboards, we’ve moved them to a dropdown menu in the upper left-hand part of the product.

Creating a new Dashboard is easy – just click the + button to the right of the dropdown, and give it a name. Once it’s created, the Dashboard will appear in the dropdown.

If your list of Dashboards gets long, using the search box can help you find the Dashboard you’re looking for much faster.

Hovering over a Dashboard in the dropdown reveals options to rename, duplicate, or delete it.

Grabbing and dragging the leftmost icon lets you manually sort Dashboards in the list.

Our product now remembers which Dashboard you last visited; when navigating back to the product after leaving the site, you will automatically be directed to the Dashboard you previously had active.

The new Dashboard management interface is live and ready-to-use for everyone. If you have any feedback, make sure to let us know!

Windows Collector Version 67 Now Available

The Windows collector version 67 is now available. Updates include:

  • Increased stability in collector and Windows library
  • Decreased bandwidth requirements
  • Increased support with programs that intercept functions (such at Citrix XenApp)
  • Troubleshooting improvements
  • General bug fixes

Get this new version by going to Administration – Collectors – Update Collectors. 

It’s Here: The New AppFirst UI

We’ve been working very hard in recent weeks to prepare for the release of our updated user interface here at AppFirst. We are excited about all the new features and over the next few months we will add even more functionality and flexibility to our product.

I would like to share some of the new features available with the new user interface. There are many changes and improvements that have been made throughout the UI and we’ll touch on some of them in this blog post.

For starters we have incorporated responsive design to many of our views and are moving toward a full HTML5 implementation. With the plethora of devices on the market today we wanted a better experience for our customers across all devices. Additionally, we have improved several of the workflows to make them more intuitive and easier to work with.

So what are some of the new features? Here are some of the highlights:

Autodetect

Autodetect is a great new feature for our users. AppFirst has always supported some level of autodetection, however the new Autodetect is much more integrated into the user interface with “opt in” and “opt out” options at every step. For every software component we find, Autodetect can find and configure alerts, log sources, polled data items, progress groups, and dashboards.

First-time users will use Autodetect when they first install a collector. However, Autodetect is also available through Admin so users can run it at any time on any number of servers they choose. We are really excited about this new feature and we think our customers will be as well.

Dashboards

Customers have always loved our dashboards. Now we’ve taken them to the next level with new features customers have been asking for. For example, the ability to duplicate dashboards, copy widgets from one dashboard to another, copy widgets within a dashboard, and a new feature called TV Mode. TV Mode allows customers to display their favorite dashboards on a large centralized TV within their office or network operations center. Because some computer monitors are on the small side, TV Mode allows you to display more widgets than will fit in the normal view. Simply choose how many widgets to view and how often to rotate views and TV mode will rotate through all your widgets in an “ad rotator” fashion.

Dashboard Widget Market

Our new Widget Market is where users will find all of the widgets that can be added to dashboards. Widgets provide small templated views of specific metrics, whether it’s system metrics, log metrics, process metrics, or any other data source in AppFirst.

Many of our widgets now support an extra level of functionality as well.

  • The ability to drill down with Correlate in context of the data from that widget.
  • Using the Polled Data Metrics widget, you can create a more detailed view of specific application components, such as Apache, IIS, Oracle, and so on. We call this the “Widget Detail” view.

The market is still in it’s early stages and we plan to add many more widgets over the coming weeks. Stay tuned for updates!

Servers

The Servers view has also been redesigned to quickly and easily provide more information about all your servers, VMs, or cloud instances. Like many of the views in the new user interface, there are more options for searching, filtering, and paging large numbers of servers. Multiple metrics are shown for multiple servers and users can toggle between a list-view and a grid-view. Clicking on a single server displays a wide range of details in a tabbed view format.

Consolidated Alerts

Alert History and Alert Status have been combined into a single new Alerts view, establishing a logical workflow to investigate alerts. Use the visual bar chart to easily identify the volume and criticality of alerts over time. You can then dive into the time interval you want to investigate and view the alerts that triggered at that point in time. We included a Resolve button so you can clear alerts of their critical or warning status, indicating to your team that things are now OK.

Administration

Virtually everything in Admin has been logically changed or redesigned. This new design offers a streamlined way for users to find and configure Admin options, including adding new users and collectors, organizing servers and process groups, and access to the new Autodetect feature. Users will also like the new Setup area for configuring alerts, logs, devices, and polled data items.

First Time User Experience

New users will now be guided through a much more complete experience when installing their first collector. Collector options have been expanded from Linux and Windows to include other operating systems like Solaris, FreeBSD, and AIX. Additionally, we’ve included links to easily access our integrations with configuration management tools like Chef and Puppet.

After installing the first collector, users will head to Autodetect where alerts, logs, polled data items, process groups, and even dashboards can be configured automatically. Now you can spend less time considering how and what data to collect in your environment.

 

So that’s a quick tour of some of the new changes and some insights into the usability improvements. We are excited about all the new changes and are always interested in hearing from you about what you like and what you think we could improve upon. Let us know your thoughts!

*Note – current customers will have the ability to return to the classic interface for the next several weeks to help with the transition.

AppFirst Beta UI: A Deep Dive into the Design

In our previous post on our new beta UI, we talked about some of the great improvements we’ve made to help bring visibility of your applications and infrastructure to the forefront. In this post, we’re going to dive deep into more of the design aspects we’ve added or improved upon.

Responsive Design

With the continued rise of mobile, many of our users mentioned they wanted to use AppFirst on the go on their tablets and smartphones. We took this feedback and converted our whole product into an optimized, user-friendly experience. So whether you’re at work on your computer, off-site with your tablet, or away from your desk on your smartphone, AppFirst’s product conforms to any screen size you’re using.

Here’s an example of how our Dashboards work with the new responsive design.

Here’s what a Dashboard looks like on an iPhone.

Dashboards aren’t the only thing responsive – the whole product is. Take a look at our new Alerts page on the iPhone.

A Consolidated Alerts View

We’ve redesigned our Alerts page to combine Alert History and Alert Status into a single, consolidated view. The top half of the page displays your alerts historically, allowing you to view your alert activity over any date range (or time range if you’re looking at a single day). Many of our users requested a better way to visualize their alerts over time so they can better identify trends in their data.

And now, without the need to go to a different page, you can simply look below to view what alerts were triggered in the range you’re viewing.

In the screenshot above, you can see we’ve helped distinguish your alerts with a color-coded key, helping you easily visualize the status of recent alerts.

Dashboard TV Mode

We touched on TV Mode a bit in our last post, but wanted to mention it again cause it’s that cool of a feature. Easily toggle to TV mode and keep a ‘hands free’ continuous watch of your most important parts of your IT environment. This is perfect for organizations with large monitors for the whole team to key an eye on.

Search/Filter Bars

As time goes on, applications and the infrastructure that supports them usually gets more and more complex. That’s why we built Search/Filter bars on nearly every single page in AppFirst. Forget the days of infinite scrolling to find that needle in the haystack. Where it’s searching for a specific Dashboard, one of your hundreds of servers, an individual alert, or one of your polled data scripts, we’ve got you covered.

One of our best examples is with our Servers page. There’s a Filter as well as a Search bar, allowing you to get very granular in how you view your data.

Use the Filter to view your servers that are part of a specific Server Group or Server Tag, or view your servers that are running or stopped.

Use the Search to view specific servers based on hostnames or nicknames.

Now in Beta: AppFirst’s New UI

Today AppFirst is announcing the release of our new user interface into public beta. In this beta, the goal was to bring visibility of your applications and infrastructure to the forefront, allowing you to:

  • Redefine how you approach troubleshooting
  • Integrate and visualize other data sets for complete visibility
  • Group all application components to increase transparency and accountability for application-specific service levels

Therefore, we’ve made some big improvements in the UI functionality in addition to new feature releases.

Dashboard Widget Market

We’ve implemented a Widget Market to streamline the creation of Dashboards. In addition to the enhanced UI, we’ve developed three new Dashboard widgets, allowing you to better visualize your key metrics.

Dashboard TV Mode

We’ve developed a Dashboard TV Mode designed for display on large monitors for your whole team. See it in action below.

Admin Market

All of our partner integrations can now be configured through our new Admin Market. Easily connect your other tools into AppFirst for a complete visualization into every component in your environment. Supported tools right now include CloudWatch, AppDynamics, New Relic, and PagerDuty. Many more to come.

Process Groups

Applications have been renamed to Process Groups in AppFirst. With an additional 5 Community Templates added, you can group more of your application components with preconfigured templates. For example, use our Apache Community Template to group all Apache processes across any number of servers (could be one server, could be one million). Once set up, that Apache Process Group will automatically detect and collect those processes that match the Template rules (httpd and apache). You no longer have to worry about telling your agents what to look for. AppFirst does it automatically.

Updated Navigation through AppFirst

You’ll notice that we changed the horizontal navigation to a vertical navigation, giving you more real estate to view your data. You’ll also notice we’ve implemented breadcrumb navigation in deeper parts of AppFirst, making it easier to find your way around.

Try today by clicking the Beta link at the top of our product or by clicking here. The future user interface of AppFirst is in your hands (literally), so let us know what you think. Feedback is always looked at and considered for future iterations. And stay tuned for more in-depth posts on our new UI in the coming weeks.