Support for Zones & Containers

Support for Zones & Containers

Jan 19, 2015

Back in April we announced the beta release of our Solaris Collector (As a reminder, the Solaris collector works with Solaris 10 and 11 on both SPARC and Intel platforms). Solaris has been receiving rave reviews in beta and the next step in improving Solaris collecting was to provide additional support by collecting within Solaris Zones. Now the AppFirst Collector fully supports both Global and Non-Global Solaris Zones.
You can see all the details about how to install and view your zones in our docs: http://support.appfirst.com/docs/collectors/#zones

And why stop there?

We also decided you needed more details into your Linux Container. So now, the AppFirst Collector can capture details on process and application details in specific Docker containers. See the support documentation for details on how to install on your host OS or directly into a container here: http://support.appfirst.com/docs/collectors/#containers

Server Topology – Tech Preview

Server Topology Mapping

Today we are happy to announce the release of Technology Preview: a gateway for AppFirst customers to try new tools and features that are currently in development. The goal of Technology Preview is to incorporate an early feedback loop for new features; allowing customers and partners to shape the tools that are most important to them. For more details, see our documentation.

Server/Device Topology

The first tool being released into Technology Preview is the Server/Device Topology.

Server Topology allows users to see, at a quick glance, the health and communication paths of their infrastructure. It was created to show the topology of your Servers and Devices as well as the metrics, processes, and logs associated for a given server or device. You are able to set metric thresholds to quickly see the overall health of a Server and its respective elements.

This was initially designed with two audiences in mind: for operators and for application owners.

Server Topology for Operations

For those responsible for the data center, Server Topology can act as a first alert of pinch-points in your infrastructure. By viewing a quick and simple visual, you can quickly identify systems running hot.

Server Topology for Application Owners

For those responsible for the applications running on top of the infrastructure, Server Topology provides a easy-to-digest view of any application crossover between environments.

In this situation, you can apply server tags by application. For example, let’s say you tag all the servers hosting Zookeeper, used as the management layer of our big data store. In this scenario the environment is designed to be self-contained. Therefore, it is critical that each backend is separate and has no communication between the two. Looking at the example below, Server Topology allows users to quickly identify if there is any crossover, which would indicate a misconfiguration on a server or in a line of code.

 

Getting Started: View A Topology

Viewing a topology is easy. Simply select a Server Tag to see its topology. If you want to see a specific layout, configure a Server Tag comprised of these servers before here (Admin – Organize Servers) .

The graph shows a topology of your servers and devices. Each server/device is colored according to its health, which is determined based on the thresholds on the right of the screen.

  • Red indicates that a process is exceeding a set threshold

  • Yellow indicates that a process is above eighty percent (80%) of the threshold

  • Green indicated the a process is healthy according to set thresholds

  • Grey indicates that No Process information can be found for that server or device

These colors percolate upwards; that is, if any process exceeds the threshold it becomes red, the server/device will also be red. You will only see green if all of processes associated with that server/device are below all the thresholds. The health of each process is recalculated when new thresholds are set or added. Similarly, when a metric is deleted, the health colors are recalculated using only the remaining thresholds. You can edit the min, max and step values of the thresholds.

Selecting a server or devices populates a table below the topology graph, displaying processes and metrics are causing problems by highlighting the cells that exceed the set thresholds.

  • Click to see Logs associated with server

  • Click to see Correlate, allowing you to drill down to very specific granular metrics on a given server

  • Click to return to the Processes list

Feedback

The Servers Topology tool is currently in Technology Preview. We’d love to hear your feedback on how this is helpful, how it could be improved, and what you like or dislike. Email product@appfirst.com or give feedback directly from the tool by clicking the feedback icon in the top right:

 

API Updates: V5

Key points for new API v5:

  • The new v5 API uses a different method for versioning. The old API took version in the URL like http://wwws.appfirst.com/api/v4/servers/. The new API takes a version in the HTTP Accept header like “application/json; version=5″ and the URL for servers is be just http://wwws.appfirst.com/api/servers/ without a version in the URL.
  • The new API offers several return data formats that can be specified in the Accept header:
    • application/json; version=5
    • application/xml; version=5
    • application/yaml; version=5
  • The new API also has a web-browsable interface for convenience when developing. It is accessible at http://wwws.appfirst.com/api/dev/
    • Note that the /api/dev/ URL should only be used for development/testing and not in actual code as it will move to /api/ when v5 becomes the default version. Versions should be in the Accept header
  • The return fields and input fields have changed slightly for certain endpoints in order to standardize field names and provide a more consistent interface for users. Applications currently interacting with v4 of the API will probably need to be updated, but the changes required should not be too complicated. The new return formats can be seen on the web-browsable interface. Support documentation will be updated shortly.
  • There is a new Python client wrapper available at https://github.com/appfirst/afapi which makes integrating API calls into Python code much more convenient.
  • The new API v5 will become the default on February 16, 2015, at which point v3/v4 will be deprecated. Versions 3 & 4 will be shut off on March 16, 2015.

Monitoring IPMI Sensor Data

A customer of ours recently told us about their use-case for extending the AppFirst collector to support the data center team. Ops was already using AppFirst’s platform to collect, aggregate and correlate massive amounts of application & operations data for monitoring & troubleshooting. However, there was glaring concern about how long it would take the data center guys (and gals) to log into their own system and check the current status and history of their hardware data.

In turn, they extended the AppFirst collector to capture IPMI sensor data to monitor the environmental trends of their physical hardware. Taken together they were able to see key metrics from the physical hardware all the way up to business performance.

Specifically, the data center team needed to track a few key metrics and analyze performance over time to see if there is a trend towards downward failure.

  1. Fans: Are the system’s fans about to fail, working too hard or experiencing degradation?

  2. Temperature: Is the system temperature anything but stable?

  3. Voltage: Has a power supply gone bad? Specifically, the Power Good signal, which prevents a computer from attempting to operate on improper voltages and damaging itself by alerting it to improper power supply(1).

How to get started

First, a little background. AppFirst’s collector supports the ingestion of multiple types of additional data: logs, statsd and polled data. Polled data can include scripts with a nagios output format to collect data from APIs, management interfaces, SNMP, IPMI or any other source of data available.

Collecting IPMI Data

AppFirst supports server hardware equipped with a BMC controller, with a variety of hardware sensor data that can be monitored using the IPMI standard. IPMI sensor data can be gathered in-band (through the host O/S), or out-of-band (through a dedicated lan connection independent of the system processor and host O/S). A free tool, FreeIPMI, is available to examine IPMI data, and AppFirst has written a series scripts that allow you to take advantage of AppFirst’s big-data store and visualization tools to easily monitor this data.

Here is an example that uses in-band communications (but out-of-band works also). You can do this two ways.

With polled data:

With log data:

To run either method you will need to install FreeIPMI and OpenIPMI on your system.

You are able to use source from here: FreeIPMI, or you may be able to find an rpm package compatible with your OS.

Before moving any further along, make sure that you can execute the ipmi-sensors command. The output looks like this (but much longer):

[appfirst@servername: ~]$ sudo /usr/sbin/ipmi-sensors
ID | Name                     | Type                              | Reading   | Units  | Event
1   | Pwr Unit Status    | Power Unit                    | N/A           | N/A     | 'OK'
2  | IPMI Watchdog    | Watchdog 2                  | N/A           | N/A     | 'OK'

Polled Data Configuration:

Once you have ipmi-sensors working, you will need a perl script, check_ipmi_sensors, to format the output to be Nagios-compatible. We currently install the required script at the same time you installed your collector, and it is located here:

/usr/share/appfirst/plugins/libexec/check_ipmi_sensors

If not, its available here: https://github.com/appfirst/nagios-plugins/blob/master/check_ipmi_sensors

You can test that it’s working by executing this command:

sudo /usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan

and you should get output that looks like this:

fan OK | System_Fan_1=2156.00;System_Fan_2=2156.00;System_Fan_3=2156.00;
System_Fan_4=2156.00;Processor_1_Fan=2450.00

From your AppFirst UI, select Admin → Setup → Polled Data  from the top menu, and locate your collector.  Click on the Server Hostname and add lines similar to these to the config file:

command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t fan
command[sensor_temp]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t temperature
command[sensor_voltage]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -t voltage

The -t parameter is the sensor type. To get a list of sensor types, execute:

sudo /usr/sbin/ipmi-sensors -L

Note that your hardware may not report all of these.

Save the file, giving the collector up to five minutes to update with the new config to start polling the device. In the Correlate tool, you will soon find the new polled data commands that can be displayed.

Log Data Configuration:

You can monitor the output of any command as log data by simply capturing it’s output and appending it to a file, and then configuring the collector to monitor that file as a log file. One way to do that would be to execute the command as a cron job. You would also want to configure logrotate to prevent the excess consumption of disk space.

An alternative is to use our simple bash script, poll2log, and leave the storage to us.  It simulates log rotation, and only stores the output from one execution. This script should have been installed when you installed your collector, and should be located here:

/usr/share/appfirst/plugins/libexec/poll2log

But if not, you can find it here: https://github.com/appfirst/nagios-plugins/blob/master/poll2log

Add a line like this to your crontab (probably as root):

*/5 * * * * /usr/share/appfirst/plugins/libexec/poll2log "/usr/local/sbin/ipmi-sensors
             --output-sensor-state" /var/log/ipmi-sensors.log

From your AppFirst dashboard, select Administration | Logs, and click the Add Log button (upper left). Find your server in the pull-down list, set the Type to “File”, and the File Path to /var/log/ipmi-sensors.log. Save the configuration, and give the collector a few minutes to receive the configuration. In the Correlate tool, you will soon find the new log files that can be displayed.

Out-of-Band Configuration:

You can monitor the ipmi sensor data from another machine that has a collector installed. This allows you to see the sensor data in the case where the target machine does not have a collector installed. You simply need to add the host information to the command line:

command[sensor_fan]=/usr/share/appfirst/plugins/libexec/check_ipmi_sensors -h 10.7.7.7 -s “-u username -p password” -t fan

Alternatively, this can also be an environment variable:

export IPMI_USER=user" with "-u $IPMI_USER

Links that may be of interest:

Intel BMC Web Console Users Guide:http://download.intel.com/support/motherboards/server/sb/intel_rmm4_ibwc_userguide_r2_72.pdf

(1) Wikipedia: http://en.wikipedia.org/wiki/Power_good_signal

The Road to Comprehensive Monitoring & Immediate Insight

Sasha Jeltuhin, CTO of Bridgevine, presented at the Gartner Data Center, Infrastructure and Operations Summit today in Las Vegas. During his case study presentation, Sasha explains:
  1. Their makeover of infrastructure, software-upgrades and real-time application monitoring
  2. The key audiences and requirements for data collection and monitoring
  3. Why they selected AppFirst’s application operations platform
  4. Results to date

Below are the highlights from the presentation and reactions from those in the room…


Learn more about the requirements for web-scale monitoring in this brief, featuring Gartner research.

 

Bridgevine Selects AppFirst for Critical Application Data and Insights

AppFirst, the leader in unified application operations visibility, today announced that Bridgevine has selected its technology to deliver critical system insights across the entire stack. The implementation allows Bridgevine to make critical forward-thinking decisions with significantly improved visibility.

Bridgevine selects AppFirst for critical application and systems monitoring

Bridgevine is a technology company that is committed to ensuring its clients never encounter a negative customer experience or disjointed interaction. Selecting the right technology partner to provide granular visibility across applications and infrastructure was a strategic decision for enabling the highest level of business uptime. After identifying detailed requirements and testing multiple solutions, they chose AppFirst.

“We drive 50+ million consumer touch points and enable millions of digital enrollments every year from our national catalogue of service providers,” said Sasha Jeltuhin, CTO of Bridgevine. “Due to the scale of our transactions and their importance to our business, we needed a miss nothing approach toward proactive monitoring of the platform, complex integrations with service providers and metric collection. This ensures we have a deterministic view of our environment and can see the details of every event across the stack, regardless of when they occur. As a company focused on delivering high quality services, legacy polled metric collection missed our requirements for data velocity, quality, operational efficiency and immediate insight.”

“With AppFirst, Bridgevine can quickly identify and solve for the root cause of any application or operational issue—whether it is in-house or from a third party service,” said David Roth, CEO of AppFirst. “The end result is a significantly improved customer experience and an improved bottom line.”

With AppFirst, organizations can see every interaction between the operating system and application components, regardless of when they occur, without impacting application performance. Installed in seconds, AppFirst’s patented collection technology eliminates the need for multiple agents or byte-code injection. All information is aggregated and time-synchronized in a big data platform to provide collaborative visibility across all groups. This delivers multiple benefits such as:

  • Dynamic topology mapping of applications and their associated server infrastructure.
  • Visualization of network data and transaction flows, along with their low level event details.
  • Deterministic information for capacity management, improved asset utilization and more.

About Bridgevine

Bridgevine, Inc. is the leading provider of customer acquisition and retention solutions to enterprise customers. The company’s SaaS-based platform is used by the leading companies in communications, entertainment, home security, and energy resulting in 50+ million consumer interactions annually. For more information, visit www.bridgevine.com.

Announcing The Enterprise-Grade Platform & Predictive Analytics Partnership

Today we are unveiling the enterprise version of the AppFirst platform with the option to securely run on-premise, in the cloud or in hybrid scenarios. This next-generation of our platform comes with many other exciting announcements, including a partnership with Accretive to deliver a predictive analytics offering to help enterprises expose, anticipate and intervene in IT risks across organizational silos. This partnership, coupled with AppFirst’s enterprise-grade platform, provides unparalleled insight and predictive capabilities across all layers of an organization’s application stack.

Starting with Accretive, over the coming weeks and months we will announce a series of enterprise focused applications leveraging this platform, consisting of those developed by AppFirst, various partners and the enterprise community.

The Platform: A Universal Timeline of Events

AppFirst’s enterprise platform represents a paradigm shift from traditional monitoring and logging tools. It establishes a complete and universal timeline of events across an enterprise application environment – at a sub-millisecond level. This includes every application call, system event, log file entry, configuration change, third party application or custom code event, and data from thousands of plug-ins. This is accomplished through a patented data collection and aggregation methodology that provides an unmatched level of visibility – all correlated into a universal timeline that can live in the cloud, on-premise or in a hybrid environment. Armed with this platform, enterprises are now able to achieve granular and continuous visibility between IT and the business

“AppFirst is redefining what’s possible for executives looking to optimize the technology that runs their business. And it starts with setting a single sub-millisecond timeline across the entire enterprise stack. Unless you start with perfect data, you cannot achieve true visibility,” said David Roth, CEO and Co-Founder of AppFirst. “We are applying deep predictive analytics and applications on top of the most detailed data in the world.  Taken together, this finally offers CIOs the ability to truly manage cost, risk and capacity without fingerpointing or guessing.”

New capabilities of the AppFirst platform include:

  • Real-time service topology mapping
  • Predictive analytics
  • On-premise deployments

Real-time service topology mapping

Customers now have the ability to auto-discover and map their service and application topologies in real-time. AppFirst auto-installs under new custom or third-party application components as they come into your environment. Therefore, you’ll always have an up-to-date view of what is running and how it is running across your enterprise.

Real-time service topology mapping

Predictive analytics

By partnering with Accretive to update predictive analytic models in real-time, IT and business leaders can identify system limits, predict issues across the enterprise, and perform what-if analysis on how changing complexity will impact performance quality and cost.

“During the past 12-18 months, the use cases for our platform have extended far beyond incident response and troubleshooting,” continued Roth. “Our partnership with Accretive is the direct result of enterprise organizations looking to holistically optimize the business from the top – all the way down to the database or web server. We are now able to reset the standard for how organizations bridge business to IT.”

“Only the combination of AppFirst and Accretive provides business leaders with the real-time and forward-thinking visibility required for true IT continuity and optimization,” said Dr. Nabil Abuelata, CEO and Founder of Accretive. “Unlike traditional data warehousing and big data analytics technologies, it’s now possible to feed and update predictive models in real-time with a unified dataset to understand what is happening now and in the future. This capability delivers tremendous value to both the core business and IT operations. We are excited to work with AppFirst to change the way enterprises think about optimizing the entire technology chain that supports their business.”

On-premise deployments

In addition to the core SaaS & Private SaaS offering, customers can easily deploy AppFirst behind their firewall. This is critical for enterprise customers in highly regulated environments and in countries where data needs to securely reside within their borders.

AppFirst enterprise platform is now available for customers worldwide. To learn more:

Chat with a Field Engineer or Sign up for a 30 day free trial