Managing The Cloud with Devices

With the ongoing transition of IT organizations from physical to virtual environments, the desire for complete visibility across the stack becomes ever more important. AppFirst has worked hard to bring a complete and real-time visibility to applications and resources, and we’ve taken another step forward by offering deeper visibility into the hypervisor.

Our initial release of “Devices” and support for VMWare’s ESXi showcases a single point of access for understanding your full stack application and the virtualized environment in which it lives.

This is important for a couple of key reasons:

  1. Visibility into application resources breaks down in a virtual environment when stolen time increases – even at 10% stolen time, CPU metrics become almost meaningless, making things much more difficult to manage.

  2. Telling a consistent story from virtualization is critical in maintaining and improving your environment. Having hypervisor access visibility with ESXi lets you compare the information at the VM level with the Hypervisor to see how much is going on between the two.

Devices support goes beyond just hypervisors and ESXi though. The goal of devices is to provide full insight across agent and agentless monitoring and providing a UI that is consistent. For practical purposes, this means you no longer have to query edge network devices by browsing through a server for instance. You don’t need to locate and manage plugins on hosts and pass this knowledge through your organization as “tribal” – something known by the administrators through their own interactions. You can now see your routers or other agentless devices as their own services with their own service measurements.

So if you’re using VMware ESXi, check out our help documentation for how to get started with Devices. We’d love to know what you think and how we can make it easier to get visibility into your hypervisor.

Solaris Collector Now Available

“…we all shine on, like the moon, and the stars, and the sun.”

It gives me great pleasure to announce the general availability of the Solaris collector for AppFirst. Joining our collectors for Linux and Windows, the Solaris collector expands AppFirst’s ability to provide complete visibility into a greater set of enterprise applications.

The Solaris collector will work with Solaris 10 and 11, on both the SPARC and Intel platforms.

In case you are not familiar with AppFirst’s offerings, I’ll do my best to provide a brief overview. AppFirst enables our customers to achieve unprecedented visibility into the services that they are charged with delivering. Our customers are using the data from AppFirst’s collectors, and the analytics of the AppFirst platform, to realize the full potential of delivering IT as a service.

Have Solaris in your environment? Want to add the powerful insight of AppFirst? Send an email to our Support team at support@appfirst.com and we’ll send you the steps to acquire and install our Solaris collector.

AppFirst Hires First VP of Sales, Ron Ranaldi, to Join New York Team

NEW YORK – March 24, 2014 – AppFirst, the patented data platform enabling IT to revolutionize how they collect and analyze data, today announced the appointment of its first Vice President of Sales, Ronald Ranaldi. Based in the company’s New York City headquarters, he will be responsible for global worldwide sales operations.

“Over the past five years since our launch, AppFirst continues to invest in its technology to exceed customer expectations in mission-critical markets,” said David Roth, CEO and founder of AppFirst. “With Ron at the helm as our first VP of Sales, we can confidently take the next step in achieving our potential by building a world-class field team to scale adoption and grow organically.”

Ranaldi brings more than two decades of experience selling enterprise technology and leading international sales teams to create new revenue opportunities, as well as sustain revenue growth. Most recently, he served as Director of Sales for application security leader Mocana Corporation, where he helped establish and implement go-to-market strategies for a new mobile-securitization model.

“Leveraging my 20-year career spent working with Fortune 500 technology leaders, I am excited to apply my diverse go-to-market expertise to help AppFirst achieve success with enterprise customers, ramp revenue and build a high-quality sales team that will help it achieve revenue objectives,” said Ranaldi.

For the prior decade, Ronald served in a variety of senior-level sales roles for numerous enterprises, including Validus DC Systems, LLC (sold to ABB); Rogue Wave Software (division of Quovadx, Inc., sold to Battery Ventures); Azul Systems, Inc.; Collaxa, Inc. (sold to Oracle); Kenamea; and BEA Systems, Inc. (WebLogic, sold to BEA). He started his career at Open Environment Corporation (IPO, sold to Borland International). Ronald holds a BS in business administration from Northeastern University in Boston.

About AppFirst:

AppFirst’s patented technology provides enterprises with continuous and complete visibility into all applications and supporting resources in their IT ecosystem, regardless of the infrastructure. With AppFirst, organizations can achieve IT-as-a-Service, continuously footprint applications to ensure performance and cost optimization, and proactively resolve issues. Founded in 2009, AppFirst is based in New York City, with venture backing from FirstMark Capital, First Round Capital, Javelin Partners and Safeguard Scientifics. For more information, visit http://www.appfirst.com. Follow us on Twitter or subscribe to the AppFirst blog to stay up-to-date on the latest AppFirst news.

# # #

Rpmbuild and CentOS 6

I wanted to document this particular problem because it took me a few days to figure out and was very frustrating.

So first some background information for those new to AppFirst. In order to collect the amount of data that we do, we deploy a collector program on our customer’s servers which are ultimately built via RPMs. Whenever a customer first requests a collector, we build that custom package for them in real time and send it back to them. Simple enough right? Well here’s where things get out of whack.

We had recently made the switch from CentOS 5 to CentOS 6. Everything was more or less working as expected except (wouldn’t you guess it) rpmbuild. We had a shell script called “blah.sh” that ran rpmbuild and made sure it was pointing to all the correct files/directories and that was run using python’s subprocess module. Before I continue let me first list the system we were on.

  • CentOS 6.4 (previously CentOS 5.8)
  • python 2.6.6
  • rpmbuild 4.8 (previously rpmbuild 4.4.2

Every time blah.sh was run with subprocess it kept exiting out with status 1 so my first assumption was to just print out it’s stdout. Refer below for code snippet on how to do this.

proc = subprocess.Popen(["./blah.sh"], stdout=subprocess.PIPE)
logger.error(proc.communicate())

Calling “communicate” will display all the stdout messages and any error message that may pop up as well in the following format: (, ). Unfortunately what was being displayed was utterly useless because it only showed the following two info messages and no errors:

('Building target platforms: x86_64\nBuilding for target x86_64\n', None)

Now what’s really interesting is that this script runs fine through the command line so I went through many steps to see what the actual difference was. I validated that the user was the same, the same environment variables, I tested to see if subprocess was the issue, and so on.

What it ultimately boiled down to was a permissions error. v4.8 of rpmbuild will no longer use /usr/src/redhat and instead default to /root/rpmbuild. And since you are doing this from an apache server (assuming you are not running apache as root user which is a huge security risk), apache will automatically block anything that is trying to access the root directory which caused the entire script to fail out with exit 1. To fix this, modify the rpmbuild line to define __topdir as a folder you have permission to write to. Refer to this link to set up your system accordingly.

This link above tells you to update your .rpmmacros file but these settings will unfortunately not get loaded in this specific scenario so you’ll have to either modify the .spec file or define it in the commandline argument. I chose the latter so here was my solution. Note that this is the contents of blah.sh

cwd='pwd'
rpmbuild --define "_topdir /home/sample_home/rpmbuild/" --bb $cwd/sample.spec --target x86_64

And that should solve your problem. Hopefully this made your transition from CentOS 5 to 6 much smoother. Until next time.

Big Data Growing Pains with HBase

Over the last week there has been a lot going on at AppFirst. With the data delays on the console, many of our users have been feeling it as well.

We’ve built a robust pipeline to handle the data from thousands of servers streaming data into our backend. We’ve learned a tremendous amount about the limitations of various software including Nginx, RabbitMQ, Redis and now, HBase.

HBase is the core for the large volume data store for our public SaaS offering, and over the last month we’ve been reaching some disk usage limits that have required frequent maintenance and capacity management. This is nothing new in the operations world, but what we’ve learned is HBase doesn’t take as nicely to server removals as we believed.

During the last week we’ve been upgrading capacity and performing data maintenance with more regularity. Adding storage should be transparent to HBase. However, in the process of taking a node down, adding disks, and rejoining the cluster, the change causes enough of an event to back up our queues.

We are actively working to find new ways to improve this flow and minimize the impact to our customers. It is important to note that while there was a delay in data processing to our backend, there was no data loss – our queueing systems managed this condition as expected. Once HBase was back online, it handled data storage very well.

Additionally, over the weekend, we found that our implementation of Zookeeper was performing a significant amount of disk read/write activity. As an in-memory distributed coordination and synchronization tool, disk I/O is a serious concern when it comes to operations.

When examining the source of this disk I/O, we found that our Zookeeper was configured to use 9 nodes. Apache recommends that Zookeeper run in a 3, 5, or 7 node configuration. As you can see from the graph, once we updated the configuration to run on a 5-node setup, disk I/O fell dramatically.

We are aware of the impact the configuration and maintenance has on our users and infrastructure. We have been developing a new storage model that improves data maintenance capability and we continue to analyze the impact of system configurations. These changes will result in a much smoother experience.

Achieving Web-Scale in the Enterprise

On Tuesday, AppFirst CEO David Roth and Claus Moldt, CEO of m>Path and former Global CIO of SalesForce.com, held a Q&A, “The Modern Data Center: Metrics for Achieving Web-Scale and IT as a Service.”

Claus shared his best practices for achieving business agility through web-scale IT from his tenure at SalesForce.com and EBay.

View the slides below:

Click here to view the webinar on-demand where Claus Moldt walks through how he built out a web-scale framework at SalesForce and EBay.

API Updates: v4 and New Limits

At AppFirst one of our biggest features is our extensive public API system. Using our many UI components, such as the Dashboard or Correlate, we try to cater an intuitive experience for our users but we understand that the UX we have established may not be as flexible as you need it. This is why we decided to expose all of the data so our users have the option to manipulate it themselves.

Unfortunately with this amount of freedom comes the risk of users abusing it (usually unintentionally). Prior to the release of API v4, our API system did not set limits on the parameters coming in, so a customer could potentially ask for a years worth of summary data resulting in hundreds of megabytes worth of data (and yes, we have seen it happen). This led to ridiculously long http requests/responses, confusion from the user, and in some extreme cases system failure.

To resolve this issue we introduce two forms of pagination: offset and cursor-based. The type of pagination used is dependent on the type of API call that the user makes and further details can be found here. Pagination benefits our customers and us at AppFirst in several ways.

  • Protects users from accidentally querying for data across large time spans. Let’s look at the aforementioned example. If that same user requests for a years worth of data again he would instead get small subset of the total data requested as well as the URL to get the next subset of data. This is much favorable to one giant query that would leave the user waiting for several minutes while half a million data points were organized into a single JSON object.
  • Gives the users more control in how much data they require. This is especially true for any offset pagination API calls (i.e. https://wwws.appfirst.com/api/servers/). The new data format of v4 also makes it easier to request for any previous or next page since we provide the URLs directly in the response.
  • Prevents AppFirst system failures against malicious API abusers. More uptime means more happy users.

We’re excited to see how our customers make use of the new v4 API calls and, as always, feel free to comment below or contact us directly if you have any questions.

Making Account Permissions Easier with Roles [New Feature]

With this new release, we are excited to introduce the new concept of Roles. Prior to this feature, customer accounts had two type of users: Owners and Subusers. Owners had full permissions to change anything tied to their account in the Administration section, while Subusers would have minimal Administration access. This prevented Subusers from modifying any critical settings that had been previously set up.

Another issue we had before Roles was we could only allow a single Owner on an account to prevent billing issues. This meant only one user could have full control over the account. We recognized this issue and expanded on the concept by introducing two new roles: Admins and Managers.

The hierarchy from most permissions to least permissions is:

  1. Owner
  2. Admin
  3. Manager
  4. Regular

The Owner and Admins essentially have the same level of permissions except that billing emails will go to the Owner (there can still only be one owner). They will have permissions to modify all settings in the Administration tab.

The Manager role has reduced permissions where they do not have the ability to change the Collector Profiles, Server Sets, Logs, Users, Providers, Partners, or Account settings. And the Regular user role will have no access to the administration tab at all. Refer to the table below for graphical representation.

Feature Owner
(single)
Admin Manager User
Receives Billing Email
X
-
-
-
Admin | Settings
X
X
X
-
Admin | Applications
X
X
X
-
Admin | Collector Profiles
X
X
-
-
Admin | Collectors
X
X
X
-
Admin | Server Tags
X
X
X
-
Admin | Server Sets
X
X
-
-
Admin | Alert Settings | Alerts
X
X
X
-
Admin | Alert Settings | Polled Data
X
X
X
-
Admin | Alert Settings | Maintenance
X
X
X
-
Admin | Polled Data
X
X
X
-
Admin | Logs
X
X
-
-
Admin | Providers
X
X
-
-
Admin | Partners
X
X
-
-
Admin | Users
X
X
-
-
Admin | Profile
X
X
X
-
Admin | Account
X
X
-
-
Dashboard | Modify Widgets
X
X
X
-
Dashboard | Move Widgets
X
X
X
-

The introduction of Roles also plays a “role” (see what I did there?) in improving the overall experience of customizing Dashboards. Only users with Manager permissions or higher will be able to modify the widgets and/or move their positions. All Regular users will only be able to view the data which means that you can finally give your non-technical CEO access to their own dashboard without worrying about accidental tweaks or edits.

It is important to note that with this push, everyone that is not an Owner will be defaulted to Manager. We hope you enjoy this new feature and, as always, let us know if there are any issues.