Frequently in the world of monitoring we hear about the miraculous solutions provided from a single agent. How the world can be solved with transaction tracing, user experience monitoring and understanding the performance of the software pipeline as it applies to business.
All of these are great starting points, but today I’ll go through the case where the software pipeline is affected by the environment in which it lives, and application monitoring only shows the symptoms.
For example, my dashboard this morning:
The specific item in question, “Frontend Response Time” has a high value for a period this morning.
And as I look into this value, it becomes evident that this is being caused by something external to the software stack.
As a quick background, the eCommerce solution I’m running (magento) uses files to keep track of each user as they access the site. A normal activity which isn’t excessively risky. However, due to this activity, thousands of session files exist which quickly eat up all available inode space if no action is taken.
Now, performing normal maintenance, removing these excessive files is easy enough. However, a simple ‘rm -f’ command is insufficient (too many items). This is easily remedied with xargs:
However, execution like this takes time. Time that we can see with the data we’ve collected from the system, and graphed in-line with the web stack performance data.
The blue line is the ecommerce response time, the purple line the system disk utilization, and the orange line is the number of files being accessed by xargs. You can see the clear correlation between running the file removal on the response time, as well as the reduction in storage space. As soon as the files have been removed, the response time recovers.
If you’d like to stop treating symptoms and see how the environmental context affects your business, sign up for a free 30 day trial at http://www.appfirst.com/signup/.