Advances in Web Performance Monitoring: Don't Accept Outages as the New Norm

Mehdi Daoudi
by Mehdi Daoudi 14 Nov, 2013

Cloud and cloud services adoption has been a growing trend in the IT industry. Cloud technology allows companies to save time and money focusing on internal operations and devoting more resources to growing revenue and product improvement. However, a multi-vendor operations approach creates a complexity issue as websites are now dependent on the success of multiple moving parts working together. Furthermore, companies using traditional monitoring tools lose visibility into the performance, reliability and consistency of cloud-based applications, making sites more vulnerable to complexity-based performance risks. In order to ensure speed and availability in the new cloud era, a cloud-ready performance monitoring strategy needs to be implemented as part of regular site operations.

With Cloud Comes Complexity

The move toward the cloud has exploded recently and is predicted to expand even further. In fact, a survey by Gartner shows that by 2020, 55 percent of CIOs plan to move all of their critical apps to the cloud.  

Cloud services encompass everything from hosted infrastructure and platforms, to ad serving technology, widgets and analytics tools. There are three kinds of cloud - private, public and hybrid. A public cloud is the most cost effective and scalable, however there are security risks and the dangers of resource-hogging neighbors. The private cloud does not have security risks, but there is much more overhead in financing and maintaining the environment. A hybrid cloud connects private and public clouds together. A company may use a variety of cloud services existing in different cloud environments.  

Complexity challenges arise when these services interact with each other to produce an overall experience. For example, if an application moves from a single infrastructure with 99.99 percent availability to an open one relying on five different providers with 99.99 percent availability each, the result is an infrastructure with 99.95 percent availability. In an era immediate fulfillment, customers won't accept any interruption/slowness of a company's Web services so 99.95 percent availability becomes a big issue. 

Another challenge is Single Point of Failure (SPOF) - a failure at one service or part of the entire webpage may make the entire webpage unavailable. Traditional monitoring tools that only monitor one aspect or service often do not detect SPOF.  

An example that comes to mind is a company that monitors the performance of their front-end servers in house to check for availability. One day, their analytics service experiences an issue with their tags on the site. To an end user browsing the site, the site is completely unrenderable - it just hangs on a blank screen as it is loading. As this error has nothing to do with the front-end servers of the company, no alarms will ring and pperations will completely miss that SPOF. According to their traditional monitoring tools, everything is OK when in reality, the website is not accessible to end users. And when calls start coming into the customer service desk, they will have no idea there really is an issue.

Moving Beyond Traditional Server Monitoring: Cloud-Ready Monitoring 

Cloud computing brings various types of IT components to the enterprise infrastructure. Therefore, performance monitoring that only focuses on one component of the entire delivery chain, does not work as it does not provide a holistic view of the environment. 

Traditional performance monitoring tools only monitor from a single point, so you have no way of knowing if all of the services on your site are accessible to customers from different locations around the world.  They also do not provide any view into how the different services act in conjunction. Does your tracking tag block your ad tag from loading in sufficient time (e.g. while the user is still on that page)?  Any missed ad impression is a missed revenue opportunity. Cases like this show why it is important for the IT and business teams to know how third-party cloud services interact with one another. 

In order to have an effective monitoring strategy, companies utilizing cloud services need to abandon traditional monitoring tools and leverage new technologies - specifically performance monitoring that can test the entire website or application as a whole, rather than just the individual components.  

Cloud-ready performance monitoring tools test your website from the outside-in and measure how all assets are delivered to the end user as a whole. An effective cloud-ready monitoring solution should not only be capable of measuring the holistic experience, but also able to isolate specific assets or functions (CDN speed, third party availability, product search performance) of the website. That way, IT can drill down to varying levels to optimize performance.  

Company X was relying on a free DNS service. Then Company X started noticing that its site slowed down during business hours - the time of day when Internet usage peaks. Through its cloud-ready performance monitoring tool, the company identified the source of slowness as its DNS resolution time. Clearly its free DNS service could not handle the increase in load during business hours. After that, investing in a managed DNS service that was more resilient to traffic peaks was an easy choice.

Managing SLAs with Monitoring

Service Level Agreements (SLAs) are critical in the cloud environment for both the vendor and the customer. Most companies manage SLAs with performance monitoring tools that take the outside-in approach. Customers pay for the services/infrastructure used and need to be assured a level of service at any time. On the other hand, cloud service vendors can take a proactive approach to managing their SLAs by monitoring the performance of the service on their clients' websites. This allows the vendor to resolve problems before they impact users and violate SLAs.

Technology Has Changed, but Needs Stay the Same

Modern enterprise applications are engineered for agility to be frequently deployed over elastic IT infrastructures. The benefits of virtualization, public and private Cloud, and hybrid deployments can include flexibility, efficiency and business enablement. However, these benefits come with challenges. Multiple factors, such as varying workloads, introduce risks to quality of service. Performance and availability can be compromised, particularly when IT organizations lack Application Performance Management solutions specifically designed to support dynamic infrastructures.

At the end of the day, the complexity of the IT infrastructure may have increased, however the needs of companies remain the same. It's all about gaining a better understanding of the performance of online services so they can ensure a fast, glitch-free online environment to improve user satisfaction, reduce quality management costs, and protect revenue.

A monitoring model for the cloud needs to provide the view of the entire cloud in delivering a service. It should aid companies in assessing whether customers' demands can be met with the current resources and performance while getting a view at the individual application level.

About the Author

Mehdi Daoudi Mehdi is the chief executive officer and founder Catchpoint, where he combines in-depth expertise in developing and operating large on-demand software platforms with hands-on experience in application monitoring, performance management and IT operations practices. As an industry pioneer, Mehdi created the Quality of Services group at DoubleClick Inc. in 2000 and holds no less than nine years of extensive experience in the monitoring, SLA, and operations excellence fields.