A growing collection of tools is available on the Internet to evaluate the speed and performance of broadband Internet service providers (ISPs). Commonly referred to as speed testers, these tools are intended to provide an objective measure of the performance of an Internet connection. Originally, results produced by these tools were often used by subscribers to estimate the speed of their service or to diagnose connectivity problems.
However, as broadband service performance becomes more significant to consumers and policy-makers alike, so has the role of the speed tester. Today, it’s not uncommon to find speed test results being used as a basis of competitive differentiation when comparing one service provider to another. Internet interest group the Electronic Frontier Foundation refers to speed tester software tools on its Web site with the goal of helping subscribers test their broadband connections.
As the broadband world incorporates these speed test results into a larger dialog, fundamental questions remain unaddressed. How do speed testers work? Are the results accurate? Should the results of today’s Internet speed testers be taken as truly representative of the subscriber experience? Is it meaningful to use them to compare one ISP to another? Are speed test results valid metrics for evaluating network and service policy?
In reality, there is significant diversity in how speed testers work. Accuracy and repeatability vary widely because of a plethora of factors, including software design, system implementation and measurement methodologies.
This article surveys the current state of speed test tools in the wilds of the Internet. It examines the underpinning technology common to a sample of these applications and investigates those factors that will influence the results of speed testers in operation today.
Active vs. passive The first thing to examine is where these speed testers map into the wide world of Internet performance measurement. Generally speaking, two methodologies for Internet performance measurement systems are defined: active measurement and passive measurement.
Passive measurement systems observe traffic flowing through the network without modifying it. (See Figure 1a.) Some classic examples of passive network measurement techniques include the use of protocols like simple network management protocol (SNMP), Internet protocol detail record (IPDR), and NetFlow. Active measurement systems take a more intrusive approach by injecting synthetic test traffic directly into the network. (See Figure 1b.) The behavior of the network between endpoints is then evaluated based on an analysis of the traffic transmitted by one endpoint and the traffic received by the other. Typically, an active measurement system will produce results for metrics such as one-way throughput, packet loss, latency and jitter. Generally speaking, the more popular speed testers found in the wilds of the Internet today fall into the active measurement category.
As the packet flies My own service provider advertises a rate of 6.0 Mbps downstream/1.0 Mbps upstream for the "starter" tier to which I subscribe. Inspired to understand more about both speed testers and my service, I’ve decided to kick the tires on my broadband connection. It is worth noting that my service provider offers a bandwidth perk, known as "PowerBoost," which proposes to increase my download speed by up to double the advertised rate under certain network conditions.
I began my humble speed test experiment by connecting my laptop directly to the Ethernet port of my cable modem. In bypassing my local network, I hope to remove any complex configurations that may bias my results.
Next step, find a speed tester. A quick Google search for "Internet speed test" takes me to my first stop: speedtest.net. I’m greeted by a dazzling graphical map of the globe that displays all available speed test servers and encourages me to select the closest one to my location (which it appears to have discovered automatically).
A press of a button and the test begins. An aesthetically pleasing animation and dancing speedometer keep me entertained for roughly half a minute until the test is complete. The results are returned, and I’m astounded to see that my upload and download speeds are greater than what my service provider advertised. Based on this test, I’m roughly 16 percent over my advertised rate in the downstream and more than 25 percent above in the upstream (on average). Wow, am I getting more than I am paying for, or is this the "PowerBoost" talking?
Immediately repeating the same speed test gets me very different numbers. This time, my measurement is approximately 100 percent over the advertised rate downstream and 30 percent over the advertised rate upstream. Intrigued by the variability, I repeat the test for a total of 10 samples. (See Run 1 in Table 1.) After a five-minute break, I return to repeat the process (Run 2, Table 1).
Arguably not the most statistically significant sample, but over a total of 20 speed test measurements taken, reported downstream measurements varied by up to 198 percent, upstream speed by approximately 12 percent.
Will my results vary as widely if I attempt a second speed test tool? After a bit of searching, I find another local speed tester offered by the digital subscriber line (DSL) provider Qwest. Again, I choose Denver as my closest speed test server.
Once more I go for 10 successive speed tests. (See Table 2.) With this tool, the results are somewhat more consistent, but contrast greatly to those generated earlier by speedtest.net. Interestingly, though my overall results are lower than those generated on speedtest.net, they are closer to my advertised service speeds. Next stop on the speed test is broadbandreports.com, a popular site for scuttlebutt and host to a collection of tools and related discussion forums where the "quality" of various service providers is hotly debated. Guided to their tools section, I find two speed testers offered that seem suitable, one based on Sun’s Java and the other on Adobe’s Flash. Unfortunately, there doesn’t seem to be a server located close to me in Denver, so I select one in Los Angeles and repeat the 10-sample test using both speed testers offered.
In contrast to my earlier results, these measurements would suggest that I’m well under my service providers advertised rate limits. (See Tables 3 and 4.) Of note, again, both of these tools are offered by Broadband Reports, and the servers for both tests appear to be hosted by the same service provider in Los Angeles. My less-than-scientific 30-minute experiment employed four different speed test tools, located in two different cities, separated by roughly 830 miles. In summary, the difference in measured speed across all samples taken varied by as much as 900 percent. What gives? What are we measuring? So why are the results for the same service provider so wildly different? In reality, the variability in speed test results stems from a large (quite possibly indeterminate) number of factors, each of which contributes to the inaccuracy of the measurement. We’ll examine a few of the more obvious culprits here.
The first is network topology. These speed tests measure much more than just my service provider’s network. With any system under test (SUT), we must clearly define what it is that we’re testing. Figure 2 illustrates a simplified view of a network path between a cable modem subscriber running a test client and the test server. Starting on the subscriber side where the speed test client executes, the speed test traffic traverses the following network path:
1. Subscriber’s home network
2. DOCSIS access network
3. Cable operator’s regional IP network
5. Service provider’s regional network
6. Tester server’s local network
In this example, the speed test traffic travels the network topology from 1 to 6 and then back again. The operator’s management domain only includes parts 2 and 3. If the provider’s network is assumed to be the SUT, then the test traffic clearly travels outside of that system. It follows that any speed test server located external to my provider’s managed network will produce measurements that include other network(s). Put another way, the longer the network distance between the speed test server and my provider’s network management domain, the less meaningful the result.
Another item worth noting is the subscriber’s home network (path point 1). Imagine sharing a 802.11 wireless connection into a $35 router in a college dorm room. Chances are you’re going to see different results from these speed testers than if you plugged directly into your operator’s terminating equipment.
Another key factor influencing these results is the anatomy of the speed test application itself. In general, these are software applications that vary widely in their design and implementation, with the internal details often hidden in the pre-compiled executable. Different protocols, test traffic blends and implementation languages are just a few of the many things that can determine the application’s behavior, all biasing speed test results.
It seems that Java, Flash and native executables are the three popular forms of client-side application implementation. Both Java and Flash lend themselves well to a browser-based client test, but each requires a runtime execution environment that introduces yet another component into the SUT and therefore a possible change in the results.
Next up are the endpoints. Speed test results are affected by the configuration and performance of the host computers running both the client and server software. It’s well understood that the resources of a subscriber’s PC can impact subscriber experience. How many other applications are running? What type of machine is this in terms of processor, memory, and input/output (I/O) resources? What is the operating system (OS)? How efficient is the transmission control protocol (TCP) stack implementation?
Finally, there’s the behavior of the access network itself. In the case of the cable modem service, DOCSIS technology applies a shared access layer which is, by design, "bursty" in nature. That is, bandwidth – actually the effective data throughput – is granted on a per modem basis using fine-grained scheduling under the control of the cable modem termination system (CMTS). In practice, the behavior of a DOCSIS network is difficult to predict over short intervals and can change dynamically from a single user’s perspective. Speed tests that are run over the same network topology only a few seconds apart can render very different results. Not just about speed It’s commonly understood that for a number of Internet applications, it’s not how fast you can go – it’s the quality of the ride that counts. Conspicuously absent from most speed testers is the concept of quality of service (QoS). Given the rise of policy-based networking and service provider bandwidth management practices, it’s not guaranteed that all applications are, or will be, treated equally as their packets travel through the Internet.
For example, the popular voice over Internet protocol (VoIP) service provider Vonage hosts a third party TCP-based speed tester in order to gauge the suitability of the subscriber’s network for using Vonage service. The underpinning assumption is that if the speed test generates data rate results equal to, or greater than, what the VoIP application requires, then the VoIP service will work, too. Not so.
Drawing the conclusion that VoIP will work is misleading because the speed test traffic and the VoIP service application traffic are based on completely different transport and application layer protocols. As a result, they may be subject to very different network policies and QoS mechanisms in the network. A measurement taken by a speed tester using protocol A can’t necessarily be held to indicate the performance of an Internet application using protocol B. In short, using speed tests to generalize the quality of a subscriber’s experience across all possible Internet applications is misleading.
Deserving of further attention in the speed tester matter is the measurement of application-layer performance. Many of the current speed testers follow the same basic protocol path. That is, they generate TCP transport segments populated with various streams of binary data. Others leverage hypertext transfer protocol (HTTP) at the application layer to encapsulate test traffic. Services such as VoIP and streaming video often rely on application-layer protocols such as real time protocol (RTP) over user datagram protocol (UDP) on the transport layer. It’s worth investigating how the speed test concept can be extended to account for these important differences. Beyond rudimentary application-layer testing using HTTP – which crudely approximates Web applications – a need exists to generate test traffic that emulates QoS-driven applications. A structured approach The limitations of current speed testers and the challenges of active measurement-based performance testing over the Internet are many. A number of initiatives are underway to help address this field in anticipation of a growing need to better understand service guarantees and overall subscriber experience in broadband networks.
Within the Internet Engineering Task Force, the IP Performance Metrics Working Group has put forth a set of recommendations for standard measures of quality, performance and reliability of Internet data delivery. Since its inception, the IPPM WG has attempted to establish standard definitions for the following metrics:
• One-way delay and loss
• Round-trip delay and loss
• Delay variation
• Loss patterns
• Packet reordering
• Bulk transport capacity
• Link bandwidth capacity
Tools implementing IPPM metrics include one for one-way active measurement protocol (OWAMP). OWAMP attempts to characterize one-way performance using UDP. In addition, OWAMP includes support for DiffServ, which may prove useful for the evaluation of QoS-based application traffic in DOCSIS networks.
The network diagnostic tool (NDT), another tool based on a structured attempt to understand performance, is under development through Internet2 Consortium. NDT is an advanced software system capable of testing a variety of traffic configurations.
Even with these and other initiatives underway that attempt to provide more sophisticated and accurate methodologies, active measurement still faces one fundamental topology problem. That is, how does the test evaluate only the service provider’s network and isolate other systems from the measurement? Your mileage may vary Realizing that my little speed test experiment is crude science at best, this is the very point I’m attempting to make. The results captured were based on a handful of measurements using un-calibrated tools in a highly uncontrolled environment. Deriving conclusions regarding the performance or quality of my service provider based on these results would be deeply flawed. In summary, the efficacy of results produced by these Internet speed testers requires scrutiny.
It’s not my intent to throw stones at the current speed tests out there. I saw no tool making any hard claims regarding the accuracy of its results. That said, it would be beneficial for these software systems to display a caveat or two as a way to better inform their audience.
Perhaps of greater significance, speed tests have become an anecdotal metric for evaluating ISP bandwidth management practices as fodder for the great network neutrality debate. It seems these simple speed test tools may play an overly significant role of influence in the future of broadband Internet access.
As I see it, there’s one consistent result for speed tests: Mileage may vary.
Jason Schnitzer is the founder and principal of Applied Broadband. Reach him at email@example.com.