Jay Harris is Cpt. LoadTest

a .net developers blog on improving user experience of humans and coders
Home | About | Speaking | Contact | Archives | RSS
 
Filed under: Performance | Testing

Outside of the QA world (and unfortunately, sometimes in the QA world), I’ve heard people toss around ‘Performance Testing’, ‘Load Testing’, ‘Scalability Testing’, and ‘Stress Testing’, yet always mean the same thing. My clients do this. My project managers do this. My fellow developers do this. It doesn’t bother me–I’m not some QA psycho that harasses anyone that doesn’t use exactly the correct term–but I do smirk on the inside whenever one of these offenses occurs.

Performance testing is not load testing is not scalability testing is not stress testing. They are not the same thing. They closely relate, but they are not the same thing.

  • Load testing is testing that involves applying a load to the system.
  • Performance testing evaluates how well the system performs.
  • Stress testing looks at how the system behaves under a heavy load.
  • Scalability testing investigates how well the system scales as the load and/or resources are increased.

Alexander Podelko, Load Testing in a Diverse Environment, Software Test & Performance, October 2005.

Performance Testing

Any type of testing–and I mean any type–that measures the performance (essentially, speed) of the system in question. Measuring the speed at which your database cluster switches from the primary to secondary database server when the primary is unplugged is a performance test and has nothing to do with the load on the system.

Load Testing

Any type of test that is dependent upon load or a specific load being placed on the system. Load testing is not always a performance test. When 25 transactions per second (tps) are placed on a web site, and the load balancer is monitored to ensure that traffic is being properly distributed to the farm, you are load testing without a care for performance.

Stress Testing

Here is where I disagree with Alexander: stress testing places some sort of unexpected stress on the system, but does not have to be a heavy load. Stress testing could include testing a web server where one of its two processors have failed, a load-balanced farm with some if its servers dropped from the cluster, a wireless system with a weak signal or increased signal noise, or a laptop outside in below-freezing temperatures.

Scalability Testing

Testing how well a system scales also is independent of load or resources, but still relies on load or resources. Does a system produce timeout errors when you increase the load from 20tps to 40tps? At 40tps, does the system produce less timeout errors as the number of web servers in the farm is increased from 2 servers to 4? Or when the Dell PowerEdge 2300s are replaced with PE2500s?


Any type of testing in QA is vague. This includes the countless types of functional testing, reliability testing, performance testing, and so on. Often time a single test can fit into a handful of testing categories. Testing how fast the login page loads after three days of 20tps traffic can be a load test, a performance test, and a reliability test. The type of testing that it should be categorized as is dependent upon what you are trying to do or achieve. Under this example, it is a performance testing, since the goal is to measure ‘how fast’. If you change the question to ‘is it slower after three days’, then it is a reliability test. The point is that no matter where the test fits in your “Venn Diagram of QA,” the true identify of a test is based on what you are trying to get out of it. The rest is just a means to an end.

Sunday, October 16, 2005 1:41:01 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [1] - Trackback

I know. I haven’t posted in a while. But I’ve been crazy busy. Twelve hour days are my norm, right now. But enough complaining; let’s get to the good stuff.

By now you know my love for PsExec. I discovered it when trying to find a way to add assemblies to a remote GAC [post]. I’ve found more love for it. Now, I can remotely execute my performance tests!

Execute LoadRunner test using NAnt via LoadRunner:

<exec basedir="${P1}"
  program="psexec"
  failonerror="false"
  commandline='\${P2} /u ${P3} /p ${P4} /i /w "${P5}" cmd /c wlrun -Run
    -InvokeAnalysis -TestPath "${P6}" -ResultLocation "${P7}"
    -ResultCleanName "${P8}"' />

(I’ve created generic parameter names so that you can read it a little better.)
P1: Local directory for PsExec
P2: LoadRunner Controller Server name
P3: LoadRunner Controller Server user username. I use an Admin-level ID here, since this ID also needs rights to capture Windows PerfMon metrics on my app servers.
P4: LoadRunner Controller Server user password
P5: Working directory on P2 for 'wlrun.exe', such as C:\Program Files\Mercury\Mercury LoadRunner\bin
P6: Path on P2 to the LoadRunner scenario file
P7: Directory on P2 that contains all results from every test
P8: Result Set name for this test run

'-InvokeAnalysis' will automatically execute LoadRunner analysis at test completion. If you properly configure your Analysis default template, Analysis will automatically generate the result set you want, save the Analysis session information, and create a HTML report of the results. Now, put IIS on your Controller machine, and VDir to the main results directory in P7, and you will have access to the HTML report within minutes after your test completes.

Other ideas:

  • You can also hook it up to CruiseControl and have your CC.Net report include a link to the LR report.
  • Create a nightly build in CC.Net that will compile your code, deploy it to your performance testing environment, and execute the performance test. When you get to work in the morning, you have a link to your full performance test report waiting in your inbox.

The catch for all of this: you need a session logged in to the LoadRunner controller box at all times. The '/i' in the PsExec command means that it interacts with the desktop.

Sidenote

PsExec is my favorite tool right now. I can do so many cool things. I admit, as a domain administrator, I also get a little malicious, sometimes. The other day I used PsExec to start up solitaire on a co-workers box, then razzed him for playing games on the clock.

Friday, October 14, 2005 11:35:40 AM (Eastern Daylight Time, UTC-04:00)  #    Comments [0] - Trackback

Filed under: Performance | Testing

In the QA forums I frequent, there are often questions about how to properly load test when you don’t have access to production or an identically built environment. Most companies won’t spring the cash to build an environment that is identical to production; generally, testing environments are made up of hand-me-down servers that used to be in production. Of course, there is also the cost of test suite licensing to produce a productional load, and the near impossibility of mimicking real production traffic.

Though a production clone would be ideal, a watered down environment can be sufficient, and in some ways better. Bottlenecks are achieved faster, without having to push through 50 Mbps of data. Additionally, a “lesser” environment will be more sensitive to changes; your transaction may take 0.5 seconds on production-grade servers, and a defect that doubles it to 1.0 seconds is hardly noticeable, but on a lesser environment where that transaction takes 6.0 seconds, doubling it to twelve throws up red flags.

For a watered-down environment, try to lessen the horsepower of your system while matching the architecture. If your productional environment is eight web servers that are all quad 3.2 Ghz Xeons running Windows Server 2003 Web Edition, and all load balanced through a hardware load balancer, you can bring it down to two web servers with less horsepower–perhaps dual 700Mhz P3s–but the servers should still run Windows Server 2003 Web Edition and be balanced with a hardware balancer. Do not drop below two web servers because you will still want a load balanced environment, and do not switch to Windows 2000 or use Microsoft’s NLB (Network Load Balancing). If your production web environment uses Windows 2000 and NLB, obviously use that technology in your testing environment; do not switch to Windows 2003 or a hardware load balancer.

Additionally, try to reduce equally throughout your environment. Don’t drop your web servers from Pentium 4s to Pentium 3s while dropping your database servers from Pentium 4s to an old 486 desktop. Equal reductions maintain your continuity, and in the end, your sanity. Unequal reductions introduce new problems that don’t exist in production, but will still happily waste your time and money. A major bottleneck might exist on your web servers, but the defect could be masked because you were database-bound by using that old 486.

The idea behind this is that many bugs can be introduced by a specific revision of your OS (Think of the problems from Windows XP SP2), from your style of network infrastructure, the version of your graphics driver, etc. You want as many common points as possible between your testing and production environments to eliminate any surprises when you launch your application. Ideally, your testing environment is an exact replica of your production environment, but unless you are making desktop applications, it is only a fantasy, so just try to get as close as you can. Use the same OS version, including the same service pack and the same installed hot fixes. Use the same driver versions, and configure the same settings on your web server software. You are trying to create a miniature version of your production environment, like a model car or a ship in a bottle. Pay attention to the details and you will be okay. To your application, the environments should be exactly the same; one is just a little snug.

Wednesday, May 25, 2005 2:32:15 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [0] - Trackback

Filed under: Performance | Testing

For love of all things QA, before you launch a new application, test production!

“What? That’s stupid! Why would I want to perform a load test production and risk an outage? That impacts my SLAs. I can’t impact my SLAs!”

Remember the number one rule of quality control: if you don’t find it, your customers will.

When you are about to launch a brand new application into your production environment, test that application against production. However, this only applies for new applications. New applications will introduce new, additional load on the environment, while existing, revised applications already have added that load to the system. Essentially, with an existing application, you already know how well the production environment can handle the additional demand generated by the application’s audience. New applications have not yet generated that load, and production has yet to prove itself.

There is no hard evidence that production can take the additional demand. Maybe your production load balancer can only handle another 5 MB/s, and your new application will demand another 7. Perhaps it is one of the switches, instead. Or for my recent life, maybe it is your ISP. You will not know until you test it, until you measure it, and “if you didn’t measure it, you didn’t do it.”

With a past project, my company created an intranet application for our client, and our application just happened to be hosted off-site. The off-site environment was green, and wasn’t hosting anything else, so our client had no issue with us testing this environment fully since it was going to be production, but wasn’t yet. The hosting company and their ISP rated the environment at 45 Mbps (That’s megabits–lower-case ‘b’), and based on the clients traffic expectations, we needed about 30. It is a good thing we tested the site because we found an issue with the load balancer at about 15 Mbps, a problem with server memory when it was processing enough transactions to produce 20 Mbps, a problem with the database switches when we were generating 22 Mbps, and–this one is the kicker–a bandwidth ceiling at 28. Though all of the routers, switches, balancers, and servers were performing well, we couldn’t get more than 28 Mbps to the web servers. It turns out that the ISP didn’t ever expect anyone to use that 45 Mbps rating, and never tested to make sure they could handle it.

“If you didn’t measure it, you didn’t do it.”

Through two months of midnight through 0600 testing, we upgraded the load balancer, added more memory, put in gigabit switches, had the ISP tweak their infrastructure, pushed through all of the data we needed, and successfully proved that the off-site environment and our new application could handle the load. But, the environment still wasn’t fully tested. Our client used everyone’s favorite single-signon, SiteMinder. However, they wouldn’t let us test the application while integrating their productional SiteMinder policy servers. We could only use staging, and when the staging servers couldn’t handle the load, “that’s okay because it’s staging.” But no matter how much we advocated, we couldn’t test production. We might impact the environment and the SLAs. So, we launched without testing it, and guess what happened? The policy servers failed, and they severely impacted their SLAs.

And to think, we could have tested that at 1:00 AM on a Saturday, and they even if we fried the policy servers, they would have had all weekend to fix it. And most importantly, we would have identified it before the end-user did. But what really cooked their goose was the difference between productional load and performance testing load: performance tests can be stopped. It is a lot harder to fix a jet engine at 30,000 ft.

The moral of the story: when launching a new application, always test production. Always.

Monday, May 23, 2005 2:35:26 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [0] - Trackback

Filed under: LoadRunner | Performance | Testing

For my needs, the biggest hole in Mercury LoadRunner is its lack of page size monitoring. LoadRunner can monitor anything else imaginable, including transaction counts, transaction times, errors, and all Windows Performance Monitor metrics. However, monitoring page size, download times, and HTTP Return codes are only available through programming.

The following function will monitor the page size of all responses, logging an error if it exceeds you specified limit, as well as track all values on the user-defined graphs.

si_page_size_limit(int PageLimit, char* PageName, char *PageURL, long TransactionID){
//
// Page Size Limit Monitor
// Author: Jay Harris, http://www.cptloadtest.com, (c) 2004 Jason Harris
// License: This work is licensed under a
//    Creative Commons Attribution 3.0 United States License.
//    http://creativecommons.org/licenses/by/3.0/us/
//
// Created: 10-Aug-2004
// Last Modified: 10-May-2005, Jay Harris
//
// Description:
// Logs an error to the log, pass or fail, including the applicable status, if logging is enabled.
// Plots page size datapoint to User Defined graph.
//
// Inputs:
// int PageLimit Maximum page size allowed, in bytes
// char* PageName Name of the page, such as the Title. For identification in logs.
// char* PageURL URL of the page. For reference in logs. FOr identification in logs.
// long TransactionID Transaction ID for the current request.
// Note: Transaction must be explicitly opened via lr_start_transaction_instance.
// Note: TransactionID is returned by lr_start_transaction_instance.
//
 
    int iPageSize = web_get_int_property(HTTP_INFO_DOWNLOAD_SIZE);
    char DataPointName[1024] = “Response Size [”;
    strcat(DataPointName, PageName);
    strcat(DataPointName, “]”);

    if (PageLimit < iPageSize) {
        lr_continue_on_error(1);
        lr_debug_message(LR_MSG_CLASS_BRIEF_LOG | LR_MSG_CLASS_EXTENDED_LOG,
	    “Page Size Check FAILED - %s [%s] exceeds specified page size limit of %d (Total: %d)”,
	    PageName,PageURL,PageLimit,iPageSize);
        lr_continue_on_error(0);
    } else {
        lr_debug_message(LR_MSG_CLASS_BRIEF_LOG | LR_MSG_CLASS_EXTENDED_LOG,
	    “Page Size Check PASSED - %s [%s] meets specified page size limit of %d (Total: %d)”,
	    PageName,PageURL,PageLimit,iPageSize);
    }
    if (lr_get_trans_instance_status(TransactionID) == LR_PASS) {
        lr_user_data_point_instance_ex(DataPointName,iPageSize,TransactionID,DP_FLAGS_EXTENDED_LOG);
    }
    return 0;
}
Tuesday, May 10, 2005 2:51:58 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [6] - Trackback