Monday, May 07, 2012

Full stack functional testing for hackers

Why do I care about full stack functional testing?

Because you really care about the quality of your product but you also care about the cost of QA. The most cost efficient option is probably full stack functional testing, which isn't exactly easy, but hey, you are a hacker, aren't you?

For the sake of conciseness, "full stack functional testing" will be mostly referred as "functional testing" in the remainder of this article.

Why unit tests, no matter how good they are, can't replace real QA?

There are many reasons, here are several examples:
  • It's common that when writing unit tests, developers use mocks to isolate the modules they are testing against and leave the integration points between modules untested. It is perfectly normal that you can have 100% unit tests coverage and your application is still completely broken.
  • Unit tests, by its nature, do not cover integration points with external modules/APIs and thus do not protect your application from the behavior changes from them when you upgrade them. Traditionally developers are cautious about upgrading external libraries because often times it will trigger the need for a full regression tests.
  • The logic unit tests are testing is sometimes too far away from the behavior logic end user is interacting with.
  • Unit tests do not provide enough confidence during major refactoring especially when a significant number of unit tests themselves need to be changed due to the refactoring.
  • For web applications, unit tests do not test cross browser compatibility.
In one sentence, you can't be serious if you plan to live without QA because "my application is pretty well covered by unit tests."

Why full-stack functional tests? Does it count to have functional tests against a lower layer, e.g. the web service layer?

Modern web applications have lots of UI logic happening in the browser (written in JS) and the integration between the UI and backend web service is too complex not to test. The point of full-stack functional tests testing from UI layer all the way to the persistence layer is that they perform the same interaction against the application exactly like real users (just faster). When they pass, you have 100% confidence your application works. Leaving out any layers will cost you that ultimate purpose.

Why functional tests are cheaper than manual QA?

You are a hacker, you know the motto: automation, automation, automation. Automated regression tests is by several magnitudes faster than manual regression tests. Actually, since the scope of regression tests will keep increasing along with the growth of the application, the time manual regression tests take will soon start to hurt the application's time to market. So, if you care about both quality and time to market, automated functional tests is the only salable way to do regression tests.

Why full stack functional tests are often claimed to be too fragile to worth it?

Functional testing is significantly different from unit testing which developers are more used to. The main factor that makes functional testing harder is the amount of uncertainties in functional testing:
  • The simple fact that they test end to end means that there are a lot more internal and external factors can impact functional tests.
  • It's harder for your to always test on a clean set of data - the amount of data needed for each functional test forbid you from setting up data on every test.
  • UI interaction is asynchronous and the response time isn't exactly deterministic. Tests could fail randomly when the assertion is made too early, or, put in another way, the expected results show up too late.
Thus debugging functional tests is more challenging than debugging the more deterministic unit test. It is very common that functional suites start fragile and take time to improve gradually until its robustness and stability reach a satisfactory level. Yes, there is a learning curve for developers how are used to writing unit tests, but it's not a "forget it, it's not worth it" one. All you need is the right developers.

How to improve the stability of functional tests over time?

When a test fails due to the undeterministic nature of functional tests, instead of running it again and stops when the test passes, take the time to improve the stability. A very important tool to have here is the ability to automatically do screen captures when a functional test fails because you might not be able to repeat it easily.

Sometimes it takes some guessing to figure out what happened, but most of such "random" failings are caused by the fact that the responsiveness of UI isn't deterministic. To give a more specific example, suppose your test clicks a button on the UI and assert that some expected result. It could fail randomly when your assertion is made "faster" than the response. A common technique is to poll UI with certain frequency for a period of time, during which the test keeps reading the UI for some indication that the application has fully responded. Then the test can assert the expected result. This polling ability is a must for your functional test framework.

Why is it the developers' responsibility to write functional tests?

There are at least 3 reasons:
  • Only developers can make sure the UI is automatically testable and easily update functional tests according to UI change.
  • Functional tests are code, lots of them. Writing readable functional tests requires at least basic OO design skills.
  • Improving the stability of functional tests is a tough job that requires virtuosity in debugging, patience and the confidence to overcome technical challenges. Most people with such capabilities tend to developers, usually pretty good ones.

What does it take to successfully write and maintain functional tests?

As mentioned above, the biggest challenge in functional testing is the debugging bugs that appear to be random. Virtuosity in debugging, patience and confidence in tackling technical challenges are what you should look for when finding developers that can successively execute the use-functional-tests-for-QA strategy. They usually go together. If a developer easily gets frustrated or even annoyed by the undeterministic nature of functional tests, s/he won't be able to effectively debug it and improve the robustness of it, and in turn to make herself to believe that functional testing is always going to be fragile and thus won't work.

If functional tests cover everything, do I still write unit tests?

The purpose of functional tests and unit tests in TDD is very different. Functional tests are mainly for code coverage to ensure functionality. In TDD, the main purpose of unit tests is to drive development, and the code coverage provided is more like a side effect. Thus functional tests cannot replace unit tests for that purpose. That being said, the code coverage provided by functional tests can help TDD because developers can now focus more on the drive-development purpose when writing unit tests without having to also keep code coverage in mind. Arguably, with functional tests you can chose other development methodologies without much immediate risk introduced to the quality of the software. So it opens up more options for the developers.

If the developers are writing functional tests that covers everything, do I still need dedicated QAs?

You will have more edge case issues experienced by your end users without dedicated QAs or dedicated QA process. Developers are not trained, or interested, in testing the system against all special cases. Edge case issues will be discovered and fixed more in an on demand fashion. A good strategy would be to automatically monitor your application closely and alarm your developers (and/or automatically rollback changes) whenever any abnormal things happen, such as exceptions, sudden user activity changes and so on. Then your developers can fix them a.s.a.p and add functional tests accordingly. This approach is arguably more agile because you don't spend the cost in front to fix some edge issues that may never be met by the real end users.

What tools are needed for writing functional tests for a web application?

Just a couple:
  • Selenium - selenium 2.0 (webdriver) is much better and quite stable, although the documentation isn't that great (here are some tips)
  • Chromedriver - faster than Firefox, but has some limitation comparing to Firefox whose selenium driver is the most mature one.
  • Your favorite BDD framework.
Given that the problem functional testing is trying to solve is already complex enough, it would be wise to keep the technology stack as simple as possible. The need for any extra layers on top of Selenium or BDD is very limited. You might need to write some simple helper methods to hide some boilerplate code needed for selenium, but not much more.

Why BDD framework over Unit Testing framework when it comes to functional tests?

Although using unit test structures to organize functional tests should work, it is more natural to organize functional tests into contexts and scenarios, which BDD frameworks usually support better.

Is there any good pattern in writing functional tests?

The basic idea is to first model the application UI you are testing against and then write tests against that model. A popular pattern based on this idea is called page object pattern. In this pattern, you write classes to represent UI pages so that
  • Detailed UI interaction implementation is hidden in these classes, e.g. the logic of how to locate a button.
  • Only business meaningful methods are exposed, e.g. submit_order or set_shipping_address
For a simplified ruby example, here is a page object for a checkout page
  class CheckoutPage
    def set_shipping_address(address)

    def use_shipping_as_billing=(same)
      checkbox = browser.find_element("#same-shipping-billing") if checkbox.selected != same

    def submit_order
Then you can write tests like
  checkout_page =
  checkout_page.set_shipping_address(street: "20 Jane St", city: "London")
  checkout_page.set_credit_card_info(cc_num: "12345678", exp: "6/12")
  checkout_page.use_shipping_as_billing = true
  thank_you_page = checkout_page.submit_order
  thank_you_page.thank_you_message.should be_displayed
As you can see, by separating the modeling of the application from the tests, you can have much better readable tests as well as reusable UI manipulation code.

Of course, you don't have to stop at the page level, you can also create partial(or control) objects to model a smaller part of the page. In one sentence, use your OO skill to model the pages/ui components in the most sensible way.

Is it possible to let the business people read or even write functional tests?

No. That's not going to work and don't waist your time or money on that. The goal for functional testing is to QA the product, it'll be wise not to get distracted by adding more responsibilities for functional testing.

Any comments are welcome. updated with one more Q&A about why only full-stack functional testing has to include every layer.


  1. Brilliant post, well thought of. I am a QA and have a slight deviation on Functional Test part. Thought QA need to spend more time doing Exploratory testing, but role of QA in shaping Functional test is important. I understand that the output of Functional test is largely a feedback to the development team, but still instead of leaving it completely on Developer, it would make more sense to have a QA pairing with Developer on that.

    This would help QA understand what is covered in functional test and where else the effort is needed.

    Nishant Verma

  2. Nishant, thanks very much for your feedback. In general I agree with you that QA should be included in the process in creating functional tests. In practice I still need to find out how to arrange such pairing because ideally developers should be able to write such functional tests whenever they need to. Having to arrange a QA pair beforehand could hurt the chance they do functional tests. Another approach could be that during the QA desk check when a story is dev complete and transferred to QA, they can also go over the functional tests created it.

  3. Kenny Lin12:57 PM

    I disagree that teams should have full stack functional selenium tests. Around the end of our project, our dev team's grumbles turned into shouts out of maintaining the thing despite handing over the tests to QAs after story acceptance.

    The cons to this approach are
    1. Higher upkeep and maintainability- After a certain point, devs were wondering if they spent more time coding tests and fixing them than coding the app itself
    2. Functional Webdriver/Selenium tests take a long time to run and fully finish- In hindsight, there were certain stories in our project that were better off being tested on the unit or webservice layer rather than creating every single case.

    Personally, caveat 2 is a dealbreaker for me as a QA. For a dev, it could mean a longer CI cycle depending on how their acceptance/regression build is set up.

    I'd rather perform the "full stack test" on the webservices layer where the responses are quicker to generate and tease (with a tool like Jmeter) and delegate selenium tests to break the UI. This is also more in alignment to the Test Pyramid.

    With Jasmine being promoted as a JS tester, I think that's grounds to explore more "undocumented features" in selenium such as breaking the UI in a fashion that a human either rarely or can't perform.

    I can't vouch for usefullness out of this practice, but there'd be a good deal of info uncovered that would inspire other like minded hackers.

  4. Kenny, Thanks for your comment.
    Here is my arguments
    1. "Higher upkeep and maintainability"
    Functional tests, written in the right way, are actually less expensive to maintain than unit tests, developers don't complain spending time writing and changing unit test, they do so with functional tests because a) they think it's QA's job and b) they became frustrated with it. This article argued against that, a) it's their job, they are the only chance to have full stack functional tests, and b) functional tests can be improved so they are no longer frustrating.

    2. Functional Webdriver/Selenium tests take a long time to run.
    This actually isn't a hard problem to solve - all you need is more faster machines, which are much cheaper than either developers or QA nowadays. So I don't see it as a ligitimate reason not to do functional tests, not to mention that it's not hard to setup CI in a way that the time functional tests take doesn't bother devs.

    Modern web applications have lots of UI logic happening in the browser (written in JS) and the integration between the UI and backend webservice could be really complicated which brings back the point why you want to test end to end using selenium. The ultimate benefit of full-stack functional tests is that when they pass, you have 100% confidence your application works. Leaving out any layers will cost you that ultimate benefit.

  5. Kenny Lin4:48 AM

    1) I think the improvement process itself is upkeep and refactoring. Given a fluid state of a UI, even with best practices like maintaining page object/step models, that still takes a good amount of non-point related dev/QA work. And should it be a project of large and nebulous scope, the organization and categorization of tests (ie. login/page 1,2,3,4/page 1interaction a/b/c/d/e, etc...) become a task in itself. I mean that's something thats endemic to all testing routines in large projects, so I guess it's shifting that burden from QAs to devs...

    2) Yeah, testing on a grid with solid state drives will reduce the run times of acceptance tests, but I'd personally like "instant gratification" tests where I can fire off a large sequence of webservice/unit tests that gives me instant feedback on whether the check-ins are smelling good.

    That said, I don't disagree with your points. I just think it's a high luxury that less test-minded members of the team won't totally jump onto (since it's more tech debt for them). My point is more about finding that balance between quick feedback with that higher priced safety net of end to end full stack functional tests.

  6. I particularly like this point: "That being said, the code coverage provided by functional tests can help TDD because developers can now focus more on the drive-development purpose when writing unit tests without having to also keep code coverage in mind."

    One of the most difficult points when introducing TDD to someone is explaining the nuances around the things you don't need to or choose not to test. It's difficult because these nuances are intractably subjective (IMHO). When you have the option of ensuring simple things work with higher level functional tests then it reduces the value in testing those things even further, bringing people to agreement more quickly.

  7. Kenny Lin: To be sure, a project with a UI that's changing frequently will be a challenge, to say the least, to full stack functional testing. I can't imagine a project whose UI doesn't stabilize after some time though.

  8. I was having a chat with my friend who is a beginner in testing. He was sharing his problem with me on how to initiate testing. Although there are many books and articles on theories and concepts of testing but there is no help or information available on how to start testing and how a tester should proceed with the same. This is one excellent article you put together.

    also if you do not mind here is another article which talks about the basic steps of functional testing:

    Basic steps of Functional Testing

    Hope this will also help.


  9. Your blog is very informative and gracefully
    your guideline is very good.Thank you
    Regional pharmacy college
    deepshikha group of colleges

  10. Better to do well than to say well

  11. Anonymous2:25 AM

    Hello. And Bye. Thank you very much.