Thursday, 20 August 2015

Pinteresting Test Automation - JavaScript Edition

It's been a roller-coaster of a month since my last blog post. In the last four weeks I have successfully managed to change job and learn JavaScript! I started on JavaScript the same way as Python by completing the free codecademy course. If you test things and you want to learn basic programming you should definitely give it a try.

Some initial observations made while learning JavaScript:

1) The learning process was much faster than last time. Knowing a first language definitely helps with learning a second. My first Fizzbuzz in Python took 30 minutes, but my first Fizzbuzz in JavaScript took 3 minutes.

2) White space is not an enemy in JavaScript land. Viva the curly bracket!

3) Forgetting semicolons isn't nearly as bad as I thought it would be.

I've also learned absolutely loads of things about test automation with JavaScript in the last couple of weeks, which is the main reason for this blog post (hooray!).

One of the first things I did was install Node.js which comes with a truly awesome package manager called npm. The package manager made it really easy to try out all of these testing frameworks. Beware if you're on Windows 10 however, some tweaking was required to get it working correctly (Stack Overflow is your friend).

I discovered that there are many different testing frameworks available for writing tests with JavaScript. Actually it's not just testing frameworks, there are many, many JavaScript frameworks in general. Far too many of them. There is a joke among developers that a new JavaScript framework is born every sixteen minutes!

Testing frameworks I encountered and explored were:

* Jasmine

* Mocha

* Chai

* Cucumber.js

* Selenium WebDriver JS

* Nightwatch.js

* Protractor.js

Some of these frameworks are specifically for unit testing, some are for end to end testing. Some depend on each other, some are agnostic and framework free.

I drew a little ASCII diagram to try visualise them. Each framework is listed left to right in a box with either (u) for unit testing or (e2e) for end to end testing. Each framework box has everything it uses on listed underneath it.

These test frameworks increase in complexity from left to right. Jasmine stand alone is a simple unit test framework that just requires JavaScript. Protractor is a more complex end to end test framework that requires either Jasmine (or Mocha and Chai) (or Cucumber) and uses both WebDriver and Node.js

I had a play around with Jasmine stand alone but as this is a unit test framework, I found I had to actually write some Javascript code before I had anything to run my tests against. Unit tests are usually written by the developers that are developing the application. As a Test Engineer, the tests I need to write are a mixture of both acceptance tests, integration tests and end to end tests.

* Acceptance test - Determines if a specification (also known as a user story) has been met.

* Integration test - Determines if a number of smaller units or modules work together.

* End to end test - Follows the flow through the application from the start to the end through all the integrated components and modules.

I looked at Protractor next. Protractor is a testing framework which has been around for a couple of years. I saw that the tests were formatted in a BDD (Behaviour Driven Development, not Beer Driven Development) style.

The syntax protractor uses is based on expect along the lines of 'expect something to equal something else' rather than the more familiar verify/assert statements I encountered when I was writing Selenium WebDriver tests in Python. Protractor's main strength is that it was created specifically to test AngularJS applications. It supports element location strategies for Angular specific elements. If you need to test anything created in AngularJS, Protractor is the King of the Hill.

I then moved on to looking at Nightwatch which felt closer in syntax to the Selenium WebDriver tests I had previously written. Nightwatch is newer than Protractor making it its first appearance on GitHub in February 2014. I found a good tutorial for getting started with Nightwatch which also has a demo on GitHub .

I after a bit of playing around with it, I decided I was going to re-write my Python Pinterest test in JavaScript Nightwatch.

I went through all the Nightwatch asserts and commands and tried to include as many of them as possible in the sample test I wrote.

It was very reassuring to see first-hand that JavaScript and Nightwatch are capable of carrying out all of the tasks possible with Python and Selenium WebDriver.

Anyway here is the test example I wrote with JavaScript and Nightwatch. One of the main advantages I found of writing within a testing framework was that creating the tests was actually much faster. The amount of text I had to physically type in was less than if I hadn't been using a test framework. Also, instead of faffing around with variable assignments a lot of the nitty gritty of what was going on in the background was actually hidden away from me allowing me to just focus on writing the test.

Saturday, 18 July 2015

A Pinteresting Python Selenium Example

Eight months ago I started my selenium adventure by learning how to automate finding pictures of cute cats. I chose Python as my weapon of choice due to it being very easy to install, not requiring a server to run and not needing a heavy IDE for development. I have been writing automated UI tests both at home and at work. I found my automated tests not only saved me time carrying out tedious repetitive regression tasks, but also found a range of genuine bugs ranging from obscure to showstopper!

Once I started writing tests with Selenium I found the more I wrote, the more snippets of code I had available to re-use. Writing tests started becoming much faster when the challenges I was encountering were challenges I had previously solved. I wanted to take all my good code snippets and combine them into a useful test that could be used as an example or reference material. None of the tests I had written at work were suitable as they were not written for software which was available to the general public. So I chose a well-known website and decided I was going to write a really good test with lots of comments so I could keep all my code snippets in one place.

The site I chose to automate was Pinterest and my 'boiler plate' Python Selenium example can be found here on Github

While writing the test I discovered something that appeared to be a minor bug. A logged in user was able to enter the email address they used to register for Pinterest into the 'Invite friends' box and invite themselves to join Pinterest again, receiving an invitation email for a site they already registered for in the process. I guess I would have expected the user's email address to be checked to see if it already belonged to an active account before sending an invite. I guess this just proves there is always value in sitting down and taking the time to write these kinds of test.

Thursday, 25 June 2015

Applying a soft dip heuristic to software testing

Just as different people can possess different political beliefs and not everyone believes the same thing, I think the same can be said with software testing. In the world of testing there isn't a one size fits all 'right answer', it doesn't exist. Lots of people have lots of different ideas and some of these ideas can conflict with each other. The whole manual vs. automation argument is a good example of this. Some people think that automated testing is a silver bullet that will eliminate all bugs. Some people believe that test automation is so expensive in terms of time and effort in relation to the value it returns that it should be used sparingly.

When I think about where my testing beliefs fit into the testing community around me, the principles of context driven testing resonate with my personal experience, like music in my ears. Testing is hard, I know this, I have experienced this first hand and as such I know that there are no absolute best practices. What might be a good approach for one problem could be totally impractical or impossible for a different problem. But I also know there are ways we can make life easier for ourselves.

James Bach seems to really like making up nonsensical acronyms to try remember good ways to test things. Some examples of nonsensical testing acronyms created by James Bach would be

HICCUPPSF = History, Image, Comparable Product, Claims, User Expectations, Product, Purpose, Standards and Statutes, Familiar Problems

CRUSSPIC = Capability, Reliability, Usability, Security, Scalability, Performance, Installability, Compatibility

CIDTESTD = Customers, Information, Developer Relations, Team, Equipment & Tools, Schedule, Test Items, Deliverables

DUFFSSCRA = Domain, User, Function, Flow, Stress, Scenario, Claims, Risk, Automatic

If you don't believe me, paste any of those seemingly random strings of capital letters into Google and see what comes back :)

My favourite of all these 'Bachist' acronyms is the SFDIPOT one, because even though it at first glance it sounds like utter bollocks it's the one that has proven the most useful to me and as a believer in context driven testing, I only care about practices that are actually good in context. I still thought the way he had arranged the letters made it sound like bollocks, so I rearranged it in my head so I could remember it easier. After all this is about what works for me, not what works for Mr Bach.

You say potato, I say potato. You say SFDIPOT, I say SOFTDIP. Soft dip is nice, especially at parties with an assortment of breadsticks and savoury nibbles. What is SOFTDIP?

SOFTDIP = Structure, Operation, Function, Time, Data, Interface, Platform

Each of the words asks questions about the thing being tested. I find that asking these questions help me to imagine how I am going to test, identify important scenarios before I begin testing and make sure I don't overlook anything really important.

The Softdip questions I ask myself are:

Structure - what is it made of? how was it built? it is modular can I test it module by module? Does it use memcache? Does it use AJAX?

Operation - how will it be used? what will it be used for? Has anyone actually given any consideration as to why we need this? are there things that some users are more likely to do than others?

Functionality - What are its functions? what kind of error handling does it have? What are the individual things that it does? Does it do anything that is invisible to the user?

Time - Is it sensitive to timing or sequencing? Multiple clicking triggers multiple events? How is it affected by the passage of time? Does it interact with things that have start dates / end dates or expiry dates?

Data - What kind of inputs does it process? What does its output look like? What kinds of modes or states can it be in? What happens if it interacts with good data? What happens if it interacts with bad data? Is it creating, reading, updating and deleting data correctly?

Interface - How does the user interact with it? If it receives input from users, is it possible to inject HTML/SQL? What happens if the user uses the interface in an unexpected way?

Platform - how does it interact with its environment? Does it need to be configured in a special way? Does it depend on third party software, third party APIs, does it use things like Amazon s3 buckets? What is it dependent on? What is dependent on it? Does it have APIs? How does it interact with other things?

Yeah, I know what you're thinking just another mnemonic but let me give you an example and show you how this works for me, because if it works for me who knows, it might work for you.

Once upon a time I was given some work to test, a new feature which when used correctly will delete a customer from a piece of software. Sounds simple doesn't it on the surface. The specification for the feature was just that, we need a way to delete a customer from the software. How would you test that? What would you consider? I know from experience that deleting things is a high risk operation so I wanted to be certain beyond doubt that this new feature would be safe and have no unexpected consequences or side effects. So I worked through SOFTDIP and this is what I came up with.

Structure - I have been provided with a diagram that shows where all client data is stored in the software's database. It seems that information about a customer can be present in up to 27 different database tables. When I test this I will need to make sure that no traces of unwanted data are left behind. Maybe I can write some SQL to help me efficiently check these 27 tables.

Operation - The software is legacy and has so much data in it currently that it costs a lot of money to run. As customers migrate to our newer product, their old data is being abandoned in this legacy software and we are still paying for it to be there. This is why the feature is needed. Only admin users will use this feature to remove data, customers and end users must not be able to access this feature. I am going to need to test that non-admin users are definitely unable to access this feature.

Functionality - It deletes existing user data. It must only delete the data it is told to delete and leave all other data intact. It must delete all traces of the data it is told to delete.

Time - What happens if deletion is triggered multiple times? What happens if removing large amounts of data takes too long and the server times out? What happens if the act of removing data is interrupted? What happens if data is partially deleted and an admin user attempts to delete it again?

Data - The software can be in the state of not deleting anything or deleting stuff. Live data is HUGE compared to the amount of data in the staging environment. I will need to prove that this feature functions correctly running on live data. This will most likely involve paying for an Amazon RDS to hold a copy of live data. I need to make sure I know exactly what I'm going to test and how before I request this RDS to minimise costs. It's also possibly going to be a good idea to take a database snapshot taken of this RDS before testing starts so can easily restore back to the snapshot if/when tests fail or need to be re-run.

Interface - I have been told there will be 'something' at URL ending /admin which will allow the admin user to delete account data. I need to make sure that customers are not able to access this URL directly and that only admin users are able to initiate the deletion process. I'm also going to have to make sure that even though this interface wont be seen by customers that it is fit for purpose. Consideration should still be given to things like asking the user for confirmation before any kind of deletion starts.

Platform - This software does make some requests to third party software to retrieve data however if customers are deleted then those requests won't happen as the software won't know what to ask about. I need to prove that this statement is true. There is a second piece of software that asks the piece of software in test for data, what happens if it asks about a client that has been deleted? I'm going to have to make sure that I also test this scenario.

Asking these questions starts to forms the basis for my test plan

* attempt to access feature as non-admin user, verify non-admin unable to access feature
* Make sure user is asked to confirm before delete operation starts
* attempt to start the delete operation multiple times for the same customer
* attempt to start the delete operation multiple times for multiple customers
* ensure feature works correctly when using live data
* ensure after running delete operation that all data has been successfully removed
* carry out a regression test to ensure existing functionality is maintained
* test the delete operation for a customer which has a really large amount of data
* verify software no longer makes requests to third party software for deleted customers
* verify that other software which makes requests to the software in test still functions

So why bother going to all this trouble, all of this preparation before starting testing? Well I'm always happier when I run code changes against a decent test plan. It makes me feel reassured when the time comes to release them into the wilds of the production environment. Every day, a lot of people depend on the software I test and I feel a strong responsibility to them to do the best job that I possibly can. Good testers care deeply about creating good software.

Monday, 8 June 2015

Tips for Staying happy and sane while testing software - Tip #3

Assume any information given to you could be made of lies until you have proven it to be true (and seen it to be true with your own eyes).

If information given comes from a non-technical or customer facing source be especially wary. If that source is members of the general public then loud warning klaxons should be immediately sounding in your head!

What happens when someone else sets your baseline or expected behaviour and the thing you are testing does not meet that baseline? Well, firstly you don't have enough information to establish where the error is. Is it a problem with the thing being tested or a problem with the expectation set for the thing being tested. Don't always assume blindly that your oracle is correct!

Without any prior knowledge of a system, if Bob one day tells you that Software A and software B share the same database. You might think 'Hey Bob thanks for sharing that information with me, that's going to make my life much easier now I know that.' If he had worked on the system for many years would his statement be any more valid than if he had been at the company a week? Possibly. But Bob is still a human and he could make a mistake with the words he used to describe what he was actually trying to communicate to you.

What if what Bob actually meant to say is Software A has a single database table which Software B also uses and there is also a second database where the majority of information used by software B lives. Would that change the way you test the interaction between software A and software B? Of course it would! This revelation would certainly lead to more questions. The first new question possibly being can both pieces of software write to this shared table? But what would happen if you were a black box tester and only found out Bob's initial statement to you was false while you were in the middle of running some tests based on what he had said?

It could possibly change the way you interact with Bob, repeated swearing and name calling may even happen depending on the situation and levels of rage.

Happy testers realise that questioning everything that isn't clear also includes questioning people. These days, if someone said to me software A and software B share a database. My initial response would be 'Orly?' then I would actually take a few measurements to check that this statement was true before embarking on major time consuming testing adventures. I guess just like in the same way you would check the oil and water in a car before going on a long journey.

To summarise, humans are human, even if they tell you about cake, the cake could always be a lie.

Monday, 1 June 2015

bLogjam - A Cryptojelly Commentary

So in the last couple of weeks a newly discovered computer security exploit was found - hooray! Something we thought that was safe, trusted, tried and tested over a very long period of time has turned out to be flawed. It's in the media, the sky is falling and people that use the internet are scared about things they do not understand! Customers are frantically emailing companies to ask if they are safe, and how safe safe actually is. An exploit how dreadful, hearts were bleeding last time, what is this nonsense?

Well it's been named Logjam and it's pretty interesting as it exploits a cryptographic method that was patented back in 1977.

The cryptographic method exploited by Logjam is called a Diffie-Hellman key exchange. So why does anyone actually care and why does cryptography stuff matter?

Let’s imagine two people that want to share a secret, but they are only able to talk to each other in public where anyone can hear everything they say to each other. That would be a pretty annoying situation, especially as sometimes you really want to share secrets with your best friends without anyone else listening in (I know I do).

So cryptography solves the problem of sharing secrets in public. The simplest explanation of how a Diffie-Hellman key exchange works is to say it is like mixing various colours of paint together. The trick is that it’s easy to mix two kinds of paint together to make a third colour, and it’s very hard to unmix a paint mixture to establish which two colours made that particular shade.

If these two secretive people wanted to share a secret colour they can do this using a selection of colours of paint. They can both agree in public to start with the same colour like yellow (the public key) then secretly pick a second colour that no one else knows (a private key) which will help them secretly share a new secret colour with each other.

So let’s say one person's secret colour is red and the other is blue. Both secret colours get mixed with the public yellow colour (to make orange and green respectively). One person then gives the orange paint to the other and receives green paint. The clever bit is now when they add their own secret colour again, mixing red into green and blue into orange, and the end result is they are both left with the same horrible shade of dirty brown. Possibly not the most aesthetically pleasing colour, but no matter how yucky it looks they now both have the same colour and more importantly, nobody else knows what that final colour is.

Colourful diagram shown above because colourful diagrams always help.

Now imagine that instead of mixing colours, a Diffie-Hellman key exchange mixes numbers together instead using hard sums.

So what happened recently was that some people discovered a way to swap what would be the equivalent of the vibrant paint palettes used by this method for crappier paint palettes. People thought they were picking their secret paint from large palettes containing lots and lots of colours, when unfortunately an attacker had switched their palettes with smaller palettes containing a smaller number of colours. And we all know if you only have a choice of red, yellow and blue it’s much easier to work out that the secret mixed colour will be a nasty shade of brown.

The logic of mixing the colours was and still appears to be sound, just no-one imagined until recently there would be a way to switch the palettes around and limit the number of private colour choices. As testers we must always strive to imagine the unimaginable, this is one of the reasons why testing is much harder than it appears to be at face value. There may be right answers and wrong answers, but there are also unknown questions which have yet to be answered. Don't worry though, unlike a mythical bug (Rowhammer), Logjam is really easy to patch and most people won't have to do anything more than upgrade their web browser to the latest version to make it go away forever.

Monday, 16 March 2015

Mythical Bugs That Can't Be Patched

When you work in software testing, every now and then, you get to hear other people's stories about bugs. Most of these stories will be fairly mundane. Something along the lines of like "Yeah, I clicked the button and nothing happened". But there will be other times, once you have been working in software testing for a while, when you may get to hear a story about a legendary bug. Legendary bug stories tend to something like "Yeah, and then if you paste that in to the text box and hold down the shift key, it sends an email to 193,248,2489 customers thanking them for ordering Nickleback's latest album".

I love legendary bug stories. They can serve as mild amusement, shining examples of things that should or should not be done, cautionary tales of woe or even be so far outside of the box that they change the way a tester will think about certain problems or situations. I think all testers must love a good bug story, the ones I go drinking with certainly do :)

The legendary bug story from my last games testing team went something like this. Once upon a time there was a racing game that was about to be released. While testing this racing game, one of the AI controlled cars fell through the track. Replays of the race had to be watched back and scrutinised in microscopic detail to try ascertain which car had fallen though the track and more importantly where it was on the track when it fell through. All the cars had the names of the drivers written just above the wind-shields. While watching the replay footage back, one of the testers noticed a spelling mistake. A driver called 'Bayne' had 'Banye' written on his car. This spelling mistake had been in the game for a long time. The spelling mistake had been missed by everyone, at every level of development and was also in a whole load of promotional screen-shots and marketing material for the game! This legendary bug would possibly fall into the cautionary tales of woe category. The fact that test caught it very late in the day and saved the company from significant embarrassment pushed the story of 'Bayne' into legendary bug status. Seriously, I'm surprised noone on the team submitted it to The Trenches.

Outside the world of games however, legendary bugs can sometimes be utterly mythical. At Google they have an all-star team of security testers dubbed 'Project Zero'. These people actively hunt out vulnerabilities with the aim to find the flaws before the bad guys find them so they can be fixed. Well, Project Zero found a new bug last week. Not just any bug, a mythical a hardware bug!

The story goes like this. Computers use memory to remember things. There is a type of memory called Dynamic random-access memory (abbreviated to DRAM). DRAM works by storing every bit of data in a separate capacitor. The capacitor can be either charged or discharged and these two states represent 0 or 1. Google's testers found a way to change 0's to 1's or 1's to 0's without accessing them. They found that if you pick two memory locations either side of third memory location, and bombard these two 'aggressor' locations with requests, the third 'victim' location will just flip from 0 to 1 on its own.

They are calling the exploit Rowhammer and you can read the Project Zero blog post here. The worst thing about this bug is that it is physical in nature. It can't be patched.

There is currently a test for Rowhammer on github.com although in the warnings it does say "Be careful not to run this test on machines that contain important data." So you possibly won't want to try this on your home PC. At least knowledge of this issue is in the public domain now. Knowing about the Rowhammer exploit exists possibly makes it slightly less terrifying. It certainly will be interesting to see if and how anyone takes advantage of it.

Wednesday, 4 February 2015

Please don't feed me spaghetti code.

Everything is urgent, everything is critical, rush rush, develop develop, test test, now now now! This is a common theme in both games development and software development. Management pressure always trying to get the product shipped or the next chunk of code released. Everything may appear shiny and happy on the surface but underneath, code becomes a twisted, tangled, distorted mess which starts testing the sanity of anyone that has to interact with it.

I recently found out about the term 'Technical Debt'. I hadn't heard of it before but after reading a description, I realised all software development will have encountered this debt in some form or another.

So what is Technical Debt and how does it affect software? Well, cutting corners leads to bad decisions, which in turn leads to problems. Fixing problems takes time so when corners are cut a technical debt is created. The debt can be paid at some point in the future when the consequences of the bad decisions are fixed but frequently these debts are ignored. When technical debt is left in place without repaying it, it grows, and accumulates interest as those bad decisions start requiring even more bad decisions to work around them.

Technical debt leads to architectural nightmares made out of spaghetti code. New features gradually start requiring an ever growing number of hacks and workarounds to implement. Before long, the code base starts looks like a really high Jenga tower held together by wishes and tears in danger of collapsing at any moment. Even making simple changes to the software becomes increasingly challenging as the technical debt grows.

Taking on small amounts of technical debt does seem to be completely unavoidable. But some companies don't know how to manage their technical debt. Even fewer companies know when they should avoid taking on new technical debt or even how to start making repayments. Technical debt is actually really dangerous because it is one of a few things that can kill companies dead.

Continuous regression testing is possibly the easiest way to find problems and identify potential code changes that create technical debt. When these kinds of problems are found at the testing stage a choice can be made between either fixing the problems (paying back some of the debt) or backing out of making the change (completely avoiding any new technical debt).

Reducing technical debt ideally should be part of a company's culture because once it starts building up, it won't remove itself. There are various testing activities that can help identify technical debt however as this debt is created by bad code and poor architectural decisions, testing won't make this debt go away. Only refactoring code, redesigning and recreating can pay back technical debt.

There isn't very much a test team can do on its own to reduce technical debt other than shouting loudly and hoping developers and architects pay attention and listen.