Perze Ababa's Blog: testing

Showing posts with label testing. Show all posts

Thursday, January 28, 2016

symbiosis and software systems integration

There's a running conversation on twitter about the definition of "integration testing" and also the notion of "integration" when it comes to software or just systems as a whole. I think that understanding the concept behind integration in terms of software systems will give us clues on how to define the what's and how's of testing from an integration perspective .

Good old Wikipedia defines "System Integration" as:

In information technology, systems integration^[2] is the process of linking together different computing systems and software applications physically or functionally,^[3] to act as a coordinated whole.

"linking together"

Looking at the idea of linking things together brings the notion of biological relationships to mind. First, uni-directional linking. This is very similar to the idea of Symbiotic Commensalism which states,

a class of relationships between two organisms where one organism benefits from the other without affecting it.

Consider a library that you use as part of building a specific functionality on your site. The methods or classes from that library gives you the benefit of not needing to write your own. The library remains unchanged and is not affected by how you use that library. Though this definition of a commensal relationship is somewhat shallow because there is a point where a given library starts to be problematic because of bugs in that library that will then eventually affect the system that you are integrating this library into.

The second type of linking is bi-directional. This reminds me of Symbiotic Mutualism. The "hope" in this type of integration is bringing mutual benefits to both systems that are being integrated. The closest thing that comes to mind is a user profile system that feeds into a marketing system which in turn provides suggestions against a given product based on a distinct set of user choices.

Just like any relationship that's defined above, there are attributes in any given relationship that can cause for a type of relationship to switch. In the case of software, this usually comes in the form of bug. Whether a bug is intended that will lead to your data being stolen for nefarious purposes (Symbiotic Parasitism) or a bug that eventually leads to the demise of another system in the form of a backdoor hack where a payload can delete everything (Symbiotic Amensalism).

"perceived risks"

So how do we test for integration risks? Testing for the functionality, behavior and purpose of the resulting coordinated whole will give you one part of the story that you are trying to write. Understanding the key dependency points and data hand-offs between both systems is key. But that's not all, you also need to be able to tell the story of how you are going to test.

In 2010, Michael Bolton wrote a blog post on Test Framing, which became a set of lenses that I "wear" whenever I need to design tests around a given feature or product. Simply said, in order for you to test (anything) you need to be able to tell two parallel stories: The story of the product and the story for testing.

Just like testing any product, integration testing is just a variation in the mission of your testing. Your ideas and awareness of the moving parts will hopefully inform you of the techniques that you can use so you get to the most important problems that will lead to risks in the least amount of time.

Thursday, January 14, 2016

The 2016 State of Testing Survey is here

The State of Testing Survey for 2016 is finally here.

I've participated in the past two surveys and the results enabled me to ask management specific questions around skill, education, processes and overall challenges. I'm not in any way affiliated with the group running the survey but I do support this endeavor.

If you want to look at the result from 2015, you can get that here: 2015 State of Testing Results

Monday, November 9, 2015

bug counting

Saturday, August 8, 2015

Value and Waste as byproducts of Software Development

The act of Software Development essentially produces two things, value and waste. Value can be rather vague when you don't understand the context. According to +Gerald Weinberg, Quality is value to some person. +James Bach then extended that definition to, Quality is value to some person that matters. That is the value that this talk was referring to.

For any software development group to be successful, value should be maximized and waste should be minimized.

According to Lean manufacturing concepts, there are seven types of waste:

Overproduction is essentially creating features that has no users. This is waste because no one is willing to exchange something valuable for something that you've built.

As an example of overproduction as waste: In the past few months, I've assumed a role as a product owner of an in-house automation tool that my team uses. When we started, we had limited information on how this tool will be used so we built in "generic" features. As time progressed, we noticed that out of a total of 200 or so functionalities that we've baked into this tool, only a handful of them are being used.

The next type of waste, Waiting, is technically the amount of time that is spent by tickets/issues/requests while waiting for someone work on them. Another type of waste is unnecessary context-switching. In this case, time is also being spent when a developer is forced to switch between contexts when trying to fix a feature that the developer was working on in the past. Testing and Programming needs to be tightly coupled in order to minimize this particular waste.

Transport is waste when it comes to movement. This is usually defined as your delivery flow. How long does it take you to deliver an idea into a stable and usable feature in the hands of your customers? How long is your build-measure-learn loop?

Inappropriate Processing is waste as a result of bad design which forces extra activity. An example for this would be investing in a record-playback tool as a long-term investment. In the long run, maintenance can be a nightmare. Just because something is cool and new, it doesn't mean we should adopt it. Another example would be when you are developing a metrics program where nobody ever uses the information from those metrics in order to make a decision.

Excessive Inventory is waste when we produce more than the demand. In software, this can be reflected in over subscribing third party services without effectively measuring what you need. Most of this waste is usually shown as a knee jerk reaction to scaling. Bloated servers that are only used at 10% capacity even at maximum peak, or over subscribing to maintenance support costs. The one example that Thomas pointed out were Unreleased Features. You have to understand that your context is changing, your consumers preferences are changing. Every day a feature is not in production, devalues the feature to a point of uselessness.

Unnecessary Motion is when you are doing more than what you need to. In software development, this waste happens when you are misusing your resources and people. Is your company asking people to do the things that they are not good at? i.e. asking testers to focus on documentation rather than test design. What are your testers and developers doing? is there unnecessary friction? are there repetitive manual steps to get a build to an environment?

Personally, I think Defects are the mother of all wastes. If you don't fix an issue right away where multiple modules are dependent on, and you release that to your customers, fixing all downstream dependencies tend to take longer because you are not only fixing one thing, but everything else that's dependent on it. Imagine that waste in an iOS app submission workflow.

When software developers have appropriate detection tools, some defects can be instantly fixed provided you don't ignore the warning. For example, your IDE can give you instantaneous feedback if you missed a semi-colon when writing javascript code. Of course, static analysis or these sophisticated IDE's can't catch everything. There's a need for redundant ways to check for problems brought about by new changes and these ways should all include properly thought of test design.

As a part of the leadership team in my software development group we've been looking at how to minimize waste in our current processes. The good thing is that my company doesn't think of testing as waste. I would readily admit that there are things that our testing team is doing that seem wasteful. Writing acceptance criteria and automation of testing scenarios that somehow doesn't make it to production is waste, or atomically testing feature details that doesn't necessarily bring value back to the customer.

Not everything in the development process has return value. There are also wasteful activities done by programmers. Code comments don't inherently provide value to the customer, or maybe even documentation since not all documentation are created equal. Creating a tag that doesn't get released can also be considered as waste.

The fact of the matter is that waste is an inherent byproduct of creation. Our goal is not to eliminate, but to minimize.

Do you look at testing is part of your waste reduction efforts or is your approach to testing producing more waste?

Note: This post is based on one of the sessions that I attended during CAST by Thomas Vaniotis about The Context-Driven Tester in a Lean Startup.

Thursday, August 6, 2015

Remembering CAST2015 -- Part 1

This is my second CAST that I've attended in person and technically the fourth that I've participated in by watching the live webcast.

I flew into GRR on sunday and had the chance to catch up with +Dhanasekar Subramaniam over dinner. The details of that conversation requires a blog post in itself and will (probably) write that separately.

On the first day of CAST, I attended +Robert Sabourin's tutorial on Testing Fundamentals for Expert Testers. What really stood out for me on that talk has been the way +Robert Sabourin organized his presentation. He started talking about the History and Principles surrounding Quality, Economics and Management that can be applied on how we test software effectively.

The participants went through a series of exercises that further explained concepts on semantics, interpretation, pivoting based on gathered relevant data, ambiguity, etc. I also went through some practical applications in using logic, decision tables, as well as logic reduction techniques that can simplify the number of rules you will need to test by employing equivalence class partitioning.

+Robert Sabourin also focused on applying a Heuristic Model in solving problems. In order for you to show your understanding of the problem at hand, do you know;

What are you looking for?
Can you restate the problem?
Can you make a visual model?
Do you have enough information to find a solution?

Can you identify the variables and the relationships between these variables?

Do you understand the concepts used in stating the problem?
Do you need to ask questions?

From there, the tutorial focused on more details on equivalence partitioning, story boarding, control flow testing as well as state models.

In all honesty, that session had too many concepts to be understood and absorbed in a single day. As an RST and BBST alumni, I've been able to pick and choose which concepts I've encountered before and which ones that are completely new to me.

There's more to be said about the first day but my brain is completely mush right now. Until the next post.

Thursday, April 10, 2014

lessons learned from a two dollar load test

Recently, I had the chance to work with one of our engineering groups and lead load testing efforts for one of our web redesign projects. The objective was simple, introduce stress into the system and observe how the application and the infrastructure around that application behaves and reacts under various degrees of stress.

The first logical step in planning for a load test is ~~picking a tool~~ understanding what information the engineering team deems as important. The objective usually gives you a good overview of what they need, but you shouldn't stop there. Part of the information that you need to know is the data story. What path does the data take when it moves from curation until it gets served to the user. You do need to understand what happens in between.

The next step would be to understand the architecture behind the application. These are some of the questions that I usually ask. Are you using CDN's? What are your caching solutions? What are these cache's respective TTLs? Do we have multiple servers behind a load balancer? What is the load balancer algorithm? Is there a way for me to hit an origin server directly? What are the declared settings in httpd.conf? etc. These questions will most likely provide you an insight of how complicated the data flow scenarios will be.

Talking about the data flow scenarios, this usually follows a rather standard set of scenarios. How does the system behave during a greenfield request? (A greenfield request is when none of the caches are set, this usually takes the longest) How does the system behave when the request is fully cached? and finally, every boolean combination in between. A bonus case would depend on the caching mechanism that's used. memcached, for example, has a behavior where if the cache is completely full, instead of writing to memory, it starts writing to disk. The response times in this particular case is almost as slow as a greenfield request and most of the time this is a culprit that's a bit tricky to troubleshoot if you don't have the proper server instrumentation.

I usually correlate most data flow diagrams to a series of rivers, waterfalls and dams. The end user is at the end of line and content creation is at the source. The network is the path where the water flows, the size of the path defines the bandwidth, the dams represent the caches and of course the water is the data that eventually a person consumes.

Finally we deal with the testing approaches which is really where the good stuff happens. In the years that I've performed load testing, no one has summed the vantage points where you can perform load testing better than Scott Barber. The vantage points Scott pointed out in his Web Load Testing for Dummies book (I can't seem to find a link to buy this anywhere, help?), fall under three categories, Behind the firewall; Within the cloud; and from the User's Perspective.

Each of the above vantage points provide different types of information. I will not list all of them, but I will provide what I think were my major takeaways. Behind the firewall tests or Load Testing 1.0, enables you to test component performance. Within the Cloud, aka Outside the firewall or Load Testing 1.5, provides information about the performance when there are multiple components and how these components relate to each other. Lastly, from the user's perspective aka Load Testing 2.0. This approach is rather new and was not an option until 2009 or 2010. This gives you the ability to load real browsers from different areas of the world and swarm a particular URL. Think DDOS but used for good (not evil). The information you get out of this will be very close to what your user's response times are and will also give you a sense of how your third party content really affects your load times from that perspective.

you can download the freemind version of this mindmap here

Early this week, we had our first dry run using the User's Perspective approach using Neustar's Web Performance Tool (aka Browsermob). Within 10 minutes of the test, we found an issue with the load balancer algorithm brought about by what seemingly is a VIP misconfiguration. Personally, this is proof that it's never too early to perform load testing since it would have been disastrous if this was opened up to the world. The other cool thing about this was that we didn't have to pay a hefty licensing sum if we decided to use loadrunner, or go through a complicated setup process for our own jmeter based load generator infrastructure (which we are considering).

Based on the number of concurrent users that we used and the length of the test, it cost us (literally) $2.25 to identify problems that would have caused problems that would cost money, lost personnel time and above all, a tarnished reputation (yeah, its possible).

How do you load test?

Wednesday, March 26, 2014

testing lessons learned from a dum biryani

One of the better analogies I've heard about testing starts with the notion of preparing a dish that you will be serving to someone. As a chef you would probably ask what the customer wants and you would perform the due diligence of getting the ingredients you need, prep the ingredients, make sure you have all the necessary utensils, cookware and equipment. Then you can cook and serve the food to the customer.

There are subtle nuances in-between that whole cooking process that will most likely spell the difference between the quality of the food that's being served. A chef normally tastes the food as he cooks it. Carefully adds the ingredients and tweaks the taste by adding salt, pepper and other spices while tasting after a new change is introduced. This is heuristically true with testing. Whenever there there is a change, a risk might have been introduced and the effects of that change needs to be evaluated. That change is usually evaluated through testing, whether you are confirming the known effects of that change (bug fixes or new features), or performing "regression" testing of features that you know and care about, or performing exploratory testing to find out more things about the product or system under test.

Test as you go, taste as you go.

For a while, this testing analogy was closest to how my understanding of testing has evolved over the years. Until March 25, 2014 (last night, as of this writing) ... when I was introduced to the Dum Biryani.

The Dum Biryani is an Indian dish that contains rice, marinated meat, spices among other ingredients. What's really fascinating with this dish is how it's cooked. The chef, layers (basmati?) rice, vegetables and meat in combination with spices in a clay, ceramic or copper pot then cover the top of the container with dough then cooked in low flame for a long, long, time. The dough cover mimic the effects of being in a pressure cooker. The result is a really good rice dish with an amazingly tender meat that goes really well with the rice.

Did you notice the gap and how this fails the tasting analogy with regards to testing? This is what really blew my mind. After the dough is closed, there is no way for the chef to correct the dish through tasting. I was also told that a really good chef will know how the Dum Biryani will taste like based on the aroma that the food releases while the ingredients are pressure cooked. The smell doesn't help the analogy either because the cook still can't effect change while the dough is closed.

What I learned was that these chefs don't need to taste the dish despite the myriad of variables that may affect the taste. To name a few, these variables are the type of rice, the water used to cook the rice, the elevation where the dish was cooked, the material of the vessel where the dish is cooked, the type of saffron being used, etc, etc, etc.

After further discussion, I realized that these chefs have learned how to cook this dish to the point where tasting can be done by feeling the ingredients through their fingers. Tasting was indeed done but not through "regular" means. The chef has attained a new level of expertise by leaning towards learning through experience and has most likely continued to work on his craft through experimentation. As testers, we should strive to do just that.

To date, no one has heard of anyone that became a respected dum biryani cook after taking a test.

I digress.

Wednesday, March 19, 2014

the testing fog of war

Ever since I started playing strategy and role playing video games, I've been exposed to the concept of the fog of war. A fog of war, in video game speak, essentially simulates the unknown element of the map that you are in. This feature forces you to adjust your strategy in order to finish the objectives that you have based on that map.

Fog of War from Sid Meiers' Civilization

Typically, as you go through the map, information about the area that you are in is progressively revealed. It is also good to note that this information about the area is temporary. In a lot of newer games, when you leave an area, the fog of war comes back and your knowledge about that area becomes history. Just like testing, in order for you to achieve the objective in any of these games, information is key. You might have a set of documented requirements that may or may not be up to date and it's still up to you to confirm if the information you have at hand is still relevant, or not.

What is information and why is it important? In Information theory, information refers to the reduction of uncertainty. The point is, when it comes to testing, the more we know, the better we are. Come to think of it, a lot of the existing testing techniques that's out there is about finding information. A lot of times, a tester's role ends up being solely focused on confirming known information. I'm not saying that confirming things is wrong. But if that's all you do, then it becomes wrong. Think of the fog of war. Most software development projects nowadays are service oriented. These projects rely on too much external dependencies that there are essentially an infinite amount of possible sources of failure. These dependencies make your fog of war shift progressively and a whole lot faster. Service dependencies are one thing, people also contributes to that risk. W. Edwards Deming has aptly said in his book, Out of Crisis (page 11) that "Defects are not free, somebody makes them and gets paid for making them.

How did this understanding of the fog of war change my mindset about testing? Simply said, things are not always what they seem. What we know about something yesterday, might not be the same two days from now. In order for us to continuously testing, we need to learn continuously as well. We are now at a juncture where Expert Testing is no longer defined by how long your tenure has been as a tester, or how many certifications you have, nor how good your test plan is.

Most of the readers of this blog has driven a car, right? How do you know that a car is not moving? Through the instrumentation panel? Through a frame of reference, e.g. you look at the trees or a stationary object around you and know that if they are not moving, then you must not be moving too? I wish it was that easy when it comes to software. Because of the so many associated risks, we no longer have an absolute frame of reference. Going back to my car example, in this case, your instrumentation is unreliable and trees or other stationary objects are moving independently.

This frame of reference is usually referred to as an oracle. If you have taken a BBST class or have been hanging around context driven testers long enough, you'll know that there is no such thing as a true oracle. This is why we rely on heuristics in order to create our own frames of references that essentially applies to your given context that is bounded by time.

The other effect from the fog of war that I've realized is that plans are only as good as your next update. Test Plans are not useless, as long as they don't remain stale. I have enough personal evidence that as you go through any testing cycle, simultaneous test design and test execution is more effective than any pre-planning that's done. Tests are more relevant and up-to-date. One of the challenges in this approach is results documentation. Primarily because testers, who are human, tend to be lazy. But this risk should be mitigated by session based test management.

I would like to encourage you to share what your experiences are with regards to the testing Fog of War. What have you done to mitigate this risk? Aside from Exploratory Testing, what have you done to alleviate this situation?

"Empirical explorations ultimately change our understanding of which questions are important and fruitful and which are not." - Laurence Krauss

Saturday, March 16, 2013

Test Management is more about testing than management.

This started as a Test Manager survey from Mike Lyles (https://t.co/LSvlWobcdW) and as I was answering question 45, I just kept on writing.

Any test manager needs to understand the three basic things any tester is or should be doing on any given day. For a couple of years now, I've been using what the Bach brothers refer to as the TBS Metrics of Session Based Test Management fame. T stands for "Test Design and Execution" related activities, B stands for "Bug Investigation and Reporting" and S stands for "Session Setup". On any given day, the amount spent on all three should add up to 100% of their time for a particular session or day depending on how you want to implement this.

Test design and execution means evaluating the product and looking for problems. Bug investigation and reporting is what happens once the tester stumbles into behavior that looks like it might be a problem. Session setup is anything else testers do that makes the first two tasks possible, including tasks such as configuring equipment, locating materials, reading manuals, or writing a session report.[1]

Based on personal experience, TBS is a really good way to look at the Work Breakdown of any given tester/ test group. You can see where the focus is from a management perspective. If a higher percentage of time is spent on Test Design and Execution, then you know that the testers are doing what they are hired for, which is ... ummm ... Testing. If a majority of time is spent on Bug investigation and reporting, it could be that the product is not quite ready for release, or that the focus of the testers are on regression duties due to the number of bugs and fixes that need validating. In this case, testing can't continue. If a majority of time is focused on Session Setup, your tester/team might be in too much meetings (which might be good or not) or is trying to solve a technology problem that your operations or support team can help with, for example setting up a LAMP stack for your web app so the tester can test in a local environment. Either way, testing has been limited.

There is no ideal percentage combination between all three since this is just an indication of what your context has to deal with. Or this is a way for you to initially evaluate and come up with questions before you have a one on one with a team member. You can even look at trends of any given tester or project based activities but trust me when I tell you that all this will give you is a graph and nothing more. As a manager, I really like what Jon Bach mentioned in his STP Crew talk about delivering Value with Test Metrics (paywall alert).

"Less BS, more T".

This is just a birds-eye view or guide for understanding what your team is up to. One on one debriefings or retrospectives will help and are more effective in finding out what is really going on with your organization. If you really want to be more effective as a manager, get in the trenches. Try to be at par with your testers' knowledge about the product. As a manager, your mission is two fold; serve your organization in making sure your testers are finding the necessary information about the quality of your software and you serve your testers in making sure that you enable them to do what they were hired for and that is to test. Bugs are just the bonus that comes as a result of your team testing properly and intelligently.

The one main lesson that I can take with me from this is that, Test Management is indeed more about testing than management.

Sources:

SBTM by Jon Bach - http://people.mozilla.com/~nhirata/SBT/sbtm.pdf

Monday, April 2, 2012

Of Oracles and Heuristics

You might have heard testers use the term Oracles and/or Heuristics when discussing how we test our projects. I'd like to provide an explanation of the context in which we as testers use these terms.

By definition, an oracle is a principle or mechanism that can tell us if the software under test is working according to someone's criteria. "Someone" in most cases is usually the person who knows how the application is supposed to work or look. This person can be the product owner, the project manager, the developer or even the tester. In most companies, this is usually the Product Owner that works hand in hand with a Project Manager. Simply put, Oracles, help us decide whether a product's behavior is inappropriate. Ultimately, oracles are usually ideas or behaviors that can confirm or tell us if a test passes or fails.

Aside from requirements and known expected user behavior, subject matter experts can be used as oracles too. So don't be surprised if members of the testing team bombard you with questions about a project if you haven't clarified the requirements and scenarios within that requirement extensively to your tester.

Heuristics on the other hand, are fallible ideas or methods that can help you investigate or solve a problem. In our context, Heuristics are application behavioral patterns that we can use to design our tests around a given project. We as a test team, also use heuristics to connect some missing requirement dots by way of discovery through exploration of the application. Heuristics are fallible due to the fact that what could be assumed as correct at this point in time can be false after a certain time has passed. Case and point, requirements change all the time. In it's entirety, Heuristics helps us design tests, investigate bugs and report information.

A classic example of a heuristic are what we call as consistency heuristics or the HICCUPPS Heuristic. This usually makes us ask the following questions as a launch point in discovering how to test any given project;

Is the software under test consistent with the History the other version of this software?
Is the software under test consistent with the Image that your company wants to project to it's consumers?
Is the software under test consistent with Comparable products or competitor websites?
Is the software under test consistent with the stakeholders Claims?
Is the software under test consistent with User Expectations?
Is the software under test consistent with how other Products in your company behave?
Is the software under test consistent with the Purpose of the application?
Is the software under test consistent with certain Standards and Statutes, such as Legal, Accessibility etc?

Due to the nature of heuristics, inconsistencies within the application may or may not be a bug. That is just the nature of a fallible heuristic. For example, if for some reason a particular functionality in the most recent version doesn't quite behave the same way as one or two versions ago, it could be that the behavior had a bug and has been fixed.

Nevertheless, the usefulness of oracles and heuristics will ultimately be on the tester's judgement and maybe his/her experience and understanding of the software under test.

[Update: See Michael Bolton's comment below regarding some details that need to be ironed out, i'll be working on updating this post with his clarifications in a separate post]

Sources:

Testing without a Map by Michael Bolton
Oracles by Michael Bolton

Note: This was an internal blog post that's initially shared to the rest of our technology group and has been sanitized to remove particular details.

Wednesday, February 1, 2012

stigma of the test manager -- part one

Last year, there was this mania about the death of testing. I don't know when it started, or who started it, but that doesn't matter now. Looking from the outside, I observed some of the staunchest testers I know almost lost faith because of the seemingly almost futile efforts of self introspection and not coming up with answers that could somewhat defend the value a tester can bring back to the company. Scott Barber (@sbarber) sum's the problem that is plaguing the software testing industry pretty well in this post.

"The under-informed leading the under-trained to do the irrelevant."

Sadly, this statement is very true and defines the individual characteristics that exemplify what I think is the stigma of a test manager; Ignorance, Incompetence and Insignificance. For the sake of this post, a test manager is defined as someone whom testers report to.

Ignorance is manifested when you either focus too much on your context and not look from the outside for better ways to improve your process or the complete opposite which is looking too much on the shiny toys that the outside has and you end up detesting your context. Incompetence is demonstrated when the test manager doesn't provide feedback that can improve the skill or correct a bad habit of a direct report. Irrelevance is a logical result of the first two, but this trait is usually personified by a test manager's refusal to champion the test team itself. That person doesn't usually know what everyone is doing beyond what they see on the status reports.

The craft of testing is not dead or dying. Testing needs to to be understood and re-evaluated. I do propose that the ignorance, incompetence and irrelevance of test managers need to die. I am not considering or suggesting that we murder them. I am simply saying that these traits need to die. Gory details in the next post.

Tuesday, November 8, 2011

A Conversation with Noah Sussman -- QA Manager, etsy.com

I have been having these questions about when testing happens in a continuous integration context and there have been a couple of suggestions that I talk to etsy.com's QA Manager, Noah Sussman and pick his brain about it.

I met Noah once at a selenium tech talk featuring Jason Huggins (more name dropping) a couple months back. One of his team members, Michelle D'Netto, was presenting how Etsy builds their regression tests. I remember Noah as someone who was very stoic during that tech talk, always taking down notes. When I met with him last week and I guess 2nd impressions are better since he was very personable, very easy to talk to and was very open to sharing his thoughts about testing in general.

Here are some key takeaways from that conversation.

when your development culture shifts, adapt

Etsy has a very interesting development culture because they don't have a merging strategy and they have a very exhaustive use of continuous integration. For a company that has 240+ employees and of which, only four have official "tester" designation, it really does beg the question, when and where does testing happen? The short answer is that "It happens". Because everyone commits to trunk, everyone "tests" and not just the testers. They also test in production, which is made possible through switches via conditionals in the code. The developers spend up to 20% of their time testing their work and the 4 "testers" are focused on finding out the unknown unknowns.

resourcefulness

His context also brings about using a set of tools in more ways than most people use them. From the lowly curl command, splunk, selenium, a gazillion unit tests, and nagios to name a few. These tools give them valuable information. Graphs of real-time metrics that would give everyone in the team sense of what is going on with the site so they can act accordingly and timely when an undesirable event happens, or is about to happen.

testing advocacy

Most test managers in Noah's shoes would probably lobby for more testers or feel completely dejected at the sheer weight of the responsibility of testing for a company who has 60 developers per tester (don't hate, I know my math is not quite exact). Instead, he lobbied for more testing. His testing advocacy resulted to hiring more testing conscious developers and gave the development team a much needed testing feedback which in turn gave his team a much envied status in the testing community, exploratory testers.

Thank you Noah for taking the time out to meet with me and I look forward to that Etsy show and tell.

Wednesday, November 2, 2011

Testing, Checking and Continuous Integration

My team has been looking into how we can employ continuous integration in conjunction with our software development lifecycle. We are currently on the early stages of adopting agile and perhaps understand that this requires effort and is not just a switch that we can flip.

I must admit that I had reservations with the whole idea of continuous deployment because it seemed to not have a place for exploratory testers like me. It was until I had a conversation with Adam Goucher early this year when he mentioned that there are two schools of thought when it comes to continuous integration, Continuous Deployment and Continuous Delivery.

Continuous Deployment is essentially a process where after a developer commits code into a repository (repo), automated tests are then executed and once every test passes, the code gets deployed to production without any human intervention.

What happens in Continuous Delivery on the other hand, is that after the developer commits code into a repo, all existing automated checks get executed, and once all the checks pass, testers can then start testing, since all the known parts in the site that should work are still working.

As a tester, Continuous Deployment is a heresy and is unethical because it assumes that all your automated tests are good enough to a point that it no longer requires human judgement. Continuous Delivery on the other hand recognizes the fallibility of automated scripts and relies on human judgement to ensure that all the acceptable risks have been mitigated, then we can say that the code is now ready for production.

Testing has a place in Continuous Delivery, but not Continuous Deployment. Checking on the other hand is KING when it comes to Continuous Deployment but works hand in hand with testing in Continuous Delivery.

Tuesday, October 25, 2011

Transitioning into Agile -- part 1

Seven weeks ago, the company decides to finally get its foot into the agile door. There have been conversations about this in the past but this is the first time where we actually did it.

I worked with our backend dev lead to figure out the process that he has experience and familiar with. Our first conversation was kicked off by the statement, "In any well defined project, there are no such things as bugs". Every part of my being screamed, heresy! Where's the inquisition when you need it?!? He must have experienced a disturbance in the force that he started explaining why.

Every project should capture all possible user stories. And instead of filing new bugs, when someone finds an issue, that person should just reopen the corresponding fixed user story. Sounds great? Smells fishy to me. I had to ask, what do you mean by capturing all user stories? Even if we do, where can we find the time to capture them all? ~~Ash Ketchum has been trying to "capture them all" for 12 seasons now and he hasn't.~~

Me: You can't possibly capture all users stories, what do we really do with unaccounted user stories?
Him: We create new ones and decide if it's worth working on in the current iteration and possibly push certain features out of the way so we can include more important ones.
Me: OK, that sounds better.

It seems to me that the inherent beauty with agile is that the interaction between people is valued more than someone creating very impersonal documents and handing them over for implementation. Be it the daily stand-ups or story estimation parties, a tap on the shoulder or a quick face to face conversation, communication is encouraged.

What does this mean for me as a tester? Maybe I shouldn't have to write formal test plans in the beginning of a project since I can include my test ideas within the story itself. And, I should still have the freedom to explore even beyond the constraints of the existing user stories since I know that not every user story will be captured. Whatever I find outside those bounds should be reported right away to the stakeholders so they can evaluate if it's worth any attention in the current iteration.

Agile is good. As long as we keep effective lines of communication and act on things accordingly, then we will be effective.

Sunday, October 23, 2011

Let's Get Rid of the QA Team

one end of the spectrum

What if you walk into work and your boss walks in and tells your group that, we need to make some cutbacks and we will get rid of the testing team. No conversations, no explanation. What do you think will happen? Do you think your company's velocity will grind down to a zero? Will the development team beg your boss to get the testing team back again? or will it be business as usual?

the other end

Lets say that you achieved everything that your testing team has ever dreamed of? You have a crack team of Exploratory testers that has a very good understanding of what can and should be automated. Everyone in the company recognizes the value that your team gives, what now? Can we just sit back lie on our laurels?

some thoughts

As a test manager, one of the biggest questions lingering at the back of my head is the value my team brings to the company. Sure, I try to mentor each and every member in my team and spend weekly 1-on-1's even with my offshore members. I even went as far as getting everyone AST memberships so everyone can take the BBST course.

Are all of that enough? Is there still room for improvement? The answer I always give myself for the above questions are NO, no amount of improvement or training is never enough because one can never test everything, and YES, there is always room for improvement.

Testing vigilance is not a talent that everyone has, as a matter of fact it is not a talent either. Testing vigilance is something that you have to do. As a test manager, one of my primary roles is to promote to the rest of the group the value my team can bring to the rest of the company. Another role is to make sure my team understands what is valuable for the company that they are working for at any given point in time. For myself and the members of my test team, I expect everyone to have healthy discussions with the projects we are involved with and not just wait for something to fall in our laps.

I have been on that first end of the spectrum and there were a lot of blame that went around. It is a place that I would never want my team to be in but that is something I cannot control. What I control is setting up an environment so my team can strive and be the best that they can be and be able to serve the team and thus provide value.

In closing, when my team does get to the other end of the spectrum, I just need to remember that a great tester named James Bach once said, "Your team maybe called 'Quality Assurance'. Don't let that go to your head. Your test results and bug reports provide information that facilitates the assurance of quality on the project, but that assurance results from the effort of the entire team."

Monday, August 15, 2011

how to return an item: an exercise in practical impromptu testing

My wife asked me to return some items in one of our local pharmacy/convenient stores. This is usually a no brainer because the United States has a very friendly customer return policy. But what happened in that store suddenly became interesting and I had to put my thinking cap on in order to perform a supposedly brainless activity.

I walk in the store with the items in one hand and receipt in the other. After standing in line for five or so minutes, I handed both the receipt and the items over the cashier where she eyeballs the items and the receipt. She then asks for the credit card that I used for this purchase. For those who are familiar with an item return process, the printed receipt usually provides the last 4 digits of the credit card that you used for that purchase. If the credit card you presented to that person does not match the last 4 digits they either deny your return or you can ask for store credit depending on the mood of the cashier.

When the cashier inspected my card, she told me that the numbers didn't match. I must have given her the wrong card but my other card still didn't give a match. At the back of my mind, I was already thinking about some reasons why the match didn't happen;

I gave them the wrong card.
Their system fiddled with my numbers

Based on my assumptions and the known "Item Returns Oracle" that the cashier vehemently believed in, these are the facts that I have so far.

Both last 4 digits of the card didn't match what's on the printed receipt.
I used an American Express card to buy the said items the day before
I have two different American Express Cards

Three minutes into the conversation with the cashier, I proposed to run a series of tests since I know I used THE card that I gave them the first time. They happily obliged and opened another register for the other waiting customers. Now that the stage is all set, there are three ways to pay for an item at this store: You can swipe your card using a customer facing remote terminal, swipe your card using the cashiers terminal and scan an RDIF tag enabled card.

For the first test, the cashier scanned CARD A on her terminal. The last four digits on the printed receipt and my card matched. For the second test, I scanned CARD B on the customer terminal and that yielded the same results as the first test. For the third test, I used CARD A that so happen to have an RDIF tag. Eureka! The numbers on the receipt didn't match what was on my card but matched what was on the returned item's receipt. I spent $0.33 per test, plus 7 cents tax for a total of $1.06.

The manager finally walked in and confirmed that if you use an AmEx card, the RDIF scanning system only takes 16 digits and since American Express only has 15 digits, the systems will generate a 16 digit number and print the last 4 digits on your receipt. This confirms my second assumption that yes, the system indeed fiddled with the numbers on my card.

This was all done within 7 minutes and I was out of the store with my refund in hand and some sweets for the kids.

Lessons learned:

The "Item Returns Oracle" has been proven that it was not an Oracle but a Heuristic. The third test proved this.
If the cashier asked for the manager in the first place, I could have gotten the return earlier. I think this might be a simple deviation of Adam Goucher's "Know Where The Sun Is Heuristic".
Black Box Testing of a Financial Transaction system doesn't have to be expensive.

I felt very proud of myself having been able to test and learn something practical. As I got back in the car I told my wife what happened and she replied, I would have just called for the manager the moment the cashier didn't want to honor the return in the first place.

She is THE shopping expert in our household :-).

EDIT: I was aptly corrected that Oracles are Heuristic based. I agree. My reason for calling it as an oracle was due to the cashier's insistence that the receipt will "always" print the last 4 digits of the user's credit card.

Monday, April 25, 2011

So you think you know automation? Part One

I stepped into a briar patch today.

The Michael Bolton (@michaelbolton) kind of patch. No you sillies, not the singer but a philosopher disguised as a ~~canadian~~ tester. Early this afternoon, I've unwittingly butted into his conversation with Adam Yuret (@AdamYuret) regarding Automated Checks, aka Automated tests to the uninitiated. I'll leave it up to you to find out why things of this nature should be called checks and not tests.

Here are some of Michael's points regarding the value Automated Checks;

at least THESE examples appear to work to some degree.
We found a bunch of interesting problems when we developed these automated checks.
These automated checks, on first principles, will probably make it easier to perform other automated checks.
These automated checks, on first principles, will probably make it easier to explore the application with automation's help.
The automated checks helped us to identify problems with load, performance, and stress that would have been hard otherwise.
These automated checks will help us to identify at least some unexpected and unwelcome changes.
These automated checks (since they come with logging) may assist with debugging and with retesting, should they expose bugs.
These checks provide coverage of stuff that with my Big Brain and Clumsy Fingers I would consider tedious, trivial, and awful.
These automated checks, like all forms of testing, provide partial answers that might be useful.

CHALLENGE: The skillful tester NOW presents a counterargument for every single one of those heuristics. Over to you. :)

The "skillful tester" in me agreed to his 8th point and argued that;

PA: The moment a person starts considering something as tedious, trivial and awful, that person has least slowed down learning.
MB: What if that person uses boredom as a heuristic trigger to do something more valuable?
PA: But the I'm stuck heuristic is different from boredom. Boredom comes because you just can't think of anything better to do.
MB: I would argue that boredom is an important variety of "I'm stuck."
PA: Besides, to me, boredom is an effect and not a cause.

And much to my surprise, Michael issued to me a secondary challenge.

I'd recommend that you practice thinking more expansively (and critically) on this subject. You're right; what might *also* be right?

So looking into that earlier statement, "These checks provide coverage of stuff that with my Big Brain and Clumsy Fingers I would consider tedious, trivial, and awful.".

So what is this stuff that Michael is talking about? And what of coverage? Why does my brain have to be big? Do I understand why fingers can be clumsy? And what does the brain and fingers have to do with being the tasks I deem tedious, trivial or awful?

Now, for more serious questions, as a tester, how does automated checks bring value to my work? will it make testing my products more efficient, even make it better? or is it just throw away work? How do these checks bring value to my team? how about to my organization? What is it really that I have that require automation? What is it that I need that call for automation? Is it a far stretch to employ it? Will it require a culture change within the entire organization that I'm working for? Do I even understand it's purpose?

I dare say that if you are automating checks for the sake of your laziness and misunderstanding of your product, then don't. Automation is not just a programmatic conversion of your manual tests. Your manual tests that you deem tedious at this point in time will probably give new insight of the product under test the next time around. However you approach your product at this point in time will definitely be different as time progresses since you would have better understanding of the product (I hope) and can find other ways or paths to take which could lead you further in your exploration.

That is enough for now. Time to get some feedback.