Exhaustive testing is impossible, and that's okay

Imagine a login form with two fields: email and password. Just two fields. It sounds easy to test. But once you consider valid emails, invalid ones, ones with special characters, empty ones, ones with leading and trailing spaces, ones with nonexistent domains, uppercase and lowercase variations, combined with short passwords, long ones, with and without special characters, empty ones, ones with spaces, and SQL injections, you're already dealing with thousands of combinations. And that's before you even touch the database, session state, network, or browser. Testing everything is a promise nobody can keep.

What the principle says

The second testing principle according to ISTQB states that exhaustive testing is impossible. It doesn't say it's hard or expensive. It says it's impossible. Except in trivial cases, you can't test every combination of inputs, preconditions, states, and execution paths in a system.

The reason is mathematical. A program with just 10 boolean variables already has 1,024 possible combinations. If each variable has 5 possible values instead of 2, the number jumps to almost 10 million. In a real system with forms, databases, external APIs, session states, and environment configurations, the combination space is astronomical. You could spend your whole life writing tests and still not cover a meaningful fraction.

Glenford Myers already laid this out in The Art of Software Testing (1979): even a simple program has so many possible paths that complete testing would take more time than any project has available. The answer isn't to give up, but to choose intelligently what to test.

The calculation nobody does

Let's put numbers on a real case. A user signup form with five fields:

Name with free text up to 100 characters.
Email with format validation.
Password with complexity requirements.
Country with a dropdown of 195 options.
Date of birth in DD/MM/YYYY format.

If you define just 10 representative values for each field (valid, invalid, boundary), you get 10⁵ combinations, that is, 100,000 test cases. If each test takes 5 seconds to run, you need almost 6 days of continuous execution for a single pass. And those 10 values per field are already a huge simplification. Now multiply that by the different prior system states, different browsers, and server configurations. The number of possible scenarios goes beyond any reasonable execution capacity.

Why it matters in practice

Understanding that exhaustive testing is impossible fundamentally changes how you approach your testing strategy. If you can't test everything, the question stops being “how many tests do we have” and becomes “are we testing what matters most”.

The trap of “testing everything”

In my experience, when someone in a meeting says “we need to test everything”, what they really mean is “I don't want anything to slip through”. That's understandable, but it's an impossible goal dressed up as a quality requirement. The result is usually one of two things: either the team tries to test everything and ends up with a huge, slow, hard-to-maintain suite that still leaves gaps, or they freeze because the task is so big and end up testing too little, and badly.

More tests isn't always better

I've seen projects with 5,000 tests that took 45 minutes to run, where half of them were testing tiny variations of the same scenario. Meanwhile, critical flows like password recovery or token renewal didn't have a single test. Quantity without judgment isn't coverage, it's noise.

The opportunity cost

Every hour you spend writing one test is an hour you don't spend on another. If you spend three hours testing 50 combinations of a text field that rarely fails, those three hours didn't go into testing the payment flow that handles real money and has three external integrations. Prioritization isn't optional, it's the core of testing.

Common mistakes when you ignore this principle

Denying the impossibility of exhaustive testing leads to strategy mistakes that are hard to fix once they're in place.

Trying to cover every combination by brute force. This creates gigantic suites that take hours, burn CI resources, and are so hard to maintain that the team ends up ignoring flaky failures.
Not prioritizing and testing everything with the same depth. An “optional nickname” field gets the same testing effort as the credit card field. The risk isn't the same, and the effort shouldn't be either.
Promising total coverage to stakeholders. When someone asks “have you tested everything?” and the answer is “yes”, that creates a false expectation. If a bug shows up in production later, trust disappears instantly. It's much better to answer “we've tested the highest-risk areas with these techniques, and this is the residual risk”.
Confusing exhaustiveness with rigor. You can be rigorous without being exhaustive. Rigor means choosing well what to test, designing tests that catch real defects, and measuring how effective your strategy is. Exhaustiveness means trying to test everything, which, as we already know, is impossible.

How to apply it in your team

If you can't test everything, you need a system to decide what to test first, how deeply, and with which techniques. Here are the strategies that work best in real teams.

1. Risk analysis to prioritize

Not every feature carries the same risk. A bug in the payment flow has a very different impact from a bug on the “About” page. Before you write a single test, classify features along two axes: likelihood of failure (code complexity, frequency of changes, external dependencies) and impact if it fails (financial loss, data loss, reputational impact).

Features with high likelihood and high impact get the biggest testing investment. Low-likelihood, low-impact ones can be covered with minimal tests or even left to production monitoring.

2. Equivalence partitioning

Instead of testing every possible value for a field, you divide the range into classes where all values should behave the same way. For an age field that accepts values between 18 and 65, you don't need to test 18, 19, 20, 21... all the way to 65. One value inside the range is enough (for example, 30), plus one below (17) and one above (66). If the code handles one value in the class correctly, it should handle all of them correctly.

This technique drastically reduces the number of tests you need without losing detection power. In the previous example, you go from 48 possible values to 3 tests that cover the same logical scenarios.

3. Boundary value analysis

Bugs pile up at the edges. In that same age field, the values 17, 18, 65, and 66 are the ones most likely to reveal defects, because that's where the code's conditions switch from accept to reject. A typical off-by-one error won't show up when you test with 30, but when you test the exact boundary value and its immediate neighbors.

Combine boundary values with equivalence partitioning and you'll have a compact test set that covers the most sensitive points in the range.

4. Pairwise testing for combinations

When you have multiple fields interacting with each other, pairwise testing (or all-pairs) cuts down the combinations dramatically. The idea is based on an empirical observation: most defects come from the interaction of at most two factors, not from combinations of three, four, or five variables at the same time. Instead of testing every possible combination, you generate a minimal set that guarantees every pair of values across two fields appears at least once. For a form with 4 fields and 3 values each, exhaustive testing needs 81 combinations, while pairwise cuts that down to around 9 or 12. You can use online generators or specific libraries to create those combinations.

5. Production data as a guide

Your real users are already telling you what to test. Analyze production logs, error reports, and usage metrics to identify the busiest flows, the most common inputs, and the conditions that generate the most errors.

If 80% of your users are on Chrome on mobile, it makes sense for that combination to have more coverage than Safari on Linux. If 90% of signups use Gmail, your email tests should cover that case well, even if they also include other domains. Real data helps you invest testing effort where it has the most impact.

Test intelligently, not by brute force

Accepting that you can't test everything isn't a weakness. It's the starting point for any mature testing strategy. The teams that protect their software best aren't the ones with the most tests, but the ones that choose best what to test.

The next time someone asks you “have you tested everything?”, have an answer ready: “we've identified the highest-risk areas, applied selection techniques to cover as much as possible with the least effort, and we have monitoring in place to catch whatever slips through”. That inspires more confidence than a “yes, everything” that's a lie.

A good exercise to get started: pick your most complex form, count the possible combinations if you tested everything, then apply equivalence partitioning and pairwise to see how many tests you actually need. The gap between those two numbers will convince you that intelligent testing isn't a shortcut, it's the only viable option.

Second principle of the seven ISTQB testing principles. You came from Tests show the presence of defects and next up is Testing early saves time and money.

What the principle says

The calculation nobody does

Let's put numbers on a real case. A user signup form with five fields:

Name with free text up to 100 characters.
Email with format validation.
Password with complexity requirements.
Country with a dropdown of 195 options.
Date of birth in DD/MM/YYYY format.