A call centre headset with a QA sample size calculation
ACXPA Glossary Term

QA Sample Size: How Many Calls You Need to Review

Your QA sample size is how many calls (or chats, or emails) you pull and score in quality assurance, out of everything you could have reviewed. Get it wrong in one direction and your QA scores are noise; get it wrong in the other and you burn review hours measuring something you already knew. The good news: the right number is almost always smaller than people expect — and it has very little to do with how many calls you handle.

Why it matters

A QA score you can't trust is worse than no score — it drives decisions about coaching, agents and process on numbers that are mostly luck of the draw.

The surprising part

Accuracy comes from how many calls you review, not how many you handle. A centre taking a million calls needs barely more review than one taking ten thousand.

What this guide covers

What sample size means, the four things that actually drive it, how to set yours, the traps to avoid — and a calculator that does the maths for you.

What is QA sample size?

In contact centre quality assurance, your QA sample size is the number of interactions you review and score in a given period — say, the 80 calls you QA out of the 3,000 your team handled this month. You review a sample because scoring every call is impossible; the sample stands in for the whole. The question sample size answers is: how many do I need to score before the result is a fair read on the real thing?

In plain English

Sample size is “how many calls do I have to listen to before I can trust the number?” The answer is driven by how sure you want to be and how precise you need the figure — not by what share of your calls that adds up to.

What sample size IS

  • A number set by the confidence and precision you need from the result
  • A random draw from all the interactions you could have reviewed
  • Big enough that the score is a trustworthy read, small enough to be practical

What it is NOT

  • A fixed percentage of your call volume (“we QA 2% of calls”)
  • Whatever calls were easiest to grab or happened to be flagged
  • A case of “bigger is always better” — past a point, more reviews barely help

Why it matters in CX

Sample size is where a QA programme is quietly made or broken. The score itself looks the same whether it rests on 15 calls or 350 — but only one of them is safe to act on.

For CX & QA leaders

A score you can defend in front of the executive team — not a figure someone can wave away with “that's only a handful of calls.”

For contact centre leaders

The right amount of review effort. Too little and the score is luck; too much and you're paying senior people to measure something they already know.

For operations & finance

QA time is real money. Right-sizing the sample — rather than chasing a percentage — puts those hours where they actually change the number.

95%
confidence is the industry-standard setting — right for almost everyone
±5%
a common, defensible precision target for a team-level score
~370
calls to read a month at 95% and ±5% — whether you handle 10,000 or a million

What drives sample size

Four things set the number. Notice that three of them are decisions about how good you need the answer to be — and only one is about your operation, which turns out to matter least of all.

1

Confidence

How often you want the score to land inside your margin. 90%, 95% or 99%. Higher confidence is more cautious and needs more calls — and, counter-intuitively, gives a wider range for the same sample, not a tighter one.

2

Precision (margin of error)

How far off the true score you can live with, written as ±x%. A score of 80% at ±5% really sits somewhere between 75% and 85%. Tighter precision needs more calls.

3

Your total call volume

The one that barely matters. Once your volume is comfortably bigger than the sample, making it bigger changes the answer almost not at all — the same reason a national poll reads a whole country from a couple of thousand people.

4

Expected fail rate

How lopsided your pass/fail split is. A 50/50 split carries the most uncertainty and needs the most calls, so 50% is the safe assumption when you don't know your rate.

How to set your sample size

1

Decide what the QA is for

A read on how the team is doing overall needs a far smaller sample than fairly scoring individual agents. Be honest about which one you're doing — ranking people takes many more calls per person than most centres can manage.

2

Pick your confidence

95% is the standard and the right choice for almost everyone. 99% is rarely worth the extra calls for QA.

3

Pick your precision

±5% is a sensible default for a team-level score. Tighter than that is a lot more review for a difference you probably can't act on.

4

Read the number

Drop your figures into the QA Sample Size Calculator and it tells you how many calls to review — and how good your current sample already is.

5

Pull the calls at random

This is the step everyone skips. The number is only honest if every call had an equal chance of being picked — not the easy ones, not one team, not one part of the day.

Do the maths in seconds

The QA Sample Size Calculator works both ways: tell it how accurate you need to be and it gives you the number of calls, or tell it how many you already review and it tells you how much you can trust the result.

What good sampling gives you

Defensible scores

A QA number you can stand behind, with the maths to back it.

Right-sized effort

Review enough to be sure — and not a single call more.

Fairer team reads

Know when a sample is solid enough to compare, and when it isn't.

📈

Spot real change

Tell a genuine shift in quality from ordinary month-to-month wobble.

💰

Less wasted time

Stop QA-ing a percentage that's far larger than you ever needed.

🤝

Credibility

Numbers leaders trust, because they hold up to a challenge.

Common pitfalls

QA-ing a percentage of calls

“We review 2% of calls” is the most common rule of thumb and the least defensible. A percentage of a big operation is wildly more than you need; a percentage of a small one can be far too few. Sample size isn't a share — it's a number set by confidence and precision.

Cherry-picking the sample

Reviewing the easy calls, one team, or one slot of the day skews the result no matter how many you check. A big but lopsided sample doesn't average out — it just makes a wrong read look more convincing.

Ranking agents on a handful of calls

Splitting a team-sized sample across individuals leaves only a few calls each — nowhere near enough to fairly separate one agent from another. Use small samples to coach and to read the team, not to build a league table.

Reaching for 99% confidence

It feels like “more accurate,” but higher confidence widens the range for a given sample — or demands a lot more calls to hold the range steady. For QA, 95% is almost always the right call.

The one to remember: size is only half the story. A large sample chosen badly is a confident wrong answer — how you pick the calls matters as much as how many.

How to know you got it right

1

Check the margin you actually achieved

Run the sample you reviewed back through the calculator. If the margin on the score is tight enough to act on, you reviewed enough; if it's wide, treat the score as a rough signal.

2

Re-draw at random every period

A good sample this month doesn't make next month's automatic. Pull a fresh random selection each cycle rather than always landing on the same calls or agents.

3

Don't split hairs inside the margin

If your score moves from 82% to 84% but your margin is ±5%, nothing has actually happened. Only call it a change when it clears the margin.

The honest test

A sample is “big enough” when two things are true: the margin on the result is tight enough to make the decision in front of you, and the calls were pulled at random. Miss either one and more calls won't save you.

Frequently asked questions

How many calls should I review per agent each month?

There's no universal number — it depends on what you're using the score for. For a read on the team, you need far fewer per agent than you'd think. For fairly scoring individuals, you need many more calls each than most centres can realistically review, which is why agent rankings built on a handful of calls are so often unfair.

Should I just QA a fixed percentage of calls?

No. A percentage rule is the most common approach and the weakest. The number of calls you need is set by how confident and precise you want to be — not by your volume. A fixed percentage almost always means a large centre reviews far more than necessary while a small one reviews too few.

Does a bigger contact centre need a bigger sample?

Barely. Once your call volume is comfortably larger than your sample, making it larger changes the required number almost not at all. Reviewing a few hundred calls reads quality about the same whether you handle ten thousand a month or a million.

What confidence level should I use?

95% is the industry standard and the right setting for almost every QA programme. 90% needs fewer calls but is less certain; 99% is usually overkill and quietly demands a lot more review for little practical gain.

What's a good margin of error?

±5% is a sensible, defensible target for a team-level QA score. Treat it as context, not a rule — tighter precision is a lot more review effort for a difference you often can't act on anyway.

Can I use these samples to rank individual agents?

Usually not. Spreading a team-sized sample across agents leaves only a few calls each — far too few to tell a genuinely better agent from a worse one. Small samples are for coaching and for reading the team, not for ranking people against each other.

What if I don't know my fail rate?

Leave it at 50%. That's the most conservative assumption — it carries the most uncertainty and so produces a sample big enough to be safe whatever your true rate turns out to be.

Is a bigger sample always better?

No. Past the point where your margin is tight enough to act on, extra reviews add very little. And a big sample that was chosen badly — only easy calls, one team, one shift — is less trustworthy than a smaller one drawn at random.

Where to next

📊

QA Sample Size Calculator

Work out how many calls to review — or how good your current sample is.

Open the Calculator
📞

Call Centre Hub

Practical resources for running a contact centre well.

Go to the Call Centre Hub
🎓

Quality Framework Course

Build a QA framework that holds up — from CX Skills.

View the Course
🔍

ACXPA Supplier Directory

Find standards and certification specialists.

Browse the Supplier Directory

Get the full QA toolkit

ACXPA membership unlocks the Advanced QA Calculator, the Call Centre Roundtables and the full member resource library.

📊

QA Sample Size Calculator

Work out how many calls to review — or how good your current sample is.

Open the Calculator
📞

Call Centre Hub

Practical resources for running a contact centre well.

Go to the Call Centre Hub
🎓

Quality Framework Course

Build a QA framework that holds up — from CX Skills.

View the Course
🔍

ACXPA Supplier Directory

Find standards and certification specialists.

Browse the Supplier Directory

Unlock the full QA toolkit,

Upgrade your membership for the Advanced QA Calculator, the Call Centre Roundtables and the full member resource library.

📊

QA Sample Size Calculator

Set your sample, or sanity-check the one you already run.

Open the Calculator
📞

Members Call Centre Hub

Your member resources for contact centre performance.

Go to the Members Call Centre Hub
💬

Call Centre Roundtables

Compare notes on QA and sampling with other members.

Join the Call Centre Roundtables
🎓

Quality Framework Course

Build a QA framework that holds up — from CX Skills.

View the Course

Final thoughts

Sample size is the quiet foundation under every QA score, and most contact centres set it on instinct — a round percentage, a habit, a number that “feels about right.” That's how programmes end up either drowning in review work or acting on scores that are mostly noise.

The honest version is simpler than the folklore. Decide how sure and how precise you need to be, read off the number, and pull the calls at random. Your total volume, the thing people fixate on, barely enters into it — and a giant sample chosen badly will mislead you more confidently than a small one chosen well.

Get those two things right — enough calls, picked fairly — and your QA score stops being a talking point people argue about and starts being a number you can actually run the operation on.

0 Comments

Leave a reply

ACXPA PLATINUM SPONSORS

ACXPA Platinum SPONSORS
ACXPA SILVER SPONSORS
ACXPA Platinum SPONSORS
ACXPA BRONZE SPONSORS
ACXPA Platinum SPONSORS
ACXPA Platinum SPONSORS
Copyright ยฉ 2026 | Australian Customer Experience Professionals Association | Website Terms of Use | Privacy Policy

Log in with your email address

or Become an ACXPA Member

Forgot your details?

Create Account