TOEIC Scores Decoded: What 10-990 and 0-200 Actually Mean

If you have ever looked at a TOEIC score report and wondered why Listening and Reading are reported on a 10-990 scale while Speaking and Writing each use a 0-200 scale, you are not alone. The two scales are not arbitrary — they reflect different testing formats, different statistical models, and different purposes — but they land on the same report and often confuse candidates who expect a single unified number.

This guide walks through both scales, explains how raw answers become scaled scores, and decodes every other element on your report: the standard error of measurement (SEM), the Abilities Measured percentages, the certificate color tiers, and the Pronunciation and Intonation descriptors on the Speaking certificate.

Two Tests, Two Scales

TOEIC is not one test but two separate assessments:

TOEIC Listening & Reading (L&R): a two-hour multiple-choice test with 100 Listening questions and 100 Reading questions, scored 10-990.
TOEIC Speaking & Writing (S&W): an approximately 80-minute performance test with 11 Speaking tasks and 8 Writing tasks, each half scored 0-200.

Candidates can take either half independently. Many corporate candidates only take L&R; academic or professional candidates who need to demonstrate production skills sit S&W as a separate session. The two tests were designed years apart, for different use cases, and ETS kept the historical scales rather than forcing a unified score.

Why 10-990 for L&R?

The 10-990 scale dates from the original TOEIC test in 1979 and was chosen to avoid the appearance of a percentage. Each section (Listening, Reading) is independently scaled 5-495 in 5-point increments, and the two are summed for a total. No one scores 0, because even guessing produces a non-zero scaled score.

Why 0-200 for S&W?

The S&W test launched in 2006-2007 with a different scoring philosophy. Responses are human-rated and task counts are smaller, so the 0-200 scale in 10-point increments mirrors rater-judgment granularity rather than multiple-choice equating. The two scales are kept separate because averaging them would hide very different underlying evidence.

How Scaled Scores Are Derived: The Equating Story

Your raw score — the count of questions you got right — is not what appears on your score report. ETS applies a statistical process called equating that adjusts for small differences in difficulty between test forms.

Suppose Form A has a slightly easier Reading section than Form B. If both forms were scored by raw count alone, a candidate who took Form A would have an unfair advantage. Equating solves this by mapping raw scores on each form to a common scale such that a scaled score of, say, 400 in Reading represents the same ability regardless of which form you sat.

This is why:

Raw scores are never reported. You will not see "87 out of 100" on your report.
The same raw count can produce different scaled scores across administrations.
The scale is stable over time. A 750 in 2020 and a 750 in 2026 represent the same level of English proficiency, even though the specific questions and candidate pool differ.

L&R uses Item Response Theory (IRT) equating, with KR-20 reliability coefficients of roughly 0.90 or above for both sections — meaning the test produces consistent results when the same candidate takes parallel forms.

The Standard Error of Measurement: Why Your "True Score" Bounces

No test — not TOEIC, not TOEFL, not IELTS — reports a perfectly exact ability level. Every scaled score carries a standard error of measurement (SEM), which quantifies how much noise surrounds the reported number.

For TOEIC L&R, SEM is approximately ±25 scaled points per section. This means if your reported Listening score is 400, your "true" score (what you would average across infinite administrations) lies within 375-425 about 68% of the time, and within roughly 350-450 about 95% of the time.

What SEM Means in Practice

If you scored 700 today and take the test again next month and score 720, that 20-point gain is almost certainly measurement noise, not real improvement. The standard error of difference (SE_diff) when comparing two administrations is approximately ±35 points on the total L&R scale. A rough rule of thumb:

Observed change	Interpretation
0-20 points	Likely noise; no meaningful change in ability
20-40 points	Ambiguous; could be noise or modest improvement
40+ points	Likely a real change in ability
70+ points	Substantial, almost certainly real improvement

This is why corporate hiring thresholds often require candidates to meet a target score by a comfortable margin. A company requiring "minimum 700" knows that a 695 and a 705 are statistically indistinguishable, so many HR departments set internal cut-offs 30-50 points above the stated minimum.

SEM for S&W

S&W scores are reported in 10-point increments because the underlying measurement precision does not support finer distinctions. A Speaking score of 140 and 150 represent genuinely different performance bands; a Speaking score of 143 would not be statistically meaningful, so ETS does not report to that level.

TOEIC L&R Score Ranges and What They Mean

Here is the commonly cited interpretation for total L&R scores, drawn from ETS proficiency descriptors and corporate usage guides:

Total Score	CEFR (approx.)	Practical English Ability
905-990	C1-C2	Near-native working proficiency; can handle complex negotiations, nuanced written communication, technical discussions
785-900	B2-C1	Strong working proficiency; can participate confidently in meetings, write professional emails, understand most business content
605-780	B1-B2	Functional working proficiency; can handle routine workplace interactions and standard correspondence with occasional gaps
405-600	A2-B1	Limited working proficiency; can communicate basic needs, follow simple instructions, struggle with abstract or technical topics
255-400	A2	Elementary proficiency; can handle highly predictable exchanges only
10-250	A1	Basic formulaic English; phrase-level comprehension and production

These ranges are guidelines, not contractual thresholds. Many employers publish their own cut-offs based on job function (for example, 600 for customer service, 750 for international sales, 850 for executive roles).

The L&R Certificate Color Tiers

Candidates who take TOEIC L&R receive a certificate with a color code reflecting the score band. The common tier structure is:

Color	Score Range	Proficiency Summary
Gold	860-990	Can handle most work situations confidently
Blue	730-855	Can meet needs for social and workplace communication
Green	470-725	Can make clear, basic conversation
Brown	220-465	Can handle limited, routine exchanges
Orange	10-215	Basic formulaic English only

These thresholds are widely cited but may vary by region. Each ETS Preferred Network (EPN) — the national administrator in a given market — has some discretion over certificate presentation, and minor variations in band edges occur in some countries. If a specific cut-off matters to you (for example, a hiring manager requested "Gold level"), confirm the exact threshold with your local EPN.

The Abilities Measured Breakdown

Every L&R score report includes an Abilities Measured section that reports your percentage correct across five skill areas per section. This is genuinely useful diagnostic information — much more actionable than the single scaled score.

Listening Abilities Measured

Can infer gist, purpose, and basic context based on information explicitly stated in short spoken texts (Parts 1-2 territory).
Can infer gist, purpose, and basic context based on information explicitly stated in extended spoken texts (Parts 3-4 gist questions).
Can understand details in short spoken texts (Part 2 detail questions, Part 1 photograph details).
Can understand details in extended spoken texts (Parts 3-4 detail questions).
Can understand a speaker's purpose or implied meaning (pragmatic understanding; intention, tone, indirect speech).

Reading Abilities Measured

Can locate and understand specific information in tables and passages (Parts 5-7 scanning tasks).
Can connect information across multiple sentences in a single text and across texts (Parts 6-7 inferencing across multi-text sets).
Can make inferences based on information in written texts (Part 7 implied meaning).
Can understand vocabulary in workplace texts (Part 5 lexical items).
Can understand grammar in workplace texts (Part 5 grammatical forms).

Using the Abilities Measured to Study

If your overall Listening score is 350 but your breakdown shows 85% on detail questions and 40% on pragmatic/implied meaning, you know exactly where to target practice. Most candidates improve fastest by drilling their weakest ability area rather than doing generic full-length practice tests.

TOEIC S&W Score Ranges and Descriptors

The Speaking and Writing scales each run 0-200 in 10-point increments. Each half publishes its own proficiency descriptor bands.

Speaking Proficiency Bands

The Speaking section has 11 tasks. Tasks 1-10 are rated 0-3 each, and Task 11 is rated 0-5, producing a maximum raw of 40, which is then converted to the 0-200 scale. ETS publishes 8 proficiency descriptor bands:

Scaled Score	Proficiency Level
190-200	Highly proficient; nuanced opinions, complex syntax, near-native delivery
160-180	Proficient; clear opinions and reasoning, minor pronunciation or grammar issues
130-150	Effective; generally understandable with some hesitation and limited range
110-120	Functional; intelligible in predictable contexts, frequent pauses
80-100	Limited; short phrases, heavy reliance on formulaic language
60-70	Basic; difficult to follow, severely limited vocabulary
40-50	Minimal; single words and memorized phrases only
0-30	Cannot function meaningfully in spoken English

Writing Proficiency Bands

The Writing section has 8 tasks. Q1-5 are rated 0-3, Q6-7 are rated 0-4, and Q8 is rated 0-5, again converted to the 0-200 scale. ETS publishes 9 proficiency descriptor bands:

Scaled Score	Proficiency Level
200	Mastery; sophisticated, well-organized, minimal errors
170-190	Highly proficient; extended opinions with strong support
140-160	Proficient; coherent opinions with occasional errors
110-130	Effective; clear basic communication; limited range
90-100	Functional; simple sentences, frequent errors
70-80	Limited; fragmented ideas, heavy grammatical problems
50-60	Minimal; phrase-level writing only
40	Pre-functional; barely intelligible
0-30	Cannot produce meaningful written English

Pronunciation and Intonation on the Speaking Certificate

A distinctive feature of the TOEIC Speaking certificate is the inclusion of two sub-descriptors that do not appear as numbers but as three-level bands:

Pronunciation: Low / Medium / High
Intonation and Stress: Low / Medium / High

These labels reflect rater judgments on the clarity of your sounds (consonants, vowels, word stress) and the naturalness of your sentence-level prosody (rhythm, pitch contour, emphasis placement).

A candidate scoring 150 in Speaking might receive:

Pronunciation: Medium
Intonation: Medium

A candidate scoring 180+ almost always receives High on both, while candidates below 110 typically receive Low on at least one.

Some employers — especially in customer-facing or international-communication roles — look at these descriptors specifically. A candidate with 160 and "High / High" on the sub-descriptors may be preferred over a 170 candidate with "Medium / Low," because intelligibility often matters more to the job than vocabulary range.

How S&W Scores Are Produced

Unlike L&R, S&W responses are evaluated by certified ETS raters through the Online Network for Evaluation (ONE). Each response is typically scored by multiple raters with discrepancies resolved by adjudication. Speaking rubrics cover pronunciation, intonation and stress, grammar, vocabulary, cohesion, and content relevance; Writing rubrics cover grammar, vocabulary, organization, relevance, and task completion — each applied differently by task type. Rater scores are summed and then mapped to the 0-200 scale using a conversion table updated periodically to maintain stability.

Percentile Ranks: Where You Stand Globally

Your score report also shows percentile ranks — the percentage of test-takers worldwide who scored at or below your score. ETS updates these tables each May based on a rolling three-year candidate pool. As rough reference points: 990 is the 99th+ percentile, 900 is around the 90th, 800 around the 75th, 700 around the 55th, and 500 around the 20th. Percentiles matter for competitive selection (scholarships, international hiring pools) but do not change absolute employer thresholds.

Reading Your Score Report: A Practical Checklist

When you receive your TOEIC score report, work through it in this order:

Total score — compare to your target and to the SEM. Is your margin above the required minimum larger than ±25?
Section scores — is one section dragging the other down? If so, next round of prep should focus there.
Abilities Measured percentages — identify the two weakest ability areas out of the ten (five Listening, five Reading). These are your highest-leverage study targets.
(S&W only) Proficiency descriptors — read the full paragraph-level descriptor for your band, not just the score. The descriptor tells you what specific behaviors would move you up.
(S&W only) Pronunciation / Intonation labels — if either is "Low," targeted phonics and prosody practice will produce visible gains faster than general speaking practice.
Percentile rank — only relevant if you are in a competitive selection context. Otherwise, focus on the absolute score.

Common Misinterpretations

"I got 87% on the Listening Abilities breakdown, so my Listening score should be 870." No. Abilities Measured percentages are diagnostic category percentages, not the basis of your scaled score. Your scaled score reflects the full equated IRT model across all items, weighted by item difficulty.

"I scored 720, my friend scored 740 — she is better at English." Within ±35 of each other, scores are statistically indistinguishable. A 720 and a 740 are operationally the same score. Only gaps of roughly 70+ points reliably reflect real ability differences.

"I will focus on reaching the next color tier." A motivational goal, but tier boundaries are discrete while ability is continuous. A 855 (top of Blue) is functionally identical to a 860 (bottom of Gold). Do not over-value the color at the boundary.

The Bottom Line

TOEIC uses two different scales because it is two different tests, built at different times for different purposes. Both scales are rigorously equated, reasonably reliable, and accompanied by rich diagnostic information — if you know how to read it. The single most important habit a TOEIC candidate can build is to ignore the total score as a first-pass reading and instead go straight to the Abilities Measured breakdown (for L&R) or the proficiency descriptor paragraph (for S&W). That is where the actionable information lives.

Understand SEM, factor ±25 into your target-setting, and do not chase 10-point swings between administrations — they are noise. Aim for meaningful gains of 40-70 points per prep cycle, and study the specific ability areas holding you back rather than doing undifferentiated full-length practice.

Ready to move your score meaningfully? ExamRift provides full TOEIC L&R adaptive mock exams with per-ability-area diagnostics — so you can see exactly which of the five Listening and five Reading abilities are costing you points, and practice targeted drills that address your weakest areas. Every session comes with AI-generated vocabulary, functional phrases, and explanation supplements that turn each question into a focused study moment. Start your free practice today and see where you really stand.