How to read a score report: raw score, scaled score, percentile and sub-scores

Raw score, scaled or standardised score, percentile and sub-scores do not answer the same question. This guide helps families read a score report without overinterpreting one number or missing the signal that actually matters.

Concept illustration of a score report separating raw score, scaled score, percentile and sub-scores.

You open the report: a total score, perhaps a raw score, a percentile, sub-scores by domain, sometimes a score range or confidence band. Within seconds the family conversation can head in the wrong direction: is this result good or bad?

Start somewhere else. A raw score tells you what the student got right on this particular version of the test. A scaled score — or, on some UK reports, a standardised or standard age score — is a converted score designed to make comparison fairer than raw marks alone. A percentile places the student relative to a reference group. Sub-scores can suggest strengths and weak spots, but they rarely justify a strong conclusion on their own.

In other words, a score report does not give one answer. It gives several. One figure helps you understand what was achieved on that sitting. Another helps you compare more fairly across papers or against a relevant sample. Another shows relative position. Others may guide revision. The real skill for families is simple to state and easy to miss: know which figure answers which question before you book another sitting, change the method, or rethink the target.

What each line on the report is really telling you

The same result can show several numbers without any contradiction. That is normal. They do not measure exactly the same thing. Vocabulary also varies by provider: scaled score, standardised score, Standard Age Score, percentile rank, reporting category, content domain. The safest habit is still the same: ask what question each figure is answering.

Measure What it tells you What it is for Common trap
Raw score How many questions or marks the student got on this paper Seeing what was achieved on that specific sitting Comparing it directly with another sitting or a different test
Scaled or standardised score A performance converted onto a reporting scale, sometimes adjusted for age or test form Making comparison fairer than raw marks alone Reading it as a percentage or a school grade
Percentile Relative position compared with a reference group Seeing where the student sits among other test takers or pupils Confusing it with percentage correct
Sub-scores Results by topic, skill, strand or reporting category Spotting patterns that may matter for revision Giving them more precision than they really deserve

That table already changes a lot. A student can have an ordinary-looking raw score on a hard paper, a respectable converted score, and a high percentile in the reference group. Those figures are not in conflict. They answer different questions.

On many UK school reports, the converted score may be called a standardised score or Standard Age Score (SAS) rather than a scaled score. The practical rule is the same: once the number has been converted, stop reading it as if it were simply “marks out of 50”.

The most common mistake: a percentile is not a percentage correct

This is the classic misunderstanding. An 80th percentile does not mean “80% correct”. It means the student performed as well as or better than about 80% of the comparison group used for that report.

That difference matters. Percentage correct says something about the paper. Percentile says something about relative position. So a student can have a high percentile with a percentage correct that does not look spectacular, if the paper was demanding or the comparison group scored lower overall. The reverse can happen too.

In practice, first check what the school, course or admissions process actually cares about: a total score, a section score, a minimum threshold, a relative rank, or the wider application. That should guide how you read the report, not whichever number looks most impressive at first glance.

Sub-scores are clues, not a verdict

Concept illustration showing a main score above several smaller domain scores, emphasising that sub-scores are only clues.

Sub-scores draw the eye because they look precise. Reading, algebra, vocabulary, problem solving, reasoning, scientific analysis, or any other breakdown used by the test provider can create the impression of a fine-grained diagnosis. Families should be more careful than that.

First, a sub-score often rests on fewer questions than the total score. The narrower the domain, the more weight a small number of items can carry. Second, some sub-scores are just percentages correct within a category, while others are themselves converted scores. Third, the categories on a report are not always clean compartments: one question can draw on several skills at once.

Sub-scores become genuinely useful in three situations:

  • when the same weakness appears consistently across several reports or several mocks;
  • when it matches the actual errors you can see in answers, scripts or corrections;
  • when it leads to a precise change in revision, rather than one more vague worry.

They become misleading when families ask too much of them:

  • treating a small gap as a deep truth about the student’s profile;
  • deciding a child is “weak” in a whole domain on the basis of one report;
  • comparing sub-scores from different tests, or even different sittings, without checking that they are really comparable;
  • building the whole work plan around a category based on very few questions.

The sensible use of a sub-score is modest. It helps you form a working hypothesis: should the student revisit a specific topic, practise a question format, or focus more on timing and sustained attention? It does not prove that they “do not have the level” in an entire subject area.

One more rule protects families well: do not rebuild the whole strategy around one isolated sub-score. Look for a repeated pattern. If the same weakness shows up across several score reports, in home revision, and in the student’s own experience, it becomes a credible signal. If not, it stays what it is: a lead, not evidence.

Why comparing two scores without context can mislead

Concept illustration of two similar score reports whose meaning changes because the comparison context is different.

Many bad decisions come from a comparison made too quickly. Two numbers are placed side by side, one is judged higher, and a story follows: improvement, stagnation, decline, limited potential. Serious comparison needs at least three checks: the scale, the reference group, and the uncertainty around the result.

The first trap is simple: the same raw score does not always mean the same thing. If two papers were not equally difficult, that is exactly why a scaled score exists. A converted score is there to reduce the unfairness of a naïve raw-mark comparison.

The second trap is just as important: the same score does not always produce the same percentile. Everything depends on the comparison group behind that rank. It may be a recent cohort of test takers, a national sample, pupils of the same age, or another group chosen by the provider. A percentile only means something with its reference group attached.

The third trap is subtler: a small gap is not automatically a real change. Some reports show a score range, confidence band or similar indicator. When they do, they are reminding you of something important: a test score is an estimate, not a measurement to the millimetre. A slight rise or fall does not automatically justify a big story about progress or failure.

Some tests are adaptive as well, and families often forget this. When the test adjusts difficulty according to earlier answers, the path through the paper changes. That makes any reading based only on “number right” even less secure.

Take a simple example. A pupil scores slightly lower than on a previous mock, but the percentile stays similar, the weak area moves, and the score bands overlap widely. The sensible conclusion is not “they are going backwards”. It is: the signal is too weak to support that claim. Look at the pattern across several attempts and at the concrete errors, not one small difference.

The reverse is true too. A reassuring percentile can still mislead in a very selective context if the school or course mainly cares about a section score, a threshold, or a more competitive applicant pool than the report’s reference group. The number may look good in itself, but still answer the wrong question.

The useful questions to ask before deciding what to do next

Before you decide that your child needs another sitting, a different revision method or a bigger investment in preparation, ask a few plain questions. They prevent a surprising number of stress-driven decisions.

  1. Which measure actually matters for the goal?
    Some contexts care mostly about the total score. Others care more about one section, a minimum threshold, a benchmark score or the wider application. Until that rule is clear, it is easy to overinterpret the wrong figure.

  2. Which group is the percentile comparing the student with?
    A national sample, recent test takers, pupils of the same age, or a specific cohort can all produce a “percentile”, but not the same comparison.

  3. Is the gap larger than the test’s normal variation?
    If the report gives a score range, confidence band or nearby estimate, use it. If it does not, be cautious anyway about treating a very small shift as a solid truth.

  4. Is the issue general or concentrated?
    A low overall score with fairly even sub-scores does not call for the same response as a decent overall score with one repeated weak area.

  5. Does the pattern repeat across more than one attempt?
    A stable pattern deserves action. An isolated point deserves checking before the whole household reorganises around it.

  6. What concrete action follows from the report?
    Revisit one skill, practise timing, change the format of revision, sit the test later, or stop piling up mocks that are creating more noise than signal. A good score report should lead to a practical next step. If no clear action follows, the result probably has not been interpreted well enough yet.

This set of questions does something useful to the family discussion: it moves it from judgement to adjustment. You stop asking, “Is my child worth this score?” and start asking, “What does this report reasonably allow us to conclude, and what should we do next?”

The right reading order in one minute

When a score report arrives, the order you read it in matters almost as much as the numbers themselves. This is a better routine than reacting to the first headline figure.

  1. Find the decision-making measure first: total score, section score, threshold, benchmark or whatever the relevant process actually uses.
  2. Then read the percentile with its reference group, never on its own.
  3. Treat sub-scores as working hypotheses, especially if they are based on a small number of questions or appear only once.
  4. Only compare what is genuinely comparable: same test, same type of scale, same kind of report, and ideally a trend across several results rather than one isolated score.
  5. End with one practical decision: consolidate, target a domain, retake later, or refuse a comparison that is too flimsy to guide action.

A well-read score report is not mainly about deciding whether a result sounds good. It is about separating what you can really compare, what you can only suspect, and what it is still too early to conclude. That discipline protects families from false reassurance, unnecessary panic and badly targeted revision.

Sources