Review and/or Usage Links

Many educationalists use the Waddington Diagnostic Reading, Spelling and Math Tests.
Here are some examples of questions, feedback and research by others:

October 2016 Response to testing best practice query - "We test our students in semester 1 and at the end of semester 2 to look for student growth. Should we be using the same diagnostic test (eg. test 1) in both semesters or is the system designed to use test one first and test 2 in the second semester? Some staff have asked if using test 2 in the second semester produces a skewed result when they have used test 1 initially." EP, Primary School South Australia

When we use the words, ‘best practice’, we need to be sure our focus is on similar objectives and goals. To start with, we must be sure about why we are testing. Testing for the sake of only obtaining summative data is not a good reason for testing on its own. It might provide some insight to growth but that can be vexed by many factors, such as test error of measurement, which I will touch on in this reply. The focus must be on skills and viewing each student as an individual. We need to diagnose each individual student’s current skills, in view of consolidating skills they can independently reapply, then working out the best way to add to them via effective teaching and learning opportunities. Luckily, even though my reading, spelling and math tests will provide summative data such as an age grade, they provide much more diagnostic information relating to the individual student.


Semester 1 is a 6 month time frame, so I’m not sure when you actually do your testing in semester 1. Best practice, in my opinion, is to test the students for reading, spelling and maths in week 2 of term 1. This is best done as early as possible in the year so the teacher can plan each student’s learning programs. Week 1 is a settling-in period, so student anxiety doesn’t affect test results. The administrative instructions presented on page 5, Preparations for Testing Checklist, state this. If testing is not done early, then you’ve most likely wasted a lot of tailored teaching and learning time when students need it the most.


You ask the question about the possibility of skewed results if using test 2 after test 1. The most important factor here is the time frame between testing and not the tests themselves. I’ll explain this further. The shorter the time frame between testing, the more significant the Standard Error of Measurement (SEm) of the derived score applied to students. The reading and spelling tests provide SEms on pages 38 (reading) and 64 (spelling). If you retest within a year, you really are talking about no more than 9 or 10 months. Can you really expect to see growth from a norm-referenced point of view over such a short period? Perhaps, but the SEm is critical. I’ll give you an example. The SEm of the Waddington Reading Tests 1 and 2 is ±2.8 months. In the world of norm-referenced standardized tests, this is a very low error of measurement, which is part of the reason why the tests are highly respected and used widely. If the test-retest period is 10 months, you can expect to see data supporting reliable growth. But who needs this data? If it’s not for the benefit of the child/student, then it’s a waste of time. If it’s needed to support observations about questionable teaching and learning practices, then maybe it has a place. But what’s the point of testing students at the end of the year? Best to do it again early in the following year by a teacher who is going to use the results for the intended proper reasons explained in paragraph 2 above – by the very person who is actually going to do something worthwhile with the results for each individual student under their care. Use those once a year results to compare performance if you must. My School Data Express program can do that sort of thing (e.g. compare average results between classes, years, gender etc..). But that’s more useful to the system and not to the student as an individual or their parents. Intervention early is the key. There is absolutely no benefit knowing if a whole school is under performing against the school up the road because results such as NAPLAN could be influenced by as little as a few students. I speak to principals regularly every week or so who lament over how their school could do better. Little do they know I also speak with other principals who do things to improve their results which are totally divorced from curriculum or teaching and learning strategies.


There are other factors, besides the time period in between, which can skew test-retest results. These might be a change in teaching style, programming, crowded curriculum, cheating, test fatigue, environmental conditions or something as benign as a student having an ‘off day’ when the test is re-done. To reduce test fatigue, I provide 2 tests for both reading and spelling in parallel form. The tests are very closely matched, backed up by estimates of their validity and reliability demonstrated on page 102. So you can use tests 1 and 2 interchangeably with a high degree of confidence. Of course, as previously explained, you will increase their effectiveness, as well as decrease the test error, by testing once a year, no less than 11 or 12 months apart, at a time of the year when a teacher and student needs the information the most for their teaching and learning.


Math testing is a bit different because it needs to be criterion-referenced (matched against a definitive curriculum broken down into small packets of skill sets and information in a careful order) rather than norm-referenced. Once again, by using criterion-referenced tests, such as mine, the focus is primarily on diagnosing individual student’s skills, against a carefully arranged scope and sequence of essential learning. The National Curriculum talks about strands (vertical/cross-ways skill introduction as well as laterally potted continuum of skills), whereas my tests have used this terminology and approach for almost 3 decades. Unfortunately, the National Curriculum tends skew its focus. It expects teachers to cover certain knowledge sets at each year level but not all students are the same and this can be dangerous for students who do not fit where someone else thinks they should. This is why I am a big believer in each student having their own math workbook based on their specific teaching and learning needs. The learning environment (e.g. classroom) can enrich this if the tools and learning experiences are innovative and made available when the students are ready to make use of them. So, all up, math testing can not be done only twice a year, it’s an on-going process.


Other forms of testing should be on-going, such as weekly spelling tests resulting from individual, or group based spelling/language programs, fitness scores, computer activity scores, project/assignment marks etc... Once again, this testing and scoring is done for the benefit of the individual student and it can show skill growth against a particular program over shorter periods of time compared to norm-referenced tests which tend to show more holistic growth. Some ‘old ways’ of doing things, such as weekly tests and marks, are still important today. School Data Express can be used to record these types of testing events and the individual tests can give a collective percentage which can be compared against previous period(s). Reading Recovery, PM Benchmarking, Lexiles, Jolly Phonics etc depend on frequent testing, such via running records and other various program and classroom based test regimes. They tend to be more time consuming and irregular and sometimes I question their real benefit if not done for a specific purpose. In my opinion, these can show growth only if the data can be standardized (e.g. teacher to teacher/year level to year level common approaches), stored, summarized and re-presented for making comparisons.


So this brings me back to the first question I posed at the start of this, and one that everyone should always ask themselves, “Why am I testing?” If it’s not for the good of the individual student, then you could be totally wasting your time retesting at the end of a year. In this regard, the question about whether to use test 1 or 2 for the end of year retest is totally redundant. Personally, I’d use test 1 for the early years. They’ll be some familiarity, but that can also be a good thing. Then use test 2 for middle primary. Use the alternative test where you think there might be an over-familiarity with the other test. Use test 1 again for upper primary and secondary. As per the page 5 instructions, only ever introduce an Advanced Reading Test for individual students who score 5 or less errors on a Standard Reading Test. The Waddington Standard and Advanced Spelling Tests 1 & 2 are cleverly presented and arranged so all students, regardless of age, can try and complete as much as they can (see page 54).


July 2016 Response to a discrepancy noted by a state based educational tutoring co-ordinator between a child's score on the Waddington Reading Test and PM Benchmarking by the school for that child

Yes, there can be differences and it boils down to whether the testing has been done properly and what each type of assessment tool is actually doing, how it relates to the reading and learning processes and how the derived result is a true reflection of the aim of the program, even what it may purport to achieve.


Here is some info about the PM Benchmark approach:

Young teacher first time using it

Australian website selling the resource and their key points

UK website selling the resource and their key points

Website - We must rethink our use of PM Readers and Benchmarking

Critical parent comments


I quote from the beginning of Chapter 3 from my Waddington Diagnostic Standard and Advanced Reading and Spelling Tests 1 & 2 Third Edition, “Reading has been defined as, “a message-getting, problem solving activity which increases in power and flexibility the more it is practised.”1 This problem solving activity involves the interrelated language modes of listening, speaking, reading, viewing and writing, as “one often supports and extends learning of the others.”2


My tests reflect a purposeful and successful approach to reading skill acquisition along with effective key teaching approaches. They are grounded in the very core elements of how reading embodies not only what is presented on a page/screen but how it interacts with the reader and their current skills, chiefly via 3 cueing systems:

1. Grapho-phonic knowledge (understanding of letter/sound relationships) - phonemic awareness and phonics,

2. Knowledge of the sentence patterns and structures of the English language (syntactic/grammatical knowledge) – fluency,

3. Knowledge of the world (semantic knowledge) – vocabulary knowledge and text comprehension


This is why my reading tests come with spelling tests (grapho-phonic knowledge). This is why I also provide math tests (semantic knowledge). In the early stages of learning to read, it is not just about how a child can respond to text on a page/screen. Reading in its most elementary stages, is basically about decoding – making sense of symbols (letters of the alphabet) and the sounds they make in their most stripped down form at the earliest of stages. That is why my reading tests start with letters and the diagnostic properties contained in the first 9 examples cover many key elements without anyone having to think of them first.


Does PM Benchmarking attend to all that I write above and give enough credit to a person who is actually able to read (make sense of written symbols in their most elementary forms) skills which may precede or transcend text on pages in a pre-defined book at a pre-determined stage that is supposed to represent accomplishment at the earliest stages of reading? Do the PM Benchmarking levels truly embody and reflect what we should be teaching during the learning to read phases? How much does the judgement of the person administering the ‘PM test’ and the amount of coaching influence the outcome, especially during the taking of running records? As a professional, you need to answer these questions. I would say that most professionals would say they are a guide but do not reflect everything they do as a teacher, nor do they reflect everything a student can do when they read. Some may say the PM Benchmarking reflects long term reading goals but not the nitty-gritty of exactly what a child needs within a carefully constructed scope and sequence of important skills. As a teacher I need to use assessment devices that match what I am actually doing so I can be sure my teaching, and each student’s learning, is progressing well. I never use assessment devices I do not fully understand or assessment devices which are somewhat alien to my teaching style, goals as well as my student’s learning style and goals.


I would ask the child’s teacher how the PM Benchmarking result reflects the child’s general ability to read, where the child is under-performing and how the PM Benchmarking result provides a guide to what needs to be done to improve the student’s reading skills.


You ask, “how we can explain this discrepancy to the parents?” To start with, the Reading Age attained by the student you mention is 3 months outside of the ±2 SEm of the true score about 95% of the time standard error of measurement. If this score seems unnaturally high for the student, other factors may have been at play, such as test coaching, the test reflects too-closely what the teacher is teaching or what the student has recently been exposed to, over-acquaintance with the test by the student etc.. Testing the student with Test 2 might provide a more accurate reading age as long as the administrative instructions are followed closely. The same goes for the PM Benchmarking. Check its score’s error of measurement and the detail behind how it arrives at a reading age. Re-test, preferably with a different test administrator and see if you get a similar result. These are the main courses of action to see whether a discrepancy still exists. You can tell all this to the parents but first establish how concerned they are and how much they actually want to know about the technical details. I’d be fairly confident that all they really want to know is whether their child is at an age appropriate level compared to the average, whether they are progressing well and if not, how they and their child can be helped. Remember, learning progress is not just about reading books, it includes the ability to spell, write, understand mathematics and having exposure to a rich learning environment with different forms of resources and technologies.


I think you answered your own questions well when you said, “I have no question about the result he has achieved on his Waddington assessment.  Whilst this student has been slow to pick up fluency skills, his understanding and articulation of concepts taught is most accurate.” It’s a bit like how we have to be aware of all forms of communication these days (texting on an iPhone compared to text in a book for example) and how they can have a beneficial effect on our learning and sometimes this will not show up easily in a test with a more narrow focus.


1 Clay, M. 1991, Becoming Literate : The Construction of Inner Control

2 ACARA - Australian Curriculum Assessment and Reporting Authority, The Australian

Curriculum English: Foundation to Year 10, 2011, p3

International School in China
"Hello Neil,
We haven't met but I have been a long time user of your material. I thought I would offer an observation just out of interest.
I have been a teacher in SA for 35 years but recently my wife and I made a big move. I am at present working in China, at an International school. It runs on the British national curriculum with a bit of tweaking at the edges. I have a class of students who are ESL in the main (three native English speakers). The reading level tests used by the school are rather opaque for my tastes ( much of the UK testing is ) so I decided to pull out the Diagnostic Reading test No 2 to see if I could make sense of their abilities. I know that your baselines were established using Australian students so I wasn't sure what I was going to see in the results. The scores and their implications were in complete accord with my instinctive judgements of the student's abilities and lined up strongly with their chronological ages. This showed that many of the ESL students had been given a good grounding in phonics and essential reading skills in JP and that their rate of learning was consistent with your Australian sample. While I can't jump to conclusions it seems to me that the tests work consistently in an international school environment. I thought you might be interested.
" P Carter, International School CHINA, 25 October, 2012


Some links to user research and/or implementation  or here

Assessment Tools for Literacy Learning Matrix, SA Ed Dept, July 2012

Evidence that the Waddington Tests produce reliable results over short test - re-test time frames, Charles Darwin University

StarSkills Blog  Review

Studies / Use By Others:

SPELD SA and Education Department SA long term 13 year growth study of 850 Children from their first day of school to their last using the Waddington Diagnostic Reading and Spelling Tests 1 & 2 to measure outcomes 2009 - 2022. - Ms A Weeks, Clinical Director SPELD SA  Interim Report : or here

2010 OLPC NT Study using the Waddington diagnostic tests in Reading, Spelling and Numeracy in seven primary schools with two classes in each school ~ 350 students. One of the classes will receive laptops, the other will not. The schools are Braitling, Bradshaw, Ross Park, Gillen, Sadadeen, Larapinta and Ntaria (Hermansburg). The plan is to conduct the tests prior to distributing the laptops and at the end of the trial, then compare results with the similar classes which do not have laptops. Ian Paul Cunningham Project Officer Technology, Information and Planning NT Department of Education and Training Ph: 08 89516816 Link to OLPC Wiki

"The purchase will be used to strengthen literacy monitoring and evaluation, particularly within the TVET National Bislama Literacy Project."
DME Program Coordinator 2006/2007, World Vision Vanuatu

Two Studies on the Effectiveness of Contiguous, Graphemic and Phonological Interventions on Measures of Reading and Spelling, R J Bourne, Master of Philosophy in Education University of Sydney, University of Sydney 2002.

To test or not to test? The selection and analysis of an instrument to assess literacy skills of Indigenous children : a pilot study, John R Godfrey, Gary Partington and Anna Sinclair, Edith Cowan University and the Education Department of Western Australia, Perth, 2001 Also available here.

Evaluation of a Standardised Test, Suzanne Speers, University of New England NSW, 2000

PA-EFL: A Phonological Awareness Program For Indigenous EFL Students With Hearing Disabilities

Academic Review / Assessment of the Waddington Diagnostic Reading Tests by Marian Haselton, 2004

When we receive requests from researchers for information and/or permissions, we welcome copies of their finished work. Unfortunately we do not always receive copies or notification when our resources have been reviewed. Please email us if you have research or links we can post here.