End graded observations: this year’s brain gym, and the gorilla in the classroom


Hidden in plain sight

Research is powerful. It can chime with your intuition, or shatter preconceptions. Like when half of all observers in an experiment to count passes of the ball, failed to spot a gorilla enter the game.

On Monday 13th January, Professor Rob Coe gave a speech at an event co-hosted by the Teacher Development Trust on lesson observations in English schools.

It was utterly shattering in its implications for school leaders. It turns out we are all complicit in this year’s brain gym.

Ben Goldacre in Bad Science demolished brain gym as a widely but uncritically adopted fad, an unscientific and useless intervention. Tom Bennett in Teacher Proof and Dan Willingham have demolished others such as VAK learning styles as pervasive but unevidenced. At ResearchEd 2013, Tom asked, what is this year’s brain gym? What are we falling for right now?

Professor Coe’s collation of the research suggests it is graded observations. I agree. It is not reliable – two different observers who see the same lesson are unlikely to agree. Nor is it valid – even if they agree that what they see is good practice, it often isn’t.

Here are Professor Coe’s killer stats:

  • if a lesson is judged outstanding, the probability that a second observer would give a different judgment is up to 78%
  • if a lesson is judged inadequate, the probability that a second observer would give a different rating is 90%.

But that’s in the robust, $50 million MET project; most schools observations are not as robust (Strong et al, 2011)

  • Fewer than 1% of those judged inadequate are genuinely inadequate
  • Only 4% of those judged outstanding actually produce outstanding learning gains
  • Overall, 63% of judgements will be wrong

Prof Coe is rightly scathing: ‘tossing a coin would have been better’; ‘you might as well decide you don’t like someone’ as give them unsatisfactory.

The effect sizes of observation as an intervention are also very low: 0.22 and 0.11. As John Hattie says, setting the bar at zero is absurd; most interventions have some effect, so his threshold for effectiveness is 0.4, which graded observations do not meet.


Graded observations: the gorilla in the classroom

The evidence shows that grading lessons is not reliable, valid or useful. But intuition and experience tells me that it is also counterproductive and damaging.

Damaging, as some fifty teachers tell here of the pressure and pain they felt after being downgraded. What if they had known the 90% probability that a second opinion would have changed their rating?

Counterproductive, as David Didau shows here, as ‘the cult of the outstanding lesson is retarding learning.’ The focus on busy engagement in protocols over memorable instruction is problematic: it is precisely this distractor that Professor Coe says compromises validity.

So what do we do about it?


First, do no harm: end numerical judgements

Doctors take the Hippocratic oath: first, do no harm. So should school leaders. But we are harming teachers’ professionalism by grading them out of 4, often in 20 minutes. There’s no way a surgeon would be graded out of 4 for 20 minute observation of an operation.

We must stop grading lessons. Professor Coe says we should ‘stop doing what we’re doing’; ‘if you don’t want to use observations for grading, it may not matter that they’re not reliable.’

If we just use them formatively, teachers can focus on improving rather than being judged, and school leaders can combine quantitative assessment data, qualitative feedback from colleagues and their own intuition to form nuanced judgements of teaching quality.



Then, follow the bright spots: use formative-only observations

‘Sow the seed of the end of the judgemental approach to school leadership’ Alison Peacock said at the same event, a primary head who eschews grading lessons and instead uses lesson study for a culture of trust.

School leaders like Chris Moyse and Paul Bambrick-Santoyo are trailblazing formative-only models. It takes courage and willpower, but it can be – and is being – done.

In years to come, like BrainGym, we may well look back on grading as a travesty and a historical curiosity. Now, though, this business of grading observations must end. Let’s get the gorilla off our backs.


About Joe Kirby

School leader, education writer, Director of Education and co-founder, Athena Learning Trust, Deputy head and co-founder, Michaela Community School, English teacher
This entry was posted in Education. Bookmark the permalink.

29 Responses to End graded observations: this year’s brain gym, and the gorilla in the classroom

  1. Pingback: Edubabble – End graded observations: this year’s brain gym, and the gorilla in the classroom

  2. Reblogged this on Boom School.

  3. Hiya, great post! I’ve reblogged this on http://www.edubabble.info.

  4. Couldn’t agree more. There is a researcher called Matt O’ Leary (Wolverhampton University) who writes quite well on observations as well.

    If you haven’t heard of him then a google scholar should find the papers.

  5. Pingback: If you want the elephant to grow, feed it, dont weigh it…. | From the Sandpit....

  6. Pingback: On Grading Lesson Observations | HuntingEnglishHuntingEnglish

  7. marvinsuggs says:

    The resistance we will meet to these kind of changes is significant. Can SLT can put their faith into the evidence? Their first question would probably be “How would Ofsted judge this kind of system?”

  8. Sara says:

    I totally agree with everything you’ve said but I think there is one motivation for observation grades that you haven’t addressed. If you were SLT and wanted to “audit” the staff to get an overall picture of the quality of teaching in your school, how would you do that? As you’ve shown, the results they would get from compiling data from graded lesson obvs would probably be wrong, so how can they get more accurate data?

  9. Pedro says:

    Dit is op From experience to meaning… herblogden reageerde:
    Made me think about the practices we do in teacher training. Luckily we invest a lot of time in different people judging a future teacher, still…

  10. rachelcabbit says:

    Reblogged this on Why Aye Miss! and commented:
    Couldn’t agree more with this. Every teacher has different ideas of what “outstanding” means, and observations are so flimsy. You can have a lesson that on paper and in theory is outstanding, but then the pupils may not actually have learned enough to have it categorised as outstanding.
    The old staffroom chat about inspections comes to mind – you put all the bells and whilst and bring out the jargon and lessons plans for inspection week. Is it really how everyone teaches though? It seems observations do not do what they are supposed to do. They rarely reflect real teaching experiences and are far too pressured!

  11. Pingback: A reason to keep graded lesson observations | deputyjohn

  12. dodiscimus says:

    From Wilshaw’s Westminster Education Forum speech on 7th November 2013: “Which ivory towered academic, for example, recently suggested that lesson observation was a waste of time – Goodness me!”. I wonder if Wilshaw needs to pay more attention to the ivory towered ones? Having said that, if you read the MET policy and practice brief (http://www.metproject.org/downloads/MET_Ensuring_Fair_and_Reliable_Measures_Practitioner_Brief.pdf) my interpretation is that Coe’s figures match, but not his conclusions. The MET project is talking about improving the reliability of observations, not abandoning them. Maybe this is what Wilshaw should actually be reading.

  13. There is a danger of a false polarity here: Grade=Bad or not Grade=Good. We have to be honest here and I speak as someone who has dedicated my 44 years in this profession to the approach so ably championed in this BLOG. But Ofsted and accountability will not go away. We have to marry the coaching, professional development model with the requirement to match criteria. We have to find a way of conducting our emotionally intelligent, collaborative inquiry, action research (call it what you will) and ALSO being prepared to challenge bad decisions/judgements by people who look and judge – we have to speak fluent Ofsted too! That is why I designed The iAbacus which STARTS with the teacher’s judgment and moves on to analyse and then plan…. Please don’t reply this is a product placement – that is another false polarity Public Service = Good or Business = Bad.. see http://www.iabacus.co.uk for the MODEL and the TOOL …

  14. MrHarrold says:

    Hi Joe,

    I was at the event and struck by the comprehensive case against the reliability of matching judgements – but I feel the debate at large has missed the key points by becoming about whether or not grades should be assigned to lessons.

    Firstly – even if the grade is wrong, feedback could have been useful, correct and helped a teacher improve. From your piece above, removing the grade would result in a positive scenario – feedback still useful – grade that was incorrect now removed. But what if the feedback was wrong or unhelpful or not understood? Removing the grade has no impact – and can lead towards false confidence about the utility of feedback if we assume that by not grading we are improving observations.

    Secondly – if we concede that observation feedback has a use in developing teacher effectiveness when feedback is accurate, then the focus should be on improving the types of things we look for in lessons – not whether or not we grade them. If we can improve the ability of observation to highlight the proxies of learning, then we can improve lesson observations more than we would by removing grades. In essence, what we grade becomes more important than why, or whether, we should grade.


  15. Pingback: Roll up, Roll up! Observations of the Absurd | gettingitrightsometimes

  16. Pingback: Revealing our hearts of darkness: another voice against graded observations | Reflecting English

  17. Pingback: Free Thinking: I agree with Katharine | Pragmatic Education

  18. Pingback: We, the teachers, must hold OFSTED’s feet to the fire | Pragmatic Education

  19. Pingback: We, the teachers, must hold OFSTED’s feet to the fire

  20. Pingback: The research-practice paradox | Pragmatic Education

  21. Pingback: A guide to this blog | Pragmatic Education

  22. Pingback: Stop OFSTED grading teaching | Pragmatic Education

  23. Pingback: Bloggers lead the campaign to reform Ofsted | Pragmatic Education

  24. Pingback: On observations | Pragmatic Education

  25. Pingback: The Signal & The Noise: The Blogosphere in 2014 | Pragmatic Education

  26. Pingback: Articles | Joe Kirby

Leave a Reply