During the height of debate on value-added modeling and using test scores to evaluate teacher performance, I was a champion for the tried and true teacher observation. Objective measures of teacher performance are elusive because of all the variables outside of our control. This is the same reason why physicians, for instance, would refuse evaluation based on patient variables, like BMI or cholesterol, because they can’t control what a patient does outside of their offices.

Thus, rather than look at an excel spreadsheet of my students’ performance to evaluate my effectiveness, which seems cold and impersonal, come in to observe my practice. See me interact with students. Before you pass judgment, observe the conditions in which we work, the context in which we are teaching and learning. So much of what is missing with quantitative information, like test scores, is a meaningful qualitative context to situate those numbers.

But there are tremendous limitations to doing teacher observation correctly. Relationships are key, and repetition. Pre and post conferences are essential to understanding subjective interpretations of practice. And we all should know that hanging a score on one isolated chunk of time, without context, is just as bad as hinging a score on test data.

This, however, seems to be how most teacher observation is implemented, and we are finding that it is just as unhelpful for teachers as test score data. For the moment, there really is no argument about the use of test data in evaluating teachers, so the debate is typically on how much test data will be weighted versus other factors, like observation. The percentage varies, and the metrics on which evaluation are based depend on state, district, and grade level. No hard and fast rules exist.

Many districts, like DC for instance, use detailed and complicated rubrics to approach something like objectivity, or inter-rater reliability, even though it’s completely dubious. Relationships with principals can affect observation outcomes, and administrative turnover can keep a teacher guessing as to what kind of observer their new principal will be. Local principal bias is kept in check, at least in DC, by observations from external evaluators, or “Master Educators.” But these observations are completely fly-by, and totally devoid of any relationship or context whatsoever. It is laughably presumptive that a rubric, regardless of detail, can control for observer subjectivities. The more hands you have in the process, the less likely observations are going to be accurate or reliable.

Given my opposition to test scores as a reliable and fair way to evaluate teachers, the observation used to be my go-to alternative. But now I’m not so sure.

UPDATE: Not even 30 minutes after I finish this, I see a link about a principal falsifying teacher observations to get them in under a deadline.


Here’s some specifics: My final appraisal for the 2011-2012 school year: Evaluation summary scores: Lesson Study: 100/100 points x .20 (20%) = 20 points Principal Appraisal: 88/100 points x .40 (40%) = 35.2 points VAM Data: 10/100 x .40 (40%) = 4 points Total points = 59.2 (Unsatisfactory) The VAM data comes from Alachua Elementary […]