Friday, August 28, 2009

Brief Analysis of Merit Pay

One aspect of correcting what some perceive as a broken education system in the United States is the mass application of a merit pay system for teacher compensation. Proponents believe that utilizing a merit pay system will both better identify low-quality teachers that should not longer be in the profession and high-quality teachers that should be better rewarded for their efforts. Also there is a belief that changing the compensation structure for teachers from seniority to output will create a more ‘result-driven’ environment forcing teachers to focus on continually improving results instead of relying on their reputations. These economic factors would then make education a more attractive environment for higher achieving individuals, thus producing higher quality teachers leading to an even greater number of higher achieving individuals. Also merit-pay based systems are viewed as more flexible because administrators are better able to respond to performance changes and labor market alterations.

Merit pay is not a new issue as in the late 19th century a majority of teachers were compensated based on the results of their performance rather than their seniority or skills. However, as the 20th century wore on the number of teachers in public schools compensated through merit pay dropped to 48% in 1918, 20% in 1939 and 4% in 1953.1 Due to a recent surge in interest in merit pay, merit pay compensation has increase to 5-10%.2 Most believe that the rise of teacher’s unions is the primary reason for the decrease in merit pay compensation in the 20th century.2,3

Overall merit pay has once again become a hot issue in public schools (technically the issue has never really gone away, but has been on the back burner until recently) because of floundering test scores and poor performance of students, especially in the higher-grade levels when compared against their international peers. Would people really care about the state of the education system even if it were exactly the same if U.S. students were first or second vs. international students instead of in the low teens? For some merit pay is viewed as a ‘silver bullet’ for education reform solving most of the current problems they believe plague school and teacher performance. Most proponents cite the successes of merit pay in private industry and private schools in an attempt to verify the superiority claim of merit pay, without actually realizing that public school is quite a different ‘beast’ not effectively emulated in either private industry or private schools. In a large number of industries, merit pay rarely influences employee salary for employees below a certain tier in the corporate hierarchy, the cutoff usually being a supervisory role. So to apply the same type of internal structure to a school, most teachers would not qualify for merit pay unless they were a department head (most senior teacher) or an administrator.

As previously alluded to, it is difficult to transfer examples from private industry or private/charter schools as a measure to justify the positive changes that will result from the application of merit pay in a public school environment. In most environments it is easy to evaluate the reason why a certain project succeeds or fails because there are predictable and controllable inputs. However, in the classroom there are a multitude of complexities that influence test results. Natural student intelligence, student work ethic, instructor style/method, instructor motivation, institutional environment, parental guidance, available resources within a given school, etc. all have a level of influence on the ability of students to perform and all have significant layers of complexities. Most opponents of merit pay attack the validity of merit pay on these grounds, that it is too difficult to separate the influence of the teacher from these other influencing elements;1 however, the more appropriate question is why focus so much attention on the influence of the teacher? The role of the teacher is important, but it seems that proponents of merit pay wish to limit administrative and parental responsibility in the role of education.

Another problem with comparing the application of a merit pay system in a private school to one in a public school is private schools have the ability, if so desired, to remove a student from the population pool because the student is not living up to the standards of the school whether it be intellectually or behaviorally. In contrast it takes extraordinary circumstances to remove someone from a public school. Even outside of removing a student from the school outright, the code of conduct governing a private school can be more severe creating an underlying motivational driving force to learn because certain actions disruptive to learning result in both greater certainty and severity of consequence.

The ability to pick and choose the characteristics of the student body also creates a more homogeneous population for teachers in private schools than teachers in public schools. These homogeneous populations limit the unique influencing factors that affect the instructional process whereas the more heterogeneous populations in public schools have no such advantages. Another particular aspect that is present in private schools at a higher probability than a public school, which influences performance, is direct parental involvement aiding motivation in the teaching process. Also regardless of inadequate funding period (some like to cite an average dollars per student figures for public school ranging from $8,000 – 15,000 per student ignoring standard deviation) or just misappropriation/incompetence by administrators, public schools do not appear to have the actionable depth of financial resources that private schools have which can be used to ensure that the necessary and appropriate tools for learning exist in the classroom. Finally the smaller populations of private schools not only reduce discipline problems, but also offer teachers a greater number of opportunities to engage students in more one-on-one time, which facilitates improved learning. So to simply say Smithville Private School has a merit pay system and look at how successful it is, Smithville Public School should have one too, illustrates an individual ignoring a plethora of relevant information to why Smithville Private School would outpace Smithville Public School beyond just having a merit pay program.

Unfortunately another potential problem with merit pay at the current time is collusion and corruption. As seen in the No Child Left Behind (NCLB) program when the judgment of meritorious service is dependent solely on peer/superior/student evaluations or standardized test scores there is motivation to falsify results or use a ‘I’ll scratch your back if you scratch my back’ mentality in order to maintain funding or acquire more salary/funding by exceeding benchmark goals. That type of corruption is small, but still exists, for some/most (depending on one’s personal optimism regarding the human condition) people tend to weaken in the morality department when money is involved. In a merit pay system it would be probable to expect the temptation for some teachers/administrators, especially those that would be negatively affected by a merit pay system, to ‘cheat’ the system in order to acquire a larger salary to increase. Many studies have been conducted highlighting corruption in evaluation systems.4,5,6,7,8

This is not to say that there is more corruption in teaching than other industries; there is corruption in almost all industries and occupations. A few bad apples should not create a misconstrued portrait of reality in the teaching profession. However, one of the arguments for a merit pay system is that it creates fair playing field where individuals are allegedly judged on quality of performance rather than seniority and politics, but if the evaluation system is not designed properly then corruption mitigates this ‘fairness’. One could argue that under such circumstances the system becomes detrimental for the more scrupulous participants. In addition focusing too much on test scores creates an environment where obtaining a certain average score on a certain standardized test becomes the primary motivation of the class rather than proper education and knowledge to create a quality and rational citizen, but more on that later.

There is a question about how genuine the motivation factor is in a merit pay system? For example most merit pay systems within public school systems do result in higher overall average salaries, but the difference between the average salary in the average merit pay school vs. the average salary in the average non-merit pay school is less than 1,000 dollars or about 2.7% of the annual salary.3 Therefore, how much of a motivating factor is such a paltry increase in pay? The answer to the above question would largely depend on the evaluation criterion/criteria put forth to determine the positive or negative outcomes from any merit pay program. If the evaluation method is transparent and fair there would be a much higher probability for improved motivation despite the size of the reward (more money is more money) vs. if the evaluation method is singular in nature and disingenuous to the educational process.

Therefore, how do the current more popular evaluation methods fair under the above described attributes. Relying on simplistic student evaluations or grades is irrational as these surveys and simple measurable factors are difficult to evaluate as genuinely impartial in a school environment. Student evaluations of instructors at a pre-college level are largely based on the overall workload, grading criteria and instructor likeability. The use of such evaluations on the overall performance of the instructor should be questioned because rarely will the students actually judge the performance solely based on the instruction.

For example what is the probability that a student earning (remember students earn their grades, grades are not given) an F in a class is going to give the instructor a positive evaluation citing that the lessons were top-flight, homework and tests were designed to optimize both learning and evaluation of knowledge and that the teacher was fair when conducting the class? A favorable evaluation under such circumstances is not very likely. Depending on prior knowledge, any potential competition issues or personal feelings, peer or administrator-based evaluations could also be tainted with bias and offer little objectivity in lieu of their subjectivity for justifying a teacher being rewarded or penalized in a merit pay system, although these structured evaluations should prove to be more valid than student-based evaluations.

Evaluation of teaching performance based on student grades is almost as irrational as student evaluations due to how easily grades could be manipulated to achieve a favorable performance review either indirectly by appraising course work less harshly than one should or directly by simply changing curves and grades. Also the variance in student skill and intelligence would also need to be considered for such an analysis system. To do so properly would imply the suspension of a merit pay system for a number of years to generate a ‘grading background curve’ to neutralize such variances if such a system really wants to evaluate teaching performance. In addition this ‘background curve’ would have to be created before any mention or application of a merit pay system to verify its statistical authenticity.

One of the many flaws in using grades as an evaluation of teaching performance can be illustrated in the real-life example where a high school instructor handed out a syllabus at the beginning of the term outlining the specific requirements to attain a given grade in the class. At the end of the term none of the students had met the given criteria to pass the class so every student in the class received an F. Upon hearing that the entire class failed, the administrators called in the instructor and said that he could not fail the entire class and he would need to change some of the grades. In response to this ultimatum the instructor instead gave each student an A (note the word ‘gave’ as none of the students earned the A) and the administrators that previously chastised the instructor elected not to comment on these new grades. Clearly in this case and many others, both the school administrators and parents refuse to be honest about student performance and as long as this continues merit pay based on grade distribution in a class is irrational, foolish and disingenuous. The door needs to swing both ways for honest appraisal of teaching performance. In short parents need to realize that their child/children may not have the intellectual capacity to earn As in every subject.

Finally utilizing scores on standardized tests does not appear to be an effective means for evaluating teaching performance. The first and most obvious reason is that to judge the quality of 180 days of instructional performance on a single annual test with rotating non-constant/uniform psychological variant participants is irrational. A significant problem with the standardized tests is that they focus more on fact memorization than critical problem solving. Memorization by rote is becoming less and less meaningful with the continuous progression of easily stored and sorted information (Wikipedia, etc.) and new technological tools. There is little reason to memorize that the Battle of Hastings took place in 1066 when one can simply find the information from a reference source. A more important question would be ‘how did William the Conqueror defeat Harold Godwinson at the Battle of Hastings’ because it requires critical reasoning skills, the ability to formulate and test hypotheses and apply those skills from theoretical situations to real-world situations. Tying merit pay based evaluations to the results of these exams send a value signal to teachers that the results of these exams are an important element in the curriculum, which will influence the teacher to devote more time to teaching the material on the test vs. teaching the ability to deduce answers from available information. In a lot of respects such a shift has already occurred on some level.9,10,11

Another problem with using standardized tests as the measure to determine teaching proficiency is that individual schools do not have direct or really even indirect control over the content of the test. Therefore, these tests have a tendency to distort the actual education that the students are receiving and the genuine performance of the teachers. The somewhat cruel irony is that the United States is the only industrialized nation that applies significant emphasis on standardized tests of such nature and yet despite this specific focus almost all other competing industrial nations outperform the United States on these very tests. Clearly there must be a better evaluation criterion than standardized tests. Any school that utilizes standardized tests as the sole criterion for evaluation of anything is a failure.

One unfortunate issue is that most proponents of merit pay do not appear to either be aware of these evaluation methodology flaws or do not seem to care that they are flawed when pressing for the application merit pay. Instead of addressing flaws in the more popular evaluation methodologies, proponents focus on criticizing teacher unions as an impediment to the administration of merit pay in a wider number of schools. Ironically such a criticism could be valid if a more reasonable and less flawed evaluation system were proposed for a merit pay system. Such a scenario would then allow merit pay proponents the ability to differentiate between the rationality of unions protesting merit pay due to low quality, inappropriate and non-transparent evaluation methods vs. unions that are simply resisting to protect the jobs of low-quality teachers. Not surprisingly most merit pay proponents seem to only assume this latter reason for union opposition. Therefore, sufficient to say the key element in the debate regarding the application of a merit pay system is the development of a valid and appropriate evaluation system.

In light of the above criticism it would be prudent to identify the most notable public school applied merit pay systems in operation in the United States: the Professional Compensation System for Teachers (ProComp) in Denver, Governor’s Educator Excellence Award Programs (GEEAP) in Texas, Special Teachers Are Rewarded (STAR) in Florida and Quality Compensation (Q-Comp) in Minnesota.4 Overall it is difficult to effectively analyze either GEEAP or STAR because their recent approval, 2006 and 2007 respectively, only generates a small sample size from which to judge the positive and negative aspects of the program. Therefore, there will not be any discussion of GEEAP and STAR. An initial analysis of Q-Comp determined that it appears to have had a positive effect on schools and supporting teachers, but there was no statistical link for these trends.12

The longest running ‘merit pay’ based program in public schools, which is still active, is ProComp which initiated a pilot program back in 1999 and was approved by Denver voters for full application in the Denver school system in 2005.4 ProComp focuses on improving teacher performance and pay opportunities through four separate components: enhancement of knowledge and skills, quality professional evaluations, market incentives and improved student growth.4 Unfortunately for proponents of merit pay, ProComp is not a very strong piece of evidentiary support for the application of pure merit pay systems nationally because when breaking down the program the highest portion of pay incentive is derived from the knowledge and skill component (43.2%) (basically what certification and degrees does the teacher have) not the improved student growth component (23.1%). This structure is interesting because most people still view merit pay in the context that a majority of the pay incentive is tied to improvement in student performance, where such a notion is a part of but not the direct focus of the most successful ‘merit-pay’ system in a public school district.

The fact that ProComp has been labeled a success, and there is no real reason to suggest otherwise, sets a powerful precedence to what perhaps should be the model for new merit pay programs. Such a strategy seems to focus on the acquisition of certified skill sets that are thought to improve teaching performance being the driving force for incentive pay not direct measurement of change in student performance. This system would limit the influence of evaluating student performance, while still increasing the probability of increasing student performance, for the newly acquired skills should allow teachers better strategies to improve the learning environment. However, if such a system is established it would handicap the ability to apply sufficient penalties for poor teaching performance to force out low-quality teachers, something that merit pay proponents believe is necessary, because the pay incentives for certification would be higher than any reasonable pay penalties for poor performance. The argument that poor performance could induce termination does not appear to change the status quo where under a fixed salary system a teacher can be fired for poor performance, thus the most successful empirically tested system does not appear to have an effective means for rooting out low-quality teachers that is superior to the current system.

Unfortunately there are other potential complications with how ProComp distributes pay incentives. The most notable concern comes from the belief that the acquisition of certification, degrees and higher level skill sets do little to actually influence the teaching dynamic put forth by a given teacher and increase student performance.13,14 If these studies are to be believed then it appears that over 40% of the pay incentives put forth by ProComp do little to nothing to increase student achievement, a statistic that could very easily change the view of ProComp as a success to a failure. In addition if valid such a reality reduce the versatility of a merit pay system placing more stress back on the student performance evaluation methodology, its execution and honesty.

Regardless of whether or not higher credentials affect teacher performance, any new system for evaluating teacher performance must be transparent in its distinctions between why a particular teacher attained a certain level of standing and another did not. The more subjective the system the greater the potential for internal conflict and grievances which do nothing, but hurt the educational environment. Of course such a requirement is only required in a competitive merit pay based environment when there is only so much money allocated for pay incentives. If money is not the limiting factor then any internal conflict would be rare because most teachers, like most employees in general, would only be concerned about his/her own evaluation.

In addition any evaluation system must adequately test critical reasoning and problem solving skills, innovation, information communication and ability to work within a team, skills that actually prepare students to be productive and intelligent citizens. Finally the execution of any evaluation system for use in a merit pay system must be able to blend naturally into the construct of the learning environment; wasting class time by conducting evaluation after evaluation or unnecessary test after test will more than likely end up hurting the students more than helping them.

One of the trickiest issues when considering a merit pay program is defining the evaluation criteria between grades and subject matter and what type of scale differences, if any, would exist in such a program. For example should a merit pay program have the exact set of generic criteria for each grade and each subject and teachers obtain the same bonuses or penalties based on their attainment of these criteria? Is it fair to say teaching English is as hard as teaching Physics? Are there just different sets of skills required for both and any overall difference in difficulty is mitigated by such skill sets?

Say it is reasonable to suggest that certain subjects are more difficult to teach than others, but even if they are do they deserve more money? Suppose they do, will the additional salary be in the base-salary or will there be a higher ceiling in merit rewards for these individuals? What do you say to those with a lower merit reward ceiling with regards to the overall importance of the subject they teach? There are important questions that need to be addressed both in general and in any type of a merit program. ProComp deals with this question under the category component of ‘market incentives’ as an additional $989 is available to teachers that teach hard to staff or hard to serve subject matter (the $989 appears to be independently awarded thus two awards can go to the same teacher for a hard to staff and a hard to serve class).

Some argue that merit pay is meaningless as a driver for teacher improvement until a system is established that forces the school itself to improve. For example merit pay may create a situation that motivates a teacher to teach better, but if the educational capacity in the school environment itself has a low ceiling, no matter how good and/or motivated a teacher is, that low ceiling will tend to produce lower expectations and results. For example if the school does not do anything to recognize or value academic achievement there would be little student motivation, regardless of the teacher, for students to be interested in learning. Therefore, a truly effective merit pay system cannot exist without some level of motivation from the schools to improve the overall academic environment. Note that it is reasonable to anticipate the ceiling capacity to increase more as a bimodal structure, such that if the increase is from an initial low capacity there will a more linear change in the ability of a teacher to teach a student, but as the capacity increases the ability change will shift from being linear to logarithmic, think a Michaelis Menten curve. In essence the higher the capacity the less positive change occurs in teaching potential when increasing capacity.

Even if the evaluation portion of a merit pay system was designed properly, the problem of continuous and sustained funding still remains. Taxpayers are notorious for failing to pass school funding levies and bonds. Add that to the fact that anyone that is not a complete cynic regarding the nation’s public school systems would expect to see a majority of teachers, after a couple of years of adjustment, meeting the evaluation benchmarks, thus obtaining at least most of the prescribed pay incentives. Outside of applying a quota-curve system where only a certain number of teachers could be in a given merit classification region, which would breed competitiveness and possibly undermine the honesty of the evaluation system completely undermine the fairness of the system, such a program would experience significant budgetary expectations each year; Therefore, funding for the program would need to be available each and every year. Otherwise it would be similar to telling a student to be proud of the earned 96% in a given class, but unfortunately because twenty other students earned higher percentages the student in question will receive a B. Such a situation would basically be a lack of reward despite fulfilling the required objectives for the given reward.

So with millions of dollars needed to fund merit pay programs and public schools already strapped for cash or at least claiming that they are, where will the funding come from? Currently there appear to only be two options, either taxes would have to be raised for the citizens in the school’s given district, it is difficult to believe that most communities will accept this increase, or significant corruption/incompetence reform assuming that such reform will produce the necessary funds, which is unlikely. Add to that fact other funding problems for public schools that are just over the horizon (busing children to school is one of the big ones) and the question of funding becomes more pertinent.

Overall there are four explanations for poor student performance: first, the teachers are not properly trained or lack the skill to teach effectively; second, students come to school unprepared to learn or do not have significant levels of natural intelligence; third, the school does not have and/or provide adequate resources to facilitate high-quality learning and instruction; fourth, students and/or teachers are not sufficiently motivated; application of a merit pay system directly affects none of these elements and depending on the system indirectly affects the first and fourth explanations. Therefore, if a given U.S. school is going to compete on both a domestic and international environment on a relatively uniform level all of these elements will need to be addressed not just one or maybe even two.

With all that has been said, if individuals are satisfied with the structure and results provided through ProComp, despite the fact that it is not even close to what most people seem to envision as a merit pay system, then a significant amount of the work involving merit pay has already been accomplished and progressive tweaking is all that will be required. However, if ProComp is not viewed as long-term viable system or does not accomplish the goals of a merit pay system then there are four critical questions that must be asked regarding merit pay: first, what elements will make up the components of the pay incentives within the merit pay system? (Will student performance be the only factor; will certain subjects be handicapped with greater/lesser bonus potential; will credentials matter; etc.); second, if student performance is utilized as an evaluation criterion for pay incentives, what elements of performance will comprise the total of evaluated performance, how will school academic incentives aid/detract from teacher evaluation, etc? Third, how will teachers that are struggling under the merit pay system be evaluated, how much time and progression will be allotted before termination? Fourth, where will the money to fund the merit pay program come from? Until these questions are addressed in an objective, open and honest fashion further discussion regarding merit pay does not seem to be useful, instead it would be a sub-optimized waste of time.

--
1. Murname, R.J. & Cohen, D. “Merit pay and the evaluation problem: Why most merit pay plans fail and few survive.” Harvard Education Review. 1986. 56(1): 1-17.

2. Figlio, David, and Kenny, Lawrence. “Individual teacher incentives and student performance.” Journal of Public Economics. 2007. 91: 901–914.

3. Goldhaber, Dan, et, Al. “Why Do So Few Public School Districts Use Merit Pay?” Journal of Education Finance. 2008. 33(3): 262-289.

4. Podgursky, Michael, and Springer, Matthew. “Teacher Performance Pay: A Review.” National Center on Performance Incentives. United States Department of Education’s Institute of Education Sciences. (R305A06034).

5. Figlio, D, and Getzler, L. (2002). “Accountability, ability and disability: Gaming the system?” National Bureau for Economic Research Working Paper 9307. 2002. Cambridge: NBER.

6. Cullen, J.B, and Reback, R. “Tinkering toward accolades: School gaming under a performance accountability system.” NBER Working Paper #12286. 2006. Cambridge, MA: National Bureau for Economic Research.

7. Jacob, B. “Testing, accountability, and incentives: The impact of high-stakes testing in Chicago Public Schools.” Journal of Public Economics. 2005. 89: pp 5-6.

8. Peabody, Z, and Markley, M. “State May Lower HISD Rating; Almost 3,000 Dropouts Miscounted, Report Says.” Houston Chronicle. 2003. June 14, A1.

9. Goodnough, A. “Answers allegedly supplied in effort to raise test scores.” 1999. New York Times. December 8.

10. Koretz, D., Et al. “Perceived Effects of the Kentucky Instructional Results Information System (KIRIS).” 1999. Santa Monica, CA: RAND Corporation.

11. Jacob, B, Levitt, S. “Rotten apples: An investigation of the prevalence and predictors of teacher cheating.” Quarterly Journal of Economics. 2005. 118 (3).

12. “Quality Compensation for Teachers Summative Evaluation.” Hezel Associates, LLC. January 2009.

13. Kane, T.J., Rockoff, J.E., and Staiger, D.O. “Identifying effective teachers in New York City.” Paper presented at NBER Summer Institute. 2005.

14. Rivkin, S., Hanushek, E.A., and Kain, J.F. “Teachers, schools, and academic achievement.” Econometrica. 2005. 73(2): 417-458.

1 comment: