Data Mining the Ed Biz
Universities deciding whether to hire, keep or promote faculty use a mix of criteria, one of which is teaching. Teaching quality, in my experience, is judged mainly by student evaluations, to some extent those are supplemented by the views of faculty members who have sat in on a class to observe it.
Could we do better? In particular, could law schools--I currently teach at one--do better? Law schools have, from this standpoint, one significant advantage: The state bar exam, which most of their graduates will take, provides an external measurement of how successful their teaching has been. A second advantage is that, in the first year, all law students take pretty much the same courses and, where different sections of a large course, such as Contracts or Property, are taught by different professors, allocation of students is pretty nearly random.
This suggests a possible solution to the problem. Analyze bar passage rates to see if students who took Property from Professor X did, on average, better or worse than those who took it from Professor Y. If there is a significant difference, take that as evidence that one of the professors was a better teacher than the other.
There are some important limitations to this approach. Who taught a particular course in the first year is probably only a small factor in whether, three years later, the student did or didn’t pass the bar. Hence the evidence produced, even if real, is going to be very weak. It could be improved if it were possible to get bar results in a more detailed form--not just overall scores but scores on each question. One could then look for the effect of the property professor on questions that depended mostly on understanding property law, of the contracts professor on questions that depended mostly on understanding contract law.
A further limitation is that learning to pass the bar is not the only objective of law school. Professor Y, whose students do a little worse on the bar, might argue that he is spending less time than Professor X on material relevant to that exam, more time on material that will be important in the student’s future law practice. “Teaching to the test” is not, after all, an unambiguously good thing—although it becomes more defensible when the particular test is one the student has to pass if he is ever going to use what he has learned to practice law in the state he lives in.
How can this approach be generalized beyond the special case of the law school and the bar exam? Consider students who have taken the first course in a subject from a variety of different teachers but have taken a more advanced course together. Their final grades in the latter course will provide some evidence of how good their preparation was, which in turn provides some evidence of how good the first course was.
One problem with this approach is that students may not have been assigned to the first course at random. Perhaps there was some reason why, on average, Professor X started with better students than Professor Y. A second and more subtle problem is that how Professor X's students do in the second course depends in part on which of them take it. Perhaps Professor X presents the material as very difficult, scaring out of the field all but the best students--with the result that, by the time we get to the second class, we are comparing X's three best students with Y's thirty best. To try to control for such problems, it would be worth including in our analysis both other information on the students, such as their SAT scores (LSAT in the law school context) and also looking at how many students from each of the initial courses went on to take more advanced courses in the subject.
One problem with all of these approaches is that, if they are known to be in place and to have a substantial effect on hiring and promotion decisions, faculty members can be expected to try to game the system. If bar passage rate is used to measure success--not because it is all that matters but because it is the only relevant external data we have--professors have in incentive to teach to the bar exam, which may or may not be a good thing. If grades in more advanced courses are used, professors have an incentive to focus their teaching on only the better students and to try to encourage their best students into the field and their worst students out of it. Readers interested in an entertaining and intelligent discussion of the problem will find it in the first chapter of George Stigler's The Intellectual and the Marketplace, which describes the efforts of a (fictional) South American university reformer.