Measuring education
by Matthias Felleisen
27 Mar 2010
People often ask me whether I have data that proves the superiority of my introductory curriculum (variously known as "How to Design Programs" ("2e") or "TeachScheme!").
My experience suggests two distinct problems with this question.
The first problem is that obtaining meaningful data is extremely difficult and expensive. A true evaluation should involve a few thousand students; it should track them over a couple of years to see whether their programming course helped them improve their problem solving skills in related fields (say algebra); and it should determine whether it attracts kids into a CS-related major. Nobody has the money to conduct this kind of study.
And then the real questions come up: How do you compare teaching Java vs. Python? How do you evaluate a course that focuses on general design ideas vs a course that teaches programming by example? How do you differentiate the effects of a curriculum that demands a radical change vs something that can be introduced as small incremental changes to the dominant fashion? After 10 years of looking I still don’t know the answer. Some of our teachers have run parallel courses with one and the same teacher running two different curricula. I/we have run "competitions", letting students choose which curricula they want. TeachScheme! "won" in all those "evaluations" but they involved small numbers and are nearly impossible to evaluate statistically and to repeat at other places. The curriculum does positively affect attitudes toward mathematics, and I am sure that no other programming curriculum has achieved this. But how do you measure attitude change in one semester?
The second problem is that people who teach programming don’t make use of
educational data. More precisely, it is extremely difficult to find
teachers or faculty who teach some programming curriculum P because
of measurements that demonstrate the superiority of P over some
other curriculum. The choice of curriculum is typically considered nothing
but a choice of language. And the choice of language in high school is
dictated by the choice of language for the AP test (currently Java) and
some industrial standards (some form of BASIC). At college, the choice is
a compromise between first-year and downstream faculty, a bow to perceived
industrial needs, a nod to the people who are willing to teach the
course—
And because of all these other factors, people reject data that contradicts
their view. Everyone who teaches something "inferior" will come with some
reason as to why some study does not apply to his/her school population or
class constraints or city or teaching style and so on. After meeting lots
of these people—
Let’s just go do research.
p.s. If you are skeptical of these thoughts—