LISTSERV - TB-L Archives - LISTSERV.ONEONTA.EDU

Although a bit lengthy, I found this posting about measuring student's
higher-level thinking and moving from a teaching centered approach to a
learner centered approach well done. I post this to the TB list for those
interested in this topic....

The CLA has an excellent website discussing the issues surrounding learning
assessment for colleges.  Much of what is posted there, and tools available
at this site, may be of use to departments as they work through current
assessment issues.

Jim G.

            A POSSIBLE MODEL FOR HIGHER EDUCATION: THE PHYSICS REFORM EFFORT

What to Measure And How to Measure

Investigation of the extent to which a paradigm
shift from teaching to learning is taking place
requires measurement of students' learning in
college classrooms. But Wilbert McKeachie 1987
has pointed out that the time-honored gauge of
student learning-course exams and final
grades-typically measures lower-level educational
objectives such as memory of facts and
definitions rather than higher-level outcomes
such as critical thinking and problem solving.
(For more general characterizations of
higher-order learning see Anderson & Krathwohl
2001 and Shavelson & Huang 2003.) The same
criticism (Hake 2002a) as to assessing only
lower-level learning applies to Student
Evaluations of Teaching (SET's), since their
primary justification as measures of student
learning appears to lie in the modest correlation
with overall ratings of course (+ 0.47) and
instructor (+ 0.43) with "achievement" as
measured by course exams or final grades (Cohen
1981). For general characterizations of
higher-order learning see Anderson & Krathwohl
2001 and Shavelson & Huang 2003. In their "Chart
1," the latter display higher-level learning such
as "procedural" (see, e.g., Anderson 2004),
"schematic," and "strategic" knowledge within
knowledge domains, as might be measured and
enhanced by disciplinary experts.

How then can we measure students' higher-level
learning in college courses? Several indirect
(and therefore in my view problematic) gauges
have been developed; e.g., Reformed Teaching
Observation Protocol (RTOP), National Survey Of
Student Engagement (NSSE), Student Assessment of
Learning Gains (SALG), and Knowledge Surveys
(KS's) (Nuhfer & Knipp 2003).

On the other hand, Richard Hersh 2005 has
discussed two types of direct measures developed
by the Learning Assessment Project
<http://www.cae.org/content/pro_collegiate.htm>
(of which he is co-director) that "evaluate
students' ability to articulate complex ideas,
examine claims and evidence, support ideas with
relevant reasons and examples, sustain a coherent
discussion, and use standard written English."
But Shavelson & Huang 2003 warn that "learning
and knowledge are highly domain-specific-as,
indeed, is most reasoning. Consequently, the
direct impact of college is most likely to be
seen at the lower levels of Chart 1 -
domain-specific knowledge and reasoning. Yet, in
the formulation of most college goal statements
for learning-and consequently in choices about
the kinds of tests to be used on a large scale to
hold higher education accountable-the focus is
usually in large part on the upper regions of
Chart 1" (those emphasized by the Learning
Assessment Project & Knipp 2003; my italics).
(For a discussion and references for all but the
last see Hake, 2005.)

Pre/Post Testing

In sharp contrast to the invalid or indirect
measures discussed in the above two paragraphs is
the direct measure of students' higher-level
domain-specific learning through pre/post testing
using (a) valid and consistently reliable tests
devised by disciplinary experts, and (b)
traditional courses as controls. Such pre/post
testing, pioneered by economists (Paden & Moyer
1969) and physicists (Halloun & Hestenes
1985a,b), is rarely employed in higher education,
in part because of the tired old canonical
objections recently lodged by Suskie 2004 and
countered by Hake 2004a and Scriven 2004. Despite
the nay-sayers, pre/post testing is gradually
gaining a foothold in introductory astronomy,
economics, biology, chemistry, computer science,
economics, engineering, and physics courses (see
Hake 2004b for references). It should be
emphasized that such low-stakes formative
pre/post testing is the polar opposite of the
high-stakes summative testing mandated by the
U.S. Department of Education's No Child Left
Behind act for K-12 (USDE 2005a) that is now
contemplated for higher education (USDE 2005b).
As the NCLB experience shows, such testing often
falls victim to "Campbell's Law" (Campbell 1975,
Nichols & Berliner 2005): "The more any
quantitative social indicator is used for social
decision making, the more subject it will be to
corruption pressures and the more apt it will be
to distort and corrupt the social processes it is
intended to monitor."

What Physics Has Learned

Physics education researchers (PER's) have
employed formative pre/post testing to show that
traditional (T) introductory physics courses
promote very little change in students'
understanding of basic physics concepts;
regardless of the experience, enthusiasm,
talents, and motivation of their professors. This
has driven some physicists to develop novel
"interactive engagement" (IE) methods, among
them: Microcomputer-based Labs, Concept Tests,
Modeling, Active Learning Problem Sets, Overview
Case Studies, and Socratic Dialogue Inducing Labs
(for references see Hake 2002b). That such
Interactive Engagement methods are relatively
effective in promoting student higher-level
learning has been demonstrated by the nearly
two-standard deviation (cf. Bloom's 1984 "two
sigma problem") superiority in normalized average
learning gains <g> of IE courses over T
(traditional) courses (Hake 1998a,b, 2002b,c and
corroborative references therein). Notable
examples are large enrollment courses at Harvard
(Crouch & Mazur 2001), North Carolina State
University (Beichner & Saul 2004), MIT (Dori &
Belcher 2004), the University of Colorado at
Boulder (Pollock 2005), and California
Polytechnic State University at San Luis Obispo
(Hoellwarth, et al. 2005).

Some definitions are in order. In the above
paragraph (a) the average normalized gain <g> is
the actual gain [<%post> - <%pre>] divided by the
maximum possible gain [100% - <%pre>], where the
angle brackets indicate the class averages; (b) T
courses are operationally defined courses as
those reported by instructors to make little or
no use of IE methods, relying primarily on
passive-student lectures, recipe labs, and
algorithmic problem exams; (c) IE courses are
operationally defined as those designed at least
in part to promote conceptual understanding
through interactive engagement of students in
heads-on (always) and hands-on (usually)
activities which yield immediate feedback through
discussion with peers and/or instructors.

For links to over 50 U.S. PER groups, over 200
PER papers published in the American Journal of
Physics since 1972, and tests of cognitive and
affective conditions see, respectively, Meltzer
2005a, Meltzer 2005b, and NCSU 2005. The very
active PER discussion list PhysLrnR
<http://listserv.boisestate.edu/archives/physlrnr.html>
logged over 750 posts in 2005. As far as I know,
no other discipline is so actively researching
undergraduate student learning. For reviews see
McDermott & Redish 1999, Redish 1999, Thacker
2003, Heron & Meltzer 2005, and Wieman & Perkins
2005.

The March of Synapses

The fact that IE methods are far more effective
in promoting conceptual understanding than
traditional passive-student methods is probably
related to the "enhanced synapse addition and
modification" induced by those methods.
Bransford, et al. 2000 wrote: ". . . synapse
addition and modification are lifelong processes,
driven by experience. In essence, the quality of
information to which one is exposed and the
amount of information one acquires is reflected
throughout life in the structure of the brain.
This process is probably not the only way that
information is stored in the brain, but it is a
very important way that provides insight into how
people learn." Leamnson 1999, 2000 has also
stressed the relationship of biological brain
change to student learning. In his Chapter 5
"Teaching and Pedagogy," Leamnson 1999 wrote,
"Teaching must involve telling, but learning will
only start when something persuades students to
engage their minds and do what it takes to
learn." Another reminder that the affective and
the cognitive are inextricably linked, as
recently emphasized by Ed Nuhfer 2005 in this
Forum.

The Challenge

I see no reason that student learning gains far
larger than those in traditional courses could
not eventually be achieved and documented in
other disciplines from arts through philosophy to
zoology if their practitioners would (a) reach a
consensus on the crucial concepts that all
beginning students should be brought to
understand, (b) undertake the lengthy qualitative
and quantitative research required to develop
multiple-choice tests (MCT's) of higher-level
learning of those concepts, so as to (c) gauge
the need for and effects of non-traditional
pedagogy, and (b) develop Interactive Engagement
methods suitable to their disciplines.

Why MCT's? So that the tests can be given to
thousands of students in hundreds of courses
under varying conditions in such a manner that
meta-analyses can be performed, thus establishing
general causal relationships in a convincing
manner.

But can multiple-choice tests measure
higher-order learning? Wilson & Bertenthal 2005
think so, writing: "Performance assessment is an
approach that offers great potential for
assessing complex thinking and learning
abilities, but multiple choice items also have
their strengths. For example, although many
people recognize that multiple-choice items are
an efficient and effective way of determining how
well students have acquired basic content
knowledge, many do not recognize that they can
also be used to measure complex cognitive
processes. For example, the Force Concept
Inventory [Hestenes et al. 1992] . . . is an
assessment that uses multiple-choice items to tap
into higher-level cognitive processes."

Lessons Learned

Can nearly all university disciplines develop
synapse-stimulating interactive engagement
methods, and also valid and reliable
multiple-choice tests of affective and cognitive
conditions to measure their effectiveness? I
would bet "Yes," provided they care enough about
student learning to mount the necessary research
and development effort.
Aside from the advantages of pre/post testing,
perhaps physics education researchers' most
important lessons (Hake 2002b) for higher
education are Lessons #1, 3, and 4:

L1: The use of Interactive Engagement strategies
can increase the effectiveness of conceptually
difficult courses well beyond that obtained with
traditional methods.

L3: High-quality standardized tests of the
cognitive and affective impact of courses are
essential for gauging the relative effectiveness
of non-traditional and traditional educational
methods. For examples of such physics tests see
the listing at
http://www.ncsu.edu/per/TestInfo.html NCSU 2005.

L4: Education Research and Development by
disciplinary experts (DEs), and of the same
quality and nature as traditional
science/engineering R&D, is needed to develop
potentially effective educational methods within
each discipline. But the DEs should take
advantage of the insights of DEs engaged in
education R&D in other disciplines, cognitive
scientists, faculty and graduates of education
schools, and classroom teachers.

Calls for the accountability of higher education
in promoting student learning are becoming more
forceful, both from inside the university, e.g.,
Duderstadt 2000, Hersh 2005, Hersh & Merrow 2005,
Bok 2005a,b; and outside the university, e.g., by
the U.S. Dept. of Education's new "Commission on
the Future of Higher Education" (USDE 2005b). For
reports on the Commission's first two meetings
and commissioner's comments on the possibility of
NCLB-like testing in higher education, and on the
declining literacy of college graduates (NAAL
2005), see Lederman 2005a,b.

As Hersh 2005 observes: ". . . in an era when the
importance of a college diploma is increasing
while public support for universities is
diminishing, [assessment of student learning] is
desperately needed. The real question is who will
control it. Legislators are prepared to force the
issue: Congress raised the question of quality
during its recent hearings on the reauthorization
of the Higher Education Act; all regional
accrediting agencies and more than forty states
now require evidence of student learning from
their colleges and universities; and pressure is
rising to extend a No Child Left Behind-style
testing regime to higher education" (see USDE
2005a,b).

Thus it would appear to be high time for faculty
members to turn more of their attention to
shifting the higher education paradigm from
teaching to learning, both because it's the right
thing to do, and because not doing so may invite
stifling oversight by state and national
bureaucrats.

(Professor Hake's extensive list of references
may be found posted in the ancillary materials
section for this issue of the FORUM on
www.ntlf.com.)

Contact:
Richard Hake
Emeritus Professor of Physics
Indiana University
24245 Hatteras Street
Woodland Hills, CA 91367
Email: [log in to unmask]
Web: http://www.physics.indiana.edu/~hake
http://www.physics.indiana.edu/~sdi