Liberal Education

Leveraging Innovation in Science Education: Using Writing and Assessment to Decode the Class Size Conundrum

Introductory biology courses are supposed to serve as gateways for many majors, but too often they serve instead as gatekeepers. Reliance on lectures, large classes, and multiple-choice tests results in high drop and failure rates. Critiques of undergraduate science education are clear about the problems with conventional introductory science courses, and yet the problems persist. As David Hanauer and Cynthia Bauerle explain, “Given the potential for science to address important problems, undergraduate programs ought to be functioning as busy portals for engaging students’ innate fascination and developing their understanding of the nature and practice of science. Instead, recent studies suggest, the opposite is true: over half of the students who enter college with an interest in science do not persist in their training beyond the first year or two of introductory coursework.”1

Researchers and expert practitioners have long proposed using student-centered, active learning strategies to improve engagement, learning, and achievement.2 Others have documented the ways class size is important for student-centered pedagogy.3 Following Hanauer and Bauerle, who recommend using assessment reform to facilitate such curricular innovations, we contend here that the better the assessment and the more focused the guidance provided to the instructor, the greater the leverage.

Our findings from the pilot study described below suggest that authentic assessment embedded in best teaching practices can show what kind of change is needed. Our study allowed us to observe the relative impacts of both class size and the use of writing as an assessment strategy, and thus to identify the sequence our reform efforts must take. The purposes of this article are, first, to report on our experience responding to Hanauer and Bauerle’s call, and second, to identify the key components that gave that “reform lever” additional power: the careful selection and preliminary testing of essay questions requiring critical thinking, the reduced size of one section of an entry-level biology course, the support of a networked improvement community, and guidance for the instructor during the testing of new methods.


At University of the Pacific, introductory biology courses have long relied on large lecture sections, an instructional model in which “learner success is based primarily on the instructor’s ability to organize and present information in ways that enable students to learn it.”4 Some at University of the Pacific, like Eileen Camfield, director of writing programs, and Eileen McFall, director of learning and academic assessment, champion adding writing and other active learning techniques to courses. But actually beginning a cultural shift from the prevailing teacher-centered, knowledge-transmission model to a learning-facilitation model requires a foot-in-the-door approach to change.

The opportunity arose when Associate Professor Kirk Land noticed deep engagement and improved learning in a small summer section of his introductory biology course. Enrollments in summer biology classes at University of the Pacific tend to be lower than during the regular academic year, when class sizes can reach eighty students. In the summer of 2014, Land’s introductory biology section had just twenty-two students. He observed differences between that section and other, larger sections that went beyond having fewer bodies in the room, though that allowed him to include short in-class writing assignments—something he had long felt was missing from the biology curriculum and an issue he had begun to discuss with Camfield the previous spring. Believing that writing can provide a window into student thinking, Land had previously been thwarted by crushing class sizes and unmanageable grading loads. The small size of his summer class enabled him to act on his belief. Moreover, having the opportunity to interact with a small group made it possible for him to pay more attention to individual students and to gauge their abilities. In that smaller class, he tried more rigorous tests because he “thought that group could handle it,” and they still outperformed other sections.

In the fall, he approached McFall to ask whether she thought his small class’s improved results could be used to make the case for smaller sections of introductory biology every semester. They discussed what might happen if he used data from one summer session as evidence and concluded it was likely that cost-conscious administrators would attribute the improved performance of students in the summer session to qualitative demographic differences between students taking an accelerated summer course and those who take the course during the spring semester. The discussion could have ended there, but with the strong support of his department chair and guidance from McFall and faculty in the university’s school of education, Land sought to test his hypothesis by gathering more assessment data through experimental and control groups. At the same time, Camfield invited him to participate in a “networked improvement community,”5 a structured and guided faculty group seeking to add more writing to the curriculum. Land believes that his connection to these two faculty-centered administrators was essential to transforming his curiosity about student performance into an active analysis of student learning that would inform subsequent pedagogy.

Writing as an assessment method and learning activity

In keeping with Hanauer and Bauerle’s call to facilitate innovations in science education through assessment reform, Land used writing first as an assessment method on mid-term exams. However, his choice did more than simply reveal student learning, it improved it. Using writing as an assessment strategy provides the science instructor with a mechanism for evaluating students’ “intuitive grasp on course concepts,”6 which might be much more important than their recall of textbook definitions. Yet, the particular benefit of selecting this assessment method is that writing simultaneously serves multiple purposes; it can provide a window into student understanding as well as trigger deeper learning in its own right. In other words, beyond assessment, writing offers several layers of cognitive benefits to the science curriculum.

In their acclaimed book Make It Stick: The Science of Successful Learning, Peter Brown, Henry Roediger, and Mark McDaniel discuss the importance of priming students for learning, which might involve asking students to struggle with a problem before learning how to solve it, and of calibrating understanding, which can help students avoid being “carried off by the illusions of mastery that catch many learners at test taking time.”7 They discuss how chunking, or breaking material down into interconnected subcategories, can trigger a process of continued retrieval that allows material to be consolidated into a mental model, “making one’s ability to recall and apply as automatic as a habit.”8 The authors observe that when students reflect on new knowledge and engage in metacognition, they can reframe course concepts in their own words and connect them to prior knowledge, which also allows learners to more effectively retrieve that information at a later time. A form of reflection, elaboration also combats the cognitive fatigue that rote memorization can engender.

Writing exercises are among the easiest ways to activate priming, calibration, reflection, metacognition, and elaboration. Further, the act of writing forces students to expend energy and commit to their ideas in ways that reinforce and extend learning and are in themselves independent forms of learning. Evidence abounds. For example, Karla Gingerich and her colleagues published a study of eight hundred college psychology students that compared the exam performance of two groups of learners—those who copied down lecture material verbatim from slides, and those who generated their own written summaries of key ideas. Students in the latter group significantly out-performed those in the former (by about half a letter grade). Moreover, follow-up tests of retention two months later showed robust benefits of writing-to-learn.9

The benefits of writing in the science curriculum are not limited to better student mastery and retention of material. The writing-in-the-disciplines community has been vocal for several decades about the value of introducing students to authentic disciplinary writing that “brings students into a community of scholars by helping the students learn to speak that community’s language.”10 In learning how to write like a scientist, a student also learns to think like a scientist and to recognize the different kinds of thinking a scientist must engage in to describe, explain, predict, apply, and clarify phenomena to various audiences. A final, but not insignificant, reason to include writing in the college science class is that it might more truly reflect the puzzle-solving, game-playing aspects of science itself. Quite simply, writing can make science learning fun.

Further, writing can be used to dismantle the hierarchy endemic to so many gateway college science courses. As previously discussed, when Hanauer and Bauerle called for reframing science education in ways that acknowledge learning as a “creative and constructive process” that “evolves beyond what is explicitly taught,”11 they suggested using new assessment strategies to leverage reform. In our project, writing is potentially the core of such a reframing. First as an assessment strategy and then as a learning activity, writing may have “transforming power” because it gives students a space to digest course material, raise questions, and formulate opinions in ways that honor their voice and agency.12 Evidence of such benefits include studies comparing “low-structure” college introductory biology classes (featuring traditional lecturing and high-stakes exams) with “high-structure” classes (featuring frequent low-stakes practice in the analytical skills necessary to do well on exams), in which high-structure courses “significantly reduced failure rates, narrowing the gap between poorly prepared students and their better prepared peers while at the same time showing exam results at higher levels on Bloom’s taxonomy.”13 Assigning short writing exercises is an effective way of providing such structure.

Class size as a factor in student engagement and learning

While the benefits of writing in the curriculum are well documented, the research on class size is inconclusive. Some studies and meta-analyses report that smaller class size has a positive impact on student achievement,14 while others show mixed results.15 The construct and its measurement are poorly defined in the literature, but at least at the level of elementary and secondary education, the differences in class size are typically the difference between small classes of eighteen to twenty-three and large classes of a maximum of thirty-five. Studies of differences in student achievement in college-level courses suggest that the impact of class size varies, in part, with the level of difficulty of course activities.16

One assumption that appears to undergird the writing-in-the-sciences literature is that writing is comparably impactful on student learning and effective as an assessment tool, regardless of class size. Our pilot study at Pacific allowed us to examine several dimensions of that assumption. By examining differences in student responses to essay questions on exams in a large versus a small section of introductory biology, we could identify the most effective sequence for subsequent curricular reforms. We could answer the following question: What should come first, smaller class sizes or more writing in the curriculum? Put another way, our assessment experiment helped us identify the core issue as one of class size, which positioned us to seek further evidence tied to this central problem.


In the spring of 2015, Land tested differences in engagement and achievement between small and large classes, including responses to essay questions on exams, with the guidance of faculty in the school of education and a graduate assistant. Students enrolled in one of Land’s two sections of introductory biology: a small class with a restricted enrollment of twenty-four students, and a large class with eighty students. Both sections, the treatment and control groups, experienced the same teaching techniques and the same learning and assessment activities. Land incorporated some problem-based learning and discussion into both sections, and he added writing as an assessment strategy for both. The graduate assistant recorded observations of students’ time on task as a way of assessing engagement. Land also compared student perceptions and attitudes about writing in science classes and overall writing proficiency (defined as the ability to communicate scientific thinking effectively) on examinations. He assessed student responses to one major essay question on each of three midterm exams and compared scores across both classes.


The differences on essay score averages between small and large classes were stark. On the first midterm, the small class performed 24 percent better than the large class; on the second midterm, the difference rose to 38 percent; and on the third midterm, the small class out-performed the large class by 35 percent. Moreover, the large class showed no improvement over time, whereas the small class’s writing scores improved 11 percent by the end of the course.

For the small class, writing as an assessment strategy actually seemed to lead to improved learning—or, at least, to an improvement in the students’ ability to communicate what they had learned. An informal survey also revealed differences between the large and small classes in terms of student attitudes about writing. Students in the smaller class were more accepting of an assessment strategy rarely seen in their other science classes, whereas those enrolled in the larger class expressed resistance and resentment. Some students simply voted with their feet: exams from the larger section had a higher frequency of “no attempt/no score” on the written component of the exams, as compared to exams from the smaller class. A number of factors associated with the climate of the smaller class could have contributed to the higher scores. Land hypothesized that students in his smaller class experienced more personally connected active learning and, therefore, felt less test anxiety as they wrote essays for an instructor they knew and trusted. Indeed, although both the small and large classes were presented with the same learning activities, proportionately more students in the large section were observed as passive and off-task.

Writing as an assessment strategy showed that writing as a learning activity had a bigger impact in the small class than in the large class, suggesting that reform in terms of class size should precede the curricular shift to high-impact teaching and assessment practices. Framed another way, smaller class sizes appear to potentiate the gains achieved by adding writing. Armed with this information, we are positioned to gather additional assessment data that will strengthen our push for class-size reform. Our next steps will involve collecting and disaggregating data to see whether class size has a more significant impact on some students over others. We also hope to explore the intersections between writing as an assessment tool and writing as a learning activity and to determine the degree to which either is sensitive to class size.

Leveraging change

Our results suggest that class size affects the culture of learning and influences writing performance—even when the same instructor is teaching the same material and giving identical exams. Conversations with McFall and support from his department chair allowed Land to conduct his initial class-size experiment, but these factors alone might not have facilitated other changes in instruction and assessment. His participation in the networked improvement community focused on writing in the disciplines provided the support, encouragement, and expertise he needed first to add writing as an assessment strategy and then to build on that experience to more intentionally add writing as a learning activity. Because the community included other science and math faculty, as well as both Camfield and McFall, conversation could range from particular disciplinary concerns to larger questions of student engagement and mastery. Being with others who shared a similar optimism about implementing writing in science classes allowed him to experience a culture in which writing-to-learn was possible. In short, the network galvanized his interest and transformed it into a reform commitment devoted to lowering class size in order to make writing a viable part of the curriculum. Most significantly, the benefits of being connected to a reform network did not end there.

At the end of the semester, teams from the networked improvement community formally presented their findings at a college-wide summit on writing. Because this was the first such event, the presenters didn’t know what to expect. Despite some anxieties, attendance exceeded all expectations (to the extent that the room was packed, and we ran out of food). Attendees included the dean of the College of the Pacific, the director of the Center for Teaching and Learning, and the vice provost for strategy and educational effectiveness, all of whom communicated their appreciation of the innovative work that faculty had undertaken. The dean of the college has since followed up with a commitment to continue with networked improvement communities and to figure out how to expand and solidify Land’s teaching reforms in biology. The foot-in-the-door approach to cultural change has resulted in the door opening wide, ushering in the next phase in “engaging students’ innate fascination and developing their understanding of the nature and practice of science”17 and improving retention, completion, and success.

Initial assessment data gave both credibility and the kind of evidence that Land’s scientist colleagues and administrators respect—thus endorsing Hanauer and Bauerle’s call for using assessment to leverage pedagogical reform. The key components that additionally empowered that “leveraging” were the selection of writing for authentic assessment, the small class size of one of the sections in this pilot project, and the supportive formal connections forged with colleagues and backed by trusted experts. Our discoveries suggest that for tactically savvy reformers, Hanauer and Bauerle’s strategy of using assessment reform to leverage curricular change is sound.


1. David Hanauer and Cynthia Bauerle, “Facilitating Innovation in Science Education through Assessment Reform,” Liberal Education 98 no. 3 (2012): 34.

2. See, for example, George Kuh, High-Impact Educational Practices: What They Are, Who Has Access to Them, and Why They Matter (Washington, DC: Association of American Colleges and Universities, 2008).

3. See, for example, Richard J. Light, Making the Most of College: Students Speak Their Minds, (Cambridge, MA: Harvard University Press, 2001).

4. Hanauer and Bauerle, “Facilitating Innovation,” 36.

5. See Anthony S. Bryk, Louis M. Gomez, and Alicia Grunow, Getting Ideas into Action: Building Networked Improvement Communities in Education (Stanford, CA: Carnegie Foundation for the Advancement of Teaching, 2010).

6. Patrick Bahls, Student Writing in the Quantitative Disciplines: A Guide for College Faculty (San Francisco: Jossey-Bass, 2012), 10.

7. Peter C. Brown, Henry L. Roediger, and Mark A. McDaniel, Make It Stick: The Science of Successful Learning (Cambridge, MA: Harvard University Press, 2014), 210.

8. Ibid., 198.

9. Karla J. Gingerich, Julie M. Bugg, Sue R. Doe, Christopher A. Rowland, Tracy L. Richards, Sara Anne Tompkins, and Mark McDaniel, “Active Processing Via Write-to-Learn Assignments: Learning and Retention Benefits in Introductory Psychology,” Teaching in Psychology 41 no. 4 (2014): 303–8.

10. Bahls, Student Writing, 8.

11. Hanauer and Bauerle, “Facilitating Innovation,” 37.

12. Katherine Gottschalk and Keith Hjortshoj, The Elements of Teaching Writing: A Resource for Instructors in All Disciplines (New York: Bedford/St. Martins, 2004), 161.

13. Brown, Roediger, and McDaniel, Make It Stick, 233; see also Scott Freeman, David Haak, and Mary Pat Wenderoth, “Increased Course Structure Improves Performance in Introductory Biology,” CBE Life Sciences Education 10, no. 2 (2011): 175–86.

14. See, for example, Gene V. Glass and Mary Lee Smith, “Meta-Analysis of Research on Class Size and Achievement,” Educational Evaluation & Policy Analysis 1, no. 2 (1979): 2–16.

15. See, for example, Matthew M. Chingos, “Class Size and Student Outcomes: Research and Policy Implications,” Journal of Policy Analysis & Management 32, no. 2 (2013): 411–38.

16. See Henry J. Raimondo, Louis Esposito, and Irving Gershenberg, “Introductory Class Size and Student Performance in Intermediate Theory Courses,” Journal of Economic Education 21, no. 4 (1990): 369–81; J. J. Arias and Douglas M. Walker, “Additional Evidence on the Relationship between Class Size and Student Performance,” Journal of Economic Education 35, no. 4 (2004): 311–29; Maria De Paola, Michela Ponzo, and Vincenzo Scoppa, “Class Size Effects on Student Achievement: Heterogeneity across Abilities and Fields,” Education Economics 21, no. 2 (2013): 135–53.

17. Hanauer and Bauerle, “Facilitating Innovation,” 34.

Eileen Kogl Camfield is director of university writing programs, Eileen Eckert McFall is director of learning and academic assessment, and Kirkwood M. Land is associate professor of biological sciences—all at University of the Pacific.

To respond to this article, e-mail, with the authors’ names on the subject line.

Previous Issues