Considerations for AP English Exam Scores

The following thoughts were originally posted on the College Board AP Lit list serve and are being posted here with permission. 

Dear Colleagues,

We have been having a very good discussion about AP English scores this year, and our director at the College Board, Brandon Abdon, has followed the comments very closely and responded with 8 carefully considered points. In my role as a consultant for APSI sessions and as moderator for this community, I think that these contributions from Brandon are very helpful in advancing our dialogue, and he has discussed them in some detail with me and with our advisors for Language (Jodi Rice) and Literature (Brian Sztabnik). We now offer them for your consideration.

Eight things to consider about AP English examination scores:

1. The standards for scoring are neither College Board’s nor Educational Testing Service’s alone so much as they are the standard set by higher education professors that take part in standard settings that establish the standards. A standard-setting does not happen each year, but it does happen when there is a significant change in the scoring expectations or in the task models for the exams. At these standard-settings, the standard for what is acceptable performance for a student is determined by higher education, not by CB or ETS. We maintain that standard by using fair and consistent essay scoring guides from year to year with essay prompts that are field tested at colleges/universities as well as with multiple choice items that are also field-tested and then reused periodically so that they can then be compared to previous performance in order to evaluate and account for performance change over time (known as “equating” performance across cohorts).

2. The exam is not norm-referenced – where one goes into an exam knowing that there will be, as an example, 10% of 5s and 25% of 4s and 30 percent of 3s, 25% of 2s and 10% of 1s/0s (that would describe a nearly perfect bell curve). Norm-referenced exams test all students against a moving standard and then divide those students into score-ranges based on performance of others, not predetermined criteria. In fact, all AP exams are criterion-referenced exams – where all students are held to the same standard, the criteria for which have been determined by the standard-setting process mentioned above. This table does a nice job of breaking down the differences.

3. The AP English exams are criteria-referenced exams that discriminate between the different performance levels very well – meaning a criterion-reference exam that falls into a proper bell-curve naturally. Not every student will, nor should, earn a 5 (or an “A” in the college course – according to the American Council on Education, cited in this publicly available College Board report). When you compare the AP English examinations to some other courses, it is really comparing apples to oranges for a few reasons, the most important being that the AP English examinations are 2 of the 3 largest courses (Lang = #1, Biology = #2, Lit = #3). In courses this size, students are rarely self-selecting as much as they might be for a course such as Calculus BC (selected usually by students who are already very high-performing math students) or Chinese (often taken by students who are either native speakers or for whom Chinese is a home-spoken language). The larger courses (e.g., the English Language and Literature, the large histories, or the large natural sciences) are often taken by students who do not call themselves students of the discipline, so we often see this natural bell curve.

4. Both CB research and external research show that dual enrollment aligned with a 2-year institution is less useful or productive for a student than either AP, IB, or even DE aligned with a 4-year institution. We know that DE is challenging some AP courses and programs around the country, but the research shows that it is not a better choice where student performance and learning are concerned. Rarely are students who pass a DE course with any particular letter-grade able to perform at the same standards established at that letter-grade level in the same college course (or on the AP exam). When it comes to “getting the credit” this isn’t very helpful, but if schools and districts consider what provides better long-term results for students, then the choice should be clear.

5. The Literature course/exam is not built around any possible text from any possible time period. The course/exam is built around literature in English written since the early modern period. It focuses on texts of “merit” that display literary complexity. All exam items are vetted through field-testing for their performance, ability to fit into the time constraints of the exam, and the viability of the prompts (in the essays) relative to what is written by college students taking the college course. The best preparation teachers can provide is to teach their students to read and write with an appreciation for the nuances and complexities of a text regardless of its time period or form.

6. I am not aware of any research that doing more practice exams or drilling past exam questions regularly will necessarily improve student performance. Yes, students must be aware of, and prepared for, the formatting and constraints of the exams, but focusing on released passages and prompts does not necessarily make students better reader or writers, nor does it necessarily prepare them for the wide-open possibilities of the exam.

7. All of this said, we at the College Board are concerned with the decline in the number of students meeting the standards for 3, 4, and 5 on the English Literature exam. Though neither the standards nor the exams have been altered to increase difficulty, we still see students not performing as well as they have in the past. We are continuing to do some focus groups and surveys to determine what sort of support we may be able to offer teachers to help improve these performances. We hope to have some things to release within the next year or two.

8. We hope to begin giving teachers more specific learning-objective and/or skill-level feedback on the exams. Currently we are developing and testing some things that will allow for this. The impetus being that teachers can make more focused, local-level course revisions based on the performance of their previous class.

Feel free to leave comments which will be passed on to Brandon Abdon at College Board as well as Brian Sztabnik and Jodi Rice who serve as College Board advisors. 

4 thoughts on “Considerations for AP English Exam Scores

  1. Here’s something to consider. Schools get rewarded for the number of students enrolled in AP courses. (That number is considered when schools get ranked by state organizations–at least that’s the case here in Wisconsin–or by “best HS” lists put out by media such as US News and World Report.) Not taken into consideration are factors such as how well students do in the class, whether the students take the AP test, or how they do on the test. Over the last few years, where I teach in suburban Milwaukee, our section numbers in AP Lit (and in Lang) have increased dramatically. Not only has this resulted in bigger classes, but also in more students who may not be prepared for the challenge of AP English classes. Now, there is definitely value in students taking AP English whether they can score well on the test or not. However, if schools don’t do enough to ensure students’ success once they’ve registered (and in classes preceding AP English), well, it seems to me that getting those high school rankings (and the good PR that accompanies them) seems to be valued more than how well students are performing in the class itself (and, consequently, on the AP exams).

  2. Thanks for addressing the points here, Brandon and Brian. I have seen a decline in my passing rates over the past several years, but in post-course surveys students often attributed it to lack of motivation or “stickitivity” during the actual exam because they already knew if their college wasn’t going to accept AP English Lit scores for credit. I doubt that is the decisive factor, but one we should bear in mind as it coincides with various colleges’ movement in this direction. That said, what is within our control as teachers is to design rigorous, relevant and high-quality instruction, particularly around close reading. I also feel that there’s a dearth of effective formative assessment materials aside from released passages and exams, which are really just summative assessments sliced up. The last major shift I see is that students who come to me struggling to transition from the argumentative writing demanded in AP Lang to the analytical writing demanded by the SAT and AP Lit exams. Nearly a semester is spent training them out of paraphrasing. That’s all I can think of right now, but Skip Nicholson designed a fantastic post-exam reflection piece that we might consider using when diagnosing the course shortcomings and student performance after AP Lit; let’s face it, the online score reports are a post-mortem. I’m more focused on developing an effective pre-course assessment so I have a sense where to go at the beginning of the year and I’d love the College Board’s help on that.

    • Mr. Morgan,
      I just finished thirty-seven years teaching high school English and Latin. At the beginning of my career, nearly half of the students in my AP and honors courses were interested in and destined for Humanities-related degrees and positions. This year, only two of nearly sixty students were interested in professions and degrees other than STEM degrees. This trend began for my students approximately ten years ago.
      Have you seen this trend? My scores were better when there was more interest in the Humanities. Anyone else notice this, too?

  3. The College Board needs to give far better feedback. When my Latin Language students take the National Latin Examination, I receive a complete, name by name, question by question report. Why can’t the College Board do the same for AP Lit? Why can’t we also receive the same for the essays?
    Also, we should receive the final tally score for each student. Did a student who earned a 2 just miss earning a 3 by just a few points, for example? Were all the students in my cohort earning a 3 just barely making it?
    Without detailed feedback, it is difficult to adjust instruction.

    Richard Grieves

Comments are closed.