Debunking the "First 7 Questions" Myth

breadcrumbs

Debunking the “First 7 Questions” Myth

by Stacey Koprince Mar 14, 2012

I don’t even need to say what the myth is! Everyone already knows—that’s how pervasive it is. Ever since the GMAT and GRE CATs launched in the 1990s people have believed that the earlier questions are worth more, that if we could get the first 7 (or 5, or 10) questions in a row right, we’d be guaranteed a really high score.

And you’ve likely also heard that this is a myth: from me, from other teachers, from Dr. Lawrence (Larry) Rudner, Chief Psychometrician of GMAC (the organization that makes the GMAT). And yet so many people still talk about it and believe it—so who should we believe?

Let’s talk about this and, hopefully, lay the myth to rest once and for all.

myth

What is the Myth?

Different people talk about different details: if we get the first 5, or 7, or 10 questions right, then we’ll get a high score no matter what else happens, or a higher score than we would otherwise get. (And, conversely, if we do poorly on the first 5, or 7, or 10 questions, then our score will be terrible no matter what, or lower than we would otherwise get.)

How Did the Myth Get Started?

Item Response Theory (IRT), the concept on which the GMAT is based, has been around for more than 50 years. In the 1990s, the GMAT decided to switch from the old-fashioned paper-based format to a new computer-adaptive testing (CAT) format based upon IRT.

During that timeframe, the Educational Testing Service was responsible for producing test items for both the GRE and the GMAT (now, the GMAT test items are produced by another organization, ACT). As ETS developed its new GRE CATs, it did initially have a format that emphasized the earlier questions. The test prep world soon figured that out and ETS redesigned the format so that this would no longer happen. That was the start of the myth: it wasn’t actually a myth at first!

But why did it persist? Everyone knew that the early form had been cracked and that ETS responded accordingly by changing things.

In 1999, ETS researchers presented a white paper at the annual meeting of the National Council on Measurement in Education. The researchers, Manfred Steffen and Walter D. Way, presented the results of a number of scenarios they ran to test the IRT-based algorithm and show how it worked. I’m going to give you some of that data below, but I want to point something out first: this paper is why the rumor persisted.

The paper itself is completely correct and accurate; in fact, it’s an example of quite good research. But a lot of that research was misinterpreted by people in the test prep industry—misinterpreted to mean that the earlier questions still were worth more and that students should spend a lot more time on those earlier questions. The paper, however, shows just the opposite.

One thing the paper tells us is that, for a true examination level of 650, answering the first two questions correctly vs. incorrectly results in a 31 point score difference (658 vs. 627). Note, though, that this assumes all the other items are identically answered…in other words, the examinee who answered the first two questions correctly didn’t have to guess on any questions at the end. How does that happen? The examinee doesn’t run out of time because the examinee didn’t take extra time (or not much, anyway) in the first place. (Also: no, you can’t really score 658 or 627 on the test; these are the average results of many simulations.)

One scenario they ran was what happened to the true scoring level if someone got the first, first two, or first three in a row right, or wrong, vs. the final question, the final two, or the final three. The results are very interesting.

For someone with a true scoring level of 750, getting the first 3 questions right (and leaving everything else the same) results in a 10-point lift to 760. Great! We should spend more time on the first three questions, right?

Not so fast. Getting the last three questions in a row wrong results in a 20-point drop to 730. What do these two data points really mean? If you can answer those first three questions correctly without sacrificing later questions, then great. Do it. But chances are pretty good that you’d have to spend extra time…and then your score will drop at the end. (And this is how the myth was perpetuated. People forgot that there are consequences for trying to get the first X number of questions right. You have to take extra time!)

At the 650 true-scoring level, getting the first 3 right results in a 16-point jump, to 666. Getting the last three wrong, however, results in a 15-point drop, to 635. Six of one, half a dozen of the other it doesn’t really matter.

I will concede that, at lower scoring levels (sub-550), a strategy that involves getting the first 3, or 5, or 7 in a row right works in theory because there isn’t much of a drop at the end for getting a similar number of questions wrong. There’s only one problem with this strategy. What are the chances that someone at a true scoring level of 500 is going to get the first 5 (or even 3) questions in a row right? Think about what happens each time you get a question right.

Next, the study talks about scenarios in which someone gets the first ten questions right or the last ten questions right. Here’s an interesting statistic: if a test-taker with a true scoring level of 670 gets the first 10 questions right, the study showed that the resulting score would be 728. Clearly, we should spend that extra time on the first 10 questions!

Except for one little detail. That part of the study assumed that the test-taker didn’t have to rush on any other questions. In other words, the study assumed that the test-taker didn’t need any significant extra time in order to get those first 10 questions right. (And, again, we missed this in the 1990s when interpreting this data.)

By the way, what can you do if you simply don’t know how to do a problem? Will spending an extra minute or two help? The vast majority of the time, no. If you can’t do it in the normal time (or perhaps about 30 seconds above normal time), then this just means that you don’t actually know what you’re doing, since there is a “normal-time” solution. Spending even more time, then, is not going to do much (except blow time).

And finally we get to the portion of the report that mimics real-life conditions the best: what the report calls the early care / late guessing condition. In this scenario, the test-taker takes additional time on some early number of questions and then has to guess on questions towards the end in order to finish the test on time. There’s one more not-so-minor detail. This scenario assumed that, for the early care (extra time) situation, the tester would answer every single question correctly. That is: you spend more time, you automatically get it right. I don’t need to point out that it doesn’t usually work that way, right? : )

If a test-taker with a true scoring level of 500 answers the first 5 questions correctly, then that test-taker is likely to end up with a higher-than-500 score as long as he doesn’t guess on more than 6 questions at the end. That sounds pretty good—until you remember that this requires a 500-level tester to answer the first 5 questions in a row correctly. Again, remember what happens when you answer questions correctly.

What about a tester at the 780 level? This tester has a pretty good shot at answering the first 5 in a row correctly. In this scenario, however, the tester cannot guess on more than a single question at the end without the score dropping below 780. Only 1 question!! If this tester answers the first 10 questions in a row correctly, he can guess on no more than 3 questions at the end before the score level drops below 780.

What does all of this mean?

If you’re going for a 750+ score, then the strategy actually boils down to this: get the first 5, or 7, or 10 in a row right while spending barely any extra time so that you have to guess on zero or very few questions at the end—otherwise, your score will actually go down. (By the way, if you can actually get the first 5+ in a row right without spending extra time, then you don’t need to worry about any of this. You just do all the questions normally for you.)

What about a 500-ish score? We’re allowed to guess more at the end but we’d still have to get 3+ questions in a row right at the beginning of the test, and we all know how the test works. I’m going to start the test with a medium-level problem. If my level really is around 500, there’s no way I’m going to get 3 in a row right, because that third question is going to be way too hard for me no matter how much time I spend (and possibly the second one as well).

What about in the 600 range or right at 700? The final scenario in the study (early care / late guessing) didn’t provide data for these specific scores. But look at all of the data given collectively in the paper. We haven’t found one case in which spending more time on the early ones actually works. Basically, we’ve got a tug of war between how many questions we really could get right in a row and how many times we’d have to guess at the end as a result—and the data shows that they’ve figured out how to balance this in such a way that gaming the test is just going to backfire in the end.

Takeaways

1. It really is a myth. Don’t spend lots of extra time on any one problem anywhere in the test. It’s not worth it.

2. The real strategy that derives from the research is: get everything right that you can without spending a bunch of extra time. (This does not mean that you can’t go 30 to 45 seconds over the average time when you think some extra time is warranted! Beyond that, though, the extra time is likely just indicating that you don’t know how to do the problem anyway.)

3. Read these two articles: In It To Win It and Time Management.

All data points cited from Test-Taking Strategies in Computerized Adaptive Testing, Steffen, M. and Way, W. D., Educational Testing Service, April 1999. Presented at the annual meeting of the National Council on Measurement in Education.

Previous Next