Manhattan Prep Earns Top Spot in Recent GMAT Score Improvement Survey! Except…We Need To Talk…
I was naturally very interested to hear that my company, Manhattan Prep, had earned the top spot for score improvement in the recent Poets & Quants survey of GMAT test prep companies. The excited part of me wants to dance around shouting, “Yay! Manhattan Prep is the best!”
But the rational part of me is saying…hmm. Of course, I think we’re the best too, but I don’t think that average score improvement is a great metric to use (no disrespect to Poets & Quants, which I think is an excellent resource for aspiring business school students). I’d rather that you talk to friends, look at verified reviews from a source like Trustpilot, and attend any free sessions available to judge for yourself. (I’ve got a free session coming up soon—come say hi!)
The study concludes that all of our GMAT students who responded to the survey had an overall average improvement of 91.3 points, higher than any other company’s average. We also had the highest average improvement for GMAT classes (104.1 points) and classes plus tutoring (118.8 points).
That sounds fantastic! Why am I questioning this? A few reasons, actually. (The teacher in me has to add: Bonus points if you brainstorm yourself before you keep reading.)
What was the starting point?
Let’s start at the starting point. Is our class students’ 104.1 point jump from, for example, 400 to 504.1? Or is that 104.1 points from, for example, 640 to 744.1? The impressiveness of “104.1″ depends pretty heavily on your starting point—it’s far harder to improve by 100 points at the higher end of the scale.
Imagine that, for some reason, a particular company tends to attract people who have higher average starting scores. For instance, my company offers an Advanced GMAT Course for which we require a 650 minimum starting score to enroll. Our students in this class can literally only improve by 150 points—or fewer, if they started higher than 650—so how would you factor that into this study? (Despite that, we still ended up with the highest average improvement in this study, but I could easily imagine a different scenario in which one company’s average score improvement was lower than another’s and yet more impressive—if their students were ending with higher actual scores.)
Is this sample representative?
When you get to b-school you’re going to learn about sampling in statistics. A good data set, or sample, is representative of the population you’re trying to study. When that data set doesn’t properly represent the desired population for some reason, your data is skewed in some way and you risk drawing faulty conclusions.
I have two big questions around the representativeness of the data sample in this study. I’ll start with my greater concern.
(1) Are the sub-groups comparable?
Someone who slacks off obviously isn’t going to have the same results as someone who works more diligently. Maybe—on average—the people willing to spend money on a class are also more serious about studying in the first place. By the way, this is called self-selection bias; you’ll need to know this concept for b-school, too.
Let’s extend that idea. The article mentions puzzlement as to why tutoring results weren’t the best of all—you’re paying a premium, so you should get better results, right?¹
It’s very common for people to do what they can on their own before spending money on tutoring (a significant percentage of our tutoring students do this). This is a smart use of your money—only use the tutor for what you really need—but this leaves less “lift” available to attribute to the tutoring because you’re only using it for the last, hardest 30-50 points.
If certain types of students tend to self-select into classes or tutoring—or into choosing one particular company for some reason—then that introduces bias into the data and into any comparison you make across companies. These factors could be controlled for in a randomized study, but that would be a far harder study to conduct.
(2) Who participated in the study?
My second biggest concern revolves around participation bias (another key concept you’ll need for b-school²). People are more likely to report their experience with something if they’re either really happy or really upset and that can skew the data. (That’s why it’s so frustrating to read Yelp reviews—you can always find 5 people who LOVE this restaurant and 5 others who absolutely hate it. And there might be 500 who felt pretty good about it but didn’t bother to write a review.)
The smaller the number of data points, the more risk that participation bias could significantly skew the results. While 1,000 data points overall is a good number, some of the cuts of the data in this survey are well under 50 data points. I’m really uncomfortable drawing any conclusions based on 15 or 30 data points.
All of the above, by the way, is why you’ve never seen any Manhattan Prep claim about score improvement in our 20 years in this business. There isn’t a great but cost-efficient way to get truly representative data for this calculation and, even then, your starting point makes a big difference in evaluating how good that score improvement is. A 40-point improvement is seriously impressive if your starting point is 720. (And, hey, then you and your 99th percentile score can work for us!)
Is there anything you do like about this study?
Sure! The main data point applying to Manhattan Prep did come from more than 100 respondents, so that makes me feel a little better about our stat.
It’s still the case that, as happy as I am to see any article in which Manhattan Prep is ranked #1 for something good, I just don’t feel comfortable accepting the accolade based on this particular data point. As I said, I still think we’re the best, but I’d rather try to prove that via other means—and, mostly, I’d like you to decide for yourself³ what you think best fits your needs and learning style. (Wait, did I just refuse to accept our Oscar? )
1 Yes, I’m a geek and have footnotes in a blog post. 🙂 My colleague Reed Arnold pointed out to me that the tutoring students in the study also reported a much lower number of tutoring hours compared to class hours for those in classes—so you could potentially argue that the tutoring was more efficient in terms of hours spent per point earned.
2 Shout-out to my colleague Daniel Fogel for telling me the official name of this type of bias.
3 I’m hosting a free webinar on February 6—click that link for details. Come say hi and ask questions about the GMAT!
You can attend the first session of any of our online or in-person GMAT courses absolutely free! We’re not kidding. Check out our upcoming courses here.
Stacey Koprince is a Manhattan Prep instructor based in Montreal, Canada and Los Angeles, California. Stacey has been teaching the GMAT, GRE, and LSAT for more than 15 years and is one of the most well-known instructors in the industry. Stacey loves to teach and is absolutely fascinated by standardized tests. Check out Stacey’s upcoming GMAT courses here.