You hear a lot nowadays about the magic of big data. Getting hold of the right numbers can increase revenue, improve decision-making, or help you find a mate—or so the thinking goes. In 2009, U.S. Education Secretary Arne Duncan told a crowd of education researchers: "I am a deep believer in the power of data to drive our decisions. Data gives us the roadmap to reform. It tells us where we are, where we need to go, and who is most at risk."
This is a story about what happened when I tried to use big data to help repair my local public schools. I failed. And the reasons why I failed have everything to do with why the American system of standardized testing will never succeed.
A few years ago, I started having trouble helping my son with his first-grade homework. I'm a data-journalism professor at Temple University, and when my son asked me for help on a worksheet one day, I ran into an epistemological dilemma. My own general knowledge (and the Internet) told me there were many possible "correct" answers. However, only one of these answers would get him full credit on the assignment.
"I need to write down natural resources," he told me.
"Air, water, oil, gas, coal," I replied.
"I already put down air and water," he said. "Oil and gas and coal aren't natural resources."
"Of course they are," I said. "They're non-renewable natural resources, but they're still natural resources."
"But they weren't on the list the teacher gave in class."
I knew my son would start taking standardized tests in third grade. If the first-grade homework was this confusing, I was really worried about how he—or any kid—was supposed to figure out the tests. I had been spending time with civic hackers, the kind of people who build software and crunch government data for fun, and I decided to see if I could come up with a beat-the-test strategy derived from a popular SAT prep course I used to teach.
In essence, I tried to game the third-grade Pennsylvania System of School Assessment (PSSA), the standardized test for my state. Along with a team of professional developers, I designed artificial-intelligence software to crunch the available data. I talked to teachers. I talked to students. I visited schools and sat through School Reform Commission meetings.
After six months of this, I discovered that the test can be gamed. Not by using a beat-the-test strategy, but by a shockingly low-tech strategy: reading the textbook that contains the answers.
Philadelphia is the eighth-largest school district in the country, and its public students are overwhelmingly poor: 79 percent of them are eligible for free or reduced-price lunch. The high-school graduation rate is only 64 percent and fewer than half of students managed to score proficient or above on the 2013 PSSA.
When a problem exists in Philadelphia schools, it generally exists in other large urban schools across the nation. One of those problems—shared by districts in New York, D.C., Chicago, Los Angeles, and other major cities—is that many schools don't have enough money to buy books. The School District of Philadelphia recently tweeted a photo of Mayor Michael Nutter handing out 200,000 donated books to K-3 students. Unfortunately, introducing children to classic works of literature won't raise their abysmal test scores.
This is because standardized tests are not based on general knowledge. As I learned in the course of my investigation, they are based on specific knowledge contained in specific sets of books: the textbooks created by the test makers.
All of this has to do with the economics of testing. Across the nation, standardized tests come from one of three companies: CTB McGraw Hill, Houghton Mifflin Harcourt, or Pearson. These corporations write the tests, grade the tests, and publish the books that students use to prepare for the tests. Houghton Mifflin has a 38 percent market share, according to its press materials. In 2013, the company brought in $1.38 billion in revenue.
Pennsylvania currently has a multi-million-dollar contract with a company called Data Recognition Corporation (DRC) to grade the PSSAs. DRC works with McGraw-Hill as part of a consortium that has a $186 million federal contract to write and grade standardized tests for the rest of the country. McGraw-Hill, meanwhile, also writes the books and curricula schools buy to prepare students for the tests. Everyday Math, the branded curriculum used by most Philadelphia public schools in grades K–5, is published by McGraw Hill.
Put simply, any teacher who wants his or her students to pass the tests has to give out books from the Big Three publishers. If you look at a textbook from one of these companies and look at the standardized tests written by the same company, even a third grader can see that many of the questions on the test are similar to the questions in the book. In fact, Pearson came under fire last year for using a passage on a standardized test that was taken verbatim from a Pearson textbook.
The issue often has as much to do with wording as it does with facts or figures. Consider this question from the 2009 PSSA, which asked third-grade students to write down an even number with three digits and then explain how they arrived at their answers. Here's an example of a correct answer, taken from a testing supplement put out by the Pennsylvania Department of Education:
Here’s an example of a partially correct answer that earned the student just one point instead of two:
This second answer is correct, but the third-grade student lacked the specific conceptual underpinnings to explain why it was correct. The Everyday Math curriculum happens to cover this rationale in detail, and the third-grade study guide instructs teachers to drill students on it: "What is one of the rules for odd and even factors and their products? How do you know that this rule is true?" A third-grader without a textbook can learn the difference between even and odd numbers, but she will find it hard to guess how the test-maker wants to see that difference explained.
Unlike college professors, who simply assign books and leave it to the students to buy them, K–12 teachers have to provide students with books. But it's not a simple matter of ordering one book per student per subject. Based on the schools I visited and the teachers I interviewed, each student needs at least one textbook and one workbook per class, plus a bunch of worksheets and projects the teacher pulls from assorted websites (not to mention binder clips and construction paper and scissors and other project-based materials). Books can be reused year to year, but only if the state standards haven't changed—which they have every year for at least the past decade.
Urban teachers have a kind of underground economy, Cohen explained. Some teachers hustle and negotiate to get books and paper and desks for their students.
Once I realized the direct connection between textbooks and standardized-test success, I tried to find out exactly how many Philadelphia schools were missing books from the Big Three publishers. I was also curious how much money it would take to make up for the shortfall.
The first challenge came when I asked the School District of Philadelphia for a list of which curricula were being used at which schools. If you want to know which books should be in a school, you need to know the name of the curriculum the school uses. (Using a branded curriculum like Everyday Math allows a school to place its orders more efficiently and negotiate a bulk discount.)
"We don't have that list," an administrator at the Philadelphia Office of Curriculum and Development told me. "It doesn't exist."
"How do you know what curriculum each school is using?" I asked.
There was silence on the phone for a moment.
"How do you know if the schools have all the books they need?"
According to district policy, every school is supposed to record its book inventory in a centralized database called the Textbook Storage System. "If you give me that list of books in the Textbook Storage System, I can reverse-engineer it and make you a list of which curriculum each school uses," I told the curriculum officer.
"Really?" she said. "That would be great. I didn't know you could do that!"
So I did what computer programmers do in this kind of situation: I created a workaround. I built a program to look at each Philadelphia public school and see whether the number of books at the school was equal to the number of students. The results of the analysis did not look good. The average school had only 27 percent of the books in the district's recommended curriculum. At least 10 schools had no books at all, according to their own records. Others had books that were hopelessly out of date.
I visited some of these schools and asked students how much access they had to textbooks. "We had books at my high school, but they were from, like, the 1980s," said David, a recent graduate of Philadelphia public schools. A junior at a public high school complained to me that her history textbook had pictures of testicles drawn on each page.
When I visited an algebra class at the Academy at Palumbo, a magnet school in South Philadelphia, a math teacher, Brian Cohen, seemed surprised by the information I presented to him. Palumbo's records showed that the school used Fast Track to a 5: Preparing for the AB and BC Calculus Exams, a book published by Houghton Mifflin. However, the quantity of books in the system read "0."
"That's strange," said Cohen after I sat in on his Algebra I class. "I'm not sure why it says we have zero copies." Had that branded curriculum had been selected but never ordered? Or had the books had been ordered but intercepted somewhere along the way?
I asked if we could go look in the book closet and Cohen took me down the hall. On the way, we stopped to chat with a colleague of his who taught calculus. "Do you have enough books?" Cohen asked.
"I do now," she said. "Some school in West Philadelphia closed, and I managed to get all the textbooks from there. I had a friend who hooked me up." But she wasn't using Fast Track to 5; she had a different calculus book that wasn't on my record sheet.
Urban teachers have a kind of underground economy, Cohen explained. Some teachers hustle and negotiate to get books and paper and desks for their students. They spend their spare time running campaigns on fundraising sites like DonorsChoose.org, and they keep an eye out for any materials they can nab from other schools. Philadelphia teachers spend an average of $300 to $1,000 of their own money each year to supplement their $100 annual budget for classroom supplies, according to a Philadelphia Federation of Teachers survey.
Cohen and I arrived at the math department "book closet," which was actually just a corner inside the locked and empty office of the math department chairperson. "Here's where we keep the extra books," he said, gesturing to two short wooden bookshelves. A medium-sized box with open flaps sat on the floor. Cohen looked inside. "Well, we found the AP Calculus books," he said. The box was filled with brand-new copies of Fast Track to 5.
It would have been easy to blame this glitch on the lack of a centralized computer system. The only problem was, such a computer system did exist, and I was looking at a printout from it. The printout said Palumbo had zero copies of the book, but 24 books were sitting in front of me in a box on the floor of a locked office.
The Philadelphia schools don't just have a textbook problem. They have a data problem—which is actually a people problem. We tend to think of data as immutable truth. But we forget that data and data-collection systems are created by people. Flesh-and-blood humans need to count the books in a school and enter the numbers into a database. Usually, these humans are administrative assistants or teacher's aides. But severe state funding cuts over the past several years have meant cutbacks in the school district's administrative staff. Even the best data-collection system is useless if there are no people available to manage it.
Michael Masch, the vice president of finance and chief financial officer at Manhattan College and the former chief financial officer of the School District of Philadelphia, told me that he used to routinely send his staff into schools to do bookkeeping and other tasks that overworked principals couldn't handle. "Principals weren't good at managing cash accounts or student accounts. They needed support in performing administrative functions because they were understaffed," said Masch. "If the principal doesn't meet with every parent, deal with every crisis, they get criticized. If they don't do the invisible stuff, like the paperwork, they're not going to read about it in the newspaper. So they triage."
When it comes to the book scarcity, Philadelphia principals react in predictable ways. "They are very possessive of their textbooks," Rebecca Dhondt, the parent of a second grader and a fourth grader at Jenks Elementary School, told me. "My daughter is not allowed to bring her textbook home because they don't want it to get lost." For the past two years, she has surveyed teachers to find out what's on their wish lists (mostly trade books and basic school supplies) and then collected donations from the community. "When I first did it last year, the principal said, 'Oh, we have some of that stuff,'" said Dhondt. As with the AP calculus books at Palumbo, the missing items were sitting somewhere in the school but hadn't made it into the right hands. "There's not enough support to connect the supplies in the supply closets or the libraries with the teachers in the classroom," Dhondt said. "They need to have enough money to connect the dots."
Keeping track of supplies is one problem; keeping track of the students who will use them is a whole other challenge. In Philadelphia schools, many students are in foster care or navigating other precarious living situations, which means they frequently switch schools. A recent report by the Children's Hospital of Philadelphia showed that one in five Philadelphia public high school students has been involved with the child welfare or juvenile-justice system. One teacher told me that when she taught in a West Philly high school, she gained or lost a student at least every two weeks.
"There is a set of logistical issues in a district this big that most districts in the U.S. don't face," explained Donna Cooper, the executive director of an organization called the Public Citizens for Children and Youth. "Everything isn't what it appears."
After I finished the first round of data analysis in 2013, I went to the school district and asked to present my findings to Philadelphia Superintendent William Hite. The district spokesperson told me Hite wasn't available and instead offered me a meeting with Stephen Spence, the deputy for the office of school-support services.
Spence, a former gym teacher in his early 60s, was in charge of school openings and closings. His job used to be handled by a whole staff, but ever since the cutbacks, Spence had been singlehandedly taking care of everything from desks to carpets.
I asked him how he verified that schools had enough books at the beginning of each year. He explained that every principal was supposed to submit a school opening checklist and a school-closing checklist. On that checklist (a Microsoft Word document that he emailed to all the principals), there was a box the principal could tick to indicate that the school had all of the books it needed to operate.
"Inventory is not micromanaged at a central- office level," said Spence. "A principal that has very good skills with technology might develop an inventory system that they keep online. Another principal who is not so good with technology might have just a person who counts the books, carries them from one location to another, puts them in the closet, and visually checks that they're there."
I wondered about this, since a district-wide electronic system had been created several years back. In 2009, a student stood up at meeting of Philadelphia's School Reform Commission and proclaimed, "I don't have a book." After that, Superintendent Arlene Ackerman had resolved to computerize the District's inventory. Chief Information Officer Melanie Harris had told me that the system had been developed using internal resources.
"You're saying that the online system is no longer in use?" I asked Spence.
The principals preferred to use their own systems, he said, and report their inventory to him. "I rely on the principals and, I'm going to say, real-time data. It gets tracked through the documents we talked about previously: the school-opening and the school-closing checklists."
As Spence receives the principals' checklists, he enters the information into an Excel spreadsheet on his computer.
"Does this Excel document get shared with anyone?" I asked.
"It gets shared with assistant superintendents," said Spence. "We have meetings. We put the Excel spreadsheet on a projector on a large screen during our school-opening meetings."
As a data-science professional, it was clear to me that Spence was in over his head. Millions of books, hundreds of thousands of desks—it is impossible to keep track of all of these objects without technology and sufficient people to track them. It's just as difficult to figure out how to use the data correctly.
The end result is that Philadelphia's numbers simply don't add up. Consider the eighth grade at Tilden Middle School in Southwest Philadelphia. According to district records, Tilden uses a reading curriculum called Elements of Literature, published by Houghton Mifflin. In 2012–2013, Tilden had 117 students in its eighth grade, but it only had 42 of these eighth-grade reading textbooks, according to the (admittedly flawed) district inventory system. Tilden's eighth grade students largely failed the state standardized test: Their average reading score was 29.4 percent, compared with 57.9 percent districtwide.
One problem is that no one is keeping track of what these students need and what they actually have. Another problem is that there's simply too little money in the education budget. The Elements of Literature textbook costs $114.75. However, in 2012–2013, Tilden (like every other middle school in Philadelphia) was only allocated $30.30 per student to buy books—and that amount, which was barely a quarter the price of one textbook, was supposed to cover every subject, not just one. My own calculations show that the average Philadelphia school had only 27 percent of the books required to teach its curriculum in 2012-2013, and it would have cost $68 million to pay for all the books schools need. Because the school district doesn't collect comprehensive data on its textbook use, this calculation could be an overestimate—but more likely, it's a significant underestimate.
At the end of the 2012–2013 school year, the book budget was eliminated altogether. Last June, the state-run School Reform Commission—which replaced Philadelphia's school board in 2001—passed a "doomsday budget" that fell $300 million short of the district's operating costs for the 2014 fiscal year. (The governor of Pennsylvania had already cut almost a billion dollars from public education funding in 2011.) Philadelphia schools were allotted $0 per student for textbooks. The 2015 budget likewise features no funding for books.
It may be many years until Philadelphia's education budget matches its curriculum requirements. In the meantime, there are a few things the district—and other flailing school districts in America—can do. Stop giving standardized tests that are inextricably tied to specific sets of books. At the very least, stop using test scores to evaluate teacher performance without providing the items each teacher needs to do his or her job. Most of all, avoid basing an entire education system on materials so costly that big, urban districts can't afford to buy them. Until these things change, it will be impossible to raise standardized test scores—despite the best efforts of the teachers and students who will return to school this fall and find no new books waiting for them.