Open-Source Learning: By the Numbers

Open-Source Learners demo & teach skills they acquired in pursuing their Big Questions

“Figures don’t lie, but liars figure.” –Mark Twain

“I shall try not to use statistics as a drunken man uses lampposts, for support rather than illumination.” –Andrew Lang


Do you remember Shane Battier?  Unless you’re a passionate basketball historian and/or a statistics geek, you’re forgiven.  Battier was, by any measure, a successful but mostly unremarkable professional basketball player. He played basketball at Duke University and went on to play for several NBA teams. But Battier was never a leader in scoring, or rebounding, or assists– in fact, none of his statistics were all that impressive.  Still, Duke won a national championship with Battier.  The Miami Heat won a championship with Battier.  Twice.  In fact, Battier is the only player in NBA history to be part of two 20-game winning streaks, on two different teams (the Heat and the Houston Rockets). All he did was win.

So how come Shane Battier was not considered a superstar?  Battier himself said, “They (other players) think of me as some chump.”  As Moneyball author Michael Lewis put it in this 2009 New York Times article, “Here we have a mystery: a player is widely regarded inside the NBA as, at best, a replaceable cog in a machine driven by superstars.  And yet every team he has ever played on has acquired some magical ability to win.”

What’s the deal with this guy?

It turns out that Battier did a lot of things that are essential in basketball.  He instinctively ran to the empty spaces to balance the court and get high-percentage shots.  He almost always got a hand up in a shooter’s face. He could make the open shots and a left jump-hook in traffic.  He stripped the ball from a shooter’s hands on the way up for a shot.  When shots went up, he boxed out the other team’s best rebounder — even when he wasn’t guarding that player. But these actions weren’t associated with traditional basketball statistics.

People see your efforts and benefit from the results. But it doesn’t help your G.P.A.

So, the question we really should be asking is: What’s the deal with basketball statistics? Why didn’t more people immediately recognize and reward Battier’s actions?  What are statistics really worth if they don’t tell us what’s important to win the game?  Everything Battier did was visible (if you knew what to look for), describable, and measurable — so why didn’t the important stuff show up in the box score?

This is a question we should also ask about learning. What do successful learners do that’s worth tracking, recording, or counting? Students do things every day that don’t show up in their academic records.  You help a classmate with homework.  You ask the question in class that everyone else secretly wanted to ask, because the teacher’s explanation made no sense.  These actions are important– they help you learn and they help the people around you learn.  People see your efforts and benefit from the results.  But it doesn’t help your G.P.A.


We tend to count those things that are easiest to describe and measure.  It’s easy to count a player’s point total or how many free throws she attempted.  But nowhere in a traditional box score will you see a statistic for diving out of bounds, or separating your team captain from the referee just before she gets a technical foul, or a thousand other discrete actions that contribute to a team’s success. In today’s digitized world, all of these behaviors could be tied to metrics; we could then weight and analyze them in different ways, depending on what we are trying to understand, or what we think is important.

Statistics indicate what we think is important. In sports, the record book represents the historical tradition and the aspirational future of every athlete who plays the game. It also determines what’s valued in the present; the marketplace doesn’t reward what doesn’t exist, and statistics create shared understandings about what success looks like. This incentivizes some behaviors more than others. If you want to know how many points a player scored last year, and so people tend to use this marker as a reason to cheer, pay a higher salary– or shoot the ball at every opportunity. But if you want to know whether a player has a good sense of humor, or is likely to choke when he’s heckled, traditional statistics won’t tell you what you need to know.

Some of the thinking about statistics has changed in recent years.  With more computing power and sophisticated modeling, many professional and amateur sports teams have developed analytics to better understand their games and make decisions that support success.  Baseball made such a science and an art out of analyzing data that there’s even a movie about it starring Jonah Hill and Brad Pitt.  Baseball executive Theo Epstein was credited with using data and evidence-based analytics, including lots and lots of computer-crunched statistics, to guide the Boston Red Sox to a World Series Championship in 2004 (their first since 1918) and the Chicago Cubs to a World Series Championship in 2016 (their first since 1908).  Baseball arguments now include new categories of statistics to analyze player performance.  Stats such as Wins Above Replacement, On-Base Plus Slugging, and a heap of sabermetrics help managers and team executives decide everything from which players they need to where to place their fielders for a specific pitch.

Analyzing quantitative data can certainly give us insight, but it’s important to remember that numbers — no matter how cleverly they are arranged and described — don’t themselves do a very good job of helping us understand people. What about intangible human qualities like curiosity, or creativity, or passion, or resilience?  After a graduation speech he gave at Yale, Theo Epstein himself put it this way:

“One of the great ironies of the digital information age is there is so much information out there, so much data, so many statistics, that it’s easy to attempt to precisely quantify a player’s contribution. But you can never really quantify a human being, can’t really quantify character, and that stuff does matter, especially in a group situation where players really do have an impact on one another. … I still think data is important; it can give you some empirical facts about a player. Objectivity is important, but you have to combine it with an understanding of the player as a human being. Chemistry is really hard to pinpoint. It’s really hard to discern the magic formula.

So what do Shane Battier and Theo Epstein have to do with how we evaluate students?

In school, we tend to count the things that are easy to count.  Usually this boils down to completed work and fractions consisting of right answers over total answers that we can convert into percentages for the sake of grading and comparing people.  This approach is efficient for large numbers of students, but it has its limitations in supporting learning. It’s easy to count the number of paragraphs a student creates and measure that against a five-paragraph essay assignment. However, apart from demonstrating a correlation with following instructions, this indicator is so superficial that it’s functionally meaningless.

This is a symptom of larger problems.  School has been widely criticized as a relic of the industrial factory model in an information economy. Statistics are poorly understood in a culture where they are frequently used to manipulate people. Student data has been generated and used to profit software firms instead of helping learners. Politicians and corporations routinely cite statistics to justify their actions and to persuade unsuspecting customers and voters. 


Recently I congratulated a student on her A on a math test. “Wow,” I said, “96% – looks like you really knew your stuff.” The student looked back at me and smiled. “Nah,” she said, “I didn’t really get it. Some of the ones I guessed, and the rest I copied off Noemi.”

The things that are easiest to count, such as a baseball player’s batting average or the number of points a basketball player scores, or even a student’s answers on a test, are often poor indicators of that person’s talent, effort, character, future performance, or fit with a organization’s culture or “team chemistry.”

The number of items a student answers correctly on a test tells us nothing about how she thinks, or how well she’s learning, or how well she can apply the knowledge from the test to something meaningful in her life, or how well she can solve problems, or see opportunities, or… you get the idea.

It’s worth taking a moment to ask: What exactly can we tell for sure from a correct multiple choice answer on a test?

Only that the student colored in a particular bubble or made a circle around a particular letter.

We have no idea WHY the student answered this way.  Maybe the student knew the correct answer and selected it with confidence.  Maybe the student had a pretty good idea and guessed correctly.  Maybe the student had no clue and got lucky.  Maybe the student read everything wrong and selected the right answer for the wrong reasons.  Maybe the student was bored out of her mind and was taking a mental vacation on a beach somewhere while she doodled the same answer for every question down the whole column on the answer sheet. 

Most tests also don’t offer much in the way of insight or progress over time because they are summative, i.e., students can’t take them again with the benefit of additional instruction or practice.  Putting these sorts of test scores in a roll book or online grading system doesn’t create a meaningful narrative about a student’s understanding or performance; these numbers are just a string of moments without any thread to connect them (unless the teacher uses the same format for each test, in which case maybe the student is learning how to respond to the instrument / exercise).  This practice makes it easy for educators to create a semester grade online and create a defensible sense of accountability, but it’s woefully inadequate in authentically, meaningfully describing learners or their work progress.


The problem with statistics and learning has an obvious solution. Given the computing power we all have in our pockets, we can document our learning in all sorts of ways. In a previous post I shared the story of the student who learned to fly. Want to know if Matt Reynolds can land a plane? Watch this:

Now imagine what Matt could’ve done with a couple mounted cameras and a little editing, instead of his English teacher backseat driving with an old phone. Today’s students have all the tools they need to curate digital portfolios of themselves as developing learners.

Open-Source Learners use a variety of media to tell our stories. We teach courses, we demonstrate skills, we curate online, and in the process we create a mountain of data and metadata in the process. An average class of 30+ high school students creates millions of discrete digital artifacts in one school year.  This includes both quantitative data (things we can count, like how much we post/write/comment) and qualitative data (things that are not numerically describable, like how well Jeronimo recited “Richard Cory” from memory in class, or why you’re more likely to procrastinate).  We can begin thinking about the elements of reading, writing, literature, Mental Fitness, Physical Fitness, Civic Fitness, Spiritual Fitness, Technical Fitness, and our Big Questions that are important enough for us to evaluate (both quantitatively and qualitatively) as we move forward.

Open-Source Learning offers a way to re-establish trust and authenticity in a world of mistrust and fake news and blah blah blah. And it’s liminal; we know that the old ways no longer serve us, but we are creating the new ways as we go along, and we haven’t yet completely determined what’s important to document or count. My high school and university course blogs have garnered more than 2.5 million page views. That’s a big number, and I think it speaks to learning community engagement, but the real point is that now it exists. The mere fact that it exists — along with so many other data points and artifacts we never considered because they didn’t exist just a few short years ago — suggests that we should explore new types of data collection and analysis. We are creating a learning treasure trove, and we should examine it a little more closely as we think about what we want from our education.


I’d like to start an Open-Source Learning-style conversation and ask you to consider what society needs to know about a learner so that everyone can contribute their best.

To start, let’s look at some easy things to count in an Open-Source Learning Network: below are some numbers from the first week of a few English classes as they began the 2019-2020 school year at Santa Maria High School in Santa Maria, CA.  These figures represent: the first five days of the school year; 2 courses taught over five class periods; four periods of English 3 (American Literature/ 11th grade), one period of AP English Language & Composition/ 11th grade); 177 students total; 36 AP students; Have a close look and please let me know what (if anything) you think these numbers mean.  Do they play a meaningful role in evaluating performance?  What data points should we consider? What statistical processes should we use to analyze correlation and explore causation? Can we set up different procedures for momentary snapshots versus longitudinal panoramas? Are we talking Markov Chains here?  Standard deviations of covariates? Why not use our imaginations and think about how UX/UI can provide insight? Or the metadata associated with the social production of Open-Source Learning Network members? Maybe Elizabeth really does write better late at night than early in the morning. How many different ways can you think of to analyze online content? Visualization tools? What qualitative data should we consider? This is just the tip of the iceberg.

Course blogs: 2

Members with blogs listed on the Member Blog pages: 167

Members without blogs listed on the Member Blog pages: 10

Members without blogs listed on the Member Blogs pages who haven’t been absent at least 2 days: 2

(Can’t wait for Monday to ask Maira Gonzalez & Jesse Ruiz how I can help them.)

Total (attending & non-attending) student participation: 94%

Total (attending) student participation: 99%

Course blog page views during first week: 3605

Course blog posts: 33

Course blog comments: 78

Course blog followers: 40

Thanks for reading and sharing your thoughts. In my next post I’ll share some ideas about who should help evaluate Open-Source Learners and our networks.

Celebrity chef comes to class and interviews/hires Open-Source Learner