Data stories from A Teaspoon And An Open Mind
by Anna Livingston (nonelvis)
Welcome to the online version of "Five Weird Data Stories From A Teaspoon And An Open Mind (and One That Isn't)," a talk I gave on February 15, 2019 at the Gallifrey One convention in Los Angeles. (Slides from the talk are available as a PDF.)
About A Teaspoon And An Open Mind
A Teaspoon And An Open Mind (whofic.com) is currently the oldest and largest fanfic archive devoted entirely to works about Doctor Who, its spinoffs, and its related media properties. Unlike other fanfic sites, such as Fanfiction.net and Archive of Our Own, Teaspoon is moderated, meaning that a member of our moderation team reviews each story submission before it is posted to ensure it meets site guidelines for spelling, grammar, punctuation, and formatting. Stories don't have to get everything exactly right; we're just after a minimum level of readability.
When you look over every story before it's posted, you start to notice fannish writing patterns and preferences you wouldn't notice as easily if you weren't looking at so many stories all the time. It's one thing to understand this stuff anecdotally; another thing entirely to dive into the data. Whether any of the stories here are truly "weird" is up for debate, but five of the six, weird or not, are based on something notable about the submissions we see, and two describe something unique to Teaspoon:
- The most common reasons stories are rejected from the archive
- Frequency of the phrase "pink and yellow" used to describe Rose Tyler
- Trends in David Tennant character types featured in "Teninch" crossover fic (stories in which Rose Tyler is paired with a Tennant character other than the Tenth Doctor)
- Name origins and connections for "Handy," the Metacrisis Tenth Doctor
- Statistics related to Unslinky, Teaspoon's most prolific author
- An interactive visualization of archive eras, genres, and ratings
A note about fic authors
This talk mentions only two authors by name: myself (nonelvis) and Unslinky. Everyone has different taste in fic, and every author is someone's favorite. I'm not here to sling insults at authors whose work I don't care for, and for that matter, I feel quite certain there are plenty of readers who don't care for my own work. This talk isn't about judging stories by some objective notion of "writing quality," which is impossible to identify anyway; it's purely about interesting patterns I found in the data, and leaving authors anonymous helps preserve that focus.
Methodology
I am not a data scientist. I am, however, someone very comfortable with technology and analysis. I've done the best I could with data that was occasionally messy – my programmer spouse may never forgive me for some of the time he lost doing data cleanup – and there were times, such as the pink and yellow analysis, where I had to literally review every single instance of a phrase and make a subjective call about whether the data met criteria for inclusion. In short: this is not what I would consider a perfectly clean, professional scientific analysis, but it's an adequate stab at it, and more than enough to demonstrate the trends I wanted to explore.