Sentiment Analysis and The “Black Lives Matter” Movement

This post continues a dialogue between Mark Ravina and TJ Greer on text mining and the recent student protest movement. In this entry we examine the potential for sentiment analysis. Mark: I’m somewhat cynical about sentiment analysis. How much of the emotional valence of a document can really be captured by counting adjectives? But I […]

Mining the Movement: Some DH perspectives on student activism

This is the first of a series of blog post applying text mining and other DH techniques to the evolving student protest movement demanding racial equality. What can DH techniques tell use about student demands and administration responses? Are faculty and students talking to each other or at each other? This project is a collaborative […]

Smooth and Rough on the Highways of France

In a previous post I suggested that historians should use quantitative methods less to answer existing questions than to pose new ones. Such a digital humanities (DH) approach would be the reverse of the older social science history approach, in which social science tools were use to “answer” definitively longstanding questions. This post offers another example […]

Baseball, Football, Moneyball

In fall 2014 I taught a freshman seminar on data visualization entitled “Charts, Maps, and Graphs.” Over the course of the semester I worked with the students to create vizs that passed Tukey’s “intra-ocular trauma” test: the results should hit you between the eyes. Over the coming months I’ll be blogging based on their final […]

Gender bias . . . across the galaxy

In TV and movies men talk more than women, and women talk mostly about men. Hence the Bechdel test. But I thought I’d do a dataviz for this phenomenon using Ben Schmidt’s implementation of Bookworm. His data scraper uses the Open Subtitles database of closed captioned subtitles for hundreds of TV shows. While it can’t […]

Fearbola, Ebola and the Web

My nasty “cold” has been diagnosed as Influenza A, so it’s bed rest for 48 hours. And, of course, blogging about why Ebola gets all the news but not good ‘ol killers like influenza. I got CDC figures for deaths and then ran Google searches for the related terms, totaling the number of hits. I was […]

Foodies, sushi, and Google Ngrams

Playing around with the new ngramr package for R, I came up with a simple viz for both the sushi boom and the rise of US foodie culture. Sometime a picture is worth a thousand words, but a least sushi is low in calories.  

Data illustration vs. data visualization?

Just discovered a great blog post on “data illustration” versus “data visualization” at Information for Humans. AIS argues that data illustration is “for advancing theories” and “for journalism or story-telling.” By contrast data visualization “generate[s] discovery and greater perspective.” I love this distinction, although I’m not sure I like the specific language. Tukey famously argued […]

In praise of “Shock and Awe”

Why graph? And why, in particular, use innovative and unfamiliar graphing techniques? I started this blog without addressing these questions, but a recent blog post by Adam Crymble, critical of “shock and awe” graphs made me realize the need to explain EDA (Exploratory Data Analysis) and data visualization. Crymble wisely challenged data visualization practitioners to […]