{"id":210,"date":"2012-05-29T19:31:26","date_gmt":"2012-05-29T19:31:26","guid":{"rendered":"http:\/\/clioviz.wordpress.com\/?p=210"},"modified":"2015-12-16T23:01:06","modified_gmt":"2015-12-16T23:01:06","slug":"in-praise-of-shock-and-awe","status":"publish","type":"post","link":"https:\/\/scholarblogs.emory.edu\/clioviz\/2012\/05\/29\/in-praise-of-shock-and-awe\/","title":{"rendered":"In praise of &#8220;Shock and Awe&#8221;"},"content":{"rendered":"<p>Why graph? And why, in particular, use innovative and unfamiliar graphing techniques? I started this blog without addressing these questions, but <a href=\"http:\/\/adamcrymble.blogspot.co.uk\/2012\/05\/shock-and-awe-graphs-in-digital.html\">a recent blog post by Adam Crymble<\/a>, critical of \u201cshock and awe\u201d graphs made me realize the need to explain EDA (Exploratory Data Analysis) and data visualization. Crymble wisely challenged data visualization practitioners to ask themselves the following questions: \u201cIs this Good for Scholarship? Or am I just trying to overwhelm my reviewers and my audience?\u201d This is sound advice, and Crymble\u2019s concerns strike me as genuine. But, upon reflection, his post led me to think that \u201cshock and awe\u201d are evitable parts of any bold scholarly intervention. Feminist scholarship provoked genuine anger when it asserted that academic conventions were rife with sexist assumptions. The linguistic turn alarmed traditional scholars with its new understandings of literary production. Certainly these interventions produced (and continue to produce) needlessly complex, derivative prattle. But can anyone seriously argue that the humanities are not richer for these intellectual challenges?<\/p>\n<p>What follows, therefore, is a defense of \u201cshock and awe\u201d: a justification for data visualizations that are unfamiliar, challenging, and demand news ways of thinking.<\/p>\n<p><em>Why graph instead of just showing the numbers? <\/em><\/p>\n<p>By just \u201cshow the numbers,\u201d humanities researchers often refer to tables. The problem with this preference for tables it that is assumes that tables are somehow more transparent and accessible than graphs. In fact, the opposite is true.\u00a0 A list of data values is like a phone directory: a wonderful way to look up individual data points, but a terrible means of discerning or discovering patterns. (<a title=\"Kastellec, 2007 #3484\" href=\"#_ENREF_2\">Kastellec and Leoni 2007<\/a>; <a title=\"Gelman, 2002 #3483\" href=\"#_ENREF_1\">Gelman, Pasarica, and Dodhia 2002<\/a>) Alternately, a table of individual data points is analogous to collection of primary text sources: it\u2019s the raw material of research, not research. Further most published tables are not transparent, \u201craw\u201d data. On the contrary, tables in most research consolidate observations into groups, listing, for example, average wages for \u201cskilled craftsman in Flanders 1830-35,\u201d or \u201cOsaka dyers 1740-80.\u201d But why those years ranges and those occupational categories? Why 1830-35 instead of 1830-1840? Why Osaka dyers and not the broader category of Osaka textile workers? Those groupings may be conceptually valid, but they are interpretative and preclude other interpretations. Certainly we can lie with graphs, but we can also lie with tables. And since a good graph is better than the best table, DH researchers need to use good graphs.<\/p>\n<p><em>Why these novel, unfamiliar graphs?<\/em><\/p>\n<p>The data visualization movement has certainly produced some bad graphs \u2014obfuscating rather than illuminating. But it is impossible to argue that newer graph forms are more misleading than the status quo. The pie chart, for example, is easy to misuse and the many variants supported by Excel are simply awful. With a 3D exploding pie chart, even a novice can make 5% look larger than 10% or even 15%. Can you correctly guess the absolute and relative sizes of the slices in this graph?\u00a0<a href=\"http:\/\/scholarblogs.emory.edu\/clioviz\/files\/2012\/05\/screen-shot-2012-05-29-at-8-38-06-am.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-218\" title=\"Screen Shot 2012-05-29 at 8.38.06 AM\" src=\"http:\/\/scholarblogs.emory.edu\/clioviz\/files\/2012\/05\/screen-shot-2012-05-29-at-8-38-06-am.png\" alt=\"\" width=\"460\" height=\"273\" srcset=\"https:\/\/scholarblogs.emory.edu\/clioviz\/files\/2012\/05\/screen-shot-2012-05-29-at-8-38-06-am.png 802w, https:\/\/scholarblogs.emory.edu\/clioviz\/files\/2012\/05\/screen-shot-2012-05-29-at-8-38-06-am-300x178.png 300w\" sizes=\"auto, (max-width: 460px) 100vw, 460px\" \/><\/a><\/p>\n<p>(See answers below). \u00a0Since pie charts are familiar, they are accessible, but that simply makes them easier to misuse. Are conventional bad graphs such as pie charts \u201cbetter\u201d than newer chart forms because they provide easier access to faulty conclusions? Is \u201cschlock\u201d worse that \u201cshock\u201d?<\/p>\n<p>My survey of graphing techniques in history journals tuned up an alarming result. Historians rely primarily on graphing techniques developed over 200 years ago: the pie chart, bar chart, and line chart. It is hard not to shock the academy with strange graphs, when \u201cstrange\u201d means anything developed in the past two centuries. Many new graphing techniques, such as parallel coordinate plots, are still controversial, difficult to use, and difficult to interpret. But many others are readily accessible and widely used, except in the humanities, The boxplot, developed in 1977 by John Tukey, is <a href=\"http:\/\/www.nctm.org\/profdev\/content.aspx?id=11688\">now recommended for middle school instruction<\/a> by The National Council of Teachers of Mathematics. The intellectual pedigree of the boxplot is beyond question: Tukey, a professor of statistics at Princeton and researcher at Bell Labs, is widely considered a giant in 20<sup>th<\/sup> century statistics. So, what to do when humanities researchers are flummoxed by a boxplot? I now append a description of how to read a boxplot, but isn\u2019t it an obligation of quantitative DH to push the boundaries of professional knowledge? And shouldn\u2019t humanities Ph.D.\u2019s have the quantitative literacy of clever eighth graders? In short, since our baseline of graphing skills in the humanities is so outdated and rudimentary, there is no avoiding some \u201cshock and awe.\u201d<\/p>\n<p><em>A graph in seven-dimensions? What are you talking about? You must be trying to trick me!<\/em><\/p>\n<p>Certainly \u201cseven dimensions\u201d sounds like a conceit designed to confuse the audience, or intimidate them into acquiescence. But a \u201cdimension\u201d in data visualization is simply a variable, a measurement. Decades ago Tufte showed how an elegant visualization, <a href=\"http:\/\/www.edwardtufte.com\/tufte\/posters\">Menard\u2019s graph<\/a> of Napoleon\u2019s invasion of Russia, could show six dimension on a 2D page: the position of the army (latitude and longitude), size of the army, structure of the Russian army, direction of movement, date, and temperature. Hans Rosling\u2019s <a href=\"http:\/\/www.gapminder.org\/world\/#$majorMode=chart$is;shi=t;ly=2003;lb=f;il=t;fs=11;al=30;stl=t;st=t;nsl=t;se=t$wst;tts=C$ts;sp=5.59290322580644;ti=2010$zpv;v=0$inc_x;mmid=XCOORDS;iid=phAwcNAVuyj1jiMAkmq1iMg;by=ind$inc_y;mmid=YCOORDS;iid=phAwcNAVuyj2tPLxKvvnNPA;by=ind$inc_s;uniValue=8.21;iid=phAwcNAVuyj0XOoBL_n5tAQ;by=ind$inc_c;uniValue=255;gid=CATID0;by=grp$map_x;scale=log;dataMin=295;dataMax=79210$map_y;scale=lin;dataMin=19;dataMax=86$map_s;sma=49;smi=2.65$cd;bd=0$inds=\">gapminder graphs<\/a> use motion to represent time, thereby freeing up the x-axis. By adding size, color and text, Rosling famously fit six dimensions on a flat screen: country name, region, date, per capita GDP, life expectancy, and total population. These are celebrated and influential data visualizations, the graphic equivalents of famously compelling, yet succinct prose. While Crymble assumes that needlessly complex graphics stems from bad faith (a desire to intimidate and deceive), I am more inclined to assume that the researcher was reaching for Menard or Rosling but failed.<\/p>\n<p><em>\u201cHow do you know there hasn\u2019t been a dramatic mistake in the way the information was put on the graph? How do you know the data are even real? You can&#8217;t. You don\u2019t.\u201d<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p>This concern strikes me as overwrought and dangerous. Liars will lie. They will quote non-existent archival documents, forge lab results, and delete inconvenient data points. When do we discover this type of deceit? When someone tries to replicate the research: combing through the archives, running a similar experiment, or trying to replicate a graph. How are complex graphics more suspect, or more prone for misuse than any other form of scholarly communication? Is there any reason to be more suspicious of complex graphs than any other research form?<\/p>\n<p>I can optimistically read Crymble\u2019s challenge as a sort of graphic counterpart of Orwell\u2019s rules for writers. But Crymble seems to view data viz as uniquely suspect. To me this resembles the petulant grousing that greeted Foucault, Derrida, Lyotard, Lacan, etc some three decades ago \u2013 \u201cwhat is this impenetrable French crap!\u201d \u201cYou\u2019re just talking nonsense!\u201d Certainly many of those texts are needlessly opaque. But much of it was difficult because the ideas were new and challenging. The academy benefitted from being shocked and awed. Data visualization can and should have the same impact. The academy needs to be shocked \u2014 that how change works.<\/p>\n<p>Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. &#8220;Let&#8217;s Practice What We Preach: Turning Tables into Graphs.&#8221; <em>The American Statistician<\/em> 56 (2): 121-30.<\/p>\n<p>Kastellec, Jonathan P., and Eduardo L. Leoni. 2007. &#8220;Using Graphs Instead of Tables in Political Science.&#8221; <em>Perspectives on Politics<\/em> 5 (4): 755-71.<\/p>\n<p>The pie chart:<\/p>\n<table border=\"0\" width=\"130\" cellspacing=\"0\" cellpadding=\"0\"><!--StartFragment--><\/p>\n<tbody>\n<tr>\n<td width=\"65\" height=\"15\">Apple<\/td>\n<td align=\"right\" width=\"65\">10<\/td>\n<\/tr>\n<tr>\n<td height=\"15\">Borscht<\/td>\n<td align=\"right\">17<\/td>\n<\/tr>\n<tr>\n<td height=\"15\">Cement<\/td>\n<td align=\"right\">13<\/td>\n<\/tr>\n<tr>\n<td height=\"15\">Donut<\/td>\n<td align=\"right\">20<\/td>\n<\/tr>\n<tr>\n<td height=\"15\">Elephant<\/td>\n<td align=\"right\">25<\/td>\n<\/tr>\n<tr>\n<td height=\"15\">Filth<\/td>\n<td align=\"right\">15<\/td>\n<\/tr>\n<p><!--EndFragment--><\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>Why graph? And why, in particular, use innovative and unfamiliar graphing techniques? I started this blog without addressing these questions, but a recent blog post by Adam Crymble, critical of \u201cshock and awe\u201d graphs made me realize the need to explain EDA (Exploratory Data Analysis) and data visualization. Crymble wisely challenged data visualization practitioners to <a class=\"read-more\" href=\"https:\/\/scholarblogs.emory.edu\/clioviz\/2012\/05\/29\/in-praise-of-shock-and-awe\/\">[&hellip;]<\/a><\/p>\n","protected":false},"author":3530,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34,2],"tags":[33,7,9,11],"class_list":["post-210","post","type-post","status-publish","format-standard","hentry","category-data-visualization","category-digital-humanities-2","tag-data-visua","tag-dh","tag-digital-humanities","tag-eda"],"_links":{"self":[{"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/posts\/210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/users\/3530"}],"replies":[{"embeddable":true,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/comments?post=210"}],"version-history":[{"count":3,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/posts\/210\/revisions"}],"predecessor-version":[{"id":447,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/posts\/210\/revisions\/447"}],"wp:attachment":[{"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/media?parent=210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/categories?post=210"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scholarblogs.emory.edu\/clioviz\/wp-json\/wp\/v2\/tags?post=210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}