See and Believe

The saying goes that “seeing is believing.”  In the age of big data, this becomes ever truer.  Raw numbers and other data have a tendency to overwhelm comprehension, but when organized visually their meaning is much more quickly apparent. 

Part of the reason for this is that our brains are simply more capable of interpreting visual information, even in abstraction.  To make a digital comparison, it’s estimated that we process about 1250 MB/s (megabytes per second) visually but are only conscious of about 0.7% of this information.[1]  By tapping into this faster processing, we can comprehend information that is normally to vast to grasp.  This vastness may not necessarily just be the volume of data—it may also be the scope of the data.  W. E. B. Du Bois presented a stark graphic of the population distribution of African Americans in Georgia at a 1900 exposition in Paris.

The straight lines adjoining diagonally represent African Americans living in various sizes of city, whereas the line spiraling in to the center of the page represents rural dwellers.  The significance of the spiral is slow to dawn on the viewer since the compact form of the graphic initially tricks the eye into confusing smaller visual area with a shorter proportional line.  To me, this makes the image all the more powerful.  Source: W. E. B. Du Bois, The Georgia Negro: City and rural population, 1900, photograph of ink and watercolor, Library of Congress, Washington, DC, https://www.loc.gov/item/2013650430.

Information similarly difficult to grasp, Napoleon’s march on Moscow resulted in mind-boggling attrition.  While presenting simple numbers may be an accurate representation of the loss of life, it has less impact than a visual.

Charles Minard’s famous representation of the Napoleonic Army’s march to Moscow. The width of the line is proportional to the number of troops remaining; tan is the march into Russia, black is the retreat. Notice also the river crossings and the temperature graph corresponding to the retreat. By deftly layering multiple data on top of another, Minard communicates a complex event easily. Source: Charles Minard, “Carte figurative des pertes successives en hommes de l’Armée Française dans le campagne de Russie 1812-1813,” infographic, 1844, https://en.wikipedia.org/wiki/Charles_Joseph_Minard#/media/File:Minard.png.

Visual representation helps us make connections that are not otherwise easy to make due to the overwhelming quantity of data.  Figures of study from Voltaire to Franklin to Kissinger have left us literal archives full of written works.  Stanford University’s Mapping the Republic of Letters project is creating a database of the correspondence between 18th century notables in order to examine their patterns.  In particular, researchers have noticed “coldspots,” such as Voltaire’s relative lack of correspondence with anyone in England.[2]  I am reminded of our HIST 454/544 class last week where we sat in silence trying to think of a voice that was “notably absent” from the oral histories we had been studying; a visual web of connections showing information about the speakers may have helped us find an answer sooner.  Similarly, the work of Micki Kaufmann to create an archive of Henry Kissinger’s memoranda provides a glimpse of his shifting priorities and emotional state.  An example of how she has used this visualization, Kaufman writes about the topic (i.e. word frequency) of “laughter:”

The “Laughter” topic is based upon those documents in which the transcriber literally placed the phrase “[laughter],” representing jovial, lightheaded moments of Kissinger’s correspondence in which the participants had a chuckle. A historian would expect these sorts of emotional expressions to occur in inverse proportion to the gravity of their respective topics (for example, the least ‘laughter’ during those negotiations in which relations were at their most sensitive, tense and/or adversarial), and the placement of the “Laughter” topic at the furthest possible point from topics relating to the Soviet Union, China and Vietnam negotiations validates this interpretation. [3]

This “exhausting preparatory work” to create “appropriately formatted data”[4] is a service to all future historians using the data sets.  From the database, “web services like Overview make visual topic modeling easy; tools like NodeXL and Gephi greatly facilitate complex network analysis; and mapping tools like QGIS, while not exactly intuitive, help users sidestep prohibitively expensive software that was required to make even rudimentary maps only a short time ago.”[5]

GIS (Geographic Information Systems) is indeed a new and powerful too available to historians. When utilizing time mapping, a new feature also called temporal GIS, spatial data can be connected with chronological data.  Analog versions of this kind of knowledge publication exist—Edward Quin’s “A General View of Universal History…” is a series of maps showing what was thought of as the “known world” at various points in history, for instance—but temporal GIS technology allows for the production of new knowledge, not simply the publication of the old.  A “time slider” controls the temporal moment as the requested data displays; Joshua MacFadyen and Nolan Kressin of the University of Prince Edward Island explored firewood transportation along Canadian railways 1876-1921 using this technology.  This allowed them to identify several railways for further study based on what visually “stood out” when their data displayed.[6]

As with all other tools in the digital humanities, novel ways to visualize data are worth little if not subjected to the same kinds of scrutiny applied to “traditional” work in the field.  Frederick Gibbs convincingly argues that “[o]ur visualizations and data interfaces must provide or at least suggest historical insight… or shed new light on old questions, rather than simply present a novel view of the historical record for novelty’s sake.”[7]  He encourages all historians, even those not “enthused about gathering, producing, or working with data,” to engage with graphics as we would any other source, interrogate them, and review their methodology.[8]  Indeed, this seems necessarily now that such visualizations have made the leap from simply representing data to helping us draw conclusions from it.  History as a discipline prides itself on maintaining an unbroken epistemological chain of conclusions, and new sources of knowledge must naturally be included in that unbroken chain.


[1] David McCandless, “The beauty of data visualization,” produced by TED-Ed, November 23, 2012, video, 9:25, https://www.youtube.com/watch?v=5Zg-C8AAIGg.

[2] Dan Edlestein et al., “Voltaire and the Enlightenment,” Mapping the Republic of Letters, Stanford University, accessed October 11, 2020, http://republicofletters.stanford.edu/casestudies/voltaire.html#.

[3] Micki Kaufman, “Force-Directed Diagram: MEMCONS and TELCONS ‘Textplot,’” “Everything on Paper Will Be Used Against Me:” Quantifying Kissinger, January 6, 2015, accessed October 11, 2020, https://blog.quantifyingkissinger.com/category/interactivity/interactive.

[4] Frederick W. Gibbs, “New Forms of History: Critiquing Data and Its Representations,” The American Historian, February 2016, https://www.oah.org/tah/issues/2016/february/new-forms-of-history-critiquing-data-and-its-representations.

[5] Ibid.

[6] Joshua MacFadyen and Nolan Kressin, “The Fir Trade in Canada: Mapping Commodity Flows on Railways,” NiCHE, October 8, 2020, accessed October 11, 2020, https://niche-canada.org/2020/10/08/the-fir-trade-in-canada-mapping-commodity-flows-on-railways.

[7] Gibbs, “New Forms of History,” The American Historian.

[8] Ibid.