Day Three

Thursday June 1, 2017

Breakfast (8:45 – 9:30 AM): Emory Center for Digital Scholarship

Session Nine  (9:30 – 11:00 AM): Woodruff Library 312

Intermediate topics in text manipulation in R and RStudio (Ravina) — guide

    • KWIC and regular expressions
    • Chunking, aggregating, and sorting
    • Tokenizing

 

Session Ten (11:15 AM – 12:30 PM): Woodruff Library 312

Hands on work in RStudio (Ravina, Long, Des Jardin)

Lunch Break (12:30 – 1:30 PM): Emory Center for Digital Scholarship

Session Eleven (1:30 – 2:45 PM): Woodruff Library 312

Overview of statistical techniques (Long)— guide

This session will introduce basic statistical methods in non-mathematical fashion. Many key concepts in DH and computational linguistics, including measures of text similarity, can be explained through simple visual analogies. In the simple case of comparing the frequency of two words, different measures can be explained as different methods for judging the closeness of two points on a piece of paper. Our goal in this morning session is to enable JF specialists to understand the principles behind these measures so that they can better communicate and collaborate with DH specialists.

  • Correlation
  • Clustering

Session Twelve (3:00 – 4:30 PM): Woodruff Library 312

Hands on work in RStudio implementing (Long, Ravina)

  • Correlation
  • Clustering