On November 3, 2017, Emory Libraries hosted the HathiTrust Research Center Digging Deeper, Reaching Further workshop. Thirty-four librarians from across the southeast attended this train-the-trainer workshop on text mining. The workshop covered text analysis, distant reading, and non-consumptive research in five interactive modules taught over the course of six hours.
The HTRC workshop focused on ways to use computers to discover patterns in digitized texts. In addition to general text analysis, HTRC helps support non-consumptive research, where one can do computational analysis without having access to a reading version of the text (so that researchers can conduct text analysis more easily on in-copyright works). We covered finding and retrieving digital texts, manipulating textual data, analyzing textual data, and finally visualizing analysis of textual data. During the workshop, we worked in the HathiTrust Research Center platform as well as with Python using Python-Anywhere. The workshop attendees had an opportunity to work with datasets from the HathiTrust digital library that the instructors had prepared.
This workshop was an excellent opportunity to bring librarians from across the Atlanta area and nearby states. We had eight out-of-state participants, two from public libraries, and one from the Federal Reserve. Our three instructors came from the University of North Carolina at Chapel Hill and Indiana University. Participants had fun exploring the materials. Our subject librarians shared their views on the workshop:
Kim Collins, Art History/Classics Librarian and Research Engagement Services Leader – “Learning how to use the tools and data from HathiTrust Research Center to text mine, i.e., find associations and patterns, was both daunting and rewarding. Step-by step hands-on exercises gave me an appreciation of the power of Python and the context to make suggestions to other Emory researchers interested in this type of digital scholarship.”
Ellen Ambrosone, the South Asian Studies and Religion Librarian – “The HathiTrust workshop was helpful and inspiring! The hands-on portion of the day gave me a much better understanding of the mechanics of text mining and made me think about the skills that I could acquire to better assist users who are exploring computational methods of research.”
Erica Bruchko, the African American Studies and U.S. History Librarian – “Over the past several years, Emory graduate students and faculty have expressed interest in text mining, especially how to acquire the data that they need to complete their text-mining projects. The HathiTrust workshop provided a great primer on how to identify open data sources. I also appreciated the opportunity to get hands-on experience cleaning and prepping data.”
As trainers now, we plan to do a few things. Those of us who participated will be bringing these methods to our faculty and students. First, either in the spring or summer, we will be hosting a workshop for faculty and fellows at the Fox Center for Humanistic Inquiry on HathiTrust and the HathiTrust Research Center. Second, we will be initiating Word Lab, a group for people interested in computational text analysis. We had eighteen participants express their interest in joining the Word Lab. Third, we will be providing training for other subject librarians. Those of us who participated in the workshop as well as the other Emory librarians we train will, in turn, share these methods with the faculty and students in our subject areas. We may also host an open workshop for some of these techniques in the future; however, if you are interested in learning more in the meantime, you can contact Katie Rawson or Chella Vaidyanathan for more information.
More information about the HathiTrust Research Center’s “Digging Deeper, Reaching Further” project is available at https://teach.htrc.illinois.edu/about-the-project/