Digital Humanities

Guides

Many researchers, however they engage with digital tools, have questions about citation management, mapping, and digitization.  Resources to assist with many of these common issues can be found here (note that this compilation is not exhaustive, however). University faculty and students can find further assistance, or help with any other research questions, from Jeffrey Tharsen, Carmen Caswell, and other campus units.

How can I digitize printed text?

Optical Character Recognition (OCR) is a technology which identifies text from digital images, such as a scan of a printed page. OCR software generates text which is “machine-readable”, or able to be processed by a computer, from the images. The text can then be copied to other programs and read by analytical tools.

Some image viewing software tools, such as Adobe Reader or Adobe Acrobat, have basic OCR capabilities built in; however, many scholars may find that these are not accurate enough for their texts. ABBYY FineReader is a popular option for a more powerful OCR tool, with support for a wider array of languages. Although the University does not offer discounted licenses for ABBYY, any faculty, student, or staff member can access it on computer terminals at the Research Computing Center and the Visual Resources Center. An open-source, free alternative is Tesseract, giving a high degree of accuracy for over 100 languages. However, its use does require familiarity with your computer’s command line interface.

Some datasets (which may include text after OCR processing) will contain errors and need cleaning before work can proceed.  OpenRefine helps researchers assess and correct tabular data by sorting values and transforming many cells at once.

How can I start a text analysis project?

Some scholars may need to create their own corpus of texts for research with the OCR tools discussed above. However, a number of large existing corpora are available to scholars. Major corpora in the English language include the Corpus of Contemporary American English, Early English Books Online, and Eighteenth Century Collections Online (access with your UChicago CNetID). The University of Chicago Library maintains a list of linguistic analysis collections for several other languages; browse the Library’s other guides for discipline-specific texts. Researchers with some familiarity with using APIs can obtain texts from the New York Times, Twitter, and the Digital Public Library of America.

ARTFL and the Textual Optics Lab offer a selection of analytics tools for use on several different large corpora of text; collections include texts in English, French, Chinese, German, Japanese, and Russian. The HathiTrust Research Center also provides analytics tools for its own collection of over 16 million fiction, nonfiction, and government volumes. JStor is currently piloting a similar service, Constellate, for its own collections.

Scholars can obtain a general overview of any text with Voyant Tools, an in-browser suite of tools for basic analysis; scholars can view word frequencies across the length of the text, co-occurrences with other words, and other metrics, as well as create simple visualizations. Lexos allows for further exploration of texts. Researchers who are ready to experiment with more advanced methods can find guides and tutorials at The Programming Historian. This may include methods such as natural language processing (NLP) or topic modeling. Among the helpful NLP toolsets are the Natural Language Toolkit, SpaCy, and the Stanford Natural Language Processing Group’s software resources. Many scholars have use MALLET package to experiment with topic modeling.

What tools can create visual representations of my texts and other data?

As noted in the above section, Voyant Tools can give scholars high-level, graphical views of their texts. Palladio offers several different graphical interfaces for many different types of humanities data, including timelines, maps, and item browser views. For high-resolution graphs for presentations or publication, Tableau offers a robust suite of visualization styles. Scholars with experience in Javascript may benefit from exploring the D3 library, which allows for the creation of many types of graphical visualizations of data, including some interactive graphs (for more resources on D3 see our Tutorials). For formatting research papers, including tables, University-affiliated researchers can use the LaTeX editor Overleaf

For suggestions on creating maps and handling geographic data, see the following section.

Some datasets (which may include text after OCR processing) will contain errors and need cleaning before work can proceed.  OpenRefine helps researchers assess and correct tabular data by sorting values and transforming many cells at once.

 

How can I create maps for research, websites, or classroom use?

A wide variety of tools can assist scholars in digital map creation. Basic web-based maps can be created with Google Maps, but StoryMapJS and Carto may offer more options for interactive maps. ArcGIS Online is also available to all University faculty, staff, and students. In addition, ArcGIS StoryMaps combines interactive maps, text, images, and other content in a streamlined narrative platform (log in via the University’s ArcGIS Online portal, linked above, to access StoryMaps).

Creating, manipulating, and analyzing geographic data, however, often must be done with more robust tools. QGIS is an open-source tool, free of charge, for creating and working with spatial data. The Research Computing Center also offers licenses for ArcGIS for an annual fee. Data created with GIS tools can be imported into Google Earth for use with satellite imagery.

The University of Chicago Library maintains a thorough guide to geographic technologies and data sources.

How can I create an Omeka site? What other digital publishing platforms can I use?

The University of Chicago Library now has a subscription to the Omeka open-source digital publishing platform. Faculty, staff, and student researchers can showcase digital collections or build web exhibits, and instructors can request a site for classroom use. For more information, and how to request an Omeka site, see the University of Chicago Library guide.

Scalar is alternative open-source tool for media-rich exhibitions or other digital scholarship. For faculty, students, or other University groups interested in creating an open-access journal, the Library also supports Open Journal Systems. OJS allows users to manage manuscript submission, the editing and review process, and offers a platform for final publication. 

What resources can help me incorporate digital humanities ideas and methods in my classes?

Students can gain experience in the process of creating digital archives and exhibits through the use of Omeka, as described above. Some projects, classes, or groups may also benefit from a Voices site, a WordPress-based platform offered by IT Services. The mapping resources discussed in the paragraph above have also been implemented in coursework. In addition, students can collaborate on digital timelines with TimelineJS.

What tools can I use to keep my citations, sources, and research materials organized?

The University of Chicago Library maintains a subject guide with resources for citation management.  Zotero is a popular, open-source tool for organizing and saving citations in a wide variety of forms, from PDFs to web pages. Users can sync their citations across devices, and save sources automatically through their web browser. Students and faculty can also obtain the similar citation management tool, EndNote, for a discount through OnTheHub (students) or BuySite (faculty).

The Visual Resources Center also offers a guide to managing personal research archives, with a focus on images. This includes image management tools such as Tropy or ARIES. For assistance with storing and sharing data with the University’s Institutional Repository, and further help with data services and management, see the Center for Digital Scholarship.

Some researchers may be in need of management for large amounts of data or for long-term research projects, and could benefit from the Online Cultural and Historical Research Environment (OCHRE).

Top: Digital reconstruction of “Raided Village” mural, Temple of the Warriors, Chichen Itza, by Magdalena Glotzer, AB ’19 (2019), based on a 1931 reconstruction by Ann Axtel Morris and diagrams from Morris, Earl Halstead. The Temple of the warriors at Chichen Itzá, Yucatan. Washington: Carnegie Institution of Washington, 1931.