Gender and Big Data: Finding or Making Stereotypes?
In his book Macroanalysis, Matthew Jockers argues that we have reached a “tipping point”: now that we have so much data digitized, the techniques and methodologies used to explore big data: text mining, topic modeling, machine learning, named entity recognition, etc. Two problems confront digital literary historians of women writers: first, that not enough women writers of the past have been digitized to participate visibly in the field of big data, and second, that those whose works have been digitized participate in propagating stereotypes about women and women writers except when analyzed at the micro level. Data scientists in the commercial world have worked on the problem of representing minorities “fairly” even when they are represented by a small sample. In this paper, I discuss how we might use their insights in order to insure that history isn’t just repeated, that we don’t begin literary history anew by excluding women writers from view yet again.
Laura Mandell is Director of the Initiative for Digital Humanities, Media, and Culture
and Professor of English at Texas A&M University. She is the author of Breaking the Book: Print Humanities in the Digital Age (2015), Misogynous Economies: The Business of Literature in Eighteenth-Century Britain (1999), a Longman Cultural Edition of The Castle of Otranto and Man of Feeling, and numerous articles primarily about eighteenth-century women writers. Her article in New Literary History, “What Is the Matter? What Literary History Neither Hears Nor Sees,” describes how digital work can be used to conduct research into conceptions informing the writing and printing of eighteenth-century poetry. She is Project Director of the Poetess Archive, an online scholarly edition and database of women poets, 1750-1900, Director of 18thConnect, and Director of ARC , the Advanced Research Consortium overseeing NINES, 18thConnect, and MESA.