[by: Will Cunningham]
Matthew Jockers, Matthew Sag, and Jason Schultz’s article “Don’t let copyright block data mining,” which appeared in Nature magazine in 2012, brings to light perhaps one of the biggest roadblocks in the field of Digital Humanities: copyright law. To put the issue simply, unless a novel was published before 1923, the copyright has expired, or the copyright owner agrees (on an individual basis) to make the content available, then some of the most important tools that DH has to offer are off the table.
Jockers, Sag, and Schultz offer a brief summary of the class action lawsuit levied by the Authors Guild against Google’s massive collection of scanned novels as a case study in the complex issue surrounding copyright and the DH. This lawsuit, while intended to protect both the integrity and economic viability of literary production, has also closed the door on “non-expressive” uses of literature in the field of DH: data and text mining, geo-surveys, deep word counts, etc. As Jockers, et al point out, the goal of DH is not about republishing work or even quoting from texts; rather, DHG scholars “simply want to extract information from and about them to sift out trends and patterns.”
This obstacle is especially prescient for African America Literary studies (and of the African American novel in particular). While HBW boasts a near exhaustive collection of every black novel published before the 1923 cut-off, an extensive DH project that omits the vast majority of novels produced by black authors post-1923 is hardly extensive or even useful, one might argue. In fact, this is a roadblock we are dealing with now: how do we build a useable database for non-expressive purposes and exclude much of the Harlem Renaissance? The Black Power Movement? The rise of Black Feminist Literature? How does one build a database without Toni Morrison?
These questions are primary to the development of both the field of DH and African American Literature. And as Jockers, et al note, an Author’s rights do deserve protection…but digitizing books for non-expressive uses is a separate issue. The slow, trudging, ponderous weight of copyright law must at some point catch up to the fast accelerating pace of academic studies in general, and the field of Digital Humanities in particular.