Monday, March 24, 2014

On Keyword Searches and Indexed Data


[By: Will Cunningham]

Janine Solberg’s article “Googling the Archive: Digital Tools and the Practice of History” (2012) reflects on the epistemological implications of accessing digital archives for research purposes. While the thrust of Solberg’s inquiry surrounds the positionality of the researcher in accessing these tools and the ways in which the researcher can actively shape the development of the tools used in the field, I found her cursory distinctions between keyword searches and content searches valuable to our larger digital project at HBW.

Solberg distinguishes between searches of “fixed descriptions,” which relies on metadata produced by a scholar, and “content” searches, which are more open-source searches directed at the original source itself. Solberg’s distinction between the two sheds light on a problematic feature of building a useable database: What are the goals? Who are the intended users? How much access can be granted? How does one determine the limit (or extent) of indexed, searchable features? These questions are, as Solberg notes , crucial for “understanding something about how search technologies work…for our research, for planning, and building [the] discipline.”

So what is our goal? At this point, my partner in this project (Kenton Rambsy) and I have spent a great deal of time focusing on the “fixed descriptions” of the African American Literary cannon. A detailed taxonomy of useable indexes will allow users to find items in our database – but that is an esoteric practice. That is to say, this is fine if our goal is only to make this tool available for researches already “in the know” about the cannon; however, a database that employs both uses – both open source and indexed sources – opens the doors for a much wider audience.

This is problematic, of course. As I noted in my last post, publishing copyright laws have blockaded access to much of the literature post-1923. Our question remains, though, of how to make these two modes of research converge into one another. With HBW’s rich source of pre-1923 documents this is an exciting opportunity that could potentially change the way we think about the cannon as a whole.

Wednesday, March 19, 2014

On the importance of open-source literature


[by: Will Cunningham]

Matthew Jockers, Matthew Sag, and Jason Schultz’s article “Don’t let copyright block data mining,” which appeared in Nature magazine in 2012, brings to light perhaps one of the biggest roadblocks in the field of Digital Humanities: copyright law. To put the issue simply, unless a novel was published before 1923, the copyright has expired, or the copyright owner agrees (on an individual basis) to make the content available, then some of the most important tools that DH has to offer are off the table.

Jockers, Sag, and Schultz offer a brief summary of the class action lawsuit levied by the Authors Guild against Google’s massive collection of scanned novels as a case study in the complex issue surrounding copyright and the DH. This lawsuit, while intended to protect both the integrity and economic viability of literary production, has also closed the door on “non-expressive” uses of literature in the field of DH: data and text mining, geo-surveys, deep word counts, etc. As Jockers, et al point out, the goal of DH is not about republishing work or even quoting from texts; rather, DHG scholars “simply want to extract information from and about them to sift out trends and patterns.”

This obstacle is especially prescient for African America Literary studies (and of the African American novel in particular). While HBW boasts a near exhaustive collection of every black novel published before the 1923 cut-off, an extensive DH project that omits the vast majority of novels produced by black authors post-1923 is hardly extensive or even useful, one might argue. In fact, this is a roadblock we are dealing with now: how do we build a useable database for non-expressive purposes and exclude much of the Harlem Renaissance? The Black Power Movement? The rise of Black Feminist Literature? How does one build a database without Toni Morrison?

These questions are primary to the development of both the field of DH and African American Literature. And as Jockers, et al note, an Author’s rights do deserve protection…but digitizing books for non-expressive uses is a separate issue. The slow, trudging, ponderous weight of copyright law must at some point catch up to the fast accelerating pace of academic studies in general, and the field of Digital Humanities in particular.

Wednesday, March 12, 2014

On the State of Digital Humanities and Black Literature


[By Will Cunningham]
As part of an ongoing project here at HBW, in the coming months I will be reviewing a large swath of publications related to the field of Digital Humanities. While I hope to delve into many of the more technical aspects of this field, I think beginning with a broader survey of the state of the field is an appropriate start. Alan Liu’s article, “The State of the Digital Humanities: A Report and a Critique” serves just that purpose. Liu’s article surveys the historical, technological, and social rise of the field of DH, focusing on the ongoing development of the varying analytic tools used in the field.
Liu argues that the state of DH is at a tipping point – a field poised to “not just facilitate the work of the humanities but to represent the state of the humanities at large in its changing relation to higher education” and that the field of DH “serves as an allegory of the social, economic, political and cultural self-image of institutions.” It cannot be stated enough: DH is the future of the humanities. It represents a broad space of expansion with exciting opportunities. But I cannot help but think of Liu’s last statement with skepticism. If the field of DH does stand proxy for the “social, economic, political and cultural self-image of an institution,” then where is the representation of African American Literature? Surely it is somewhere to be found?
Simply put, it is not; or rather, if it does exist, it exists on the periphery of the field. Liu constantly references the “high points” of the field of DH: The William Blake Archive, Romantic Circles, Rossetti Archive, The Valley of the Shadow, Walk Whitman Archive, and Women Writers Project. While these archives represent both the germinal roots and the embodiment of the potential of the field, the representation of African American Literature is notably absent.
This is where HBW steps in. As a part of our ongoing commitment to the recovery, preservation, and study of African American Literature, we are gearing up to fill this void. Again, if DH does truly stand as a watermark for the institution, then African American Lit can and should be represented in this growing field. It should be an integral part of the image of the institution at large.