How We Find What We Find

My attempt to use these tools is illustrative of methodological and theoretical issues pertaining to the specifics of this project as well as larger questions of the role of the digital humanities. In order to obtain truly significant findings, to either reject or fail to reject the null hypothesis, I would have to make significant modifications to the framework of this study. First, while the meaning of “custom” and “law” seem self-evident, it does not follow that these terms themselves would be used in fiction to refer to custom and the law, respectively. That is, the use of the word “custom” is not a necessary marker of the custom sketch. Nor is the use of the term law clear evidence of the rise of the law, though its meaning is more restricted, and an analysis in AntConc shows that we can more readily trust the usage of law as we mean it than the usage of custom.

Figure 1: An AntConc concordance output based on the search term “law.”

As seen above, “law” coincides a great deal with the words against, as in prohibition, and allows, as in permission. The meaning here is quite clear: law as a regulatory force. The snapshot above is by no means exhaustive. How does “custom” compare?

Figure 2: An AntConc concordance output based on the search term “custom.”

The meanings for “custom” in the image seem to center around personal habits, though “custom” here is a more intense form of personal habit. At other times, “custom” is used metaphorically (“poetic custom”). Sometimes “custom” suggests regulation (“custom at sea”) but not quite in the more both restrictive and universal meanings of law.

“Manner,” as a proxy for “custom,” is also difficult to pin down as the picture below shows.

Figure 3: An AntConc concordance output based on the search term “manner.”

“Manner” seems to be used often as “way” (“unexpected manner,” “artistic manner”), though at times it is used to mean “bearing.” To be sure, bearing suggests a certain social performance that is connected to social norms, but overall, “manner” is used in a variety of ways, literal and metaphoric, which cannot easily be interpreted to signal custom with a high degree of certainty.

My point is that the terms I used do not seem particularly useful in answering the research question I’ve posed. “Law” seems to work marginally better than “custom” or “manner,” and these latter two don’t seem to work very much at all. This is surely because analyzing custom sketches in relation to the law does not entail a like to like comparison. That is, law is much more finite in its meaning, while the custom sketch is a species of description that cannot be reduced to a word, a color, metaphor, or some other trope. A statistically significant term to stand in for custom or the custom sketch, therefore, is needed. To arrive at such a term, it would be useful to have two teams of researchers, one of close readers and another of digital humanists, that would compose a very specific definition for the custom sketch as it manifests in the United States (recall that most of my experience is with the Latin American genre). With this definition, the team of close readers could look at texts that fit the definition carefully since, to date, it seems that only a human reader can identify entering the custom sketch mode. They would be looking for terms or tropes that repeat across texts. At the same time, the team of digital humanists could attempt to find appropriate proxy terms for custom through machine learning by training a program to find words or phrases that correspond with the agreed upon definition of custom. The two teams would then compare their findings and select appropriate proxy terms (ideally, each team would independently come up with some of the same terms).

Drilling down on more precise terminology would greatly improve the validity of the findings. At the very least, this would avoid the “shot in the dark” style I used to analyze the texts under study. One thing is certain, more precise terminology would hopefully lead to better results and to appropriate revisions of the research question. In literary studies, we adopt epistemological paths that assume that reality is experienced through language and other sign systems, that there are enduring narratives and discourses that shape our world. For us, the world is subjective (which does not mean that we disavow the facticity of reality), and we explore topics from a subjectivist, particularist mode. But statistical methods and computational methods rely on epistemologies that equate reality with the purely observable; empirical fact is real, with little to no room for interpretation. It should not be surprising, then, that an appropriate research question for computational literary studies would be quite different and distinct from a research question derived through close reading. In a very real sense, I need to retrain in more objective methodologies, not necessarily to properly understand the results (though that can’t hurt) but rather—and more importantly—to learn how to ask questions that would more easily coincide with the methods in use. I can hear a small voice in my mind whispering that this would flatten out the nuance of literature, cultural production, communication. That is an important consideration. But my sense is that I don’t have yet have the appropriate training to imagine a valid question that does not collapse the complexity of culture into a simple binary. My original project was to be a sort of genealogy of Caribbean musical forms through network analysis. For someone like me, the elucidation of relationships, some obvious and some until now invisible, through networks seems like a happy medium: valid relationships within a network that exert varying degrees of pressure and that can be described and analyzed in terms of shape would seem like an ideal way to be grounded in the observable while allowing and making space for myriad configurations of relationships.

In other words, I am trying to resolve the incongruities between an objectivist mode and a subjectivist one–or, more familiarly, a quantitative method and a qualitative one. This is an old tension with a simple, yet messy and complicated, answer: why not both? Indeed, viewed as a dialectic, it would seem quite productive to oscillate back and forth between close reading and distant reading, between massive corpuses and esoteric quotes, between a global view and a granular one. The problem is not the idea, or better yet the impulse, to bring these methods together but how to do so. Some of the issues are theoretical (can we really combine these methods? how successful has that been in other fields, e.g. the social sciences?), while others are practical (what is the role and scope of one researcher? how can communities of practice between the two “camps” come together to approach a research topic?). I started in a rather naïve mode, and I am ending in a, well, somewhat worse one: idealism. My sense is that putting together teams of researchers that would agree if not on an idea then on a principled approach would yield knowledge production and development of tools that could produce a quantum leap in what we know and in how we know.

For a bibliography and a complete list of texts in my corpus, please visit the about page for this project.

