Refashioning. That’s the thread that ties together the six blog posts that I have listed below. Four of these posts are about my own intellectual process, while the other two seem to be about reviews of critical digital humanities articles. However, these review entries are also about my own intellectual refashioning, a process which has been long and dialectical (using this term sounds too on the nose…but I believe it’s accurate).
The first and third entries are squarely about me: in the first post, I perform a kind of archeology of my intellectual formation and my current preoccupations, while the third selection is a rather awkward attempt to push myself in the direction I want to go. The fifth entry shows how I land on a research topic that is exciting when I am not pushing myself in any particular way; it shows how if I just let things happen, the right kinds of questions and ideas will present themselves. The last entry is about the ups and downs of the research process, specifically how I had to abandon an idea that seemed extremely exciting for an idea that is intellectually interesting but somewhat inert (inert because it was a “continuation” of a dissertation that I was interested in but had no love for). The two review entries are about me because they exemplify moments in which I quite embraced digital humanities (Indexes) and another in which I questioned not the relevance of digital humanities but not its relevance for me (Computational).
The refashioning thread is clearly evident in Beyond Now and Genealogies. In the former, I finally grasped something that I hadn’t until then: even when I thought I had been pursuing topics that were unique or new, I was opting for topics that were safe or respectable. This is a cliché…but clichés are clichés because they’re true… While in Beyond Now I was grasping at straws (scraping Twitter data), there’s an energy here, a declaration of independence that was…just fun. In Genealogies, that independence sparked into a project that was truly unique: intellectually interesting but that also filled me with wonder; the project also leaned into what I think, for me, is the best digital humanities approach: networks. Looking at these two entries, then, as the “center” of my intellectual “task” over the last several months makes plain to me how my sense of adventure and risk shifts to safety and comfort and back again depending on the moment. There is no way to “resolve” this “dialectic.” I think the trick is to find a way to use it, to paraphrase Kant, purposely unpurposefully. Put another way, I can’t synthesize the two; I have to learn to iteratively go back and forth and trust that I will develop new knowledge that can contribute something meaningful to the world.
So, basically, I’ve ended up…in another cliché really, this time from RuPaul’s Drag Race: learn to be me and bring that into all of the challenges. (Or, be yourself and that will be enough.) So, kind of trite…but also, kind of true.
What happened? I discovered two important obstacles, both having to do with technical expertise (or lack thereof). It turns out that it is very, very difficult to use 21st century methods…when you don’t have 21st century technical skills. Shocker, right?
To work on music, I wanted to scrape data from Spotify, which is really easy–if you know Python and know how to use Spotify’s API (application programming interface). I had neither skills. But I was intent on trying (I’m, like, really smart). I found one video that seemed particularly helpful; the video goes through the process of getting an artist’s album art from Spotify by scraping the web page source code (not the API). This video was helpful in introducing me to Visual Studio Code, which is allegedly used by everyone (I say allegedly because I truly don’t know–I’m not throwing shade). I’ve played around with Visual Studio Code, and it does feel like an easy to use interface, even for a non-coder like me. I also learned some Python commands.
But I couldn’t really figure out how to get what I needed, and honestly not only scraping web source code but parsing it was quite beyond me, even with borrowed code. So, I did more searches and found that there were great tools that could be used through a combination of Python and Spotify’s API. But I also found that it’s very difficult to follow many articles that pretend to show you how to do use these tools (it’s really easy, you see). My sense is that the writers of these articles really believed they were being very clear about all of the steps (it also seemed that way when I read the articles as opposed to trying to replicate what they showed), but in my opinion, those articles are very writerly (or writer-focused) texts, not readerly (or reader-focused) texts. And hey, it makes sense: it is well known that once you cross an educational threshold, it can be very difficult to explain the concept or skill to someone who is learning it.
I resorted to taking a LinkedIn Learning course on Python, which is useful for understanding the basics of the language. But the class is several hours long, and I don’t really have the time for that. The parts that I did do, however, were enough to help me decipher some of the implicit instructions in some of these articles (e.g., to install a library or module in Python 3, you have to use the command pip3, not pip, etc.). But it wasn’t enough to get me anywhere. I started to think about pivoting to another project (wait for it…), but also kept tooling around with Python. As I started to madly look about for a new project (could I look at Latin American science fiction? what about Marvel superhero comics? what about the posthuman in Battlestar Galactica?), I found a website that promised to scrape the data for me! Yay! It was very expensive. Boo! But there was a free trial. Yay! The free trial was very much a trap, like most free trials. The website did scrape the Spotify data through the API, but it only returned 10 rows at a time. And the queries I ran came back with pretty much the same 10 tracks/artists, which really wouldn’t work for the project.
But research, Reader, is a team sport. A good friend who is working more and more with DH tools called me (he got tired, I think, of my frantic, desperate texts). That call coincided with one last ditch effort to follow the instructions in an article, which I used to assemble a program that would scrape the data.
As I answered my friend’s phone call, I literally hit the “run” button on VS Code…and got my dataset.
While the dataset was legit, it didn’t have the information I needed, not for what I had set out to do. I meant to do, again, a kind of genealogy by mapping out network relationships between these artists and others outside the genre. That means that I would need to add sampling information to this dataset, which I don’t quite know how to do. My friend suggested taskrabbit, but that seemed like adding pieces to this project that I just didn’t have the time to do.
My friend and I kept going back and forth on discussing the feasibility of this music project and a completely new one. He mentioned Project Gutenberg, which has lots and lots of texts in public domain. He specifically referenced nineteenth century novels (he knows that this is a strength of mine) and threw out “Melville.” Which made think of Hawthorne. Which made think of the last few pages of the last chapter of my dissertation. Which sort of speculated on a parallel between the dynamics of the custom sketch and the form of the novel (the bulk of my dissertation) and the dynamics between social custom or norms and the law. And, before I knew it, I couldn’t reach the escape velocity to resist the pull of my dissertation. In fact, I sort of dove right in. And, I’m either gonna go through the 1,000 black holes or slingshot my way around them never to visit that galaxy again.
As of now, I have about 30 texts I downloaded from Project Gutenberg that I will be analyzing through a few DH tools (you can see my proposal here). Even though I have a long road ahead of me to see what I can come up with, it’s good to know that I have a corpus–and that there’s a there, though what it is, I can’t tell yet.
As you can see from the image above, there’s lots to be done.
This collocation analysis is very suggestive. The term I searched for was “custom,” and here are several words that point to the law (e.g., clerkships, sanctions, authorises, archive, etc.). There’s also “anathematise,” which I don’t think means anything except: God bless nineteenth century writers.
This concordance, again run on “custom,” also seems promising (and intelligible in ways that statistical output, frankly, isn’t for me).
I understand that the length of this post is not ideal. But I wanted to show the real vagaries of research, which operate at a resource level (what can I study?), an intellectual/conceptual level (what questions can/should I ask?), and a deeply personal level (I will study the present. I am studying the past). In other words: Research. It’s a journey.
In 2012, at the end of a dissertation regarding the relationship between the literary custom sketch and the novel in the nineteenth century, I speculated that “the struggle between local laws and national laws only mask the struggle between customs and the law” (134). My larger point was that, in the context of maturing countries and societies in the Americas, appeals to the “tradition” articulated in custom sketches, to stories that reinforced social norms, signaled a new national (literally postcolonial) world in which the state would legitimate itself through written laws (e.g., national constitutions), often at the expense of local customs and, arguably, local laws. That hunch stemmed from my analysis of Cecilia Valdés and Sab, two antislavery Cuban novels; Quincas Borba, a novel about a rather mediocre Brazilian gentleman who convinces himself he is Napoleon III; and The Scarlet Letter. My proposal for the present project is to more carefully explore that hypothesis through computational literary studies. The main research question driving this project is this: to what extent does mid-nineteenth century fiction deploy custom, tradition, or social norms vis-a-vis the law, particularly national law? Can we point to a particular dynamic between norms and the law in this fiction? Are these concepts pitted against each other (never, sometimes, all of the time)? Or are they mutually reinforcing (never, sometimes, all of the time)? Are they figured as an evolution or, perhaps, a mere transition?
These overall questions are complicated, but this project is a necessarily modest one, a proof of concept. For this reason, I am focusing on readily available texts in the public domain, specifically texts that can be found on Project Gutenberg. I have created a corpus composed of texts written by Harriet Beecher Stowe, Charles Chesnutt, Nathaniel Hawthorne, Herman Melville, and Edgar Allen Poe. These authors were selected not only because of their prominence in nineteenth century U.S. literature and their availability but also because they are each well known writers of custom sketches and/or novels that described social life in the United States of their time.
In order to investigate the relationship between norms and the law, I will use the digital tools Voyant and AntConc to perform frequency, collocation, word list, and keyword queries on the corpus using terms closely related to norms and the law (e.g., custom, law, tradition, norm, taboo, outlaw, church, ancestor, strange, queer, pariah, etc.). My hope is that these tools will point me to a statistically significant relationship between these terms and/or lead me to other terms that may elucidate this relationship.
To the extent that I can use the word thesis, what I seek to show is that the mid nineteenth century fiction alerts us to the waning of custom as an organizing principle for society and the rise of the law, particularly federal law (I’m explicitly thinking here of the Fugitive Slave Act of 1850 as one example). The stakes of this hypothesis are manifold. Such a finding would show that the fiction of the time was registering a paradigmatic shift in not only how society was organizing itself but also on how national law grappled with local custom and law for the supremacy promised by the Constitution. But it would also reveal and restore to continuing importance on custom and law that is sometimes at odds with the law and at other times necessary for the law’s functioning (e.g., the Constitution is merely a blueprint for a government system; it requires norms, here explicit as well as tacit agreements between people, to be implemented and maintained).
Through Voyant and AntConc (as well as other tools that may be useful), I will create data visualizations that will show my findings, which I will assemble into an interpretive essay using visualizations. This essay will be posted to this website, which is built on the WordPress platform and readily supports images (visualization) and videos (should I find useful public domain videos). While I would like the text to be accessible to “casual readers,” my audience is very much experts on nineteenth century fiction and/or the law as well as the community of practice of digital humanists.
Jimenez, Javier. Regarding American Customs. University of California, Berkeley, PhD dissertation.
Bakhtin, Mikhail M. “From the Prehistory of Novelistic Discourse.” Trans. Caryl Emerson and Michael Holquist. The Dialogic Imagination. Ed. Michael Holquist. Austin: University of Texas Press, 2002. 41-83.
Hale, Dorothy J. “Aesthetics and the New Ethics: Theorizing the Novel in the Twenty-First Century.” PMLA 124.3 (2009): 896-905. Web. 30 Sept. 2009.
Harpham, Geoffrey G. Shadows of Ethics. Durham, N.C.: Duke University Press, 1999.
Korobkin, Laura. “The Scarlet Letter of the Law: Hawthorne and Criminal Justice.” Novel. 30.2 (Winter 1997): 193-217.
Nussbaum, Martha C. Poetic Justice : The Literary Imagination and Public Life. Boston, Mass.: Beacon Press, 1995.
There’s nothing quite like a graduate seminar for encouraging you to like one approach one week, see it as intellectually bankrupt the next one, only to be redeemed weeks later. I have academic whiplash…or, better yet, I am currently the intellectual kombucha girl.
As a recent post suggests, I remain skeptical of computational literary students for three somewhat related reasons: 1) The ontological and epistemological foundations (realist and objectivist, respectively) on which computational methods stand do not easily jive, if at all, with those of literary criticism and most humanistic disciplines since these fields tend toward nominalist ontologies and subjectivist epistemologies. We have oil and water situation; 2) as Nan Z. Da argues, computational literary criticism seems to entail a great spectacle around…counting words; and 3) following Nan Z. Da, these methods seem to, at worst, confirm the obvious and at best, following Franco Moretti in “Network Theory, Plot Analysis,” “Corroboration, improvement, and discovery. Eventually, the day for theory-building will also come” (9).
But I have to admit that Moretti’s use of networks to analyze Hamlet has me quite intrigued. To be sure, I think Moretti’s network analysis is an example of corroboration; the network did not reveal new relationships. Indeed, seeing the network and reading the discussion of the characters’ relationships triggered a kind of memory of the relationships for me. It’s true that I had never thought of the relationships in quite the terms Moretti described, but it is also true that his description of the relationships confirmed an “intuition” about them. Note here that I am referring to the relationships established by the network, not the argument about the use of language by those inside the court and those outside (though I have to say that it is well known that courtly characters use more refined language and speak in poetic verses as opposed to “commoners”) nor his interpretation that the “outside characters” seem to constitute the rise of a state bureaucracy (a fascinating interpretation), a phenomenon well on its way in Jacobean England. Moretti admits that these insights have nothing to do with the network analysis. However, the network analysis to visualize the relationships all at once is useful. While the analysis does not discover new relationships, it does more explicitly reveal something that has always been in plain sight but occluded by the temporality of the plot. In some ways, this approach suggests a return to structure in literary criticism—just of a structure we could intuit but not describe.
Moretti’s network analysis taken together with Scott Selisker’s argument in “The Bechdel Test and the Social Form of Character Networks,” that social network analysis can help us move away from characters and their agency (or lack thereof) to the network (or networks) in which they are embedded, may help us better understand “that the agency we should look for in texts and in the world…is fundamentally social, and thus fundamentally networked” (519). This represents a new of thinking about a “social text,” providing a way for us to consider the social connections represented in texts in a much holistic and comprehensive way (i.e., from social milieu to social network), which I have to admit is pretty exciting.
In my own upcoming project on new manifestations of Spanish Caribbean musical forms, I plan to use network analysis, and both Moretti and Salerski’s approach suggests that even corroborating what is known about the creation of these new forms could be a useful tool for understanding not only the relationships between musical genres but also the relationship between the United States and the Spanish Caribbean, their cultures and their peoples.
Yo sé que a ti te gusta el pop-rock latino Pero este reguetón se te mete por los intestinos Por debajo de la falda como un submarino Y te saca lo de indio taíno … No importa si eres rapera o eres hippie Si eres de Bayamón o de Guaynabo City Conmigo no te pongas picky Esto es hasta abajo, cójele el triqui Esto es fácil, esto es un mamey ¿Qué importa si te gusta Green Day? ¿Qué importa si te gusta Coldplay? Esto es directo sin parar, one way “Atrevete-te-te,” Calle 13
Back in 2005, when Calle 13’s “Atrévete-te-te” came out, the song was an instant sensation, reverberating through both the Spanish language and English language worlds. After all, the song is, as the kids say, a bop. Musically, the track seems to be composed of various Latin American musical genres, such as salsa and cumbia (notably music to dance to), and is sung, or rather rapped/spoken, in Spanish. While both the music and style of rapping blend well together, even lyrically, there is no question that the rhythm in which the letters are sung/rapped recall hip hop. Moreover, the playfulness of the lyrics is reminiscent of many 80s and 90s hip hop tracks. Though Calle 13’s musical evolution after 2005 resists the reggaetón label, the song can be properly classified as reggaetón, though it does not use the Reggae rhythms and word play that was most popular in reggaetón of that era.
No one who listens to or dances to reggaetón, never mind those who make it, is ignorant of the genre’s constitutive mixture of both American hip hop and Latin American or, more specifically, Caribbean musical forms. The mixture is in evidence both in the music and the language, by which I literally mean the use of Spanish, English, Spanglish, or a combination of all three. But “Atrévete-te-te” seems to be keenly aware if not of its own remixed origins then certainly of the plurality of musical forms that its audience, ostensibly a Puerto Rican audience either on the island or the United States, listens to. The song doesn’t care about your personal style or identity (“No importa si eres rapera o eres hippie”), about where you’re from (“Si eres de Bayamón o de Guaynabo city”)–though importantly these are Puerto Rican places, nor does the song care whether you like Green Day or Cold Play. What matters is that “este reggaetón se te mete por los intestinos…y te saca lo de indio/this reggaeton gets into your guts…and brings forth the Indian, the Taíno.” Notwithstanding, as in the Dominican Republic, the national narrative of “Indianness” (which has a fraught and problematic history and present), the lyrics are telling you (specifically, a Puerto Rican woman) that this reggaetón will bring out your true, authentic self.
Authenticity is often regarded as being true to oneself or one’s community. That is, authenticity seems to be apparent when there is a great deal of (or complete) coincidence between your true, inner self and your outer, social self. And yes, let’s point out that the entire “definition” as well as its terms are if not dubious, then rather unstable or mobile. After all, what is your true inner self? And even if you know, how can someone else know what that is, even if you try to tell them? Likewise, what is your outer, social self or persona? Doesn’t that change given the context? Not to mention that the idea of the self as stable or unchanging, well, seems to be an unfortunate myth.
Why add authenticity into this mix? Because mixture or combining are often seen as adulterating a substance–or a culture, even when identity is articulated as mixture as in, for example, the national mestizajes of most Latin American countries, where a “mestizo” nation is lauded but hispanophile or European culture is regarded as superior or normal and indigenous, as well Afro-Latin and mixed, cultures are ignored or denigrated–along with the communities and peoples who fit this category.
In many ways, “Atrevete-te-te’s” offhand remark of helping your true self come out seems to rely on a peculiar kind of mestizaje found in Puerto Rico (as well as the Dominican Republic). Tapping into a discourse that, historically, in celebrating mixture seems to also erase real social differences seem suspect. But I wonder how this 21st century claim to musical mixture, a mixture that has as much to do with musical creation and innovation as it does with flows of capital in the musical industry as well as Latin America’s status under the ever-watchful eye of U.S. imperialism in all its manifestations (i.e., military, political, economic, and cultural), is revelatory of how these artists, and their audiences, position themselves in relation to the hegemonic musical genres of the record industry.
What I am setting out to do–in a very, very modest way–is a first step in documenting the interrelationship between genres of music and how they have come to be mixed in the musical laboratories of Latin America, specifically the Spanish-speaking Caribbean (for my purposes here, I am thinking of the Spanish-speaking Caribbean as the traditional three island nations, Dominican Republic, Puerto Rico, and Cuba, as well as south Florida and northern Colombia). Bringing to bear computational methods to examine the metadata of tracks as well as the tracks themselves (if possible), I am looking to document the musical genealogies that have led to the creation of new musical forms that have brought together U.S. musical genres such as hip-hop, R&B, and Rock with Spanish Caribbean musical genres such as salsa, bachata, son, rumba, etc.
My starting point will be to create a dataset of tracks that will serve as proxies for musicians, bands, or singers that are connected to the Spanish Caribbean either by birth or heritage that revel in generating a new expressive medium in order to vocalize their experience in the 21st century. My aim is to focus on tracks from musicians (e.g., Calle 13, Orishas, Bad Bunny, Karol G, Natti Natasha, Camilo, etc.) with critical and economic success (because they both bring together different forms but also collaborate with each other and other artists and serve as conduits for new innovation) and work backwards by identifying their musical ancestors. My hypothesis is that the metadata in the songs or albums will contain credited music samples that will reach backward in time (if I can figure out a way to identify rhythms in tracks that can be found in others and work backward, that would provide a more comprehensive view). This would allow me to visualize this information as a network, that right now I am imagining as a genealogical web of sorts that would connect artists diachronically and synchronically (the image came to me, of all things, through the book burning scene in Don Quijote). While this analysis is unlikely to be able to tell change in meaning over time or account for how a form or sample has been recontextualized for new meanings, the visualizations may help discover relationships (edges) that would merit further studies.
In “The Computational Case against Computational Literary Studies,” Nan Z. Da levels a serious critique of, if not the whole discipline of digital humanities, then its golden child, computational literary studies (CLS). By CLS, Da means the use of computational and algorithmic tools for the use of literary study, specifically “distant reading.” Da’s critique of CLS has the benefit of being clear and of voicing the suspicion many traditional literary scholars have of CLS: Da argues that in CLS “what is robust is obvious…and what is not obvious is not robust” (601). Put bluntly, for Da, CLS basically entails counting words/word frequency to make arguments, and this word counting either confirms what we already know (e.g., the dissimilarity between historical apocrypha and fiction or narratives) or suggests phenomenon that are dubious (e.g., the devotional structure of Augustine’s Confessions) (615; 614). Part of the problem, as Da sees it, is in the preparation of corpora for analysis: judgment calls abound in this process, and she argues that these decisions stack the deck for the results obtained (e.g., in defining what a haiku is in order to compare how prevalent this form is in East Asian poems) (619). In other words, Da is arguing that confirmation bias is a real problem for CLS.
While Da, as the title implies, is intending to use computational methods to argue against CLS, there are some important components of her argument that are not computational but are nevertheless important. For example, Da ends the first paragraph of her article by affirming that, “There is a fundamental mismatch between the statistical tools that are used [in CLS] and the objects to which they are applied” (601). Narrowly read, this statement would mean that the tools are not appropriate for the corpora (objects). But, her point, I think, is broader than that. Toward the end of the essay, Da claims that CLS work reduces literary studies to counting and states that, “In literary studies, there is no rationale for such reductionism; in fact, the discipline is about reducing reductionism” (638). On its face, this last statement seems true to me, but what I am trying to highlight is that she is making an implicit ontological argument: statistical tools come from and respond to a world (the natural sciences as well as a good portion of the social sciences) that is realist in its ontological mode and, thus, employs objectivist epistemology (such as statistics). Thus, as Da states early on, there is a “fundamental mismatch” between statistics and literature, between tools for a world that is stable and an object of study that while coherent is contingent, ineffable, irreducible. That is why, to my mind, she ends the article by stating that the utility of computational textual analysis is rendered more or less ineffectual by literature and “in particular, reading literature well” (639).
One response to Da’s provocative claims comes from Fotis Jannidis, who goes to some lengths to defend CLS. Jannidis argues well that Da seems to want for CLS to be able to explain a literary phenomenon in its totality and points out that a method cannot be ruled just because it does not account for the complexity of a phenomenon in total (6). It’s not quite clear to me that Da demands this of CLS. Instead, she seems to be responding to the rhetoric around CLS. That said, I think his point is well taken. Jannidis also argues, from a quantitative perspective, that Da has selected very few articles to analyze and indict a whole methodology; according to Jannidis, Da’s study is rife with selection bias (9-11). This point is also well taken. While Da does claim to analyze representative cases, Jannidis shows that she has focused on a small number of articles from the American academy, leaving the work of European scholars out (9-10).
Where Jannidis’ critique fails, I believe, is in how he deals with “complexity.” Jannidis challenges the notion that “literature is singularly complex” (3-4), which on its own seems…unfortunate. Literary art, by definition, resists unitary, stable readings. For example, metaphor attempts to describe an experience by invoking a completely different one. How metaphors work is not at all transparent, even if we are familiar with them and have little difficulty in interpreting them. Likewise, irony, while common, is literally counterintuitive: in irony, we see a non-coincidence between words and their meaning, so much so that a speaker often means the opposite of what they say. That is not straightforward. In addition, Jannidis collapses the complexity of literature with the complexity of other disciplines, namely sociology and psychology (3). While it is true that sociology attempts to “describe whole societies” and psychology “tries to understand the psyche of individuals as well as groups” (3), it is clear that those disciplines tend to use quantitative methods to reach conclusions. That is, they do not typically subject qualitative data (the closest thing to literature) to quantitative methods, not unless that qualitative data has been coded in ways that mean specific things and only those specific things. However, I think the more important point, I think, is that Jannidis does not address the ontological and epistemological differences of statistical methods and literature. That is the problem I personally can’t let go—and the problem that some CLS enthusiasts seem to neatly put to the side.
Ultimately, this debate brings me back to the question I’ve had for quite a while and continue to have: what can CLS, and digital humanities, offer literary studies that literary can’t do on its own already? If Da is right and CLS studies often reveal the obvious, what is the point? To use a shiny new object, a shiny new method? Or, better yet, to use a shiny new method that can both garner more funding while at the same time bolstering the validity of literary studies vis-à-vis the quantitative sciences? Is this a mad dash for funding as well as scientific and cultural relevancy? Or are we in the embryonic stage of methods that will one day more clearly aid in the work of interpretation and analysis?
Da, Nan Z. “The Computational Case against Computational Literary Studies,” Critical Inquiry, vol. 45, no. 3, 2019, pp. 601-639. https://doi.org/10.1086/702594. Accessed 15 Oct. 2021.
Jannidis, Fotis. “On the Perceived Complexity of Literature: A Response to Nan Z. Da,” Journal of Cultural Analytics, vol. 5, no. 1, 2020. https://doi.org/10.22148/001c.11830. Accessed 15 Oct. 2021.
As I detailed in a previous post, I see in my academic and intellectual trajectory a push and pull between the past and the present (and, for that matter, between the humanities and the social sciences). But now that I’ve committed myself to the present, the question is: where exactly to begin?
At first, this question seemed really daunting. I was socialized into an academic world with “proper” intellectual fields (e.g., Early Modern literature, the novel as genre, Romantic lyric poetry, modernism, etc.), even if some of those fields (e.g., African-American literature, Queer Theory, etc.) were once consider unworthy. That means that even when I believed myself to be “free,” I nevertheless stayed away from fields and topics that seemed unfamiliar or like I didn’t know anything about (ignoring that whether familiar to me or not, I’d still have to develop expertise).
But I promised myself that my current re-engagement with formal training would entail exploring what I truly loved (as opposed to what was interesting but with little feeling) as well as that for which I had real curiosity. For a while now, what has felt most fascinating to me is the cacophonous “conversation” on Twitter, specifically how BIPOC and LGBTQ+ voices manifest in/through that digital public sphere. My approach with working with these voices (Twitter conversations) will necessarily mean bringing to bear decolonial theory, queer theory, and media theory.
Inspired by Project Twitter Literature, what I intend to do is to use the tools that project has created to scrape Twitter data to look for relevant Twitter conversations and threads (using keyword searches as well as using hashtags).
Of course, even if I limit myself to BIPOC and LGBTQ+ voices (and their intersections), that is entirely too much material to contend with. What I need, then, is an entry point. A topical one, though admittedly someone tired now, is the debate around “cancel culture.” And here, important distinctions are necessary: what do these voices consider to be “canceling”? Is canceling rhetorical or material for these populations? How do BIPOC and LGBTQ+ use it? What is the difference between its actual usage by these populations and the perception of how individuals are “canceled”?
In “Against Cleaning,” Katie Rawson and Trevor Muñoz make an important contribution to the question of methodology in the digital humanities, especially in relation to preparing and working with large datasets of humanities information. Drawing on their experience with the New York Public Library’s What’s on the Menu? public data, Rawson and Muñoz problematize the notions of “data cleaning” and “messy” data, noting that the traditional view of these data tasks as technical problems in need of technical solutions elides the fact that data standardization may often result in “computing away” difference. As humanists, Rawson and Muñoz are attentive to the fact that “modern humanities have invested mental and moral energy into, and reaped insights from, studying difference.” They are, therefore, interested in working deftly and technically with large datasets while at the same not deleting the granularity, specificity, and at times oddity of humanities materials and data. To remove this information would either remove the complexities that humanities research focuses on and attempts to produce, create a false version or impression of the data (and, thus, the world and context from which this data emerges), or both. In highlighting this methodological, theoretical, and ontological problem, Rawson and Muñoz restore the nuance and particularity that most humanities scholars associate with their work.
In what seems to be an adaptation and transformation of close reading, Rawson and Muñoz begin their essay with an astute point about “data cleaning,” pointing out that the term itself is a kind of empty signifier that: presents as one task a range of activities that can vary greatly from researcher to researcher; obscures detailed descriptions of preparing data for analysis; suggests that the process is straightforward enough to not have an impact on the analysis and findings; and, assumes that there’s not much to learn from inquiring about the inner workings of the process(es) of data cleaning. From these insights about data cleaning, Rawson and Muñoz develop an argument of not only about why talking about the data cleaning process matters (the cleaning process is a kind of curation that is in itself a methodological and intellectual choice that necessarily affects the subsequent analysis), but also about the theoretical and ontological implications of the cleaning (an interpretation of the data that reconstructs a world that is either far removed from its actual context and/or that is counter to the researcher’s theoretical and ontological approach to their research and the world). They develop this argument by tracing their work “cleaning” the NYPL menu dataset, noting that this cleaning aided the work of a crowdsourced transcription process (of scanned menus) but was “insufficient for scholarly inquiry. To ask research questions, [they] needed to create [their] own dataset, which would work in context with the NYPL dataset.” In other words, they learned to differentiate alternate spellings and/or syntax of menu items from menu items that provided new information. Dealing with the new details, however, posed a new problem of scalability; if digital humanities methods allow researchers to deal with large datasets, leaving details that would make the data “messy” would represent a powerful limitation in the analytic and explanatory power of these new methods. To work through this potential impasse, Rawson and Muñoz draw on the work of anthropologist Anna Tsing on nonscalability, making the connection that scalable methods, like working with “clean,” large datasets, are inextricably linked with “totalizing systems” while “nonscalable phenomena are enmeshed in multiple relationships, outside or in tension with the nesting frame.” In other words, nonscalable elements (read: “messy,” unique data points) represent working with “historical contingencies and encounters across difference.” Data points that add diversity or heterogeneity to large datasets, then, are the site of the local, of difference, of phenomena that open up an interpretive field that makes the development of new knowledge possible.
How, then, do Rawson and Muñoz deal with the tensions between the scalable and nonscalable, the global and the local? They elegantly reach into a tried-and-true technique and tool known to humanities researchers: indexes. For Rawson and Muñoz, “an index is an information structure designed to serve as a system of pointers between bodies of information, one of which is organized to provide access to concepts in the other…an array of other terms that people use alongside ‘cleaning’ (wrangling, munging, normalizing, casting) name other important parts of working with data, but indexing best captures the crucial interplay of scalability and diversity that we are trying to trace.”
In exploring the virtual black box of data cleaning, Rawson and Muñoz make a significant contribution that has important implications at the practical, methodological, theoretical, and ontological levels for digital humanists. The diversity of digital humanities research, in and of itself a good thing, means that digital humanists may not always be using the same frame of reference or standards when they refer to data cleaning. Moreover, the tasks of preparing data, which Rawson and Muñoz suggest is about 80% of the labor of working with large datasets, can be so onerous that digital humanists, understandably, may be focusing on how to do something without necessarily scrutinizing what it means to adopt a method from another discipline (e.g., the natural sciences) or, at the granular level, what standardizing a spelling or term across a dataset could mean at the interpretive and ontological levels. In “Against Cleaning,” Rawson and Muñoz very much try to foreground the “humanities” in digital humanities by reaffirming the humanities’ concern for the particular, for nuance, and for complexity. In some ways, Rawson and Muñoz go “back to basics” by “rediscovering” how the humanities have developed and used the information structures of indexes and indexing, which, as it turns out, is quite distinct from indexing optimized to produce better search tools.
By calling back to an “oldie but goodie” like indexes, Rawson and Muñoz also make a powerful political intervention within humanities disciplines. In their article, Rawson and Muñoz often refer to the suspicion some (more traditional) humanities researchers have for digital methods. In exploring data cleaning and the scalable/nonscalable (global/local, total/particular) problems inherent in these tasks and noting the utility and power of indexes to reveal obvious and potentially invisible relationships between and among data, Rawson and Muñoz show that digital humanities research involves the same level of methodological scrutiny, theoretical sophistication, and interpretive techniques that “analog” humanities research does as long as borrowed methods from the natural and social sciences are scrutinized for their ontological and epistemological underpinnings. Thus, the work is intellectually similar even if in practice it looks vastly different. Most importantly, what Rawson and Muñoz show in their article, which is particularly important for non-digital humanists to hear, is that digital humanities methods, if performed thoughtfully and rigorously, do not entail a collapse of interpretation and analysis in humanities work into counting and collation. Rawson and Muñoz show, instead, that digital humanities methods can parse through large datasets and be used productively to complement and/actuate the analytical and interpretive work of traditional humanities to produce new insights and knowledge.
“I want to understand how [literature] presents different modes of thought, different conceptions of reality, how it both sustains and undermines language as a unifying principle of communication…
I am interested in connections, not only in the intertextual links within one language or tradition but also the interrelationships and influences of languages, literatures, and cultures on each other…
I seek to understand what is indigenous and what is alien, what is other and what is not other, and how these relationships are mediated by languages and cultures.”
My computer tells me that I wrote the sentences above on or about December 9, 2003 as part of a statement of purpose I wrote for graduate school applications. The first thing that strikes me about these statements is their idealism and hope. Looking back, they seem naive: not as goals or orientations, mind you, but the sheer scope of issues I intended to study and explore feels staggering.
My second reaction to these statements is: what happened?
Well, not a lot but also quite a bit.
I have always been animated by questions of the human, by manifestations of the human. Following this drive to understand the human, my first scholarly incarnation was in social science. My undergraduate major was in (combined) history-sociology, which meant I had to gather “contents” from history and place them within the edifice of social theory. This major, now defunct, was perfect for me: it was heavy on learning about history, culture, and theory but quite light on sociological methods (I know there’s something called a Chi square; I also know that, whatever it is, I don’t care). Two important developments here: 1) study abroad at the Facultad de Ciencias Sociales in Buenos Aires, a great deal of social theory, and learning lots of history turned me into a historical materialist (true story); and 2) I did terribly in my sociology senior seminar but aced my Latin American civilization final paper–writing an almost metaphysical paper on Jorge Luis Borges’ short fiction. This phase of my development should be called: “When Ontologies Attack Each Other.” Thus, a productive but irreconcilable tension between two distinct ways of understanding and approaching the world living inside my body, stalking each other in my brain.
Then, years later, yes, because it had to happen to most people in the late 90s/early aughts, came the short-lived but rather acute not-quite-Derridean-but-nevertheless-postructuralist moment. Like veganism, it was illuminating but insufferable. This was when I was doing a master’s in English literature, and it inspired me to bring Judith Butler’s Gender Trouble to a thesis on Geoffrey Chaucer’s Pardoner’s Tale from The Canterbury Tales. It was very earnest. It had to be, seeing as how that field was heavily New Historicist at that point–and I wasn’t. One might wonder why, with its Marxist legacy, I did not approach Chaucer through New Historicism. It’s not that I didn’t think that literature and history couldn’t work cooperatively to illuminate Chaucer’s England. It’s that I didn’t want to–I ultimately was not that interested in England (how would one become interested in England, I wonder). What preoccupied me was poetry itself and interpretation. So, two key developments from this period: 1) the primacy of the cultural product or artifact, not because it wasn’t embedded in a world and history, but rather because it generates pleasure and wonder, both as received images and ideas as well as an opportunity for the (cognitive) pleasures of interpretation; and 2) learning to read otherwise or against the grain, noticing silences and oddities that for too long were ignored or in some cases edited out of texts.
My engagement with Chaucer led to what I thought at the time was a detour in time. My doctoral work would be about a cultural formation that was in the process of becoming, not a text written long ago. It would matter to current debates about what literature was, what it contributed to philosophy and society, and what it was for (I mean, here, its role in the fashioning of a democratic spirit). It would also not be about novels or prose but rather shorter works (at one point, dramatic literature seemed perfect). Very, very, very long story short: my doctoral dissertation was on the 19th century (but it was almost on the 16th and 17th centuries…); it was on antislavery texts, mostly (though, as it happens, perhaps antislavery/legacies of slavery is current?); and, it was on novels, great, big, voluminous 19th century Latin American novels replete with incest plots (I mean, very serious incest plots or instantiations of doubles). Some of the successes of that era will always be with me: I became an excellent reader, an excellent teacher, and sharpened my sixth sense for identifying the prodigious in culture, from “low” to “high” and back again. For my trouble, I revised two dissertation chapters into two separate articles that I published in major journals (one on the novel Sab, which you can find here, and another on the novel Cecilia Valdés, which you can find here).
But now the detour through history, at long last, finally is over. My intellectual pursuits have joined me in the present, which has brought me both renewal and restlessness. I am filled with scholarly trepidations, which, right now, I am articulating in these ways:
How are contemporary digital cultural products shaped by and, in turn, reconfigure culture in terms of narrative, imagery, production, and cultural dialogue?
As social media platforms like Twitter, Instagram, and TikTok evolve, merge, and are destroyed by technological innovation as well as by shifting values and competing social narratives, how does conversation take place in contemporary the U.S. and abroad?
What does “conversation” even mean in the contemporary public square (digital or otherwise)?
Amidst the proliferation of voices in print, television, film, and social media, is a conversation still possible? Or, will the growing hubbub of the demos congeal into pure cacophony?
In the realm of the digital humanities, how does search and the configuration of datasets modify our interactions with technology and texts/images/video/etc.? If we are constantly having to find just the right keywords to get the search results we need or want, are not data and computer systems, following Lacan, speaking through us, delimiting our research in ways that we may not be aware? Are we not the tools (as opposed to search)?
Finally, who or what is doing the interpretation in digital humanities? What would a computational hermeneutics entail? Where would the humans be? In the code? Code and data seem to be ill representations of the human.
The scope. The scope. That’s the one clear constant: the vertiginous scope. Too much, too complicated.
My name is Javier Jiménez Westerman, and I created this site primarily to communicate with colleagues near and far about my academic research and teaching interests. That was before, when I considered myself fully invested in university life, even though I had voluntarily and purposefully left the tenure-track faculty world. While I still work in “the academy,” being a (tenured) professor is certainly no longer a destination. What I am supposed to do next lies just beyond my reach–I just have to imagine it.
Like most immigrants to the United States, my story is both exceptional and very common place. Common place because my parents brought my sisters and me to this country so that we could take advantage of the great educational opportunities here. Exceptional because my educational attainment is unusual for first-generation immigrants. Underneath it all, I like to think I’m regular folk…regular folk that just happens to enjoy some of the finer things in life.
I was born and raised in the Dominican Republic until the age of 10, at which point my family moved to New York City (Elmhurst, Queens to be exact). After spending two years in Queens, learning English and trying to figure out a new and bewildering world, my family moved to Springfield, Massachusetts. Though I grew to like Springfield, my early experience with New York City made we want to go there for college. I spent most of my high school years working hard so that I could get the grades I needed to get into college in New York. Happily, I did and got into my dream school, New York University.
Through a series of puzzling and unwelcome circumstances, I did not get to go to NYU, and I was very disappointed. I did, however, attend Columbia University. Not having gone to NYU, I can’t definitively say that my experience at Columbia was “better” than what it would have been at NYU. But I do know that I made some amazing friends there, and that unbeknownst to me the university afforded me with a top-notch education (surprise, surprise!). I would happily realize just how good my education was when I entered a Master’s program in English Literature at San Francisco State University and later the PhD program in the Department of Comparative Literature at the University of California, Berkeley.
After earning my PhD in 2012, I moved to southeastern Ohio to take a post as a professor of Latin American literature and culture at small college there. For three years, I taught courses in Spanish language (all levels), and upper division courses in Latin American literature and culture. You can take a look at my CV at the time of leaving the tenure track here.
While in Ohio, I noticed many things. Some were a surprise and some were not. For example, it became clearer than ever that I enjoyed teaching, guiding students, and the workings of a college. Research was fine, but it wasn’t clicking–nor did I have time to really engage in it (I had negotiated a 3-3 teaching load, which was less than most other professors at the college, but even though I had to teach less courses, each course was its own unique prep). It was also clear that it would be very feasible for me to get tenure, and that if I stayed long enough to get it, I (and my now husband) would be trapped there forever.
It might seem like I am being shady toward the town and college (the college did treat me reasonably well), but that’s now what I mean. I’m a big city, coastal kind of person. That’s the kind of life I most prefer. It didn’t make sense to commit to a life in which I was going to voluntarily choose against not only what I wanted but also give up the ability to choose–while at the same time having to be thankful for the tenure “prize.” So, we chose keeping the ability to make choices and left Ohio for Washington, DC.
I have now lived for six years in the DMV, and it seems like the universe keeps rewarding me for that choice, personally and professionally. Even during a global pandemic, I have a lot of privileges, and I am grateful and humbled by how lucky and stable my life is.
I have lived and continue to live nothing short of a charmed life.