Digital literary studies: text mining: does this make text data?

A new product, Tesserae (described in HASTAC), “aims to provide a flexible and robust web interface for exploring intertextual parallels.”  Texts (such as books or plays or poem cycles) become grist for analysis–the meanings and implications, not the character sequence or word count.

Its start is with classical authors (Plautus, Ovid, Catullus, Vergil, Horace), but it is expanding to English prose.

This is another good example of what digital humanities (DH) are, in that understanding texts for their language and thought expression (as opposed to phenomena described in words) is a core concern of humanistic scholarship.

What is noteworthy, especially from the point of view of finding crossover points or ways in which the humanities can open new windows of analysis of science, is the concern over copyright.  As soon as text becomes data, it brings along issues of copyright.  Of course experimental data can be copyrighted as well, but that is either liable to be waived, or the level of concern over originality and remuneration is likely to be lower than in artistic or literary communication.  As DH grows, scientists, clinicians, and social scientists can learn to address issues they consider “supporting” or outside their disciplines, and can award such problems the attention they deserve, since their colleagues in the humanities are supplying the expertise.

Leave a Reply

Your email address will not be published. Required fields are marked *