Archive for the 'Data: The new information resource' Category

Research Data Alliance promoted by NSF

The U.S. National Science Foundation showed its support for research data per se in a press release about the Research Data Alliance (RDA).  The RDA is a new organization with an open structure–still not heavily populated yet–for researchers and data professionals to come together in comment boards and occasional face-to-face meetings. When it finally gets going it should be a nice complement to IASSIST.

Digital literary studies: text mining: does this make text data?

A new product, Tesserae (described in HASTAC), “aims to provide a flexible and robust web interface for exploring intertextual parallels.”  Texts (such as books or plays or poem cycles) become grist for analysis–the meanings and implications, not the character sequence or word count.

Its start is with classical authors (Plautus, Ovid, Catullus, Vergil, Horace), but it is expanding to English prose.

This is another good example of what digital humanities (DH) are, in that understanding texts for their language and thought expression (as opposed to phenomena described in words) is a core concern of humanistic scholarship.

What is noteworthy, especially from the point of view of finding crossover points or ways in which the humanities can open new windows of analysis of science, is the concern over copyright.  As soon as text becomes data, it brings along issues of copyright.  Of course experimental data can be copyrighted as well, but that is either liable to be waived, or the level of concern over originality and remuneration is likely to be lower than in artistic or literary communication.  As DH grows, scientists, clinicians, and social scientists can learn to address issues they consider “supporting” or outside their disciplines, and can award such problems the attention they deserve, since their colleagues in the humanities are supplying the expertise.

“Make data more human”

Jer Thorp, Data Artist in Residence at the New York Times, develops data visualizations answering humanistic questions (modeling sharing, questions during conversations, looking for narrative structures, laying out names in a 9/11 memorial according to relationships among the people).  By creating “human contexts” for primitive data points (latitude and longitude of landing in New York for the first time, where one met one’s girlfriend, etc.), he attempts to bring more participants into “dialogs” about the data points (or chains of events or consequences), which, by widening the scope of additional viewpoints, can enhance creativity or at least address needs or mitigate hazards (by giving data stories, or creating empathy).  In a TED Talk, he invokes the role of artists and poets to work at the convergence of science, art, and design, add meanings and promote a deeper relation between humans and data.  OpenPaths, a site for uploading and sharing (thus “owning”) one’s own location data, is an example.

Medical data meets Big Data in Meaningful Use of Complex Medical Data Symposium

Medical informatics has been around for decades, and with the rise of data availability and interest, it is natural that there would be a “medical flavor” to data files and stewardship. There doesn’t seem to be a society dedicated to it yet, but there is an annual conference, now in its second year, the Meaningful Use of Complex Medical Data Symposium. Programs include not only clinical models for decisionmaking (a long-standing instance of medical informatics, including performance measurement by comparison to protocols, now informed by data as well as expert opinion), but also mechanisms for collaboration and crowdsourcing.

Data in the humanities

The sciences have led the data revolution because of their very nature, and the humanities’ “data” started out as mostly collections of digital creative works; but there are legitimate “pure data” endeavors that are exclusive (mostly) of science.  The Council on Library and Information Resources has just released a report, “One Culture. Computationally Intensive Research in the Humanities and Social Sciences: A Report on the Experiences of First Respondents to the Digging Into Data Challenge,” based on interviews of recipients of grants through the Digging into Data program, led by the NEH, who partnered with JISC in the UK, SSHRC in Canada, and NSF.  The table of contents, listing the cases studied, is informative:

  • Introduction
  • Using Zotero and TAPOR on the Old Bailey Proceedings: Data Mining with Criminal Intent (DMCI)
  • Digging into the Enlightenment: Mapping the Republic of Letters
  • Towards Dynamic Variorum Editions (DVE)
  • Mining a Year of Speech
  • Harvesting Speech Datasets for Linguistic Research on the Web
  • Structural Analysis of Large Amounts of Music Information (SALAMI)
  • Digging into Image Data to Answer Authorship Related Questions (DID-ARQ)
  • Railroads and the Making of Modern America