Ethical Visualization in the Age of Big Data

A Planning Workshop Summary

A workshop to seek interdisciplinary expert perspectives on ethically and visually representing the historical place of misrepresented peoples and locales.

Contents

Session 7: Adapting and integrating existing open source projects

graphic recording session 7

Scope and purpose

Documentation

Discussion summary

During this session, we reviewed several open-source projects, paying particular attention to the research activities and experience of our grant participants.In discussing the available tools, we attended to four main project needs: 1) browsing and sharing the document collection; 2) annotating the corpus; 3) processing the corpus using machine learning, geoparsing, and text mining techniques; and 4) visualizing and exploring the corpus.

Decisions

We resolved to adapt tools for the following purposes in building our project:

  1. Browsing Corpus OpenONI, The Online Newspaper Initiative which provides a function set for loading, modeling and indexing data
  2. Annotating Corpus BRAT A server-based tool used to annotate the training and verification data for natural language processing Pelagios and Recogito A semantic annotation tool for texts and images that can identify and map places PERDIDO Geoparser A flexible geoparser that could be adapted to use a manually annotated gazetteer of historic place names
  3. Natural Language Processing Spacy A flexible python library for natural language processing capable of performing most of the needed project tasks
  4. Visualization D3, Data-Driven Documents A Javascript library that will enable us to make custom visualizations that are web-based and interoperable across browsers

 


Back to main page