-
List of stop words used for analyzing data as part of the "Using Metadata as Data for Reparative Archival Description" project.
-
XLS file containing all metadata from manuscript collections housed at and digitized by Marshall University Archives and Special Collections.
-
CSV containing selected metadata for concentrated analysis from manuscript collections housed at and digitized by Marshall University Archives and Special Collections.
-
CSV containing selected metadata harvested from the Jim Peppler Southern Courier Collection of photographs digitized by the Alabama Department of Archives and History.
-
The Looking at Appalachia project is composed of more than 800 photos contributed from individuals throughout the Appalachian region to demonstrate the diversity and complexity of the Appalachian experience. On the project website, the photos are housed in an image carousel and organized by state of origin and feature only plain text unstandardized metadata about each photo. I wrote a Python program to visit the page for each state, crawl the page for the URL of every photo and string of metadata associated with each of the images, and use regular expression to parse and standardize the data. This map displays counties represented in the "Looking at Appalachia" project, with darker blue counties having more photos included. Clicking the county will take you to that state's page on the "Looking at Appalachia" site.
-
Graph depicting the number of issues of the “Southern Courier” that were linked to photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Network graph depicting the tagged individuals (where identified) shown in photographs in the Jim Peppler "Southern Courier” Collection with Martin Luther King Jr. and linking individuals with others depicted in the same photographs. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Network graph depicting the tagged individuals (where identified) in photographs in the Jim Peppler "Southern Courier” Collection linking individuals with others depicted in the same photographs. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Graph depicting the tagged individuals (where identified) in photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Graph depicting the 25 most frequently tagged individuals (where identified) in photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Word cloud depicting the frequency of words used in titles that describe photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Graph plotting the 15 most used subjects used to describe photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Graph plotting the subjects used to describe photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Treemap graphing the locations (where known) depicted by photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
Map plotting the locations (where known) depicted by photographs in the Jim Peppler "Southern Courier” Collection. Data for this dataset was scraped from the Alabama Department of Archives and History ContentDM implementation using Python and the BeautifulSoup library, slightly tidied in OpenRefine, and mapped using Tableau.
-
This interactive network displays the relationships between individuals and the location of individuals who sent telegrams to Governor Pierpont during the Civil War. Data to create the base dataset was crawled from the West Virginia & Regional History Center content management system using Python, lightly tidied in Excel, and mapped using Pyvis and NetworkX.
-
This interactive visualization displays the names of individuals who sent telegrams to Governor Pierpont during the Civil War as well as the number of telegrams sent by each individual. Data to create the base dataset was crawled from the West Virginia & Regional History Center content management system, lightly tidied in Excel, and graphed using Tableau.
-
This interactive timeline displays the telegrams and transcriptions of telegrams sent to Governor Pierpont during the Civil War. Data to create the base dataset was crawled from the West Virginia & Regional History Center content management system, lightly tidied in Excel, and mapped using TimeMapper.
-
This interactive map displays the location of individuals who sent telegrams to Governor Pierpont during the Civil War. Data to create the base dataset was crawled from the West Virginia & Regional History Center content management system, lightly tidied in Excel, and mapped using TimeMapper.
-
While working with large humanistic data sets can create more broadly applicable scholarship, working with small sets of data in one-shot instruction sessions can help students begin to understand the value of working computationally with materials that are traditionally not viewed through a computational lens. For instance, when viewing the interrelated subject data associated with twenty-nine oral histories by Vietnam veterans, students can visibly see the significant overlap in content discussed by veterans and identify broader themes. Using Pyvis and NetworkX to visualize the overlap in content makes it possible for students to discover and compare oral histories that focus on these themes, ultimately creating higher quality scholarship. This visualization was used in a session of a First Year Seminar course for undergraduate students that focused on analyzing propaganda in the Vietnam War as a way of fostering critical thinking and information literacy skills.
-
Because the Alabama Department of Archives and History did not have a dedicated API endpoint or JSON output, I had to use Beautiful Soup and the CSS selectors mapped to each field to scrape the metadata for this project. This code is now deprecated as of fall 2020 when the ADAH upgraded to the latest version of ContentDM.