Class Notes for October 10
Announcements
- Blog commenting: “The Magnificent Seven ds106 Comment Challenge“
Reading
James Grossman, “‘Big Data’: An Opportunity for Historians?” March 2012.
- Big Data
- “And because we [historians] look for stories—for ways of synthesizing diverse strands into narrative themes—we usually look for interactions among variables that to other eyes might not seem related.”
- Importance of collaboration: e.g., joining “the historian’s facility with sifting and contextualizing information to the computer scientist’s (or marketing professional’s) ability to generate and process data.”
Ted Underwood, “Where to start with text mining,” The Stone and the Shell, August 14, 2012
- “Quantitative analysis starts to make things easier only when we start working on a scale where it’s impossible for a human reader to hold everything in memory.”
- quantitative v. qualitative?
- Close reading v. distant reading
- OCR challenges with primary sources
- Wordle
- Tools? Some programming needed.
- “you can build complex arguments on a very simple foundation”
- What can we do?
- Categorize documents
- Contrast the vocabulary of different corpora
- Trace the history of particular features (words or phrases) over time (e.g. ngram viewer, Bookworm)
- Cluster features that tend to be associated in a given corpus of documents (aka topic modeling)
- Entity extraction
- Visualization (e.g. geographically, network graph)
Group Projects
Group 1
Caroline, Anton, Eli, Cameron, Leanardo
Group 2
Estevan, Tatsiana, Phillip, Jordan Burgos
Group 3 – Instigator
Felipe, Jordan Smith, Robert, Pablo
Group 4 – Contra
Guang, Cary, William, Stephen, Shaif
Recent Comments