Text Mining and Social Media

As Ted Underwood mentioned in the reading, some of the biggest obstacles around text mining is not only finding the data needed, but finding the skills to collect the correct data.

A reason being that our topic revolves around social media, which can be traced back not just from MySpace, but to early social networking services such as email, chat services and other early internet social structures.  Also, as modern history goes, text mining can be easier as we will have more resources as sites, blogs and social applications become more accessible and popular.

After our group uncovers more secondary documents, as we feed them into a Wordle-like application we can see common themes such as undecided, voting, and different kind of feelings that stem from being a first-time voter. These similarities can help us focus on what aspect of the sources we should focus our attention towards, and can help us specify our final historical question.

In the case of secondary sources, my group may find itself in the same predicament the Underwood found himself in his own research.

“A lot of excitement about digital humanities is premised on the notion that we already have large collections of digitized sources waiting to be used. But it’s not true, because page images are not the same thing as clean, machine-readable text.” – Underwood


However, Many of our sources with social media can be a primary source – with interviews, blogs to mine through, and various social networks to comb through by means of twitter hashtags, trending topics, and blogging categories.