How the heck could we use Text Mining?
How could your group use text mining to answer the historical question(s) you’ve proposed thus far?
Our group could use text mining to answer the historical questions we had proposed by narrowing down keys word, sort different type of document, seek out specifics information, and determine correlation between different documents.
First off, our group could start text mining for keys words from website or blog through wordle.com. Then, we can create a list of words that appear the most. Following that, we can theorize the relationship between the words and find out why it appears. The finding might surprise us and guide us in a different direction.
Second, we can sort out documents that are irrelevant to our research because through the general search, when we enter the war on drugs in a normal search engine, we can get ton of results. Out of all the result we get back from a regular search engine, we could get tons of unrelated hits to our topics. So by using text mining, we can sort out the irrelevant information and go to more the specifics of our search.
Finally, with text mining we can figure out the relationship between documents that might not be present in the document itself. For example, an article or report writing in 2012 might has something do with a document in the 1960s.
In conclusion, text mining is extremely useful tool to use for research and to get A in digital history class at Baruch.
Thanks, Guang, for this post.
Some questions that come to my mind when reading it: What sources do you think would be most important to mine to achieve the goals you outline? Besides the appearance of the phrase “War on Drugs,” how can you ensure that you capture all vital information and weed out the rest? Will data mining enable you to explain the reason for the existence of particular patterns, or just detect them.
Great link at the end of your post.
Thanks Professor Thomas for the response.
The most important source to mine is any document that is relevant and informative to our topic. We can ensure that we capture all the vital information by apply “historical thinking” to our source; by asking questions like who wrote this, its she or he creditable writer on this topics, why he or she wrote this document, and can we prove all of the stuff said on that document and etc. I think data mining will enable us to do both.