Text Miners
In order to mine relevant data for an issue such as the War on Drugs effectively during the 2012 Presidential Election, Contra as a team must first understand what we are looking for. The current theme as we are proceeding with is a bit too vast and overwhelming tapping into all aspects of government. Although during the debates, there won’t be much mention of the War on Drugs we can focus on policies supported as well as prior speeches. It would be vital to focus on keywords and build upon the candidate’s position on the current drug problems. What is not said will be just as important in our research in an attempt to fully understand where the candidates stand.
With this information, we can relate it with the context of their respective parties and through qualitative research of history, we can project and imagine how the War on Drugs will unfold in the near or distant future. It is an absolute necessity to find and choose authentic sources of information in order to successfully paint a slight picture of our task at hand. Upon sifting through all of these sources, we can attempt to answer the question we are proposed.
Great point, Stephen, about the importance of what you *do not* find. Read those silences! And the question you raise at the beginning of your post is an essential one for text mining: what are you looking for? If you are counting the number of uses of the phrase “War on Drugs” across a large number of documents, you need to ask what exactly this will yield. And then there are many questions around execution of data mining and what direction it will lead you. For example: Which documents do you group together and how many do you search for at a time? Is there information beyond frequency that should be detected, such as location of the words in the document?