• Data is Plural. A newsletter that also has an archive of posts about data sets that are publicly available. This is often where I go first. To navigate the archive of posts, you just click on any of the markdown files that read as something like “2022-02-09.md” and you get the newsletter edition of that day with 3-5 different publicly available data sets and a blurb about each one.
  • Kaggle. An online platform for data scientists that also has a section on publicly available data sets. You can access this data by getting a free account (you can also set it up through a Google Account rather than creating a new account). Use the search bar in the data set section to see what you can find.
  • Awesome Data. Just an archive of really interesting and useful data sets. Scroll down and you will see how it is organized by topic. Click on topics of interest to see if anything is worth checking out.
  • Data.gov. The U.S. government’s repository of publicly available data sets. It is a little hard to navigate, but there is some useful stuff here if you can figure that part out. It is also worth checking out any government agency that you think might collect data on a topic of interest (e.g., CDC, Health and Human Services, FBI). They might have some stuff on their websites.
  • KD Nuggets. This has a link to several data repositories that could be of interest to you.
  • Numlock News. A newsletter like Data is Plural, but has fewer data sets.

 

If you are stuck, here are some interesting Data Is Plural Newsletter Editions that have some useable data sets:

Here are some other ones from 2020, too:

A good thing to do is to just click around though! Don’t just rely on what I offer here. Kaggle is particularly good to navigate with its search function.

 

Primary and Secondary Research

See below for explanation on difference between primary and secondary research, as well as tips for finding quality secondary research: Primary and Secondary Research – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

Research Tips in March 17 Lesson Plan: March 17 Lesson Plan – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

 

Distribution and Variability

Information on understanding the shape of your data and how that should impact choices in analysis from March 15 lesson plan: March 15, 2022 Lesson Plan – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

 

Excel Tips

A good video on how to use common Excel formulas: Top 10 Most Important Excel Formulas – Made Easy! – YouTube

Filtering and sorting: March 3, 2022 Lesson Plan – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

How to visualize and do other things in Excel to understand shape of data for analysis: March 15, 2022 Lesson Plan – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

Counting and taking the mean: March 8, 2022 Lesson Plan – Data and Writing Toward Social Change, Spring 2022 (cuny.edu)

Formula for taking the median (in between the parentheses, put cell ranges just like with the formula for taking the mean): =MEDIAN( )

 

 

 

Ongoing Class Glossary

We are going to keep a running list of terms to define from our readings and work in data science, critical theory, and elsewhere. Here is the link to our class glossary.