Learning Modules (5-10 min)

Key things for going forward:

  • In my ambition, I thought I’d have time to grade Learning Modules on Wednesday, so I figured I’d make them due at end of what would have been our approximate scheduled class time. This won’t happen lol. So, I figured there is no reason to have that due date of 10:30am. Going forward, you can turn them in by 11:59pm. That is now reflected on the schedule.
  • They key with the Learning Modules is to read the content. I saw in your Learning Needs Survey that a few folks seemed to appreciate more visual content to break up text, so I am going to try to incorporate more of that going forward. However, labor conditions are important here. It might not be realistic for me to make videos, for instance, at this stage. So, it will probably be videos I find and static images, as well as more attention to design to break things up.
  • As I just said: read each page carefully! There will always be some kind of task (and sometimes more than one task) to do for each page. Please do them to get full credit for the assignment. The Learning Modules are replicating what a synchronous session might look like: lots of activities, writing, talking. So, you are asked to do some writing as comments or in Discord to to that work. Reading the pages would essentially the lecture portion. To fully participate, you need to read and do the tasks in the Learning Module.
  • What do you all think is a realistic deadline to get Learning Modules done? Based on my schedule, it looks like I might not be able to have them ready for you until first thing on Monday morning some weeks. Is two full days enough? Do you need more time than that? Please also know, as with the Learning Module due for this week, I ALWAYS take into account whether you have other things due that day. So, you should notice that the Learning Module for this week is a little lighter because I knew you had the Literacy and Numeracy Narrative due this Wednesday, as well.

 

Examining Power and Examining Data (10-15 min)

Other Key Terms

Before focusing on power, I wanted to turn your attention to other key terms for the book. I thought you all did a good job at working through some key terms and aims of Data Feminism in the Learning Module due on Wednesday. I wanted to just give a quick gloss on those at this page that I am going to keep on our “Data Resources” page that I will be adding to throughout the semester: Key Terms and Aims from Data Feminism. Let’s quickly review.

Knowing and using these terms will be important for the upcoming Data Set Biography and Influence Project.

 

Power

D’Ignazio and Klein explain that to examine power is to “nam[e] and explai[n] the forces of oppression that are so baked into our daily lives–and into our datasets, our databases, and our algorithms–that we often don’t even see them. Seeing oppression is especially hard for those of us who occupy positions of privilege.”

Understanding these forces put us in a much better position to ask critical questions during data collection, managing data sets, or using data sets created by others.

But how do we do this if it is difficult to intuitively see forces of oppression, especially if we are in positions of privilege? (e.g., cisgender, male, white, straight, able-bodied)

There’s not an easy answer here, but we are going to try out two things that have a good track record in an activity:

  • Collaborate with others. Drawing from a variety of experiences will help a group, collectively, strongly identify how power and oppression are at work.
  • Be ready with questions. Using a set of critical guiding questions can help figure out an approximate answer to a problem you are facing.

picture of question mark on chalk board

Questions from Data Feminism chapter:

  1. Who is doing the work (and who is not)?
    • Example: D’Ignazio and Klein cite the Amazon algorithm that was used to flag resumes for interviews, but the model was trained on data of previous applicants that heavily skewed male.
  2. Who benefits (and who is overlooked or actively harmed)?
    • Example: Missing data, like the data on femicide in Mexico cited by D’Ignazio and Klein. Data on murders was not sufficient in terms of the information of the victims that were prioritized, and this opened up opportunities for sexist propaganda to fill the gap, benefiting the status quo of Mexico’s law enforcement.
  3. Whose goals are prioritized (and whose are not)?
    • Example: the Allegheny County Office of Children, Youth, and Families that prioritized the goal of efficiency of the bureaucracy because it oversampled poor families that were more likely to use public services. Because this agency did not have enough resources to best help families, they chose the goal of efficiency instead. The goal was reached, but in doing so, poor families were unfairly harmed and targeted.
  4. How does the matrix of domination help with these questions?

Textbox that reads "Collins' Matrix of Domination"

Using Patricia Hill Collins’ model for the four domains of the matrix of domination can also be helpful. In the section “Power and the Matrix of Domination,” the full model is provided in Table 1.1 and described in the text of the section.

It can help to think about how data collection and analysis can be influenced by or influence

  • the theory/intent behind laws and policies
  • how laws and policies are enforced
  • how oppressive ideas are inhibited or furthered
  • how they contribute to individual experiences with oppression.

(In the section “Data Science for Whom?” there is an example analysis of data with the different domains in mind in relation to the femicide data in Mexico starting with the sentence, “The most grave and urgent manifestation…”)

 

Group Activity (30-45 min)

In groups of 3 or 4, you are going to receive a data set that I will share in your group’s text channel. I want you to, as a group, take some time to ask the above questions of your data set and be ready to discuss it with your group.

First, take 5 minutes to just get acquainted with what it is telling you. After 5 minutes, I will prompt you to join your group.

Once you join your group, discuss the following and be prepared to present it to the large group:

  1. An answer to each of the three above questions.
  2. Which, if any, of the four domains of Patricia Hill Collins’ matrix of domination are related to any of your answers. You can name any number of them you feel are applicable, just be ready to explain why!

Feel free to look not only at the data set itself and how it is structured, but also do some searching around the internet to find more information on the organization collecting the data, how it might have been funded, what uses it has been put to, any commentary on the data from others, etc.

You’ll have about 15-20 minutes to work with your group.

Groups 1 and 4: Look at the Long-Term Productivity Database. Learn more about this data set from the about page but also click this spreadsheet to look into the data itself. (some of the acronyms used here are explained on the about page linked).

Groups 2 and 5: Look at this cleaned data set about maternal mortality ratios across the world produced by a user of Kaggle, which is an online community for data scientists. Go here to learn a little more about the data set and what the user did to it and where they got it from. The “Indicator” column tells you what the measure is (i.e., maternal mortalities per 100,000) and the “First Tooltip” column tells you the ratio (e.g., 638 per 100,000) with a confidence interval (e.g., 638 [427-1010] means that the likely true number for the year was between 427 and 1010). To learn more about confidence intervals, this source from Simply Psychology has a good explanation.

Groups 3 and 6: Look at the 2016 data set of Offenders’ Race by Offense Category published by the FBI. You have the option to download Excel documents of each table – here is the first table. It is a little easier to look at a spreadsheet than the web page. To learn more about how this data is collected, go to the FBI’s page on how different categories are defined around the subject of crime at the time of 2016.

One-on-One Meetings (5 min)

On February 9, 10, 16, 17 (i.e., tomorrow, Wednesday, next Tuesday, next Wednesday), I have some times scheduled to meet with you one on one. I’d like to talk with you in the beginning of the semester to gauge your interests and to think with you about the kind of projects you might work on in class.

Link to Calendly for one on one meeting for 10 min (Feb 9, Feb 10, Feb 16, Feb 17): https://calendly.com/daniel-libertz/student-conference

Topic: what you might work on, the kind of data sets available, what help you need, how group work could help

We will meet in “Office Hours” in our Discord Server (under “Resources” on left side)

 

Literacy and Numeracy Narrative (5 min)

I just want to get a sense of your standpoint within making knowledge with words and numbers, two important contextual pieces of information as we think about the work you’ll do in other projects this semester. I showed you the prompt in Learning Module 1, but I wanted to bring up the prompt quickly and to take questions if you had any.

 

Next Time (2-5 min)

-Complete Learning Module 2

-Complete the Literacy and Numeracy Narrative

-Make sure you have a 1-on-1 meeting set up with me for tomorrow, 2/10, 2/16, or 2/17.