Chapter 4 of Data Feminism: Balancing Tensions in Counting and Classifying

D’Ignazio and Klein write that

questions about counting must be accompanied by questions about consent, as well as of personal safety, cultural dignity, and historical context.

Consent here roughly equates to how data are collected with the ability for a participant to choose to offer up information in ways they wish to do so or to participate at all.

Personal safety can relate to weighing risks to a participant’s safety if they participate or how they participate.

Cultural dignity roughly means that data collection creates data and data analysis that pays respect to the cultural backgrounds of those that participate and maybe even invites participants to help shape collection and analysis to ensure it is done respectfully.

Historical context roughly means that data collection and analysis is done in ways that keeps in mind larger histories (and root causes of social problems) in how data are collected, stored, and analyzed.

Beginning in the paragraph that starts, “Gender is certainly complicated…” and through the end of that section (the heading for the next section is “Rethinking Binaries in Data Visualization”), there is a really illuminating example D’Ignazio and Klein use that covers each of those elements to varying degrees.

But there are examples throughout the chapter that reflect concern for (or ignorance of) consent, personal safety, cultural dignity, and historical context.



Go back now and review that section I highlighted as well as a skim of the annotations you made and the chapter as a whole.

What do you notice about the tensions involved in data sets representing people and social problems while also attempting (or ignoring) matters of consent, personal safety, cultural dignity, and historical context?

Keep that answer in mind for the task below.



In a comment below, write about how your data set or database that you are writing about for your Data Set Biography and Influence Project both meets and fails to meet the challenges of designing a method of data collection that respects consent, personal safety, cultural dignity, and/or historical context in about 100-200 words.

Look over your data set or database as well as any information you can find that shows how the data was collected (e.g., a survey, an explanation of methodology, the about page).

After commenting below, click the button to continue.

Button with text that reads click here to continue

18 thoughts on “Chapter 4 of Data Feminism: Balancing Tensions in Counting and Classifying

  1. Arti says:

    In my data set, COVID-19 relief funding, the data does well to meet certain criteria’s. The data set specifically respects gender equality in distribution of resources and funding for the most part. There was an interactive chart rating from zero to four whether distributions were equal, and most categories fell between the three and four range. It was still uncomforting seeing that distributions were not all four and some were in the three and two and even one range. The data seems to do well in respecting the legitimacy of discrimination and oppression even during a pandemic. The data is focused more so on high-risk nations, so it helps focus and isolate helping the poorest of nations. The data did not describe the personal safety of individuals or the consent of how the data was collected. The data set on COVID-19 funding has analyzed the reaction of the financial system in place and funding in place for these high risk and poor nations, it has been monitoring and analyzing whether these nations have reacted positively or there needs to be changes in the data collected since the beginning of the pandemic.

  2. Elaine says:

    In my COVID Tracking Project (Race Tracker) dataset, the columns and raws meet general helpful criteria’s as it shows the number of tests, cases, deaths categorized by race and hospital which allows people to understand whether certain races have better access to health systems than others . However, the dataset ignores major aspects that could give a more just dataset. For instance, criteria for gender, age as well as the economic situation is not considered in their tracking project and so this makes the project bias to a certain group or I would rather say incomplete. Part of this lack of data could be that all the information collected is from different sources across the country such as long-term care facility residents and staff from state and territory websites, recent press conferences and releases, and directly from state health department officials which maintains privacy and confidentiality of their patiences. I believe the way the data is collected seems trustworthy but it definitely does not represent the population overall. In addition, there is no disclosure whether the data was collected with consent even though it seems to not disclose any personal information.


    In my data set, compiled by the federal government, there is an immediate example of the collectors failing to give participants the power of consent. In the FAFSA application there are three options for the gender field: M, F, or blank. Rather than allowing applicants to enter whatever gender they identify as, if the answer is simply not male or female, then there is no answer. This allows important information about different demographics to go uncollected on an application as important as student aid, and is clearly a flaw in the design of the data collection.

    With respect to cultural dignity, the data presented doesn’t present any data about the ethnic background of applicants at all. It is debatable whether this is good or bad for applicants, but the unfortunate outcome is that again we do not collect important data about what communities require further aid.

    In some ways, I find that not asking about the racial background of individuals seems to pay some respect to historical context. Historically people who checked black, latino, or anything non-white seem to have poor outcomes when it comes to requesting financing or aid. Maybe the government does not collect this data to prevent this sort of bias from reoccurring? Even if so, the unfortunate consequence is the loss of important data about which communities require further aid when it comes to education.

  4. PRATAP THAPA says:

    I am working on the mental health issue dataset where data are categorized based on the gender, age and year. It improves enormously to understand the situations in a different prospective. Simultaneously, there are other aspect of information that were required to provide the clarity to the data like the location, ethnicity and financial. When these essential informations are missing, the group of people suffering from the problems can not persuade the government and other organization attentions. Moreover, My dataset is collected from the few hundred thousand individuals, which causes the possible doubt on the quality overall when the data are used to represent the entire population of the county. Therefore, a dataset must include all the minor details and equality.

  5. Lynden L Frank says:

    My data set is a portfolio summary of the federal student loans It is data that has been collected form the year 2007 to now and is broken up in categories that divides the loans into what types of loan that was offered. Consent within this portfolio summary may not be needed because it does not name names or paint a picture of loan recipients. This data would be a better tool if it had elements that included an ethnic or racial breakdown and toggled with economic backgrounds. This dataset starts with the year 2007 so the historical context feels wanting. Including how the price of admission and cost of living has increased while teachers salaries have not and the minimum wage has stayed stagnant.

  6. Queen says:

    My dataset of COVID-19 in ICE facilities shows the data of ages among immigrants regarding to the examination of who is suffering the most from the disease- a range of ages that can be infected the most easily. The article doesn’t quite mention about gender instead generally providing the number of population who being affected by infectious disease. Consent is not required as the article does not prevail patients’ names or relates to any classifications. Scientists, doctors or nurses can be listed in a category of Personal safety which they represent as participants who taking risks, directly contact and treat infected patients.

  7. MINGYI YOU says:

    The dataset I chose, though, was to collect a comparison of income levels across a country. But there are still some details that affect the goals and expectations of the data set collector. First, the data set, that I choose to work on, does not involve any consent issues, personal safety issues, or cultural dignity issues. The reasons here are, the data set does not include anything in terms of personal information; the sources of data are reliable due to it is officially collected; the data set does not involve any forms of cultural issues due to it only deals with income inequality. However, I think it is not perfect as it misses specific categories, such as sexual identity, employment rates, and unemployment rates, helping dig up the real reasons directly relate to the income gap. For example, if gender discrimination exists in certain areas of China, then the data set might become unreliable to someone who tries to take the investigation to see some facts.

  8. Andrea Flores says:

    The dataset that I chose does not have any interference with age, gender, sex, race. This dataset does not require to meet certain personal information or consent from others due to the dataset was collected through the rates of google trends. It is mostly about the number of times that people were indirectly seeking for psychological help, such as searching for depression, anxiety, insomnia, etc. It is classified just by dates instead of gender, ethnicity, age, race, or sex; and the data collection was on Canada, Iran, Japan, United Kingdom, United States, South Korea, and Italy. However, the hole data is just collected by dates and not based on other information, in other terms, random data collection from google trends.

  9. Liz Fadel says:

    According to the NYCDOE calculation method in my data set of High school graduates, they have complied with the Family Education Rights and Privacy Act (FERPA) regulations on public reporting of education outcomes. The data set comprises the five boroughs of NYC with the cohort of a 4 year period consists of all students who entered 9th grade. Based on gender analysis, there were only male and female graduates. Non-binary has become invisible, so that part of the data is unknown. In cultural dignity, data shows ethnicity in dropout rates, still enrolled and advanced within the five boroughs of NYC. Blacks and Hispanics were the highest dropout rates, and the lowest were Multi-racial, Asian, and whites. This could be because of poverty, English language learners, students with disabilities, or geographic location where there are not enough resources. “By challenging the binary thinking that erases the experiences of certain groups while elevating others, we can work toward more just and equitable data practices and consequently toward a more just and equitable future”. Instead,the data focus on ELL, SWD, Poverty, gender, and ethnicity.

  10. Gina DiGiacomo says:

    In the dataset that I am using about the Chicago police department’s misconduct, consent was not used in the sense that the officers and the people who accused the officers of misconduct did not and were not able to give consent to the organization who put the dataset together. For some of the data, the Invisible Institute (the organization that created the dataset) had to go to court to get access to the reports used in the dataset. They met the challenge of consent because they did get the courts to give them permission to use the data, so in that way they received consent, not directly from officers or civilians. As for personal safety, they did provide the name, age, birthday, salary, unit number, race, and badge numbers of the officers- which could put their personal safety at risk. However, some of the pdf’s of the formal complaints had the name of officers and civilians blurred out so you can’t read them. In that way they tried to look out for the personal safety of those involved, so that nobody could identify who was making the accusations. As for cultural dignity, the data is broken down by races. The dataset accounts for officers and civilians that are black, white, Hispanic, and “other”. The dataset could do a better job here by defining who “other” is referring to exactly. The term “other” comes off as being lazy in presenting the data and raises questions about why they are not counting those groups individually and choosing to group them as “other” instead. As for historical context, the organization works with civil rights lawyers and has acknowledged that certain neighborhoods have been experiencing more policing than others, as well as, acknowledging that certain races have worse experiences with police than others have. To improve on this, they could talk about what in the past has caused certain areas and races to experience this.

  11. MAHIMA KHANEJA says:

    The data set that I chose examines the global income distribution. In this data set, it is crucial in determining how the inequality gap can be reduced. The project picks inequality data set from the World Inequality Database because it is interactive, allowing comparisons between the top 10% and bottom 40% earners’ income levels. Additionally, the database highlights both the average income and wealth levels for the last 40 years, making it easy to understand income level changes over the period. Lastly, the database incorporates data from almost 150 countries, making it easy to perform comparisons between countries all over the world. The selected database will be thoroughly analyzed to determine the causes of income inequality. The data set perpetuates total income for individuals; income from different sources, be it from rental properties, employment, bank interest, or dividends

  12. Joseph Habert says:

    I think my data set does everything very well that it sets out to do. A lot of the things you mention here for this task is not relevant as to what my data set is about though. It reperesents the division of races of student within public schools as well as how disabled students suffer restraints and seclusion within those schools. The data is formatted well in an easy to understand manner. The data is mostly done in bar graphs in all different colors so that you can easily identify which data is which.

  13. Leonida H. says:

    The data set I choose is on mental health/illness and poverty relating to income inequality as well this topic and data set highlights and addresses the major issues and people that are being effected/harmed most due to these disadvantages and the lack of help they are receiving categorized based on age range, gender, race, households, jobs etc. the representation and limit these people are facing. The positive side to the data sets and researched providing this information is that is brings awareness on what can be improved for these social issues starting from health care/assistance that can be provided and or expanded to improve the income inequality, types of jobs available and assistance to bring about a change and this information being provided can benefit to have a further understanding on the statistics.

  14. SAMEER DHIMAN says:

    In of the data sets that I have chosen, I think it meets the general helpful criteria. It shows how the glacier mass is decreasing by year from 1945-2014. This helps my claim about how climate change/global warming is changing the world with a negative impact. It comes from the EPA, which is a government agency. The source also comes from the World Glacier Monitoring Service. The data serves to inform us about the history of glaciers shrinking since 1945 and it shows a downward slope starting from a mean cumulative mass balance of 0 in 1945 since that’s when they started collecting the data to almost -30 in 2014. The data also includes number of observations included in each separate year. Even though the data doesn’t talk about this, glaciers shrinking leads to animals losing their homes such as penguins and polar bears and it also leads to sea levels rising, which is obviously bad since it will cover our land more with water and some islands may just be underneath the ocean in the future.


    My dataset is a product of crime and incarceration data published by the Federal Bureau of Investigations under the Uniform Crime Reporting Statistics Program. The program was established in 1930 as a congressional mandate directing “the Attorney General to “acquire, collect, classify, and preserve identification, criminal identification, crime, and other records.” The FBI collects this information voluntarily submitted by local, state, and federal law enforcement agencies.” Some U.S jurisdictions are exempt from fully participating in this program. Municipal governments within the U.S are required to provide this data on an annual basis. In a manner that respects both cultural dignity and historical context, the author of the dataset only provides state prison and crime populations metrics excluding ethnic and religious prison demographic data. There is also no written footnote nor in the arrangement of the dataset that lends to any bias. The author does not provide his or hers personal views, opinions or concluding remarks within the dataset itself.

  16. DALANDA BAH says:

    In my data set, it both meets and fails to meet the challenges of designing a method of data collection. It fails to respect the consent. It doesn’t say if any participants were offered to participate or not, or gave them permission to release anything. My data set doesn’t imply personal safety or cultural dignity. It doesn’t because it’s all about income inequality in the world. Regarding to historical context, the data set kind of is done in a way that keeps in mind larger histories because it does mention income inequalities in the world and how they plan on expanding. But it still lacks a lot of information on income inequality that’ll help readers know more about the income gap in the world.

  17. Nikki says:

    The data set that I’m using focuses less on classifications of people and more on time. The data was collected through Google Analytics for the purpose of highlighting the trends over a specific time period. Therefore, there are no issues of consent, prsonal safetly or others as mentioned above.

Comments are closed.