Counting and Measures of Central Tendency

As we get into using Excel formulas, you might want to go to the “View” tab and select “Freeze Panes.” Choose “Freeze top row.” This will make things easier to see as you scroll down.

 

Counting

Click above to read more about options in counting things up in Excel.

 

Measures of Central Tendency

A measure of central tendency is something that tells you what the average value is for a data set. It is a way of explaining what is the most common value. However, this is rhetorical construction. There are many ways to argue what is typical and what is atypical.

Most statistical descriptions of what is common use one of three options: mean, median, and mode.

For categorical data, like the neighborhood_group that an Airbnb listing is located in, you would only use mode as a measure of central tendency. The mode is the most common value. So, you would count all the neighborhood_group values and see which one was highest. That would be the mode for this data set.

You can use all three common measures of central tendency, though, for continuous data–data that is numeric and can technically go on for ever (think decimal points). Discrete data–numeric data that does not go on forever, like the number of cloudy days in a year–can also use all three, but the mean might not be as useful as median and mode.

Click here for more on Mode, Median, and Mean and using them in Excel.

 

Task

  1. If your data set is a csv file, open it in Microsoft Excel. Look over each column, and find a categorical variable that you can count up one or more values. Write in a comment below what the variable is, the numbers you calculated, and anything interesting that you might have noted while looking at the results of the calculations you made.
  2. If your data set is a csv file, open it in Microsoft Excel. Look over each column, and find a continuous or discrete variable where you can find the mean, median, and mode. Write in a comment below what the variable is, what the mode was, what the median was, what the mean was, and anything interesting that you might have noted while thinking about all three values.
  3. If you are working from a database on a website, look at all of the options you have to filter things through checking checkboxes, selecting items in a drop-down menu, clicking various buttons, etc. Once you make the selections you want to, find the count, mode, median, or mean and write in a comment below anything interesting that you might have noted while glancing through the calculations you found.
  4. If you have a database or if you can’t quite make calculations in the ways the above ask for, but you want to practice with Excel, you can use the Airbnb data set we used in class and complete #1 or #2 above (just do #1  or #2 but with the Airbnb data set).

After commenting, click below to continue:

Button with text that reads click here to continue

14 thoughts on “Counting and Measures of Central Tendency

  1. Elaine says:

    I used the sum function to count the number of deaths for White, Black, Latino and Asian. Then, I used the average function to get the percentage of the deaths over the total number of deaths nationwide. The numbers I got was as follow:

    Total # of Deaths: 39701814
    Black: 3494370 –> 9%
    White: 10198552 —>26%
    Asian: 666869 —>2%
    Latino: 3226590—> 8%

    I realized that there are still more number of White dying from the COVID-19 than of black or any other race. Before the calculations, I was expecting to get more numbers of Black. However, it seems as per my calculations that the percentage of Black and Latinos dying from the virus are pretty close. I have to recall though that I still need to take into account the overall population of each category in the U.S. to really understand what the data means and whether my conclusions are accurate.

  2. LIAM SCHNEIDER says:

    I calculated that the average amount of quarterly FAFSA applicants between q1 of 2016, and q2 of 2018 was 2.709 million. I noticed after calculating this that the average is heavily skewed because of the cyclicality of the college year. For example the spring and fall semester have far heavier volumes of application than the summer and winter months. If I were doing data analysis I would probably just use the original data where the count is already accurate to create an assumption. (Maybe create a graph with a trend line.)

  3. Arti says:

    The amount of funding given to over 877 high risk poor nations across the world to help with COVID-19 relief was over 110 trillion dollars in 2020. I think that’s quite a baffling amount of money to provide and it shows the impact that COVID-19 has had across the globe, and be mindful that these are just the highest risk nations. I couldn’t imagine the amount of funds provided across the globe.

  4. DALANDA BAH says:

    I calculated the mean and median of the percentage of world GDP. When I calculated it, for the mean I got 9.92 and the median 47.5. when seeing the results, i found this interesting because since the mean is less than the median, the distributions is negatively unbalanced.

  5. Andrea Flores says:

    I calculated the mode, median, and mean of depression and mental health (just a random selection) and the results shows that the fifty percentage of depression was 86 out of 100, and the average was 84.29 out of 100%; whereas mental health the median is 66 out of 100 and the average of searching the word was 66.55 out of 100%.

  6. Liz Fadel says:

    I used descriptive statistics to calculate the mean and median. The estimated overall graduation percentage was derived using the economically disadvantaged graduation rate as the independent variable. It resulted in a strong correlation with 77% accuracy. The median graduation rate for economically disadvantaged students is 72%, compared to 74% for non-economically disadvantaged students.

  7. Gina DiGiacomo says:

    I went through and chose offensive language: physical disability as the allegation and sorted through the ages of the individuals who accused the officers of using offensive language. It showed that out of 15 complaints, the mean age of the accusers was 25, the median was 25, and the mode was 16, 39. Although this was a small size of complaints, this is interesting because it shows officers are more likely to use offensive language toward adults who were in their prime.

  8. Queen says:

    I used “=countif” function with additional criteria to determine different ages of detainees such as less than age of 20 “<20", less than age 30 "<30", following with age of 40 and so on.

  9. MINGYI YOU says:

    I tried to calculate the mode, median, and mean of the Airbnb data set. By applying it on the price of private room category. I got $100 as mode, $90 as the median, and $123.19 as the mean of a private room. From the calculation, a person could see if the price matches his/her budget on a private room from Airbnb.

  10. MAHIMA KHANEJA says:

    The count of the data indicates that the total number of datasets is 12,753. The mode is $16,832.5, indicating that this is the gross domestic product for most nations. The mean and median gross domestic product is $17,222.79 and $10,397.63, respectively.

  11. Kimberly Barrios says:

    The mode in the age in my data is 58 which shows me that the information I read from other sources was correct about the age group. The mean or average cholesterol level was 246.26.

  12. Joseph Habert says:

    I calculated the mean median and mode for the price colomn of the Airbnb data.
    Mean: 123.18
    Median: 90
    Mode: 100
    I notice how all three answers are relatively close in value but not close enoug to each other for one to be a definitive outlier.

  13. SAMEER DHIMAN says:

    One of the variables I have is already the mean temperature of each year used. However, I used mode for the entire mean column and got -0.21. The median I got for the mean column was 0.0534.

Comments are closed.