Project Ideas

Team 3 | CIS 4400 | 5:50 pm Tues/Thur

      1. MD Abir A. Choudhury –
      2. Steven Amadou –
      3. Luan Da Silva –
      4. Pete Destil –

First Idea Dump: Wellbeing of Restaurants in NYC

The goal of this site is to create a health data warehouse aimed at analyzing the health of cities based on restaurant data. There are multiple ways we can go about achieving this, but we theorize that by analyzing the restaurant rating dataset, we can get a health outlook of the city. We also want to introduce a rodent population dataset in the city and correlate this to the health of the restaurants in the city. It’d be interesting to see if there is any correlation between the population density of rodents in certain neighborhoods and the health of the restaurants in those neighborhoods.

We hope to create an application based on this problem as an extension of this project if time permits. The application will provide users with a list of restaurants that are healthy based on the users’ location. This would be a mobile application.  The users would also be able to review the health of the restaurant to confirm whether the NYC ratings are correct or not. If the reviews don’t conform to the users’ reviews, the city would be able to go back to the restaurant to do another health inspection.

Data Sources:

  1. NYC Open Data Restaurant Dataset:

    26 Dimensions

  2. NYC Open Data Rodent Population Dataset:

    20 Dimensions


Second Idea Dump: Healthy eating, fried food consumption, and mortality

Goal: What is the relationship between weight or BMI and meal preparation patterns, consumption of fresh/fast food, or snacking patterns. Why do grocery shopping patterns differ by income?

It has always been prevalent in North American culture to limit the amount of fried food we consume. Most of the fried food that we consume is from fast-food restaurants. By frying food the process takes away from the key nutrients and increases the formation of advanced glycation and acrylamide, which contribute to stress and inflammation. Looking at the different datasets we want to look for relationships between healthy eating, income, and lifestyle to see who is more at risk to develop diseases.

Data Sources



Third Idea Dump:

The Centers for Medicare & Medicaid Services has a dataset on quality of care in over 4,000 medicare-certified hospitals. An interesting question to look at would be: by how much can life expectancy be predicted by looking at these medicare-certified hospitals. What other factors can good-quality vs poor hospitals influence a country’s life expectancy?

Data Sources

  1. Life-expectancy dataset –
    1. 13 Dimensions

2. CMMS dataset –

1. An undetermined number of Dimensions


Fourth Idea Dump:

The goal is to compare all the arrests that happened in NYC by the NYPD during this year. This data is extracted manually on every quarter and reviewed by the Office of Management Analysis and Planning. The purpose is to compare the number of arrests made in NYC per borough based on the suspect demographics and we can know the nature of the police enforcement activity.

Number of dimensions: 18

NYC Open Data