Overview: Once you have organized into a team the next step is to select a project to work on. You have freedom to identify a project that focuses on the Higher Education or NYC Government domain or sector. Be creative, pick a project you are interested in exploring or select one of the projects listed below that have been identified by the organizations listed. Looking for an idea related to NYC Government? Below are links top websites that review the operation of NYC Government and the services they provide to New Yorker.
Below are projects provided by government and nonprofit organizations looking for a creative way to addresses them. These projects describe real problems and have been provided by HELPUSA, a nonprofit that provides housing for those in need or the NYC Department of Health and Mental Health, NYC Department of Information Technology and Telecommunication or NYC Economic Development Corporation have identified, In total there are 7 projects listed below. A maximum of two teams will be able to work on each of the projects listed below – project selection will be handled on a first come first serve basis. For each organization a link is provided to its website.
Make your choice asap – your project selection will become effective as soon as you file (email to [email protected]) your TEAM INFORMATION FORM listing your team members and project description.
LINKS TO ORGANIZATIONS THAT REPORT ON NYC GOVERMENT
- The Independent Budget Office (IBO) – click on Publications
- The Citizen Budget Commission (CBO) – click on How Much is Enough, a report on the homeless crisis in NYC
- Read the New York City Comptroller’s 2015 Audit Report
- Link to NYC Government Website – general source of information
ONE PROJECT PROVIDED BY MAYOR’S OFFICE OF TECHNOLOGY AND INNOVATION-NYC Open Data Portal
Challenge statement: Develop natural language processing for data queries in the NYC Open Data portal such that everyday New Yorkers can use it without data science. i.e. – A New Yorker searches the NYC Open Data Portal with the question “Does restaurant X have an A rating?” and the response is “No” rather than having to look at the appropriate dataset and analyze the data to get to a response.
It should be understood that any product produced as a result of student work during the CUNY-IBM Watson Competition would provide a non-exclusive, fully paid-up, royalty-free, worldwide license in perpetuity to the City. (These are the terms we required of City hackathon participants).
FOUR PROJECTS PROVIDED BY NYC DEPARTMENT OF HEALTH AND MENTAL HEALTH
- Foodborne Disease Surveillance & Natural Language Processing: There is a significant burden of foodborne illness in New York City (NYC) with approximately 3,200 cases and 150 clusters and outbreaks reported each year. In 2009, the NYC Department of Health and Mental Hygiene (DOHMH) became one of now ten FoodCORE sites in the United States. Sites receive funding to enhance foodborne disease surveillance and outbreak investigations which is used to fund improvements in regards to epidemiologic response, laboratory capacity, and environmental health assessments. With this funding, DOHMH is able to hire a team of graduate student interns to conduct interviews with all reported cases of salmonellosis (approximately 1,000 per year), which consists of the completion of an extensive hypothesis generating questionnaire, collecting detailed exposure history for all cases, focusing on travel, restaurant, grocery store and specific food items. While some questions are structured, many are stored in free text fields which can be problematic for analytical purposes. We believe there is valuable information stored in free text fields that is not currently being analyzed and wish to develop analytical methodologies to standardize and analyze these fields in order to improve the detection of common exposures, outbreaks and clusters of salmonellosis in NYC. Analyses could also be developed to compare exposures of cluster cases (cases who are infected with Salmonella isolates with matching DNA fingerprints) to cases not in the cluster, which we believe would aid in determining the sources of Salmonella infections. In addition, the developed methodology could be applied to improve the surveillance of other diseases in NYC. We believe that text mining and machine learning algorithms could be developed to assist us with this endeavor and seek a team of students to help us with this project.
- Disease Surveillance Predictive Analytics: The New York City (NYC) Department of Health and Mental Hygiene (DOHMH) needs timely and accurate information about diseases, conditions, and events that affect New Yorkers to protect and promote the community’s health. Electronic disease surveillance enables DOHMH to monitor over 90 infectious diseases through a variety of means, including electronic reports from laboratories and healthcare providers. While data continuously converge around standards, reports will always differ in quality and format given the large number of diseases, tests, providers, and laboratories.To address this, DOHMH’s electronic disease reporting infrastructure employs a series of sophisticated business rules which help interpret these reports to inform public health action such as routine case follow-up and also outbreak investigation. However, these rules are very specific and require a high involvement of human review. Epidemiologists and research scientists commit time to routine manual checking of these reports when they could otherwise attend to more high-value work including developing statistical models, improving current systems, or conducting other research activities.DOHMH is interested in exploring machine learning methods to increase automation of disease report interpretation. In addition to disease outcomes, DOHMH is also interested in techniques to accurately predict report details which are often incomplete, such as healthcare provider address. These developments could benefit other health departments which wrestle with similar challenges.
- Exploring SPARCS Outpatient Data:The NYC Department of Health and Mental Hygiene (NYC DOHMH) would like to understand the estimated 6.5 million outpatient discharges per year, more than half of which are to NYC residents, to quantify access to outpatient preventive and screening services, laboratory tests and mental health treatment in NYC, identify gaps in services, and to develop a better understanding of care seeking behavior for specific diseases including pneumonia and various chronic conditions. The CUNY-IBM team will be responsible for creating frequencies on the diagnoses and procedures received by patient demographic characteristics, demonstrate if and how they’ve changed over time. Such information will help formulate needed ongoing surveillance activities, research questions and/or policy work.The Statewide Planning and Research Cooperative System (SPARCS) currently collects patient level detail on demographics, diagnoses, procedures, and charges from health care facilities certified under Article 28 of the New York State Public Health Law. SPARCS includes inpatient hospital discharge data since 1980, emergency department admission data since 2003, ambulatory surgery data since 1983, and outpatient data since 2011. Due to the sheer volume and complexity of outpatient data, the department only uploaded and made available to health analysts in-patient and emergency department data since getting access to these data through a data use agreement with New York State Health Department in 2013. The ambulatory surgery and remainder of the outpatient data are currently being uploaded to the department’s SPARCS warehouse. We hope to partner with CUNY-IBM to catalyze our exploration of these outpatient events for surveillance, research and potentially policy purposes.The data are received annually as a flat file, and are structured and static. The data are supposed to include a unique individual identifier. However, our experience with the inpatient and emergency department data have revealed that the unique identifier is not always unique.
- Provider Communication Improvements: The New York City Department of Health and Mental Hygiene (DOHMH) is in regular communication with area providers health care providers, giving information in emergency situations and disseminate news of public health import. Communications are complicated as there are numerous distribution lists with fluctuating information, updated at differed times, and with varying levels of accuracy. The data from the disparate sources needs to be matched and cleaned appropriately to produce a universal provider contact database. Watson technology may assist in selecting the correct contact information from the various sources to ensure that the providers are reached at the right time using the right channel. DOHMH would also like to better understand the gaps in the current lists to guide outreach and better coverage.
Two Projects Provided by NYC Department of Information Technology and Telecommunications – DOITT
The two projects listed below were submitted by the NYC Mayor’s Office for People with Disabilities(MOPD). MOPD is working to improve access to public spaces and city services for the disability community. One focuses on wheelchair accessibility of residential buildings and the other to information on the accessibility of NYC Government websites. These projects provide opportunities for team members to learn how to design projects with the disability community in mind so that their end product is accessible.
MOPD’s mission is to make NYC the most accessible city in the world because accessibility benefits everyone. Curb cuts, ramps and elevators benefit people that use wheelchairs but they also benefit parents with strollers and those who are transporting heavy objects; captioning videos benefits people who are deaf but are also useful for everyone in noisy environments; And Technologies like SIRI are a result of research to create hands-free technologies for people with physical disabilities.
It is suggested that your team become familiar with the concept of Universal Design. Wikipedia describes the evolution of Universal Design as “The term “universal design” was coined by the architect Ronald L. Mace to describe the concept of designing all products and the built environment to be aesthetic and usable to the greatest extent possible by everyone, regardless of their age, ability, or status in life. However, it was the work of Selwyn Goldsmith, author of Designing for the Disabled (1963), who really pioneered the concept of free access for disabled people. His most significant achievement was the creation of the dropped curb – now a standard feature of the built environment.” (https://en.wikipedia.org/wiki/Universal_design )
- The Mayor’s Office for People with Disabilities(MOPD) has data on approximately 35,000 buildings in New York City that were built starting 1990. These buildings are supposed to be wheelchair accessible/Americans with Disability Act (ADA) compliant. Currently all we have is a spreadsheet with the data (the year they were constructed and which ones have elevators) and would like to provide the disability community with tools to easily access the information on these buildings. Basically, the city has assumed that being these buildings were constructed after 1990, they are ADA compliant. However, the city needs to confirm whether each of these buildings is or is not ADA compliant. The approach suggested is for the public to provide feedback to the city on the ADA compliant status of these buildings. To accomplish this MOPD would like the public to be able to access the list of these buildings that can be searched by borough, zip code or address with the ability of the public to comment on whether the building is ADA compliant or not. The interface could be through a webpage or a mobile app.
- The number of websites managed by the city is somewhere around 300 and in total they contain thousands of pages. We are working on surveying website compliance of all sites managed by the City of New York. By compliance we mean the websites are accessible, e.g. include, alt-text for images, proper labels for buttons/links, properly formatted headings etc. As we collect this data, we will be submitting it to City Council as well as sharing it with the community. DoITT would like a convenient and streamlined way to add and organize this data as we collect it. At present each website is being tested using a combination of manual review and the application of automated tools to determine if the website is accessible. The desired outcome of this project is a more automated and efficient method to collect website accessibility data. After entering data, DOITT would like to be able to automatically generate a report on the accessibility of NYC agency websites. This project would be primarily used by DoITT and would most likely not be developed into a widely distributed mobile application.
One Project Provided by NYC Economic Development Corporation – NYCEDC
NYCEDC is the City’s primary engine for economic development, charged with leveraging the City’s assets to drive growth, create jobs, and improve quality of life.
EDC as an organization is dedicated to New York City and its people. EDC use our expertise to develop, advise, manage, and invest to strengthen businesses and help neighborhoods thrive. We make the City stronger. NYCEDC also helps create affordable housing, new parks, shopping areas, community centers, cultural centers, and much more.
Here at the New York City Economic Development Corporation, we think a lot about the opportunities and challenges of urban development. In fact, last week Mayor Bill De Blasio announced his big plan of bringing 100,000 good new jobs to the City over the next ten years. But what exactly is a “good job” and what industries are those jobs in? What types of strategies should practitioners of urban development even take to secure good jobs for New Yorkers?
These are some of the questions that we ask here on the Economic Research and Analysis team at EDC. As “economic nerds”, our first step is to look to the data – and there is A LOT of data – to analyze and hopefully inform the decisions we make as a company. We look at everything from structured census demographic data to large unstructured data sets generated from scraping Twitter. We believe in the power of natural language processing and building customized data warehouses to confront issues of economic development.
Data: We have census data for all of the companies that operate in New York City. The data reported for each company is it geographic location, financial and employment information, the industry classification that has been assigned to it. It is the current classification system that is used to calculate aggregate statistics by industry.
However, the current classification system does not reflect the new and emerging industries. As the nature of work changes with automation and tech-growth, defining and understanding traditional industries becomes even more difficult. But one of our greatest challenges here is figuring out exactly how to define and categorize industries. How do we figure out exactly how many jobs are in new industries such as Fin-Tech or Health-Tech when these industry classification do not currently exist. Another example is FitBit, is it a healthcare company or a technology company or even a wearable accessory company?
To deal with the shortfalls of the current classification system, NYC EDC has developed a new classification system that includes the new and emerging industries referred to in the above paragraph. NYC EDC has also developed definitions for each of the new classification codes. The project involves crawling the web, company by company, and by analyzing the unstructured data found on each companies website determine its new classification code.
If you select this project your team and EDC staff will work together to understand how big these emerging industries are in the New York City. Once we have that information, we can create and write policy based on the sub-sectors that have the strongest growth or the show the most promise. We believe that a team of students, paired with Watson, can help gather and analyze live industry data in a way that’s never been done before. Join us!
One Project Proposed by NYC Department of Records and Information Services
Overview:
Established in 1977, the Department of Records and Information Services preserves and provides public access to historical and contemporary records and information about New York City government through the Municipal Library and Municipal Archives.
The NYC OpenRecords Portal (http://nyc.gov/openrecords) is an open-source web application that streamlines the process for the public to request information from any City agency through the Freedom of Information Law (https://www.dos.ny.gov/coog/foil2.html). It also provides an intuitive interface to view existing FOIL requests and statistics for the City as a whole.
Project:
The OpenRecords project team wants to make it easier for New Yorker’s to find information they are seeking whether it be through the FOIL process, Government Publications (nyc.gov/publications), 311 (http://www1.nyc.gov/311/index.page), or OpenData (http://opendata.cityofnewyork.us/).
Right now, the interface doesn’t provide the user with an easy way to find information beyond basic keyword searching. If we could have some way to map different City agencies business functions, records, and information portals into a simple search interface, it would be easier for everyone to find the information they want without wasting City resources by submitting FOIL requests.
The possibility of using Watson’s ability to communicate using natural language and the ability to process large amounts of unstructured data suggest building a Watson mobile app using Bluemix services might be a way to provide the public with easy access to the City’s records and archival information.
ONE PROJECT PROVIDED BY HELPUSA
Project Description: HELP Hollis Garden Apartments, HELP USA’s permanent supportive housing residence in Hollis, Queens, provides a holistic aftercare approach to sustaining subsidized housing for recently homeless veterans and senior citizens at risk of homelessness. The programs and services are provided at the residence in partnership with the City University of New York (CUNY). CUNY offers education and workforce development services while the HELP USA staff provides property management and clinical case management services. The team has developed a service set intended to address the needs of the tenant base that includes acupuncture, clinical and case management sessions, legal services, digital photography, job club, and support groups.
Keeping clients employed and housed is the ultimate goal, but understanding how best to accomplish that goal with the target population can be difficult. As part of the 2017 CUNY-IBM Watson Competition, HELP USA hopes to apply this technology to better track program performance, evaluate the efficiency of present service provision, and improve service delivery and client outcomes. Of particular interest are those cases where clients find themselves dissatisfied with the services available (not useful, effective, etc.); we would like to get more aligned with their wants and needs while also remaining within the scope of our business model, mission, and budget.
Additionally, the IBM Watson initiative provides a tremendous opportunity to gather and synthesize information from various federal, state, and local sources on benefits available to military veterans. A comprehensive, easy to understand resource would help veterans navigate what is often a complex and confusing system not only at Hollis, but also at our other residences in New York, New Jersey, and Philadelphia.