How does one of the largest cities in the world use data for social good?

New York agency analyzes interagency data to visualize needs around the city and improve assistance programs

By Alison Bolen, SAS Insights Editor

Each year close to 1,000 adolescents leave the foster care system in New York City without a strong support system to help them navigate the adult world of housing, education, employment and health care. The transition into adulthood can be especially difficult without a strong family network, leading to higher rates of homelessness, incarceration and school dropouts for foster youth.

Recently, a longitudinal study conducted by the Center for Innovation through Data Intelligence (CIDI) in conjunction with Good Shepherd Services has shown that supportive housing programs that combine housing and targeted services can have significant results. CIDI found that participants in one such program were 36 percent less likely to have a stay in the single-adult shelter system and 55 percent less likely to go to jail during a two-year time period as compared with the comparison group.

Learn more about the Good Shepherd Services project

We reached out to Maryanne Schretzman, Executive Director at CIDI, to learn more about the organization’s use of data to influence policy and affect change in the nation’s largest city. Founded in 2011, CIDI is the analytics research arm of New York City’s Deputy Mayor for Health and Human Services. CIDI collaborates with all Health and Human Service agencies and other city partners to research and promote policy improvements with a goal of benefiting all New Yorkers.

As more organizations and individuals are working together to help solve society’s problems using data, how do you see technology driving this trend? What particular factors or technologies are key?

Maryanne Schretzman: It’s been particularly exciting to see the emergence of more centers like the Center for Innovation through Data Intelligence, both within government and in other sectors. People are increasingly realizing the importance of interagency and intersector work, which allows us to get a more holistic view of the outcomes of the individuals that we serve and the impact of human service programs.

Integrating data … is essential to understanding the bigger picture and figuring out the places where intervention matters most.
Maryanne Schwartzmen

Maryanne Schretzman
Executive Director, CIDI


The most critical social policy issues cut across agency, regulatory and budgetary boundaries. It has always been a problem that agencies and organizations are siloed, and therefore, so is the data. Integrating data from various social service agencies facilitates multidisciplinary teams coming together to think through problems. That is essential to understanding the bigger picture and figuring out the places where intervention matters most.

I think one of the most exciting factors in analytics is the growing emphasis on  merging various administrative databases with geomapping analysis. The advancements in mapping technology and the enthusiasm with which mapping has been embraced are hugely important to planning processes. Mapping allows us to look at issues from a community-based perspective, and in a city as diverse and as large as New York, it’s really helpful to see the differences between communities. Maps can also visualize a lot of data in a very straightforward and beautiful way, which aids in understanding the issue at hand and keeps stakeholders engaged.

In what areas do you see data being able to help society the most? What major problems are within our reach to solve?

Schretzman: We work primarily with health and human services agencies and there is a lot of work being done to better match programming with the specific needs of individuals. It is no longer a one-size-fits-all approach where, by default, some people get more and others get less than they need. In many human service systems, approximately 25 to 30 percent of people end up being high-end users of a particular system. It is important to identify the characteristics of this group so you can intervene and help reduce the needs of this high-risk group.

The focus on data integration allows for a process that embraces both a community approach, which identifies a neighborhood’s structural deficits and its existing assets, as well as a targeted approach, which identifies those individuals within a neighborhood who are at greatest risk for poor outcomes. Linking a community-based strategy with a targeted approach provides for a collective and coordinated impact in specific neighborhoods. Being able to identify risk factors for major problems such as homelessness and child abuse allows city agencies to work together better and allocate resources to match the needs of people with appropriate resources.

Can you give a few specific examples from your organization of data being used for social good?

Schretzman: We have recently completed an outcomes study of youth leaving foster care, the justice system, and those youth who are dually involved with both systems. We were able to combine data from multiple city and state agencies to glean information about their adult outcomes over six years. By knowing their outcomes, we were also able to explore early risk factors for high-cost service use so we could think about ways to target services and support system-involved youth.

We are also working with several taskforces throughout the city to help visualize community-based data for planning purposes through GIS analysis. We combine data from national surveys, such as the American Community Survey, agency service use data, and data about the locations of programs and resources to better coordinate existing resources and understand where more resources may be needed.

How does collaboration between different groups and agencies play a role? Why is this important?

Schretzman: Collaboration is one of the fundamental pieces of what makes CIDI what it is. Because we are primarily using data from other agencies, we work closely with research and policy teams of those agencies to contextualize and understand the data and develop recommendations that are relevant to the agency’s work and build on their current infrastructure. This connection allows our analyses to be utilized in a concrete way and helps to guide our analytics work. There are so many interesting questions that can be asked with the available data, but by keeping our work focused on analyses that will help the agencies and their clients, we ensure that we have an outlet to put our findings into practice.

Beyond city agencies, we also collaborate with nonprofit providers and advocates to make sure our work is informed by the work that is happening on the ground. We often visit sites to speak with program staff when we are first getting involved with projects so we can appreciate the stories of the clients that our data may not be able to capture.

How does your organization address concerns over privacy?

Schretzman: CIDI is extremely careful in our data storage and analyses. On the technical side, we have legal and technological structures in place to protect data from being used improperly. We also spend a lot of time thinking through the implications of our work to make sure that our analyses do not have unintended consequences for people we are trying to help. We have to think about privacy concerns in the context of communities and people; as analyses get focused on smaller and smaller geographies and more focused populations, the concerns about privacy become greater, so we try to take this into account when we are planning analyses. We never forget that our data are people and they need to be treated with respect and dignity. We strive to provide a context for data analysis and use an institutional review board and other processes to ensure that we are not just abiding by the governance of law, but are also abiding by the guidance and standards set forth by the National Association of Social Workers.

 How are you dealing with the challenges of big data?

Schretzman: Currently we access data from the agencies directly, which allows us an opportunity to think through the data and analyses with our partners. The majority of our work is actually then taking the separate data sets and combining them into a useable format. Although this process can be time-consuming, it does compel us to be more thoughtful in designing our research projects. We have to be very deliberate in deciding which questions we want to answer and for which population in order to pull and structure the appropriate data. This process keeps us from pursuing analyses that have not been fully thought out. By spending a lot of time on the questions, we have been able to set up a disciplined process for deciding upon the best methods for analysis. We are aware that our data sets can become quite large and over- and under-fitting of the data can become an issue. We have checks and balances throughout the process to mitigate these issues.

What other challenges remain with the “data for social good” movement?

Schretzman: Technological advances are occurring so rapidly, it is difficult for government agencies and nonprofits to keep up with them sometimes, particularly in relation to the private sector. As advancements are made, it is important for groups like CIDI to figure out how they can be applied in health and human services while still maintaining integrity and confidentiality. In some cases more data is not necessary to solve problems and we need to implement what we already know!

What can other organizations or cities learn from the work you are doing? How can some of your projects be applied at a smaller scale for smaller towns, or at a larger scale for entire regions?

Schretzman: I believe the best thing to do is to start with the data you have and begin to use it. In my experience when you begin to use it for accountability, data entry becomes more accurate. New York City probably collects more data than most places, and much of our administrative data is very rich with details. However, whenever we conduct a new analysis or ask a new question, we learn about other things we wish agencies were asking or storing in a more analyzable way. For places that are just starting to think about how data can be helpful, I recommend beginning with asking a question and then seeing what data is available to answer it. If elements are missing, it can inform a larger process of data collection. We have undertaken several multi-city projects and have had to think through how data from different systems can be leveraged to answer the same questions across sites.

Depending on the data available, I think lots of our work can be applied at any level – it is more a matter of coordinating data across different agencies if needed and choosing questions that are relevant and helpful to the people and communities being served.


What to read next

          Get More Insights


          Want more Insights from SAS? Subscribe to our Insights newsletter. Or check back often to get more insights on the topics you care about, including analytics, big data, data management, marketing, and risk & fraud.

          Back to Top