Sara Kingsley, Carnegie Mellon University with Ricardo Sandoval, Vanderbilt University
Understanding Feature Selection Practices of Social Work Researchers
Implementing effective poverty-alleviation programs in the US requires understanding the dynamics of poverty and its interactions with domains including health and nutrition. Despite significant computational advances, computing researchers remain limited in their ability to predict poverty outcomes. Many studies show that, in fact, regression-based analyses conducted using a handful of features selected by domain experts can out-perform sophisticated analyses using several hundreds or thousands of features.
In this project, we pursue two lines of inquiry: first, we ask whether feature selection techniques can give us insights into the impact of health and nutrition on poverty outcomes, despite these prediction accuracy limitations. We use a large-scale administrative dataset -- Survey of Income and Program Participation -- to examine this set of computational questions. Second, we investigate the feature selection practices of social work researchers. We report on an ongoing user-study based on in-depth interviews about data practices and feature selection approaches of social work researchers, including when and where they vary from those of data scientists. We close by discussing future work on building an interactive data and poverty analysis platform to enable knowledge transfer between data science researchers and social work researchers and foster the nascent but growing field of computational social work.
Rediet Abebe & Elena Glassman