Gary King: "Computer-Assisted Clustering and Conceptualization from Unstructured Text"

Date: 

Monday, March 7, 2011, 11:30am to 1:00pm

Location: 

Maxwell Dworkin 119

CRCS Lunch Seminar

Date: Monday, March 7, 2011
Time: 11:30am – 1:-00pm
Place: Maxwell Dworkin 119

Speaker: Gary King, Dept. of Government, Harvard

Title: Computer-Assisted Clustering and Conceptualization from Unstructured Text

Abstract: We develop two computer-assisted methods for the discovery of insightful conceptualizations, in the form of clusterings (i.e., partitions) of input objects. Each of the numerous fully automated methods of cluster analysis proposed in statistics, computer science, and biology optimize a different objective function. Almost all are well defined, but how to determine before the fact which one, if any, will partition a given set of objects in an “insightful” or “useful” way for a given user is unknown and difficult, if not logically impossible. We develop a metric space of the clusterings of a given data set from all cluster analysis methods presently known, as well as the partitions from all methods not yet invented (i.e., all possible clusterings), and enable a user to explore and interact with it, and quickly reveal or prompt useful or insightful conceptualizations. In addition, although uncommon in unsupervised learning problems, we offer and implement evaluation designs that make our computer-assisted approach vulnerable to being proven suboptimal in specific data types. We demonstrate that our approach facilitates more efficient and insightful discovery of useful information than either expert human readers or existing fully automated methods. We (will) make available an easy-to-use software package that implements all our suggestions.

Bio: : Gary King is the Albert J. Weatherhead III University Professor at Harvard University — one of 23 with the title of University Professor, Harvard’s most distinguished faculty position. He is based in the Department of Government (in the Faculty of Arts and Sciences) and serves as Director of the Institute for Quantitative Social Science. King develops and applies empirical methods in many areas of social science research, focusing on innovations that span the range from statistical theory to practical application.

King has been elected Fellow in 6 honorary societies (National Academy of Sciences 2010, American Statistical Association 2009, American Association for the Advancement of Science 2004, American Academy of Arts and Sciences 1998, Society for Political Methodology 2008, and American Academy of Political and Social Science 2004), President of the Society for Political Methodology (1997-1999), and Vice President of the American Political Science Association (2003-2004). He was appointed a Fellow of the Guggenheim Foundation (1994-1995), Visiting Fellow at Oxford (1994), and Senior Science Advisor to the World Health Organization (1998-2003). King has won more than 30 “best of” awards for his work — including the Career Achievement Award (2010), Warren Miller Prize (2008), McGraw-Hill Award (2006), Durr Award (2005), Gosnell Prize (1999 and 1997), Outstanding Statistical Application Award (2000), Donald Campbell Award (1997), Eulau Award (1995), Mills Award (1993), Pi Sigma Alpha Award (2005, 1998, and 1993), APSA Research Software Award (2005, 1997, 1994, and 1992), Okidata Best Research Software Award (1999), and Okidata Best Research Web Site Award (1999), among others. His more than 125 journal articles, 15 open source software packages, and 8 books span most aspects of political methodology, many fields of political science, and several other scholarly disciplines.

King’s work is widely read across scholarly fields and beyond academia. He was listed as the most cited political scientist of his cohort; among the group of “political scientists who have made the most important theoretical contributions” to the discipline “from its beginnings in the late-19th century to the present”; and on ISI’s list of the most highly cited researchers across the social sciences. His work on legislative redistricting has been used in most American states by legislators, judges, lawyers, political parties, minority groups, and private citizens, as well as the U.S. Supreme Court. His work on inferring individual behavior from aggregate data has been used in as many states by these groups, and in many other practical contexts. His contribution to methods for achieving cross-cultural comparability in survey research have been used in surveys in over eighty countries by researchers, governments, and private concerns. King led an evaluation of the Mexican universal health insurance program, which includes the largest randomized health policy experiment to date. The statistical methods and software he developed are used extensively in academia, government, consulting, and private industry.

King has had many students and postdocs, many of whom now hold faculty positions at leading universities. He has collaborated with more than seventy scholars, including many of his students, on research for publication. He has served on 25 editorial boards; on the governing councils of the American Political Science Association, Inter-university Consortium for Political and Social Research, the Society for Political Methodology, and the Midwest Political Science Association; and on several National Research Council and National Science Foundation panels.

King received a B.A. from SUNY New Paltz (1980) and a Ph.D. from the University of Wisconsin-Madison (1984). His research has been supported by the National Science Foundation, the Centers for Disease Control and Prevention, the World Health Organization, the National Institute of Aging, the Global Forum for Health Research, and centers, corporations, foundations, and other federal agencies.