Alice Oh: "Insights and challenges of applying machine learning approaches to problems in social science"

Date: 

Wednesday, September 18, 2013, 12:00pm to 1:30pm

Location: 

Maxwell Dworkin 119

CRCS Lunch Seminar

Date: Wednesday, September 18, 2013
Time: 12:00pm – 1:30pm
Place: Maxwell Dworkin 119

Speaker: Alice Oh, CRCS and KAIST

Title: Insights and challenges of applying machine learning approaches to problems in social science

Abstract: In this talk, I will share the experiences and challenges of my research group in applying probabilistic topic models (e.g., LDA,HDP) and other machine learning algorithms to problems in social science. Specifically, I will discuss three papers on the topics of emotion cycles, self-disclosure behavior, and agenda-setting theory.

These topics span a wide range of social science research, so I do not claim to go deeply into the problem domains but rather offer rudimentary explorations into the realm of social science.

In the first paper, we developed a computational framework based on LDA for understanding the social aspects of emotions in Twitter conversations. We looked for meaningful patterns of emotional exchanges in a conversation, where those patterns may depend on the topics and words of the conversation. We looked at how conversational partners can influence each others’ emotions and topics, and we discovered interesting patterns in the overall emotions of the conversations. We also found that tweets containing sympathy, apology, and complaint are significant emotion influencers. Finally, we discovered lexical patterns, such as the usage of profanity, that influence the overall emotion of a conversation. In the second paper, we looked at the relationship between tie strength and self-disclosure. In social psychology, it is generally accepted that one discloses more of his/her personal information to someone in a more intimate and trusting relationship,often called a “strong tie” in the social network literature. We question and study how tie strength affects self-disclosures in the context of Twitter conversations. Our results illustrate that in general, there is a significant trend that validate the findings in social psychology, however, there are interesting exceptions to this general trend. We analyze and discuss these results in detail and propose directions for further studies of tie strength and self-disclosure in Twitter and other SNS users. In the third paper, we looked at agenda setting theory which explains how media affects its audience. Limitations of traditional media studies about agenda setting include the small set of issues, the costly survey data of public interest, and the expertise needed for categorizing the article frames. In this paper, we took a computational approach to study agenda setting with a large dataset of online news including articles, user comments, and social sharing counts. With that data, we extracted the major issues with hierarchical Dirichlet processes, a nonparametric probabilistic topic model. Then, we quantified the effects of agenda setting by analyzing the correlations of the user comments and social sharing with the amount of news coverage. By using machine learning tools, we showed the potential of detailed and principled analysis of agenda setting from a large set of publicly available data.

Bio: Alice Oh is an Assistant Professor of Computer Science at Korea Advanced Institute of Science and Technology. She leads the Users and Information Lab with the vision of modeling various types of information from multiple perspectives and understanding users in terms of their individual and group behaviors. To that end, she studies and employs methods from machine learning, human-computer interaction, and statistical natural language processing. Alice completed her M.S. in Language and Information Technologies at CMU and her Ph.D. in Computer Science at MIT.