Michael Cafarella: "Analyzing Possible Human Trafficking Activity Using Large-Scale Information Extraction"

Date: 

Monday, March 21, 2016, 11:30am

Location: 

Maxwell Dworkin 119, 33 Oxford Street, Cambridge

Speaker: Michael Cafarella, University of Michigan


Title: Analyzing Possible Human Trafficking Activity Using Large-Scale Information Extraction

Abstract: Online text advertisements for sex work constitute a large trove of data about potential human trafficking activity.  If the information embedded in these advertisements --- price, location, service details, and so on --- could be analyzed using traditional data tools, it might be possible to build tools and models to identify and attack trafficking activity.  We extracted structured data from natural language text from more than 80 million online ads posts over a four-year period. We used this data to build a novel search tool for law enforcement officers to identify potential trafficking victims.

We also used the database to obtain novel insight into illicit sex markets.  These findings include the quantifying the price premium for services performed at a location of the buyer’s choosing, and determining what portion is due to travel costs or risk of violent crime.  We also show that there is negative correlation between wage increase for women and advertised prices once locality-specific factors are accounted for. That correlation is driven by mid-priced providers entering the market at lower rates when wages go up.

This talk will include results relevant to both the computational and social science communities.

Bio: Michael Cafarella is the Morris Wellman Faculty Development Assistant Professor of Computer Science and Engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. He has published extensively in venues such as SIGMOD, VLDB, and elsewhere. Mike received his PhD from the University of Washington, Seattle, in 2009 with advisors Oren Etzioni and Dan Suciu. He received the NSF CAREER award in 2011 and is a 2016 Sloan Research Fellow. In addition to his academic work, he costarted (with Doug Cutting) the Hadoop open-source project, which is widely used at Facebook, Yahoo!, and elsewhere.