Text and Social Context: Analysis and Prediction
Sala d'actes de la FIB - Campus Nord
26 Juliol 2012
11:00h - Presentació
The rise of the social web presents new opportunities and challenges for computational and statistical analysis of text data. We can now explore corpora of messages written by huge segments of the population and observe many dimensions of the social context of these messages. In this talk, I'll present some of the analyses we've done to explore how message content varies with geographic location in the US and to shed light on message deletion in China. I'll then turn to the use of text for forecasting social outcomes in scientific and political domains. I'll show how simple models can be used to predict how scientific communities will respond to a newly published article and which congressional bills will survive committee.
This is joint work with David Bamman, Chris Dyer, Jacob Eisenstein, Michael Heilman, Brendan O'Connor, Bryan Routledge, John Wilkerson, Eric Xing, Tae Yano, and Dani Yogatama.
Noah Smith is the Finmeccanica Associate Professor of Language Technologies and Machine Learning in the School of Computer Science at Carnegie Mellon University. He received his Ph.D. in Computer Science, as a Hertz Foundation Fellow, from Johns Hopkins University in 2006 and his B.S. in Computer Science and B.A. in Linguistics from the University of Maryland in 2001. His research interests include statistical natural language processing, especially unsupervised methods, machine learning for structured data, and applications of natural language processing. His book, Linguistic Structure Prediction, covers many of these topics. He serves on the editorial board of the journal Computational Linguistics and the Journal of Artificial Intelligence Research and received a best paper award at the ACL 2009 conference. His research group, Noah's ARK, is supported by the NSF (including an NSF CAREER award), DARPA, Qatar NRF, IARPA, ARO, Portugal FCT, and gifts from Google, HP Labs, IBM Research, and Yahoo Research.