I recently joined
UMass Amherst's School of Computer Science
as an assistant professor,
where I am also a Core Faculty affiliate
Computational Social Science Initiative.
Within CS, I am also affiliated with
CIIR and MLDS.
What can statistical text analysis tell us about society?
that can help answer social science
I'm interested in
statistical machine learning and
natural language processing,
especially when informed by or applied to areas like
political science or sociolinguistics.
My work often uses text data from news and social media.
Some areas of interest include:
- Interactive text data visualization (prev work)
- Regional dialects and language change in social media (prev work
- Social media sentiment compared to polls
- NLP for informal (sometimes called "noisy") language (prev work)
- Syntactic analysis, e.g. part of speech tagging, parsing
- Semantic/discourse analysis, e.g. coreference, events, entity modeling (e.g. movie character archetypes)
- Political event extraction from news (prev work)
- Joint statistical models of social factors and language
- Measuring sentiment, opinions, ideologies, worldviews from text
- Human-in-the-loop learning, e.g. simple syntax annotations, or Mechanical Turk crowdsourcing
- Probabilistic graphical models, latent variable models, Bayesian inference, sampling and optimization methods
See also my (oldish) research statement
or publications below.
If you are interested in getting involved in research, shoot me an email.
Keep in mind there is a rich set of faculty at UMass interested in
similar areas—from computational social science
to natural language processing—including but not limited to
James Kitts (Soc),
Bruce Desmarais (Polisci),
Krista Gile (Stats),
David Jensen (CS),
Hanna Wallach (CS),
Bruce Croft (CS),
James Allan (CS),
Andrew McCallum (CS),
Brian Dillon (Ling),
Kristine Yu (Ling),
Rajesh Bhatt (Ling),
and many more! See also the CSSI website.
My PhD was in the Machine Learning Department
Carnegie Mellon University's School of Computer Science, where I was a member of the
Noah's ARK research group.
I have also been a
Visiting Fellow at Harvard IQSS,
an intern on the Facebook Data Science team.
Before grad school,
I worked on crowdsourced annotations at CrowdFlower / Dolores Labs,
as well as
"semantic" search at Powerset.
I was an undergrad and masters student in the Stanford
Symbolic Systems Program (cognitive science, more or less).
Videos from past presentations
Selected recent publications
(See also Google Scholar.)
Other papers on my CV or
In SemEval-2014 (Proceedings of the International (COLING) Workshop on Semantic Evaluations, Dublin, Ireland, August 2014).
arXiv:1310.1975, Oct 2013.
Data Analysis Project report, Machine Learning Department, CMU.
In Linguistic Annotation Workshop, 2013.
In First Monday 17.3, March 2012.
In NIPS Workshop on Comptuational Social Science and the Wisdom of Crowds, Sierra Nevada, Spain, December 2011.
In ACL-2011 (short paper).
In NIPS-2010 Workshop on Machine Learning and Social Computing.
In EMNLP-2010 (presentation).
- Press coverage:
New York Times,
All Things Considered,
Wall Street Journal,
San Francisco Chronicle,
In ICWSM-2010 (presentation).
In ICWSM-2010 (demo track).
In Beautiful Data, ed. Toby Segaran and Jeff Hammerbacher. O'Reilly Media. 2009.
In EMNLP-2008 (presentation).
- MiTextExplorer: interactive exploration of text data and document covariates.
- TweetNLP: tokenization and part-of-speech tagging for Twitter.
- ARKref, a coreference resolution system.
- ParseViz - quick and dirty parse tree/dependency visualization via graphviz.
- tsvutils for tab-separated data processing
- Other misc utilities (commandline, R, Python...)
Recent and not-so-recent news
- Mar 28: invited speaker, University of Michigan School of Information.
- Mar 26: invited speaker, NYU
Courant Institute and
Center for Data Science.
- Feb 26: invited speaker, Allen Institute for Artificial Intelligence.
- Feb 24-25: invited speaker, Information School, University of Washington. slides, abstract.
- Feb 14: invited speaker, Microsoft Research, New York City.
- Feb 10-11: invited speaker, Computer Science, UMass Amherst.
- Feb 7: invited speaker, Information Science, Cornell University.
- Jan 27: invited speaker, Toyota Technological Institute at Chicago, University of Chicago.
- Jan 15, 2014: invited speaker, Wharton Statistics, University of Pennsylvania. slides, abstract.
- Oct 9, 2013: invited talk at Univ. of Maryland at College Park,
CLIP Colloquium (host: Philip Resnik). [slides]
- Summer 2013: attended SOCS, NAACL, and ACL. See publications list for presentations/posters.
- Apr 9: Thesis proposal has been proposed: "Statistical Text Analysis for Social Science."
- Mar 22: invited speaker, Northeastern (Lazer Lab; host David Lazer)
- Feb 25: invited speaker, Columbia NLP group (abstract)
- November 16, 2012: invited speaker, UChicago Computational Social Science Workshop seminar series (host: Forest Gregg)
- October 4, 2012: invited speaker, UMass Amherst Machine Learning and Friends Lunch and Computational Social Science seminar (host: Hanna Wallach)
- May 2012 - invited panelist at the American Association for Public Opinion Research conference, for the panel "Survey Responses vs. Tweets: New Choices for Social Measurement." Talk: "From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series."
- April 2012 - invited speaker, New Faces in Political Methodology V workshop, Political Science Department, Penn State. Talk: "Corpus Analysis and Unsupervised Frame Learning from Text."
Elsewhere on the Internet
My PGP key
There are many Brendan O'Connors in the world.
If this is the wrong webpage, you may be interested in another Brendan O'Connor;
My awesome sister, Maureen O'Connor, is a writer in New York.