About
This is a blog on artificial intelligence and "Social Science++", with an emphasis on computation and statistics. My website is brenocon.com.
Blogroll
Blog Search
-
Archives
Author Archives: brendano
Java and IDEs for the R/Python world
(Some tips on how to use Java if you’re from R or Python; some thoughts on software platforms and programming for data-science-or-whatever-we-call-it-now.) Most of my research these days uses Python, R, or Java. It’s terrific that so many people are … Continue reading
Replot: departure delays vs flight time speed-up
Here’s a re-plotting of a graph in this 538 post. It’s looking at whether pilots speed up the flight when there’s a delay, and find that it looks like that’s the case. This is averaged data for flights on several … Continue reading
What the ACL-2014 review scores mean
I’ve had several people ask me what the numbers in ACL reviews mean — and I can’t find anywhere online where they’re described. (Can anyone point this out if it is somewhere?) So here’s the review form, below. They all … Continue reading
Scatterplot of KN/PYP language model results
I should make a blog where all I do is scatterplot results tables from papers. I do this once in a while to make them eaiser to understand… I think the following are results are from Yee Whye Teh’s paper … Continue reading
tanh is a rescaled logistic sigmoid function
This confused me for a while when I first learned it, so in case it helps anyone else: The logistic sigmoid function, a.k.a. the inverse logit function, is \[ g(x) = \frac{ e^x }{1 + e^x} \] Its outputs range … Continue reading
Response on our movie personas paper
Update (2013-09-17): See David Bamman‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on … Continue reading
Probabilistic interpretation of the B3 coreference resolution metric
Here is an intuitive justification for the B3 evaluation metric often used in coreference resolution, based on whether mention pairs are coreferent. If a mention from the document is chosen at random, B3-Recall is the (expected) proportion of its actual … Continue reading
Some analysis of tweet shares and “predicting” election outcomes
Everyone recently seems to be talking about this newish paper by Digrazia, McKelvey, Bollen, and Rojas (pdf here) that examines the correlation of Congressional candidate name mentions on Twitter against whether the candidate won the race. One of the coauthors also … Continue reading
Confusion matrix diagrams
I wrote a little note and diagrams on confusion matrix metrics: Precision, Recall, F, Sensitivity, Specificity, ROC, AUC, PR Curves, etc. brenocon.com/confusion_matrix_diagrams.pdf also, graffle source.
Movie summary corpus and learning character personas
Here is one of our exciting just-finished ACL papers. David and I designed an algorithm that learns different types of character personas — “Protagonist”, “Love Interest”, etc — that are used in movies. To do this we collected a brand new dataset: 42,306 … Continue reading