About
This is a blog on artificial intelligence and "Social Science++", with an emphasis on computation and statistics. My website is brenocon.com.
Blogroll
Blog Search
-
Archives
Author Archives: brendano
1 billion web page dataset from CMU
This is fun — Jamie Callan‘s group at CMU LTI just finished a crawl of 1 billion web pages. It’s 5 terabytes compressed — big enough so they have to send it to you by mailing hard drives. Link: ClueWeb09 … Continue reading
Pirates killed by President
A lesson in x-axis scaling, and choosing which data to compare. Two current graphs making their rounds on the internet: (about this.)
Binary classification evaluation in R via ROCR
A binary classifier makes decisions with confidence levels. Usually it’s imperfect: if you put a decision threshold anywhere, items will fall on the wrong side — errors. I made this a diagram a while ago for Turker voting; same principle … Continue reading
Comparison of data analysis packages: R, Matlab, SciPy, Excel, SAS, SPSS, Stata
Lukas and I were trying to write a succinct comparison of the most popular packages that are typically used for data analysis. I think most people choose one based on what people around them use or what they learn in … Continue reading
“Logic Bomb”
Article: Fannie Mae Logic Bomb Would Have Caused Weeklong Shutdown | Threat Level from Wired.com. I love the term “logic bomb”. Can you pair it with a statistics bomb? Data-driven bomb? Or maybe the point is a connectionist bomb.
SF conference for data mining mercenaries
I got an email from a promoter for Predictive Analytics World, a very expensive conference next month in San Francisco for business applications of data mining / machine learning / predictive analytics. I’m not going because I don’t want to … Continue reading
Love it and hate it, R has come of age
Seeing a long, lavish article about R in the NEW YORK TIMES (!) really freaks me out. replicate(100, c( “OMG OMG, R is now famous?!”, “People used to make fun of me for learning R since Splus is SO OLD!”, … Continue reading
Facebook sentiment mining predicts presidential polls
I’m a bit late blogging this, but here’s a messy, exciting — and statistically validated! — new online data source. My friend Roddy at Facebook wrote a post describing their sentiment analysis system, which can evaluate positive or negative sentiment … Continue reading
Information cost and genocide
In 1994, the Rwandan genocide claimed 800,000 lives. This genocide was remarkable for being very low-tech — lots of non-military, average people with machetes killing their neighbors. Romeo Dallaire, the leader of the small UN peacekeeping mission there, saw it … Continue reading