Comments on: Rough binomial confidence intervals

By: Memorizing small tables | AI and Social Science – Brendan O'Connor

Memorizing small tables | AI and Social Science – Brendan O'Connor — Fri, 11 Nov 2011 18:13:55 +0000

[...] is the Clopper-Pearson binomial confidence interval. Actually, the more useful ones to memorize are Wald binomial intervals, which are easy because they’re close to (pm 1/sqrt{n}). Good party trick. This sticky is [...]

By: brendano

brendano — Sat, 23 Apr 2011 18:15:42 +0000

Yeah it’s tricky. I still think it’s useful to build some intuitions about confidence about effect size relative to sample size.

By: Bob Carpenter

Bob Carpenter — Thu, 14 Apr 2011 23:02:47 +0000

The 1/sqrt(n) approximation of two standard deviations deteriorates pretty quickly as p moves away from 0.5. It’s off by nearly a factor of 2 for p = 0.9 as indicated in your table. [Also, you need a +/- on the RHS.]

These binomial confidence intervals are defined for i.i.d. samples. This is a particularly lousy assumption for any natural language evaluation involving whole documents, because there’s a huge amount of correlation between words in a document. Given correlation among the items, the confidence intervals need to be fatter because the effective independent sample size is lower. This reduces significance, which may explain why no one ever talks about it.

There’s also sampling variance because the p here are usually only estimated from the corpus, not the entire (super-)population. This also reduces significance.

As we say in CS, “garbage in, garbage out”.