By: brendano

brendano — Wed, 22 Jun 2011 04:44:11 +0000

Thanks for the comment Bob, I really appreciate it. Somehow I missed the literature that compares PPM to what are now standard smoothed LM’s. That paper looks great.

By: Bob Carpenter

Bob Carpenter — Thu, 16 Jun 2011 21:01:48 +0000

Nice! Compressibility’s totally the way to measure this. The only question is what scale you do the compression on.

With these implementations like zip, we wind up providing upper bounds on the amount of information. PPM with longer contexts is even better, but they still don’t get close to simple smoothed language models due to online constraints. There are some really cool hierarchical Bayesian language model with non-parametrics that are even better that Frank Wood, Nick Bartlett, David Pfau, Yee Whye Teh and a few others developed:

http://www.stat.columbia.edu/~fwood/Papers/Bartlett-DCC-2011.pdf

Comments on: How much text versus metadata is in a tweet?

By: brendano

By: Bob Carpenter