<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How much text versus metadata is in a tweet?</title>
	<atom:link href="https://brenocon.com/blog/2011/06/how-much-text-versus-metadata-is-in-a-tweet/feed/" rel="self" type="application/rss+xml" />
	<link>https://brenocon.com/blog/2011/06/how-much-text-versus-metadata-is-in-a-tweet/</link>
	<description>cognition, language, social systems; statistics, visualization, computation</description>
	<lastBuildDate>Tue, 25 Nov 2025 13:11:20 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
	<item>
		<title>By: brendano</title>
		<link>https://brenocon.com/blog/2011/06/how-much-text-versus-metadata-is-in-a-tweet/#comment-68711</link>
		<dc:creator>brendano</dc:creator>
		<pubDate>Wed, 22 Jun 2011 04:44:11 +0000</pubDate>
		<guid isPermaLink="false">http://brenocon.com/blog/?p=980#comment-68711</guid>
		<description><![CDATA[Thanks for the comment Bob, I really appreciate it.  Somehow I missed the literature that compares PPM to what are now standard smoothed LM&#039;s.  That paper looks great.]]></description>
		<content:encoded><![CDATA[<p>Thanks for the comment Bob, I really appreciate it.  Somehow I missed the literature that compares PPM to what are now standard smoothed LM&#8217;s.  That paper looks great.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Carpenter</title>
		<link>https://brenocon.com/blog/2011/06/how-much-text-versus-metadata-is-in-a-tweet/#comment-68235</link>
		<dc:creator>Bob Carpenter</dc:creator>
		<pubDate>Thu, 16 Jun 2011 21:01:48 +0000</pubDate>
		<guid isPermaLink="false">http://brenocon.com/blog/?p=980#comment-68235</guid>
		<description><![CDATA[Nice!  Compressibility&#039;s totally the way to measure this.  The only question is what scale you do the compression on.

With these implementations like zip, we wind up providing upper bounds on the amount of information.   PPM with longer contexts is even better, but they still don&#039;t get close to simple smoothed language models due to online constraints. There are some really cool hierarchical Bayesian language model with non-parametrics that are even better that Frank Wood, Nick Bartlett, David Pfau, Yee Whye Teh and a few others developed:

http://www.stat.columbia.edu/~fwood/Papers/Bartlett-DCC-2011.pdf]]></description>
		<content:encoded><![CDATA[<p>Nice!  Compressibility&#8217;s totally the way to measure this.  The only question is what scale you do the compression on.</p>
<p>With these implementations like zip, we wind up providing upper bounds on the amount of information.   PPM with longer contexts is even better, but they still don&#8217;t get close to simple smoothed language models due to online constraints. There are some really cool hierarchical Bayesian language model with non-parametrics that are even better that Frank Wood, Nick Bartlett, David Pfau, Yee Whye Teh and a few others developed:</p>
<p><a href="http://www.stat.columbia.edu/~fwood/Papers/Bartlett-DCC-2011.pdf" rel="nofollow">http://www.stat.columbia.edu/~fwood/Papers/Bartlett-DCC-2011.pdf</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic page generated in 0.014 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2026-04-23 18:26:47 -->
