CMU ARK Twitter Part-of-Speech Tagger – v0.3 released

Posted on September 21, 2012

We’re pleased to announce a new release of the CMU ARK Twitter Part-of-Speech
Tagger, version 0.3.

The new version is much faster (40x) and more accurate (89.2 -> 92.8) than
before.
We also have released new POS-annotated data, including a dataset of one
tweet for each of 547 days.
We have made available large-scale word clusters from unlabeled Twitter data
(217k words, 56m tweets, 847m tokens).

Tools, data, and a new technical report describing the release are available at:
www.ark.cs.cmu.edu/TweetNLP.

0100100 a 1111100101110 111100000011, Brendan

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.