CMU ARK Twitter Part-of-Speech Tagger – v0.3 released

We’re pleased to announce a new release of the CMU ARK Twitter Part-of-Speech
Tagger, version 0.3.

  • The new version is much faster (40x) and more accurate (89.2 -> 92.8) than

  • We also have released new POS-annotated data, including a dataset of one
    tweet for each of 547 days.

  • We have made available large-scale word clusters from unlabeled Twitter data
    (217k words, 56m tweets, 847m tokens).

Tools, data, and a new technical report describing the release are available at:

0100100 a 1111100101110 111100000011, Brendan

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.