New features Add flatten and levels arguments to as.list.dictionary2() to enable more flexible conversion of dictionary objects. (#1661) In corpus_sample(), the size now works with the by argument, to control the size of units sampled from each group. Improvements to textstat_dist() and textstat_simil(), see below. Long tokens are not discarded automatically in the call to tokens(). (#1713) Behaviour changes textstat_dist() and textstat_simil() now return sparse symmetric matrix objects using classes from the Matrix package. This replaces the former structure based on the dist class. Computation of these classes is now also based on the fast implementation in the proxyC package. When computing similarities, the new min_simil argument ...
Bug fixes and minor feature additions. Changes since v0.9.9-3 Bug fixes Fixed a bug causing dfm and...
Changes Moved data_corpus_irishbudget2010 and data_corpus_dailnoconf1991 to the quanteda.textmodels...
Bug fixes and stability enhancements Changed the default value of the size argument in dfm_sample()...
Changes Added block_size to quanteda_options() to control the number of documents in blocked tokeni...
Changes since v0.9.9-50 New features Corpus construction using corpus() now works for a tm::SimpleC...
quanteda 2.0 introduces some major changes, detailed here. What's new in v2.0 New corpus object str...
Bug fixes and stability enhancements Fixed bug in dfm_compress() and dfm_group() that changed or de...
New Features tokens_segment() has a new window argument, permitting selection within an asymmetric ...
New features Improvements and consoldiation of methods for detecting multi-word expressions, now ac...
Bug fixes and stability enhancements Fixed a bug causing incorrect counting in fcm(x, ordered = TRU...
Last 1.x.x release before major changes in v2. New features Added Yule's I to textstat_lexdiv(). Ad...
New Features Added as.dfm() methods for tm DocumentTermMatrix and TermDocumentMatrix objects. (#122...
New Features Added an nsentence() method for spacyr parsed objects. (#1289) Bug fixes and stabili...
New Features Added vertex_labelfont to textplot_network(). Added textmodel_lsa() for Latent Semanti...
Bug fixes and stability enhancements fcm() computes the marginal frequency of upper-case tokens cor...
Bug fixes and minor feature additions. Changes since v0.9.9-3 Bug fixes Fixed a bug causing dfm and...
Changes Moved data_corpus_irishbudget2010 and data_corpus_dailnoconf1991 to the quanteda.textmodels...
Bug fixes and stability enhancements Changed the default value of the size argument in dfm_sample()...
Changes Added block_size to quanteda_options() to control the number of documents in blocked tokeni...
Changes since v0.9.9-50 New features Corpus construction using corpus() now works for a tm::SimpleC...
quanteda 2.0 introduces some major changes, detailed here. What's new in v2.0 New corpus object str...
Bug fixes and stability enhancements Fixed bug in dfm_compress() and dfm_group() that changed or de...
New Features tokens_segment() has a new window argument, permitting selection within an asymmetric ...
New features Improvements and consoldiation of methods for detecting multi-word expressions, now ac...
Bug fixes and stability enhancements Fixed a bug causing incorrect counting in fcm(x, ordered = TRU...
Last 1.x.x release before major changes in v2. New features Added Yule's I to textstat_lexdiv(). Ad...
New Features Added as.dfm() methods for tm DocumentTermMatrix and TermDocumentMatrix objects. (#122...
New Features Added an nsentence() method for spacyr parsed objects. (#1289) Bug fixes and stabili...
New Features Added vertex_labelfont to textplot_network(). Added textmodel_lsa() for Latent Semanti...
Bug fixes and stability enhancements fcm() computes the marginal frequency of upper-case tokens cor...
Bug fixes and minor feature additions. Changes since v0.9.9-3 Bug fixes Fixed a bug causing dfm and...
Changes Moved data_corpus_irishbudget2010 and data_corpus_dailnoconf1991 to the quanteda.textmodels...
Bug fixes and stability enhancements Changed the default value of the size argument in dfm_sample()...