Added documentation for n-grams, skip n-grams, and regex Added codecov and appveyor Added tidiers for LDA objects from topicmodels and a vignette on topic modeling Added function to calculate tf-idf of a tidy text dataset and a tf-idf vignette Fixed a bug when tidying by line/sentence/paragraph/regex and there are multiple non-text columns Fixed a bug when unnesting using n-grams and skip n-grams (entire text was not being collapsed) Added ability to pass a (custom tokenizing) function to token. Also added a collapse argument that makes the choice whether to combine lines before tokenizing explicit. Changed tidy.dictionary to return a tbl_df rather than a data.frame Updated cast_sparse to work with dplyr 0.5.0 Deprecated the pair_count fun...
Improvements to documentation (#117) Fix for NSE thanks to @lepennec (#122). Tidier for estimated re...
Bug fixes and stability enhancements Fixed bug in dfm_compress() and dfm_group() that changed or de...
Much of the data available today is unstructured and text-heavy, making it challenging for analysts ...
unnest_tokens can now unnest a data frame with a list column (which formerly threw the error unnest_...
Fix bug in augment() function for stm topic model. Warn when tf-idf is negative, thanks to @EmilHvit...
get_sentiments now works regardless of whether tidytext has been loaded or not (#50). unnest_tokens ...
Fix tidier for quanteda dictionary for correct class (#71). Add a pkgdown site. Convert NSE from und...
Wrapper tokenization functions for n-grams, characters, sentences, tweets, and more, thanks to @Coli...
Updates to documentation (#102), README, and vignettes. Add tokenizing by character shingles thanks ...
The tidyr package is part of the tidyverse. As its name indicates, it is meant to help you create ti...
reorder_within() now handles multiple variables, thanks to @tmastny (#170) Move stopwords to Suggest...
v.0.2.8 bug fix in tidy_genomic_data while using data.table::melt.data.table instead of tidyr::ga...
Use vdiffr conditionally Bug fix/breaking change for collapse argument to unnest_functions(). This a...
scale_x/y_reordered() now uses a function labels as its main input (#200) Fixed how to_lower is pass...
Updates to documentation (#109) thanks to Emil Hvitfeldt. Add new tokenizers for tweets, Penn Treeba...
Improvements to documentation (#117) Fix for NSE thanks to @lepennec (#122). Tidier for estimated re...
Bug fixes and stability enhancements Fixed bug in dfm_compress() and dfm_group() that changed or de...
Much of the data available today is unstructured and text-heavy, making it challenging for analysts ...
unnest_tokens can now unnest a data frame with a list column (which formerly threw the error unnest_...
Fix bug in augment() function for stm topic model. Warn when tf-idf is negative, thanks to @EmilHvit...
get_sentiments now works regardless of whether tidytext has been loaded or not (#50). unnest_tokens ...
Fix tidier for quanteda dictionary for correct class (#71). Add a pkgdown site. Convert NSE from und...
Wrapper tokenization functions for n-grams, characters, sentences, tweets, and more, thanks to @Coli...
Updates to documentation (#102), README, and vignettes. Add tokenizing by character shingles thanks ...
The tidyr package is part of the tidyverse. As its name indicates, it is meant to help you create ti...
reorder_within() now handles multiple variables, thanks to @tmastny (#170) Move stopwords to Suggest...
v.0.2.8 bug fix in tidy_genomic_data while using data.table::melt.data.table instead of tidyr::ga...
Use vdiffr conditionally Bug fix/breaking change for collapse argument to unnest_functions(). This a...
scale_x/y_reordered() now uses a function labels as its main input (#200) Fixed how to_lower is pass...
Updates to documentation (#109) thanks to Emil Hvitfeldt. Add new tokenizers for tweets, Penn Treeba...
Improvements to documentation (#117) Fix for NSE thanks to @lepennec (#122). Tidier for estimated re...
Bug fixes and stability enhancements Fixed bug in dfm_compress() and dfm_group() that changed or de...
Much of the data available today is unstructured and text-heavy, making it challenging for analysts ...