Create a word frequency dataframe from a text corpus.

wordFreqFromCorpus(
  corp,
  rm_sparse = 0.99,
  word_len = c(3, 26),
  word_freq = c(1, Inf)
)

Arguments

corp

a tm text corpus object.

rm_sparse

Logical. Remove proportion of sparse terms. Default is 0.99.

word_len

Numeric vector. Min and max length of words to include. Default is c(3, 26).

word_freq

Numeric vector. Min and max frequency of words to include. Default is c(1, Inf).

Value

A data.table of word frequencies.