This function creates a text corpus from node or edge text attribute data in an igraph.

corpusFromGraph(
  g = NULL,
  txt_attr = NULL,
  type = "vertex",
  iconv = FALSE,
  html_decode = TRUE,
  rm_url = TRUE,
  rm_num = TRUE,
  rm_punct = TRUE,
  rm_twit_hashtags = FALSE,
  rm_twit_users = FALSE,
  sw_kind = "SMART",
  rm_words = NULL,
  stem = FALSE
)

Arguments

g

an igraph graph object.

txt_attr

Character string. Name of graph text attribute. Default is NULL.

type

Character string. Graph attribute type. Default is "vertex".

iconv

Logical. Use the iconv function to attempt UTF8 conversion. Default is FALSE.

html_decode

Logical. HTML decode text. Default is TRUE.

rm_url

Logical. Remove URL's. Default is TRUE.

rm_num

Logical. Remove numbers. Default is TRUE.

rm_punct

Logical. Remove punctuation. Default is TRUE.

rm_twit_hashtags

Logical. Remove twitter hashtags. Default is FALSE.

rm_twit_users

Logical. Remove twitter user names. Default is FALSE.

sw_kind

Character string. Stopword dictionary. Refer stopwords kind parameter. Default is "SMART".

rm_words

Character vector. User defined stopwords. Default is NULL.

stem

Logical. Apply word stemming. Default is FALSE.

Value

A tm text corpus object.