Skip to contents

Collects comments made by users on one or more specified subreddit conversation threads and structures the data into a dataframe with the class names "datasource" and "reddit".

Usage

# S3 method for class 'thread.reddit'
Collect(
  credential,
  endpoint,
  threadUrls,
  sort = NA,
  waitTime = c(6, 8),
  ua = getOption("HTTPUserAgent"),
  ...,
  writeToFile = FALSE,
  verbose = TRUE
)

collect_reddit_threads(
  threadUrls,
  sort = "best",
  waitTime = c(6, 8),
  ua = vsml_ua(),
  writeToFile = FALSE,
  verbose = TRUE,
  ...
)

Arguments

credential

A credential object generated from Authenticate with class name "reddit".

endpoint

API endpoint.

threadUrls

Character vector. Reddit thread urls to collect data from.

sort

Character vector. Reddit comment sort order. Options are "best", "top", "new", "controversial", "old", and "qa". Default is NA.

waitTime

Numeric vector. Time range in seconds to select random wait from in-between url collection requests. Minimum is 3 seconds. Default is c(6, 8) for a wait time chosen from between 6 and 8 seconds.

ua

Character string. Override User-Agent string to use in Reddit thread requests. Default is option("HTTPUserAgent") value as set by vosonSML.

...

Additional parameters passed to function. Not used in this method.

writeToFile

Logical. Write collected data to file. Default is FALSE.

verbose

Logical. Output additional information about the data collection. Default is TRUE.

Value

A tibble object with class names "datasource" and "reddit".

Note

The reddit web endpoint used for collection has maximum limit of 500 comments per thread url.

Examples

if (FALSE) { # \dontrun{
# subreddit url to collect threads from
threadUrls <- c("https://www.reddit.com/r/xxxxxx/comments/xxxxxx/x_xxxx_xxxxxxxxx/")

redditData <- redditAuth |>
  Collect(threadUrls = threadUrls, writeToFile = TRUE)
} # }