A quick introduction to VOSONDash network and text analysis features
This post introduces VOSONDash network analysis tools, which include network visualisation, network metrics, and text analysis. Users can analyse different networks including those collected with VOSONDash (Twitter, YouTube and Reddit), or import graphml
files collected elsewhere.
Analysing online networks with VOSONDash is the first of a series of posts where we will cover VOSONDash features. Data collection with VOSONDash is covered in the following posts:
VOSONDash is an output of computational social methods research, designed to be a “Swiss Army knife” for studying online networks. The R/Shiny dashboard tool enables online data collection, and network and text analysis (including visualisation) within the same environment. VOSONDash builds on a number of R packages, in particular vosonSML for data collection and network generation, and igraph for network analysis. The package provides a graphical user interface which does not require users to have R programming skills and it is available on CRAN and GitHub. Bryan Gertzel is the lead developer and maintainer of VOSONDash.
The GitHub page provides instructions to install VOSONDash via R or Rstudio. Once the package is installed, run VOSONDash from the RStudio console entering the following code; VOSONDash will open in a web browser.
To ease replication, in this example we will use the EnviroActivistsWebsite_2006 demo dataset which is provided in the package. The dataset is a hyperlink network collected with VOSON in 2006, as part of a research piece (Ackland and O’Neil 2011). The network has 161 nodes (websites representing environmental organisations) and 1,444 edges representing hyperlinks between these organisations. In this dataset, text data is stored as node attribute and categorical values are assigned depending on type of environmental organisations (Bios, Globals, and Toxics).
There are three main approaches to analysing online networks with VOSONDash: Network graph, Network metrics (SNA), and Text analysis. More information on features can be accessed in the VOSONDash Userguide (Borquez et al. 2020).
In Network graph
provides two options to explore networks: network visualisation via igraph and visNetwork; and tabulations for nodes and edges. The Network graph
pane provides the following options for manipulating the network:
Via the Network metrics
pane, we can observe basic SNA metrics, including network level and node level metrics (e.g. centralisation). Network metrics
reflect the applied filters for the visualisation; in this example we removed isolates (3 nodes), so network size is 158 and the Component distribution is 1 (one connected component). Degree distribution
is only available for undirected networks; Indegree distribution
and Outdegree distribution
charts are available for directed networks, like this example. Accordingly, in this network, there are 15 nodes receiving one hyperlink, and three nodes receiving 35 hyperlinks. While 19 nodes link out to only one other site, there are two organisations in this network that link out to 50 sites.
Assortativity metrics
(Homogeneity and Homophily indexes, including mixing matrix and population share) are presented for networks with categorical node attributes. In this example, we have selected the categorical attribute Type
. The mixing matrix table presents links across the three types of organisations Globals, Bios and Toxics. The Bios and Globals sub-movements show a strong tendency towards linking to their own type. Population shares, Homogeneity indexes and Homophily indexes are presented by type. Controlling for group size, Globals are the group more biased towards its own type, where 53% of their ties to other Global organisations can be explained by homophily.
For a network with text data stored as either node or edge attribute, it is possible to conduct basic text analysis with VOSONDash. Text corpus can be pre-processed using Filters
to:
Remove Standard Stopwords
,User-Defined Stopwords
,Apply word stemming
,Word lenght
, if need to specify number of characters.Advanced options
provide HTML Decode
and iconv UTF8
, specially useful for social media as text often contains encoded characters.Remove Twitter hashtags
and Remove Twitter Usernames
.There are three methods available to visualise text:
Word frequency
bar charts, where further parameters can be applied such as to define the number of results displayed, and frequency to define Minimum frequency
, for the text to appear.
Word clouds
where users can adjust Minimum frequency
(how many times a word needs to have been used in order for it to feature in the visualisation); Maximum words
to control for the number of words appearing in the graph; percentage of vertical words can be set for legibility; and random colours can be assigned to the visualisation. Comparison clouds
are only available for datasets with categorical data, like this example where colour represents the node attribute type (Bios, Globals or Toxics).
The Sentiment analysis
function uses the Syuzhet package and classifies words based on the NRC Emotion Lexicon, which is a list of English words and their associations with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive).
We hope this guide is useful and easy to follow.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Borquez (2021, Sept. 9). VOSON Lab Code Blog: Analysing online networks with VOSONDash. Retrieved from https://vosonlab.github.io/posts/2021-08-06-analysing-online-networks-with-vosondash/
BibTeX citation
@misc{borquez2021analysing, author = {Borquez, Francisca}, title = {VOSON Lab Code Blog: Analysing online networks with VOSONDash}, url = {https://vosonlab.github.io/posts/2021-08-06-analysing-online-networks-with-vosondash/}, year = {2021} }