Title: | Valence Aware Dictionary and sEntiment Reasoner (VADER) |
---|---|
Description: | A lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains. Hutto & Gilbert (2014) <https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109/8122>. |
Authors: | Katherine Roehrick [aut, cre] |
Maintainer: | Katherine Roehrick <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.1 |
Built: | 2024-11-13 05:14:27 UTC |
Source: | https://github.com/cran/vader |
Use get_vader() to calculate the valence of a single text document.
get_vader(text, incl_nt = T, neu_set = T, rm_qm = T)
get_vader(text, incl_nt = T, neu_set = T, rm_qm = T)
text |
to be analyzed; for get_vader(), the text should be a character string |
incl_nt |
defaults to T, indicates whether you wish to incl UNUSUAL n't contractions (e.g., yesn't) in negation analysis |
neu_set |
defaults to T, indicates whether you wish to count neutral words in calculations |
rm_qm |
defaults to T, indicates whether you wish to clean quotation marks from text (setting to F may result in errors) |
A named vector containing the valence score for each word; an overall, compound valence score for the text; the weighted percentage of positive, negative, and neutral words in the text; and the frequency of the word "but".
For the original Python Code, please see:
https://github.com/cjhutto/vaderSentiment
https://github.com/cjhutto/vaderSentiment/blob/master/vaderSentiment/vaderSentiment.py
For the original R Code, please see:
https://github.com/nrguimaraes/sentimentSetsR/blob/master/R/ruleBasedSentimentFunctions.R
Modifications to the above scripts include, but are not limited to:
ALL CAPS fx: updated to account for non-alpha words; i.e. "I'M 100 PERCENT SURE" would previously have been counted as mixed case due to the use of numbers
IDIOMS fx: added capacity to check for idioms that do not contain any words found in the Vader Lexicon
WORDS+EMOT: strip punctuation while preserving ALL emoticons found in dictionary
Option to turn on/off neutral count
In the examples below, "yesn't" is an internet neologism meaning "no", "maybe yes, maybe no", "didn't", etc.
vader_df
to get vader results for multiple text documents
get_vader("I yesn't like it") get_vader("I yesn't like it", incl_nt = FALSE) get_vader("I yesn't like it", neu_set = FALSE) get_vader("I said \"I'm not happy\"", rm_qm = FALSE) get_vader("I said \" I'm not happy \" ", rm_qm = FALSE)
get_vader("I yesn't like it") get_vader("I yesn't like it", incl_nt = FALSE) get_vader("I yesn't like it", neu_set = FALSE) get_vader("I said \"I'm not happy\"", rm_qm = FALSE) get_vader("I said \" I'm not happy \" ", rm_qm = FALSE)
Use vader_df() to calculate the valence of multiple texts contained within a vector or column in a dataframe.
vader_df(text, incl_nt = T, neu_set = T, rm_qm = F)
vader_df(text, incl_nt = T, neu_set = T, rm_qm = F)
text |
to be analyzed; for vader_df(), the text should be a single vector (e.g. 1 column) |
incl_nt |
defaults to T, indicates whether you wish to incl UNUSUAL n't contractions (e.g., yesn't) in negation analysis |
neu_set |
defaults to T, indicates whether you wish to count neutral words in calculations |
rm_qm |
defaults to T, indicates whether you wish to clean quotation marks from text (setting to F may result in errors) |
A dataframe containing the valence score for each word; an overall, compound valence score for the text; the weighted percentage of positive, negative, and neutral words in the text; and the frequency of the word "but".
In the examples below, "yesn't" is an internet neologism meaning "no", "maybe yes, maybe no", "didn't", etc.
get_vader
to get vader results for a single text document