p.4
Visualization of Social Media Data
What is data visualization?
A graphical representation of information and data that helps to see and understand exceptions, trends, and patterns.
p.6
Anonymity in Data Collection
How can anonymity in data collection be achieved?
By using a VPN to prevent easy identification of collecting sources.
p.16
Sentiment Analysis Techniques
What is shown in Figure 10?
Overall ranking of tweets and retweets by the algorithm Pattern Analyzer.
p.5
Data Collection and Processing Architecture
What is the main focus of the OctopusViz environment?
To collect, process, research, analyze, and visualize real-time Twitter topics, contexts, and trends.
p.11
Data Collection and Processing Architecture
What Boolean logic was used for data collection on Twitter?
The Boolean logic used by the Twitter search function.
p.8
Data Collection and Processing Architecture
What is the first step in processing tweets according to the document?
Detecting the language, translating, and correcting the text to English.
p.6
Data Collection and Processing Architecture
What is the role of the Demilitarized Zone (DMZ) in the proposed architecture?
It connects a host to a protected network of the University of Brasília Research Laboratory.
p.2
Bot Detection and Analysis
What vulnerability can be exploited by cybercriminals according to Hwang et al.?
It can erode trust in social media.
p.5
Visualization of Social Media Data
What are some highlighted features of OctopusViz compared to related works?
Anonymization, sentiment analysis, real-time operation, distributed storage, and visualization.
p.13
Bot Detection and Analysis
What unusual behavior was noted about the user InfosFuteboI?
Posted only 34 tweets but had over 35,000 retweets.
p.13
Data Collection and Processing Architecture
What is the purpose of the data processing sublayer?
To classify tweets and retweets.
p.13
Visualization of Social Media Data
What can be created and separated using filters in the tool?
Hashtags' clouds by tweets and retweets.
p.10
Sentiment Analysis Techniques
What are the four entries included in the lexical dataset?
Polarity, subjectivity, intensity, and confidence.
p.14
Sentiment Analysis Techniques
Which algorithm is used for classification in sentiment analysis?
Pattern Analyzer algorithm.
p.5
Sentiment Analysis Techniques
What types of data does the proposed solution aim to analyze?
Tweets, retweets, hashtags, mentions, likes, and user mapping.
p.6
Data Collection and Processing Architecture
What is the purpose of XenCenter in the proposed architecture?
To configure and manage guest systems in XenServer.
p.14
Sentiment Analysis Techniques
What is the purpose of sentiment analysis in the context of tweets and retweets?
To rank users’ public opinion and identify the type of speech that allows specific decisions.
p.15
Sentiment Analysis Techniques
What event likely caused the highest peak of negative polarity tweets on June 22nd?
Brazil's match against Costa Rica, which they won 2-0.
p.13
Challenges in Sentiment Analysis
What is a common practice on Twitter regarding accounts?
Finding fake accounts and bots spreading misinformation.
p.11
Impact of Social Media on Public Opinion
What was the trend observed after July 10th regarding the Brazilian National Soccer Team on Twitter?
A drop in posts indicating reduced presence on Twitter.
p.5
Real-Time Analytics in Social Media
What types of topics can be monitored and analyzed using OctopusViz?
Strikes, elections, companies, marketing, protests, cyber-attacks, military operations, and market research.
p.11
Sentiment Analysis Techniques
What was the classification of posts collected during the study?
168,509 tweets and 562,341 retweets.
p.16
Sentiment Analysis Techniques
What does Figure 9 illustrate?
Ranking of tweets and retweets per day with neutral, positive, and negative polarities.
p.11
Case Study: Brazilian National Soccer Team
What was the main focus of the study regarding the Brazilian National Soccer Team?
To observe metrics, statistics, and sentiment on Twitter during the 2018 FIFA World Cup.
p.5
Anonymity in Data Collection
How does OctopusViz prioritize analyst anonymity?
By using a VPN for data collection.
p.4
Data Collection and Processing Architecture
What is the purpose of the Vox Civitas tool?
To support journalists in extracting news from Twitter by aggregating data.
p.13
Bot Detection and Analysis
Why is user analysis important for analysts?
To identify fake accounts and bots on Twitter.
p.4
Challenges in Sentiment Analysis
How does HoneySELK contribute to cybersecurity?
It performs real-time monitoring of cyber attacks using the ELK stack for distributed storage and georeferencing.
p.3
Challenges in Sentiment Analysis
What challenges are associated with determining sentiment in tweets?
It can be laborious, prone to errors and ambiguity.
p.2
Anonymity in Data Collection
What are the two classifications of anonymity systems?
High-latency and low-latency.
p.6
Anonymity in Data Collection
What is a fundamental requirement for the proposed environment architecture?
To ensure the source of data collection is anonymous for security and privacy reasons.
p.11
Visualization of Social Media Data
What was the highest number of tweets and retweets recorded on a single day?
42,733 tweets and retweets on July 7th.
p.10
Text Mining and Natural Language Processing
What does the Penn Treebank tag set help determine?
The grammatical class (POS tagger) of the words.
p.7
Data Collection and Processing Architecture
What programming language and libraries are used in the data collection layer?
Python with libraries such as Tweepy, JSON, and TextBlob.
p.12
Data Collection and Processing Architecture
What time period was the Twitter data collected for the analysis?
Between 15 June and 31 July 2018.
p.6
Data Collection and Processing Architecture
What are the specifications of the server used in the proposed architecture?
Intel Xeon processor E5-2690 v3 @ 2.6 GHz, 48 cores, 128 GB RAM, and 6 network cards.
p.10
Case Study: Brazilian National Soccer Team
What was the focus of the case study involving the Brazilian National Soccer Team?
To analyze Twitter data while maintaining a neutral view on sensitive topics.
p.4
Sentiment Analysis Techniques
What classification does the sentiment analysis algorithm use?
It classifies opinions as positive, negative, and neutral.
p.9
Text Mining and Natural Language Processing
What is tokenization in text processing?
The identification of tokens (words) that divides texts into words, phrases, or symbols.
p.15
Sentiment Analysis Techniques
What was the highest peak of positive-rated tweets about the Brazilian National Soccer Team, and when did it occur?
21,146 tweets on July 7th, the day after Belgium beat Brazil 2-1.
p.4
Sentiment Analysis Techniques
What does Tweetviz do?
Helps companies extract actionable information from noisy messages on Twitter, identifying sentiment and demographics.
p.3
Sentiment Analysis Techniques
What model did Gomes et al. propose for sentiment analysis?
A model that polarizes news into positive, negative, or neutral and provides procedures for organizations to extract knowledge from textual data.
p.2
Bot Detection and Analysis
What effect do bots have on the perception of social media influence?
They can artificially increase some people’s audiences.
p.13
Sentiment Analysis Techniques
What does the analysis of hashtags depend on?
Analyst interest in tweets or retweets.
p.4
Sentiment Analysis Techniques
What is the significance of using Twitter for sentiment analysis?
Twitter has social networking features that are interesting for mining user opinions and sentiments.
p.10
Sentiment Analysis Techniques
What is the purpose of the lexical dataset en-sentiment.xml?
To assign scores (polarity, subjectivity, intensity, and confidence) and determine grammatical class for each word in sentences.
p.11
Data Collection and Processing Architecture
What keywords were used to collect data about the Brazilian National Soccer Team?
"seleção brasileira" or "seleção do brasil".
p.5
Challenges in Sentiment Analysis
What challenge does the exponential growth of social media present?
The need to manipulate large amounts of data and investigate how organizations can benefit from it.
p.8
Data Collection and Processing Architecture
What is the purpose of removing stop words during data preprocessing?
To eliminate words that have no value for analysis.
p.1
Impact of Social Media on Public Opinion
Why is sentiment analysis important for government agencies and companies?
It provides insights for business strategies and helps in understanding public opinion.
p.5
Impact of Social Media on Public Opinion
Why is it essential for analysts to monitor social media?
To observe the evolution of themes and generate data that aids in decision-making.
p.2
Anonymity in Data Collection
What is an example of a low-latency anonymity system?
Virtual Private Networks (VPNs).
p.8
Data Collection and Processing Architecture
What types of words are typically considered stop words?
Articles, prepositions, punctuations, conjunctions, and pronouns.
p.12
Visualization of Social Media Data
What does Figure 4 represent in the analysis?
A cloud of words identifying the most referenced hashtags in tweets and retweets.
p.4
Visualization of Social Media Data
Why are Big Data tools important?
They are essential for analyzing huge amounts of information in real time and making data-driven decisions.
p.16
Sentiment Analysis Techniques
What is the significance of Table 12?
It shows the peak polarities of tweets and retweets per day.
p.10
Sentiment Analysis Techniques
How is the score of each word in the lexical dataset defined?
According to the meaning of the sentence.
p.9
Sentiment Analysis Techniques
What is the main objective of the Classification Sublayer?
To carry out sentiment analysis of tweets to identify behaviors that may measure public opinion.
p.1
Data Collection and Processing Architecture
What is the significance of monitoring social networks according to the article?
It allows for the generation of knowledge about events and changes in the current world.
p.7
Data Collection and Processing Architecture
What is the purpose of the collection layer in the architecture?
To capture tweets in real time according to keywords entered in the application.
p.3
Sentiment Analysis Techniques
What was the focus of Rodrigues Barbosa et al.'s study?
To explore tweets about the Brazilian presidential elections in 2010 and trace online sentiment.
p.15
Sentiment Analysis Techniques
What was the highest peak of positive polarity tweets per hour, and when did it occur?
1128 tweets before the first match on June 16th at 6 p.m.
p.14
Sentiment Analysis Techniques
What is the significance of the classification sublayer in sentiment analysis?
It is where the classification of public opinion is made.
p.17
Data Collection and Processing Architecture
What algorithm was used to analyze user tweets and retweets?
Pattern Analyzer algorithm.
p.1
Impact of Social Media on Public Opinion
What role do social networks play in the dissemination of information?
They serve as a common practice for publishing and spreading ideas and opinions.
p.2
Sentiment Analysis Techniques
What model did Mladenovic et al. present for sentiment analysis?
A model using morphological dictionaries, sentiment lexicon, and irony classification.
p.2
Challenges in Sentiment Analysis
What is the significance of cleaning or modifying datasets in sentiment analysis?
To reduce noise that undermines classification.
p.1
Sentiment Analysis Techniques
What is the main focus of the proposed architecture in the article?
To monitor and perform anonymous real-time searches in tweets for sentiment analysis.
p.4
Data Collection and Processing Architecture
What is the main goal of the research involving Tweetviz?
To leverage geographic information to provide actionable location-specific information.
p.10
Visualization of Social Media Data
What is the purpose of the Visualization Layer?
To help analysts interpret tweets and retweets for better data interpretation.
p.1
Real-Time Analytics in Social Media
What capabilities does the proposed solution offer?
High capacity to collect, process, search, analyze, and visualize tweets in real-time.
p.14
Data Collection and Processing Architecture
What additional metrics does the environment identify for each tweet and retweet?
Mentions, likes, and hashtags.
p.7
Data Collection and Processing Architecture
What is the function of the distributed storage layer?
To index and fetch tweets received from the capture layer.
p.11
Sentiment Analysis Techniques
Why is sorting messages into tweets and retweets important?
It helps analysts understand the influence on a particular subject.
p.3
Sentiment Analysis Techniques
Which sentiment analysis tool showed better results with the SVM classifier according to Hasan et al.?
TextBlob and Word Sense Disambiguation (WSD) showed better results than SentiWordNet.
p.1
Data Collection and Processing Architecture
What is the purpose of creating an environment for data collection in the proposed architecture?
To transform data into intelligence information for analysis and decision-making.
p.14
Impact of Social Media on Public Opinion
What does the column 'favorite_count' represent in the context of retweets?
The number of likes each retweet received.
p.6
Data Collection and Processing Architecture
What is the configuration of the host used in the proposed architecture?
Dell PowerEdge R730 with Intel Xeon processor, 128 GB RAM, and 6 disks configured with RAID 5.
p.5
Real-Time Analytics in Social Media
What is the significance of real-time analysis in social media?
It allows anticipating possible scenarios for advising on the decision-making process.
p.7
Anonymity in Data Collection
How does the architecture ensure anonymity during data collection?
By authenticating to a contracted VPN server.
p.10
Data Collection and Processing Architecture
When did the data collection for the case study take place?
Between 15 June and 31 July 2018.
p.9
Sentiment Analysis Techniques
What does a subjectivity score of 0.0 represent?
Very objective sentiment.
p.2
Impact of Social Media on Public Opinion
How can bots affect public policy?
By creating the impression of an opposing grassroots movement.
p.9
Sentiment Analysis Techniques
What are the two implementations of sentiment analysis algorithms in TextBlob?
PatternAnalyzer and NaiveBayesAnalyzer.
p.18
Impact of Social Media on Public Opinion
How many tweets and retweets were made with the hashtag #Copa2018?
4098 tweets and retweets.
p.3
Sentiment Analysis Techniques
How did Hasan et al. enhance sentiment analysis?
By adopting a hybrid approach with three sentiment analyzers and two machine learning classifiers.
p.18
Sentiment Analysis Techniques
What were the three classifications of polarity for the hashtag #Copa2018?
Positive, negative, and neutral.
p.2
Challenges in Sentiment Analysis
What challenges are associated with sentiment analysis on social media?
Mixing objective and subjective information generates noise.
p.7
Data Collection and Processing Architecture
What is the role of the processing sublayer?
To transform raw data into information of interest according to filters.
p.7
Sentiment Analysis Techniques
What does the classification sublayer perform?
Sentiment analysis and identifies users’ public opinion.
p.1
Impact of Social Media on Public Opinion
What types of scenarios can be predicted using data from social networks?
Strikes, protests, marketing, cyber-attacks, elections, military operations, and market research.
p.2
Anonymity in Data Collection
What do VPNs provide in terms of communication?
Secure communication and data traffic confidentiality.
p.1
Challenges in Sentiment Analysis
What are the benefits of the proposed monitoring solution?
Low cost of implementation and operation while providing real-time sentiment analysis.
p.17
Data Collection and Processing Architecture
What does link analysis aim to achieve?
Integrate information from various Twitter entities to detect patterns and connections.
p.14
Impact of Social Media on Public Opinion
What is the total number of tweets and retweets posted by user cleytu?
13,858 (2 tweets and 13,856 retweets).
p.15
Sentiment Analysis Techniques
How did the number of retweets compare to tweets across all polarities?
The number of published retweets was higher across all polarities.
p.3
Sentiment Analysis Techniques
What did Cerón-Guzmán and León-Guzmán's work focus on?
Distinguishing spammer from non-spammer accounts and investigating voting intent inference from Twitter data.
p.8
Data Collection and Processing Architecture
What is the output of the cleaning function for the input 'Brazil is an excellent soccer team :) !!!'?
['Brazil', 'excellent', 'soccer', 'team']
p.9
Sentiment Analysis Techniques
What is the output of the sentiment analysis for the tweet 'Brazil is an excellent soccer team :) !!!'?
Sentiment(polarity = 0.98828125, subjectivity = 1.0) with Polarity: Positive.
p.4
Challenges in Sentiment Analysis
How does the proposed text mining process differ from previous works?
It operates in real time and considers words in tweets that may express sentiment, even if not marked with a hashtag.
p.17
Sentiment Analysis Techniques
Which polarity had more users related to it according to the analysis?
Neutral and positive polarities.
p.3
Sentiment Analysis Techniques
What did Praciano et al. propose for analyzing Brazilian presidential elections?
A framework for space-time trend analysis based on Twitter data.
p.17
Data Collection and Processing Architecture
What was the time frame for the data used in the hashtag and user link analysis?
From 15 June to 31 July 2018.
p.12
Sentiment Analysis Techniques
Which hashtags were more included in retweets than in tweets?
#Copa2018, #BRA, #VaiBrasil, #BrasilGanha, #BRAMEX.
p.7
Sentiment Analysis Techniques
What is the main function of the TextBlob API?
To work with Natural Language Processing (NLP), sentiment analysis, and classification.
p.7
Data Collection and Processing Architecture
What are the five phases of the proposed architecture development?
Data collection layer, data processing sublayer, classification sublayer, distributed storage layer, and real-time tweets’ visualization.