What special character is removed from hashtags before indexing in the distributed storage layer?
# (hash symbol).
What is data visualization?
A graphical representation of information and data that helps to see and understand exceptions, trends, and patterns.
1/154
p.13
Data Collection and Processing Architecture

What special character is removed from hashtags before indexing in the distributed storage layer?

# (hash symbol).

p.4
Visualization of Social Media Data

What is data visualization?

A graphical representation of information and data that helps to see and understand exceptions, trends, and patterns.

p.6
Anonymity in Data Collection

How can anonymity in data collection be achieved?

By using a VPN to prevent easy identification of collecting sources.

p.13
Impact of Social Media on Public Opinion

Which hashtags were the most commented on for both tweets and retweets?

#Copa2018 and #BRA.

p.16
Sentiment Analysis Techniques

What is shown in Figure 10?

Overall ranking of tweets and retweets by the algorithm Pattern Analyzer.

p.5
Data Collection and Processing Architecture

What is the main focus of the OctopusViz environment?

To collect, process, research, analyze, and visualize real-time Twitter topics, contexts, and trends.

p.17
Sentiment Analysis Techniques

What is the total number of positive tweets and retweets?

263,485.

p.12
Impact of Social Media on Public Opinion

What was the outcome of the semifinal match between Belgium and France?

France won 1-0.

p.16
Sentiment Analysis Techniques

On which day were the most positive tweets recorded?

July 7th.

p.11
Data Collection and Processing Architecture

What Boolean logic was used for data collection on Twitter?

The Boolean logic used by the Twitter search function.

p.8
Data Collection and Processing Architecture

What is the first step in processing tweets according to the document?

Detecting the language, translating, and correcting the text to English.

p.6
Data Collection and Processing Architecture

What is the role of the Demilitarized Zone (DMZ) in the proposed architecture?

It connects a host to a protected network of the University of Brasília Research Laboratory.

p.9
Text Mining and Natural Language Processing

Which library is used for tokenization in this work?

The TextBlob library.

p.2
Bot Detection and Analysis

What vulnerability can be exploited by cybercriminals according to Hwang et al.?

It can erode trust in social media.

p.16
Sentiment Analysis Techniques

How many negative tweets were recorded on June 22nd?

1,196.

p.5
Visualization of Social Media Data

What are some highlighted features of OctopusViz compared to related works?

Anonymization, sentiment analysis, real-time operation, distributed storage, and visualization.

p.18
Impact of Social Media on Public Opinion

What was the most referenced hashtag during Brazil's fifth match in the quarterfinals?

#Copa2018.

p.13
Bot Detection and Analysis

What unusual behavior was noted about the user InfosFuteboI?

Posted only 34 tweets but had over 35,000 retweets.

p.13
Data Collection and Processing Architecture

What is the purpose of the data processing sublayer?

To classify tweets and retweets.

p.8
Data Collection and Processing Architecture

Which API is used for dynamic translation of tweets?

Google Translate API.

p.12
Impact of Social Media on Public Opinion

Which team eliminated Brazil in the quarterfinals of the 2018 World Cup?

Belgium.

p.13
Visualization of Social Media Data

What can be created and separated using filters in the tool?

Hashtags' clouds by tweets and retweets.

p.10
Sentiment Analysis Techniques

What are the four entries included in the lexical dataset?

Polarity, subjectivity, intensity, and confidence.

p.14
Sentiment Analysis Techniques

Which algorithm is used for classification in sentiment analysis?

Pattern Analyzer algorithm.

p.8
Data Collection and Processing Architecture

What library is used for correcting the translated text?

TextBlob library.

p.5
Sentiment Analysis Techniques

What types of data does the proposed solution aim to analyze?

Tweets, retweets, hashtags, mentions, likes, and user mapping.

p.11
Visualization of Social Media Data

What was the total number of posts collected during the study?

730,850 posts.

p.6
Data Collection and Processing Architecture

What is the purpose of XenCenter in the proposed architecture?

To configure and manage guest systems in XenServer.

p.14
Sentiment Analysis Techniques

What is the purpose of sentiment analysis in the context of tweets and retweets?

To rank users’ public opinion and identify the type of speech that allows specific decisions.

p.17
Sentiment Analysis Techniques

What is the total number of negative tweets and retweets?

146,445.

p.15
Sentiment Analysis Techniques

What event likely caused the highest peak of negative polarity tweets on June 22nd?

Brazil's match against Costa Rica, which they won 2-0.

p.13
Challenges in Sentiment Analysis

What is a common practice on Twitter regarding accounts?

Finding fake accounts and bots spreading misinformation.

p.11
Impact of Social Media on Public Opinion

What was the trend observed after July 10th regarding the Brazilian National Soccer Team on Twitter?

A drop in posts indicating reduced presence on Twitter.

p.14
Impact of Social Media on Public Opinion

How many retweets did user InfosFuteboI have?

35,504 retweets.

p.5
Real-Time Analytics in Social Media

What types of topics can be monitored and analyzed using OctopusViz?

Strikes, elections, companies, marketing, protests, cyber-attacks, military operations, and market research.

p.11
Sentiment Analysis Techniques

What was the classification of posts collected during the study?

168,509 tweets and 562,341 retweets.

p.16
Sentiment Analysis Techniques

What does Figure 9 illustrate?

Ranking of tweets and retweets per day with neutral, positive, and negative polarities.

p.11
Case Study: Brazilian National Soccer Team

What was the main focus of the study regarding the Brazilian National Soccer Team?

To observe metrics, statistics, and sentiment on Twitter during the 2018 FIFA World Cup.

p.5
Anonymity in Data Collection

How does OctopusViz prioritize analyst anonymity?

By using a VPN for data collection.

p.4
Data Collection and Processing Architecture

What is the purpose of the Vox Civitas tool?

To support journalists in extracting news from Twitter by aggregating data.

p.13
Bot Detection and Analysis

Why is user analysis important for analysts?

To identify fake accounts and bots on Twitter.

p.4
Challenges in Sentiment Analysis

How does HoneySELK contribute to cybersecurity?

It performs real-time monitoring of cyber attacks using the ELK stack for distributed storage and georeferencing.

p.3
Challenges in Sentiment Analysis

What challenges are associated with determining sentiment in tweets?

It can be laborious, prone to errors and ambiguity.

p.2
Anonymity in Data Collection

What are the two classifications of anonymity systems?

High-latency and low-latency.

p.16
Sentiment Analysis Techniques

What was the total number of tweets and retweets on July 9th?

20,770.

p.6
Anonymity in Data Collection

What is a fundamental requirement for the proposed environment architecture?

To ensure the source of data collection is anonymous for security and privacy reasons.

p.16
Sentiment Analysis Techniques

What was the total number of retweets on July 9th?

17,946.

p.11
Visualization of Social Media Data

What was the highest number of tweets and retweets recorded on a single day?

42,733 tweets and retweets on July 7th.

p.6
Data Collection and Processing Architecture

What virtualization technology is used in the proposed environment?

Hypervisor XenServer.

p.10
Text Mining and Natural Language Processing

What does the Penn Treebank tag set help determine?

The grammatical class (POS tagger) of the words.

p.7
Data Collection and Processing Architecture

What programming language and libraries are used in the data collection layer?

Python with libraries such as Tweepy, JSON, and TextBlob.

p.10
Data Collection and Processing Architecture

What tool is used for indexing and searching large volumes of data?

Elasticsearch.

p.12
Data Collection and Processing Architecture

What time period was the Twitter data collected for the analysis?

Between 15 June and 31 July 2018.

p.6
Data Collection and Processing Architecture

What are the specifications of the server used in the proposed architecture?

Intel Xeon processor E5-2690 v3 @ 2.6 GHz, 48 cores, 128 GB RAM, and 6 network cards.

p.17
Data Collection and Processing Architecture

How many indexed users were randomly selected for the link analysis?

3000.

p.10
Case Study: Brazilian National Soccer Team

What was the focus of the case study involving the Brazilian National Soccer Team?

To analyze Twitter data while maintaining a neutral view on sensitive topics.

p.4
Sentiment Analysis Techniques

What classification does the sentiment analysis algorithm use?

It classifies opinions as positive, negative, and neutral.

p.9
Text Mining and Natural Language Processing

What is tokenization in text processing?

The identification of tokens (words) that divides texts into words, phrases, or symbols.

p.15
Sentiment Analysis Techniques

What was the highest peak of positive-rated tweets about the Brazilian National Soccer Team, and when did it occur?

21,146 tweets on July 7th, the day after Belgium beat Brazil 2-1.

p.4
Sentiment Analysis Techniques

What does Tweetviz do?

Helps companies extract actionable information from noisy messages on Twitter, identifying sentiment and demographics.

p.3
Sentiment Analysis Techniques

What model did Gomes et al. propose for sentiment analysis?

A model that polarizes news into positive, negative, or neutral and provides procedures for organizations to extract knowledge from textual data.

p.2
Bot Detection and Analysis

What effect do bots have on the perception of social media influence?

They can artificially increase some people’s audiences.

p.13
Sentiment Analysis Techniques

What does the analysis of hashtags depend on?

Analyst interest in tweets or retweets.

p.4
Sentiment Analysis Techniques

What is the significance of using Twitter for sentiment analysis?

Twitter has social networking features that are interesting for mining user opinions and sentiments.

p.12
Sentiment Analysis Techniques

How many total mentions did the hashtag #BRA receive?

3276 total mentions.

p.10
Sentiment Analysis Techniques

What is the purpose of the lexical dataset en-sentiment.xml?

To assign scores (polarity, subjectivity, intensity, and confidence) and determine grammatical class for each word in sentences.

p.11
Data Collection and Processing Architecture

What keywords were used to collect data about the Brazilian National Soccer Team?

"seleção brasileira" or "seleção do brasil".

p.12
Sentiment Analysis Techniques

What was the total number of retweets for the hashtag #Copa2018?

3193 retweets.

p.5
Challenges in Sentiment Analysis

What challenge does the exponential growth of social media present?

The need to manipulate large amounts of data and investigate how organizations can benefit from it.

p.18
Impact of Social Media on Public Opinion

How many users posted tweets and retweets with the hashtag #Copa2018?

1022 users.

p.8
Data Collection and Processing Architecture

What is the purpose of removing stop words during data preprocessing?

To eliminate words that have no value for analysis.

p.1
Impact of Social Media on Public Opinion

Why is sentiment analysis important for government agencies and companies?

It provides insights for business strategies and helps in understanding public opinion.

p.5
Impact of Social Media on Public Opinion

Why is it essential for analysts to monitor social media?

To observe the evolution of themes and generate data that aids in decision-making.

p.18
Impact of Social Media on Public Opinion

Which user sent the most messages with the hashtag #Copa2018?

User torcidasfotos.

p.2
Anonymity in Data Collection

What is an example of a low-latency anonymity system?

Virtual Private Networks (VPNs).

p.8
Data Collection and Processing Architecture

What types of words are typically considered stop words?

Articles, prepositions, punctuations, conjunctions, and pronouns.

p.3
Sentiment Analysis Techniques

What was the accuracy of the naive Bayes algorithm in Kunal et al.'s study?

92.58%.

p.12
Visualization of Social Media Data

What does Figure 4 represent in the analysis?

A cloud of words identifying the most referenced hashtags in tweets and retweets.

p.4
Visualization of Social Media Data

Why are Big Data tools important?

They are essential for analyzing huge amounts of information in real time and making data-driven decisions.

p.16
Sentiment Analysis Techniques

What is the significance of Table 12?

It shows the peak polarities of tweets and retweets per day.

p.10
Sentiment Analysis Techniques

How is the score of each word in the lexical dataset defined?

According to the meaning of the sentence.

p.9
Sentiment Analysis Techniques

What is the main objective of the Classification Sublayer?

To carry out sentiment analysis of tweets to identify behaviors that may measure public opinion.

p.1
Data Collection and Processing Architecture

What is the significance of monitoring social networks according to the article?

It allows for the generation of knowledge about events and changes in the current world.

p.7
Visualization of Social Media Data

What service is used for data presentation in the visualization layer?

Kibana.

p.15
Sentiment Analysis Techniques

What tool was used for sentiment analysis in this study?

Pattern Analyzer.

p.7
Data Collection and Processing Architecture

What is the purpose of the collection layer in the architecture?

To capture tweets in real time according to keywords entered in the application.

p.11
Visualization of Social Media Data

How many users posted about the Brazilian National Soccer Team during the study?

122,975 users.

p.14
Impact of Social Media on Public Opinion

Who posted the most tweets and retweets according to the data?

User InfosFuteboI.

p.3
Sentiment Analysis Techniques

What was the focus of Rodrigues Barbosa et al.'s study?

To explore tweets about the Brazilian presidential elections in 2010 and trace online sentiment.

p.12
Sentiment Analysis Techniques

Which hashtag had the highest total mentions in tweets and retweets?

#Copa2018.

p.15
Sentiment Analysis Techniques

What was the highest peak of positive polarity tweets per hour, and when did it occur?

1128 tweets before the first match on June 16th at 6 p.m.

p.14
Sentiment Analysis Techniques

What is the significance of the classification sublayer in sentiment analysis?

It is where the classification of public opinion is made.

p.17
Data Collection and Processing Architecture

What algorithm was used to analyze user tweets and retweets?

Pattern Analyzer algorithm.

p.1
Impact of Social Media on Public Opinion

What role do social networks play in the dissemination of information?

They serve as a common practice for publishing and spreading ideas and opinions.

p.2
Sentiment Analysis Techniques

What model did Mladenovic et al. present for sentiment analysis?

A model using morphological dictionaries, sentiment lexicon, and irony classification.

p.18
Sentiment Analysis Techniques

What was the percentage of users against the hashtag #Copa2018?

13.88%.

p.2
Challenges in Sentiment Analysis

What is the significance of cleaning or modifying datasets in sentiment analysis?

To reduce noise that undermines classification.

p.1
Sentiment Analysis Techniques

What is the main focus of the proposed architecture in the article?

To monitor and perform anonymous real-time searches in tweets for sentiment analysis.

p.4
Data Collection and Processing Architecture

What is the main goal of the research involving Tweetviz?

To leverage geographic information to provide actionable location-specific information.

p.15
Sentiment Analysis Techniques

What was the total number of tweets and retweets analyzed during the sentiment analysis?

730,850.

p.10
Visualization of Social Media Data

What is the purpose of the Visualization Layer?

To help analysts interpret tweets and retweets for better data interpretation.

p.1
Real-Time Analytics in Social Media

What capabilities does the proposed solution offer?

High capacity to collect, process, search, analyze, and visualize tweets in real-time.

p.10
Visualization of Social Media Data

Which tool provides a rich interface for advanced analytical queries and visualization?

Kibana.

p.14
Data Collection and Processing Architecture

What additional metrics does the environment identify for each tweet and retweet?

Mentions, likes, and hashtags.

p.7
Data Collection and Processing Architecture

What is the function of the distributed storage layer?

To index and fetch tweets received from the capture layer.

p.11
Sentiment Analysis Techniques

Why is sorting messages into tweets and retweets important?

It helps analysts understand the influence on a particular subject.

p.3
Sentiment Analysis Techniques

Which sentiment analysis tool showed better results with the SVM classifier according to Hasan et al.?

TextBlob and Word Sense Disambiguation (WSD) showed better results than SentiWordNet.

p.1
Data Collection and Processing Architecture

What is the purpose of creating an environment for data collection in the proposed architecture?

To transform data into intelligence information for analysis and decision-making.

p.3
Sentiment Analysis Techniques

What accuracy did Praciano et al.'s framework achieve when using SVM for sentiment classification?

Close to 90%.

p.14
Impact of Social Media on Public Opinion

What does the column 'favorite_count' represent in the context of retweets?

The number of likes each retweet received.

p.6
Data Collection and Processing Architecture

What is the configuration of the host used in the proposed architecture?

Dell PowerEdge R730 with Intel Xeon processor, 128 GB RAM, and 6 disks configured with RAID 5.

p.9
Sentiment Analysis Techniques

What does the polarity score range from in sentiment analysis?

From -1.0 to 1.0.

p.5
Real-Time Analytics in Social Media

What is the significance of real-time analysis in social media?

It allows anticipating possible scenarios for advising on the decision-making process.

p.7
Anonymity in Data Collection

How does the architecture ensure anonymity during data collection?

By authenticating to a contracted VPN server.

p.10
Data Collection and Processing Architecture

When did the data collection for the case study take place?

Between 15 June and 31 July 2018.

p.17
Impact of Social Media on Public Opinion

Which hashtag was noted as the most referenced in tweets and retweets?

#copa2018.

p.9
Sentiment Analysis Techniques

What does a subjectivity score of 0.0 represent?

Very objective sentiment.

p.3
Sentiment Analysis Techniques

What was the accuracy achieved by the SVM algorithm in Tumitan and Becker's study?

81.37%.

p.2
Impact of Social Media on Public Opinion

How can bots affect public policy?

By creating the impression of an opposing grassroots movement.

p.9
Sentiment Analysis Techniques

What are the two implementations of sentiment analysis algorithms in TextBlob?

PatternAnalyzer and NaiveBayesAnalyzer.

p.18
Impact of Social Media on Public Opinion

How many tweets and retweets were made with the hashtag #Copa2018?

4098 tweets and retweets.

p.18
Sentiment Analysis Techniques

What percentage of users were neutral regarding the hashtag #Copa2018?

54.2%.

p.3
Sentiment Analysis Techniques

How did Hasan et al. enhance sentiment analysis?

By adopting a hybrid approach with three sentiment analyzers and two machine learning classifiers.

p.18
Sentiment Analysis Techniques

What were the three classifications of polarity for the hashtag #Copa2018?

Positive, negative, and neutral.

p.2
Challenges in Sentiment Analysis

What challenges are associated with sentiment analysis on social media?

Mixing objective and subjective information generates noise.

p.7
Data Collection and Processing Architecture

What is the role of the processing sublayer?

To transform raw data into information of interest according to filters.

p.8
Data Collection and Processing Architecture

Which library provides the methods for stop words removal?

NLTK library.

p.7
Sentiment Analysis Techniques

What does the classification sublayer perform?

Sentiment analysis and identifies users’ public opinion.

p.1
Impact of Social Media on Public Opinion

What types of scenarios can be predicted using data from social networks?

Strikes, protests, marketing, cyber-attacks, elections, military operations, and market research.

p.18
Impact of Social Media on Public Opinion

On what date did the highest peak of tweets and retweets with #Copa2018 occur?

June 27th.

p.2
Anonymity in Data Collection

What do VPNs provide in terms of communication?

Secure communication and data traffic confidentiality.

p.1
Challenges in Sentiment Analysis

What are the benefits of the proposed monitoring solution?

Low cost of implementation and operation while providing real-time sentiment analysis.

p.18
Impact of Social Media on Public Opinion

How many hashtags were related to #Copa2018?

368 hashtags.

p.15
Sentiment Analysis Techniques

What percentage of users were in favor of the Brazilian selection according to the sentiment analysis?

36.05% (263,485 users).

p.17
Data Collection and Processing Architecture

What does link analysis aim to achieve?

Integrate information from various Twitter entities to detect patterns and connections.

p.14
Impact of Social Media on Public Opinion

What is the total number of tweets and retweets posted by user cleytu?

13,858 (2 tweets and 13,856 retweets).

p.15
Sentiment Analysis Techniques

How did the number of retweets compare to tweets across all polarities?

The number of published retweets was higher across all polarities.

p.3
Sentiment Analysis Techniques

What did Cerón-Guzmán and León-Guzmán's work focus on?

Distinguishing spammer from non-spammer accounts and investigating voting intent inference from Twitter data.

p.8
Data Collection and Processing Architecture

What is the output of the cleaning function for the input 'Brazil is an excellent soccer team :) !!!'?

['Brazil', 'excellent', 'soccer', 'team']

p.9
Sentiment Analysis Techniques

What is the output of the sentiment analysis for the tweet 'Brazil is an excellent soccer team :) !!!'?

Sentiment(polarity = 0.98828125, subjectivity = 1.0) with Polarity: Positive.

p.4
Challenges in Sentiment Analysis

How does the proposed text mining process differ from previous works?

It operates in real time and considers words in tweets that may express sentiment, even if not marked with a hashtag.

p.17
Sentiment Analysis Techniques

Which polarity had more users related to it according to the analysis?

Neutral and positive polarities.

p.15
Sentiment Analysis Techniques

What percentage of users appeared to be neutral in their sentiment towards the Brazilian National Soccer Team?

43.91% (320,920 users).

p.7
Text Mining and Natural Language Processing

What libraries are used for data processing in the processing sublayer?

TextBlob and NLTK.

p.3
Sentiment Analysis Techniques

What did Praciano et al. propose for analyzing Brazilian presidential elections?

A framework for space-time trend analysis based on Twitter data.

p.9
Sentiment Analysis Techniques

What is the subjectivity score range in sentiment analysis?

From 0.0 to 1.0.

p.17
Data Collection and Processing Architecture

What was the time frame for the data used in the hashtag and user link analysis?

From 15 June to 31 July 2018.

p.12
Sentiment Analysis Techniques

Which hashtags were more included in retweets than in tweets?

#Copa2018, #BRA, #VaiBrasil, #BrasilGanha, #BRAMEX.

p.9
Sentiment Analysis Techniques

What does a polarity score of 0.0 indicate?

Neutral sentiment.

p.7
Sentiment Analysis Techniques

What is the main function of the TextBlob API?

To work with Natural Language Processing (NLP), sentiment analysis, and classification.

p.12
Sentiment Analysis Techniques

What was the total number of tweets for the hashtag #VaiBrasil?

100 tweets.

p.8
Data Collection and Processing Architecture

What additional data is removed from tweets during preprocessing?

URLs.

p.2
Sentiment Analysis Techniques

What was the best precision achieved by Mladenovic et al.'s classifier?

68.6%.

p.7
Data Collection and Processing Architecture

What are the five phases of the proposed architecture development?

Data collection layer, data processing sublayer, classification sublayer, distributed storage layer, and real-time tweets’ visualization.

Study Smarter, Not Harder
Study Smarter, Not Harder