Twitter Sentiment Analysis in Python
Summary
TLDRIn this tutorial, viewers learn to create a Twitter sentiment analysis tool using Python. The video guides through setting up a Twitter app, obtaining API keys, and using libraries like Tweepy and TextBlob. It demonstrates connecting to the Twitter API, gathering tweets on a chosen topic, cleaning the data, and performing sentiment analysis. The script also discusses potential biases and inaccuracies in sentiment analysis and suggests ways to improve the analysis by adjusting polarity thresholds.
Takeaways
- 🔑 **Twitter API Access**: To build a Twitter sentiment analysis tool, you need access to the Twitter API through a Twitter Developer App.
- 📱 **App Verification**: A verified Twitter account with a confirmed phone number and email is required to create a Twitter Developer App.
- 🔗 **API Keys**: Four keys are necessary for the Twitter API: API key, API secret key, Access token, and Access token secret.
- 📚 **Libraries Needed**: The tutorial uses the libraries TextBlob for sentiment analysis and Tweepy (or Tweetpie) for interacting with the Twitter API.
- 💾 **Key Management**: It's advisable to store API keys in a text file for security rather than hardcoding them into the script.
- 🔍 **Search Tweets**: Use the Twitter API's search method to retrieve tweets based on a specified topic and language (English in this case).
- 🗑️ **Data Cleaning**: Clean the tweets by removing unnecessary parts like 'RT', mentions, and possibly hyperlinks to improve sentiment analysis accuracy.
- 📊 **Sentiment Analysis**: TextBlob is used to calculate the polarity of each tweet, which indicates the sentiment (positive, negative, or neutral).
- 📈 **Threshold Adjustment**: Set a higher polarity threshold to account for a positive bias often observed in sentiment analysis results.
- 📝 **Result Interpretation**: Analyze the overall sentiment by adding up the polarity scores and interpreting the results as positive, negative, or neutral based on the polarity values.
- 🔍 **Topic Selection**: Choose appropriate topics for sentiment analysis and consider the impact of context on the accuracy of sentiment analysis results.
Q & A
What is the main focus of the video tutorial?
-The main focus of the video tutorial is to build a Twitter sentiment analysis tool using the Twitter API and natural language processing tools.
What is required to set up a Twitter app for the project?
-To set up a Twitter app, one needs to navigate to developer.twitter.com or apps.twitter.com, create a new project, and have a verified Twitter account with a specified phone number and email.
What are the keys and tokens needed to use the Twitter API?
-The keys and tokens needed to use the Twitter API include an API key, a secret API key, access tokens, and a secret access token.
Which additional libraries are required for the Twitter sentiment analysis project?
-The additional libraries required for the project are TextBlob for sentiment analysis and Tweepy (or Tweetpie) for interacting with the Twitter API.
How does one install the required libraries mentioned in the tutorial?
-The required libraries can be installed using pip commands in the command line by activating the conda environment and then typing 'pip install text blob' and 'pip install tweepy'.
How are the API keys and tokens kept secure in the script?
-In the script, the API keys and tokens are read from a text file to keep them secure, rather than being hard-coded directly into the Python code.
What is the purpose of using a cursor object in the script?
-A cursor object is used in the script to search for tweets based on a specific term and to limit the number of tweets returned for analysis.
Why is the language parameter specified when searching for tweets?
-The language parameter is specified to filter the results and only get tweets in English, as the sentiment analysis tool used, TextBlob, only works on English text.
How is the sentiment of each tweet determined using TextBlob?
-The sentiment of each tweet is determined using TextBlob by creating a TextBlob object with the cleaned tweet text and then finding the polarity of the text.
What is the issue with considering tweets with a polarity of zero as neutral?
-The issue with considering tweets with a polarity of zero as neutral is that TextBlob's sentiment analysis might not accurately determine the context, and many tweets perceived as negative might still be scored as slightly positive.
How can one increase the accuracy of the sentiment analysis results?
-One can increase the accuracy of the sentiment analysis results by setting a higher threshold for what is considered positive, cleaning the tweets more thoroughly to remove noise like mentions and RTs, and using a more sophisticated natural language processing tool.
Outlines
🔑 Setting Up Twitter Developer App
The paragraph introduces the process of setting up a Twitter Developer App to access the Twitter API for a sentiment analysis project. It explains the necessity of having a verified Twitter account with a phone number and email to create a developer app. The video creator walks through the steps of navigating to developer.twitter.com, creating a new project, and specifying the purpose and details of the app. It also discusses the importance of obtaining the API key, secret API key, access token, and secret access token, which are essential for the tutorial's script.
📚 Installing Required Libraries
This section covers the installation of necessary Python libraries for the Twitter sentiment analysis tool. The video instructs viewers to install 'textblob' for sentiment analysis and 'tweepy' for accessing the Twitter API. It provides instructions on how to install these libraries using pip in a command line interface, assuming the user has a conda environment set up. The paragraph also mentions the potential need for additional libraries if visualization is included in the project.
🔗 Authenticating and Connecting to Twitter API
The paragraph explains how to authenticate and connect to the Twitter API using the keys obtained from the developer portal. It details creating an authentication handler with the consumer key and secret, then setting the access token and access token secret to connect to the API. The process involves defining an API object that utilizes the authentication handler. The video also discusses how to define a search term and the number of tweets to analyze for sentiment, emphasizing the importance of specifying the language to filter out irrelevant results.
🔍 Searching and Cleaning Tweets
Here, the video script describes the process of searching for tweets using the Twitter API and cleaning the data to improve the accuracy of sentiment analysis. It involves using a cursor object to search for tweets based on a specified term and language. The paragraph also covers removing unnecessary elements like retweets ('rt'), mentions, and ads from the tweets to focus on the content that is relevant for analysis. The cleaned tweets are then printed to the screen for further processing.
📊 Analyzing Sentiment and Counting Tweets
The final paragraph discusses performing sentiment analysis on the cleaned tweets using the TextBlob library. It explains creating a polarity score for each tweet and accumulating these scores to determine the overall sentiment. The video creator also suggests setting a higher threshold for considering a tweet as positive due to a perceived positive bias in the analysis tool. Additionally, it introduces the idea of counting the number of positive, negative, and neutral tweets and adjusting the criteria for classification based on the desired accuracy.
Mindmap
Keywords
💡Twitter API
💡Sentiment Analysis
💡Natural Language Processing (NLP)
💡TextBlob
💡Tweepy
💡API Key
💡Authentication
💡Cursor
💡Polarity
💡Threshold
💡Context
Highlights
Introduction to building a Twitter sentiment analysis tool using Python and the Twitter API.
Overview of setting up a Twitter developer app, including the need for a verified Twitter account.
Importance of obtaining API keys and tokens from the Twitter developer portal to interact with the Twitter API.
Installation of two critical Python libraries for this project: TextBlob (for sentiment analysis) and Tweepy (for interacting with the Twitter API).
Guide on how to read API keys from a file for security, or hard-code them directly for simpler use.
Establishing a connection to the Twitter API using OAuth and API keys, followed by setting access tokens.
Defining a search term (such as 'stocks') and a tweet limit to specify how many tweets will be analyzed.
Explanation of using Tweepy’s cursor object to fetch tweets based on search terms and filter them by language.
Cleaning up tweet text by removing retweets (RT) and mentions (@usernames) to enhance analysis accuracy.
Using TextBlob for sentiment analysis by computing the polarity of each tweet's text.
Summing up the polarity of all tweets to determine overall sentiment on a specific topic.
Handling biases in sentiment analysis, such as positive bias, and introducing a threshold to determine positive polarity.
The ability to categorize tweets into positive, neutral, and negative based on their polarity score.
Demonstration of modifying the polarity threshold to get more accurate sentiment results.
Final sentiment analysis results for different topics (e.g., stocks, war, death, happiness) and understanding biases in the results.
Transcripts
[Music]
what is going on guys welcome back to
another ai project tutorial in python in
today's video we're going to build a
twitter sentiment analysis
tool so we're going to use the twitter
api in order to get a bunch of tweets
based on a topic that we provide
and then we're going to use natural
language processing tools in order to
analyze the overall sentiment for that
topic so let us get right into it
now before we can get into the actual
coding we need to make sure we have a
twitter app set up
so um if you've never worked with
twitter developer apps you probably
don't have that so you need to navigate
to developer.twitter.com or
apps.twitter.com
and it's going to get you here to the
developer portal and here you can create
new projects now this only works if you
have a um
an account where you have a phone number
i think and a confirmed email or
something you cannot just create a
a new twitter account without specifying
or verifying your phone number and then
do this
uh you need to have an actual twitter
account that
is verified citizen like not verified in
terms of celebrity verified but
verified in terms of you have specified
a phone number a valid phone number and
an email
then you can use these developer apps
here and the only thing that you need to
do is you need to create
an app and um you you will need to have
to specify
what you need it for who you are like
are your student or your company what
are you doing with this for example i
specified i'm going to do youtube
tutorials with this app so
depending on what you plan on doing with
that you need to specify what you're
going to use it for
and then you can just create it and then
what you have is you have a bunch of
keys here so you
have an api key and a secret api key
and you will have access tokens and a
secret access token now i'm not going to
go through the whole process here
i think it's very intuitive and also you
can go ahead and just google
how to set up a twitter app developer
app whatever
it's not too complicated um this
tutorial is more focused on the learning
but what you need for this tutorial
is you need uh to definitely know your
api key your secret api key
your access token and your secret access
token now you can regenerate them revoke
them at all times i'm not going to show
my keys in this tutorial but you need to
use your keys your four keys
you need to know them and to use them in
the script
so for this video we're going to need
two additional libraries that are not
part of the core python stack and those
are
text blob and of course
tweepy or tweetpie i'm not sure how it's
pronounced
now you can just go ahead and install
them using a command line so you just go
ahead
activate your condo environment if you
have one in my case it's called
main and then you just go ahead and type
pip install
text blob in my case already satisfied
and then you also go ahead and say pip
install to ep or tweetpie
which is the twitter api uh library that
we're going to use
so when you have that you need to import
um
three libraries uh maybe four if we're
going to do a visualization in the end
but for now we're going to import three
libraries
the first one is from text blob we're
going to import
text blob this is the library that we're
going to use for the actual sentiment
analysis
uh then we also need of course tweetpie
tweetp whatever
i'm going to call it pie here uh because
i don't want to say tweet pie tweepy
whatever all the time
so tweet pie um and then we're going to
import
sis as well which is part of the core
python stack
so what we do then is we need to
somehow connect to our app so we need to
use those four keys in order to
make a connection to the app that we
just created in the twitter
developer portal so what we're first
going to do is we're going to somehow
get the keys now in my case since i'm
not going to show them to you
i save the keys in this text file here
on the left and i'm going to read them
into my script
if you don't have anyone watching and if
you if you don't want to hide it
from anyone you can just go ahead and
write them clear text into your python
code so the only thing you need to do is
you need to say
api key equals whatever
api key secret equals whatever and you
just
paste the strings from the twitter
developer pool in my case i'm just going
to read them from a file so i'm going to
say
my keys equals open
twitter keys.txt you can also do it like
that if you want just
keep in mind that you need to have the
right order and then we're going to say
dot
reads and then we're going to say split
lines
and that's essentially it and now i'm
just going to say api
key and here you paste your api key
in my case i'm going to say my keys
0 so the first one then we're going to
say api key
secret or api secret key
equals my keys 1
then we're going to say access token
equals my keys 2
and access token secret equals
my keys my keys
three as i said you don't need to do it
like that you just write
all the keys you can just copy paste
them directly out of the
developer portal if you want to so once
we have done that the next thing that we
need to do is we need to use those keys
because those are just strings right now
uh we need to use those keys in order to
connect to the app so what we're going
to do is we're going to
define an authentication handler so
we're going to say off
handler equals twi.offhandler
actually o off handler and what we
specify here is the consumer key
which is the api key and then we specify
the uh i think it's called
what is it called it's called consumer
secret
and here we're going to specify the api
key secret
then we're now connected to the uh api
and now we need to just set the access
token so that we know which app
we're working on and we're going to say
offhandler dot
set access token and here we set the
access token
and the access token secret
and then last but not least we create
the actual api so we say api
equals twi dot api
using the auth handler that we just
created
and that is how you build the connection
to the app
so now we can actually go ahead and
define the search term that we're
interested in so the topic so to say
uh that we want to analyze the sentiment
for and in this case i'm just going to
pick
stocks for that and then the second
thing that we need to do is we need to
define
or we can define uh the amount of tweets
that we're interested in so we can base
our analysis on 10 tweets on 100 tweets
on a thousand tweets
and of course the more tweets we use for
the analysis
the more quote-unquote accurate it will
get now of course
accurate is to be taken with a grain of
salt here because we're using text blob
and essentially what we're doing is
we're looking at each word
and then determining if this word is a
positive one or negative one and then
just adding up all these sentiments for
the individual words
which is not always uh in the right
context of course
so if i have for example not happy happy
is still a positive word and it's not
the most accurate thing to do
however the more the more tweets we use
the more quote-unquote accurate we get
so we're going to say
twitter or actually sorry tweet
amount equals and we're just going to
go for 200 uh in the beginning
and then what we need to do is we need
to use a cursor object
in order to search for the term so we're
going to say
our results or actually that's let's
call it tweets
equals and then we're going to use twi
dot cursor and
here we need to specify the method the
argument for the method
and the language is an optional argument
that we're definitely going to specify
because otherwise we got all kinds of
results here so the first thing is we're
going to use api.search so the api that
we already created here with
our access token with our secret key
with our api key
we're going to use its search method
then we're going to specify
the parameter q here which is the actual
term that we're going to search for so
we're just going to pass a search term
and then we're going to specify a
language in this case
english so lowercase e n
is how you specify the english language
because if you don't do that
you'll get all kinds of hyperlinks and
all kinds of
uh random auto-generated messages
uh maybe you get spanish french posts uh
and the sentiment analysis and textblob
as far as i know
only works on english text so you'll get
not the most accurate results
and then we're going to use this object
here sorry
in order to call the items function and
the items function
essentially just specifies how many
items we're interested in so
we're going to get all the items uh and
we can pass
the tweet amount that we just created in
order to limit it to 200
items to 200 tweets so
i think we should be able to see the
tweets now so for tweet and tweets
we should be able to print them out onto
the screen
let's see if this is the case so
actually i need to run this thing run
main
and uh actually it works but i think we
don't get the text so let me just see
what i did here
okay we need to say tweet dot text
and then we should be able to see the
tweets
there you go as you can see there are a
lot of things here
a lot of noises so to say that we are
not
necessarily interested in so you can see
we have a lot of rt
rt we have a lot of ad we have a lot of
mentions and all kinds of stuff so uh
maybe we'll get rid of those as well to
make the results more accurate
so let's go ahead and clean up the
tweets a little bit we're going to save
for tweet and tweets and now we're going
to
um delete all these rt tags and also
we're going to try to
get rid of all these mentions here of
all these add
some account occurrences so we're going
to first say
final text you can also choose a more
reasonable name here i'm just going to
call it final text is essentially the
tweet
text that is remaining in the end and
we're going to say it's just tweets
dot text that we have but we're going to
replace
the rt with nothing so we're going to
just get rid of the rt
occurrences here so all of those here
because they're essentially not
important for the sentiment
um also we're going to to remove all
these
at whatever at least at the start so
we're not going to make two complicated
expressions here but every time that add
something appears at the start
we're going to delete it so since we
removed the rt we have
a blank space here a white space and
then the add
username so what we're going to do is
we're going to say if
final text that we already removed the
rt from
if this text starts with
with if it starts with blank space
at what we're going to do is we're going
to say find
the colon colon because what we have
here is we have an
add then a username and it ends with a
colon here
so we're going to find the index of the
colon so we're going to say
position equals final text.index
of the colon and then after that we're
going to say final text
equals final text but it starts
from the position that we have um
plus from then on
so we're going to uh cut off the first
part
where the username is so this is the
actual final text we're not going to
we can actually go ahead and just print
the final texture
we're not going to clean up all the mess
here we could also remove the hype links
and all that
uh actually it seems to not work
every time i think because there are
also user names that are not starting
with rt so we ignored those
we can actually also try the same thing
for those i think
if final text starts with that directly
we could do the same thing i think
then we should be able to get rid of
those
uh what do we have here substring not
found
oh i think that's the problem because um
when they start
with at username we don't get a colon
there
yeah as you can see let's just wait a
little bit
till it's done as you can see wherever
we start with something like that we
don't have a colon so we could actually
go ahead
uh and repeat the same thing but we
would have to look for the white space
like that and like that
so this should work in order to remove
those usernames as well
most of the time at least yeah seems to
work
so we clean up uh we we have cleaned up
the messages here and or the tweets here
and now we can go ahead and do the
sentiment analysis for this we're just
going to create a textbook object we're
going to say analysis
equals text blob of the final text so
we're going to not pass the tweet text
but the final text the cleaned up text
and then we're just going to find the
polarity of it
and overall we're going to create a
polarity object starting at zero so
polarity equals zero
and we're just going to add the polarity
of the individual tweets
to that polarity uh variable so we're
going to say analysis
dot polarity because if a text has a
polarity of 10
um it's probably a very positive text if
it has a polarity of -10 it's a very
negative text
so if you combine them you end up with
zero because you have one very negative
one very positive text so it's neutral
if you have a very very positive text
like a
polarity of let's say 100 then you have
-10 you have a very positive text
and a somewhat negative text and when
you combine them you have still
positive polarity overall so this is how
it works it's enough to just
add them up um and then what we can do
is we can actually just go ahead and
the polarity itself if that is enough
uh so we're not going to see all the
tweets right now since we're not
printing them we're just going to get an
overall polarity at the end
hopefully and you can see it's positive
because everything above
0 is positive the more it's above zero
the more positive it is
um i'm not sure i i figured that
whenever i use the twitter api or
whenever i use text blob in general
i have a positive bias even if i look
for topics like war or disease or
something i still get
very uh not very but at least slightly
positive
sentiment but we can try let's see what
happens when we look for war
um i'm not sure why this is but i think
because
most words are actually positive and
there are very few words that are
negative and
people don't use them because people
often times use as you can see we have a
polarity of 10 which is positive
i think people oftentimes use stuff like
not good not happy
or something like that and it makes the
analysis
tend a little bit to be more positive
however
what we can do as well is we can
actually count
the amount of positive tweets the amount
of negative tweets and the amount of
neutral tweets
i don't think that we will have any
completely neutral tweets
but we can actually just go ahead and
say positive equals zero
neutral equals zero negative equals zero
and then whenever we have um we can say
uh tweet polarity
equals analysis dot polarity and we can
say
if tweet polarity
is larger than zero
positive plus equals one
um alif tweet polarity
less than zero
positive no sorry negative plus equals
one and else if it's exactly zero we're
just going to say neutral equals
plus equals one and then here we add
the tweak polarity as well so we can
print the polarity we can
um print f string
amount of positive
tweets is positive
and then we can just go ahead
copy that for negative and
neutral
and then just exchange or swap the
values here so
neutral there you go
um and one thing that you could do since
we have a positive bias is you can
actually say if you really want
something to be considered positive
you have to demand that it crosses at
least like 15 or 20 or something
because if even topics like war or
actually maybe war
is is really a positive topic because we
have um
so little war so people maybe talk about
how little war we have
uh this is also a possibility so maybe
that's really uh
the case here but actually you could
also just set a higher standard so you
can say we don't consider something to
be positive
unless it has at least a polarity of uh
15 or something so because i very very
rarely get something below zero and if
it's below zero it's like
minus five or something you rarely get
something like -10
because as i said most words are
probably positive but we can try for a
different topic here
let's uh see something like um
what is a negative topic that will not
get demonetized here i know what you're
all thinking but i cannot do that
because
if i use the word that you're all
thinking about i probably
uh will get demonetized here so i'm
going to go with
something like death because death is
not never a good topic
and let's also go ahead and print the
tweets
so that we can see what's happening here
not tweet final text and then we can see
the results
prison death benefit death
immediately death death death there you
go
um still positive for some reason we
have 63 positive
uh tweets then we have 40 negative
and 97 neutral tweets i'm not sure if
this is actually
maybe we need to work with with floating
point numbers
so maybe we should say larger than 0.00
less than 0 0 0 and then neutral
is only we're not going to do else we're
going to alif
tweet polarity only if it is exactly
0.00 so maybe that is a problem here
because i don't think that we have
97 tweets that have exactly zero as
polarity
i mean could be but it seems unrealistic
to me
so let's see oh
we still have a lot more so yeah i think
the only way to
to really do that i mean we could also
go ahead and read through all the tweets
here maybe they
really are positive tweets or neutral
tweets uh but i think if they are
actually
negative in your perception and the tool
still outputs that they're positive you
can just set the standards the threshold
higher
you can say i only consider something to
be positive if it's above
20 for example so we can actually try to
to go with a topic like
happy or happiness because i think there
we will get a much higher number than
eight
maybe i'm also wrong i don't know i'm
always surprised by this uh
this sentiment analysis here but
actually i think if something is really
really positive yeah as you can see we
get 81
so we can actually consider something to
be positive uh only if it is above
20 or something or 15 at least or
anything like that
uh you can do that however you want but
that's how you analyze the sentiment for
a specific topic
on twitter so that's it for this we hope
you enjoyed it i hope you'll learn
something if so let me know by hitting
the like button and leaving a comment in
the comment section down below
uh also feel free to make any
suggestions in the comment section down
below
for projects that you would like to see
in the future um
i know that i not don't always do the
projects that you suggest or
ask for but sometimes it is just because
i'm not capable of
yet because a lot of you guys have
requested django tutorials for example
but i'm simply not good at django i've
not used django enough i've not worked
with django i'm not good at python web
development yet
because i haven't educated myself in
that area
so in order to make django tutorials i
need to first learn
django myself for example and this is
also true for a lot of other projects
but feel free to suggest any ai projects
networking projects in the comment
section down below
if i think they're a good idea and if
i'm capable of implementing them i'm
definitely going to make a video
on them other than that make sure you
subscribe to this channel in order to
see more future
videos for free and thank you very much
for watching see you next video
bye
[Music]
you
Посмотреть больше похожих видео
Crie dashboards incríveis usando PYTHON, STREAMLIT e CHATGPT
Twitter OSiNT (Ethical Hacking)
Training a model to recognize sentiment in text (NLP Zero to Hero - Part 3)
Aspect Based Sentiment Analysis: A Python Demo
How to Scrape Google Search Results: A Step-by-Step Guide
HOW I TRADE 1000X MEMECOINS USING PEPEBOOST (FULL GUIDE)
5.0 / 5 (0 votes)