Aspect Based Sentiment Analysis: A Python Demo

Decision Analytics
23 May 202209:26

Summary

TLDRThis video introduces aspect-based sentiment analysis (ABSA), a method to extract structured data on aspects, opinions, and sentiments from reviews. It explains ABSA's advantages over traditional sentiment analysis and demonstrates a paper's model for extracting aspect sentiment triplets. The video includes a Python demo showcasing how to train a model on a specific dataset and evaluate its performance using F1 score.

Takeaways

  • 🍽 Aspect-Based Sentiment Analysis (ABSA) focuses on identifying specific aspects of a product or service mentioned in reviews and the sentiment towards those aspects.
  • 📊 ABSA provides structured data that includes aspects, opinions, and sentiments, offering more detailed insights compared to traditional sentiment analysis.
  • 🔍 The example given in the script illustrates how ABSA can distinguish between positive sentiment towards 'waiters' and negative sentiment towards 'pasta' in a restaurant review.
  • 📚 Terminology in ABSA includes 'aspects' or 'targets' (like 'waiters' and 'pasta'), and 'subjective opinions' (like 'friendly' and 'average').
  • 📊 ABSA can quantify sentiments towards aspects by counting mentions and categorizing them as positive, negative, or neutral.
  • 🔍 The script mentions clustering and dimension reduction as methods to group similar aspects when there are too many to analyze individually.
  • 📖 The paper 'Learning Span-level Interactions for Aspect Sentiment Triplet Extraction' is highlighted, which introduces a model that outputs 'aspect sentiment triplets'.
  • 🛠️ The model architecture involves tokenization, span enumeration, pruning, and classification to identify aspects, opinions, and their sentiment relations.
  • 💻 The paper's authors have made their code publicly available on GitHub, allowing others to replicate their work and use their model.
  • 🐍 A Python demo using the pre-trained model is provided, showing how to predict aspect sentiment triplets for new sentences.
  • 📈 The model's performance varies depending on the dataset and encoding method used, with BERT showing better performance than LSTM.

Q & A

  • What is Aspect-Based Sentiment Analysis (ABSA)?

    -Aspect-Based Sentiment Analysis is a method used to analyze specific aspects or targets of a product or service mentioned in a review or text, identifying the sentiment expressed towards these aspects.

  • How does ABSA differ from traditional sentiment analysis?

    -ABSA provides more detailed insights by extracting information about specific aspects people are discussing, whether they like or dislike each aspect, and the reasons behind their sentiments, whereas traditional sentiment analysis typically focuses on overall positive, negative, or neutral sentiments.

  • What is an example of structured data output for ABSA?

    -Structured data output for ABSA might include aspects like 'waiters' and 'pasta' from a restaurant review, with opinions such as 'friendly' and 'average', and sentiments like 'positive' and 'negative' respectively.

  • What terminology is used in ABSA for specific parts of a review?

    -In ABSA, 'aspects' or 'targets' refer to specific parts of a product or service being reviewed, while 'subjective opinions' are the adjectives or phrases used to describe those aspects.

  • What does the output of ABSA look like when analyzing a large number of reviews?

    -The output of ABSA on a large dataset might show the frequency of mentions for each aspect, along with the associated opinions and sentiments, such as 'food' being mentioned 523 times with various opinions like 'weird', 'authentic', 'great', and 'dry'.

  • How can clustering and dimension reduction be applied in ABSA?

    -When there are too many aspects identified in the reviews, clustering and dimension reduction techniques can be used to group similar aspects together, simplifying the analysis and making it more manageable.

  • What is an 'aspect sentiment triplet' as mentioned in the paper walkthrough?

    -An 'aspect sentiment triplet' refers to a combination of an aspect, the opinion expressed about that aspect, and the sentiment associated with that opinion, such as 'windows 8' being the aspect, 'did not enjoy' the opinion, and 'negative' the sentiment.

  • What is the process for breaking down a sentence into tokens in ABSA?

    -In the model described, sentences are broken down into tokens, which are then combined to form spans. These spans represent phrases that could be aspects or opinions, like 'windows 8' or 'did not enjoy'.

  • How does the model handle the computational complexity of enumerating spans?

    -To manage computational complexity, the model performs pruning operations that remove invalid spans and keep only candidate aspects or targets and opinions.

  • What is the role of the sentiment relation classifier in the ABSA model?

    -The sentiment relation classifier determines whether each aspect-opinion pair conveys a positive, negative, or neutral sentiment, or if it is invalid due to being an objective fact without a subjective opinion.

  • How can one train a model from scratch using the provided ABSA framework?

    -The provided notebook demonstrates how to train a model from scratch using one's own labeled training data and evaluate its performance using metrics like the F1 score.

Outlines

00:00

📊 Aspect-Based Sentiment Analysis Explained

This paragraph introduces Aspect-Based Sentiment Analysis (ABSA), emphasizing its ability to provide structured data that includes aspects, opinions, and sentiments. The script uses a restaurant review as an example to illustrate how ABSA can distinguish between positive and negative sentiments towards different aspects of a product or service. It also explains the terminology involved, such as 'aspects' or 'targets' for the subjects being reviewed and 'subjective opinions' for the descriptors. The paragraph concludes by highlighting the advantages of ABSA over traditional sentiment analysis, which is its capacity to extract detailed feedback on specific aspects and reasons behind the sentiments.

05:00

🔍 Deep Dive into ABSA Model Architecture

The second paragraph delves into the model architecture for ABSA, focusing on a paper titled 'Learning Span-level Interactions for Aspect Sentiment Triplet Extraction.' It outlines the process of breaking sentences into tokens and then combining them into spans to identify aspects and opinions. The model performs pruning to filter out invalid spans, leaving only potential aspects and opinions. These are then paired and classified to determine the sentiment relationship. The paragraph also discusses the model's performance, mentioning that it has shown significant improvement over previous models. It notes the use of LSTM or BERT for sentence encoding and the different datasets used for training and testing, including restaurant and laptop reviews from 2014. The paragraph concludes with a mention of a Python demo on Google Colab that demonstrates how to use the model for prediction and evaluation.

Mindmap

Keywords

💡Aspect-Based Sentiment Analysis (ABSA)

Aspect-Based Sentiment Analysis is a subfield of sentiment analysis that focuses on identifying and categorizing opinions expressed about specific aspects or features of a product, service, or topic. Unlike general sentiment analysis, which determines the overall sentiment towards an entity, ABSA drills down to the granular level to understand sentiments towards particular attributes. In the video, ABSA is used to analyze an online restaurant review, distinguishing between sentiments towards 'waiters' and 'pasta'.

💡Structured Data

Structured data refers to information that is organized into a formatted repository, typically a database, where it can be easily accessed, managed, and analyzed. In the context of the video, structured data is the output format of ABSA, which shows the aspect, opinion, and sentiment in an organized manner. For instance, the review 'waiters are very friendly but the pasta is simply average' is translated into structured data indicating 'waiters' as an aspect with a positive sentiment, and 'pasta' with a negative sentiment.

💡Aspect

In ABSA, an 'aspect' is a specific feature or component of a product or service that is being evaluated. It is also known as a 'target'. The video script uses 'waiters' and 'pasta' from a restaurant review as examples of aspects where the sentiment analysis focuses on these particular elements to determine the sentiment expressed towards them.

💡Opinion

An 'opinion' in the context of ABSA refers to the subjective evaluation or view expressed about an aspect. It is a descriptive term that captures the sentiment-bearing phrase that is associated with an aspect. For example, in the transcript, 'friendly' and 'average' are opinions about the aspects 'waiters' and 'pasta', respectively.

💡Sentiment

Sentiment, in the video, refers to the emotional tone or attitude expressed in an opinion, which can be positive, negative, or neutral. It is a key output of ABSA, indicating the feeling or opinion towards an aspect. The script gives examples such as 'friendly' indicating a positive sentiment and 'average' indicating a negative sentiment.

💡Target-Based Sentiment Analysis

Target-Based Sentiment Analysis is synonymous with ABSA and focuses on analyzing sentiments directed at specific targets or aspects. The video emphasizes that ABSA, or target-based sentiment analysis, provides more detailed insights about individual aspects of a product or service compared to traditional sentiment analysis.

💡Aspect Sentiment Triplet

An 'Aspect Sentiment Triplet' is a model output that consists of an aspect, an opinion, and a sentiment. It is a structured way to represent the relationship between an aspect and the sentiment expressed towards it through an opinion. The video describes a model that outputs these triplets, such as 'Windows 8' as an aspect, 'did not enjoy' as the opinion, and 'negative' as the sentiment.

💡Tokenization

Tokenization is the process of breaking down text into individual elements or 'tokens', which are typically words or phrases. In the video, tokenization is the first step in the model's architecture, where a sentence is broken into tokens to later form spans that represent aspects or opinions.

💡Span

A 'span' in the context of the video refers to a sequence of tokens that together represent an aspect or an opinion. It is a multi-word phrase that is considered as a single unit for sentiment analysis. For example, 'Windows 8' is a span that represents an aspect in a review about a laptop.

💡Pruning

Pruning, in the context of the video, is a technique used to reduce computational complexity by eliminating irrelevant or unnecessary data. The model prunes spans that are classified as invalid, keeping only those that are potential aspects, targets, or opinions for further analysis.

💡Sentiment Relation Classifier

The 'Sentiment Relation Classifier' is a component of the model that determines the sentiment relationship between an aspect and an opinion. It classifies each aspect-opinion pair as positive, negative, neutral, or invalid, as not all aspects have a sentiment associated with them. The video uses the example of 'new Windows 8' being a fact without sentiment, thus classified as invalid.

Highlights

Introduction to Aspect-Based Sentiment Analysis (ABSA)

ABSA outputs structured data showing aspect, opinion, and sentiment

Terminology: aspects (targets), subjective opinions, and sentiment

ABSA provides more detailed sentiment analysis compared to traditional methods

Example of ABSA output for a restaurant review

ABSA can cluster and reduce dimensions for numerous aspects

Paper walkthrough on 'Learning Span-level Interactions for Aspect Sentiment Triplet Extraction'

Model outputs aspect sentiment triplets

Model architecture involves tokenization, span enumeration, and pruning

Aspect term extraction and opinion term extraction as part of the model

Sentiment relation classifier determines the sentiment between aspect and opinion

Model performance compared to previous papers

Use of LSTM or BERT for sentence encoding

Training and testing sets from the Semantic Evaluation Workshop

Code availability on GitHub and Python demo on Google Colab

Process of downloading and installing packages for the demo

Example of training data for the 2014 laptop dataset

Pre-trained model prediction example

Model sensitivity to the dataset it was trained on

Instructions on training a model from scratch with custom data

Evaluation of model using F1 score

Call to action for likes and subscriptions

Transcripts

play00:01

hello everyone today i will introduce

play00:03

what is aspect-based sentiment analysis

play00:06

we will take a look at what absa outputs

play00:08

look like followed by a paper

play00:10

walkthrough and we will end with a

play00:12

python demo

play00:15

let's start with an example

play00:16

let's say we have an online review for a

play00:18

restaurant waiters are very friendly but

play00:21

the pasta is simply average

play00:24

ebsa should be able to output structured

play00:26

data

play00:27

showing the aspect the opinion and the

play00:30

sentiment

play00:31

in this case waiters are being described

play00:33

as friendly and it's a positive

play00:35

sentiment in contrast pasta is being

play00:38

described as average is a negative

play00:40

sentiment

play00:42

as we saw on the previous page here we

play00:45

have some terminology and jargons

play00:48

waiters and pasta are called aspects

play00:50

also called targets

play00:53

friendly and average are subjective

play00:55

opinions

play00:58

to summarize

play00:59

absa is also called target-based

play01:02

sentiment analysis and opinion mining

play01:06

compared to traditional sentiment

play01:07

analysis ebsa provides more details

play01:11

for a given product or service ebsa can

play01:14

extract what aspects are people talking

play01:16

about

play01:17

do people like or dislike each aspect

play01:20

and for what reason

play01:23

next let's look at what absa outputs

play01:25

should look like

play01:27

let's say we scraped all the online

play01:29

reviews for a restaurant this is a

play01:31

google review page for a place i really

play01:33

like in florida

play01:36

the outputs should look something like

play01:37

this for instance people may have

play01:39

mentioned the word food 523 times

play01:43

seven people may have said the food

play01:45

tasted weird

play01:46

six people may have said the food tasted

play01:49

authentic

play01:50

43 said the food was great 17 said the

play01:54

food was dry and weir correspond to a

play01:57

negative sentiment

play01:58

authentic positive great positive

play02:02

and so on and on the other hand people

play02:04

may have mentioned the word service 326

play02:07

times 34 people may have said the

play02:09

service was slow

play02:11

12 may have said the service was fast

play02:14

six may have said the servers were rude

play02:17

and 15

play02:18

friendly they also correspond to

play02:21

different sentiments

play02:23

if you ended up having too many aspects

play02:26

you can do clustering and dimension

play02:28

reduction to group similar aspects

play02:31

together

play02:34

and now that we have a general

play02:35

expectation of what outputs we're

play02:37

looking for i'm going to do a paper walk

play02:40

through so that we can understand the

play02:41

model design and architecture

play02:45

this paper is called learning span level

play02:48

interactions for aspect sentiment

play02:50

triplet extraction it has two first

play02:52

authors who contributed to the work

play02:54

equally

play02:56

and their model will output what they

play02:57

call

play02:58

aspect sentiment triplets

play03:00

for example we have a consumer review

play03:02

here for a laptop did not enjoy the new

play03:05

windows 8 and touch screen functions

play03:08

here windows 8 is the aspect and so is

play03:11

touch screen functions

play03:13

they were both described by the opinion

play03:16

not enjoy and the sentiment is negative

play03:22

here we have the model architecture

play03:24

first they break a sentence into tokens

play03:27

so did not enjoy the new windows 8 they

play03:30

were broken down into many tokens

play03:33

and these tokens are then combined to

play03:36

form a span

play03:38

for example did

play03:40

did not enjoy not enjoy enjoy the new

play03:44

and we need to combine these words to

play03:46

form a span because an aspect or an

play03:48

opinion can be a phrase and a phrase can

play03:51

have multiple words windows 8 should be

play03:54

read together as one phrase it wouldn't

play03:56

make sense to break it apart

play03:58

and the enumeration of span is pretty

play04:01

computational intensive because the

play04:03

number of possible spans grows

play04:05

exponentially as the sentence gets

play04:08

longer

play04:09

to control the computational complexity

play04:12

next they did some pruning

play04:14

here is a task called aspect term

play04:17

extraction

play04:18

and opinion term extraction

play04:21

each span is then classified as either

play04:24

invalid

play04:25

an opinion

play04:27

or a target also called an aspect

play04:31

and a pruning operation here removes

play04:34

everything that's invalid and only keeps

play04:36

candidate aspects or targets as well as

play04:40

candidate opinions

play04:42

and finally each aspect is paired with

play04:44

each opinion these target candidate

play04:46

aspects and opinions are coupled

play04:48

together so that we can determine the

play04:50

sentiment relationship between them

play04:54

here the sentiment relation classifier

play04:57

determines whether each aspect opinion

play05:00

pair

play05:01

conveys a positive

play05:04

or a negative or a neutral sentiment it

play05:07

can also be invalid because the new

play05:10

windows 8 is a fact it's not a

play05:13

subjective opinion but rather an

play05:15

objective fact so there is no sentiment

play05:18

relationship there

play05:21

and this model has a decent performance

play05:24

which is significantly improved compared

play05:27

to previous papers

play05:30

the performance varies depending on the

play05:33

model in the beginning they can either

play05:35

do sentence encoding using by lstm or

play05:39

bird

play05:40

bird is a more complex model so it leads

play05:42

to better performance

play05:45

they also had four different training

play05:47

and testing sets

play05:49

here rest 14 means that it's a data set

play05:52

about restaurant from 2014

play05:55

and lab 14 means that it's a data set

play05:57

about laptop from 2014.

play06:01

these data sets were originally released

play06:04

by the semantic evaluation workshop

play06:07

the authors of this paper have made

play06:09

their code publicly available on github

play06:12

and we can see a python demo on google

play06:15

colab

play06:18

in the beginning we need to download and

play06:20

install a number of packages as well as

play06:22

loading the pre-trained model

play06:25

there are a lot of packages that needed

play06:28

to be downloaded it took me about four

play06:30

minutes

play06:31

the next cell shows what training data

play06:34

looks like and here specifically they're

play06:37

showing an example

play06:38

for the 2014 laptop data

play06:42

and this is what the data looks like for

play06:44

example i charge it at night and skip

play06:46

taking the chord with me because of the

play06:48

good battery life

play06:50

the target word is battery life and its

play06:53

index goes from 16 to 17

play06:57

and the opinion word is good its index

play07:00

goes from 15 to 15

play07:02

and also this is a

play07:05

positive sentiment

play07:07

and below we can use a pre-trained model

play07:09

for prediction this is the training

play07:12

example we saw from the paper did not

play07:14

enjoy the new windows 8 and touch screen

play07:17

functions

play07:19

at the very end the output is going to

play07:21

show you a triplet

play07:23

so here the target word is windows 8 and

play07:27

the opinion is not enjoy the sentiment

play07:30

is negative this sentence has two

play07:33

aspects so we also get an output for

play07:35

touch screen

play07:37

so the touch screen functions

play07:40

the opinions also not enjoy and the

play07:42

sentiment is negative

play07:46

now let's try providing a new example

play07:51

easy to use and set up

play07:54

no problems after six months

play08:00

and let's run the cell and

play08:02

see what happens

play08:04

and after about 48 seconds we have the

play08:08

output

play08:09

here the target is used the opinion is

play08:12

that it's easy to use and the sentiment

play08:15

is positive

play08:16

so it looks pretty good

play08:19

and let's try something else

play08:22

so so this model is trained on laptop

play08:25

data let's try

play08:27

something not related to laptop

play08:30

let's say the treatment is effective

play08:35

and i already run this cell it ended up

play08:37

not giving us anything

play08:39

meaning that it wasn't able to detect

play08:42

any sentiment aspect triplets

play08:45

so the model is pretty sensitive to the

play08:48

data set it was trained on because it

play08:50

was trained on data about laptop it

play08:53

wasn't able to detect

play08:55

it wasn't able to detect anything

play08:58

in the medical domain

play09:00

but the nice thing is that the notebook

play09:03

also show you how to train a model from

play09:06

scratch using your own label training

play09:08

data

play09:10

they also show you how to evaluate

play09:13

your model f1 score

play09:17

this is the end of the video if you

play09:18

learned anything useful today feel free

play09:20

to like and subscribe

play09:22

thank you for your time and take care

Rate This

5.0 / 5 (0 votes)

Related Tags
Sentiment AnalysisAspect-BasedPython DemoABSAOpinion MiningData ExtractionMachine LearningReview AnalysisModel TrainingText Processing