Bert for Sentiment Analysis - Connecting final output back to the input

Question

I have not found a lot of information on this, but I am wondering if there is a standard way to apply the outputs of a Bert model being used for sentiment analysis, and connect them back to the initial tokenized string of words, to gain an understanding of which words impacted the outcome of the sentiment most.

For example, the string "this coffee tastes bad" outputs a negative sentiment. Is it possible to analyze the output of the hidden layers to then tie those results back to each token to gain an understanding of which words in the sentence had the most influence on the negative sentiment?

The below chart is a result at my attempt to explore this, however I am not sure it makes sense and I do not think I am interpreting it correctly. I am basically taking the outputs of the last hidden layer, which in this case has shape (1, 7, 768), [CLS] + 5 word tokens + [SEP], and looping through each token summing up their values (768) and computing the average. The resulting totals are outputted in the below graph.

Any thoughts around if there is any meaning to this or if i am way off on approach, would be appreciated. Might be my misunderstanding around the actual output values themselves.

Hopefully this is enough to give someone the idea of what i am trying to do and how each word can be connected to positive or negative associations that contributed to the final classification.

The sentiment would depend on the _relation_ between words, not just the individual words. As I see it, the best approach with the problem as you phrased it would be to break down the sentence to individual tokens, and then find the sentiment of each word. — Varun Vejalla, Sep 05 '20 at 02:00
What method would you recommend for finding the sentiment of each word? I think that is what I am after here, but unsure of how to go about obtaining that information. — JSS, Sep 05 '20 at 03:52
Since it is based on the individual word, it would have to either be stored for each word, or something could be trained on a dataset of sentences and sentiment analysis. — Varun Vejalla, Sep 05 '20 at 03:57

Bert for Sentiment Analysis - Connecting final output back to the input

0 Answers0