AI Sentiment Analysis using Tensorflow to automate Social Media Marketing.
In light of the recent global pandemic tragedy, businesses, especially small ones, are feeling the pinch. To survive during trying times, small businesses like retail shops need to cut down costs, particularly headcounts. Failure to do so would lead to revenue losses and even bankruptcy.
AI sentiment analysis can be used to automate digital marketing campaigns, deal with customer inquiries or automatically route to relevant departments. In this article, we will learn how to use AI sentiment analysis to identify bad comments or reviews.
What is AI Sentiment Analysis
Sentiment analysis is an area of natural language processing (NLP) that involves building systems that are capable of automatically detecting the attitude of a person towards a particular content.
The way the person expresses is categorized into emotions that are related to that content. Thus, sentiment analysis is also known as opinion mining or emotion AI because it captures the subjective nature of expressions that describes their sentiment.
Build A Sentiment Classification Model
Let’s build a sentiment classification model that involves classifying the emotion expressed in texts. The texts would be divided into two categories: positive or negative.
The dataset for the texts can be found and downloaded from open sources such as Kaggle. Let’s use the dataset from the first Republican party (GOP) debate in the United States presidential elections of 2016.
The first step involves importing relevant libraries to build the model, such as Pandas, Keras, modules from Scikit-Learn etc.
import numpy as np # for linear algebra computation
import pandas as pd # data processing, CSV file I/O (ie, pd.read_csv)from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.model import Sequential
from keras.layers import Dense,Embedding,LSTM,SpatialDropout1D
from sklearn.model_selection import train_test_split
import re
Next we use Pandas to read CSV file which contains the dataset. The following shows what the first five row of dataset contains:
data = pd.read_csv('Sentiment.csv')
#keeping only the necessary columns for our analysis
data = data[['text','sentiment']]
data.head()
For this exercise, let’s remove the sentiment with “Neutral” label so we would have only two categories (Positive or Negative).
data = data[data.sentiment!="Neutral"]
data['text'] = data['text'].apply(lambda x:x.lower())
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]',",x)))
sub('['a-zA-z0-9\s]',",x)))print(data[data['sentiment']=='Positive'].size)
print(data[data['sentiment']=='Negative'].size)4472
16986
As shown above, the data contains 4472 observations which are positive and 16986 negative observations. We have also transformed the entire text into lowercase and have used a regular expression to remove invalid characters. Next we remove all instances of “rt”, which does not offer any additional information about the emotions expressed.
for idx,row in data.iterrows():
row[0] = row[0].replace('rt','')
We can now vectorize the text using the tokenizer from keras. We also select the maximum number of features and pad all vector observations (sequences) to be of the same length.
max_features = 2000
tokenizer = Tokenizer(num_words=max_features, split='')
tokenizer.fit_on_texts(data['text'].values)x = tokenizer.texts_to_sequences(data['text'].values)
x=pad_sequences(X)
The next step involves using the keras API to describe the architecture of the network. We set the number of dimensions of the LSTM unit to 196, use dropout to regularize the network and use a dense layer with 2 units as the output of the network. One unit represents positive sentiment while the other represents negative. Then we use softmax activation to give probabilities for the two classes.
embed_dim = 128
lstm_out = 196
model = Sequential()
model.add(Embedding(max_fatures, embed_dim, input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='softmax'))
print(model.summary())
We then transform our labels into one hot encoded variables and split the dataset into a training and test set.
Y = pd.get_dummies(data[‘sentiment’]).values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)
Finally we compile the model and fit it to the dataset. Since we used softmax activation in the final layer, our objective function is categorical cross entropy.
model.compile(loss =
'categorical_crossentropy',optimizer='adam',metrics=['accuracy'])history=model.fit(X_train, Y_train, epochs=10, batch_size=32, validation_split=0.2)
After training the model for 10 epochs, we achieve an accuracy of 94.5% on the training set and 83% on the validation set. The large difference in accuracies suggests that the model is suffering from high variance. We can plot the accuracies on the training set and validation set to further investigate this.
import matplotlib.pyplot as pltacc = history.history['acc']
val_acc= history.history['val_acc']epochs = range(len(acc))plt.plot(epochs,acc,'bo',label='Training acc')
plt.plot(epochs,val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.show()
As suspected, we can see from the graph that the validation accuracy stagnates after 3 epochs but the training accuracy continues to rise. Training can be improved with more data.
Summary
Small businesses need to prepare for the worst case scenario. The ones who can weather the storm are those who can maintain profitability or breakeven when global customer demand is low.
Cutting headcount costs by automating your revenue driving strategies can be a feasible way to business survival.
For those who are interested in learning more about sentiment analysis, feel free to visit our website at growthbotics.