Sie sind auf Seite 1von 1

Title : Abusive Language Detection with Graph Convolutional Networks

Authors : Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

The paper deals with the problem of Abuse Detection on Social Media websites. Previous research in this
area only captured shallow properties of online communities by modelling follower-following relationships.
This approach has not yielded efficient solutions as they do not capture the linguistic behaviour of the
authors and their communities do not convey whether they are abusive or not.

The authors talk about homophily. This a phenomenon which states that people with a similar mindset
form a cluster together. Homophily can be modelled by identifying clusters. This, combined with the fact
that a dataset of 16,907 tweets labelled as racist, sexist and clean makes it easy for a machine learning
algorithm to model this problem.

Their model is predicated by a paper done on author profiling. In that paper, authors represented the
community using a homogeneous graph where nodes denoted the authors and the edges were used to
denote the follower-following relationship. Their model learnt author embeddings using node2vec which is
based on creating embeddings of nodes based on position and their neighbourhood.

In this paper, they take the previous idea a step further by using Graphical Convolution Network and
applying them on a heterogeneous graph. The main difference between the two is making use of
heterogeneous graphs. This, as opposed to the previous approach, consists of two kinds of nodes :
authors and their tweets. This approach helps in capturing both the structural and linguistic aspects of the
tweets sent out by the person. Each output in a convolution GCN is compute by multiplying the
normalised adjacency matrix of a node to the output of the previous layers and weights of that layer.

The authors experimented with 5 different types of classification methods in order to label the tweets.
Each of the method is run 10 times with random initialisations .In each run, 90% of the data is kept for
training and the remaining 10% of the data is kept for testing. In cases involving GCN, a part of the test
data is labelled as validation data to avoid overfitting. After every test on the dataset they report the
values of precision, recall and F1 scores for racism, sexism and overall tweets to measure the accuracy
of their models. They found out that Logistic Regression along with GCN outperforms other models.

After testing all the five models, they conclude that GCN on its own, it more accurate than the method
suggested previously. However, it’s performance takes a hit when classifying racist tweets because of
high recall and low precision. The final comparison between the previous model and the the newer model
is suggested by making a t-SNE graph. The t-SNE graph for node2vec is unable to gain anything from the
extended graph whereas while using LR+GCN a clear cluster consisting of authors with abusive
behaviour is produced. However, despite this changes this model is unable to identify tweets which link to
abusive content or where the authors have used symbols in their tweets instead of writing the full word.

Das könnte Ihnen auch gefallen