Identifying social actors on twitter through profile augmentation and machine learning classification
Date Issued
June 2021
Author(s)
Advisor
Abstract
The Web and the Online Social Network platforms (OSNs) brought a new era in politics.
Introduced by Barack Obama’s election campaign in 2008, we observe a rapidly
increasing election campaigns to take place on OSNs. Citizens, Academics, Journalists,
Influencers, and many more actors, on the other hand, talk about politics on the online
media public sphere. This interaction creates enormous and extremely valuable data to be
collected and analyzed. Recent advances in Natural Language Processing, allowed the
deep contextual understanding of these data, sheeting light on the question of “what” is
discussed on OSNs. However, there is huge gap in the literature, and consequently on the
available techniques, on approaches that would allow someone to answer the question of
“who”, i.e., different groups of users, is talking on OSNs.
The aim of this research is to break new ground in public opinion interpretation using Big
Data, by answering the question of “who” is discussing on OSNs and Twitter in particular.
This will be achieved by proposing and developing an automated process that classifies
Twitter’s users into different social actors. The methodology adopted derives from the
fields of Machine Learning and Natural Language processing (for tweets classification)
accompanied with political communication theories (for identifying categories of social
actors). Through the combination of these disciplines, this study proposes a theoretically
sound approach, unlike most existing related work in the literature, where classification
categories were defined in a random manner. Moreover, the proposed methodology, in
contrast to other approaches in the literature, does not rely on the tweet’s content for the
classification process; this can introduce bias in cases where tweets will undergo other
types of analyses, e.g., sentiment analysis. Given the above, this work proposes a novel
and sound approach for classifying tweets into different social actor classes.
Introduced by Barack Obama’s election campaign in 2008, we observe a rapidly
increasing election campaigns to take place on OSNs. Citizens, Academics, Journalists,
Influencers, and many more actors, on the other hand, talk about politics on the online
media public sphere. This interaction creates enormous and extremely valuable data to be
collected and analyzed. Recent advances in Natural Language Processing, allowed the
deep contextual understanding of these data, sheeting light on the question of “what” is
discussed on OSNs. However, there is huge gap in the literature, and consequently on the
available techniques, on approaches that would allow someone to answer the question of
“who”, i.e., different groups of users, is talking on OSNs.
The aim of this research is to break new ground in public opinion interpretation using Big
Data, by answering the question of “who” is discussing on OSNs and Twitter in particular.
This will be achieved by proposing and developing an automated process that classifies
Twitter’s users into different social actors. The methodology adopted derives from the
fields of Machine Learning and Natural Language processing (for tweets classification)
accompanied with political communication theories (for identifying categories of social
actors). Through the combination of these disciplines, this study proposes a theoretically
sound approach, unlike most existing related work in the literature, where classification
categories were defined in a random manner. Moreover, the proposed methodology, in
contrast to other approaches in the literature, does not rely on the tweet’s content for the
classification process; this can introduce bias in cases where tweets will undergo other
types of analyses, e.g., sentiment analysis. Given the above, this work proposes a novel
and sound approach for classifying tweets into different social actor classes.
File(s)![Thumbnail Image]()
Name
Abstract ChristosChristodoulou_thesis_6663.pdf
Size
207.16 KB
Format
Adobe PDF
Checksum (MD5)
16b9072b76d0ffafc7828c2a68bc3911

