A Transformer-based Infrastructure for Youtube Misinformation Detection
Date Issued
2023
Author(s)
Advisor
Abstract
This thesis addresses the growing concern of misinformation on social media platforms,
particularly regarding COVID-19 vaccines. Focusing on YouTube, the study proposes
a framework for identifying and filtering COVID-19 misinformation on the platform by
collecting data and modeling misinformation detection. The methodology includes creating
four YouTube accounts (from which data was collected), a custom Chrome extension
for data collection, and using supervised learning techniques for video-related data labeling
and classification. Incorporating Large Language Models (LLMs) and transformers
enhances the accuracy of the misinformation detector. The study investigates the effectiveness
of YouTube’s tools for identifying and filtering COVID-19 misinformation. The
findings can facilitate the development of more effective strategies for promoting public
health and safety. The methodology includes using state-of-the-art technologies and
techniques like word-embeddings, transformers, natural language processing, and machine
learning. Fusing these approaches provides a novel, more sophisticated, and accurate approach
to detecting and filtering COVID-19 misinformation on YouTube, preventing its
spread.
particularly regarding COVID-19 vaccines. Focusing on YouTube, the study proposes
a framework for identifying and filtering COVID-19 misinformation on the platform by
collecting data and modeling misinformation detection. The methodology includes creating
four YouTube accounts (from which data was collected), a custom Chrome extension
for data collection, and using supervised learning techniques for video-related data labeling
and classification. Incorporating Large Language Models (LLMs) and transformers
enhances the accuracy of the misinformation detector. The study investigates the effectiveness
of YouTube’s tools for identifying and filtering COVID-19 misinformation. The
findings can facilitate the development of more effective strategies for promoting public
health and safety. The methodology includes using state-of-the-art technologies and
techniques like word-embeddings, transformers, natural language processing, and machine
learning. Fusing these approaches provides a novel, more sophisticated, and accurate approach
to detecting and filtering COVID-19 misinformation on YouTube, preventing its
spread.

