Tri-FND: Multimodal Fake News Detection Using Triplet Transformer Models

Engy  Ehab; Nahla Belal; Yasser Omar

doi:10.37934/araset.63.1.255270

Authors

Engy Ehab Department of Computer Science, College of Computing and Information Technology Arab Academy for Science, Technology and Maritime Transport, Smart Village, Cairo, Egypt
Nahla Belal Department of Computer Science, College of Computing and Information Technology Arab Academy for Science, Technology and Maritime Transport, Smart Village, Cairo, Egypt
Yasser Omar School of Library and Information Studies, University of Oklahoma

DOI:

https://doi.org/10.37934/araset.63.1.255270

Keywords:

Fake News, Multi-Model Learning, Transformer Models, Image-Text Matching

Abstract

The prevalence of fake news accompanied by multimedia content on the internet presents a significant challenge for users attempting to discern its authenticity. Automatically identifying and classifying fake news is a crucial way for combating misinformation and maintain the integrity of information dissemination. This paper proposes a fake news detection approach that exploits multimodality's potential and integrates textual and visual data to improve the fake news classification system. The novel multimodal learning approach to fake news detection, which has been termed Tri-FND, uses triplet transformers for fake news detection. This approach utilizes state-of-the-art language and vision transformers with Contrastive Language-Image Pretraining (CLIP) to improve feature representation and textual and visual semantic alignment. This technique significantly enhances the capability of identifying fake news by analyzing both text and images. Experiments were conducted on two linguistic datasets: the English dataset is sourced from Twitter, while the Chinese dataset is sourced from Weibo. The proposed approach can achieve an overall accuracy of 0.90 on the Twitter dataset and 0.93 on the Weibo dataset.