A Perfect Synthesis of Multimodal Techniques for Sarcasm Detection
Ansh Mishra Mishra
Paper Contents
Abstract
A combination of verbal and non-verbal clues, such as tone changes, pronounced with greater emphasis, longer syllables, or a neutral facial expression, are frequently used to indicate sarcasm. Although textual data has been the primary focus of most previous sarcasm detection research, this work suggests that incorporating multimodal cues might greatly increase sarcasm classification accuracy. We offer the Multimodal Sarcasm Detection Dataset (MUStARD), derived from well-known TV series, to aid in the development of such systems. The dataset includes audiovisual statements that have been labeled with sarcasm together with contextual language that offers more details about the situations in which the sarcasm is used. Our first findings show that multimodal information can decrease sarcasm detection mistakes in F-score by as much as 12.9% when compared to models that only use individual modalities. The dataset is openlyaccessible to the public
Copyright
Copyright © 2025 Ansh Mishra . This is an open access article distributed under the Creative Commons Attribution License.