- data augmentation
- generative pre-trained transformer
GAP
However, the existing methods could not learn the deep concepts of the rumor text to detect the rumor. In addition, imbalanced datasets in the umor domain reduce the effectiveness of these algorithms.
Idea
leveraging the Generative Pre-trained Transformer 2 (GPT-2) model to generate rumor-like texts, thus creating a balanced dataset. (利用GPT2+增强数据)
- GPT-2 captures rich semantic information and can produce diverse, high-quality synthetic text samples.
Datasets
PHEME, Twitter15, and Twitter16 datasets.