Trung tâm Thư viện và Tri thức số

Propaganda Identification Using Topic Modelling

Loại tài liệu: Tài liệu số - article

Thông tin trách nhiệm: Kirill, Yakunin; Mihail, MihailIonescu George; Sanzhar, Murzakhmetov; Rustam, Mussabayev; Olga, Filatova; Ravil, Mukhamediev

Nhà Xuất Bản: Elsevier B.V

Năm Xuất Bản: 2020

Tải ứng dụng tại các liên kết sau để xem đầy đủ tài liệu.

Tóm tắt

This paper presents a method based on topic modelling for identifying texts with propagandistic content. The method is an attempt to incorporate transfer learning idea of obtaining effective vector representation from a large unlabeled or (semi-) automatically labelled dataset, while also attempting to minimize the amount of necessary manual expert labelling by introducingrnhigh level labelling (either manual or automatic) on some explicit document property. The proposed method includes four key stages: formation of corpus partitioning, computing a topic model of a united corpus, calculation of corpora imbalance estimates of each topic; extrapolating the results of the imbalance estimation on all documents. The method was cross-validated on arnlabelled subsample of 1000 news, and achieves high predictive power – ROC AUC 0.73

Ngôn ngữ:	en
Thông tin trách nhiệm:	Kirill, Yakunin; Mihail, MihailIonescu George; Sanzhar, Murzakhmetov; Rustam, Mussabayev; Olga, Filatova; Ravil, Mukhamediev
Thông tin nhan đề:	Propaganda Identification Using Topic Modelling
Nhà Xuất Bản:	Elsevier B.V
Loại hình:	article
Bản quyền:	© 2020 The Authors. Published by Elsevier B.V
Mô tả vật lý:	8 p.
Năm Xuất Bản:	2020