FBERT: A Neural Transformer for Identifying Offensive Content

Sarkar, Diptanu; Zampieri, Marcos; Ranasinghe, Tharindu; Ororbia, Alexander

Computer Science > Computation and Language

arXiv:2109.05074 (cs)

[Submitted on 10 Sep 2021]

Title:FBERT: A Neural Transformer for Identifying Offensive Content

Authors:Diptanu Sarkar, Marcos Zampieri, Tharindu Ranasinghe, Alexander Ororbia

View PDF

Abstract:Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over $1.4$ million offensive instances. We evaluate fBERT's performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.

Comments:	Accepted to EMNLP Findings
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Cite as:	arXiv:2109.05074 [cs.CL]
	(or arXiv:2109.05074v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.05074

Submission history

From: Tharindu Ranasinghe Mr [view email]
[v1] Fri, 10 Sep 2021 19:19:26 UTC (167 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI
cs.LG
cs.SI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Marcos Zampieri
Tharindu Ranasinghe

export BibTeX citation

Computer Science > Computation and Language

Title:FBERT: A Neural Transformer for Identifying Offensive Content

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:FBERT: A Neural Transformer for Identifying Offensive Content

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators