Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

Hildebrandt, Niclas; Boenninghoff, Benedikt; Orth, Dennis; Schymura, Christopher

Computer Science > Computation and Language

arXiv:2109.02383 (cs)

[Submitted on 6 Sep 2021 (v1), last revised 18 Aug 2024 (this version, v2)]

Title:Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

Authors:Niclas Hildebrandt, Benedikt Boenninghoff, Dennis Orth, Christopher Schymura

View PDF HTML (experimental)

Abstract:This paper presents the contribution of the Data Science Kitchen at GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments. The task aims at extending the identification of offensive language, by including additional subtasks that identify comments which should be prioritized for fact-checking by moderators and community managers. Our contribution focuses on a feature-engineering approach with a conventional classification backend. We combine semantic and writing style embeddings derived from pre-trained deep neural networks with additional numerical features, specifically designed for this task. Classifier ensembles are used to derive predictions for each subtask via a majority voting scheme. Our best submission achieved macro-averaged F1-scores of 66.8\%,\,69.9\% and 72.5\% for the identification of toxic, engaging, and fact-claiming comments.

Comments:	Accepted at 17th Conference on Natural Language Processing (KONVENS 2021)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2109.02383 [cs.CL]
	(or arXiv:2109.02383v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.02383

Submission history

From: Christopher Schymura [view email]
[v1] Mon, 6 Sep 2021 12:00:29 UTC (444 KB)
[v2] Sun, 18 Aug 2024 20:32:42 UTC (346 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Christopher Schymura

export BibTeX citation

Computer Science > Computation and Language

Title:Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators