SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

Sedghamiz, Hooman; Raval, Shivam; Santus, Enrico; Alhanai, Tuka; Ghassemi, Mohammad

Computer Science > Computation and Language

arXiv:2109.07424 (cs)

[Submitted on 15 Sep 2021]

Title:SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

Authors:Hooman Sedghamiz, Shivam Raval, Enrico Santus, Tuka Alhanai, Mohammad Ghassemi

View PDF

Abstract:While contrastive learning is proven to be an effective training strategy in computer vision, Natural Language Processing (NLP) is only recently adopting it as a self-supervised alternative to Masked Language Modeling (MLM) for improving sequence representations. This paper introduces SupCL-Seq, which extends the supervised contrastive learning from computer vision to the optimization of sequence representations in NLP. By altering the dropout mask probability in standard Transformer architectures, for every representation (anchor), we generate augmented altered views. A supervised contrastive loss is then utilized to maximize the system's capability of pulling together similar samples (e.g., anchors and their altered views) and pushing apart the samples belonging to the other classes. Despite its simplicity, SupCLSeq leads to large gains in many sequence classification tasks on the GLUE benchmark compared to a standard BERTbase, including 6% absolute improvement on CoLA, 5.4% on MRPC, 4.7% on RTE and 2.6% on STSB. We also show consistent gains over self supervised contrastively learned representations, especially in non-semantic tasks. Finally we show that these gains are not solely due to augmentation, but rather to a downstream optimized sequence representation. Code: this https URL

Comments:	short paper, EMNLP 2021, Findings
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2109.07424 [cs.CL]
	(or arXiv:2109.07424v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.07424

Submission history

From: Enrico Santus [view email]
[v1] Wed, 15 Sep 2021 16:51:18 UTC (210 KB)

Computer Science > Computation and Language

Title:SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators