MURAL: Multimodal, Multitask Retrieval Across Languages

Jain, Aashi; Guo, Mandy; Srinivasan, Krishna; Chen, Ting; Kudugunta, Sneha; Jia, Chao; Yang, Yinfei; Baldridge, Jason

Computer Science > Information Retrieval

arXiv:2109.05125 (cs)

[Submitted on 10 Sep 2021]

Title:MURAL: Multimodal, Multitask Retrieval Across Languages

Authors:Aashi Jain, Mandy Guo, Krishna Srinivasan, Ting Chen, Sneha Kudugunta, Chao Jia, Yinfei Yang, Jason Baldridge

View PDF

Abstract:Both image-caption pairs and translation pairs provide the means to learn deep representations of and connections between languages. We use both types of pairs in MURAL (MUltimodal, MUltitask Representations Across Languages), a dual encoder that solves two tasks: 1) image-text matching and 2) translation pair matching. By incorporating billions of translation pairs, MURAL extends ALIGN (Jia et al. PMLR'21)--a state-of-the-art dual encoder learned from 1.8 billion noisy image-text pairs. When using the same encoders, MURAL's performance matches or exceeds ALIGN's cross-modal retrieval performance on well-resourced languages across several datasets. More importantly, it considerably improves performance on under-resourced languages, showing that text-text learning can overcome a paucity of image-caption examples for these languages. On the Wikipedia Image-Text dataset, for example, MURAL-base improves zero-shot mean recall by 8.1% on average for eight under-resourced languages and by 6.8% on average when fine-tuning. We additionally show that MURAL's text representations cluster not only with respect to genealogical connections but also based on areal linguistics, such as the Balkan Sprachbund.

Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2109.05125 [cs.IR]
	(or arXiv:2109.05125v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2109.05125

Submission history

From: Aashi Jain [view email]
[v1] Fri, 10 Sep 2021 22:26:05 UTC (15,290 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.IR

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI
cs.CL
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Mandy Guo
Ting Chen
Sneha Kudugunta
Chao Jia
Yinfei Yang

…

export BibTeX citation

Computer Science > Information Retrieval

Title:MURAL: Multimodal, Multitask Retrieval Across Languages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:MURAL: Multimodal, Multitask Retrieval Across Languages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators