A retrospecive of multilingual sentence encoders

This is a post with notes on the paper Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond by Mikel Artetxe and Holger Schwenk.

What is the problem? Multilingual sentence encoders work well only on the languages and tasks they were trained on.
Proposed solution Sentence encoder trained on 93 languages across 30 language families; it works well across languages, scripts and NLP tasks.
What's next? We have an encoder which works well on languages it has seen during training and those it hasn't as well.

March 2021

A retrospecive of multilingual sentence encoders

This is a post with notes on the paper Multilingual Universal Sentence Encoder for Semantic Retrieval by Yinfei Yang, Daniel Cer et al.

What is the problem? Previously existing models (e.g., BERT) did not produce good cross-lingual sentence representations which could be used across tasks and domains;
Proposed solution Multilingual sentence encoder trained on large corpora in 16 languages which cover multiple tasks such as Question Answering and Natural Language Inference;
What's next? It is tested on transfer to unseen sentence retrieval tasks (with success!!) but not on unseen languages.

March 2021

Welcome! ☀️

What will this blog be about? Mostly Natural Language processing and Machine Learning research I find interesting, but not only :)

March 2021

Evgeniia Razumovskaia

Blog

Welcome! ☀️