Blog
A retrospecive of multilingual sentence encoders
This is a post with notes on the paper Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond by Mikel Artetxe and Holger Schwenk.
March 2021
This is a post with notes on the paper Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond by Mikel Artetxe and Holger Schwenk.
- What is the problem? Multilingual sentence encoders work well only on the languages and tasks they were trained on.
- Proposed solution Sentence encoder trained on 93 languages across 30 language families; it works well across languages, scripts and NLP tasks.
- What's next? We have an encoder which works well on languages it has seen during training and those it hasn't as well.
March 2021
A retrospecive of multilingual sentence encoders
This is a post with notes on the paper Multilingual Universal Sentence Encoder for Semantic Retrieval by Yinfei Yang, Daniel Cer et al.
March 2021
This is a post with notes on the paper Multilingual Universal Sentence Encoder for Semantic Retrieval by Yinfei Yang, Daniel Cer et al.
- What is the problem? Previously existing models (e.g., BERT) did not produce good cross-lingual sentence representations which could be used across tasks and domains;
- Proposed solution Multilingual sentence encoder trained on large corpora in 16 languages which cover multiple tasks such as Question Answering and Natural Language Inference;
- What's next? It is tested on transfer to unseen sentence retrieval tasks (with success!!) but not on unseen languages.
March 2021