Language Models are Few-Shot Learners
Resource history | v1 (current) | created by janarez
Details
Language Models are Few-Shot Learners
see v1 | created by janarez | Add topic "GPT-3"
- Title
- Language Models are Few-Shot Learners
- Type
- Paper
- Created
- 2020-01-01
- Description
- Humans generally perform a new language task from only a few examples - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
- Link
- http://arxiv.org/abs/2005.14165
- Identifier
- https://github.com/openai/gpt-3
authors
This resource has no history of related authors.
topics
official for GPT-3
resources
This resource has no history of related resources.