Chapter 16 sections from Deep Learning with PyTorch.
8 items
Natural language models cannot operate directly on words as strings.
A language model cannot process raw text directly. Text must first be converted into a sequence of token IDs. The procedure that performs this conversion is called tokenization.
Text classification assigns one or more labels to a piece of text.
Named entity recognition, usually abbreviated NER, identifies spans of text that refer to named or typed entities.
Machine translation converts text from one language into another. Given a source sentence in one language, the model generates a semantically equivalent sentence in a target language.
Question answering, often abbreviated QA, is the task of producing an answer to a question.
A conversational system processes dialogue between users and machines.
Language modeling is the task of predicting text sequences. A language model assigns probabilities to sequences of tokens and learns the statistical structure of language.