skip to main content
Caltech

Mathematics & Machine Learning Seminar

Tuesday, November 14, 2023
2:00pm to 3:00pm
Add to Cal
East Bridge 114
Introduction to Transformer Models
Aike Liu, Physics Department, Caltech,

As the currently most dominant machine learning architecture, Transformer models have realized remarkable advancements and are continuously reshaping the landscape of natural language processing. This talk will serve as a brief introduction to the fundamentals of Transformers. We will begin with the basics of Seq2Seq and then delve into the attention mechanism, which Transformers rely heavily on. Finally, we will review one of the most impactful papers in the field, "Language Models are Few-Shot Learners" by Tom B. Brown et al., emphasizing the design and performance of GPT-3.

For more information, please contact Math Department by phone at 626-395-4335 or by email at [email protected].