AI/ML STIG Lecture 9 Feb 2026 — Building a GPT-Style Transformer

By Febspot 09 Feb 2026 • 1 min read

Source: NASA Science

Helen Qu (Flatiron Institute) will build a decoder-only transformer (GPT-style) from scratch in PyTorch, train it on Tiny Shakespeare for character-level language modeling, and generate text.

Topics covered include self-attention as a learned, data-dependent mixing operator; causal (masked) self-attention for autoregressive modeling; building a GPT-style Transformer block from scratch; token and positional embeddings; training a small autoregressive language model; and text generation with temperature and top-k sampling.

Lecture tutorial materials and jupyter notebooks can be found here: https://tingyuansen.github.io/NASA_AI_ML_STIG/#schedule. We'll be using this slido link for Q&A during the talk: https://app.sli.do/event/kx2WYpZtkHYtbWMntJoFFj. The link to join the meeting is here: https://science.nasa.gov/astrophysics/programs/cosmic-origins/community/ai-ml-stig-lecture-series-9-feb-2026/

gpt-style, transformer, decoder-only, pytorch, tiny shakespeare, self-attention, causal self-attention, token embeddings, top-k sampling, text generation