In Part 1 of this series, I built a simple neural network for classification to get comfortable with the basics of deep learning. In Part 2, I created a MiniTokenizer to understand how raw text is transformed into tokens. Now, in Part 3, I am moving one step closer to building a GPT-style model by…
In Part 1 of this series, I built a simple neural network for binary and multiclass classification to get comfortable with the fundamentals of deep learning. For Part 2, I shifted focus to something equally important in the world of transformers: tokenization. Transformers do not work directly with raw text. They need text to be…
Building My Own Transformer Model I’ve decided to take on a challenge that’s equal parts exciting and intimidating: building my own open-source transformer model from scratch: something in the spirit of GPT-OSS. Right now, I have basic machine learning skills and a working knowledge of Python. Over the coming weeks and months, I’ll be diving…