Blog - AI & Cloud by Syd

𓁟 Thoth: A Private, Local‑First Knowledge Agent for the AI‑Native Era

February 13, 2026

Artificial Intelligence

Every once in a while, a project comes along that feels like it should already exist, something obvious in hindsight, yet strangely absent from the landscape. Thoth is one of those projects. In a world where AI tools increasingly rely on cloud‑hosted models, opaque data pipelines, and centralized storage, Thoth takes a very different stance:…
Building a privacy first local ai assistant – Part 2: LOCAL RAG techniques

February 4, 2026

Artificial Intelligence

Building a Local, Privacy‑First RAG Pipeline with LangChain: From Embeddings to Hybrid Retrieval As part of my broader project to build a completely local, privacy‑first AI assistant, I’ve been exploring how to design a robust Retrieval‑Augmented Generation (RAG) pipeline using LangChain, LangGraph, and local LLMs. My goal is to create a model‑agnostic system that runs…
Building a privacy first local ai assistant – Part 1: Basic LangChain workflow

January 22, 2026

Artificial Intelligence

I have started building a completely local, privacy‑first AI assistant: a multimodal system that combines retrieval‑augmented generation (RAG) and tool calling powered by local LLMs. I chose a model‑agnostic framework—LangChain—to keep the architecture flexible and to make it easy to swap or compare models. My first step was to learn LangChain and LangGraph deeply so…
Part 8: Building a GPT-like Transformer from ScratcH – SFT on alpaca

December 12, 2025

Artificial Intelligence

Supervised Instruction Fine-Tuning on Alpaca, Deployment, and Why 164M Isn’t Enough In Part 7, I pretrained a 164M parameter GPT-style model (SydsGPTv2) on ~12B tokens using a carefully engineered pipeline on a single NVIDIA 3080 Ti. In this final part of the series, I shift from pure pretraining to instruction fine-tuning (SFT). The goals for…
Part 7: Pretraining on 12B tokens on a single consumer gpu: what it takes

December 2, 2025

Artificial Intelligence

In this part, I scaled a full pretraining pipeline: a ~10B-token corpus, pre-tokenization and chunking for streaming, a Flash Attention replacement inside the GPT blocks, training-loop features (warmup, cosine decay, gradient accumulation), torch.compile for runtime speedups, and GaloreAdamW as the optimizer. I then ran a long single‑GPU pretraining run (~12B tokens over ~11 days on…
Part 6: Pretraining SydsGPT on Project Gutenberg

November 10, 2025

Artificial Intelligence

In Part 5, I assembled the complete GPT medium model and validated its architecture with forward passes and text generation. In Part 6, I moved into the crucial stage of pretraining. I set out to understand the basics of pretraining by building a complete, reproducible pipeline around a GPT‑2 style model I call SydsGPT. In…
Part 5: Building a complete GPT medium model and first text generation

October 28, 2025

Artificial Intelligence

In Part 4, I focused on attention and built reusable modules that mirror transformer internals. In Part 5, I assembled the complete GPT architecture at medium scale, validated shapes and memory, and ran first text generation. The outputs are gibberish because the model is untrained. That is expected. The goal here is to make sure…
Part 4: attention is all you need (pretty much): From Basic Attention to Multi-Head Attention in PyTorch

October 16, 2025

Artificial Intelligence

In Part 3 of this series, I focused on preparing a dataset for training a language model, combining multiple books into a corpus, tokenizing with tiktoken, and creating PyTorch datasets. With the data pipeline in place, the next step in building a GPT-style model is to understand attention mechanisms. Attention is the core innovation behind…
Part 3: Preparing Data for Training a Language Model

October 6, 2025

Artificial Intelligence

In Part 1 of this series, I built a simple neural network for classification to get comfortable with the basics of deep learning. In Part 2, I created a MiniTokenizer to understand how raw text is transformed into tokens. Now, in Part 3, I am moving one step closer to building a GPT-style model by…
Part 2: Building a Mini Tokenizer from Scratch

October 1, 2025

Artificial Intelligence

In Part 1 of this series, I built a simple neural network for binary and multiclass classification to get comfortable with the fundamentals of deep learning. For Part 2, I shifted focus to something equally important in the world of transformers: tokenization. Transformers do not work directly with raw text. They need text to be…