SIGN IN SIGN UP
rasbt / LLMs-from-scratch UNCLAIMED

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

0 0 0 Jupyter Notebook

n_heads × d_head -> d_head × d_head in DeltaNet (#903)

Clarified the explanation of the memory size calculation for `KV_cache_DeltaNet` and updated the quadratic term from `n_heads × d_head` to `d_head × d_head`.
S
Sebastian Raschka committed
bcc73f731d09cec9c091b4ed563eed68fbdeecf0
Parent: 488bef7
Committed by GitHub <[email protected]> on 11/6/2025, 12:28:37 AM