Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Add Deepseek Sparse Attention (DSA) implementation (ch04/09_dsa) (#1014)
* fix: handle token_ids length<2 in find_freq_pair * Added new folder and related changes * Get PR ready * update --------- Co-authored-by: Tarun Kumar <tarun123@gmail.com> Co-authored-by: Sebastian Raschka <rasbt@users.noreply.github.com> Co-authored-by: rasbt <mail@sebastianraschka.com>
T
Tarun kumar committed
768fc57d4e125627ad45aa3ceae85ec2b6f0b8ae
Parent: 0ab8b81
Committed by GitHub <noreply@github.com>
on 5/23/2026, 3:02:00 PM