LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Python

tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

PythonBuildingcourselarge-language-modelllmpython

Stars

4.3K

Forks

332

Contributors

8

Last push

3d ago

Recent commits

Latest commits.

docs: update hf login command to hf auth login for huggingface_hub v1.x (#132)
efb0c89Zhang Yifan3d ago
Update MoE roadmap status (#131)
e072144Connor7d ago
Add Qwen3 MoE lesson (#129)
342cccdConnor8d ago
Keep Week 2 embeddings quantized (#130)
3705aa0Connor10d ago
Support float16 quantized matmul (#128)
e8da33fConnor20d ago

fix: correct axpby jvp argnums handling (#127)

7fac5ffihanzh21d ago

Optimize quantized matmul for small decode batches (#122)

8f71501Li0k27d ago

Clarify Qwen3 model size test names (#121)

77eb9f3Connor1mo ago

Top contributors

Builders behind this project.