LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Python

tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

PythonBuildingcourselarge-language-modelllmpython
GitHubWebsite
Stars
4.3K
Forks
332
Contributors
8
Last push
3d ago

Recent commits

Latest commits.

  • docs: update hf login command to hf auth login for huggingface_hub v1.x (#132)
    efb0c89Zhang Yifan3d ago
  • Update MoE roadmap status (#131)
    e072144Connor7d ago
  • Add Qwen3 MoE lesson (#129)
    342cccdConnor8d ago
  • Keep Week 2 embeddings quantized (#130)
    3705aa0Connor10d ago
  • Support float16 quantized matmul (#128)
    e8da33fConnor20d ago
fix: correct axpby jvp argnums handling (#127)
7fac5ffihanzh21d ago
  • Optimize quantized matmul for small decode batches (#122)
    8f71501Li0k27d ago
  • Clarify Qwen3 model size test names (#121)
    77eb9f3Connor1mo ago
  • Top contributors

    Builders behind this project.

    skyzh
    131 commits
    Connor1996
    48 commits
    ekzhang
    5 commits
    jiengup
    5 commits
    58191554
    4 commits
    KKKZOZ
    4 commits
    jhsong233
    2 commits
    linuxholic
    2 commits