High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Latest commits.
Builders behind this project.