LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Python

GPTQModel

Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

PythonEmerging

Stars

—

Forks

—

Contributors

8

Last push

16mo ago

Recent commits

Latest commits.

gptqmodel: Fixup supported models list
927d00cForenche16mo ago
fix 3 bit packing regression， fixed #1278 (#1280)
52d2c42CSY-ModelCloud16mo ago
start v2.0.0-dev release cycle (#1275)
3ead8c1Qubitium-ModelCloud16mo ago
Update README.md (#1274)
599e5c7Qubitium-ModelCloud16mo ago
prepare for 1.9.0 release (#1273)
b1283b7Qubitium-ModelCloud16mo ago

[CI] remove peft & fix torch was upgraded by optimum (#1272)

bae0e6cCSY-ModelCloud16mo ago

[CI] fix transformers version was overrided (#1271)

24c05e2CSY-ModelCloud16mo ago

fix outputs are not enough (#1268)

c6dc35dCSY-ModelCloud16mo ago

Top contributors

Builders behind this project.