LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Builders

Builder

wejoncy

@wejoncy

GitHub
Followers
25
Public repos
29
Stars (recent)
209
Ecosystems
1

Projects

Repositories this builder owns.

flash-attention
Fast and memory-efficient exact attention
—
QLLM
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.
190
sfllm
Super fast serving stack for LLM on Windows/Linux/Macos
17
sglang
SGLang is a fast serving framework for large language models and vision language models.
—
ArtRelease

Recent activity

Most recently pushed work.

  • wejoncy/flash-attention
    pushed 15d ago
  • wejoncy/QLLM
    pushed 3mo ago
  • wejoncy/sfllm
    pushed 6mo ago
  • wejoncy/sglang
    pushed 8mo ago

Connected narratives

AI AgentsConsumer CryptoSocialFiStablecoinsAgent CommerceOnchain Apps

Related builders

Others building in the same ecosystem.

Jesse Pollak
1.3K followers
0xDeployer
— followers
Xen
— followers
Ahaan Raizada
— followers
Youssef
— followers
Igor Yuzo
— followers
No description.
—
accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
—
aphrodite-engine
Large-scale LLM inference engine
—
VPTQ
VPTQ, A Flexible and Extreme low-bit quantization algorithm
—