LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Builder

wejoncy

@wejoncy

Followers

25

Public repos

29

Stars (recent)

209

Ecosystems

1

Projects

Repositories this builder owns.

flash-attention

Fast and memory-efficient exact attention

A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.

Super fast serving stack for LLM on Windows/Linux/Macos

SGLang is a fast serving framework for large language models and vision language models.

Recent activity

Most recently pushed work.

wejoncy/flash-attention
pushed 15d ago
wejoncy/QLLM
pushed 3mo ago
wejoncy/sfllm
pushed 6mo ago
wejoncy/sglang
pushed 8mo ago

Connected narratives

AI Agents Consumer Crypto SocialFi Stablecoins Agent Commerce Onchain Apps

Related builders

Others building in the same ecosystem.

No description.

—

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

aphrodite-engine

Large-scale LLM inference engine

VPTQ, A Flexible and Extreme low-bit quantization algorithm