LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Builders

Red Hat

Matthew Bonanni

vLLM Maintainer | MLE at Red Hat | Stanford PhD '25 | HPC, C++, CUDA, LLM inference

GitHubWebsite
Followers
59
Public repos
25
Stars (recent)
17
Ecosystems
1

Projects

Repositories this builder owns.

vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
—
flash-attention
Fast and memory-efficient exact attention
—
matthewbonanni.github.io
No description.
—
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
—

Connected narratives

AI AgentsConsumer CryptoSocialFiStablecoinsAgent CommerceOnchain Apps

Related builders

Others building in the same ecosystem.

Jesse Pollak
1.3K followers
0xDeployer
— followers
Xen
— followers
Ahaan Raizada
— followers
Youssef
— followers
Igor Yuzo
— followers
attn-viz
Interactive visualizer for transformer self-attention variants (MHA, GQA, MQA, MLA) with tensor shapes, FLOPs/memory cost analysis, and per-GPU roofline estimates
6
canhazgpu
A simple GPU reservation tool for single host shared development systems
—
recipes
Common recipes to run vLLM
—
ci-infra
This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
—

Recent activity

Most recently pushed work.

  • MatthewBonanni/vllm
    pushed 8h ago
  • MatthewBonanni/flash-attention
    pushed 8h ago
  • MatthewBonanni/matthewbonanni.github.io
    pushed 10h ago
  • MatthewBonanni/pytorch
    pushed 5d ago