LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Other

tokenizers

C++ implementations for various tokenizers (sentencepiece, tiktoken etc).

OtherEmerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
5mo ago

Recent commits

Latest commits.

  • Add python 3.13 support in build files (#161)
    7b26e65Mengwei Liu6mo ago
  • Fix BPE merge parsing to handle HuggingFace tokenizer.json format
    92cf202Mergen Nachin6mo ago
  • Fix broken tests
    2055674Jon Janzen7mo ago
  • Rename build files from TARGETS to BUCK (group ID: 3033693060807749529) (#157) (#157)
    1b1d68aJon Janzen7mo ago
  • Migrate re2 usages for pytorch
    6f8e168Simon Krueger7mo ago
Daily `arc lint --take BLACK`
3aada3fFacebook Community Bot8mo ago
  • Turn off file tracker on Windows (#151)
    a28e941Mengwei Liu8mo ago
  • More msvc fixes (#147)
    0bcd9f5Jacob Szwejbka8mo ago
  • Top contributors

    Builders behind this project.

    larryliu0820
    81 commits
    jackzhxng
    22 commits
    gabe-l-hart
    16 commits
    JacobSzwejbka
    6 commits
    facebook-github-bot
    5 commits
    mergennachin
    5 commits
    GregoryComer
    4 commits
    shoumikhin
    3 commits