sci-ml / tokenizers

Implementation of today's most used tokenizers

Official package sites : https://github.com/huggingface/tokenizers ·

v0.23.1 :: 0 :: gentoo

Modified
License
Apache-2.0 Apache-2.0 Apache-2.0-with-LLVM-exceptions BSD-2 BSD ISC MIT MPL-2.0 Unicode-DFS-2016
Keywords
~amd64
USE flags
debug test

v0.22.2 :: 0 :: gentoo

Modified
License
Apache-2.0 Apache-2.0 Apache-2.0-with-LLVM-exceptions BSD-2 BSD ISC MIT MPL-2.0 Unicode-DFS-2016
Keywords
~amd64
USE flags
debug test

v0.22.1 :: 0 :: gentoo

Modified
License
Apache-2.0 Apache-2.0 Apache-2.0-with-LLVM-exceptions BSD-2 BSD ISC MIT MPL-2.0 Unicode-DFS-2016
Keywords
~amd64
USE flags
debug test

General

debug
Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces
test
Enable dependencies and/or preparations necessary to run tests (usually controlled by FEATURES=test but can be toggled independently)

python_single_target

python3_12
Build for Python 3.12 only
python3_13
Build for Python 3.13 only
python3_14
Build for Python 3.14 only

dev-lang / python : An interpreted, interactive, object-oriented programming language

dev-lang / python : An interpreted, interactive, object-oriented programming language

dev-libs / oniguruma : Regular expression library for different character encodings

sci-ml / transformers : State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow

Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: update RUST_MIN_VER
Closes: https://bugs.gentoo.org/976068 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Alfredo Tupone · gentoo
sci-ml/tokenizers: drop a test needing network
Closes: https://bugs.gentoo.org/976166 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.23.1
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.22.2
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: enable py3.14
Closes: https://bugs.gentoo.org/974127 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Michał Górny · gentoo
sci-ml/tokenizers: Remove py3.11 (per scipy)
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: drop 0.21.4
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.22.1, drop 0.22.0
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.22.0
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: drop 0.21.0
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.21.4, rm 0.21.1-r1
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: drop test failing in sandbox
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: fix for gcc-15
Closes: https://bugs.gentoo.org/944852 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: add 0.21.1
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: enable py3.13
Closes: https://bugs.gentoo.org/952696 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/tokenizers: drop 0.20.1-r1
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
sci-ml/*: mv sci-libs/datasets to sci-ml/
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Alfredo Tupone · gentoo
sci-ml/*: mv sci-libs/tokenizer to sci-ml/
Signed-off-by: Alfredo Tupone <tupone@gentoo.org>