dev-util / Tensile

Stretching GPU performance for GEMMs and tensor contractions

Official package sites : https://github.com/ROCm/Tensile ·

v6.3.3 :: 0/6.3 :: gentoo

Modified
License
MIT
Keywords
~amd64
USE flags
client test

v6.3.2 :: 0/6.3 :: gentoo

Modified
License
MIT
Keywords
~amd64
USE flags
client test

v6.1.1-r1 :: 0/6.1 :: gentoo

Modified
License
MIT
Keywords
~amd64
USE flags
client test

v6.1.1 :: 0/6.1 :: gentoo

Modified
License
MIT
Keywords
~amd64
USE flags
client test

v5.7.1-r2 :: 0/5.7 :: gentoo

Modified
License
MIT
Keywords
~amd64
USE flags
client test

General

client
Build and install tensile_client executable to run benchmarks and tune GPU GEMM
test
Enable dependencies and/or preparations necessary to run tests (usually controlled by FEATURES=test but can be toggled independently)

amdgpu_targets

gfx1010
RDNA GPU, codename navi10, including Radeon RX 5700XT/5700/5700M/5700B/5700XTB/5600XT/5600/5600M, Radeon Pro 5700XT/5700, Radeon Pro W5700X/W5700
gfx1011
RDNA GPU, codename navi12, including Radeon Pro 5600M/V520
gfx1012
RDNA GPU, codename navi14, including Radeon RX 5500XT/5500/5500M/5500XTB/5300/5300M, Radeon Pro 5500XT/5500M/5300/5300M, Radeon Pro W5500X/W5500/W5500M/W5300M
gfx1030
RDNA2 GPU, codename navi21/sienna cichlid, including Radeon RX 6950XT/6900XT/6800XT/6800, Radeon Pro W6800
gfx1031
RDNA2 GPU, codename navi22/navy flounder, including Radeon RX 6750XT/6700XT/6800M/6700M
gfx1100
RDNA3 GPU, codename navi31/plum bonito, including Radeon RX 7900XTX/7900XT, AMD Radeon Pro W7900/W7800
gfx1100
RDNA3 GPU, codename navi31/plum bonito, including Radeon RX 7900XTX/7900XT, AMD Radeon Pro W7900/W7800
gfx1101
RDNA3 GPU, codename navi32, including Radeon RX 7700XT/7800XT
gfx1102
RDNA3 GPU, codename navi33, including Radeon RX 7600/7600M/7600M XT/7700S/7600S, AMD Radeon PRO W7600/W7500
gfx803
Fiji GPU, codename fiji, including Radeon R9 Nano/Fury/FuryX, Radeon Pro Duo, FirePro S9300x2, Radeon Instinct MI8
gfx900
Vega GPU, codename vega10, including Radeon Vega Frontier Edition, Radeon RX Vega 56/64, Radeon RX Vega 64 Liquid, Radeon Pro Vega 48/56/64/64X, Radeon Pro WX 8200/9100, Radeon Pro V320/V340/SSG, Radeon Instinct MI25
gfx906
Vega GPU, codename vega20, including Radeon (Pro) VII, Radeon Instinct MI50/MI60
gfx908
CDNA Accelerator, codename arcturus, including AMD Instinct MI100 Accelerator
gfx90a
CDNA2 Accelerator, codename aldebaran, including AMD Instinct MI200 series Accelerators
gfx940
CDNA3 Accelerator, codename aqua_vangaram, MI300A rev 0
gfx941
CDNA3 Accelerator, codename aqua_vangaram, MI300X rev 0
gfx942
CDNA3 Accelerator, codename aqua_vangaram, MI300A and MI300X rev >=1

llvm_slot

18
Use LLVM 18.
19
Use LLVM 19.

python_targets

python3_10
Build with Python 3.10
python3_11
Build with Python 3.11
python3_12
Build with Python 3.12
python3_13
Build with Python 3.13

dev-cpp / msgpack-cxx : MessagePack for C++

dev-lang / python : Freethreading (no-GIL) version of Python programming language

dev-libs / boost : Boost Libraries for C++

dev-python / joblib : Tools to provide lightweight pipelining in Python

dev-python / msgpack : MessagePack (de)serializer for Python

dev-python / pyyaml : YAML parser and emitter for Python

dev-util / hip : C++ Heterogeneous-Compute Interface for Portability

dev-util / rocm-smi : ROCm System Management Interface Library

llvm-core / clang : C language family frontend for LLVM

dev-cpp / msgpack-cxx : MessagePack for C++

dev-lang / python : Freethreading (no-GIL) version of Python programming language

dev-libs / boost : Boost Libraries for C++

dev-python / joblib : Tools to provide lightweight pipelining in Python

dev-python / msgpack : MessagePack (de)serializer for Python

dev-python / pyyaml : YAML parser and emitter for Python

dev-util / hip : C++ Heterogeneous-Compute Interface for Portability

dev-util / rocm-smi : ROCm System Management Interface Library

llvm-core / clang : C language family frontend for LLVM

903602
dev-util/Tensile-5.4.2-r1 - ninja: build stopped: subcommand failed.
934970
dev-util/Tensile-6.1.1 fails tests: FAILED test_.py::test_2sum_gsu_src - FileNotFoundError: [Errno 2] No such file or directory: 2sum_gsu_src.yaml
949526
dev-util/Tensile-6.3.2 - [glibc] [gcc-15] [ffmpeg] Reference.cpp: fatal error: omp.h file not found
Repository mirror & CI · gentoo
Merge updates from master
Patrick Lauer · gentoo
dev-util/Tensile: add 6.3.3, drop 6.3.0
Signed-off-by: Patrick Lauer <patrick@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Repository mirror & CI · gentoo
Merge updates from master
Patrick Lauer · gentoo
dev-util/Tensile: add 6.3.2
Signed-off-by: Patrick Lauer <patrick@gentoo.org>
Sv. Lockal · gentoo
dev-util/Tensile: drop 5.1.3-r3, 5.4.2-r2
Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Repository mirror & CI · gentoo
Merge updates from master
Michał Górny · gentoo
Move {sys-devel → llvm-core}/clang
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Sv. Lockal · gentoo
dev-util/Tensile: add 6.3.0
Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Patrick Lauer · gentoo
dev-util/Tensile: Fix runtime behaviour
Similar to https://github.com/ROCm/Tensile/pull/1898 Fixes rocBLAS failing, no idea why above code was merged but fell out during refactoring and why is everything so horrible Signed-off-by: Patrick Lauer <patrick@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Sv. Lockal · gentoo
dev-util/Tensile: strip unsupported flags for potentially switched compiler
Bug: https://bugs.gentoo.org/936099 Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Paul Zander · gentoo
sci-libs/rocBLAS: 6.1.1 make Tensile dependency optional
Building with Tensile requires actual hardware present, to avoid breaking CI make this optional. Signed-off-by: Paul Zander <negril.nx+gentoo@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Sv. Lockal · gentoo
dev-util/Tensile: add 6.1.1
Changes: * fix USE=test dependency for dev-python/joblib (in 5.7.1 and 6.1.1) * ReplacementKernels-cov3 directory does not exist anymore * update expand-isa-compatibility patch to not use removed gcnArch field * update llvm eclass to r1 * add LLVM_COMPAT 18 Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Sv. Lockal · gentoo
dev-util/Tensile: fix compilation of rocBLAS by propagating MSGPACK_NO_BOOST definition
Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Sv. Lockal · gentoo
dev-util/Tensile: lock dev-util/hip version, as with hip-6.0 build fails with "no member named 'gcnArch'"
Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Sam James <sam@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Sv. Lockal · gentoo
dev-util/Tensile: add 5.7.1
increase LLVM_MAX_SLOT to 17 Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Marek Szuba · gentoo
dev-util/Tensile: adapt for msgpack-cxx-6.0.0
The name of the cmake module has changed again. Signed-off-by: Marek Szuba <marecki@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Marek Szuba · gentoo
dev-util/Tensile: revbump to account for recent RDEPEND changes
Signed-off-by: Marek Szuba <marecki@gentoo.org>
Yiyang Wu · gentoo
dev-util/Tensile: add client USE to compile tensile_client
tensile_client is for running benchmarks. By default, Tensile contains a whole set of scripts to configure and compile tensile_client via cmake for very benchmark. This commit enables a use flag to compile and install this executable, so at runtime benchmark won't need to compile again. Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Marek Szuba <marecki@gentoo.org>
Yiyang Wu · gentoo
dev-util/Tensile: support >=dev-cpp/msgpack-cxx-5.0.0
Closes: https://bugs.gentoo.org/893544 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Marek Szuba <marecki@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Repository mirror & CI · gentoo
Merge updates from master
Yiyang Wu · gentoo
dev-util/Tensile: add missing patches
Closes: https://bugs.gentoo.org/892736 Closes: https://github.com/gentoo/gentoo/pull/29356 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Andreas Sturmlechner · gentoo
dev-util/Tensile: drop 5.0.2-r2
Signed-off-by: Andreas Sturmlechner <asturm@gentoo.org>
Yiyang Wu · gentoo
dev-util/Tensile: add 5.4.2
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
David Seifert · gentoo
*/*: remove py3.8 from PYTHON_COMPAT
Signed-off-by: David Seifert <soap@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Andreas Sturmlechner · gentoo
dev-util/Tensile: drop 4.3.0-r1
Signed-off-by: Andreas Sturmlechner <asturm@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Jack de Kleuver · gentoo
dev-util/Tensile: Bump LLVM version to 15
Signed-off-by: Jack de Kleuver <jackdekleuver@gmail.com> Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Repository mirror & CI · gentoo
Merge updates from master
Benda Xu · gentoo
dev-util/Tensile: relax SLOT dependency.
This unlocks the exact SLOT dependency of the lower and higher level ROCm tools to make version bumps easier. Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Yiyang Wu · gentoo
dev-util/Tensile: backport patch to control multiprocess
Reference: https://github.com/ROCmSoftwarePlatform/Tensile/commit/25b1621549f9b120462988913e657684645be79d Bugs: https://bugs.gentoo.org/852236 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Yiyang Wu · gentoo
dev-util/Tensile: add 5.1.3, switch to vanilla clang-14
enable py3.11 as well. No need to rebuild when hip upgrade since hip is just used at runtime. Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Benda Xu · gentoo
dev-util/Tensile: die out when pushd fails.
Bug: https://github.com/gentoo/gentoo/pull/24537 Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
YiyangWu · gentoo
dev-util/Tensile: fix hardcoded EPREFIX in gentoopath.patch
Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
YiyangWu · gentoo
dev-util/Tensile: bump version to 5.0.2
Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
YiyangWu · gentoo
dev-util/Tensile: FHS and benchmark feature
The previous Tensile package installs non-python files in python site directory. This change move Config files to /usr/share. Various patches are applied for correcting paths. Also, enable gfx1031 target so people can run benchmark on navi22 GPUs. Closes: https://github.com/gentoo/gentoo/pull/24537 Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Andrew Ammerlaan · gentoo
dev-util/Tensile: add new dependency of rocBLAS
Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>