We recommend using requirements.txt. This has been tested with Ubuntu 22.04, CUDA 12.4, and Python 3.10. A GPU with 24GB or more HBM should work for most datasets ...
This repo aims at providing a collection of efficient Triton-based implementations for state-of-the-art linear attention models. All implementations are written purely in PyTorch and Triton, making ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results