-
NUS
- Singapore
-
13:44
- 8h ahead - https://radarfudan.github.io
- https://orcid.org/0009-0001-1457-2419
- @SanderWangSD
Highlights
-
Curse-of-memory Public
Curse-of-memory phenomenon of RNNs in sequence modelling
-
-
-
flash-linear-attention Public
Forked from sustcsonglin/flash-linear-attentionEfficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Python MIT License UpdatedNov 6, 2024 -
Awesome-state-space-models Public
Collection of papers on state-space models
-
-
-
pythia Public
Forked from EleutherAI/pythiaThe hub for EleutherAI's work on interpretability and learning dynamics
-
llm.c Public
Forked from karpathy/llm.cLLM training in simple, raw C/CUDA
Cuda MIT License UpdatedJul 12, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
mamba2-minimal Public
Forked from tommyip/mamba2-minimalMinimal Mamba-2 implementation in PyTorch
Python Apache License 2.0 UpdatedJun 17, 2024 -
-
s4 Public
Forked from state-spaces/s4Structured state space sequence models
Jupyter Notebook Apache License 2.0 UpdatedMar 24, 2024 -
google-research Public
Forked from google-research/google-researchGoogle Research
-
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
-
-
attention_with_linear_biases Public
Forked from ofirpress/attention_with_linear_biasesCode for the ALiBi method for transformer language models (ICLR 2022)
-
-
in-context-operator-networks Public
Forked from LiuYangMage/in-context-operator-networksICON for in-context operator learning
Python MIT License UpdatedMar 11, 2024 -
gateloop-transformer Public
Forked from lucidrains/gateloop-transformerImplementation of GateLoop Transformer in Pytorch and Jax
Python MIT License UpdatedMar 11, 2024 -
causal-conv1d Public
Forked from Dao-AILab/causal-conv1dCausal depthwise conv1d in CUDA, with a PyTorch interface
-
RWKV-CUDA Public
Forked from BlinkDL/RWKV-CUDAThe CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )
-
profiling-cuda-in-torch Public
Forked from gpu-mode/profiling-cuda-in-torch -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
-
annotated-mamba Public
Forked from srush/annotated-mambaAnnotated version of the Mamba paper
-
lightning-hydra-template Public template
Forked from ashleve/lightning-hydra-templatePyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
-
flash-fft-conv Public
Forked from HazyResearch/flash-fft-convFlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
-
TinyLlama Public
Forked from jzhang38/TinyLlamaThe TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
-
LongMamba Public
Forked from jzhang38/LongMambaSome preliminary explorations of Mamba's context scaling.
Python UpdatedFeb 8, 2024 -