- Faster LLMs: Improving RWKV with Parallel Cumulative Sums
- Simple Fast Attention: Causal Implementation Experiments
- Negative Gearing and the CGT Discount: A Modern Portfolio Theory Analysis
- Retention LLMs: Analysing Algorithms and Alternative Implementations
- Digging into Doherty: Implications of Initialization
Tensorflow 5
- Faster LLMs: Improving RWKV with Parallel Cumulative Sums Oct 1, 2023
- Simple Fast Attention: Causal Implementation Experiments Nov 30, 2022
- Micro-benchmarking in TF2 Jan 23, 2021
- Deterministic Tensorflow Part 2: Data Augmentation Dec 6, 2020
- Deterministic Tensorflow Part 1: Model Training Dec 6, 2020
