Title:DeepMind makes big jump toward interpreting LLMs with sparse autoencoders Summary: A new research by Google DeepMind shows how sparse autoencoders (SAEs) with special JumpReLU activation can help interpret LLMs. Link:
DeepMind makes big jump toward interpreting LLMs with sparse autoencoders Daily Deals