Tag: rnn | Yingfa Chen

Posts

Publications

About Me

rnn

2025

Multi-Head DeltaNet

In DeltaProduct (Siems et al., 2025), they propose to improve DeltaNet (Yang et al., 2025) by updating the online memory with $n_h$ KVs for each token, which can be seen as performing multiple steps of gradient descent per token. I will explain how this method is almost the same as multi-KV DeltaNet and reveal a potential flaw in the design of DeltaProduct.

2025-03-22

2025-03-23

665 words, 4 min

Research

research rnn neural-networks langauge-models test-time-learning

2024

2024 国庆之后

刚刚放完 🇨🇳 国庆假，从东莞回来了北京继续我的博士生涯。这几个月感觉事情特别多，虽然很充实，但也很累，刚好这个七天长假（实际只有五天）可以让我喘口气。很久没有写博客了，上一次关于我自己的博客内容好像就是去年国庆之后的。刚好过一年，也可以当作一个年度总结吧。

2024-10-10

3.2k words, 10 min

Life

0 %