The Rise of Test-Time Training

Abstract : The main idea in test time training (TTT) ( 1) is that a model with fixed parameters produces the supervision for another network that is updated during test time (or inference time). This article first reviews the TTT paper. Then, we discuss the problem with TTT and how LaCT addresses them, resulting in ...

Jul 7, 2025 · 7 min
Generalizing DeltaProduct

Generalizing DeltaProduct

In DeltaProduct 1, they propose to improve DeltaNet 1 by updating the online memory with KVs for each token, which can be seen as performing multiple steps of gradient descent per token. I will explain how this method is almost the same as multi KV DeltaNet and reveal a potential flaw in the design of DeltaProduct.

Mar 22, 2025 · 4 min

Implementating Test-Time Training - Part 1

This blog post is part 1 of a series that describes my attempt in implementing the Test Time Training (TTT) model proposed by 1, and Titans, proposed by 1. At the time of writing, these two are two strong recurrent language models, but they have not yet open sourced their implementation (TTT has only open sourced th...

Mar 19, 2025 · 8 min
2024 国庆之后

2024 国庆之后

刚刚放完 🇨🇳 国庆假,从东莞回来了北京继续我的博士生涯。这几个月感觉事情特别多,虽然很充实,但也很累,刚好这个七天长假(实际只有五天)可以让我喘口气。很久没有写博客了,上一次关于我自己的博客内容好像就是去年国庆之后的。刚好过一年,也可以当作一个年度总结吧。

Oct 10, 2024 · 1 min
(EREN) Robust and Scalable Model Editing for Large Language Models

(EREN) Robust and Scalable Model Editing for Large Language Models

1 | 1 TL;DR : A reader is augmented with a growing notebook that caches all edits in natural texts, and the reader retrieves relevant edits and make inference based on them. This achieves SOTA in model editing in QA and fact checking.

Mar 14, 2024 · 3 min
InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens

InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens

1 | 1 The first benchmark for evaluating the effectiveness of LLMs in handling more than 100k tokens! In the paper, we name it Bench, but I will sometimes use "InfiniteBench" in this blog post for better readability. Finally got some time to write this blog, been so busy lately! I have been in a fairly long duration...

Jan 10, 2024 · 6 min

Safety and Ethical Concerns of Large Language Models

I will be holding a seminar at ModelBest (面壁智能) in Sep 20, 2023 in Beijing, Haidian, 科技园. The seminar will be in Chinese, and it's called "大模型安全与伦理问题" (translation: Safety and Ethical Concerns of Large Language Models). Below is a list of references.

Sep 19, 2023 · 3 min
更新个人主页

更新个人主页

之前有过个人主页,但是一直没有弄好,更没有更新。最近我将自己的 GitHub 的用户名改了,导致之前的 GitHub Pages 失效了,就趁机重新搭建个人主页。 兜兜转转,还是决定使用 Hexo。以前用过 Jekyll,觉得还行,但是真的不想用 Ruby,Hugo 又太麻烦。 选了好久主题,Hexo 宣传说有很多主题,但是官网上不到 400 个主题,而且大部分都不符合我的审美或者要求。我想要的风格是简约,现代,需要同时支持黑暗和白亮模式,需要有代码高亮且是代码是等款字体。最接近我的要求就是 1主题。可是仍然无法满足我的要求,所以我修改了一些格式(原版甚至有一些颜色 bug),添加了自己的一些内容,结果是一个叫做 1的主题。 ...

Sep 16, 2023 · 1 min
CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

1 | 1 | 1 | 1 I did this work with my girlfriend, whose research direction is computational fluid dynamics (CFD). We observed that there are numerous research works in applying deep learning (DL) to solve CFD problems. E.g., 1 have shown that DL methods can not only be more accurate than the best numerical methods, ...

Sep 16, 2023 · 2 min
第一个帖子,瞎写点东西

第一个帖子,瞎写点东西

现在是 2023 年五月十七,马上硕士一年级就结束,在清华园已经快五年了,感觉对我人生的影响真的巨大。这一年认识了很可爱的 00,希望可以一直走下去。 我和 00 的孩子们:

May 17, 2023 · 1 min