(EREN) Robust and Scalable Model Editing for Large Language Models

(EREN) Robust and Scalable Model Editing for Large Language Models

1 | 1 TL;DR : A reader is augmented with a growing notebook that caches all edits in natural texts, and the reader retrieves relevant edits and make inference based on them. This achieves SOTA in model editing in QA and fact checking.

Mar 14, 2024 · 3 min

Interpreting a Maze-Solving Network

1 I can't believe I haven't read this until now. This is mind provoking, and the result is an important step towards understanding neural networks.

Oct 7, 2023 · 1 min

Activation Addition (ActAdd)

1 TLDR: Propose ActAdd , a method for controlling model behavior during inference by modifying activations with a bias term that is learned from a pair of prompt. Summary: Propose ActAdd , a method for controlling model behavior by modifying activations at inference time. Steering vectors are computed by taking the ...

Oct 7, 2023 · 4 min

Safety and Ethical Concerns of Large Language Models

I will be holding a seminar at ModelBest (面壁智能) in Sep 20, 2023 in Beijing, Haidian, 科技园. The seminar will be in Chinese, and it's called "大模型安全与伦理问题" (translation: Safety and Ethical Concerns of Large Language Models). Below is a list of references.

Sep 19, 2023 · 3 min
CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

1 | 1 | 1 | 1 I did this work with my girlfriend, whose research direction is computational fluid dynamics (CFD). We observed that there are numerous research works in applying deep learning (DL) to solve CFD problems. E.g., 1 have shown that DL methods can not only be more accurate than the best numerical methods, ...

Sep 16, 2023 · 2 min

Some Binary Search

A binary search with C++: The same thing with Rust: And with Python:

Sep 14, 2023 · 1 min