Yingfa Chen

2024

InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens

Code | Paper

The first benchmark for evaluating the effectiveness of LLMs in handling more than 100k tokens!

In the paper, we name it $\infty$-Bench, but I will sometimes use "InfiniteBench" in this blog post for better readability.

Finally got some time to write this blog, been so busy lately! I have been in a fairly long duration of research hiatus, meanwhile the field of NLP has been revolutionized by an overwhelming number of new LLMs. Finally, I was able to arrive at some productive and meaningful work in this new era of research, as a second author. In this blog post, I will introduce this work that I have been working on recently.

1.1k words, 7 min

Research

2023

Interpreting a Maze-Solving Network

The blog post

I can't believe I haven't read this until now. This is mind-provoking, and the result is an important step towards understanding neural networks.

56 words, 1 min

Thoughts

Activation Addition (ActAdd)

Paper

TLDR: Propose ActAdd, a method for controlling model behavior during inference by modifying activations with a bias term that is learned from a pair of prompt.

Summary:

  • Propose ActAdd, a method for controlling model behavior by modifying activations at inference time.
  • Steering vectors are computed by taking the activation differences that result from pairs of prompts. The vectors are added as bias during inference.
  • ActAdd provides control over high-level properties of the output, and preserves off-target model performance, and requires little computational and implementational costs.

709 words, 4 min

Paper Note

Safety and Ethical Concerns of Large Language Models

I will be holding a seminar at ModelBest (面壁智能) in Sep 20, 2023 in Beijing, Haidian, 科技园. The seminar will be in Chinese, and it's called "大模型安全与伦理问题" (translation: Safety and Ethical Concerns of Large Language Models). Below is a list of references.

635 words, 3 min

Thoughts
0 %