<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Paper Note on Yingfa Chen 陈英发</title><link>https://chen-yingfa.github.io/categories/paper-note/</link><description>Recent content in Paper Note on Yingfa Chen 陈英发</description><generator>Hugo -- 0.146.6</generator><language>en-us</language><lastBuildDate>Sat, 07 Oct 2023 17:55:33 +0000</lastBuildDate><atom:link href="https://chen-yingfa.github.io/categories/paper-note/index.xml" rel="self" type="application/rss+xml"/><item><title>Activation Addition (ActAdd)</title><link>https://chen-yingfa.github.io/research_posts/2023-actadd/</link><pubDate>Sat, 07 Oct 2023 17:55:33 +0000</pubDate><guid>https://chen-yingfa.github.io/research_posts/2023-actadd/</guid><description>&lt;p>&lt;a href="https://arxiv.org/abs/2308.10248">Paper&lt;/a>&lt;/p>
&lt;p>TLDR: Propose &lt;strong>ActAdd&lt;/strong>, a method for controlling model behavior during inference by modifying activations with a bias term that is learned from a pair of prompt.&lt;/p>
&lt;p>Summary:&lt;/p>
&lt;ul>
&lt;li>Propose &lt;strong>ActAdd&lt;/strong>, a method for controlling model behavior by modifying activations at inference time.&lt;/li>
&lt;li>Steering vectors are computed by taking the activation differences that result from pairs of prompts. The vectors are added as bias during inference.&lt;/li>
&lt;li>ActAdd provides control over high-level properties of the output, and preserves off-target model performance, and requires little computational and implementational costs.&lt;/li>
&lt;/ul>
&lt;!-- more -->
&lt;blockquote>
&lt;p>The recently popular &lt;a href="https://arxiv.org/abs/2310.01405">representation engineering paper&lt;/a> (RepE) seems to be largely inspired by this work.&lt;/p></description></item></channel></rss>