Blog

์—ผ๊ทœํ™˜
19 March 2026

OrthAlign: Orthogonal Subspace Decomposition for Non-Interfering Multi-Objective Alignment

ICLR'26 Poster

๐Ÿ’ก๋‹ค์ค‘ preference ์ตœ์ ํ™” ์‹œ ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ ๊ณต๊ฐ„์„ orthogonal subspace๋กœ ๋ถ„ํ•ดํ•˜์—ฌ, objective ๊ฐ„ ๊ฐ„์„ญ์„ ์›์ฒœ์ ์œผ๋กœ ์ œ๊ฑฐํ•˜์ž

19 March 2026

Multiplayer Nash Preference Optimization

ICLR'26 Poster

๐Ÿ’กalignment๊ฐ€ ๊ฐ€์ ธ์•ผ ํ•  ๋ชฉํ‘œ๋Š” ๋ณด์ƒ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๋‹ค์ˆ˜ ๊ฐ€์น˜ ๋ฐ ์ •์ฑ… ์ง‘๋‹จ ์†์—์„œ ๊ทธ ๋ˆ„๊ตฌ์—๊ฒŒ๋„ ์ง€์ง€ ์•Š๋Š” ์•ˆ์ •์  ๊ท ํ˜• ์ƒํƒœ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒƒ์ด๋‹ค!

19 March 2026

How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence

COLM'25

๐Ÿ’กPost-training ํ›„ ๋ชจ๋ธ ๋‚ด๋ถ€ ์ง€์‹, ์ง„์‹ค์„ฑ, ์•ˆ์ „์„ฑ, ํ™•์‹ ์„ฑ์˜ ๋ณ€ํ™”๋ฅผ ๊ธฐ๊ณ„์ ์œผ๋กœ ๋ถ„์„!