Blog

์—ผ๊ทœํ™˜
26 March 2026

TROLL: Trust Regions Improve Reinforcement Learning for Large Language Models

ICLR'26 Oral

๐Ÿ’กLLM์„ RL๋กœ ํ•™์Šตํ•  ๋•Œ ๋ชจ๋ธ์ด ํ•œ ๋ฒˆ์— ๋„ˆ๋ฌด ํฌ๊ฒŒ ๋ฐ”๋€Œ๋ฉด ๋ง๊ฐ€์ง€๋ฏ€๋กœ, ํ—ˆ์šฉ๋œ ๋ฒ”์œ„ ์•ˆ์—์„œ๋งŒ ์—…๋ฐ์ดํŠธํ•ด์„œ ์•ˆ์ „ํ•˜๊ฒŒ ํ•™์Šต์‹œํ‚ค์ž

26 March 2026

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

COLM'25

๐Ÿ’ก๋„ˆ๋ฌด ๊ธธ๊ณ  ๋ณต์žกํ•œ reasoning ๊ฒฝํ–ฅ์„ ์™„ํ™”ํ•˜์ž!โ‡’ reasoning process๋ฅผ ์„ธ๋‹จ๊ณ„๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ , ๊ทธ ์ค‘์— ์–ด๋–ค ๊ฑธ ์ค„์—ฌ์•ผ ํ• ์ง€ ๋ถ„์„ํ•˜์ž

26 March 2026

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

COLM'25

๐Ÿ’กRefusal token์œผ๋กœ ๋ชจ๋ธ์˜ ์‘๋‹ต ๊ฑฐ์ ˆ์„ ๋” ์„ฌ์„ธํ•˜๊ณ (์„ฑ๋Šฅโ†‘), ์œ ์—ฐํ•˜๊ฒŒ(inference ๋‹จ์—์„œ ์กฐ์ ˆ ๊ฐ€๋Šฅ) ํ•œ๋‹ค!