Blog

์ตœ๋ฏผ์˜
26 March 2026

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

COLM'25

๐Ÿ’ก์ •๋‹ต์„ ๊ทธ๋Œ€๋กœ ๋ชจ๋ฐฉํ•˜๋Š” SFT๋ณด๋‹ค, noisyํ•œ ๋‹ต์•ˆ์„ โ€˜๋น„ํŒ(critique)โ€™ํ•˜๋„๋ก ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์ด reasoning ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋” ํšจ๊ณผ์ ์ด๋‹ค!Human learning process์˜ ๋ฐฉ์‹(critical thinking, analyze, understandingโ€ฆ)์„ ๋ชจ๋ธ ํ•™์Šต์— ์ ์šฉํ•ด๋ณด์ž

26 March 2026

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

COLM'25

๐Ÿ’กํ˜„์žฌ์˜ ์ถ”๋ก  ์ตœ์ ํ™”๊ฐ€ ํ˜‘๋ ฅ์„ ๋ณ„๋„๋กœ ์ •๋ ฌ์‹œํ‚ค์ง€ ์•Š๋Š”๋‹ค๋ฉด, ํ˜‘๋ ฅ์ด ์•„๋‹Œ ํ•ฉ๋ฆฌ์  ์ด๊ธฐ์ฃผ์˜๋ฅผ ํ‘œ๋ฐฉํ•˜๋Š” ๊ฐœ์ธ์ฃผ์˜ ๋ชจ๋ธ์ด ํƒ„์ƒํ•  ์ˆ˜ ์žˆ๋‹ค!์ฆ‰, ์ถ”๋ก  ๋Šฅ๋ ฅ๊ณผ, ํ˜‘์—… ๋Šฅ๋ ฅ(๋น„์šฉ ๊ฐ์ˆ˜ ์ธก๋ฉด)์€ ๋ณ„๊ฐœ๋‹ค!