SVD

์ด์Šนํ™˜
26 March 2026

Language Model Personalization via Reward Factorization

COLM'25

๐Ÿ’ก์—ฌ๋Ÿฌ ์‚ฌ์šฉ์ž์˜ ์„ ํ˜ธ๋ฅผ ๊ณตํ†ต๋œ ์„ ํ˜ธ ์ถ•(e.g., ์นœ์ ˆ, ๊ฐ„๊ฒฐ, ๊ฒฉ์‹)์œผ๋กœ ๋ถ„ํ•ดํ•ด ํ•™์Šตํ•œ ๋’ค, ์ƒˆ๋กœ์šด ์‚ฌ์šฉ์ž๊ฐ€ ๋“ค์–ด์˜ค๋ฉด ์ถ•๋งˆ๋‹ค ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜๋ฅผ ์ฃผ์–ด ์‚ฌ์šฉ์ž์˜ personalized๋œ ์„ ํ˜ธ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ถ”์ •ํ•˜์ž!