27 March 2026

FRESH IN MEMORY: TRAINING-ORDER RECENCY IS LIN-EARLY ENCODED IN LANGUAGE MODEL ACTIVATIONS

๐Ÿ’ก์–ธ์–ด ๋ชจ๋ธ์€ โ€œ๋ฌด์—‡โ€ ์„ ๋ฐฐ์› ๋Š”์ง€์™€ โ€œ์–ธ์ œโ€ ๋ฐฐ์› ๋Š”์ง€์— ๋Œ€ํ•ด ์•Œ๊ณ ์žˆ๋‹ค.โ‡’ ๋‹ค์–‘ํ•œ ํ†ต์ œ ์‹คํ—˜์„ ํ†ตํ•ด ๊ฒ€์ฆํ•ด๋ณด์ž ! !

FRESH IN MEMORY: TRAINING-ORDER RECENCY IS LIN-EARLY ENCODED IN LANGUAGE MODEL ACTIVATIONS

Review

๋‹‰๋„ค์ž„ Strength & Weakness & Sugguestions ๋ณ„์  (0/5)
๋ˆˆ๋ฌผ โ€ข ๊ฐ•์  : LLM์— ๋Œ€ํ•ด ํ†ต์ƒ์ ์œผ๋กœ "๋ฌด์—‡"์„ ์•Œ๊ณ ์žˆ๋Š”์ง€๋ฅผ ๋„˜์–ด "์–ธ์ œ" ์•Œ๊ฒŒ๋˜์—ˆ๋Š”์ง€์— ๋Œ€ํ•ด ๊ฒ€์ฆํ•˜๋Š” ์—ฐ๊ตฌ. Training Order๊ณผ ๊ด€๋ จ์žˆ์„ ๋ฒ•ํ•œ ๋ชจ๋“  "๊ฐ„๋‹จํ•œ" ์š”์ธ๋“ค์„ ํ†ต์ œํ•ด ์‹คํ—˜์„ ํ–ˆ์ง€๋งŒ,
Training Order๋Š” ๊ฐ„๋‹จํ•˜๊ฒŒ ๋ฐœ์ƒํ•œ ๊ตฌ์กฐ๊ฐ€ ์•„๋‹Œ, ๋ชจ๋ธ ๋‚ด๋ถ€์˜ ๋ณต์žกํ•œ ๊ตฌ์กฐ์ž„์„ ์•”์‹œํ•จ.
โ€ข ์•ฝ์  : Training Order๊ณผ ๊ด€๋ จ๋œ ํ†ต์ œ ์‹คํ—˜์„ ๋งŽ์ด ํ–ˆ์ง€๋งŒ, ๊ฒฐ๋ก ์ ์œผ๋กœ ์‹ค์ œ ๋ฐœ์ƒํ•˜๋Š” ์›์ธ์€ ๋ฐํ˜€๋‚ด์ง€ ๋ชปํ•จ. ๋˜ํ•œ Sequence data์— ํ•œ์ •์ ์ด๋‹ค ๋ณด๋‹ˆ, ์‘์šฉํ•˜๋ ค๋ฉด fine-tuning์— ์˜์กด์ ์ผ ๊ฒƒ์œผ๋กœ ๋ณด์ž„.
โ€ข ๋ณด์™„์  : ์ž˜๋งŒ ์ด์šฉํ•œ๋‹ค๋ฉด, ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€๋™์„ฑ์ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃฐ ๋•Œ ํšจ์œจ์ ์ผ ๊ฒƒ ๊ฐ™๋‹ค. ๊ทธ๋ฆฌ๊ณ , ๋” ๋ณต์žกํ•˜๊ณ  ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์— ํ†ต์ œ ์‹คํ—˜์„ ์ง„ํ–‰ํ•ด ๋ถ„์„ํ•œ๋‹ค๋ฉด ์ผ๋ฐ˜์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์„ ๋“ฏ.
3.1
ํ”ผ๋•€ โ€ข ๊ฐ•์ : LLM์˜ ํ•ด์„๊ฐ€๋Šฅ์„์— ๋Œ€ํ•ด์„œ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ์ถ•์„ ์ œ์•ˆํ•จ
โ€ข ์•ฝ์ & ๋ณด์™„์ : ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€๋ฅผ ์™œ ์•Œ์•„์•ผ ๋˜๋Š”์ง€ motivation์ด ๋ถ€์กฑํ•จ, ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€๊ฐ€ ์™œ ์ค‘์š”ํ•œ์ง€์— ๋Œ€ํ•ด ์ข€ ๋” ๋งํ•˜๊ณ  ์ด์— ๋Œ€ํ•œ ์‹ค์ œ ์‹คํ—˜์ด ์žˆ์—ˆ์œผ๋ฉด ๋” ์ข‹์•˜์„๋“ฏ. e.g., ์ •๋ณด ์—…๋ฐ์ดํŠธ, Unlearning ๋“ฑ๋“ฑ
What's In My Human Feedback? Learning Interpretable Descriptions of Preference Data ๋…ผ๋ฌธ์˜ Section 5์ฒ˜๋Ÿผ
3.8
thumps-up โ€ข ์žฅ: training-order๊ฐ€ ๋ชจ๋ธ ๋‚ด์— ๋ช…ํ™•ํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉ๋œ๋‹ค๋Š”๊ฒŒ ์ง„์งœ ์‹ ๊ธฐํ•จ. ํ•˜๊ธด ๊ทธ๋Ÿฌ๋‹ˆ๊นŒ incremental learning์ด ์œ ํšจํ•œ๊ฑฐ๊ฒ ์ง€?
๋‹ค์–‘ํ•œ family๋‚˜ model size์— ๋Œ€ํ•ด์„œ๋„ ์ถฉ์‹คํ•˜๊ฒŒ ์‹คํ—˜ํ•œ ๋ถ€๋ถ„๋„ ์ข‹์Œ
โ€ข ๋‹จ&: ๊ทผ๋ฐ ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€๋Š” ์™œ ์•Œ์•„์•ผ ํ•˜์ง€? ๊ทธ ์‹œ์ ์— ํ•™์Šตํ•œ ๋ถ€๋ถ„๋งŒ ์ฝ• ์ง‘์–ด์„œ model editingํ•  ๋•Œ ์“ฐ์ด๋‚˜? rationale์ด ๋ถ€์กฑํ•ด์„œ ์•„์‰ฌ์›€
3.5
์›ƒ์œผ๋ฉด์„œ ๋ณด์ž์žฅ: ์‹ ์„ ํ•œ ๊ด€์ . ์ˆœ์„œ๋ฅผ ์•„๋Š” ๊ฒƒ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ๊ทธ๊ฑธ ํ™œ์šฉํ•  ์ˆ˜๋„ ์žˆ์„ ๊ฒƒ์ด๋ผ๋Š” ๋ฐฉํ–ฅ์ด ๋ณด์ด๊ธด ํ•จ.
๋‹จ์ : ์™œ ํ•ด์•ผํ• ๊นŒ? ์ฝ๋Š” ์‚ฌ๋žŒ์ด ์ƒ๊ฐํ•˜๊ฒŒ ํ•œ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ๋Š” ๋‚˜์ค‘์— ์“ธ๋ชจ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•จ. ๊ฒฐ๊ตญ ์ตœ์‹  ์ง€์‹์„ ๊ณ„์† ํ•™์Šตํ•˜๊ฒŒ ๋ ํ…๋ฐ, ๊ฐ€์žฅ ๋งˆ์ง€๋ง‰์— ๋ฐฐ์šด ์ง€์‹์ด ๋ญ์•ผ? ๋ผ๊ณ  ๋ฌผ์–ด๋ตœ์„ ๋•Œ ๋‹ตํ•˜๋Š” ๊ฒƒ. ํ•„์š”ํ•œ ์ด์œ ๋Š” ๊ฒฐ๊ตญ LLM์ด ์ธ๊ฐ„๋ณด๋‹ค ๋˜‘๋˜‘ํ•ด์ง€๋ฉด ์ž๊ฐ€ ํ•™์Šต์„ ํ•  ๊ฒƒ ๊ฐ™๊ณ , ๊ทธ๋Ÿผ ์ตœ์‹  ๊ธฐ์ˆ ๋„ llm๋งŒ ์•Œ๊ฒƒ ๊ฐ™์€๋ฐ, ๊ทธ๋•Œ๋Š”?
๋ณด์™„์ : ๊ฐ€์งœ ๋ฐ์ดํ„ฐ ๋“ฑ์„ ๋” ๋งŒ๋“ค๊ณ  ์‹คํ—˜ํ•ด๋ณด๋ฉด์„œ ์ถฉ๋Œ ์—ฌ๋ถ€๋„ ๊ณ ๋ คํ–ˆ์œผ๋ฉด..
3.7
๋…์ˆ˜๋ฆฌ์˜คํ˜•์ œ โ€ข ๊ฐ•์ : ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€(recency)๊นŒ์ง€ activation์— ์ธ์ฝ”๋”ฉ ๋œ๋‹ค๋Š” ์ ์„ ์ž˜ ์ œ์‹œํ•จ. training-order๊ฐ€ ์„ ํ˜• ๋ฐฉํ–ฅ์œผ๋กœ encoding ๋˜๋Š”๊ฒƒ๋„ ์ƒˆ๋กœ์šด ์‚ฌ์‹ค์ž„
โ€ข ์•ฝ์ : ๊ทธ๋ž˜์„œ ์ด ํ˜„์ƒ์ด ๋ชจ๋ธ์˜ ์‹ค์ œ ์˜ˆ์ธก์— ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€์— ๋Œ€ํ•œ ๋ถ„์„์€ ๋ถ€์กฑํ•จ
โ€ข ๋ณด์™„/์ œ์•ˆ: ํ–ฅํ›„ ๋ชจ๋ธ ํ›ˆ๋ จ ์‹œ์— ๋ฐ์ดํ„ฐ์…‹์„ ์ด๋Ÿฌํ•œ ์‹์œผ๋กœ ๊ตฌ์„ฑํ•ด ํ•™์Šต์‹œํ‚ค๊ณ  ๋‚˜์ค‘์— ๊ด€๋ จ๋œ entity๊ฐ„ conflict๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ์ด๋Ÿฌํ•œ recency signal์„ ํ™œ์šฉํ•ด ์‰ฝ๊ฒŒ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์„๊ฒƒ ๊ฐ™์Œ
4.2
์‚์งˆ โ€ข ๊ฐ•์ :๋ชจ๋ธ์ด "์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€"์— ๋Œ€ํ•œ ์ •๋ณด๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์‹œ์ ์˜ ์ •๋ณด๊ฐ€ ์ถฉ๋Œํ•˜๋Š” ์ƒํ™ฉ์ด๋‚˜, knowledge edting ์‹œ ์ค‘์š”ํ•˜๊ฒŒ ์ž‘์šฉํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋จ
โ€ข ์•ฝ์ : ๋„ˆ๋ฌด ์ธ์œ„์ ์ธ ๋ฐ์ดํ„ฐ์…‹ ๋А๋‚Œ..? ๋ฐ์ดํ„ฐ๊ฐ€ ์™„์ „ํžˆ ๋…๋ฆฝ์ด๋ผ๋Š” ์ ๊ณผ ์‹ค์ œ ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ์—๋„ ๋…ธ์ด์ฆˆ / ์ˆœ์„œ๊ฐ€ ์„ž์—ฌ์žˆ์„ํ…๋ฐ... real-world corpus์—๋„ ์ด๊ฒŒ ์œ ์ง€๋  ์ง€ ์˜๋ฌธ์ด ๋“ฆ.
โ€ข ๋ณด์™„์ : ๋ณด๋‹ค ํ˜„์‹ค์ ์ธ ๋ฐ์ดํ„ฐ (๋…ธ์ด์ฆˆ or ์ค‘๋ณต ๋ฐ˜์˜)๋ฅผ ํ™œ์šฉํ•œ ์‹คํ—˜
3.5
ํŒ์ฝ˜โ€ข ์žฅ์ : ํ•™์Šต ์ˆœ์„œ๋ฅผ LLM์ด ์ธ์‹ํ•œ๋‹ค๋Š” ์ƒˆ๋กœ์šด ๋ฐœ๊ฒฌ
โ€ข ๋‹จ์ : ๊ฐ€์„ค์˜ ๊ทผ๊ฑฐ๊ฐ€ ์•ฝํ•ด์„œ์ธ์ง€ ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ๋ด๋„ ๋‚ฉ๋“์ด ์ž˜ ์•ˆ๊ฐ€๋Š”๋ฐ, ์‹คํ—˜ ์„ค์ •์ด ๋” ๋‹ค์–‘ํ•˜๋ฉด ์ข€๋” ์„ค๋“๋์„ ๋“ฏํ•จ
โ€ข ๋ณด์™„์ : ๋ชจ๋ธ์ด ํ•™์Šต ์ˆœ์„œ๋ฅผ ์™œ ์ธ์ฝ”๋”ฉํ•˜๊ณ  ์žˆ์„๊นŒ? ๊ทธ ์ •๋ณด๋ฅผ ์–ด๋–ป๊ฒŒ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์„๊นŒ? โ†’ ๊ด€๋ จ ํ•ด์„
3.5
์ดˆ์ฝœ๋ฆฟ โ€ข ์žฅ์ : ๋ชจ๋ธ์ด ์–ธ์ œ ๋ฐฐ์› ๋Š”๊ฐ€๋ผ๋Š” ์งˆ๋ฌธ ์ž์ฒด๊ฐ€ ์‹ ์„ ํ–ˆ์Œ. training order๊ฐ€ activation ๊ณต๊ฐ„์— ์„ ํ˜•์ ์œผ๋กœ ์ธ์ฝ”๋”ฉ๋œ๋‹ค๋Š” ๊ฒŒ ์ง๊ด€์ ์œผ๋กœ ์ž˜ ์™€๋‹ฟ์•˜์Œ
โ€ข ์•ฝ์ : ์‹คํ—˜์ด alias๋กœ ๋ฐ”๊พผ ์ธ์œ„์ ์ธ ๋ฐ์ดํ„ฐ์…‹์—์„œ๋งŒ ์ด๋ฃจ์–ด์กŒ๋Š”๋ฐ, ์‹ค์ œ ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ์ฒ˜๋Ÿผ ๋…ธ์ด์ฆˆ๊ฐ€ ๋งŽ๊ณ  ์ˆœ์„œ๊ฐ€ ๋’ค์„ž์ธ ํ™˜๊ฒฝ์—์„œ๋„ ๊ฐ™์€ ํ˜„์ƒ์ด ๋‚˜ํƒ€๋‚˜๋Š”์ง€ ์•Œ ์ˆ˜ ์—†์Œ.
โ€ข ๋ณด์™„์ : ์‹ค์ œ ์‚ฌ์ „ํ•™์Šต ํ™˜๊ฒฝ๊ณผ ๋น„์Šทํ•˜๊ฒŒ ๋…ธ์ด์ฆˆ๋‚˜ ์ค‘๋ณต์ด ํฌํ•จ๋œ ๋ฐ์ดํ„ฐ๋กœ๋„ ์‹คํ—˜ํ•ด๋ณด๋ฉด ์ข‹์„๊ฒƒ ๊ฐ™์Œ.
3.5
ํŒŒ์ด์–ด โ€ข ์žฅ์ : ๋ชจ๋ธ์ด ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ LLM์ด ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋ฐœ๊ฒฌ์ด ์žฅ์ ์ž„.
โ€ข ๋‹จ์ : ์ธ์œ„์ ์œผ๋กœ alias์™€ timestamp๋ฅผ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, ์ด๊ฑธ ์ผ๋ฐ˜์ ์ธ ํ™˜๊ฒฝ์—์„œ๋„ ์ ์šฉ์ด ๋˜์–ด ์ž˜ ํ•™์Šต์ด ๋ ์ง€๋Š” ์˜๋ฌธ์ž„.
โ€ข ๋ณด์™„: ํ•™์Šต ์ˆœ์„œ๋ฅผ ์ธ์ฝ”๋”ฉํ•  ๋•Œ, ์‹ค์ œ์˜ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ—˜์ด ํ•„์š”ํ•  ๊ฒƒ ๊ฐ™์Œ.
3.7
๋ฉ์ฟ ๋ฆผ๋ณดTraining trajectory๋ฅผ ๋‚ด๋ถ€์— ์ธ์ฝ”๋”ฉํ•˜๋Š”๊ฒŒ ์‹ ๊ธฐํ•จ! ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋ดค์„ ๋•Œ ์ˆœ์„œ์— ๋”ฐ๋ผ ๋ชจ๋ธ์ด ๋ณ€ํ•˜๋Š” ๊ฒƒ์€ ๋‹น์—ฐํ•œ๋ฐ, ๊ทธ๊ฑธ ์–ด๋–ค์‹์œผ๋กœ ๋‚ด์žฌ์ ์œผ๋กœ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”์ง€๋Š” ์‰ฝ๊ฒŒ ๊ฐ์ด ์žกํžˆ์ง€๋Š” ์•Š์Œ. ์ด๋ฏธ ์ปค๋ฆฌํ˜๋Ÿผ ๋Ÿฌ๋‹ ๋“ฑ์œผ๋กœ trajectory์— ๋Œ€ํ•œ ์ตœ์ ํ™” ์—ฐ๊ตฌ๋Š” ๋งŽ์ด ๋˜์–ด ์žˆ์–ด์„œ, ์ด ์—ฐ๊ตฌ ์–ด๋””๋‹ค๊ฐ€ ์จ๋จน์„์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์Œ. unlearningํ•  ๋•Œ training ์ดˆ๊ธฐ์— ๋ฐฐ์šด๊ฒƒ๊ณผ ํ›„๊ธฐ์— ๋ฐฐ์šด๊ฒƒ ์ค‘ ์–ด๋А๊ฑธ ์ž˜ ์žŠ๋Š”์ง€ ๋ณผ ์ˆ˜ ์žˆ์–ด์„œ ๊ฑฐ๊ธฐ์—๋‹ค ์จ๋จน์„ ์ˆ˜ ์žˆ์œผ๋ ค๋‚˜ 3.7

TL; DR

๐Ÿ’ก

์–ธ์–ด ๋ชจ๋ธ์€ โ€œ๋ฌด์—‡โ€ ์„ ๋ฐฐ์› ๋Š”์ง€์™€ โ€œ์–ธ์ œโ€ ๋ฐฐ์› ๋Š”์ง€์— ๋Œ€ํ•ด ์•Œ๊ณ ์žˆ๋‹ค.

โ‡’ ๋‹ค์–‘ํ•œ ํ†ต์ œ ์‹คํ—˜์„ ํ†ตํ•ด ๊ฒ€์ฆํ•ด๋ณด์ž ! !

Summary

  • ์—ฐ๊ตฌ์ง„
  • github : x
  • ์ธ์šฉ์ˆ˜ : 2

Background & Motivation

  • LLM์— ๋Œ€ํ•ด์„œ ๋ณดํ†ต โ€œKnowledge์— ๋Œ€ํ•ด ์•„๋Š”๊ฐ€?โ€ ์— ์ง‘์ค‘ํ•˜์ง€๋งŒ, โ€œknowledge๋ฅผ ์–ธ์ œ ๋ฐฐ์› ๋Š”๊ฐ€?โ€ ์— ๋Œ€ํ•ด์„œ๋Š” ํƒ๊ตฌํ•˜์ง€ ์•Š์•˜๋‹ค.

    โ‡’ ๋”ฐ๋ผ์„œ, โ€œ๋ฌด์—‡์„ ์•„๋Š”์ง€โ€์— ๋”ํ•ด, ๋ชจ๋ธ์ด ํ•™์Šตํ•˜๋Š” ๋ชจ๋“  ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์•”๋ฌต์ ์œผ๋กœ timestamp๋ฅผ ์ฐ์„ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ ๋ ๊นŒ??

๐Ÿ’ก

LLM์ด ๋‹จ์ˆœํžˆ โ€œ๋ฌด์—‡โ€์„ ์•„๋Š”์ง€๋ฅผ ๋„˜์–ด์„œ,

โ€œ์–ธ์ œโ€ ๋ฐฐ์› ๋Š”์ง€์— ๋Œ€ํ•œ ์ •๋ณด๊นŒ์ง€ LLM ๋‚ด๋ถ€์ ์œผ๋กœ ๊ตฌ๋ถ„ํ•˜๊ณ  ์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์„ ํƒ๊ตฌํ•ด๋ณด์ž.

Contributions (What theyโ€™ve revealed)

๐Ÿ’ก
  1. Training-order(ํ›ˆ๋ จ ์ˆœ์„œ ์ •๋ณด)๊ฐ€ LLM์˜ activation(hidden vector)์— Linearํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉ๋˜๋Š” ๊ฒƒ์„ ์ฒ˜์Œ์œผ๋กœ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค.
  1. ์ด๋Ÿฌํ•œ ์ธ์ฝ”๋”ฉ ์ •๋ณด๋Š” โ€œ์ตœ์‹ ์„ฑ(recency)โ€์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.
  1. โ€œ์ตœ์‹ ์„ฑ(recency)โ€ ์ •๋ณด๋Š” ๋‹จ์ˆœํ•œ artifact๊ฐ€ ์•„๋‹ˆ๋ผ, ์‹ค์ œ representation ์ •๋ณด์ด๋‹ค.

    โ‡’ ๋‹จ์ˆœํ•˜๊ฒŒ ๋‚˜์˜จ ์ •๋ณด๊ฐ€ ์•„๋‹ˆ๋‹ค!

  1. โ€œ์ตœ์‹ ์„ฑ(recency)โ€ ์ •๋ณด๋Š” ๋ชจ๋ธ์ด ์ง์ ‘ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค!
Experimental Setup

๋ชจ๋ธ์ด ์–ธ์ œ ๋ฐฐ์› ๋Š”์ง€, ๊ธฐ์–ตํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด training order๋ฅผ ํ†ต์ œํ•ด์•ผ ํ•œ๋‹ค.

  1. ๋ฐ์ดํ„ฐ์…‹ : ์œ ๋ช…์ธ entity ๊ธฐ๋ฐ˜์˜ QA ๋ฐ์ดํ„ฐ์…‹

    โ‡’ entity๋Š” ์ด 16000๊ฐœ๊ฐ€ ์กด์žฌํ•˜๊ณ , ๊ฐ entity๋ณ„ 6๊ฐœ์˜ ๊ณ ์ • QA ์งˆ๋ฌธ์ด ์กด์žฌํ•จ.

    (์–ธ์ œ, ์–ด๋””์„œ ํƒœ์–ด๋‚˜๊ณ  ์‚ฌ๋งํ–ˆ๋Š”์ง€, ๋ฌด์—‡์„ ํ–ˆ๋Š”์ง€..๋“ฑ๋“ฑ)

  1. QA ๋ฐ์ดํ„ฐ์…‹์— ์กด์žฌํ•˜๋Š” ๋ชจ๋“  entity๋ฅผ alias๋กœ ๋ฐ”๊ฟˆ โ‡’ Synthetic

    e.g,.) Einstein โ†’ sjdfef(๋žœ๋ค ํ† ํฐ)

    โ‡’ Pretrained model์˜ knowledge ์˜ํ–ฅ์„ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•จ.

    (Pretrained model์ด Einstein์— ๋Œ€ํ•œ ์ง€์‹์„ ๊ฐ€์ง€๊ณ  ์žˆ์„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ƒˆ๋กœ์šด ํ† ํฐ์œผ๋กœ ๋ฐ”๊ฟˆ)

    ๊ทธ๋ฆฌ๊ณ  ๊ณ ์ •๋œ ์งˆ๋ฌธ templete์™ธ์—๋„ Natural ํ•œ ๋ฒ„์ „์˜ ์งˆ๋ฌธ๋„ ์ถ”๊ฐ€ํ•จ.


  1. Test sample ์ƒ์„ฑ

    =Test sample์€ training์— ์ผ๋˜ QA ํ…œํ”Œ๋ฆฟ์—์„œ alias๋Š” ๊ณ ์ •ํ•˜๊ณ , ํ…œํ”Œ๋ฆฟ๋งŒ ๋ฐ”๊ฟ”์„œ ๋งŒ๋“ ๋‹ค.

    (์ฆ‰, fine-tuning์— ์‚ฌ์šฉํ–ˆ๋˜ entity=alias๋ฅผ ๊ฐ€์ง€๊ณ , โ€œ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ์งˆ๋ฌธโ€์œผ๋กœ ๋‹ค์‹œ ๋ฌผ์–ด๋ณด๋Š” sample)

    โญ sample์„ ๋งŒ๋“ค ๋•Œ๋Š” ํ† ํฐ ๊ธธ์ด, ์œ„์น˜, ๊ฐ™์€ ์งˆ๋ฌธ ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉํ•ด์•ผ ํ•œ๋‹ค.

    (์กฐ๊ฑด์„ ๋งž์ถฐ์ฃผ๊ธฐ ์œ„ํ•ด์„œ!)


  1. ์ „์ฒด entity๋ฅผ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ”. <E>

    ์ „์ฒด entity๋ฅผ E1โˆฝEmE_1 โˆฝ E_m๏ปฟ ๊ทธ๋ฃน(์ง‘ํ•ฉ)์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค. ๊ฐ ์ง‘ํ•ฉ์€ ๋…๋ฆฝ์ž„!

    (์—ฌ๊ธฐ์„œ m์€ 2 ๋˜๋Š” 6์„ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•จ)


  1. ์ „์ฒด entity์— ๋Œ€ํ•ด QA ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ <D>

    EiE_i๏ปฟ ์— ๋“ฑ์žฅํ•˜๋Š” ๋ชจ๋“  entity(alias)๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” QA ์งˆ๋ฌธ์„ QA ๋ฐ์ดํ„ฐ์…‹์—์„œ ๊ฐ€์ ธ์˜ค๊ณ ,

    D1โˆฝDmD_1 โˆฝ D_m๏ปฟ QA ๋ฐ์ดํ„ฐ ๋ถ€๋ถ„์ง‘ํ•ฉ์„ ๋งŒ๋“ ๋‹ค!


  1. D1โˆฝDmD_1 โˆฝ D_m๏ปฟ QA ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ด ๋ชจ๋ธ Fine-tuning.

    D1โˆฝDmD_1 โˆฝ D_m๏ปฟ ์ˆœ์„œ๋Œ€๋กœ Llama ๋ชจ๋ธ์— ๋ฐ์ดํ„ฐ์…‹๋ณ„๋กœ 5 epoch์”ฉ ํ•™์Šต์„ ์‹œํ‚ด.

    (D1 = 5 epoch / D2 = 5 epoch / โ€ฆ. / Dm = 5 epoch) ์ˆœ์„œ๋Œ€๋กœ!!


  1. ํ•™์Šต๋œ ๋ชจ๋ธ์— Test_sample์„ ๋„ฃ์Œ

    D1โˆฝDmD_1 โˆฝ D_m๏ปฟ ๊ณผ ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ์งˆ๋ฌธ์— alias๋Š” ๊ทธ๋Œ€๋กœ ์กด์žฌํ•˜๋Š” QA test sample์„ ๋„ฃ๋Š”๋‹ค.


  1. ๊ฐ ์ž…๋ ฅ์— ๋Œ€ํ•œ activation์„ ๋ฝ‘๋Š”๋‹ค.
    • ์ž…๋ ฅ๋งˆ๋‹ค Layer๋ณ„๋กœ ํ† ํฐ๋ณ„ hidden vector๋ฅผ ๋ฝ‘๋Š”๋‹ค.

      (N-layers x N_tokens)๊ฐœ ๋ฒกํ„ฐ


  1. ๋ฐ์ดํ„ฐ์…‹๋ณ„๋กœ ๋‚˜์˜จ activation๋“ค์„ ํ‰๊ท ์„ ๋‚ด์„œ centroid๋ฅผ ๊ตฌํ•˜๊ณ , ์ง์„ ์ƒ์— ๋‚˜ํƒ€๋‚ธ๋‹ค.
  • ๋งˆ์ปค์˜ ์˜๋ฏธ : Test_sample์— ์‚ฌ์šฉ๋œ ์งˆ๋ฌธ ํ…œํ”Œ๋ฆฟ์„ ๋‹ค๋ฅด๊ฒŒ ์„ค์ •ํ•œ ๊ฒƒ
  • x์ถ• : c1โ†’c6์œผ๋กœ ๊ฐ€๋Š” ๋ฐฉํ–ฅ๋ฒกํ„ฐ์— ๋ชจ๋“  centroid๋ฅผ ์ •์‚ฌ์˜์‹œ์ผœ์„œ ์ถ•๋ณ€ํ™˜ ์‹œํ‚จ ๊ฒƒ.
  • y์ถ• : x์ถ•์œผ๋กœ ์„ค๋ช…๋˜์ง€ ์•Š๋Š” ๋ถ€๋ถ„์— ๋Œ€ํ•ด์„œ PCA๋ฅผ ์ ์šฉํ•จ. (ํฐ ์˜๋ฏธx)

    โญ c1~c6 ๋ฐฉํ–ฅ์„ ๊ธฐ์ค€์œผ๋กœ ๋ณผ ๋•Œ, ๋ชจ๋“  centroid๊ฐ€ ์ˆœ์„œ๋Œ€๋กœ ๋†“์ด๋ฉด, activation ๊ณต๊ฐ„์— ์‹œ๊ฐ„ ์ˆœ์„œ์— ๋Œ€ํ•œ ์ถ•์ด ์กด์žฌํ•œ๋‹ค.

  1. ์ „์ฒด activation(hidden vector)์„ ๊ฐ€์ง€๊ณ , 8:2 ๋น„์œจ๋กœ train/test๋กœ ๋‚˜๋ˆ„์–ด์„œ ์„ ํ˜• ๋ถ„๋ฅ˜๊ธฐ์ธ probe๋ฅผ ํ•™์Šตํ•˜๊ณ , test๋ฅผ ํ‰๊ฐ€ํ•œ๋‹ค. (์ด ๋•Œ๋Š” ๋ฐ์ดํ„ฐ์…‹ 2๊ฐœ์”ฉ๋งŒ ๋†“๊ณ  ์ด์ง„๋ถ„๋ฅ˜ ํ‰๊ฐ€)

    e.g.,) Probe model = ๐Ÿ˜บ

    Train(5-epochs)

    train activation1 โ†’ ๐Ÿ˜บ โ†’ D1์˜ entity

    train activation2 โ†’ ๐Ÿ˜บ โ†’ D2์˜ entity

    Test

    test activation1 โ‡’ ๐Ÿ˜บ โ‡’ D2์— ์žˆ๋˜ entity !

    test activation2 โ‡’ ๐Ÿ˜บ โ‡’ D2์— ์žˆ๋˜ entity !


Result

Experimental Setup์˜ ์‹คํ—˜๊ฒฐ๊ณผ์— ๋Œ€ํ•ด์„œ ํ•ด์„ํ•ด๋ณด์ž !

  • [1] ํ›ˆ๋ จ ์ˆœ์„œ(training-order)๋Š” activation ๊ณต๊ฐ„์—์„œ Linearํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉ๋œ๋‹ค.

    training-order๊ฐ€ activation ๊ณต๊ฐ„์—์„œ ํ•˜๋‚˜์˜ ๋ฐฉํ–ฅ์œผ๋กœ ํ‘œํ˜„๋œ๋‹ค. (์ˆœ์„œ๋Œ€๋กœ!)

    • (a) ํ•™์Šต๋œ ์„ ํ˜• Probe๋Š” test entity์— ๋Œ€ํ•ด stage ๊ตฌ๋ถ„์„ ์ž˜ํ•œ๋‹ค! (์–ด๋–ค stage์˜ entity์ธ์ง€ ์ž˜ ๋งž์ถค). ๊ทธ๋ฆฌ๊ณ , stage ์ฐจ์ด๊ฐ€ ํด์ˆ˜๋ก ๋” ๊ตฌ๋ถ„์„ ์ž˜ํ•œ๋‹ค.
    • (b) sample(QA๋ฌธ์žฅ)์— ๋Œ€ํ•œ activation์„ ๋ณด์•˜์„ ๋•Œ, alias entity๋ฅผ ์ธ์‹ํ•˜๊ณ ๋ถ€ํ„ฐ ๋ฒกํ„ฐ๊ฐ’์ด ์ปค์ง„๋‹ค. ๋˜, layer๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ์ปค์ง„๋‹ค.

  • [2] ํ›ˆ๋ จ ์ˆœ์„œ(๋ฐฉํ–ฅ)์ด๋ผ๋Š”๊ฒŒ ๋ญ˜ ์˜๋ฏธ?

    [1]์—์„œ ๋“ฑ์žฅํ•œ ์„ ํ˜•์ ์ธ ์ˆœ์„œ/๋ฐฉํ–ฅ์ด๋ผ๋Š”๊ฒŒ ๋ฌด์—‡์ธ์ง€ ํ•ด์„ํ•ด๋ณด์ž.

    ๐Ÿ’ก

    ๋งŒ์•ฝ, ํ›ˆ๋ จ ์ˆœ์„œ๋ฅผ ๋ชจ๋ธ์ด ์•Œ๊ณ ์žˆ๋‹ค๋ฉด, ๊ฐ€์žฅ ์ตœ๊ทผ์— ํ›ˆ๋ จ์‹œํ‚จ ๋ฐ์ดํ„ฐ๋Š” ์ง์„ ์ƒ ๊ฐ€์žฅ ์šฐ์ธก์— ์žˆ์„ ๊ฒƒ์ด๋‹ค.

    โ‡’ D1 ~ D6 ๋กœ fine-tuningํ•œ ๋ชจ๋ธ์— ์ถ”๊ฐ€์ ์œผ๋กœ, D1 ~ D6๋ฅผ ๋žœ๋คํ•˜๊ฒŒ ํ•˜๋‚˜์”ฉ ๋” Fine tuning ํ•˜๊ณ , ์ง์„ ์ƒ์— ํ‘œํ˜„ํ•ด๋ณด์ž.

    (a) ์ถ”๊ฐ€๋กœ ํ›ˆ๋ จ์‹œํ‚จ ๋ฐ์ดํ„ฐ์…‹์˜ activation์ด ๊ฐ€์žฅ ์šฐ์ธก์— ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Œ

    (b) ์—ํญ์ˆ˜์— ๋”ฐ๋ผ ์ด์ „๋‹จ๊ณ„์˜ centroid์—์„œ ๋” ๋ฉ€์–ด์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•œ ์ถ”๊ฐ€ ์‹คํ—˜์ž„.

    โ“ ๋‹จ์ˆœํ•˜๊ฒŒ, ์ตœ๊ทผ์— ํ›ˆ๋ จ์‹œํ‚ค๋ฉด recency์ธ๊ฑฐ ์•„๋‹Œ๊ฐ€?

    โ‡’ D1 ~ D6 ์„ ์„ž์–ด์„œ(mixing data) โ€œ์ˆœ์„œ ์ •๋ณด ์—†์ดโ€ ์ถ”๊ฐ€ํ•™์Šต์„ ํ•œ๋‹ค๋ฉด?

    ์›๋ž˜ training-order๊ฐ€ ์‚ฌ๋ผ์ง€์ง€ ์•Š์„๊นŒ?

    mixing data๋กœ ํ•™์Šต ํ›„, activation์„ ๋ณด๋ฉด, training-order๊ฐ€ ์—ฌ์ „ํžˆ ์œ ์ง€๋œ๋‹ค.

    โ‡’ ์ฆ‰, training-order๋Š” ๋‹จ์ˆœํ•˜๊ฒŒ ์ตœ์‹ ์ •๋ณด(recency)๊ฐ€ ์•„๋‹Œ, ๋ชจ๋ธ ๋‚ด๋ถ€์— training-order๋ฅผ ๋” ๊นŠ๊ฒŒ ์ €์žฅํ•˜๊ณ  ์žˆ์Œ.

    ๐Ÿ’ก

    training order๋Š” ๋‹จ์ˆœ recency ์ •๋ณด๊ฐ€ ์•„๋‹Œ,

    recency ์ •๋ณด + ๋ชจ๋ธ ๋‚ด๋ถ€์˜ ๋ณต์žกํ•œ history ์ •๋ณด๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

  • [3] ํ›ˆ๋ จ ์ˆœ์„œ๋Š” ์šฐ์—ฐํžˆ ๋ฐœ์ƒํ•˜๋Š”๊ฒŒ ์•„๋‹ˆ๋‹ค.

    ๊ธฐ์กด ์‹คํ—˜์—์„œ๋Š” Llama-3.2-1B ๋ชจ๋ธ์„ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ,

    • Qwen2.5-0.5B / 1.5B / 3B ๋ชจ๋ธ๋ณ„๋กœ ๋ฐ”๊ฟ”์„œ ์‹คํ—˜์„ ์ง„ํ–‰ํ•ด๋ณด๊ณ ,
    • Full Fine Tuning์ด ์•„๋‹ˆ๋ผ, LoRA๋ฅผ ์‚ฌ์šฉํ•ด๋ณด๊ณ ,
    • epoch์ˆ˜๋ฅผ ์ค„์ด๊ณ , ๋ฐ์ดํ„ฐ ์ˆ˜๋ฅผ ๋Š˜๋ ค์„œ ์‹คํ—˜ํ•ด๋ด๋„

      โ‡’ training-order๋Š” ๋ฐœ์ƒํ–ˆ๋‹ค.

      (ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ค‘์š”ํ•œ๊ฒŒ ์•„๋‹Œ, ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ๊ฐ€ ์ค‘์š”ํ•ด ๋ณด์ธ๋‹ค)

    ํ•™์Šต ๋ฐ์ดํ„ฐ ์ˆœ์„œ๋ฅผ ์—†์• ๊ณ  ํ›ˆ๋ จ์„ ํ•ด๋ณด์ž.

    • D1~D6์„ ์ˆœ์„œ์—†์ด ๋ฌด์ž‘์œ„๋กœ ํ•™์Šตํ•ด๋ณด๊ณ ,
    • Fine-Tuning์„ ํ•˜์ง€ ์•Š๊ณ  ํ•™์Šตํ•ด๋ณด๊ณ ,
    • Probe ํ•™์Šต์‹œ label์„ ๋žœ๋ค์œผ๋กœ ์„ž๊ณ  ํ‰๊ฐ€๋ฅผ ํ–ˆ์„ ๋•Œ๋Š”

      โ‡’ training-order๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์•˜๋‹ค.

      ๐Ÿ’ก

      training order๋Š” ์ˆœ์ฐจ์ ์œผ๋กœ ํ•™์Šต์ด ์žˆ์„ ๋•Œ๋งŒ ์ƒ๊ธด๋‹ค.

  • [4] ํ›ˆ๋ จ ์ˆœ์„œ๋Š” entity-level์—์„œ ๊ฐ•ํ•˜๊ฒŒ encoding๋œ๋‹ค.
    • ๊ธฐ์กด ๋ชจ๋ธ ํ•™์Šต๋ฐฉ๋ฒ• : D1 ~ D6 ๋ฐ์ดํ„ฐ์…‹๋ณ„๋กœ Entity๋งŒ ๋‹ค๋ฅด๊ณ , ์งˆ๋ฌธ ํ…œํ”Œ๋ฆฟ์€ ๋™์ผํ–ˆ์Œ.

      โ‡’ Entity-level์—์„œ training-order๊ฐ€ ์ž˜ ๋“œ๋Ÿฌ๋‚ฌ๋‹ค!

    • ์ƒˆ๋กœ์šด ์‹คํ—˜ ๋ฐฉ๋ฒ• : D1 ~ D6 ๋ฐ์ดํ„ฐ์…‹๋ณ„๋กœ Entity๋ฅผ ํ†ต์ผ, ์งˆ๋ฌธ ํ…œํ”Œ๋ฆฟ์€ ๋‹ค๋ฅด๊ฒŒ ์„ค์ •.

      โ‡’ Sample-level(์งˆ๋ฌธ)์—์„œ๋„ trainig-order๊ฐ€ ์ž˜ ๋“œ๋Ÿฌ๋‚ ๊นŒ?

      Prob Acuuracy / LevelEntity-levelSample-level
      Accuracy90%60%
      Accuracy(Mixing)63%50%
      ๐Ÿ’ก

      Sample-level์—์„œ์˜ Training-Order๋Š” ์•ฝํ•˜๊ฒŒ ๋“œ๋Ÿฌ๋‚œ๋‹ค.

  • [5] ๋ชจ๋ธ์€ ์ง์ ‘ ๋‚ด๋ถ€์ ์œผ๋กœ Training-Order๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

    ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€์ง€๊ณ , ์ด๋ฏธ Training-Order๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์ƒํƒœ์—์„œ

    Q : <Which training stage is this <alias> from?>

    A : D1

    ์ด๋ผ๋Š” Task๋ฅผ ๋งŒ๋“ค์–ด ํ•™์Šต์„ ์‹œํ‚จ๋‹ค.

    ์ดํ›„, ๋ชจ๋ธ์—๊ฒŒ ์ง์ ‘ ํ‰๊ฐ€๋ฅผ ์‹œ์ผฐ๊ณ , 80%์˜ ์ •ํ™•๋„๋ฅผ ๋‹ฌ์„ฑํ•จ.

    ๐Ÿ’ก

    Recency ์ •๋ณด๋Š” ์™ธ๋ถ€์ ์œผ๋กœ Probe๋ฅผ ๋”ฐ๋กœ ์„ค์ •ํ•˜์ง€ ์•Š๊ณ , ๋ชจ๋ธ๋„ ์ง์ ‘ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.


Simple Explanations cannot fully account for The Effect

์ด๋Ÿฐ Training-order์˜ ํšจ๊ณผ๋Š” ๋‹จ์ˆœํ•˜๊ฒŒ ์„ค๋ช…๋˜๋Š” ํšจ๊ณผ๊ฐ€ ์•„๋‹ˆ๋‹ค!

โ‡’ Training order๊ฐ€ ๋‹จ์ˆœํžˆ ํ†ต๊ณ„์ ์ธ ํŠน์ง• ๋•Œ๋ฌธ์— ๋‚˜์˜จ ๊ฒƒ์ผ์ˆ˜๋„ ์žˆ์ง€ ์•Š๋Š”๊ฐ€? ํ™•๋Œ€ ํ•ด์„์•„๋‹Œ๊ฐ€?

๐Ÿ’ก

์‹ค์ œ ๋‚ด๋ถ€์˜ ๊ตฌ์กฐ์ ์ธ ์ •๋ณด๋กœ Training order๊ฐ€ ๋ฐœ์ƒํ•œ ๊ฒƒ์ธ์ง€,

๋‹จ์ˆœํ•˜๊ฒŒ ์šฐ์—ฐํžˆ ์ธ์œ„์ ์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๊ฒฐ๊ณผ(=Artifact)์ธ์ง€ ๊ฒ€์ฆํ•˜์ž.

  • ๋‹จ์ˆœํ•œ ์ •๋ณด์—์„œ training order๊ฐ€ ๋‚˜์˜จ ๊ฑด ์•„๋‹ˆ๋‹ค.
    • Activation์˜ ํฌ๊ธฐ ๋ถ„ํฌ

      D1๊ณผ D2์˜ ํฌ๊ธฐ๋ถ„ํฌ๊ฐ€ ๋™์ผํ•œ ์ƒํƒœ์ž„์—๋„, ์œ„์—์„œ probe์˜ ๊ฒฐ๊ณผ๋Š” ์ข‹์•˜๋‹ค.

      โ‡’ ํฌ๊ธฐ๋ถ„ํฌ๋กœ Training order๋ฅผ ์„ค๋ช…๋ถˆ๊ฐ€.


    • PCA(์ฃผ์„ฑ๋ถ„ ๋ถ„์„)

      D1๊ณผ D2์˜ activation์— ๋Œ€ํ•ด ์ฃผ์„ฑ๋ถ„์„ ๋ณด๋ฉด, ๊ตฌ๋ถ„์ด ๋˜์ง€ ์•Š์Œ

      ์ฆ‰, ๋‹จ์ˆœํ•œ ํŠน์ง•์ด์—ˆ๋‹ค๋ฉด, PCA์—์„œ ๊ตฌ๋ถ„์ด ๋˜์—ˆ์–ด์•ผ ํ–ˆ์ง€๋งŒ, ๊ตฌ๋ถ„์ด ๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ,

      ๊ฐ„๋‹จํ•œ ํ†ต๊ณ„์  ํŠน์ง•์ด๋ž‘ ์—ฐ๊ด€์ด ์—†๋‹ค!


    • Cosine similarity

      s11s_{11}๏ปฟ : D1 ๋‚ด๋ถ€ activation๋“ค๋ผ๋ฆฌ ์–ผ๋งˆ๋‚˜ ๋น„์Šทํ•œ๊ฐ€

      s22s_{22}๏ปฟ : D2 ๋‚ด๋ถ€ activation๋“ค๋ผ๋ฆฌ ์–ผ๋งˆ๋‚˜ ๋น„์Šทํ•œ๊ฐ€

      s12s_{12}๏ปฟ : D1๊ณผ D2๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋น„์Šทํ•œ๊ฐ€

      cosine similiarity๋กœ training order๊ฐ€ ์„ค๋ช…๋˜๋ ค๋ฉด, s12s_{12}๏ปฟ์˜ ๊ฐ’์ด ํฌ๊ฒŒ ๋‚˜์˜ค๋ฉด ์•ˆ๋œ๋‹ค.

      ํ•˜์ง€๋งŒ, ํฌ๊ฒŒ ๋‚˜์™”๊ธฐ ๋•Œ๋ฌธ์— training order๋Š” cosine similarity๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์—†๋‹ค.


  • D1,D2๋ฅผ ๋ชจ๋“  activation ํ†ต๊ณ„์  ์กฐ๊ฑด์„ ๋˜‘๊ฐ™์ด ๋งž์ถ˜ ์ƒํƒœ์—์„œ๋„ probe๊ฐ€ ์ž˜๋˜๋ฉด training order๋Š” ๋‚ด๋ถ€์˜ ๋ณต์žกํ•œ ๊ตฌ์กฐ ์ •๋ณด๋กœ ์ด๋ฃจ์–ด์ ธ์žˆ๋‹ค.

    training order์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์กฐ๊ฑด์ด ์žˆ์ง€ ์•Š์„๊นŒ? ์—ฌ๋Ÿฌ ์‹คํ—˜์„ ํ†ตํ•ด ํ™•์ธํ•ด๋ณด๋Š” ๋‹จ๊ณ„

    • Activation(6๊ฐœ)
      • max value
      • L2norm
      • mean
      • std
      • skewness
      • kurtosis

    • logit(7๊ฐœ)
      • entropy
      • max logit
      • logsumexp
      • mean
      • std
      • skewness
      • Kurtosis

    D1, D2์— ๋Œ€ํ•ด ์œ„ ํ†ต๊ณ„๋Ÿ‰ ํŠน์„ฑ๋“ค์„ ๋ชจ๋‘ ๋งž์ถ˜ ์ƒํƒœ์—์„œ Probe๋ฅผ ํ•™์Šตํ•˜๊ณ  ํ‰๊ฐ€.

    โ‡’ Probe์˜ ์„ฑ๋Šฅ์ด ์ข‹์œผ๋ฉด Training-Order๊ฐ€ ์žˆ๋Š” ๊ฒƒ!

    ์ฆ‰, ํ†ต๊ณ„๋Ÿ‰๊นŒ์ง€ ๋ชจ๋“  ์กฐ๊ฑด์ด ๋งž๋Š” ์ƒํƒœ์ธ๋ฐ๋„, training order๊ฐ€ ์žˆ๋‹ค๋ฉด, ํ†ต๊ณ„๋Ÿ‰ ์†์„ฑ๋“ค๊ณผ๋„ ์ผ์ ˆ ์—ฐ๊ด€์ด ์—†๋‹ค.

    • ํ†ต๊ณ„๋Ÿ‰ ํŠน์„ฑ์„ ๋งž์ถ”๋ฉด ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๊ฐ€ ๋น„์Šทํ•ด์ง„๋‹ค. (Activation ํ†ต๊ณ„๋Ÿ‰์— ๋Œ€ํ•œ ์˜ˆ์‹œ)

    ์ด๋ ‡๊ฒŒ ํ†ต๊ณ„ ์†์„ฑ์„ ๋งž์ถ˜ ๋ฐ์ดํ„ฐ๋ฅผ Balancing Data๋ผ๊ณ  ํ•จ

    • ์–ด๋–ป๊ฒŒ ๋ฐ์ดํ„ฐ๋“ค์˜ ํ†ต๊ณ„ ์†์„ฑ์„ ๋งž์ถœ๊นŒ?

      D1, D2 Dataset์€ QA Dataset์„ ๋ชจ๋ธ์— ๋„ฃ์—ˆ์„ ๋•Œ ๋ฐœ์ƒํ•œ Hidden vector๋“ค์ž„.

      ์ด๋ ‡๊ฒŒ ์กฐ๊ฑด์„ ๋งž์ถ˜ ํ›„ ๋ถ„์„ ์ง„ํ–‰(๊ฐ™์€ ํ†ต๊ณ„์  ๋ถ„ํฌ๋ฅผ ๋Œ ๋•Œ๋„ Training-Order๊ฐ€ ์ƒ๊ธธ๊นŒ?)

      • ์œ„์—์„œ ๋งŒ๋“  Balancing data๋กœ probe๋ฅผ 4:1 = train / test๋กœ ๋Œ๋ ค๋ณด์ž.
      • ๋˜ํ•œ, ์ถ”๊ฐ€ ์‹คํ—˜์œผ๋กœ Balancing data ์ƒ˜ํ”Œ๋ง ์‹œ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋งŽ์ด ๋ฒ„๋ ธ๋Š”๋ฐ,

        โ€œ๋ฐ์ดํ„ฐ๋ฅผ ๋งŽ์ด ๋ฒ„๋ฆฐ๊ฒŒ ๋˜ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์„๊นŒ?โ€ ์— ๋Œ€ํ•œ ์‹คํ—˜๋„ ํ•ด๋ณด์ž.

        (Random Downsampling)

      • ์œ„ 2๊ฐœ ๊ฒฐ๊ณผ์™€ ์›๋ž˜ ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•ด๋ณด์ž.

        ๋ฐ์ดํ„ฐ ๊ท ํ˜•์„ ๋งž์ถ˜ balancing data์™€ ๊ท ํ˜•์„ ๋งž์ถ”์ง€ ์•Š๊ณ  ๋ฐ์ดํ„ฐ ์ˆ˜๋งŒ ๋งž์ถ˜ random downsampling ๋ชจ๋‘ probe์˜ ์„ฑ๋Šฅ์ด ์œ ์ง€๊ฐ€ ๋˜์—ˆ์Œ.

        ๐Ÿ’ก

        ๊ฒฐ๋ก ์€ activation, logit์˜ ํ†ต๊ณ„์  ์ฐจ์ด์— ์˜ํ•ด training order๊ฐ€ ์ขŒ์šฐ๋˜์ง€ ์•Š๋Š”๋‹ค!



Discussion

๋งŽ์€ ๊ฐ€๋Šฅํ•œ ๋ณ€์ˆ˜๋“ค์— ๋Œ€ํ•ด์„œ ํ†ต์ œ ์‹คํ—˜์„ ํ–ˆ์Œ์—๋„, Training-Order๊ฐ€ ๋‚จ์•„์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ๊ณ ,

๋”ฐ๋ผ์„œ, Training-Order๋Š” ๋ชจ๋ธ ๋‚ด๋ถ€์ ์œผ๋กœ ๊ตฌ์กฐ์ ์ธ ํŠน์ง•์„ ๊ฐ€์ง€๊ณ  ์žˆ์Œ.

  • Limitation
    • 8B ๋ชจ๋ธ์˜ ์ž‘์€ ๋ชจ๋ธ๊ณผ ๋‹จ์ˆœํ•œ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ๋งŒ ์‹คํ—˜์„ ํ•จ.
    • ๋˜ํ•œ ์–ธ์–ด ๋ชจ๋ธ์— ๋Œ€ํ•ด์„œ๋งŒ ์‹คํ—˜์„ ํ•ด ์ผ๋ฐ˜์„ฑ์ด ๋ถ€์กฑํ•จ.
    • Alias๋กœ entity๋ฅผ ๋ฐ”๊พธ๊ณ , Fine-tuning์„ ํ•˜์—ฌ Training-order๋ฅผ ๊ด€์ฐฐํ–ˆ๋Š”๋ฐ,

      Pre-Training์‹œ์—๋„ Training-Order๊ฐ€ ์žˆ๋Š”์ง€์— ๋Œ€ํ•ด ํ™•์ธํ•˜์ง€ ์•Š์•˜์Œ.

    • Training-Order๋ผ๋Š” ํ˜„์ƒ๋งŒ ๋ฐœ๊ฒฌํ–ˆ์ง€, ์ด ํ˜„์ƒ์ด ๋ฐœ์ƒํ•œ ์ •ํ™•ํ•œ ์›๋ฆฌ๋Š” ํŒŒ์•…๋ชปํ•จ.

  • Positive Effect
    • Training Data๊ฐ€ ์‹œ๊ฐ„์  ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์„ ๋•Œ, Training-order๊ฐ€ Training์— ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๊ณ , ๋” ์ข‹์€ ์˜ˆ์ธก์„ ํ•  ์ˆ˜ ์žˆ์Œ.

  • Future Works
    • Pretrainig์‹œ์—๋„ Training-order๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ์‹คํ—˜ ๊ฐ€๋Šฅ
    • ๋ชจ์ˆœ๋˜๊ฑฐ๋‚˜ ์ƒ์ถฉ๋˜๋Š” ์ •๋ณด๋“ค์— ๋Œ€ํ•ด์„œ Training-Order๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณ€ํ• ์ง€ ํ™•์ธ
    • ๋ชจ๋ธ์ด ์‹ค์ œ๋กœ Training-Order๋ฅผ ์•Œ๊ณ , ์Šค์Šค๋กœ ์—…๋ฐ์ดํŠธ๋ฅผ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์„์ง€?
    • ๋ชจ๋ธ์ด ์ž์‹ ์˜ ๋‹ต๋ณ€ ์ƒํƒœ์— ๋Œ€ํ•œ ์„ค๋ช…์„ ํ•  ์ˆ˜ ์žˆ์„์ง€?

  • Conclusion

    ์–ธ์–ด ๋ชจ๋ธ์€ โ€œ๋ฌด์—‡โ€์„ ์•Œ๊ณ  ์žˆ๋Š”์ง€์— ๋”ํ•ด์„œ โ€œ์–ธ์ œโ€ ๋ฐฐ์› ๋Š”์ง€์— ๋Œ€ํ•ด์„œ๋„ ์ €์žฅ์„ ํ•œ๋‹ค.

    ์—ฌ๋Ÿฌ ํ†ต์ œ ์‹คํ—˜์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์–ธ์–ด ๋ชจ๋ธ์€ Training-Order๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

    ๐Ÿ’ก

    LLM์˜ representation์€ ๋‹จ์ˆœํžˆ ์ง€์‹์„ ์ €์žฅํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ,

    ํ•™์Šต ๊ณผ์ •์˜ ์‹œ๊ฐ„์ ์ธ ์ •๋ณด(=Training Order)๊นŒ์ง€ ๋‹ด๊ณ  ์žˆ๋Š” ๊ณต๊ฐ„์ž„.

Categories

Memorization Recency research