07 January 2026

AI as Humanityโ€™s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

๐Ÿ’กLLM์€ ์ฐฝ์˜์„ฑ์œผ๋กœ ์‚ฌ๋žŒ์„ ๋”ฐ๋ผ์žก์„ ์ˆ˜ ์žˆ์„๊นŒ? โ‡’ ใ„ดใ„ด์•„์ง ์ฐฝ์˜์„ฑ์„ ๊ธฐ๋ฐ˜์œผ๋กœ LLM๊ณผ ์‚ฌ๋žŒ์„ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ์„๊นŒ? โ‡’ ์›… ๊ฐ€๋Šฅ

AI as Humanityโ€™s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

Review

๋‹‰๋„ค์ž„ ํ•œ์ค„ํ‰๋ณ„์  (0/5)
๋™๊นŒ์ŠคLLM์€ ํ•™์Šต ๊ณผ์ •์—์„œ ์—„์ฒญ ๋งŽ์€ ์ธ๊ฐ„ ํ…์ŠคํŠธ๋ฅผ ํ•™์Šตํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜๋ฏธ ๊ณต๊ฐ„์—์„œ ์ธ๊ฐ„ ์–ธ์–ด ๋ถ„ํฌ์˜ ์ค‘์‹ฌ์— ๊ฐ€๊นŒ์šด ์ถœ๋ ฅ์„ ๋‚ด๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค๊ณ  ์•Œ๊ณ  ์žˆ๋Š”๋ฐ ์ด๋Ÿฌํ•œ LLM์˜ ์•„ํ‚คํ…์ฒ˜์ ์ธ ๋ฌธ์ œ์—์„œ ์–ด๋–ป๊ฒŒ ์ฐฝ์˜์ ์ธ ๋‹ต๋ณ€์„ ๋‚ด๋†“์„ ์ˆ˜ ์žˆ์„๊นŒ๋ผ๋Š” ๊ถ๊ธˆ์ ์ด ์ƒ๊น€. + ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์˜ ์ง€์‹์„ ์•Œ๋ฉด ๋…ผ๋ฌธ์„ ์“ธ ๋•Œ ๋” ์•Œ ์ˆ˜ ์žˆ๊ฒ ๋‹ค๋ผ๋Š” ์ƒ๊ฐ์ด ๋“ฆ3.8
๋งˆ์Šคํ‚นํ…Œ์ดํ”„์ฐฝ์˜์„ฑ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์€ ์ž˜ ๋ชจ๋ฅด๊ฒ ๋‹ค. ๊ณ ์ „์ž‘๊ฐ€๊ฐ€ ์ž˜ ๋‚˜์˜จ ๊ฑด ๊ทธ๋ƒฅ ๊ณ ์ „๋ฌธํ•™ ์Šคํƒ€์ผ์˜ ํ…์ŠคํŠธ๊ฐ€ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ์ ๊ฒŒ ๋ถ„ํฌํ•˜๊ณ  ์žˆ์–ด์„œ ๊ทธ๋Ÿฐ ๊ฑด ์•„๋‹๊นŒ? ๋ผ๋Š” ์ƒ๊ฐ๋„ ๋“ค์—ˆ์Œ. ์˜ˆ๋ฅผ ๋“ค์–ด ํŒํƒ€์ง€๋ฉด ๋” ์ฐฝ์˜์ ์ธ๊ฑด๊ฐ€? ์•„์ด๋””์–ด๋Š” ํฅ๋ฏธ๋กญ์ง€๋งŒ, ์•„์ง์€ ๋ถ€์กฑํ•œ ์—ฐ๊ตฌ๋ผ๋Š” ์ƒ๊ฐ์ด ๋“ค์—ˆ์Œ. (๋…ผ๋ฌธ์„ ์ž˜ ์จ์„œ ์ปค๋ฒ„ํ–ˆ๋‚˜ ์‹ถ๊ธฐ๋„ ํ•จ)3.5
๊ทค'์ฐฝ์˜์„ฑ์„ ์ธก์ •ํ•œ๋‹ค'๋ผ๋Š”๊ฒƒ ์ž์ฒด๊ฐ€ ์ •๋ง ์• ๋งคํ•˜๋‹ค๊ณ  ์ƒ๊ฐ์„ ํ•ด์™”์—ˆ๋Š”๋ฐ, ๊ธฐ์กด์— ์–ผ๋งˆ๋‚˜ ์—†์—ˆ๋А๋ƒ~ ๊ด€์ ์œผ๋กœ ๋ฐ”๋ผ๋ณด๋ฉด ๋˜๊ฒ ๊ตฌ๋‚˜. ๊ทผ๋ฐ ์–ด๋–ค reference corpus๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์‚ผ๋А๋ƒ์— ๋”ฐ๋ผ ํ‰๊ฐ€๊ฐ€ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ƒ๊ฐ์ด ๋“ ๋‹ค. ๋˜ ํ•œํŽธ์œผ๋กœ๋Š” ๋ชจ๋ธ์€ ๊ณ ์ •๋œ ํ† ํฐ ์ง‘ํ•ฉ๊ณผ ๊ทธ ์กฐํ•ฉ์„ ํ†ตํ•ด ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•˜๋Š”๋ฐ, ์ด๊ฒƒ๋“ค์„ ์–ผ๋งˆ๋‚˜ ์ƒˆ๋กญ๊ฒŒ ์กฐํ•ฉํ•˜๋А๋ƒ์˜ ๋ฌธ์ œ๋กœ ๋ณผ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ?4
์ˆ˜๋ฉด์žฅ์•  reference corpus์™€์˜ ์ฐจ์ด์ ์œผ๋กœ ํŒ๋‹จํ•˜๋Š”๊ฒŒ ์ตœ์„ ์ผ๊นŒ? ๊ทธ ์•ˆ์—์„œ factuality๊ฐ€ ์†์ƒ๋  ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ? ๋ฌผ๋ก  unrealisticํ•œ ์ƒ์„ฑ์„ ํ•ด๋‚ด๋Š” ๊ฒƒ์ด ์ฐฝ์˜์„ฑ์˜ ์ฒ™๋„์ธ ๊ฑด ๋งž์ง€๋งŒ, ๋‹จ์ˆœํžˆ fact๋งŒ ๋’ค์ง‘์–ด๋„ ์ฐฝ์˜์„ฑ์ด ๋†’์•„์งˆ ์ˆ˜ ์žˆ์„๊นŒ? (e.g. ์ด๊ฒฝํ˜ธ ๊ต์ˆ˜๋‹˜์€ ์—ฐ๋Œ€์—์„œ ํ•™์„๋ฐ•์„ ํ•˜์…จ๋‹ค โ†” ์ด๊ฒฝํ˜ธ ๊ต์ˆ˜๋‹˜์€ ๊ฒฝํฌ๋Œ€์—์„œ ํ•™์‚ฌ๋ฅผ, ๊ณ ๋Œ€์—์„œ ์„์‚ฌ๋ฅผ, MIT์—์„œ ๋ฐ•์‚ฌ๋ฅผ ํ•˜์…จ๋‹ค) 3
์ด์–ดํฐ์ฐฝ์˜์„ฑ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ ์›น ํ…์ŠคํŠธ ๋ ˆํผ๋Ÿฐ์Šค ๊ธฐ์ค€์œผ๋กœ ๊ตฌํ˜„ํ–ˆ๋‹ค๋Š” ๊ฒŒ, ๋‚ฉ๋“ ๊ฐ€๋ฉด์„œ๋„ ๋˜๊ฒŒ ์‹ ๋ฐ•ํ•œ ์ƒ๊ฐ๊ฐ™๋‹ค. LLM์ด ์‹ค์ œ๋กœ ๋œฌ๊ธˆ์—†์ด ์›น ์ฝ”ํผ์Šค ๋”ฐ๋ผ ๋งํ•˜๋Š” ๊ฒฝ์šฐ๋„ ๋งŽ์œผ๋‹ˆ๊นŒ ์ด๋Ÿฐ ์˜ค๋ฅ˜ ๊ณ ์น˜๋Š” ๊ฑฐ๋ž‘๋„ ๊ด€๋ จ์ง€์–ด ๋ณผ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๊ณ , ์ฐฝ์˜์„ฑ ์ง€ํ‘œ๋ฅผ ๋” ๋ณด์™„ํ•  ์—ฌ์ง€๊ฐ€ ๋งŽ์•„ ๋ณด์ด์ง€๋งŒ ์‹ ๋ฐ•ํ•œ ์ œ์•ˆ์ธ ๋“ฏํ•˜๋‹ค4
7์ผ์ฐฝ์˜์„ฑ ์ง€ํ‘œ๋ž‘ hallucination score๋ž‘ ๋ฐœ์ƒ์ ์œผ๋กœ ๋น„์Šทํ•œ ๋А๋‚Œ์ด ๋“ค์—ˆ๋‹ค. ๋ ˆํผ๋Ÿฐ์Šค๋ž‘ ๋™์ผํ• ์ˆ˜๋ก ์ •ํ™•๋„ ๊ด€์ ์—์„œ๋Š” ์˜คํžˆ๋ ค ์ข‹์€๋ฐโ€ฆ ์ด๊ฑธ ํ•จ๊ป˜ ๊ณ ๋ คํ–ˆ์œผ๋ฉด ํ•˜๋Š” ์•„์‰ฌ์›€์ด ์กด์žฌํ•จ.3.5
์‚ฌ๊ณผ์ฐฝ์˜์„ฑ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ Reference Corpus๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์‚ผ๋Š” ๊ฒƒ์— ์กฐ๊ธˆ ์˜๋ฌธ์ด ๋“ค๊ธฐ๋Š” ํ•จ. ์ •ํ™•๋„์™€ Hallucination ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ฉด์„œ ์ฐฝ์˜์„ฑ์„ ๋†’์ด๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ๋Š”์ง€..4

TL; DR

๐Ÿ’ก

LLM์€ ์ฐฝ์˜์„ฑ์œผ๋กœ ์‚ฌ๋žŒ์„ ๋”ฐ๋ผ์žก์„ ์ˆ˜ ์žˆ์„๊นŒ? โ‡’ ใ„ดใ„ด์•„์ง
์ฐฝ์˜์„ฑ์„ ๊ธฐ๋ฐ˜์œผ๋กœ LLM๊ณผ ์‚ฌ๋žŒ์„ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ์„๊นŒ? โ‡’ ์›… ๊ฐ€๋Šฅ

Summary

  • ์—ฐ๊ตฌ์ง„: ์›Œ์‹ฑํ„ด๋Œ€ํ•™๊ต, AllenAI
    • + ์˜ˆ์ง„์ดˆ์ด ๊ต์ˆ˜๋‹˜โ€ฆ!!! ๐Ÿค
  • github: x
  • ์ธ์šฉ์ˆ˜: 14

Main Idea

โ€œLLM์˜ ์ฐฝ์˜์„ฑโ€์„ ์–ด๋–ป๊ฒŒ ์ •๋Ÿ‰ํ™”ํ•  ๊ฒƒ์ธ๊ฐ€?

๋‹จ์ˆœํžˆ ์‚ฌ๋žŒ์ด ํ‰๊ฐ€ํ•˜๊ธฐ์—๋Š” ๋ชจํ˜ธํ•˜๊ณ  ํ™•์žฅ์„ฑ์ด ๋–จ์–ด์ง€๋‹ˆ๊นŒ ์ง€ํ‘œ๋ฅผ ๋งŒ๋“ค๊ณ , ์ž˜ ํ™œ์šฉํ•ด๋ณด์ž

  • Who is Salieri ?

Background & Motivation

  • LLM์ด ์ฐฝ์˜๋ ฅ์„ ์š”ํ•˜๋Š” ๋งŽ์€ ์ง์—…์„ ๋Œ€์ฒดํ•˜๋Š” ์‚ฌ๋ก€๊ฐ€ ๋ฐœ์ƒํ•จ
    • e.g. ์›นํˆฐ, ์˜์ƒ, ์ž‘๊ฐ€ ๋“ฑ๋“ฑ
    • ์‹ค์ œ๋กœ ํ• ๋ฆฌ์šฐ๋“œ ์ฃผ์š” ์ŠคํŠœ๋””์˜ค๋“ค์ด ์‹œ๋‚˜๋ฆฌ์˜ค ์ž‘์„ฑ ๋“ฑ์˜ ์ œ์ž‘๊ณผ์ •์— LLM์„ ๋„์ž…ํ•จ

    โ‡’ LLM์€ ๊ทธ ์–ด๋–ค ์‚ฌ๋žŒ๋ณด๋‹ค ๋งŽ์€ ์ž‘ํ’ˆ์„ ์ ‘ํ•ด์™”๊ธฐ์—(ํ•™์Šตํ–ˆ๊ธฐ์—), ์ƒˆ๋กœ์šด ์ฐจ์›์˜ ์ฐฝ์˜์„ฑ์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ?

    โ‡’ ๊ณผ์—ฐ LLM์ด ์‚ฌ๋žŒ์˜ ์ฐฝ์˜์„ฑ์„ ๋„˜์„ ์ˆ˜ ์žˆ์„๊นŒ?

  • ์ฐฝ์˜์„ฑ์€ ์ •๋Ÿ‰ํ™” ๋ฐ ๋น„๊ต๊ฐ€ ์–ด๋ ค์›€
    • ๊ทธ๋™์•ˆ ๋‹ค์–‘ํ•œ ์‹œ๋„๊ฐ€ ์žˆ์—ˆ์œผ๋‚˜ (e.g. Torrance Test of Creative Thinking) ๋ชจ๋‘ human annotator์— ์˜์กดํ•จ

      โ‡’ ์ฃผ๊ด€์ ์ด๊ณ , cost๊ฐ€ ๋„ˆ๋ฌด ํผ

Contributions (What theyโ€™ve revealed)

  • ์‚ฌ๋žŒ๊ณผ ๊ธฐ๊ณ„์˜ ์ฐฝ์˜์„ฑ์— ๊ด€ํ•œ ์ธ์‚ฌ์ดํŠธ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด, ์ฐฝ์˜์„ฑ์„ ์ธก์ •ํ•˜๋Š” ์ง€ํ‘œ์ธ CREATIVITY INDEX ์ œ์•ˆ
    • CREATIVITY INDEX๋ž€?
      • โ€œweb์— ์กด์žฌํ•˜๋Š” human-written text๋ฅผ ์–ผ๋งˆ๋‚˜ ์‰ฝ๊ฒŒ ์žฌ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€โ€
        • ์–ด๋–ค LLM์ด ์ƒ์„ฑํ•œ ํ…์ŠคํŠธ๊ฐ€, ๋ฌธ๋งฅ์ด ๊ธด๋ฐ(n-gram์˜ n์ด ์ปค์ง) web์— ๊ฑฐ์˜ ๊ทธ๋Œ€๋กœ ์กด์žฌํ•˜๋Š” ํ…์ŠคํŠธ๋ผ๋ฉด, ๊ทธ LLM์€ ๋œ ์ฐฝ์˜์ ์ž„
          • ๋งŒ์•ฝ ํ…์ŠคํŠธ๊ฐ€ reference corpus์— ๊ทธ๋Œ€๋กœ ์กด์žฌํ•œ๋‹ค๋ฉด, CREATIVITY INDEX=0
        • ๋ฐ˜๋Œ€๋กœ, ๋ฌธ๋งฅ์ด ๊ธธ์–ด์กŒ๋Š”๋ฐ, web์—์„œ ์ฐพ์„ ์ˆ˜ ์—†๋Š” ๊ตฌ๊ฐ„์ด ๋งŽ์€ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค๋ฉด, ๊ทธ LLM์€ ๋” ์ฐฝ์˜์ ์ด๋‹ค

        ** ์ธก์ •ํ•  ๋•Œ, reference corpus๋กœ RedPajama๋ฅผ ์‚ฌ์šฉํ•จ

        • why? ๋‹ค์–‘ํ•œ domain์„ ํฌํ•จํ•˜๋Š” large-scale web text corpus๋ผ์„œ

      • ๋‹จ์ˆœํžˆ score๊ฐ€ ์•„๋‹ˆ๋ผ, ๋‹ค์–‘ํ•œ n์˜ ๊ธธ์ด์— ๋Œ€ํ•ด ์ธก์ •ํ•จ
        • ํ…์ŠคํŠธ๊ธธ์ด/์Šคํƒ€์ผ์— ์˜ํ–ฅ์„ ๋œ ๋ฐ›๋„๋ก ํ•˜๊ธฐ์œ„ํ•จ
    • professional writers ๋“ฑ ์‚ฌ๋žŒ์ด LLM๋ณด๋‹ค ํ‰๊ท ์ ์œผ๋กœ 66.2% ๋†’์€ ์ฐฝ์˜์„ฑ์„ ๊ฐ€์ง์„ ๋ฐํ˜€๋ƒ„
      • ์†Œ์„ค ๋ฐœ์ทŒ๋ฌธ, ํ˜„๋Œ€์‹œ, ์—ฐ์„ค ๋…น์ทจ๋ก ๋“ฑ ๋‹ค์–‘ํ•œ ์˜์—ญ์—์„œ
      • verbatim ์ˆ˜์ค€๊ณผ semantic ์ˆ˜์ค€ ๋ชจ๋‘์—์„œ
        • verbatim: ์™„์ „ํžˆ ๋™์ผํ•œ n-gram์ด ํฌํ•จ๋˜๋Š”์ง€
        • semantic: ์˜๋ฏธ์ ์œผ๋กœ ๋งค์šฐ ๊ฐ€๊นŒ์šด n-gram์ด ํฌํ•จ๋˜๋Š”์ง€
        CREATIVITY INDEX in novel writing considering both verbatim and semantic matches
        • verbatim ์ˆ˜์ค€๋งŒ์„ ๋Œ€์ƒ์œผ๋กœ ํ–ˆ์„ ๋•Œ(+52.2%) ๋ณด๋‹ค, semantic ์ˆ˜์ค€๊นŒ์ง€ ํ•จ๊ป˜ ๊ณ ๋ คํ–ˆ์„๋•Œ (+102.5%) ์ธ๊ฐ„์˜ ์ฐฝ์˜์„ฑ์ด LLM๋ณด๋‹ค ๋†’์•˜์Œ
    • RLHF๊ฐ€ LLM์˜ CREATIVITY INDEX๋ฅผ ํ‰๊ท  30.1% ๊ฐ์†Œ์‹œํ‚ด
      • ๊ฒฐ๊ณผ: -Base ๋Š” RLFH ์ „

        ChatGPT, llama2-Chat, OLMO๋Š” ๊ฐ๊ฐ์˜ base๊ฐ€ RLHF๋œ ๋ฒ„์ „

        based on verbatim matches
        based on verbatim & semantic matches
      • RLHF์ž์ฒด๊ฐ€, LLM์ด ์ธ๊ฐ„์˜ ์„ ํ˜ธ๋Œ€๋กœ align๋˜์—ˆ์„ ํ…Œ๋‹ˆ๊นŒ,,,
    • ๋‹ค์–‘ํ•œ ์ธ๊ฐ„ ์ง‘๋‹จ๊ฐ„์˜ CREATIVITY INDEX ๋น„๊ต
      • ํ—ˆ๋ฐ์›จ์ด, ๋””ํ‚จ์Šค ๊ฐ™์€ ๊ณ ์ „๋ฌธํ•™์ž‘๊ฐ€๋Š” ๊ฐ€์žฅ ๋†’์€ ์ˆ˜์ค€์˜ ์ฐฝ์˜์„ฑ ๋ณด์ž„
        • Details
          • Classic Literature: ์œ ๋ช… ์ž‘๊ฐ€์˜ ๊ณ ์ „ ๋ฌธํ•™
          • Popular Teen Fiction: Goodreadsโ€™ book list๋กœ๋ถ€ํ„ฐ ๊ฐ€์ ธ์˜จ ์ถ”์ถœํ•œ ์ธ๊ธฐ ์ฒญ์†Œ๋…„ ์†Œ์„ค๋“ค
        • Insights
          • ์ฐฝ์˜์„ฑ ๋ฟ ์•„๋‹ˆ๋ผ ์ž‘๊ฐ€์˜ ๋ฌธ์ฒด, ์ง‘ํ•„์‹œ๊ธฐ ๋“ฑ์˜ ์š”์ธ์„ ๋ฐ›์„ ์ˆ˜๋„ ์žˆ๊ธด ํ•จ
          • ๊ฐ category ๋‚ด์—์„œ๋„ ์ฐฝ์˜์„ฑ ์ง€์ˆ˜ ํŽธ์ฐจ๊ฐ€ ํผ
            • e.g. Popular Teen Fiction ์ค‘ 'ํ—๊ฑฐ ๊ฒŒ์ž„'์˜ ์ฐฝ์˜์„ฑ ์ง€์ˆ˜๋Š” 'ํŠธ์™€์ผ๋ผ์ž‡'๋ณด๋‹ค 35.4% ๋” ๋†’์Œ
    • ๋น„๊ณต๊ฐœ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋œ GPT4์˜ ๊ฒฝ์šฐ, CREATIVITY INDEX๋ฅผ ์šฐํšŒ์ ์œผ๋กœ ์ธก์ •ํ•˜์—ฌ ๋ถ„์„
      • ๋‹ค๋ฅธ LLM๊ณผ ๋‹ฌ๋ฆฌ, GPT4๋Š” Redpajama๋ณด๋‹ค ์ตœ์‹ /๋น„๊ณต๊ฐœ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ
      • GPT-4์™€ ์œ ์‚ฌํ•œ knowledge update ์‹œ๊ธฐ์ธ ๊ณต๊ฐœ๋ชจ๋ธ(Gemma-7B, Llama3-8B, Mixtral-7B)์ด ์ƒ์„ฑํ•œ, โ€˜model-generated reference corpusโ€™๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€

        (1) RedPajama ์ฝ”ํผ์Šค์—์„œ 15๋งŒ ๊ฐœ์˜ ๋ฌธ์žฅ์„ ๋ฌด์ž‘์œ„๋กœ ์ถ”์ถœ

        (2) ๊ณต๊ฐœ๋ชจ๋ธ๋“ค์—๊ฒŒ, ๊ฐ ๋ฌธ์žฅ ๋’ค์— ์ด์–ด์งˆ ๊ธ€์„ ๋ฌธ์„œ ์ˆ˜์ค€์œผ๋กœ ์ƒ์„ฑํ•˜๋„๋ก ํ•จ

        Please generate a continuation for the following sentence: 
        [PROMPT SENTENCE]

        (3) ์ด ํ…์ŠคํŠธ๋ฅผ reference corpus๋กœ ๋‘๊ณ  CREATIVITY INDEX ์ธก์ •

      • ์ธ๊ฐ„์˜ ํ‰๊ท  ์ฐฝ์˜์„ฑ ์ง€์ˆ˜๋Š” GPT-4๋ณด๋‹ค 30.3% ๋” ๋†’์Œ
  • LLM ๋ฒ„์ „ ์—…๋ฐ์ดํŠธ์— ํ™œ์šฉํ•˜๋Š”, ์›น์ƒ์˜ ๊ธฐ์กด text snippets์˜ ์‚ฌ์šฉ ๋‚ด์—ญ์„ ์ถ”์ ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ DJ SEARCH ์ œ์•ˆ
    ๋ชจ๋“  pair๋ฅผ ๋‹ค ๋ณด๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ, ๋Œ€๊ฐ์„  ์ดํ›„์˜ ๋ถ€๋ถ„๋งŒ ๊ฒ€ํ† ํ•˜์ž
    • CREATIVITY INDEX ๋ฅผ ์œ„ํ•ด reference corpus๋ฅผ ์ฐพ์„ ๋•Œ, brute force๋กœ ํ•˜๋ฉด ๋„ˆ๋ฌด ๋น„์‹ธ์„œ ํšจ์œจ์ ์œผ๋กœ ์ฐพ์œผ๋ ค๊ณ !
      • ๋ ˆํผ๋Ÿฐ์Šค corpus๋ฅผ ์ฐพ๋Š”๊ฒŒ DJ๊ฐ€ ๋ฆฌ๋ฏน์Šค ํ• ๋•Œ original composer์— creditํ•˜๋Š” ๊ฒƒ๊ณผ ์œ ์‚ฌํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ด์„œ ์ด๋ ‡๊ฒŒ ์ง€์—ˆ๋‹ค๋„ค์š”
    • ๋ชจ๋“  (start position i, end position j) ์Œ์„ ๋น„๊ตํ•˜๊ธฐ๋ณด๋‹ค, ๊ฐ i์— ๋Œ€ํ•ด longest span์„ ์ฐพ์ž!
      • ์ด ๊ณผ์ •์—์„œ ์ด๋ฏธ ๊ณ„์‚ฐํ•œ ์ •๋ณด๋ฅผ ์žฌ์‚ฌ์šฉํ•˜์ž

      โ‡’ ํฌ์ธํ„ฐ๋ฅผ ์›€์ง์ด๋ฉฐ โ€œ์ด์ „๋ณด๋‹ค ๋” ๋’ค์—์„œ ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜/๋๋‚˜๋Š” n-gram๋งŒโ€ ๊ฒ€์‚ฌํ•˜๋„๋ก ์ œํ•œ

      • verbatim matching์—ฌ๋ถ€๋ฅผ ๋จผ์ € ํŒ๋‹จํ•˜๊ณ , ์ด๊ฒŒ ์—†์œผ๋ฉด semantic matching ์—ฌ๋ถ€๋ฅผ ํŒ๋‹จํ•จ
      • semantic matching ํ•  ๋•Œ, BM25๋กœ x์™€ ๊ฐ€์žฅ ์œ ์‚ฌํ•œ ํ…์ŠคํŠธ๋“ค์„ ๋จผ์ € ๊ณจ๋ผ ๊ทธ subset์—์„œ๋งŒ WMD๋ฅผ ๊ณ„์‚ฐํ•ด ๋น„์šฉ์„ ์ค„์ž„
      • minimum n-gram length(k) = 5
      • WMD threshold = 0.95

    • DJ SEARCH ๊ฒฐ๊ณผ ์˜ˆ์‹œ
      ChatGPT๊ฐ€ ์ƒ์„ฑํ•œ abstract์€ Elam ๊ต์ˆ˜๊ฐ€ ์ž‘์„ฑํ•œ ์›๋ณธ abstract์— ๋น„ํ•ด ์›น์ƒ์˜ ๊ธฐ์กด ํ…์ŠคํŠธ์™€ ํ›จ์”ฌ ๋” ๋งŽ์€ ์›๋ฌธ ์ผ์น˜ ๋ฐ ์œ ์‚ฌ ์ผ์น˜ ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜๊ณ  ์žˆ์Œ
  • CREATIVITY INDEX ๋ฅผ zero-shot black-box machine text detection์— ํ™œ์šฉ
    • setting
    • result
      • DetectGPT๋ณด๋‹ค 30.2% ๊ฐœ์„ ๋œ ์„ฑ๋Šฅ์„ ๋ณด์ž„
      • GhostBuster์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ, 6๊ฐœ ์ค‘ 5๊ฐœ ์˜์—ญ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„

Categories

research