Blog

30 December 2025

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

ICML'25

πŸ’‘LLM λ©€ν‹° μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œμ—μ„œ 였λ₯˜κ°€ 났을 λ•Œ λˆ„κ°€ μ–Έμ œ 였λ₯˜λƒˆλŠ”μ§€ μžλ™μœΌλ‘œ νŒŒμ•…ν•΄λ³΄μž!벀치마크 μ œμ•ˆ 및 ν˜„ LLM μ„±λŠ₯ 평가

30 December 2025

To Mask or to Mirror: Human-AI Alignment in Collective Reasoning

EMNLP'25

πŸ’‘LLM은 μ‚¬λžŒμ„ λ”°λΌν•˜λŠ”κ°€? ν˜Ήμ€ μ‚¬λžŒμ΄ 보편적으둜 κ°€μ§„ 편ν–₯(?)을 μ—†μ• κ³  μ‚¬λžŒλ³΄λ‹€ 더 λ‚˜μ€ 결정을 λ‚΄λ¦¬λŠ”κ°€? 리더 μ„ μΆœ μ‹€ν—˜μ„ 톡해 λΆ„μ„ν•œ κ²°κ³Ό, LLM λ³„λ‘œ λ‹€λ₯΄λ‹€. (GPT, GeminiλŠ” 인간을 κ·ΈλŒ€λ‘œ λͺ¨λΈλ§ , ClaudeλŠ” 더 λ‚˜μ€ 선택)

μ—Όκ·œν™˜
17 December 2025

Quantifying Elicitation of Latent Capabilities in Language Models

NIPS'25

πŸ’‘LLM은 잠재된 λŠ₯λ ₯을 이미 κ°–μΆ”κ³  있으며, μ•„μ£Ό 적은 수의 λ¬΄μž‘μœ„ νŒŒλΌλ―Έν„°λ§Œ ν•™μŠ΅ν•΄λ„ κ·Έ λŠ₯λ ₯을 효율적으둜 λŒμ–΄λ‚Ό 수 μžˆλ‹€λŠ” 것을 μ‹€ν—˜/이둠적으둜 μ •λŸ‰ν™”ν•¨