๐Ÿ“• CS330 Lecture 1 Introduction & Overview

์–ผ๋งˆ์ „ ํŽ˜์ด์Šค๋ถ์—์„œ Multi-task and Meta Learning ์ด๋ผ๋Š” ์ œ๋ชฉ์„ ๋‹ฌ๊ณ ์žˆ๋Š” Stanford CS330์„ ๋‹ฌ๊ณ  ์žˆ๋Š” ๊ฐ•์˜๋ฅผ ๋ณด์•„์„œ ๋“ค์–ด๋ณด๊ธฐ๋กœ ํ–ˆ๋‹ค. 14๊ฐœ ์ •๋„์˜ ๊ฐ•์˜๋ผ ๋ฐฐ์†์œผ๋กœ ์ ๋‹นํžˆ ๋นจ๋ฆฌ ๋“ค์–ด๋ด์•ผ๊ฒ ๋‹ค.

์‹œ๊ฐ„์ด ๋งŽ์ด ํ˜๋Ÿฌ์„œ(๊ฐ•์˜ ๋น„๋””์˜ค๋Š” 2019๋…„ ๊ฐ€์„) ๋‚ด์šฉ์ด ๋งŽ์ด ๋ฐ”๋€Œ๊ฒ ์ง€๋งŒ, ํ•ด๋‹น ๋‚ด์šฉ์€ ๋ฐœํ‘œ ์Šฌ๋ผ์ด๋“œ๋กœ ์–ด๋–ป๊ฒŒ ์ฑ„์›Œ๋ด์•ผ๊ฒ ๋‹ค.


  • ํ•˜๋‚˜์˜ environment์—์„œ ํ•˜๋‚˜์˜ task๋ฅผ ๋ฐฐ์šฐ๋Š”๋ฐ ์—ฌ๊ธฐ์—๋Š” ๋งŽ์€ supervision๊ณผ guidance๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด๊ฑด ๊ฐ•ํ™”ํ•™์Šต์ด๋‚˜ ๋กœ๋ณดํ‹ฑ์Šค, speech recognition๋“ฑ๋“ฑ ๋งŽ์€ ๋ถ„์•ผ์— ์ ์šฉ๋˜๋Š” ์ด์•ผ๊ธฐ
  • deep multi-task, meta-learning์„ ์‹ ๊ฒฝ์จ์•ผ ํ•˜๋Š” ์ด์œ 
    • ํฌ๊ณ  ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜๊ณ  ํฐ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ์ž˜ generalizeํ•˜๋Š” ๊ฒƒ์€ ๊ธฐ์กด์— ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค.
      • ํ•˜์ง€๋งŒ large dataset์„ ์ด์šฉํ•  ์ˆ˜ ์—†๋‹ค๋ฉด ์ด์•ผ๊ธฐ๋Š” ๋‹ฌ๋ผ์ง„๋‹ค. (medical imaging์ด๋‚˜ robotics, medicine, recommendations ๋“ฑ๋“ฑ์„ ์ƒ๊ฐํ•ด๋ณด์ž) ๊ฐ๊ฐ์˜ ํƒœ์Šคํฌ๋ฅผ ํ•™์Šตํ•˜๊ธฐ ํž˜๋“ค์–ด์ง„๋‹ค.
      • ๋˜๋Š” long tail dataset์— ๋Œ€ํ•ด์„œ ํ•™์Šตํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ณด์ž. ์ผ๋ฐ˜์ ์ธ supervised learning๋งŒ์œผ๋กœ๋Š” ํ•™์Šตํ•˜๊ธฐ ํž˜๋“ค๋‹ค.
      • ์•„๋‹ˆ๋ฉด ๋น ๋ฅด๊ฒŒ ์ƒˆ๋กœ์šด ํƒœ์Šคํฌ์— ๋Œ€์‘ํ•ด์•ผ ํ•  ๋•Œ๋Š”? -> ์‚ฌ๋žŒ์ด๋ผ๋ฉด ๊ธฐ์กด์˜ ์ง€์‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋น ๋ฅด๊ฒŒ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•˜๋‹ค.
    • ์œ„์™€ ๊ฐ™์€ ์ƒํ™ฉ์—์„œ multi-task learning์ด๋‚˜ meta learning์ด ํ•„์š”ํ•˜๋‹ค.
  • ์—ฌ๊ธฐ์„œ multi-task/meta learning์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์—ฌ๋Ÿฌ ํƒœ์Šคํฌ๊ฐ€ ๊ฐ™์€ structure๋ฅผ ๊ณต์œ ํ•ด์•ผ ํ•œ๋‹ค.
    • ๋งŒ์•ฝ ๊ด€๊ณ„๊ฐ€ ์—†์–ด๋ณด์ด๋”๋ผ๋„ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ๋ผ๋ฉด ์ตœ์†Œํ•œ ํ•œ๊ตญ์–ด์˜ ๋ฃฐ์— ๋Œ€ํ•œ ๋ถ€๋ถ„์€ ๊ณต์œ ํ•œ๋‹ค๋Š” ์ ์„ ์ƒ๊ฐํ•ด๋ณด๊ณ , ์–ธ์–ด๋Š” ๋น„์Šทํ•œ ๋ชฉ์ ์„ ์œ„ํ•ด ๋งŒ๋“ค์–ด์กŒ๋‹ค๋Š” ์ ์„ ์ƒ๊ฐํ•ด๋ณด๋ฉด ์ž„์˜์˜ ํƒœ์Šคํฌ๋ณด๋‹ค๋Š” ํ›จ์”ฌ ๊ด€๊ณ„์žˆ์–ด ๋ณด์ธ๋‹ค๊ณ  ํ•œ๋‹ค.
  • informalํ•˜๊ฒŒ ๊ฐ•์˜ ์ฃผ์ œ๋ฅผ ์ •์˜ํ•ด๋ณด๋ฉด
    • multitask learning problem: learn all of tasks more quickly or more proficiently than learning them independently.
    • meta learning problem: given data/experience on previous tasks, learn a new task more quickly and/or more proficiently.
  • ๊ทธ๋Ÿผ domain adaptation๊ณผ ๋ฌด์—‡์ด ๋‹ค๋ฅผ๊นŒ.
    • domain adaptation์ด ๋ฐฐ์šฐ๋Š” ๊ฒƒ์€ ์ƒˆ๋กœ์šด ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ์ด์ „ ํ•™์Šต ๋ฐ์ดํ„ฐ์—์„œ์˜ out of distribution์ด๋ผ๋Š” ์  ์ •๋„
  • ๊ทผ๋ฐ multi task learning์€ single task learning์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์ง€ ์•Š๋‚˜์š”?
    • dataset์˜ ํ•ฉ์ง‘ํ•ฉ์œผ๋กœ ๋ณด๊ณ  loss๋ฅผ ๊ฐ๊ฐ ํƒœ์Šคํฌ์˜ loss์˜ ํ•ฉ์œผ๋กœ ๋ณด๋ฉด ๊ทธ๋ ‡๋‹ค.
    • ๊ทผ๋ฐ ํ•ด๋‹น ๋ฐฉ๋ฒ•์€ multi task์˜ ํ•˜๋‚˜์˜ ๋ฐฉ๋ฒ•์ด์ง€ ์ „๋ถ€๊ฐ€ ์•„๋‹ˆ๊ณ , ์„œ๋กœ ๋‹ค๋ฅธ ํƒœ์Šคํฌ๋ผ๋Š” ์ •๋ณด๋กœ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ์œ„ํ•ด ์‹œ๋„ํ•ด๋ณผ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ๋“ค์ด ์žˆ๋‹ค.
March 24, 2021
Tags: cs330