๐Ÿ“ƒ Fast and Accurate Deep Bidirectional Language Representation for Unsupervised Learning ๋ฆฌ๋ทฐ

์–ผ๋งˆ ์ „ TensorFlow Korea์— ์ €์ž ๋ถ„์ด ์ง์ ‘ ์„ค๋ช…์„ ๊ฐ„๋žตํ•˜๊ฒŒ ๋‹ฌ์•„์ฃผ์…”์„œ ์ฝ์–ด๋ณธ ๋…ผ๋ฌธ์ด๋‹ค. ACL 2020 ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์ด๊ณ , Abstract์— Similarity task์—๋Š” BERT-based ๋ชจ๋ธ์— ๋น„ํ•ด 12๋ฐฐ์ •๋„ ๋น ๋ฅธ ์†๋„๋ฅผ ๊ฐ€์ง€๋ฉด์„œ๋„ ๊ดœ์ฐฎ์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง„๋‹ค๊ณ  ํ•œ๋‹ค. ์ตœ๊ทผ Similarity Task๊ฐ€ ํ•„์š”ํ•ด์ง„ ์ผ์ด ์žˆ์–ด์„œ ๋ฆฌ๋ทฐํ•ด๋ณด์•˜๋‹ค.

๊ฐ„๋‹จํ•˜๊ฒŒ ๋…ผ๋ฌธ์—์„œ ํ•„์š”ํ–ˆ๋˜ ์ ๋งŒ ์ ์–ด๋ณธ๋‹ค.

Introduction

  • โ€œCan we construct a deep bidirectional language model with a minimal inference time while maintaining the accuracy of BERT?โ€ -> ์ด ๋…ผ๋ฌธ์—์„œ ์ฃผ๋ชฉํ•˜๊ณ  ์‹ถ์—ˆ๋˜ ์ฃผ์ œ
  • ๊ทธ๋ž˜์„œ transformer๋ฅผ ์‚ฌ์šฉํ•ด์„œ auto encoding์„ ํ–ˆ๋Š”๋ฐ, ์ด ๋•Œ input์„ ๋ณต์‚ฌํ•ด์„œ ๊ทธ๋ƒฅ output์— ๋‚ผ ์ˆ˜ ์žˆ์œผ๋‹ˆ ๋‘๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ–ˆ๋‹ค.
    • diagonal masking
    • input isolation

ํŒจ์Šค

Language Model Baselines

  • speed baseline: Unidirectional language model
  • performance baseline: bidirectional language model

Proposed Method

Transformer based Text Autoencoder

Diagonal Masking

  • scaled dot product๋Š” self-unknown์ด ์ž˜ ์•ˆ๋จ
  • transformer layer output์˜ ๊ฐ position์˜ ๊ฐ’์€ \(Q\)์™€ \(K\)์—์„œ ๋‚˜์˜จ attention weight์™€ \(V\)์˜ ๋‹ค๋ฅธ ํฌ์ง€์…˜์˜ ๊ฐ€์ค‘ํ•ฉ์ด ๋˜๋„๋กํ•œ๋‹ค.
    • ์ธ๋ฐ ๊ทธ๋ƒฅ attention maskingํ•  ๋•Œ Identity Matrix๋ฅผ ์ถ”๊ฐ€ํ•ด์„œ maskingํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์ดํ•ดํ•˜๋ฉด ๋  ๊ฒƒ ๊ฐ™๋‹ค.
  • ๊ทผ๋ฐ ์ด๊ฑฐ ํ•ด๋„ residual connection ์žˆ์œผ๋ฉด ์†Œ์šฉ์—†์Œ

Input isolation

  • K, V ์™€ Q๋ฅผ ๋ถ„๋ฆฌํ•ด์„œ ๋„ฃ์–ด์ค€๋‹ค.

Experiments

๋‹ค๋ฅธ ๊ฒฐ๊ณผ๋ณด๋‹ค Semantic Textual Similarity๋ฅผ ์œ„์ฃผ๋กœ ๋ด„

  • BERT Finetunining ์—†์ด ํ•œ ๊ฒƒ ๊ฐ™์€๋ฐ, STS-B ๊ธฐ์ค€์œผ๋กœ BERT๋ณด๋‹ค ๋†’๊ฒŒ ๋‚˜์˜จ๋‹ค.
    • ํ•˜์ง€๋งŒ Sentence BERT (Reimers and Gurevych, 2019)๋…ผ๋ฌธ์— ๊ธฐ์ˆ ๋œ ์Šค์ฝ”์–ด์™€ ๋งŽ์ด ์ฐจ์ด๋‚˜๋Š” ๊ฒƒ์„ ๋ณด๋ฉด ์‹ค์ œ BERT๋ฅผ ์ž˜ ์ด์šฉํ•œ ๊ฒƒ๊ณผ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ์–ด ๋ณด์ธ๋‹ค.
    • Transformer๋ฅผ ์ˆ˜์ •ํ–ˆ๋‹ค๊ณ  ํ•˜๋”๋ผ๋„ 3๋ ˆ์ด์–ด๋งŒ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋–„๋ฌธ์— ๋‹น์—ฐํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ด๊ธฐ๋„ ํ•œ๋‹ค.

๊ทธ๋ž˜๋„ ๋น ๋ฅด๊ฒŒ ๋ฝ‘์•„๋‚ด๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ์—๋Š” ๊ดœ์ฐฎ์€๊ฐ€?? ์‹ถ๊ธฐ๋„ ํ•˜๋‹ค.

July 12, 2020
Tags: paper