๐Ÿ“ƒ Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model ๋ฆฌ๋ทฐ

April 27, 2020

TensorFlow ์ƒ์—์„œ FP32๋ฅผ INT8๋กœ quantization์„ ํ•ด๋ณด๋Š” ๋…ผ๋ฌธ์ด๋‹ค. 1.5๋ฐฐ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์œผ๋ฉด์„œ 0.5 BLEU score accuracy๋งŒ ๋–จ์–ด์กŒ๋‹ค๊ณ  ํ•œ๋‹ค. ๋˜ํ•œ intel cpu์— ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ–ˆ๋‹ค. arxiv ๋งํฌ๋Š” https://arxiv.org/abs/1906.00532์ด๊ณ , intel์—์„œ ๋‚˜์˜จ ๋…ผ๋ฌธ์ด๋‹ค.

Tags: paper
Read More

๐Ÿ“ƒ Patient Knowledge Distillation for BERT Model Compression ๋ฆฌ๋ทฐ

April 16, 2020

EMNLP 2019์— Accept๋œ ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ์—์„œ ๋‚˜์˜จ PKD (Patient Knowledge Distillation) ๋ฐฉ์‹์˜ Model Compression ๋…ผ๋ฌธ์ด๋‹ค. arxiv ๋งํฌ๋Š” https://arxiv.org/abs/1908.09355์ด๊ณ  ์ฝ”๋“œ๋Š” GitHub - intersun/PKD-for-BERT-Model-Compression์— ์žˆ๋‹ค.

Tags: paper
Read More

๐Ÿ“ƒ Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding ๋ฆฌ๋ทฐ

April 16, 2020

์ด ๋…ผ๋ฌธ์ด ๋‚˜์˜ค๊ธฐ ์–ผ๋งˆ ์ „์— ๋งˆ์ดํฌ๋กœ ์†Œํ”„ํŠธ์—์„œ ๋‚˜์˜จ MT-DNN (Liu et al., 2019)์— Knowledge Distillation์„ ์ ์šฉํ•œ ๋…ผ๋ฌธ์ด๋‹ค. arvix๋งํฌ๋Š” https://arxiv.org/abs/1904.09482์ด๊ณ  ์ฝ”๋“œ๋Š” GitHub - namisan/mt-dnn์—์„œ ํ™•์ธ ๊ฐ€๋Šฅํ•˜๋‹ค. ํŠน์ดํ•˜๊ฒŒ ๋‹ค๋ฅธ...

Tags: paper
Read More

๐Ÿ“ƒ Q8BERT: Quantized 8Bit BERT ๋ฆฌ๋ทฐ

April 14, 2020

intel์—์„œ ๋‚˜์˜จ NeurIPS 2019์— ๋ฐœํ‘œ๋œ Q8BERT ๋…ผ๋ฌธ์ด๋‹ค. arxiv ๋งํฌ๋Š” https://arxiv.org/pdf/1910.06188.pdf์ด๋‹ค. BERT๋ฅผ fine tuning phase๋•Œ quantization aware training์„ ์ ์šฉํ•˜์—ฌ 4๋ฐฐ ์••์ถ•ํ•˜๊ณ , intel CPU์˜ 8bit ์—ฐ์‚ฐ์„ ์‚ฌ์šฉํ•ด ์—ฐ์‚ฐ์„ ๊ฐ€์†ํ–ˆ๋‹ค.

Tags: paper
Read More

๐Ÿ“ƒ FastBERT: a Self-distilling BERT with Adaptive Inference Time ๋ฆฌ๋ทฐ

April 14, 2020

์ด ๋…ผ๋ฌธ ์—ญ์‹œ BERT๊ฐ€ ๋„ˆ๋ฌด ์„œ๋น™ํ•˜๊ธฐ ํฐ ๋ชจ๋ธ์ด๋ผ์„œ fine tuning ์‹œ์— self distillation์„ ์ ์šฉํ•ด๋ณธ ๊ฒƒ์ด๋‹ค. 2019 Tencent Rhino-Bird Elite Training Program์—์„œ ํŽ€๋”ฉ๋ฐ›์•„์„œ ์ž‘์„ฑํ•œ ๊ฒƒ์ด๋‹ค. arxiv ๋งํฌ๋Š” https://arxiv.org/abs/2004.02178์ด๋‹ค.

Tags: paper
Read More

๐Ÿ“ƒ DynaBERT: Dynamic BERT with Adaptive Width and Depth ๋ฆฌ๋ทฐ

April 13, 2020

์ด ๋…ผ๋ฌธ์—์„œ๋Š” BERT, RoBERTa๊ฐ€ ๋งค์šฐ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€๋งŒ, memory, computing power๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์ด ํ•„์š”ํ•˜๋ฏ€๋กœ ๊ทธ๋ฅผ ์••์ถ•ํ•ด๋ณด๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์•„์ง WIP์ธ ๋…ผ๋ฌธ์ด๊ณ , https://arxiv.org/abs/2004.04037๊ฐ€ ๋งํฌ์ด๋‹ค. ํ™”์›จ์ด์—์„œ ๋‚˜์˜จ ๋…ผ๋ฌธ์ด๋‹ค.

Tags: paper
Read More

๐Ÿ PEP(Python Enhancement Proposal)๋ž€ ๋ฌด์—‡์ผ๊นŒ

March 27, 2020

PEP์™€ ์ˆซ์ž๋กœ ์ด๋ฃจ์–ด์ง„ ์ˆ˜๋งŽ์€ python proposal์ด ์กด์žฌํ•˜์ง€๋งŒ, ๊ทธ ๋งŽ์€ proposal๋“ค์€ ์–ด๋–ค ๊ธฐ์ค€์œผ๋กœ ์ฝ์–ด์•ผ ํ•˜๊ณ , ํŒ๋‹จ์„ ํ•ด์•ผ ํ• ๊นŒ? ์–ด๋–ค proposal์„ ์ฝ์–ด์•ผ ํ•˜๊ณ  ์–ด๋–ค proposal์„ ์ฝ์ง€ ์•Š์•„๋„ ๋ ๊นŒ? ์ด๋Ÿฐ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต์„...

Tags: python
Read More

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์„œ๋น„์Šค A-Z 1ํŽธ - ์—ฐ์‚ฐ ์ตœ์ ํ™” ๋ฐ ๋ชจ๋ธ ๊ฒฝ๋Ÿ‰ํ™”

March 11, 2020
Tags: pytorch scatterlab tensorflow
Read More

์„ฑ๋Šฅ์„ ์œ„ํ•œ TensorFlow Serving ์ปค์Šคํ…€ ๋นŒ๋“œ

February 26, 2020

์•„๋ž˜๋Š” TensorFlow๋ฅผ ์‚ฌ์šฉํ•˜๋‹ค๋ณด๋ฉด ์ž์ฃผ ๋ณผ ์ˆ˜ ์žˆ๋Š” ๊ฒฝ๊ณ  ๋ฉ”์‹œ์ง€์ด๋‹ค. CPU๊ฐ€ AVX2, AVX512F, FMA๋ฅผ ์ง€์›ํ•˜์ง€๋งŒ ํ•ด๋‹น extension๋“ค์„ ์‚ฌ์šฉํ•˜๋„๋ก ๋นŒ๋“œ๋˜์ง€ ์•Š์•˜๋‹ค๋Š” ๋ฉ”์‹œ์ง€์ธ๋ฐ, ์ด๋Ÿฐ ๋ฉ”์‹œ์ง€๋Š” tensorflow/serving์—๋„ ๋˜‘๊ฐ™์ด ์ ์šฉ๋œ๋‹ค. ๊ทธ๋ž˜์„œ โ€œ๋นจ๋ผ์ง€๋ฉด ์–ผ๋งˆ๋‚˜ ๋นจ๋ผ์งˆ๊นŒ?โ€ํ•˜๊ณ  ํ…Œ์ŠคํŠธํ•ด๋ณด์•˜๋‹ค.

...
Tags: tensorflow
Read More