NLP - attention

Data Science/AI

NLP - attention

토마토. 2021. 9. 24. 09:19

- Prerequisites : SMT statistical machine translation vs NMT Neural Machine Translation

SMT Probabilistic Model, Bayes Rule, Parallel data & Alignment

find best English sentence y, given French sentence x : P(y|x)

use bayes rule to break this down into two components to be learned separately : P(x|y)P(y)

translation model, language model

spurious word : 소스 언어와 번역 단어들이 짝을 이루어야 한다. 어떤 관계인지 - alignment 한 데이터가 필요하다.

NMT Neural Machine Translation

single end-to-end model, seq2seq, conditional language model

NMP directly calculates P(y|x):

p(y|x) = p(y_1|x)...

seq2seq

the sequence-to-sequence model

- encoding of the source sentence provides initial hidden state for Decoder RNN

start

target sentence - output

encoder RNN produces an encoding of the source sentence

decoder RNN is a language model that genereates target sentence, conditioned on encoding.

하나의 벡터 형태로 인코딩한다.

encoder-decoder를 이용해서, 인코딩 되어있는 소스 언어의 문장/토큰이 주어졌을 때 등장 확률이 높은 y값을 등장시키는 거이 목표 함수식이다.

seq2seq is optimized as a single system. backpropagation operates 'end-to-end'

source 언어를 하나의 고정된 벡터로 인코딩한다.

이렇게 인코딩한 벡터를 타겟 언어로 디코딩한다.

문장을 이해하지 못한 채로 이렇게 욱여넣는다?? 불가능

문장 벡터 = sentence vector

lambda x (state x and border x,e89)

sentence embeddings

현대 nlp는 신경망에 기반한 모델

분석적 + 파싱을 많이 하는, 의미 파싱을 많이 함. 전통적인 방식.

의미론을 바탕으로 하는 것이 꽤 어렵다.

단어 벡터 - 개/고양이는 비슷한 곳에 위치한다. 문장벡터는 비슷한 의미의 경우, 비슷한 공간에 매핑이 된다.

seq2seq의 문제점은?

아웃풋을 산출할 때 같은 단어를 반복해서 내놓는 경향이 있다.

이유 : LSTM이 인풋을 잊어버렸기 때문에 발생하는 현상이다.

seq2seq의 문제점

인풋 문장의 길이가 길어지면,

UNK단어 unknown token 에 대해서 잘 처리하지 못함. Pont-de-buis와 같은 희소 단어를 벡터 형태로 제대로 인코딩하는 것은 매우 어려움. 대신 디코딩 단계에서 소스 언어를 copy하는 것이 적절한 전략이다.

aligned inputs

디코딩하고자 하는 토큰과 소스 언어의 부분만을 매칭할 수 있다면, 엄청난 성능 향상이 이루어질 것이다.

그러나 수동으로 alignment 를 주석하고 하드코딩하는 것은 효율성이 떨어진다.

neural machine translation by jointly learning to align and traslate

- attention이 곧 소스 언어와 타겟 언어 사이의 slft-alignment라는 것이 중요한 포인트이다.

attention

기본적인 과정

stanford cs224n slide 를 가져왔음.

attention score를 정의하는 방식은 각자 하기 나름이다.

attention score를 계산한다. 타겟 언어와 인코더를 구성하고 있는 소스 언어와의 유사점/관계성을 보는 과정이다.

attention의 종류에 여러가지가 있다

- bahdanau/luong(dot)

soft alignment

- 인코딩을 하고자 하는 소스 언어 / 디코딩을 하고자 하는 언어와 어떻게 연관되는지 보면 된다.

타켓 단어.

attetion 이 광범위하게 쓰이게된다. - text classification, interpretability

attention은 여기까지 듣게 된다.

'Data Science > AI' 카테고리의 다른 글

Deep Learning - DNN에 대해 알아보자 \| ANN, DNN (0)	2021.09.25
Deep learning 기초 정리 \| 퍼셉트론, 신경망, 3차신경망 학습, 오차역전파법 등 (0)	2021.09.24
What is Spark Streaming? \| RDD, SparkStreaming, DStream, Sparkconfig \| 작성 중 (0)	2021.09.16
What is Git and GitHub? \| 깃, 깃허브 이용법 (0)	2021.08.29
[8.9] 공동세션 분석 리뷰 (0)	2021.08.10

현재글NLP - attention

HappyTomatoLife

기록하는 토마토

조건문, singly linked list, maze problem, linear DS, 반복문, Doubly Linked List, linked stack, SQL, 자료구조, JavaScript, linked Queue, 함수형 언어, binary search, REACT, react.js, 교육상담, OCaml, Expression evaluation, DS, Deque,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

HappyTomatoLife

NLP - attention

'Data Science > AI' 카테고리의 다른 글

'Data Science/AI'의 다른글

티스토리툴바

NLP - attention

'Data Science > AI' 카테고리의 다른 글

'Data Science/AI'의 다른글

관련글

티스토리툴바