Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling (2019)Permalink
ContentsPermalink
- Abstract
- Introduction
- Our Methods
- Task formulation
- Framework
- Target-Fused Encoder
- Decoder & training
0. AbstractPermalink
ABSA의 2가지 main task
- (1) Opinion target extraction
- (2) Opinion words extraction
→ Few works aim to extract BOTH AS PAIRS
Propose a novel TOWE (=Target-oriented Opinion Words Extraction)
- extract opinion words for a given opinion target
- 이를 수행 하기 위해 target-fused seuqence labeling NN을 만듬
1. IntroductionPermalink
given a review & target in a review…
Goal of TOWE : Extract the corresponding opinion words
( 핵심은, learning a target-specific context representations )
TOWE를 수행하기 위한 powerful target-fused sequence labeling NN을 제안함
Inward-Outward LSTMPermalink
- neural encoder to incorporate target info & generate target-fused context
- pass target info to the left & right context respectively
ContributionPermalink
- (1) (sequence labeling subtask for ABSA인) TOWE를 제안함
- (2) TOWE를 풀기 위한 novel sequence labeing NN를 만듬
- generate target-specific context representations
2. Our MethodsPermalink
2-1) Task formulationPermalink
Notation
-
sentence : s={w1,w2,…,wi,…,wn}
-
task : sequence labeling
( target-oriented opinion words를 뽑아내기 위해서 )
-
yi∈{B,I,O}.
(B: Beginning, I: Inside, O: Others)
2-2) FrameworkPermalink
Propose a target-fused encoder
- to incorporate target information into context
- learn target-specific context representations
그런 뒤, decoder에 pass해서 sequence labeling을 수행한다
Model
- Encoder : Inward-Outward LSTM
- Decoder ( 2 different strategies )
2-3) Target-Fused EncoderPermalink
step 1) generate input vector
-
embedding lookup table L∈Rd×∣V∣ 사용해서
-
map s={w1,w2,…,wt,…,wn}
into {e1,e2,⋯,ei,…,en}
step 2) split sentence into 3 segments
- LEFT : {w1,w2,⋯,wl}
- TARGET : {wl+1,⋯,wr−1}
- RIGHT : {wr,⋯,wn}
step 3) left LSTM & right LSTM 사용하여 modeling
(a) Inward-LSTMPermalink
2개의 LSTM을 “양 끝에서부터 가운데 target으로 향하도록”
-
hLi=→LSTM(hLi−1,ei),∀i∈[1,⋯,r−1]hRi=←LSTM(hRi+1,ei),∀i∈[l+1,⋯,n].
-
이 둘을 average하여…
hLRi=(hLi+hRi)2,∀i∈[l+1,⋯,r−1]
최종적인 context representation : HI={hL1,⋯,hLl,hLRl+1,⋯,hLRr−1,hRr,⋯,hRn}
(b) Outward-LSTMPermalink
2개의 LSTM을 “가운데 target에서 양 끝을 향하도록”
- 위의 (a)와 마찬가지로 구한 뒤 average하기
(c) IO-LSTM = (a)+(b)Permalink
hIOi=[hIi;hOi].
(d) IOG : IO-LSTM + Global contextPermalink
whole sentence의 global meaning을 이해하는 것도 매우 중요!
따라서 global context를 도입한다!
( use BiLSTM to model whole sentence embeddings )
hGi=[→hi;←hi]→hi=LSTM(→hi−1,ei)←hi=LSTM(←hi+1,ei).
final target-specific contextualized representation r for each word:
- ri=[hIOi;hGi].
2-4) Decoder & trainingPermalink
sequential representation r 을 사용하여
compute p(y∣r) where y={y1,⋯,yn}
(a) (decoding 방법 1) Greedy decodingPermalink
-
Softmax : p(yi∣ri)=softmax(Wsri+bs).
-
NLL : L(s)=−∑ni=1∑3k=1I(yi=k)logp(yi=k∣wi)
(b) (decoding 방법 2) CRF ( Conditional Random Field )Permalink
- correlations between tags 고려
- score the whole sequence of tags
p(y∣r)=exp(s(r,y))∑y′∈Yexp(s(r,y′)).
Y : set of all possible tag sequences
s(r,y)=∑ni(Ayi−1,yi+Pi,yi) : score function
- Ayi−1,yi : transition score from yi−1 to yi
- Pi=Wsri+bs.
Sentence에 대한 Loss로 NLL 사용 : L(s)=−logp(y∣r)
최종 : minimize J(θ)=∑∣D∣L(s)