Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling (2019)Permalink

ContentsPermalink

  1. Abstract
  2. Introduction
  3. Our Methods
    1. Task formulation
    2. Framework
    3. Target-Fused Encoder
    4. Decoder & training


0. AbstractPermalink

ABSA의 2가지 main task

  • (1) Opinion target extraction
  • (2) Opinion words extraction

Few works aim to extract BOTH AS PAIRS


Propose a novel TOWE (=Target-oriented Opinion Words Extraction)

  • extract opinion words for a given opinion target
  • 이를 수행 하기 위해 target-fused seuqence labeling NN을 만듬


1. IntroductionPermalink

figure2


given a review & target in a review…

Goal of TOWE : Extract the corresponding opinion words

( 핵심은, learning a target-specific context representations )


TOWE를 수행하기 위한 powerful target-fused sequence labeling NN을 제안함


Inward-Outward LSTMPermalink

  • neural encoder to incorporate target info & generate target-fused context
  • pass target info to the left & right context respectively


ContributionPermalink

  • (1) (sequence labeling subtask for ABSA인) TOWE를 제안함
  • (2) TOWE를 풀기 위한 novel sequence labeing NN를 만듬
    • generate target-specific context representations


2. Our MethodsPermalink

figure2

2-1) Task formulationPermalink

Notation

  • sentence : s={w1,w2,,wi,,wn}

  • task : sequence labeling

    ( target-oriented opinion words를 뽑아내기 위해서 )

  • yi{B,I,O}.

    (B: Beginning, I: Inside, O: Others)


2-2) FrameworkPermalink

Propose a target-fused encoder

  • to incorporate target information into context
  • learn target-specific context representations

그런 뒤, decoder에 pass해서 sequence labeling을 수행한다


Model

  • Encoder : Inward-Outward LSTM
  • Decoder ( 2 different strategies )


2-3) Target-Fused EncoderPermalink

step 1) generate input vector

  • embedding lookup table LRd×V 사용해서

  • map s={w1,w2,,wt,,wn}

    into {e1,e2,,ei,,en}


step 2) split sentence into 3 segments

  • LEFT : {w1,w2,,wl}
  • TARGET : {wl+1,,wr1}
  • RIGHT : {wr,,wn}


step 3) left LSTM & right LSTM 사용하여 modeling


(a) Inward-LSTMPermalink

2개의 LSTM을 “양 끝에서부터 가운데 target으로 향하도록”

  • hLi=LSTM(hLi1,ei),i[1,,r1]hRi=LSTM(hRi+1,ei),i[l+1,,n].

  • 이 둘을 average하여…

    hLRi=(hLi+hRi)2,i[l+1,,r1]


최종적인 context representation : HI={hL1,,hLl,hLRl+1,,hLRr1,hRr,,hRn}


(b) Outward-LSTMPermalink

2개의 LSTM을 “가운데 target에서 양 끝을 향하도록”

  • 위의 (a)와 마찬가지로 구한 뒤 average하기


(c) IO-LSTM = (a)+(b)Permalink

hIOi=[hIi;hOi].


(d) IOG : IO-LSTM + Global contextPermalink

whole sentence의 global meaning을 이해하는 것도 매우 중요!

따라서 global context를 도입한다!

( use BiLSTM to model whole sentence embeddings )

hGi=[hi;hi]hi=LSTM(hi1,ei)hi=LSTM(hi+1,ei).


final target-specific contextualized representation r for each word:

  • ri=[hIOi;hGi].


2-4) Decoder & trainingPermalink

sequential representation r 을 사용하여

compute p(yr) where y={y1,,yn}


(a) (decoding 방법 1) Greedy decodingPermalink

  • Softmax : p(yiri)=softmax(Wsri+bs).

  • NLL : L(s)=ni=13k=1I(yi=k)logp(yi=kwi)


(b) (decoding 방법 2) CRF ( Conditional Random Field )Permalink

  • correlations between tags 고려
  • score the whole sequence of tags

p(yr)=exp(s(r,y))yYexp(s(r,y)).


Y : set of all possible tag sequences

s(r,y)=ni(Ayi1,yi+Pi,yi) : score function

  • Ayi1,yi : transition score from yi1 to yi
  • Pi=Wsri+bs.


Sentence에 대한 Loss로 NLL 사용 : L(s)=logp(yr)


최종 : minimize J(θ)=DL(s)