Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology.

Similar presentations


Presentation on theme: "A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology."— Presentation transcript:

1 A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology

2 Forward-backward algorithm Allows efficient calculation of sums (e.g. expectation,...) over all paths in a trellis. Plays an important role in sequence modeling HMMs (Hidden Markov Models) CRFs (Conditional Random Fields) [Lafferty et al., 2001]...

3 A sequential labeling example: part-of-speech tagging SOURCE “Time flies like an arrow” Time [noun] Time [verb] Time [prep.] flies [noun] flies [verb] flies [prep.] like [noun] like [verb] like [prep.] an [noun] an [verb] an [prep.] arrow [noun] arrow [verb] arrow [prep.] SINK Time [indef. art.] flies [indef. art.] like [indef. art.] an [indef. art.] arrow [indef. art.] in CRFs and HMMs, we need to compute the "sum" of the probabilities (or scores) of all paths.

4 Forward-backward algorithm efficiently computes sums over all paths in the trellis with dynamic programming It is intractable to enumerate all paths in the trellis because the number of all paths is enormous Forward-backward algorithm recursively computes the sum from source/sink to sink/source with keeping intermediate results on each node and arc

5 Forward-backward algorithm is applicable to Normalization constant of CRFs E-step for HMMs Feature expectation on CRFs = type of node/node pair = k-th feature= set of nodes and arcs (cliques) in path = set of paths

6 0 th -order moment (Normalization constant) 1 st -order moment Type of sums computable with forward- backward algorithm: = set of nodes and arcs (cliques) in path = set of paths

7 But sometimes we need higher-order multivariate moments... To name a few examples: Correlation between features Objectives more complex than log-likelihood Parameter differentiations of these...

8 Our goal: To generalize forward-backward algorithm for higher-order multivariate moments!

9 Can we derive dynamic programming for this formula? AnswerRecord multiple forward/backward variables for each clique, and Combine all the previously calculated values by the binomial theorem

10 SOURCE u A set of paths from SOURCE to u

11 SOURCE u A set of paths from SOURCE to u Ordinary forward- backward records only this variable

12 Direct ancestors of v u v SOURCE

13 Direct ancestors of v u v SOURCE These are derived from the binomial theorem

14 Direct ancestors of SINK SINK SOURCE Desired values

15 Summary of Our Ideas u v SOURCE multiple variables for each clique multiple variables for each clique Dependency between variables in a step, which is derived from the binomial theorem Dependency between variables in a step, which is derived from the binomial theorem

16 For multivariate cases, forward/backward variables have multiple indices u

17 To calculate the following form computational cost of the generalized forward- backward is proportional to Computational cost is only linear in the number of nodes and arcs in the trellis Linear in |V| and |E|

18 Merits of the generalized forward- backward algorithm 1.The generalized forward-backward subsumes many existing task-specific algorithms 2.For some tasks, it leads to a solution more efficient than the existing ones

19 Merit 1. The generalized forward-backward subsumes many existing task-specific algorithms: TaskSum to compute Parameter diffs. of Hamming-loss for CRFs [Kakade et al., 2002] Parameter diffs. of entropy for CRFs [Mann et al., 2007] Hessian-vector product for CRFs [Vishwanathan et al., 2006]

20 Merit 1. The generalized forward-backward subsumes many existing task-specific algorithms: TaskSum to compute Parameter diffs. of Hamming-loss for CRFs [Kakade et al., 2002] Parameter diffs. of entropy for CRFs [Mann et al., 2007] Hessian-vector product for CRFs [Vishwanathan et al., 2006] All these formulas have a form computable with our proposed method.

21 The previously proposed algorithms for these tasks are task-specific The generalized forward-backward is a task- independent algorithm applicable to formulae of the form If a problem involves this form, it immediately offers efficient solution

22 Merits of the generalized forward- backward algorithm 1.The generalized forward-backward subsumes many existing task-specific algorithms 2.For some tasks, it leads to a solution more efficient than the existing ones

23 Merit 2. Efficient optimization procedure with respect to Generalized Expectation Criteria for CRFs [Mann et al., 2008] Computational cost is proportional to Algorithm proposed in [Mann et al., 2008]By a specialization of the generalization Nodes labeled as answers ( L = # of nodes labeled as answers)

24 Future tasks Explore other tasks to which our generalized forward-backward algorithm is applicable Extend the generalized forward-backward to trees and general graphs containing cycles

25 Summary We have generalized the forward-backward algorithm to allow for higher-order multivariate moments The generalization offers an efficient way to compute complex models of sequences that involve higher- order multivariate moments Many existing task-specific algorithms are instances of this generalization It leads to a faster algorithm for computing Generalized Expectation Criteria for CRFs

26 Summary We have generalized the forward-backward algorithm to allow for higher-order multivariate moments The generalization offers an efficient way to compute complex models of sequences that involve higher- order multivariate moments Many existing task-specific algorithms are instances of this generalization It leads to a faster algorithm for computing Generalized Expectation Criteria for CRFs Thank you for your attention!


Download ppt "A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology."

Similar presentations


Ads by Google