Sie sind auf Seite 1von 4

1 Computing Arc Posteriors in a Word Lattice

A word lattice may be described using a set of nodes (with word id and time stamp) and a set of arcs
(with scores) connecting the nodes. A word lattice provides a compact representation for a set of sentences
(sequence of words). Typically, a word lattice may be generated by a recogniser and the score of the arcs
may be a combination of acoustic and language model scores.
Now, consider a lattice with the following properties:
• N : number of nodes
• L : number of arcs

• I : set of nodes in the lattices, |I| = N


• J : set of arcs in the lattices, |J | = L
• sij : score associated with the arc connecting node i and node j (sij = 0 if i and j are not connected)
To obtain the posterior probability of an arc in the lattice, the forward-backward algorithm (similar to
that used in HMM training) is performed to compute αi and βi for i ∈ I. This is done efficiently using the
following recursions:

αi = αk ski (1)
k∈I

βi = βk sik (2)
k∈I

The following initial values are used:

α0 = 1 (3)
βN = 1 (4)

where 0 and 1 denote the start and end nodes of the lattice respectively. The arc posteriors are given by the
probability of making a transition through that arc. For an arc connecting nodes i and j, the arc posterior
is given by
αi sij βj
γij = (5)
β0
This is similar to the statistics used to estimate the transition probabilities for an HMM.
Note that when using the posteriors to compute the expected counts for N -grams with N > 2, it
is neccesary to expand the lattice such that distinct nodes are used to represent the contexts. In the
SRILM toolkit, the lattices are not expanded. Instead, context dependent forward probabilities are computed
recursively on the fly when needed.

1
1.1 Example 1
Consider the following lattice:

3
3 4

1 2 5 6
2 3

6 1
4

There are 2 paths encoded in the lattice:


Paths Probability
1→2→3→5→6 2/3
1→2→4→5→6 1/3
The forward and backward probabilities are given by:
Nodes 1 2 3 4 5 6
α 1 2 6 12 36 108
β 108 54 12 3 3 1
And the arc posteriors are given by:
γij 1 2 3 4 5 6
1 — 1 — — — —
2 — — 2/3 1/3 — —
3 — — — — 2/3 —
4 — — — — 1/3 —
5 — — — — — 1
6 — — — — — —

2
1.2 Example 2
Consider the following lattice:

4
2 2

2 7
1 1 2 1

1 5 9

2 2 2 2
3 8

1
2
6

There are 6 paths encoded in the lattice:


Paths Probability
1 → 2 →4→ 7 → 9 2/21
1 → 2 →5→ 7 → 9 1/21
1 → 2 →5→ 8 → 9 2/21
1 → 3 →5→ 7 → 9 4/21
1 → 3 →5→ 8 → 9 8/21
1 → 3 →6→ 8 → 9 4/21
The forward and backward probabilities are given by:
Nodes 1 2 3 4 5 6 7 8 9
α 1 1 2 2 5 4 14 14 42
β 42 10 16 2 6 2 1 2 1
And the arc posteriors are given by:
γij 1 2 3 4 5 6 7 8 9
1 — 5/21 16/21 — — — — — —
2 — — — 2/21 3/21 — — — —
3 — — — — 12/21 4/21 — — —
4 — — — — — — 2/21 — —
5 — — — — — — 5/21 10/21 —
6 — — — — — — — 4/21 —
7 — — — — — — — — 7/21
8 — — — — — — — — 14/21
9 — — — — — — — — —

3
1.3 Lattice expansion for high order N-grams.
Consider the word lattice with the same structure as Example 1:
eating

3
3 4
I like western food

1 2 5 6
2 3

6 1
4

cooking

The bigram counts are given by:


Bigrams Path 1 Path 2 Expected
I like 1 1 1
like eating 1 0 2/3
like cooking 0 1 1/3
eating western 1 0 2/3
cooking western 0 1 1/3
western food 1 1 1
If trigram counts are needed, the above lattice has to be expanded as follows:
eating western | eating
4
3 5a
3 3
I like food

1 2 6
2

6 3
4 5b

1
cooking western | cooking

Trigrams Path 1 Path 2 Expected


I like eating 1 0 2/3
I like cooking 0 1 1/3
like eating western 1 0 2/3
like cooking western 0 1 1/3
eating western food 1 0 2/3
cooking western food 0 1 1/3

Das könnte Ihnen auch gefallen