ISM206 Lecture, Nov. 15 , 2005 Markov Chain

ISM206 Lecture, Nov. 15 , 2005
Markov Chain
Instructor: Kevin Ross
Scribe: Pritam Roy, John Frick and Joseph Rios
November 29, 2005
1
Outline of Topics
The topics are:
•
•
•
•
•
•
•
2
Discrete Time Markov Chain
Examples
Chapman-Kolmogorov Equation
Types of states
Long Run Behavior and Expected/Average cost
Recurrence Times
Absorbing states and Random Walk
Introduction
• Transitioning into uncertain problems like sensitivity, decision analysis, and queueing dynamics
• Solving problems in optimization which deal with markov chains.
3
Discrete time Markov Chain
In this model time is discrete i.e. t ∈ 0, 1.2, . . ..
States: The current status of the system may by one of the (M+1) mutually exclusive
categories called States Hence Xt represents the state of the system at time t, so its
1
possible values are 0,1,2,...,M. We are interested in evolving states e.g.
• inventory level
• interest rate
• number of waiting tasks in a queue.
3.1
Key Properties of markov Chain
Markovian proprty says that the conditional probability of any future “event” given
any past “events”. and the present state Xt = i, is independent of the past events
and depends only upon the present state.
P [Xt+1 = j|X0 = k0 , X1 = k1 , . . . , Xt = i] = P [Xt+1 = j|Xt = i]
(1)
A stochastic process which follows the Markovian property are called Markov Chain.
For example, amount of the stock left in shop depends on what was there in previous
day, not everyday.
The conditional probabilities pij (t) = P [Xt+1 = j|Xt = i] for a Markov Chain are
called (one-step) transition probabilities. Similarly, n-step transition probabilities
(n)
are pij = P [Xt+n=j |Xt = i]. If for each i and j,
P [Xt+n = j|Xt = i] = P [Xn = j|X0 = i]
(2)
then the transition probabilities are said to be stationary
Because the pij (t) are conditional probabilities, they must be non-negative, and
since the process must make a transition into some state, they must satisfy the
propoerties
(n)
pij ≥ 0, f or all i and j; n = 0, 1, 2, . . . ,
(3)
and
M
X
(n)
pij = 1, f or all i; n = 0, 1, 2, . . .
(4)
j=0
A convenient way of showing all the n-step probabilities p(n) in the matrix form
as
State
0
1
...
M
(n)
(n)
(n)
0
p00 p01 . . . p0M
(n)
(n)
(n)
1
p10 p11 . . . p1M
...
... ... ... ...
(n)
(n)
(n)
M
pM 0 pM 1 . . . pM M
for n=0,1,2,...,M. Note how equations 3 and 4 apply to this table: the rows sum to
1 and all values are positive.
2
3.2
Formulating Weather Example as Markov Chain Problem
The weather in Canterville from day to day has been formulated as
Xt = 0 if day t is dry
= 1 if day t is wet.
P[tomorrow will be dry | today is dry] = P [Xt+1 = 0|Xt = 0]= p00 = 0.8
P[tomorrow will be dry | today is wet] = P [Xt+1 = 0|Xt = 1]= p10 = 0.6
Furthermore,
p00 + p01 = 1, so p01 = 1 − 0.8 = 0.2,
p10 + p11 = 1, so p11 = 1 − 0.6 = 0.4.
State 0
1
The transition matrix P =
0
0.8 0.2
1
0.6 0.4
3.3
Gambling Problem
We assume one person bets 1 dollar on every round of game. The probability of
winning is p and the probability of losing is (1-p). So he plays until either he wins 3
dollars or goes broke. For example, if he has 1 dollar initially, then the probability of
winning i.e. having 2 dollars after first round is p and probability of having 0 dollars
i.e. losing is (1-p). So the 1-step transition matrix
State
0
P=
1
2
3
0
1
1
0
1-ρ 0
0 1-ρ
0
0
2 3
0 0
ρ 0
0 ρ
0 1
Let us start with X0 = [1 0 0 0].
X1 = X0 ∗ P = [0.2 0.3 0.4 0.1]
X2 = X1 ∗ P = X0 ∗ P 2 = [0.15 0.32 0.34 0.19]
...
Xn = X0 ∗ P n
Eventually one will reach the fixed point and from then Xi+1 = Xi .
3
Inference: Memoryless property does not care about beginning states. This property
was illustrated in class through the use of Matlab.
4
Chapman-Kolmogrov Equations
The Chapman-Kolmogrov equations essentially defines what happens after lots of
steps
(n)
pij =
M
X
(m)
(n−m)
(pik ) (pkj
)
(5)
k=0
where, i = 0,1,..,M, j=0,1,..,M and any m = 1,2,..,n-1, n = m+1, m+2, ...
For n=2 the expression becomes,
(2)
pij =
M
X
(pik )(pkj )
(6)
k=0
(2)
for all states i and j, where the pij are the elements of a matrix P (2) . These elements
are obtained by multiplying the matrix of one-step transition by itself; i.e.,
P (2) = P.P = P 2 .
(7)
In general, n-step transition probabilities can be obtained by computing the n-th
power of the one-step transition matrix i.e. P (n) = P.P (n−1) = P n .
0.8 0.2
0.8 0.2
0.76 0.24
In the weather example, p2 (weather) = (
)(
)=(
).
0.6 0.2
0.6 0.2
0.72 0.28
P [Xn = j|X0 = i0 ] = P [Xn = j|X0 = i0 ] = (P n )(P0 )
(8)
where P0 is the vector of probabilities of state i0 .
5
Classification of States of a Markov Chain
• Absorbing : A state is said to be an absorbing state if, upon reaching this state,
the process never will leave this state agin. State i is an absorbing state if and only
if pjj = 1
(n)
• Accessible : A state j is said to be accessible from state i if pij > 0 for some
n ≥ 0 i.e. one can get to j from i in some steps.
4
• Communicating : If state j is accessible from state i and state i is accessible from
state j then states i and j are said to communicate.
• Transient : A state is said to be a transient state if upon entering this state, the
process may never return this state again. A state is transient if there exists a state
j (j 6= i) that is accessible from state i but not vice-versa.
• Recurrent : A state is said to be recurrent state if upon entering this state, the
process definitely will return to this state again.
• Periodic : A state is said to be periodic if upon entering this state, the process
definitely return to this state in fixed number of steps.
• Irreducible Markov Chain : If all states communicate then the Markov Chain
can not be simplified and is said to be irreducible.
• Ergodic : In a finite state Markov Chain, recurrent state that is aperiodic is called
ergodic.
6
Long Run Behavior
“Steady state probabilities” for large enough n, all rows of P n are same i.e. the
probabilities of being in each state is independent of the original. They are called
“steady state probabilities” since they don’t change.
(n)
For any irreducible ergodic Markov Chain, limn→∞ pij exists and is independent
of i. Furthermore,
(n)
limn→∞ pij = πj > 0
(9)
where πj uniquely satisfies the following steady-state equations.
πj =
M
X
πi pij
f orj = 0, 1, . . . , M
i=0
M
X
πj = 1
j=0
Note : Steady state probabilities are NOT same as stationary transition probabilities.
5
In the weather example,
π0 = π0 p00 + π1 p10
π1 = π0 p01 + π1 p11
π0 + π1 = 1
We have 3 equations and 2 unknown variable. But the third equation is obtained
just by adding first two equations. After solving we obtain, π0 = 0.25 and π0 = 0.75.
Note: the important results concerning steady-state probabilities. • if i and j are
(n)
recurrent states belonging to different classes, then pij = 0 for all n
(n)
• if j is a transient state, then limn→∞ pij = 0 for all i.
We can use long-run behavior to
• obtain likelihood of states,
• calculate expected/average cost.
6.1
Expected Average Cost per Unit Time
If the requirement that the states be aperiodic is relaxed, then the limit
(n)
lim pij = 0
n→∞
may not exist. To illustrate this point, consider the two-state transition matrix
State 0 1
P=
0
0 1
1 0
1
If the process starts in state 0 at time 0, it will be in state 2,4,6,..and in state 1 at
(n)
(n)
times 1,3,5,..Thus, p00 = 1 if n is even and p00 = 0 if n is odd, so that
(n)
lim p00
n→∞
does not exist. However, the following limit always exists for an irreducible (finitestate) Markov Chain:
n
1 X (k)
lim (
pij ) = πj ,
n→∞ n
k=1
(10)
where pij satisfy the steady-state equations given in previous subsection.
6
The result is important in computing the long-run average cost per unit time
associated with a Markov Chain. Suppose that a cost (or, other penalty function)
C(Xt ) is incurred when the process is in state Xt at time t, for t = 0,1,2,..
Note: C(Xt ) is a random variable that takes on any one of the values C(0),
C(1),..,C(M) and that the function C(.) is independent of t.
The expected avergare cost incurred over the first n periods is given by
n
1X
E[
C(Xt ))].
n k=1
By using the result that
n
1 X (k)
lim (
pij ) = πj ,
n→∞ n
k=1
it can be shown that the (long-run) expected average cost per unit time is given
by
M
n
X
1X
C(Xt ))] =
πj C(j).
E[
n k=1
j=0
For further reference, please see worked out example in the text.
7
Expected Average Cost per Unit Time for Complex cost Functions
Note: this section was not covered during Tuesdays lecture. In previous section the
cost function was based soley on the state that the process is in time t. In many
imprtant problems encountered in practice, the cost may also depend upon some
other random variables.
The other random variables has to be i.i.d. e.g. stock holding cost may depend
on number of workers, interest rate. Xt is independent of this other two values.
Suppose the costs to be considered are the ordering cost and the penalty cost for
the unsatisfied demand. Therefore the total cost for week t is a function of Xt−1 and
Dt , that is, C(Xt−1 , Dt ).
Under the assumptions of the example, it can be shown that the (long-run)
expected average cost per unit time is given by,
n
lim E[
n→∞
M
X
1X
C(Xt−1 , Dt ))] =
k(j)πj .
n k=1
j=0
7
(11)
where
k(j) = E[C(j, Dt )],
where the expectation is taken w.r.t. probability distribution of the random variable
Dt , given the state j. Similarly the long-run actual average cost per unit time is
given by
n
M
X
1X
lim
C(Xt−1 , Dt )) =
k(j)πj .
n→∞ n
j=0
k=1
8
(12)
Recurrence Time
We often like to know how long we expect to get state j from state i and how often
we return to i.
Let fijn = probability of first passage time from i to j is n.
(1)
fij = p1ij = pij
P
(2)
(1)
fij = k6=j pik fkj
...
P
(n−1)
(n)
fij = k6=j pik fkj .
In general, n-th transitions are tedious to calculate, but first passage time can be
calculated from
P∞ above.
µij = ∞
if n=1 fijn < 1
P∞
= n=1 nfijn otherwise.
If
P∞
P
fijn = 1 then µij = 1 + k6=j pik µkj is unique.
Expected recurrence time for well behaved systems:
n=1
µii =
9
1
πi
(13)
Absorbing States
We can recall that a state is called an absorbing state if pkk = 1, so that once the
chain visits k it remains there forever. If k is an absorbing state , the process starts in
state i, the probability of ever going to state k is called the probability of absorption
8
into state k, given the system started in state i. This probability is denoted by fik .
fik =
M
X
pij fjk ,
(14)
j=0
for i = 0,1,..,M, subject to the conditions
fkk = 1,
fik = 0,
if state i is recurrent and i 6= k. We can solve these equations to get another fik
value.
9.1
Random Walk
Absorption probabilities are important in random walks. A Random Walk is a
Markov chain with the property that if the system is in a state i, then in a single
transition the system either remains at i or moves to one of the two states immediately
adjacent to i. For example, arandom walk often is used as a model for situations
involving gambling.
To illustrate consider a gambling example with two players (A and B) each having 2 dollars. They agree to keep playing the game and betting 1 dollar at a time
until one player is broke. The probability of A wiining a single bet is 13 , so B wins
with probability 23 . The number of dollars player A has before each bet (0,1,2,3,or
4) provides the states of a Markov chain with transition matrix
State
0
1
P=
2
3
4
0
1
2
3
0
0
0
1
0
0
2
0
2
3
0
0
0
1
3
2
3
0
3
0
0
1
3
0
0
4
0
0
0
1
3
1
Let fik be the probability of absoroption at state k given start state is i. We can
check that
fik =
M
X
pij fjk ,
(15)
j=0
9
for i = 0,1,..,M, is satisfied withy the conditions
fkk = 1,
fik = 0,
if state i is recurrent and i 6= k.
For our gambling example,
f10
f20
f30
f00 = 1
2
1
= f00 + f20
3
3
2
1
= f10 + f30
3
3
2
1
= f20 + f40
3
3
f40 = 0
After solving the sets of equations, we obtain probability of A losing = f20 = 15
and probability of A winning = f24 = 45 .
Note: Starting state has an effect on long term behavior. For example, if they
start with 3 dollars then probability of A losing would be different.
10

Download Report

ISM206 Lecture, Nov. 15 , 2005 Markov Chain

Paperzz.com

Your Paperzz