Markov chain: a stochastic
process describing a sequence of possible events in which the probability of each event depends
only on the state attained in the previous event
defining property: a stochastic process(X_t)_{t\in\N_0}(Xt)t∈N0
on a countable (finite or countably infinite) state space EE is
called a Markov chain if for every n \in \Nn∈N and for every
i,j,i_0,...,i_{n-1} \in
Ei,j,i0,...,in−1∈E
such that P(X_0 = i_0,...,X_n=i) > 0P(X0=i0,...,Xn=i)>0, it holds that \displaystyle{P(X_{n+1}=j\;|\;X_0 = i_0,...,X_n=i) =
P(X_{n+1}=j\;|\;X_n=i)}P(Xn+1=j∣X0=i0,...,Xn=i)=P(Xn+1=j∣Xn=i)
homogeneity: the transition probabilities are independent of the
timett
formally: the Markov chain (X_t)_{t\in\N_0}(Xt)t∈N0
is homogeneous if for every i,j\in
Ei,j∈E
and every n,m\in\Nn,m∈N, if
P(X_{n-1}=i) > 0P(Xn−1=i)>0 and P(X_{m-1}=i)>0P(Xm−1=i)>0, then P(X_n=j\;|\;X_{n-1}=i) =
P(X_m=j\;|\;X_{m-1}=i)P(Xn=j∣Xn−1=i)=P(Xm=j∣Xm−1=i)
stochastic process: a
sequence (X_t)_{t\in\N_0}(Xt)t∈N0
of random variables (tt: time), all defined
on the same probability space (\Omega, \mathcal{F}, P)(Ω,F,P), all taking values in the
same (finite or countably infinite) space EE
in other words: a sequence \{X(t) : t \in T\}{X(t):t∈T}, with tt meaning
"(discrete) time" here
notation: X_t = iXt=i means that
at time tt, the
process is in the state ii
formally: \Pi\in[0,1]^{E\times E}Π∈[0,1]E×E
double stochastic, if \PiΠ stochastic and
\sum_{i\in E} \Pi(i,j) = 1∑i∈EΠ(i,j)=1 for all j\in Ej∈E
note¹: if a stochastic matrix is symmetric, it is also doubly
stochastic
note²: for a double stochastic matrix, the stationary
distribution is the uniform distribution
transition matrix: a square matrix used to describe the transitions of a
homogeneous Markov chain
formally: \Pi\in[0,1]^{E\times E}Π∈[0,1]E×E
stochastic and \Pi(i,j) = P(X_{n+1}=j\;|\;X_n=i)Π(i,j)=P(Xn+1=j∣Xn=i) for all i,j\in Ei,j∈E and
n \in \N_0n∈N0
with P(X_n = i) >
0P(Xn=i)>0
Existence, Markov Property
initial distribution: the distribution of X_0X0
formally: \mu: E \to \Rμ:E→R, \mu(i) := P(X_0 =
i)μ(i):=P(X0=i)
\forall i: \mu(i) \geq 0∀i:μ(i)≥0
\sum_{i\in E} \mu(i) = 1∑i∈Eμ(i)=1
existence theorem: let \muμ be a
distribution on EE,
let \Pi \in [0,1]^{E\times
E}Π∈[0,1]E×E
be a stochastic matrix; then there exists a homogeneous Markov chain(X_t)_{t\in\N_0}(Xt)t∈N0
with initial distribution\muμ and
transition matrix\PiΠ
lemma: let (X_t)_{t\in\N_0}(Xt)t∈N0
be a stochastic process on EE,
let \Pi \in [0,1]^{E\times
E}Π∈[0,1]E×E
be a stochastic matrix; then (X_t)_{t\in\N_0}(Xt)t∈N0
is a homogeneous Markov chain with transition matrix \PiΠif and only
if for all n \in \Nn∈N and for all
i,j,i_0,...,i_{n-1} \in
Ei,j,i0,...,in−1∈E
such that P(X_0 = i_0, ..., X_n = i) > 0P(X0=i0,...,Xn=i)>0, it holds that \displaystyle{P(X_{n+1}=j\;|\;X_0 = i_0,...,X_n=i) =
\Pi(i,j)}P(Xn+1=j∣X0=i0,...,Xn=i)=Π(i,j)
random-mapping representation: every homogeneous Markov chain can be realized as
X_{n+1}=f(X_n, Z_{n+1})Xn+1=f(Xn,Zn+1)
formally: let Z_n, n\in NZn,n∈N iid
taking values in FF and
let EE be
a countable state space; let f : E \times F \to Ef:E×F→E be
a measurable function and let X_0: \Omega \to EX0:Ω→E be
a random variable independent of Z_nZn;
set \displaystyle{X_{n+1} =
f(X_n, Z_{n+1}) \;\forall n \in \N_0}Xn+1=f(Xn,Zn+1)∀n∈N0,
then (X_n)_{n\in\N_0}(Xn)n∈N0
is a homogeneous Markov chain on EE
with transition matrix \displaystyle{\Pi(i,j) = P(f(i,Z_1)=j) \;\forall i,j\in
E}Π(i,j)=P(f(i,Z1)=j)∀i,j∈E
expanded: X_1 = f(X_0, Z_1);\;\; X_2=f(X_1,Z_2) =
f(f(X_0,Z_1),Z_2)X1=f(X0,Z1);X2=f(X1,Z2)=f(f(X0,Z1),Z2) and so on...
on [0,1][0,1]: let
EE be
a countable state space and let \Pi\in[0,1]^{E \times E}Π∈[0,1]E×E
be a stochastic matrix; let Z_n, n\in NZn,n∈N iid
uniform distributed on [0,1][0,1]; let f : E \times [0,1] \to
Ef:E×[0,1]→E be
a measurable function; set \displaystyle{X_0 = i_0,\; X_{n+1} = f(X_n, Z_{n+1})
\;\forall n \in \N_0}X0=i0,Xn+1=f(Xn,Zn+1)∀n∈N0,
then (X_n)_{n\in\N_0}(Xn)n∈N0
is a homogeneous Markov chain on EE
with transition matrix \PiΠ and P(X_0=i_0) = 1P(X0=i0)=1
corollary: if EE
is countable and \Pi\in[0,1]^{E \times E}Π∈[0,1]E×E
is a stochastic matrix, there exists a homogeneous Markov chain with transition
matrix \PiΠ
Markov property: no matter what happened before time mm, once we know X_m = kXm=k, the
process restarts at kk with the
same law as the original chain started from kk, dropping
all history before mm
in other words: the future only depends on the present
formally: let (X_n)_{n\in \N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ; fix m \in \Nm∈N and k \in Ek∈E
such that P(X_m = k) > 0P(Xm=k)>0; then, under \tilde P := P(\cdot\;|\;X_m
= k)P~:=P(⋅∣Xm=k), the sequence (\tilde X_n := X_{n+m})_{n
\in \N_0}(X~n:=Xn+m)n∈N0
is a Markov chain with transition matrix \PiΠ and starting
distribution \delta_kδk
(Dirac measure), independent of X_0,...,X_mX0,...,Xm
(?)
for any past event AA
before or at X_mXm
and any future event BB
after X_mXm,
\displaystyle{P(A \cap B \;|\; X_m = k) =
P(A\;|\; X_m = k)P(B\;|\; X_m = k)}P(A∩B∣Xm=k)=P(A∣Xm=k)P(B∣Xm=k)
Finite Dimensional Distributions
what defines a Markov chain?: let EE be a
countable state space, let \Pi
\in [0,1]^{E\times E}Π∈[0,1]E×E
be a stochastic matrix, let \muμ be a distribution
on EE, let (X_n)_{n \in \N_0}(Xn)n∈N0
be a stochastic process on EE; then the
following are equivalent (iff-s):
(X_n)_{n\in
\N_0}(Xn)n∈N0
is a Markov chain with transition matrix \PiΠ and initial
distribution \muμ
\forall n \in \N_0,
i_0,...,i_n \in E: P(X_0 =
i_0,...,X_i=i_n)=\mu(i_0)\Pi(i_0,i_1)...\Pi(i_{n-1},i_n)∀n∈N0,i0,...,in∈E:P(X0=i0,...,Xi=in)=μ(i0)Π(i0,i1)...Π(in−1,in)
uniqueness in distribution: \muμ and \PiΠ uniquely define the
distribution of a Markov chain
formally: let \muμ be
a distribution on EE
and \Pi \in [0,1]^{E\times E}Π∈[0,1]E×E
be a stochastic matrix on EE;
then two Markov chains (X_n)_{n \in \N_0}(Xn)n∈N0
and (Y_n)_{n \in \N_0}(Yn)n∈N0,
both with \muμ and
\PiΠ, have the
same distribution
Markov chain distribution at time nn: let
(X_n)_{n \in \N_0}(Xn)n∈N0
be a Markov chain with initial distribution \muμ and transition
matrix \PiΠ; then for all n\in \N_0n∈N0,
the distribution of X_nXn
is \mu^n = \mu\Pi^nμn=μΠn
\mu^nμn
row vectors (probability distribution of the chain at time nn),
\Pi^nΠn
n-th power of \PiΠ (n-step
transition probability matrix)
\mu^0 = \muμ0=μ
\Pi^0 = IΠ0=I
corollary: let (X_n)_{n \in \N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ; then for all m,n \in \N_0, i,j \in
Em,n∈N0,i,j∈E
such that P(X_m = i) > 0P(Xm=i)>0, it holds that P(X_{m+n}=j\;|\;X_m=i)=\Pi^n(i,j)P(Xm+n=j∣Xm=i)=Πn(i,j) (see Markov
property)
reachability: a state jj is
reachable from ii if there exists
n \in \N_0n∈N0
such that \Pi^n(i,j) > 0Πn(i,j)>0
write: i \to ji→j
formally: i \to j \iff \exists n \in \N_0: \Pi^n(i,j) > 0 \iff
\sum_{n=0}^\infty\Pi^n(i,j)>0i→j⟺∃n∈N0:Πn(i,j)>0⟺∑n=0∞Πn(i,j)>0
tip: fix i_1,...,i_{n-1}\in Ei1,...,in−1∈E,
then \Pi^n(i,j) \geq
\Pi(i,i_1)...\Pi(i_{n-1},j)Πn(i,j)≥Π(i,i1)...Π(in−1,j) always; if
the RHS is >0>0, then you've
proven \Pi^n(i,j)>0Πn(i,j)>0
\leftrightarrow↔ is an equivalence
relation; the equivalence classes are called communication classesE
/\leftrightarrowE/↔
irreducibility: a Markov chain is called irreducible if it only has
one communication class, otherwise it is reducible
in other words: a Markov chain is irreducible if every state
communicates with every other state
formally: \forall i,j\in E: i \to j∀i,j∈E:i→j
\forall i,j \in E \; \exists n \in \N, i =
i_0,i_1,...,i_{n-1},i_n=j: \Pi(i_0,i_1)...\Pi(i_{n-1},i_n)>0∀i,j∈E∃n∈N,i=i0,i1,...,in−1,in=j:Π(i0,i1)...Π(in−1,in)>0
equivalent: \forall i,j \in E, i \neq j \;\exists n = n(i,j) \in \N
: \Pi^n(i,j)>0∀i,j∈E,i=j∃n=n(i,j)∈N:Πn(i,j)>0
closed set: a (non-empty?) set C \subseteq EC⊆E is
closed if you cannot leave the set
formally: C \subseteq EC⊆E
closed \iff \forall i \in C: \sum_{j \in
C}\Pi(i,j)=1⟺∀i∈C:∑j∈CΠ(i,j)=1
theorem: let (X_n)_{n\in\N_0}(Xn)n∈N0
be an irreducible Markov chain with period dd; then for
all i,j \in Ei,j∈E,
there exist m = m(i,j) \in \N_0m=m(i,j)∈N0
and n_0 = n_0(i,j) \in
\N_0n0=n0(i,j)∈N0
such that for all n \geq n_0: \Pi^{m+nd}(i,j)>0n≥n0:Πm+nd(i,j)>0
choose m=0m=0 if i=ji=j
special case (corollary): let (X_n)_{n\in\N_0}(Xn)n∈N0
be an irreducible, aperiodic Markov chain on a finite state space
EE;
then there exists n_0 \in \N_0n0∈N0
such that for all i,j \in Ei,j∈E
and all n \geq n_0n≥n0:
\Pi^n(i,j)>0Πn(i,j)>0 (i.e. you can
get from ii to
jj
in every sufficiently large number of steps)
lemma: let A \subseteq
\NA⊆N such
that \gcd(A) = 1gcd(A)=1 and if a,b\in
Aa,b∈A,
then a+b\in Aa+b∈A;
then there exists n_0 \in \Nn0∈N such
that n \in An∈A for
all n \geq n_0n≥n0
partitioning: let (X_n)_{n\in\N_0}(Xn)n∈N0
be an irreducible Markov chain with period dd; then there exists
exactly one partitioning C_0,...,C_{d-1}C0,...,Cd−1
of EE such that
for all k=\{0,...,d-1\}k={0,...,d−1}
and i \in C_ki∈Ck:
\sum_{j \in C_{k+1}} \Pi(i,j) =
1∑j∈Ck+1Π(i,j)=1, where C_d = C_0Cd=C0
in other words: choose a state i_0i0,
group all states by "distance mod dd" from
i_0i0
\Pi^nΠn
is also a block matrix, \Pi^{nd}Πnd
is a block diagonal matrix (i.e. diagonals are square matrices)
Stationary Distributions
stationary distribution: a probability distribution that, once reached, remains
unchanged over time as the chain evolves
formally: a probability measure \alphaα is
called stationary for a Markov chain with transition matrix \PiΠ if for all i\in Ei∈E:
\displaystyle{\alpha(i)=\sum_{j\in
E}\alpha(j)\Pi(j,i)}α(i)=j∈E∑α(j)Π(j,i)
matrix form: \alpha\Pi = \alphaαΠ=α
theorem: if the initial distribution \muμ of a Markov
chain (X_n)_{n \in
\N_0}(Xn)n∈N0
is stationary. then for all n \in \N_0n∈N0
and A \subseteq EA⊆E:
P_\mu(X_n \in A) =
\mu(A)Pμ(Xn∈A)=μ(A)
in other words: \mu\Pi^n = \mu^n =
\muμΠn=μn=μ
then, P_\alpha(X_n = i) = \alpha(i)Pα(Xn=i)=α(i) for all
N\in \N_0, i \in EN∈N0,i∈E
if a Markov chain has 2 stationary distributions, then it has infinitely
many
formally: if \alpha,
\betaα,β
are stationary, then so is every \mu \in
\{\lambda\alpha+(1-\lambda)\beta:\lambda\in(0,1)\}μ∈{λα+(1−λ)β:λ∈(0,1)}
if |E| < \infty∣E∣<∞ (or countably
infinite, positive recurrent):
exactly oneclosed communicating class \iff⟺
exactly one stationary distribution
two or moreclosed communicating classes \iff⟺infinitely many stationary distributions
reverse transitions (time-reversed chain): let (X_n)_{n\in\N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ and stationary distribution
\alphaα such that
\alpha(i)>0α(i)>0 for all i\in Ei∈E; define
\displaystyle{\Pi'(i,j):=\frac{\alpha(j)\Pi(j,i)}{\alpha(i)}}Π′(i,j):=α(i)α(j)Π(j,i),
then for all i,j\in E, n \in \N_0i,j∈E,n∈N0:
\Pi'(i,j)=P_\alpha(X_n=j\;|\;X_{n+1}=i)Π′(i,j)=Pα(Xn=j∣Xn+1=i) are the backwards
(reverse) transition probabilities
reversibility: let (X_n)_{n\in\N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ; a distribution \alphaα on EE is called
reversible if \displaystyle{\alpha(i)\Pi(i,j) =
\alpha(j)\Pi(j,i)}α(i)Π(i,j)=α(j)Π(j,i) for all i,j\in Ei,j∈E
the Markov chain is called reversible if it has a reversible distribution
theorem: every reversible distribution is stationary
note: if (X_n)_{n\in\N_0}(Xn)n∈N0
reversible and \alpha(i)>0α(i)>0 for all ii, then P_\alpha(X_n = j \;|\;
X_{n+1}=i)=\Pi(i,j)=P_\alpha(X_{n+1}=j\;|\;X_n = i)Pα(Xn=j∣Xn+1=i)=Π(i,j)=Pα(Xn+1=j∣Xn=i) (so if we start from
\alphaα, the
forwards and backwards transition probabilities are the same)
Kolmogorov's criterion: an irreducible, positive recurrent,
aperiodic Markov chain with transition matrix \PiΠ is reversible iff
\pi_{i_1i_2}\pi_{i_2i_3}...\pi_{i_n{i_1}} =
\pi_{i_1{i_n}}\pi_{i_n{i_{n-1}}}...\pi_{i_2i_1}πi1i2πi2i3...πini1=πi1inπinin−1...πi2i1
for all i_1,...,i_n\in Ei1,...,in∈E
Strong Markov Property
\sigmaσ-algebra:
given a set XX, a
collection \mathcal AA of subsets A \subseteq P(X)A⊆P(X) is called a \sigmaσ-algebra if:
contains the universe: X \in \mathcal AX∈A (and, by 2.,
the empty set \emptyset \in \mathcal A∅∈A)
closed under complementation: if A \in \mathcal AA∈A, then X \backslash A = A^c \in
\mathcal AX\A=Ac∈A
closed under countable unions: if A_n \in \mathcal AAn∈A for all n \in \Nn∈N, then \bigcup_{n=1}^\infty A_n
\in \mathcal A⋃n=1∞An∈A
filtration: a growing sequence of information where past information does not get
lost over time (accumulates)
formally: let (\Omega, \mathcal F, P)(Ω,F,P) be a probability
space; a sequence \mathcal F_n \subseteq \mathcal FFn⊆F for
n \in \N_0n∈N0
is called a filtration if \mathcal F_n \subseteq \mathcal F_{n+1}Fn⊆Fn+1
for all n \in \N_0n∈N0
natural filtration: all the information generated by the chain up
to time nn (i.e. the
exact states of the chain, whether certain states are in subsets of the state space,
complements, functions of the past etc., but not model parameters themselves)
formally: let (X_n)_{n \in
\N_0}(Xn)n∈N0
be a Markov chain on EE;
define \mathcal F_n := \sigma(X_0,...,X_n)Fn:=σ(X0,...,Xn) as the
smallest \sigmaσ-algebra
containing all events of type X_t^{-1}(A)Xt−1(A) for A \subseteq
EA⊆E
and t\in\{0,...,n\}t∈{0,...,n}; then (\mathcal
F)_{n\in\N_0}(F)n∈N0
is the natural filtration of (X_n)_{n \in
\N_0}(Xn)n∈N0
stopping time: the event of stopping at time nn only depends on
what happened up to that time
formally: a random variable \tau : \Omega \to \N_0 \cup \{\infty\}τ:Ω→N0∪{∞} is called a
stopping time with respect to the natural filtration (\mathcal
F)_{n\in\N_0}(F)n∈N0
if for all n \in \N_0n∈N0:
\displaystyle{\{\tau = n\}
\in \mathcal F_n}{τ=n}∈Fn
(i.e. can you rewrite it in a way that depends on X_iXi
only up to nn?)
stopped Markov chain: let (X_n)_{n \in \N_0}(Xn)n∈N0
be a Markov chain, let \tauτ be a
stopping time w.r.t. the natural filtration (\mathcal F)_{n\in\N_0}(F)n∈N0;
define a \land b = \min(a,b)a∧b=min(a,b) for a,b\in \Ra,b∈R; the stopped
Markov chain is (X_{n \land \tau})_{n \in \N_0}(Xn∧τ)n∈N0
with X_{n \land \tau} =
\begin{cases}X_n & \text{if }n \leq \tau\\X_\tau & \text{if } n \geq
\tau\end{cases}Xn∧τ={XnXτif n≤τif n≥τ
strong Markov property: generalization of the Markov property to random stopping
times; the Markov property still holds even if you restart the process at a random stopping
time \tauτ
formally: let (X_n)_{n \in \N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ, let \tauτ be a
stopping time w.r.t. the natural filtration (\mathcal F)_{n\in\N_0}(F)n∈N0;
fix k \in Ek∈E
such that P(\tau < \infty, X_\tau = k) > 0P(τ<∞,Xτ=k)>0; then, under \tilde P :=
P(\cdot\;|\;X_\tau = k)P~:=P(⋅∣Xτ=k), the sequence (\tilde X_n :=
X_{n+\tau})_{n \in \N_0}(X~n:=Xn+τ)n∈N0
is a Markov chain with transition matrix \PiΠ and starting
distribution \delta_kδk
(Dirac measure), independent of (X_{n \land \tau})_{n \in \N_0}(Xn∧τ)n∈N0
if \tau = \inftyτ=∞, choose (\tilde
X_n)(X~n) arbitrarily
Recurrence & Transience
recurrence and transience: a state i \in Ei∈E is
called...
...recurrent, if P_i(T_i<\infty) \stackrel{!}{=} 1Pi(Ti<∞)=!1
in other words: starting from ii and
from wherever you can go, there is always a way (path) of returning to ii
(finite)
positive recurrent: recurrent and E_i[T_i]
<\inftyEi[Ti]<∞
null recurrent: recurrent and E_i[T_i]=\inftyEi[Ti]=∞
theorem: a state i \in Ei∈E
is recurrent if and only if \sum_{n=0}^\infty\Pi^n(i,i)=\infty∑n=0∞Πn(i,i)=∞
...transient, if P_i(T_i<\infty) <1Pi(Ti<∞)<1
in other words: starting from ii,
there is at least one path such that, if you take it, you will never be able to
return to ii
(finite)
ii
transient, EE
finite \implies⟹\alpha(i) = 0α(i)=0
h_i^A = 0hiA=0 if
AA
is unreachable from ii
(absorbing state, state outside of closed set)
mean hitting time vector: k^A=(k_i^A)_{i \in E}kA=(kiA)i∈E
for each starting state ii:
k_i^A = E_i[H_A] = \text{expected number of
steps to hit } A \text{ from } ikiA=Ei[HA]=expected number of steps to hit A from i
k_i^A = \inftykiA=∞ if
AA
is unreachable from ii
(i.e. P_i(H^A < \infty) = 0Pi(HA<∞)=0)
invariant distribution: a function \alpha : E \to \Rα:E→R is called an invariant
distribution for a Markov chain (X_n)_{n \in \N_0}(Xn)n∈N0
with transition matrix \PiΠ if:
\forall i \in E: \alpha(i)
\in [0,\infty)∀i∈E:α(i)∈[0,∞)
\exists i \in E: \alpha(i)
>0∃i∈E:α(i)>0 (i.e. \alphaα is
not the null function)
if, additionally, \sum_{i\in E}\alpha(i) = 1∑i∈Eα(i)=1, then this is the
stationary (invariant) distribution
Markov chain recurrent\implies⟹ it
has an invariant measure
Markov chain recurrent + irreducible\implies⟹ it
has a unique invariant measure (up to multiplication by a
constant)
existence theorem + construction: let (X_n)_{n \in \N_0}(Xn)n∈N0
be an irreducible, recurrent Markov chain with transition matrix \PiΠ and state space EE;
then, (X_n)_{n \in
\N_0}(Xn)n∈N0
has an invariant measure which can be constructed as follows:
pick an element 0 \in E0∈E
(any element, call it "0")
let T_0 := \inf\{n \geq 1:X_n = 0\}T0:=inf{n≥1:Xn=0} (first
return time to 00)
write \alpha(i) =
E_0[\sum_{n=1}^\infty1_{\{X_n=i\}}1_{\{n \leq T_0\}}] =
E_0[\sum_{n=1}^{T_0}1_{\{X_n=i\}}]α(i)=E0[∑n=1∞1{Xn=i}1{n≤T0}]=E0[∑n=1T01{Xn=i}] (expected
number of visits to ii
between two visits to 00)
\alpha(i) = \sum_{n=1}^\infty P_0(X_n =
i, n \leq T_0)α(i)=∑n=1∞P0(Xn=i,n≤T0)
uniqueness up to a constant: let (X_n)_{n \in \N_0}(Xn)n∈N0
be an irreducible, recurrent Markov chain with transition matrix \PiΠ, let \alpha, \betaα,β be
invariant measures for (X_n)_{n \in \N_0}(Xn)n∈N0;
then \exists C > 0: \alpha =
C\beta∃C>0:α=Cβ
(unique invariant measure up to multiplication by a constant)
theorem: let (X_n)_{n \in \N_0}(Xn)n∈N0
be an irreducible, recurrent Markov chain; then for all i,j \in Ei,j∈E:
P_i(T_j<\infty) =
1Pi(Tj<∞)=1
corollary: an irreducible Markov chain is positive
recurrent if and only if it has a stationary distribution
then, this stationary distribution is unique with \alpha(i) >0α(i)>0 for
all i\in Ei∈E
theorem: let (X_n)_{n \in \N_0}(Xn)n∈N0
be an irreducible, positive recurrent Markov chain; then \alpha(i) =
\frac{1}{E_i[T_i]}α(i)=Ei[Ti]1
for all i \in Ei∈E
theorem: every irreducible Markov chain with a finite state
space is positive recurrent
reversed transition matrix: let (X_n)_{n \in \N_0}(Xn)n∈N0
be a Markov chain with transition matrix \PiΠ, let \alphaα be an
invariant measure for (X_n)_{n \in \N_0}(Xn)n∈N0;
define the \alphaα-reversed
transition matrix as \displaystyle{\Pi^\alpha(i,j)=\frac{\alpha(j)}{\alpha(i)}\Pi(j,i)}Πα(i,j)=α(i)α(j)Π(j,i)
\Pi^\alphaΠα
is a stochastic matrix
if (X_n)_{n \in
\N_0}(Xn)n∈N0
is recurrent, then so is a Markov chain with transition matrix \Pi^\alphaΠα
let (X_n)_{n=0}^N(Xn)n=0N
be a Markov chain with initial distribution \delta_0δ0,
conditioned on X_N = 0XN=0; then, (Y_n :=
X_{N-n})_{n=0}^N(Yn:=XN−n)n=0N
is a Markov chain with transition matrix \Pi^\alphaΠα,
initial distribution \delta_0δ0,
conditioned on Y_N = 0YN=0
in other words: (Y_n)_{n=0}^N(Yn)n=0N
is the time reversal of the Markov chain (X_n)_{n=0}^N(Xn)n=0N,
where we start at 00 and return to
00 at time NN
corollary: assume (X_n)(Xn) is
recurrent, let (Z_n)(Zn) have
transition matrix \Pi^\alphaΠα,
then (Z_n)(Zn) is also
recurrent, and if you run both with initial distribution \delta_0δ0
from time 0 to time T_0T0,
then (Z_n)_{n = 0}^{T_0}(Zn)n=0T0
has the same distribution as (X_{T_0-n})_{n =
0}^{T_0}(XT0−n)n=0T0
let (X_n)_{n\in
\N_0}(Xn)n∈N0
be a Markov chain with initial distribution \delta_0δ0
and transition matrix \PiΠ; let (Y_n)_{n\in
\N_0}(Yn)n∈N0
be a Markov chain with initial distribution \delta_0δ0
and transition matrix \Pi^\alphaΠα;
if P(T_0 < \infty)
=1P(T0<∞)=1, then (Y_0,...,Y_{T_0})(Y0,...,YT0) has the same
distribution as (X_{T_0},...,X_0)(XT0,...,X0)
Convergence
total variation metric: notion of convergence; the "distance" between
\alphaα and \betaβ in total
variation
formally: let \alpha, \betaα,β be
distributions on EE
(countable), then the total variation distance is defined as \displaystyle{d_{TV} =
\frac12\sum_{i\in E}|\alpha(i) - \beta(i)|}dTV=21i∈E∑∣α(i)−β(i)∣ (i.e. half
of the L1 norm)
convergence theorem: let (X_n)_{n \in \N_0}(Xn)n∈N0
be an irreducible, aperiodic, positive recurrent Markov chain with invariant
distribution\alphaα; then \displaystyle{\lim_{n \to
\infty}P_i(X_n = j) = \lim_{n \to \infty}\Pi^n(i,j) = \alpha(j)}n→∞limPi(Xn=j)=n→∞limΠn(i,j)=α(j)
recipe: if the Markov chain is not irreducible, split Markov chain
into equivalence classes and look at "restricted" transition matrices