Explicit expression for the steady state translation rate in the infinite

1
Explicit expression for the steady state
translation rate in the infinite-dimensional
homogeneous ribosome flow model
Yoram Zarai, Michael Margaliot and Tamir Tuller
Abstract—Gene translation is a central stage in the intra-cellular process of protein synthesis. Gene translation proceeds in three
major stages: initiation, elongation, and termination. During the elongation step ribosomes (intra cellular macro-molecules) link amino
acids together in the order specified by messenger RNA (mRNA) molecules.
The homogeneous ribosome flow model (HRFM) is a mathematical model of translation-elongation under the assumption of constant
elongation rate along the mRNA sequence. The HRFM includes n first-order nonlinear ordinary differential equations, where n
represents the length of the mRNA sequence, and two positive parameters: ribosomal initiation rate and the (constant) elongation
rate.
Here we analyze the HRFM when n goes to infinity and derive a simple expression for the steady-state protein synthesis rate. We also
derive bounds that show that the behavior of the HRFM for finite, and relatively small, values of n is already in good agreement with
the closed-form result in the infinite-dimensional case. For example, for n = 15 the relative error is already less than 4%. Our results
can thus be used in practice for analyzing the behavior of finite-dimensional HRFMs that model translation. To demonstrate this, we
apply our approach to estimate the mean initiation rate in M. musculus, finding it to be around 0.17 codons per second. Other potential
biological applications include re-engineering genes to produce a desired translation rate.
Index Terms—Gene translation, systems biology, computational models, monotone dynamical systems, periodic continued fractions.
F
1
I NTRODUCTION
Proteins are micro-molecules involved in all intracellular activities. Proteins are built of chains of smaller
molecules called amino acids. There are 20 different
amino acids, so the alphabet of the protein sequences
include 20 letters. DNA regions, called genes, encode
proteins as an ordered list of amino acids. During the
process of gene expression these regions are transcripted
to mRNA molecules; DNA and mRNA molecules are
sequences of 4 possible nucleotides. Gene translation is
the process in which the information encoded in genes
is deciphered and translated into proteins by molecular
machines called ribosomes that move along the mRNA
sequence (see [1]). During the translation process, each
triplet of consecutive nucleotides, called a codon, is
replaced by a corresponding amino acid.
Computational models of gene translation have been
developed and used to address various biological and
biotechnological questions (see, for example, [2], [3], [4],
[5], [6], [7]). Rigorous analysis of these computational
The research of MM is partly supported by the ISF.
• Y. Zarai is with the School of Electrical Engineering, Tel-Aviv University,
Tel-Aviv 69978, Israel.
E-mail: [email protected]
• M. Margaliot is with the School of Electrical Engineering and the Sagol
School of Neuroscience, Tel-Aviv University, Tel-Aviv 69978, Israel.
E-mail: [email protected]
• T. Tuller is with the Department of Biomedical Engineering and the Sagol
School of Neuroscience, Tel-Aviv University, Tel-Aviv 69978, Israel.
E-mail: [email protected]
models is important for several reasons: It can deepen
our understanding of the translation process, lead to
efficient algorithms for optimizing gene translation for
various biotechnological goals, and assist in improving
the fidelity of the mathematical models.
A conventional model of translation-elongation is the
Asymmetric Simple Exclusion Process (ASEP) (see [5], [4],
[8], [9]). ASEP is a stochastic model describing particles
that can hop from one site to another along a 1D lattice.
The term “simple exclusion” refers to the fact that hops
may take place only to a target site that is not already
occupied by another particle. The motion is assumed to
be directionally asymmetric, i.e., there is some preferred
direction of motion. In the Totally Asymmetric Simple
Exclusion Process (TASEP) motion is unidirectional.
TASEP models for translation are based on the following assumptions: (1) Initiation time and the time a
ribosome spends translating each site are random and
codon dependent; and (2) ribosomes span over several
sites and if two ribosomes are adjacent, the trailing one
is delayed until the ribosome in front of it has proceeded
onwards. Despite its rather simple description, it seems
that rigorous analysis of the TASEP is non-trivial. The
recent monograph [10] provides a detailed exposition of
these issues.
Reuveni et al. [11] recently introduced a deterministic
model for translation-elongation called the Ribosome Flow
Model (RFM). In the RFM, mRNA molecules are coarsegrained into n consecutive sites of codons. The statevariables xi (t) : R+ → [0, 1], i ∈ {1, . . . , n}, describe
2
the “occupancy level” of site i at time t, with xi (t) = 1
[xi (t) = 0] indicating that the site is completely occupied
[free].
The RFM is given by n non-linear ordinary differential
equations:
ẋ1 = λ(1 − x1 ) − λ1 x1 (1 − x2 ),
ẋ2 = λ1 x1 (1 − x2 ) − λ2 x2 (1 − x3 ),
..
.
ẋn−1 = λn−2 xn−2 (1 − xn−1 ) − λn−1 xn−1 (1 − xn ),
ẋn = λn−1 xn−1 (1 − xn ) − λn xn .
(1)
Here λ > 0 describes the initiation rate into the chain,
λi > 0, i ∈ {1, . . . , n − 1}, is a parameter that controls
the transition rate from site i to site i + 1, and λn >
0 controls the output rate at the end of the chain. The
equation for ẋ1 describes the change in the occupancy
level at the first site along the chain. The term λ(1 − x1 )
represents the fact that ribosomes enter at a rate λ, but
can only bind to the site if it is not already occupied.
In particular, if x1 = 1 (i.e., the cite is completely full)
then the effective binding rate becomes λ(1 − x1 ) = 0.
The term −λ1 x1 (1 − x2 ) represents the flow of ribosomes
from site 1 to site 2. This becomes zero if x2 (t) = 1,
i.e., if site 2 is completely occupied. The other equations
follow similarly. The term λn xn (t) in the last equation
represents the rate in which ribosomes leave the last site
on the chain, that is, the translation rate or protein synthesis
rate.
Note that the RFM encapsulates both the simple exclusion and the total asymmetry properties of the TASEP.
The numerical values of the parameters λ, λi depend
on physical features such as the number of available
free ribosomes, nucleotide context surrounding initiation
codons, etc. (see [11], [12], [13], [6], [3]). As demonstrated
in [11], simulations of the full TASEP and the simpler
RFM yield similar predictions of translation rates. For
example, the correlation between their predictions of
translation rates over the set of endogenous genes of S.
cerevisiae is 0.96.
We briefly review some previous results on the analysis of the RFM. Let x(t, a) denote the solution of (1)
at time t for the initial condition x(0) = a. We always consider initial conditions x(0) in the closed unit
cube C := [0, 1]n . It has been shown in [14] that the RFM
admits a unique equilibrium point e in C, and
lim x(t, a) = e,
t→∞
for all a ∈ C.
Furthermore, e ∈ Int(C), i.e., every coordinate ei of e
satisfies ei ∈ (0, 1). This means that the model parameters define a unique attracting steady-state of ribosome
densities (and thus steady state translation rate). Perturbations in the xi s, corresponding to events such as
ribosome drop-off, will not affect this steady state.
Let
R := λn en
(2)
denote the (steady-state) translation rate. From a biological point of view, it is important to understand
how R depends on the RFM parameters. Of course, it
is enough to understand the dependence of e on the
RFM parameters. For x = e the right-hand side of all
the equations in (1) is zero, so
λ(1 − e1 ) = λ1 e1 (1 − e2 )
= λ2 e2 (1 − e3 )
..
.
= λn−1 en−1 (1 − en )
= λn e n .
(3)
This yields
ei = λn en /(λi (1 − ei+1 )),
i ∈ {1, . . . , n − 1},
(4)
and
e1 = 1 − λn en /λ.
(5)
Combining (4) and (5) provides a finite continued fraction expression for the last coordinate en of e, namely,
1 − λn en /λ =
λn en /λ1
λn en /λ2
1−
λn en /λ3
1−
..
. 1 − λ e /λ .
n n
n
(6)
It has been shown experimentally that in some cases
the ribosomal elongation speed is close to constant along
the mRNA sequence [15]. To model this, Ref. [16] has
considered the RFM in the special case where
λ1 = λ2 = · · · = λn := λc ,
(7)
that is, the transition rates λi are all equal, and λc denotes
their common value. Since this Homogeneous Ribosome
Flow Model (HRFM) includes only two parameters, λ
and λc , the analysis is simplified. In particular, (6) becomes
1
en − 1 =
η
− en
,
− en
1+
− en
1+
..
.1−e ,
n
(8)
where en appears in the continued fraction a total of n
times, and
η := λ/λc ,
is the normalized initiation rate. Adding 1 to both sides
of (8), taking reciprocals, and then multiplying by (−en )
3
yields
−η =
− en
,
− en
1+
− en
1+
..
.1−e
n
(9)
0.25
0.2
0.15
where en appears in the continued fraction a total of n+1
times. This implies of course that en = en (η), i.e., en is a
function of η.
Eq. (9) yields a polynomial equation of degree d(n +
1)/2e in en . For example, for n = 2, (9) becomes
e22 − (2η + 1)e2 + η = 0.
Our previous studies analyzed the HRFM when λ →
∞ and when λ → 0 [17], [16]. In this paper, we consider
for the first time the HRFM when n → ∞, i.e. the infinitedimensional HRFM. In this case, the term on the righthand side of (9) becomes an infinite 1-periodic continued
fraction [18, Chapter 3]. Using results from the analytic
theory of continued fractions, we derive an explicit and
simple expression for
e∞ (η) := lim en (η).
0.1
0.05
0
0
R∞ (η) := λc e∞ (η).
We also prove bounds that show that our results
provide a good approximation for the steady-state translation rate in the finite-dimensional HRFM. For example,
for n = 15 the relative error between e∞ (η) and e15 (η)
is already less than 4% for all η ≥ 0. It is important to
note that the typical length of mRNA sequences is larger
than 15 sites. For example, in S. cerevisiae the mean length
is about 33 sites; in mammals the mRNA chains are
much longer; thus, the asymptotic results here provide
a good approximation for the translation rate in HRFM
models of gene translation.
The remainder of this paper is organized as follows.
The next section describes our main results. Section 3
describes the application of the analytic results to a
biological example. The proofs of the main results are
given in Section 4.
2
M AIN R ESULTS
Our first result shows that given two HRFMs with
different lengths, but with the same η, the steady-state
translation rate in the longer chain is smaller than the
one in the shorter chain.
£
¤0
Proposition 1 Fix η > 0. Let e(η) = e1 , . . . , en denote
the unique equilibrium
of the
£
¤0 n-dimensional HRFM in C,
and let ẽ(η) = ẽ1 , . . . , ẽn+1 denote the unique equilibrium
of the (n + 1)-dimensional HRFM in C. Then
ẽn+1 < en .
(10)
0.2
η
0.3
0.4
0.5
Fig. 1. en (η) as a function of η ∈ [0, 1/2] for n = 3
(dash-dot), and n = 10 (dotted). The solid line is the
function f (η).
In other words, the occupancy level at the last site is a
decreasing function of the length of the chain n.
Combining (10) with the fact that the equilibrium
point is in Int(C) for all n (and thus every coordinate ei
is bounded), we conclude that the limit
n→∞
Of course, this yields an explicit expression for the
steady-state translation rate
0.1
e∞ (η) := lim en (η)
n→∞
exists for all η > 0.
Define f : R+ → R+ by
(
x(1 − x),
f (x) :=
1/4,
0 ≤ x < 1/2,
1/2 ≤ x.
Our main result provides a simple closed-form expression for e∞ (η).
Theorem 1 For every η > 0,
e∞ (η) = f (η).
(11)
Figure 1 depicts en (η), n = 3, 10, for η ∈ [0, 1/2].
The function f (η) is also shown. It may be seen that
as n increases en (η) indeed converges to f (η), and
that en (η) agrees well with f (η) already for n = 10.
Figure 2 depicts en (η), n = 3, 10, 15, for η ∈ [1/2, 4].
The function f (η) is also shown. It may be seen that
as n increases en (η) indeed converges to f (η). Figures 1
and 2 suggest that f (η) is a good approximation of en (η)
already for relatively small values of n (e.g., n = 15).
Below we derive rigorous error bounds that indeed
verify this. Thus, the closed-form expression in (11) is
useful for the finite-dimensional HRFM as well.
Let R∞ (η) := λc e∞ (η) denote the steady-state translation rate in the infinite-dimensional HRFM. The next
result follows immediately from Theorem 1.
Corollary 1
(
λ(λc − λ)/λc , λ < λc /2,
R∞ (λ, λc ) =
λc /4,
λ ≥ λc /2.
(12)
4
0.25
0.38
0.36
0.2
0.34
0.15
0.32
0.3
0.1
0.28
0.26
0.05
0.24
0.5
1
1.5
2
η
2.5
3
3.5
4
0
Fig. 2. en (η) as a function of η ∈ [1/2, 4] for n = 3 (dashdot), n = 10 (dotted), and n = 15 (dashed). The solid line
is the function f (η).
2
7
12
17
22
27
32
n
Fig. 3. The function
1
4
tan2 (π/(n + 2)) as a function of n.
¡ ¢
o f denote the set of functions g : R+ → R+ that satisfy
An important open problem in gene translation is
related to the dominant gene translation regime: some
studies claim that initiation is the rate limiting step [19],
while others claim that the elongation is also rate limiting [20], [21]. In terms of the HRFM, the question
is whether λ or λc is the rate limiting factor. Eq. (12)
suggests that both initiation and translation can be the
rate limiting factor. Indeed, (12) implies that the behavior
of R∞ (λ, λc ) may be divided into two different regimes.
If the normalized initiation rate η is much smaller
than 1/2 then
R∞ = λ(λc − λ)/λc
≈ λλc /λc
= λ,
g(x)
= 0.
f (x)
Proposition 2 Fix an arbitrary n ≥ 2. Then
¡
¢
en (η) = η − η 2 + η n+2 + o η n+2 ,
¡
¢
dn (η) = η n+2 + o η n+2 ,
¡ n¢
rn (η) = o η ,
(14)
for all η ∈ [0, 1/2].
In other words, for small η the error bounds decrease
quickly with the dimension n.
2.2 The case η → ∞
so in particular the limiting rate factor is the initiation
rate λ. On the other hand, if η ≥ 1/2 then R∞ = λc /4, so
in particular the transition rate λc becomes the limiting
factor.
Once e∞ (η) is explicitly known, it is natural to seek a
bound for the difference
dn (η) := |en (η) − e∞ (η)|,
lim
x→0
(13)
and for the relative difference
dn (η)
rn (η) := 100
(measured in percent).
en (η)
Indeed, if these errors are small enough then the closedform results on the infinite-dimensional HRFM provide
a good approximation for the behavior of the finitedimensional HRFM. In the next subsections, we derive
such bounds when η is close to zero and when η → ∞.
The next result provides a bound on dn (η) that, as shown
in the proof, is based on considering the case where η →
∞.
Proposition 3 For all η > 0 and all n ≥ 2,
dn (η) ≤
1
tan2 (π/(n + 2)) .
4
(15)
Figure 3 depicts the bound 14 tan2 (π/(n + 2)) as a
function of n. It may be seen that for small values of n
this bound decreases quickly with n.
The next result provides a bound on rn .
Proposition 4 Fix an arbitrary integer k ≥ 1. Let sk ≥ 1 be
such that
ek (η) ≤ sk η(1 − η),
for all η ∈ [0, 1/2).
(16)
2.1 The case η small
Then for all n ≥ k,
(
100(1 − 1/sk ),
η ∈ [0, 1/2),
rn (η) ≤
2
100 sin (π/(n + 2)) , η ≥ 1/2.
In this subsection, we consider the case η ∈ [0, 1/2]. Thus,
e∞ (η) = η(1 − η). For a real function f : R+ → R+ , let
Note that the bounds for both dn and rn become
tighter as n increases.
(17)
5
To demonstrate the bound in (17) consider the case k =
−e1
1. Recall that e1 (η) is the solution of −η = 1−e
, i.e.,
1
e1 = η/(1 + η). Eq. (16) thus becomes
1 ≤ s1 (1 − η 2 ).
Since η ∈ [0, 1/2], this clearly holds √
for s1 = 4/3. A similar calculation shows that s2 = 4−2 2 ≈ 1.1716. For n =
15, a numerical calculation shows that s15 ≈ 1.00859,
so the terms in (17) become 100(1 − 1/s15 ) = 0.8517%,
and 100 sin2 (π/17) = 3.3764%. Thus, the relative error
between en (η) and e∞ (η) is lower than 3.3764% for
all n ≥ 15 and all η ≥ 0. This demonstrates that the
explicit results for the infinite-dimensional HRFM in (11)
and (12) are useful for analyzing the finite-dimensional
HRFM as well.
In the next section, we demonstrate how the results
derived above can be used for analyzing biological data.
3
A
BIOLOGICAL EXAMPLE
Currently, there exist effective experimental approaches
for estimating the translation-elongation rate and the
translation rate, but there is no effective approach for
measuring the initiation rate. Indeed, initiation is a
highly complex mechanism and its efficiency is based on
numerous biophysical properties of the coding sequence
including: the nucleotide context of the START codon (i.e.
the first codon that is translated in a gene); the folding
of the RNA near the beginning of the Open Reading
Frame (ORF); the number of ribosomes and mRNA
molecules in the cell; the length and the nucleotide context of the 5’UTR; and more. Thus, although there exist
experimental approaches for measuring positions on the
mRNA suspected to correspond to initiation sites [15],
[22], there are no large scale measurements of initiation
rate.
Ingolia et al. [15] estimated the elongation rate in
M. musculusin embryonic stem cell in the following
way: elongation was halted by applying cyclohexamide.
Fragments covered by ribosomes were mapped to each
transcript and a baseline ribosomal read counts profile
corresponding to the ribosomal density was created. In
three additional experiments harringtonine was used to
stop initiation while allowing ribosomes, that already
started translating the mRNA, to continue their movement along the mRNA. Cyclohexamide was again applied 90/120/150 seconds after applying harringtonine
to stop translation. By measuring the “movement” of the
“ribosomal density wave” it was possible to estimate the
speed of elongation. Their results show that in mouse
embryonic cells 5.6 codons are translated per second
(this corresponds to 0.3733 sites per second in the RFM,
as the size of the ribosome spans about 15 codons [23]).
According to [15], this elongation speed is typical, and
it does not vary much between different genes.
A recent study by Schwanhausser et al. [24] estimated
the translation rate in M. musculus fibroblasts by simultaneously measuring protein abundance and turnover by
parallel metabolic pulse labelling for more than 5, 000
genes in mouse. They found that the median translation
rate in mouse is about 40 proteins per mRNA per
hour (i.e., 2/3 proteins per mRNA per minute or 0.0111
proteins per second).
One possible application of our theoretical results is
to estimate the initiation rate based on the translation
and (constant) transition rates. We now demonstrate this
for the case of mouse embryonic cells. As described
above, in steady-state the elongation rate is λc = 0.3733
HRFM sites per second. In mouse, the mean mRNA
length is about 465 codons corresponding to n = 31
sites in the HRFM. Since we know that the relative error
between e∞ (η) and e31 (η) is small for all η ≥ 0, we can
estimate the mean initiation rate in mouse based on (12).
We do this as follows. First, since R ≈ 0.0111 is very
different from λc /4 ≈ 0.0933, we assume that we are in
the regime where
λ < λc /2,
(18)
so R∞ = λ(λc − λ)/λc , i.e.,
0.0111 = λ(0.3733 − λ)/0.3733.
This is a quadratic equation in the initiation rate λ with
solutions: 0.361849 and 0.0114513. The first solution is
not feasible, as it does not satisfy (18). Thus, we conclude
that the mean initiation rate in mouse is 0.0114513 sites
per second corresponding to 0.17 codons per second.
4
P ROOFS
Proof of Proposition 1.
Seeking a contradiction, assume that
ẽn+1 ≥ en .
(19)
Since λ(1 − e1 ) = λc en and λ(1 − ẽ1 ) = λc ẽn+1 , this yields
ẽ1 ≤ e1 .
1)
1− η(1−e
e1
1)
Since e2 =
and ẽ2 = 1− η(1−ẽ
, we get ẽ2 ≤ e2 ,
ẽ1
and proceeding in this fashion yields
ẽi ≤ ei ,
i = 1, . . . , n.
(20)
By [16, Proposition 3], ẽn+1 < ẽn . Combining this
with (19) yields
en ≤ ẽn+1 < ẽn .
This contradicts (20), and thus completes the proof.
Proof of Thm. 1.
We begin by stating a known result from the theory of
continued fractions that will be used in the proof. Let
a
Kn (a/1) :=
,
a
1+
1+
(21)
a
..
. 1 + a,
where a appears in the continued fraction a total of n
6
times. Note that
Kn (a/1) =
a
1 + Kn−1 (a/1)
.
If K∞ (a/1) := limn→∞ Kn (a/1) exists then this yields
the quadratic equation
a
K∞ (a/1) =
,
1 + K∞ (a/1)
We claim that d0 = 0 and d1 = 1. Indeed, recall that
for η = 0 the equilibrium point is the origin, so in
particular en = 0. This implies that d0 = 0. Also,
taking η → 0 in (8) yields d1 − 1 = 0, so d1 = 1.
We now use induction to prove that
¡
¢
en = η − η 2 + η n+2 + o η n+2 .
For n = 1, (9) becomes −η =
whose solutions are
p
a + 1/4
.
2
The next result (see, e.g., [18, Theorem 3.2][25, Corollary
4.8]) provides a necessary and sufficient condition for
the convergence of (21) as n → ∞.
−1 ± 2
Theorem 2 Consider the continued fraction Kn (a/1),
with a ∈ R \ {0}. Let
p
−1 + 2 a + 1/4
x :=
.
2
Then K∞ (a/1) exists if and only if
a ≥ −1/4,
η
= η − η2 + η3 − η4 + η5 − . . .
1+η
−η = Kn+2 (−en+1 /1).
Letting α := 1 − η −1 en+1 yields
−α = Kn+1 (−en+1 /1),
¡
¢
en+1 = α − α2 + αn+2 + o αn+2 .
en+1 =
As n → ∞, the continued fraction on the right-hand side
of (9) becomes an infinite 1-periodic continued fraction
and since it is equal to −η, this infinite continued fraction
always converges. Thus, Theorem 2 implies that
for all η ≥ 0.
(24)
(27)
so by the induction hypothesis
Let
We can now prove Theorem 1. First note that (9) may
be written as
−η = Kn+1 (−en /1).
(23)
so
and this implies that (26) indeed holds for n = 1. Assume
that (26) holds for some n ≥ 1, i.e., the solution
¢ of −η =
¡
Kn+1 (−p/1) satisfies p = η − η 2 + η n+2 + o η n+2 . For the
induction step, consider the equation
(22)
and then K∞ (a/1) = x.
e∞ (η) ≤ 1/4,
e1 =
−e1
1−e1 ,
(26)
∞
X
ck η k .
(28)
(29)
k=1
WePalready know that c1 = 1, so α = 1 − η −1 en+1 =
∞
− k=2 ck η k−1 . Substituting this in (28) yields
η+
∞
X
k=2
ck η k = −
∞
X
ck η k−1 − (
k=2
+ (−
∞
X
∞
X
ck η k−1 )2
k=2
∞
¡X
¢
ck η k−1 )n+2 + o (
ck η k−1 )n+2 .
k=2
k=2
Suppose that η > 0 is sufficiently small so that e∞ (η) ∈
[0, 1/4] (such an η exists, as for η = 0 the equilibrium
point is e(0) = 0). Then combining (9) with Theorem 2
yields
p
−η = (−1 + 2 −e∞ + 1/4)/2,
(30)
Equating coefficients yields c2 = −1, c3 = 0, c4 =
0, . . . , cn+2 = 0 , and cn+3 = 1. Substituting these
parameters in (29) yields
¡
¢
en+1 = η − η 2 + η n+3 + o η n+3 .
and solving this yields
This completes the proof of (26). Now
e∞ = η(1 − η).
Since η(1 − η) ≤ 1/4 for all η ≤ 1/2, a continuity
argument implies that (25) actually holds for all η ≤ 1/2,
and in particular, e∞ (1/2) = 1/4.
Now suppose that η ≥ 1/2. It follows from [17, Corollary 1] that for a fixed n, en (η) is an increasing function
of η. Combining this with the result in Case 1 above
implies that e∞ (η) ≥ 1/4, and (24) yields e∞ (η) = 1/4.
This completes the proof of Theorem 1.
Proof of Prop. 2.
Fix arbitrary n > 0 and η > 0. Write en = en (η) as a
Taylor series in η, i.e.,
en = d0 η 0 + d1 η 1 + d2 η 2 + . . . .
dn = en − e∞
(25)
¡
¢
= η − η 2 + η n+2 + o η n+2 − η(1 − η)
¡
¢
= η n+2 + o η n+2 ,
and
rn = 100dn /en
¢
¡
η n+2 + o η n+2
¢
¡
= 100
η − η 2 + η n+2 + o η n+2
¡ n¢
=o η .
This completes the proof of Prop. 2.
Proof of Prop. 3.
First note that combining the definition of dn in (13)
7
with (10) yields
dn (η) = en (η) − e∞ (η).
It was shown in [16] that for every n ≥ 2,
¡
¢−1
.
lim en (η) = 4 cos2 (π/n + 2)
η→∞
(31)
Thus,
lim dn (η) = lim (en (η) − e∞ (η))
η→∞
η→∞
=
=
1
³
4 cos2
1
tan2
4
π
n+2
µ
´−
π
n+2
1
4
¶
.
For a fixed n, en (η) is an increasing function of η [17,
Corollary 1] and this implies that dn (η) ≤ limη→∞ dn (η)
for all η > 0. This proves (15).
Proof of Prop. 4.
We consider two cases.
Case 1. Suppose that η ≥ 1/2. Then e∞ (η) = 1/4,
so rn (η) = 100(1 − 4en1(η) ), and using (31) yields
rn (η) ≤ 100 sin2 (π/(n + 2)).
Case 2. Suppose that η ∈ [0, 1/2]. Since en (η) is a
decreasing function of n, en (η) ≤ ek (η) ≤ sk η(1 − η)
for all η. This yields
η(1 − η)
)
en (η)
1
≤ 100(1 − ).
sk
rn (η) = 100(1 −
This completes the proof of Prop. 4.
5
D ISCUSSION
The RFM is a new deterministic mathematical model for
ribosome flow along the mRNA. The order n of the RFM
corresponds to the number of sites along the mRNA
chain. The RFM encapsulates both the simple exclusion
and the total asymmetry properties of the stochastic
TASEP model.
We note in passing that the RFM is a monotone dynamical system [14]. Monotone dynamical systems, and
their generalization to monotone control systems [26], have
recently been shown to provide powerful models in
systems biology (see, e.g., [27], [28], [29], [30]).
The RFM has already been successfully used to
model and analyze important properties of the process of translation-elongation. For example, in eukaryotes mRNA molecules usually form circular structures
leading to ribosomal recycling [31], [32]. To analyze this,
Ref. [17] considered the RFM as a control system, with
the initiation rate as the input, and the translation rate
as the output, and closing the loop with a positive
linear feedback. As another example, Ref. [33] studied
the RFM under the assumption of periodic time-varying
initiation rate and/or transition rates, and showed that
the state-variables entrain to the periodic excitation. The
motivation for this is a recent biological study demonstrating that the gene expression pattern entrains to a
periodically varying abundance of tRNA molecules [34]
(see also [35], [36]). More generally, oscillations play an
important role in many dynamic cellular processes (see,
e.g., [37]) and understanding the mechanisms relating
oscillations in protein levels to oscillations in the genetic
level is very important.
Under the assumption of equal transition rates along
the chain the RFM becomes the HRFM. In this paper, we
considered for the first time the HRFM with n → ∞. Our
main result provides a simple closed-form expression for
the translation rate R in terms of the initiation rate λ
and transition rate λc . We also derived bounds showing
that the behavior of the finite-dimensional HRFM is in
good agreement with the asymptotic results already for
relatively small values of n. Thus, our results may be
used to (approximately) analyze the behavior of finite
HRFMs that model translation.
We demonstrated one such application, namely, using
the analytic results to obtain an estimation of the initiation rate in mouse embryonic cells based on recent
estimates of the translation and transition rates. Our
approach suggests several possible biological experiments. For example, it is possible to manipulate various
biophysical properties that affect the initiation rate, and
then measure the new translation rate. Applying (12),
one can obtain the modified initiation rate. This may lead
to a quantitative value for the effect of various parameters
on the initiation rate and, in particular, to an understating of which factors are more crucial than others.
Gene translation is a central intra-cellular process in all
living organisms, and deriving systematic estimations
of various parameters of this process, and understating
their relative importance, may contribute to numerous
biomedical disciplines.
As a topic for further research, we note that TASEP
and its variants have been used to model and study not
only translation, but many other biological and artificial
systems as well. Examples include ad hoc communication networks, biomolecular motors, and vehicular
traffic [38], [10], [39]. It may be of interest to use the deterministic, and in some ways simpler, RFM (or HRFM)
to model and analyze these systems.
R EFERENCES
[1]
[2]
[3]
[4]
B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P. Walter,
Molecular Biology of the Cell, New York, 2002.
S. Zhang, E. Goldman, and G. Zubay, “Clustering of low usage
codons and ribosome movement,” J. Theor. Biol., vol. 170, pp. 339–
54, 1994.
A. Dana and T. Tuller, “Efficient manipulations of synonymous
mutations for controlling translation rate–an analytical approach.”
J. Comput. Biol., vol. 19, pp. 200–231, 2012.
R. Heinrich and T. Rapoport, “Mathematical modelling of translation of mRNA in eucaryotes; steady state, time-dependent processes and application to reticulocytes,” J. Theor. Biol., vol. 86, pp.
279–313, 1980.
8
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
C. T. MacDonald, J. H. Gibbs, and A. C. Pipkin, “Kinetics of
biopolymerization on nucleic acid templates,” Biopolymers, vol. 6,
pp. 1–25, 1968.
T. Tuller, I. Veksler, N. Gazit, M. Kupiec, E. Ruppin, and M. Ziv,
“Composite effects of the coding sequences determinants on the
speed and density of ribosomes,” Genome Biol., vol. 12, no. 11, p.
R110, 2011.
T. Tuller, M. Kupiec, and E. Ruppin, “Determinants of protein
abundance and translation efficiency in s. cerevisiae.” PLOS Computational Biology, vol. 3, pp. 2510–2519, 2007.
L. B. Shaw, R. K. Zia, and K. H. Lee, “Totally asymmetric exclusion
process with extended objects: a model for protein synthesis,”
Phys. Rev. E Stat. Nonlin. Soft. Matter Phys., vol. 68, p. 021910,
2003.
R. Zia, J. Dong, and B. Schmittmann, “Modeling translation in
protein synthesis with TASEP: A tutorial and recent developments,” J. Statistical Physics, vol. 144, pp. 405–428, 2011.
A. Schadschneider, D. Chowdhury, and K. Nishinari, Stochastic
Transport in Complex Systems: From Molecules to Vehicles. Elsevier,
2011.
S. Reuveni, I. Meilijson, M. Kupiec, E. Ruppin, and T. Tuller,
“Genome-scale analysis of translation elongation with a ribosome
flow model,” PLOS Computational Biology, vol. 7, p. e1002127,
2011.
J. B. Plotkin and G. Kudla, “Synonymous but not the same: the
causes and consequences of codon bias,” Nat. Rev. Genet., vol. 12,
pp. 32–42, 2010.
M. Kozak, “Point mutations define a sequence flanking the aug
initiator codon that modulates translation by eukaryotic ribosomes,” Cell, vol. 44, no. 2, pp. 283–92, 1986.
M. Margaliot and T. Tuller, “Stability analysis of the ribosome flow
model,” IEEE/ACM Trans. Computational Biology and Bioinformatics,
vol. 9, pp. 1545–1552, 2012.
N. T. Ingolia, L. Lareau, and J. Weissman, “Ribosome profiling of
mouse embryonic stem cells reveals the complexity and dynamics
of mammalian proteomes,” Cell, vol. 147, no. 4, pp. 789–802, 2011.
M. Margaliot and T. Tuller, “On the steady-state distribution
in the homogeneous ribosome flow model,” IEEE/ACM Trans.
Computational Biology and Bioinformatics, vol. 9, pp. 1724–1736,
2012.
M. Margaliot and T. Tuller, “Ribosome flow model with positive
feedback,” J. Royal Society Interface, vol. 10, p. 20130267, 2013.
W. B. Jones and W. J. Thron, Continued Fractions: Analytic Theory
and Applications. Reading, MA: Addison-Wesley, 1980.
G. Kudla, A. W. Murray, D. Tollervey, and J. B. Plotkin, “Codingsequence determinants of gene expression in escherichia coli,”
Science, vol. 324, pp. 255–258, 2009.
F. Supek and T. Smuc, “On relevance of codon usage to expression
of synthetic and natural genes in escherichia coli,” Genetics, vol.
185, pp. 1129–1134, 2010.
T. Tuller, Y. Y. Waldman, M. Kupiec, and E. Ruppin, “Translation
efficiency is determined by both codon bias and folding energy,”
Proceedings of the National Academy of Sciences, vol. 107, no. 8, pp.
3645–50, 2010.
S. Lee, B. Liu, S. Lee, S. Huang, B. Shen, and S. Qian, “Global
mapping of translation initiation sites in mammalian cells at
single-nucleotide resolution,” Proc Natl Acad Sci U S A., vol. 109,
no. 37, pp. E2424–32, 2012.
N. T. Ingolia, S. Ghaemmaghami, J. R. Newman, and J. S.
Weissman, “Genome-wide analysis in vivo of translation with
nucleotide resolution using ribosome profiling,” Science, vol. 324,
no. 5924, pp. 218–23, 2009.
B. Schwanhausser, D. Busse, N. Li, G. Dittmar, J. Schuchhardt,
J. Wolf, W. Chen, and M. Selbach, “Global quantification of
mammalian gene expression control,” Nature, vol. 473, no. 7347,
pp. 337–42, 2011.
L. Lorentzen and H. Waadeland, Continued Fractions: Convergence
Theory, 2nd ed. Paris: Atlantis Press, 2008, vol. 1.
D. Angeli and E. D. Sontag, “Monotone control systems,” IEEE
Trans. Automat. Control, vol. 48, pp. 1684–1698, 2003.
P. D. Leenheer, D. Angeli, and E. D. Sontag, “Monotone chemical
reaction networks,” J. Mathematical Chemistry, vol. 41, pp. 295–314,
2007.
G. Enciso and E. D. Sontag, “Monotone systems under positive
feedback: multistability and a reduction theorem,” Systems Control
Lett., vol. 54, pp. 159–168, 2005.
[29] D. Angeli and E. D. Sontag, “Oscillations in I/O monotone
systems under negative feedback,” IEEE Trans. Automat. Control,
vol. 53, pp. 166–176, 2008.
[30] L. Wang, P. de Leenheer, and E. D. Sontag, “Conditions for
global stability of monotone tridiagonal systems with negative
feedback,” Systems Control Lett., vol. 59, pp. 130–138, 2010.
[31] A. W. Craig, A. Haghighat, A. T. Yu, and N. Sonenberg, “Interaction of polyadenylate-binding protein with the eIF4G homologue
PAIP enhances translation,” Nature, vol. 392, no. 6675, pp. 520–3,
1998.
[32] S. Z. Tarun and A. B. Sachs, “Binding of eukaryotic translation
initiation factor 4E (eIF4E) to eIF4G represses translation of uncapped mRNA,” Mol. Cell. Biol., vol. 17, pp. 6876–6886, 1997.
[33] M. Margaliot, E. D. Sontag, and T. Tuller, “Entrainment to periodic
initiation and transition rates in the ribosome flow model,” 2013,
submitted.
[34] M. Frenkel-Morgenstern, T. Danon, T. Christian, T. Igarashi, L. Cohen, Y. M. Hou, and L. J. Jensen, “Genes adopt non-optimal codon
usage to generate cell cycle-dependent oscillations in protein
levels,” Mol. Syst. Biol., vol. 8, p. 572, 2012.
[35] Y. Xu, P. Ma, P. Shah, A. Rokas, Y. Liu, and C. H. Johnson, “Nonoptimal codon usage is a mechanism to achieve circadian clock
conditionality,” Nature, vol. doi:10.1038/nature11942, 2013.
[36] M. Zhou, J. Guo, J. Cha, M. Chae, S. Chen, J. M. Barral, M. S.
Sachs, and Y. Liu, “Non-optimal codon usage affects expression,
structure and function of clock protein FRQ,” Nature, vol. doi:
10.1038/nature11833, 2013.
[37] K. Kruse and F. Julicher, “Oscillations in cell biology,” Curr. Opin.
Cell Biol., vol. 17, no. 1, pp. 20–6, 2005.
[38] S. Srinivasa and M. Haenggi, “A statistical mechanics-based
framework to analyze ad hoc networks with random access,”
IEEE Trans. Mobile Computing, vol. 11, pp. 618–630, 2012.
[39] D. Chowdhury, A. Schadschneider, and K. Nishinari, “Physics
of transport and traffic phenomena in biology: from molecular
motors and cells to organisms,” Physics of Life Reviews, pp. 318–
352, 2005.

Download Report

Explicit expression for the steady state translation rate in the infinite

Paperzz.com

Your Paperzz