1 Explicit expression for the steady state translation rate in the infinite-dimensional homogeneous ribosome flow model Yoram Zarai, Michael Margaliot and Tamir Tuller Abstract—Gene translation is a central stage in the intra-cellular process of protein synthesis. Gene translation proceeds in three major stages: initiation, elongation, and termination. During the elongation step ribosomes (intra cellular macro-molecules) link amino acids together in the order specified by messenger RNA (mRNA) molecules. The homogeneous ribosome flow model (HRFM) is a mathematical model of translation-elongation under the assumption of constant elongation rate along the mRNA sequence. The HRFM includes n first-order nonlinear ordinary differential equations, where n represents the length of the mRNA sequence, and two positive parameters: ribosomal initiation rate and the (constant) elongation rate. Here we analyze the HRFM when n goes to infinity and derive a simple expression for the steady-state protein synthesis rate. We also derive bounds that show that the behavior of the HRFM for finite, and relatively small, values of n is already in good agreement with the closed-form result in the infinite-dimensional case. For example, for n = 15 the relative error is already less than 4%. Our results can thus be used in practice for analyzing the behavior of finite-dimensional HRFMs that model translation. To demonstrate this, we apply our approach to estimate the mean initiation rate in M. musculus, finding it to be around 0.17 codons per second. Other potential biological applications include re-engineering genes to produce a desired translation rate. Index Terms—Gene translation, systems biology, computational models, monotone dynamical systems, periodic continued fractions. F 1 I NTRODUCTION Proteins are micro-molecules involved in all intracellular activities. Proteins are built of chains of smaller molecules called amino acids. There are 20 different amino acids, so the alphabet of the protein sequences include 20 letters. DNA regions, called genes, encode proteins as an ordered list of amino acids. During the process of gene expression these regions are transcripted to mRNA molecules; DNA and mRNA molecules are sequences of 4 possible nucleotides. Gene translation is the process in which the information encoded in genes is deciphered and translated into proteins by molecular machines called ribosomes that move along the mRNA sequence (see [1]). During the translation process, each triplet of consecutive nucleotides, called a codon, is replaced by a corresponding amino acid. Computational models of gene translation have been developed and used to address various biological and biotechnological questions (see, for example, [2], [3], [4], [5], [6], [7]). Rigorous analysis of these computational The research of MM is partly supported by the ISF. • Y. Zarai is with the School of Electrical Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel. E-mail: [email protected] • M. Margaliot is with the School of Electrical Engineering and the Sagol School of Neuroscience, Tel-Aviv University, Tel-Aviv 69978, Israel. E-mail: [email protected] • T. Tuller is with the Department of Biomedical Engineering and the Sagol School of Neuroscience, Tel-Aviv University, Tel-Aviv 69978, Israel. E-mail: [email protected] models is important for several reasons: It can deepen our understanding of the translation process, lead to efficient algorithms for optimizing gene translation for various biotechnological goals, and assist in improving the fidelity of the mathematical models. A conventional model of translation-elongation is the Asymmetric Simple Exclusion Process (ASEP) (see [5], [4], [8], [9]). ASEP is a stochastic model describing particles that can hop from one site to another along a 1D lattice. The term “simple exclusion” refers to the fact that hops may take place only to a target site that is not already occupied by another particle. The motion is assumed to be directionally asymmetric, i.e., there is some preferred direction of motion. In the Totally Asymmetric Simple Exclusion Process (TASEP) motion is unidirectional. TASEP models for translation are based on the following assumptions: (1) Initiation time and the time a ribosome spends translating each site are random and codon dependent; and (2) ribosomes span over several sites and if two ribosomes are adjacent, the trailing one is delayed until the ribosome in front of it has proceeded onwards. Despite its rather simple description, it seems that rigorous analysis of the TASEP is non-trivial. The recent monograph [10] provides a detailed exposition of these issues. Reuveni et al. [11] recently introduced a deterministic model for translation-elongation called the Ribosome Flow Model (RFM). In the RFM, mRNA molecules are coarsegrained into n consecutive sites of codons. The statevariables xi (t) : R+ → [0, 1], i ∈ {1, . . . , n}, describe 2 the “occupancy level” of site i at time t, with xi (t) = 1 [xi (t) = 0] indicating that the site is completely occupied [free]. The RFM is given by n non-linear ordinary differential equations: ẋ1 = λ(1 − x1 ) − λ1 x1 (1 − x2 ), ẋ2 = λ1 x1 (1 − x2 ) − λ2 x2 (1 − x3 ), .. . ẋn−1 = λn−2 xn−2 (1 − xn−1 ) − λn−1 xn−1 (1 − xn ), ẋn = λn−1 xn−1 (1 − xn ) − λn xn . (1) Here λ > 0 describes the initiation rate into the chain, λi > 0, i ∈ {1, . . . , n − 1}, is a parameter that controls the transition rate from site i to site i + 1, and λn > 0 controls the output rate at the end of the chain. The equation for ẋ1 describes the change in the occupancy level at the first site along the chain. The term λ(1 − x1 ) represents the fact that ribosomes enter at a rate λ, but can only bind to the site if it is not already occupied. In particular, if x1 = 1 (i.e., the cite is completely full) then the effective binding rate becomes λ(1 − x1 ) = 0. The term −λ1 x1 (1 − x2 ) represents the flow of ribosomes from site 1 to site 2. This becomes zero if x2 (t) = 1, i.e., if site 2 is completely occupied. The other equations follow similarly. The term λn xn (t) in the last equation represents the rate in which ribosomes leave the last site on the chain, that is, the translation rate or protein synthesis rate. Note that the RFM encapsulates both the simple exclusion and the total asymmetry properties of the TASEP. The numerical values of the parameters λ, λi depend on physical features such as the number of available free ribosomes, nucleotide context surrounding initiation codons, etc. (see [11], [12], [13], [6], [3]). As demonstrated in [11], simulations of the full TASEP and the simpler RFM yield similar predictions of translation rates. For example, the correlation between their predictions of translation rates over the set of endogenous genes of S. cerevisiae is 0.96. We briefly review some previous results on the analysis of the RFM. Let x(t, a) denote the solution of (1) at time t for the initial condition x(0) = a. We always consider initial conditions x(0) in the closed unit cube C := [0, 1]n . It has been shown in [14] that the RFM admits a unique equilibrium point e in C, and lim x(t, a) = e, t→∞ for all a ∈ C. Furthermore, e ∈ Int(C), i.e., every coordinate ei of e satisfies ei ∈ (0, 1). This means that the model parameters define a unique attracting steady-state of ribosome densities (and thus steady state translation rate). Perturbations in the xi s, corresponding to events such as ribosome drop-off, will not affect this steady state. Let R := λn en (2) denote the (steady-state) translation rate. From a biological point of view, it is important to understand how R depends on the RFM parameters. Of course, it is enough to understand the dependence of e on the RFM parameters. For x = e the right-hand side of all the equations in (1) is zero, so λ(1 − e1 ) = λ1 e1 (1 − e2 ) = λ2 e2 (1 − e3 ) .. . = λn−1 en−1 (1 − en ) = λn e n . (3) This yields ei = λn en /(λi (1 − ei+1 )), i ∈ {1, . . . , n − 1}, (4) and e1 = 1 − λn en /λ. (5) Combining (4) and (5) provides a finite continued fraction expression for the last coordinate en of e, namely, 1 − λn en /λ = λn en /λ1 λn en /λ2 1− λn en /λ3 1− .. . 1 − λ e /λ . n n n (6) It has been shown experimentally that in some cases the ribosomal elongation speed is close to constant along the mRNA sequence [15]. To model this, Ref. [16] has considered the RFM in the special case where λ1 = λ2 = · · · = λn := λc , (7) that is, the transition rates λi are all equal, and λc denotes their common value. Since this Homogeneous Ribosome Flow Model (HRFM) includes only two parameters, λ and λc , the analysis is simplified. In particular, (6) becomes 1 en − 1 = η − en , − en 1+ − en 1+ .. .1−e , n (8) where en appears in the continued fraction a total of n times, and η := λ/λc , is the normalized initiation rate. Adding 1 to both sides of (8), taking reciprocals, and then multiplying by (−en ) 3 yields −η = − en , − en 1+ − en 1+ .. .1−e n (9) 0.25 0.2 0.15 where en appears in the continued fraction a total of n+1 times. This implies of course that en = en (η), i.e., en is a function of η. Eq. (9) yields a polynomial equation of degree d(n + 1)/2e in en . For example, for n = 2, (9) becomes e22 − (2η + 1)e2 + η = 0. Our previous studies analyzed the HRFM when λ → ∞ and when λ → 0 [17], [16]. In this paper, we consider for the first time the HRFM when n → ∞, i.e. the infinitedimensional HRFM. In this case, the term on the righthand side of (9) becomes an infinite 1-periodic continued fraction [18, Chapter 3]. Using results from the analytic theory of continued fractions, we derive an explicit and simple expression for e∞ (η) := lim en (η). 0.1 0.05 0 0 R∞ (η) := λc e∞ (η). We also prove bounds that show that our results provide a good approximation for the steady-state translation rate in the finite-dimensional HRFM. For example, for n = 15 the relative error between e∞ (η) and e15 (η) is already less than 4% for all η ≥ 0. It is important to note that the typical length of mRNA sequences is larger than 15 sites. For example, in S. cerevisiae the mean length is about 33 sites; in mammals the mRNA chains are much longer; thus, the asymptotic results here provide a good approximation for the translation rate in HRFM models of gene translation. The remainder of this paper is organized as follows. The next section describes our main results. Section 3 describes the application of the analytic results to a biological example. The proofs of the main results are given in Section 4. 2 M AIN R ESULTS Our first result shows that given two HRFMs with different lengths, but with the same η, the steady-state translation rate in the longer chain is smaller than the one in the shorter chain. £ ¤0 Proposition 1 Fix η > 0. Let e(η) = e1 , . . . , en denote the unique equilibrium of the £ ¤0 n-dimensional HRFM in C, and let ẽ(η) = ẽ1 , . . . , ẽn+1 denote the unique equilibrium of the (n + 1)-dimensional HRFM in C. Then ẽn+1 < en . (10) 0.2 η 0.3 0.4 0.5 Fig. 1. en (η) as a function of η ∈ [0, 1/2] for n = 3 (dash-dot), and n = 10 (dotted). The solid line is the function f (η). In other words, the occupancy level at the last site is a decreasing function of the length of the chain n. Combining (10) with the fact that the equilibrium point is in Int(C) for all n (and thus every coordinate ei is bounded), we conclude that the limit n→∞ Of course, this yields an explicit expression for the steady-state translation rate 0.1 e∞ (η) := lim en (η) n→∞ exists for all η > 0. Define f : R+ → R+ by ( x(1 − x), f (x) := 1/4, 0 ≤ x < 1/2, 1/2 ≤ x. Our main result provides a simple closed-form expression for e∞ (η). Theorem 1 For every η > 0, e∞ (η) = f (η). (11) Figure 1 depicts en (η), n = 3, 10, for η ∈ [0, 1/2]. The function f (η) is also shown. It may be seen that as n increases en (η) indeed converges to f (η), and that en (η) agrees well with f (η) already for n = 10. Figure 2 depicts en (η), n = 3, 10, 15, for η ∈ [1/2, 4]. The function f (η) is also shown. It may be seen that as n increases en (η) indeed converges to f (η). Figures 1 and 2 suggest that f (η) is a good approximation of en (η) already for relatively small values of n (e.g., n = 15). Below we derive rigorous error bounds that indeed verify this. Thus, the closed-form expression in (11) is useful for the finite-dimensional HRFM as well. Let R∞ (η) := λc e∞ (η) denote the steady-state translation rate in the infinite-dimensional HRFM. The next result follows immediately from Theorem 1. Corollary 1 ( λ(λc − λ)/λc , λ < λc /2, R∞ (λ, λc ) = λc /4, λ ≥ λc /2. (12) 4 0.25 0.38 0.36 0.2 0.34 0.15 0.32 0.3 0.1 0.28 0.26 0.05 0.24 0.5 1 1.5 2 η 2.5 3 3.5 4 0 Fig. 2. en (η) as a function of η ∈ [1/2, 4] for n = 3 (dashdot), n = 10 (dotted), and n = 15 (dashed). The solid line is the function f (η). 2 7 12 17 22 27 32 n Fig. 3. The function 1 4 tan2 (π/(n + 2)) as a function of n. ¡ ¢ o f denote the set of functions g : R+ → R+ that satisfy An important open problem in gene translation is related to the dominant gene translation regime: some studies claim that initiation is the rate limiting step [19], while others claim that the elongation is also rate limiting [20], [21]. In terms of the HRFM, the question is whether λ or λc is the rate limiting factor. Eq. (12) suggests that both initiation and translation can be the rate limiting factor. Indeed, (12) implies that the behavior of R∞ (λ, λc ) may be divided into two different regimes. If the normalized initiation rate η is much smaller than 1/2 then R∞ = λ(λc − λ)/λc ≈ λλc /λc = λ, g(x) = 0. f (x) Proposition 2 Fix an arbitrary n ≥ 2. Then ¡ ¢ en (η) = η − η 2 + η n+2 + o η n+2 , ¡ ¢ dn (η) = η n+2 + o η n+2 , ¡ n¢ rn (η) = o η , (14) for all η ∈ [0, 1/2]. In other words, for small η the error bounds decrease quickly with the dimension n. 2.2 The case η → ∞ so in particular the limiting rate factor is the initiation rate λ. On the other hand, if η ≥ 1/2 then R∞ = λc /4, so in particular the transition rate λc becomes the limiting factor. Once e∞ (η) is explicitly known, it is natural to seek a bound for the difference dn (η) := |en (η) − e∞ (η)|, lim x→0 (13) and for the relative difference dn (η) rn (η) := 100 (measured in percent). en (η) Indeed, if these errors are small enough then the closedform results on the infinite-dimensional HRFM provide a good approximation for the behavior of the finitedimensional HRFM. In the next subsections, we derive such bounds when η is close to zero and when η → ∞. The next result provides a bound on dn (η) that, as shown in the proof, is based on considering the case where η → ∞. Proposition 3 For all η > 0 and all n ≥ 2, dn (η) ≤ 1 tan2 (π/(n + 2)) . 4 (15) Figure 3 depicts the bound 14 tan2 (π/(n + 2)) as a function of n. It may be seen that for small values of n this bound decreases quickly with n. The next result provides a bound on rn . Proposition 4 Fix an arbitrary integer k ≥ 1. Let sk ≥ 1 be such that ek (η) ≤ sk η(1 − η), for all η ∈ [0, 1/2). (16) 2.1 The case η small Then for all n ≥ k, ( 100(1 − 1/sk ), η ∈ [0, 1/2), rn (η) ≤ 2 100 sin (π/(n + 2)) , η ≥ 1/2. In this subsection, we consider the case η ∈ [0, 1/2]. Thus, e∞ (η) = η(1 − η). For a real function f : R+ → R+ , let Note that the bounds for both dn and rn become tighter as n increases. (17) 5 To demonstrate the bound in (17) consider the case k = −e1 1. Recall that e1 (η) is the solution of −η = 1−e , i.e., 1 e1 = η/(1 + η). Eq. (16) thus becomes 1 ≤ s1 (1 − η 2 ). Since η ∈ [0, 1/2], this clearly holds √ for s1 = 4/3. A similar calculation shows that s2 = 4−2 2 ≈ 1.1716. For n = 15, a numerical calculation shows that s15 ≈ 1.00859, so the terms in (17) become 100(1 − 1/s15 ) = 0.8517%, and 100 sin2 (π/17) = 3.3764%. Thus, the relative error between en (η) and e∞ (η) is lower than 3.3764% for all n ≥ 15 and all η ≥ 0. This demonstrates that the explicit results for the infinite-dimensional HRFM in (11) and (12) are useful for analyzing the finite-dimensional HRFM as well. In the next section, we demonstrate how the results derived above can be used for analyzing biological data. 3 A BIOLOGICAL EXAMPLE Currently, there exist effective experimental approaches for estimating the translation-elongation rate and the translation rate, but there is no effective approach for measuring the initiation rate. Indeed, initiation is a highly complex mechanism and its efficiency is based on numerous biophysical properties of the coding sequence including: the nucleotide context of the START codon (i.e. the first codon that is translated in a gene); the folding of the RNA near the beginning of the Open Reading Frame (ORF); the number of ribosomes and mRNA molecules in the cell; the length and the nucleotide context of the 5’UTR; and more. Thus, although there exist experimental approaches for measuring positions on the mRNA suspected to correspond to initiation sites [15], [22], there are no large scale measurements of initiation rate. Ingolia et al. [15] estimated the elongation rate in M. musculusin embryonic stem cell in the following way: elongation was halted by applying cyclohexamide. Fragments covered by ribosomes were mapped to each transcript and a baseline ribosomal read counts profile corresponding to the ribosomal density was created. In three additional experiments harringtonine was used to stop initiation while allowing ribosomes, that already started translating the mRNA, to continue their movement along the mRNA. Cyclohexamide was again applied 90/120/150 seconds after applying harringtonine to stop translation. By measuring the “movement” of the “ribosomal density wave” it was possible to estimate the speed of elongation. Their results show that in mouse embryonic cells 5.6 codons are translated per second (this corresponds to 0.3733 sites per second in the RFM, as the size of the ribosome spans about 15 codons [23]). According to [15], this elongation speed is typical, and it does not vary much between different genes. A recent study by Schwanhausser et al. [24] estimated the translation rate in M. musculus fibroblasts by simultaneously measuring protein abundance and turnover by parallel metabolic pulse labelling for more than 5, 000 genes in mouse. They found that the median translation rate in mouse is about 40 proteins per mRNA per hour (i.e., 2/3 proteins per mRNA per minute or 0.0111 proteins per second). One possible application of our theoretical results is to estimate the initiation rate based on the translation and (constant) transition rates. We now demonstrate this for the case of mouse embryonic cells. As described above, in steady-state the elongation rate is λc = 0.3733 HRFM sites per second. In mouse, the mean mRNA length is about 465 codons corresponding to n = 31 sites in the HRFM. Since we know that the relative error between e∞ (η) and e31 (η) is small for all η ≥ 0, we can estimate the mean initiation rate in mouse based on (12). We do this as follows. First, since R ≈ 0.0111 is very different from λc /4 ≈ 0.0933, we assume that we are in the regime where λ < λc /2, (18) so R∞ = λ(λc − λ)/λc , i.e., 0.0111 = λ(0.3733 − λ)/0.3733. This is a quadratic equation in the initiation rate λ with solutions: 0.361849 and 0.0114513. The first solution is not feasible, as it does not satisfy (18). Thus, we conclude that the mean initiation rate in mouse is 0.0114513 sites per second corresponding to 0.17 codons per second. 4 P ROOFS Proof of Proposition 1. Seeking a contradiction, assume that ẽn+1 ≥ en . (19) Since λ(1 − e1 ) = λc en and λ(1 − ẽ1 ) = λc ẽn+1 , this yields ẽ1 ≤ e1 . 1) 1− η(1−e e1 1) Since e2 = and ẽ2 = 1− η(1−ẽ , we get ẽ2 ≤ e2 , ẽ1 and proceeding in this fashion yields ẽi ≤ ei , i = 1, . . . , n. (20) By [16, Proposition 3], ẽn+1 < ẽn . Combining this with (19) yields en ≤ ẽn+1 < ẽn . This contradicts (20), and thus completes the proof. Proof of Thm. 1. We begin by stating a known result from the theory of continued fractions that will be used in the proof. Let a Kn (a/1) := , a 1+ 1+ (21) a .. . 1 + a, where a appears in the continued fraction a total of n 6 times. Note that Kn (a/1) = a 1 + Kn−1 (a/1) . If K∞ (a/1) := limn→∞ Kn (a/1) exists then this yields the quadratic equation a K∞ (a/1) = , 1 + K∞ (a/1) We claim that d0 = 0 and d1 = 1. Indeed, recall that for η = 0 the equilibrium point is the origin, so in particular en = 0. This implies that d0 = 0. Also, taking η → 0 in (8) yields d1 − 1 = 0, so d1 = 1. We now use induction to prove that ¡ ¢ en = η − η 2 + η n+2 + o η n+2 . For n = 1, (9) becomes −η = whose solutions are p a + 1/4 . 2 The next result (see, e.g., [18, Theorem 3.2][25, Corollary 4.8]) provides a necessary and sufficient condition for the convergence of (21) as n → ∞. −1 ± 2 Theorem 2 Consider the continued fraction Kn (a/1), with a ∈ R \ {0}. Let p −1 + 2 a + 1/4 x := . 2 Then K∞ (a/1) exists if and only if a ≥ −1/4, η = η − η2 + η3 − η4 + η5 − . . . 1+η −η = Kn+2 (−en+1 /1). Letting α := 1 − η −1 en+1 yields −α = Kn+1 (−en+1 /1), ¡ ¢ en+1 = α − α2 + αn+2 + o αn+2 . en+1 = As n → ∞, the continued fraction on the right-hand side of (9) becomes an infinite 1-periodic continued fraction and since it is equal to −η, this infinite continued fraction always converges. Thus, Theorem 2 implies that for all η ≥ 0. (24) (27) so by the induction hypothesis Let We can now prove Theorem 1. First note that (9) may be written as −η = Kn+1 (−en /1). (23) so and this implies that (26) indeed holds for n = 1. Assume that (26) holds for some n ≥ 1, i.e., the solution ¢ of −η = ¡ Kn+1 (−p/1) satisfies p = η − η 2 + η n+2 + o η n+2 . For the induction step, consider the equation (22) and then K∞ (a/1) = x. e∞ (η) ≤ 1/4, e1 = −e1 1−e1 , (26) ∞ X ck η k . (28) (29) k=1 WePalready know that c1 = 1, so α = 1 − η −1 en+1 = ∞ − k=2 ck η k−1 . Substituting this in (28) yields η+ ∞ X k=2 ck η k = − ∞ X ck η k−1 − ( k=2 + (− ∞ X ∞ X ck η k−1 )2 k=2 ∞ ¡X ¢ ck η k−1 )n+2 + o ( ck η k−1 )n+2 . k=2 k=2 Suppose that η > 0 is sufficiently small so that e∞ (η) ∈ [0, 1/4] (such an η exists, as for η = 0 the equilibrium point is e(0) = 0). Then combining (9) with Theorem 2 yields p −η = (−1 + 2 −e∞ + 1/4)/2, (30) Equating coefficients yields c2 = −1, c3 = 0, c4 = 0, . . . , cn+2 = 0 , and cn+3 = 1. Substituting these parameters in (29) yields ¡ ¢ en+1 = η − η 2 + η n+3 + o η n+3 . and solving this yields This completes the proof of (26). Now e∞ = η(1 − η). Since η(1 − η) ≤ 1/4 for all η ≤ 1/2, a continuity argument implies that (25) actually holds for all η ≤ 1/2, and in particular, e∞ (1/2) = 1/4. Now suppose that η ≥ 1/2. It follows from [17, Corollary 1] that for a fixed n, en (η) is an increasing function of η. Combining this with the result in Case 1 above implies that e∞ (η) ≥ 1/4, and (24) yields e∞ (η) = 1/4. This completes the proof of Theorem 1. Proof of Prop. 2. Fix arbitrary n > 0 and η > 0. Write en = en (η) as a Taylor series in η, i.e., en = d0 η 0 + d1 η 1 + d2 η 2 + . . . . dn = en − e∞ (25) ¡ ¢ = η − η 2 + η n+2 + o η n+2 − η(1 − η) ¡ ¢ = η n+2 + o η n+2 , and rn = 100dn /en ¢ ¡ η n+2 + o η n+2 ¢ ¡ = 100 η − η 2 + η n+2 + o η n+2 ¡ n¢ =o η . This completes the proof of Prop. 2. Proof of Prop. 3. First note that combining the definition of dn in (13) 7 with (10) yields dn (η) = en (η) − e∞ (η). It was shown in [16] that for every n ≥ 2, ¡ ¢−1 . lim en (η) = 4 cos2 (π/n + 2) η→∞ (31) Thus, lim dn (η) = lim (en (η) − e∞ (η)) η→∞ η→∞ = = 1 ³ 4 cos2 1 tan2 4 π n+2 µ ´− π n+2 1 4 ¶ . For a fixed n, en (η) is an increasing function of η [17, Corollary 1] and this implies that dn (η) ≤ limη→∞ dn (η) for all η > 0. This proves (15). Proof of Prop. 4. We consider two cases. Case 1. Suppose that η ≥ 1/2. Then e∞ (η) = 1/4, so rn (η) = 100(1 − 4en1(η) ), and using (31) yields rn (η) ≤ 100 sin2 (π/(n + 2)). Case 2. Suppose that η ∈ [0, 1/2]. Since en (η) is a decreasing function of n, en (η) ≤ ek (η) ≤ sk η(1 − η) for all η. This yields η(1 − η) ) en (η) 1 ≤ 100(1 − ). sk rn (η) = 100(1 − This completes the proof of Prop. 4. 5 D ISCUSSION The RFM is a new deterministic mathematical model for ribosome flow along the mRNA. The order n of the RFM corresponds to the number of sites along the mRNA chain. The RFM encapsulates both the simple exclusion and the total asymmetry properties of the stochastic TASEP model. We note in passing that the RFM is a monotone dynamical system [14]. Monotone dynamical systems, and their generalization to monotone control systems [26], have recently been shown to provide powerful models in systems biology (see, e.g., [27], [28], [29], [30]). The RFM has already been successfully used to model and analyze important properties of the process of translation-elongation. For example, in eukaryotes mRNA molecules usually form circular structures leading to ribosomal recycling [31], [32]. To analyze this, Ref. [17] considered the RFM as a control system, with the initiation rate as the input, and the translation rate as the output, and closing the loop with a positive linear feedback. As another example, Ref. [33] studied the RFM under the assumption of periodic time-varying initiation rate and/or transition rates, and showed that the state-variables entrain to the periodic excitation. The motivation for this is a recent biological study demonstrating that the gene expression pattern entrains to a periodically varying abundance of tRNA molecules [34] (see also [35], [36]). More generally, oscillations play an important role in many dynamic cellular processes (see, e.g., [37]) and understanding the mechanisms relating oscillations in protein levels to oscillations in the genetic level is very important. Under the assumption of equal transition rates along the chain the RFM becomes the HRFM. In this paper, we considered for the first time the HRFM with n → ∞. Our main result provides a simple closed-form expression for the translation rate R in terms of the initiation rate λ and transition rate λc . We also derived bounds showing that the behavior of the finite-dimensional HRFM is in good agreement with the asymptotic results already for relatively small values of n. Thus, our results may be used to (approximately) analyze the behavior of finite HRFMs that model translation. We demonstrated one such application, namely, using the analytic results to obtain an estimation of the initiation rate in mouse embryonic cells based on recent estimates of the translation and transition rates. Our approach suggests several possible biological experiments. For example, it is possible to manipulate various biophysical properties that affect the initiation rate, and then measure the new translation rate. Applying (12), one can obtain the modified initiation rate. This may lead to a quantitative value for the effect of various parameters on the initiation rate and, in particular, to an understating of which factors are more crucial than others. Gene translation is a central intra-cellular process in all living organisms, and deriving systematic estimations of various parameters of this process, and understating their relative importance, may contribute to numerous biomedical disciplines. As a topic for further research, we note that TASEP and its variants have been used to model and study not only translation, but many other biological and artificial systems as well. Examples include ad hoc communication networks, biomolecular motors, and vehicular traffic [38], [10], [39]. It may be of interest to use the deterministic, and in some ways simpler, RFM (or HRFM) to model and analyze these systems. R EFERENCES [1] [2] [3] [4] B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P. Walter, Molecular Biology of the Cell, New York, 2002. S. Zhang, E. Goldman, and G. Zubay, “Clustering of low usage codons and ribosome movement,” J. Theor. Biol., vol. 170, pp. 339– 54, 1994. A. Dana and T. Tuller, “Efficient manipulations of synonymous mutations for controlling translation rate–an analytical approach.” J. Comput. Biol., vol. 19, pp. 200–231, 2012. R. Heinrich and T. Rapoport, “Mathematical modelling of translation of mRNA in eucaryotes; steady state, time-dependent processes and application to reticulocytes,” J. Theor. Biol., vol. 86, pp. 279–313, 1980. 8 [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] C. T. MacDonald, J. H. Gibbs, and A. C. Pipkin, “Kinetics of biopolymerization on nucleic acid templates,” Biopolymers, vol. 6, pp. 1–25, 1968. T. Tuller, I. Veksler, N. Gazit, M. Kupiec, E. Ruppin, and M. Ziv, “Composite effects of the coding sequences determinants on the speed and density of ribosomes,” Genome Biol., vol. 12, no. 11, p. R110, 2011. T. Tuller, M. Kupiec, and E. Ruppin, “Determinants of protein abundance and translation efficiency in s. cerevisiae.” PLOS Computational Biology, vol. 3, pp. 2510–2519, 2007. L. B. Shaw, R. K. Zia, and K. H. Lee, “Totally asymmetric exclusion process with extended objects: a model for protein synthesis,” Phys. Rev. E Stat. Nonlin. Soft. Matter Phys., vol. 68, p. 021910, 2003. R. Zia, J. Dong, and B. Schmittmann, “Modeling translation in protein synthesis with TASEP: A tutorial and recent developments,” J. Statistical Physics, vol. 144, pp. 405–428, 2011. A. Schadschneider, D. Chowdhury, and K. Nishinari, Stochastic Transport in Complex Systems: From Molecules to Vehicles. Elsevier, 2011. S. Reuveni, I. Meilijson, M. Kupiec, E. Ruppin, and T. Tuller, “Genome-scale analysis of translation elongation with a ribosome flow model,” PLOS Computational Biology, vol. 7, p. e1002127, 2011. J. B. Plotkin and G. Kudla, “Synonymous but not the same: the causes and consequences of codon bias,” Nat. Rev. Genet., vol. 12, pp. 32–42, 2010. M. Kozak, “Point mutations define a sequence flanking the aug initiator codon that modulates translation by eukaryotic ribosomes,” Cell, vol. 44, no. 2, pp. 283–92, 1986. M. Margaliot and T. Tuller, “Stability analysis of the ribosome flow model,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 9, pp. 1545–1552, 2012. N. T. Ingolia, L. Lareau, and J. Weissman, “Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes,” Cell, vol. 147, no. 4, pp. 789–802, 2011. M. Margaliot and T. Tuller, “On the steady-state distribution in the homogeneous ribosome flow model,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 9, pp. 1724–1736, 2012. M. Margaliot and T. Tuller, “Ribosome flow model with positive feedback,” J. Royal Society Interface, vol. 10, p. 20130267, 2013. W. B. Jones and W. J. Thron, Continued Fractions: Analytic Theory and Applications. Reading, MA: Addison-Wesley, 1980. G. Kudla, A. W. Murray, D. Tollervey, and J. B. Plotkin, “Codingsequence determinants of gene expression in escherichia coli,” Science, vol. 324, pp. 255–258, 2009. F. Supek and T. Smuc, “On relevance of codon usage to expression of synthetic and natural genes in escherichia coli,” Genetics, vol. 185, pp. 1129–1134, 2010. T. Tuller, Y. Y. Waldman, M. Kupiec, and E. Ruppin, “Translation efficiency is determined by both codon bias and folding energy,” Proceedings of the National Academy of Sciences, vol. 107, no. 8, pp. 3645–50, 2010. S. Lee, B. Liu, S. Lee, S. Huang, B. Shen, and S. Qian, “Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution,” Proc Natl Acad Sci U S A., vol. 109, no. 37, pp. E2424–32, 2012. N. T. Ingolia, S. Ghaemmaghami, J. R. Newman, and J. S. Weissman, “Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling,” Science, vol. 324, no. 5924, pp. 218–23, 2009. B. Schwanhausser, D. Busse, N. Li, G. Dittmar, J. Schuchhardt, J. Wolf, W. Chen, and M. Selbach, “Global quantification of mammalian gene expression control,” Nature, vol. 473, no. 7347, pp. 337–42, 2011. L. Lorentzen and H. Waadeland, Continued Fractions: Convergence Theory, 2nd ed. Paris: Atlantis Press, 2008, vol. 1. D. Angeli and E. D. Sontag, “Monotone control systems,” IEEE Trans. Automat. Control, vol. 48, pp. 1684–1698, 2003. P. D. Leenheer, D. Angeli, and E. D. Sontag, “Monotone chemical reaction networks,” J. Mathematical Chemistry, vol. 41, pp. 295–314, 2007. G. Enciso and E. D. Sontag, “Monotone systems under positive feedback: multistability and a reduction theorem,” Systems Control Lett., vol. 54, pp. 159–168, 2005. [29] D. Angeli and E. D. Sontag, “Oscillations in I/O monotone systems under negative feedback,” IEEE Trans. Automat. Control, vol. 53, pp. 166–176, 2008. [30] L. Wang, P. de Leenheer, and E. D. Sontag, “Conditions for global stability of monotone tridiagonal systems with negative feedback,” Systems Control Lett., vol. 59, pp. 130–138, 2010. [31] A. W. Craig, A. Haghighat, A. T. Yu, and N. Sonenberg, “Interaction of polyadenylate-binding protein with the eIF4G homologue PAIP enhances translation,” Nature, vol. 392, no. 6675, pp. 520–3, 1998. [32] S. Z. Tarun and A. B. Sachs, “Binding of eukaryotic translation initiation factor 4E (eIF4E) to eIF4G represses translation of uncapped mRNA,” Mol. Cell. Biol., vol. 17, pp. 6876–6886, 1997. [33] M. Margaliot, E. D. Sontag, and T. Tuller, “Entrainment to periodic initiation and transition rates in the ribosome flow model,” 2013, submitted. [34] M. Frenkel-Morgenstern, T. Danon, T. Christian, T. Igarashi, L. Cohen, Y. M. Hou, and L. J. Jensen, “Genes adopt non-optimal codon usage to generate cell cycle-dependent oscillations in protein levels,” Mol. Syst. Biol., vol. 8, p. 572, 2012. [35] Y. Xu, P. Ma, P. Shah, A. Rokas, Y. Liu, and C. H. Johnson, “Nonoptimal codon usage is a mechanism to achieve circadian clock conditionality,” Nature, vol. doi:10.1038/nature11942, 2013. [36] M. Zhou, J. Guo, J. Cha, M. Chae, S. Chen, J. M. Barral, M. S. Sachs, and Y. Liu, “Non-optimal codon usage affects expression, structure and function of clock protein FRQ,” Nature, vol. doi: 10.1038/nature11833, 2013. [37] K. Kruse and F. Julicher, “Oscillations in cell biology,” Curr. Opin. Cell Biol., vol. 17, no. 1, pp. 20–6, 2005. [38] S. Srinivasa and M. Haenggi, “A statistical mechanics-based framework to analyze ad hoc networks with random access,” IEEE Trans. Mobile Computing, vol. 11, pp. 618–630, 2012. [39] D. Chowdhury, A. Schadschneider, and K. Nishinari, “Physics of transport and traffic phenomena in biology: from molecular motors and cells to organisms,” Physics of Life Reviews, pp. 318– 352, 2005.
© Copyright 2025 Paperzz