*On leave from the Centre National de la Recherche Scientifique. UNBIASED DIE RO~~ING WITH A BIASED DIE by P. Camion* Department of Statistics University of North Carolina at Chapel Hill Institute of Statistics Mimeo Series No. 920 April, 1974 UNBIASED DIE ROLLING ~'!ITH A BIASED DIE by P. Camion C. N. R. S. 1. IntT'oduation: & University of North CaroUna We will here use some algebraic methods lectured by M. P. SchUtzenberger at T~ulousetmiversity in 1965, some of these were intro~ duced in [6]. X* be the monord freely generated by a finite alphabet Let p:X+JR l for which + p(x) =1 X. is defined and may be interpreted X€X as a probability distribution. A word in X* may be viewed as a finite sequence of trials corresponding to the tossing of a die that finite number of times. length n by p(u) = p IT p(a.) j€[l,n] l j first aim is to build up a set E = E(1) + .•. + E '0,' Yi) U€E\ n ~ is extended to the set E c where u = ai of words of ... at 1 The n X* which is partitioned into ( k,)wlth~ . for k the cardinality of X : p(u) = l(j) p(U) ,V u€E n i,j € [l,k] ,V n € N ..;B- 'W~ F n =F n r , for any F X* c He must have, for whatever L (2) =0 lim p(u) n-+oo ~€r\EX* Such an . E has been built up by von Neumann for lar construction is extended here. p X = {O,l} [8]. That particu- The generating series of probability for the- values under consideration in (2) is given, for any Card (Xn \EX * ) proved and the generating series for X and any p. (2) is is found explicitJ;yy . A computable formula for the mean delay is given as well. The adv~~tage of this procedure is that only a small memory capacity and a Actually the number of comparisons required is few computations are required. less than the value of the mean delay. But the efficiency is very poor. In the next paragraph we give a procedure with high efficiency and a reasonable amount of computation. 2. Von Neumann sequences in the set A classical solution for X {O, 1} = {O,l} is to take with even length having a0right factor in {01,10} E to be the set of words and without a proper left factor with the same property (i.e. the property of having even length and having a right factor in {01,10}). Here we write (4) where is the set of words in of words in E ending with 1. E ending with Clearly, the mapping by a(uOl) 0 = ulO , and E(2) is the set a : E(l) ~ E(2) define~ is one to one and since p(ulO) = p(uOl) is the set of von Neumann sequences. L p(u) = (1 , for every p, (1) is verified. E Clearly (2) is satisfied since - 2p(O)p(1))n • uEt\EX* Moreover (4) may be easily computed for all set p be. However the E here described is not the best possible, as proved by W. Hoeffding and G. Simons 3. n, whatever [5] who define several other sets with better values of (4). An extension of von Neumann sequences in the case where X has more than two symbols 3.1 Construction of the set E of sequences produaing the output syrribols Let We define in X* Card X a prefix code =k . C, that is a subset of C n CXX * = {</>} X* for vnlich • We write the partition (6) where C. of words C. ~ is the set of words of length ~ u = aa n in C. C 1 = </> , that is the words with two equal symbols. is the set of all words in C of length sJrobols and no left factor in any equal symbols, all ot~ers Cj , j < i • are distincts.) (5) for a prefix code is met. i C 2 is the set Now, recursively having at least two equal ('l'hus, C. ~ has exactly two Then, by definition, the requirement Now let 7l. <X> , that is the ring over the rational inte- R be the ring * gers of the monoid X. To every subset X* F of corresponds a polynomial L u € ZZ <X>. For notational simplicity, we just write F for such a polyu€E nomial. We now consider R[t] and as well the ring of formal series R[[t]] • 1 - L belongs to R[t] Its inverse is l:;;;i:;;;k L 1 + i2:0 and the monomials that we find in the in C. in C has a unique factorization in (5). are all possible products of words Those monomials all have coefficient one since every word factorizable Now, the mapping into a morphism of 71. Ai nt ]] ¢l: X ZZ<X> -+ 71. into C , as is well known, a consequence of defined by </l(x) =1 , V X E X 71. and further into a morphism of extends R[[t]] into Then . L i2:0 or \ i)-l ( 1 - £c.t 1 where c denotes i previous remark. *= C u i2:0 by = Card Thus Ci is L '>0 1- , 1 and a'I denotes ¢lAi number of words of length P X -+ IR , into a morphism of + Ai ' by our in i 7l. <X> and for A(t) = into lR[[t]] R[[t]] denote ¢l A. P 1 One has, for example, (8) = Cerd. We will also consider the natural extension of Ai defined by </lC i = L a,.t i L c. t i)-l = lp '>0 1- &. 1p t i = a. P (t) by SJ•• 1p -+ Denote IR, and since a"'"ip = ~pA.~ = p(u) , the formal"5serie (8) gives, for every the probability for a word of length Now let the k P be the distincts letters in X into k of subsets. X. X Let C* to be in * words of kl E(X)is the subset of where i i which is the set of all sequences of E(t) = A(t)Pt k E ending with the letter • E(l) Denote by E . (9) is a partition x. For any distribution of probability L p(x) =1 , XEX we see that if U€E ) p(u) tx ) ~X) denotes the subset of words in does not depend on E(x)With length n , x. n 3.2 The generating series Property 1 (10) c l One has =0 , ci = k(k - 1) •.• (k - i + 2)(i - 1) , 2 SiS k • This is a straightforward consequence of the definition of that c k = k(k - 1) ••• 2(k - 1) say that a finite prefix code of a word in F F , every word in (k - l)k! c, (11) F.rlSiSn L ~ n has a left factor in prefix codes satisfy the polynomial equality i = 1? . We observe Following M. P. SchUtzenberger, we is complete when, if :.ll x C. is the largest lengtb F • Then complete Property 2: CuP is a compZete prefix code. Proof: If a word of if has no two equal symbols, it belongs to (C u p)X* n (C u p)XX * CuP verifies If it c. has two equal symbols, it has a left factor in Also P. =~ We write in place of (11) (12) Property 3: For every probabiZity distribution p f(tJ = 1 .... poZynomiaZ c. 1,~p t The reciprocal polynomial of companion matrix is non-negative. i an of the roots of the have a modu Zus larger than one. f(t) is a Frobenius polynomial, i.e. its Thus its largest positive root has the largest absolute value among all its roots. From (12) by using the morphism c. lp =1 ~ ,one obtains p - kIn p(x) xeX \ k-i has a real root ~ in lO,l[, since it c. t l:Si:Sk lp takes a negative value for t 0 • On the other hand, it does not have any which proves that l.. = other positive root since its by t - C*P . Let quot~ent ~ is a polynomial with ncn- negative coefficients. In (9) we have defined E as the set of words with no left factor in E. for a word of length We have n THEOREM 1 to be in 1 - D n L 2sisk and 2 neE d n converges. c.lp t i X*\EX* , that is D be the set We denote by - kIn p(x)t xeX k d np the probabilitJ -.. . . " •• >, Let D(t) be the formal serie corresponding to the set D. One has = L (1 - E(t»-l D(t) Xit i idN ~ Applying the morphism (1 - kl (16) IT XEX then by (8), l;'e ~ , we get, with defined in (8), (t) p = (1 p(x)tka (t)-l d (t) P _ t)-l obtain (14). Then by (13), the numerator of (14) has 1 as a root so that the denomina· tor of (14) reduces to =1 f(t) L - i c. t 2::;i::;k ~p . Hence property 3 completes the proof. Remark: For = p(y) p(x) we then have i L ndN Card D t n n n is the cardinal of D n and since being given by (10), one has C 1 - (18) , V x,y EX, knd L k(k - 1) ... (k - i + 2)(i - l)t i - kklt k i<k = ----------------------(1 - L k(k - 1) ... (k-i+2)(i-l}ti)(1-kt)-1 i::;k If we write as well t n,p = ~PEn . E tn ndN n = L E(t) By theorem 1 we have Xm _ D , and, from applying m ~ p and t (t) p Lt ndN n,p : l t Q::;n::;m n,p =1 - d m =1 = L t tn, where nE 1" n,p since L E ~-n Q::;n::;m n = -8- .e.np may be interpreted as the probability that the sequence of symbols obtaine,: at time n be a word in E. If Ivl denotes the number of symbols in the W01' .e. is the distribution of the Ivl for V EO E . By the Ivl,p same argument as that one used by M. P. SchUtzenberger in [6] 111.7 page 1209 v we see that it can be proved easily that this distribution is dominated by an exponential distribution and then has moments of every order. fact that the;' exists a word occurs, since, if property. v u EO X* is any word in X*u c EX * such that = v2 P , u This argument relies on the This actually is easily seen to have this Hm;ever, we give a proof which is valid for any prefix code C, CuP being a complete prefix code. (*) The distribution THEOREM 2: Zlvl,p Ivl of the v for E has moments of EO every order. (*) Gordon Simons shows that such a word under consideration. Let THEOREM: and let u always exists in the situation Here is his statement and. proof. P be a complete finite prefix code for a finite alprabet A be a nonempty subset of P. There exists a 'mrd u EO X* X for vrhi.C{ * * cPAX * * XuX P is complete, tbere exists a finite set Since Proof: X* for which exists a u EO 1 = P*S X* and P* n S :.: {e} for '\;hich , where * * P *slul c PAX e Then P*s u u j l 2 P*sju C u j c P*AX* P*AX* for each ,j , j , and = {sl' ... is the empty word. , and, in turn, a u 2 Proceeding recursively, one for which S ca~ obtain EO =X There X* for ul ,u2 ' •.• u 1 = 1, ... ,n . Let u = u1u 2 * * = PSuX * * P*AX * XUX C ,s } n u n -9:d'en'~\tllt.ttCl" The of the rationa.D. fraction giving .e. p (t) verifies Property 3 as well as its powerswhich will appear in the derivatives. Then all series under consideration converge. Example: k = 3 Then, knowing the first three Card D. , 1 ~ ~ i ~ 3 , we may compute the follow- ing by the recurrence relation (20) Card Di ; 3 Card D. 2 + 12 Card D. 3 . ~~- We finally obtain the array. (See page 9a.) Since f *.( t) has a real root roots b -1 and c- 1 a-I = 0,367392 and two complex conjugated , we know from (18) that the first member of (18) has the form ( 21 ) and since - 1t )-1 + x (l-c - 1 t )-1 xl ( l-a -1) t -1 + x 2 (l-b 3 Ib -1, ~ a -1 , Ic -1 I·;~ a -1 , one has We have a Vandermond system of linear equations: x 1 +x 2 -'.. x 3 =1 ax + bX + cX l 2 3 2 2 2 a xl + b x 2 + c x from which =3 =9 3 -9e.i 3 i 1 Card D./3 1 Card D.1 i (x1+2I x 2 1)a /3 1 3 3 1 1,1834 2 9 9 1 1,0737 3 27 21 0,777 0,9741 4 Sl 63 0,777 0,8838 5 243 171 0.703 0,8020 6 ,7;29 441 0,605 0,7276 7 2.187 1269 0,580 0,6601 8 .6.561 3.375 0,514 0,5990 9 .-19~683 9.099 0,462 0,5434 10 59.049 25.353 0,429 0,4930 11 177.147 67.797 0,382 0,4473 12 531.441 185.247 0,3!J.8 0,4058 13 1.594.323 507.327 0,318 0,3682 14 4.782.969 1.368.405 0,286 0,3340 15 14.348.907 3.742.2)+5 0,260 0,3031 16 43.046.721 10.185.039 0,236 0,2750 17 129.139.163 27.647.595 0,214 0,2495 18 387.417.48x 1162252',4-.102 - 75462057 0,194 0,2264 20516324x 0,176 0,2054 5.581~5.131X 0,160 0,1863 15.210.34'4 .10 2 41. 364.307.10 2 0,145 0,1690 0.132 0.1534 11. 260.990.10 3 0,120 0,1392 30.661.704.10 3 83.420,13.1.06 0.108 0,1262 0,098 0,1146 19 31+86757,2.10 221 1. 0.46.0271,10 2 2 22 31380813.10 2 23 "94142439.10 ~ 28242731.10 3 25 8472819193.10 3 20 {1_31/3 3 )i/3 i 0,777 0,605 0,470 0,366 0,285 0,221 0,172 0,134 -10- (a-3){b-3) ( a ~c Hb-c) , from which, after computing, a = 2,721892 = 1,189301 Xl , a D = X*\EX* , E 2 (1 - k!/kk)i/k , i = u ndN = 0,367392. = Ix 3 1 = 0,05703. = 1,3043071 Ix 1 Ix1 1 + 21x21 We also compute -1 E: IN, which is the ratio (Card D. )!k i ~ wner x3~ , which gives a natural extension of Von Neumann s2quences but is inferior to the one described here. 3.3 Computing the mean de Zay the construction of (9), the probability of producing an output symbol By at the arrival of the i th input symbol is the coefficient of degree series v.'here = 7T IT p(x) , f (t) p XIZX =1 - c. t i This coefficient is 1];1 end the mean delay is: g·~(l) = k!7Tf-1 (1)(k - r-1 (1).r-1(1» p (24) since g' (1) f (1) = k!7T , by (13). p p =k r! (1) - _P,"-:-- k!7T p i of the 1.0 * * - 009 0.8 ~ 1 0 - .... • 0.7 - 0 i o (x +2 x )a / Si .x i Card D./3 1 • Cl_31/e 3)i/3 1 2 Using Th. 3, one may approximate Card D by (3-a)an n )( "* <> o -r 0.6 x 0 ..... 0.5 0 ~ • -* 0 -¥- 0.4 0 ~ 0 • ~ 0.3 0 ~ 0 ..¥- <> • -l- o -*- 0.2 o & ..... --- (I ~ o ~ 0.1 1 2 3 4 5 6 7 8 9 10 ( 11 12 13 14 15 16 17 18 19 20 -11THEOREM 3: L The mean de Zay is d L d nE]N np • converge since the denominator of the second member of (14) is nEIN np actually a polynomial whose roots are outside the unit circle. f (t) - k!~fk P = (1 - t)h(t) , we have by derivation h(l) If we write = kk!~ - fl(l) • P ~lUP by (14), (13) and (24) L d = h(l)/f (1) = h(l)/k!~ = g'(l) • ne:!N np p 4. T01JJard an effiaient easiZy avmputabZe p:t'oaedure 4.1 The efficient aonst:t'uation P. Elias [2] published a procedure (independantly obtained, he says, by J. A. Lechner and J. Gill) lrhich is proved to approach the best possible efficie (= the expected number of output digits per input digit) which is Hk(p) =- 2 P(x)LogkP(X) where is the cardinality of X . k X€X We describe the procedure in the binary case. factorized ir.to words of length n. Now a mapping The sequznce of input is xn 0 : ~ X* is defined and the imA.ges of the factors of the input sequence are concatenated. ( n,) 1. x.~ with i words of symbols 1 and n - i symbols The form a set in 0 w'hich any two ',fOrds are equiprobal)::l.e, If the binary vriting of (~) is 1. jl j2 js jl j2 jS) (n,) and the one2- + 2 + •• , + 2 , then Card (X u X u u X = l. to-one image under j XS={R.} word. 1:. y of those words is jl x u u Js x If j s :: °, the empty wcrd', .and any word of this set may be mapped on the empty We ccmpute the best possible efficiency respectively p = 25%; 40, 62%; 55, 18% ' p for n = 2,4,8. We nave -12It seems that for high efficiency the mapping n has to be large and the decoding by 0 will need some computations. 4.2 Decoding with pe~utation gro~s: We first remind the reader that if k then (~) 0 modulo k 2 i ~ 0 , n. for is a prime and Also for i l n k, a power of + ••. + i k =n (1) if at least one of the J is not or 0 n. , o f words of length n with it repetition of 1. 1 ' ... ,1. k symbol of X, t € [l,k] may be partitioned into subsets with respec., So the set the i tth S, j :tiTe:'cardinalities j k 1, ,k's , with best possible partition for defining J l ~ ..• ~ Js is given by the procedure given above, Jl . Js {jl' ... ,js} with Card S1.' . i =k +. '.'+ k however any set of integers 0 I!'''' k O. will allow the definition of a suitable of permutations on Q, Card Q define such a partition. =n , with r G u Gu is the subgroup of the orbit of u under U E: r =k S eX8~ple, any group (any power of G wllJ k) G, with some notational induced by . ,find graphic ordering of u G == n G. wU.l' ,bEL:partitionc X Card G G in which the word G , we see that then allows a suitable e Given a G For G and since Card G Card u where Card Let us denote again by abuse, the group of permutations on into orbits under Actually the > 0 • Card u G u is fixed and divides Practically, the problem reduce , then determine the location of u G and define k . S u G is This partition to the follmri:cg. u for the lexico- eu as the corresponding word in J Xu , -13where ju = LogkCard u order k 2t-l G G that we will use is the subgroup of The group 71. of the affine group of t ,n = k n ,k a prime. Denote the permutation i .... > i + 1 mod n by and the permutation 0 i .... > a i mod n by a ,a being a unit of in G has the form ao invariant subgroup of of G. j 71. with order a power of n k. Then every element (0) since we know that the cyclic group But we may also write G This means that every permutation in oja for the generic element G may be obtained by first applying a permutation of type (3) and afterward a cyclic shift. the cyclic subgroup of order with type (3). n of G u H Denote by and by K the subgroup of pcr!llUtationa K . has exactly one wcrd G=HK=KH in each orbit of the set is an G under H • Then the decoding algorithm will be the following. 1- For every word phic order. order of If u' u' v of u among all . VI k has i digits. 2. Determine the order of t , find the v' € H v with the highest lexicogra- is the word "'ith the highest order in which is a divisor, say, P K i This is an integer between k t-l of . l.n u H 5u , determine and Card u tt.c K , k , 5u. Since Card uH , say, ,we find an integer which will be written with with the known left factor of 0 H This integer written in basis It is the left factor of u u to form Qu. j divides digits and concatenated -14Here is an example. H of the orbits v k , for v tioned into orbits under of (k) • =2 n = 8. We have determined all leading words having not more than four ones. The set is parti- G (which are found by letting operate the elements We find a class of four leaders, with cyclic order 8, which means that a corresponding u will be decoded into a word of five digits. 1000000018 1010100018 1100000018 10010000 1111000018 11010010 1010000018 ·11101000 11100010 8 11010100 11001010 10001000/4 1110010018 1110000018 10100100 1101100018 1101000018 11000010 1100110014 1100100018 11000100 1010101012 The figure gives the cyclic order of the elements in the corresponding class. The best possible efficiency of the procedure is p = (5.2 5 + ,4 9.4.2 + 8.3.2 328 + 3.2.2 + 2)\8.2 = 46,58% , which is 86% more than the efficiency of the procedure of von Nevmann. When k is an odd prime, the group group of a cyclic group [7]. K is known to be cyclic, as a sub- This makes the computation easier. However sup- pose for example that form with a i b.1 s a € u K has two generators. Every element in K has the 2 s a a a We compute u , u ; where s is the smallest integer u . H U S has to be a power of 2 (maybe 2 0 ) . This means (as) c K H u -15We find similarly and then 5' (as, b sl ) = K H' (This is true because u H is an invariant subgroup of HK taniously and as soon as Remark: S' = G ). Elements· of u K are computed simul- is determined, they are all known. As a final remark we observe that for m a prime and whatever k an easily computable procedure with poor efficiency consists in the following. First factorize the given sequence into words of length repeated symbol, the m cyclic shifts of each word u m. shifts. u is not a are distinct. is the set of input symbols,which may be linearlyordered, an integer in If u If X may be assigned [I,m] corresponding to its lexicographic rank among its cyclic This integer is the decoding symbol of u. This procedure is also an extension of the procedure of von Neumann. Acknowledgement. I am grateful to Gordon Simons who introduced me to this problem and helped me find the references. a ~endix. See N. Bourbaki "Polynonus et fraction rationnelles" for more details. Some aZgebraic justifications. For R any ring with unity, we denote by L series of the form i€ Ii L a t a.t i , where L b.t a. J. J. i ....... i J.€ .II.~: + . IN J.€ i J. € R[[t]] R , l;J i € the ring of all formal IN . We have = and The formal derivative D L i;;::O D is defined by a.t i J. = L i:2:l . t J.a. J. i -l and the property l;J u , is easily verified. Now suppose coefficients). of Thv to = vDu v,~ Suppose R[[t]] , D(u.v) = uDv + (Du)v , From. now on, R is assumed to be commutative. are polynomials (i.e. formal series with It is known that is a unit of + lillY V € R. v is invertible in R[[t]] aL~ost iff the coeffici( For simplicity, let this coefficient be and, consequently Du = v- 2 (vDw - wDv) . Then + ••• • all zero 1. w = vu, b D , which may be directly computed from the definition of u obtained by multiplying this series on the right by that D is also vDw - wDv. Suppose now R is the ring of real ntunbers and that we have the situation q = gh where h is a polynomial and where the series of coefficients in Then, if in ge~eral, v i = L\ v..]. t g converges. , iElN L OS:iS:n q.]. = L L OS:kS:n OS:iS:k ~ .h.]. -j{-]. = = Hence Lqi see that converges. L OS:iS: oo If g is the series h- l and if Lgi converges we r, g. = l/h(l) • Applying this to the case where q=Du,g=v ]. h = vDw - wDv , we see that if the series Lg i converges we are allowed to write, with the usual notation for derivatives if only we knm,r that the series of coefficients of be the case in our paper since no roots of v v -2 converges. This will are in the closed unit circle. -Co Bibliography [1] Dwass, Meyer "Unbiased coin tossing with discrete random variablesl~, Ann. Math. Stat., 1972, 860 - 864. [2] Elias, Peter "The efficient construction of an unbiased random sequence", Ann. Math. [3] Stat.~1972, Vol. 43, 865 - 870. Lechner, James "Efficient techniques for l..Ulbiasing a Bernoulli generator" (abstract), Ann. Math. Stat.~ 1971, page 2171. [4] Bernard, Jacques and Letac, Gerard "Construction d'evenements equiprobabler: et coefficients multinomiaux modulo pn ll , IZlinois J. ofMath.~ 1973,317332. [5] Hoeffding, Wassily and Simons, Gordon IiUnbiased coin tossing with a biased coin", Ann. Math. [6] Stat.~ 1970, Vol. 41, No.2, 341 - 352. Schutzenberger, M. P. "On a special class of recurrent events", Ann. Math. Stat.~ 1961, Vol. 32, 1201 - 1213. [7] Albert, A. "Fundamental concepts of Algebra", Chic. Univ. Press. [8] von Neumann, John (1951), nVarious techniques used in connection with random digits. Monte Carlo Method", Applied MathennticrJ Series No. 12, 36 - 38, U. S. National Bureau of Standards, Washington, D. C.
© Copyright 2025 Paperzz