[math-fun] Pascal triangle row: shift-and-add
What happens when row n of the Pascal triangle is shifted by k then added to itself? Or in continuous terms, two binomial distributions, identical apart from the shift along the x-axis, are summed? Consider k fixed and n increasing: initially n_C_i + n_C_j where i+j = n+k is bimodal; but at some point n_0 it becomes unimodal and remains so for n > n_0. Numerical experiments suggest that n_0 ~ k^2 - 5/3 ; indeed this bound estimate is remarkably good, being only <0.1 too large at k = 2 , and <0.001 too large at k = 20 . No doubt probability wonks know all about this stuff already: so can anyone point me at a reference to a proof of the bound; or alternatively explain why it's perfectly obvious? Fred Lunnon
Further investigation strongly supports n_0 = k^2 - 5/3 - (16/45)/k^2 + O(1/k^4) WFL On 6/27/14, Fred Lunnon <fred.lunnon@gmail.com> wrote:
What happens when row n of the Pascal triangle is shifted by k then added to itself? Or in continuous terms, two binomial distributions, identical apart from the shift along the x-axis, are summed?
Consider k fixed and n increasing: initially
n_C_i + n_C_j where i+j = n+k
is bimodal; but at some point n_0 it becomes unimodal and remains so for n > n_0.
Numerical experiments suggest that
n_0 ~ k^2 - 5/3 ;
indeed this bound estimate is remarkably good, being only <0.1 too large at k = 2 , and <0.001 too large at k = 20 .
No doubt probability wonks know all about this stuff already: so can anyone point me at a reference to a proof of the bound; or alternatively explain why it's perfectly obvious?
Fred Lunnon
On 27/06/2014 19:22, Fred Lunnon wrote:
Consider k fixed and n increasing: initially
n_C_i + n_C_j where i+j = n+k
is bimodal; but at some point n_0 it becomes unimodal and remains so for n > n_0.
Numerical experiments suggest that
n_0 ~ k^2 - 5/3 ;
First thought: For large n we have the familiar approximation that says binomial(n,p) is close to normal(np,npq) and, in particular, binomial(n,1/2) is close to normal(n/2,n/4). (The second parameter being variance, not stddev.) Now, the sum of two standard normals separated by 2t is (up to an irrelevant multiplicative constant) f(x) = exp(-(x+t)^2/2) + exp(-(x-t)^2/2) so f'(x) = - (x+t) exp(-(x+t)^2/2) - (x-t) exp(-(x-t)^2/2) f''(x) = (x+t)^2 exp(-(x+t)^2/2) + (x-t)^2 exp(-(x-t)^2/2) - exp(-(x+t)^2/2) - exp(-(x-t)^2/2) so f''(0) = [2t^2 - 2] exp(-t^2/2) and therefore switches from unimodal to bimodal at t=1. So the sum of two normals with stddev sqrt(n/4) switches from unimodal to bimodal when the separation is 2sqrt(n/4) = sqrt(n). Or, holding the separation fixed at k, when the stddev is sqrt(k^2/4), i.e., when n=k^2. Of course this requires more rigour to verify that the approximation is close enough; perhaps supplying that rigour might naturally lead to some of your higher-order terms, though I wouldn't count on it. Second thought: Wait, we can do this by hand. The question is whether the term at i=j=(n+k)/2 is bigger or smaller than its (equal) neighbours. In other words, we need to compare 2nCi with nC(i-1)+nC(i+1) -- to see when nC(i-1)/nCi + nC(i+1)/nCi crosses 2. Well, that quantity is simply i/(n-i+1) + (n-i)/(i+1) and if I've done my algebraic scribblings correctly the condition simplifies to k^2 = n+2. That doesn't match your more complicated formula. Are you considering i,j to be continuous variables and asking when there's any dip in the middle -- i.e., when the second derivative at (n+k)/2 changes sign? In that case I suppose we get something a bit hairier. If some more dubious algebraic scribblings are correct, the condition is (f1((n+k)/2) - f1((n-k)/2))^2 = f2((n+k)/2) + f2((n-k)/2) where f1 is the logarithmic derivative of the factorial function (i.e., digamma offset by 1) and f2 is f1'. Those more fluent than I with "special functions" may find it all obvious at this point :-). -- g
In the Gaussian approximation, normalized to mean 0 and variance 1, we have exp(-(x-s/2)^2/2)) + exp(-(x+s/2)^2/2)) = 2 exp(-(x^2+s^2)/2) cosh sx. This has extrema where the derivative of the log is 0. s tanh sx = x. When s < 1, there is a single maximum at 0. When s > 1, there is a minimum at 0 and 2 maxima. The critical point is s = 1, when there is a triple root. That is, the critical shift s is 1 standard deviation, which for the binomial(n,k) is sqrt(n)/2. -- Gene
________________________________ From: Fred Lunnon <fred.lunnon@gmail.com> To: math-fun <math-fun@mailman.xmission.com> Sent: Friday, June 27, 2014 11:22 AM Subject: [math-fun] Pascal triangle row: shift-and-add
What happens when row n of the Pascal triangle is shifted by k then added to itself? Or in continuous terms, two binomial distributions, identical apart from the shift along the x-axis, are summed?
Consider k fixed and n increasing: initially
n_C_i + n_C_j where i+j = n+k
is bimodal; but at some point n_0 it becomes unimodal and remains so for n > n_0.
Numerical experiments suggest that
n_0 ~ k^2 - 5/3 ;
indeed this bound estimate is remarkably good, being only <0.1 too large at k = 2 , and <0.001 too large at k = 20 .
No doubt probability wonks know all about this stuff already: so can anyone point me at a reference to a proof of the bound; or alternatively explain why it's perfectly obvious?
Fred Lunnon
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
My thanks to Warren, Gareth, Gene --- I will digest their replies. In the meantime, I realised that I conflated discrete (Pascal triangle) and continuous (binomial distribution) problems: the answers are not quite the same. For the continuous case instead I find experimentally n_0 = k^2 - 4/3 - (32/45)/k^2 - (4/9)/k^4 + O(1/k^6) . WFL On 6/28/14, Eugene Salamin via math-fun <math-fun@mailman.xmission.com> wrote:
In the Gaussian approximation, normalized to mean 0 and variance 1, we have
exp(-(x-s/2)^2/2)) + exp(-(x+s/2)^2/2)) = 2 exp(-(x^2+s^2)/2) cosh sx.
This has extrema where the derivative of the log is 0.
s tanh sx = x.
When s < 1, there is a single maximum at 0. When s > 1, there is a minimum at 0 and 2 maxima. The critical point is s = 1, when there is a triple root. That is, the critical shift s is 1 standard deviation, which for the binomial(n,k) is sqrt(n)/2.
-- Gene
________________________________ From: Fred Lunnon <fred.lunnon@gmail.com> To: math-fun <math-fun@mailman.xmission.com> Sent: Friday, June 27, 2014 11:22 AM Subject: [math-fun] Pascal triangle row: shift-and-add
What happens when row n of the Pascal triangle is shifted by k then added to itself? Or in continuous terms, two binomial distributions, identical apart from the shift along the x-axis, are summed?
Consider k fixed and n increasing: initially
n_C_i + n_C_j where i+j = n+k
is bimodal; but at some point n_0 it becomes unimodal and remains so for n > n_0.
Numerical experiments suggest that
n_0 ~ k^2 - 5/3 ;
indeed this bound estimate is remarkably good, being only <0.1 too large at k = 2 , and <0.001 too large at k = 20 .
No doubt probability wonks know all about this stuff already: so can anyone point me at a reference to a proof of the bound; or alternatively explain why it's perfectly obvious?
Fred Lunnon
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Writing f(i) = n_C_i + n_C_j where i+j = n+k , I asked for the boundary n = n_0(k) between f(i) bimodal and unimodal. What I mistermed "binomial distribution" actually just meant regarding f as a function of a continuous variable: solving for the second differential vanishing appeared to give n_0 = k^2 - 4/3 - O(1/k^2) for no dip in the middle of the smooth curve at all. What I mistermed the "discrete" (sic) bound was actually chimaerical, taking the first difference of f then treating it as a function of a continuous variable, which appeared to give n_0 = k^2 - 5/3 - O(1/k^2) for maybe a very small dip. Gareth's "really discrete" argument simply divides out common factorials from the second difference of f(i) in the centre of the combined support, which vanishes at n_0 = k^2 - 6/3 permitting wobbles between several successive integers. So yup, it really is obvious after all, once I can kick my addiction to floating-point numerical computation. But 'ang abaht there --- we have cheerfully assumed that the sum of two unimodal functions is necessarily at worst bimodal. This may be false in general, even when they are symmetric, and shifts of one another. So why exactly is it true in this case? Back to the drawing board! (But I can't work up much enthusiasm for tangling with digamma ...) Fred Lunnon On 6/28/14, Fred Lunnon <fred.lunnon@gmail.com> wrote:
My thanks to Warren, Gareth, Gene --- I will digest their replies.
In the meantime, I realised that I conflated discrete (Pascal triangle) and continuous (binomial distribution) problems: the answers are not quite the same. For the continuous case instead I find experimentally
n_0 = k^2 - 4/3 - (32/45)/k^2 - (4/9)/k^4 + O(1/k^6) .
WFL
On 6/28/14, Eugene Salamin via math-fun <math-fun@mailman.xmission.com> wrote:
In the Gaussian approximation, normalized to mean 0 and variance 1, we have
exp(-(x-s/2)^2/2)) + exp(-(x+s/2)^2/2)) = 2 exp(-(x^2+s^2)/2) cosh sx.
This has extrema where the derivative of the log is 0.
s tanh sx = x.
When s < 1, there is a single maximum at 0. When s > 1, there is a minimum at 0 and 2 maxima. The critical point is s = 1, when there is a triple root. That is, the critical shift s is 1 standard deviation, which for the binomial(n,k) is sqrt(n)/2.
-- Gene
<< We have cheerfully assumed that the sum of two unimodal functions is necessarily at worst bimodal. This may be false in general, even when they are symmetric, and shifts of one another. So why exactly is it true in this case? >> Writing down the second difference again, but solving \Delta^2 f(i) = 0 for i in terms of n, k instead, yields a quartic in i with at most 4 real roots; so f has at most 3 maxima and minima, and is at worst bimodal. These chimaerical arguments --- mixing discrete differences and continuous notions like inflection --- do strike me as rather unsatisfactory --- not to mention liable to confusion if not outright error. But if the alternative is involving higher transcendental functions ... Turning to the more general question of summing a pair of unimodal functions: here's an example of a trimodal sum of a symmetric unimodal with its own shift: [ 1 3 5 7 9 11 12 15 16 19 19 16 15 12 11 9 7 5 3 1 0 0 0 0 0 ] + [ 0 0 0 0 0 1 3 5 7 9 11 12 15 16 19 19 16 15 12 11 9 7 5 3 1 ] = [ 1 3 5 7 9 12 15 20 23 28 30 28 30 28 30 28 23 20 15 12 9 7 5 3 1 ] (shame about the spacing). The summand is however not concave on its support. Is there an example which remedies this defect? Fred Lunnon On 6/28/14, Fred Lunnon <fred.lunnon@gmail.com> wrote:
Writing f(i) = n_C_i + n_C_j where i+j = n+k , I asked for the boundary n = n_0(k) between f(i) bimodal and unimodal.
What I mistermed "binomial distribution" actually just meant regarding f as a function of a continuous variable: solving for the second differential vanishing appeared to give n_0 = k^2 - 4/3 - O(1/k^2) for no dip in the middle of the smooth curve at all.
What I mistermed the "discrete" (sic) bound was actually chimaerical, taking the first difference of f then treating it as a function of a continuous variable, which appeared to give n_0 = k^2 - 5/3 - O(1/k^2) for maybe a very small dip.
Gareth's "really discrete" argument simply divides out common factorials from the second difference of f(i) in the centre of the combined support, which vanishes at n_0 = k^2 - 6/3 permitting wobbles between several successive integers.
So yup, it really is obvious after all, once I can kick my addiction to floating-point numerical computation.
But 'ang abaht there --- we have cheerfully assumed that the sum of two unimodal functions is necessarily at worst bimodal. This may be false in general, even when they are symmetric, and shifts of one another.
So why exactly is it true in this case? Back to the drawing board! (But I can't work up much enthusiasm for tangling with digamma ...)
Fred Lunnon
On 6/28/14, Fred Lunnon <fred.lunnon@gmail.com> wrote:
My thanks to Warren, Gareth, Gene --- I will digest their replies.
In the meantime, I realised that I conflated discrete (Pascal triangle) and continuous (binomial distribution) problems: the answers are not quite the same. For the continuous case instead I find experimentally
n_0 = k^2 - 4/3 - (32/45)/k^2 - (4/9)/k^4 + O(1/k^6) .
WFL
On 6/28/14, Eugene Salamin via math-fun <math-fun@mailman.xmission.com> wrote:
In the Gaussian approximation, normalized to mean 0 and variance 1, we have
exp(-(x-s/2)^2/2)) + exp(-(x+s/2)^2/2)) = 2 exp(-(x^2+s^2)/2) cosh sx.
This has extrema where the derivative of the log is 0.
s tanh sx = x.
When s < 1, there is a single maximum at 0. When s > 1, there is a minimum at 0 and 2 maxima. The critical point is s = 1, when there is a triple root. That is, the critical shift s is 1 standard deviation, which for the binomial(n,k) is sqrt(n)/2.
-- Gene
participants (3)
-
Eugene Salamin -
Fred Lunnon -
Gareth McCaughan