The Todd function method
MacTutor biography source |
John Todd was a British geometer who worked at Cambridge for most of his life. Michael Atiyah took classes from him there. He was not Atiyah’s doctoral advisor—that was William Hodge—but he advised Roger Penrose, Atiyah’s longtime colleague at Oxford.
Today Ken and I want to add to the discussion of Atiyah’s proof of the Riemann Hypothesis (RH).
Primary sources are Atiyah’s short paper and longer precursor, the official video of his talk, and his slides. Discussion started here and has continued in several forums. MathOverflow removed their discussions; apparently so did StackExchange. A number of news sources reflect the universal skepticism.
We will not try to cover the same ground as these discussions, nor enumerate statements about holes in the papers. Instead we have gained some small insights into what Atiyah is doing. We are not disagreeing with the conclusion by many that “it’s not all there” but we think we can identify a few more things that are there—by intent—than we’ve seen noted. They don’t make a proof (either) but we think they are important to understand where all this is coming from, and that such an understanding is warranted. At the very least this is an exercise in how to read a challenging source.
We will first explain a previous proof that uses a related method—a famous proof that works and is correct. Then we will explain Atiyah’s idea as we see it.
The Todd Trick
Atiyah of course is well aware of the classic use of a special function to prove a deep theorem of complex analysis. Let’s call this the “Todd Trick.” The proof uses the existence of a complex function lambda () with certain special properties. Let’s recall the famous Liouville theorem, named after Joseph Liouville:
Theorem 1 Every bounded entire function must be a constant.
Then the famous stronger Picard theorem, named after Émile Picard, states:
Theorem 2 Every entire function that misses two points must be a constant.
Sketch of Proof: Let be an entire function that misses two points, which we may assume are and . This follows by using a linear map to move the missed points, if needed. Then the magic is to look at the following function :
It follows that is a bounded function, since it misses the poles of . But then we see that must be a constant and so is a constant.
The Todd Trick For The Riemann Hypothesis
Atiyah claims to have a proof of the famous RH. It is based on a special function that he calls the Todd function and he denotes it by . The function has a slew of special properties that Atiyah lists. Then he uses the properties to prove the RH directly.
This proof is very much in the spirit of the above proof of the Little Picard Theorem. So there is hope. But we are puzzled over some of the properties that is suppose to have. We must be confused but it seems that cannot have all the properties that are needed.
Here is the way that we think Atiyah proof is going. Consider the space of complex functions with a power series centered around some fixed point . Define
This will be our “Todd function.” Note this function is well defined on the given space and satisfies the key property
This explains how it is possible to get
with no higher terms. We use for our version of his function to mark that it is a variation of what he seems to say.
in this form is really an operator as signaled by the use of curly braces in the last equation. Note that its right hand side also equals . This is the sense we get from two pivotal equations on page 3 of Atiyah’s short paper—where, however, we use curly not . The former, to which we’ve added the label ‘(A)’, is said to apply when “ and are power series with no constant term.”
The reason we use curly is that the only way to make sense of the former equation is to read it as applied to the functions and and in the form of power series, and then the resulting function or series is applied to the complex number . It does not make sense to say that for any we evaluate and and then say that the results obey .
Our point is that the latter equation (3.1) hence needs to be read the same way—not as a simple function value where . That is, it must be what we are writing as curly- applied to as a function—indeed, as a power series. Then the result is applied to . This is neither hairsplitting nor special pleading but a need we feel as computer theorists who have used strongly-typed programming languages.
This means we want to understand as a power series. No such series appears in the papers, but given Atiyah’s surrounding references to power series it must be there. So we will try our best to supply what is indicated.
Power Series For Zeta and Square Root
A power series for differs from its original summation formula by having in the bases rather than the exponents. There are several ways to represent as a power series. Actually, from above we want a series for the function (to be applied to ), which may or may not have the same effect as the function
with fixed. Since we are not trying to be perfect, we will mention the Laurent series around as given here:
Here the are constants named for the Dutch mathematician Thomas Stieltjes, except that is the Euler-Macheroni constant The next one is . Expanding the series around rather than separates out the pole at via the first term.
If some particular property of a power series for affects the application then this insulates against the charge that no special property of is being used. To be sure, no special property is evident in the papers, and the burden to state one is on the claimer, but an intent along these lines is more likely than a blank slate.
Now the same logic must apply to two numbered equations that appear between the two we juxtaposed in the last section. They are extra-confusing because now “” is written as a simple function without curly braces. Here they are as they appear, including a cryptic “or”:
Which is it, or ? A remark just before these latter two equations hints at the answer “both”:
Remark. Weakly analytic functions have a formal expansion as a power series near the origin. Formula 2.6 is just the linear approximation of this expansion (more precisely this is on the branched double cover of the complex -plane given by ).
So what is going on involves approximation of a power series. Thus “” must be carrying out a linear approximation of a series. Hence “” needs to be read this way. It is hard to read the right-hand side of 2.6 and the left-hand side of 2.7 with inside the square root, but we can use them to substitute:
which per above really means
with a power series for as the argument for . The Maclaurin series expansion (as given here) is
Taking away the super-linear terms leaves , which by the same intent equals as stated. That the whole series converges provided confers some legitimacy.
The Final Line to Read
Here is a screenshot of the climax on page 3:
The key line is the one saying, “Now take in 2.6″—where 2.6 refers not to the equation with that number but to the one we’ve labeled ‘(A)’ which is in paragraph 2.6 of his paper. The important point in this substitution is not that is a numerical function on but rather that is to be treated as “a power series with no constant term.” This means that an application of is given as argument for another application of .
We can’t claim to have connected all the dots. We haven’t even connected the factor in from the square-root expansion (subtracting off the constant term ) to its claimed use to get . Taken at face value, the latter holding on any open region entails that must be linear. But connecting more dots helps to see fault lines more clearly, both for Atiyah’s papers and attempts on RH in general.
Still A Problem?
The emphasis on linearity in our exposition sharpens the kind of objection raised by Luboš Motl in his review: Take any two zeroes and close by each other on the line, take and and define:
This exchanges two genuine zeroes of for two mirror-image new zeroes of that are off the line by , and likewise for their complex conjugates. We have chosen and to minimize the effect on series expansions of compared to . Would the discrepancy affect the coefficient of the new linear term compared to of the original linear term for ? Surely, not enough is said about what is in the relevant series to tell, nor about any other way to distinguish from . But Motl’s example and our attention to series have at least channeled the question.
There are numerous other issues with the papers. Regarding the assertions about the fine structure constant, perhaps the argument is best left to physicists, but we note a 2010 paper by Giuseppe Dattoli. It is titled, “The Fine Structure Constant and Numerical Alchemy” and gives both a historical survey and a would-be simple formula for it. Both Dattoli and Atiyah have references to Kurt Gödel at the end of their papers. Just before the latter is a sentence that is wrong at face value: “To be explicit, the proof of RH in this paper is by contradiction and this is not accepted as valid in ZF, it does require choice.” On the contrary, RH has purely arithmetic formulations—indeed with only one unbounded quantifier per reference to Jeff Lagarias here—and all arithmetic statements (and more) provable in ZFC are provable in ZF. Nor is “by contradiction” an issue for ZF. Atiyah’s next sentence, however, talks about “most general versions” of RH and his concern about hoice might transfer to them.
Finally, we remind that some key ingredients in the essay on RH by Alain Connes, which we mentioned in the previous post, involve analyzing operators that, like , are idempotent. These have great sophistication. More down-to-earth, a calculation by Ken at the end of this recent post gives a motive for cutting off terms above where both and are small. Those terms don’t vanish in the real world but calculating in spaces where they do vanish may help clarify real behavior of limits involving them.
Open Problems
Is the proof’s idea okay or not? Does have the properties that are claimed? The general idea of a “Todd” approach to the RH seems at least to be an interesting idea. Can we make a list of properties that a function must have to shed light on the RH? Are we right that the Todd function is not defined on complex numbers, but is defined on functions represented by series? The most accessible reference we have found is chapter 5 of this 2004 thesis linked from this StackExchange discussion.