From particle physics to climatology to macroeconomics, scientists confront phenomena they would like to better understand but which are too complex to study in realistic detail. In such situations, researchers often turn to models: surrogate systems that are simpler or more tractable than the target phenomenon but similar enough to it to offer insight.
Scientists’ modeling practices raise a number of philosophical questions. Some especially challenging and interesting ones pertain to unrealistic, fictional or essentially idealized models and their role in science. Such models have been a major theme of recent work in the philosophy of modeling, which has seen debates around questions like these:
Can we gain genuine scientific understanding from unrealistic models?
Can unrealistic models explain anything about their target systems? If so, is this the source of their ability to generate understanding?
Are unrealistic models useful primarily because they give us counterfactual knowledge about their target systems?
My goal is to bring pure mathematics into these conversations. Doing so is appropriate because mathematicians are modelers too: models of all sorts, and unrealistic models in particular, are used in similar ways and for similar reasons in pure mathematics as in the empirical sciences.
This fact is worth advertising in its own right. Although idealized models are an indispensable part of the toolkit in many areas of mathematics, their existence is rarely noted either by philosophers of mathematical practice or by theorists of modeling. Indeed, as far as I am aware, no single instance of modeling in this sense has ever been examined in detail — a stark contrast to the myriad case studies from across the sciences.
To begin to correct this omission, I will first look carefully at two examples from contemporary number theory: Cramér’s random model of the primes and the function field model of the integers. Both models are important and widely used research tools about which philosophers ought to be better informed.
Perhaps more importantly, attending to modeling in mathematics can help to settle some of the contentious questions noted above. With the help of the two case studies, I plan to argue that:
Mathematicians make extensive use of unrealistic models and derive understanding from them.
It is not always the case that this understanding is mediated by explanation. An unrealistic model can help us understand a phenomenon even when it does not offer any explanation of the phenomenon.
It is not always the case that unrealistic models contribute to understanding (or are otherwise useful) by imparting counterfactual knowledge.
The last two claims, in particular, constitute challenges to popular views in philosophy of science. Taking cases from mathematics seriously, then, can help move debates about modeling forward.
In §1, I make some preliminary clarifications about what I take models to be and what makes a model unrealistic. In §2 and §3, I discuss Cramér’s model and the function field model, respectively. In §4, I answer the questions posed above about understanding, explanation and counterfactual knowledge and closes with a plea for greater contact between pure mathematics and philosophy of science.
1. Unrealistic Models In Mathematics
Let me begin with some comments about the scope of my study and its rationale.
First, the subject matter of this paper — mathematics and models — might bring to mind the branch of mathematical logic known as model theory. For the model theorist, a model is a structure that satisfies a given set of sentences in a specified formal language under an interpretation. I am not primarily interested in this special sense of ‘model’ but rather in the broader scientific meaning of the term. In this broader sense, I take it, a model is any object M that is used to represent some other phenomenon, system, or body of information P. There are no a priori restrictions here on the nature of M or its relationship to P; in particular, there is no requirement that M satisfy some set of sentences associated with P.
Another clarification. It is in the nature of a model to take some liberties with its target phenomenon — “all models are wrong”, as the saying goes — but different models do so in different ways and to different degrees. Some abstract away from irrelevant details but are otherwise largely realistic; their elements represent only real features of the target phenomenon, all the most important features are represented, and these representations are more or less accurate. Moreover, such models can often be “de-idealized” even further without fundamentally changing their character (by adding in missing details or relaxing simplifying assumptions, for example).
In other cases, however, the relationship between surrogate and reality is less tidy. Many models explicitly and essentially misrepresent key aspects of their target phenomena and hence are nonveridical in a deeper sense. In these cases, no simple de-idealization procedure is available; the models are what they are, and function as they do, precisely on account of the distortions they contain. This latter sort of case is what I mean by an unrealistic model.1
To get a clearer sense of the distinction, consider a schematic street map or a simple lunar model of the tides. Both models omit some features of their targets: the map may not depict the relative widths of the streets or the locations of alleys and unpaved drives, while the model of the tides neglects the gravitational influence of the Sun and the effects of Earth’s rotation. Nevertheless, both accurately represent the most important features of their target systems without introducing major ontological or ideological distortions.
Compare Bohr’s model of the atom or Schelling’s model of housing segregation. It is essential to Bohr’s model that it portrays electrons as moving in well-defined orbits around their nuclei, when in fact they do no such thing. The electron orbitals, as Alisa Bokulich puts it, are fictions, which “[cannot] be properly thought of as an ‘idealization’ of the true quantum dynamics” (Bokulich 2011, 43). Meanwhile, the Schelling model represents an agent’s housing choices as completely determined by two factors: their preference to be surrounded by a certain percentage of neighbors from their own group and the current composition of their immediate neighborhood. Cost considerations and other factors of obvious real-world importance are absent from the model, which is therefore usually viewed as a toy model: a “strongly idealized” and “extremely simple” representation that omits most of the factors on which the target phenomenon actually depends (cf. Reutlinger et al. 2018). Unrealistic models raise some especially interesting questions, and their role in mathematics will be my focus below.
2. The Cramér Random Model
One phenomenon often studied via models is the distribution of the prime numbers. In this section, I describe one of the most important and widely used of these: Cramér’s random model of the primes. Since Cramér introduced the model in 1932, numerous refinements, spinoffs and variants have emerged. Some of these are more accurate or useful than the original model for certain purposes. I focus mostly on the original here, because it is the simplest and remains in frequent use.
Let me start with some background. Famously, and perhaps to a greater degree than any other branch of mathematics, number theory is rife with simple and natural questions that have proven very hard to answer. Among the most well-known examples are the four Landau problems2:
Goldbach’s conjecture: Is every even integer greater than 2 the sum of two primes?
Twin Primes conjecture: Are there infinitely many pairs of prime numbers of the form p, p + 2?
Legendre’s conjecture: Is there a prime between n2 and (n + 1)2 for every positive integer n?
Fourth Landau conjecture: Are there infinitely many primes of the form n2 + 1?
Settling these conjectures requires understanding the arrangement of the primes among the natural numbers. This is no easy task, because “the series of prime numbers exhibits great irregularities of detail” (Ingham 1932, 1) and “do[es] not follow any apparent pattern” (Koukoulopoulous 2019, 1).3 The first three questions have thus remained open for 170 years or more.4 Although existing technology still does not seem up to the challenge of solving the Landau problems, we have learned enough to make some headway.
The very first relevant discovery was Euclid’s theorem that there are infinitely many prime numbers. This is a precondition for the conjectures’ possible truth, but not helpful for their resolution, because it does not provide any information about the distribution of the primes. For this, we need the much more recent prime number theorem (PNT; 1896, proved independently by Hadamard and de la Vallée Poussin). Where log x is the natural logarithm and π(x) is the prime-counting function (giving the number of primes less than or equal to x), the PNT says that
i.e., that the number of primes up to x approaches x / log x as x goes to infinity. This means that the primes steadily thin out among the natural numbers but at a relatively slow rate. A consequence of the PNT is that, for sufficiently large n, the probability that n is prime is about 1 / log n — a fact we’ll return to below.5
Unfortunately, this still does not provide enough information about the distribution of the primes to settle Landau’s problems. (A proof of the Riemann Hypothesis would help, because it can be viewed as an improvement on the PNT, bounding how far off π(x) can be from x / log x. But such a proof seems unlikely to be forthcoming any time soon.)
In view of these difficulties, the Swedish mathematician Harald Cramér proposed a new way of approaching the distribution problem. Rather than directly studying the primes themselves, he constructed a more tractable surrogate, now known as “Cramér’s model” or the “random model” of the primes. The idea, set out in Cramér (1936), is to build a subset of the natural numbers by independently choosing to include each n > 2 with probability 1 / log n. The resulting sequence might look something like this:6
3, 4, 6, 11, 12, 25, 26, 28, 32, 34, 35, 36, 43, 57, 66, 68, 80, 83, 87, 93, …
100005, 100006, 100008, 100018, 100045, 100055, 100074, 100094, 100096, 100106, …
Think of this set as a model of the real sequence of primes. By the consequence of the prime number theorem mentioned above, the “primes” in Cramér’s model will have the same asymptotic density in the natural numbers as the real primes (with probability 1) — one can observe, for instance, that the larger terms in the sample sequence are spaced out somewhat more than the smaller ones. So, it is reasonable to hope that other statistical and distributional properties of the real prime sequence will also resemble those of the model. (Note, by the way, that “the model” here refers to an arbitrary sequence generated by Cramér’s procedure, not to any definite sequence in particular. Correspondingly, claims of the form “P is true in the model” mean that P holds for an arbitrary such sequence with probability 1, perhaps with finitely many exceptions.7)
Given that the “Cramér primes” and the real primes are similarly distributed in ℕ, what is the benefit of working with the former instead of the latter? As it turns out, it is much easier to study the statistics of distributions with strong joint independence properties, like those of the random model.8 Consequently, we know a lot about the behavior of the surrogate primes.
Some of these facts were independently known or strongly believed to be true of the actual primes, while other claims are considered to have gained support from the fact that they hold in the model (e.g. the Riemann Hypothesis and the Landau conjectures). Yet other hypotheses were originally motivated by observations about the model itself. Among these is the important Cramér conjecture on the sizes of gaps between primes,9 introduced in Cramér’s original paper on the model, which even in recent years “does not seem to be attackable by other methods” (Granville 1995b, 391).
I want to make one observation and two more substantive claims about this case. The observation is that the Cramér model is manifestly not a model of the theory of the primes in the model-theoretic sense. There are many sentences true of the real primes that are false of the Cramér primes (for instance, “exactly one even number is prime”).10 So, model theory is not the proper framework for thinking about this situation, as per the remarks in §1.
The first substantive claim I want to defend is that Cramér’s model (and similar random models) have significantly improved our understanding of the distribution of the primes. The model makes several kinds of epistemic contribution.
To start with, mathematicians take the model seriously because it correctly predicts many known facts about the primes. Kannan Soundararajan notes, for instance, that “the Cramér model makes accurate predictions for the distribution of primes in [very] short intervals” (Soundararajan 2007a, 64). (Note that this is not just a trivial consequence of the fact that the model gets the asymptotic density right; two sequences can have similar long-run behavior without looking alike at small scales.11) More generally, Andrew Granville writes that “the probabilistic model usually gives one a strong indication of the truth” (Granville 1995b, 391). In Tao’s words, “[we] have a number of extremely convincing and well supported models for the primes … the most accurate [of these] in practice are random models” (Tao 2015).
Since Cramér-type models have proven generally reliable in regimes where their predictions can be verified, number theorists view them as useful guides to unknown territory. Their contributions along these lines fall into at least three categories: (1) increasing or decreasing our confidence in conjectures derived from independent sources; (2) motivating entirely new conjectures; and (3) suggesting novel methods of proof.
Let us start with (1). As mentioned above, a number of fundamental statements in number theory are known to hold in the Cramér model but are not yet known to hold for the actual primes. These include the Riemann Hypothesis (RH) and the four Landau conjectures. Each of these hypotheses was suspected to be true before the advent of the Cramér model. But the results from the model served to increase mathematicians’ confidence, in some cases significantly. For instance, van der Poorten claims that “the most compelling” evidence in favor of RH is the fact that it holds in a related random model of the primes, the Hawkins model (van der Poorten 1996, 147).12 Similarly, according to Patterson’s textbook on the zeta function, the validation of RH by random models “represents one of the more reassuring reasons for expecting the Riemann Hypothesis to be true” (Patterson 1988, 75).
The credence calibration provided by Cramér-style models extends much further, as elaborated on by Tao. In the setting of these models, he writes, “many difficult conjectures on the primes reduce to relatively simple calculations … Indeed, the models are so effective at this task that analytic number theory is in the curious position of being able to confidently predict the answer to a large proportion of the open problems in the subject, whilst not possessing a clear way forward to rigorously confirm these answers!” (Tao 2015).
Let us now turn to (2), new conjectures. In addition to bolstering confidence in independently motivated hypotheses, “the probabilistic heuristic, in which independence is assumed, provides a useful means of constructing conjectures” (Montgomery & Vaughan 2007, 57).13 The most famous of these is the Cramér conjecture on prime gaps, mentioned above, which Cramér arrived at by way of the model. Random models have also led to progress in other parts of number theory. One example is the theory of “lucky numbers” (the sequence 1, 3, 7, 9, 13, 15, 21, 25, …, generated by a certain sieve process).14 On the basis of his random model of the primes, Hawkins conjectured (in Hawkins 1957) and was later able to prove (in Hawkins & Briggs 1957) a PNT-type result for lucky numbers, to the effect that their asymptotic density in the natural numbers is also 1 / log n. Random models continue to be deployed on the front lines of research, sometimes in novel ways. For example, Lozano-Robledo (2020) “propose[s] a new probabilistic model for the distribution of ranks of elliptic curves … in the spirit of Cramér’s model for the prime numbers” (2), which is used to generate predictions about the number of elliptic curves of a given rank. In general, then, Cramér-type models “[give] a clearer indication of what results one expects to be true, thus guiding one to fruitful conjectures” (Tao 2015).
Finally, let us consider (3), novel methods of proof. As we have seen, models of the primes are generally used for heuristic purposes rather than as tools for proving theorems. Nevertheless, in the view of the number theorist János Pintz, “probabilistic models can help or could have helped not only to conjecture but also prove results about primes” (Pintz 2007, 362). Pintz goes on to show how a particular result — Maier’s theorem about the number of primes in small intervals — could have been established much earlier using a modified Cramér model.
In addition to these three applications, Tao mentions several other uses of random models: “providing a quick way to scan for possible errors in a mathematical claim (e.g. by finding that the main term is off from what a model predicts …); gauging the relative strength of various assertions (e.g. classifying some results as ‘unsurprising’ [and] others as ‘potential breakthroughs’ …); or setting up heuristic barriers … that one has to resolve before resolving certain key problems” (Tao 2015). In view of these various uses, benefits and insights, I conclude that number theorists have gained significant understanding from Cramér-type models.
One could try to push back against this claim by noting the lack of philosophical consensus around the notion of understanding. In the absence of a widely accepted explicit theory, which criteria are being used to judge cases such as this? And why should we think those criteria are appropriate?
It is true that philosophers disagree about understanding. For instance, some equate understanding a phenomenon with having an explanation of it (Strevens 2013). Others link understanding with the possession of certain abilities (Delarivière & Van Kerkhove 2021), with suitably structured knowledge (Kelp 2015), or with the disposition to generate new knowledge from a minimal core (Wilkenfeld 2019). I do not take sides in this debate here (although I do argue against the explanation account in §4, below). My approach is different, and it has two components. First, I claim that the Cramér model ought to count as a source of understanding on any reasonable view — the same goes for the function field model discussed in the next section. Second, I offer the appraisals of mathematicians themselves, which I take to count at least as strongly as philosophical arguments in this context.15
The Cramér model, as just shown, has strengthened number theorists’ confidence in some important hypotheses and has played a key role in generating others. It has led to a clearer overall picture of the phenomena. It helps mathematicians organize, justify, and check their reasoning. It would be a tendentious and implausible theory that regarded these achievements as insufficient for improving understanding. In particular, the model evidently confers abilities associated with understanding, lends valuable structure to number theorists’ knowledge, and allows much novel information to be spun out from a compact representational core. So, theories in the spirit of the last three mentioned above will count the model as a source of understanding. This seems correct.
What is more, the same conclusion has been reached by number theorists who are intimately familiar with the model and its uses. Granville, for instance, refers to “Cramér’s probabilistic approach [to] understanding the distribution of prime numbers, which underpins most of the heuristic reasoning still used in the subject today” (Granville 1995a, 15). Absent compelling reasons to do otherwise, good methodology recommends taking such judgments at face value.16
I conclude from these considerations that the Cramér model is a source of understanding. (To be precise, it contributes to understanding the distribution of the prime numbers. I take this to be a case of understanding a phenomenon, as opposed to, say, a case of understanding-why.)
The second main claim of this section is that Cramér’s model is quite unrealistic, in the sense discussed in §1 above. That is, rather than a mild idealization that merely abstracts away from inessential details, the model involves an explicit and extensive misrepresentation of its subject matter.
One major distortion is that the Cramér primes are chosen probabilistically, but the actual primes are not in any sense random. Rather, as George Pólya says, whether or not a number is prime “can be decided by the ‘definite rules’ of arithmetic — where and how could chance enter the picture?” (Pólya 1959, 376). Although the assumption of randomness is unrealistic, it is essential to all Cramér-type models. There is no prospect of de-idealizing to remove this assumption without discarding the model framework entirely.
Cramér’s model also fails to capture the important multiplicative structure of the actual primes — for instance, the fact that if p is prime then n ∙ p cannot be. (Recall that the Cramér primes are chosen independently, so the selection of one number has no effect on the probability of choosing any other number.) Hence the model generates infinitely many even primes, pairs of consecutive primes, and other absurdities — for example, in the run of the Cramér algorithm given above, 34, 35, and 36 are all chosen. Some modifications of the simple Cramér model reintroduce basic aspects of the actual multiplicative structure of the primes, such as by forbidding even primes greater than 2. But going much further in the direction of realism would be counterproductive, because the joint independence of the surrogate primes is exactly the feature that makes the models more tractable than the real primes.
Thus, “[d]espite its predictive power, Cramér’s model is a vast oversimplification” (Klarreich 2018a, 25) — indeed, a distortion — of its target, the prime number sequence. This is interesting for a number of reasons, most obviously because it shows that mathematicians, like empirical scientists, make serious use of unrealistic models and rely on them to gain understanding. I defer discussion of further philosophical consequences to §4 below, after discussing my second example in the next section.
3. The Function Field Model of the Integers
This section discusses so-called dyadic models of linear structures, in particular the model of the integers as polynomials over a finite field. This is a further example of an unrealistic model in widespread use as a source of mathematical understanding. This second case also bears consequentially on the questions about modeling mentioned at the start of the paper.
Dyadic models in mathematics take on a variety of forms depending on the settings in which they are deployed, which range from differential equations and harmonic analysis to combinatorics and number theory. But a common motivation for using such models is the desire to avoid spillover between scales exhibited by the integers, real numbers, cyclic groups, and other linearly structured sets.
I will focus here on the integers. One manifestation of the spillover phenomenon in this domain is the need to carry digits when adding numbers together. In the sum 28 + 75 = 103, for example, the addition of 8 and 5 in the units place spills over to affect the values in the tens and hundreds places. This kind of interaction between fine and coarse scales can be inconvenient. For instance, when adding many integers together, an accumulation of tiny (“fine-scale”) errors can significantly distort the final (“coarse-scale”) result.
The most common dyadic model of the integers is the ring of polynomials F[t] over a finite field F. This is known as the function field model of the integers.17 The elements of F[t] are polynomials with coefficients from F. The role of positive integers in the model is played by monic polynomials, i.e. polynomials of the form with leading coefficient 1.
(Some relevant definitions: a field is an algebraic structure with commutative addition and multiplication operations in which every non-zero element has both an additive inverse and a multiplicative inverse. The real numbers are a familiar example. A finite field is a field with finitely many elements. In fact, a finite field always has pn elements, with p prime and n ≥ 1. The simplest examples are the fields p, whose elements are {0, 1, 2, …, p – 1}. Addition in p works like addition modulo p. For example, in the field with seven elements 7, we have 5 + 6 = 11 (“mod” 7) = 4.)
In the function field model, arithmetical operations on fine-scale terms do not affect the values of coarse-scale terms and vice versa. Compare adding (t2 + 2t + 5) + (t + 6) in 7[t], for example, with the analogous 125 + 16 in .ℤ18 The latter exhibits spillover, because the units-place sum (5 + 6 = 11) contributes 1 to the tens-place result. But not so in the model. In 7[t], as noted above, 5 + 6 equals 4, and so the sum (t2 + 2t + 5) + (t + 6) = t2 + 3t + 4 is spillover-free.
Another important feature of the function field model is that its surrogate “integers”, being polynomials, have non-trivial derivatives. (The derivative of an ordinary integer is, of course, always 0.) I will say more about why this is useful below.
F[t] might, at first glance, seem like a strange model for the integers. Why would representing numbers as polynomials be appropriate, and how might it be useful? As Michael Rosen writes (in his textbook on the subject, Number Theory in Function Fields):
Early on … it was noticed that ℤ has many properties in common [with F[t]] … Both rings are principal ideal domains … both rings have infinitely many prime elements, and both rings have finitely many units. Thus, one is led to suspect that many results which hold for ℤ have analogues [in F[t]]. This is indeed the case. (Rosen 2002, vii)
In other words, the two structures have importantly similar algebraic properties. Hence, “number theory in F[t]” should (and does) resemble ordinary number theory to a significant degree. The study of function fields as a source of number-theoretic insight goes back at least to Dedekind and Weber’s 1882 paper “Theory of Algebraic Functions of One Variable” (Dedekind & Weber 2012). As the translator John Stillwell notes, “the paper revealed the deep analogy between number fields and function fields — an analogy that continues to benefit both number theory and geometry today” (vii).
Indeed, the function field model has proven fruitful in many ways. Several of its uses resemble those of Cramér’s model of the primes: increasing our confidence in independent hypotheses, suggesting new conjectures, and offering novel methods of proof. The function field model is also unique in at least one important way: it suggests to many mathematicians that there ought to exist a novel kind of object, the “field with one element”, to complete certain aspects of the correspondence between F[t] and ℤ. If efforts to make sense of this notion prove successful, momentous developments in algebra, number theory and geometry are expected to ensue.
Let me fill in some details, starting with the first items mentioned above. A famous open problem in number theory is the abc conjecture of Oesterlé and Masser. The conjecture is roughly as follows: Let a,b,c be relatively prime integers such that a + b = c, and let D denote the product of the distinct prime factors of abc. Then c is significantly bigger than D in only finitely many cases. Many other major conjectures are known to be true conditional on the truth of abc; it would also yield a simple proof of Fermat’s Last Theorem. So abc is of great interest to number theorists.19
It is therefore significant that the counterpart of the abc conjecture, known as the Mason–Stothers theorem, is known to hold in the function field model. The Mason–Stothers theorem concerns relatively prime polynomials a(t), b(t), c(t), not all constant and with a + b = c. Where D is the degree of the product of the distinct irreducible factors of a, b, and c, the theorem asserts that D is significantly bigger than the maximum among the degrees of a, b, and c. The Mason–Stothers theorem has an elementary proof in F[t] based on taking derivatives of a, b, and c (Snyder 2000) — a trick unavailable for proving abc in the integers. This is an example of the aforementioned usefulness of derivatives in the function field model.
The abc conjecture is just one hypothesis on which the function field model sheds light. As Rudnick notes, a variety of classic problems “which are currently viewed as intractable over the integers, have recently been addressed in the function field context … and the resulting theorems can be used to check existing conjectures over the integers, and to generate new ones” (Rudnick 2014, 443). (Rudnick’s paper discusses five such problems.)
In some cases, results in the function field model can be used to directly prove the corresponding statements in ordinary number theory. One example is the Ax–Kochen theorem, an important result concerning the zeroes of certain polynomials over the p-adic numbers. In the standard proof of Ax–Kochen, the first step is to show that the analogous claim holds in the function field model. Using the transfer principle technique from model theory, it is then possible to import the function field statement back to the original p-adic context, thus proving the theorem.
A final way in which the function field model has advanced number theory is by motivating research around the notional “field with one element” F1. In standard algebra, a field with one element is impossible, because fields, by definition, have an additive identity 0 and a multiplicative identity 1 such that 0 ≠ 1. So, the quest for F1 can be seen as an exercise in conceptual engineering; the task is to build a coherent theory in which an F1-like object exists and has certain desirable properties.20
Impetus for this quest comes from several sources, a major one being the success and promise of the function field model. Work on F1 is technical and describing it in adequate detail would be unduly lengthy, so I will give only a brief sketch here.
The starting point is André Weil’s 1948 proof of the Riemann Hypothesis for function fields (Weil 1948). As Oliver Lorscheid writes, “[t]he analogies between number fields and function fields led to the hope that one can mimic these methods for ℚ and approach the [standard] Riemann hypothesis” (Lorscheid 2018, 94). (Here, ℚ denotes the set of rational numbers.)
Weil’s proof starts with a “global” function field F, that is, a finite field extension of the rational function field p(t). (See footnote 17 for more on p(t).) It turns out that F can be interpreted as the function field of a curve C over the base field p. One can then define a zeta function ζC for this curve and use the tools of algebraic geometry to prove the analogue of the Riemann Hypothesis for ζC. In particular, Weil’s proof counts the number of intersection points of C with a “twisted” version of itself inside the fiber product C × Specp C.21
Many mathematicians have hoped, as per Lorscheid’s remarks above, to obtain a proof of the standard Riemann Hypothesis by translating Weil’s proof from the function field model to the integers. The first step in this process is to identify the curve C such that ζC is the ordinary Riemann zeta function that features in RH. This curve turns out to be the spectrum of the integers, Spec ℤ. The remaining task is to specify the base field over which Spec ℤ is to be viewed as a function field. It is at this point that the field with one element enters the scene:
The analogy between number fields and function fields finds a basic limitation with the lack of a ground field. One says that Spec ℤ is … like a (complete) curve; but over which field? In particular, one would dream of having an object like
since Weil’s proof of the Riemann hypothesis for a curve over a finite field makes use of the product of two copies of this curve. (Soulé 1999, 1–2)
The field with one element 1, then, should be an object over which it makes sense to view Spec ℤ as a curve.
Using 1 to prove the Riemann Hypothesis is just one motivation for its study. Broadly speaking, the idea of doing geometry with the integers over 1 “emerged from certain heuristics in combinatorics, number theory and homotopy theory that could not be explained in the framework of Grothendieck’s scheme theory” (Lorscheid 2018, 83). Given that scheme theory has served as the foundation for algebraic geometry for over half a century, a fully realized theory of the field with one element would necessitate a major rethinking of a large body of mathematics.
Although we still lack a definition of 1 that seems likely to yield a proof of RH, an impressive collection of mathematicians have undertaken extensive exploratory theory-building: Jacques Tits (credited with first suggesting 1 in Tits (1957)), Alain Connes, Caterina Consani, Yuri Manin, and Christophe Soulé, to name a few particularly influential contributors. Thas (2016) and Lorscheid (2018) are a recent essay collection and a survey paper, respectively. Even if RH remains elusive, work on F1 goes on, and has already produced a richer picture of the relationship between number theory and geometry.
My final claims in this section will come as no surprise: I want to argue that mathematicians have gained significant understanding from the function field model despite its unrealistic character. The considerations from the last section apply here also in support of this claim. As with the Cramér model, the experts best positioned to assess the merits of the function field model describe it as a source of understanding. Here, for instance, is Tao (recall that the dyadic models are a family that includes the function field model, as noted at the beginning of this section):
In some areas [dyadic constructions] are an oversimplified and overly easy toy model; in other areas they get at the heart of the matter by providing a model in which all irrelevant technicalities are stripped away; and in yet other areas they are a crucial component in the analysis of the non-dyadic case. In all of these cases, though, it seems that the contribution that dyadic models provide in helping us understand the non-dyadic world is immense. (Tao 2008, 68)
Finally, the function field model is an unrealistic representation of the integers. Most obviously and importantly, there is no natural notion of order on the elements of F[t], so the function field model completely lacks the linear structure of ℤ. This difference has far-reaching consequences. For instance, mathematical induction does not make sense in F[t], whereas the availability of induction is often taken to be a characteristic property of the whole numbers. (Cf. Stewart and Tall’s Foundations of Mathematics: “What is a number? … The first step [in finding the answer] was to characterise natural numbers. It turned out that their most important defining feature wasn’t counting, or arithmetic: it was the possibility of proving theorems using mathematical induction” (Stewart & Tall 2015, 159).)
In addition, the metric on F[t] is non-Archimedean, meaning that the familiar triangle inequality d(x,z) ≤ d(x,y) + d(y,z) takes the stronger “ultrametric” form d(x,z) ≤ max{d(x,y), d(y,z)}. This implies, for instance, that all triangles in F[t]n are isosceles, that every point on the interior of a ball is its center, and that for any two intersecting balls, one is contained inside the other. These properties are, of course, very much unlike those of ℤn equipped with the usual metric.
Despite sharing some algebraic features, then, the function field model differs from the integers in fundamental ways. As with the Cramér model, these differences go beyond mere elisions of small or unimportant details. Nor is it possible to recover the key features of the integers by any straightforward process of de-idealization. F[t] is an unrealistic model.
4. Morals: Understanding, Explanation, Counterfactuals
At the beginning of the paper, I listed three pressing questions about unrealistic scientific models. The previous sections have shown that mathematicians engage in modeling, and that some widely used models in pure mathematics are unrealistic. So, these cases are relevant to the questions at issue. What can we learn from them?
I have already argued that the Cramér model of the primes and the function field model of the integers are sources of understanding. So, the first of the three questions — can we gain genuine understanding from unrealistic models? — has an affirmative answer.
The use of unrealistic models in mathematics has gone unappreciated by philosophers, and cases like those I have described are note-worthy for that reason. But this answer is not otherwise surprising. Various kinds of epistemically salutary unrealistic models have been studied extensively in recent years, and the mere fact of their existence no longer seems especially controversial. See, for example, Batterman and Rice (2014), Bokulich (2011), de Regt (2015), Hindriks (2013), Mäki (2009), Morrison (2015), Rice (2016), and the papers in Synthese’s recent collection “What to Make of Highly Unrealistic Models”: Boesch (2019), Knuuttila and Koskinen (2020), Papayannopoulos (2020), and van Eck and Wright (2020).
The remaining questions, though, are very much under debate. The mathematical cases I have described can help to resolve both.
4.1 Explanation and Understanding
First: can unrealistic models explain, and is this how they generate understanding? I take no position here on the first part of the question, but the answer to the second part is “in general, no”. Some unrealistic models help us to understand phenomena for which they offer no explanation.
Cramér’s model is a case in point. As I have argued, the model has improved number theorists’ understanding of the distribution of the primes. But the model does not explain this distribution. There are circumstantial, theoretical, and commonsensical reasons to believe this.
The circumstantial reason is the lack of evidence from mathematical practice. Mathematicians are often quite interested in the explanatory value of theorems, proofs, heuristics, and other tools, and they tend to make their positive appraisals known.22 This is especially true of widely used and frequently discussed pieces of mathematics like the Cramér model. If the model were explanatory, this fact would be of interest to the community of number theorists who have studied, worked with, and instructed their students about it for decades. After consulting what must be a large percentage of the published literature on the model (as well as many less formal online discussions), however, I have encountered no such appraisals, either explicit or oblique.23 If mathematicians consider the Cramér model to be explanatory, they have been uncharacteristically quiet about it for almost a hundred years.
The theoretical reason is that, insofar as we have anything like a general understanding of explanation (in mathematics or elsewhere), Cramér’s model does not seem to fit the bill. A standard idea is that explanations require dependence relations of some sort, either ontic or counterfactual. On the former view, an explanans must cause, ground, or otherwise metaphysically undergird its explanandum. On the latter view, what is required is counterfactual dependence — if the explanans had been different, the explanandum would have been too. (For defenses of these two views, see, for example, Ruben (1990) and Reutlinger (2016), respectively.)
The distribution of the prime numbers evidently does not depend on facts about the Cramér model in either sense of “depend”. Indeed, it would be absurd to suggest that a randomly generated subset of the natural numbers might produce or give rise to any properties of the actual primes. (We regrettably lack a well-developed theory of the metaphysics of mathematical objects, but this ought to be beyond doubt if anything is.)
It is also highly implausible that the distribution of the primes counterfactually depends on the properties of the model. Again, philosophy has yet to reach a consensus about how to deal with mathematical counterpossibles (see Baron et al. (2020) for a start). But the prevailing idea, following the Lewis–Stalnaker semantics for ordinary counterfactuals, is to somehow identify the (impossible) worlds closest to actuality where the antecedent is true and check whether the consequent also holds in those worlds.
In the case at issue, we are supposed to imagine that the Cramér model is different in some way — say, that the Goldbach conjecture is false in the model, instead of true. At this world, some even natural number greater than 2 is no longer the sum of two Cramér primes. Is it also the case here that some even n > 2 is not the sum of two ordinary primes? I see no reason to think so. The closest worlds at which Goldbach fails in the Cramér model are worlds at which it fails just barely — say, where exactly one even n > 2 is not the sum of two Cramér primes. And the fact that the model falsifies Goldbach by the slimmest of margins seems to entail nothing at all about whether Goldbach holds in ℕ. To whatever extent (if any) the properties of the natural numbers counterfactually depend on the properties of the Cramér primes, the dependence is surely not so extraordinarily sensitive.
There are, of course, other proposals regarding the nature of explanation. A final one worth mentioning is Kitcher’s unificationist theory (1989), which was explicitly intended to apply to explanations in pure mathematics. On Kitcher’s approach, a proof counts as explanatory just in case it instantiates an argument pattern from the “explanatory store”, that is, the set of argument patterns that most efficiently systematizes our knowledge in a given domain. Kitcher tends to argue that a given proof P is explanatory by comparing it to another proof of the same result — one which generalizes less readily or less widely than P or which is more mired in the details of a special case than P. (See §3.2 of Kitcher (1989) for these mathematical examples.)
In a straightforward sense, the models I have discussed are not even potentially explanatory on Kitcher’s view, because they do not usually let us directly prove things about their target systems. The kind of assurance they provide is heuristic and analogical rather than deductive.24 Even setting this issue aside, the model-based approach is often decidedly closer to the purpose-built, single-use end of the inferential spectrum — much like the forms of reasoning Kitcher dismisses as unexplanatory.
For instance, the Cramér model is good at providing insights about the distribution of the primes. But it is not derived from any grand general theory, and it suggests no unifying perspective that is expected to help with other kinds of problems. By contrast, if and when we manage to prove claims like the Riemann Hypothesis, these proofs are expected to flow from a highly fruitful new framework with consequences for many areas of mathematics. The argument patterns obtained from this theory will be far stronger candidates for inclusion in the explanatory store than any inferences associated with Cramér-style models. So, unificationism provides no reason to judge the models explanatory either.
I lack space to review other accounts of explanation. But the situation seems much the same on any plausible, mathematically applicable theory.25
There is perhaps a second theoretical reason to deny that the Cramér model is explanatory. As noted in §2, number theorists use a large family of random models to study the distribution of the primes, Cramér’s being the original. Many of these models make incompatible assumptions. Some allow infinitely many even primes, while others allow none besides 2. Some generate the surrogate prime sequence by a completely different random procedure than Cramér’s. And so on. The models in this menagerie validate many of the same basic claims (the Landau conjectures and the Riemann Hypothesis, for instance), but they have little else in common apart from their use of various random methods. Which of these models are explanatory, if any are? Singling out one in particular would be indefensible: no individual model is uniquely worthy of the title. But declaring that all are explanatory is equally problematic. Given that the models have so little in common, in virtue of what shared feature could they count as giving the same explanation?
Finally, there is a commonsensical reason to deny that Cramér’s model explains. To explain a phenomenon is to give a reason why it occurs or obtains. And Cramér’s model does not do this. In response to the question “Why is it the case that Goldbach’s conjecture holds?”, for instance, one would not accept as the reason “Because it holds in random models of the primes”. This fact about the model might be (and probably is) a good reason to believe that Goldbach’s conjecture is true, but it does not tell us why the conjecture is true. Someone who knew the relevant facts about the model would not be confused or misguided for continuing to seek an explanation elsewhere. Such, at least, is my intuition.
Many of the same points apply to the function field model, and indeed we have further evidence from mathematical practice in this case. Here is Lorscheid:
For a not yet systematically understood reason, many arithmetic laws have (conjectural) analogues for function fields and number fields. While in the function field case, these laws often have a conceptual explanation by means of a geometric interpretation, methods from algebraic geometry break down in the number field case. The mathematical area of F1-geometry can be understood as a program to develop a geometric language that allows us to transfer the geometric methods from function fields to number fields. (Lorscheid 2014, 408–409)
Here, Lorscheid is contrasting the situation in the function field model — where results like the Riemann Hypothesis are not only known but have been successfully explained via algebraic geometry — with the situation in ordinary number theory, where these results are believed to hold but where no corresponding explanation is yet available. Number theorists would like to find a geometric explanation for the Riemann Hypothesis, hence their interest in studying the field with one element. But Lorscheid is clear that the function field model, at least as it currently stands, does not do this explanatory work. The model helps us to understand features of the natural numbers without explaining those features.
This conclusion is significant because it challenges a popular view in philosophy of science according to which understanding requires (or perhaps just is) the possession of an explanation. Defenses of such a view include de Regt (2009), Hannon (2019), Khalifa (2012), Strevens (2013), and Trout (2007). Strevens provides a clear statement of exactly the claim that I deny: “An individual has scientific understanding of a phenomenon just in case they grasp a correct scientific explanation of that phenomenon” (Strevens 2013, 510).
What is the right way to think about understanding — and, in particular, the sort of understanding gained from unrealistic models — if not in terms of explanation? I will say more about this below.
4.2 Counterfactual Knowledge and Understanding
Second: are unrealistic models useful primarily because they impart counterfactual knowledge about their target systems? Is this how such models contribute to understanding? Again, the answer is “in general, no”. The epistemic benefits gained from some unrealistic models have little to do with counterfactual knowledge.
Before I argue against this view, I want to briefly explain why it has seemed plausible to many authors, as it may not be as intuitive as the purported link between explanation and understanding. The Schelling model of housing segregation mentioned in §1 provides a good illustration. The model represents a city as a square grid, with each grid cell depicting a housing unit. A housing unit can be either empty or occupied by a single agent. Agents are split into two disjoint groups, say red and blue. Each agent prefers that a certain ratio R of the adjacent occupied squares are occupied by members of their own group; if the ratio falls below R at any time, the agent will then move to an empty unit where their preferences are satisfied. Schelling showed that the red and blue populations eventually segregate themselves even for relatively small values of R (roughly R ≥ 1/3 if the two groups are equally sized).
It is obvious that the Schelling model does not (and is not meant to) realistically represent the factors actually responsible for segregation. The model considers only one variable that is potentially relevant to housing choice, and everything about its treatment of that variable is highly idealized. Yet the Schelling model is often taken to have improved our understanding of segregation. How so? Perhaps by imparting counterfactual knowledge. While the model teaches us little about the causes of segregation in real cities, it plausibly does show how things would (and would not) change if the world were different. For instance, one might infer from the model that segregation is a robust phenomenon, likely to persist even in the absence of social and economic inequalities. And one could use this counterfactual knowledge to evaluate proposed interventions. A diversity course that convinced people to be comfortable living in at least 50% same-race neighborhoods, for example, would not be a promising remedy for segregation.
This looks like a reasonable diagnosis of the usefulness of the Schelling model. Is it possible that all unrealistic models serve our epistemic purposes in the same way? Many philosophers have thought so. On this view — defended, for instance, in Bokulich (2011), Grimm (2011), Hindriks (2013), Lipton (2009), Rice (2016), Levy (2020), and Saatsi (2020) — unrealistic models contribute to understanding mainly by providing us with counterfactual knowledge about their target systems. Indeed, some of these authors identify understanding in general with counterfactual knowledge. Levy, for example, writes that “understanding something is having a representation of it that allows one to draw inferences about its counterfactual behavior ….[O]ne understands [a target phenomenon] T when one can use one’s representation of T to say what would happen to the target if this or that change were made to it” (Levy 2020, 281–2).
Unfortunately, this appealing view does not fit the facts. The function field model of the integers, as we have seen, is an unrealistic model that confers understanding. But it does not do so by imparting counterfactual knowledge. When number theorists work with the function field model, they are not seeking information about a scenario in which the properties of the integers have been somehow altered. Their interest is not in what mathematics would be like if ℤ lacked its linear structure, numbers had non-trivial derivatives, and additive spillover did not occur. Rather, what they seek (and have gained) is a suite of evidence and heuristics about the expected properties of ℤ, ideas about how the relevant claims might or might not be proved, and clues about the geometric structure undergirding the ℤ/F[t] analogy. These are insights about the actual integers that contain no hint of counterfactual content.
Let me be clear about what I am not claiming here. First, it is not my view that mathematical counterpossibles are inherently defective in some way. I see no problem with admitting that some such statements are meaningful and have substantive truth conditions — for example, “If 6 were prime, then it would not be divisible by 3” seems true, whereas “If 6 were prime, then it would not be divisible by 1” seems false. Second, it is not my view that counterpossibles are never of interest to mathematicians. Claims like “if the traveling salesman problem were solvable in polynomial time, then the clique problem would be too” are common and perfectly reasonable (cf. Jenny 2018).26 Third, it is not my view that the function field model yields no counterfactual knowledge whatsoever. One can infer from the model, say, that if the integers had well-behaved derivatives, then the abc conjecture would be easy to prove. But this sort of fact is just an uninteresting instance of an obvious general principle: for any two things A and B, if A had some of B’s properties, then some B-ish things would be true of A. Truths of this form are not enlightening unless we have some reason to care about and take seriously the counterfactual scenario in question. And we do not in this case: mathematicians simply do not entertain the prospect of the integers acquiring the properties of F[t].
In summary, then, I do not reject the intelligibility or potential usefulness of mathematical counterpossibles in general.27 My claim is just that the function field model does not improve understanding by delivering knowledge of this sort. Its primary epistemic contributions take the form of information about the integers’ actual properties.
One might think that this conclusion leaves us with a puzzle. If unrealistic models in science often seem to improve understanding by conferring counterfactual knowledge — as is plausibly the case with Schelling’s model, for example — why is this not generally true of unrealistic models in mathematics? To pose the question another way, why have philosophers mistakenly identified one possible element or symptom of understanding with a general rule about gaining understanding from unrealistic models?
One reason, I think, is that philosophers of science in our Wood-wardian era have focused overmuch on control, manipulation, interventions, and difference-makers — factors closely associated with counterfactuals and often analyzed within a counterfactual framework. These factors are important in empirical science and gaining knowledge about them can indeed contribute to understanding. But they are not the only game in town. Mathematicians, for instance, care a great deal about understanding, but have little use for manipulationist machinery (because they are not in a position to perform interventions and observe the results28). Instead, they favor models that offer other kinds of goods — confirmation, predictions, heuristics, analogies, proof ideas, plausibility checks, and hints at deeper structure. These sources of understanding exist in empirical science too, of course. And they are no less valuable there, even if philosophers are prone to neglect them. So, we do not need to accept a disjunctive picture, according to which unrealistic models in science and mathematics contribute to understanding in fundamentally different ways. Rather, they do so in mostly similar ways, except that counterfactual knowledge associated with control and manipulability plays a larger role in (some parts of) empirical science.
Having considered and rejected two accounts of model-based understanding, it is natural to ask what positive picture suggests itself in their stead. This is not the place to mount a defense of a novel theory (or to campaign at length on behalf of an existing one). Broadly speaking, however, accounts that link understanding to cognitive systematization look more promising than those that focus on possession of a specific type of knowledge or ability. The understanding gained from unrealistic models often has the character of a broad-spectrum improvement to a variety of epistemic states (belief, credence, expectation, attention, inquiry) and cognitive functions (reasoning, intuition, similarity-detection, problem-solving). Trying to single out any one of these contributions as necessary or sufficient for understanding strikes me as an unpromising project. But a theory that takes the whole package as primary ought to do better. I think, for example, that the account in Kelp (2015) is in the right ballpark, although its exclusive focus on knowledge and explanation/justification relations may be a weakness. I lack the space to fully engage with Kelp’s or other views here, however.
4.3 Mathematics as a Special Science
One final conclusion suggested by this discussion concerns the relationship between pure mathematics and philosophy of science. This relationship is rather tenuous at present. Even if most philosophers of science accept some sort of Quinean continuity thesis in principle, those who pay serious attention to the content and practice of contemporary mathematics are a rare breed in practice. Why is this? Probably at least in part because few philosophers of science are aware of these developments. And why are they unaware? Even if they would hesitate to say so, I suspect that a widespread sense persists that pure mathematics and empirical science are fundamentally dissimilar enterprises — concerned with different goals, about different kinds of things, and using different methods of inquiry. On this picture, there is just not much reason for the two disciplines to intersect (except, every so often, in the context of wondering about unreasonable effectiveness and indispensability).
Perhaps this is beginning to change, ever so slightly. The recent surge of interest in non-causal explanation has led more philosophers to recognize the importance of explanation in pure mathematics and to propose, partly on the basis of mathematical examples, theories of explanation that apply to mathematics and empirical science alike (see, for example, Lange (2014) and Pincock (2015a; 2015b)).
This is an encouraging development, but the two disciplines have more to say to one another. Mathematics and empirical science are much more alike than is often supposed. Researchers in both domains share the same basic epistemic desiderata (knowledge, understanding, explanation, evidence acquisition, theory construction), they pursue these goals using many of the same techniques (including modeling and other non-deductive strategies), and, in doing so, they have to weigh similar values and confront similar problems. There are, of course, real differences between mathematics and empirical science — but they are not obviously more significant than the differences between psychology and theoretical physics, say, or economics and geology. The natural sciences already admit a wide variety of inferential methods, degrees of certainty and apriority, and entities of more or less exotic metaphysical status. While mathematics occupies a distinctive place on some of these scales, there is little reason to view it as completely sui generis.
Indeed, I believe that it is both appropriate and enlightening to view pure mathematics as a special science on par with the rest. Doing so raises questions, presents problems, and suggests solutions that would not have been obvious otherwise.
5. Conclusion
This paper has argued for three main claims. First: that unrealistic models have important uses in pure mathematics and their epistemic benefits include improving our understanding of their target phenomena. Second: that the understanding gained from these models (and hence from unrealistic models in general) need not flow from explanations of the target phenomena. Third: that it need not flow from counterfactual knowledge either.
Future work on the philosophy of modeling could benefit from further examination of mathematical cases. Consider the metaphysics of models. One popular view holds that models are artifacts (Thomasson 2020; Thomson-Jones 2020); a related view holds that models in general, or perhaps unrealistic models in particular, are fictional entities of some sort (Bokulich 2011; Frigg 2010; Salis 2021). It would seem to follow from such views that the polynomial ring F[t], for example, is an artifact or a fiction. But F[t] is also a piece of ordinary mathematics, whose ontological status is presumably the same as that of other mathematical objects. Artifactualism about models therefore seems to imply artifactualism about mathematics in general. Depending on one’s metaphysical commitments, this may be either a welcome consequence or a damning reductio of artifactualist views.
Some other important questions are broadly epistemological. For example, there is a tradition of viewing (some) model-based inference as a kind of analogical reasoning (Bartha 2009; Hesse 1963), and mathematical models may offer some unique data. Mathematics should also join the conversation about thought experiments and imagination in science (Brown 2010; Murphy 2022) and the relationship between these activities and modeling practices (Arfini 2006).
These are just a few of the ways in which philosophy stands to benefit from taking mathematics seriously. The kingdom will prosper when the sequestered queen of the sciences is allowed to return to court.29
Notes
- Other names in the literature for roughly this type of model include “fictional model”, “essentially idealized model”, “pervasively distorted model”, and so on. ⮭
- Named for Edmund Landau’s 1912 address to the International Congress of Mathematics, which characterized them as “unattackable” by the methods of contemporary number theory. ⮭
- Of course, this is not to say that the prime sequence is completely random (in the sense that there is no deterministic procedure for generating its terms) or that the sequence has no meaningful structure at all. Neither of these things is true. The point is that the sequence’s structure is, in certain respects, elusive and hard to study. ⮭
- Goldbach’s conjecture dates to a letter from Goldbach to Euler in 1742 and is one of the oldest unsolved problems in mathematics. The Twin Primes conjecture is first known to have been explicitly stated by de Polignac in 1849, but the idea was probably considered much earlier. Legendre’s conjecture is from his Essai sur la Théorie des Nombres, published in 1797–98. ⮭
- To see this, note that there are about n / log n prime numbers among the first n integers. So the probability that any given number between 1 and n is prime is (n / log n) / n = 1 / log n. ⮭
- These are excerpts from a sequence generated by Mathematica code written by Glenn Harris. Many thanks to Glenn for the code. ⮭
- A bit more precisely and in terms of a concrete example: the claim that Goldbach’s conjecture holds in Cramér’s model means that, in an arbitrary sequence of Cramér primes, with probability 1, the number of ways to express an even integer n as a sum of two primes grows large as n → ∞. (We cannot yet even prove that this limit is bigger than zero in the case of the actual primes.) ⮭
- Recall that each choice of a Cramér prime is made independently of all the other choices. Things obviously do not work this way in the real world; if p is an odd prime, for instance, then p + 1 and 2p are necessarily composite. ⮭
- Cramér’s conjecture is the statement that, for Pn the nth prime, the difference P(n+1) –Pn is asymptotically bounded by (log(p–n))2. Hence, the gaps between consecutive primes are consistently small in the long run. ⮭
- I am assuming that sentences like these are given their obvious interpretations in the model. ⮭
- As Colin McLarty put the point in correspondence: “A similar fallacy, fed by motivated thinking, is important today when climate change deniers say things like ‘They can’t even be sure if it will rain next Sunday! How can they make predictions about 20 years from now?’” In general, the moral is that the ability to predict large-scale trends over long intervals with a high degree of accuracy does not imply the ability to do the same with fine-grained details over short intervals. ⮭
- The Hawkins model generates a set of surrogate primes by a random sieve technique. For an accessible introduction to the Hawkins model, including a comparison with the Cramér model, see Lorch & Ökten (2007). ⮭
- “The probabilistic heuristic, in which independence is assumed” refers, of course, to the method of constructing random models by independently choosing surrogate primes. ⮭
- The lucky numbers are generated in the following way. First list all the natural numbers starting with 1. Then cross out every second number, leaving the sequence 1, 3, 5, 7, 9, 11, …. Next cross out every third number, leaving 1, 3, 7, 9, 13, 15, …. At every successive step, cross out every nth number, where n is the first surviving number on the list such that every nth number has not yet been crossed out. ⮭
- This is not to suggest that philosophers should mechanically rubber-stamp any opinion a mathematician expresses in print. Experts in every field make mistakes and throwaway comments; taking mathematical practice seriously also means exercising discretion in choosing, reading, and interpreting potential sources of evidence. But it is true, nevertheless, that the relevant specialists are better positioned than most philosophers to judge what qualifies as a source of mathematical understanding. When the best-informed and most thoughtful experts make such judgments deliberately, repeatedly, and for coherent reasons, taking their word for it is the appropriate default. ⮭
- One such reason might be that the experts disagree among themselves. In such cases, philosophy can play a useful role by investigating the source and nature of the disagreement. For a mathematical case study, see D’Alessandro (2020). See also the previous footnote for further elaboration of this epistemological stance. ⮭
- The terminology here is standard but a bit confusing. The ring of polynomials F[t] is not itself a function field, as it is not a field at all. The model’s namesake is rather the rational function field F(t), consisting of polynomials with coefficients in F and their multiplicative inverses. Because it contains fractions, the function field F(t) is most naturally viewed as a model for the rational numbers ℚ. But the name “function field model” (or “function field analogy”) is generally applied to all models in this family, including F[t] as a model for ℤ. ⮭ ⮭
- This example should not be taken too literally. The function field model generally deals with an arbitrary finite field F rather than a specific one like 𝓕7, and it generally does not assign specific polynomials to serve as the representatives of specific integers. The point is just to compare spillover in ℤ with its absence in function fields. ⮭
- Notoriously, the Japanese mathematician Shinichi Mochizuki has claimed to have proven abc since 2012, but the consensus among number theorists is that the proof is unconvincing and the conjecture remains open. See Dutilh Novaes (2013) for a philosophical analysis and Klarreich (2018b) for an account of recent developments, including Peter Scholze and Jakob Stix’s engagement with Mochizuki and their claim that his proof contains an unfixable gap. ⮭
- See Tanswell (2018) for discussion of conceptual engineering in mathematics. ⮭
- Spec R, the spectrum of a commutative ring R, is the set of all prime ideals of R, often equipped with the Zariski topology. ⮭
- For overviews of the role of explanation in mathematics, see D’Alessandro (2019) or Mancosu (2018). ⮭
- A reasonably large sample of this literature is cited in §2 above. ⮭
- While Kitcher is a self-avowed “deductive chauvinist”, he has a story to tell about how seemingly statistical or probabilistic explanations can be accommodated within his framework (see Kitcher 1989, 448–459). But inferences from the properties of models to the expected properties of their target systems do not appear to be statistical or probabilistic arguments, and it seems unlikely that Kitcher would consider such inferences potentially explanatory. (My thanks to an anonymous referee for prompting me to discuss Kitcher’s view.) ⮭
- An account that it might seem strange not to mention is Marc Lange’s theory of mathematical explanation, defended in Lange (2014). Lange’s theory, like Kitcher’s, is about explanatory proofs, which the models under discussion generally do not provide. I find it even less clear whether or how Lange’s view might apply to model-based inference. ⮭
- These are counterfactuals assuming that P ≠ NP. ⮭
- For more on the uses of counterpossibles in science, see Tan (2019) and McLoone (2020), as well as Jenny (2018), cited above. ⮭
- And, more broadly, of course, because causal reasoning is generally inapplicable in pure mathematics. ⮭
- I am grateful to everyone who read or heard versions of this paper. Many thanks to Alan Baker, Nicola Bonatti, Kenny Easwaran, Joachim Frans, Silvia Jonas, Colin McLarty, Alex Paseau, Fenner Tanswell, Karim Zahidi, and Benjamin Zayton for their comments and encouragement. All of you are lovely. Likewise to audiences at the Munich Center for Mathematical Philosophy, Ghent University, the University of Turin, the 4th Scientific Understanding and Representation meeting at Fordham University, the 2022 German Society for Philosophy of Science conference, and the participants in John Burgess and Silvia De Toffoli’s seminar on mathematical rigor at Princeton University in Fall 2021. Each was a pleasure. Two anonymous referees for this journal provided much-appreciated suggestions. Finally, special thanks to Lauren Woomer for patiently listening to me give this talk on Zoom many times while we were stuck inside during the pandemic. ⮭
References
Arfini, Selene. 2006. “Thought experiments as model-based abductions.” In Lorenzo Magnani and Claudia Casadio (eds.), Model Based Reasoning in Science and Technology: Logical, Epistemological, and Cognitive Issues. New York: Springer-Verlag, 437–452. DOI: https://doi.org/10.1007/978-3-319-38983-7_24.https://doi.org/10.1007/978-3-319-38983-7_24
Baron, Sam, Mark Colyvan, and David Ripley. 2020. “A counterfactual approach to explanation in mathematics.” Philosophia Mathematica 28, 1–34. DOI: https://doi.org/10.1093/philmat/nkz023.https://doi.org/10.1093/philmat/nkz023
Bartha, Paul. 2009. By Parallel Reasoning: The Construction and Evaluation of Analogical Arguments. New York: Oxford University Press.
Batterman, Robert, and Collin Rice. 2014. “Minimal model explanations.” Philosophy of Science 81, 349–376. DOI: https://doi.org/10.1086/676677.https://doi.org/10.1086/676677
Boesch, Brandon. 2019. “Scientific representation and dissimilarity.” Synthese 198, 5495–5513. DOI: https://doi.org/10.1007/s11229-019-02417-0.https://doi.org/10.1007/s11229-019-02417-0
Bokulich, Alisa. 2011. “How scientific models can explain.” Synthese 180, 33–45. DOI: https://doi.org10.1007/s11229-009-9565-1.https://doi.org10.1007/s11229-009-9565-1
Brown, James Robert. 2010. The Laboratory of the Mind: Thought Experiments in the Natural Sciences (2nd edition). New York: Routledge.
Button, Tim, and Sean Walsh. 2018. Philosophy and Model Theory. New York: Oxford University Press.
Cramér, Harald. 1936. “On the order of magnitude of the difference between consecutive prime numbers.” Acta Arithmetica 2, 23–46.
D’Alessandro, William. 2019. “Explanation in mathematics: Proofs and practice.” Philosophy Compass. DOI: https://doi.org/10.1111/phc3.12629.https://doi.org/10.1111/phc3.12629
D’Alessandro, William. 2020. “Proving quadratic reciprocity: Explanation, disagreement, transparency and depth.” Synthese, DOI: https://doi.org/0.1007/s11229-020-02591-6.https://doi.org/0.1007/s11229-020-02591-6
Dedekind, Richard, and Heinrich Weber. 2012. Theory of Algebraic Functions of One Variable. Trans. by John Stillwell. Providence, Rhode Island: American Mathematical Society.
Delarivière, Sven, and Bart Van Kerkhove. 2021. “The mark of understanding: In defense of an ability account.” Axiomathes 31, 619–648. DOI: https://doi.org/10.1007/s10516-020-09529-0.https://doi.org/10.1007/s10516-020-09529-0
de Regt, Henk. 2009. “The epistemic value of understanding.” Philosophy of Science 76, 585–597. DOI: https://doi.org/10.1086/605795.https://doi.org/10.1086/605795
de Regt, Henk. 2015. “Scientific understanding: Truth or dare?” Synthese 192, 3781–3797. DOI: https://doi.org/10.1007/s11229-014-0538-7.https://doi.org/10.1007/s11229-014-0538-7
Dutilh Novaes, Catarina. 2013. May 14. “What’s wrong with Mochizuki’s ‘proof’ of the ABC conjecture?” M-Phi. Online at <https://m-phi.blogspot.com/2013/05/whats-wrong-with-mochizukis-proof-of.html>.https://m-phi.blogspot.com/2013/05/whats-wrong-with-mochizukis-proof-of.html
Frigg, Roman. 2010. “Models and fiction.” Synthese 172, 251–268. DOI: https://doi.org/10.1007/s11229-009-9505-0.https://doi.org/10.1007/s11229-009-9505-0
Granville, Andrew. 1995. “Harald Cramér and the distribution of prime numbers.” Scandinavian Actuarial Journal 1, 12–28. DOI: https://doi.org/10.1080/03461238.1995.10413946.https://doi.org/10.1080/03461238.1995.10413946
Granville, Andrew. 1995. “Unexpected irregularities in the distribution of prime numbers.” In S.D. Chatterjee (ed.), Proceedings of the International Congress of Mathematicians. Basel: Birkhauser.
Grimm, Stephen. 2011. “Understanding.” In Duncan Pritchard and Sven Berneker (eds.), The Routledge Companion to Epistemology, New York: Routledge, 84–94.
Hannon, Michael. 2019. What’s the Point of Knowledge? A Function-First Epistemology. New York: Oxford University Press.
Hawkins, David. 1957. “The random sieve.” Mathematics Magazine 31, 1–3. DOI: https://doi.org/10.2307/3029322.https://doi.org/10.2307/3029322
Hawkins, David, and W. E. Briggs. 1957. “The lucky number theorem.” Mathematics Magazine 31, 81–84. DOI: https://doi.org/10.2307/3029213.https://doi.org/10.2307/3029213
Hesse, Mary. 1963. Models and Analogies in Science. South Bend, Indiana: University of Notre Dame Press.
Hindriks, Frank. 2013. “Explanation, understanding, and unrealistic models.” Studies in History and Philosophy of Science 44, 523–531. DOI: https://doi.org/10.1016/j.shpsa.2012.12.004.https://doi.org/10.1016/j.shpsa.2012.12.004
Ingham, A.E. 1932. The Distribution of Prime Numbers. Cambridge: Cambridge University Press.
Jenny, Matthias. 2018. “Counterpossibles in science: The case of relative computability.” Noûs 52, 530–560. DOI: https://doi.org/10.1111/nous.12177.https://doi.org/10.1111/nous.12177
Kelp, Christoph. 2015. “Understanding phenomena.” Synthese 192, 3799–3816. DOI: https://doi.org/10.1007/s11229-014-0616-x.https://doi.org/10.1007/s11229-014-0616-x
Khalifa, Kareem. 2012. “Inaugurating understanding or repackaging explanation?” Philosophy of Science 79, 15–37. DOI: https://doi.org/10.1086/663235.https://doi.org/10.1086/663235
Kitcher, Phillip. 1989. “Explanatory unification and the causal structure of the world.” In Phillip Kitcher and Wesley Salmon (eds.), Scientific Explanation (Minnesota Studies in the Philosophy of Science, Volume XIII), Minneapolis: University of Minnesota Press, 410–505.
Klarreich, Erica. 2018. “Mathematicians discover prime conspiracy.” In Thomas Lin (ed.), The Prime Number Conspiracy, Cambridge, MA: MIT Press.
Klarreich, Erica. 2018, September 20. “Titans of mathematics clash over epic proof of ABC conjecture.” Quanta Magazine. Online at <https://www.quantamagazine.org/titans-of-mathematics-clash-over-epic-proof-of-abc-conjecture-20180920/>.https://www.quantamagazine.org/titans-of-mathematics-clash-over-epic-proof-of-abc-conjecture-20180920/
Knuuttila, Tarja, and Rami Koskinen. 2020. “Synthetic fictions: Turning imagined biological systems into concrete ones.” Synthese, DOI: https://doi.org/10.1007/s11229-020-02567-6.https://doi.org/10.1007/s11229-020-02567-6
Koukoulopoulous, Dimitris. 2019. The Distribution of Prime Numbers. Providence, Rhode Island: American Mathematical Society.
Lange, Marc. 2014. “Aspects of mathematical explanation: Symmetry, unity, and salience.” Philosophical Review 123, 485–531. DOI: https://doi.org/10.1215/00318108-2749730.https://doi.org/10.1215/00318108-2749730
Levy, Arnon. 2020. “Metaphor and scientific explanation.” In Arnon Levy and Peter Godfrey-Smith (eds.), The Scientific Imagination: Philosophical and Psychological Perspectives, New York: Oxford University Press, 280–303.
Lipton, Peter. 2009. “Understanding without explanation.” In Henk W. de Regt, Sabina Leonelli, and Kai Eigner (eds.), Scientific Understanding: Philosophical Perspectives, Pittsburgh: University of Pittsburgh Press, 43–63.
Lorch, John, and Giray Ökten. 2007. “Primes and probability: The Hawkins random sieve.” Mathematics Magazine 80, 112–119. DOI: https://doi.org/10.1080/0025570X.2007.11953464.https://doi.org/10.1080/0025570X.2007.11953464
Lorscheid, Oliver. 2014. “Blueprints — toward absolute arithmetic?” Journal of Number Theory 144, 408–421. DOI: https://doi.org/10.1016/j.jnt.2014.04.006.https://doi.org/10.1016/j.jnt.2014.04.006
Lorscheid, Oliver. 2018. “ for everyone.” Jahresbericht der Deutschen Mathematiker-Vereinigung 120, 83–116. DOI: https://doi.org/10.1365/s13291-018-0177-x.https://doi.org/10.1365/s13291-018-0177-x
Lozano-Robledo, Álvaro. 2020. “A probablistic model for the distribution of ranks of elliptic curves over ℚ.” Journal of Number Theory, DOI: https://doi.org/10.1016/j.jnt.2020.05.022.https://doi.org/10.1016/j.jnt.2020.05.022
Mäki, Uskali. 2009. “Realistic realism about unrealistic models.” In Harold Kincaid and Don Ross (eds.), The Oxford Handbook of Philosophy of Economics, New York: Oxford University Press, 68–98.
McLoone, Brian. 2020. “Calculus and counterpossibles in science.” Synthese, DOI: https://doi.org/10.1007/s11229-020-02855-1.https://doi.org/10.1007/s11229-020-02855-1
Mancosu, Paolo. 2018. “Explanation in mathematics.” In Edward N. Zalta (ed.), The Stanford Encyclopedia of Philosophy (Summer 2018 edition), online at <https://plato.stanford.edu/archives/sum2018/entries/mathematics-explanation/>.https://plato.stanford.edu/archives/sum2018/entries/mathematics-explanation/
Montgomery, Hugh L., and Robert C. Vaughan. 2007. Multiplicative Number Theory I: Classical Theory. New York: Cambridge University Press.
Morrison, Margaret. 2015. Reconstructing Reality: Models, Mathematics, and Simulations. New York: Oxford University Press.
Murphy, Alice. 2022. “Imagination in science.” Philosophy Compass 17, 1–12, DOI: https://doi.org/10.1111/phc3.12836.https://doi.org/10.1111/phc3.12836
Papayannopoulos, Philippos. 2020. “Unrealistic models for realistic computations: how idealisations help represent mathematical structures and found scientific computing.” Synthese, DOI: https://doi.org/10.1007/s11229-020-02654-8.https://doi.org/10.1007/s11229-020-02654-8
Patterson, S.J. 1988. An Introduction to the Theory of the Riemann Zeta-function. Cambridge: Cambridge University Press.
Pincock, Christopher. 2015. “Abstract explanations in science.” British Journal for the Philosophy of Science 66, 857–882. DOI: https://doi.org/10.1093/bjps/axu016.https://doi.org/10.1093/bjps/axu016
Pincock, Christopher. 2015. “The unsolvability of the quintic: A case study in abstract mathematical explanation.” Philosophers’ Imprint 15, 1–19.
Pintz, János. 2007. “Cramér vs. Cramér: On Cramér’s probabilistic model for primes.” Functiones et Approximatio 37, 361–376. DOI: https://doi.org/10.7169/facm/1229619660.https://doi.org/10.7169/facm/1229619660
Pólya, George. 1959. “Heuristic reasoning in the theory of numbers.” American Mathematical Monthly 66, 375–384. DOI: https://doi.org/10.1080/00029890.1959.11989304.https://doi.org/10.1080/00029890.1959.11989304
Potochnik, Angela. 2017. Idealization and the Aims of Science. Chicago: University of Chicago Press.
Reutlinger, Alexander. 2016. “Is there a monist theory of causal and non-causal explanations? The counterfactual theory of scientific explanation.” Philosophy of Science 83, 733–745. DOI: https://doi.org/10.1086/687859.https://doi.org/10.1086/687859
Reutlinger, Alexander, Dominik Hangleiter, and Stephan Hartmann. 2018. “Understanding (with) toy models.” British Journal for the Philosophy of Science 69, 1069–1099. DOI: https://doi.org/10.1093/bjps/axx005.https://doi.org/10.1093/bjps/axx005
Rice, Collin C. 2016. “Factive scientific understanding without accurate representation.” Biology and Philosophy 31, 81–102. DOI: https://doi.org/10.1007/s10539-015-9510-2.https://doi.org/10.1007/s10539-015-9510-2
Rosen, Michael. 2002. Number Theory in Function Fields. New York: Springer-Verlag.
Ruben, David-Hillel. 1990. Explaining Explanation. New York: Routledge.
Rudnick, Zeev. 2014. “Some problems in analytic number theory for polynomials over a finite field.” In Sun Young Jang, Young Rock Kim, Dae-Woong Lee and Ikkwon Yie (eds.), Proceedings of the International Congress of Mathematicians, Seoul 2014, Volume II: Invited Lectures, Seoul: Kyung Moon SA.
Saatsi, Juha. 2020. “Realism and explanatory perspectivism.” In Michela Massimi and C.D. McCoy (eds.), Understanding Perspectivism: Scientific Challenges and Methodological Prospects, New York: Routledge, 65–84.
Salis, Fiora. 2021. “The new fiction view of models.” British Journal for the Philosophy of Science 72, 717–742. DOI: https://doi.org/10.1093/bjps/axz015.https://doi.org/10.1093/bjps/axz015
Snyder, Noah. 2000. “An alternate proof of Mason’s theorem.” Elemente der Mathematik 55, 93–94.
Soulé, Christoph. 1999. “On the field with one element.” Lecture notes from the Arbeitstagung 1999 of the Max Planck Institute for Mathematics, available online at <https://www.ihes.fr/~soule/f1-soule.pdf>.https://www.ihes.fr/~soule/f1-soule.pdf
Soundararajan, Kannan. 2007. “The distribution of prime numbers.” In Andrew Granville and Zeév Rudnick (eds.), Equidistribution in Number Theory: An Introduction, Dordrecht: Springer.
Soundararajan, Kannan. 2007. “Small gaps between prime numbers: The work of Goldston-Pintz-Yıldırım.” Bulletin (New Series) of the American Mathematical Society 44, 1–18.
Stewart, Ian, and David Tall. 2015. The Foundations of Mathematics, second edition. New York: Oxford University Press.
Strevens, Michael. 2013. “No understanding without explanation.” Studies in History and Philosophy of Science 44, 510–515. DOI: https://doi.org/10.1016/j.shpsa.2012.12.005.https://doi.org/10.1016/j.shpsa.2012.12.005
Tan, Peter. 2019. “Counterpossible non-vacuity in scientific practice.” Journal of Philosophy 116, 32–60. DOI: https://doi.org/10.5840/jphil201911612.https://doi.org/10.5840/jphil201911612
Tanswell, Fenner Stanley. 2018. “Conceptual engineering for mathematical concepts.” Inquiry 61, 881–913. DOI: https://doi.org/10.1080/0020174x.2017.1385526.https://doi.org/10.1080/0020174x.2017.1385526
Tao, Terence. 2008. Structure and Randomness. Providence, RI: American Mathematical Society.
Tao, Terence. 2015, January 4. “Probabilistic models and heuristics for the primes.” What’s New. Online at <https://terrytao.wordpress.com/2015/01/04/254a-supplement-4-probabilistic-models-and-heuristics-for-the-primes-optional/>https://terrytao.wordpress.com/2015/01/04/254a-supplement-4-probabilistic-models-and-heuristics-for-the-primes-optional/
Tenenbaum, Gérald, and Michel Mendès France. 2000. Prime Numbers and Their Distribution. Trans. by Philip G. Spain. Providence, RI: American Mathematical Society.
Thas, Koen (ed.). 2016. Absolute Arithmetic and F1-Geometry. Zurich: European Mathematical Society.
Thomasson, Amie. 2020. “If models were fictions, then what would they be?” In Arnon Levy and Peter Godfrey-Smith (eds.), The Scientific Imagination: Philosophical and Psychological Perspectives, New York: Oxford University Press.
Thomson-Jones, Martin. 2020. “Realism about missing systems.” In Arnon Levy and Peter Godfrey-Smith (eds.), The Scientific Imagination: Philosophical and Psychological Perspectives, New York: Oxford University Press.
Tits, Jacques. 1957. “Sur les analogues algébriques des groupes semi-simples complexes.» Colloque d’algèbre supérieure, tenu à Bruxelles du 19 au 22 décembre 1956, Centre Belge de Recherches Mathématiques Établissements Ceuterick, Louvain. Paris: Librairie Gauthier-Villars.
Trout, J.D. 2007. “The psychology of explanation.” Philosophy Compass 2, 564–596. DOI: https://doi.org/10.1111/j.1747-9991.2007.00081.x.https://doi.org/10.1111/j.1747-9991.2007.00081.x
van der Poorten, Alf. 1996. Notes on Fermat’s Last Theorem. New York: Wiley.
van Eck, Dingmar, and Cory Wright. 2020. “Mechanist idealisation in systems biology.” Synthese, DOI: https://doi.org/10.1007/s11229-020-02816-8.https://doi.org/10.1007/s11229-020-02816-8
Weil, André. 1948. Sur les Courbes Algébriques et les Variétés qui s’en Déduisent. Paris: Hermann.
Wilkenfeld, Daniel. 2019. “Understanding as compression.” Philosophical Studies 176, 2807–2831. DOI: https://doi.org/10.1007/s11098-018-1152-1.https://doi.org/10.1007/s11098-018-1152-1