CHAPTER II MODULAR SYMBOL ALGORITHMS In this chapter we describe the modular symbol method in detail. First, in Sections 2.1 to 2.5, we describe the use of modular symbols and M-symbols to compute the homology space H1(X0(N), Q) and the action of the Hecke algebra, for an arbitrary positive integer N. At this stage it is already possible to identify rational newforms f, and obtain some information about the modular elliptic curves Ef attached to them: these are introduced in Section 2.6. To obtain equations for the curves Ef we compute their period lattices: the methods used for this stage occupy most of the remaining sections of the chapter. The final section 2.15 shows how to compute the degree of the associated map ϕ : X0(N) → Ef . To illustrate the methods, we also give some worked examples in an Appendix to the chapter. 2.1 Modular Symbols and Homology 2.1.1. The upper half-plane, the modular group and cusp forms. Let H denote the upper half-plane H = {z = x + iy ∈ C | y > 0}, and H∗ = H∪Q∪{∞} the extended upper half-plane, obtained by including the cusps Q∪{∞}. The group PSL2(R) acts on H∗ via linear fractional transformations: a b c d : z → az + b cz + d ; these are the isometries of the hyperbolic geometry on H, for which geodesics are either halflines perpendicular to the real axis R, or semicircles perpendicular to R. The modular group Γ = PSL2(Z) is a discrete subgroup of PSL2(R) (in the topology induced from SL(2, R) ⊂ M2(R) ∼= R4 ), and acts discontinuously on H, in the sense that for each z ∈ H the orbit Γ.z is discrete. Note that the cusps Q ∪ {∞} form a complete Γ-orbit. The elements S = 0 −1 1 0 : z → −1/z (of order 2) and T = 1 1 0 1 : z → z + 1 (of infinite order) generate Γ. This fact, and the related fact that a fundamental region for the action of Γ on H is given by the set F defined by (2.1.1) F = {z = x + iy ∈ H | |x| ≤ 1 2 , |z| ≥ 1}, are standard and will not be proved here. Both results depend essentially on the fact that Z is Euclidean. Let G be a subgroup of Γ of finite index e. Then G also acts discretely on H. A fundamental region for G on H is given by ∪Mi.F, where the Mi (for 1 ≤ i ≤ e) are right coset representatives for G in Γ. 7 8 II. MODULAR SYMBOL ALGORITHMS Let XG = G\H∗ denote the quotient space; this may be given the structure of a compact Riemann surface. Around most points the local parameter is just z, but more care is needed about the “parabolic points” or cusps, and the “elliptic points” which have non-trivial stabilizers in the Γ-action. These elliptic points for G (if any) are in the Γ-orbits of i (stabilized by S of order 2), and of ρ = (1 + √ −3)/2 (stabilized by TS of order 3). See the books of Lang [32], Shimura [55] or Knapp [28] for details of the Riemann surface construction. Let g denote the genus of the surface XG; as a real manifold1 , XG is a g-holed torus. We will be concerned with the explicit computation of the 1-homology H1(XG, Z), which is a free Z-module of rank 2g. (See Subsection 2.1.2 below for a brief review of homology). This homology will be expressed in terms of “modular symbols”, defined below. We must also explain the connection between homology, modular forms, and elliptic curves. First we review the definition of cusp forms. The space of cusp forms of weight 2 for G will be denoted by S2(G) . These cusp forms are holomorphic functions f(z) for z ∈ H which satisfy (1) f |M = f for all M ∈ G, where f a b c d (z) = (cz + d)−2 f az + b cz + d . Thus, since (cz + d)−2 = (d/dz)(M(z)) for M = a b c d , we have, for all M ∈ G, f(M(z))d(M(z)) = f(z)dz. (2) f(z) behaves nicely at the cusps. The significance of this condition is that, by (1), a cusp form of weight 2 for G is the pull-back of a (holomorphic) differential on the Riemann surface G\H, of which XG is the compactification after adding the (finitely many) G-inequivalent cusps, and we want this differential to be holomorphic on the whole of XG. In future we will identify cusp forms of weight 2 for G with holomorphic differentials on XG. From standard Riemann surface theory, we then know that S2(G) is a complex vector space of dimension g. We can make explicit the condition that f(z)dz is holomorphic at the cusp ∞ (for the other cusps, see one of the references on the theory of modular forms). The stabilizer of ∞ in Γ is the infinite cyclic subgroup generated by T; if h is the least positive integer such that T h ∈ G, then clearly we have Stab(∞) ∩ G = Th , and every f ∈ S2(G) has a Fourier expansion of the form (2.1.2) f(z) = ∞ n=1 ane2πinz/h with coefficients an ∈ C. The integer h is called the width of the cusp ∞; for G = Γ0(N), the case we will be most interested in, we have h = 1, since T ∈ Γ0(N). 1Strictly speaking, XG is not a manifold unless G has no elements of finite order, because of the branching over the elliptic points. However this will make no difference in practice and we may safely ignore it. 2.1 MODULAR SYMBOLS AND HOMOLOGY 9 2.1.2. The duality between cusp forms and homology. The basis for our method is the explicit computation of the homology (specifically, the rational 1-homology) of the Riemann surface XG. This is useful for various reasons. On the one hand, this gives us a very explicit vector space on which Hecke operators act, which is isomorphic (or more strictly, dual) to the space of cusp forms. Thus by computing homology and the Hecke action on it, we are indirectly also able to obtain information about the space of cusp forms. The Fourier coefficients of the cusp forms are determined by their Hecke eigenvalues (see Section 2.6), so we obtain these indirectly as eigenvalues of Hecke operators acting on homology. Secondly, in order to actually compute the elliptic curves attached to these cusp forms, we need to know their periods, which are obtained by integrating the corresponding differentials around closed paths on the surface XG; since two paths give the same period (for all forms) if and only if they are homologous (essentially by Cauchy’s Theorem on XG), it is clear that to determine the whole period lattice we will also require an explicit knowledge of the homology of XG. The integral homology H1(XG, Z) is most easily defined geometrically: it is the abelian group obtained by taking as generators all closed paths on XG, and factoring out by the relation that two closed paths are equivalent (or homologous) if one can be continuously deformed into the other. If the genus of the surface XG is g, this gives a free abelian group of rank 2g: roughly speaking, the surface is a g-holed torus, and there are two generating loops around each hole. To determine this homology group in practice, one triangulates the surface, so that every path is homologous to a path along the edges of the triangulation. Now the generators are the directed edges of the triangulation, modulo relations given by the sum of the edges around each triangle being homologous to zero. A typical element of H1(XG, Z) will then be given as a Z-linear combination of these directed edges. In Subsection 2.1.6 below, we will make this very explicit: there will be one edge of the triangulation for each coset of G in Γ, and the triangle relations will be expressed algebraically in terms of the coset action of Γ. This description will entirely algebraicize the situation, in a way which is then easy to implement on a computer. For any other ring R, the homology with coefficients in R is obtained simply by tensoring with R: H1(XG, R) = H1(XG, Z) ⊗Z R. Explicitly, one just takes R-linear combinations of the 2g generators of the Z-module H1(XG, Z) (“extension of scalars”); the result is then an R-module. In what follows we will only need to take R = Q, R = R, and R = C. Let H1(XG, R) = H1(XG, Z)⊗Z R, which is a real vector space of dimension 2g. Abstractly, this space is obtained by formal extension of scalars from H1(XG, Z); but we can be more concrete, if we introduce the notion of modular symbols. First let α, β ∈ H∗ be points equivalent under the action of G, so that β = M(α) for some M ∈ G. Any smooth path (for instance, a geodesic path) from α to β in H∗ projects to a closed path in the quotient space XG, and hence determines an integral homology class in H1(XG, Z), which depends only on α and β and not on the path chosen, because H∗ is simply connected. (In fact, the class depends only on M: see (5) in Proposition 2.1.1 below). We denote this homology class by the modular symbol {α, β}G, or simply {α, β} when the group G is clear from the context. Conversely, every integral homology class γ ∈ H1(XG, Z) can be represented by such a modular symbol {α, β}. Also, if f ∈ S2(G) then the integral γ 2πif(z)dz = 2πi β α f(z)dz 10 II. MODULAR SYMBOL ALGORITHMS is well-defined, since f(z) is holomorphic, and will be denoted either as γ, f or as If (α, β). The (complex) value of such an integral is called a period of the cusp form f, or of the associated differential 2πif(z)dz. Let f1, f2, . . . , fg be a fixed basis for S2(G), so that the differentials 2πifj(z)dz are a basis for the holomorphic differentials on XG. Also let γ1, γ2, . . . , γ2g be a fixed Z-basis for the integral homology H1(XG, Z). Then we may form the 2g × g complex period matrix Ω = (ωjk) = ( γj, fk ) . By standard Riemann surface theory, the 2g rows of Ω are linearly independent over R, and so their Z-span is a lattice (discrete subgroup) Λ of rank 2g in Cg . The quotient J(G) = Cg /Λ is the Jacobian of XG; it is an abelian variety of dimension g. The symbols {α, β} give C-linear functionals S2(G) → C via f → If (α, β). We may identify H1(XG, R) with the space of all C-linear functionals on S2(G) as follows: given an element γ ∈ H1(XG, R), we can write γ uniquely in the form γ = 2g j=1 cjγj with coefficients cj ∈ R. Define γ, f = 2g j=1 cj γj, f . Then the corresponding functional is f → γ, f . Conversely, given a functional ω: S2(G) → C, the vector (ω(f1), ω(f2), . . . , ω(fg)) ∈ Cg may be expressed uniquely as an R-linear combination of the rows of Ω, so there exist real scalars cj (1 ≤ j ≤ 2g) such that ω(f) = 2g j=1 cj γj, f for all f ∈ S2(G); then ω(f) = γ, f where γ = 2g j=1 cjγj ∈ H1(XG, R). In particular, let α, β ∈ H∗ be arbitrary (not necessarily in the same G-orbit); then the functional f → If (α, β) corresponds to a unique element γ = 2g j=1 cjγj ∈ H1(XG, R), and we define the modular symbol {α, β}G ∈ H1(XG, R) to be this element. Clearly this definition agrees with the earlier one in the special case where β = M(α) for some M ∈ G; indeed, this case holds if and only if all cj ∈ Z. By the field of definition of an element γ ∈ H1(XG, R) we mean the field generated over Q by its coefficients cj (with respect to the Z-basis for the integral homology, as above). For example, γ is rational (has field of definition Q) if and only if γ ∈ H1(XG, Q). We now have an R-bilinear pairing (2.1.3) S2(G) × H1(XG, R) −→ C given by (f, γ) → γ, f = γ 2πif(z)dz which gives an exact duality between the two spaces on the left if we view S2(G) as a real vector space of dimension 2g by restriction of scalars from C to R. To interpret this as a duality over C, we can give H1(XG, R) the structure of a vector space over C (of dimension g) as follows. Given γ ∈ H1(XG, R) and c ∈ C, we define cγ to be that element of H1(XG, R) which satisfies cγ, f = γ, cf for all f ∈ S2(G); in other words, cγ is the element corresponding to the functional f → c γ, f . Now the map (f, γ) → γ, f is C-bilinear, and the dual pairing (2.1.3) between homology and cusp forms is an exact duality over C. 2.1 MODULAR SYMBOLS AND HOMOLOGY 11 2.1.3. Real structure. For suitable groups G we can restrict the duality described above to a duality between real vector spaces of dimension g. This has important implications for explicit computations, where a halving of the dimension (from 2g to g) gives a significant saving of effort. Let J = −1 0 0 1 . We say that a subgroup G of Γ is of real type if J normalizes G. Explicitly, let M = a b c d ∈ G; then J−1 MJ = a −b −c d = M∗ , say, and G is of real type when M ∈ G ⇐⇒ M∗ ∈ G. This will be true, in particular, for the congruence subgroups Γ0(N) and Γ1(N) of most interest to us. For z ∈ H set z∗ = −z. Then a trivial calculation shows that w = M(z) ⇐⇒ w∗ = M∗ (z∗ ); it follows that, for G of real type, the map z → z∗ induces a well-defined map on the quotient XG, and hence also on homology, via {α, β} → {α∗ , β∗ }. Clearly this is an R-linear involution on H1(XG, R). Hence we obtain a decomposition into +1 and −1 eigenspaces for ∗: H1(XG, R) = H+ 1 (XG, R) ⊕ H− 1 (XG, R). Remark. The involution ∗ also acts on the integral homology H1(XG, Z), and we may set H± 1 (XG, Z) = H± 1 (XG, R)∩H1(XG, Z). However the direct sum H+ 1 (XG, Z)⊕H− 1 (XG, Z) will in general have finite index in H1(XG, Z). We now define dually an involution, also denoted ∗, on the space S2(G) where G is of real type. For a holomorphic function f on H, we set f∗ (z) = f(z∗). Then f∗ is also holomorphic on H, and the following facts are easily verified: (1) If f has Fourier expansion f(z) = anqn (where q = exp(2πiz/h) as in (2.1.2) above), then f∗ (z) = anqn . In other words, the Fourier coefficients of f∗ are the conjugates of those of f. (2) For M ∈ Γ, we have f∗ | M = (f | M∗ )∗ . (3) γ∗ , f∗ = γ, f for all f, γ. As a formal consequence of fact (2), we immediately see that, for G of real type, the map f → f∗ is an R-linear map from S2(G) to itself, which is an involution. Denote by S2(G)R the R-subspace of S2(G) fixed by this involution, which by fact (1) consists of those cusp forms with real Fourier coefficients. Then dimR(S2(G)R) = g, and S2(G)R spans S2(G) over C. For nonzero f ∈ S2(G)R we have (from fact (3)): γ, f ∈ R ⇐⇒ γ ∈ H+ 1 (XG, R), and also γ, f ∈ iR ⇐⇒ γ ∈ H− 1 (XG, R). Moreover, multiplication by i on H1(XG, R) interchanges the “real” and “pure imaginary” eigenspaces H± 1 (XG, R) since γ ∈ H+ 1 ⇐⇒ γ, f ∈ R ∀f ∈ S2(G)R ⇐⇒ iγ, f ∈ iR ∀f ∈ S2(G)R ⇐⇒ iγ ∈ H− 1 . It follows that dim H+ 1 (XG, R) = dim H− 1 (XG, R) = g. Also, since the period γ, f is real for γ ∈ H+ 1 and f ∈ S2(G)R, the duality over C we had earlier now restricts to a duality over R: (2.1.4) S2(G)R × H+ 1 (XG, R) −→ R. 12 II. MODULAR SYMBOL ALGORITHMS It follows that the real vector spaces S2(G)R and H+ 1 (XG, R) of dimension g are dual to each other. We will exploit this duality (which also respects the action of Hecke and other operators, see below), as we will compute H1(XG, R) explicitly in order to gain information about the cusp forms in S2(G). Also, this duality is crucial in the definition of modular elliptic curves. 2.1.4. Modular symbol formalism. We will need the following simple properties of the modular symbols {α, β}. Proposition 2.1.1. Let α, β, γ ∈ H∗ , and let M, M1, M2 ∈ G. Then (1) {α, α} = 0; (2) {α, β} + {β, α} = 0; (3) {α, β} + {β, γ} + {γ, α} = 0; (4) {Mα, Mβ}G = {α, β}G; (5) {α, Mα}G = {β, Mβ}G; (6) {α, M1M2α}G = {α, M1α}G + {α, M2α}G; (7) {α, Mα}G ∈ H1(XG, Z). Proof. Only (5) and (6) are not quite obvious. For (5), write {α, Mα} = {α, β}+{β, Mβ}+ {Mβ, Mα}, using (2) and (3); now the first and third terms cancel by (4). For (6), we have {α, M1M2α} = {α, M1α} + {M1α, M1M2α} = {α, M1α} + {α, M2α} using (4). Corollary 2.1.2. The map M → {α, Mα}G is a surjective group homomorphism G → H1(XG, Z), which is independent of α ∈ H∗ . The kernel of this homomorphism contains all commutators and elliptic elements (since the latter have finite order, and the image is a torsion-free abelian group), and also all parabolic elements: for if M ∈ G is parabolic, it is a conjugate of T and hence fixes some α ∈ Q ∪ {∞}, so M → {α, Mα} = 0. In fact, the kernel is generated by these elements, but we will not prove that here. 2.1.5. Rational structure and the Manin-Drinfeld Theorem. We have seen that every element γ of H1(XG, Z) has the form {α, Mα} with M ∈ G and α ∈ H∗ arbitrary; usually we take α to be a cusp, so that γ is a path between G-equivalent cusps. It is not clear in general what is the field of definition of a modular symbol {α, β}G for which α and β are both cusps. However, when G is a congruence subgroup, this is answered by the Manin-Drinfeld Theorem. A congruence subgroup of Γ is a subgroup G such that membership of G is determined by means of congruence conditions on the entries of a matrix in Γ. A moment’s thought shows that this is equivalent to the condition that for some positive integer N, G contains the principal congruence subgroup Γ(N), which is defined to be the subgroup of Γ consisting of matrices congruent to the identity modulo N. The least such N is called the level of G. The most important congruence subgroups are Γ(N) itself; Γ0(N) = a b c d ∈ Γ | c ≡ 0 (mod N) ; and Γ1(N) = a b c d ∈ Γ | c ≡ 0, a ≡ 1 (mod N) . We can now state the Manin-Drinfeld Theorem. 2.1 MODULAR SYMBOLS AND HOMOLOGY 13 Theorem 2.1.3. (Manin, Drinfeld) Let G be a congruence subgroup of the modular group Γ. Then for all pairs of cusps α, β ∈ H∗ we have {α, β}G ∈ H1(XG, Q). In particular, the modular symbol {0, ∞}G is rational; the denominator of this element is very important in many ways. Thus the rational homology H1(XG, Q) is generated by paths between cusps (since it is generated by the integral homology), and conversely every path between cusps is rational. We will see later how to use this fact to develop an algorithm for computing the rational homology. The proof of the Manin-Drinfeld Theorem involves the use of Hecke operators: see the Remark in Section 2.9 for a sketch of this argument in the case of Γ0(N). Using Hecke operators, we can also prove that the space of cusp forms S2(G) has a Q-structure, namely a basis consisting of forms with rational Fourier coefficients (when G is a congruence subgroup). This is related to the fact that the modular curve XG, which as a Riemann surface is certainly an algebraic curve over C, can in fact be given the structure of an algebraic curve over the field of algebraic numbers Q (and even over the Nth cyclotomic field, if G has level N). This rational structure is crucial to the construction of modular elliptic curves, to ensure that we obtain elliptic curves defined over Q. For further details, see the books of Lang [32] and Knapp [28]. The duality between cusp forms and homology does not descend entirely to Q, however, because even if f ∈ S2(G) has rational Fourier coefficients and γ ∈ H1(XG, Q), the period γ, f will not be rational. Rationality questions for periods of modular forms have been studied extensively, notably by Manin, but we will not go into this further here. 2.1.6. Triangulations and homology. From now on we will assume that G is a congruence subgroup, so that the rational homology of XG is precisely the homology generated by paths between cusps. We will compute the homology of XG by first triangulating the upper half-plane H∗ , using a tessellation of hyperbolic triangles, and then passing to the quotient. This will give us a very explicit triangulation of the surface XG, using which we can write down explicit generators and relations for its 1-homology. For M ∈ Γ let (M) denote {M(0), M(∞)}, viewed as a path in H∗ ; this is the image under M of the imaginary axis {0, ∞}. These geodesic paths form the oriented edges of a triangulation of H∗ whose vertices are the cusps Q ∪ {∞}. Explicitly, there is an edge from ∞ to n for all n ∈ Z, and an edge between rational numbers a/c and b/d such that ad − bc = 1. The triangles of the triangulation are images under M ∈ Γ of the basic triangle T with vertices at 0, 1 and ∞ and edges (I), (TS), ((TS)2 ). We denote by M the image of this triangle under M, which has vertices M(0), M(1) and M(∞) and edges (M), (MTS) and (M(TS)2 ). This representation of the triangles is unique except for the relation M = MTS = M(TS)2 . Also, triangles M and MS meet along the edge (M), since (MS) = −(M) (the negative sign indicating reverse orientation). We will use the symbol (M)G to denote the image of the path (M) in the quotient XG, and also its image in the rational homology. The geometric observations of the previous paragraph now give us the following 2- and 3-term relations between these symbols: (2.1.5) (M)G + (MTS)G + (M(TS)2 )G = 0 (M)G + (MS)G = 0. 14 II. MODULAR SYMBOL ALGORITHMS We also clearly have the relations (2.1.6) (M M)G = (M)G for all M ∈ G, so we may use as generators of the rational homology the finite set of symbols (Mi)G, (1 ≤ i ≤ e), where as before M1, . . . , Me are a set of right coset representatives for G in Γ. Let C(G) be the Q-vector space with basis the formal symbols (M)G for each M in Γ, identified by the relations (2.1.6), so that dim(C(G)) = e = [Γ : G]. Let B(G) be the subspace of C(G) spanned by all elements of the form (M)G + (MS)G, (M)G + (MTS)G + (M(TS)2 )G. Let C0(G) be the Q-vector space spanned by the G-cusps [α]G for α ∈ Q ∪ {∞} (so that [α]G = [β]G ⇐⇒ β = M(α) for some M ∈ G). Define the boundary map δ: C(G) → C0(G) by δ((M)G) = [M(∞)]G − [M(0)]G and set Z(G) = ker(δ). Note that B(G) ⊆ Z(G), by a trivial calculation using the facts that S transposes 0 and ∞ while TS cycles 0, 1 and ∞. Finally we define H(G) = Z(G)/B(G). The crucial result, due in this form to Manin [37, Theorem 1.9], is that this formal construction does in fact give us the rational homology of XG: Theorem 2.1.4. H(G) is isomorphic to H1(XG, Q), the isomorphism being induced by (M)G → {M(0), M(∞)}G. We may thus use the symbol (M)G either as an abstract symbol obeying certain relations, or to denote an element of H1(XG, Q), without confusion. In future, as the subgroup G will be fixed, we will omit the subscript on these symbols and blur the distinction between (M) as a path in the upper half-plane and (M)G as representing an element of the rational 1-homology of XG. Note that the form of the relations between the generating symbols (M) does not depend at all on the specific group G. In particular we do not have to consider explicitly the shape of a fundamental region for the action of G on H∗ , or how the edges of such a region are identified. This represents a major simplification compared with earlier approaches, such as that used by Tingley [67]. In order to develop this result into an explicit algorithm for computing homology, we need to have a specific set of right coset representatives for the subgroup G of Γ, and also to have a test for G-equivalence of cusps. These are purely algebraic problems which can easily be solved for arithmetically defined subgroups G such as congruence subgroups. Observe that from this point on, we do not have to do any geometry at all. One final remark before we specialize to the case G = Γ0(N): every path between cusps may be expressed as a finite sum of paths of the form (M) with M ∈ Γ. Writing {α, β} = {0, β} − {0, α}, it suffices to do this for modular symbols of the form {0, α}. Let (2.1.7) p−2 q−2 = 0 1 , p−1 q−1 = 1 0 , p0 1 = p0 q0 , p1 q1 , p2 q2 , . . . , pk qk = α denote the continued fraction convergents of the rational number α. Then, as is well-known, pjqj−1 − pj−1qj = (−1)j−1 for −1 ≤ j ≤ k. 2.2 M-SYMBOLS AND Γ0(N) 15 Hence (2.1.8) {0, α} = k j=−1 pj−1 qj−1 , pj qj = k j=−1 {Mj(0), Mj(∞)} = k j=−1 (Mj) where Mj = (−1)j−1 pj pj−1 (−1)j−1 qj qj−1 . 2.2 M-symbols and Γ0(N) We now specialize to the case G = Γ0(N): Γ0(N) = a b c d ∈ Γ | c ≡ 0 (mod N) . The index of Γ0(N) in Γ is given (see [55, Proposition 1.43]) by [Γ : Γ0(N)] = N p|N 1 + p−1 . Define H(N) = H(Γ0(N)) and X0(N) = XΓ0(N). After Theorem 2.1.4, we will identify H(N) with H1(X0(N), Q) by identifying (M) with {M(0), M(∞)}. The next lemma is used to determine right coset representatives for Γ0(N) in Γ. Proposition 2.2.1. For j = 1, 2 let Mj = aj bj cj dj ∈ Γ. The following are equivalent. (1) The right cosets Γ0(N)M1 and Γ0(N)M2 are equal; (2) c1d2 ≡ c2d1 (mod N); (3) There exists u with gcd(u, N) = 1 such that c1 ≡ uc2 and d1 ≡ ud2 (mod N). Proof. We have M1M−1 2 = a1d2 − b1c2 ∗ c1d2 − d1c2 a2d1 − b2c1 , which is in Γ0(N) if and only if c1d2 − d1c2 ≡ 0 (mod N). Thus (1) and (2) are equivalent. Also, if (1) holds, then from det(M1M−1 2 ) = 1, we deduce also that gcd(u, N) = 1, where u = a2d1 − b2c1. Now uc2 = a2d1c2 − b2c1c2 ≡ a2d2c1 − b2c2c1 since d1c2 ≡ d2c1 (mod N) = c1 since a2d2 − b2c2 = 1 and ud2 ≡ d1 similarly. Conversely, if c1 ≡ uc2 and d1 ≡ ud2 (mod N) with gcd(u, N) = 1, then the congruence in (2) follows easily. On the set of ordered pairs (c, d) ∈ Z2 such that gcd(c, d, N) = 1 we now define the relation ∼, where (2.2.1) (c1, d1) ∼ (c2, d2) ⇐⇒ c1d2 ≡ c2d1 (mod N). By Proposition 2.2.1, this is an equivalence relation. The equivalence class of (c, d) will be denoted (c : d), and such symbols will be called M-symbols (after Manin, who introduced them in [37]). The set of these M-symbols modulo N is P 1 (N) = P1 (Z/NZ), the projective line over the ring of integers modulo N. Notice that in an M-symbol (c : d), the integers c and d are only determined modulo N, and that we can always choose them such that gcd(c, d) = 1. Proposition 2.2.1 now implies the following. 16 II. MODULAR SYMBOL ALGORITHMS Proposition 2.2.2. There exist bijections P1 (N) ←→ [Γ : Γ0(N)] ←→ {(M) : M ∈ [Γ : Γ0(N)]} given by (2.2.2) (c : d) ↔ M = a b c d ↔ (M) = {b/d, a/c} where a, b ∈ Z are chosen so that ad − bc = 1. Note that a different choice of a, b in (2.2.2) has the effect of multiplying M on the left by a power of T which does not change the right coset of M, or the symbol (M), since T ∈ Γ0(N) for all N. The right coset action of Γ on [Γ : Γ0(N)] induces an action on P1 (N): (2.2.3) (c : d) p q r s = (cp + dr : cq + ds). In particular, we have (2.2.4) (c : d)S = (d : −c) and (c : d)T = (c : c + d). The boundary map δ now takes the form (2.2.5) δ: (c : d) → [a/c] − [b/d]. In order to compute ker(δ), we must be able to determine when two cusps are Γ0(N)equivalent. This is achieved by the following result. Proposition 2.2.3. For j = 1, 2 let αj = pj/qj be cusps written in lowest terms. The following are equivalent: (1) α2 = M(α1) for some M ∈ Γ0(N); (2) q2 ≡ uq1 (mod N) and up2 ≡ p1 (mod gcd(q1, N)), with gcd(u, N) = 1. (3) s1q2 ≡ s2q1 (mod gcd(q1q2, N)), where sj satisfies pjsj ≡ 1 (mod qj). Proof. (1) =⇒ (2): Let M = a b Nc d ∈ Γ0(N); Then p2/q2 = (ap1 + bq1)/(Ncp1 + dq1), with both fractions in lowest terms. Equating numerators and denominators (up to sign) gives (2), with u = ±d, since ad ≡ 1 (mod N). (2) =⇒ (1): Here we use Proposition 2.2.1. Assume (2), and write p1s1 − q1r1 = p2s2 − q2r2 = 1 with s1, r1, s2, r2 ∈ Z. Then p1s1 ≡ 1 (mod q1) and p2s2 ≡ 1 (mod q2). Also gcd(q1, N) = gcd(q2, N) = N0, say, since q2 ≡ uq1 (mod N). Now up2 ≡ p1 (mod N0) implies us1 ≡ s2 (mod N0), so we may find x ∈ Z such that uxq1 ≡ us1−s2 (mod N). Set s1 = s1−xq1 and r1 = r1 − xp1. Then p1s1 − q1r1 = 1 and now us1 ≡ s2 (mod N). By Proposition 2.2.1, there exists M ∈ Γ0(N) such that p2 r2 q2 s2 = M p1 r1 q1 s1 , and so M(p1/q1) = p2/q2 as required. (1) ⇐⇒ (3): As before, solve the equations pjsj − qjrj = 1 for j = 1, 2. Set Mj = pj rj qj sj , so that Mj(∞) = αj, and M2M−1 1 (α1) = α2. This matrix is in Γ0(N) if and only if q2s1 − q1s2 ≡ 0 (mod N). The most general such matrix is obtained by replacing s1 by 2.2 M-SYMBOLS AND Γ0(N) 17 s1 = s1 + xq1, and it follows that α1 and α2 are equivalent if and only if we can solve for x ∈ Z the congruence 0 ≡ q2s1 − q1s2 ≡ q2s1 − q1s2 + xq1q2 (mod N), which is if and only if the congruence in (3) holds. Henceforth, we can therefore assume that H(N) is given explicitly in terms of M-symbols. Certain symbols will be generators, and each M-symbol (c : d) will be expressed as a Q-linear combination of these generating symbols, by means of the 2-term relations (2.2.6) (c : d) + (−d : c) = 0 and 3-term relations (2.2.7) (c : d) + (c + d : −c) + (d : −c − d) = 0. These are just the relations (2.1.5) expressed in terms of M-symbols, using (2.2.4). Implementation. We make a list of inequivalent M-symbols as follows: first, list the symbols (c : 1) for 0 ≤ c < N; then the symbols (1 : d) for 0 ≤ d < N and gcd(d, N) > 1; and finally a pairwise inequivalent set of symbols (c : d) with c|N, c = 1, N, gcd(c, d) = 1 and gcd(d, N) > 1. (The latter symbols do not arise when N is a prime power.) To speed up the looking up of M-symbols in the list, we have found it extremely worthwhile to prepare at the start of the program a collection of lookup tables, containing for example a table of inverses modulo N. We also used a simple “hashing” system, so that given any particular symbol (c : d) we could quickly determine to which symbol in our standard list it is equivalent. While this preparation of look-up tables may seem rather trivial, in practice it has had a dramatic effect, speeding up the mass computation of Hecke eigenvalues ap (see Section 2.9) by a factor of up to 50. Using the 2-term relations (2.2.6) we may identify the M-symbols in pairs, up to sign. This immediately halves the number of generators needed. Then the 3-term relations (2.2.7) are computed, each M-symbol being replaced by plus or minus one of the current generators, and the resulting equations solved using integer Gaussian elimination. At the end of this stage we have a list of r (say) “free generators” from the list of M-symbols, and a table expressing each of the M-symbols in the list as a Q-linear combination of the generators. In practice, we store Z-linear combinations, keeping a common denominator d1 separately; however, by judicious choice of the order of elimination of symbols, in practice this denominator is very frequently 1. Next we compute the boundary map δ on each of the free generators, using (2.2.5). We have a procedure based on Proposition 2.2.3 to test cusp equivalence. Hence we do not have to compute in advance a complete list of inequivalent cusps. Instead, we keep a cumulative list: each cusp we come across is checked for equivalence with those in the list already, and is added to the list if it represents a new equivalence class. We found this simpler to implement than using a standard set of pairwise inequivalent cusps, as in [37, Corollary 2.6]. We thus compute a matrix with integer entries for the linear map δ, and by a second step of Gaussian elimination can compute a basis for its kernel, which by definition is H(N). This basis is stored as a list of 2g integer vectors in Zr over a second common denominator d2. (Here g is the genus of X0(N), so that dim H(N) = 2g.) We may arrange (by reducing the basis suitably) that whenever a linear combination of M-symbols (represented as a vector in Zr ) is in ker(δ), then its coefficients with respect to the basis are given by (a subset of) 2g components of these vectors, divided by the cumulative common denominator d1d2. From now on we will regard elements of H(N) as being given by vectors in Z2g in this way. 18 II. MODULAR SYMBOL ALGORITHMS 2.3 Conversion between modular symbols and M-symbols As noted above, each M-symbol (c : d) has a representative with gcd(c, d) = 1, and corresponds to the right coset representative M = a b c d in Γ, where ad − bc = 1. The isomorphism of Theorem 2.1.4 thus becomes (2.3.1) (c : d) → {b/d, a/c}. The modular symbol on the right of (2.3.1) is independent of the choice of a and b with ad − bc = 1, since 1 1 0 1 ∈ Γ0(N) for all N, and so {α, β}Γ0(N) = {α + k, β + l}Γ0(N) for all k, l in Z and α, β in H∗ . Conversely each modular symbol {α, β} with α and β in Q ∪ {∞} can be expressed, using continued fractions, as a sum of modular symbols of the special form (M) = {M(0), M(∞)} with M ∈ Γ, hence as a sum of M-symbols (c : d), and finally as a linear combination of the generating M-symbols. Using the notation introduced above in Subsection 2.1.6, if q0 = 1, q1, . . ., qk are the denominators of the continued fraction convergents to the rational number α as in (2.1.7), in terms of M-symbols we have (2.3.2) {0, α} = k j=1 ((−1)j−1 qj : qj−1) since the first two terms in (2.1.8) cancel out. Note that it is only the denominators of the continued fraction convergents which are used. 2.4 Action of Hecke and other operators For each prime p not dividing N, the Hecke operator Tp acts on modular symbols {α, β} via (2.4.1) Tp: {α, β} →   p 0 0 1 + r mod p 1 r 0 p   {α, β} = {pα, pβ} + r mod p α + r p , β + r p . This action induces a linear map from H(N) to itself, provided that p does not divide N, which we again denote by Tp. There are also Hecke operators, which we also denote Tp, acting on the space S2(N) = S2(Γ0(N)) of cusp forms of weight 2 for Γ0(N). First recall that 2 × 2 matrices M = a b c d with ad − bc > 0 act on functions f(z) on the right via f → f |M where (f |M )(z) = ad − bc (cz + d)2 f az + b cz + d . 2.4 ACTION OF HECKE AND OTHER OPERATORS 19 A form of weight 2 for some group G will satisfy f |M = f for all M ∈ G. This action extends by linearity to an action by formal linear combinations of matrices. The Hecke operator Tp is defined by f |Tp = f p 0 0 1 + p−1 r=0 1 r 0 p , so that (f |Tp ) (z) = p f(pz) + 1 p p−1 r=0 f z + r p . A standard result is that Tp does act on S2(N), provided that p N. (There are similar operators Up for primes p dividing N, but we will not need these). These matrix actions on S2(N) and H(N) are compatible, in the sense that they respect the duality between cusp forms and homology: {α, β}, f |M = {Mα, Mβ}, f , since d dz az + b cz + d = ad − bc (cz + d)2 , and so β α (f |M ) (z)dz = β α ad − bc (cz + d)2 f(M(z))dz = Mβ Mα f(w)dw. Thus, in particular, {α, β}, f |Tp = Tp{α, β}, f . Secondly, for each prime q dividing N there is an involution operator Wq acting on H(N) and S2(N). We recall the definition. Let qα be the exact power of q dividing N, and let x, y, z, w be any integers satisfying qα xw − (N/qα )yz = 1. Then the matrix Wq = qα x y Nz qα w has determinant qα and normalizes Γ0(N) (modular scalar matrices). Thus Wq induces an action on H(N) and S2(N), which is an involution since W 2 q ∈ Γ0(N) (modulo scalars), and is independent of the values x, y, z, w chosen. The product of all the Wq for q dividing N is the Fricke involution WN , coming from the transformation z → −1/Nz, with matrix 0 −1 N 0 . The operators Tp for primes p not dividing N and Wq for primes q dividing N together generate a commutative Q-algebra, called the Hecke algebra and denoted T. Moreover, each operator is self-adjoint with respect to the so-called Petersson inner product on S2(N), and so there exist bases for S2(N) consisting of simultaneous eigenforms for all the Tp and Wq, with real eigenvalues. (See [1, Theorem 2] or [32, Corollary 2 to Theorem 4.2].) Similarly, the action of T on H(N) ⊗ R can also be diagonalized. Finally, recall from Subsection 2.1.3 that the transformation z → z∗ = −z on H commutes with the action of Γ0(N) and hence also induces an involution on H(N) which we denote ∗. This operator commutes with all the Tp and Wq, which thus preserve the eigenspaces H+ (N) and H− (N). Moreover, H+ (N) and H− (N) are isomorphic as modules for the Hecke algebra T. It follows that in order to compute eigenvalues of Hecke operators, we can restrict our attention to H+ (N). This has some practical significance, as we elaborate in the next section. 20 II. MODULAR SYMBOL ALGORITHMS Implementation. To compute the matrices giving the action of each of these operators on H(N) we may proceed as follows. We convert each of the generating M-symbols to a modular symbol as in Section 2.3. To compute a Tp, we apply (2.4.1) to each, reconvert each term on the right of (2.4.1) to a sum of M-symbols using (2.3.2), and hence express it as a Z2g -vector giving it as a linear combination of the generating M-symbols. This gives one column of the 2g × 2g matrix. Similarly with Wq and WN . Computing the matrix of ∗ is easier, as we can work directly with the M-symbols, on which ∗ acts via (c : d) → (−c : d). These integer matrices are in fact d1d2 times the actual operator matrices (where d1 and d2 are the denominators which may have arisen earlier as a result of the Gaussian elimination steps). Obviously this must be taken into account when we look for eigenvalues later; however, for simplicity of exposition we will assume from now on that this denominator d1d2 is 1. We use the convention that the space is represented by column vectors, with operator matrices acting on the left. Heilbronn matrices. There is an alternative approach to computing the Tp, based on socalled Heilbronn matrices of level p. These were described by Mazur in [38], and their application to give an algorithm for computing the Hecke action on homology in terms of M-symbols was given by Merel in his paper [42]. We will describe our own version of this method, which is easy to implement; our approach differs slightly from, and is a little simpler than, that of Merel’s paper [42]. Since with this method one acts directly on the M-symbols, one avoids the conversion to and from modular symbols. This makes the method faster in practice, particularly as we may precompute the Heilbronn matrices for all the small primes p (say p < 30) for which we need to compute the matrix of Tp in order to split off one-dimensional eigenspaces from H(N). From the definition in (2.4.1), the action of Tp is expressed as the sum of the actions on modular symbols, on the left, of p+1 matrices of determinant p, namely p 0 0 1 and 1 r 0 p for r modulo p. The following result shows how each of these acts on M-symbols directly, via an action on the right. Proposition 2.4.1. Let p be a prime not dividing N and (c : d) an M-symbol for N. The action on (c : d) of the p + 1 matrices appearing in (2.4.1) is as follows. (1) p 0 0 1 (c : d) = (c : pd) = (c : d) 1 0 0 p . (2) For r ∈ Z, let Mi ∈ Γ for 0 ≤ i ≤ k be the matrices constructed from the continued fraction convergents to r/p as in (2.1.8), so that M0(0) = ∞, M1(0) = M0(∞), . . . , Mk(0) = Mk−1(∞), Mk(∞) = r p . Set Mi = p −r 0 1 MiS for 0 ≤ i ≤ k. Then 1 r 0 p (c : d) = k i=0 (c : d)Mi. Proof. In each case we first solve ad − bc = 1 for integers a, b and apply the appropriate matrix to the modular symbol {b/d, a/c}. 2.4 ACTION OF HECKE AND OTHER OPERATORS 21 (1) Since p N we may assume by the Chinese Remainder Theorem that c is a multiple of p, say c = pc . Now M = a pb c d = p 0 0 1 a b c d p 0 0 1 −1 ∈ Γ, and (c : d) 1 0 0 p = (c : pd) = (c : d) = (M) = pb d , a c = p 0 0 1 b d , a c . (2) By construction of the Mi, we have k i=0 (Mi) = k i=0 {Mi(0), Mi(∞)} = ∞, r p . Given an M-symbol (c : d), we will show how to choose a and b which satisfy ad − bc = 1 and also M = 1 r 0 p a b c d 1 r 0 p −1 ∈ Γ. Then k i=0 (MMiS) = M r p , ∞ = M 1 r 0 p = 1 r 0 p a b c d = 1 r 0 p b d , a c . Now we show that suitable values of a and b exist. Replacing d by d + N if necessary, we may assume that cr ≡ d (mod p). Given an arbitrary solution to ad − bc = 1, solve for t the congruence (cr − d)t ≡ (b + dr) − r(a + cr) (mod p). Replacing (a, b) by (a + ct, b + dt) we then still have ad − bc = 1 and now (b + dr) − r(a + cr) = pb with b ∈ Z, and a simple calculation shows that M = a + rc b pc d − rc has the desired properties. Since the bottom row of MMiS is (pc : d − rc)MiS = (c : d) p −r 0 1 MiS = (c : d)Mi, the result follows. Hence for each prime p there exists a finite set Rp of matrices in M2(Z) with determinant p, called the Heilbronn matrices of level p, such that the Hecke operator Tp acts on M-symbols via (c : d) → M∈Rp (c : d)M. The usual definition of the set Rp (for an odd prime p not dividing N) is as follows: Rp is the set of matrices x −y y x ∈ M2(Z) with determinant xx +yy = p, and either (i) x > |y| > 0, x > |y | > 0, and yy > 0; or (ii) y = 0, and |y | < x /2; or (iii) y = 0, and |y| < x/2. This description, while closer to the original definition by Heilbronn and used by both Mazur and Merel, is not so easy to use in practice. One can show that the matrices in this definition may be constructed using the continued fraction expansions of r/p for r modulo p, and this leads to the presentation we have given here. 22 II. MODULAR SYMBOL ALGORITHMS For example, for the first few primes we have R2 = 1 0 0 2 , 2 0 0 1 , 2 1 0 1 , 1 0 1 2 , R3 = 1 0 0 3 , 3 1 0 1 , 1 0 1 3 , 3 0 0 1 , 3 −1 0 1 , −1 0 1 −3 , and R5 = 1 0 0 5 , 5 2 0 1 , 2 1 1 3 , 1 0 3 5 , 5 1 0 1 , 1 0 1 5 , 5 0 0 1 , 5 −1 0 1 , −1 0 1 −5 , 5 −2 0 1 , −2 1 1 −3 , 1 0 −3 5 . To compute the complete set Rp for any prime p we may use the following algorithm. Effectively, we are computing the continued fraction expansions of each rational r/p with denominator p, and recording the matrices denoted Mi in the preceding Proposition. In line 3 of the algorithm, the loop is over a complete set of residues r modulo p, such as −p/2 ≤ r < p/2. Algorithm for computing Heilbronn matrices INPUT: p (a prime). OUTPUT: the Heilbronn matrices of level p. 1. BEGIN 2. OUTPUT 1 0 0 p ; 3. FOR r MODULO p DO 4. BEGIN 5. x1=p; x2=-r; y1=0; y2=1; a=-p; b=r; 6. OUTPUT x1 x2 y1 y2 ; 7. WHILE b=0 DO 8. BEGIN 9. q=nearest integer(a/b); 10. c=a-b*q; a=-b; b=c; 11. x3=q*x2-x1; x1=x2; x2=x3; 12. y3=q*y2-y1; y1=y2; y2=y3; 13. OUTPUT x1 x2 y1 y2 14. END 15. END 16. END For example, take p = 7 and r = 3. The continued fraction convergents linking ∞ to 3/7 are ∞, 0, 1 2 , 3 7 with associated unimodular matrices M0 = 0 1 −1 0 , M1 = 1 0 2 1 , and M2 = −3 1 −7 2 . 2.5 WORKING IN H+(N) 23 The matrices Mi constructed in Proposition 2.4.1 are then M0 = 7 −3 0 1 , M1 = −3 −1 1 −2 , and M2 = 1 0 2 7 . These are (up to sign) the same as the matrices output by lines 3–14 of the algorithm when p = 7 and r = 3. 2.5 Working in H+ (N) Recall that H+ (N) is the +1 eigenspace for the operator ∗: z → −z acting on H(N) = H1(X0(N), Q). We would like to work in H+ (N) to compute the action of the Hecke algebra T, since there are obvious savings in computation time and storage space achieved by working in a space with half the dimension of H(N). To do this, note that H+ (N) ∼= H(N)/H− (N) (as vector spaces). We can thus compute H+ (N) in terms of M-symbols by including extra 2-term relations (2.5.1) (c : d) = (−c : d) between the M-symbols. We must also identify the cusp equivalence classes [α] and [α∗ ] = [−α] for α ∈ Q. Effectively we are replacing Γ0(N) by the larger group Γ0(N) = a b c d | a, b, c, d ∈ Z, ad − bc = ±1, c ≡ 0 (mod N) = Γ0(N), J which still acts discretely on H∗ via a b c d : z →    az + b cz + d if ad − bc = +1, az + b cz + d if ad − bc = −1; in particular, J = −1 0 0 1 sends z to z∗ = −z, giving the action of ∗. Hence, in effect, Γ0(N) = Γ0(N), ∗ , and H+ (N) ∼= H1(Γ0(N)\H∗ , Q). (A similar procedure is possible for other subgroups G of Γ of real type.) As a further saving, use of the extra relation (2.5.1) enables us to cut out half the 3-term relations (2.2.7), as follows. Using (2.5.1) on the second and third terms of (2.2.7) yields (c : d) + (c + d : c) + (d : c + d) = 0. Also, (2.5.1) and (2.2.6) together imply (d : c) = −(c : d). Hence relation (2.2.7) for (d : c) now gives the same information as (2.2.7) for (c : d), and can be omitted. Geometrically, the triangles which determine the 3-term relations have been identified in pairs by the action of the larger group, since the effect of the transformation J is to fold the upper half-plane in two along the imaginary half-axis. 24 II. MODULAR SYMBOL ALGORITHMS Implementation. We modify the procedure of Section 2.2 in three ways: taking the 2-term relations (2.2.6) and (2.5.1) together we may identify M-symbols in sets of four (instead of two), up to sign, at the first stage of elimination. Then in the second stage we have only half the number of 3-term relations to consider, as noted above, and each can be expressed in terms of half the number of current generators: so we have half the number of equations in half the number of variables to solve, giving a four-fold saving in space and time. Finally, in computing ker(δ) we must use a wider notion of cusp equivalence, since for α, β ∈ Q, α ≡ β (mod Γ0(N)) ⇐⇒ α ≡ ±β (mod Γ0(N)). Working in H+ (N) is sufficient for the first stage of our algorithm, when we want to find certain cusp forms in S2(N), since H+ (N)⊗Q C ∼= S2(N), both as vector spaces and as modules for the Hecke algebra T. Hence eigenvectors in H+ (N) correspond to eigenforms in S2(N). Since these eigenforms (or, more accurately, newforms—see the next section) have Fourier expansions in which the Fourier coefficients are determined by their Hecke eigenvalues, we can determine these coefficients indirectly by computing explicitly the action of the Hecke algebra T on H+ (N). 2.6 Modular forms and modular elliptic curves Let S2(N) denote, as above, the space of cusp forms of weight 2 on Γ0(N). Forms f(z) ∈ S2(N) have Fourier expansions of the form f(z) = ∞ n=1 a(n, f)e2πinz , with coefficients a(n, f) ∈ C. The corresponding differentials 2πif(z)dz are (the pullbacks of) holomorphic differentials on the Riemann surface X0(N). Hence S2(N) is a complex vector space of dimension g, where g is the genus of X0(N), and 2g = dim H(N). Moreover, S2(N) = S2(N)Q ⊗Q C where S2(N)Q is the subset of S2(N) consisting of forms f(z) with rational Fourier coefficients a(n, f). This rational structure on S2(N) is a consequence of the deep fact that X0(N) may be viewed as the complex points of an algebraic curve defined over Q; it may also be proved using Hecke operators and the duality with homology. We are interested here in “rational newforms” f: that is, forms f which have rational Fourier coefficients a(n, f), are simultaneous eigenforms for all the Hecke operators, and which are also “newforms” in the sense of Atkin and Lehner (see [1]). We briefly recall the definition. For each proper divisor M of N and each g ∈ S2(M), the forms g(Dz) for divisors D of N/M are in S2(N). The subspace Sold 2 (N) of S2(N) spanned by all such forms is called the space of oldforms. There is also an inner product on S2(N), called the Petersson inner product, with respect to which the Hecke operators are self-adjoint (Hermitian). Define Snew 2 (N) to be the orthogonal complement in S2(N) of Sold 2 (N) with respect to the Petersson inner product. The restriction of the Hecke algebra T to Snew 2 (N) is semisimple; Snew 2 (N) has a basis consisting of simultaneous eigenforms, and these eigenforms are called newforms. In general, newforms come in conjugate sets of d ≥ 1 forms with eigenvalues generating an algebraic number field of degree d. The periods of such a set of conjugates {f} form a lattice Λ of rank 2d in Cd , and hence an abelian variety Af = Cd /Λ, which is defined over Q. Here we will only be interested in the case d = 1, where the Hecke eigenvalues and hence Fourier coefficients of f are rational (in fact integers, being eigenvalues of integral matrices and hence algebraic integers). We will call such a form f a rational newform. Thus a rational newform f has an associated period lattice Λf : Λf = { {α, β}, f | α, β ∈ H∗ , α ≡ β (mod Γ0(N))} 2.7 SPLITTING OFF ONE-DIMENSIONAL EIGENSPACES 25 which is a discrete rank 2 subgroup of C. Then Ef = C/Λf is an elliptic curve, the modular elliptic curve attached to f. Moreover it is known that Ef is defined over Q, has conductor N, and has L-series L(Ef , s) = a(n, f)n−s where f = a(n, f) exp(2πinz). (See [64], [55], and [7] for proofs of these statements, and [28] for a fuller discussion.) The Fourier coefficients a(n, f) of a newform f(z) = a(n, f) exp(2πinz) are obtained from the Hecke eigenvalues of f as follows (see [1]). Firstly, for a newform f we always have a(1, f) = 0, and we normalize so that a(1, f) = 1. Then: If p is a prime not dividing N, and f |Tp = apf, then a(p, f) = ap. If q is a prime dividing N, and f |Wq = εqf with εq = ±1, then (2.6.1) a(q, f) = −εq if q2 N, 0 if q2 |N. For prime powers, we have the recurrence relation (2.6.2) a(pr+1 , f) = a(p, f)a(pr , f) − δN (p)pa(pr−1 , f) (r ≥ 1) where δN (p) = 1 if p N, 0 if p|N. Finally, for composite indices we have multiplicativity: a(mn, f) = a(m, f)a(n, f) when m and n are relatively prime. With this background we may now make more precise what we mean by “computing the modular elliptic curves of conductor N”. We do the following: (1) Compute the space H+ (N) in terms of M-symbols and their relations. (2) Compute the action of sufficient Hecke operators Wq and Tp on H+ (N) to determine the one-dimensional eigenspaces with rational eigenvalues; by duality, we now know the rational newforms in S2(N). Oldforms can be recognized, since in any systematic computation we will have already found them at some lower level M dividing N. (3) Find a Z-basis for the period lattice Λf , for each rational newform f, computing the generating periods to high precision. (4) Given a Z-basis for Λf , compute the coefficients of an equation for the attached elliptic curve Ef . (5) As well as the period lattice of the curves Ef , we can also compute the rational number L(Ef , 1)/Ω(Ef) (exactly) and the real value L(Ef , 1) (approximately). Also, when L(Ef , 1) = 0 we can also determine the order of vanishing of L(Ef , s) at s = 1, giving the analytic rank r of Ef , and the value of the derivative L(r) (Ef , 1), which is important in view of the Birch–Swinnerton-Dyer conjecture; we will discuss the latter computations in a later section. This is the program which we wish to carry out, and have in fact carried out for all N ≤ 5077. In sections 2.7–2.14 we discuss steps (2)–(5) in more detail. 2.7 Splitting off one-dimensional eigenspaces Having computed a representation of H+ (N) in terms of M-symbols, we now wish to identify the one-dimensional eigenspaces with rational integer eigenvalues for all the Hecke operators. For each eigenspace we will later need a dual basis vector in order to compute the projection of an arbitrary vector onto the eigenspace. Explicitly, we identify H+ (N) with Qg via our M-symbol basis, representing each cycle as a column vector; each operator matrix acts on the left. Elements of the dual space will then be represented as row vectors. Projection onto a 26 II. MODULAR SYMBOL ALGORITHMS one-dimensional eigenspace is then achieved by multiplying on the left by the appropriate row vector, which is defined up to scalar multiple by its being a simultaneous left eigenvector of each matrix. In our implementation, we do not distinguish between row and column vectors, and our linear algebra routines are designed to give right eigenvectors, so in practice all we do is find simultaneous eigenvectors for the transposes of the operator matrices. Projection (of a column vector) is then achieved by taking the dot product with the appropriate dual (row) vector. These remarks seem fairly trivial, but we need to be completely explicit if we are to implement these ideas successfully. We wish to compute as few Tp as possible at this stage, to save time; we will have a much faster way of computing many Hecke eigenvalues later (see Section 2.9), once the eigenspaces have been found. We also need to identify “oldclasses”: these are also common eigenspaces for all the Tp (though not for all the Wq, see below) but have dimension greater than 1. In order to recognize and discard oldforms as early as possible, we can create a cumulative database of the number of newforms and the first few Hecke eigenvalues (including all Wq-eigenvalues) of each newform at each level. If we proceed systematically through the levels N in order, we will thus always know about the newforms at levels M dividing N but less than N. An alternative approach might be possible here, in which we use further operators at level N, such as the Uq of [1], to eliminate all but newforms. We have not devised such a scheme which works in full generality; the advantage would be that each level could then be treated in isolation, independently of lower levels, but this was not necessary in our systematic investigations which resulted in the tables in this volume. Before starting to split H+ (N) we have the following data: the number of rational newforms g in S2(M) for proper divisors M of N; and for each such g, the Wq-eigenvalue εq for all primes q dividing M and the Tp-eigenvalue ap for several primes p not dividing N. Each form g generates an “oldclass” in S2(N): a subspace of forms which have the same eigenvalue ap for all primes p not dividing N. A basis for this oldclass consists of the forms g(Dz) for all positive divisors D of N/M; hence its dimension is d(N/M), the number of positive divisors of N/M. The forms in the oldclass do not necessarily, however, have the same Wq-eigenvalue for primes q dividing N. We now proceed to find these eigenvalues explicitly. To simplify the following exposition, observe that the Wq operators may be defined for all primes q, not just those dividing the level N, using the matrices Wq = qα x y Nz qα w of determinant qα where qα || N; for if in fact q N, then α = 0, so that Wq ∈ Γ0(N) and f |Wq = f for all f ∈ S2(N). Thus in such a case, Wq reduces to the identity. We first consider the case where N/M is a prime power. Lemma 2.7.1. Let g be a newform in S2(M), let l be a prime with g |Wl = εg, and let N = qβ M where q is also prime. Thus g determines an oldclass of dimension β + 1, spanned by the forms gi(z) = qi g(qi z) ∈ S2(N), for 0 ≤ i ≤ β. (1) If l = q, then gi |Wl = εgi for all i; (2) If l = q, then gi |Wq = εgβ−i. In case (1), all members of the oldclass have the same Wq-eigenvalue ε as g, so ε has multiplicity β +1 (as an eigenvalue of Wl acting on this oldclass). In case (2), the ε-eigenspace for Wq has dimension [(2 + β)/2] (that is, 1 2 (β + 1) if β is odd, or 1 2 (β + 2) if β is even). Proof. Suppose lα || M. In case (1) we have lα || N also. Let W (N) l = lα x y Nz lα w , with det W (N) l = lα . Then for 0 ≤ i ≤ β we have qi 0 0 1 W (N) l = W (M) l qi 0 0 1 2.7 SPLITTING OFF ONE-DIMENSIONAL EIGENSPACES 27 where W (M) l = lα x qi y Mqβ−i z lα w also has determinant lα . Hence gi W (N) l = g qi 0 0 1 W (N) l = g W (M) l qi 0 0 1 = εg qi 0 0 1 = εgi. In case (2), when q = l, we have qα+β || N. Let W (N) q = qα+β x y Nz qα+β w with det W (N) q = qα+β . Then for 0 ≤ i ≤ β we have qi 0 0 1 W(N) q = W(M) q qβ−i 0 0 1 (modulo scalar matrices, which act trivially), where W (M) q = qα+i x y Mz qα+β−i w has determinant qα . Hence gi W (N) q = εgβ−i as required. As a basis for the ε-eigenspace for W (N) q we may take the forms gi + gβ−i for 0 ≤ i ≤ β/2, and for the (−ε)-eigenspace, gi − gβ−i for 0 ≤ i < β/2. Hence the multiplicities are as stated. Using this result we can easily extend to the general case by induction on the number of prime divisors of N/M, giving the following result. Proposition 2.7.2. Let g be a newform in S2(M) where M | N. Write N/M = k i=1 qβi i , so that the oldclass in S2(N) coming from g has dimension d(N/M) = (1 + βi). (1) For every prime q not dividing N/M, the Wq-eigenvalue of every form in the oldclass is the same as that of g. (2) Suppose g |Wqi = εig for 1 ≤ i ≤ k. Let (2.7.1) n+ i =    1 2 (βi + 1) if βi is odd, 1 2 (βi + 2) if βi is even and εi = +1, 1 2 βi if βi is even and εi = −1, and put n− i = 1 + βi − n+ i , so that (n+ i + n− i ) = (βi + 1) = d(N/M). If (δ1, δ2, . . . , δk) is any k-vector with each δi = ±, then the subspace of oldforms in the oldclass on which Wqi has eigenvalue δi for 1 ≤ i ≤ k has dimension k i=1 nδi i . Hence we are able to compute from our database a complete set of “sub-oldclasses”—that is, subspaces of oldclasses which have the same eigenvalues for all the operators Tp and Wq—with their dimensions. Having thus computed a list of sub-oldclasses with their dimensions, Wq-eigenvalues and first few Tp-eigenvalues, we now proceed to find “new” one-dimensional rational eigenspaces of H+ (N) as follows. We consider each prime in turn, starting with the q which divide N, then moving on to the p which do not divide N, computing Wq or Tp as appropriate. For each, we consider all possible integer eigenvalues (εq = ±1 for Wq, and ap with |ap| < 2 √ p for Tp) and restrict all subsequent operations to each nonzero eigenspace in turn. At any given stage we have a subspace of H+ (N) on which all the operators so far considered act as 28 II. MODULAR SYMBOL ALGORITHMS scalars. Comparing with the oldform data we can tell whether this subspace consists entirely of oldforms: if so, we discard it. If not, and the subspace is one-dimensional, we have found a rational one-dimensional eigenspace corresponding to a newform. We then record a basis vector and a list of the (prime–eigenvalue) pairs needed to isolate this subspace. Otherwise we proceed recursively to the next prime and the next operator. At the end of this stage of the computation in H+ (N), we have found the number of rational one-dimensional “new” eigenspaces in H+ (N), or equivalently, the number of rational newforms in S2(N). For each we have a dual (integer) eigenvector, which we will use to compute a large number of Hecke eigenvalues in Section 2.9. Implementation. In preparation for splitting off the one-dimensional eigenspaces of H+ (N) we compute the matrices of all the W-operators acting on H+ (N), and store their transposes. We also collect from the “oldform database” information about the newforms at all levels M dividing and less than N. For each oldclass we must compute the eigenvalue multiplicities for each Wq using the formula (2.7.1) above. The splitting itself is done recursively. At the general stage, at depth n, we have the following data: • a particular subspace S of H+ (N) (initially the whole of H+ (N)); • a list of n primes (starting with the q dividing N, and initially empty); • a list of eigenvalues, one for each of the primes in the list. Here S is precisely the subspace of H+ (N) on which the first n operators have the given eigenvalues. Given this data, the recursive procedure does the following: (1) check whether S consists entirely of oldforms, by comparing the list of eigenvalues which determine S with those of each “suboldclass”; if so, terminate this branch; (2) otherwise, if dim S = 1 then store the (single) basis vector for S in a cumulative list and terminate; (3) otherwise, take the next operator T in sequence (computing and storing its matrix if it has not been used before) and compute the matrix TS of its restriction to S; for all possible eigenvalues a of T, compute the kernel of TS − aI; if non-trivial, pass the accumulated data, together with this kernel as a new working subspace, to the procedure at the next depth. This procedure has been found to work extremely efficiently in practice. The only practical difficulty is the possibility of overflow during Gaussian elimination; it was found that the early use of W-operators was an efficient way of avoiding this for as long as possible. However, for larger values of N we were forced to abandon single-precision integer arithmetic for the linear algebra at this stage, and instead use a modular method, working in Z/PZ for some large prime P, instead of in Z. Alternatively, one could use multiprecision arithmetic, but this is likely to be slower. In all subsequent calculations in H+ (N), we will be interested only in the one-dimensional eigenspaces corresponding to rational newforms. To enhance the speed we now change the main M-symbol lookup tables: each vector in the table is replaced by the vector of its projections onto each of the subspaces, computed simply by taking the dot product with each dual eigenvector. For each one-dimensional rational eigenspace found, we also compute the eigenvalue εN of the Fricke involution WN , which is the product of all the Wq involutions. The significance of this is that w = −εN is the sign of the functional equation of the L-series L(f, s) attached to the newform f (see [64] and the next section). 2.8 L(f, s) AND THE EVALUATION OF L(f, 1)/Ω(f) 29 2.8 L(f, s) and the evaluation of L(f, 1)/Ω(f) Attached to each newform f in S2(N) there is an L-function L(f, s), defined as follows via Mellin transform: (2.8.1) L(f, s) = (2π)s Γ(s)−1 i∞ 0 (−iz)s f(z) dz z . This gives an entire function of the complex variable s. Substitute the Fourier expansion f(z) = ∞ n=1 a(n, f) exp(2πinz) and integrate term by term; provided that Re(s) > 3/2 (for convergence), we obtain a representation of L(f, s) as a Dirichlet series: (2.8.2) L(f, s) = ∞ n=1 a(n, f) ns . This L-function is one of the key links between the newform f and the modular elliptic curve Ef defined in Section 2.6 by its periods. First of all, the multiplicative relations satisfied by the coefficients a(n, f), given above in Section 2.6, are equivalent to the statement that the Dirichlet series in (2.8.2) has an Euler product expansion: (2.8.3) ∞ n=1 a(n, f) ns = p N 1 − a(p, f)p−s + p1−2s −1 p|N 1 − a(p, f)p−s −1 . This is exactly the form of the L-function of an elliptic curve of conductor N defined over Q, and in fact the fundamental result (see [7], though partial results were known considerably earlier) is that (2.8.4) L(f, s) = L(Ef , s). Thus (2.8.1) provides an analytic continuation to the entire plane of the L-function attached to the curve Ef , such as is conjectured to exist for all elliptic curves E defined over Q. Instead of the function L(f, s) defined above by (2.8.1), it is sometimes convenient to use the variant with extra ‘infinite’ Euler factors: (2.8.5) Λ(f, s) = Ns/2 (2π)−s Γ(s)L(f, s) = ∞ 0 f(iy/ √ N)ys−1 dy. Thus for Re(s) > 3/2 we have Λ(f, s) = Ns/2 (2π)−s Γ(s) ∞ n=1 a(n, f) ns . The functions L(f, s) and Λ(f, s) also satisfy functional equations relating their values at s and 2 − s. For since f is an eigenform for the Hecke algebra T, it is in particular an eigenform for the Fricke involution WN . Suppose that f |WN = εN f with εN = ±1: that is, f(−1/(Nz)) = εN Nz2 f(z). With z = iy/ √ N this gives f(i/y √ N) = −εN y2 f(iy/ √ N). Hence the substitution of 1/y for y in (2.8.5) yields the functional equation (2.8.6) Λ(f, 2 − s) = −εN Λ(f, s) (note the change of sign). In view of (2.8.4), this gives a functional equation for L(Ef , s) too, of the form conjectured for all elliptic curves over Q. 30 II. MODULAR SYMBOL ALGORITHMS From (2.8.6), we deduce that L(f, 1) = Λ(f, 1) = 0 when εN = +1; more generally, L(f, s) has a zero of odd order when εN = +1, and a zero of even order (or no zero) when εN = −1. The significance of this is that the Birch–Swinnerton-Dyer conjectures predict that the order of the zero of L(E, s) is equal to the rank of E(Q), for an elliptic curve E defined over Q. Thus we will be able to compare this order with the rank of the modular curves Ef , once we have found their equations explicitly. The Birch–Swinnerton-Dyer conjectures also predict the value of L(E, 1)/Ω(E), which in the case of our modular curve E = Ef is L(f, 1)/Ω(f), where Ω(f) is a certain period of f. We now discuss the relationship between L(f, 1) and the periods of f (by which we will always mean the periods of the differential 2πif(z)dz). Substituting s = 1 into the Mellin transform formula (2.8.1), we obtain (2.8.7) L(f, 1) = −2πi i∞ 0 f(z)dz = − {0, ∞}, f . The modular symbol {0, ∞} is in the rational homology, so that L(f, 1) is a rational multiple of some period of f. To find the rational factor, we use the trick of “closing the path” (see [38, page 286] or [37]). For each prime p not dividing N we have, by (2.4.1), Tp({0, ∞}) = {0, ∞} + p−1 k=0 {k/p, ∞} = (1 + p){0, ∞} + p−1 k=0 {k/p, 0}, and hence (2.8.8) (1 + p − Tp) · {0, ∞} = p−1 k=0 {0, k/p}. Let ap be the Tp-eigenvalue of f, so that Tpf = apf. Integrating the differential 2πif(z)dz along both sides of (2.8.8) gives (2.8.9) (1 + p − ap) · {0, ∞}, f = p−1 k=0 {0, k/p}, f . Since p does not divide N, each modular symbol {0, k/p} on the right of (2.8.9) is integral: that is, in H1(X0(N), Z). Thus the right-hand side of (2.8.9) is a period of f. It is even a real period, since {0, k/p}, f = {0, −k/p}, f = {0, (p − k)/p}, f . Let Ω0(f) denote the least positive real period of f, and set Ω(f) = 2Ω0(f) if the period lattice of f is rectangular, Ω0(f) otherwise. Thus Ω(f)/Ω0(f) is the number of components of the real locus of the elliptic curve Ef . Also note that in each case, Ω(f) is twice the least real part of a period of f. This is useful since, as we are working in H+ (N), we can only (at this stage) determine the projection of the period lattice Λf onto the real axis. 2.9 COMPUTING FOURIER COEFFICIENTS 31 In both cases, (2.8.9) becomes (2.8.10) L(f, 1) Ω(f) = n(p, f) 2(1 + p − ap) , where n(p, f) is an integer. Note that 1 + p − ap is non-zero, since, by well-known estimates, |ap| < 2 √ p. Formula (2.8.10) is significant in several ways. On the one hand, let Ef be the modular elliptic curve attached to f as above. Then L(Ef , 1) = L(f, 1), and Ω0(f) = Ω0(Ef ), the least positive real period of Ef . Thus, once we know ap and n(p, f) for a single prime p, we can evaluate the rational number L(Ef , 1)/Ω(Ef ), whose value is predicted by the Birch–Swinnerton-Dyer conjecture for Ef . In particular, we should have L(f, 1) = 0 if and only if Ef (Q) is infinite. In the tables we give the value of L(f, 1)/Ω(f) for each rational newform f computed, and observe that the value is consistent with the Birch–Swinnerton-Dyer conjecture in each case. Secondly, having computed the right-hand side of (2.8.10) for a single prime p, we may (if L(f, 1) = 0) use the fact that n(p, f)/(1+p−ap) is independent of p to compute the eigenvalue ap quickly for other p, by computing n(p, f). This is discussed in the next section. 2.9 Computing Fourier coefficients For each one-dimensional rational eigenspace of H+ (N) we will need to know many Fourier coefficients a(n, f) of the corresponding newform f(z) = a(n, f) exp(2πinz). These are obtained from the Hecke eigenvalues by the recurrence formulae given in Section 2.6. We already have the eigenvalue εq of each Wq operator, and at least one eigenvalue ap0 for the smallest prime p0 not dividing N, which we recorded as we found the one-dimensional eigenspaces earlier. It remains to compute a large number of the Hecke eigenvalues ap for primes p not dividing N. If L(f, 1) = 0 then the most efficient method is to use (2.8.10). First we compute n(p0, f) from the right-hand side of (2.8.8). (This integer is nonzero if and only if L(f, 1) = 0, by (2.8.10)). For other primes p we then have n(p, f) 2(1 + p − ap) = n(p0, f) 2(1 + p0 − ap0 ) , and hence ap = 1 + p − n(p, f)(1 + p0 − ap0 ) n(p0, f) . The integers n(p, f) may be computed by expressing the right-hand side of (2.8.8) as a linear combination of the M-symbols which generate H+ (N), and then projecting onto the one-dimensional subspace corresponding to f: here we take the dot product with the dual eigenvector computed previously, normalized so that its components are relatively prime integers. The integer this produces is then actually too big by a scaling factor d1d2, where d1 and d2 are the denominators defined in Section 2.2; this factor can be ignored at this stage, where it cancels out in the computation of ap, but must be included when we need the actual ratio L(f, 1)/Ω(f) from (2.8.10). If L(f, 1) = 0 then a variation of this method may be used. For α ∈ Q we have (2.9.1) (1 + p − Tp){α, ∞} = {α, pα} + p−1 k=0 α, α + k p = {0, pα} + p−1 k=0 0, α + k p − (p + 1){0, α}. 32 II. MODULAR SYMBOL ALGORITHMS If p does not divide N and α = n/d with gcd(d, N) = 1 then [0] = [pα] = [(α + k)/p] for all k, so that the right-hand side of (2.9.1) lies in the integral homology H1(X0(N), Z). Hence we can express it as an integral linear combination of the generating M-symbols. Projecting onto the rational one-dimensional subspace of H+ (N) corresponding to f, we find that (2.9.2) Re {α, ∞}, f Ω(f) = n(α, p, f) 2(1 + p − ap) for some integer n(α, p, f), where the left-hand side is independent of p. Thus we can compute each ap from n(α, p, f), given ap0 and n(α, p0, f), provided that the latter in nonzero. It is slightly simpler to use a modular symbol of the form {0, α} here instead of {α, ∞}, since (for suitable α) this will be integral. However the formula analogous to (2.9.1) has more terms of the form {0, β} on the right, so this is slower in practice. Remark. Equation (2.9.1) and the remarks following it show that the modular symbol {α, ∞} lies in the rational homology H1(X0(N), Q) provided that the denominator of α is coprime to N. More generally, for an arbitrary rational number α, the right-hand side of (2.9.1) will be integral provided that p ≡ 1 (mod N); this proves that {α, ∞} ∈ H1(X0(N), Q) in all cases, which is the Manin-Drinfeld Theorem (Theorem 2.1.3) for Γ0(N). Implementation. In practice we only use the first method if L(f, 1) = 0 for all the rational newforms f in S2(N). Otherwise we find a rational α such that n(α, p0, f) = 0 for all f, where p0 is the smallest prime not dividing N. We have already discussed computation of the integers n(p, f). The n(α, p, f) are computed similarly by expressing the right-hand side of (2.9.1) in terms of the generating M-symbols and projecting onto each eigenspace. Note that the term {0, α} of (2.9.1) need only be computed once. The Hecke eigenvalues which we have computed are stored in a data file for use both in subsequent steps of the calculations at level N, and also as part of the cumulative database which will be accessed when levels which are multiples of N are reached. The exact number of ap needed depends on N, and on the form f, and will not be known until the numerical calculation of periods is carried out in the next phase. Our strategy here was first to compute ap for all p up to some predetermined bound (we used all p < 1000 for N ≤ 200, p < 2000 for 200 < N ≤ 400, and p < 3000 for 401 < N ≤ 1000). We may also store extra information, so that if more eigenvalues are needed later, these can be computed without having to repeat the time-consuming steps described in Sections 2.1–2.7. Specifically, we may store the following: the M-symbols which generate H+ (N); a table giving each M-symbol as a linear combination of these generators; a basis for ker(δ); and a (dual) basis vector for each rational one-dimensional eigenspace. Recapitulation. At this point we have completed the first phase of the computation at level N, in which we have been working in the space H+ (N). To summarize, we know (1) the number of rational newforms f in S2(N); and, for each f, (2) the sign w of the functional equation for L(f, s); (3) the ratio L(f, 1)/Ω(f); (4) all Wq-eigenvalues εq of f; (5) a large number of Tp-eigenvalues ap of f. In particular, we know the number of modular elliptic curves Ef of conductor N (up to isogeny); for each curve, we know the sign of its functional equation and whether or not its L-series vanishes at s = 1. All computations carried out so far are exact and algebraic. In addition, we can also in this first phase compute approximations to the value L(f, 1) (when it is non-zero) and to the 2.10 COMPUTING PERIODS I 33 period Ω(f), though we do not know at this stage whether Ω(f)/Ω0(f) = 1 or 2. In other words, we can compute the projection of the period lattice Λf onto the real axis. Of course, this is insufficient information from which to construct the curve Ef . In the second phase, which we describe in Sections 2.10–2.14, we compute the period lattice Λf of each rational newform f, and hence obtain an (approximate) equation for the curve C/Λf . These “analytic” quantities (periods) will necessarily be computed approximately, by summing certain infinite series whose coefficients involve the Fourier coefficients of f (see below). In order to achieve sufficient accuracy, we may have to compute many thousands of these Fourier coefficients, and it is therefore necessary to have efficient ways of doing this, such as the method described in this section. 2.10 Computing periods I In order to compute the full period lattice Λf for each rational newform f found earlier, we have to work in the full space H(N). By working in H+ (N) we could only compute the real period Ω0(f). Although we could also compute the least imaginary period Ωim(f) by working similarly in H− (N) (which would be slightly faster), the lattice spanned by Ω0(f) and Ωim(f) may have index 2 in Λf . Hence from now on we work in H(N). We begin by computing H(N) using M-symbols as in Section 2.2 (omitting relations (2.5.1)). Let γ1, γ2, . . . , γ2g be a Z-basis for H1(X0(N), Z) (and hence also a Q-basis for H(N)). Using this basis we will identify H(N) with the space of rational column vectors, and dual vectors will be represented by row vectors. Next we read from the data file (created during the first phase) the number of rational newforms and, for each, the eigenvalues ap and εq. For each form f we now compute two integer dual (row) eigenvectors with eigenvalues ap and εq for all p and q: one, v+ , with eigenvalue +1 for the ∗ operator, and one, v− , with eigenvalue −1. This is much faster than repeating the splitting step described in Section 2.7, since we already know the eigenvalues which determine each one-dimensional eigenspace. As before, the eigenvectors v± we compute must be dual eigenvectors, since we will use them for projecting onto the eigenspaces in question. Let γ± ∈ H± (N) (respectively) be eigenvectors with the same eigenvalues as v± , such that v+ γ+ = v− γ− = 1. We view γ± as column vectors in Q2g by expressing them as linear combinations of the basis γ1, γ2, . . . , γ2g for H(N). Thus the product v+ γ+ is the product of a row vector by a column vector: essentially a dot product. Set x = γ+ , f and y = −i γ− , f (so that x, y ∈ R). We do not actually compute these vectors γ± in practice; they are only needed for this exposition, as they determine the real numbers x and y. Moreover, although the eigenvectors v± which we do use are only determined up to a scalar multiple, we shall see that this choice does not (as it should not) affect the specific period lattice we obtain. Let γ = 2g j=1 cjγj be an arbitrary integral cycle in H(N). We identify γ with the column vector with component cj. Then we have (2.10.1) γ, f = (v+ γ)γ+ + (v− γ)γ− , f = (v+ γ)x + (v− γ)yi. The period lattice Λf is the set of all such integral periods γ, f . To determine a Z-basis for Λf we proceed as follows. Write v+ = (a1, a2, . . . , a2g) and v− = (b1, b2, . . . , b2g) with aj, bj ∈ Z. Then as a special case of (2.10.1) we have γj, f = ajx + bjyi, since v+ γj = aj and v− γj = bj. Hence Λf is spanned over Z by the 2g periods γj, f = ajx + bjyi. Let Λ be the Z-span in Z2 of the 2g pairs (aj, bj), and let (λ1, µ1), (λ2, µ2) be a 34 II. MODULAR SYMBOL ALGORITHMS Z-basis for Λ. Then we find that Λf = { γ, f | γ ∈ H(N)} = Zω1 + Zω2, where (2.10.2) ωj = λjx + µjyi (j = 1, 2). Thus ω1 and ω2 form a Z-basis for Λf . We may compute (λ1, µ1) and (λ2, µ2) from v+ and v− using the Euclidean algorithm in Z. In fact it is easy to see that there are only two possibilities, since v± are determined within the subspace they generate by being the +1 and −1 eigenvectors for an involution. Normalize v± so that each is primitive in Z2g ; that is, gcd(a1, . . . , a2g) = gcd(b1, . . . , b2g) = 1. In the first case (which we will call “Type 1”), v+ ≡ v− (mod 2), and we may take (λ1, µ1) = (2, 0) and (λ2, µ2) = (1, 1), so that ω1 = 2x and ω2 = x + yi. In this case Ω(f) = Ω0(f) and the elliptic curve has negative discriminant. In the second case (“Type 2”), when v+ and v− are independent modulo 2, we will be able to take (λ1, µ1) = (1, 0) and (λ2, µ2) = (0, 1), so that ω1 = x and ω2 = yi. In this case the period lattice is rectangular, Ω(f) = 2Ω0(f), and the elliptic curve has positive discriminant. It remains to compute the real numbers x and y. We describe two methods: the first computes periods directly, while the second computes them indirectly by computing L(f ⊗χ, 1) for suitable quadratic characters χ. The latter method is in certain cases more accurate (in that fewer ap are needed for the same accuracy) but cannot be used when N is a perfect square, as we shall see below. Observe that the cycles γ± do not enter into the calculations directly, but are merely used to define x and y. Also, if either v+ or v− is replaced by a scalar multiple of itself, then γ+ and γ− (and hence x and y) are scaled down by the same amount, but λj and µj are scaled up. In particular, it is no loss of generality to assume that v± are primitive integer vectors. Thus (2.10.2) defines ω1 and ω2 unambiguously, as generators of the full period lattice of f. Direct method. The simplest method is essentially the same as that used by Tingley in [67]. Using a recent improvement (see [18]), this method can now be made to converge as well as the indirect method described later. From (2.10.1) it suffices to compute γ, f for a single cycle γ such that v+ γ and v− γ are both nonzero; then by taking real and imaginary parts we can solve (2.10.1) for x and y and compute the periods ω1 and ω2 from (2.10.2). (In some cases it may be better in practice to use two different cycles, one for the real period and one for the imaginary period, but for simplicity we will assume that this is not the case.) We denote by If (α, β) the integral If (α, β) = β α 2πif(z)dz, and set If (α) = If (α, ∞). Let M ∈ Γ0(N); since f is holomorphic, the period integral If (α, M(α)) is independent of the basepoint α, and can be expressed as If (α) − If (M(α)). We will denote this period of f by Pf (M). Note that any cycle γ ∈ H(N) can be expressed as {α, M(α)} for a suitable matrix M ∈ Γ0(N), and then γ, f = Pf (M). The map M → Pf (M) is (by Corollary 2.1.2) a group homomorphism from Γ0(N) to the additive group of complex numbers, whose image is the period lattice Λf . Our basic tool for computing periods is the following easy result. Proposition 2.10.1. Let z0 = x0 + iy0 ∈ H, so that y0 > 0. Let f be a cusp form of weight 2 with Fourier coefficients a(n, f). Then (2.10.3) If (z0) = ∞ z0 2πif(z)dz = − ∞ n=1 a(n, f) n e2πinx0 e−2πny0 . 2.10 COMPUTING PERIODS I 35 Proof. Using the Fourier expansion f(z) = ∞ n=1 a(n, f) exp(2πinz), we can integrate termby-term over a vertical path from z0 to ∞ to obtain the result. The term-by-term integration is justified since the series converges absolutely, since |a(n, f)| n and y0 > 0. We can sum the series (2.10.3) to obtain an approximation to If (z0), provided that we have sufficiently many Fourier coefficients a(n, f). The important point to notice is that this series is a power series in exp(−2πy0) (with bounded coefficients since |a(n, f)| < n for all n), so will converge best when y0 is large (or at least, not too small). Suppose we are given a matrix M = a b cN d ∈ Γ0(N), where a, b, c, d ∈ Z, and we wish to compute the associated period Pf (M) = If (α) − If (M(α)) of f. How should we choose α? If α has large imaginary part, then M(α) will tend to have a small imaginary part; we would like to maximize both of these simultaneously. The simplest solution, used by Tingley in his thesis [67] for the original computations of modular elliptic curves2 , is to choose α = −d + i cN , so that M(α) = a + i cN . Thus both α and M(α) have imaginary part (cN)−1 . (Note that, by replacing M by −M if necessary, we may assume that c > 0; we are not interested in M with c = 0 since these are parabolic, and hence have zero period.) Hence we obtain the following. Proposition 2.10.2. Let f ∈ S2(N). Then, for all M = a b cN d ∈ Γ0(N), the period Pf (M) is given by (2.10.4) Pf (M) = If (α) − If (M(α)), where α ∈ H is arbitrary. Taking α = −d+i cN , we have: (2.10.5) Pf (M) = If −d + i cN − If a + i cN = ∞ n=1 a(n, f) n e−2πn/cN e2πina/cN − e−2πind/cN . To use this result, we take a rational number b/d with denominator d coprime to N, solve ad − bcN = 1 for a and c, and set M = a b cN d . The integral cycle γ = {0, b/d} should have the properties that v+ γ and v− γ are both nonzero; also, since y0 = 1/(Nc) with c > 0 we should also choose b/d so that c is as small as possible, to speed convergence in the series (2.10.5). This series converges adequately quickly for small N, but when N increases we require too many terms in order to obtain the periods to sufficient precision. (Not only does it take longer to sum the series when we use more terms, but more significantly, computing the coefficients a(n, f) by modular symbols becomes more expensive as n increases.) The series (2.10.5) is a power series in exp(−2π/cN) for some small positive integer c; at best we might hope to use c = 1 and have a power series in exp(−2π/N). We can improve this, however, to give a better formula which involves power series in exp(−2π/d √ N) for a small positive integer d. This greatly improves the convergence of the series. 2and also used in the first edition of this book 36 II. MODULAR SYMBOL ALGORITHMS In order to do this, we make use of the fact that the newform f is an eigenform for the Fricke involution WN , which for brevity we will here denote simply W. Thus (as in Section 2.8 above) we have f(z) = ε (f | W)(z) where ε = ±1 is the Fricke eigenvalue. By changing variables in the integrals, we see that (2.10.6) If (W(α), W(β)) = If|W (α, β) = ε If (α, β). In particular, if β = W(α) we obtain If (α, W(α)) = −εIf (α, W(α)), so that when ε = +1 we have If (α, W(α)) = 0 for all α. Assume we are in this case (ε = +1). Then in any period integral, we may replace an endpoint α with W(α) without affecting the value of the integral. In particular, Pf (M) = If (α, M(α)) = If (W(α), M(α)). Setting α = di/( √ N − cNi) we find that M(α) = b d + i d √ N and W(α) = c d + i d √ N , which both have the same imaginary part 1/d √ N. (We may assume that d > 0, again by replacing M by −M if necessary.) Hence Pf (M) = If (W(α)) − If (M(α)) = If ( c d + i d √ N ) − If ( b d + i d √ N ), where both integrals converge relatively well. When ε = −1, we can obtain a slightly more complicated result which is just as good in practice. Combining both cases gives the following. Proposition 2.10.3. Let f ∈ S2(N), such that f | W = εf with ε = ±1. Then for all M = a b cN d ∈ Γ0(N) the period Pf (M) is given by (2.10.7) Pf (M) = (1 − ε)If (i/ √ N) + εIf (W(α)) − If (M(α)), where α ∈ H is arbitrary. Taking α = M−1 ( b d + i d √ N ), so that W(α) = c d + i d √ N , we have (2.10.8) Pf (M) = (1 − ε)If (i/ √ N) + εIf c d + i d √ N − If b d + i d √ N = ∞ n=1 a(n, f) n (ε − 1)e−2πn/ √ N + e−2πn/d √ N e2πinb/d − εe2πinc/d . Proof. Using W(i/ √ N) = i/ √ N, we simply compute: If (α, M(α)) = If (α, i/ √ N) + If (i/ √ N, W(α)) + If (W(α), M(α)) = εIf (W(α), i/ √ N) + If (i/ √ N, W(α)) + If (W(α), M(α)) = (1 − ε)(If (i/ √ N) − If (W(α))) + If (W(α)) − If (M(α)) = (1 − ε)If (i/ √ N) + εIf (W(α)) − If (M(α)) which establishes (2.10.7). Then (2.10.8) follows from (2.10.3), using the value of α defined before. 2.11 COMPUTING PERIODS II: INDIRECT METHOD 37 Note that the term (1 − ε)If (i/ √ N), which appears in (2.10.7), is equal to −L(f, 1), by (2.11.1) below. Hence this term is zero unless the analytic rank of f is zero. When we use this method for computing the periods, before proceeding to the next stage we store the following data: type, M, v+ γ, v− γ. Here type = 1 or 2 denotes the lattice type, M is a matrix in Γ0(N) such that γ = {0, M(0)}, and the integers v+ γ and v− γ are nonzero. Then we will be able to compute the periods from stored data quickly without having to recompute H(N) or the eigenvectors v± . We compute the period Pf (M) using (2.10.8), set x = Re(Pf (M))/v+ γ and y = Im(Pf (M))/v− γ from (2.10.1), and take the period lattice Λf to be the lattice with Z-basis 2x, x + yi (if type 1) or x, yi (if type 2). If we later find that we need greater accuracy here, then after computing more ap, we can obtain more accurate values for the periods ω1 and ω2 very quickly, without having to repeat the expensive calculation in H(N). 2.11 Computing periods II: Indirect method The idea here is to compute Ω(f) indirectly by computing L(f, 1) and dividing by the ratio L(f, 1)/Ω(f), which we know from (2.8.10). If L(f, 1) = 0, and in any case to find the imaginary period, we can use the technique of twisting by a quadratic character χ, since the value L(f ⊗χ, 1) is a rational multiple of a real or imaginary period of f (depending on whether χ(−1) = +1 or −1), and is non-zero for suitable χ. We are also interested in the value of L(f, 1) for its own sake, in relation to the Birch–Swinnerton-Dyer conjecture for the modular curve Ef . We will return to this, and the method of computing L(r) (f, 1) for r > 0, in Section 2.13. If L(f, 1) = 0, then we may compute L(f, 1) accurately from (2.8.7) as follows. Let εN = ±1 be the eigenvalue of the Fricke involution WN on f. Then in the notation of the previous section, using (2.10.6) and WN (i/ √ N) = i/ √ N: (2.11.1) L(f, 1) = − i∞ 0 2πif(z)dz = If (∞, 0) = If (∞, i/ √ N) + If (i/ √ N, 0) = If (∞, i/ √ N) + εN If (i/ √ N, ∞) = (εN − 1)If (i/ √ N). Thus if L(f, 1) = 0, then necessarily εN = −1, and in this case L(f, 1) = −2If (i/ √ N). Using Proposition 2.10.1 then gives the following result. Proposition 2.11.1. If f(z) = ∞ n=1 a(n, f) exp(2πinz) ∈ S2(N) and f |WN = −f then (2.11.2) L(f, 1) = 2 ∞ n=1 a(n, f) n exp(−2πn/ √ N). Remark. If in (2.11.1) we split the range of integration at Ai/ √ N for some positive real number A (instead of taking A = 1) then we obtain the more general formula L(f, 1) = ∞ n=1 a(n, f) n exp(−2πAn/ √ N) − εN exp(−2πn/A √ N) , 38 II. MODULAR SYMBOL ALGORITHMS where the right-hand side is independent of A. This can be useful in situations where we do not know the value of εN , since we can evaluate this expression for two values of A, say A = 1 and A = 1.1, and check that the values obtained are approximately the same. For only one of the two possible values of εN will this happen. This idea is due to H. Cohen (see [9, Section 7.5]). More generally, let l be an odd prime not dividing N, and χ the quadratic character modulo l. Define (f ⊗ χ)(z) = ∞ n=1 χ(n)a(n, f) exp(2πinz) and L(f ⊗ χ, s) = (2π)s Γ(s)−1 i∞ 0 (−iz)s (f ⊗ χ)(z) dz z ; then for Re(s) > 3/2 we can integrate term-by-term to obtain L(f ⊗ χ, s) = ∞ n=1 χ(n)a(n, f)n−s . Suppose, as above, that f |WN = εN f. Then f ⊗ χ is in S2(Nl2 ), and (f ⊗ χ) |WNl2 = χ(−N)εN f ⊗ χ (special case of equation (14) in [64]). Hence we can immediately generalize Proposition 2.11.1 to obtain the following. Proposition 2.11.2. Let f be as above. Let l be an odd prime not dividing N. If χ(−N) = εN then L(f ⊗ χ, 1) = 0, while if χ(−N) = −εN , then (2.11.3) L(f ⊗ χ, 1) = 2 ∞ n=1 χ(n)a(n, f) n exp(−2πn/l √ N). The values L(f ⊗ χ, 1) are related to the periods of f by a formula similar to (2.8.10). Let g(χ) be the Gauss sum attached to χ: if l ≡ 1 (mod 4) then χ(−1) = +1 and g(χ) = √ l, while if l ≡ 3 (mod 4) then χ(−1) = −1 and g(χ) = i √ l. If we set l∗ = χ(−1)l then in all cases we have g(χ) = √ l∗. By [64, equation(12)] we have f ⊗ χ = g(χ) l l−1 k=0 χ(−k)f l k 0 l . Hence L(f ⊗ χ, 1) = − {0, ∞}, f ⊗ χ = − g(χ) l χ(−k) {0, ∞}, f l k 0 l = − g(χ) l χ(−k) {k/l, ∞} , f = χ(−1) g(χ) l γl, f = 1 √ l∗ γl, f , 2.11 COMPUTING PERIODS II: INDIRECT METHOD 39 where γl = l−1 k=0 χ(k) {0, k/l} . Here we have used the identity χ(k) = 0. Since l does not divide N, the cycle γl is in the integral homology. Thus for each prime l not dividing 2N we can define an integral period P(l, f) = γl, f , and we have shown that P(l, f) = √ l∗L(f ⊗ χ, 1). Clearly (γl)∗ = χ(−1)γl, since {0, k/l}∗ = {0, −k/l}. So, if χ(−1) = +1, then γl ∈ H+ (N), hence P(l, f) is an integer multiple of the real period Ω0(f), and thus of the form m+ (l, f)x for some integer m+ (l, f) . So, provided that m+ (l, f) = 0, we have (2.11.4) x = √ l L(f ⊗ χ, 1) m+(l, f) = P(l, f) m+(l, f) . In practice, if we express γl as a linear combination of the basis cycles γj and thus view it as a column vector, then m+ (l, f) = v+ γl. Similarly, if χ(−1) = −1 then γl ∈ H− (N), and P(l, f) = m− (l, f)yi, where m− (l, f) = v− γl is an integer, so that if m− (l, f) = 0 then (2.11.5) y = √ l L(f ⊗ χ, 1) m−(l, f) = P(l, f) im−(l, f) . Assuming that N is not a perfect square, we find the smallest primes l+ ≡ 1 (mod 4) and l− ≡ 3 (mod 4) (not dividing N) such that m+ = m+ (l+ , f) and m− = m− (l− , f) are nonzero. A necessary (but not sufficient) condition for this to be true is that for the associated quadratic characters, χ1(−N) = χ2(−N) = −εN ; for if χ(−N) = εN then the sign of the functional equation for L(f ⊗ χ, s) is −1, and hence L(f ⊗ χ, 1) = 0. Suitable primes always exist, provided that N is not a perfect square, by a theorem of Murty and Murty (see [44]). We then compute L(f ⊗ χj, 1) for j = 1, 2 from (2.11.3), obtain x and y from (2.11.4) and (2.11.5), and finally substitute in (2.10.2) as before to obtain the periods ω1 and ω2. If N is a square, however, then χ(−N) = χ(−1) for all primes l not dividing 2N; hence we will only be able to find the real period this way if εN = −1, and only the imaginary period if εN = +1. Rather than seek a way round this difficulty we always use the “direct” method to compute the periods when N is square. To assist convergence in (2.11.3) we clearly want to choose l as small as possible. It is a simple matter to estimate the error obtained in truncating the series (2.11.3) for L(f ⊗ χ, 1) at a certain point n = nmax. In practice we may use this to estimate the number of eigenvalues ap needed to obtain the desired accuracy. However, to save time, we did not in all cases compute this many ap, if the computed values of c4 and c6 (see Section 2.14) were close to integers, and when rounded led us to the coefficients of an elliptic curve of conductor N. Note that, apart from the numerical evaluation of the periods P(l± , f) (using the series (2.11.3) for L(f ⊗ χ, 1)), all these computations are purely algebraic: we express the cycles γl in terms of our homology basis using continued fractions, and take the dot products of the resulting column vectors with our dual eigenvectors v± to obtain the integers m± . The result of this algebraic computation consists of the following data for each rational newform f: primes l± congruent respectively to ±1 modulo 4; nonzero integers m± ; and the 40 II. MODULAR SYMBOL ALGORITHMS type (1 or 2) of the lattice. As in the direct method, before proceeding we store the following data for each newform f: type, l+ , m+ , l− , m− . To compute the lattice from this data set of five integers, we compute the periods P(l± , f) using formula (2.11.3), divide by m± respectively to obtain x and y, and take Λf to be the lattice with Z-basis 2x, x+yi (if type 1) or x, yi (if type 2). In practice we store just these five integers, and recompute the periods when we need them. In particular, if at the first attempt we are unable to compute the integer invariants c4, c6 of the curve Ef to sufficient precision to recognize them, then we will return to H+ (N) in order to compute more Hecke eigenvalues, and then recompute the periods to greater precision without having to recompute H(N). Tricks and shortcuts. In fact, the data l+ and m+ can be computed earlier in the first H+ (N) phase, since they only depend on the real projection of the period lattice. Hence we can already compute the real period x from the data we have from the first phase. Moreover, it is easy to find a suitable prime l− once we know the Hecke eigenvalues of f, by numerically computing P(l, f) for several primes l ≡ −1 (mod 4) until we find a value which is clearly non-zero. It follows that the only purpose of the extremely expensive second phase of the computation, working in H(N), is to determine the integer factor m− and the type of the lattice. An alternative approach, which we have used systematically for larger levels (N > 3200), is simply to guess the value of m− by trying each positive integer m in turn. For each m ≥ 1 we set y = P(l− , f)/m and test the two possible lattices (one of each type). If either lattice has approximate integer invariants c4 and c6, and the rounded integral values are valid invariants of an elliptic curve over Q, and the resulting curve has conductor N, then we store for later use the successful value m− of m and the type, and consider the curve Ef we have found as a possible candidate for the actual modular elliptic curve Ef . The curves Ef and Ef are certainly isogenous; they even have the same real period. In many cases, the curve Ef has no rational isogenies; in such a case we can conclude that Ef = Ef with no ambiguity. In any case, we can compute the isogeny class of curves isogenous to Ef via rational isogenies, and the only loss is that we do not always know exactly which curve in the class is the “strong Weil curve” Ef . (A further disadvantage is that we cannot compute the degree of the modular parametrization of Ef , as this requires knowledge of H(N): see Section 2.15 below.) The great advantage of this method is that in only a few seconds computation time, as soon as we have a rational newform, we can (almost always) write down an associated curve Ef ; before this was implemented, it could take many hours of computation time to determine H(N), find the eigenvectors v± , and hence determine the factor m− and the lattice type, before we could compute Ef . Finally we discuss some variants of the trick just described. 1. We may use the same trick to find l+ and m+ if we have not computed them earlier. Then we are obtaining the period lattice and equation of the curve using only the Fourier coefficients of f (i.e. the coefficients of the L-series of the curve); the sign of the functional equation; and the conductor N. No modular symbol information at all is needed in this case. In fact, one may even guess the sign of the functional equation if all one has is the L-series; see the remark after Proposition 2.11.1. 2. Let l1 and l2 be two primes ≡ −1 (mod 4) for which −N has the correct quadratic character, so that P(l1, f) and P(l2, f) are both not trivially zero. We may compute the periods P(lj, f); assume that these are nonzero (or use different primes lj). We know that there exist nonzero integers mj such that P(lj, f) = mjyi for j = 1, 2. Therefore P(l2, f)/P(l1, f) = 2.12 COMPUTING PERIODS III: EVALUATION OF THE SUMS 41 m2/m1, and we may compute a floating point approximation to this rational number. In practice (provided we have many Fourier coefficients, and the primes lj are small) we will be able to recognize this rational number (say by using continued fractions). Its denominator is a factor of the unknown integer m1. If we do this for several different values of l2 (with the same l1) then the least common multiple of the denominators may give us a nontrivial factor of m1, and then in our search for the exact value we may restrict to multiples of this factor. This is useful in practice. 3. Another possibility, which we have not implemented, is to compute H− (N) in order to determine m− exactly, as we do m+ from H+ (N). This would be no harder than the original computation of H+ (N), and in fact it would be easier to find the eigenvector corresponding to each newform f, since we already know its eigenvalues. The result would be that we would have computed exactly all the data we need in a shorter time than would be required for computing H(N), except for the type of the lattice. Then the only ambiguity is that we would not know the type, and would have to try both types to see which results in a curve with integral coefficients. If both types succeeded (as does happen), we would only know the curve Ef up to a 2-isogeny. 2.12 Computing periods III: Evaluation of the sums The results of the previous two sections express the periods of a rational newform f(z) = a(n, f) exp(2πinz), and the value L(f, 1), in terms of various infinite series, each of the form a(n, f)c(n). In each case the factor c(n) is a simple function of n, but the coefficient a(n, f) must be computed more indirectly from the a(p, f) for prime p as in Section 2.9. In practice we will know a(p, f) for the first few primes, say p ≤ pmax. An elegant and efficient recursive procedure for summing a series of the form a(n)c(n) over {n : 1 ≤ n ≤ nmax, and p|n ⇒ p ≤ pmax}, with a(n) defined in a similar recursive manner, was described in [5, pages 27–28]. This method has the advantage of minimizing the number of multiplications involved and the number of a(n) which must be stored. Also, if some a(n) = 0 then there is a whole class of integers m for which a(m) = 0 that the procedure avoids automatically. Although in our program this part of the computation was not critical for either time or storage space, we found this algorithm to be very useful. It may also be applied in other similar situations for other kinds of modular forms: we have ourselves used it in [14], with cusp forms of weight 2 for Γ1(N), and also in our work over imaginary quadratic fields. To evaluate such a sum, assume that the array p[i] hold the first pmax primes pi, and that the array ap[i] holds the coefficients ap[i] = a(pi) for pi ≤ pmax. We can evaluate the sum a(n)c(n) over all n ≤ nmax all of whose prime divisors are less than or equal to pmax with the following pseudo-code. Algorithm for recursively computing a multiplicative sum 1. BEGIN 2. Sum = c(1); 3. FOR i WHILE p[i] ≤ pmax DO 4. BEGIN 5. add(p[i],i,ap[i],1) 6. END 7. END 42 II. MODULAR SYMBOL ALGORITHMS (Subroutine to add the terms dependent on p) subroutine add(n,i,a,last a) 1. BEGIN 2. IF a=0 THEN j0 = i ELSE Sum = Sum + a*c(n); j0 = 1 FI; 3. FOR j FROM j0 TO i WHILE p[j]*n ≤ nmax DO 4. BEGIN 5. next a = a*ap[j]; 6. IF j=i AND (N ≡ 0 (mod p[j])) THEN 7. next a = next a - p[j]*last a 8. FI; 9. add(p[j]*n,j,next a,a) 10. END 11. END Here the recursive function add(n,i,a,last a) is always called under the following conditions: (i) pi = p[i] is the smallest prime dividing n = n; (ii) a = a(n); (iii) last a = a(n/pi). The procedure for n calls itself with pn in place of n, for all primes p ≤ pi, having first computed next a = a(pn) using the recurrence formulae from Section 2.6; if a(n) = 0 then only p = pi need be used, since then a(pn) = a(p)a(n) = 0 for all p < pi. 2.13 Computing L(r) (f, 1) In investigating the Birch–Swinnerton-Dyer conjecture for the modular curves Ef we will need to compute the numerical value of the rth derivative L(r) (Ef , 1) = L(r) (f, 1), where r is the order of L(f, s) at s = 1. This integer r is sometimes called the ‘analytic rank’ of the curve Ef , since it is also, according the the Birch–Swinnerton-Dyer conjecture, the rank of Ef (Q). Following earlier work with examples of rank 0 and 1, this computation was carried out by Buhler, Gross and Zagier in [6], for the curve of conductor 5077 and rank 3. Their method works in general, and we describe it here. Recall the definition of Λ(f, s) from Section 2.8: (2.8.5) Λ(f, s) = Ns/2 (2π)−s Γ(s)L(f, s) = ∞ 0 f(iy/ √ N)ys−1 dy. Let the WN -eigenvalue of f be ε. Using f(−1/Nz) = εNz2 f(z) we obtain Λ(f, s) = ∞ 1 f(iy/ √ N) ys−1 − εy1−s dy (from which the functional equation (2.8.6) follows immediately). Differentiating k times with respect to s gives Λ(k) (f, s) = ∞ 1 f(iy/ √ N)(log y)k ys−1 − ε(−1)k y1−s dy, so at s = 1 we have Λ(k) (f, 1) = (1 − (−1)k ε) ∞ 1 f(iy/ √ N)(log y)k dy. 2.13 COMPUTING L(r)(f, 1) 43 Trivially this gives Λ(k) (f, 1) = 0 if ε = (−1)k . In particular, since Λ(r) (f, 1) = 0, by definition of r, we must have (−1)r = −ε so that r is even if and only if ε = −1. Hence setting k = r, we have Λ(r) (f, 1) = 2 ∞ 1 f(iy/ √ N)(log y)r dy = 2 ∞ n=1 a(n, f) ∞ 1 exp(−2πny/ √ N)(log y)r dy.(2.13.1) If r = 0, of course, we recover the formula Λ(f, 1) = √ N π ∞ n=1 a(n, f) n exp(−2πn/ √ N) which agrees with (2.11.2) since Λ(f, 1) = ( √ N/2π)L(f, 1). Now assume that r ≥ 1. Integrating (2.13.1) by parts gives Λ(r) (f, 1) = r √ N π ∞ n=1 a(n, f) n ∞ 1 exp(−2πny/ √ N)(log y)r−1 dy y . Since Λ(f, s) vanishes to order r at s = 1 we have L(r) (f, 1) = (2π/ √ N)Λ(r) (f, 1), and hence the following result. Proposition 2.13.1. Let f be a newform in S2(N) with WN -eigenvalue ε, and suppose that the order of L(f, s) at s = 1 is at least r, where ε = (−1)r−1 . Then (2.13.2) L(r) (f, 1) = 2r! ∞ n=1 a(n, f) n Gr 2πn √ N where Gr(x) = 1 (r − 1)! ∞ 1 e−xy (log y)r−1 dy y . In order to evaluate the series in (2.13.2) we may use the summation procedure of the preceding section, provided that we are able to compute the function Gr(x). When r = 1, G1(x) is the exponential integral ∞ 1 e−xy dy/y, which may be evaluated for small x (say x < 3) by the power series (2.13.3) G1(x) = log 1 x − γ − ∞ n=1 (−x)n n · n! where γ is Euler’s constant 0.577 . . .. For larger x (say x > 2) it is better to use the continued fraction expansion G1(x) = e−x x + 1 1 + 1 x + 2 1 + 2 x + 3 1 + . . . . 44 II. MODULAR SYMBOL ALGORITHMS To generalize the series (2.13.3) for G1(x), we observe that the functions Gr(x) satisfy the functional equations Gr(x) = (−1/x)Gr−1(x), with G0(x) = e−x . It follows that Gr(x) = Pr log 1 x + ∞ n=1 (−1)n−r nr · n! xn where Pr(t) is a polynomial of degree r satisfying Pr(t) = Pr−1(t) and P0(t) = 0. From our earlier expression for G1(x) we see that P1(t) = t − γ. In general Pr(t) = Qr(t − γ) where Q1(t) = t; Q2(t) = 1 2 t2 + π2 12 ; Q3(t) = 1 6 t3 + π2 12 t − ζ(3) 3 ; Q4(t) = 1 24 t4 + π2 24 t2 − ζ(3) 3 t + π4 160 ; Q5(t) = 1 120 t5 + π2 72 t3 − ζ(3) 6 t2 + π4 160 t − ζ(5) 5 − ζ(3)π2 36 . For N < 5077 we always found that r ≤ 2, and determining the value of r in such cases is easy. Certainly r = 0 if and only if L(f, 1) = 0, which can be determined algebraically by (2.8.10). When L(f, 1) = 0 and ε = +1 we know that r is odd; by computing L (f, 1) to sufficient precision using (2.13.2) we could verify that L (f, 1) = 0, so that r = 1. Similarly, when L(f, 1) = 0 and ε = −1, we know that r is even and at least 2, and we could check that r = 2 by computing L (f, 1) to sufficient precision to be certain that L (f, 1) = 0. In higher rank cases we have the problem of deciding whether L(k) (f, 1) = 0, since no approximate calculation can by itself determine this. The first case where this occurs is for N = 5077, the rank 3 case considered in [6]. Here one finds that L (f, 1) = 0 to 13 decimal places using (2.13.2) with 250 terms; then it is possible to conclude that L (f, 1) = 0 exactly, by applying the theorem of Gross and Zagier concerning modular elliptic curves of rank 1 (see [25] or [26]) which relates the value of L (f, 1) to the height of a certain Heegner point on Ef . In this case no point on Ef has sufficiently small positive height, and one can therefore deduce that L (f, 1) = 0, so that r ≥ 3. Finally the value of L(3) (f, 1) can be computed numerically and hence shown to be non-zero (approximately 1.73 in this case). See [6] for more details. Using more recent work of Kolyvagin (see [29]) this argument can be simplified, since it is now known that when L(f, s) has a simple zero at s = 1, the curve Ef has rank exactly 1. But in this case Ef has rank 3 (computed via two-descent, though finding three independent points of infinite order is easy and shows that the rank is at least 3), so again the analytic rank must be at least 3, and is therefore exactly 3 as before. The results of Kolyvagin in [29] imply that when L(f, s) has a zero of order r = 0 or 1 at s = 1 then3 the rank of Ef is exactly r. For the tables we also verified that the rank of Ef (Q) was r directly in almost all cases (the exceptions being curves where the coefficients were so large that the two-descent algorithm, described in the next chapter, would have taken too long to run). These results apply to all but 18 of the rational newforms f we found at levels up to 1000. The remaining cases all had r = 2 (determined as above) and we verified that the rank of Ef (Q) was 2 in each case. For a summary of the ranks found in the extended computations to N = 5077, see Chapter IV. 3In fact, Kolyvagin’s result in the rank 0 case was conditional on a certain technical hypothesis, which was later proved independently by Murty and Murty and by Bump, Friedberg and Hoffstein. See [44]. The analogous hypothesis in the rank 1 case was already known as a consequence of a theorem of Waldspurger. The rank 0 result was previously proved in the case of complex multiplication by Coates and Wiles. 2.14 OBTAINING EQUATIONS FOR THE CURVES 45 2.14 Obtaining equations for the curves So far we have described how to compute, to a certain precision, the periods ω1 and ω2 which generate the period lattice Λf of the modular curve Ef = C/Λf attached to each rational newform f in S2(N). Now we turn to the question of finding an equation for Ef . Set τ = ω1/ω2. Interchanging ω1 and ω2 if necessary, we may assume that Im(τ) > 0. By applying the well-known algorithm for moving a point in the upper half-plane H into the usual fundamental region for SL(2, Z) we may assume that |Re(τ)| ≤ 1/2 and |τ| ≥ 1, so that Im(τ) ≥ √ 3/2. One merely replaces (ω1, ω2) by (ω1 − nω2, ω2) for suitable n ∈ Z and (ω1, ω2) by (−ω2, ω1) until both conditions are satisfied. In practice one must be careful about rounding errors, as it is quite possible to have both |τ| < 1 and | − 1/τ| < 1 after rounding, which is liable to prevent the algorithm from terminating. Set q = exp(2πiτ). Then the lattice invariants c4(= 12g2) and c6(= 216g3) are given by (2.14.1) c4 = 2π ω2 4 1 + 240 ∞ n=1 n3 qn 1 − qn and c6 = 2π ω2 6 1 − 504 ∞ n=1 n5 qn 1 − qn (see, for example, [31, p.47]). Since |q| = exp(−2πIm(τ)) ≤ exp(−π √ 3) < 0 · 005, these series converge extremely rapidly. Thus, assuming that ω1 and ω2 are known to sufficient precision, we can compute c4 and c6 as precisely as required. Since Ef is defined over Q, the numbers c4 and c6 are rational, but there is no simple reason why they should be integral. Fortunately, a result of Edixhoven (see [21]) states that in fact they are integral. Hence, provided that we have computed the periods and then c4 and c6 to sufficient precision, we will be able to recognize the corresponding exact integer values. This only presents practical difficulties when c4 and c6 are large, since standard double precision arithmetic only yields around 16 decimal places. In several cases this means that we can recognize c4, but the last digit or digits of c6 are undetermined. One obvious way round these difficulties is to use multiprecision arithmetic, though the resulting programs are slower, which can be an important consideration when large numbers of curves are being processed. In these situations, we are helped by the fact that we know that c4 and c6 are the invariants of an elliptic curve of conductor N. This implies the following congruence conditions (see [30], [10] or Section 3.2 below): (1) c3 4 − c2 6 = 1728∆, where ∆ is a non-zero integer divisible by the primes dividing N; (2) for primes p ≥ 5 which divide N, we have p | c4 ⇐⇒ p | c6 ⇐⇒ p2 | N; (3) c6 ≡ 9 (mod 27); (4) either c6 ≡ −1 (mod 4), or c4 ≡ 0 (mod 16) and c6 ≡ 0, 8 (mod 32). Note that in condition (1) we should not assume that ∆ is only divisible by the “bad primes” which divide N, since we do not know that c4 and c6 are the invariants of a minimal model. However, Edixhoven’s result does bound the non-minimality, and in practice all the equations of curves we have constructed are minimal, verifying the conjecture (Manin’s “c = 1” conjecture) that this should always be the case. Conditions (2)–(4) do assume minimality at the relevant primes. Since c4 tends to be smaller than c6, the common situation is that we know c4, but may need to use the above congruence conditions to help us find c6 in case it has more than 16 digits. Given integral invariants c4, c6 satisfying (1), (3) and (4) above, the coefficients of a standard Weierstrass equation for the curve may be obtained as follows (see Section 3.1), where all divisions are exact: 46 II. MODULAR SYMBOL ALGORITHMS b2 = −c6 mod 12 ∈ {−5, . . . , 6}; b4 = (b2 2 − c4)/24; b6 = (−b3 2 + 36b2b4 − c6)/216; a1 = b2 mod 2 ∈ {0, 1}; a3 = b6 mod 2 ∈ {0, 1}; a2 = (b2 − a1)/4; a4 = (b4 − a1a3)/2; a6 = (b6 − a3)/4. Having the coefficients [a1, a2, a3, a4, a6] of a curve E, we may apply Tate’s algorithm (see Section 3.2 below) to check that E has conductor N. We also check whether this model for E is minimal. These conditions do hold for all the cases we have computed to date (N ≤ 5077). We also verify in each case that the traces of Frobenius of Ef and E for all primes under 1000 agree in each case, and in nearly all cases (see the previous section) that the rank of E, computed via two-descent, agrees with the ‘analytic rank’ of Ef . Finally, we can compute all curves isogenous to E over Q: see Section 3.8 for one way to do this. This final list of curves will, according to the Shimura–Taniyama–Weil conjectures, contain all elliptic curves defined over Q with conductor N (up to isomorphism). At the time of writing4 this has been proved (by Wiles and Taylor, following Ribet, Frey and others) provided that N is divisible by neither 4 nor 25. Our computations do not give any verification of the Shimura–Taniyama–Weil conjecture, since if there did exist elliptic curves over Q which were not modular, then we would simply not find them. We could only verify the conjecture if we had an independent method for listing all curves of conductor N, up to isogeny. For example, this has been done • when N is a power of 2 (Ogg) or of the form 2a 3b (Coghlan, see [2, Table 4]); • when N = 11 (by Agrawal, Coates, Hunt and Van der Poorten, using the theory of Baker; and independently by Serre, using a variant of Faltings’s method based on quartic fields [54]); • for certain prime values of N (see [4]). Our results are compatible with those of Brumer and Kramer in [4] for curves of prime conductor under 1000. The algorithms we used to study these curves E further will be the subject of the next chapter. 2.15 Computing the degree of a modular parametrization The modular elliptic curves we have shown how to construct in this chapter can be parametrized by modular functions for the subgroup Γ0(N) of the modular group Γ = PSL(2, Z). Equivalently, there is a non-constant map ϕ from the modular curve X0(N) to E. In the paper [17], we presented a method of computing the degree of such a map ϕ for arbitrary N. Our method is derived from a method of Zagier in [69]; by using those ideas, together with the modular symbol and M-symbol techniques which have been used above, we are able to derive an explicit formula for deg(ϕ) which is in general much simpler to implement than Zagier’s, for arbitrary subgroups of finite index in Γ. To implement this formula one needs to have explicit coset representatives for the subgroup, but it is not necessary to determine an explicit fundamental domain for its action on the upper half-plane H. In particular, it is 4October 1996 2.15 COMPUTING THE DEGREE OF A MODULAR PARAMETRIZATION 47 simple to implement for Γ0(N) for arbitrary N, in contrast with Zagier’s formula which is only completely explicit for N prime. In this section we present the algorithm described in [17]. For more details and proofs, see [17]. Worked examples are given in the appendix to this chapter, and results for N ≤ 1000 may be found in Chapter IV. 2.15.1. Modular Parametrizations. Let G be a congruence subgroup of the modular group Γ = PSL(2, Z). The quotient X = XG = G\H∗ is a Riemann surface, and an algebraic curve defined over a number field, and is called a modular curve. An elliptic curve E defined over Q is called a modular elliptic curve if there is a nonconstant map ϕ: XG → E for some modular curve XG. The pull-back of the (unique up to scalar multiplication) holomorphic differential on E is then of the form 2πif(z)dz, where f ∈ S2(G). According to the Shimura–Taniyama–Weil conjecture, this should be the case for every elliptic curve defined over Q, with G = Γ0(N), where N is the conductor of E. Moreover, the cusp form f should be a newform in the usual sense. We will suppose that we are given a cusp form f ∈ S2(G). Since the differential f(z)dz is holomorphic, the function z0 → If (z0) = 2πi ∞ z0 f(z)dz is well-defined for z0 ∈ H∗ (independent of the path from z0 to ∞). Also, for M ∈ G, the function M → Pf (M) = If (z0) − If (M(z0)) = 2πi M(z0) z0 f(z)dz is independent of z0, and defines a function Pf : G → C which is a group homomorphism. The image Λf of this map will, under suitable hypotheses on f which we will assume to hold, be a lattice of rank 2 in C, so that Ef = C/Λf is an elliptic curve. Hence If induces a map ϕ: X = G\H∗ → Ef = C/Λf z mod G → If (z) mod Λf . The period map Pf : G → Λf is surjective (by definition) and its kernel contains all elliptic and parabolic elements of G. We may write Λf = Zω1 + Zω2 with Im(ω2/ω1) > 0. Then Pf (M) = n1(M)ω1 + n2(M)ω2 where n1, n2: G → Z are homomorphisms. These functions are explicitly computable in terms of modular symbols as seen in earlier sections. Alternatively, given sufficiently many Fourier coefficients of the cusp form f(z) we may evaluate the period integrals If (z) (using the formula (2.10.8), for example) to sufficient precision that (assuming that the fundamental periods ω1 and ω2 are also known to some precision) one can determine the integer values of n1(M) and n2(M) for any given M ∈ G. The latter approach is used in [69]. The advantage of the modular symbol approach is that exact values are obtained directly, and that it is not necessary to compute (or even know) any Fourier coefficients of f. On the other hand, it becomes computationally infeasible to carry out the modular symbol computations when the index of G in Γ is too large, whereas the approximate approach can still be used, provided that one has an explicit equation for the curve E to hand, from which one can compute the periods and the Fourier coefficients in terms of traces of Frobenius (assuming that E is modular and defined over Q). This method was used, for example, to compute deg(ϕ) for the curve of rank 3 48 II. MODULAR SYMBOL ALGORITHMS with conductor 5077, in [69]; we verified the value obtained (namely 1984) using our modular symbol implementation. The special case we are particularly interested in is where G = Γ0(N) and f(z) is a normalized newform for Γ0(N). Then the periods of 2πif(z) do form a suitable lattice Λf , and the modular elliptic curve Ef = C/Λf is defined over Q and has conductor N. In order to compute the degree of the map ϕ: X → Ef , the idea used in [69] is to compute the Petersson norm ||f|| in two ways. The first way involves deg(ϕ) explicitly, while the second expresses it as a sum of terms involving periods, which can be evaluated as above. Proposition 2.15.1. Let f(z) be a cusp form of weight 2 for G as above, and ϕ: X → Ef the associated modular parametrization. Then 4π2 ||f||2 = deg(ϕ)Vol(Ef ). Remark. In terms of the fundamental periods ω1, ω2 of Ef , the volume is given by Vol(Ef ) = |Im (ω1ω2)|. More generally, if ω, ω ∈ Λf , with ω = n1(ω)ω1 + n2(ω)ω2 and ω = n1(ω )ω1 + n2(ω )ω2, then (up to sign) we have Im (ωω ) = Vol(Ef ) · n1(ω) n1(ω ) n2(ω) n2(ω ) . 2.15.2. Coset representatives and Fundamental Domains. Let S = 0 −1 1 0 and T = 1 1 0 1 be the usual generators for Γ, so that S has order 2 and TS has order 3. Let F be the usual fundamental domain for Γ defined above in (2.1.1), and T the “ideal triangle” with vertices at 0, 1 and ∞. Recall from Section 2.1 that M denotes the transform of T by M for M ∈ Γ, which is the ideal triangle with vertices at the cusps M(0), M(1) and M(∞). These triangles form a triangulation of the upper half-plane H, whose vertices are precisely the cusps Q ∪ {∞}. Recall that M = MTS = M(TS)2 but that otherwise the triangles are distinct. The triangle M has three (oriented) edges; these are the modular symbols (M), (MTS) and (M(TS)2 ). Assume, for simplicity, that G has no non-trivial elements of finite order, i.e., no conjugates of either S or TS. (This assumption is merely for ease of exposition; in fact, it is easy to see that elliptic elements of G contribute nothing to the formula in Theorem 2.15.4 below in any case.) Choose, once and for all, a set S of right coset representatives for G in Γ, such that M ∈ S ⇒ MTS ∈ S; this is possible since, by hypothesis, G contains no conjugates of TS. Let S be a subset of S which contains exactly one of each triple M, MTS, M(TS)2 , so that S = S ∪ S TS ∪ S (TS)2 . Then a fundamental domain for the action of G on H is given by FG = M∈S M . In general, this set need not be connected, but this does not matter for our purposes: it can be treated as a disjoint union of triangles, whose total boundary is the sum of the oriented edges (M) for M ∈ S. The key idea in the algebraic reformulation of Zagier’s method is to make use of the coset action of Γ on the set S. We now introduce notation for the actions of the generators S and T. 2.15 COMPUTING THE DEGREE OF A MODULAR PARAMETRIZATION 49 Action of S. For each M ∈ S we set MS = s(M)σ(M), where s: S → G is a function and σ: S → S is a permutation. Since S2 is the identity, the same is true of σ, and s(σ(M)) = s(M)−1 . For brevity we will write M∗ = σ(M), so that M∗∗ = M for all M ∈ S. (This conflicts with an earlier use of the notation M∗ in Section 2.1, but this should not cause confusion.) Note that the triangles M and MS are adjacent in the triangulation of H, since they share the common side (M) = {M(0), M(∞)} = −(MS). However, since in general we do not have MS ∈ S, in the fundamental domain FG for G it is the triangles M and M∗ which are glued together by the element s(M) ∈ G which takes (M∗ ) to −(M) (the orientation is reversed). Action of T. Similarly, for M ∈ S we set MT = t(M)τ(M) with t(M) ∈ G and τ(M) ∈ S. The permutation τ of S plays a vital part in what follows. The following lemma will not be used later, but is included for its own interest as it explains the geometric significance of this algebraic permutation. Lemma 2.15.2. (a) Two elements M and M of S are in the same τ-orbit if and only if the cusps M(∞) and M (∞) are G-equivalent; hence the number of τ-orbits on S is the number of G-equivalence classes of cusps. (b) The length of the τ-orbit containing M ∈ S is the width of the cusp M(∞) of G. Proof. (a) M and M are in the same τ-orbit if and only if M0 = M Tj M−1 ∈ G for some j, which is if and only if M0M(∞) = M (∞), since the stabilizer of ∞ in Γ is the subgroup generated by T. (b) The length of the orbit of M is the least k > 0 such that MT k M−1 = MTM−1 k ∈ G, which is the width of the cusp M(∞), since the stabilizer of M(∞) in Γ is generated by MTM−1 . Thus there is a one–one correspondence between the orbits of τ on S and the classes of G-inequivalent cusps, with the length of each orbit being the width of the corresponding cusp. In each τ-orbit in S, we choose an arbitrary base-point M1, and set Mj+1 = τ(Mj) for 1 ≤ j ≤ k, where k is the length of the orbit and Mk+1 = M1. Thus MjT = t(Mj)Mj+1, so that M1Tj = t(M1)t(M2) . . .t(Mj)Mj+1. In particular, M1Tk = M0M1, where M0 = t(M1)t(M2) . . .t(Mk) ∈ G. Since M0 is parabolic and Pf is a homomorphism, we obtain the following. Lemma 2.15.3. k j=1 Pf (t(Mj)) = 0. Write M M if M and M are in the same τ-orbit in S, and M precedes M in the fixed ordering determined by choosing a base-point for each orbit. In the notation above, M M if and only if M = Mi and M = Mj where 1 ≤ i < j ≤ k. We can now state the main results of this section. Theorem 2.15.4. Let f be a cusp form of weight 2 for G with associated period function Pf : G → C. Then (the square of) the Petersson norm of f is given by ||f||2 = 1 8π2 M M Im(Pf (t(M))Pf (t(M ))). 50 II. MODULAR SYMBOL ALGORITHMS Here the sum is over all ordered pairs M M in S which are in the same orbit of the permutation τ of S induced by right multiplication by T. Combining this result with Proposition 2.15.1, we immediately obtain our explicit formula for the degree of the modular parametrization. Theorem 2.15.5. With the above notation, deg(ϕ) = 1 2Vol(Ef ) M M Im(Pf (t(M))Pf (t(M ))) = 1 2 M M n1(t(M)) n1(t(M )) n2(t(M)) n2(t(M )) . Hence to compute deg(ϕ), we only have to compute the right coset action of T on an explicit set S of coset representatives for G in Γ, and evaluate the integer-valued functions n1 and n2 on each of the matrices t(M) for M ∈ S. In the case of Γ0(N), these steps can easily be carried out using M-symbols, and we will give some further details below. Remarks. 1. The formula given in Theorem 2.15.5 expresses deg(ϕ) explicitly as a sum which can be grouped as a sum of terms, one term for each cusp, by collecting together the terms for each τ-orbit. It is not at all clear what significance, if any, can be given to the individual contributions of each cusp to the total. 2. The form of our formula is identical to the one in [69]. However, we stress that in [69], the analogue of our coset action τ is defined not algebraically, as here, but geometrically, as a permutation of the edges of a fundamental polygonal domain for G (and dependent on the particular fundamental domain used). Then it becomes necessary to have an explicit picture of such a fundamental domain, including explicit matrices which identify the edges of the domain in pairs. This is only carried out explicitly in [69] in the case G = Γ0(N) where N is a prime. In our formulation, the details are all algebraic rather than geometric, which makes the evaluation of the formula more practical to implement. Also, we have the possibility of evaluating the functions n1 and n2 exactly using modular symbols, instead of using numerical evaluation of the periods, which reduces the computation of deg(ϕ) entirely to linear algebra and integer arithmetic. 2.15.3. Implementation for Γ0(N). We now discuss the case G = Γ0(N) in greater detail, using M-symbols to represent the coset representatives. The right coset action of Γ on P 1 (N) is given by (2.2.4), so we have σ(c : d) = (c : d)S = (d : −c) and τ(c : d) = (c : d)T = (c : c + d). Lemma 2.15.6. The length of the τ-orbit of (c : d) is N/ gcd(N, c2 ). Proof. τk (c : d) = (c : d) ⇐⇒ (c : kc + d) = (c : d) ⇐⇒ cd ≡ c(kc + d) (mod N) ⇐⇒ kc2 ≡ 0 (mod N) ⇐⇒ k ≡ 0 (mod N/ gcd(N, c2 )). In earlier sections, it was immaterial exactly which coset representatives were used, or in practice which pair (c, d) ∈ Z2 was used to represent the M-symbol (c : d). For the application of Theorem 2.15.5, however, we must ensure that our set is closed under right multiplication by TS, where (c : d)TS = (c + d : −c), unless (c : d) is fixed by TS, which is if and only if c2 + cd + d2 ≡ 0 (mod N). Thus each M-symbol (c : d) will be represented by a specific pair (c, d) ∈ Z2 with gcd(c, d) = 1, in such a way that our set S of representatives contains the pairs (c+d, −c) and (−d, c+d) whenever it contains (c, d), unless (c : d) is fixed by TS. (Even when working with pairs (c, d) ∈ Z2 we will identify (c, d) and (−c, −d).) Fixing these triples of pairs (c, d) corresponds to fixing the triangles M which form a (possibly disconnected) fundamental domain for Γ0(N). If M = a b c d , the pair (c, d) 2.15 COMPUTING THE DEGREE OF A MODULAR PARAMETRIZATION 51 corresponds to the directed edge {M(0), M(∞)} = {b/d, a/c}. For this reason, we will refer to the pairs (c, d) as edges, and the triples of pairs as triangles. Right multiplication by TS corresponds geometrically to moving round to the next edge of the triangle, while right multiplication by S corresponds to moving across to the next triangle M∗ adjacent to the current one. The τ-action is given by composing these, taking (c : d) (or edge {b/d, a/c}) to the symbol (c : d)T = (c : c + d) with corresponding edge {(a + b)/(c + d), a/c}, up to translation by an element of Γ0(N). Note how in this operation the endpoint at the cusp M(∞) = a/c is fixed, in accordance with Lemma 2.15.2 above. We may therefore proceed as follows. For each orbit, start with a standard pair (c, d), chosen in an M-symbol class (c : d) not yet handled. Apply T to obtain the pair (c, c + d). If this pair is the standard representative for the class (c : c + d), we need take no action and may continue with the orbit. But if (c, c + d) ≡ (r, s), say, with (r, s) ∈ S, then we must record the “gluing matrix” M = a a + b c c + d p q r s −1 ∈ Γ0(N), where ad − bc = ps − qr = 1, whose period Pf (M) will contribute to the partial sum for this orbit. When this happens, we say that the orbit has a “jump” at this point. Different choices for a, b, p and q only change M by parabolic elements, and so do not affect the period Pf (M). We continue until we return to the starting pair, and then move to another orbit, until all M-symbols have been used. As checks on the computation we may use Lemmas 2.15.2 and 2.15.6: the length of the orbit starting at (c, d) can be precomputed as N/ gcd(N, c2 ), and the number of orbits is the number of Γ0(N)-inequivalent cusps. A worked example for the case N = 11 is included in the appendix to this chapter.