This comes up a fair bit in math, as we like to name things after people but then later change the name to make more sense OR vice versa. For instance, “Outer Space” is actually written as which stands for Culler-Vogtmann space even though everyone says “outer space” aloud. Funnily in that article I just linked Vogtmann writes it as but I haven’t seen anyone else write it that way. Another funny one is right angled Artin groups, which were originally called “graph groups” but now everyone says “raags”. Incidentally this is a great introduction to RAAGS (sometimes written raAgs).

Some spaces don’t have any alternative names and should. The one I’m thinking of now is Teichmüller space– every day dozens of mathematicians and physicists refer to this space and the accompanying theory, which feels like we’re honoring Teichmüller. This is not a person whom I particularly want to honor every day, but like the Downs syndrome problem I doubt we’ll be able to change the name to “complex structure space” or “marked surfaces space”. I didn’t know any of this stuff about Teichmuller until reading a wonderful interview of Autumn Kent by Evelyn Lamb. Here’s a pull quote; most of it is Autumn and the Note is by Evelyn.

There is a dangerous amount of tolerance of intolerable people in academia based on the principle that we are all dedicated to the pursuit of knowledge and beauty and that a person’s academic work makes them a person worthy of mutual respect. This principle is wrong.

Bers famously quoted Plutarch in defense of his admiration for Teichmueller’s work: “It does not of necessity follow that, if the work delights you with its grace, the one who wrought it is worthy of your esteem.” This is of course true, but Teichmueller was still a piece of sh*t and if he were alive today I would not be his friend on Facebook. [Note: Oswald Teichmueller (1913-1943) was a German mathematician and literally a card-carrying Nazi. As a student, he organized a boycott of Edmund Landau, a Jewish math professor at the University of Göttingen. He was killed fighting for the Third Reich in World War II.] I would not invite him to an academic conference. The pursuit of knowledge and beauty is admirable, but it should not be undertaken at the expense of the bodies and souls of marginalized people. If my work would result in violence I would abandon it.

There are a LOT of goodies in that interview and I highly, highly recommend it. In fact I wrote this entire post just to share this interview with you, but I snuck it in via eponyms (and also I’ve been having a lot of practice contractions lately and wanted you to know. Due date is March 29!)

]]>This post is based on the paper “Deep driven fMRI decoding of visual categories” by Michele Svanera, Sergio Benini, Gal Raz, Talma Hendler, Rainer Goebel, and Giancarlo Valente. First, here’s my schematic of the paper:

We’ll read this schematic from top to bottom, left to right.

- On top is the experiment: they had a lot of people watch 5-10 minute movies. The left white arrow indicates that the people were in fMRI machines (I know a fMRI machine does not look like an EEG but that’s the picture you get) and so they have a bunch of data sitting around from that. The right white arrow indicates that they used a computer algorithm (“math!”) to extract information directly from the movies [this is the fc7 data]. So far they haven’t contributed anything new to the literature; just used existing techniques to come up with raw data.
- The orange diagonal arrows are when things get interesting. The fMRI data and fc7 data comes in giant matrices, and they use another math algorithm to come up with a set of “decoding” matrices. Not pictured in schematic: they test these matrices using some of the data.
- The goal is indicated by the green arrows: to use the brain data and these matrices they came up with to reconstruct what people are seeing and
*classify*these things (aka are subjects seeing people’s faces on the screen, or entire human figures?)

Now for a few details on each of the steps.

0. The **motivation **behind the paper seems to be to link the brain imaging community (those who work the fMRI, EEG, etc. data) with the deep neural network community (computer people) to answer questions that involve both. The main question they have is: **how do people associate low-level information like colors, shapes, etc. with semantic concepts like car, person, etc.? **Here’s the picture:

- The data: they grabbed a bunch of existing data from other experiments, where scientists took 5-10 minute clips from five different movies (side note I would never want to be in these studies because one of the clips was from The Ring 2) and showed them to subjects (ranging from 27 to 74 participants in each movie) and recorded all the fMRI data, which creates a huge three-dimensional datasetevery 3 seconds. Then they threw the movie frames into a computer algorithm (called the
**faster R-CNN method**) which detects objects in the video frames (with varying confidence levels) and spits out a 4096-dimensional vector for each frame. They averaged these vectors over 15 frames so that the two datasets could match up (the movies were shown at 5 frames per second so this makes sense). These vectors form the**fc7**data. - The math: they use an algorithm called
**Canonical Correlation Analysis (CCA)**to spit out two orthogonal matrices**U**and**V**which are highly correlated (hence the middle C). Looks like linear algebra with some linear projection! The schematic is . To do this, they took a subset (about 75%) of the fMRI data and the corresponding fc7 data and plugged it into the math. The goal of this step (the**training step**) is actually to get the helper matrices*A*and*B*. To make sure these matrices are a-OK, they used the remaining fMRI data to reconstruct the fc7 data within a reasonable margin of error . Remember U and V are highly (in fact maximally) correlated so that middle arrow actually makes sense in this step (the**testing step**). - The result: For one movie, they did the training math step using different subsets of data (they did it 300 times) to make sure those helper matrices
*A*and*B*are the best possible ones. Then to show that this whole paper does what they want it to do, they do the testing step using*the other movies.*[The whole point of a decoding method is towhat people are seeing].*predict*

Conclusion: their method works on this preliminary result (faces versus bodies)! They want to expand to other movies and other semantic concepts in the future.

General keywords: machine learning, fMRI, linear algebra. Also CCA, faster R-CCN, fc7 but those are keywords for specialists.

My conclusion: this was cool and fun! I like reading new things and learning. I hope you do too!

]]>Anyways, let’s first talk about tilings by triangles. We first choose some integers *p, q, r *and set the three angles of a base triangle equal to . Now reflect over each of the three sides to start tiling your space. Turns out this tiling will lead to a group. Here’s an example with *p=q=4 *and *r=2 (*so we have a right isosceles triangle):

I picked a particularly good example, so that my triangles could tile the Euclidean plain. We learned some time ago about non-Euclidean geometries: the space is *spherical *if the sum of triangle angles is more than , and *hyperbolic *if triangles are thin and their sum of angles is less than . So based on how I choose my *p, q, *and *r, *I’ll find myself tiling different spaces. Here’s an example of one iteration on a sphere for p=q=2* *and *r=5:*

It’s pretty easy to find the integer solutions for *p, q, r *to tile each space. The only triangles that tile the flat plane are when *(p,q,r) = (2,3,6), (2,4,4), *and *(3,3,3). *We already saw (2,4,4), and I’ll just tell you that (3,3,3) is when you use equilateral triangles (so there are 6 around each vertex), and (2,3,6) are those 30-60-90 triangles we learned about back in trigonometry class: here’s the picture from wikipedia:

Similarly there are only a few (relatively) that tile the sphere: (2,3,3), (2,3,4), (2,3,5), and (2,2, *n*), where *n *is some number bigger than 1. Of course this forms an infinite family of tilings, since you can choose *n*. In the example above I chose *n=5*, and if *n *is bigger the base pink triangle just gets skinnier.

But I say there’s only a few that tile the sphere because **everything else tiles the hyperbolic plane**. That’s a lot of choices! It might also make you think, “hm, maybe the hyperbolic plane is interesting.”

Let’s bring us back to groups. How does a tiling of a space lead to a group? Well, let the reflections over the (extended) sides of the base triangle be the **generators **of our group. If I name these *a, b, *and *c, *I immediately get the **relators **. Next we have to figure out the rest of the relators. I hinted at them in the pictures above. They are . Now we have a group presentation [link for definition] .

Also, fun coincidence: if you create the **dual tree **to the tiling by putting a vertex inside each triangle and connecting two vertices by a line if one triangle is the image of another under one of the reflections, you get something that looks a lot like the **Cayley graph **of the reflection triangle group. The only difference is that each edge needs to be two edges (like a little loop) to reflect that each generator has order 2.

So what’s special about the (2,3,7) triangle group? We know from above that it tiles the hyperbolic plane. Check out this great picture from wikipedia of the tiling:

Maybe we’ll take a second here to point out that you can see the *p, q, r *values in the tilings, both in this picture and the other wikipedia picture above for (2,3,6): pick your favorite triangle, and look at its three vertices. Each vertex is adjacent to *n *other triangles, and since there are angle around any vertex, we can figure out that *p,q,r *are just .

Also, of all the integers you can pick for *p, q, r*, it turns out that 2, 3, and 7 maximize the sum while staying less than . [It ends up giving you for those of you keeping track at home.]

So we maximize *something *with the numbers 2, 3, 7. Well it turns out we do more than that- we also *minimize *the volume of the resulting quotient-we touched on and defined those in this post about manifolds. And this is unique (up to conjugation/fiddling), and makes the smallest possible quotient. Huzzah!

On a personal note, I’ve had a demonic cold keeping me in bed for the past two weeks, so forgive me if I messed up (pretty sure I did on the reflections; I’ll try to fix those soon). Also, hope you voted today! I voted a week and a half ago.

]]>A little bit catchy, but not quite my cup of tea. What is my cup of tea? An integer sequence that does not appear in the amazing OEIS! 138 is the smallest product of three primes such that the third is the concatenation of the first two: so 138 is , 777 is , and 4642 and 10263 are the next in this sequence (I applied for an account to OEIS to submit this, so if you find a smaller one or the next several in the sequence let me know). So off we go, to venture into the great unknown (aka the internet)! This month we have a fun mix of grammar, history/sociology, math, and games!

- First,
**grammar:**Manun Shah over at Math Misery [which hosted CoM#136] posted Does 11=8 + 3? to chat about linguistics and mathematics. If you love those memes about the Oxford comma, this post might be right up your alley. My favorite sentence: “I bought orange juice and dishwasher detergent and drank it for breakfast.” He argues we should think more about linguistics when teaching math, and the implicit biases that language might have on students as they learn mathematical reasoning. - Also in grammar, Thomas Oléron Evans of Mathistophiles posted Understanding an Abused Unit: The Kilometre-Hour, which delves deeper into a specific example of how common language can hurt mathematical understanding, and he uses a DELIGHTFUL example of turtles. Favorite sentence here: “Unless they sidestep across the beach, of course, like some sort of synchronised reptilian dressage.” Here’s an amazing graph from his example:

- Keeping up with his fun theme, Thomas submitted a SUPER FUN
**game**by David Peter called cube composer which is all about functions. Do please do go play it! It’s so fun! It’s an intuitive individual puzzle game, and you can fly through the levels. It also remembers your progress, but you can skip levels if you want too. Here’s a screenshot of me picking random functions and not solving the puzzle:

- A high school math teacher friend texted me about another game called Square It on a website for math educators supported by the University of Cambridge. We proceeded to lose, lose, lose, and continue to lose against the computer. I played against my spouse a few times, and he won his first time. Those darn programmers and their algorithm-minds! For the record I did eventually win. This one is faster and lighter than cube composer, and it offers different mathematical questions to think about in different size grids.

Square It! is also great to play with kids. Two more submissions are also **kid-friendly. **

- Brent Yorgey of Math Less Traveled submitted his post about Making Tesselations, which delves into the math behind the delightful new children’s book Tessalation! that I will definitely be reading to my toddler. In his post he also talks about ants on a doughnut, and y’all know how we mathematicians love our analogies to lean toward the absurd. Here’s a wonderful joke Brent nonchalantly sneaks into his post: “The ant is so small that it can’t tell that the surface it lives on is curved. To the ant, it just looks flat. (You may know some tiny creatures in a similar situation who live on a sphere.)” So fun.
- Matthew Scroggs submitted a post dear to my heart, written by Belgin Seymenoglu over at Chalkdust Magazine: an ode to the delightful half hour film Donald in Mathemagic Land. The post includes a link to the 1959 cartoon, which I’ve watched many times since I first saw it when I was 15. I guess this is a fun month, because I really really recommend watching this cute and funny cartoon that includes actual math in it.

Now for a sort of random assortment of history/sociology/whatever posts that have to do with math:

- Paul Plowman wrote a 2 minute 20 second guitar song using the number
*e,*and posted about it over on his blog as Transcendental Music. Sort of a silly but fun little exercise like memorizing digits of , and the riff sounds nice. - Poppy Brady submitted a story she wrote for The Voice about Nira Chamberlain, titled Teachers said I could be a boxer, but Never a Mathematician, one of those feel-good stories about mathematicians. One fun quote by Chamberlain: “I also like what British mathematician Sir John Kingman once said that ‘mathematicians are better if they stay a bit childish and play the game as a game.’” I think that keeps in line with our theme this month!
- Over in
*Nature,*Davide Castelvecchi wrote a news story about how the Majority of mathematicians hail from just 24 scientific ‘families’, a result by Floriana Gargiulo who analyzed the Mathematics Genealogy Project. Every grad student has used the MGP to stalk their advisor/potential advisor to see their lineage, and you can print out a tree and give it to your advisor as a defense present! So it was fun to read about someone actually analyzing this trove of data. - Finally, saving the best for last, the brilliant Evelyn Lamb, explainer extraordinaire, wrote a post on Roots of Unity at Scientific American about How Converting Between Addition and Multiplication makes Math Easier. If you don’t follow/read everything that Evelyn writes, you really should. So approachable and so lucid. She also wrote a fun piece about using different bases for your birthday candles, so you should read these two articles, follow her on Twitter, and tell her happy birthday!

If you run into anything interesting on the math-internet from now until October 15th, submit them to the Carnival of Mathematics site; the Carnival will be hosted next month at the online magazine Gonit Sora. Hope you had as much fun as I did with the submissions this month!

]]>Instead, you’ll read about the basics behind the research of my friend Gillian (no website), which is being supported somehow by people who want to know about cancer. First, some background: the field of study of phylogenetic trees is inspired by and informs applications in evolutionary biology. All of the life-forms today (or at some slice of time) all trace back to a root life-form at time 0. Each current life-form follows a series of branching away from the root to get to where it is now

Then we can represent this evolution via a labeled rooted binary tree: the root represents the root life-form at time 0, and each branching of the tree represents a different evolution. The labels mark which life-form is which. Of course this model isn’t perfect (I can’t find the word for it but it’s a thing where two different species evolve separately from the same ancestor, then meet again and make one species. If we were to represent this information in a graph, it’d make a cycle and not be a tree), but it’s been fruitful.

Now let’s mathify this. We’d like to encode the evolutionary information into our tree. We’ve already decided that all life-forms will end at the same time (now), so if we just assign lengths to each of the non-leaf edges this will automatically determine the lengths of the leaf edges. A **leaf **in a tree is a vertex with only one neighbor, and we call the edge leading to that vertex a leaf-edge. Let’s call the non-leaf edges **interior edges**. In the picture above, we have 5 non-leaf edges, which determine a tree with 7 leaves. Using this exact configuration of labels and edges, we have five degrees of freedom: we can make those interior edges whatever lengths we want, as long as they are positive numbers. So in math-terms, the set of **phylogenetic trees **(aka rooted, binary, labeled trees) in this configuration forms a **positive orthant **of . You can smoothly change any one of the edges to a slightly longer or shorter length, and still have a phylogenetic tree with the same combinatorial data.

What about phylogenetic trees with *different *combinatorial data? Say, with different labels or different branching, but the same number of leaves and the same number of interior edges? First we need to figure out what we mean by ‘different’. For instance, the following picture from the seminal paper in this field shows three trees that don’t immediately look the same, but we don’t count as different:

Why aren’t they different? Because they encode the same data for each life-form: reading from node 0 we see that first 1 branches off, then 2, then 3 and 4 in all three cases. There’s some combinatorics here with partitions that you can do (one can label a tree with a set of partitions). However, changing the labels so that first 2 branches off, then 1, then 3 and 4 will be a different phylogenetic tree. In fact I can smoothly go from one to the other in the space that we’re creating: first I shrink the length of the green edge below to zero, which takes us to the middle tree (not binary!), and then extend the blue edge.

We’re going to add these non-binary trees with one less edge length to our space. Remember the tree on the left has an entire positive orthant, and the tree on the right has an entire positive orthant. Shrinking the green length to zero means that we’re moving to one side of the left orthant: so we add this axis to our space (we have instead of strictly greater than 0). We can glue the green and blue orthants together along this axis. Here’s a picture from the paper:

Notice that they also have the origin filled in, with a tree with no interior edges. This is the **cone point **of this space. Now we’re finally ready to describe the space of phylogenetic trees: within each combinatorial structure/labeling, we have a Euclidean orthant in (n-2) dimensions. Then these orthants are glued together along their axes in a specific way, and all of them are connected to the cone point. This is called **BHV(n), **short for Billera-Holmes-Vogtmann space (in the paper they call it T(n) but that’s confusing to everyone else). Here’s the picture of T(4):

There are 15 different orthants glued together in this picture, because the number of labelled rooted binary trees on *n *vertices is (2n-3)!!. The double !! means you only multiply the odds, a.k.a. (2n-3)(2n-5)(2n-7)… This is also known as Schroeder’s fourth problem , which as far as I can tell was open for 100 years. Pretty cool!

If you truncate BHV(n) so it’s not infinite (just pick some compact bound), then it forms a nonpositively curved cube complex, and we love those! CAT(0) cube complexes are great. I haven’t blogged too much about them (first terrible post and then those truncated Haglund notes) but they are the basis of all that I do and the number one thing I talk about when I give math talks. Whoops! The gist is that you glue cubes together in not-terrible ways, and then the resulting complex has great and fun properties (like you can cut it in half the way you want to).

That’s about all I have to say about this! Gillian is working on some stuff about putting a probability measure on BHV(n) [you can’t do it with certain conditions], embedding it into a small enough Euclidean space that still preserves some of its features, and finding an isometrically embedded copy of the phylogenetic tree inside BHV(n) instead of just the coordinate point. Also, fun fact to prove to yourself (actually please don’t scoop my friend), find the automorphism group of BHV(n)! It’s just the symmetric group on some number that has to do with n (n+1 or something like that; I can’t remember and didn’t take notes).

Again, the main reference for this is the seminal paper that should also be accessible as it’s meant for biologists and statisticians.

]]>Anyway, I wanted to blog about some math I saw there, based on a talk by Chassidy Bozeman, a grad student at Iowa State and a fellow EDGE-r. The talk is from a paper that resulted from her graduate research course, which Iowa offers to intermediate grad students and gets them started on research (hence the eight authors on that paper). I thought it was fun and approachable! And I have one crumpled page of two-month old notes, so we’ll see how this goes.

First, let’s remember what a **graph** is: a collection of vertices (dots) and edges (lines between dots). A graph is **simple **if there’s at most one edge between any two vertices, and if no vertex has an edge to itself (aka a **loop**). A **loop graph **allows loops, but still no multiple edges.

You can associate an infinite family of symmetric matrices to a loop graph. These square matrices will have the number of vertices of the graph as the number of columns and rows, and the entry will be 0 if and only if there is no edge between the corresponding vertices *i *and *j * in the graph. The rest of the entries will just be real nonzero numbers. This infinite family is useful for defining the **minimum rank of a loop graph: **it’s the minimum rank over all of this infinite family of matrices. The ** rank **of a matrix is a measure of how large it is. For definition by example, has rank 1, and has rank 2.

So the goal of Chassidy’s talk was to characterize all loop graphs whose minimum rank is the full size of the graph. A couple more definitions before the theorem: a *generalized cycle* is a subgraph whose connected components are either a loop, a single edge, or a cycle. It’s *spanning *if it touches every vertex in the graph. Spanning generalized cycles don’t always exist!

**Theorem: **A loop graph has minimum rank equal to its size if and only if there exists a **unique **spanning generalized cycle.

Quick nonexample: here’s a picture of a loop graph that has more than one spanning generalized cycle, so from the theorem we know it doesn’t have minimum rank four.

This is theorem 3.1 in their paper. It’s also comically the end of my notes on this talk. Here are a few facts they use to prove the theorem:

- If
*H*is an induced subgraph of G (that means the vertices of H are a subset of the vertices of G, and its edges are all the edges that appear in G within that subset), then it minimum rank is bounded above by the minimum rank of G. Succinctly, . - If
*G*is a disjoint union of a bunch of connected components, the its minimum rank is the sum of the minimum ranks of those components. .

Here’s a picture of my notes! Read the actual paper if you want to know more!

]]>I’ve read Sageev’s PCMI lecture notes several times by this point; it’s the basis of my and many other people’s work (in particular, my friend Kasia has an impressive number of publications related to this stuff). And every single time I get stumped on the same exercise near the end, so I thought I’d try to write up a solution, crowd-source it among my blog readers, and figure out something correct. For reference, these are exercise 4.27 and 4.28 in his notes, but I’ll lay out the problems so you don’t need to look them up if you don’t want to. Please comment with corrections/ideas!

A thing that mathematicians care about is the structure of a group. We say that a particular subgroup H<G is **separable **if for any group element *g *that’s not in H, we can find a finite index subgroup K that contains H but doesn’t contain *g*. Intuitively, we can separate *g *from H using a finite index subgroup. Here’s the cartoon:

The first exercise is to show that this definition is implied by another one: that for any group element *g *that’s not in H, we can find a homomorphism * *where F is a finite group, so that the image of *H *under the map doesn’t contain *f*(*g*).

So let’s say we start with such a homomorphism, and our goal is to find a finite index subgroup *K *that contains *H *but not *g*. Since we’ve got a homomorphism, let’s use it and try . Since , we know this definition of *K *excludes *g*, as desired. Then we need to show that K is finite index in G and we’ll be done.

What about the first isomorphism theorem? We have a map , and we know , and is a proper subgroup since f(g) isn’t in f(H). **This next bit is iffy and I could use help! **

- (
*Original)*Then we have a map induced by the map*f*, and the kernel of this map is*K*. By the first isomorphism theorem, the index of*K*in*G*is the size of the image of this map. Since*F/f*(*H*) is finite, the image of the map is finite. So*K*has finite index in*G*, as desired. [What’s iffy here? You can’t take quotients with random subgroups, just with normal subgroups, and I don’t see why f(H) would be normal in F unless there’s something I don’t know about finite groups.] - (
*based on Zach Himes’ comment)*By the first isomorphism theorem, ker*f*has finite index in F. We know ker*f*is contained in*K*, since 1 is contained in f(H) [since 1 is contained in H, and f(1)=1, where 1 indicates the identity elements of G and F]. It’s a fact that if , then . Since the left hand side is finite, the right hand side is also finite, which means that K has finite index in G, as desired. - (
*Based on conversation with Nick Cahill)*We can think of F/f(H) as a set which is not necessarily a group, and say that G acts on this set by . Then . By the orbit-stabilizer theorem, . Since F is finite, the size of the orbit must be finite, so K has finite index in G, as desired.

The second exercise has to do with the **profinite topology**. Basic open sets in the profinite topology of a group are finite index subgroups and their cosets. For instance, in the integers, are both open sets in the profinite topology. Being closed in the profinite topology is equivalent to being a separable subgroup (this is the second exercise).

So we have to do both directions. First, assume we have a separable subgroup *H. *We want to show that the complement of *H *is open in the profinite topology. Choose a *g *in the complement of *H*. By separability, there exists a finite index subgroup *K *that contains *H *and not *g*. Then there’s a coset *tK *of *K *which contains *g*. This coset is a basic open set, so *g *is contained in a basic open set and the complement of *H *is open.

Next, assume that *H *is closed in the profinite topology, so we want to show that *H *is separable. Again, choose some *g *in the complement of *H*. Since the complement of *H *is open, *g *is contained in a coset of a finite index subgroup*, *so that *H *is not in this coset. Let’s call this coset *K, *and call its finite index *n. *We can form a map , to the symmetric group on *n *letters, which tells us which coset each group element gets mapped to. Then *g *is in the kernel of this map, since *g *is contained in *K*, but *H *is not in the kernel of *f* since it is not in that coset. In fact no element of *H* is in the kernel. So we’ve made a homomorphism to a finite group so that *g *and *H *have disjoint images, which we said implies separability by the previous exercise.

Okay math friends, help me out so I can help out my summernar! So far in summernar we’ve read these lectures by Sageev and some parts of the Primer on Mapping Class Groups, and I’m curious what people will bring out next.

]]>Here are some pictures from last week:

The picture on the left describes technique 1, which we did last time to count how many points are in an octahedron with vertices at (0,0,*n*), (0,0,-*n*), (0,n,0),(0,-n,0), (n,0,0), and (-n,0,0). The edges of the octahedron are lines between vertices that live in the same octant, and the faces live in each octant (so there are eight faces). Now we’ll do **Technique 2: **counting the points in each *octahedral shell, *which is also called the spherical number.

Let’s look at just the *yz *plane, that is, where *x=0. *So we want the integer points in the *yz *plane that lie on the *n-*shell of the octahedron. Let’s stop and think about what we said in the last paragraph: there are eight faces because there are eight identical octants. What if we count up all the points in one octant, then multiply by 8, and subtract any points we double counted? Sounds great to me! Let’s now just focus on when *x, y, *and *z *are all greater than or equal to 0.

So now we have these points on a line such that *x=0*, and the line runs between (0,n,0) and (0,0,n). Well thanks to point-slope formula, or slope-intercept form, or however you want to write your equation, we know that these points satisfy the equation of the line: *y+z=n*. How many ways can I cut up *n* into two numbers *y *and *z *such that both are at least 1? (I’m going to say they both have to be at least one because the vertices (0,n,0) and (0,0,n) will be special in the next paragraph). Well if I choose *y*, that automatically chooses *z*. I have *n-1* choices for *y* (they are {1,…,n-1}), so that means I have exactly *n-1* ways to choose *y *and *z*. Then there are *n-1* points on the line. I can either count and see I have 12 edges in my octahedron, so insides of edges contribute *12n-12* to the total count, or I can see I have three edges in my positive octant, multiply by 8 (because we have eight octants), and then divide by two since each edge lies in two octants. Either way, we contribute *12n-12 *to the final count.

By the same logic, I have exactly three vertices in my positive octant, each of which appears in 4 neighboring quadrants. These contribute 6 to the final count (3*8/4=6 if we do the double counting method, or again we can just look at how many vertices are in an octahedron).

Now I’ve counted the edges and vertices in my octahedral shell, so I only need to figure out how many points lie inside the triangular face. Again I can write an equation for the face: *x+y+z=n*. Then I need to figure out how many ways I can partition *n* balls into three boxes, each with at least ball inside. This is a partition problem.

Let’s try to tackle this the way we did the last one. But maybe let’s use an example, like the number 6. How many ways can I cut up 6? Well, I can choose 3 for *x*, which leaves 3 leftover to split into two numbers *y* and *z*. From above I know there are 3-1 ways to choose, which I know since I can pick 2,1 or 1,2 for *y *and *z*. I can also choose 4 for *x*, which leaves 2 leftover, or 2 for *x*, which leaves 4 leftover, or 1 for *x*. Let’s organize this into a table.

Notice then that the number of ways I have to partition 6 into three bins, each of which has value at least 1, is 1+2+3+4. This logic generalizes: I can pick any number from 1 to *n-2* for *x*, and each of those options gives me *n-x *leftover to split into *y *and *z*, and from above that’s *n-x-1 *such options. So I have . Thanks to nine-year-old Gauss via fable, I know how to sum up the numbers from 1 to k: it’s a formula, . There’s a bunch of ways to explain this formula, here’s an easy to read collection. So if I apply this formula to my sum, I have . This is all the points inside the face that lie completely in the octant, so if I multiply this by 8 I have contributed 4(*n-1*)(*n-2*) points.

Then in the octahedral shell, I can add up all the contributions and get 6+12n-12+4(n-1)(n-2), which simplifies to .

We actually want to count all the lattice points *inside *an octahedron. The octahedron is a bit special, since any lattice point inside it will lie on one of these octahedral shells, like a bunch of nesting Russian dolls. So now I need to do , where I added the 1 at the beginning to represent the single point (0,0,0). Luckily I also have a formula for the sum of squares, though without fun folklore to go with it. The formula is . Plugging that into my formula, I have the number of points in the octahedron is , which is **EXACTLY WHAT WE GOT LAST WEEK!!!!**

I hope this was as exciting for you as it was for me. My friend Jeremy Kun and an anonymous number theorist commented last week about the Gauss Circle Problem and the error terms in it. I just wanted to add that Gauss offered up the problem and the error guess, and it took 100 years until Sierpinski came up with a proof for something not that close to the error guess, and now it’s been like 200 years and we don’t quite have a proof for what we think the answer is yet. Rumor has it that several years ago someone posted a proof to the arXiv, then quietly withdrew it.

Partner has pointed out that there’s a wikipedia entry for this problem but it doesn’t go into as much depth or have as cute pictures as my two blog posts.

]]>First, some two dimensional background and set up. The **Gauss Circle Problem **has been around for a looong time (200+ years) and is not actually solved. Let’s live in the *xy *plane, and mark points where *x *and *y *are integers in the plane. We get a lattice of points:

The Gauss circle problem asks: if I make a circle with radius *n *centered at the origin, how many lattice points are inside that circle? We can count the first few:

There’s four points on the pink circle and one point inside, so *G*(1)=5. The orange circle has four points on it, and it also adds the four points between the pink and orange circles. So *G*(2)=4+4+5=13. And the beige circle has eight points on it, and also adds more inside points, so *G*(3)=29. Gauss thought that . We haven’t talked about this big O notation (it’s literally called big O notation; we’re very creative nomenclators), and I’m not going to do so here: take it as an “error term up to linearity”. That means that while *G*(273) is around 74529π, it’s off by a factor that’s not “too far” from 273, a.k.a. probably closer to 273 than to 74529.

How do you count *G*(3)? One way you can do it is count *G*(2), then count all the added points, like we did above. You can also find a smaller region to look at, then multiply to fill in the circle and subtract double counted points. In the example above, let’s just look at one quadrant.

It’s pretty easy to see how many points are in there: I count 11 integer lattice points. Now we multiply by 4 to get 44 lattice points. But we double counted! Imagine multiplication by two as picking up that pie slice, and rotating it, and putting it back down in the tin in the northwest corner. I’ve overlapped the first slice by one edge, so if I want to count the number of lattice points I’ll have to subtract the four points on that edge from my total. Similarly, multiplying by four double counts all of the edge points on the green lines, and counts the origin four times. So I need to subtract 3 for the quadruple counted origin, and 12 for the double counted edges: that leaves 44-15=29, which is good because it’s what I had above!

For the next problem, we’re going to use a black box called Pick’s Theorem. It’s not an extremely dark black box since there’s a proof of it on that wikipedia page, but I’m not going to explain the proof here, just say the statement: **If you have a polygon with integer coordinates in the plane, then A=i+b/2-1, **where A is the area of the polygon, *i *is the number of lattice points inside the polygon, and *b *is the number of lattice points on the boundary of the polygon. Notice this works for *polygons*, not circles (circles is the Gauss circle problem, still unsolved).

If I let *A, i, *and *b *represent what they do for my initial polygon, I can write a formula for the number of lattice points after I scale the first polygon by *n*: * *by Pick’s theorem. Since A stands for area and I’m scaling linearly by a factor of *n*, . Similarly, b stands for boundary which is a linear factor, so . Plugging these both in to the original equation gives , which is great! You only need to find *A *and *b *for the first polygon to find the number of lattice points in any scaling of it.

So here’s the next problem: Find *G(n) *for an *octahedron* in three dimensions (we’ve run into the octahedron in the poincare homology sphere posts). For a recap, the octahedron is what happens when you draw vertices at (0,0,*n*), (0,*n*,0), (*n, *0,0), (-*n,0,0), (0,-*n*,0), (0,0,-n) *and draw the edges between them and then fill it in.

So now it’s a little harder and weirder to count, but we can still do it, using a mixture of the techniques above. I’ll show two different ways to count this.

** Technique 1**: use level sets! My friend and co-author Andrew Sanchez came up with this clever idea, which uses the theorem we have above. Imagine that we’re at the top of the octohedron, at point (0,0,n

This is exactly where *D *is the diamond! So from the formula above, we can calculate , by find the initial area (2) and number of boundary points (4) of the initial diamond.

Now we’ll need to add all of these points from all of the level sets. The elevator starts with one point, which corresponds to z=n and k=0, and in the fattest part of the octahedron we have z=0 and k=n. Then it gets skinny again. So we can count all the points from z=0 to z=n, multiply by two, and subtract what we double counted (the fat base). This shows up as follows: . There are a bunch of proofs that and that the sum of the first *n *numbers is . So if I go through the arithmetic (you can do it I believe in you!) I get .

Do you believe it? Let’s do a second technique too! Actually, hate to leave you on a cliffhanger, but a second technique will take a lot more words and this post is already long enough. So you’ll have to wait til next week to see the technique I used with my friend Priyam Patel, whose current research I have previously blogged. Preview: we use partitions!

]]>*Side note: when I told my spouse that there are exactly 5 Platonic solids he reacted with astonishment. “Only 5?!” I’d taken this fact for granted for a long time, but it is pretty amazing, right?!*

Last week we learned about how to make the Poincaré homology sphere by identifying opposite sides of a dodecahedron through a minimal twist. I thought I’d go a little further into the proof that , where the latter is the way that Kirby and Scharlemann denote the Poincaré homology sphere in their classic paper. This post is a guided meditation through pages 12-16 of that paper, and requires some knowledge of algebraic topology and group actions and complex numbers.

Honestly, I don’t know too much about , but I do know that it’s a double cover of *I*, which is the group of symmetries of the icosahedron. For instance, if you pick a vertex, you’ll find five rotations around it, which gives you a group element of order 5 in *I*. Every symmetry will be a product of rotations and reflections.

Last time we watched this awesome video to see how you can tessellate the three sphere by 120 dodecahedra, and explained that we can think of the three sphere as {Euclidean three space plus a point at infinity} using stereographic projection.

Hey that’s great! Because it turns out that there are 60 elements in *I*, which means that has 120 elements in it. Let’s try to unpack how *I *acts on the three sphere.

First, we think of how the three sphere acts on the three sphere. By “three sphere,” I mean all the points equidistant from the origin in four space. The complex plane is a way to think of complex numbers, and it happens to look exactly like . So if I take the product of two copies of the complex plane, I’ll get something that has four real dimensions. We can think of the three sphere as all points distance 1 from the origin in this space. So a point on the three sphere can be thought of as a pair of points *(a,b) , *where *a *and *b *are both complex numbers. Finally, we identify this point with a matrix , and then we can see how the point acts on the sphere: by matrix multiplication! So for instance, the point (*a,b) *acts on the point *(c,d) *by , where I abused a bit of notation to show it in coordinate form.

What does this actually do? It rotates the three sphere in some complicated way. But we can actually see this rotation somewhat clearly: set *b *equal to 0, and choose *a* to be a point in the unit circle of its complex plane. Because *a *is a complex unit, this is the same as choosing an angle of rotation θ [].

Remember how we put two toruses together to make the three-sphere, earlier in the video? Each of those toruses has a middle circle so that the torus is just a fattening around that middle circle. Now think about those two circles living in our stereographic projection. One is just the usual unit circle in the *xy *plane of , and the other is the *z *axis plus the point at infinity. So how does *(a, 0) *act on these circles? We can choose the basis cleverly so that it rotates the *xy *unit circle by θ, and ‘rotates’ the *z *axis also by θ. That means that it translates things up the *z *axis, but by a LOT when they’re far on the z-axis and by only a little bit when they’re small.

Side note: this makes it seem like points are moving a lot faster the farther you look from the origin, which is sort of like how the sun seems to set super fast but moves slowly at noon (the origin if we think of the path of the sun as a line in our sky + a point at infinity aka when we can’t see the sun because it’s on the other side of the Earth).

Similarly, if we don’t have *b=0*, we can do some fancy change of coordinate matrix multiplication and find some set of two circles that our *(a,b)* rotate in some way. In either case, once we define how the point acts on these two circles we can define how it acts on the rest of the space. Think of the space without those two circles: it’s a collection of concentric tori (these ones are hollow) whose center circle is the blue unit circle, and whose hole centers on the red axis. If you have a point on one of those tori, we move it along that torus in a way consistent with how the blue and red circles got rotated.

This is a schematic: the blue and green circles get rotated, so the purple point on the pink torus gets rotated the way the blue circle does, and then up the way the green circle does.

What does this have to do with I? Fun fact: the symmetries of the icosahedron are the same as the symmetries of the dodecahedron! (Because they’re duals). So let’s look back at that tessellation of the 120-cell by dodecahedra, and stereographically project it again so that we have one dodecahedron centered at the origin, with a flat face at (0,0,1) and (0,0,-1), and a tower of ten dodecahedra up and down the z-axis (which is a circle, remember).

Now imagine rotating around the green axis by a click (a 2pi/5 rotation). This is definitely a symmetry of the dodecahedron. It rotates the blue circle, and by our action as we described earlier, it also rotates the green circle, taking the bottom of our dodecahedron to the top of it (because ). So this identifies the bottom and top faces with that minimal rotation. We said earlier that this rotation has order 5 in *I, *which means that it has some corresponding group element in with order 10. 10 is great, because that’s the number of dodecahedra we have in our tower before we come back to the beginning: so if we keep doing this group element rotation, we end up identifying the top and bottom of every dodecahedron in our *z-axis *tower of 10.

Similarly, using change of coordinate matrix basis madness, we can figure out how the rotations around the centers of each of the faces acts on the 120-cell (hint: each one will identify all the dodecahedra in a tower just like our first one did). With 120 elements in , each element ends up identifying one of the dodecahedra in the 120 cell with our origin dodecahedron, including that little twist we had when we defined the space.

So that’s it! That’s how you go from the tessellation of the 120-cell to the Poincare homology sphere dodecahedral space. Huzzah!

]]>