Continue reading ‘Visualisation Of Scottish Demographic Data’ »]]>

This project explores the use of interactive visualisations to augment the extensive data published by the National Records of Scotland. Good visualisation can illustrate key trends in statistical data, increasing impact and accessibility; great visualisation can go further, and enable us to identify and explore unexpected connections. Data visualisations can therefore support operational research, but we will see that producing them also entails solving problems of an OR flavour.We survey the existing literature for principles of good design in presenting data visually; much of this is aimed at hand-produced imagery for print, so we examine how it can be best used in the new context of procedurally-generated, interactive visualisations for the web. In the first instance, we consider this for chart types which have proven popular or successful for static visualisations, particularly if already used by NRS.

This leads us to investigate more complicated data sets which can be interpreted as having a graph theoretic structure. We will show how the constrained layout of networks of vertices with an associated size can be posed as an optimisation problem, and develop a visualisation that operates under such constraints. Further, we will consider the use of geographic clustering to represent migration flow, describing and implementing a novel `re-wiring’ algorithm to generate tree structures that produce better visualisations than standard agglomerative approaches.

Finally, we present a portfolio of visualisations created for NRS that follow the design principles identified and make use of the software tools developed during the project.

There is also an online version of the appendix with links to the various visualisations developed, including source code and sample data files. The rest of this post gives at-a-glance versions.

The Cause of Death Explorer

Cause of Death Treemap

*Experimental alternative presentation of the above data set; not suitable for Internet Explorer*

Fertility data (cohort effects)

Popular baby names for boys

*Also available for girls!*

Life Expectancy

*The first to be used in NRS reporting, appearing here.*

Continue reading ‘My Erdős number…’ »]]>

If I do not publish any more papers, the best I can hope for is three, if Gary later collaborates with a one. But for now my goal should be to obtain a Bacon number…

]]>Continue reading ‘Recent popular baby names in Scotland’ »]]>

The data available is the top 100 names for each of boys and girls born in Scotland, every year 1998-2013 except 2000 (for unknown reasons). For each name that features, the precise count is also given – but for any that fail to make the cut, we don’t have this figure. As well as this censoring effect – for which the precise threshold will vary each year – raw counts should really be considered in the context of varying birth rates too: there may be less children with a particular name simply because there are less children! So for the visualisation project I focused my efforts on the rankings. After various experiments, I settled on simply showing the top 20 each year. Interestingly, this doesn’t require too much data. For example, there are just 25 different boys names that feature in any of the 13 top 10′s; and only another 16 are needed to form the pool for the top 20′s, as many of those are past or future members of the top 10. So, here’s the boys:

and similarly for the girls:

However, I couldn’t resist going back to the raw counts to look at some of these in more detail. For instance, at the top of the charts we seem to have captured “peak *Emma*“; from highs of around 630 in 2003-04, it not only lost the top spot to *Sophie* but plummeted out of the top 10 (and nearly the top 20), with just 237 of them a decade later. The shift is even more pronounced when you consider that *Sophia* cracked the top 20 from 2011, and *Sofia* is also to be found further down the top 100. For the boys, *Lewis* has also declined substantially from its chart-topping days, but still holds a top three position despite there being less than half as many in 2013 than 2003.

The Sophie/Sophia/Sofia situation is an example of a rather common phenomenon girls names. Although the truncated rankings will suppress the least popular variants, a sufficiently popular name can carry with it homophones (such as

As mentioned, the most popular names usually spend some time as moderately popular ones first, and take a while to disappear entirely. But an interesting example of a name that has very recently sprung into prominence is

Finally, I couldn’t resist an egotistical look at the data. However, in a sure sign of my advancing age, neither

^{1} Yes, those sound the same. Blame gaelic. For bonus marks, can you pronouce 2007′s twentieth most popular name, *Eilidh*?

style="display:inline-block;width:728px;height:90px"

data-ad-client="ca-pub-4513602643284941"

data-ad-slot="7088282779">

Continue reading ‘Fertility in Scotland’ »]]>
*The Registrar General’s Annual Review of Demographic Trends (158th Edition)*. The user can select from many more years, but as only one is shown at a time clutter is reduced (whilst animation helps to reveal the changing patterns, and tooltips provide clarification and precise data values). Moreover, there is a cohort effect within this data: it is not entirely accurate that fertility of 25 year olds fell from 1973 to 1974, as these are different groups of women. The transition animations therefore instead show the changing experiences of each of these groups as they age, identified by colour coding. For an alternative slice through the data along these lines, this version instead shows fertility at each age for a selected cohort; I am still considering if there is an effective way to combine the two.

**Source: Vital Events Reference Tables 2012**Table 3.6: Age-specific birth rates, per 1,000 female population, Scotland, 1951 to 2012.- Live births only. Excludes births where mother’s age is not stated.
- Rate for age 15 includes births at younger ages and for age 44 includes births at older ages.
- The average age is calculated by adding 0.5 years to the mother’s age at her last birthday (e.g. it is assumed that 30-year-old mothers were, on average, aged 30 years and 6 months when they gave birth).
- The age-specific birth rates for 2002 to 2010 are the revised figures calculated using the rebased population estimates which were published on 17th December 2013.

Continue reading ‘DataViz first steps’ »]]>
*Data Visualisation of Scottish Demographic Information*. Here’s a first dip into the world of D3, lightly adapted from these examples of chord diagrams. The data shown are Migration flows between Council areas for 2011-12 (most recent).

Continue reading ‘The Giant 4D Buckyball’ »]]>

Earlier in the year I participated in the building of a `Giant 4D Buckyball’ sculpture; the first of its kind in the UK, and assembled by a team of twenty during the opening day of the University of Edinburgh’s Innovative Learning Week. I then represented the project at the ASCUS Art and Science Salon as part of TEDxUniversityofEdinburgh at the end of the week. The build was one of several ILW events organised by Julia Collins from the School of Mathematics, and you can read her account here. There was a lot of coverage of this event, from student blogs to Scottish Television – although of varying standards of mathematical literacy! So I’ve put together a series of posts describing the fundamental building block, the `buckyball’:

Whilst the sculpture definite counts as mathematical artwork, it also gave me a chance to indulge some of my other creative interests. As well as the images above, during the construction (and more recent deconstruction) I was able to capture the action through a pair of time-lapse videos (as always, setting to HD is recommended!):

Continue reading ‘What is a buckyball? Part 3: Fullerenes’ »]]>

The Platonic solids were extremely regular: every face had to be the same, with all angles and side lengths the same, and the same number of faces meeting at each vertex. In a *Fullerene* we allow the faces to be either pentagons or hexagons, but we require exactly three faces to meet at each vertex. We can still take a one-point projection and get a planar graph: it’ll be 3-regular from the vertex condition, and every face has degree five or six.

This turns out to force a seemingly stronger condition on our graphs:

A fullerene has exactly twelve faces of degree five.

To see this, suppose there are P faces of degree five (pentagons) and H faces of degree six (hexagons). Then all-in-all there are F=P+H faces, and we know from Euler’s formula that F = 2 + E -V. By 3-regularity we know 2E=3V. So P+H=F=2+3V/2 – V = 2 + V/2. Further, by handshaking for planar graphs we know 2E = 5P + 6H; so

This tells us that V=2H+20, so H = V/2 -10. As P + H = 2 +V/2, we conclude P = 2 + V/2 – H = 2+ V/2 – (V/2 -10) = 12, as claimed.

So in a certain degenerate sense we’ve already seen a fullerene – if we have 12 pentagons and *no* hexagons, with three pentagons meeting at every point, then we have one of our Platonic solids – specifically, the dodecahedron. However, the motivation for studying fullerenes comes from molecular chemistry, where they arise as different *allotropes* of carbon. But the laws of physics get in the way of having adjacent pentagonal faces when building with carbon – the bonds are not stable. To be a viable fullerene in the chemical sense, our fullerene graph has to have isolated pentagons. That means that none of the five vertices of each of the twelve pentagons can be shared, so a fullerene has to have at least sixty vertices. But, remarkably, we *can* exhibit a 60 vertex planar graph with twelve pentagonal faces, all other faces hexagonal, three faces meeting at every vertex, and no two pentagons touching:

This, as you may have guessed, is our long-awaited Buckyball! Or, more properly, *Buckminsterfullerine*. This is the simplest possible isolated pentagon fullerene, but it is still much more complicated than the more familiar allotropes of carbon: graphite and diamond. The theoretical existence of the C_{60} allotrope had been advanced several times in the 60s and 70s, but was not generally accepted as a realistic possibility by the scientific community. That had to change in the 1980s, when it was first synthesised by Kroto, Curl and Smalley. They named it Buckminsterfullerene due to its resemblence to geodesic dome constructions by the architect Richard Buckminster Fuller:

They also produced C_{70}, showing that C_{60} was just one instance of a general class, the fullerenes: they received the 1996 Nobel prize in Chemistry for opening up this field of study. It has subsequently been shown that C_{60} is naturally occuring – it can be found in soot, created by lightning, and has even been identified in clouds of cosmic dust!

Continue reading ‘What is a Buckyball? Part 2: Projection’ »]]>

How can we represent a 3-dimensional object such a cube in only 2-dimensions, such as on a flat piece of paper? This is the problem of *projection*, and it inevitably introduces inaccuracies. Different choices of *perspective* will alter what features survive the projection process. For instance, a perfect cube has all faces square, with corner angles of 90 degrees, and opposite sides of each square are parallel. But in the two point perspective shown only the vertical lines remain parallel; the introduction of vanishing points has distorted the horizontal ones and thus the angles.

Instead of thinking of the cube as a solid object, we can describe it in terms of its vertices (the corners or points) and the edges that join them – that is, as a graph! But in the previous post we were interested in planar graphs, and in our 2-point perspective we have edges crossing. This might seem unavoidable – the `front’ blocking our view of the `back’ – but through a different choice of projection we can get a planar graph of the cube, or indeed any suitably well-behaved solid. This is the key to using graph theory to study those solids.

Instead of the two-point perspective, we can draw the cube using one-point perspective: our viewpoint is the centre of one of the faces, and corresponds to a single vanishing point. We still have all the vertical lines parallel, but now the top and bottom of the two faces algined with our view (the red and blue ones) remain parallel too. Whilst we have substantially distorted sizes now – in the perfect cube, the red and blue squares would be the same size – this perspective has given us a planar graph representation of the cube, known mathematically as its Schlegel diagram.

A polyhedron is a 3 dimensional solid made up of flat faces, joined along straight edges that meet at vertices (sharp corners): this terminology should already feel familiar from our discussion of planar graphs! And indeed there is a connection:

If a polyhedron is convex (its surface does not intersect itself, and straight lines joining any two points remain within its volume or on its surface) then its Schlegel diagram is a planar graph.

In Julia’s writeup, a special case of convex polyhedra, the regular or Platonic Solids, are mentioned- these have the additional restriction that every face is the same regular polygon (all angles and side lengths equal), and the same number of faces meet at each vertex. The cube is an example: each face is a regular 4-gon (a square!), and we have three squares meeting at every vertex (corner).

The claim is that there are exactly five (convex) examples of these, and we can use our results on planar graphs to prove this! By example, we know there are at least five:

So our task is to rule out the possibility of any others. To do this, we need a little more notation for graphs: we call the number of edges meeting at a vertex the *degree* of that vertex, d; we call the graph d-regular if every vertex has the same degree, *d*. By the `double counting’ argument mentioned in the previous post, if you add up the degrees of all the vertices then you’ll get twice the number of edges (since each edge contributes 1 to the degree count at each of its ends). So a *d*-regular graph with V vertices has dV/2 edges.

Now let’s imagine we have some Platonic solid; whatever it is, it is made out of faces with some fixed number k of sides, and there are F of them. There are also V vertices, and at each of these we have d edges (the sides of the polygons) meeting. We can then project to a planar graph G, the Schlegel diagram, which will also have F faces (of degree k), V vertices, and E edges (corresponding to the sides). The graph G will be d-regular because d edges met at each vertex of the solid, so we know that 2E=dV, or V=2E/d. Every face has degree k, and there are F of them, so the total sum of face degrees is kF; but we know from last time that this sum is also twice the number of edges (the handshaking lemma). So 2E=kF too, or F= 2E/k.

We also know – by Euler’s formula – that to be planar G must satisfy V – E + F = 2. So this gives us 2E/d – E + 2E/k = 2, which rearranges to 1/d + 1/k = 1/e + 1/2. Whatever 1/e is, it’s bigger than zero, so that means that 1/d + 1/k is strictly bigger than 1/2. Suppose both d and k were bigger than 3. Then 1/d would be 1/4 or smaller, and so would 1/k: so their sum is at most 1/2, which is not bigger than itself. Since we can’t have a two-sided polygon, we know k is at least 3.

So we can start with k=3; the possible Platonic solids with triangular faces. Using 1/d + 1/3 > 1/2, we see that 1/d > 1/6, so d can only be 3,4 or 5. All 3 of these are possible: they are the tetrahedron, octahedron, and icosahedron.

If we take k strictly bigger than 3, then we know that d has to be no more than 3, but since at least 3 edges have to meet at a point in 3D space, d is also at least 3, so it must be precisely 3. Again we want 1/k + 1/3 < 1/2, so 1/k > 1/6 which means k is 4 or 5. k=4 gives us the cube (3 squares meeting at every point), and k=5 gives us the dodecahedron (3 pentagons meeting at every point), and we’ve ruled out any other possible pair of k and d.

So there can only be 5 Platonic solids, and we were able to prove this without having to consider any geometric properties like angles or side lengths – just the `combinatorial’ information about how many faces, edges and vertices are involved and the way they connect, as captured by a graphical representation. So, let’s close this part by taking a look at their Schlegel diagrams.

]]>