Back’s boats

Senator Back is doing the rounds with a strong anti-boat-arrival theme. I fired back a letter in frustration, which I’ll get to in a moment.

First, I’ll mention something else I discovered. Back sent out two letters, about a month apart, each accompanied with a pamphlet on how Labor is failing to “stop the boats”. The content in general is no great surprise (i.e. thoroughly depressing), except when it comes to comparing the numbers. Here are the graphs shown in the pamphlets:

1st pamphlet (arrived June 2010)

2nd pamphlet (arrived July 2010)

Now, of course, the first uses financial years while the second uses calendar years, but look closely. The numbers do not add up. Specifically:

  • the first graph shows three arrivals in ’03-’04, while the second shows only one in ’03 and none in ’04; and
  • the first shows eight arrivals in ’05-’06, while the second shows only one in ’05 and three in ’06.

The first pamphlet is (roughly) consistent with official figures. (The figures for Labor are roughly consistent with the pamphlets having been printed a few months apart; they look different, but I can’t spot any definite inconsistencies).

Here’s my more general response to Senator Back:

Dear Senator Back,

I read with great annoyance your second letter and pamphlet regarding boat arrivals and the mining tax.

Labor has capitulated on asylum seekers (and climate change). Your party might claim some credit for this, but now that the moral highground is there for the taking, why do you persist in this spurious and degrading line of argument?

I am not worried in the least about the number of boat arrivals, and your graphs and numbers mean nothing to me. Frankly, I find the whole issue bizarre and offensive. How does the Liberal Party propose to assist those people fleeing persecution who are clearly unable to come via the official channels? If you do “stop the boats”, surely you will only increase the suffering felt by such people, who are apparently not wanted anywhere. You don’t seem to offer an alternative, other than suggesting that Australia wash its hands of the problem.

I would vote for the absence of policy sooner than I would vote for yours.

It’s almost as though the two major parties are actively vying to be the more perverse and incompetent. Labor has done everything it can to break our trust, and yet the Liberal Party runs scared of offering anything better. I find it incredible that you’re not able to put together a policy framework to put Labor to shame, because Labor has handed you this opportunity on a silver platter.

On the mining tax, very few disinterested experts seem to agree with your point of view. As you know, the mining tax was proposed by Ken Henry in a comprehensive review of the tax system; the Labor Party merely adopted it. Moreover, I’m unsure of the relevance of the figure you quote – the proportion of revenue coming from Western Australia. I’m an Australian before I’m a West Australian, as I hope you are. WA is not a nation in its own right. Australia and all its people own the resources on which the mining tax is to be levied; that much of that mineral wealth happens to be found in WA is neither here nor there.

There are many genuine reasons for changing the government. It’s time that the Liberal Party stood up and took notice of them, because as it stands now you do not offer an alternative.

Open source science

Slashdot notes an article from the Guardian: “If you’re going to do good science, release the computer code too“. The author is, Darrel Ince, is a Professor of Computing at The Open University. You might recognise something of the mayhem that is the climate change debate in the title.

Both the public release of scientific software and the defect content thereof are worthwhile topics for discussion. Unfortunately, Ince seems to go for over-the-top rhetoric without having a great deal of evidence to support his position.

For instance, Ince cites an article by Professor Les Hatton (who I also cite on account of his recent study on software inspection checklists). Hatton’s article here was on defects in scientific software. The unwary reader might get the impression that Hatton was specifically targetting recent climate modelling software, since that’s the theme of Ince’s article. However, Hatton discusses studies conducted from 1990-1994, in different scientific disciplines. The results might still be applicable, but it’s odd that the Ince would choose to cite such an old article as his only source. There are much newer and more relevant papers; for instance:

S. M. Easterbrook and T. C. Johns (2009), Engineering the Software for Understanding Climate Change, Computing in Science and Engineering.

I stumbled across this article within ten minutes of searching. While Hatton takes a broad sample of software from across disciplines, Easterbrook and Johns  delve into the processes employed specifically in the development of climate modelling software. Hatton reports defect densities of around 8 or 12 per KLOC (thousand lines of code), while Easterbrook and Johns suggest 0.03 defects per KLOC for the current version of the climate modelling software under analysis. Quite a difference – two orders of magnitude, for those counting.

Based on Hatton’s findings of the defectiveness of scientific software, Ince says:

This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program.

This is a profoundly strange thing for a Professor of Computing to say. It’s certainly true that one single error can invalidate a computer program, but whether it usually does this is not so obvious. There is no theory to support this proclamation, nor any empirical study (at least, none cited). Non-scientific programs are littered with bugs, and yet they are not useless. Easterbrook and Johns report that many defects, before being fixed, had been “treated as acceptable model imperfections in previous releases”, clearly not the sort of defects that would invalidate the model. After all, models never correspond perfectly to empirical observations anyway, especially in such complex systems as climate.

Ince claims, as a running theme, that:

Many climate scientists have refused to publish their computer programs.

His only example of this is Mann, who by Ince’s own admission did eventually release his code. The climate modelling software examined by Easterbrook and Johns is available under licence to other researchers, and RealClimate lists several more publicly-available climate modelling programs. I am left wondering what Ince is actually complaining about.

Finally, Ince seems to have a rather brutal view of what constitutes acceptable scientific behaviour:

So, if you are publishing research articles that use computer programs, if you want to claim that you are engaging in science, the programs are in your possession and you will not release them then I would not regard you as a scientist; I would also regard any papers based on the software as null and void.

This is quite a militant position, and does not sound like a scientist speaking. If Ince himself is to be believed (in that published climate research is often based on un-released code), then the reviewers of those papers who recommended publication clearly didn’t think as Ince does – that the code must be released.

Ince may be convinced that scientific software must be publicly-auditable. However, scientific validity ultimately derives from methodological rigour and the reproducibility of results, not from the availability of source code. The latter may be a good idea, but it is not necessary in order to ensure confidence in the science. Other independent researchers should be able to confirm or contradict your results without requiring your source code, because you should have explained all the important details in published papers. (In the event that your results are not reproducible due to a software defect, releasing the source code may help to pinpoint the problem, but that’s after the problem has been noticed.)

There was a time before computing power was widely available, when model calculations were evaluated manually. How on Earth did science cope back then, when there was no software to release?

The Mad Monk’s modelling mockery

Tony Abbott has tried his hand at modelling the economic costs of carbon emissions reduction. The results are a little disturbing. Unless Abbott was being deliberately, deceptively simplistic in order to appeal to the burn-the-elitists demographic of Australian society, he truly doesn’t have a clue what he’s talking about:

He says given a 5 per cent reduction in greenhouse gas emissions will cost Australian taxpayers $120 billion, the cost of the emissions trading scheme’s 10-year aim of a 25 per cent reduction will be much greater.

“The Federal Government has never released the modelling,” Mr Abbott said.

“Now if there is modelling that shows the costs of a 15 per cent and a 25 per cent emissions reduction, let’s see the modelling, let’s release the figures.

“I think it’s reasonable to assume in the absence of other plausible evidence that five times that reduction, a 25 per cent reduction in emissions, might cost five times the price – half a trillion dollars, 50 per cent of Australia’s annual GDP.”

I’m no economist, but I suspect the experts might shy away from confidently predicting that 5 times the reduction implies 5 times the cost. We’re talking about billions of dollars flowing through all the intricate structures that make up the economy. There are feedback mechanisms, economies of scale, and the little fact that a “5%” reduction in CO2 is relative to 2000 levels but the projected cost is based on 2020 levels (because that’s when it’s happening). Even a “0%” change from 2000 levels represents a substantial cut in what our 2020 CO2 emissions would have been, but according to Abbott’s model this scenario would cost nothing.

Why even have economists if a constant factor is all it takes to convert a percentage CO2 reduction into a dollar amount? If Tony, our alternative Prime Minister, thinks it’s “reasonable to assume” such things, perhaps we can get him to try out this approach to economic modelling in a controlled environment where he can’t hurt anyone else. Say, in a padded cell with Monopoly money.

Software defect costs

In my persuit of software engineering data, I’ve recently been poring over a 2002 report to the US Government on the annual costs of software  defects. The report is entitled “The Economic Impacts of Inadequate Infrastructure for Software Testing“. Ultimately, it estimates that software defects cost the US economy $59.5 billion every year.

Modelling such economic impacts is an incredibly complex task, and I haven’t read most of the report’s 309 pages (because much of it isn’t immediately relevant to my work). However, since trying to use some of the report’s data for my own purposes, certain things have been bothering me.

For instance, the following (taken from the report):

nist_table

This table summarises the consequences to users of software defects (where “users” are companies in the automotive and aerospace industries).

Strictly speaking, it shouldn’t even be a table. The right-most column serves no purpose, and what remains is a collection of disparate pieces of information. There is nothing inherently “tabular” about the data being presented. Admittedly, for someone skimming through the document, the data is much easier to spot in a table form than as plain text.

The last number piqued my curiosity, and my frustration (since I need to use it). What kind of person considers a $4 million loss to be the result of a “minor” error? This seems to be well in excess of the cost of a “major” error. If we multiply it by the average number of minor errors for each company (70.2) we arrive at the ludicrous figure of $282 million. For minor errors. Per company. Each year.

If the $4 million figure is really the total cost of minor errors – which would place it more within the bounds of plausibility – why does it say “Costs per bug”?

The report includes a similar table for the financial services sector. There, the cost per minor error is apparently a mere $3,292.90, less than a thousandth of that in the automotive and aerospace industries. However, there the cost of major errors is similarly much lower, and still fails to exceed the cost of minor errors. Apparently.

What’s more, the report seems to be very casual about its use of the words “bug” and “error”, and uses them interchangeably (as you can see in the above table). The term “bug” is roughly equivalent to “defect”. “Error” has a somewhat different meaning in software testing. Different definitions for these terms abound, but the report provides no definitions of its own (that I’ve found, anyway). This may be a moot point, because none of these terms accurately describe what the numbers are actually referring to – “failures”.

A failure is the event in which the software does something it isn’t supposed to do, or fails to do something it should. A defect, bug or fault is generally the underlying imperfection in the software that causes a failure. The distinction is important, because a single defect can result in an ongoing sequence of failures. The cost of a defect is the cost of all failures attributable to that defect, put together, as well as any costs associated with finding and removing it.

The casual use of the terms “bug” and “error” extends to the survey instrument – the questionnaire through which data was obtained – and this is where the real trouble lies. Here, potential respondants are asked about bugs, errors and failures with no suggestion of any difference in the meanings of those terms. It is not clear what interpretation a respondant would have taken. Failures are more visible than defects, but if you use a piece of buggy software for long enough, you will take note of the defects so that you can avoid them.

I’m not sure what effect this has on the final estimate given by the report, and I’m not suggesting that the $59.5 billion figure is substantially inaccurate. However, it worries me that such a comprehensive report on software testing is not more rigorous in its terminology and more careful in its data collection.

The American hypothesis

I have a hypothesis on politics – a somewhat unfortunate hypothesis given its implications. Roughly speaking, it’s this: the workability of democracy diminishes with large populations. I’m not talking about the logistics of holding elections, but about the ability of society to engage in meaningful debate.

My reasoning goes like this. Insofar as I can tell, in any given (relatively democratic) country, the media tends to focus predominantly on the national politics of that country. At the same time, there are of course a variety of political parties and interest groups seeking to alter public perception for their own ends. We can think of this in two parts:

  1. the effort expended on politically-charged adverts, campaigns, editorials, etc.; and
  2. the resulting effects on the public mindset.

Due to mass media (TV, radio and the Internet), a fixed amount of “effort” will probably yield the same result, independent of the population size. That is, the effectiveness of a single TV ad will not diminish simply because more people are viewing it.

However, countries with larger populations will naturally have a higher talent pool from which to draw people to promote particular causes. Thus, more effort will be expended on political advertising, campaigns, editorials, etc., and so the effect on the public mindset will be greater. (I also assume that the proportion of people employed to promote particular causes is independent of population size.)

Now, we might naïvely assume that all this political advertising “balances out”, since there’s always an array of competing interests. I say this is naïve, because all efforts to promote political causes have one thing in common – one thing that can’t easily be balanced out: deception. I’m not only talking about outright lies (though it does come to that with tedious regularity), but also errors of omission, logical fallacies, appeals to emotion and any other psychological tricks used to blunt your critical thinking. They’re not even necessarily deliberate.

Without wanting to generalise, there are certainly a subset of PR people, political strategists and so on who do seem to hold an “ends justifies the means” view. These are the people who really feed the political machine, who take things out of context, invent strawmen, engage in character assassination, and generally pollute the political debate with outrageous propaganda. The larger the population, the more of these people there will be, and so the louder, better organised, more pervasive and more inventive the disinformation.

The effect of disinformation is to disconnect public perception from reality. At at sufficient level this would cripple democracy, because democracy relies on the people having at least some understanding of government policy and its consequences.

I can’t comment too much on India – the world’s largest democracy – because I honestly know very little about it.

I don’t claim much expertise on American politics either, but I suspect the US is suffering this affliction. To me, American politics now seems to languish in a state of heated anachronism. The political machine instantly suffocates any sign of meaningful debate with ignorant fear and rage. You’re still perfectly able to exercise your rights to free speech and free expression, but it’s not going to achieve anything. Meanwhile, in a desperate attempt to climb above the fray, the media sometimes treats political debate more like a sporting match than a tool of democracy. I’m sure there is an element of this in every democratic country, but in the US it seems to be boiling over.

It might pay to consider this if we intend to move towards a World Federation, as science fiction often proposes, and which appeals to me intuitively. Of course, a “One World Government” is the nightmare-fantasy shared by so many conspiracy theorists. However, the danger is not that the government will have too much control, but that even with our rights fully protected, democracy will nevertheless be pummelled to oblivion by global armies of political strategists and PR hacks.

Just a thought.

Asylum statistics

One of Amnesty International’s media releases reports on a survey of Australians’ knowledge and opinions on asylum seekers. However, the point of the media release is clearly to highlight some of the facts themselves, not just the extent to which people are aware of them. This seems reasonable, given that:

The opinion poll also showed that a large majority of Australians have major misconceptions regarding the percentage of asylum seekers who arrive in Australia by boat. On average, Australians believe that about 60 per cent of asylum seekers come to Australia by boat. More than a third of Australians believe that over 80 per cent of asylum seekers arrive by boat. In fact, only 3.4 per cent of people who sought asylum in Australia in 2008 arrived by boat – the other 96.6 per cent arrived by plane.

This is a fairly important statistic. However, this article is utterly devoid of citations, and as a researcher this annoys the hell out of me. Amnesty is a kind of lobbying organisation. As such it has an interest in altering opinions, and so it shouldn’t always expect people to take it at face value.

The other thing that troubles me is the discussion of processing costs (it costs more to process asylum seekers on Christmas Island than on the mainland). Why would Amnesty even care about asylum seeker processing costs? It’s hardly an issue on which human rights hinge. I’d venture that it cares only because it’s another means of altering opinions. It certainly wouldn’t be reporting processing costs if they were less on Christmas Island.

(This reminded me of the nuclear power debate. Greenpeace has argued that the nuclear power is unwise because the economics don’t stack up. This is actually quite dishonest, in my opinion, even if it’s entirely accurate. It’s hard to imagine that Greenpeace cares about the economics argument against nuclear power for its own sake. Coming from an authority on economics, such an argument may be taken seriously. The same argument coming from Greenpeace just looks like someone trying to push our buttons.)

In general I don’t wish to denigrate Amnesty. The lobbying it does is directed at a genuinely worthy cause, unlike that conducted by a large number of other lobbyists. However, worthy causes are almost always served by open discussion, and this includes the ability to verify the facts and statistics for oneself.

There is of course much discussion of the statistics in the media. For instance, Crikey has a list of statistics on asylum seekers with numerous but not terribly good references. I eventually managed to (more-or-less) confirm that only 179 out of 4750 asylum seekers arrived by boat in 2008. This report gives the 179 figure on page 4, while a media release on the Immigation Minister’s website mentions the 4750 figure. That comes out at roughly the same percentage (3.8%) as quoted by Amnesty.

The processing costs, I’m guessing, came from a 2007 report for Oxfam. The report states:

The latest figures given to a budget estimates hearing on 22 May 2006 suggest that it cost $1,830 per detainee per day to keep someone on Christmas Island compared to $238 per detainee per day at Villawood in Sydney.

So why am I interested in asylum seeker processing costs? I’m not; not directly, anyway. I consider it to be an argument that largely misses the point –  mechanisms intended to discourage unauthorised boat arrivals incur a human cost, not just a financial one. However, from the financial cost I note that not even selfish motives would justify a hardline position on unauthorised boat arrivals. What, then, are the hardliners actually arguing about? If both altruism and self-interest suggest the same course of action, what kind of corrupt mode of thinking can possibly raise an objection?

It’s inexcusable that we should make asylum seekers the object of such irrational concern. By definition, these are people who possess the least political power of anyone in the world. However, as a direct result, their suffering also carries the least political risk; not that you’d know it from listening to some of the myopic reactionary logic floating around over the last few years.

It seems that ideology can thrive where beliefs are not merely simplistic or unsupported, but where they are demonstrably false.

The doomsday argument

This has recently been the source of much frustration for some of my friends, as I’ve attempted to casually plow through a probabilistic argument that most people would instinctively recoil at. So, I thought, it might work better when written down. Of course, plenty of others have also written it down, including Brandon Carter – its originator – and Stephen Baxter – a science fiction author (who referred to it as the “Carter Catastrophe” in his novel Time).

The main premise of the argument is the Copernican principle. Copernicus, of course, heretically suggested that the Earth was not the centre of the universe. Thus, the Copernican principle is the idea that the circumstances of our existence are not special in any way (except insofar as they need to be special for us to exist in the first place).

We are now quite comfortable with the Copernican principle applied to space, but the doomsday argument applies it to time. Just as we do not live in any particularly special location, so we do not live at any particularly special moment. This is distorted by the fact that the human population has exploded in the last century to the point where about 10% of all the humans to have ever lived (over the course of homo sapiens’ ~200,000 year history) are still alive today. We can deal with this distortion by (conceptually) assigning a number to each human, in chronological order of birth, from 1 to N (where N is the total number of humans that have lived, are currently alive, or will ever be born in the future). We can then say, instead, that we are equally likely to have been assigned any number in that range.

In probability theory, this is equivalent to saying that you have been randomly selected from a uniform distribution. Yes, it must be you (the observer) and not someone else, because from your point of view you’re the only person who has a number selected from the entire range – past, present and future. You could have been assigned a number at any point along the human timeline (by the Copernican principle), but you still cannot observe the future, and so by selecting any other specific individual you’d automatically be restricting the range to the past and present. The number you’ve actually been assigned is something on the order of 60 billion (if we estimate that to be the total number of humans to have ever lived so far).

So where does that leave us? Well, in a uniform distribution, any randomly selected value is 95% likely to be in the final 95% of the range. If your randomly selected number is 60 billion, then it’s 95% likely that the total number of humans to ever live will be less than 60 billion × 20 = 1.2 trillion. Similarly, it’s 80% likely that the total number will be 60 billion × 5 = 300 billion, and 50% likely that the total number will be 120 billion. Now, 50%, 20% and 5% probabilities do crop up, but we must draw the line at some point, because you cannot demand absolute certainty (or else science would be impossible.)

This should make us think. The doomsday argument doesn’t give an exact number, nor does it directly give us a time, but this can be estimated from trends in population growth. However, the prospect of a scenario in which humanity spreads out beyond the solar system and colonises the galaxy, to produce a population of countless trillions over tens of thousands or even millions of years, would seem vanishingly unlikely under this logic. Even the prospect that humanity will survive at roughly its current population on Earth for more than a few thousand years seems remote.

It’s also worth pointing out, as others have, that the doomsday argument is entirely independent of the mechanism by which humanity’s downfall might occur. That is, if you accept the argument, then there is nothing we can do to stop it.

Needless to say, the objections to this reasoning come thick and fast, especially if you bumble like I have through a hasty verbal explanation (hopefully I’ve been more accurate and articulate in this blog post). One should bear in mind that this isn’t simply some apocalyptic pronouncement from a random, unstable individual (it wasn’t my idea). This is work that has been published some time ago by three physicists independently (Brandon Carter, J. Richard Gott and Holger Bech Nielsen) in peer-reviewed journals. That’s not to say it’s without fault, but given the level of scrutiny already applied, one might at least pause before simply dismissing it out of hand.

The objections I’ve heard (so far) to the doomsday argument usually fall along the following lines:

  1. Often they discard the notion that the observer is randomly selected, thus reaching a different (and trivial) conclusion.  One can point out that there always has to be a human #1, and a human #2, and so on, and that this says nothing about the numbers that come after. However, in pointing this out, one is not randomly selecting those numbers, and random selection is the premise of the argument.
  2. They object that a sample size of one is useless. Indeed, in the normal course of scientific endeavour, a sample size of one is useless, but that’s just because in a practical setting we’re trying to achieve precision. If we’re just trying to make use of what we know, one sample is infinitely more useful than no samples at all. The doomsday argument does not at any point assume that its single randomly-selected number represents anything more than a single randomly-selected number. If we had more than one random sample, we’d be able to make a stronger case, but that does not imply there’s currently no case at all.
  3. Sometimes they object on the grounds of causality – that we simply can’t know the future. I think this is just a manifestation of personal incredulity. There is no physical law that says we cannot know the future, and here we’re not talking about some divine revelation or prophecy. We’re only talking about broad probabilistic statements about the future, and we make these all the time (meteorology, climatology, city planning, resource management, risk analysis, software development, etc. ad infinitum).

However, I’m sure that won’t be the end of it.

Theoretical frameworks, part 3

The first and second instalments of this saga discussed the thinking and writing processes. However, I also need to fess up to reality and do some measuring.

A theoretical framework is not a theory. The point of a theoretical framework is to frame theories – to provide all the concepts and variables that a theory might then make predictions about. (If I were a physicist these might be things like light and mass). You can test whether a theory is right or wrong by comparing its predictions to reality. You can’t do that for theoretical frameworks, because there are no predictions, only concepts and variables. The best you can do is determine whether those concepts and variables are useful. This really means you have to demonstrate some sort of use.

And so it falls to me to prove that there’s a point to all my cogitations, and to do so I need data. In fact, I need quite complex data, and in deference to approaching deadlines and my somewhat fatigued brain, I need someone else’s quite complex data.

The truth is – I’m probably not going to get it; at least, not all of it.  Ideally, I need data on:

  • the length of time programmers take to assimilate specific pieces of knowledge about a piece of software;
  • the specific types of knowledge required to assimilate other specific types of knowledge;
  • the probability that programmers will succeed in understanding something, including the probability that they find a defect;
  • the probability that a given software defect will be judged sufficiently important to correct;
  • the precise consequences, in terms of subsequent defect removal efforts, of leaving a defect uncorrected;
  • the cost to the end user of a given software defect;
  • the propensity of programmers to find higher-cost defects; and
  • the total number of defects present in a piece of software in the first place.

I also need each of these broken down according to some classification scheme for knowledge and software defects. I also need not just ranges of values but entire probability distributions. Such is the pain of a theoretical framework that attempts to connect rudimentary cognitive psychology to economics via software engineering.

With luck, I may be able to stitch together enough different sources of data to create a usable data set. I hope to demonstrate usefulness by using this data to make recommendations about how best to find defects in software.

When statistics attack

I swear stats is trying to kill me. I’ve redesigned my experiment so that it’s a nice elegant “two-factor repeated measures” flavour. I won’t trouble you with exactly what that means, or exactly what the nine separate hypotheses I’m testing are. What I will trouble you with, for it’s certainly been troubling me, is this:

To analyse the data I will collect I need to use a stats test, which broadly speaking is a factory that converts numbers into truth (or lies if you’re not careful).

Jim, Mr Stats, has a stats handbook that tells you how to do this. It has a nifty little flowchart at the beginning that you can trace through to work out which of the several dozen different kinds of stats tests you need to use. Easy enough, I think to myself as my fingers follow the little arrows across the page. And where do I end up? At a little box that states helpfully: “It may be possible to devise an ad hoc statistical test for the design under consideration.”

That’s right – with my new, improved, elegant design, the Oracle of Statistics reckons it may be possible, with not so much as a hint as to how one might actually go about it.

Not to be defeated, however, I turn to the Oracle of Everything – Google – with which I stumble upon something called Factorial Logistic Regression. I certainly won’t trouble you with what this means, not because I don’t want to but because I currently have no idea myself. Neither of my two supervisors – one of whom is Jim Himself – does either.

My only hope appears to lie in a library book entitled Regression Modeling Strategies. So the campaign continues…

How does this experiment work?

Statistics. It all seems to easy until you have to do it.

No worries Dave, I confidently assured myself as I fitted the last details of my delicate experimental design into place, all set to be unleashed on as many undergraduates as I had chocolate to bribe. Now all I have to do is plug in the stats terminology and I’ll… oh shit.

As a research student, you know you’re onto a “winning” idea when you have to write a Python script just to work out what factors your experiment is testing. Somewhat like realising, after you’ve found the ultimate answer to life, the universe and everything, that you didn’t know what the question was.