Open source science

Slashdot notes an article from the Guardian: “If you’re going to do good science, release the computer code too“. The author is, Darrel Ince, is a Professor of Computing at The Open University. You might recognise something of the mayhem that is the climate change debate in the title.

Both the public release of scientific software and the defect content thereof are worthwhile topics for discussion. Unfortunately, Ince seems to go for over-the-top rhetoric without having a great deal of evidence to support his position.

For instance, Ince cites an article by Professor Les Hatton (who I also cite on account of his recent study on software inspection checklists). Hatton’s article here was on defects in scientific software. The unwary reader might get the impression that Hatton was specifically targetting recent climate modelling software, since that’s the theme of Ince’s article. However, Hatton discusses studies conducted from 1990-1994, in different scientific disciplines. The results might still be applicable, but it’s odd that the Ince would choose to cite such an old article as his only source. There are much newer and more relevant papers; for instance:

S. M. Easterbrook and T. C. Johns (2009), Engineering the Software for Understanding Climate Change, Computing in Science and Engineering.

I stumbled across this article within ten minutes of searching. While Hatton takes a broad sample of software from across disciplines, Easterbrook and Johns  delve into the processes employed specifically in the development of climate modelling software. Hatton reports defect densities of around 8 or 12 per KLOC (thousand lines of code), while Easterbrook and Johns suggest 0.03 defects per KLOC for the current version of the climate modelling software under analysis. Quite a difference – two orders of magnitude, for those counting.

Based on Hatton’s findings of the defectiveness of scientific software, Ince says:

This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program.

This is a profoundly strange thing for a Professor of Computing to say. It’s certainly true that one single error can invalidate a computer program, but whether it usually does this is not so obvious. There is no theory to support this proclamation, nor any empirical study (at least, none cited). Non-scientific programs are littered with bugs, and yet they are not useless. Easterbrook and Johns report that many defects, before being fixed, had been “treated as acceptable model imperfections in previous releases”, clearly not the sort of defects that would invalidate the model. After all, models never correspond perfectly to empirical observations anyway, especially in such complex systems as climate.

Ince claims, as a running theme, that:

Many climate scientists have refused to publish their computer programs.

His only example of this is Mann, who by Ince’s own admission did eventually release his code. The climate modelling software examined by Easterbrook and Johns is available under licence to other researchers, and RealClimate lists several more publicly-available climate modelling programs. I am left wondering what Ince is actually complaining about.

Finally, Ince seems to have a rather brutal view of what constitutes acceptable scientific behaviour:

So, if you are publishing research articles that use computer programs, if you want to claim that you are engaging in science, the programs are in your possession and you will not release them then I would not regard you as a scientist; I would also regard any papers based on the software as null and void.

This is quite a militant position, and does not sound like a scientist speaking. If Ince himself is to be believed (in that published climate research is often based on un-released code), then the reviewers of those papers who recommended publication clearly didn’t think as Ince does – that the code must be released.

Ince may be convinced that scientific software must be publicly-auditable. However, scientific validity ultimately derives from methodological rigour and the reproducibility of results, not from the availability of source code. The latter may be a good idea, but it is not necessary in order to ensure confidence in the science. Other independent researchers should be able to confirm or contradict your results without requiring your source code, because you should have explained all the important details in published papers. (In the event that your results are not reproducible due to a software defect, releasing the source code may help to pinpoint the problem, but that’s after the problem has been noticed.)

There was a time before computing power was widely available, when model calculations were evaluated manually. How on Earth did science cope back then, when there was no software to release?

Peer review

I’ve stumbled across yet another “ClimateGate” article (by way of James Delingpole), this one going right for the jugular of science: peer review. The author is journalist Patrick Courrielche, who I hadn’t come across until now.

Courrielche argues that peer review is kaput and is being replaced by what he calls “peer-to-peer review”, an idea that brings to mind community efforts like Wikipedia. This has apparently been catalysed by “ClimateGate”, an event portrayed by the denialist community as something akin to the Coming of the Messiah.

Courrielche asserts that peer review is a old system of control imposed by the “gatekeepers” of the “establishment”, while peer-to-peer review is a new system gifted to us by the “undermedia”. Courrielche has very little time for nuance in the construction of this moralistic dichotomy, and clearly very little idea why peer review exists in the first place.

It should be noted from the start (and many an academic will agree) that peer review is a flawed system. It’s well known that worthwhile papers are rejected from reputable journals from time to time, while the less reputable journals have the opposite problem. Nevertheless, there is a widely-recognised need for at least some form of review system to find any weaknesses in papers before publication. It seems obvious that the people best placed to review any given piece of work are those working in the same field. Peer review acts both as a filter and a means of providing feedback (a sort of last-minute collaborative effort). The reviewers are not some sort of closed secret society bent on stamping their authority on science, as Courrielche seems to imply. Anyone working in the field can be invited by one relevant journal or another to review a paper, and it’s in a journal’s best interests to select the best qualified reviewers.

Courrielche sticks the word “review” on the end of “peer-to-peer” so that it can appear to fulfill this function. The premise seems to be that hordes of laypeople are just as good, if not better, at reviewing a given work than those who work in the relevant field. This is really just thinly-veiled anti-intellectualism. How can a layperson possibly know whether the author of a technical paper has used the appropriate statistical or methodological techniques, or considered previous empirical/theoretical results, or made appropriate conclusions?

That’s why papers are peer-reviewed. Reputable journals get their reputation from the high quality (i.e. usefulness and scientific rigour) of the work presented therein, as determined by experts in the field. Barring the very occasional lapse of judgment, the flat earth society, the intelligent design movement, the climate change denialists, and any number of other weird and wonderful parties are prevented from publishing their dogma in Science, Nature and other leading journals. There’s no rule forbidding such publication; that’s just what happens when you apply consistent standards in the persuit of knowledge. Ideologues are frequently given an easy ride in politics, and it clearly offends them that science is not so forgiving.

However, Courrielche appears to be more interested in describing how the “undermedia” is up against some sort of vast government-sponsored conspiracy to hide the truth. His tone is one of rebellion, of exposing the information to the media, and doing battle with dark forces trying to prevent its disclosure. Even if such a paranoid fantasy were true, it has nothing to do with peer review. Peer review is not a means of quarantining information from the public, but simply a way of deciding the credibility of that information. In reality, the information is already out there, and in fact it’s always been out there (just not necessarily in the mass media). The problem is not the lack of information, but the prevalence of disinformation. We are all free to ignore the information vetted by the peer review system, but we don’t because it’s intrinsically more trustworthy than anything else we have.

Courrielche makes mention of the “connectedness” of the climate scientists, as if mere scientific collaboration is to be regarded with deep suspicion. Would he prefer that scientists work in isolation, without communicating? This is quite blatantly hypocritical, because his peer-to-peer review system is based on connectedness.

Well, sort of. I also suspect that most of the many and varied denialist memes floating around have not resulted from some sort of collective intelligence of the masses, but from a few undeserving individuals exalted as high priests by certain ideologically-driven journalists. There is nothing “peer-to-peer” about that at all.

From my point of view, what Courrielche describes as the “fierce scrutiny of the peer-to-peer network” is more like ignorant nitpicking and groupthink. There are no standards for rigour or even plausibility in the many of the discussions that occur in the comments sections of blog sites. Free speech is often held sacrosanct, but free speech is not science.

The denialists are up against much more than a government conspiracy. They’re up against reality itself.

The colloquium

An “official communication” from early June demanded that all Engineering and Computing postgraduate students take part in the Curtin Engineering & Computing Research Colloquium. Those who didn’t might be placed on “conditional status”, the message warned.

A slightly rebellious instinct led me to think of ways to obey the letter but not the spirit of this new requirement. Particularly, the fact that previous colloquiums have been published online introduced some interesting possibilities:

  • a randomly-generated talk;
  • a discussion of some inventively embarrassing new kind of pseudo-science/quackery; or
  • the recitation of a poem.

In the end I yielded, and on the day (August 25) I gave a reasonably serious and possibly even somewhat comprehensible talk on a controlled experiment I’d conducted on defect detection in software inspections.

A while afterwards, I received in the mail a certificate of participation, certifying that I had indeed given the talk I had given. It felt a little awkward. Giving a 15 minute talk isn’t something I’d have thought deserving of a certificate. It might be useful for proving that I’ve done it, since it now appears to be a course requirement, but a simple note would have sufficed.

Interestingly, I later received another certificate, identical except that my thesis title had been substituted for the actual title of my talk. In essence, I now have a piece of paper, signed personally by the Dean of Engineering, certifying that I’ve given a talk that never happened.

Meta-engineering

I’m beginning to think I should have approached this maths modelling stuff from an engineering point of view: with a requirements document, version control and unit testing. Constructing a reasonably complicated mathematical model seems to have enough in common with software development that such things could be quite useful.

I’m calling this “meta-engineering”, because I’d be engineering the development of a model which itself describes (part of) the software engineering process.

The only problem is that formal maths notation can’t just be compiled and executed like source code, and source code is far too verbose (and lacking in variety of symbols) to give you a decent view of the maths.

Fortunately, Bayesian networks provide a kind of high-level design notation; perhaps the UML of probability analysis. Mine look like some sort of demented public transport system. However, drawing them in LaTeX using TikZ/PGF gives me a warm fuzzy feeling.

What am I doing?

Over the past few weeks I’ve had numerous questions of the form: “how’s your work going?” I find I can only ever answer this with banalities like “good” or “meh”.

It’s not that I don’t know what I’m doing. At any given point in time, I have a list of minor challenges written up on the whiteboard (which accumulate with monotonous regularity). However, my first problem is that I never remember what these are when I’m not actually working on them. I write them down so that I don’t have to remember, of course.

My second problem is that, even if I did remember what I was supposed to be doing, there just isn’t any short explanation. Currently I have on the whiteboard such startling conversation pieces as “Express CI in terms of S and U”. This may or may not tickle your curiosity (depending on how much of a nerd you are), but explaining what it means – and granted, I’ll have to do that eventually anyway – demands as much mental energy as solving the problem itself.

My third problem is  that I regularly shuffle around the meaning of the letters, to ensure I don’t run out of them and also to resolve inconsistencies. I’m currently using the entire English alphabet in my equations and a large proportion of the Greek one, so naming variables is a minor headache in itself. For instance, since I wrote the todo item “Express CI in terms of S and U”, I’ve decided to rename the variable “CI” to “CS“. Also, “S” used to be “T”, and “U” used to be two separate variables. This is mostly cosmetic, but I recoil at the prospect of explaining something so obviously in flux.

I choose to believe that I’ll be able to explain everything once I’ve written my thesis… and hopefully as I’m writing my thesis.

Science fail

Apparently one of the world’s foremost experts on global warming – as far as the denialist camp is concerned – is Viscount Monckton of Brenchley. The sum total of his qualifications appear to be his propensity to comment on the subject. A google search turned up the Heartland Institute’s take on Monckton.

Observe the ad on the left of the page: “Why Does Gore Refuse To Debate His Critics? CLIMATE CHANGE IS NOT A CRISIS”. It looks like something straight out of a political campaign, which ought to be enough to toss it aside without further contemplation. But let’s contemplate for a second. The ad shows Al Gore’s face above four people who – we presume – are “his critics” (one of whom is our esteemed Viscount Monckton). How much tomfoolery can you squeeze into something so small?

  1. The one-versus-four theme makes Al Gore look like he’s on his own, which couldn’t be further from the truth.
  2. The ad conjures up images of public debates of the sort that have nothing to do with science. One does not resolve anything, least of all matters of scientific enquiry and public policy, by having proponents of each view point stand up on a stage and hurl sound bites at each other.
  3. If anyone did need to be involved in a debate, it would be the hundreds of scientists who contribute to the IPCC’s reports, not Al Gore, who is after all just the messenger.

Science. We’ve heard of it.

Artificial intelligence

A thought occurs, spurred on by my use of Bayesian networks. They’re used in AI (so I’m led to believe), though I’m using them to model the comprehension process in humans. However, I do also work in a building filled with other people applying AI techniques.

My question is this: how long until Sarah Connor arrives and blows up level 4? And if she doesn’t, does that mean that the machines have already won? Or does it simply mean that we’re all horrible failures and that nothing will ever come of AI?

A good friend (you know who you are) is working with and discovering things about ferrofluids. In my naivety, I now find myself wondering if you could incorporate some kind of neural structure into it, and get it to reform itself at will…

Theoretical frameworks, part 3

The first and second instalments of this saga discussed the thinking and writing processes. However, I also need to fess up to reality and do some measuring.

A theoretical framework is not a theory. The point of a theoretical framework is to frame theories – to provide all the concepts and variables that a theory might then make predictions about. (If I were a physicist these might be things like light and mass). You can test whether a theory is right or wrong by comparing its predictions to reality. You can’t do that for theoretical frameworks, because there are no predictions, only concepts and variables. The best you can do is determine whether those concepts and variables are useful. This really means you have to demonstrate some sort of use.

And so it falls to me to prove that there’s a point to all my cogitations, and to do so I need data. In fact, I need quite complex data, and in deference to approaching deadlines and my somewhat fatigued brain, I need someone else’s quite complex data.

The truth is – I’m probably not going to get it; at least, not all of it.  Ideally, I need data on:

  • the length of time programmers take to assimilate specific pieces of knowledge about a piece of software;
  • the specific types of knowledge required to assimilate other specific types of knowledge;
  • the probability that programmers will succeed in understanding something, including the probability that they find a defect;
  • the probability that a given software defect will be judged sufficiently important to correct;
  • the precise consequences, in terms of subsequent defect removal efforts, of leaving a defect uncorrected;
  • the cost to the end user of a given software defect;
  • the propensity of programmers to find higher-cost defects; and
  • the total number of defects present in a piece of software in the first place.

I also need each of these broken down according to some classification scheme for knowledge and software defects. I also need not just ranges of values but entire probability distributions. Such is the pain of a theoretical framework that attempts to connect rudimentary cognitive psychology to economics via software engineering.

With luck, I may be able to stitch together enough different sources of data to create a usable data set. I hope to demonstrate usefulness by using this data to make recommendations about how best to find defects in software.

Theoretical frameworks

One of the chapters of my much-delayed thesis describes (or rather will describe) a theoretical framework, which is academic-speak for “a way of understanding stuff” in a given field. In my case, stuff = software inspections, and my way of understanding them is a mixture of abstractions of abstractions of abstractions and some slightly crazy maths, just to give it that extra bit of abstractedness that seemed to be lacking.

It’s very easy when engaged in abstract theorising to forget what it is you’re actually modelling. All those boxes and lines look positively elegant on a whiteboard, but when you come to describe what the concepts represent and how someone would actually use it, things frequently go a bit pear-shaped. The problem, as far as I’ve been able to tell, is the limited short-term memory available for all this mental tinkering. What you need is to keep the concrete and the abstract in your head simultaneously, but this is easier said than done (especially if one’s head is full of concrete to begin with). When the abstract gets very abstract and there’s lots of it, the real-world stuff slips quietly out of your consciousness without telling you.

Sometimes it’s only a small thing that gets you. Sometimes you realise that it all mostly makes sense, if only this box was called something else. Then there are times when you finish your sketch with a dramatic flourish, try to find some way of describing the point of the whole thing, and shortly after sit back in an embarrassing silence.

My latest accomplishment, or perhaps crime against reason, is the introduction of integrals into my slightly crazy maths (already liberally strewn with capital sigmas). An integral, for the uninitiated, looks a bit like an S, but rather pronounced “dear god, no”. You can think of it as the sum of an infinite number of infinitely small things, which of course is impossible. However, it does allow my theoretical framework to abstract… no, nevermind.

Ponderings of sanity

There are many things to be said about debating in online forums. One, that you learn early on, is that it doesn’t take much effort to find the fruitcakes. It really doesn’t. The people who firmly believe that the World Trade Centre was brought down by explosives, as evidenced by the “indisputable fact” that it “fell faster than gravity”, because just look at that YouTube video. The people who believe you’re going to hell not just because you don’t believe in God, but because you haven’t performed the 54-day version of the “Rosary Novena” (a type of prayer) and that TV shows made since the 1960s are so unforgivably immoral that they must be the work of Satan Himself. The people who equate taxation with slavery and socialism with atheism. The people who believe that oil is not derived from ancient organic matter but instead is simply “produced” by the Earth’s core. The people who proudly challenge you to disprove their three-paragraph thesis on why the entirety of science on evolution and cosmology is flat-wrong and the literal Biblical account is the only possible alternative.

One person I encountered had a pet theory on the nature of photons (particles of light): that each in fact comprises an electron and a positron in orbit around each other. Facts, such as the one where photons have no mass, unlike electrons and positrons, do not pose a hindrance to such theories, I’ve discovered. The idea, more generally, that experts in the field have been looking into this sort of thing for quite some time, publishing multitudes of peer-reviewed journal articles along the way, is of little concern.

Not that I’d wish to put you off online debating, but as you’re encountering these varied and interesting specimens, you’re bound to pick up a few insults, depending on what fascinating theory you’re being unreasonably sceptical of. As a change of pace from the usual names I get called – leftist, liberal, socialist, atheist (which at least is true), materialist or totalitarian – I’ve recently been called a “Bushbot”. This is an interesting and somewhat disturbing thought, considering some of the stuff that’s popped up in my George Bush “Out of Office Countdown” off-the-wall calendar.

Not even Bush though can match some of the wisdom of the Internet, which I’ve decided to share with you:

“In addition, the Earth is continually producing oil, because “Peak oil” was a carefully crafted myth. Oil does not come from dead dinosaurs as you skulls full of mush have been brainwashed to believe.”

“Scientists are usually the last to know about anything”

“A price chart is how I make my living….It represents truth.”

“A truth to point, all the Atheists I know have no children and it is always due to thier Atheistic mental state as compared to normal (spiritual) people. I know 7 Atheists; three couples. Sure many Atheists do produce children but certainly a large number possessing the Atheistic mind, refuse and will therefore generally NOT pass on either their genetic or social make up to the younger generations.”

“The constant social and technological progress resulting from the constant advancement of the metaphysical mind set means that we now have societies full of people, some of whom now can survive to adulthood with all alorts of personal shortcomings. This obviously includes Atheists.”

So now you know.