Admit me to the conspiracy

Deltoid takes a look at a piece of code taken from the Climate Research Unit (CRU) that apparently has the denialists salivating. Buried therein is the following comment: “Apply a VERY ARTIFICAL [sic] correction for decline!!” Are you convinced yet of the global leftist socialist global warming alarmist conspiracy?! I certainly am.

I’d also like to apply for membership. You see, trawling through my own code for handling experimental data (from September 2008), I’ve re-discovered my own comment: “Artificially extends a data set by a given amount”. Indeed, I appear to have written two entire functions to concoct artificial data*, clearly in nefarious support of the communist agenda. I therefore submit myself as a candidate for the conspiracy. The PhD is only a ruse, after all. Being a member of the Conspiracy is the only qualification that really counts in academia.

* I’m not making this up – I really do have such functions. However, lest you become concerned about the quality of my research, this artificial data was merely used to test the behaviour of the rest of my code. It was certainly not used to generate actual results. I can sympathise with the researcher(s) who leave such untidy snippets of  code lying around, and I’m a software engineer who should know better!

Old computers

The Linux boot up message of the moment:

/ has gone 49710 days without being checked, check forced.

This would place the manufacturing date of the computer in question at around 1872 or earlier; a century before the UNIX epoch (the official Dawn of Time for UNIX-based computers) and at least 86 years prior to the invention of the microchip.

Marvelous spam

Normally unseen by the small number of real live humans who visit this blog, I get a steady stream of comment spam. None of it makes it through, most being caught by a set of fairly simple filter rules, and the rest being swiftly cut down by a heartless moderator (i.e. me).

Recent comment spammers have become a little smarter. Instead of blasting away with meaningless comments containing dozens of links to automatically-generated URLs, it seems they’re now trying a sort of social engineering. They post flattering comments with no embedded links at all, possibly hoping that if a single comment gets through from a given email address, they will have free reign to post anything from that same address (taking advantage of an option in WordPress).

This is fairly easy to spot, because such comments are still completely generic and contain no hint that the commenter has understood anything I’ve written. In particular, in response to my irreverent post on the H1N1 flu, one “person” had the following contribution to make:

Hi, Congratulations to the site owner for this marvelous work you’ve done. It has lots of useful and interesting data.

Well, it was a truly important piece of research, after all, so I’m happy to take the credit for it.

Artificial intelligence

A thought occurs, spurred on by my use of Bayesian networks. They’re used in AI (so I’m led to believe), though I’m using them to model the comprehension process in humans. However, I do also work in a building filled with other people applying AI techniques.

My question is this: how long until Sarah Connor arrives and blows up level 4? And if she doesn’t, does that mean that the machines have already won? Or does it simply mean that we’re all horrible failures and that nothing will ever come of AI?

A good friend (you know who you are) is working with and discovering things about ferrofluids. In my naivety, I now find myself wondering if you could incorporate some kind of neural structure into it, and get it to reform itself at will…

Horrible Java

Apologies to non-geeks. The following Java code determines whether infinity is even or odd. It compiles, runs, finishes immediately, and outputs “false” (meaning that infinity is odd).

class Infinity \u007b static \u0062\u006f\u006f\u006c\u0065\u0061\u006e\u0020\u0065\u0076\u0065\u006e\u003b static
{
    // Configure infinite speed
    System.nanoTime(\u002f\u002a);
    boolean even = true;
    double i = 0.0;
    while(i <= infinity)
    {
        even = !even;
        i += 1.0;
    }
    System.normalTime(\u002a\u002f);
    System.out.println("Infinity is even: " + even);
    System.exit(0);\u007d
}

Yes, it's all smoke and mirrors, but I've been having fun with it.

The Bayesian rabbit hole

You may recall previous rants about my theoretical framework. The recent evolution of my thought processes (much like all other times) has been something like this: hurrah, done… except… [ponder]… I should see if I can fix this little problem… [ponder]… How the hell is this supposed to work?… [ponder]… Damn, the library doesn’t have any books on that… [ponder]…  Gah, I’ll never finish this.

This all concerns the enormous equation slowly materialising in Chapter 7 of my thesis – the one that calculates the “cost effectiveness” of a software inspection. It used to be finished. I distinctly recall finishing it several times, in fact.

The equation was always long, but it used to contain relatively simple concepts like no. defects detected × average defect cost. Then I decided in a state of mild insanity that it would be much better if I had matrix multiplication in there. Then I decided that this wasn’t good enough either, and that what I really needed were some good solid Bayesian networks (often discussed in the context of artificial intelligence). I only just talked myself down from using continuous-time Bayesian networks, because – though I like learning about these things – at some point I’d like to finish my thesis and have a life.

(Put simply, Bayesian networks are a great way of working out probabilities when there are complex causal relationships, and you have limited knowledge. They also allow you to insert pretty diagrams into an otherwise swampy expanse of hard maths.)

On the up side, I’ve learnt what 2S means, where S is a set, and that there’s such a thing as product integration (as opposed to the normal area-under-the-curve “summation” integration). It’s all happening here.

Temporal malfunction

You may have noticed that Dave’s Archives, in a fit of celebratory humour, temporarily reverted to a two-week old version on Good Friday. So did my inbox, and for a while I thought I’d lost all my emails and blog posts since March 26.

Email isn’t a problem, because I have a convoluted forwarding scheme where, by design, I end up with three copies of all email sent to me. The blog shouldn’t have been a problem either, because it’s automatically backed up via email. I noted with some humility, however, that these particular emails were only being sent to one of my IMAP accounts – the account that had just lost the last two weeks of email.

It was eventually fixed by a poor Jumba customer rep working Good Friday.

I did find a backup, though, after rummaging around in my laptop’s IMAP cache while not connected to the Internet, but it was a few days old and missing the most recent post (before this one). I’ve since ratcheted up my backup scheme a notch.

The doomsday argument

This has recently been the source of much frustration for some of my friends, as I’ve attempted to casually plow through a probabilistic argument that most people would instinctively recoil at. So, I thought, it might work better when written down. Of course, plenty of others have also written it down, including Brandon Carter – its originator – and Stephen Baxter – a science fiction author (who referred to it as the “Carter Catastrophe” in his novel Time).

The main premise of the argument is the Copernican principle. Copernicus, of course, heretically suggested that the Earth was not the centre of the universe. Thus, the Copernican principle is the idea that the circumstances of our existence are not special in any way (except insofar as they need to be special for us to exist in the first place).

We are now quite comfortable with the Copernican principle applied to space, but the doomsday argument applies it to time. Just as we do not live in any particularly special location, so we do not live at any particularly special moment. This is distorted by the fact that the human population has exploded in the last century to the point where about 10% of all the humans to have ever lived (over the course of homo sapiens’ ~200,000 year history) are still alive today. We can deal with this distortion by (conceptually) assigning a number to each human, in chronological order of birth, from 1 to N (where N is the total number of humans that have lived, are currently alive, or will ever be born in the future). We can then say, instead, that we are equally likely to have been assigned any number in that range.

In probability theory, this is equivalent to saying that you have been randomly selected from a uniform distribution. Yes, it must be you (the observer) and not someone else, because from your point of view you’re the only person who has a number selected from the entire range – past, present and future. You could have been assigned a number at any point along the human timeline (by the Copernican principle), but you still cannot observe the future, and so by selecting any other specific individual you’d automatically be restricting the range to the past and present. The number you’ve actually been assigned is something on the order of 60 billion (if we estimate that to be the total number of humans to have ever lived so far).

So where does that leave us? Well, in a uniform distribution, any randomly selected value is 95% likely to be in the final 95% of the range. If your randomly selected number is 60 billion, then it’s 95% likely that the total number of humans to ever live will be less than 60 billion × 20 = 1.2 trillion. Similarly, it’s 80% likely that the total number will be 60 billion × 5 = 300 billion, and 50% likely that the total number will be 120 billion. Now, 50%, 20% and 5% probabilities do crop up, but we must draw the line at some point, because you cannot demand absolute certainty (or else science would be impossible.)

This should make us think. The doomsday argument doesn’t give an exact number, nor does it directly give us a time, but this can be estimated from trends in population growth. However, the prospect of a scenario in which humanity spreads out beyond the solar system and colonises the galaxy, to produce a population of countless trillions over tens of thousands or even millions of years, would seem vanishingly unlikely under this logic. Even the prospect that humanity will survive at roughly its current population on Earth for more than a few thousand years seems remote.

It’s also worth pointing out, as others have, that the doomsday argument is entirely independent of the mechanism by which humanity’s downfall might occur. That is, if you accept the argument, then there is nothing we can do to stop it.

Needless to say, the objections to this reasoning come thick and fast, especially if you bumble like I have through a hasty verbal explanation (hopefully I’ve been more accurate and articulate in this blog post). One should bear in mind that this isn’t simply some apocalyptic pronouncement from a random, unstable individual (it wasn’t my idea). This is work that has been published some time ago by three physicists independently (Brandon Carter, J. Richard Gott and Holger Bech Nielsen) in peer-reviewed journals. That’s not to say it’s without fault, but given the level of scrutiny already applied, one might at least pause before simply dismissing it out of hand.

The objections I’ve heard (so far) to the doomsday argument usually fall along the following lines:

  1. Often they discard the notion that the observer is randomly selected, thus reaching a different (and trivial) conclusion.  One can point out that there always has to be a human #1, and a human #2, and so on, and that this says nothing about the numbers that come after. However, in pointing this out, one is not randomly selecting those numbers, and random selection is the premise of the argument.
  2. They object that a sample size of one is useless. Indeed, in the normal course of scientific endeavour, a sample size of one is useless, but that’s just because in a practical setting we’re trying to achieve precision. If we’re just trying to make use of what we know, one sample is infinitely more useful than no samples at all. The doomsday argument does not at any point assume that its single randomly-selected number represents anything more than a single randomly-selected number. If we had more than one random sample, we’d be able to make a stronger case, but that does not imply there’s currently no case at all.
  3. Sometimes they object on the grounds of causality – that we simply can’t know the future. I think this is just a manifestation of personal incredulity. There is no physical law that says we cannot know the future, and here we’re not talking about some divine revelation or prophecy. We’re only talking about broad probabilistic statements about the future, and we make these all the time (meteorology, climatology, city planning, resource management, risk analysis, software development, etc. ad infinitum).

However, I’m sure that won’t be the end of it.

Students

Here’s what diversity means to a university tutor.

Student A appears with a deer-in-the-headlights look at the door to the senior tutor room and asks (in a bewildering tone that sounds as if a layer of righteous outrage has been suppressed and petrified beneath another layer of sheer blinding terror) if there is going to be a tutorial now for the unit that I tutor. I stumble through an explanation of the weekly tutorial times – there are only two, and neither of them are now – and leave him with a look of deep suspicion and confusion. This is half-way through semester.

Student B appears at the door to the senior tutor room with a demeanour that could very well be those transfixing headlights. She doesn’t have a question – she’s just bored. She bounds over to see what I’m doing and recoils at the tutorial exercise I’m preparing to give in an hour. Nevertheless, I begin to explain it and within a minute she rips the paper out of my hand and sits down to undertake the exercise: disassembling a Java class file by hand. She isn’t even enrolled in the unit, and won’t be for another year.

Conroy and Bolt on filtering

The ABC’s Q&A programme spent about 30 minutes last night pondering Senator Conroy’s mandatory Internet filtering plan… well, idea, because it’s increasingly clear that “plan” is too strong a word. Conroy was, frankly, an embarrassment. To be honest, most of the questions put to him were not especially articulate, but Conroy made a mockery of himself. What disturbs me is that he seems to be fully cognisant  of the reality of public opposition, the technical barriers and even the dangers of encroaching on political freedoms, and yet at the same time he has no inkling that it means anything. Sure, ACMA may have blacklisted a dentist’s website, among a number of other worrying examples, but somehow that’s perfectly alright and acceptable simply because Conroy is able to explain how it happened (something about the Russian mafia, apparently). Forgive me if the idea of a secret blacklist doesn’t fill me with confidence. If said blacklist hadn’t been leaked recently, such errors would never come to light, and so there would be no pressure to correct them.

Andrew Bolt’s remarks on the filter were mostly directed at the Internet libertarian strawman. The argument – not terribly innovative – lays down a few of the worst examples of criminal behaviour and suggests that you can’t allow free access to everything. Possibly true, and utterly beside the point. Mandatory Internet filtering is and should be opposed on the grounds that there just isn’t a workable mechanism, by which I mean one that is effective while being compatible with basic democratic principles. Ultimately, it doesn’t matter what your filtering criteria are. Computers aren’t smart enough, humans aren’t honest enough and the Internet is just too damn big.