The Bayesian rabbit hole

You may recall previous rants about my theoretical framework. The recent evolution of my thought processes (much like all other times) has been something like this: hurrah, done… except… [ponder]… I should see if I can fix this little problem… [ponder]… How the hell is this supposed to work?… [ponder]… Damn, the library doesn’t have any books on that… [ponder]…  Gah, I’ll never finish this.

This all concerns the enormous equation slowly materialising in Chapter 7 of my thesis – the one that calculates the “cost effectiveness” of a software inspection. It used to be finished. I distinctly recall finishing it several times, in fact.

The equation was always long, but it used to contain relatively simple concepts like no. defects detected × average defect cost. Then I decided in a state of mild insanity that it would be much better if I had matrix multiplication in there. Then I decided that this wasn’t good enough either, and that what I really needed were some good solid Bayesian networks (often discussed in the context of artificial intelligence). I only just talked myself down from using continuous-time Bayesian networks, because – though I like learning about these things – at some point I’d like to finish my thesis and have a life.

(Put simply, Bayesian networks are a great way of working out probabilities when there are complex causal relationships, and you have limited knowledge. They also allow you to insert pretty diagrams into an otherwise swampy expanse of hard maths.)

On the up side, I’ve learnt what 2S means, where S is a set, and that there’s such a thing as product integration (as opposed to the normal area-under-the-curve “summation” integration). It’s all happening here.