Curing viral misinformation

A great deal of mischief is caused, regularly, by viral misinformation. Factoids that support one side of any controversial issue are rapidly copied and pasted many times over (the “echo chamber”). By the time anyone manages to marshal the truth into a coherent response, it’s too late — the lie has convinced enough people for it to become self-reinforcing. Everyone can probably name some examples of this, particularly in day-to-day national politics.

I can’t help but quote Churchill:

A lie gets halfway around the world before the truth has a chance to get its pants on.

(Given the Internet, this actually seems rather conservative.)

For me, the frenzied reaction in 2009 to the hacked CRU emails springs to mind. All manner of nefarious interpretations were placed on isolated snippets of private correspondence of climate scientists, before anyone in a position to understand the emails’ context (or at least the lack thereof) could conduct an honest evaluation. And in cases like this, the lies are often more complete than the truth, and certainly more interesting.

I don’t have an exact model of how this process unfolds. However, I suspect that, if we sat down and analysed a sample of propagated misinformation, we’d find that important parts of the original wording have largely been preserved, with very little paraphrasing. Misinformation only manages to propagate so fast because higher cognitive levels1 are (probably) never reached in the initial hours of propagation. This means that the propagation of misinformation is largely a mechanical process (not a creative one), which places it within the reach of automated or semi-automated analysis.

To come to the point, we can and should devise a tool to automatically detect this misinformation, and build it into the web browser — a browser extension. It should highlight and annotate misinformation in any web page the user views, based on a regularly-updated database. There are a few sites already dedicated to correcting misinformation (Snopes, Skeptical Science, etc.), and they are certainly invaluable, but a greater prize is to have misinformation annotated without any immediate human effort at all.

I’ve been toying with this idea for over a year, considering how to engineer communication between the browser extension and the database, how to provide flexibility in searching for different types of misinformation, while avoiding software security vulnerabilities, etc. (I should probably have written a prototype by now, but paid work took priority.)

It turns out — unsurprisingly — that others have considered some of these issues as well. The existing research tool Dispute Finder is very similar to what I’d envisaged. (It was well reported back in 2009, but clearly escaped my attention at the time). However, that project has apparently ended, and its principal investigator Rob Ennals has moved on. The Firefox browser extension has been removed, so I haven’t seen it in action, and presumably the database is no longer available either. The project did get as far as conducting user evaluations of the software. Perhaps Dispute Finder was only intended to have a fixed lifetime, or perhaps the authors decided that the project was not sufficiently successful.

Skeptical Science has its own Firefox browser extension, but this is climate-change-specific, and so is most likely to be used by those who consciously and actively accept the reality of climate change. That’s not to say it isn’t useful, but its effects on public discourse are probably indirect.

A generic “lie detector” tool might have a disproportionately greater impact on public discourse compared to a domain-specific tool. The generic tool would cover a much greater array of misinformation, and as a result would probably also gain wider acceptance. For instance, at least some of those who don’t particularly care about or believe climate science may nonetheless choose to use the generic tool for its treatment of other issues. (Hard core denialists of any stripe may complain about the “anomalous” treatment of their pet topics. Such complaints might be a blessing in disguise, actually boosting awareness.)

In fact, there are really two pieces of software here: the browser extension itself and the database. Given an appropriate means of communication, they could be developed quite independently.

The source code for Dispute Finder (previously “Think Link”) seems to be available here. I still intend to write my own independently, because I have different views on the technical architecture, which I may elucidate in future. The research findings of the Dispute Finder / “Confrontational Computing” project are certainly worth pondering, though. It would be a waste to ignore the experience gained, and it seems too good an idea to give up on.

  1. Bloom’s taxonomy breaks cognition into distinct levels: knowledge, comprehension, application, analysis, synthesis and evaluation. The “knowledge” level is pure rote learning, while “evaluation” represents critical thinking. []