draft: true title: A Primer on Probabilistic Thinking date: 2024-02-29

(Never Tell Me The Odds)

Hollywood’s writers reflect our culture’s idea that this is what probabilistic thinking looks like; you’re ambiently Smart and Think Real Hard and then come up with probabilities with ludicrous numbers of significant digits.

If there is one thing accurate about this scene, and every scene where this trope is trotted out, it is the subsequent proof that the probability estimate was wrong when the long shot comes to pass anyhow.

In day-to-day life you do not generally get the sort of precise probabilities amenable to rigorous mathematical analysis, and rigorous mathematical analysis is its own hornet’s nest anyhow, as is any attempt to interpret the results of rigorous analysis. There’s a lot of ways such analysis can inform your day-to-day probabilistic thinking, but here I’m going to try to lay out an intuitive approach to thinking probabilistically in real life, which accounts for having only rough estimates and incomplete information.

Categorizing Probabilities

The baseline representation of a probability in mathematics is a number between 0 and 1. This has a variety of very nice characteristics to it, such as the way you can obtain the probability of two independent events both occurring by multiplying their probability together in exactly the way you’ve learned to multiply every other number.

It’s mathematically useful, but for intuitive use, deeply flawed. The problem is that almost all the probabilities worth thinking about end up squished up on just the ends. A 1-in-2 change of happening is .5. A 1-in-4 change, .25. A 1-in-10 chance 0.1, a 1-in-100 chance .01, a 1-in-a-million is 0.000001.

The corresponding opposites of those last few are 9-in-10 at .9, 99-in-100 at .99, and 999,999-in-a-million at .999999.

The difference between “out of ten” and “out of a million” are quite substantial and often the thing we are pondering, yet the numbers representing them are all squished up at zero and one, and you’re left sitting there counting zeros or nines to determine if something is out of thousands or out of billions. The vast range of number from .1 to .9, fully 90% of the probability interval, represents a swing only from 1-in-10 to 9-in-ten, whereas all the long shots and near-certainties get squeezed to ever-smaller bits right at the end.

The natural mathematical solution to this is to take the logarithm of probabilities. This is used quite a lot in computers for various reasons, but is unpleasant for humans to work with because it maps the range 0 to 1 to negative infinity to 0. The resulting mathematical operations are nice, where for instance combining independent probabilities is now a simple addition rather than multiplication (the primary utility of a logarithm), but who wants to intuitively think of probabilities on a scale from 0, “certain to happen”, to negative infinity, “certain to not happen”?

What I use intuitively is based on the idea of a logarithmic map, but it is not mathematically sound. However, the degree to which it is inaccurate is generally dwarfed by the way I’m doing a Fermi calculation anyhow because I don’t have precise probabilties so it’s not like I’m working with 20 digits of precision anyhow.

Example: Are lottery probabilities accurate?

One place in real life where you do seem to have precise probabilities is in the case of lotteries. It is a middle-school math problem to compute the probability of winning the jackpot on some lottery ticket, and if you can’t do that, the number is probably printed somewhere on the ticket itself.

But is that number correct? Do you really have exactly a 1 in 292,201,338 chance of winning the Powerball jackpot per ticket?

Example: Is This Kid Vaping?

A friend of a friend’s kid failed a drug test at school for vaping. The kid claims that this was the first time they’ve ever done it. Their friends just pressured them so hard in the bathroom right before hand that they finally gave in and took the very first puffs of their life, and then wouldn’t you know it there was a drug test.

The parents face a question: How likely is it that their kid is lying to them and has actually been routinely vaping for a while?

Now, the people I expect will be reading this have already intuited that the kid is almost certainly, if not certainly, lying. But I hope you’d also agree there’s a lot of credulous parents out there who would happily buy this excuse without blinking, and to examine the question through the lens of probabilistic thinking rather than gut feelings.

[untitled post]