While I was refreshing some concepts in statistics, I came up with a problem I thought I could ask ChatGPT (talk about attention deficit). I am still using the cheap ChatGPT3.5, so what I am about to show may not work on the expensive ChatGPT4.0. Here’s an example on how it can mislead while solving math problems and one should already have an inkling how to solve the problem (or atleast have an estimate of the answer) to do some prompt engineering to finally get to a solution.
Here’s the conversation (edited out quite a bit of fluff). My prompts are in italics.
The conversation
Me: What is the probability of 3 samples from a normal distribution to be one standard deviation more than the mean?
ChatGPT: In a standard normal distribution (with a mean of 0 and a standard deviation of 1), the probability that a random variable falls within one standard deviation of the mean is approximately 68.27%. This is a well-known property of the normal distribution. The probability that the first sample is one standard deviation more than the mean is approximately 0.6827 (68.27%).
Since each sample is independent, the probability that the second sample is also one standard deviation more than the mean is also 0.6827 (68.27%).
Similarly, the probability that the third sample is one standard deviation more than the mean is 0.6827 (68.27%).
To find the combined probability that all three events occur, you simply multiply these probabilities together because they are independent:
0.6827 * 0.6827 * 0.6827 ≈ 0.3194
Me (to myself): That seems wrong, so let’s prompt again…
Me (to ChatGPT): What is the probability of a sample from a normal distribution to be more than one standard deviation away from the mean?
ChatGPT: In a standard normal distribution (with a mean of 0 and a standard deviation of 1), the probability that a random variable falls more than one standard deviation away from the mean (in either direction) is approximately 31.73%. Using standard normal tables or a calculator, you can find the probabilities associated with Z < -1 and Z > 1, which are approximately 0.1587 and 0.1587, respectively.
Now, add these probabilities:
0.1587 + 0.1587 ≈ 0.3174
So, the probability that a sample from your normal distribution is more than one standard deviation away from the mean is approximately 31.74%.
Me: Does this change your answer to the previous question?
Yes, the probability that a sample from a normal distribution is more than one standard deviation away from the mean is approximately 31.74%, as calculated in the previous response. This probability applies to each individual sample.
In the previous question, you asked about the probability that three samples from the same distribution are each one standard deviation more than the mean.
Therefore, the probability that all three samples are each one standard deviation more than the mean remains the same:
0.3174 * 0.3174 * 0.3174 ≈ 0.0320 or approximately 3.20%.
Concluding thoughts
The astute reader will recognize that it’s still wrong, since I am asking for the probability of the sample being one std more than the mean, whereas it’s still calculating the answer for one std away from the mean (i.e., it’s calculating the probability that the points are on both sides of the distribution instead of calculating for the points being on the upper side of the distribution).
The correct answer is (0.1587)^3.
Anyway, thought I’d share this non-wordly stuff. Can’t fault it, it’s an Large Language Model, not an Large Math Model… yet (weird, evil, nervous laugter, umbwaahhaaha).