In finite probability spaces, like rolling a die, "impossible" and "probability zero" are fundamentally the same. You can remove every event which has probability zero, or add however many probability-zero events you like, and the probability distribution is essentially unchanged. No trial you run will select an event which has probability zero of occurring. This is also true of some countable probability spaces, but that's not important here.

Say you pick a point randomly on a 1 meter by 1 meter square.

You can't do probability on this like you would for, say, rolling a six-sided die. There are infintely many (in fact uncountably many) events (points on the square which could be selected), so if we assume that each point is equally likely to occur, we would be adding up the same number infinitely many times... which obviously would give you either infinity or 0. But the probability of landing on any of the points is one, and in the land of finite probability spaces, summing up the probabilities of every point needs to give you 1.

So, if we try to do probability the same way on infinite and finite probability spaces, you get a contradiction.

Instead, we go another way: suppose you split the square into a grid of tiny squares, each of the same area, and suppose that the probability of picking a point in each of the tiny squares is equal. This is now a finite probability problem, and it works exactly how you would think it works. Landing in a particular square is just like rolling a particular number on a many-sided die.

Now suppose that this were true for *every possible grid of tiny squares*. Then you can start to do stuff like take limits, and in the end you will find that the probability of landing in any set is equal to the area of that set.

But... points don't take up any area, so the chance of randomly picking any *particular* point is zero, even though every point could possibly be picked. However, anything outside the square is simply impossible: if your random process always picks points from a 1x1 meter square, you will never get "banana".

(This is something a lot of people get wrong when philosophizing: the fact that there are infinitely many possibilities does *not* imply that literally anything is possible)

This unintuitive result was just the consequence of a completely reasonable, intuitive fact about dividing the square into tinier and tinier squares. It's just how the math works out if you want things to make sense no matter how tiny the squares are.

The notion of probability density functions helps us reason about continuous probability (where "probability zero but not impossible" can show up) in a way that doesn't have problems with "probability zero". The probability density function p(x) of a point x is the limit of the ratios P(landing distance less than r from point x)/(area of the circle with radius r) as r goes to 0. What's more, we can "sum up" a probability density function by taking its integral, which will always give us 1, showing a consistency with finite probability. In the example above this ratio is always one, so the pdf says something that we were trying to express from the start: every point is "equally likely" in some sense!