ago
0 like 0 dislike
0 like 0 dislike
In other words, how many samples do I need to know the total amount of elements in a set?

If I give you a random sequence like AABABBBABABABBABABA and say "every element has the same chance of appearing", you can reasonably assume there are only two possibilities: A or B. But can you assume the same from AABABBBABA? Or AABAB? How to calculate the exact (or at least 'reasonable') number of samples I need to run in order to infer the number of possibilities?
ago
0 like 0 dislike
0 like 0 dislike
I don't think there's a set amount of trials you'll need tbh. The more trials you take, the closer you'll get to the answer.

>But can you assume the same from AABABBBABA? Or AABAB?

Ideally, you can't, as there's a slight chance that a third element (say C) didn't show up. If the sequence is only AABAB, I can assume A and B are the only elements and call it a day, as my statement "A and B are the only elements" makes sense as those are the only ones I saw + you told me that all elements have an equal chance of popping up. But, if the die did indeed have a C, I would be wrong. Eventually, as you increase the number of trials, the probability of C not appearing becomes extremely small, to an extent that you can assume it to be 0 (in other words, as you increase the number of trials, C is bound to appear in one of them). So, the more trials you do, the more likely every element shows up.
ago
0 like 0 dislike
0 like 0 dislike
Suppose you roll a die with an unknown number of sides twenty times, and you get twenty different numbers.  What can you conclude?
ago
0 like 0 dislike
0 like 0 dislike
With a finite number of samples, you will never know for sure. For example, you could flip a fair coin a trillion times and get heads each time, it's just absurdly unlikely.

My instinct is to approach this using hypothesis testing. Let's say you've only sampled two sides so far after twenty samples, and you're wondering about the possibility of a third side. You essentially ask:

Assume, for the sake of argument, that there are three sides. What is the probability of taking twenty samples and failing to sample at least one of the sides?

If that probability is very low, then we can assume the assumption that there are three sides is   
probably wrong, and that there are probably only two sides. You should spend some time learning about what it does and doesn't mean to reject the null hypothesis, because it can be a source of confusion, but that's the general idea.
ago
0 like 0 dislike
0 like 0 dislike
I think this is basically the "German tank problem," so maybe that's a search term that will turn up something helpful.
ago

No related questions found

33.4k questions

135k answers

0 comments

33.7k users

OhhAskMe is a math solving hub where high school and university students ask and answer loads of math questions, discuss the latest in math, and share their knowledge. It’s 100% free!