Outliers and Mean Problem

## 4 Answers

I don't know why you're not telling us what these data sets actually represent. That would be pretty helpful.

However, it sounds like each one intentionally contains exactly one large number and a bunch of smaller ones. So you probably need to produce two numbers -- one for the large score, and another for some kind of average of the small ones.

Then if you *really* need to combine those into a single number, you can average the two numbers (possibly a weighted average).
r/statistics. This is their bread and butter.
Have you tried using percentiles instead? Can you calculate the 75th percentile of each data set?
You might consider the median of means estimator - partition each set of data into groups of the same size, calculate the mean for each of these groups and then take the median of these means.

That will give you a good estimate of the mean that is very robust to outliers.

## Related questions

0 like 0 dislike
79 answers
Regretting majoring in math
0 like 0 dislike
61 answers
Just ordered a Klein Bottle from Cliff Stoll. He sent me about 2 dozen pictures of him packing it up. Why is he so cute :)
0 like 0 dislike
21 answers
Is set theory dying?
0 like 0 dislike
2 answers
Contributing to the right math area, If all areas are equally curious
0 like 0 dislike
5 answers
Is there a nice way to recast riemannian geometry in terms of principal bundles?