Outliers and Mean Problem

## 4 Answers

I don't know why you're not telling us what these data sets actually represent. That would be pretty helpful.

However, it sounds like each one intentionally contains exactly one large number and a bunch of smaller ones. So you probably need to produce two numbers -- one for the large score, and another for some kind of average of the small ones.

Then if you *really* need to combine those into a single number, you can average the two numbers (possibly a weighted average).
r/statistics. This is their bread and butter.
Have you tried using percentiles instead? Can you calculate the 75th percentile of each data set?
You might consider the median of means estimator - partition each set of data into groups of the same size, calculate the mean for each of these groups and then take the median of these means.

That will give you a good estimate of the mean that is very robust to outliers.

