ago
0 like 0 dislike
0 like 0 dislike
When I was in academics I always dreamed of good (free) datasets like in the industry. Now I am in the industry and I have good data, but I don't see it treated as rigorously as I was expecting. In my field it's mostly regression analysis - for which even low R^2 are accepted, and A/B test where normality is just assumed and rarely checked. The argument is that "we need to make business decisions, not publish a paper". I suppose an indicative figure is better than a guess work. I am nonetheless surprised.

How is it for you guys? I'd love to get opinions from people in highly specialised fields as well
ago
0 like 0 dislike
0 like 0 dislike
The worst is when they just ask you to justify, ex post, a decision that has already been made. I had this happen a few days when I was performing DiD to see if a policy may have had any obserable impact on growth previously. While it appeared that it did (significant DiD estimator), the parallel trend assumption did not seem to hold, so I couldn't draw a scientifically sound conclusion/estimation. They didn't care - they really wanted to repeat this policy. So the way they framed it was that 'our models conclusively demonstrate that policy X had a positive impact on Y'.
ago
0 like 0 dislike
0 like 0 dislike
With good data you can often use quite basic methods. Like, if you have good domain knowledge, you do not need to test normality on every data batch. And anyway, least squares is still the best linear unbiased estimator in many non normal cases.

Low r^2 is not good though.
ago
0 like 0 dislike
0 like 0 dislike
What do you mean by 'accepted'. If your R\^2 is low, it's low. How can you not accept it?

I check for normality and typically assume I don't have it.

Yeah, we need to make business decisions. When I worked in academia, there was  a tendency (sometimes implicit, sometimes not) to want to find significant or interesting results, so that we could publish a paper, get longer CVs and get raises and promotion. Now we want the truth.

When I'm analyzing an A/B test I don't know what the groups are, I don't know what anyone 'wants', and I'm completely dispassionate. All I care about is getting the 'right' answer.  If I ever say "I'm not sure this analysis is appropriate because X" I would be (and am) taken very seriously. (Right now I'm working on detecting differences in rare events - events happen around 0.1% of the time. We have a sample of 1000 (per group). If you have a significant result, I don't believe it.)

That might not be true for everyone.
ago
0 like 0 dislike
0 like 0 dislike
Normality is not a terribly important assumption in A/B testing
ago
0 like 0 dislike
0 like 0 dislike
I try to be rigorous, and sometimes people don’t like it.

If you are not trying to be rigorous you are not using data to support or derive conclusions. You are using data to justify decisions that are supposed to be made due to gut feelings or justify decisions that have already been made.

I also see no general problem with a low R2. It depends on the underlying question. If your data generating process is a coin toss, R2 will be 0. But these predictions might still be useful.
If you are doing causal inference, R2 doesn’t matter at all.
ago
0 like 0 dislike
0 like 0 dislike
>even low R2 are accepted

I know its not the point of your post, but models with low R2 can still provide potentially meaningful, but weak associations.  Finding low signal in noise is not necessarily a bad thing.
ago
by
0 like 0 dislike
0 like 0 dislike
But finally if the decision is wrong because of not checking assumptions the value of our decision drops which is dangerous for this profession and the consultants.
ago
0 like 0 dislike
0 like 0 dislike
How rigorous you are really depends on what is necessary in any given situation and what you are trying to do with your data. I would argue that as a statistician, you should always be rigorous in how you are reporting on analyses, assumptions, and shortcomings.

In the drug development word, we see both ends of this:

Primary (and secondary) hypothesis testing in a late phase study that will support regulatory filings? Yes, you better be damn sure that your multiplicity adjustment is correct and assumptions needed for an analysis are both reasonable, and also tested via many sensitivity analyses.

Exploratory post-hoc analysis about unexpected findings? Go crazy with whatever you want, but report results accurately nonetheless as post-hoc and not controlled for type I error.
ago
0 like 0 dislike
0 like 0 dislike
It depends on what you mean by rigorous. If you are making important decisions, you should try not to be wrong.

But the two examples you gave are not the most compelling. As others have said, normality is not an important thing to check for a t test. There are 10 other things more likely to be a real problem, and if you are making decisions, a t test is probably not the right thing to do in the first place.

And what's wrong with small R\^2? As others have said, that might mean that there is weak, but potentially useful, predictive value in the model.

It sounds to me like the problem is not that your colleagues are not rigorous (by your definition), but it may be that the statistical methods you are using are not appropriate to the problems they care about.
ago
0 like 0 dislike
0 like 0 dislike
Have you asked yourself how much do you need normality for A/B tests?
ago

No related questions found

33.4k questions

135k answers

0 comments

33.7k users

OhhAskMe is a math solving hub where high school and university students ask and answer loads of math questions, discuss the latest in math, and share their knowledge. It’s 100% free!