0 like 0 dislike
0 like 0 dislike
I have training in physics and maths and have been looking at statistical programming jobs in the private sector (mostly biotech), and it seems like every single company wants to use SAS. I gave it a shot over the weekend, as I usually just use Python or R, and holy shit this language is such garbage. Why do companies willingly use this? It's extortionate, syntactically awful, closed-source, has terrible docs, and lags a LOT of functionality behind modern statistical packages implemented in Python and R.

A lot of the statistical programming work sounds interesting *except* that it's in SAS, and I just cannot fathom why anybody would keep using this garbage instead of R + Tableau or something. Am I missing something? Is this something I'll just have to get over and learn?
by

26 Answers

0 like 0 dislike
0 like 0 dislike
Two good reasons and two extremely shitty reasons. One good reason is that because the source code is extremely stable from one edition to the next, legacy code remains supported by production versions of SAS basically indefinitely.

The second good reason is that it's got pretty solid memory management when your data requires more ram than your machine has. It won't just crash, it'll make intelligent use of vram without any user effort or input. You can work around this in R or Python but you have to be deliberate afaik.

The shitty reasons are 1) that managers are dinosaurs who don't know how to code and aren't willing to learn, and because of that they don't know what they're missing, and too many of the people who know better care too much about being polite and diplomatic to confront them on just how assanine this is. 2) Other dinosaurs who know even less than those managers believe in the persist myth that paying for software provides some kind of liability protection compared to open source, despite being wildly unable to articulate what sort of liabilty they're concerned about.
by
0 like 0 dislike
0 like 0 dislike
Where SAS is used, it is often the database as well as the analytic language.  It competes with MySQL.  Migrating 50 years of data from a SAS DB to MySQL, BigQuery, Azure, or Redshift is a good idea, and costly.

And SAS automation is reliable.  Code created 30 years ago may still be the core of ETL at a company without maintenance.

SAS has had its strengths.  At one time, its manuals were the leading resource about statistics and were excellent.  And the way it shows you rows of the results each time it runs a data step is nice enough that I have coded work-arounds in Python and R to submit my SQL steps and show me a random sample of each table as it is created.  JMP-IN's strategy of showing you the graphics you should have asked for with the results of each test was also a great idea that would be great to see as a standard approach in R.
0 like 0 dislike
0 like 0 dislike
From a pharma perspective, I think there are two scenarios to think about:

(1) Exploratory analysis, ad hoc analysis, simulation studies, etc.

(2) Production statistical reporting of clinical trial data

In the case of #1, the use of R is not at all uncommon. Most folks in Biotech are well aware of the advantages of R and its benefits in these scenarios.

In the case of #2, I think you are not taking the business view of SAS in the pharma industry. Large pharma companies have huge macro pipelines and templates built around SDTM/ADaM/TFLs that took an enormous amount of human capital to develop and are easily deployable using SAS for all historical, ongoing, and near-future studies. So while this could theoretically be achieved using R, there is also absolutely no benefit to doing so while simultaneously introducing a lot of expense and complications to redo that entire pipeline. Standard analyses in pharma are wwwaaaayyyyyy within the bounds of SAS' technical capabilities.

Also, you have to also keep in mind that an NDA being submitted today includes data from a phase I study conducted 10 years ago. To aid in the evaluation of your submission package, it goes an awful long way to keep a large degree of consistency in the SDTM/ADaM/TFL production between your various studies. so why would you do the analysis of a Ph3 study in R all the sudden after the first several clinical trials were all done in SAS? Right, you wouldn't.

Ok, so then what about smaller biotechs? Well, they are outsourcing the work to CROs (they don't have the resources in house) which all have the exact same pipeline set up as the large pharmas. CROs would have to charge wwwaaayyyyyyyy more to redo all of these pipelines using R. Thus the end result would be way more expensive to cash-strapped small biotechs with little to no upside. So also not gonna happen any time soon.

We can argue about whether #2 is a "good" thing until we are blue in the face. But at least in 2022 this is why SAS remains dominant in clinical trial reporting.

Could this change 10 or 20 years in the future? Perhaps. But seeing the lack of penetration of R in the industry in the \~10 years I have been in it, I am a bit skeptical that it will happen any time soon.
0 like 0 dislike
0 like 0 dislike
SAS is extremely common in pharma because it's been used for so long that it's a known entity and hence less risky. The FDA will even accept SAS files directly as part of regulatory submissions.

Also, from a regulatory perspective, companies in medical device and pharma tend to shy away from open source because it makes validation of the software more difficult. If you purchase software, you can then audit the company that wrote it to verify that they followed FDA guidelines in producing it.

I also hate it.
0 like 0 dislike
0 like 0 dislike
“I gave it a shot over the weekend” lol
0 like 0 dislike
0 like 0 dislike
It's really prevalent in public health because that's what the CDC uses and they create a lot of really complex code that helps standardize the country's surveillance efforts. But I don't need any of those since I just work with my state data, so I do everything in R since it's so much more flexible and Rmarkdown is super easy to male reproducible documents.
0 like 0 dislike
0 like 0 dislike
\- Reliable

\- Has lots of modules built

\- Legacy
0 like 0 dislike
0 like 0 dislike
There's no greater high than when you first realise you can hack linked lists into the macro language!
0 like 0 dislike
0 like 0 dislike
‘Statistical Programming’ in pharma means working with clinical trial data files to produce standardized reports that go to regulatory agencies. They use SAS because it is highly, centrally controlled and QCed, because the industry is conservative and risk averse, and because of the huge amount of legacy code that is available.

If you want to avoid this and do more Python and R in the pharma industry, I’d look at bioinformatics positions. Or possibly statistical methods or RWD, but you probably need more formal statistical training for those areas.
0 like 0 dislike
0 like 0 dislike
In hospital research we had issue with SAS and we were able to call 800# send in some made up data and they walked us through it .

I am not sure if R offers something like this ?

Related questions

0 like 0 dislike
0 like 0 dislike
0 answers
MorganHoover_ asked Jun 21
"Previous comparison of change in CAPS score between sertraline and placebo showed effect sizes of 0.31 and 0.37 (ref. 16). Similarly, comparison of change in CAPS score ...
MorganHoover_ asked Jun 21
0 like 0 dislike
0 like 0 dislike
1 answer
TylerReddick asked Jun 21
Hello Stats, I am trying to model frequency of an event, and I am looking for the best way to do this. My response variable is frequency, and I am fitting it with Poisson...
TylerReddick asked Jun 21
0 like 0 dislike
0 like 0 dislike
1 answer
_moadams asked Jun 21
I am running a LGCM and have identified a best fitting, two-class trajectory. The problem is that when I open the plots to see the estimated means and observed individual...
_moadams asked Jun 21
0 like 0 dislike
0 like 0 dislike
1 answer
ansaeuropa asked Jun 21
Hi! I thought that this would be a perfect place to discuss about my current project I'm thinking of doing. I have the chance to explore Trackman data, (trackman is a dev...
ansaeuropa asked Jun 21
0 like 0 dislike
0 like 0 dislike
22 answers
JadeToussay asked Jun 21
Is there a cocaine culture in reinsurance?
JadeToussay asked Jun 21

33.4k questions

135k answers

0 comments

33.7k users

OhhAskMe is a math solving hub where high school and university students ask and answer loads of math questions, discuss the latest in math, and share their knowledge. It’s 100% free!