ago
0 like 0 dislike
0 like 0 dislike
The platform is great but I want to know about the challenges regarding using this platform to build ML models and take them to production. One point I know is that building models on a platform such as this could be a blackbox situation. Why would you prefer datarobot or avoid using it.
ago
by
0 like 0 dislike
0 like 0 dislike
I used DataRobot and I can say a lot of the times there was a problem with overfitting. Datarobot has a blue print of the model you want to use so you know almost all methods used in order to obtain the final result, altough it does not explain fully all transformations and results obtained. There is a problem with some ML models because you cant use it's feature creation engine without the model being only used in batch deployment.
ago
0 like 0 dislike
0 like 0 dislike
TL;DR: If you're the scale of a Fortune 20, you're not gonna use these tools unless it's a hosting platform or just a start. If you're the scale of a Fortune 100, they are a great start that you can tweak and you will likely use them. If you're the scale of a Fortune 1000, then these are great tools to add to your arsenal until you decide that you want to focus on getting good data scientists. If you're smaller than that you might not be able to afford these tools.  


Longer explanation: DataRobot is probably the second best autoML solution out there (I'm biased, I realize).

AutoML tools are amazing at going from the 'we want something' to 'we have something' level of specificity and completeness. They do a bunch of (basic, thoughtless) feature engineering and then try a bunch of (basic, thoughtless) parameter tuning to get to the best result you ask for. Huge caveat - you still need to understand what to ask for and how to interpret the result(s).

So if you're at a firm without much in the way of capacity it can scale your capability really, really well. As long what you want is basic, thoughtless models. And then these tools let you host a model so you can score against it.

Don't take that as a necessarily bad thing. Most firms don't have this level of capability and so if what you lack is someone who can write the code then these tools can be amazing at scaling your capabilities. And if you're coming from nothing or a 10% solution then this will get you to an 80% solution and do so quickly.

Now...that being said...if you want it to work better you can't stop with just DataRobot (or any of the other autoML tools). You're not gonna get great. You'll just get good enough.

For what it's worth, I think there is going to be a niche of implementation folks that use tools like DataRobot to get orgs from nearly zero analytics to basic ML level for the next 2 - 10 years.
ago
0 like 0 dislike
0 like 0 dislike
So I totally waxed poetic and answered the wrong question.

The problems tend to be the business solution. The data will usually work but, if it's a classification model, what next? What does that mean?

So let's say you have a classification model that predicts customer churn...what the hell do you do with that? You can't just fire those customers so now you need some sort of thing to do with them and you'll need to set up a treatment & control group to measure the result. And you need to do something with the groups of customers now that you've defined them so you need some sort of customer journey setup to understand how to intake and deal with the results of your model.

In short, if you have an existing analytics group that has a robust experimentation system (like A/B testing or the like) then you can just slot in the results of your model into an existing framework that you already use. But otherwise it's just like any other massive project that you have to implement. It's usually a nightmare unless you have buy-in all the way at the beginning.

I actually do this part for a living so you can message me if you want to talk through a bunch more specifics about what next and all that.
ago
0 like 0 dislike
0 like 0 dislike
I worked with DR before. It reminded me “a depressingly stupid machine”. It can be used as a quick start to get initial statistics and visual insights in the dataset and its model as a baseline. Though it’s missing common sense, hence its feature selection and modeling usually lead to overfitting. A simple EDA in Jupiter and a baseline model as a result outperforms DR in most cases.
ago

No related questions found

33.4k questions

135k answers

0 comments

33.7k users

OhhAskMe is a math solving hub where high school and university students ask and answer loads of math questions, discuss the latest in math, and share their knowledge. It’s 100% free!