Statistics always struck me as a layer of terminology atop concepts in analysis and probabilities. I remember one time, we did a proof in statistics that involved the Borel-Lebesgue characterization of compact sets (from any covering by open sets of the compact, a finite covering can be extracted), in order to get a covering of a compact by a finite set of balls with radius epsilon... This is very very unusual for statistics and is the only glimpse of intelligible math I found in all statistics courses I had to suffer. I would describe statistics as object oriented mathematics, where everything is hidden from view. But, on that day, I was allowed to see that there is something behind the terminological "thicket", to quote Numerical Recipes in Fortran. Sorry, this turned into a rant.

Machine learning and such is basically interpolation with complicated or steep functions (like logistic functions, but they call those some other way), naturally using Optimization (called fitting or training, because why not). What they call a "loss function" is what everyone else calls a "cost function", so there is a tendency here. Also they use mostly stochastic optimization because they have many variables, but that does not make for a field per se, in my opinion. It is similar to statistics as it is an old field used for new purposes, hence the new terminology.