It's more math then anything else. I think it originated in the optimization community. Basically you try to recover sparse signals (i.e. signals that have only few non zero entries) from as few linear measurements as possible. That is an NP hard problem, but luckily there are relaxations of the problem that under certain conditions find the solution. A lot of research was focused on how many measurements you need for a given sparsity, designing fast algorithms and proving, under which conditions on the sensing matrix these algorithms find the correct solution. It had found a ton of applications everywhere, e.g. wireless communication, but the hard part of it are the mathematical results. Terrence Tao write one of the first papers on CS in 2007 i think.