1. I am working on a dataset from a disease survey. I tried to identify risk factors from that survey for a sub-category of the disease. As a first step, I took 4 states for data observation from this survey.
2. For each state, I tried to see if the data collected on every possible predictor of the disease (for example, age, gender, smoking status, diabetes status, and socio economic status) was representative of the total population surveyed in the state, for example, I found that in state X, testing efforts are more focused on 60-74-year-olds.
3. With this information, I looked up for some baseline data for each predictor on national data sources.
4. Since I now had an idea of how the representativeness of the population was for each state, the next step was to see if this surveyed population was representative of the country's population and could be replicated in a synthetic population.
I have tried to use Copula probability theory to describe the dependence structure between random variables while keeping the marginals fixed in R.
I built a correlation matrix for synthetic data, using each predictor's marginal distribution. I am trying to use this matrix to fit my Copula but I keep getting this error:
clayton\_copula\_fit <- fitCopula(copula = claytonCopula(), data = pobs(syn\_data), method = "mpl")
Error in (copula, u = data, method = method, start = start, :
The dimension of the data and copula do not match
Can someone please help? thank you!