Here is a model I’d like to fit: What proportion of statisticians actually like to work with data?
Some statisticians like the idea of data, the theory of data, stuff like that. Real data means data cleaning, which is kind of fun, in limited doses. But I do not think some (most?) classically trained, hardcore, mathematical statisticians like working on real data problems. They are just too messy. They’d rather spend time finding the right methods to apply, helping others apply those methods, advocating, researching, and so on.
Research is the hardest when the data are messy and when the questions are poorly defined. One way to make a real problem more fun is by putting in the upfront work to properly define the research questions in terms of a good model for the data. Actually, this is one the best ways to convince people of the utility of statistics.