I saw something interesting on Frank Harrell’s Datamethods forum. A user wanted to know what to do with her ‘big p, small n’ data. She was analyzing clinical trial data from a trial that had been cut short. The outcome was a binary variable (something like medicine successfully reduce inflammation), but she’d only collected \(20\) obs, and had a handful of predictors.

The is exactly the situation I don’t want to be in. I knew that there was not
enough data to support much of an analysis: binary outcomes do not carry very much
information. But I was surprised when Frank Harrell responded to the thread
saying the minimum sample size for a logistic regression with **no** predictors
is 96 obs (Check out page 8-13 of this
handout).

Wow! What a nice number to have in your back pocket!

But when you google “minimum sample size for logistic regression” you’ll get a much different answer from Google!