[1] 0.001327097
2024-04-01
Most model selection procedures will choose features that have large \(T\)-statistics when testing whether they are 0 or not…
Even when nothing is happening some features will have large \(T\)-statistics!
Using \(p\)-values from summary()
of a selected model is misleading.
Using confidence intervals from confint()
of a selected model is misleading.
Y
and X
.[1] 0.001327097
[1] 0.83
80% of the time we’ll falsely declare a true relationship between Y
and X
!
80% of our confidence intervals won’t cover 0 (truth)…
Let’s look at a selection procedure we have used…
We’ll build up 100 null data sets and store them for a few analyses
In practice, there will likely be some signals – here there are none…
Distribution function here should be diagonal…
50% of our 95% confidence intervals will not cover 0 (truth)
[1] 0.5005834
[1] 0.04