Eigen-Stratified Models
J. Tuck and S. Boyd
Optimization and Engineering, 23:397–419, January 2022.
Stratified models depend in an arbitrary way on a selected categorical
feature that takes values,
and depend linearly on the other features.
Laplacian regularization with respect to a graph on the feature values
can greatly improve the performance of a stratified model,
especially in the low-data regime.
A significant issue with Laplacian-regularized stratified models is that the model
is times the size of the base model, which can be quite large.
We address this issue by formulating eigen-stratifed models,
which are stratified models with an additional constraint that the model parameters
are linear combinations of some modest number of bottom eigenvectors of the graph
Laplacian, i.e., those associated with the smallest eigenvalues.
With eigen-stratified models,
we only need to store the bottom eigenvectors and
the corresponding coefficients as the stratified model parameters.
This leads to a reduction, sometimes large,
of model size when and .
In some cases, the additional regularization implicit in
eigen-stratified models can improve out-of-sample performance over
standard Laplacian regularized stratified models.
|