Introduction
The gPLS method allows dimensionality reduction by still creating
new components but by selecting groups of variables. This method is
therefore ideally suited in a context where several groups of predictors
(in
)
are correlated with several groups of responses (in
);
these groups must then be selected.
Concretely, the matrix
is divided into
blocks representing the groups of variables. The same is true for
,
which is divided into
blocks. These are then denoted
and
.
The scores are therefore given by:
Caution:
and
are no longer real numbers (dimension 1) but vectors of dimensions
and
respectively.
Minimization problem
The function to be minimized is written as:
where
.
The penalties are:
The Frobenius norm can also be written:
So the function becomes:
For all
:
with :
(because
)
with
Hence :
In the same way :
with
.
Problem solving
The values of
and
must cancel the gradient. We therefore solve:
In the same way, we also find :
Convergence algorithm
To solve the minimization problem, we first perform a SVD
decomposition as before (first column of matrices
and
).
The vectors found are not yet the solutions. We must therefore apply,
for each component
,
for each group
,
for each group
,
a convergence algorithm. In the same way as before, we must
calculate:
then, after having calculated the last
:
In the same way, we also find :
then, after having calculated the last
:
Finally, the solutions found must be standardized :
We then select
and
respectively when
and
While
or
: