Skip to contents

Introduction

The gPLS method allows dimensionality reduction by still creating HH new components but by selecting groups of variables. This method is therefore ideally suited in a context where several groups of predictors (in XX) are correlated with several groups of responses (in YY); these groups must then be selected.

Concretely, the matrix XX is divided into KK blocks representing the groups of variables. The same is true for YY, which is divided into LL blocks. These are then denoted X=(X1,X2,...,XK)X = (X_1, X_2, ..., X_K) and Y=(Y1,Y2,...,YL)Y = (Y_1, Y_2, ..., Y_L).

The scores are therefore given by:

th=k=1KXk(h)uk(h)=Xut_{h} = \sum_{k = 1}^{K} X^{(h)}_k u^{(h)}_k = Xush=l=1LYl(h)vl(h)=Yvs_{h} = \sum_{l = 1}^{L} Y^{(h)}_l v^{(h)}_l = Yv

Caution: uk(h)u^{(h)}_k and vl(h)v^{(h)}_l are no longer real numbers (dimension 1) but vectors of dimensions pkp_k and qlq_l respectively.

Minimization problem

The function to be minimized is written as:

k=1Kl=1LMk,lukvlTF2+Pλ(u)+Pμ(v)\sum_{k=1}^{K} \sum_{l=1}^{L} \left\| M_{k,l} - u_k v_l^T \right\|_F^2 + P_\lambda(u) + P_\mu(v)

where Mk,l=XkTYlM_{k,l} = X_k^T Y_l.

The penalties are:

Pλ(u)=λk=1Kpkuk2etPμ(v)=μl=1Lqlvl2P_\lambda(u) = \lambda \sum_{k=1}^{K} \sqrt{p_k} \|u_k\|_2 \quad \text{et} \quad P_\mu(v) = \mu \sum_{l=1}^{L} \sqrt{q_l} \|v_l\|_2

The Frobenius norm can also be written:

Mk,lukvlTF2=Tr(Mk,lMk,lT)2Tr(ukvlMk,lT)+Tr(ukukTvlTvl)\|M_{k,l} - u_k v_l^T\|_F^2 = \mathrm{Tr}(M_{k,l}M_{k,l}^T) - 2\,\mathrm{Tr}(u_k v_l M_{k,l}^T) + \mathrm{Tr}(u_k u_k^T v_l^T v_l)

So the function becomes:

k=1Kl=1LTr(Mk,lMk,lT)2k=1Kl=1LTr(ukvlTMk,lT)+k=1Kl=1LTr(ukukTvlTvl)\sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(M_{k,l}M_{k,l}^T) - 2 \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k v_l^T M_{k,l}^T) + \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k u_k^T v_l^T v_l)+λk=1Kpkuk2+μl=1Lqlvl2+ \lambda \sum_{k=1}^{K} \sqrt{p_k} \|u_k\|_2 + \mu \sum_{l=1}^{L} \sqrt{q_l} \|v_l\|_2

For all h{1,...,H}h \in \{1, ..., H\} :

u=argminuk=1Kl=1LTr(ukukTvlTvl)2k=1Kl=1LTr(ukvlTMk,lT)+λk=1Kpkuk2=argminuf(uk)2g(uk)+λk=1Kpkuk2\begin{align*} u &= \operatorname*{argmin}_{u} \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k u_k^T v_l^T v_l) -2 \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k v_l^T M_{k,l}^T) + \lambda \sum_{k=1}^{K} \sqrt{p_k} \|u_k\|_2 \\ &= \operatorname*{argmin}_{u} f(u_k) - 2g(u_k) + \lambda \sum_{k=1}^{K} \sqrt{p_k} \|u_k\|_2 \end{align*}

with :

f(uk)=k=1Kl=1LTr(ukukTvlTvl)=k=1Kl=1LvlTvlTr(ukukT)=k=1KTr(ukukT)=k=1Kuk2=u2\begin{align*} f(u_k) &= \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k u_k^T v_l^T v_l) = \sum_{k=1}^{K} \sum_{l=1}^{L} v_l^T v_l \, \mathrm{Tr}(u_k u_k^T) \\ &= \sum_{k=1}^{K} \mathrm{Tr}(u_k u_k^T) = \sum_{k=1}^{K} \|u_k\|^2 = \|u\|^2 \end{align*}

(because v(h1)=1\|v^{(h-1)}\| = 1)

g(uk)=k=1Kl=1LTr(ukvlTMk,lT)=k=1KTr(ukl=1LvlTMk,lT)=k=1KTr(ukvTMkT)=Tr(uvTMT)\begin{align*} g(u_k) &= \sum_{k=1}^{K} \sum_{l=1}^{L} \mathrm{Tr}(u_k v_l^T M_{k,l}^T) = \sum_{k=1}^{K} \mathrm{Tr} \left( u_k \sum_{l=1}^{L} v_l^T M_{k,l}^T \right)\\ &= \sum_{k=1}^{K} \mathrm{Tr}(u_k v^T M_k^T) = \mathrm{Tr}(u v^T M^T) \end{align*}

with Mk=XkTZM_k = X_k^T Z

Hence :

u=argminuku22Tr(uvTMT)+λk=1Kpkuk2uk=argminukuk22Tr(ukvTMkT)+λpkuk2\begin{align*} u &= \operatorname*{argmin}_{u_k} \|u\|^2 - 2\,\mathrm{Tr}(u v^T M^T) + \lambda \sum_{k=1}^{K} \sqrt{p_k} \|u_k\|_2 \\ \iff u_k &= \operatorname*{argmin}_{u_k} \|u_k\|^2 - 2\,\mathrm{Tr}(u_k v^T M_k^T) + \lambda \sqrt{p_k} \|u_k\|_2 \end{align*}

In the same way :

vl=argminvlvl22Tr(vluTMl)+μqlvl2\begin{align*} v_l &= \operatorname*{argmin}_{v_l} \|v_l\|^2 - 2\mathrm{Tr}(v_l u^T M_l) + \mu \sqrt{q_l} \|v_l\|_2 \end{align*}

with Ml=XlTZM_l = X_l^T Z.

Problem solving

The values of uku_k and vlv_l must cancel the gradient. We therefore solve:

dduk(uk22×Tr(ukvTMkT)+λpkuk2)=0 \frac{d}{du_k} \left( \|u_k\|^2 - 2 \times \mathrm{Tr}(u_k v^T M_k^T) + \lambda \sqrt{p_k} \|u_k\|_2 \right) = 0

ddukuk22ddukTr(ukvTMkT)+λpkddukuk2=0 \iff \frac{d}{du_k} \|u_k\|^2 - 2 \frac{d}{du_k} \mathrm{Tr}(u_k v^T M_k^T) + \lambda \sqrt{p_k} \frac{d}{du_k} \|u_k\|_2 = 0

2uk2Mkv+λpkukuk2=0 \iff 2u_k - 2M_k v + \lambda \sqrt{p_k} \frac{u_k}{\|u_k\|_2} = 0

2uk+λpkukuk2=2Mkv \iff 2u_k + \lambda \sqrt{p_k} \frac{u_k}{\|u_k\|_2} = 2M_k v

......

uk=Mkv×(1λpk2Mkv) \iff u_k = M_k v \times \left(1 - \frac{\lambda \sqrt{p_k}}{2 \|M_k v\|} \right)

In the same way, we also find :

vl=MlTu×(1μql2MlTu) v_l = M_l^T u \times \left(1 - \frac{\mu \sqrt{q_l}}{2 \|M_l^T u\|} \right)

Convergence algorithm

To solve the minimization problem, we first perform a SVD decomposition as before (first column of matrices UU and VV). The vectors found are not yet the solutions. We must therefore apply, for each component hh, for each group kk, for each group ll, a convergence algorithm. In the same way as before, we must calculate:

ũnew,k(h)=Mk(h)vold(h)(1λhpk2Mk(h)vold(h)) \tilde{u}^{(h)}_{new,k} = M_k^{(h)} v^{(h)}_{old} \left(1 - \frac{\lambda_h \sqrt{p_k}}{2 \|M_k^{(h)} v^{(h)}_{old}\|} \right)

then, after having calculated the last ũnew,k(h)\tilde{u}^{(h)}_{new,k} :

ũnew(h)=(ũnew,1(h)...ũnew,pk(h)) \tilde{u}^{(h)}_{new} = \begin{pmatrix} \tilde{u}^{(h)}_{new,1} \\ ...\\ \tilde{u}^{(h)}_{new,p_k} \end{pmatrix}

In the same way, we also find :

ṽnew,l(h)=Ml(h)Tuold(h)(1μhql2Ml(h)Tuold(h))\tilde{v}^{(h)}_{new,l} = M_l^{(h)T} u^{(h)}_{old} \left(1 - \frac{\mu_h \sqrt{q_l}}{2 \|M_l^{(h)T} u^{(h)}_{old}\|} \right)

then, after having calculated the last ṽnew,l(h)\tilde{v}^{(h)}_{new,l} :

ṽnew(h)=(ṽnew,1(h)...ṽnew,ql(h)) \tilde{v}^{(h)}_{new} = \begin{pmatrix} \tilde{v}^{(h)}_{new,1} \\ ...\\ \tilde{v}^{(h)}_{new,q_l} \end{pmatrix}

Finally, the solutions found must be standardized :

unew(h)=ũnew(h)ũnew(h)2vnew(h)=ṽnew(h)ṽnew(h)2\begin{align*} u^{(h)}_{new} &= \frac{\tilde{u}^{(h)}_{new}}{\|\tilde{u}^{(h)}_{new}\|_2} \\ v^{(h)}_{new} &= \frac{\tilde{v}^{(h)}_{new}}{\|\tilde{v}^{(h)}_{new}\|_2} \end{align*}

We then select unew(h)u^{(h)}_{new} and vnew(h)v^{(h)}_{new} respectively when |unew(h)uold(h)|<eps|u^{(h)}_{new} - u^{(h)}_{old}| < eps and |vnew(h)vold(h)|<eps|v^{(h)}_{new} - v^{(h)}_{old}| < eps

While |unew(h)uold(h)|>eps|u^{(h)}_{new} - u^{(h)}_{old}| > eps or |vnew(h)vold(h)|>eps|v^{(h)}_{new} - v^{(h)}_{old}| > eps:

  • we assign the following values: uold(h)=unew(h)u^{(h)}_{old} = u^{(h)}_{new} and vold(h)=vnew(h)v^{(h)}_{old} = v^{(h)}_{new}

  • we repeat one more loop.