sPLS-DA performance
2025-08-01
sPLSDA_performance.Rmd
Introduction
This page presents an application of the sPLSDA performance assessment. The PLS method is a quite particular method : there are several predictions according to the components number selected in the model. It is the same with sPLSDA. The goal is almost to choose the best number of component in order to compute the best possible predictions. For that, we will use two datasets:
one is a dataset with only five predictor variable and two classes.
the other is a dataset with forty predictor variables ans three classes. With , this dataset approches realist conditions for PLS training.
To access to predefined functions from sgPLSdevelop package and manipulate these datasets, run these lines :
library(sgPLSdevelop)
data1 <- data.cl.create(p = 5, list = TRUE) # 2 classes by default
data2 <- data.cl.create(n = 30, p = 40, classes = 3, list = TRUE)
Now, it’s time to train a PLS model for each dataset built.
Leave-one-out CV
First model
perf.res1 <- perf.sPLSda(model1)
h.best <- perf.res1$h.best
keepX.best <- perf.res1$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 5 variables for each component.
Second model
perf.res2 <- perf.sPLSda(model2)
h.best <- perf.res2$h.best
keepX.best <- perf.res2$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 40 variables for each
component.
10-fold CV
First model
perf.res1 <- perf.sPLSda(model1, K = 10)
h.best <- perf.res1$h.best
keepX.best <- perf.res1$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 5 variables for each component.
Second model
perf.res2 <- perf.sPLSda(model2, K = 10)
h.best <- perf.res2$h.best
keepX.best <- perf.res2$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 40 variables for each
component.
5-fold CV
First model
perf.res1 <- perf.sPLSda(model1, K = 5)
h.best <- perf.res1$h.best
keepX.best <- perf.res1$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 5 variables for each component.
Second model
perf.res2 <- perf.sPLSda(model2, K = 5)
h.best <- perf.res2$h.best
keepX.best <- perf.res2$keepX.best
The perf.sPLSda
gives us an optimal components number
equal to
1, therefore we suggest to select 1 components in our first model. The
function also indicates us to select 40 variables for each
component.