Options
Exploring body composition and physical condition profiles in relation to playing time in professional soccer: a principal components analysis and Gradient Boosting approach
Fábrica-Barrios, Gabriel
Jorquera-Aguilera, Carlos
Guede-Rojas, Francisco
Pérez-Contreras, Jorge
Lozano-Jarque, Demetrio
Carvajal-Parodi, Claudio
Romero-Vera, Luis
Frontiers
2025
Background: This study aimed to explore whether a predictive model based on body composition and physical condition could estimate seasonal playing time in professional soccer players.
Methods: 24 professional soccer players with 5–7 years of professional experience participated. Body composition and physical condition variables were assessed, and total minutes played during the season were recorded as the dependent variable. Correlations between variables were examined to reduce multicollinearity, followed by a principal component analysis (PCA) of the selected predictors. The first three components were used as inputs in a Gradient Boosting model. Model performance was evaluated using 5-fold cross-validation and leave-one-out cross-validation (LOOCV).
Results: High intercorrelations among independent variables (r > 0.70) justified dimensionality reduction through PCA. The first three components explained 70% of the total variance. However, no direct correlations were observed between individual variables and minutes played, and the Gradient Boosting model did not achieve positive predictive performance under cross-validation (5-fold CV: R2 = −0.04; LOOCV: R2 < 0).
Conclusion: In this small dataset, a multivariate approach combining PCA and Gradient Boosting did not yield predictive accuracy for playing time. Nonetheless, the PCA revealed meaningful structures in the players’ physical and body composition profiles, which may inform future research. Larger and more heterogeneous samples are required to determine whether component-based predictors can reliably estimate playing time in professional soccer.
Methods: 24 professional soccer players with 5–7 years of professional experience participated. Body composition and physical condition variables were assessed, and total minutes played during the season were recorded as the dependent variable. Correlations between variables were examined to reduce multicollinearity, followed by a principal component analysis (PCA) of the selected predictors. The first three components were used as inputs in a Gradient Boosting model. Model performance was evaluated using 5-fold cross-validation and leave-one-out cross-validation (LOOCV).
Results: High intercorrelations among independent variables (r > 0.70) justified dimensionality reduction through PCA. The first three components explained 70% of the total variance. However, no direct correlations were observed between individual variables and minutes played, and the Gradient Boosting model did not achieve positive predictive performance under cross-validation (5-fold CV: R2 = −0.04; LOOCV: R2 < 0).
Conclusion: In this small dataset, a multivariate approach combining PCA and Gradient Boosting did not yield predictive accuracy for playing time. Nonetheless, the PCA revealed meaningful structures in the players’ physical and body composition profiles, which may inform future research. Larger and more heterogeneous samples are required to determine whether component-based predictors can reliably estimate playing time in professional soccer.
No Thumbnail Available
Name
Exploring body composition and physical condition profiles in relation to playing time in professional soccer- a principal components analysis and Gradient Boosting approach.pdf
Size
12.8 MB
Format
Checksum
Soccer
Body composition
Physical condition
Principal component
Playing time