- Research
- Open access
- Published:
Frailty in older adults patients: a prospective observational cohort study on subtype identification
European Journal of Medical Research volume 30, Article number: 336 (2025)
Abstract
Background
While the FRAIL scale has been used in primary care, cluster analysis on frail patients in a hospital setting has not been performed.
Objectives
To identify potential subtypes of frail patients, and develop a simple, clinically applicable model for improved patient management.
Methods
The study included 214 frail patients aged 65 and above who were hospitalized in a hospital in Beijing from September 2018 to April 2019. This study applied the K-means clustering algorithm to analyze 27 variables, determining the optimal cluster number using the Elbow method and Silhouette coefficient. Key variables for predictive modeling were identified through LASSO (least absolute shrinkage and selection operator) regression, SVM–RFE (support vector machine–recursive feature elimination), and random forest techniques. A logistic regression model was then developed to predict patient subtypes, aimed at enhancing clinical identification and management of frailty subtypes.
Results
Clustering analysis distinguished two unique subgroups among the frail patients, revealing significant disparities in clinical characteristics and survival outcomes. One-year survival rates for Class 1 and Class 2 were 62.51% and 47.51%, respectively. The logistic regression model exhibited robust predictive capability, with an AUC (Area under curve) of 0.88. Validation through 1000 bootstrap resamples confirmed the model's reliability, with an average AUC of 0.8707 and a 95% CI (Confidence intervals) of 0.8572 to 0.8792.
Conclusions
This study identifies two frailty subtypes in a hospital setting using unsupervised machine learning, demonstrating significant differences in survival outcomes.
Clinical Trial registration ChiCTR1800017204; date of reqistration: 07/18/2018.
Introduction
Frailty, a unique phenotype of aging syndrome characterized by a significant reduction in physiological reserve and an increased susceptibility to stress events, leads to adverse health outcomes such as dependency and mortality among the older adults [1, 2]. Given its extensive impact and the significant challenges it poses to individuals and society, frailty is considered one of the most serious public health challenges of the twenty-first century [3].The International Association of Nutrition and Aging has developed a FRAIL scale (FRAIL), which comprises a simple questionnaire of five items: fatigue, resistance, illness, ambulation, and weight loss [4]. Diagnosis of frailty is established when an individual meets three or more of these criteria, while meeting one to two items indicates pre-frailty, and meeting none signifies robustness [5]. Currently, due to its simplicity, this scale has been recommended as an effective tool for the preliminary screening of frailty among older adults individuals in community settings [6].
However, the Fried phenotype and FRAIL questionnaire have significant practical limitations: they require personal interviews with patients [1, 7], which introduces a strong degree of subjectivity in the results. Frail patients identified through interviews may exhibit slight variations in physical health, and applying the same treatment measures to all may lead to wasted healthcare resources. Therefore, there is an urgent need for objective methods (such as objective indicators) to differentiate between frail patients with better or worse prognoses, enabling personalized treatment and preventive measures for different patient types.
Profiling or distinguishing subpopulations is a common practice in various disciplines, such as social sciences, economics, and medical sciences, which is particularly relevant for conditions like frailty [8]. Frailty is inherently heterogeneous, leading to differences in its definition and assessment [9]. Research indicates that the factors contributing to frailty mainly include demographic and socioeconomic factors (such as advanced age, female gender, and low income), diseases (comorbidities, obesity, malnutrition), lifestyle choices (sedentary behavior, lack of exercise), and age-related pathological processes (such as DNA damage, inflammation, and metabolic imbalance) [10,11,12,13]. The combination of these factors further exacerbates the heterogeneity of frailty, making the identification of its subtypes particularly important. However, existing classification methods have not been uniformly validated, highlighting the necessity of recognizing frailty subtypes. This variability, along with the potential subjectivity of self-assessment tools like frailty scales, further underscores this demand. By understanding these subtypes, we can better elucidate the complex interactions among the factors affecting older adults individuals, thereby refining our understanding of the condition and enhancing the methods used for its assessment [14, 15]. A research team has used the frailty scale to explore frailty in primary care settings, emphasizing its relevance in primary care [16]; however, in a hospital setting, frailty may be more complex, and its management more critical.
This study endeavors to uncover potential subtypes within a cohort of hospitalized frail patients using unsupervised machine learning techniques, leveraging medical history, demographic data, and laboratory test results. The logistic regression model was subsequently developed to predict each patient's subtype, providing clinicians with a streamlined tool for more effective identification and management of frailty subtypes.
Materials and methods
Data collection and participant selection
A prospective study was conducted based on data from previous research on frailty in the older adults(defined as individuals aged 65 years and above in China), collecting information on 1,068 older adults patients who were consecutively admitted to a tertiary referral hospital in Beijing, China, from September 2018 to April 2019 [17]. The frailty status of each patient was assessed using the FRAIL scale, which includes five items: fatigue, ability to climb one flight of stairs, ability to walk one block, having more than five illnesses, and loss of weight greater than 5%, with each item scoring one point [5]. Patients with a FRAIL score of less than 3, indicating they were not considered frail according to the scale's criteria, were excluded from further analysis. Consequently, 214 patients who met the criteria for frailty (FRAIL score ≥ 3) were included in the final analysis. The collected data encompassed a wide range of variables, including demographic characteristics (age, gender), medical history (comorbid conditions, previous hospitalizations), and clinical parameters observed during the hospital stay (laboratory indicators, treatments received). The primary outcomes of interest in this study were defined as either death or readmission to the hospital within 1 year following the initial hospital discharge. However, the main focus of this study was to perform clustering analysis on frail patients and model the resulting subgroups. Therefore, while the clinical outcomes (death or readmission) were recorded, the core aim of the study was to categorize frail patients into subgroups for further predictive modeling of these clusters. These events were systematically recorded and verified through hospital records and follow-up communications with patients or their families.
Unsupervised machine learning and survival analysis
In this study, 27 variables (the first 27 variables listed in Table 1) with less than 40% missing values, which were imputed, as described in Sect. "Data processing and statistical analysis", were analyzed using the K-means algorithm, a basic yet widely utilized clustering method in both academia and industry, valued for its simplicity and efficiency [18]. In this method, data points are considered in a multidimensional Euclidean space, with 'k' representing the number of predefined clusters [18, 19]. The optimal K value obtained through the Elbow Method and the Silhouette Method [20, 21]. Subsequently, with the classification based on cluster results, Kaplan–Meier curves were drawn using death or rehospitalization as the endpoint events [22]. In the survival analysis, the status variable was defined as '0' for patients who were alive and '1' for those who experienced readmission or death within 1 year. The survival analysis started at the time of patient admission and ended when the event of interest occurred or when the follow-up concluded.
Variable selection
To enhance the practicality of this classification for clinical physicians, we endeavored to construct the predictive classification model using as few variables as possible. Machine learning is more adept at managing high-dimensional data(characterized by numerous variables and complex relationships) compared to traditional statistical methods [23]. Hence, LASSO regression, SVM–RFE, and random forest were employed for variable selection [24,25,26]. The variables filtered through three different approaches were intersected, after which a collinearity examination was conducted using the Variance Inflation Factor (VIF) to exclude variables with high VIF values, thereby finalizing the variables for the model.
Model building and validation
In this study, models were constructed and validated using a variety of classifiers (DT—decision trees, KNN-k-nearest neighbors, SVM—support vector machines, LR—logistic regression, and RF—random forests) through the “mlr3” package, with prediction outcomes presented as probabilities [27]. Employing a fivefold cross-validation approach allowed for the evaluation of model performance based on accuracy, precision, recall, F1 Score and AUC metrics. Following this, the constructed models underwent internal validation through 1000 bootstrap resamples [28]. The optimism-adjusted AUC was calculated using Bootstrap resampling with 1,000 iterations [29]. In each iteration, a model was trained on a Bootstrap sample drawn from the original data set, and the AUC was computed on both the resampled data (acting as the training set) and the original data set (used as the validation set). The optimism bias was calculated as the difference between the AUC on the resampled data and the AUC on the original data set (Optimism = AUC on resampled data—AUC on original data set). The final adjusted AUC was obtained by subtracting the mean optimism across all iterations from the original AUC (Adjusted AUC = Original AUC—Mean Optimism).
Data processing and statistical analysis
Considering the high-dimensional nature of the data, imputation of missing data was performed using the learner (regression tree learner) from the mlr3 package in R [27, 30]. The continuous variables were standardized before the cluster analysis [20]. Details on the specific proportions of missing data can be found in the supplementary material (Table S1). All analyses in this study were performed using R (v4.2.2). Analysis of variance was used for normally distributed continuous variables, the Pearson chi-square test for categorical variables, and the Mann–Whitney test for non-normally distributed data. Continuous baseline variables were presented using the mean with standard deviation (SD) or median with interquartile range (IQR) where applicable. All P values were two-sided, and a P value of < 0.05 was considered statistically significant.
Ethics statement
The study conformed to the Declaration of Helsinki and was approved by the Ethics Committee of Beijing Hospital (No. 2018BJYYEC-121–02). All participants gave their written informed consent.
Results
Characteristics and clustering of hospitalized frail patients
This study included a total of 214 patients with FRAIL scores greater than or equal to 3. The median age and interquartile range (IQR) were 78.9 [73.6, 83.5], with 115 females (53.7%). A majority of the patients had hypertension, totaling 170 individuals (79.4%), and the predominant portion of the cohort, consisting of 163 patients (76.2%), had Frail scores equal to 3. Utilizing the Elbow method (Fig. 1A) and the Silhouette method (Fig. 1B), the optimal number of clusters was determined. The best number of clusters was identified as two when the Average Silhouette Width exceeded the value of 2. Subsequently, the results of the clustering were displayed using a PCA plot (Fig. 1C), which revealed a satisfactory degree of separation between the two distinct subpopulations within the cohort.
Clinical features and Kaplan–Meier survival analysis
Prior to descriptive analysis, the normality of the variables was assessed using the Shapiro–Wilk test, with results presented in Supplementary Table S2. Descriptive statistics were performed for two distinct classes of the population, detailed in Table 1. The study segregated 214 hospitalized frail patients into two distinct classes, with Class 1 comprising 114 patients and Class 2 consisting of 100 patients. Class 1 is characterized by a higher proportion of females (70.2%) and a median Left Ventricular Ejection Fraction (LVEF) of 65%. Class 2 differs significantly, with higher rates of alcohol consumption (39%), polypharmacy (80%), myocardial infarction (33%), coronary heart disease (70%), diabetes (61%), and a mean of 2.26 ± 1.36 readmissions in the past year. Additionally, the Body Mass Index (BMI) of 26.19 ± 3.46 and left atrial anteroposterior diameter (41 mm) also demonstrated significant differences from Class 1.
During the 1-year observation period, the survival rate of Class 1 was higher than that of Class 2. Specifically, the 1-year survival rate for Class 1 was approximately 62.51% (95% CI 54.06–72.28%), while that for Class 2 was 47.51% (95% CI 38.41–58.77%). The standard errors were 4.63% for Class 1 and 5.16% for Class 2, respectively. The Kaplan–Meier survival analysis underscored a statistically significant difference in outcomes between the two classes, with Class 2 showing a higher rate of event within 1 year, the log-rank test resulted in a p value of 0.036 (Fig. 2). The Kaplan–Meier curves were separately plotted for mortality within 1 year and rehospitalization within 1 year as the endpoint events. Figure S4 shows the survival curves for the two frailty subtypes with mortality within 1 year as the endpoint (p = 0.33), while Figure S5 shows the survival curves with rehospitalization within 1 year as the endpoint (p = 0.034).
Variable selection for predictive modeling
To accurately identify two subtypes for clinical screening with a minimal set of variables, supervised learning models were employed using variables from a database. LASSO, SVM–RFE, and random forest methods were utilized for variable reduction.
LASSO regression produced a cross-validation error plot (Fig. 3A) and a coefficient path plot to visually demonstrate how the model coefficients change with different λ values (Fig. 3B), ultimately selecting between 11 to 17 variables.
Subsequently, the support vector machine–recursive feature elimination (SVM–RFE) method was applied, using fivefold cross-validation as the model assessment method, to identify and select the subset of features most relevant to the target variable, with the top 10 variables being chosen.
Finally, a random forest model with 1000 trees was constructed, and the model's performance was evaluated with an error curve plot (Fig. 3C), which determined that a configuration of 587 trees minimized the error rate. The model was retrained with 587 trees, and a feature importance plot was created (Fig. 3D), selecting the top 10 features. The detailed variables selected by each of the three algorithms are presented in the supplementary material (Table S3). By intersecting the results from these three methods, five variables were identified: BSA, left atrial anteroposterior diameter, LVEF (left ventricular ejection fraction), creatinine, and diabetes. After conducting a collinearity test to ensure a Variance Inflation Factor (VIF) < 2, BSA was excluded, leaving the left atrial anteroposterior diameter, LVEF, creatinine, and diabetes as the final selection of features for our predictive classification model.
Performance of predictive models
To select a suitable model algorithm, this study implemented various machine learning classification methods for benchmark testing, utilizing fivefold cross-validation as the resampling strategy for model assessment. The performance metrics assessed included Accuracy, Precision, Recall, F1 Score, and Area Under the Curve (AUC) (Table 2), with a selection of results visually presented in Supplementary Figures S2 and S3. The results indicated that logistic regression achieved the highest AUC of 0.8619. Utilizing logistic regression to construct the model yielded an AUC of 0.88, with the formula and detailed information available in the appendix (refer to Fig. 4A and Figures S6 and S7). To ascertain the stability and reliability of the logistic regression model, 1000 bootstrap replications were performed. The distribution of the AUC values from these replications is shown in Supplementary Figure S1. The mean AUC value across the bootstrap samples was 0.8707, with a 95% confidence interval of 0.8572 to 0.8792, indicating the model's commendable stability (Fig. 4B). The optimism estimate was 0.01, and the final optimism-adjusted AUC was 0.869, suggesting a slightly improved model performance after adjusting for optimism.
Discussion
This study represents a significant stride toward elucidating the heterogeneity within frail populations by employing machine learning techniques to identify distinct subtypes of frailty. As aging intensifies, the number of frail individuals is also on the rise [31]. The Frail scale, as an exceedingly simple clinical screening tool, has been proven to serve as a predictive instrument for prognosis [32]. Despite the scale's widespread application in screening, its potential in differentiating subtypes with varied prognoses remains underexplored [14]. Our approach not only confirms the feasibility of such differentiation but also enhances the understanding of frailty beyond its traditional association with disability and comorbidities.
Frailty has traditionally been regarded as synonymous with disability, comorbidities, and other characteristics, yet there is growing recognition of its potential biological basis, making it a distinct clinical syndrome [1]. In this study, Class 1 and Class 2 frailty appeared to differ in certain characteristics; for instance, individuals with Class 2 frailty had a higher BMI and a greater burden of disease. This may be associated with their poorer prognosis. It is noteworthy that, in this study, there was no statistically significant difference in the Frail scores between the two groups (P = 0.052). This observation underscores the significance of the classification adopted in this research. Furthermore, other studies have indicated that frailty and comorbidities are significant factors in the poor prognosis of colorectal cancer (CRC) [33].
A European study explored the use of machine learning with texture-based muscle ultrasound image analysis to assess frailty. This method achieved moderate to good accuracy in classifying frailty and was able to predict mortality in pre-frail and frail participants [34]. Another study from Taiwan, China, utilized machine learning to explore the 5-item FRAIL scale, Cardiovascular Health Study index, and Study of Osteoporotic Fractures index, finding that the XGBoost algorithm had higher predictive accuracy than random forest across all three frailty assessments [35]. Additionally, Spanish scholars pointed out that commonly used frailty scales are difficult to apply in practice. Therefore, a scoring system was developed in primary care settings that predicts frailty levels using only clinical variables [16]. This study, however, focuses on hospitalized patients.
Additionally, a noteworthy observation in this study was that the proportion of females in Class 2 frailty is significantly lower than in Class 1 frailty, with Class 1 frailty being associated with a relatively better prognosis. In most countries worldwide, women have a longer lifespan than men, yet men possess better health conditions [36,37,38]. This is the male–female health survival paradox [37, 39]. In our research, Class 2 frailty, which has a higher disease burden, poorer prognosis, and a greater proportion of males, could serve as evidence supporting this paradox to some extent.
Among the variables critical for subtype classification, echocardiographic indicators (ejection fraction and left atrial anteroposterior diameter) and markers of metabolic health (diabetes and creatinine levels) were paramount. Studies have shown that there is no difference in LVEF between frail and non-frail populations [40]. Diabetes and creatinine have been reported in previous studies. Research indicates that the prevalence of frailty among diabetic patients can be as high as 48%, and frail diabetic patients have higher mortality rates and a higher risk of complications compared to diabetic patients without frailty [41,42,43]. This further suggests that frailty and diabetes often indicate a poor prognosis. Moreover, in other studies, creatinine has been demonstrated to be associated with the prediction of weight loss [44].
The strengths of this study lie in, first, the application of unsupervised machine learning techniques to explore potential subtypes among hospitalized older adults frail patients, achieving a more detailed patient classification. Second, the study utilizes a variety of variables, including medical history, demographic characteristics, and laboratory indicators, to comprehensively view frail patients. Finally, a logistic regression model was constructed, and its robustness was proven through internal validation, offering better interpretability compared to the "black box" theory of machine learning models [23, 45].
Furthermore, while the high AUC values in Table S3 reflect optimized performance within individual feature selection methods, they may not translate well into clinical practice. Each method has inherent strengths and limitations—LASSO tends to exclude highly correlated variables, potentially omitting key predictors [46]; random forest may retain low-frequency variables that are less clinically relevant [47]; and SVM–RFE, particularly in data sets with a small sample size relative to the number of features, can lead to overfitting [48]. To enhance interpretability and usability, we selected four variables consistently identified across multiple methods (SVM–RFE, RF, and LASSO). While this approach may slightly reduce predictive accuracy, it ensures robustness and practical applicability.
However, there are some limitations. First, this study is based on a single-center sample almost, which may limit the generalizability of the model. Second, the cohort only used 1 year of follow-up data, and when selecting death as the endpoint, there was no significant difference in the survival curves between the two classes of patients (Figure S4), which could be due to the short follow-up period or insufficient sample size. Future research should aim to replicate these findings in multi-center studies with longer follow-up durations, enhancing the predictive model’s applicability and exploring targeted interventions for identified frailty subtypes.
Conclusion
This study not only advances the understanding of frailty's clinical heterogeneity but also lays the groundwork for personalized frailty management, potentially revolutionizing the care of this vulnerable population. The insights garnered here should inform both future research and clinical practice, emphasizing the integration of machine learning tools in geriatric care.
Data availability
The data that support the findings of this study are available upon reasonable request. For inquiries, please contact the corresponding author.
References
Fried LP, Tangen CM, Walston J, et al. Frailty in older adults: evidence for a phenotype. J Gerontol A Biol Sci Med Sci. 2001;56:M146–57. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/gerona/56.3.m146.
Morley JE, Vellas B, Abellan van Kan G, et al. Frailty consensus: a call to action. J Am Med Dir Assoc. 2013;14:392–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jamda.2013.03.022.
Dent E, Martin FC, Bergman H, et al. Management of frailty: opportunities, challenges, and future directions. The Lancet. 2019;394:1376–86. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(19)31785-4.
Van Kan GA, Rolland Y, Bergman H, et al. The I.A.N.A. task force on frailty assessment of older people in clinical practice. J Nutr Health Aging. 2008;12:29–37. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/BF02982161.
van Kan GA, Rolland YM, Morley JE, Vellas B. Frailty: toward a clinical definition. J Am Med Dir Assoc. 2008;9:71–2. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jamda.2007.11.005.
Ruiz JG, Dent E, Morley JE, et al. Screening for and managing the person with frailty in primary care: ICFSR consensus guidelines. J Nutr Health Aging. 2020;24:920–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s12603-020-1492-3.
Rolfson DB, Majumdar SR, Tsuyuki RT, et al. Validity and reliability of the Edmonton Frail Scale. Age Ageing. 2006;35:526–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/ageing/afl041.
Lafortune L, Béland F, Bergman H, Ankri J. Health status transitions in community-living elderly with complex care needs: a latent class approach. BMC Geriatr. 2009;9:6. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2318-9-6.
Iqbal J, Denvir M, Gunn J. Frailty assessment in elderly people. The Lancet. 2013;381:1985–6. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(13)61203-9.
de Labra C, Maseda A, Lorenzo-López L, et al. Social factors and quality of life aspects on frailty syndrome in community-dwelling older adults: the VERISAÚDE study. BMC Geriatr. 2018;18:66. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12877-018-0757-8.
Eissa HM, Lu L, Baassiri M, et al. Chronic disease burden and frailty in survivors of childhood HSCT: a report from the St. Jude Lifetime Cohort Study Blood Adv. 2017;1:2243–6. https://doiorg.publicaciones.saludcastillayleon.es/10.1182/bloodadvances.2017010280.
Poli S, Cella A, Puntoni M, et al. Frailty is associated with socioeconomic and lifestyle factors in community-dwelling older subjects. Aging Clin Exp Res. 2017;29:721–8. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s40520-016-0623-5.
Valdiglesias V, Sánchez-Flores M, Marcos-Pérez D, et al. Exploring genetic outcomes as frailty biomarkers. J Gerontol: Series A. 2019;74:168–75. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/gerona/gly085.
Liu L-K, Guo C-Y, Lee W-J, et al. Subtypes of physical frailty: latent class analysis and associations with clinical characteristics and outcomes. Sci Rep. 2017;7:46417. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/srep46417.
Looman WM, Fabbricotti IN, Blom JW, et al. The frail older person does not exist: development of frailty profiles with latent class analysis. BMC Geriatr. 2018;18:84. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12877-018-0776-5.
Aznar-Tortonda V, Palazón-Bru A, la Rosa DMF, et al. Detection of frailty in older patients using a mobile app: cross-sectional observational study in primary care. Br J Gen Pract. 2020;70:e29–35. https://doiorg.publicaciones.saludcastillayleon.es/10.3399/bjgp19X706577.
Liang Y-D, Zhang Y-N, Li Y-M, et al. Identification of frailty and its risk factors in elderly hospitalized patients from different wards: a cross-sectional study in china. CIA. 2019;14:2249–59. https://doiorg.publicaciones.saludcastillayleon.es/10.2147/CIA.S225149.
Liu H, Chen J, Dy J, Fu Y. Transforming complex problems into K-means solutions. IEEE Trans Pattern Anal Mach Intell. 2023;45:1–20. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TPAMI.2023.3237667.
Schroeder L, de Souza EM, Rosset C, et al. Fire association with respiratory disease and COVID-19 complications in the State of Pará, Brazil. Lancet Regional Health - Am. 2022;6: 100102. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.lana.2021.100102.
Pasin O, Gonenc S. An investigation into epidemiological situations of COVID-19 with fuzzy K-means and K-prototype clustering methods. Sci Rep. 2023;13:6255. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-023-33214-y.
Kirti K, Singh SK. Obesogenic diet and metabolic syndrome among adolescents in India: data-driven cluster analysis. BMC Cardiovasc Disord. 2023;23:393. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12872-023-03429-y.
Barakat A, Mittal A, Ricketts D, Rogers BA. Understanding survival analysis: actuarial life tables and the Kaplan-Meier plot. Br J Hosp Med. 2019;80:642–6. https://doiorg.publicaciones.saludcastillayleon.es/10.12968/hmed.2019.80.11.642.
Deo RC. Machine learning in medicine. Circulation. 2015;132:1920–30. https://doiorg.publicaciones.saludcastillayleon.es/10.1161/CIRCULATIONAHA.115.001593.
Huang M-L, Hung Y-H, Lee WM, et al. SVM-RFE based feature selection and taguchi parameters optimization for multiclass SVM classifier. Sci World J. 2014;2014:1–10. https://doiorg.publicaciones.saludcastillayleon.es/10.1155/2014/795624.
Tibshirani R. The lasso method for variable selection in the cox model. Statist Med. 1997;16:385–95.
Izmirlian G. Application of the random forest classification algorithm to a seldi-tof proteomics study in the setting of a cancer prevention trial. Ann N Y Acad Sci. 2004;1020:154–74. https://doiorg.publicaciones.saludcastillayleon.es/10.1196/annals.1310.015.
Lang M, Binder M, Richter J, et al. mlr3: a modern object-oriented machine learning framework in R. J Open Source Software. 2019;4:1903. https://doiorg.publicaciones.saludcastillayleon.es/10.21105/joss.01903.
Mizus M, Li J, Goldman D, Petri MA. Autoantibody clustering of lupus-associated pulmonary hypertension. Lupus Sci Med. 2019;6: e000356. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/lupus-2019-000356.
Smith GCS, Seaman SR, Wood AM, et al. Correcting for optimistic prediction in small data sets. Am J Epidemiol. 2014;180:318–24. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/aje/kwu140.
Alsaber A, Al-Herz A, Pan J, et al. Handling missing data in a rheumatoid arthritis registry using random forest approach. Int J of Rheum Dis. 2021;24:1282–93. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1756-185X.14203.
Clegg A, Young J, Iliffe S, et al. Frailty in elderly people. The Lancet. 2013;381:752–62. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(12)62167-9.
Morley JE, Malmstrom TK, Miller DK. A simple frailty questionnaire (frail) predicts outcomes in middle aged African Americans. J Nutr Health Aging. 2012;16:601–8.
Boakye D, Rillmann B, Walter V, et al. Impact of comorbidity and frailty on prognosis in colorectal cancer patients: a systematic review and meta-analysis. Cancer Treat Rev. 2018;64:30–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ctrv.2018.02.003.
Mirón-Mombiela R, Ruiz-España S, Moratal D, Borrás C. Assessment and risk prediction of frailty using texture-based muscle ultrasound image analysis and machine learning techniques. Mech Ageing Dev. 2023;215: 111860. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.mad.2023.111860.
Yang C-C, Chen P-H, Yang C-H, et al. Physical frailty identification using machine learning to explore the 5-item FRAIL scale, cardiovascular health study index, and study of osteoporotic fractures index. Front Public Health. 2024;12:1303958. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fpubh.2024.1303958.
Theou O, Brothers TD, Peña FG, et al. Identifying common characteristics of frailty across seven scales. J Am Geriatrics Soc. 2014;62:901–6. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/jgs.12773.
Hubbard RE, Rockwood K. Frailty in older women. Maturitas. 2011;69:203–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.maturitas.2011.04.006.
Corbi G, Cacciatore F, Komici K, et al. Inter-relationships between gender, frailty and 10-year survival in older Italian adults: an observational longitudinal study. Sci Rep. 2019;9:18416. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-019-54897-2.
Jenkins ND, Hoogendijk EO, Armstrong JJ, et al. Trajectories of frailty with aging: coordinated analysis of five longitudinal studies. Innov Aging. 2022;6:059. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/geroni/igab059.
Vidán MT, Blaya-Novakova V, Sánchez E, et al. Prevalence and prognostic impact of frailty and its components in non-dependent elderly patients with heart failure. European J of Heart Fail. 2016;18:869–75. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/ejhf.518.
Yanase T, Yanagita I, Muta K, Nawata H. Frailty in elderly diabetes patients. Endocr J. 2018;65:1–11. https://doiorg.publicaciones.saludcastillayleon.es/10.1507/endocrj.EJ17-0390.
Jang HC. Sarcopenia, frailty, and diabetes in older adults. Diabetes Metab J. 2016;40:182. https://doiorg.publicaciones.saludcastillayleon.es/10.4093/dmj.2016.40.3.182.
Cacciatore F, Testa G, Galizia G, et al. Clinical frailty and long-term mortality in elderly subjects with diabetes. Acta Diabetol. 2013;50:251–60. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00592-012-0413-2.
Pillatt AP, Silva BD, Franz LBB, et al. Muscle, endocrine, and immunological markers of frailty in older people. Exp Gerontol. 2021;151: 111405. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.exger.2021.111405.
Handelman GS, Kok HK, Chandra RV, et al. Peering into the black box of artificial intelligence: evaluation metrics of machine learning methods. Am J Roentgenol. 2019;212:38–43. https://doiorg.publicaciones.saludcastillayleon.es/10.2214/AJR.18.20224.
Zhu X-W, Xin Y-J, Ge H-L. Recursive random forests enable better predictive performance and model interpretation than variable selection by LASSO. J Chem Inf Model. 2015;55:736–46. https://doiorg.publicaciones.saludcastillayleon.es/10.1021/ci500715e.
Gilholm P, Lister P, Irwin A, et al. Comparison of random forest and stepwise regression for variable selection using low prevalence predictors: a case study in paediatric sepsis. Matern Child Health J. 2025. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10995-025-04038-1.
Sanz H, Valim C, Vegas E, et al. SVM-RFE: selection and visualization of the most relevant features through non-linear kernels. BMC Bioinformatics. 2018;19:432. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12859-018-2451-4.
Funding
This work was supported by Capital Health Development Fund for Scientific Research (2022‑1‑4052); National High Level Hospital Clinical Research Funding (BJYY-2023-070); CAMS Innovation Fund for Medical Sciences (2021-I2M-1-050).
Author information
Authors and Affiliations
Contributions
Y.Z. wrote the main manuscript text and conducted statistical analyses. J.C. and W.T. were responsible for data visualization. H.W., W.Y., and Z.M. handled data cleaning. G.D. and C.L. managed the data. W.H. reviewed and revised the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The study involving human participants was reviewed and approved by the Ethics Committee of Beijing Hospital, China (ID number: 2018BJYYEC-121-02), the version date of the protocol approved by ethics is September 18, 2018, and the version number is 1.0. The patients/participants provided their written informed consent to participate in this study.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yang, Z., Ji, C., Wang, T. et al. Frailty in older adults patients: a prospective observational cohort study on subtype identification. Eur J Med Res 30, 336 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40001-025-02450-5
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40001-025-02450-5