- Research
- Open access
- Published:
Machine learning-based risk prediction model for arteriovenous fistula stenosis
European Journal of Medical Research volume 30, Article number: 217 (2025)
Abstract
Background
Arteriovenous fistula stenosis is a common complication in hemodialysis patients, yet effective predictive tools are lacking. This study aims to develop an interpretable machine learning model for stenosis risk prediction.
Methods
Clinical data from 974 patients (55 features) undergoing arteriovenous fistula dialysis at The Central Hospital of Wuhan (2017–2024) were analyzed retrospectively. The dataset was split into training (70%) and test (30%) sets. Seven models—Random Forest, XGBoost, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Artificial Neural Network, and Decision Tree—were trained. Performance was evaluated using F1 score, accuracy, specificity, precision, recall, and AUC-ROC. SHAP values identified key predictors in the optimal model.
Results
XGBoost achieved the highest AUC (0.829, 95% CI 0.785–0.880). SHAP analysis highlighted seven critical predictors: number of surgeries, prothrombin time activity, lymphocyte count, fistula duration, triglycerides, vitamin B12, and C-reactive protein.
Conclusion
The XGBoost model effectively predicts arteriovenous fistula stenosis risk using clinical data. SHAP explanations enhance clinical interpretability, aiding personalized care strategies.
Introduction
The prevalence of chronic kidney disease (CKD) is rising annually, posing significant challenges to human health and well-being. Epidemiological studies indicate that the global prevalence of CKD is approximately 9.1%, affecting around 697.5 million individuals. In China, the prevalence is notably higher at 10.8%, with an estimated 120 million individuals affected by the disease [1]. Dialysis serves as the primary treatment modality for patients with end-stage renal disease (ESRD). KDOQI (Kidney Disease Outcomes Quality Initiative) has indicated that the arteriovenous fistula (AVF) is the preferred vascular access for hemodialysis due to its extended service life, low infection risk, and minimal complications, making it widely utilized in clinical practice [2]. However, AVF stenosis remains a prevalent complication that significantly impairs vascular access patency and adversely affects dialysis outcomes. Research indicates that the one-year functional patency rate of a mature postoperative AVF is approximately 87%, while the two-year patency rate decreases to 75%. Furthermore, nearly half of mature AVF require additional interventions or treatments to address complications and maintain patency [3]. The presence of AVF stenosis exacerbates patient morbidity and can result in suboptimal dialysis, elevated hospitalization rates, and potentially life-threatening complications. Consequently, the early identification of patients at high risk for AVF stenosis, coupled with the implementation of effective interventions, is crucial for enhancing patient outcomes and reducing healthcare expenditures [4]. Although various clinical techniques, such as ultrasonography and angiography, are available for assessing the risk of AVF stenosis, these methods have inherent limitations. For instance, ultrasonography is highly dependent on the operator's expertise and exhibits low sensitivity in detecting minor stenosis [5]. While digital subtraction angiography is considered the "gold standard" for diagnosing AVF stenosis, it is an invasive procedure that is both costly and associated with a risk of complications [6, 7]. Consequently, there is an urgent need in clinical practice and research for the development of an efficient, accurate, and noninvasive tool to predict the risk of AVF stenosis.
In recent years, machine learning technology has demonstrated significant advancements in the medical field, with its robust data processing capabilities and pattern recognition abilities showing considerable potential for applications in disease diagnosis, treatment plan optimization, and risk prediction [8,9,10]. Machine learning algorithms have demonstrated the capability to extract intricate relationships between potential features and diseases from extensive clinical datasets, thereby offering substantial support for clinical decision-making. Within the domain of vascular diseases, machine learning models have been effectively employed for risk assessment in conditions such as coronary artery disease and peripheral vascular disease, yielding favorable outcomes [11,12,13]. Relevant research has also been conducted on predicting the risk of stenosis in arteriovenous fistulae, with existing models showing notable predictive performance and clinical utility [14, 15]. However, despite the high predictive accuracy of certain models in various studies, the complexity of these models often obscures the specific contribution of each feature to the prediction process.
To enhance the interpretability and clinical applicability of the model, this study employs the SHAP (Shapley Additive explanations) method. This approach elucidates the influence of individual features on the prediction of stenosis risk in arteriovenous fistulas by calculating the marginal contribution of these features across various combinations. The SHAP method, grounded in the Shapley value from game theory, quantifies the contribution of each feature to the model’s predictions. By doing so, it effectively demystifies the “black box” nature of machine learning models, thereby increasing their transparency. The SHAP method not only identifies the most influential features affecting the prediction of arteriovenous fistula stenosis risk but also elucidates the interactions between these features and their collective impact on model predictions. This study aims to develop seven machine learning-based models for predicting AVF stenosis risk, select the optimal model, and interpret it using the SHAP method. The ultimate goal is to provide clinicians with a novel decision-making tool to facilitate the early identification of high-risk patients, thereby enabling timely interventions to extend the lifespan of the arteriovenous fistula and improve the prognosis for hemodialysis patients.
Methodology
Study design
This retrospective cohort study adhered to the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines and employed demographic and laboratory data from 1,168 patients who received dialysis at The Central Hospital of Wuhan from January 1, 2017, to October 2024. The cohort was stratified into two groups based on the presence or absence of arteriovenous fistula stenosis. According to the conventional guideline for machine learning, which suggests having 10–20 samples per feature, our sample size of 974 patients with 55 features is considered sufficient.
Inclusion and exclusion criteria
Inclusion Criteria: 1. Patients who underwent dialysis using an arteriovenous fistula. 2. Patients aged 18 years or older. 3. Patients with a dialysis history of at least three months. 4. Patients who received dialysis three times per week for four hours per session. Exclusion Criteria: 1. Patients who utilized artificial arteriovenous fistulas for dialysis. 2. Patients whose clinical records contained more than 30% missing data. This study received approval from the Ethics Committee of The Central Hospital of Wuhan, under approval number WHZXKYL2024-115. Given the retrospective nature of the study, the Ethics Committee granted an exemption from obtaining informed consent.
Selection of variables
Utilizing the electronic medical record system of our hospital, we systematically compiled demographic information encompassing a total of 55 variables. The primary variables of interest included: gender, age, presence of underlying diseases (such as diabetes mellitus, hypertension, chronic glomerulonephritis, coronary artery disease, vasculitis, and other conditions, along with comorbidities involving two or more diseases), educational attainment, smoking history, alcohol consumption history, literacy levels, incidence of blood clots, body mass index (BMI), diameter of fistulae, and various hematological parameters including erythrocyte count, leukocyte count, hemoglobin concentration, erythrocyte pressure volume, platelet count, lymphocyte count, neutrophil count, monocyte count, D-dimer levels, plasminogen activity, prothrombin time, and procalcitonin levels. Furthermore, lipid profiles were assessed, including high-density lipoprotein (HDL), low-density lipoprotein (LDL), apolipoprotein A1 (ApoA1), apolipoprotein B (ApoB), and free fatty acids, alongside total cholesterol, total protein, globulin, direct bilirubin, indirect bilirubin, total bilirubin, creatinine, cystatin C, creatine kinase, urea, uric acid, and electrolyte levels (potassium, sodium, chloride, phosphorus, calcium). Additionally, iron metabolism markers such as serum iron, total iron binding capacity, folate, vitamin B12, ferritin, unconjugated iron, erythrocyte sedimentation rate, parathyroid hormone, C-reactive protein, serum troponin, as well as the duration of arteriovenous fistula use, were included in the analysis.
The primary outcome measured was arteriovenous fistula stenosis
Stenosis was defined as a localized narrowing exceeding 50% of the diameter of adjacent normal vessels, accompanied by one or more of the following conditions: a natural blood flow of the AVF less than 500 ml/min, inability to fulfill the dialysis prescription, elevated venous pressure during dialysis, difficulty in cannulation, reduced dialysis adequacy, and abnormal clinical signs of the fistula [6].
Construction of machine learning models
(1) Prior to the development of the machine learning model, the dataset underwent a thorough cleaning process. Patients with more than 30% missing data in individual records were excluded from the analysis. (2) Despite this, additional missing values persisted in the original dataset. To optimize data utilization and mitigate the potential bias and information loss associated with the deletion of records with missing values, we applied different imputation methods based on the type of variable. Specifically: For categorical variables, missing values were imputed using the mode (most frequent category) of each variable. For continuous variables, missing values were imputed using the mean of each variable. In addition to these single imputation methods, we also employed the Multiple Imputation by Chained Equations (MICE) method to handle missing data comprehensively. MICE is an iterative algorithm that constructs distinct predictive models for each variable with missing data using chained equations, generating multiple complete datasets through successive iterations. For a dataset X containing n observations and p variables, where the subset of variables with missing values is denoted as Xmiss and the subset of observed variables as Xobs, the MICE algorithm formulates an appropriate model for each missing variable Xj.miss (where j represents the variable index), conditional on the other variables. The choice of model, such as a linear regression model or logistic regression model, is contingent upon the type of variable being imputed. In this study, we used the 'mice' package in the R programming language to conduct the MICE process, resulting in the creation of five complete datasets. Subsequent analyses were conducted on each of these datasets, and the results were synthesized in accordance with Rubin's rules to derive the final statistical inferences, inclusive of uncertainty estimates.
(3) Data Standardization: After imputation, we standardized the continuous variables using the z-score method. This process involved transforming each variable to have a mean of zero and a standard deviation of one. The z-score standardization was applied to ensure that all continuous variables were on a comparable scale, thereby enhancing the stability and interpretability of the models.
(4) After data imputation, we further refined the feature set using Recursive Feature Elimination (RFE) to identify the most relevant predictors for model training. RFE is a wrapper method that recursively removes the least important features based on model performance, thereby enhancing model efficiency and interpretability. The dataset was randomly partitioned into training and test sets in a 7:3 ratio. We developed several predictive models, including logistic regression (LR), random forest (RF), k-nearest neighbors’ algorithm (KNN), support vector machine (SVM), eXtreme Gradient Boosting (XGBoost), artificial neural network (ANN), and decision tree (DT), using R version 4.3.2. The training and validation datasets were utilized, with the clinical demographics and laboratory test results from the training set serving as predictors, and the occurrence of intra-arterial fistula stenosis as the target variable. Optimal model parameters were identified through a grid search coupled with fivefold cross-validation and grid search (For more details, please refer to Attachment 1).
(5) Upon finalizing the models, their performance was assessed using the validation dataset, evaluating metrics such as the area under the curve (AUC), sensitivity, specificity, accuracy, recall, and F1 score.
SHAP is a Python-based package designed for model interpretation. To comprehend the model outputs, we employ the SHAP package to interpret and rank the features of the training model, thereby assessing the contribution of each feature element to the model. Following the selection of the optimal model, SHAP values were utilized to visualize the significance of the study features.
Statistical analysis
Data processing and statistical analysis were performed using R software (version 4.0.2). Categorical variables were presented as frequencies and percentages, and compared using Fisher's exact test or chi-square test. Continuous variables were assessed for normality using the Shapiro–Wilk test. Normally distributed variables were compared using the independent samples t-test (mean ± standard deviation), while non-normally distributed variables were presented as median with interquartile range and compared using the Mann–Whitney U test.
Statistical Tests and Assumptions: (1) Fisher's Exact Test/Chi-square Test: Used for categorical variables. Fisher's exact test was applied when expected cell frequencies were < 5; otherwise, chi-square test was used. Assumptions: independence of observations and categorical data. (2)Independent Samples t-test: Used for normally distributed continuous variables (Shapiro–Wilk test, p ≥ 0.05). Assumptions: independence of observations, normality within groups, and homogeneity of variances (Levene's test). Adjusted for unequal variances if necessary. (3)Mann–Whitney U Test: Used for non-normally distributed continuous variables (Shapiro–Wilk test, p < 0.05). Assumptions: independence of observations and ordinal/continuous data. A p-value < 0.05 was considered statistically significant.
Results
Comparative analysis of patient characteristics between two groups
A total of 974 patients were included in this study, selected according to specific inclusion and exclusion criteria (Fig. 1). Among these, 368 patients were diagnosed with stenosis, while 606 patients did not have stenosis. Table 1 presents the demographic and clinical characteristics of both the stenosis and non-stenosis groups. Our analysis identified statistically significant differences between the two groups concerning variables such as gender, smoking status, alcohol consumption, thrombus presence, underlying diseases, age, red blood cell count, white blood cell count, hemoglobin concentration, neutrophil count, plasminogen activity, prothrombin time, creatinine levels, blood calcium concentration, and erythrocyte sedimentation rate (p < 0.05). Additionally, a heat map was constructed to illustrate the correlations among these variables (Fig. 2).
Note: The following variables were considered in the study: Educational Background (EB), Fistula Diameter (FD), Number of Operations (NOO), Underlying diseases(BD), Hematocrit (HCT), Lymphocyte Count (LYMPH), Monocyte Count (MONO), Neutrophil Count (NEUT), D-dimer (DD), Prothrombin Time Activity (PTA), Prothrombin Time (PT), triglycerides (TG), high-density lipoprotein (HDL), low-density lipoprotein (LDL), apolipoprotein A1 (ApoA1), apolipoprotein B (Apo B), free fatty acids (FFA), total cholesterol (TC), total protein (TP), globulin (GLB), direct bilirubin (DBIL), indirect bilirubin (IBIL), total bilirubin (TBIL), creatinine (CR), cystatin C (CysC), creatine kinase (CK), urea (UREA), uric acid (UA), serum iron (SI), total iron-binding capacity (TIBC), folic acid (FA), vitamin B12 (B12), unsaturated iron-binding capacity (UIBC), erythrocyte sedimentation rate (ESR), parathyroid hormone (PTH), serum troponin (cTn), and use time of AVF(UT).
Underlying diseases (BD) included: 1. Hypertension, 2. Diabetes mellitus, 3. Polycystic kidney disease, 4. Chronic glomerulonephritis, 5. Multiple conditions (two or more types), 6. Coronary heart disease, 7. Vasculitis, and 8. Other conditions.
The age described refers to the specific age at which the patient developed stenosis and underwent surgical intervention. For patients who did not develop complications, their age was determined based on the age recorded during the last examination in the medical records that we were able to access.
Comparative analysis of model features
In this study, we developed seven distinct machine learning models and conducted a comparative analysis of their performance based on several evaluation metrics, including accuracy, F1 score, specificity, recall, area under the curve (AUC), and precision. Among these models, the XGBoost model demonstrated superior performance, achieving the highest AUC value of 0.829 for the subjects, as illustrated in Fig. 3. This model outperformed the others across multiple evaluation criteria, as detailed in Table 2. Consequently, the XGBoost model was selected for further investigation in this research.
Analysis of characteristic interpretation
In the optimal model utilizing the XGBoost model to assess the significance of factors influencing the stenosis of the arteriovenous fistula (Fig. 4), our analysis identified the following variables as the seven most critical determinants: the number of surgeries, prothrombin time activity, lymphocyte count, duration of AVF use, triglyceride levels, vitamin B12 levels, and C-reactive protein levels.
Figure 5 presents a feature density scatter plot, illustrating the influence of these primary features on the model's predictive performance. The horizontal axis represents the SHAP values, which quantify each feature's contribution to the model's overall output. SHAP values less than 0, equal to 0, and greater than 0 correspond to negative, neutral, and positive contributions, respectively. The left vertical axis displays the features ranked by their importance. The color gradient on the right vertical axis, ranging from blue to red, signifies the feature values from low to high. The SHAP values associated with the red dot clusters on the right side of the figure are both positive and substantial, suggesting that a high "NOO" feature value significantly enhances the model's predictive value for stenosis. Conversely, the SHAP values linked to the blue dot clusters on the left side are negative, indicating that a lower "NOO" feature value tends to reduce the model's predicted value for stenosis.
Interpretation of model applications
Figure 6 illustrates the contribution values and impact of the top 20 significant features on the model. The SHAP values associated with the "number of operations" (NOO) feature exhibit a wide range and are predominantly negative, suggesting that this feature frequently contributes to a decrease in the model's predictive outcomes, with its contribution demonstrating considerable variability. In contrast, the SHAP values for other features, such as PTA, UT, and LYMPH, are relatively concentrated around zero, indicating that these features exert a comparatively minor and balanced influence on the model's predictions.
To further examine the contribution of specific features to individual patients and their clinical applicability within the XGBoost model, a patient was randomly selected from the validation cohort. Figure 7 provides a visual interpretation of the XGBoost model. The features highlighted in red (NEUT, SI) have a positive contribution, thereby increasing the predicted value, whereas the features highlighted in blue (NOO, PTA, UT) have a negative contribution, thereby decreasing the predicted value (Fig. 7).
Discussion
Arteriovenous fistula stenosis is a prevalent complication among patients undergoing hemodialysis via an arteriovenous fistula, with an incidence rate ranging from 4.6 to 10.8% [16]. Despite this, there is currently no sufficiently robust model for predicting the occurrence of arteriovenous fistula stenosis. In response to this gap, Hongbo Li developed a predictive model for the risk of arteriovenous fistula complications, utilizing clinical data from 178 patients and employing a column line graph, achieving an area under the curve of 0.823 [17]. Additionally, Jae Hyon Park and colleagues constructed three convolutional neural network architectures—DenseNet201, EfficientNetB5, and ResNet50—using audio data from arteriovenous fistulae of 40 patients. The area under the receiver operating characteristic curve for the DenseNet201, EfficientNetB5, and ResNet50 models were 0.70, 0.98, and 0.99, respectively [18]. Other researchers have identified risk factors for endogenous fistula stenosis through multivariate regression analyses, highlighting variables such as gender, age, body mass index, cholesterol, and albumin as significant contributors to the complication score of intra-arteriovenous fistulae. However, the findings across these studies have not been consistent [19, 20].
In our study, we analyzed clinical demographics from 974 cases alongside laboratory tests to train seven distinct machine learning models. Among these, the XGBoost model demonstrated superior performance, achieving an area under the curve of 0.829 with a 95% confidence interval of 0.785–0.880. The XGBoost model achieved an AUC of 0.829, indicating robust discriminatory ability for stenosis risk prediction. However, clinical adoption requires balancing sensitivity and specificity based on local resource availability. Integration into EMR systems as a decision support tool could optimize monitoring schedules and reduce costs. Prospective studies are needed to validate its impact on hard endpoints (e.g., stenosis rates) and compliance with healthcare regulations. In this study, the XGBoost model demonstrated significant performance advantages over the traditional LR model in predicting arteriovenous fistula stenosis risk. The XGBoost model achieved an AUC of 0.829 (95% CI 0.785–0.880), which is notably higher than LR's AUC of 0.787(95% CI 0.733–0.835), indicating a stronger ability to discriminate between high-and low-risk risk. The outperformed LR in precision (0.840 vs. 0.734) and balanced sensitivity and specificity, despite a slightly lower recall (0.785 vs. 0.823). The superior performance of XGBoost is further enhanced by its interpretability through SHAP analysis, which identified critical predictors such as the number of surgeries, prothrombin time activity, and lymphocyte count. These features provide valuable insights for clinical decision-count. This ability to handle complex, on-linear relationships and feature interactions makes it a more robust tool compared to traditional models, offering enhanced accuracy and interpretability for predicting arteriovenous fistula stenosis risk and supporting personalized patient care.
Through characteristic importance analysis, we identified the number of surgeries, prothrombin time activity, lymphocyte count, duration of endovascular fistula use, triglycerides, vitamin B12, and C-reactive protein as the seven most influential factors affecting the model's performance. Notably, the number of surgeries emerged as the most critical factor, exerting the greatest influence on the model.
Previous studies have indicated that the 3, 6, and 9-month primary patency rates of arteriovenous fistulas in patients with arteriovenous endovascular stenosis who underwent percutaneous transluminal angioplasty (PTA) were 84.0%, 68.0%, and 60.0%, respectively. Additionally, patients who developed stenosis post-maturation required more frequent annual interventions compared to those who did not experience stenosis, suggesting that prolonged use of a patient's arteriovenous endocardial fistula increases the likelihood of stenosis development [21]. Multiple surgeries lead to mechanical damage of the vascular endothelium, which activates the TGF-β pathway and promotes intimal proliferation. Repeated punctures trigger local inflammation, increasing the risk of fibrosis, this leads to the occurrence of stenosis [22].
Several studies have demonstrated that the neutrophil-to-lymphocyte count ratio and the platelet-to-lymphocyte count ratio serve as effective predictors of arteriovenous fistula stenosis [23,24,25]. Additionally, C-reactive protein has been identified as a reliable predictor for the occurrence of thrombosis and stenosis in arteriovenous fistulas [26]. CRP plays a critical role in the development and progression of arteriovenous fistula stenosis by significantly increasing the risk through mechanisms that promote endothelial injury, accelerate thrombosis, and exacerbate local inflammatory responses. Consequently, monitoring and regulating CRP levels are crucial for the prevention and management of AVF stenosis in clinical practice [27,28,29].
Triglycerides, a significant variable, were incorporated into our model but excluded from the LASSO regression analysis, and they did not exhibit a direct association with stenosis. In contrast, a study by Feng Wenjun et al. identified triglycerides ≥ 2.26 mmol/L as an independent risk factor for stenosis [30]. This discrepancy with our findings may be attributed to our lack of stratification of triglyceride values during model training [29]. Additionally, other studies that did not stratify triglyceride levels have also reported that triglycerides were not associated with stenosis in their final models [31]. Our findings are largely consistent with existing research. Elevated triglyceride levels have been shown to activate inflammatory mediators, such as tumor necrosis factor α (TNF-α) and nuclear factor κB (NF-κB), thereby intensifying the inflammatory response in vascular endothelial cells. This inflammatory response not only results in endothelial cell damage but also facilitates the progression of atherosclerosis. Furthermore, elevated triglyceride levels can induce apoptosis in vascular endothelial cells, disrupting the endothelial barrier and leading to vascular dysfunction. This dysfunction subsequently contributes to the development of complications associated with arteriovenous fistulae [32].
Research has demonstrated that altered prothrombin time activity predisposes individuals to thrombosis, ultimately impacting the function of arteriovenous fistulas [33]. However, few studies have adequately addressed the biochemical indicators of prothrombin time activity. Previous investigations have indicated that deficiencies in folic acid and vitamin B6 can lead to elevated plasma homocysteine levels. Hemodialysis patients often present with hyperhomocysteinemia, often accompanied by deficiencies in folic acid and B vitamins. Supplementation with folic acid has been shown to effectively reduce plasma homocysteine levels in these patients. Furthermore, the combined administration of B complex vitamins enhances the improvement of hyperhomocysteinemia. Long-term supplementation with folic acid and vitamin B6 has been associated with reduced plasma homocysteine levels and a decreased incidence of arteriovenous endovascular stenosis and thrombosis [34, 35]. The SHAP analysis revealed that vitamin B12 deficiency is an independent risk factor for stenosis, which is consistent with the KDOQI consensus on mineral and bone disorder in chronic kidney disease. Vitamin B12, as an essential cofactor for homocysteine metabolism, leads to metabolism, leads when deficient. This condition impairs vascular repair capacity through a dual mechanism involving oxidative stress and inhibition of endothelial nitric oxide synthase (eNOS). It is recommended to routinely monitor serum B12 levels in patients with fistulas and to supplement with methyl cobalamin for those with levels below 200 pg/mL (Evidence Level: B) [2].
Our study demonstrates that vitamin B12 is a significant factor in the stenosis of arteriovenous fistula, contributing substantially to our predictive model. Despite the limited research on the impact of vitamins on the pathology of arteriovenous fistula, further investigation into the specific mechanisms of this effect is warranted. In conclusion, we have developed a risk model to predict the likelihood of arteriovenous fistula stenosis. As clinical nurses, who are most closely engaged with patients, we should prioritize care for individuals who have undergone multiple surgeries, exhibit elevated C-reactive protein levels, have used the arteriovenous fistula for extended periods, show high lymphocyte counts, and present with elevated triglyceride levels. Providing targeted nursing interventions may help mitigate the risk of stenosis in these patients. Given the variability among clinical patients, it is advisable to develop individualized care strategies based on the risk models.
We have found that the inflammatory state within patients influences the occurrence of stenosis in arteriovenous fistulas. Lymphocyte count and CRP are both indicators of inflammation. Chronic inflammatory responses may lead to local tissue hypoxia, which affects cellular repair and functional recovery. The hypoxic environment further promotes the activation of fibroblasts and the deposition of extracellular matrix (ECM), thereby exacerbating the progression of fibrosis [36].
In conclusion, our study has delineated several specific risk factors, such as procedural damage inflicted by the operating surgeon, repeated punctures during dialysis, and the biochemical status of the patient, all of which are correlated with inflammation in patients. These risk factors collectively contribute to a cascade of oxidative stress responses and activate various inflammatory signaling pathways, ultimately culminating in the stenosis of the patients' arteriovenous fistulas. In the clinical management of hemodialysis patients, a comprehensive understanding of their inflammatory status is crucial. Various strategies should be employed to ameliorate their inflammatory condition. For instance, addressing hyperlipidemia through active treatment, proactively supplementing with vitamins, and administering pharmacological agents such as Hirudoid cream following dialysis puncture can effectively mitigate puncture-related injuries [2].
Conclusion
An advanced machine learning model utilizing XGBoost was developed, with SHAP values employed to elucidate the risk factors and underscore their clinical significance in predicting arteriovenous fistula stenosis. It was observed that patients with fistula stenosis are significantly correlated with multiple surgical interventions, elevated C-reactive protein levels, prolonged duration of fistula usage, increased lymphocyte counts, and elevated triglyceride levels. The integration of XGBoost and SHAP algorithms offers transparent explanations for risk prediction, which holds substantial potential for future clinical research applications. This methodological combination could assist clinicians in implementing early clinical interventions and provide comprehensive insights for the long-term management of patients with arteriovenous fistula stenosis, including prevention and stratification of stenosis-related risks.
Study limitations
This study has several limitations that should be acknowledged. First, the single-center retrospective design restricted the generalizability of our findings. The lack of external validation means that our results may not be applicable to other populations or healthcare settings. Future studies should consider multicenter, prospective cohorts to enhance the robustness and generalizability of the findings. Second, our study was limited by the potential presence of unmeasured confounders. Variations in patient demographics, such as race, and differences in treatment modalities may have influenced the incidence of stenosis in patients with intra-arterial fistulae. These factors could introduce biases that were not fully accounted for in our analysis. Additionally, our modeling study was restricted to adult participants. The predictive capability of the XGBoost model for pediatric patients with endogenous fistula stenosis remains uncertain. Further validation and adjustment would be necessary to extend our findings to pediatric populations. The retrospective nature of the data collection may also have impacted the model. Since the data were derived from existing medical records, there may be issues with missing information, incomplete or inaccurate documentation, which could affect the performance and reliability of the model. Moreover, retrospective data collection may lead to selection bias, as certain patients' data may be excluded due to incomplete records, thereby affecting the model's generalizability. Future Research Directions: To address these limitations, future research should focus on conducting multicenter, large-sample, prospective cohort studies. Such studies would enhance the accuracy and generalizability of predictive models by incorporating diverse patient populations and healthcare settings. External validation in multicenter cohorts is also essential to confirm the robustness of our findings and to develop more universally applicable clinical guidelines.
Availability of data and materials
All data generated during this study are available upon request. Interested parties should contact the corresponding author or the first author, in accordance with hospital policy requirements and patient privacy considerations.
References
Bikbov B, Purcell CA, Levey AS, Smith M, Abdoli A, Abebe M, Adebayo OM, Afarideh M, et al. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the global burden of disease study 2017. Lancet. 2020;395(10225):709–33. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/s0140-6736(20)30045-3.
Lok CE, Huber TS, Lee T, et al. KDOQI clinical practice guideline for vascular access: 2019 update. Am J Kidney Dis. 2020;75(4):S1–164. https://doiorg.publicaciones.saludcastillayleon.es/10.1053/j.ajkd.2019.12.001.
Huber TS, Berceli SA, Scali ST, et al. Arteriovenous fistula maturation, functional patency, and intervention rates. JAMA Surg. 2021;156(12):1111–8. https://doiorg.publicaciones.saludcastillayleon.es/10.1001/jamasurg.2021.4527.
Wang ZQ, Liu Z, Wang YJ, Zhao SL, Fu LP, Du ST. Treatment of central venous stenosis in hemodialysis patients and the impact of reimbursement methods on treatment selection. Chin J Blood Purif. 2018;17(7):476–80.
Abreo K, Amin BM, Abreo AP. Physical examination of the hemodialysis arteriovenous fistula to detect early dysfunction. J Vasc Access. 2019;20(1):7–11. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/1129729818768183.
Chinese Hospital Association Hemodialysis Center Branch Vascular Access Working Group. Chinese expert consensus on vascular access for hemodialysis. Chin J Blood Purif. 2019;18(6):365–81.
Liu J, Qin J, Lei R, He LY, Wen R. Clinical study of open surgery combined with tunneling device for the repair of arteriovenous fistula stenosis in hemodialysis patients. Int J Urol. 2022;42(5):858–62.
Jiang XR, Jiang T, Sun JY, et al. Application of deep learning artificial intelligence technology in medical imaging analysis. Chin Med Equip. 2021;36(6):164–71.
Liu JL, Hu SL, Zhou PR, Mo HQ, Huang JW, Hu B. Research trends of artificial intelligence in hemodialysis: a visual analysis based on VOSviewer. Chin J Blood Purif. 2023;22(8):633–7.
Zhang D. Application of artificial intelligence in hemodialysis. J Nephrol Dial Transplant. 2018;27(4):383–6.
Ebrahimi S, Bagchi P. Application of machine learning in predicting blood flow and red cell distribution in capillary vessel networks. J R Soc Interface. 2022;19(193):20220306. https://doiorg.publicaciones.saludcastillayleon.es/10.1098/rsif.2022.0306.
Al’Aref SJ, Anchouche K, Singh G, et al. Clinical applications of machine learning in cardiovascular disease and its relevance to cardiac imaging. Eur Heart J. 2019;40(24):1975–86. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/eurheartj/ehy404.
Liu L, Bi B, Cao L, Gui M, Ju F. Predictive model and risk analysis for peripheral vascular disease in type 2 diabetes mellitus patients using machine learning and shapley additive explanation. Front Endocrinol. 2024;15:1320335. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fendo.2024.1320335.
Wang FL, Yang YL, Xu YK, et al. Feasibility analysis of monitoring arteriovenous fistula function based on machine learning and audio technology. Chin J Blood Purif. 2024;23(9):701–5.
Wu F, Li Y, Xue R. Research progress on risk prediction models for autologous arteriovenous fistula failure based on different research schemes. Chin J Integr Tradit West Med Nephrol. 2024;25(6):553–5.
Meola M, Marciello A, Di Salle G, Petrucci I. Ultrasound evaluation of access complications: thrombosis, aneurysms, pseudoaneurysms and infections. J Vasc Access. 2021;22(1):71–83. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/11297298211018062.
Li HB. Analysis of risk factors for complications of autologous arteriovenous fistula in maintenance hemodialysis patients and construction of risk prediction model. Chin J Blood Purif. 2023;22(11):861–5.
Park JH, Park I, Han K, et al. Feasibility of deep learning-based analysis of auscultation for screening significant stenosis of native arteriovenous fistula for hemodialysis requiring angioplasty. Korean J Radiol. 2022;23(10):949–58. https://doiorg.publicaciones.saludcastillayleon.es/10.3348/kjr.2022.0364.
Chen B, Tu B, Lai Q, et al. Patency and factors related to patency after percutaneous transluminal angioplasty for inflow arterial stenosis in native arteriovenous fistula dysfunction: a single-center retrospective study. Sci Rep. 2024;14(1):21072. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-024-71776-7.
Huang JY, Duan LL. Establishment of a prediction model for arteriovenous fistula dysfunction in hemodialysis patients based on synthetic minority oversampling technique algorithm. J Clin Nephrol. 2024;24(5):378–84.
Allon M, Robbin ML, Young CJ, et al. Preoperative venous intimal hyperplasia, postoperative arteriovenous fistula stenosis, and clinical fistula outcomes. Clin J Am Soc Nephrol. 2013;8(10):1750–5. https://doiorg.publicaciones.saludcastillayleon.es/10.2215/cjn.02740313.
Wang W, Wang Z, Tian D, et al. Integrin β3 mediates the endothelial-to-mesenchymal transition via the notch pathway. Cell Physiol Biochem. 2018;49(3):985. https://doiorg.publicaciones.saludcastillayleon.es/10.1159/000493229.
Li H, Zhai YL, Yuan L, et al. Impact of neutrophil-to-lymphocyte ratio on patency after percutaneous transluminal angioplasty for arteriovenous fistula in hemodialysis patients. Chin J Blood Purif. 2020;19(8):554–8.
Sarioglu O, Capar AE, Belet U. Relationship of arteriovenous fistula stenosis and thrombosis with the platelet-lymphocyte ratio in hemodialysis patients. J Vasc Access. 2020;21(5):630–5. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/1129729819894113.
Yilmaz H, Bozkurt A, Cakmak M, et al. Relationship between late arteriovenous fistula (AVF) stenosis and neutrophil-lymphocyte ratio (NLR) in chronic hemodialysis patients. Ren Fail. 2014;36(9):1390–4. https://doiorg.publicaciones.saludcastillayleon.es/10.3109/0886022x.2014.945183.
Gong MQ, Wu M, Li Z. Analysis of influencing factors of arteriovenous fistula stenosis in maintenance hemodialysis patients. Chin J Front Med Sci. 2019;11(7):134–7.
Ren ZT, Zhou T, He LX, Ma YM, Zhang HP. Research progress on the molecular biological mechanisms of arteriovenous fistula dysfunction. Clin Nephrol. 2023;23(12):1041–4.
Xue ZQ, Tan JZ, Kong YY, Xiang DM. Analysis of factors influencing primary patency time and short-term patency rate after upper extremity graft arteriovenous fistula surgery. Clin Nephrol. 2023;23(8):621–7.
Li B, Deng XG, Li J, Liu HW. Relationship between plasma growth arrest-specific protein 6 levels and arteriovenous fistula thrombosis in maintenance hemodialysis patients. Clin Nephrol. 2022;22(9):737–41. https://doiorg.publicaciones.saludcastillayleon.es/10.3969/j.issn.1671-2390.2022.09.006.
Feng WJ, Xu HY, Chen L. Analysis of high-risk factors for autologous arteriovenous fistula stenosis in hemodialysis patients. China Modern Doctor. 2021;59(29):49–52.
Li J, Chen WD, Zhang Y, Zhang JQ. Analysis of influencing factors of fistula calcification in autologous arteriovenous fistula in maintenance hemodialysis patients. Chin J Blood Purif. 2014;13(11):793–6.
Liang HJ, Zhang QY, Hu YT, Liu GQ, Qi R. Hypertriglyceridemia: a neglected risk factor for ischemic stroke? J Stroke. 2022;24(1):21–40. https://doiorg.publicaciones.saludcastillayleon.es/10.5853/jos.2021.02831.
Salmela B, Hartman J, Peltonen S, Albäck A, Lassila R. Thrombophilia and arteriovenous fistula survival in ESRD. Clin J Am Soc Nephrol. 2013;8(6):962–8. https://doiorg.publicaciones.saludcastillayleon.es/10.2215/cjn.03860412.
Chen WD, Zhang Y, Chang BC, Yang P. Relationship between plasma homocysteine and arteriovenous fistula stenosis in hemodialysis patients. Chin J Blood Purif. 2009;8(6):319–21.
Tu YR, Tu KH, Lee CC, et al. Supplementation with folic acid and cardiovascular outcomes in end-stage kidney disease: a multi-institution cohort study. Nutrients. 2022;14(19):4162. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/nu14194162.
Tang F, Liu M, Zeng O, Tan W, Long J, Liu S, Yang J, Chu C. Gefitinib-coated balloon inhibits the excessive hyperplasia of intima after vascular injuries through PI3K/AKT pathway. Technol Health Care. 2019;27(S1):331–43. https://doiorg.publicaciones.saludcastillayleon.es/10.3233/THC-199031.
Acknowledgements
Peng Shu and Ling Huang were responsible for conceptualizing the study, designing the research framework, developing the machine learning model, analyzing and interpreting the results, and drafting the manuscript. Haitao Bai and Jun Qiu contributed by collecting and preprocessing the input data, conducting data analysis, and participating in the manuscript writing. Shanshan Huo was involved in data collection and preprocessing. Fang Xu and Xia Wang provided critical feedback on the manuscript's conceptual framework, suggested revisions, and assisted in the manuscript's refinement. All authors have agreed to be accountable for their respective contributions and have reviewed and approved the final version of the manuscript.
Funding
This study was funded by the Chen Xiaoping Foundation for the development of science and technology of Hubei province, No. CXPJJH124001-2404.
Author information
Authors and Affiliations
Contributions
Peng Shu and Ling Huang were responsible for conceptualizing the study, designing the research framework, developing the machine learning model, analyzing and interpreting the results, and drafting the manuscript. Haitao Bai and Jun Qiu contributed by collecting and preprocessing the input data, conducting data analysis, and participating in the manuscript writing. Shanshan Huo was involved in data collection and preprocessing. Fang Xu and Xia Wang provided critical feedback on the manuscript's conceptual framework, suggested revisions, and assisted in the manuscript's refinement. All authors have agreed to be accountable for their respective contributions and have reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study received approval from the Ethics Committee of The Central Hospital of Wuhan, under approval number WHZXKYL2024-115. Given the retrospective nature of the study, the Ethics Committee granted an exemption from obtaining informed consent.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shu, P., Huang, L., Huo, S. et al. Machine learning-based risk prediction model for arteriovenous fistula stenosis. Eur J Med Res 30, 217 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40001-025-02490-x
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40001-025-02490-x