Clustering
HomeHome > News > Clustering

Clustering

Aug 02, 2023

Scientific Reports volume 13, Article number: 12701 (2023) Cite this article

69 Accesses

1 Altmetric

Metrics details

Machine learning applied to digital pathology has been increasingly used to assess kidney function and diagnose the underlying cause of chronic kidney disease (CKD). We developed a novel computational framework, clustering-based spatial analysis (CluSA), that leverages unsupervised learning to learn spatial relationships between local visual patterns in kidney tissue. This framework minimizes the need for time-consuming and impractical expert annotations. 107,471 histopathology images obtained from 172 biopsy cores were used in the clustering and in the deep learning model. To incorporate spatial information over the clustered image patterns on the biopsy sample, we spatially encoded clustered patterns with colors and performed spatial analysis through graph neural network. A random forest classifier with various groups of features were used to predict CKD. For predicting eGFR at the biopsy, we achieved a sensitivity of 0.97, specificity of 0.90, and accuracy of 0.95. AUC was 0.96. For predicting eGFR changes in one-year, we achieved a sensitivity of 0.83, specificity of 0.85, and accuracy of 0.84. AUC was 0.85. This study presents the first spatial analysis based on unsupervised machine learning algorithms. Without expert annotation, CluSA framework can not only accurately classify and predict the degree of kidney function at the biopsy and in one year, but also identify novel predictors of kidney function and renal prognosis.

Chronic kidney disease (CKD) involves a gradual loss of kidney function and is not easily detected in the early stages until the condition is advanced. According to the Centers for Disease Control and Prevention, more than 37 million people (15% of US adults) are estimated to have Chronic Kidney Disease (CKD) and as many as 9 in 10 adults with CKD do not know they have CKD1. Diabetes, high blood pressure, heart disease, and a family history of kidney failure are the most common causes of kidney disease. Currently, CKD, which causes more deaths than breast cancer or prostate cancer, is the 9th leading cause of death in the U.S.1.

As the degree of kidney dysfunction is associated with increased mortality and risk of heart disease2,3 an early accurate diagnosis is crucial to slow the progression to kidney failure4. Current typical measures of kidney function and risk of progression such as creatinine level in the blood and protein in the urine5,6 have several limitations and are not accurate at higher levels of kidney function7. Although kidney biopsy samples may provide further prognostic information, e.g., degree of glomerular sclerosis and interstitial fibrosis8, they are often visually estimated, and interpretation may vary among pathologists. Computer-aided algorithms may provide a more objective kidney assessment and help to overcome substantial inter-observer variability.

Several deep learning and machine learning approaches to histopathological image analysis have become increasingly common with the growing availability of whole-slide digital scanners9. Coudray et al. used convolution neural networks (CNN) on whole-slide images (WSI) to classify them into lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC) or normal lung tissue10. Also, CNN have been applied to WSI to classify sclerosed and non-sclerosed glomeruli11,12. Kolachalama et al. demonstrated that CNN deep learning models can outperform the pathologist-estimated fibrosis score across the classification tasks and can be applied to routine renal biopsy images13.

To date, most machine learning and deep learning algorithms applied to histopathology images have been based on supervised (training) approaches. However, supervised algorithms require the use of a large amount of labeled training data, which is a time-consuming, often impractical, and expensive task. To overcome this problem, several studies have proposed methods such as weakly-supervised learning and multiple instance learning (MIL) which gave relatively high performance14,15,16,17,18. However, these are still supervised methods that required patient-level labels.

Spatial analysis of tissue microenvironment and cellular organization has become increasingly popular with multiplexed staining and advanced visualization techniques19. Investigating the spatial context in digital histopathology images is key to understanding the microenvironment heterogeneity with clinical implications20. In recent years, graph neural networks (GNNs)21 have demonstrated ground-breaking performances in various deep learning applications such as graph convolutional network (GCN)22, deep graph convolutional network (DGCNN)23, and graph attention network (GAT)24.

The primary objective of this study was to propose the novel computational framework that does not require expert’s annotation and investigated the spatial context of histopathology images through CluSA with GNN. We hypothesize that adding spatial neighborhood information, which is an important characteristic feature of all forms of CKD, to the clustering analysis can help to improve the predictive model performance. The overall computational pipeline of this study is summarized in Fig. 1.

Overall workflow. (a) biopsy core image; (b) clustered patches obtained using a k-means clustering algorithm; (c) histogram of clusters; (d) color-coded map consisting of clustered patches; (e) graph representation converted from the color-coded map; and (f) DGCNN model that outputs core-level prediction. The normalized area weighted score (aw-score) was computed with the area of each core biopsy sample for the patient-level prediction. (g) Random forest machine learning classifier was used with various groups of features, such as aw-scores, the number of clustered patches or nodes, clinical features, and polynomial fit coefficient features.

In this study, we proposed a novel computational ensemble machine learning framework, clustering-based spatial analysis (CluSA), that utilizes both unsupervised and supervised machine learning methods to predict patient outcomes as well as to identify important patterns or features associated with the level of kidney function and risk of progression. The unsupervised machine learning using a clustering method learns from an unlabeled dataset and automatically finds structures or patterns in the data by extracting useful features25 and the supervised machine learning using graph analysis learns spatial information through neighbor relationships between adjacent structures or patterns in the data.

The dataset consisted of human subjects enrolled in the C-PROBE cohort, a multicenter cohort of patients with CKD established under the auspices of the George O’Brien Kidney Center at the University of Michigan (https://kidneycenter.med.umich.edu/clinical-phenotyping-resource-biobank-core). The C-PROBE aimed at collecting high-quality data and biosamples for translational studies approved by the Institutional Review Boards of the University of Michigan Medical School (IRBMED) with approval number HUM00020938. C-PROBE enrolls patients at the time of clinically indicated biopsy and follows them with phenotypic data prospectively. The written informed consent was obtained from all subjects and/or their legal guardians.

A total of 107,471 histopathology images (256 × 256 pixels) were used in the clustering and in the deep learning model. All images were obtained from 172 biopsy cores from 57 cases in the form of trichrome-stained slides. This study was conducted and carried out in accordance with relevant guidelines and regulations. The Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formula was used to calculate the estimated glomerular filtration rate (eGFR)7,26.

In modern digital pathology, stain normalization is an important processing task for the computer-aided diagnosis (CAD) systems. In this study, we used Reinhard color normalization on all whole-slide-imaging (WSI) data as a preprocessing step to avoid confounding due to variations in color and intensity of the images and increase the computational efficiency and performance27. The Reinhard normalization algorithm maps the color distribution of the stained image to that of a reference image by using a linear transform in a perceptual color space. We computed the global mean and standard deviation of each channel in Lab color space for all data and used them as reference values to normalize our data. Figure S1 in the Supplementary materials provides an example of data with stain normalized images.

Feature extraction is an important step in clustering, and it aims at extracting relevant information which characterizes each image pattern. In this process, relevant features are extracted from images to form feature vectors that are used to cluster image patterns based on similarity measures. There are several feature extraction methods such as Histogram of Oriented Gradients (HOG), Scale Invariant Feature Transformation (SIFT), Speeded-Up Robust Features (SURF), and Features from Accelerated Segment Test (FAST)28,29. In this study, for feature extraction, we used one of the most popular machine learning methods called transfer learning30,31,32. Transfer learning is especially popular in medical image analysis for deep learning where the data are not sufficient for training33,34,35. Transfer learning uses a pre-trained deep learning model where a model was developed for a task and reused as the starting point for a model on another related task36.

We used DeepLab V3+ with ResNet-18 architecture37,38 pre-trained on ImageNet39. We performed feature extraction on this pre-trained deep learning model, allowing the input image to propagate forward and take the outputs of the specific layer as our features. We extracted features from a layer (res5b) of ResNet-1840, a decoder structure, which is a part of deep neural networks for semantic segmentation, DeepLab V3+.

We used one of the most popular and simplest unsupervised machine learning algorithms called K-means clustering, which forms groupings using a similarity or distance measures. First, the optimal number of data cluster K was estimated from the Silhouette algorithm in MATLAB (Supplementary Fig. S2). Then, the K-means clustering algorithm was applied on the feature vectors for each image tile, obtained from all the images across the patients, to obtain cluster indices, centroid locations, and distances from each point to every centroid. Then, we constructed the histogram representation for each case based on nearest-centroid group labels assigned to each point, resulting in 9 visual pattern group bins for the histogram. The frequency on the cluster histogram represents how often each clustered image pattern is encountered for each case. This occurrence or frequency for each visual pattern was used as a feature for predicting the patient’s eGFR. The Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formula was used to calculate eGFR7,26. The order of clusters on the x-axis of the cluster histogram was obtained using multidimensional scaling (MDS). MDS allows us to visualize how near groups are to each other in histogram plots by computing the similarity between clustered visual pattern groups with the Euclidean distances.

In this study, we engineered the feature that describes the fitting of a nonlinear relationship between the local visual patterns and the frequency of each pattern and used the 4th polynomial fit coefficients as an additional feature for predicting patients’ outcomes. We applied the 4th polynomial fitting on the frequency histogram and obtained five coefficients for each case.

where c1, c2, …, c5 are the coefficients of the 4th polynomial function f(x). This polynomial fitting on the histogram (Supplementary Fig. S3) provided overall information or a trend about all histogram frequency features.

In this study, we developed a clustering-based spatial analysis framework using a graph deep learning model to extract key features of visual patterns from whole-slide imaging (WSI) of renal biopsies. In order to incorporate spatial information over the clustered image patterns on the biopsy sample, we spatially encoded clustered visual patterns on the original biopsy images with colors. The cluster indices obtained from the K-means clustering algorithm can be used to find the original location of each image tile on the biopsy sample. These spatially color-encoded visual patterns can be considered as nodes for the spatial analysis. In this study, to obtain the spatial neighborhood relationship between clustered image patterns on biopsy samples, we used a state-of-the-art graph neural network model called the DGCNN model41.

The DGCNN consists of four graph convolutional layers, a sort pooling layer, two 1-dimensional convolutional layers, a max pooling layer, and a fully connected layer. The representation of an entire graph can be obtained by summing the representation vectors of all nodes in the graph with a DGCNN algorithm (Fig. 2).

DCGNN architecture. (a) Biopsy core image. (b) Clustered color map. (c) Graph representation of the clustered color map. (d) DCGNN architecture. The DCGNN model consisted of four graph convolutional layers, a sort pooling layer, two 1-dimensional convolutional layers, a max pooling layer, and a fully connected layer.

The graph representation as an input of the DGCNN was generated from the spatial location of visual patches using its adjacency and node features matrices. The node features were obtained using MDS, which calculates the dissimilarity between clustered visual pattern groups with the Euclidean distances to show how near clustered visual pattern groups are to each other. The DCGNN outputs class prediction for each biopsy core image. Since each patient has multiple biopsy cores, we invented a normalized aw-score for the patient-level prediction which incorporates the normalized size of each core biopsy sample into its corresponding class prediction (1 or − 1), then summed these aw-scores to get the patient-level outcome by Eq. (2).

where P is an output of the predicted class either 1 or − 1 from GDCNN for each biopsy core. n is the number of biopsy cores in each patient. A is the normalized area of each biopsy core. aw-score is defined as \(A \times P\) for each core. We performed a threefold cross validation, stratified by patients (Tables 2, 3), on DGCNN with listed area and prediction of DGCNN for each core as well as aw-scores for all patients obtained from the test sets of a threefold cross validation. For the spatial analysis with GNN, we used a python library for machine learning on graph-structured data, called StellarGraph based on TensorFlow and Keras API for the graph analysis.

We obtained various groups of features: frequency or occurrence of each visual pattern group, polynomial fit coefficients from a histogram representation of clusters for each patient, spatial neighborhood relationship between clustered visual patterns, and clinical features such as age, race, and diagnosis. In this study, we used multi-stage feature extraction and classification pipelines with a random forest classifier to predict the kidney function at the biopsy and 1-year prediction, respectively. In general, multi-stage feature extraction and classification pipelines have high predictive accuracy compared to the end-to-end learning which requires a huge amount of data to obtain a high accuracy42,43. Further, Random Forest classifier is a commonly used machine learning algorithm which provides a higher level of accuracy in predicting outcomes over the decision tree algorithm. Also, it can handle large datasets efficiently, produce a reasonable prediction without hyper-parameter tuning, and solve the issue of overfitting in decision trees. For the prediction at the biopsy, the patients were dichotomized into low and high eGFR groups with a threshold at eGFR 60: eGFR ≥ 60 (n = 36) and eGFR < 60 (n = 21). For the prediction whether eGFR is decreased or increased in one year, the patients were dichotomized into two groups based on the eGFR slope: eGFR slope ≥ 0 (n = 30) and eGFR slope < 0 (n = 27). The eGFR slope is defined as Eq. (3).

where “age at year 1” is the age in days approximately 1 year after the biopsy. A receiver operating characteristic (ROC) curve analysis was performed to illustrate the diagnostic ability of the binary classifier system. To evaluate the performance of our model, we estimated the area under the ROC curve (AUC) and its 95% confidence interval44,45,46. The clustering analysis was performed using the algorithms “kmeans” and “silhouette” functions in MATLAB (R2020a, The MathWorks, Inc.). The ROC and AUC were computed using R (R Foundation for Statistical Computing, Vienna, Austria).

In order to cluster image patterns on image patches, we extracted features from each image patch for the clustering using pre-trained DeepLab V3+ with ResNet-18 model38,47. Then, all 172 biopsy cores on images were tiled into 107,471 patches. Then, those patches were clustered through K-means clustering to group similar image patterns together (Fig. 3a). K-means clustering algorithms use similarity or distance measures to form groupings such that image patches in the same groups are more similar than those in other groups. The optimal number of data clusters (K = 9) was obtained using the algorithm Silhouette in MATLAB (Supplementary Fig. S2).

Clustered image patterns and examples. (a) Clustered nine visual pattern groups with representative image patches; (b) an example of biopsy sample; (c) its cluster color-coded map; and (d) each color-coded patch can be considered a node for the spatial analysis.

Cluster indices, centroid locations, and distances from each point to every centroid were computed for the analysis. Figure 3 shows (a) 9 clustered visual pattern groups with representative image patches, (b) an example of cortexes, (c) its color-coded cluster map, and (d) each color-coded patch can be considered a node for the spatial analysis. In this study, we developed a clustering-based deep learning methodology to find previously unknown visual patterns as well as spatial neighboring relationships between clustered kidney structure patterns that could predict patient outcomes. Table 1 summarized a detailed description for each visual pattern group and its importance for both predicting eGFR at the biopsy and in one year, respectively. The important visual patterns were established by using the Gini index, which calculates the amount of probability of a specific feature that is classified incorrectly when selected randomly.

A graph for the GNN model is a data structure consisting of two components such as nodes and edges, which is used to analyze the pair-wise relationship between objects and entities. In this study, the graph representation as an input of the DGCNN was created with the nodes, which is the clustered visual patterns spatial located on the biopsy images.

In each case, there are multiple biopsy cores and the DCGNN outputs class prediction for each core and the patient-level prediction was obtained with a normalized area-weighted score, aw-score, by incorporating the normalized size of each core biopsy sample in a case into its corresponding patient-level prediction. We performed a threefold cross validation, stratified by patients, on DGCNN and Tables 2 and 3 listed normalized area and prediction of DGCNN for each core and aw-score for each case. These scores were obtained from the test sets from 3 different models (threefold cross validation).

A histogram representation of clusters for each patient was created to describe the distribution of each type of cluster at the patient level. This cluster frequency information from the histogram gives us the cluster frequency features for each patient. Investigating these visual patterns for each case can give us information to find previously unknown features that predict patient outcomes (Table 1). Some visual pattern groups match well with distinct microscopic kidney structures: visual pattern group #2 (blue nodes) is the glomerular structure, which is the most important visual pattern for predicting kidney function at the biopsy and visual pattern group #5 is arterioles with some white space. Visual pattern group #7 does not exactly match with distinctive microscopic kidney structures but contains both normal and near normal tubulointerstitial (TI) and some interstitial areas, which are the most important visual patterns for predicting kidney function in one year. Figure 4 shows two examples of biopsy samples; one (right) has complex heterogeneous visual patterns and one (left) has relatively few and distinctive visual patterns. Through our CluSA framework, we can assess complex heterogeneous visual patterns of biopsy samples not only through their quantities in CKD patient tissue but also through their spatial configuration in the tissue.

Two examples of biopsy samples. (a,e) Two examples of biopsy samples; (b,f) color-coded nodes from clustered visual pattern groups; (c,g) zoom images; (d,h) histogram representation of clusters with 4th polynomial fitting curves for both cases, respectively.

In addition to the individual frequency or occurrence of visual patterns on the histogram, the polynomial fitting on the histogram provided overall information about all histogram cluster frequency features (Fig. 4d,h; Supplementary Fig. S3). In this study, we used a machine learning classifier to incorporate all features such as histogram frequency features, polynomial coefficient features, aw-scores from DGCNN, and clinical features such as age, race, gender, and diagnosis to predict patient outcomes. The detailed clinical features including patient diagnosis and demographics are shown in Supplementary Table S1. We used a random forest model as a classifier to calculate AUC and predict association of visual patterns and features with clinical patient outcomes such as eGFR. Tables 4 and 5 show the ranking of the important features for the dichotomized level of kidney function at the biopsy and in one year, respectively. We used additional clinical features, eGFR and UPC at the biopsy, for the prediction of eGFR changes in one year. These important features were computed by using the Gini index, also known as Gini impurity, which calculates the amount of probability of a specific feature that is classified incorrectly when selected randomly.

We selected the top 7 most important features based on the importance index rank (Tables 4, 5): one spatial feature, three frequency features (f2, f5, and f8), one polynomial feature (c1), and two clinical features (age and diagnosis) for predicting eGFR at the biopsy; and one spatial feature, three frequency features (f4, f6, and f8), two polynomial feature (c2 and c4), and one clinical feature (age) for predicting eGFR changes in 1 year. Selecting the top 7 features ensured that all four categories of feature types were included in our analysis. For predicting eGFR at the biopsy, the error from the random forest was 0.05, the sensitivity was 0.97, and specificity was 0.90. The ROC for the top 7 features is illustrated in Fig. 5a. The AUC was 0.96 and the 95% confidence interval was 0.89–1.0. The accuracy was 0.95.

ROC curves and AUC values. ROC curves for the prediction of the level of kidney function, (a) at the biopsy and, (b) in the future. Top7 represents the top 7 features selected based on the importance rank. The x-axis is the true negative rate (TNR) or specificity and the y-axis is the true positive rate (TPR) or sensitivity.

For predicting whether eGFR is increased or decreased in one year, the error from the random forest was 0.16; sensitivity, 0.83; specificity, 0.85; and accuracy, 0.84. The ROC for the top 7 features is illustrated in Fig. 5b. The area under the ROC curve (AUC) was 0.85 and the 95% confidence interval was 0.76–0.95. The accuracies were calculated by using Eq. (4) for this model,

where TP, FP, TN, and FN represent true-positive, false-positive, true-negative, and false-negative predictions, respectively. The detailed results of confusion matrix, AUC, 95% confidence interval (CI), and accuracy are shown in Supplementary Tables S2 and S3, respectively. Based on the results, the spatial feature of neighborhood information between clustered visual patterns from the graph neural network was the most important feature associated with the prediction of the kidney function at the biopsy as well as in one year. It showed that the accuracy and AUC for all combined features were increased compared to the accuracy and AUC of each type of feature. The accuracy and AUC for each type of feature are summarized in Table 6.

In histopathology image analysis, artificial intelligence and machine learning methods have been used in computer-aided studies to solve diagnostic decision-making problems, and most of the machine learning methods applied to histopathology slides have relied on fully supervised learning and pixel-level expert annotations to extract features or train a model48,49 although some researchers tried to reduce the labeling efforts by using weakly supervised learning and semi-supervised learning for the classification tasks50,51. However, a deep learning segmentation model requires significant labeling efforts, which is a very time-consuming task that is often impractical in histopathology images due to their large image size with high resolution. Also, labeling microstructures or regions of interests (ROIs) on histopathology images requires the domain knowledge of the microstructures of ROIs as well. Further, the model’s results depend on the quality of labels in the training set, which could involve human error in manual labeling. On the other hand, unsupervised learning does not require labeled data and the model learns from raw data without any prior knowledge. In addition, it discovers previously unknown patterns from the data. However, unsupervised learning has some disadvantages as well such as difficulty of measuring accuracy or effectiveness due to lack of predefined answers during training. Also, one of the typical disadvantages in a clustering algorithm is that it does not consider spatial relationships in the data.

In order to build highly intelligent and efficient machine learning algorithms, we developed a computational framework that uses unsupervised learning to overcome the burden of manual labeling and supervised machine learning to incorporate a spatial relationship between visual patterns. The best way to make progress on this is through unsupervised machine learning using a clustering algorithm, which does not require labeled data, and find the original location of each clustered patch on the biopsy sample image. Also, clustering-based analysis has no specific sample size limitation. As shown in Fig. 3, each clustered color-coded patch can be considered a node, and a graph representation was obtained from these nodes for the spatial analysis. In this study, we identified the most important features among normalized aw-score feature that contains spatial information between neighboring image patterns, cluster frequency features that represent a quantitative amount of each clustered image pattern or node within a case, polynomial fit coefficient features that provide overall information or a trend about all cluster frequency features, and clinical features that include age, race, and diagnosis. This is done by computing the Gini index or Gini impurity. The most important feature was the normalized aw-score feature obtained from the graph deep learning model for both predictions at the biopsy and in one year. This shows that the spatial pattern of neighboring image patterns or fibrosis, which is a characteristic feature of all forms of CKD, could be an important factor to be considered for the level of kidney function in CKD. To our knowledge, CluSA is the first study in which unsupervised machine learning has been used to cluster morphologic visual patterns and assess the spatial neighborhood relationship between clustered visual patterns to predict the kidney function in CKD.

Our retrospective study has a several limitations. First, in this study we fixed the patch size at 256 × 256 pixels based on the requirement of the input image size of the pretrained deep learning model. However, other image sizes with rescaling or with some degree of overlap between adjacent patches could be investigated in the future study. Secondly, although k-means clustering is one of the popular unsupervised learning methods to cluster unlabeled data into k clusters, identifying an optimal number of clusters in a dataset is a fundamental issue and there is no definitive answer as to the true number of clusters. To determine the optimal number of clusters in k-means clustering, we used one of the most popular algorithms, silhouette, which measures the quality of a clustering. Its value indicates a measure of how similar an object is to its own cluster compared to other clusters. However, the effects of the number of clusters on the clustering performance should be explored in a future study with different patch size. Similarly, to ensure generalizability of the study, future studies that provide more systematic examination of the effects of stain color normalization, optimal number of clusters, and various staining such as H&E and PAS are needed. Also, we utilized a pre-trained ResNet-18 for feature extraction in this study. However, the choice of pre-trained neural networks may impact the performance of feature extraction and the effects of the feature extraction on different networks should be explored in a future study. Lastly, drugs, including RAAS inhibitors and SGLT2 inhibitors, have been introduced in recent years that may affect the change or rate of change in eGFR after treatment52. In this study, we did not take drug effects into account but the effects of the medication on increased cases with eGFR or other accurate measurement of the kidney function (e.g., pathology evaluation of the disease severity) as outcomes should be explored in a future study53.

Previously, we have shown that unsupervised machine-learned clustering features are potential surrogates of predicting eGFR and can be used as tools for prognosis as well as for objective assessment of the level of kidney function in CKD47. In the present study, our results demonstrate that the addition of spatial information improves the model’s performance by 2.4% and 5.1% of AUC at the biopsy and one-year prediction, respectively, compared to the previous study. Furthermore, we identified that aw-score, consisting of results from the GNN model, is the most important feature for predicting patient outcomes. The clustering of visual patterns enables pathologists to inspect these key image segments for clinically significant data. In contrast to traditional deep learning approaches, in which an algorithm learns from data labeled by a pathologist with known pathologic features for classifying disease, this unsupervised approach via CluSA automatically identifies the most optimal discriminative features, some of which may be potentially new, for understanding and prognosticating disease. It is important that our framework can find important visual patterns of the kidney tissue corresponding to the patient outcomes without human input and can predict future diagnosis. Although further study is required for complex disease analysis, our computational CluSA framework will potentially have benefits from having a higher speed of execution, accuracy, and incorporating spatial information while minimizing the need for time-consuming, impractical expert annotations.

In this study, we showed that the identifying morphological characteristics from clustering and the spatial relationship between them can not only remove the burden of obtaining manual-labeled training datasets, but also provide interpretability in the form of spatial visualizations of predictive features. The results from our study also indicate that the spatial relationship between visual patterns obtained from unsupervised machine learning is the most important feature that can predict outcomes. Our objective computational CluSA framework will be useful for discriminating levels of kidney function as well as other disease in digital histopathology image analysis. Since clustering-based analysis has no sample size limitation, our CluSA framework confers real practical use with relatively small datasets and could help in decision making during follow-up.

All data associated with this study are in the paper or the Supplementary Material. The code and materials used in the analysis are available in GitHub (https://github.com/aznetz/BoSVW) and the datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

CDC. Centers for Disease Control and Prevention. Chronic Kidney Disease Surveillance System website. https://nccd.cdc.gov/CKD. Accessed 8 June 2020.

Romagnani, P. et al. Chronic kidney disease. Nat. Rev. Dis. Primers 3, 17088. https://doi.org/10.1038/nrdp.2017.88 (2017).

Article PubMed Google Scholar

Gansevoort, R. T. et al. Lower estimated GFR and higher albuminuria are associated with adverse kidney outcomes. A collaborative meta-analysis of general and high-risk population cohorts. Kidney Int. 80, 93–104. https://doi.org/10.1038/ki.2010.531 (2011).

Article CAS PubMed PubMed Central Google Scholar

Qaseem, A. et al. Screening, monitoring, and treatment of stage 1 to 3 chronic kidney disease: A clinical practice guideline from the American College of Physicians. Ann. Intern. Med. 159, 835–847. https://doi.org/10.7326/0003-4819-159-12-201312170-00726 (2013).

Article PubMed Google Scholar

da Silva Selistre, L. et al. Diagnostic performance of creatinine-based equations for estimating glomerular filtration rate in adults 65 years and older. JAMA Intern. Med. 179, 796–804. https://doi.org/10.1001/jamainternmed.2019.0223 (2019).

Article PubMed PubMed Central Google Scholar

Tangri, N. et al. A predictive model for progression of chronic kidney disease to kidney failure. JAMA 305, 1553–1559. https://doi.org/10.1001/jama.2011.451 (2011).

Article CAS PubMed Google Scholar

Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612. https://doi.org/10.7326/0003-4819-150-9-200905050-00006 (2009).

Article PubMed PubMed Central Google Scholar

Nath, K. A. Tubulointerstitial changes as a major determinant in the progression of renal damage. Am. J. Kidney Dis. 20, 1–17. https://doi.org/10.1016/s0272-6386(12)80312-x (1992).

Article CAS PubMed Google Scholar

Bhargava, R. & Madabhushi, A. Emerging themes in image informatics and molecular analysis for digital pathology. Annu. Rev. Biomed. Eng. 18, 387–412. https://doi.org/10.1146/annurev-bioeng-112415-114722 (2016).

Article CAS PubMed PubMed Central Google Scholar

Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567. https://doi.org/10.1038/s41591-018-0177-5 (2018).

Article CAS PubMed PubMed Central Google Scholar

Bueno, G., Fernandez-Carrobles, M. M., Gonzalez-Lopez, L. & Deniz, O. Glomerulosclerosis identification in whole slide images using semantic segmentation. Comput. Methods Programs Biomed. 184, 105273. https://doi.org/10.1016/j.cmpb.2019.105273 (2020).

Article PubMed Google Scholar

Kannan, S. et al. Segmentation of glomeruli within trichrome images using deep learning. Kidney Int. Rep. 4, 955–962. https://doi.org/10.1016/j.ekir.2019.04.008 (2019).

Article PubMed PubMed Central Google Scholar

Kolachalama, V. B. et al. Association of pathological fibrosis with renal survival using deep neural networks. Kidney Int. Rep. 3, 464–475. https://doi.org/10.1016/j.ekir.2017.11.002 (2018).

Article PubMed PubMed Central Google Scholar

Sudharshan, P. J. et al. Multiple instance learning for histopathological breast cancer image classification. Expert Syst. Appl. 117, 103–111. https://doi.org/10.1016/j.eswa.2018.09.049 (2019).

Article Google Scholar

Vu, T. et al. A novel attribute-based symmetric multiple instance learning for histopathological image analysis. IEEE Trans. Med. Imaging 39, 3125–3136. https://doi.org/10.1109/Tmi.2020.2987796 (2020).

Article PubMed PubMed Central Google Scholar

Xu, Y., Zhu, J. Y., Chang, E. I., Lai, M. & Tu, Z. Weakly supervised histopathology cancer image segmentation and classification. Med. Image Anal. 18, 591–604. https://doi.org/10.1016/j.media.2014.01.010 (2014).

Article PubMed Google Scholar

Kanavati, F. et al. Weakly-supervised learning for lung carcinoma classification using deep learning. Sci. Rep. 10, 9297. https://doi.org/10.1038/s41598-020-66333-x (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

van der Laak, J., Litjens, G. & Ciompi, F. Deep learning in histopathology: the path to the clinic. Nat. Med. 27, 775–784. https://doi.org/10.1038/s41591-021-01343-4 (2021).

Article CAS PubMed Google Scholar

Schapiro, D. et al. histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat. Methods 14, 873–876. https://doi.org/10.1038/nmeth.4391 (2017).

Article CAS PubMed PubMed Central Google Scholar

Heindl, A., Nawaz, S. & Yuan, Y. Mapping spatial heterogeneity in the tumor microenvironment: a new era for digital pathology. Lab. Invest. 95, 377–384. https://doi.org/10.1038/labinvest.2014.155 (2015).

Article PubMed Google Scholar

Zhang, Z. et al. Graph neural network approaches for drug-target interactions. Curr. Opin. Struct. Biol. 73, 102327. https://doi.org/10.1016/j.sbi.2021.102327 (2022).

Article CAS PubMed Google Scholar

Xuan, P., Pan, S., Zhang, T., Liu, Y. & Sun, H. Graph convolutional network and convolutional neural network based method for predicting lncRNA-disease associations. Cells 8, 25. https://doi.org/10.3390/cells8091012 (2019).

Article Google Scholar

Peng, H. et al. Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In Web Conference 2018: Proceedings of the World Wide Web Conference (Www2018), 1063–1072. https://doi.org/10.1145/3178876.3186005 (2018).

Veličković, P. et al. Graph attention networks. (2017).

Lopez, C., Tucker, S., Salameh, T. & Tucker, C. An unsupervised machine learning method for discovering patient clusters based on genetic signatures. J. Biomed. Inform. 85, 30–39. https://doi.org/10.1016/j.jbi.2018.07.004 (2018).

Article PubMed PubMed Central Google Scholar

Levey, A. S. & Stevens, L. A. Estimating GFR using the CKD epidemiology collaboration (CKD-EPI) creatinine equation: more accurate GFR estimates, lower CKD prevalence estimates, and better risk predictions. Am. J. Kidney Dis. 55, 622–627. https://doi.org/10.1053/j.ajkd.2010.02.337 (2010).

Article PubMed PubMed Central Google Scholar

Reinhard, E., Ashikhmin, N., Gooch, B. & Shirley, P. Color transfer between images. IEEE Comput. Graph. 21, 34–41. https://doi.org/10.1109/38.946629 (2001).

Article Google Scholar

Routray, S., Ray, A. K. & Mishra, C. Analysis of various image feature extraction methods against noisy image: SIFT, SURF and HOG. In Proceedings of the 2017 IEEE Second International Conference on Electrical, Computer and Communication Technologies (Icecct) (2017).

Kumar, G. & Bhatia, P. K. A detailed review of feature extraction in image processing systems. Int. C Adv. Comput. Comput. https://doi.org/10.1109/Acct.2014.74 (2014).

Article Google Scholar

Cheplygina, V., de Bruijne, M. & Pluim, J. P. W. Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 54, 280–296. https://doi.org/10.1016/j.media.2019.03.009 (2019).

Article PubMed Google Scholar

Liu, S. P., Tian, G. H. & Xu, Y. A novel scene classification model combining ResNet based transfer learning and data augmentation with a filter. Neurocomputing 338, 191–206. https://doi.org/10.1016/j.neucom.2019.01.090 (2019).

Article Google Scholar

Morid, M. A., Borjali, A. & DelFiol, G. A scoping review of transfer learning research on medical image analysis using ImageNet. Comput. Biol. Med. 128, 104115. https://doi.org/10.1016/j.compbiomed.2020.104115 (2021).

Article PubMed Google Scholar

Shin, H. C. et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35, 1285–1298. https://doi.org/10.1109/TMI.2016.2528162 (2016).

Article PubMed Google Scholar

van Opbroek, A., Ikram, M. A., Vernooij, M. W. & de Bruijne, M. Transfer learning improves supervised image segmentation across imaging protocols. IEEE Trans. Med. Imaging 34, 1018–1030. https://doi.org/10.1109/TMI.2014.2366792 (2015).

Article PubMed Google Scholar

Christopher, M. et al. Performance of deep learning architectures and transfer learning for detecting glaucomatous optic neuropathy in fundus photographs. Sci. Rep. 8, 16685. https://doi.org/10.1038/s41598-018-35044-9 (2018).

Article ADS CAS PubMed PubMed Central Google Scholar

Pratt, L. Y. Advances in neural information processing systems, p. 204–11.

Chen, L. C. E., Zhu, Y. K., Papandreou, G., Schroff, F. & Adam, H. Encoder–decoder with atrous separable convolution for semantic image segmentation. Lect. Notes Comput. Sci. 11211, 833–851. https://doi.org/10.1007/978-3-030-01234-2_49 (2018).

Article Google Scholar

He, K. M., Zhang, X. Y., Ren, S. Q. & Sun, J. Deep residual learning for image recognition. Proc. Cvpr IEEE. https://doi.org/10.1109/Cvpr.2016.90 (2016).

Article Google Scholar

Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252. https://doi.org/10.1007/s11263-015-0816-y (2015).

Article MathSciNet Google Scholar

He, K., Zhang, X., Ren, S. & Sun, J. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.

Zhang, M. H., Cui, Z. C., Neumann, M. & Chen, Y. X. An end-to-end deep learning architecture for graph classification. In Thirty-Second Aaai Conference on Artificial Intelligence/Thirtieth Innovative Applications of Artificial Intelligence Conference/Eighth Aaai Symposium on Educational Advances in Artificial Intelligence, 4438–4445 (2018).

Zheng, X. Q., Tao, Y. F., Zhang, R. K., Yang, W. M. & Liao, Q. M. TimNet: A text-image matching network integrating multi-stage feature extraction with multi-scale metrics. Neurocomputing 465, 540–548. https://doi.org/10.1016/j.neucom.2021.09.001 (2021).

Article Google Scholar

Keshta, I. et al. Multi-stage biomedical feature selection extraction algorithm for cancer detection. Sn Appl. Sci. https://doi.org/10.1007/s42452-023-05339-2 (2023).

Article Google Scholar

Bradley, A. P. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30, 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2 (1997).

Article ADS Google Scholar

Khatun, M. S., Shoombuatong, W., Hasan, M. M. & Kurata, H. Evolution of sequence-based bioinformatics tools for protein-protein interaction prediction. Curr. Genom. 21, 454–463. https://doi.org/10.2174/1389202921999200625103936 (2020).

Article CAS Google Scholar

Khatun, M. S. et al. Recent development of bioinformatics tools for microRNA target prediction. Curr. Med. Chem. https://doi.org/10.2174/0929867328666210804090224 (2021).

Article Google Scholar

Lee, J. et al. Unsupervised machine learning for identifying important visual features through bag-of-words using histopathology data from chronic kidney disease. Sci. Rep. 12, 4832. https://doi.org/10.1038/s41598-022-08974-8 (2022).

Article ADS CAS PubMed PubMed Central Google Scholar

Bouteldja, N. et al. Deep Learning-based segmentation and quantification in experimental kidney histopathology. J. Am. Soc. Nephrol. 32, 52–68. https://doi.org/10.1681/ASN.2020050597 (2021).

Article CAS PubMed Google Scholar

Kim, Y. et al. A deep learning approach for automated segmentation of kidneys and exophytic cysts in individuals with autosomal dominant polycystic kidney disease. J. Am. Soc. Nephrol. 33, 1581–1589. https://doi.org/10.1681/ASN.2021111400 (2022).

Article PubMed PubMed Central Google Scholar

Zhou, Z. H. A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5, 44–53. https://doi.org/10.1093/nsr/nwx106 (2018).

Article Google Scholar

Kingma, D. P., Rezende, D. J., Mohamed, S. & Welling, M. Semi-supervised learning with deep generative models. In Advances in Neural Information Processing Systems 27 (Nips 2014) 27 (2014).

Zhang, F. et al. Effects of RAAS inhibitors in patients with kidney disease. Curr. Hypertens. Rep. https://doi.org/10.1007/s11906-017-0771-9 (2017).

Article PubMed Google Scholar

Bjornstad, P., Karger, A. B. & Maahs, D. M. Measured GFR in routine clinical practice-the promise of dried blood spots. Adv. Chron. Kidney Dis. 25, 76–83. https://doi.org/10.1053/j.ackd.2017.09.003 (2018).

Article Google Scholar

Download references

We would like to thank all staffs at the J.B.H. lab for their contribution to this study. This work was supported by the Department of Defense W81XWH2210032 (to J.L.), Department of Defense W81XWH2010436 (to J.B.H & A.R.), and NCI Grant R37-CA214955 (to A.R.).

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA

Joonsang Lee, Elisa Warner, Xin Zhang & Arvind Rao

Department of Pathology, University of Michigan, Ann Arbor, MI, USA

Jharna Saha, Yingbao Yang, Jinghui Luo & Jeffrey B. Hodgin

Department of Internal Medicine, Nephrology, University of Michigan, Ann Arbor, MI, USA

Salma Shaikhouni, Markus Bitzer, Matthias Kretzler, Subramaniam Pennathur & Laura Mariani

Department of Pediatrics, Pediatric Nephrology, University of Michigan, Ann Arbor, MI, USA

Debbie Gipson

Department of Internal Medicine, Nephrology, St. Clair Nephrology Research, Detroit, MI, USA

Keith Bellovich

Department of Internal Medicine, Nephrology, Wayne State University, Detroit, MI, USA

Zeenat Bhat

Department of Internal Medicine, Nephrology, Cleveland Clinic, , Cleveland, OH, USA

Crystal Gadegbeku

Department of Pediatrics, Pediatric Nephrology, Levine Children’s Hospital, Charlotte, NC, USA

Susan Massengill

Department of Internal Medicine, Nephrology, Department of JH Stroger Hospital, Chicago, IL, USA

Kalyani Perumal

Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

Arvind Rao

Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, USA

Arvind Rao

Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, USA

Arvind Rao

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

Project conception and design were by J.L., J.B.H., and A.R. The data collection and preprocessing were performed by J.L., E.W., S.S., M.B., M.K., D.G., S.P., K.B., Z.B., C.G., S.M., K.P., J.S., Y.Y., J.L., X.Z., L.M., and J.B.H. The software programming, statistical analysis, and interpretation were performed by J.L. The manuscript was written by J.L. and all authors reviewed the manuscript.

Correspondence to Joonsang Lee, Jeffrey B. Hodgin or Arvind Rao.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

Lee, J., Warner, E., Shaikhouni, S. et al. Clustering-based spatial analysis (CluSA) framework through graph neural network for chronic kidney disease prediction using histopathology images. Sci Rep 13, 12701 (2023). https://doi.org/10.1038/s41598-023-39591-8

Download citation

Received: 21 April 2023

Accepted: 27 July 2023

Published: 05 August 2023

DOI: https://doi.org/10.1038/s41598-023-39591-8

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.