Study population
Following the exclusion of individuals with missing information on relevant covariates and patients with a diagnosis of sudden cardiac arrest (SCA) prior to baseline, the exploration cohort ultimately consisted of 349,648 participants from the UK Biobank (UKB) (Supplementary Fig. 1). Data for the participants in the validation cohort were retrieved from the Changsha cohort, a group of individuals involved in an ongoing, large-scale longitudinal cohort study in Hunan, China34,35. Mortality information for the participants in the Changsha cohort was obtained from the National Mortality Surveillance System (NMSS), which is operated by the Centre of Disease Control and Prevention in Hunan Province. The unique ID number in the resident’s health record was matched with the NMSS to obtain the time of death, cause of death, and International Classification of Diseases 10th Edition (ICD-10) code. Since participant death information in the NMSS was tracked from 1 January 2009 to 24 February 2021, the 129,279 participants from the Changsha cohort aged ≥18 years with complete, available information on important covariates (age, sex), renal function tests, and mortality tracked from 1 January 2009 to 31 December 2020 were included in this study. In addition, to ensure that the renal function tests were sufficiently robust, we excluded individuals in the Changsha cohort who had data from only one physical examination (Supplementary Fig. 2). The proteomic analysis was replicated with data from participants in the Framingham Offspring Study. We included CKD participants attending the seventh Heart Study visit and who underwent proteomic testing. Detailed descriptions of the UKB, Changsha cohort, and Framingham Offspring Study are given in the Supplementary Methods, Section 1. All participants provided written informed consent. The UKB study protocol was approved by the North West Multi-centre Research Ethics Committee (11/NW/0382), and the study in Changsha cohort was approved by the institutional review board of the Third Xiangya Hospital of Central South University (no. 23309).
CKD definition
According to the 2021 CKD-EPI Race-Free Equation36, we used serum creatinine to calculate the baseline eGFR and urine creatinine and microalbumin to calculate the urine albumin–creatinine ratio (uACR) (as detailed in the Supplementary Methods, Section 2). Individuals in the exploration cohort were considered to have CKD if they met any of the following criteria: 1) a diagnosis of CKD according to the ICD-10 codes; 2) a baseline eGFR<60 mL/min/1.73 m2; and 3) a baseline uACR≥30 mg/g. Owing to the limited data availability, in the validation cohort and Framingham offspring cohort, we considered only the participants with an eGFR <60 mL/min/1.73 m2 to have CKD. To explore the risk of SCD in patients with different CKD stages, we divided the participants into four groups on the basis of the ICD-10 codes and the eGFR level of the patients with CKD: non-CKD, CKD stages 1–3, CKD stage 4, and CKD stage 5 (Supplementary Information).
Outcome definitions
For the exploration cohort, the outcome of this study was SCD (defined according to the relevant ICD-10 codes; Supplementary Methods, Section 3), including both patients who diagnosed with SCA and those who died as a result of SCD. The participants were followed from baseline until the date of first SCA, death, or the last follow-up (Oct 31, 2022). In the validation cohort, the outcome was mortality due to SCD. Patients for whom the primary or secondary cause of death was listed as SCD or cardiac arrest were considered to have SCD. The subjects were followed from baseline until the date of death or last follow-up (31 December 2020). In the proteomic analysis replication cohort, with the limited sample size, CVD mortality was used as the primary outcome of interest in Framingham Offspring Study.
Data collection
In the exploration cohort, information on socioeconomic status (average household income) and lifestyle factors (smoking history, alcohol use, and use of medications for cholesterol, blood pressure, and diabetes) was collected with a self-administered questionnaire. Demographic information on age, sex, ethnicity, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), total cholesterol (TC), and serum creatinine was collected by trained nurses during the baseline assessment at the visit to the corresponding centre. A history of hypertension, diabetes, and hyperlipidaemia in participants at baseline was obtained on the basis of the corresponding ICD-10 codes. Detailed descriptions of the covariates of these cohorts are provided in the Supplementary Information.
Plasma proteomics
In the original UKB cohort, over 50,000 participants in UKB were randomly selected to undergo proteomic profiling of their blood plasma samples acquired at baseline from 2007 to 2010. The plasma samples were transported to the Olink Analysis Service in Sweden, where 2923 unique proteins were measured using the antibody-based Olink Explore 3072 proximity extension assay (PEA)37,38,39. The proteins measured spanned four panels: cardiometabolic, inflammation, neurology, and oncology proteins. Stringent quality control ensured inter- and intra-panel coefficients of variation below 20 % and 10 %, respectively. Upon conducting quality control procedures, protein levels were converted into normalized protein expression (NPX) values. For each protein level, values greater than five times the median absolute deviation from the median were removed from our analyses. We ultimately included the data of 52,687 UKB participants who underwent complete baseline plasma proteomic profiling. In the proteomic analysis replication cohort, the stored EDTA plasma samples acquired from the seventh (1998–2001) Heart Study visit of the Framingham Offspring Study were similarly sent to the Olink Analysis Service for proteomics40. Fasting plasma samples collected during the Offspring 7th Examination (stored at −80 °C) were analyzed for protein biomarkers using the antibody-based Olink Explore proximity extension assay. Protein expression levels are reported as NPX values, where a one-unit increase represents a twofold change in protein concentration.
Statistical analysis
The baseline characteristics of both the CKD and non-CKD participants are presented as the medians with interquartile ranges (IQRs) for continuous variables, and the frequencies (percentages) for categorical variables. The differences between the groups were assessed with the χ2 test or the Mann‒Whitney U test, as appropriate.
To assess outcome disparities between participants with and without CKD, we performed stratified Kaplan‒Meier (KM) analysis followed by the log-rank test to compare KM curves. Cox proportional hazard models were subsequently constructed to estimate the association between CKD, CKD stages, and the risk of incident SCD. In the exploration cohort, two models were constructed: one adjusted for age, sex, and ethnicity (Model 1) and another adjusted for age, sex, and ethnicity plus obesity, lifestyle factors, socioeconomic status, history of hypertension, diabetes, hyperlipidaemia, CVD, medication for cholesterol, blood pressure, and diabetes, and eGFR (Model 2). In the validation cohort, Model 1 was adjusted for age and sex (since >90% of the validation cohort is Han Chinese, adjusting for ethnicity would have served no purpose), and Model 2 was adjusted for obesity, lifestyle factors, history of hypertension, diabetes, hyperlipidaemia, and medication for cholesterol, blood pressure, and diabetes, and eGFR. Besides, to reduce the potential impact of missed SCD, we conducted an additional analysis in the exploration cohort using these expanded ICD-10 codes, which were proposed by previous researchers41. We also separately restricted the outcomes to death due to SCD and diagnosis with SCA for Cox proportional hazard model analysis in the exploration cohort.
Additionally, we conducted stratified analyses to investigate potential effects of age (≤60 years or >60 years), sex (male or female), the presence of hypertension (yes or no), diabetes (yes or no), hyperlipidemia (yes or no), and a history of CVD (yes or no). To ensure the reliability of our findings, we conducted several sensitivity analyses in the exploration cohort, separately: (1) excluding participants with CVD; (2) excluding participants with preexisting conditions such as hyperlipidaemia, diabetes, or hypertension; (3) excluding participants with diagnosed ESRD or who were dependent on dialysis; (4) excluding individuals who achieved either outcome within the first year after baseline; (5) excluding individuals who achieved either outcome within 3 years after baseline; (6) changing the confounders in the final model; and (7) restricting the definition of CKD to a diagnosis in the medical records and the eGFR.
Circulating protein analysis
A 3-step study combined with observational analysis was conducted in 36,530 participants with available circulating protein data in the exploration cohort. First, we examined the differentially expressed proteins between non-CKD patients (N = 34,250) and CKD patients (N = 2280) with the t-test. If a protein was absent in >50% of the population, the mean value of that protein was used. If the protein was absent in >80% of the population, it was excluded from the analysis. We also constructed a linear regression model and Cox proportional hazards model to separately identify CKD- and SCD-related proteins. We then overlapped the three sets of proteins to obtain our initial protein list of interest. Second, we applied a penalized Cox proportional hazards model with adaptive LASSO with the minimized mean cross-validated error criterion to select the optimal biomarkers among the overlapping significant proteins from CKD patients with all overlapping protein data. Finally, we constructed Cox proportional hazard models with the data from these CKD patients to examine the associations of candidate proteins selected in the LASSO regression with the risk of SCD. Moreover, in CKD patients whose protein data were available from the proteomic analysis replication cohort, we constructed Cox proportional hazards models to validate the candidate proteins identified in the exploration cohort.
All statistical analyses were conducted using Stata software version 17.0 and R version 4.4.1. A two-sided P value < 0.05 was considered statistically significant, whereas the statistically significant result was determined by an adjusted P value < 0.05 using Benjamini and Hochberg false discovery rate (FDR) and Bonferroni-corrected P value < 0.05/2923 for multiple testing in the analysis of circulating proteins.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
link
