Determination of the risk factors for breast cancer survival using the Bayesian method, Yazd, Iran
Vida Pahlevani1, Morteza Mohammadzadeh2, Nima Pahlevani3, Vajiheh Nayeb Zadeh1
1 Department of Biostatistics and Epidemiology, Shahid Sadoughi University of Medical Sciences and Health Services, Yazd, Iran
2 Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
3 Department of Medical Sciences, Kashan University, Kashan, Iran
|Date of Submission||12-Jul-2019|
|Date of Decision||14-Aug-2019|
|Date of Acceptance||28-Aug-2020|
|Date of Web Publication||26-Nov-2021|
Dr. Vajiheh Nayeb Zadeh
Shahid Sadoughi University of Medical Sciences and Health Services, Yazd
Source of Support: None, Conflict of Interest: None
Background: There are numerous sophisticated studies which have investigated risk factors of breast cancer (BC). The purpose of this paper is to use benefits of Bayesian modeling to involve such prior information in determining factors affecting the survival of women with BC in Yazd city. Materials and Methods: The checklist included the characteristics of the patients and the factors studied. Then, from the records of patients referred to Radiotherapy Center of Shahid Ramezanzadeh, who had BC, from April 2005 to March 2012, the survival of 538 persons was recorded in the census. Data were analyzed by R software version 3.4.2, and 0.05 was considered the significance level. Results: The mean age of BC diagnosis was 48.03 ± 11016 years. The Bayesian Cox regression showed that surgery (hazard ratio [HR] =1.631 95% PI; 1.102–2.422), ki67 (HR = 3.260. 95% PI; 1.6308–6.372), stage (HR = 5.620, 95% PI; 4.079–7.731), lymph node (HR = 1.765, 95% PI; 1.127–2.790), and ER (HR = 2. 600 95% PI; 2.023–3.354) were significantly related to survival time. Conclusion: The parametric and cox models were compared with standard error, and Cox model was selected as an optimal model. Accordingly, stage, ki67, lymph node, ER, and surgery variables had a positive effect on death hazard.
Keywords: Bayesian method, breast cancer, regression analysis, risk factors, survival analysis
|How to cite this article:|
Pahlevani V, Mohammadzadeh M, Pahlevani N, Nayeb Zadeh V. Determination of the risk factors for breast cancer survival using the Bayesian method, Yazd, Iran. Adv Biomed Res 2021;10:35
|How to cite this URL:|
Pahlevani V, Mohammadzadeh M, Pahlevani N, Nayeb Zadeh V. Determination of the risk factors for breast cancer survival using the Bayesian method, Yazd, Iran. Adv Biomed Res [serial online] 2021 [cited 2022 Jan 23];10:35. Available from: https://www.advbiores.net/text.asp?2021/10/1/35/331280
| Introduction|| |
Cancer is a chronic disease that dedicates a high rate of death to itself in many societies in recent decades. Breast cancer (BC) is the most frequent malignancy in women around the world. Its incidence and mortality rates have been rising in Asian countries. The available cancer treatments are generally costly, along with multiple complications and the response rate to treatment is not complete in many cases. Furthermore, BC is the most common cause of death in female 40–44 years old in many advanced and developing countries. It is the second leading cause of cancer death after lung cancer. It is estimated that cancer death will reach 13.1 million in 2030. However, 40% of deaths from cancer can be prevented by identifying and controlling the risk factors. BC is abnormal and malignant proliferation of breast tissue cells, which is generally divided into two groups of carcinoma in situ (noninvasive) and invasive cancer. According to the latest statistics, the incidence of BC in Iranian female is 27.5/1000 people. The 5-year survival rate in these patients ranges from 48% to 84%, and the overall survival rate is 72%. This cancer is a type of hormone-related disease and malignant proliferation of epithelial cells that cover the breast lobule and lactiferous duct. Sixteen percent of all cancers in Iran are related to BC, which is ranked first among females. In this paper, two tumor markers of ER, PR Ki67, and Her2 were applied. In survival analysis, time variable is usually called as the “survival time” because this variable determines the length of time that a person has “survived” during the follow-up period. In the majority of medical studies, methods such as Cox regression are used when the purpose is to evaluate the survival distribution. Cox regression is a much more popular choice than parametric regression, because the nonparametric estimate of the hazard function offers you much greater flexibility than most parametric approaches, but the proportional hazard (PH) assumption for all predictor variables in model is an important and fundamental hypothesis for this model that proportion hazard has been constant over time. Only if the PH assumption is being satisfied, we are allowed that fit Cox regression on data. Hitherto many studies have been conducted on the Cox regression model, but according to a systematic study, only 5% of these studies have considered the PH assumption. In parametric models, survival times have a known probability distribution such as Weibull, exponential, lognormal, and log logistic. Medical communities use Bayesian methods for survival analysis due to limitations such as the small sample size or lots of censorship. The main reason for this introduction is to combine the previous information and the data in the Bayesian method, which makes the estimations accurate and riches the findings. Optimal decisions can be made by observing new data and arguing about the probability distribution. In this paper, the parametric and Cox models are fitted on the data using the Bayesian method considering different prior distributions and meta-analyses. Initially, the various survival parametric models including exponential, Weibull, Gompertz, log logistic, lognormal, generalized Fisher, and generalized gamma and Cox were fitted by the Bayesian method on data. The final model was selected based on the Bayesian standard deviation criterion, and accordingly, factors affecting on survival of BC were determined.
The goal in this study has two folds: our primary purpose is to investigate effect of tumor markers on patients' survival such that involve prior information about them in our analysis. Second, find optimal model between several parametric and Cox model in Bayesian perspective. Bayesian models can combine the previous information with the data. Thus, when rich sources of prior information about risk factors are available, this approach can overcome classical statistical models.
| Materials and Methods|| |
Initially, a checklist containing the patients' specifications and all the studied factors(age, size of the tumor, lymph node involvement, primary metastasis, stage of disease, type of pathology (vascular or neurological lymph invasion), grade of disease, tumor markers (Her2, ER, PR, and Ki67), type of surgery (mastectomy or breast-conserving therapy [BCT]), collateral treatments (post-surgery radiotherapy, post-surgery chemotherapy, hormone therapy), distance of metastasis, and survival of patients) were prepared. Then, the patients' file in the archives of Shahid Ramezanzadeh Radiotherapy Center in Yazd, who had BC was examined from the beginning of 2005 to the end of 2012, and they were called to determine the survival of 538 patients. In addition, the time of patients' death was given from the provincial health center. In this analytical and survival analysis study, exploratory factor analysis was used to identify correlated predictor variables and to reduce the dimension of data, due to the lots of predictor variables in model. The variables were allocated to five independent factors using exploratory factor analysis.
Finally, seven variables including Her2, Ki67, estrogen receptor (ER), age (less than or over 40), type of surgery (mastectomy or BCT), stage of disease (primary or advanced), and lymph node involvement (positive or negative) were considered as a risk factor for the survival of BC according to the results of factor analysis and expert opinion. In the next step, the parametric models fitted with different survival models including exponential, Weibull, Gompertz, log logistic, lognormal, generalized Fisher, and generalized gamma on the variables. The optimal parametric model was selected using Akaike information criterion (AIC), and then, the optimal model fitted by Bayesian method.
Akaike criterion was used to determine the optimal parametric model. Akaike criterion was presented by Akaike in 1974 to assess the goodness of fitting models and is defined as follows:
AIC = −2 (log [likelihood]) +2 (a + c)
Where, a is the number of model parameters and c is a constant coefficient that is different from that of the applied model, and the smaller Akaike criterion means better fitting. Furthermore, the Bayesian Cox model was fitted, and finally, the optimal model was determined based on Bayesian standard error (SE).
In this study, packages SurvMisc, icenReg, and Coda were used in the R software for Bayesian survival analysis. In all tests, α =0.05 was considered a significant level.
| Results|| |
In this study, 538 patients had BC which 109 patients were died (20.3%). Using the Kaplan–Meier method, the survival rates of 1, 3, and 8 years old were estimated 0.976, 0.898, and 0.737, respectively. Mean age of patients was 48.03 ± 11.66 years old and the mean survival time of patients was of 97.64 ± 4.23 months. [Table 1] reports descriptive statistics. In the parametric modeling, all listed parametric models fitted on these seven variables, and “Log Normal model” was selected as the optimal parametric model (AIC = 464.77). Then, the log normal model was fitted with the Bayesian method. Gibbs sampling method was used for Markov chain Monte Carlo (MCMC) technique. The fitting results of Bayesian Lognormal model were given in [Table 2]. The mean and variance of significant variables were determined by studying different studies for Cox Bayesian method. The normal distribution was selected as the informative prior distribution. Bayesian Cox model was considered in the multiple form. For parameters h, hk, and β used bottom prior distribution:
|Table 1: Frequency of patients with breast cancer in terms of the variables affecting the disease (n=538)|
Click here to view
Posterior distribution for β is
Moreover, Schonfeld test was used to testing the proportion hazard assumption (PH). The results of the test were satisfactory (Global P = 0.399).
In the Bayesian method, MCMC is used so that it estimates the posterior distribution of the parameters under the specific loss function by performing the large sampling. In this research, Gibbs sampling was performed with indicators (nburn = 6000; nsave = 60,000; nskip = 20 + 4; niter = nburn + nsave) and the square error function was considered. Therefore, the posterior distribution mean was calculated as the final estimate of HR for each variable. These samples should be random without any particular process (they should not have autocorrelation). Geweke and Raftery Lewis indicators were used to randomize samples (close to 1) and lack of autocorrelation (<2.6). In addition, the statistical inferences are done using probability interval (credible interval) instead of P value. [Figure 1] shows the Bayesian survival for significant variables in this study, and [Table 3] was given the result of the Bayesian Cox method.
|Figure 1: Survival plot for significant tumor markers. Width (Px): 624, Height (Px): 244|
Click here to view
| Discussion|| |
The use of the Bayesian factors is a good idea for a measure of comparing parametric models and Cox model, but in practice, they are not measured in statistical packages in the Bayesian survival analysis, and their manual calculation is not possible due to a high number of sample size and variables. Hence, the SE has been selected in this paper as a comparison in this paper. Based on this, the Bayesian Cox model was selected as the optimal model because it had a minimum of standard error between models, so based on this, the factors affecting BC were chosen. In this study, the age variable was not significant, which is similar to Tazhibi et al.'s study. who studies the survival data of 996 patients with BC in Isfahan to evaluate the risk factors in metastatic patients. This result can be due to the use of the same cut point for the age variable. The Ki67 variable is significant, and patients with a Ki67-positive rate have 3.260 times greater death risk than the negative one. This result has been confirmed in other studies. The role of Ki67 in BC has always been disagreement. In Nishimura et al. study, a high level of Ki67 was associated with a low survival, but in Bryan study, there was no relationship between Ki67 and androgen receptor.In addition, For the type of surgery variable, patients with mastectomy survived 1.631 times less than BCT. Moslemi et al. proposed BCT method because of breast maintain and mental relaxation for patients, but unlike our results, it was not significant. Saadatmand et al. investigated the effect of the stage disease on the survival of 173,797 patients. It was reported that mastectomy had more death hazard, and the Her2 variable was not significantly similar to this study. However, by using of the Bayesian method in our study even with a small sample size, we obtained similar results with Saadatmand's study, The stage of disease was significant in most studies, such as Rakha et al. In this study, patients with advanced stage of disease have 5.620 times higher death hazard than those in the primary stages of the disease. In addition, the ER was significant and patients with positive ER were 2.6 times more death hazard than negative ER patients, which were similar to the most studies. Finally, the lymph node variable involvement was statistically significant, and patients with lymph node involvement had 1.765 times higher death hazard than patients without lymph node involvement. The results were similar to the results of Khodarahmi et al. (2015) and Fallahzadeh et al. in the same year. Medically, it has been proven that lymph node involvement is one of the most influential factors in BC. Shahireh Haghighat et al. evaluated the survival rate in BC patients. In a longitudinal study, they evaluated 623 patients with BC who referred to the Center of Breast Disease in Jihad University in 1997–2006. They concluded that HR ratio according to Cox analysis showed that lymph node involvement (HR = 2.52) and negative ER (HR = 2.60) significantly were important factors correlated with the survival rate of the patients. Joseph Ibrahim, a researcher of Bayesian survival, reported in 2011 if there was no information about the parameters from previous studies, and the prior distribution of parameters was considered to be noninformative; then, the classical Cox and Bayesian Cox lead to the same findings. In this study, the informative prior distributions have been used, so results can be cited with a high power. One of the main problems of the research is the lack of enough sample size. However, high number of sampling has some limitations. The Bayesian method allows accurate results even in low samples based on information from meta-analysis and scientific resources. Combination of Bayesian and parametric methods in the survival analysis for diseases such as cancer gives more accurate results in terms of estimation error compared to the classic methods. In some of diseases such as BC that mortality rate in this cancer is lower than other cancers, Cure models are a better alternative to survival analysis. The combination of the Bayesian method and Cure models can be considered a substitute proposal in feature studies.
| Conclusion|| |
The Bayesian Cox model was selected as the optimal model because it had a minimum of standard error between models, so based on this, the factors affecting BC were chosen. In this study, stage, ki67, lymph node, ER, and surgery variables had a positive effect on death hazard.
We thank Hossein Fallahzadeh and all the participants in this study for their help with this project.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Lawson DA, Bhakta NR, Kessenbrock K, Prummel KD, Yu Y, Takai K, et al
. Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells. Nature 2015;526:131-5.
Nelson HD, Fu R, Cantor A, Pappas M, Daeges M, Humphrey L. Effectiveness of breast cancer screening: Systematic review and meta-analysis to update the 2009 US preventive services task force recommendation effectiveness of breast cancer screening. Ann Intern Med 2016;164:244-55.
Kazemi A, Omid E, Amin MM, Nesaee P. A Survey on breast cancer status in Kurdistan province on Medical Geography viewpoint During 2006 - 2010.hsr. 2015;11:459–72.
Potosky AL, O'Neill SC, Isaacs C, Tsai HT, Chao C, Liu C, et al
. Population-based study of the effect of gene expression profiling on adjuvant chemotherapy use in breast cancer patients under the age of 65 years. Cancer 2015;121:4062-70.
McGale P, Taylor C, Correa C, Cutter D, Duane F, Ewertz M, et al
. Effect of radiotherapy after mastectomy and axillary surgery on 10-year recurrence and 20-year breast cancer mortality: Meta-analysis of individual patient data for 8135 women in 22 randomised trials. Lancet (London, England). 2014;383:2127–35.
Jafari-Koshki T, Schmid VJ, Mahaki B. Trends of breast cancer incidence in Iran during 2004-2008: A Bayesian space-time model. Asian Pac J Cancer Prev 2014;15:1557-61.
Sharifian A, Pourhoseingholi MA, Emadedin M, Rostami Nejad M, Ashtari S, Hajizadeh N, et al
. Burden of breast cancer in Iranian women is increasing. Asian Pac J Cancer Prev 2015;16:5049-52.
Cleves M, Gould W, Gould WW, Gutierrez R, Marchenko Y. An Introduction to Survival Analysis Using Stata, Second Edition. Stata Press; 2008. 398p.
Cox DR, Oakes D. Analysis of Survival Data. Boca Raton: Chapman and Hall/CRC; 2017. 212 p. Bennett S. Analysis of survival data by the proportional odds model. Stat Med 1983;2:273-7.
Klein JP, Houwelingen HC van, Ibrahim JG, Scheike TH. Handbook of Survival Analysis. CRC Press; 2016. 635p.
Zhou H, Hanson T. spBayesSurv: Bayesian Modeling and Analysis of Spatially Correlated Survival Data. R Package Version; 2014. p. 1.
Kabacoff RI. R in action: Data analysis and graphics with R. Simon and Schuster; 2015.
Plummer M, Best N, Cowles K, Vines K. CODA: Convergence diagnosis and output analysis for MCMC. R News. 2006 Mar; 6:7–11.
Therneau TM. Extending the Cox Model. In: Lin DY, Fleming TR, editors. Proceedings of the First Seattle Symposium in Biostatistics. New York, NY: Springer US; 1997. p. 51–84. (Lecture Notes in Statistics).
Elandt-Johnson RC, Johnson NL, Statistiker M. Survival models and data analysis. Wiley Online Library; 1980.
Tazhibi M, Fayaz M, Mokarian F. Detection of prognostic factors in metastatic breast cancer. J Res Med Sci 2013;18:283-90.
Yerushalmi R, Woods R, Ravdin PM, Hayes MM, Gelmon KA. Ki67 in breast cancer: Prognostic and predictive potential. Lancet Oncol 2010;11:174-83.
Vera-Badillo FE, Chang MC, Kuruzar G, Ocana A, Templeton AJ, Seruga B, et al
. Association between androgen receptor expression, Ki-67 and the 21-gene recurrence score in non-metastatic, lymph node-negative, estrogen receptor-positive and HER2-negative breast cancer. Journal of clinical pathology. 2015;68:839–43.
Nishimura R, Osako T, Okumura Y, Hayashi M, Toyozumi Y, Arima N. Ki-67 as a prognostic marker according to breast cancer subtype and a predictor of recurrence time in primary breast cancer. Exp Ther Med 2010;1:747-54.
Bryan RM, Mercer RJ, Bennett RC, Rennie GC, Lie TH, Morgan FJ. Androgen receptors in breast cancer. Cancer 1984;54:2436-40.
Moslemi D, Gholizadeh PA, Hajian K, Sum SH, Pourghasem M, Jahantigh R. Comparison of Modified Radical Mastectomy with Breast Conservative Therapy and Radiotherapy in Patients with Breast Cancer; 2012.
Saadatmand S, Bretveld R, Siesling S, Tilanus-Linthorst MM. Influence of tumour stage at breast cancer detection on survival in modern times: Population based study in 173,797 patients. BMJ 2015;351:h4901.
Rakha EA, El-Sayed ME, Green AR, Lee AH, Robertson JF, Ellis IO. Prognostic markers in triple-negative breast cancer. Cancer 2007;109:25-32.
Ren Y, Black DM, Mittendorf EA, Liu P, Li X, Du XL, et al
. Crossover effects of estrogen receptor status on breast cancer-specific hazard rates by age and race. PLoS One 2014;9:e110281.
Vostakolaei FA, Broeders MJ, Rostami N, van Dijck JA, Feuth T, Kiemeney LA, et al
. Age at Diagnosis and Breast Cancer Survival in Iran. Simon MS, editor. International Journal of Breast Cancer. 2012 Nov 22;2012:517976.
Khodarahmi S, Rezaianzadeh A. The role of prognostic factors on the survival of breast cancer patients: Bayesian approach. Iran J Epidemiol 2015;11:23-33.
Fallahzadeh H, Momayyezi M, Akhundzardeini R, Zarezardeini S. Five year survival of women with breast cancer in Yazd. Asian Pac J Cancer Prev 2014;15:6597-601.
de Boniface J, Frisell J, Andersson Y, Bergkvist L, Ahlgren J, Rydén L, '
. Survival and axillary recurrence following sentinel node-positive breast cancer without completion axillary lymph node dissection: The randomized controlled SENOMAC trial. BMC Cancer 2017;17:379.
Haghighat S. Survival rate and its correlated factors in breast cancer patients referred to Breast Cancer Research Center. Iran J Breast Dis 2013;6:28-36.
Ibrahim JG, Chen MH, Sinha D. Bayesian semiparametric models for survival data with a cure fraction. Biometrics 2001;57:383-8.
Othus M, Mitchell A, Barlogie B, Morgan G, Crowley J. Cure-Rate Survival Models and Their Application to Cancer Clinical Trials. In: Matsui S, Crowley J, editors. Frontiers of Biostatistical Methods and Applications in Clinical Oncology [Internet]. Singapore: Springer; 2017. p. 165-78. Available from: https://doi.org/10.1007/978-981-10-0126-0_11
. [Last accessed on 2021 Jun 26].
Haghighat S. Survival rate and its correlated factors in breast cancer patients referred to Breast Cancer Research Center. Iran J Breast Dis. 2013;6:28¨C36.
Ibrahim JG, Chen M-H, Sinha D. Bayesian survival analysis. Wiley Online Library; 2005.
Othus M, Mitchell A, Barlogie B, Morgan G, Crowley J. Cure-Rate Survival Models and Their Application to Cancer Clinical Trials. Front Biostat Methods Appl Clin Oncol. Springer; 2017. p. 165¨C178.
[Table 1], [Table 2], [Table 3]