Abstract
The use of parametric and nonparametric statistical modeling methods differs depending on data sufficiency. For sufficient data, the parametric statistical modeling method is preferred owing to its high convergence to the population distribution. Conversely, for insufficient data, the nonparametric method is preferred owing to its high flexibility and conservative modeling of the given data. However, it is difficult for users to select either a parametric or nonparametric modeling method because the adequacy of using one of these methods depends on how well the given data represent the population model, which is unknown to users. For insufficient data or limited prior information on random variables, the interval approach, which uses interval information of data or random variables, can be used. However, it is still difficult to be used in uncertainty analysis and design, owing to imprecise probabilities. In this study, to overcome this problem, an integrated statistical modeling (ISM) method, which combines the parametric, nonparametric, and interval approaches, is proposed. The ISM method uses the two-sample Kolmogorov–Smirnov (K–S) test to determine whether to use either the parametric or nonparametric method according to data sufficiency. The sequential statistical modeling (SSM) and kernel density estimation with estimated bounded data (KDE-ebd) are used as the parametric and nonparametric methods combined with the interval approach, respectively. To verify the modeling accuracy, conservativeness, and convergence of the proposed method, it is compared with the original SSM and KDE-ebd according to various sample sizes and distribution types in simulation tests. Through an engineering and reliability analysis example, it is shown that the proposed ISM method has the highest accuracy and reliability in the statistical modeling, regardless of data sufficiency. The ISM method is applicable to real engineering data and is conservative in the reliability analysis for insufficient data, unlike the SSM, and converges to an exact probability of failure more rapidly than KDE-ebd as data increase.
Similar content being viewed by others
References
Agarwal H, Renaud JE, Preston EL, Padmanabhan D (2004) Uncertainty quantification using evidence theory in multidisciplinary design optimization. Reliab Eng Syst Saf 85(1):281–294
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Anderson TW, Darling DA (1952) Asymptotic theory of certain goodness of fit criteria based on stochastic processes. Ann Math Stat 23(2):193–212
Ayyub BM, McCuen RH (2012) Probability, statistics, and reliability for engineers and scientists. CRC Press, Florida
Betrie GD, Sadiq R, Morin KA, Tesfamariam S (2014) Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage chemistry: a probability bounds approach. Sci Total Environ 490:182–190
Betrie GD, Sadiq R, Nichol C, Morin KA, Tesfamariam S (2016) Environmental risk assessment of acid rock drainage under uncertainty: the probability bounds and PHREEQC approach. J Hazard Mater 301:187–196
Burnham KP, Anderson DR (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res 33(2):261–304
Chen S (2015) Optimal bandwidth selection for kernel density functionals estimation. J Probab Stat 2015:21
Choi JS, Hong S, Chi SB, Lee HB, Park CK, Kim HW, Yeu TK, Lee TH (2011) Probability distribution for the shear strength of seafloor sediment in the KR5 area for the development of manganese nodule miner. Ocean Eng 38(17):2033–2041
Doh J, Lee J (2018) Bayesian estimation of the lethargy coefficient for probabilistic fatigue life model. J Comput Des Eng 5(2):191–197
Frangopol DM, Corotis RB, Rackwitz R (1997) Reliability and optimization of structural systems: Proceedings of the seventh IFIP WG7.5 working conference on reliability and optimization of structural systems 1996. Elsevier Science, Pergamon
Frigge M, Hoaglin DC, Lglewicz B (1989) Some implementations of the boxplot. Am Stat 43(1):50–54
Guidoum AC (2015) Kernel estimator and bandwidth selection for density and its derivatives. Department of Probabilities & Statistics, Faculty of Mathematics, University of Science and Technology Houari Boumediene, Algeria https://cran.r-project.org/web/packages/packages/kedd/vignettes/kedd.pd. Accessed 06 Sept 2019
Gunawan S, Papalambros PY (2006) A Bayesian approach to reliability-based optimization with incomplete information. J Mech Des 128(4):909–918
Hansen BE (2009) Lecture notes on nonparametrics. University of Wisconsin, Madison 718/NonParametrics1.pdf. Accessed 06 Sept 2019
Hao WY, Liu C, Wang B, Wu H (2017) A novel non-probabilistic reliability-based design optimization algorithm using enhanced chaos control method. Comput Methods Appl Mech Eng 318:572–593
Hao P, Ma R, Wang Y, Feng S, Wang B, Li G (2019a) An augmented step size adjustment method for the performance measure approach: toward general structural reliability-based design optimization. Struct Saf 80:32–45
Hao P, Wang Y, Ma R, Liu H, Wang B, Li G (2019b) A new reliability-based design optimization framework using isogeometric analysis. Comput Methods Appl Mech Eng 345:476–501
Hess PE, Bruchman D, Assakkaf IA, Ayyub BM (2002) Uncertainties in material and geometric strength and load variables. Nav Eng J 114(2):139–166
Hong J, Kang YJ, Lim OK, Noh Y (2018) Comparison of multivariate statistical modeling methods for limited correlated data. Trans Korean Soc Mech Eng A 42(5):445–453
Jackman S (2009) Bayesian analysis for the social sciences, vol 846. John Wiley & Sons, Chichester
Joo M, Doh J, Lee J (2017) Determination of the best distribution and effective interval using statistical characterization of uncertain variables. J Comput Des Eng
Jung JH, Kang YJ, Lim OK, Noh Y (2017) A new method to determine the number of experimental data using statistical modeling methods. J Mech Sci Technol 31(6):2901–2910
Kang YJ (2018) Development of integrated statistical modeling method for reliability analysis, Ph.D. Dissertation, Pusan National University
Kang YJ, Lim OK, Noh Y (2016) Sequential statistical modeling for distribution type identification. Struct Multidiscip Optim 54(6):1587–1607
Kang YJ, Hong J, Lim OK, Noh Y (2017) Reliability analysis using parametric and nonparametric input modeling methods. J Comput Struct Eng Inst Korea 30(1):87–94
Kang YJ, Noh Y, Lim OK (2018) Kernel density estimation with bounded data. Struct Multidiscip Optim 57(1):95–113
Karanki DR, Kushwaha HS, Verma AK, Ajit S (2009) Uncertainty analysis based on probability bounds (P-box) approach in probabilistic safety assessment. Risk Anal 29(5):662–675
Keshtegar B, Chakraborty S (2018) A hybrid self-adaptive conjugate first order reliability method for robust structural reliability analysis. Appl Math Model 53:319–332
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Li J, Wang H, Kim NH (2012) Doubly weighted moving least squares and its application to structural reliability analysis. Struct Multidiscip Optim 46(1):69–82
Lukić M, Cremona C (2001) Probabilistic assessment of welded joints versus fatigue and fracture. J Struct Eng 127(2):211–218
Malekpour S, Barmish BR (2016) When the expected value is not expected: A conservative approach. IEEE Transactions on Systems, Man, and Cybernetics: Systems 47(9):2454–2466
Montgomery DC, Runger GC (2003) Applied statistics and probability for engineers, 3rd edn. Wiley, New York
Noh Y, Choi KK, Lee I (2010) Identification of marginal and joint CDFs using Bayesian method for RBDO. Struct Multidiscip Optim 40(1):35–51
Park C, Kim NH, Haftka RT (2015) The effect of ignoring dependence between failure modes on evaluating system reliability. Struct Multidiscip Optim 52(2):251–268
Peng X, Li J, Jiang S (2017a) Unified uncertainty representation and quantification based on insufficient input data. Struct Multidiscip Optim 56(6):1305–1317
Peng X, Wu T, Li J, Jiang S, Qiu C, Yi B (2017b) Hybrid reliability analysis with uncertain statistical variables, sparse variables and interval variables. Eng Optim
Picheny V, Kim NH, Haftka RT (2010) Application of bootstrap method in conservative estimation of reliability with limited samples. Struct Multidiscip Optim 41(2):205–217
Schwarz (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Shah H, Hosder S, Winter T (2015) Quantification of margins and mixed uncertainties using evidence theory and stochastic expansions. Reliab Eng Syst Saf 138:59–72
Sheather SJ (2004) Density estimation. Stat Sci 19(4):588–597
Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC press, London
Socie D (2014) Probabilistic statistical simulations technical background, eFatigue LLC, 2008, https://www.efatigue.com/probabilistic/background/statsim.html#Cor, April, 2014
Tucker WT, Ferson S (2003) Probability bounds analysis in environmental risk assessment. Applied Biomathematics, Setauket, New York http://citeseerx.ist.psu.edu/viewdoc/download?. Accessed 06 Sep 2019
Tukey JW (1977) Exploratory data analysis. Pearson, New York
Verma AK, Srividya A, Karanki DR (2010) Reliability and safety engineering. Springer, London
Wang P, Youn BD, Xi Z, Kloess A (2009) Bayesian reliability analysis with evolving, insufficient, and subjective data sets. J Mech Des 131(11):111008
Wang L, Cai Y, Liu D (2018) Multiscale reliability-based topology optimization methodology for truss-like microstructures with unknown-but-bounded uncertainties. Comput Methods Appl Mech Eng 339:358–388
Wheeler DJ (2012) What they forgot to tell you about the normal distribution: how the normal distribution has maximum uncertainty. Quality Digest (http://www.qualitydigest.com/print/21738), https://www.qualitydigest.com/print/21738
Yao W, Chen X, Quyang Q, Van Tooren M (2013) A reliability-based multidisciplinary design optimization procedure based on combined probability and evidence theory. Struct Multidiscip Optim 48(2):339–354
Yoo D, Lee I (2014) Sampling-based approach for design optimization in the presence of interval variables. Struct Multidiscip Optim 49(2):253–266
Youn BD, Wang P (2008) Bayesian reliability-based design optimization using eigenvector dimension reduction (EDR) method. Struct Multidiscip Optim 36(2):107–123
Youn BD, Jung BC, Xi Z, Kim SB, Lee WR (2011) A hierarchical framework for statistical model calibration in engineering product development. Comput Methods Appl Mech Eng 200:1421–1431
Zhang Z, Jiang C, Han X, Hu D, Yu S (2014) A response surface approach for structural reliability analysis using evidence theory. Adv Eng Softw 69:37–45
Funding
This work was supported by a grant from the National Research Foundation of Korea (NRF), funded by the Korean Government (NRF-2015R1A1A3A04001351) and by the Technology Innovation Program (10048305, Launching Plug-In Digital Analysis Framework for Modular System Design) funded by the Ministry of Trade, Industry, and Energy (MOTIE, Korea).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Responsible Editor: Byeng D Youn
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1. Probability density functions
Types | Parameters | |
---|---|---|
Normal | \( f\left(x|\mu, \sigma \right)=\frac{1}{\sigma \sqrt{2\pi }}\exp \left\{\frac{-{\left(x-\mu \right)}^2}{2{\sigma}^2}\right\} \) | μ: Location (mean) σ: Scale (standard deviation) |
Logistic | \( f\left(x|\mu, \sigma \right)=\frac{\exp \left(\frac{x-a}{b}\right)}{b{\left\{1+\exp \left(\frac{x-a}{b}\right)\right\}}^2} \) | a: Location (mean) b: Scale |
t Location scale | \( f\left(x|\mu, \sigma, \nu \right)=\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\sigma \sqrt{\nu \pi}\Gamma \left(\frac{\nu }{2}\right)}{\left[\frac{\nu +{\left(\frac{x-\mu }{\sigma}\right)}^2}{\nu}\right]}^{-\left(\frac{\nu +1}{2}\right)} \) | μ: Location (mean) σ: Scale ν: Shape |
Appendix 2. Flow chart of SSM
Appendix 3. Flow chart of KDE-ebd
Appendix 4. Quantile function value ratio
Rights and permissions
About this article
Cite this article
Kang, YJ., Noh, Y. & Lim, OK. Integrated statistical modeling method: part I—statistical simulations for symmetric distributions. Struct Multidisc Optim 60, 1719–1740 (2019). https://doi.org/10.1007/s00158-019-02402-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-019-02402-8