中文文獻
[1]凌士雄 (2004),非對稱性分類分析解決策略之效能比較,碩士論文,國立中山大學資訊管理學系,高雄。[2]張琦、吳斌、王柏 (2005),「非平衡數據訓練方法概述」,計算機科學,第三二卷,第十期,第181-186頁。
[3]郭琇靜 (2007),應用支援向量機與製程統計特徵於線上偵測製程異常之研究,碩士論文,國立虎尾科技大學工業工程與管理研究所,雲林。英文文獻
[1]A. An, and Y. Wang, (2001), “Comparisons of classification methods for screening potential compounds,” Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 11-18.
[2]A. Orriols-Puig, and E. Bernadó-Mansilla, (2009), “Evolutionary rule-based systems for imbalanced datasets,” Soft Computing, vol. 13, pp. 213-225.
[3]A. Fernández, S. García, M.J. del Jesus, and F. Herrera, (2008), “A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets,” Fuzzy Sets System, Vol. 159, pp. 2378-2398.
[4]A. Fernández, M.J. del Jesus, and F. Herrera, (2009), “On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets,” Expert Systems with Applications, Vol. 36, pp. 9805-9812.
[5]A. Maciej Mazurowski, A. Piotr Habas, M. Jacek Zurada, Y. Joseph Lo, AJay Baker, and D. Georgia Tourassi (2008), “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance,” Neural Networks, Vol. 21, No. 2-3. pp. 427-436.
[6]B. Raskutti, A. Kowalczyk, (2004), “Extreme rebalancing for svms: a case study,” SIGKDD Explorations, Vol. 6 No. 1, pp. 60-69.
[7]B. Zadrozny and C. Elkan, (2001), “Learning and making decisions when costs and probilities are both unknown,” In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining, pp. 204-213.
[8]C. Drummond, R.C. Holte, (2003), “C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” Workshop on Learning from Imbalanced Datasets, NRC 47381.
[9]C.-I. Lee, C.-J. Tsai, T.-Q. Wu and W.-P. Yang, (2008), “An approach to mining the multi-relational imbalanced database, ” Expert Systems with Applications: An International Journal, Vol. 34 No. 4, pp. 3021-3032.
[10]C.-T. Su, C.-H. Yang, K.-H. Hsu and W.-K. Chiu (2006), “Data mining for the diagnosis of type II Diabetes from three-dimensional body surface anthropometrical scanning data,” Computers & Mathematics with Applications, 51, 1075-1092.
[11]C.-T. Su, and Y.-H. Hsiao, (2007), “An evaluation of the robustness of MTS for imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, Vol. 19, No. 10, pp. 1321-1332.
[12]C.-T. Su, L.-S. Chen, and T.-L. Chiang, (2006), “A neural network based information granulation approach to shorten the cellular phone test process,” Computers In Industry, Vol. 57, No. 5, pp. 412-423.
[13]C.-T. Su, L.-S. Chen, and Y. Yih, (2006), “Knowledge acquisition through information granulation for imbalanced data,” Expert System with Applications, Vol. 31, No. 3, pp. 531-541.
[14]D.Lewis, and J. Catlett, (1994), “Heterogeneous uncertainty sampling for supervised learning,” Proceedings of the 11th International Conference on Machine Learning, pp. 144-156.
[15]D.R. Wilson, and T.R. Martinez, (2000), “Reduction techniques for instance-based learning algorithms,” Mach. Learning, Vol. 38, No. 3, pp. 257-286.
[16]E. Andrews, Q. Morris, and A. Bonner, (2008), “Neural networks approaches for discovering the learnable correlation between gene function and gene expression in mouse,” Neurocomputing, Vol. 71, No. 16-18, pp. 3168-3175.
[17]G.Batista, R. Prati, and M. Monard, (2004), “Astudy of the behaviour of several methods for balancing machine learning training data,” SIGKDD Explorations, Vol. 6, No. 1, pp. 20-29.
[18]G.Batista, R.C. Prati, and M.C. Monard, (2004), “A study of the behavior of several methods for balancing machine learning training data,” SIGKDD Explorations, Vol. 6, No. 1, pp. 20-29.
[19]G.H. Nguyen, A. Bouzerdoum, S.L. Phung, (2008), “A supervised learning approach for imbalanced data sets,” IEEE Xplore, pp. 1-4.
[20]G.M. Weiss, (2004), “Mining with rarity: a unifying framework,” SIGKDD Exploration, Vol. 6, No. 1, pp. 7-19.
[21]G.M. Weiss, and F. Provost, (2001), “The effect of class distribution on classifier learning,” Technical Report, MLTR43, Department of Computer Science, Rutgers University.
[22]G.V. Kass, (1980), “An exploratory technique for investigating large quantities of categorical data,” Applied Statistics, pp. 119-127.
[23]G. Weiss, and F. Provost, (2003), “Learning when training data are costly: the effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, No. 19, pp. 315-354.
[24]G. Wu, and E.Y. Chang, (2005), “KBA: kernel boundary alignment considering imbalanced data distribution,” IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Vol. 6, pp. 786-795.
[25]H. Altincay, and C. Ergun, (2004), “Clustering based undersampling for improving speaker verification decisions using AdaBoost,” Lecture Notes in Computer Science, Vol. 3138, pp. 698- 706.
[26]H. Guo, and H.L. Viktor, (2004), “Learning from imbalanced data sets with boosting and data generation: the DataBoost- IM approach,” SIGKDD Explorations, Vol. 6, No. 1, pp. 30-39.
[27]H.T. Lin, and C.J. Lin, (2003), “A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods,” Technical report, Department of Computer Science & Information Engineering, National Taiwan University.
[28]Hsu, C.-W., Chang, C.-C. and Lin, C.-J., (2006), “A practical guide to support vector classification,” available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html.
[29]I.H. Witten, and E. Frank, (2002), “Data mining: practical machine learning tools with Java implementations,” Morgan Kaufmann, San Francisco.
[30]J.G. Xie, and Z.D. Qiu, (2007), “The effect of imbalanced data sets on LDA: a theoretical and empirical analysis,” Pattern Recognition, Vol. 40, No. 2, pp. 557–562.
[31]J.R. Quinlan, (1986) “Induction of decision tree,” Machine Learning, Vol. 1, No. 1, pp.81-106.
[32]J.R. Quinlan, (1993), “C4.5:programs for machine learning,” Morgankaufmann, San Mateo, CA.
[33]J. Yu, and L. Xi, (2008), “A hybrid learning-based model for on-line monitoring and diagnosis of out-of-control signals in multivariate manufacturing processes,” International Journal of Production Research, DOI: 10.1080/00207540801942208.
[34]L.M. Taft, R.S. Evans, C.R. Shyu, M.J. Egger, N. Chawla, J.A. Mitchell, S.N. Thornton, B. Bray, and M. Varner, (2009), “Countering imbalanced datasets to improve adverse drug event predictive models in labor and delivery,” Journal of Biomedical Informatics, Vol. 42, pp. 356-364.
[35]L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, (1984), “Classification and regression trees,” Wadsworth Belmont.
[36]L. Xu, and M.Y. Chow, (2006), “A classification approach for power distribution systems fault cause identification,” IEEE Transactions on Power Systems, Vol. 21, No. 1, pp. 53-60.
[37]L. Zhuang, H. Dai, and X. Hang, (2005), “A novel field learning algorithm for dual imbalance text classification,” International Conf. on Fuzzy Systems and Knowledge Discovery, Lecture Notes on Artificial Intelligence, Vol. 3614, pp. 39–48.
[38]J. Liu, Q. Hu, and D. Yu, (2008), “A comparative study on rough set based class imbalance learning,” Knowledge based Systems, Vol. 21, No. 8, pp. 753-763.
[39]K. Coussement, D. Van den Poel, (2008), “Churn prediction in subscription services: An application of support vector machines while comparing two parameter selection techniques,” Expert Systems with Applications, 313–327.
[40]M.A. Maloof, (2003), “Learning when data sets are imbalanced and when costs are unequal and unknown,” ICML-2003 Workshop on Learning from Imbalanced Data Sets.
[41]M.-C. Chen, L.-S. Chen, C.-C. Hsu, and W.-R. Zeng, (2008), “An information granulation based data mining approach for classifying imbalanced data,” Information Sciences, Vol. 178, No. 16, pp. 3214-3227.
[42]M.J.A. Berry, and G. Linoff, (1997) “Data mining techniques: for marketing sale and customer support,” John Wiley & Sons, Inc.
[43]M. Kubat, and S. Matwin, (1997), “Addressing the curse of imbalanced training sets: one-sided selection,” Machine Learning, pp. 179-186.
[44]M. Kubat, R. Holte, and S. Matwin, (1997), “Learning when negative examples abound,” Proceedings of Europeanm Conference on Machine Learning, pp. 146-153.
[45]M. Rosell, V. Kann, and J. Litton, (2004), “Comparing comparisons: document clustering evaluation using two manual classifications,” Proceedings of Int. Conf. on Natural Language Processing, pp. 207-216.
[46]N. Chawla, A. Lazarevic, L. Hall and K. Bowyer, (2003), “SMOTEBoost: improving prediction of the minority class in boosting,” 7th uropean Conference on Principles and Practice of Knowledge Discovery in Databases, Cavtat-Dubrovnik, Croatia , pp. 107-119.
[47]N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer, (2002), “Smote: synthetic minority over-sampling technique,” J. Artificial Intelligent Res, Vol. 16, pp. 321-357.
[48]N. Japkowicz, and S. Stephen, (2002), “The class imbalance problem: a systematic study,” Intelligent Data Analysis, Vol. 6, No. 5, pp. 429-449.
[49]N.V. Chawla, N. Japkowicz, and A. Kolcz, (2004), “Editorial: special issue on learning from imbalanced data sets,” SIGKDD Explorations, Vol. 6, No. 1, pp. 1-6.
[50]P. Campadelli, E. Casiraghi, and G. Valentini, (2005), “Support vector machines for candidate nodules classification,” Neurocomputing, Vol. 68, pp. 281–288.
[51]P. Hart, (1968), “The condensed nearest neighbor rule,” Inform. Theory, IEEE Trans, pp. 515-516.
[52]R. Barandela, J.S. Sanchez, V. Garcia, and E. Rangel, (2003), “Strategies for learning in class imbalance problems,” Pattern Recognition, Vol. 36, No. 3, pp. 849-851.
[53]S. Lessmann, S. Voß, (2009), “A reference model for customer-centric data mining with support vector machines,” European Journal of Operational Research, 520-530.
[54]S.S. Keerthi, C.-J. Lin, (2003), “Asymptotic behaviors of support vector machines with Gaussian kernel,” Neural Computation, 1667–1689.
[55]S.J. Press, and S. Willson, (1978), “Choosing between logistic regression and discriminant analysis,” Journal of the American Statistical Association, pp. 699-705.
[56]S.R. Gunn, (1998), “Support vector machines for classification and regression,” Technical Report, University of Southampton.
[57]S. Li, L. Shue, and S. Lee, (2008), “Business intelligence approach to supporting strategy-making of ISP service management,” Expert Systems with Applications, vol. 35, no. 3, pp. 739-754.
[58]S.-J. Yen, Y.-S. Lee, (2009), “Cluster-based under-sampling approaches for imbalanced data distributions,” Expert Systems with Applications, Vol. 36, pp. 57180-5727.
[59]T. Estabrooks, N. Jo, Japkowicz, (2004), “A multiple resampling method for learning from imbalanced data sets,” Comput. Intelligence, Vol. 20, No. 1, pp. 18-36.
[60]T. Fawcett, and F.J. Provost, (1997), “Adaptive fraud detection,” Data Mining Knowledge Discovery, Vol. 1, No. 3, pp. 291-316.
[61]T. Kohonen, (1990), “The self-organizing map,” Proceedings of the IEEE, Vol. 78, No. 9, pp. 1464-1480.
[62]T.W. Liao, (2008), “Classification of weld flaws with imbalanced class data,” Expert Systems with Applications, Vol. 35, No. 3, pp. 1041-1052.
[63]Tomek, (1976), “Two modifications of CNN,” IEEE Trans, Vol. 6, No. 11, pp. 769-772.
[64]U. Seiffert, and L. Jain, (2002), “Self-organizing neural networks: recent advances and applications,” Studies in Fuzziness and Soft Computing, 78, Springer, Berlin.
[65]V.N. Vapnik, (1995), “The nature of statistical learning theory,” Springer Verlag, New York, NY, USA.
[66]V.S. Desai, J.N. Crook, and G.A. Overstreet, (1996), “A comparison of neural networks and linear scoring models in the credit union environment” European Journal of Operation Research, Vol. 95, pp. 24-37.
[67]Y.M. Chae, S.H. Ho, K.W. Cho, D.H. Lee, and S.H. Ji, (2001), “Data mining approach to policy analysis in a health insurance domain,” International Journal of Medical Informatics, Vol. 62, No. 2-3, pp. 103-111.
[68]Y. Xie, X. Le, E.W.T. Ngai, W. Ying, (2009), “Customer churn prediction using improved balanced random forests,” Expert Systems with Applications, Vol. 36, pp. 5445-5449.
[69]Z.H. Zhou, and X.Y. Liu, (2006), “Training cost-sensitive neural networks with methods addressing the class imbalance problem,” IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 1, pp. 63-77.