|
英文參考文獻 Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175-185. Aslam, J. A., & Pavlu, V. (2007). Query hardness estimation using Jensen-Shannon divergence among multiple scoring functions. Paper presented at the European Conference on Information Retrieval. Baoli, L., Qin, L., & Shiwen, Y. (2004). An adaptive k-nearest neighbor text categorization strategy. ACM Transactions on Asian Language Information Processing (TALIP), 3(4), 215-226. Blekanov, И., & Korelin, V. (2015). Hierarchical clustering of large text datasets using Locality-Sensitive Hashing. Proceedings of the International Workshop on Applications in Information Technology (IWAIT-2015). Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389-2404. Broder, A. Z. (1997, June). On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings (pp. 21-29). IEEE. Broder, A. Z., Charikar, M., Frieze, A. M., & Mitzenmacher, M. (1998, May). Min-wise independent permutations. In Proceedings of the thirtieth annual ACM symposium on Theory of computing (pp. 327-336). ACM. Buhler, J. (2001). Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics, 17(5), 419-428. Carletta, J. (1996). Assessing agreement on classification tasks: the kappa statistic. Computational linguistics, 22(2), 249-254. Cha, S.-H. (2007). Comprehensive survey on distance/similarity measures between probability density functions. City, 1(2), 1. Chen, H., Chung, Y. M., C., Marshall, R., & Yang, C. C. (1998). An intelligent personal spider (agent) for dynamic Internet/Intranet searching. Decision Support Systems, 23(1), 41-58. Chowdhury, G. (2010). Introduction to modern information retrieval: Facet publishing. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27. Dutta, D., Guha, R., Jurs, P. C., & Chen, T. (2006). Scalable partitioning and exploration of chemical spaces using geometric hashing. Journal of chemical information and modeling, 46(1), 321-333. Feldman, R., Fresko, M., Kinar, Y., Lindell, Y., Liphstat, O., Rajman, Y., Schler & Zamir, O. (1998, September). Text mining at the term level. In European Symposium on Principles of Data Mining and Knowledge Discovery (pp. 65-73). Springer Berlin Heidelberg. Gwet, K. (2002). Inter-rater reliability: dependency on trait prevalence and marginal homogeneity. Statistical Methods for Inter-Rater Reliability Assessment Series, 2, 1-9. Han, E. H. S., Karypis, G., & Kumar, V. (2001, April). Text categorization using weight adjusted k-nearest neighbor classification. In Pacific-asia conference on knowledge discovery and data mining (pp. 53-65). Springer Berlin Heidelberg. Haveliwala, T., Gionis, A., & Indyk, P. (2000). Scalable Techniques for Clustering the Web (Extended Abstract). In: Third International Workshop on the Web and Databases (WebDB 2000), May 18-19, 2000, Dallas, Texas,. Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8), 2554-2558. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of classification, 2(1), 193-218. Hull, D. A. (1996). Stemming algorithms: A case study for detailed evaluation. JASIS, 47(1), 70-84. Indyk, P., & Motwani, R. (1998, May). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing (pp. 604-613). ACM. Jaccard, P. (1901). Distribution de la Flore Alpine: dans le Bassin des dranses et dans quelques régions voisines: Rouge. Sciences Naturelles, 1901, 241-272. Leskovec, J., Rajaraman, A., & Ullman, J. D. (2014). Mining of massive datasets: Cambridge University Press. Levandowsky, M., & Winter, D. (1971). Distance between sets. Nature, 234(5323), 34-35. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Paper presented at the Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Oprişa, C., Checicheş, M., & Năndrean, A. (2014). Locality-sensitive hashing optimizations for fast malware clustering. Paper presented at the Intelligent Computer Communication and Processing (ICCP), 2014 IEEE International Conference on. Park, D. C., El-Sharkawi, M., Marks, R., Atlas, L., & Damborg, M. (1991). Electric load forecasting using an artificial neural network. Power Systems, IEEE Transactions on, 6(2), 442-449. Powers, D. (2007). Evaluation: From Precision, Recall and F Factor to ROC, Informedness, Markedness & Correaltion. Sch. Informatics Eng. Flinders. Ravichandran, D., Pantel, P., & Hovy, E. (2005, June). Randomized algorithms and nlp: using locality sensitive hash function for high speed noun clustering. InProceedings of the 43rd Annual Meeting on Association for Computational Linguistics (pp. 622-629). Association for Computational Linguistics. Salton, G., & McGill, M. J. (1986). Introduction to modern information Retrieval. New York, NY, USA: McGraw-Hill, Inc. Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620. Santos, J. M., & Embrechts, M. (2009). On the use of the adjusted rand index as a metric for evaluating supervised classification. Paper presented at the International Conference on Artificial Neural Networks. Slaney, M., & Casey, M. (2008). Locality-sensitive hashing for finding nearest neighbors [lecture notes]. IEEE Signal Processing Magazine, 25(2), 128-131. Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1), 11-21. Stehman, S. V. (1997). Selecting and interpreting measures of thematic classification accuracy. Remote sensing of Environment, 62(1), 77-89. Sullivan, D. (2001). Document warehousing and text mining: techniques for improving business operations, marketing, and sales. John Wiley & Sons, Inc.. Wu, G., Boydell, O., & Cunningham, P. (2014). High-throughput, Web-scale data stream slustering. In Proceedings of the 4th Web Search Click Data workshop (WSCD 2014). Yang, Y., & Pedersen, J. O. (1997, July). A comparative study on feature selection in text categorization. In ICML (Vol. 97, pp. 412-420). Zhang, J., Song, R., Yu, W.-X., Xia, S.-P., & Hu, W.-D. (2005). Construction of hierarchical classifiers based on the confusion matrix and fisher's principle. Ruan Jian Xue Bao(Journal of Software), 16(9), 1560-1567. 中文參考文獻 江珅薇(2007),相關學術論文集合關鍵詞擷取-學術領域自動命名,國立臺北大學資訊管理研究所。 曾有德(2008),以 Web 2.0 概念建構自動化文件分群與內容相似性比對之研究, 國立高雄第一科技大學資訊管理研究所。 黃馨儀(2015),智識建構方法論之改進研究,國立臺北大學資訊管理研究所。 鄭宇傑(2015),以核運算方法與LDA主題模型產生文字標籤之比較研究,國立臺北大學資訊管理研究所。 謝祥榆(2016),應用區域敏感雜湊進行中文文獻分類之研究,國立臺北大學資訊管理研究所。
|