Analysis of Various Clustering and Classification Algorithms in Datamining

Valsala, Sandhia; Thomas, Bindhya; George, Jissy Ann
November 2012
International Journal of Computer Science & Network Security;Nov2012, Vol. 12 Issue 11, p54
Academic Journal
Clustering and classification of data is a difficult problem that is related to various fields and applications. Challenge is greater, as input space dimensions become larger and feature scales are different from each other. The term "classification" is frequently used as an algorithm for all data mining tasks [1]. Instead, it is best to use the term to refer to the category of supervised learning algorithms used to search interesting data patterns. While classification algorithms have become very popular and ubiquitous in DM research, it is just but one of the many types of algorithms available to solve a specific type of DM task [12]. In this paper various clustering and classification algorithms are going to be addressed in detail. A detailed survey on existing algorithms will be made and the scalability of some of the existing classification algorithms will be examined.


Related Articles

  • Overview of Emperical Data Mining Research. Saroja Thota, Lalitha; Appa Rao, Allam // International Journal of Advanced Research in Computer Science;Sep/Oct2013, Vol. 4 Issue 10, p49 

    Data and Information has a significant role on human activities. The explosive growth in databases has created a need to develop technologies that use information and knowledge intelligently. Data mining is the knowledge discovery process by analyzing the large volumes of data from various...

  • A novel hybrid feature selection method based on rough set and improved harmony search. Inbarani, H.; Bagyamathi, M.; Azar, Ahmad // Neural Computing & Applications;Nov2015, Vol. 26 Issue 8, p1859 

    Feature selection is a process of selecting optimal features that produce the most prognostic outcome. It is one of the essential steps in knowledge discovery. The crisis is that not all features are important. Most of the features may be redundant, and the rest may be irrelevant and noisy. This...

  • Large-scale supervised similarity learning in networks. Chang, Shiyu; Qi, Guo-Jun; Yang, Yingzhen; Aggarwal, Charu; Zhou, Jiayu; Wang, Meng; Huang, Thomas // Knowledge & Information Systems;Sep2016, Vol. 48 Issue 3, p707 

    The problem of similarity learning is relevant to many data mining applications, such as recommender systems, classification, and retrieval. This problem is particularly challenging in the context of networks, which contain different aspects such as the topological structure, content, and user...

  • A Decision Tree Classification Model for University Admission System. Mashat, Abdul Fattah; Fouad, Mohammed M.; Yu, Philip S.; Gharib, Tarek F. // International Journal of Advanced Computer Science & Application;Oct2012, Vol. 3 Issue 10, p17 

    Data mining is the science and techniques used to analyze data to discover and extract previously unknown patterns. It is also considered a main part of the process of knowledge discovery in databases (KDD). In this paper, we introduce a supervised learning technique of building a decision tree...

  • Hiding Sensitive XML Association Rules with Supervised Learning Technique. Iqbal, Khalid; Asghar, Sohail; Mirza, Abdulrehman // Intelligent Information Management;Nov2011, Vol. 3 Issue 6, p219 

    In the privacy preservation of association rules, sensitivity analysis should be reported after the quantification of items in terms of their occurrence. The traditional methodologies, used for preserving confidentiality of association rules, are based on the assumptions while safeguarding...

  • Semi-Supervised Learning Based Social Image Semantic Mining Algorithm. AO Guangwu; SHEN Minggang // Journal of Multimedia;Feb2014, Vol. 9 Issue 2, p245 

    As social image semantic mining is of great importance in social image retrieval, and it can also solve the problem of semantic gap. In this paper, a novel social image semantic mining algorithm based on semi-supervised learning is proposed. Firstly, labels which tagged the images in the test...

  • A Study on Data Mining Classification Algorithms For Medical Data. Rani, P. R. Sudha; Rao, M. R. Narasinga; Lakshmi, D. T. Vijaya // International Journal of Advanced Research in Computer Science;Mar2014 Special Issue, Vol. 5 Issue 2, p13 

    This paper briefly describes various classification Algorithms used in medical Data. We introduce four widely used supervised methods. They are Artificial Neural Network, Bayesian classifiers, Decision Trees, Support Vector Machines.

  • Recent Trends in Data Classifications. Shukran, Mohd Afizi Mohd; Khairuddin, Mohammad Adib; Maskat, Kamaruzaman // International Proceedings of Computer Science & Information Tech;2012, Vol. 31, p69 

    Data classification is a widely used technique in various fields, including data mining, whose goal is to classify a large set of objects into predefined classes, described by a set of attributes, using supervised learning methods. Due to the explosive growth of both business and scientific...

  • Comparison of Supervised Learning Techniques for Binary Text Classification. Doshi, Hetal; Zalte, Maruti // International Journal of Computer Science & Information Security;Sep2012, Vol. 10 Issue 9, p52 

    Automated text classifier is useful assistance in information management. In this paper, supervised learning techniques like Naïve Bayes, Support Vector Machine (SVM) and K Nearest Neighbour (KNN) are implemented for classifying certain categories from 20 Newsgroup and WebKB dataset. Two...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics