An Efficient Density-based Approach for Data Mining Tasks

Domeniconi, Carlotta; Gunopulos, Dimitrios
November 2004
Knowledge & Information Systems;Nov2004, Vol. 6 Issue 6, p750
Academic Journal
We propose a locally adaptive technique to address the problem of setting the bandwidth parameters for kernel density estimation. Our technique is efficient and can be performed in only two dataset passes. We also show how to apply our technique to efficiently solve range query approximation, classification and clustering problems for very large datasets. We validate the efficiency and accuracy of our technique by presenting experimental results on a variety of both synthetic and real datasets.


Related Articles

  • The Research of Distributed Data Mining Knowledge Discovery Based on Extension Sets.  // International Journal of Computer Applications;Oct2010, Vol. 8, p12 

    The article offers information on the importance of distributed data mining knowledge to researchers in Pradesh, India. It says that this method is vital to them because it helps them discover and generate new knowledge from large databases. Moreover, it furnishes methods in obtaining...

  • Data Mining is Dead--Long Live Predictive Analytics. Agosta, Lou // DM Review;Jan2004, Vol. 14 Issue 1, p37 

    Compares the data mining and predictive analytics techniques in the U.S. Reasons for the failure of data mining; Functions of the Data Mining software of Oracle; Formulation and validation of hypothesis in data mining.

  • A review on particle swarm optimization algorithms and their applications to data clustering. Rana, Sandeep; Jasola, Sanjay; Kumar, Rajesh // Artificial Intelligence Review;Mar2011, Vol. 35 Issue 3, p211 

    Data clustering is one of the most popular techniques in data mining. It is a method of grouping data into clusters, in which each cluster must have data of great similarity and high dissimilarity with other cluster data. The most popular clustering algorithm K-mean and other classical...

  • Characterizing and Mining the Citation Graph of the Computer Science Literature. An, Yuan; Janssen, Jeannette; Milios, Evangelos E. // Knowledge & Information Systems;Nov2004, Vol. 6 Issue 6, p664 

    Citation graphs representing a body of scientific literature convey measures of scholarly activity and productivity. In this work we present a study of the structure of the citation graph of the computer science literature. Using a web robot we built several topic-specific citation graphs and...

  • Managing Multiuser Database Buffers Using Data Mining Techniques. Feng, Ling; Lu, Hongjun // Knowledge & Information Systems;Nov2004, Vol. 6 Issue 6, p679 

    In this paper, we propose a data-mining-based approach to public buffer management for a multiuser database system, where database buffers are organized into two areas -publicandprivate. While the private buffer areas contain pages to be updated by particular users, the public buffer area...

  • Document Similarity Using a Phrase Indexing Graph Model. Hammouda, Khaled M.; Kamel, Mohamed S. // Knowledge & Information Systems;Nov2004, Vol. 6 Issue 6, p710 

    Document clustering techniques mostly rely on single term analysis of text, such as the vector space model. To better capture the structure of documents, the underlying data model should be able to represent the phrases in the document as well as single terms. We present a novel data model, the...

  • On the time series support vector machine using dynamic time warping kernel for brain activity classification. W. Chaovalitwongse; P. Pardalos // Cybernetics & Systems Analysis;Jan2008, Vol. 44 Issue 1, p125 

    Abstract  A new data mining technique used to classify normal and pre-seizure electroencephalograms is proposed. The technique is based on a dynamic time warping kernel combined with support vector machines (SVMs). The experimental results show that the technique is superior to the...

  • Finding centric local outliers in categorical/numerical spaces. Xu Yu, Jeffrey; Weining Qian; Hongjun Lu; Aoying Zhou // Knowledge & Information Systems;Mar2006, Vol. 9 Issue 3, p309 

    Outlier detection techniques are widely used in many applications such as credit-card fraud detection, monitoring criminal activities in electronic commerce, etc. These applications attempt to identify outliers as noises, exceptions, or objects around the border. The existing density-based local...

  • Quality assessment of individual classifications in machine learning and data mining. Kukar, Matjaž // Knowledge & Information Systems;Mar2006, Vol. 9 Issue 3, p364 

    Although in the past machine learning algorithms have been successfully used in many problems, their serious practical use is affected by the fact that often they cannot produce reliable and unbiased assessments of their predictions' quality. In last few years, several approaches for estimating...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics