Question

* There are huge secondary/ standard DATASETS published in the
internet for such case studies:

·        Case 1: Text categorization: Spam Detection.

·        Case 2: Face detection.

·        Case 3: Signature recognition.

·        Case 4: Customer discover.

·        Case 5: Medicine: Classify if a patient has heart
ischemia by a spectral analysis of his/her ECG.

Find ONE (1) of the secondary datasets from the internet. Name the
secondary dataset that you have found.

* Based on your answer in 1, perform the following computation using
RapidMiner application software to compute accuracy performances (you
have to screenshot the processes in detail):

* Classification method: Artificial Neural Network (ANN) Algorithm.
	* Feature construction: Independent component analysis (ICA) method. 
	* Feature selection: (Wrapper) method.
	* Results presentation in testing, training and validation: m-fold
cross-validation (CV) method.

Accepted Answer

1(a) For spam detection we use SpamAssassin datasets. The text
classification datasets are used to categorize the natural language
text.

1(B) For the face detection, we use labelled faces in the wild(LFW)
dataset.

1(c) For signature recognition, we use mainly three different kind of
data set. which are CEDAR, UTSig, and ICDAR.

1(d) For customer dicover we use kaggale datasets.

1(e) ECG-Holter validated datasets is used for AI mechine learning
algorithms validation.

Question #160716

Expert's answer