CSIT 7th Semester
Data Warehousing And Data Mining Board Question Paper 2078


CSC 410-2078 ✡
Tribhuvan University
Institute of Science and Technology
2078
Bachelor Level/Fourth Year/Seventh Semester/Science
Computer Science Information Technology (CSC 410)
(Data Warehousing And Data Mining)
(New Course)
Full Marks:60 Pass Marks:24 Time:3 hours

Candidates are required to give their answers in their own words as for as practicable.
The figures in the margin indicate full marks

Section A
Long Answer Questions
Attempt any Two question.
[2x10=20]
1.

Write down any one advantage and disadvantage of MOLAP over ROLAP. Define signed network and how do you check whether it is balanced or not? How beam search reduces the space complexity? Illustrate with an example.

2.

How concept hierarchy is used in extracting information? Generate the frequent pattern from the following data set using FP growth, where minimum support=3.

T_IDItems
1A,B,C,D,F,H
2A,D,E,F
3C,D,F
4B,H
5A,C,F,G,H
6C,D,E,G
7A,C,D,I

3.

How do you compare two classifiers? Given the points A(3,7), B(4,6), C(5,5), D(6,4), E(7,3), F(6,2), G(7,2) and (8,4), find the core points, border points and outliers using DBSCAN. Take Eps = 2.5 and MinPts = 3.

Section B

Attempt any Eight questions

[8x5=40]
4.

When a pattern is said to be interesting? List the issues of data mining.

5.

Define data discretization. Describe the tasks for data preprocessing.

6.

Define spatial data mining. What are the challenged of multimedia mining? Describe with an example.

7.

Consider the following data Set.

ConfidentStudiedSickResult
YesNoNoFail
YesNoYesPass
NoYesYesFail
NoYesNoPass
YesYesYesPass

Find out whether the object with attribute Confident = Yes, Studied = Yes, Sick = No will Fail or Pass using Bayesian classification.

8.

What are the choices for data cube materialization? Explain the strategies for cube computation.

9.

Show the conflict between theory of balance and status. How do you improve Apriori?

10.

Differentiate between star schema and snow flake schema. List any two methods for data normalization.

11.

How do you evaluate the accuracy of a classifier? Discuss the advantages of using K-fold cross validation.

12.

Apply K(=2)- Means algorithm over the data (185,72), (170, 56), (168,60), (179,68), (182,72), (188,77) up to two iterations and show the clusters. Initially choose the first two objects as initial centroids.