CSIT 7th Semester
Data Warehousing And Data Mining Board Question Paper 2079


CSC 410-2079 ✡
Tribhuvan University
Institute of Science and Technology
2079
Bachelor Level/Fourth Year/Seventh Semester/Science
Computer Science Information Technology (CSC 410)
(Data Warehousing And Data Mining)
(New Course)
Full Marks:60 Pass Marks:24 Time:3 hours

Candidates are required to give their answers in their own words as for as practicable.
The figures in the margin indicate full marks

Section A
Long Answer Questions
Attempt any Two question.
[2x10=20]
1.

Discuss any two drawbacks of Apriori algorithm. Find frequent itemsets and association rules from the transaction database given below using FP-growth algorithm. Assume minimum support is 50% and minimum confidence is 60%.

Transaction IDItems Purchased
1Sausage, Peanut, Beer
2Peanut, Beer, Apple
3Apple, Milk
4Sausage, Peanut, Apple
5Sausage, Peanut, Beer, Milk
6Sausage, Peanut, Beer, Apple

2.

When multilayer perceptron is better choice over other classification algorithms? Consider a multilayer feed-forward neural network given below. Let the learning rate be 0.5. Assume initial values of weights and biases as given in the table below. Train the network for the training tuples (1, 1, 0) and (0, 1, 1), where last number is target output. Show weight and bias updates by using back-propagation algorithm. Assume that activation function is used in the network.

w1w13w23w24w35w45b3b4b5
0.50.2-0.30.50.10.30.6-0.40.8

multilayer feed-forward neural network diagram

3.

Why OLAP operations are used? Discuss various OLAP operation with suitable example of each.

Section B

Attempt any Eight questions

[8x5=40]
4.

Suppose that we have 5 dimensional data. What will be total number of cuboids generated? If we consider each dimension has 5 levels, what will be the number of cuboids generated?

5.

Discuss different types of attributes with suitable example of each.

6.

Why data normalization is important in data mining? Explain min-max and Z-score normalization approach.

7.

What are two categories of hierarchical clustering? Divide the following datapoints into two clusters using agglomerative clustering.
{(2,10), (2,5), (8,4), (5,8), (7,5), (6,4)}

8.

Discuss the concept of K-means and Mini-batch K-means algorithm.

9.

What is confusion matrix? Discuss various classification measures along with their mathematical formulae.

10.

What are application areas of graph mining? Explain the concept behind inductive logic programming with suitable demonstration.

11.

Discuss the concept of text mining with its practical implications.

12.

Write down short notes on
a. DataMart
b. Market Basket Analysis