Data Mining and Data Warehousing Question of 2079

We have a collection of Data Mining and Data Warehousing Question of 2079.

University: Tribhuvan University, Faculty of Management.

Data Mining and Data Warehousing Question Paper of 2079

Full Marks: 40

Pass Marks: 20

Time: 2 hrs

Semester: 8th

Subject Code: IT 308

Subject: Data Mining and Data Warehousing

Candidates are required to answer all the questions in their own words as far as practicable.

Group ”A”

1. Brief Answer Question [ 10*1 = 10 ]

I. Mention any two data mining techniques.

ii. What is the attribute selection measure in a decision tree?

iii. How do you validate the classification model?

iv. What are the different types of data used for cluster analysis?

v. Define base cuboid.

vi. List the advantage of MOLAP.

vii. Mention the purpose of FP tree.

viii. List any two pitfalls of data mining.

ix. Give any two applications of web mining.

x. Differentiate agglomeration and divisive hierarchical clustering.

Group “B”

Exercise Problems: [5*4]

2. If epsilon = 2 and Minpts = 2, what are the core point, border point, and outliner that DBSCAN would find from the data set A (3,10), B (2,3), C (3,4), D(6,7), and E(7,6).

3. Given the following transaction set, find the frequent itemset using the Apriori algorithm.

T1{ pasta, lemon, bread, orange }
T2{ pasta, Lemon }
T3{ pasta, orange, cake }
T4{ pasta, lemon, orange, cake }

Minimum support = 2

4. Illustrate the significance of authorities and hub in ranking the web pages.

5. What is an operational data source? List some guidelines to be considered in data warehouse implementation.

6. Assume the following training set with two classes, food, and beverage.

Food: “turkey stuffing”

Food: ”Buffalo wings”

Beverage: ”cream soda”

Beverage: ”orange soda”

Apply K- nearest neighbor with K = 3 to classify the new documents ”turkey soda”.

Group “C”

Comprehensive Question [ 2*5 = 10 ]

7. Define the dimension table. List the responsibilities of the query manager.

8. What might be the cause of overfitting in the classifier? Explain some data cube operations.

