Вы находитесь на странице: 1из 2

Data Engineering Lab

List of Programs:

1. Creating Star Schema/snowflake Schema / Fact constellation Schema using any tool

a) All Electronics sales application.

b) Identify the facts and dimensions for banking environment.

2. Compute all the cuboids of 4D cube using group-bys.

3. Compute all the cuboids of 4D cube using Rollup and Cube operators of oracle SQL.

4. SQL queries for implementing different OLAP operations.

5. Write high level language programs to implement different data preprocessing techniques.

a. Suppose that the data for analysis includes the attribute age. The age values for the data
tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30,
33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. Write a C program to implement smoothing
by bin means to smooth the data, using a bin depth of 3.

b. Write a C program to calculate the correlation coefficient. Use the following data to check
your code.
Suppose a hospital tested the age and body fat data for 18 randomly selected adults with
the following result:

Are these two variables positively or negatively correlated?

6. Write a C program to implement:


(a) min-max normalization
(b) z-score normalization
(c) Normalization by decimal scaling.

7. Write a high level program for the following:

Suppose that the data for analysis includes the attribute age. The age values for the data tuples are
(in increasing order)

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 36, 40, 45, 52, 70.

i) What is the mean of the data? What is median?


ii) What is the mode of the data? Comment on the data’s modality (bimodal,
trimodal, etc..)
iii) What is mid-range of the data?
iv) Can you find the first quartile(Q1) and the third quartile (Q3) of the data?
v) Give the five number summary of the data.
8. Implement various classification techniques on data sets using a data mining tool.

9. Estimate the values of numeric attributes, through prediction, using a data mining tool.

10. Mine strong association rules out of a given data set using a mining tool.

11. Cluster the given set of data objects by applying various clustering techniques using a data
mining tool.

12. Write high level language programs to implement Association rule mining/ classification
techniques.

Вам также может понравиться