Вы находитесь на странице: 1из 2

Data Mining Methods Basics

---------------------------------

1. Which of the following is not applicable to Data Mining? - Involves working with known information
2. The process of extracting valid, useful, unknown info from data and using it to make proactive
knowledge driven business is called - Data mining
3. Which of the following modelling type should be used for Labelled data? - Predictive Modelling
4. Which of the following role is responsible for performing validation on analysis datasets?-Statisticians
5. Noisy values are the values that are valid for the dataset, but are incorrectly recorded -True
6. What is the other name for Data Preparation stage of Knowledge Discovery Process? - ETL
7. Which of the following activities is performed as part of data pre processing? -Detect Missing
Values/All the options
8. If time is used as an independent variable in a simple linear regression analysis, which of the following
assumptions could be violated? -Successive observations of the dependent variable are uncorrelated
9. Probability of theft in an area is 0.03 with expected loss of 20% or 30% of things with probabilities 0.55
and 0.45. Insurance policy from A costs $150 pa with 100% repayment. Policy with B, costs $100 pa and
first $500 of any loss has to be paid by the owner. Which data mining technique can be used to choose
the policy? -Probability of theft in an area is 0.03 with expected loss of 20% or 30% of things with
probabilities 0.55 and 0.45. Insurance policy from A costs $150 pa with 100% repayment. Policy with B,
costs $100 pa and first $500 of any loss has to be paid by the owner. Which data mining technique can
be used to choose the policy?
10. Statistical technique used for investigating and modelling the relationship between two or more
variables is:- Regression analysis
11. What is the type of learning where a function is inferred to describe hidden structure from unlabeled
data -Unsupervised Learning
12. Which statistical technique deals with finding a structure in a collection of unlabeled data? -Clustering

13. _________ are the values that mark the boundaries of the confidence interval. -Confidence limits
14. Machine learning task of inferring a function from labelled training data is known as -supervised
Learning
15. Regression is typically carried out to develop a mathematical model of the process -True
16. Which of the following are Multi-class Classification problem? -Is this movie a comedy, a documentary,
or a thriller?
17. Simulations are carried out to develop a mathematical model of the process -false
18. Which data mining method groups together objects that are similar to each other and dissimilar to the
other objects? - Clustering
19. Associate rule is known as _________ -Affinity analysis
R Basic

--------------

1. Which command allows you to get the median tree Height of the R sample dataset "trees"? -
median(trees$Height)
2. What would be the result of following code:x <- 4 class(x) - numeric
3. What is the class of the object defined by the expression x<- c(4,"a",TRUE) in R ? - Character
4. What would be the output of the following code ?x <- c("a", "b", "c", "c", "d", "a") x[c(1, 3, 4)] - "a" "c"
"c"
5. Which function can be used to create collection of vectors objects? - c()
6. Which function is used to generate Sequences in R? - seq()
7. Which of the following statements is correct? -all
8. If I have two vectors, x<-c(1,3,5) and y<-c(3,2,10), what does rbind(x,y) give ? - A 2 x 3 matrix
9. What command will you enter in the R console to get help on how to quit R - help(q)
10. What is the function in R to get the # of observations in a data frame ? - n(),nrow()
11. What would be the result of following code:x <- c("x", "y", "z") as.logical(x) - NA NA NA
12. In R, the following are all atomic data types except -Data frame
13. A key property of vectors in R is that - All elements must be of the same class
14. Which of the following while loops will print numbers from 1 to 4? - x<-1 while(x < 5) { print(x); x <-
x+1;}
15. Which R command creates a 2 by 2 matrix with the values 1,2,3 and 4?- m <- matrix(1:4, 2, 2)
16. What is the output of the R code:m <- c(1, 2, 3) n <- c(6, 5, 4) (m < 2) & (n > 5) - TRUE FALSE FALSE
17. Which of the following statements is correct? - All
18. Suppose I have a vector x <- c(3,5,1,10,12,6) and I want to set all elements of this vector that are less
than 6 to be equal to zero, what R code achieves this ? - None of the Options
19. What would be the result of following code:x <- 0:4 as.logical(x) - FALSE TRUE TRUE TRUE TRUE
20. Which of the following statements is correct? - Number Inf represents infinity in R

Вам также может понравиться