Вы находитесь на странице: 1из 3

SUMMARIZING BIVARIATE DATA -value lower than 0.5 or -0.

5 indicates weak
positive/negative linear relationship
-goal of the analysis is the give a quantitative
description of the relationship between two
variables
Regression
-strong/weak relationship
-predict the value of a second variable y using
-if the goal of the study is to describe the the information about x
strength of the relationship and the scatterplot
-predict y-dependent variable using x-
shows linear pattern
independent variable
-correlation coefficient or coefficient of
FORMULA: y = a + bx
determination

Principle of Least Squares


Correlation coefficient
-a particular line a good fit to the data if the
-numerical assessment of the strength of
deviations from the line are small in magnitude
relationship between the x and y values in a
bivariate data set consisting of (x,y) pairs -combine deviations into a single measure of fir
by squaring the deviations and summing them
 Pearson’s Sample Correlation Coefficient
up
-measures the strength of any linear
-sum of squared deviations
relationship between two numerical variables
[y – (a + bx)]2

-slope of the least-squares line


Sample Correlation Coefficient
b = [(x-x1)(y-y1)]/(x-x1)2
-utilizes z-scores
-y-intercept
-z-score=(value-mean)/standard deviation
a = y1 – bx1

Properties of r
Residuals
-value of r is between -1 and 1
-vertical deviation of a point in the scatterplot
-1 to -0.8: strong
from the regression line
-0.8 to -0.5: moderate
-difference between an observed y value and
-0.5 to 0.5: weak the corresponding predicted y value

0.5 to 0.8: moderate -positive residual – above the line

0.8 to 1: strong -negative residual – below the line

-value near +1 or -1 indicates strong (yn – yn) = nth residual


positive/negative linear relationship
 Influential observation
-an x value that is far away from the rest of the Standard Deviation
data but its removal has a large impact on the
Se = sqrt((SSResid/(n-2)))
value of the slope or intercept of the least-
squares lines -standard deviation about the least-squares line
 outliers -typical amount by which an observation
deviates from the least-squares line
-an observation that has a large residual

Coefficient of determination

-measure of the proportion of variability in the y


variable that can be attributed to an
approximate linear relationship between x and
y

-total sum of squares (SSTo)

-measure of total variation larger value of SS TO


means greater amount of variability

-residual sum of squares (SSResid)

-measure of unexplained variation

-amount of variation in y that cannot be


attributed to the linear relationship between x
and y

-larger value SSResid means more points in


scatterplot that deviates from least-squares line

-SSresid/SSTo

-fraction/proportion of total variation that is


unexplained by a straight-line relation (least-
squares line)

-r2=1-(SSResid/SSTo)

-fraction/proportion of total variation that is


explained by a straight-line relation (least-
squares line)

-closer this percentage to 100% means the


more successful the linear relationship is in
explaining variation in y
PROBABILITY  relative frequency approach

-the extent to which an event is likely to occur


as measured by the ratio of the favorable cases -utilizes an event’s frequency to occur
to the whole number of cases possible
-ratio of the number of times an event occurs in
-chance experiment a process to the total number of times the
process is done
-sample space
-P(E) = number of times E occurs/number of
times the experiment is performed
Chance experiment

-any activity or situation in which there is


Basic Properties of Probability
uncertainty about which of two or more
possible outcomes will result 1 For any event E, 0 ≤ P(E) ≤ 1.

2 If S is the sample space for a chance


experiment, P(S) = 1.
Event
3 If two events E and F are mutually exclusive,
-any collection of outcomes from the sample
then P(E or F) = P(E) + P(F).
space of chance experiment
4 For any event E, P(E) + P(not E) = 1.

Relationships
Conditional Probability
-union
-an event occurs based on a given condition for
-intersection
another certain event to occur
-complement
- P(E|F) = P(E∩F)/P(E)

Approaches
Independence
 classical approach
-one event will not change the assessment of
-classical approach to probability for equally the probability of occurrence of a second event
likely outcomes
-if two event E and F are not independent, they
-ratio of the number of outcomes favorable to E are dependent events; if
to the total number of outcomes in the sample
P(E|F) = P(E) P(F|E) = P(F)
space

 subjective approach
P(E|F) = P(E∩F)/P(F)
-a probability is interpreted as a personal
measure of the strength of belief that a P(E∩F) = P(E|F) P(F)
particular event will occur
P(E∩F) = P(E|F) P(F) = P(E) P(F)

Вам также может понравиться