Вы находитесь на странице: 1из 14

Correlation and linear regression

Q. Explain briefly the differences between simple linear regression and correlation and indicate the assumptions
common to both

Correletion Regression
Correletion no assumption made whether one regression attempts to describe the dependence of a
variable is dependent on the other(s) and is not variable on one (or more) explanatory variables;
concerned with the relationship between variables;
it implicitly assumes that there is a one-way causal
it gives an estimate as to the degree of association effect from the explanatory variable(s) to the
between the variables response variable, regardless of whether the path of
effect is direct or indirect

assumptions
 Both methods attempt to describe the association between two (or more) variables

Overview
What is correlation ?
 A correlation is a number between -1 and +1 that measures the degree of association between two variables
(call them X and Y).

What is regression?
 A regression is a number that measure the degree of dependence of 1 variable to a other variable

What term that use to summarize correlation ?


 correlation coeeficient

What factors that summarize the regression ?


 slope and intercept

What is correlation coefficients ?


 term that summarize the stregth of association between 2 variable

What is the function of correlation coefficients ?


 to test the hypotheses
 therefore, we can determine either the association is due to chance or random error

Does the correlation examine the direction of relationship ?


 No

What are 2 type of correlation coefficients ?


 spearmen correlation coefficient
 pearson correlation coefficient

What is pearson correlation coefficients ?


 correlation coefficients that based on original data

What is spearman correlation coefficients ?


 correlation coefficients that based on ranked or categorical data

What is the symbol for correlation coefficients ?


 R
What is R ? Does R has values ?
 dimensionless value
 range between -1 to + 1

What are 2 type of correlation coefficients ?


 positive correlation & negative correlation

What is positive correlation coefficients ?

What are 2 type of positive correlation coefficients ?


 strong positive correleation
 weak positive correlation

What are 2 types of negative correlation coefficients ?


 no correlation
 weak negative correlation

Does correlation coefficients is affected by units of measurements ?


 no

Does correlation can be use for non- linear relationship ?


 no

What are 2 cautions that has to be considered to measure coefficients ?


 first, the presence of outliers
 second , if the variable measure more than 1 distinct group

Overview
What is regression ?
 examine the dependence of one variable to other variable

How does relationship of regression summarize ?


 the relationship is summarize by slope and intercept

Summary
What is slope?
 the amount of the dependent variable increase as teh independent variable increase

What is intercept?
 the value of independent variable when the dependent variable is zero

Classification
What are 2 type of regression ?
 multivariate regression & linear regression

What is multivariate regression ?


 regression that examine simultaneous relationship of continuous variable

What is formula for multivariate regression ?


 y= a + b1 + b2x2

Give example?
predicted hemoglobin level for different age and hematocrit level
Hb= 5.24 + 0.11( age ) + 0.097 ( Hct)

What is linear regression ?


 examine on how the changes in one variable causes the changes in other variable

Give example of linear regression ?


 hemoglobin increases with age

What are 2 components of linear regression ?


 y axis is independent variable
 x axis is dependent variables

What is y variables ?
 independent variable

What is x variables ?
 dependent variables

Regression line
What is regression lines?
 it is the regression equation that describe the relationship between dependent variable and independent
variable

Give regression equation ?


 y= a + bx
 where a is intercept and b is slope

How do we calculate b ?
 b= Σ ( x- x) ( y – y) /(x- x)2

Measures of data symmetry


What are 2 types of data symmetry ?
 Data may be symmetric distribution or data may have skewed distribution

How does symmetric data look like?


 First, the data mean & median will be closed
 Second, the data can be expressed as mean & SD

How do we expressed symmetric data?


 By mean & SD

How does asymmetric distribution looked like ?

What is skewed distribution ?


If data, skewed to the right when its tail is longer to the right
Or skewed to the left if data is skewed when its tail is longer to the left

How do we expressed the data?


 Data is expressed as median and interquartile range

Why we don't use mean & SD?


 Because mean & SD are very sensitive to skewness

What is skewness?
 It is an index
 To which
 Distribution not symmetric
 Or
 The tail of the distribution
 Skewed/extend to left or right

How do we calculate skewness ?


 Sk= 3(mean-median)/SD

How do we know form the data distribution that the data is normaly distributed.
 The datas mean & median are closed

What is the used of skewness ?


 To assess if the data is normally distributed

What is skewness for normal distribution ?


 The normal distribution has a symmetric distribution
 Therefore , it skewness is zero
What about skewness to the left and right ?
 If the data is skewed to the left then it has distribution with a negative skewness
 When a curve has extreme scores on the right hand side of the distribution, it is said to be
positively skewed
 When a curve has extreme scores on the left hand side of the distribution, it is said to be
positively skewed
 If the data is skewed to the right then it has distribution with a positive skewness

how does curve for skewness looked like?

What is kurtosis?
 Measure of the extent
 In which
 Observation are clustered in the tails

What is the use of kurtosis ?


 To assess if the data is normally distributed
 Use with skewness

What is the values of kurtosis if the data is normally distributed?


 Zero values

What is the kurtosis for normal distribution ?


 Zero

How does the curve look like?

What happened if the data has negative kurtosis ?


 The distribution of data becomes lighter
 Data kurus=kurtosis
 Less cases will fall into tail
 Curves will contain more score in the center than a normal curve
 it tend to have higher peaks
 also reffered as leptokurtic

What happened if the data has positive kurtosis ?


 The distribution of data becomes larger
 Many cases will fall into tail
 Curves that have fewer scores in the center than the normal curve

Non parametric test

Define nonparametric data ?


 Nonparametric data is the qualitative data such as ordinal or nominal not in normal
distribution , numerical
 Example ; ordinal data, nominal data

What is nonparametric test


Overview
 Statistical test that is used to analyse ordinal and categorical data

what is the function of test


 This test compare the medium
 It doesn't compare the means
 It doesn't estimate standard of deviation

Advantage of using non-parametric test?


1. Doesn't need to know the the means and SD
2. The outlier has no much effect in the statistical outcomes
3. Doesn't need to rely on estimation of parameter
4. Appropriate for small sample size

what is the Disadvantage


 Not powerful as parametric test
 not as likely to detect the difference as the parametric test

Give example of non-parametric test


 Wilcoxon signed rank
 Mann-Whitney U
 Rank correlation (eg Spearman)
 Chi-squared

Type of test use for non-normal distribution


 the various forms of chi-square tests
 the Fisher Exact Probability test
 the Mann-Whitney Test
 the Wilcoxon Signed-Rank Test
 the Kruskal-Wallis Test
 and the Friedman Test

Wilcoxon signed-rank test


 The Wilcoxon signed-rank test is a non-parametric that equivalent to to the paired
Student's t-test
 This test analyze the difference between two paired observation
 Like the t-test, the Wilcoxon test involves comparisons of differences between
measurements,
 It doesnt make an assumption of the distribution between the two variable
 so it requires that the data are measured at an interval level of measurement.
 However it does not require assumptions about the form of the distribution of the
measurements

Mann Whittney test


 a non-parametric statistical test----equivalent to unpaired T test or two sample test
that two independent sample came from the same population

Wilcoxon rank sum test


 a non-parametric statistical test----equivalent to unpaired T test or two sample test
that two independent sample came from the same population

Kruskal-Wallis one-way analysis of variance


 Kruskal-Wallis one-way analysis of variance is a non-parametric method for testing
equality of population medians among groups
 it is identical to a one-way analysis of variance with the data replaced by their ranks.
 It is an extension of the Mann-Whitney U test to 3 or more groups
 the Kruskal-Wallis test does not assume a normal population, unlike the analogous one-
way analysis of variance
Parametric test
What is parametric data?
 Parametric data is the quantitative data that involve population parameters such as
continous data and discrete data

what is continuous data?

What is parametric test?


 Statistical test that is used to analyse normally distributed continous data
 This test is used to analyse numerical data such as cardiac output , renal blood flow

What are five parametric test?


 t-test
 ANOVA
 MANOVA
 Regression
 Correlation

Parametric test?
 parametric statistical test is one that makes assumptions about the parameters (defining
properties) of the population distribution(s) from which one's data are drawn
 Statistical test that is used to analyse normally distributed continous data
 This test is used to analyse numerical data such as cardiac output , renal blood flow

Q. In a clinical trial why is adequate power important? What factors affect the determination of an adequate
sample size?

Power of statistical tests


what is power of statistical test ?
 It is the likelihood to detect there is a difference in the studies
 it is the probablity of the study that can predict the difference when the real difference actually exist

How does we relate power of study to the null hypothesis?


 It is the probablity of rejecting null hypothesis when it is false
 rejecting null hypothesis means- reject the statement of no difference in treatment
 when it false means----there is a difference between treatment

How does it relate to type 2 error?


 It is represent as 1-B(beta)

how does we calculate of power of study?


 Need to know beta or type 2 error
 Researcher usually set the B at value between 0.05 to 0.2
 The power of test; 1-B
 Example ; power of test for study that compare the duration of neuromuscular block in normal and
postpartum patient with power analysis of B=0.1 , a=0.05

what is the significance of power of study


 To estimate appropriate samples size for the study
 If sample size to small> lack of precision to provide reliable answer
 If samples to large> waste time, resources, put patient at risk
How does the power of study relate to the observation of difference between studies ?
 the greater the power the study,
 the smaller the difference that will be detected by the test
 that means your test is to effeicient in detecting the true nothing but the true difference , therefore tak
banyaklah difference can be detected

Q. Write short notes on bias in drug trials and how its influence may be reduced
Bias
What is bias?
 Bias is a systematic non-random deviation of result and inferences from the truth or process that lead to
deviation
 Bias is any trend in the collection , analysis, interpretation , publication or review of data that lead to coclusion
that systematiccly different from truth
 Effect of bias; distortion of the truth

Why bias is systematic errors ?


 Bias is systematic errors that is caused by any factors that systematically affect measurements of variable across
sample
 and it cannot be compensated by increasing sample size

What is the difference between bias and random errors ?


 Random errors is errors that is caused by any factors that randomly affect the measurements of variable across
sample
 It can be minimize/alleviate by studying large number of patients
 Bias is systematic errors that is caused by any factors that systematically affect measurements of variable
across sample and
 it cannot be compensated by increasing sample size

What are some sources of bias?


Selection bias
 Systematic differences in the group that being compared due to incomplete randomisation
 Example ;volunteer vs non-voluteer, employed vs unemployed , selective refferal, loss to follow up

Detection bias
 The bias that occur when the measurements of the outcome of one group are not as vigilant as the other group
 Example, pain score , some researcher may have comprehensive pain scoring
Observer bias
 The bias that occur when the measurements of outcomes is made by the person judgement
 Example

Reporting bias or recall bias


 The bias that occur due to intentional or unintentional differential recall and thus reporting of information about
the exposure or outcome of an association by subjects in one group compared to the other.

Response bias
 Bias that occur when the patients that enrolled the study given the answer that do not reflect their true beliefs
 Example
 answer questions in the way they think the questioner wants them to answer rather than according to their true
believe

Publication bias
 The bias that occur when the studies that has negative result is less likely to be submitted or accepted
 Publication bias arises from the tendency for researchers and editors to handle experimental results that are
positive (they found something) differently from results that are negative (found that something did not happen)
or inconclusive

How to control bias ?


 After carefully reviewing our study
 determining what might effect our results that are not part of the experiment,
 we need to control for these biases.
 To control for selection bias, most experiments use what’s called Random Assignment,
 which means assigning the subjects to each group based on chance rather than human decision.
 To control for the placebo effect, subjects are often not informed of the purpose of the experiment.
 This is called a Blind study, because the subjects are blind to the expected results.
 To control for experimenter biases, we can utilize a Double-Blind study,
 which means that both the experimenter and the subjects are blind to the purpose and anticipated results of
the study
Q. Define Standard Deviation. Explain its application to the use of drugs in clinical anaesthetic practice

Overview

 standard deviation is a most common measure of statistical dispersion, measuring how widely spread the
values in a data set are.

relationship

 If many data points are close to the mean, then the standard deviation is small;

 if many data points are far from the mean, then the standard deviation is large.

 If all the data values are equal, then the standard deviation is zero.

Formula

 For a population, the standard deviation can be estimated by a modified standard deviation (s) of a sample. ,

 sigma , σ

application in clinical practice


 standard deviation is imporatance in stating the finding of unknwon parameter
 the standard deviation can be stated in confidence interval
 95% CI = x +/- 1.96 x s/ _/n

Example
 let say you want to study the mean dose of opioid that can give pain score of 1 in 24 hours
 therefore , you take let say 100 samples
 from the sample calculate the means dose opioid to give pain score of 1 , let say x gram
 then calculate the standard deviation , σ
 form the value of x and standard deviation you can state the means dose of opioid

Standard deviation & measurements of dispersion


How do you describe your data dispersion or variability or scatter?
 Range
 Interquartile range
 Standard deviation
 Degree of freedom

What is variability ?
 How do they cluster around the central location

 How do we describe variability ?
 Range, percentile, SD, degree of freedom

What is range ?
 Range is measure of data dispersion given by the smallest & the largest values
 Vulnerable to outliers

What is interquartile ranges ?


 Distance between 25th centile and 75th centile
 Not vulnerable to outliers
 What is standard deviation ?
 Is a measure of dispersion of data from the means

 What is abbreviation and formula for standard deviation


 SD or s or sigma
 Formula= SD= _/( x- x1)2/ n-1

Give example

What is n-1?
 It is degree of freedom

What happened if the data , x is scattered over wide location ?


 The s or standard deviation would become large

Can you show it in the graph of normal distribution ?

What is variance?
 It is measure of dispersion of values about the mean

What is the formula?


 S2= {( x- x1)2/ n-1
 It is square of standard deviation

What is coefficients of variations ?


 SD divided by means x 100%

For a data of normal distribution , how much is the chance that the data would be dispersed or fall within 1 SD?
 68.26%

For a data of normal distribution , how much is the chance that the data would be dispersed or fall within 2 SD?
 95.45%

For a data of normal distribution , how much is the chance that the data would be dispersed or fall within 3 SD?
 99.7%

Can you draw 1 SD


Can you draw 2 SD
How many population fall within 1 SD

Вам также может понравиться