Вы находитесь на странице: 1из 19

DATA EDITING AND VALIDATION

Preeti Shukla

Department of Botany DSB Campus Kumaun University Nainital

Data

Editing

Technically speaking, Data editing is a processing of data involving text editing , coding, classification and tabulation so that it can be analyzed and further validated.

Data Editing operations:


1. Text editing 2. Coding 3. Classification 4. Tabulation

Students Weight 20

Students Age 10

25 23

12 8

A B A

Analysis of data
(Searching patterns of relationship among the groups of data)

Discriptive Analysis
(Studies of distributions of variables)

Inferential Analysis
(Hypotheses Testing for drawing inference)

One-dimensional Analysis
Parametric tests Bivariate analysis

Non Parametric tests

Multivariate Analysis

One-dimensional Analysis
Involves calculation of several measures mostly concerning one variable. These measures include: 1. Measures of central tendency ( Mean , Median and
Mode)

2. Measures of dispersion (Range , Mean deviation and


Standard deviation)

3. Measures of Asymmetry (skewness)

Measures of central tendency:


It describes a point considered as the most representative figure for the entire mass of data.
Mean: It is the sum of the variables divided by total

number of variables.

Xi X = n

Median: It is the value of middle item of the series

when arranged in ascending or descending order of magnitude.


M = Value of ( n + 1)/2 th item Mode: It is the most frequently occurring value in a

series.

Measures of dispersion
It gives the idea about how the values of variables are scattered around the mean value of a series. These measures are: 1. Range: Simplest possible measure of dispersion, defined as difference between the two extreme values of a variable.
Range = (Highest value
of variable)
_

(Lowest value of variable)

1.

Mean deviation: It is the average of the difference of values of variables from average of the series. Mean deviation =
| Xi X | n

3. Standard deviation: It is defined as the square-root

of the average of squares of deviation obtained from arithmetic mean. Standard deviation = (Xi X )2

Measure of Asymmetry ( Skewness)


This gives us information about the formation of the series and shows the manner in which the items are clustered around the average.

In case of Positive skewness, we have Mode < Median < Mean and in case of negative skewness we have Mean < Median < Mode.

Bivariate Analysis
It involves analysis of two variables or attributes in a two way classification. It involves the following methods: 1. Simple regression and correlation 2. Association of attributes 3. Two-way ANOVA (Software- SPSS)
Height Weight Altitude

Multi-variate analysis
It is the simultaneous analysis of more than two variables/ attributes in a multiway classification. It involves :
1. 2. 3. 4. Multiple regression and multiple correlation Multi-ANOVA Canonical analysis Cluster analysis
Height Weight Fathers income

Inferential analysis
It is done through Parametric tests of Hypotheses and Nonparametric Tests of hypotheses.

Parametric Tests include (Assumptions are made):


z test t test
For judging the significance difference between the means of two independent variables.

2 test
For comparing the sample variance to theoretical variance.

Nonparametric Tests
(No assumptions)

1. Sign Test 2. Fisher-Irwin Test 3. Rank correlation 4. One sample runs Test 5. Chi-square Test

Difference in elongation of treated and untreated seedlings


Pair 1 2 3 4 5 6 7 8 9 10 Difference (mm) 6.0 1.3 10.2 23.9 3.1 6.8 -1.5 -14.7 -3.3 11.1 Signed Rank 5 1 7 10 3 6 -2 -9 -4 8

Smallest difference is ranked as 1 and so on. First ,absolute vales are taken (ignoring signs), then signs are restored.

1. The ranks with negative signs total 15 and with positive signs are 40.

2. From the theoretical value we should have a rank sum of 8 for 10 pairs. 3. Since 15 > 8 , the data supports null hypotheses that elongation is unaffected by electric current treatment.

Significance of Data Editing and Validation


The data is processed and analysed in accordance with the developing the research plan. Searching patterns of relationship that exist among data-groups, which is thus subjected to statistical tests of significance to determine with what validity data can be said to indicate any conclusions. Generalizations can be made on the basis of data analysis and may be stated as hypotheses to be tested by subsequent researchers in coming time.

References:
en.wikipedia.org Research Methodology, C. R. Kothari Statistical Methods, Snedecor & Cochran

Acknowledgement
Dr. L. M. Tewari sir Dr. Ashish Tewari sir Dr. Chitra Pandey mam Dr. Geeta Tewari mam Respected Teachers and dear friends

Вам также может понравиться