Data Editing and Validation

DATA EDITING AND VALIDATION
Preeti Shukla
Department of Botany DSB Campus Kumaun University Nainital
Data
Editing
Technically speaking, Data editing is a processing of data involving text editing , coding, classification and tabulation so that it can be analyzed and further validated.
Data Editing operations:

1. Text editing 2. Coding 3. Classification 4. Tabulation
Students Weight 20
Students Age 10
25 23
12 8
A B A
Analysis of data
(Searching patterns of relationship among the groups of data)
Discriptive Analysis
(Studies of distributions of variables)
Inferential Analysis
(Hypotheses Testing for drawing inference)
One-dimensional Analysis
Parametric tests Bivariate analysis
Non Parametric tests
Multivariate Analysis
One-dimensional Analysis
Involves calculation of several measures mostly concerning one variable. These measures include: 1. Measures of central tendency ( Mean , Median and
Mode)
2. Measures of dispersion (Range , Mean deviation and

Standard deviation)
3. Measures of Asymmetry (skewness)
Measures of central tendency:

It describes a point considered as the most representative figure for the entire mass of data.
Mean: It is the sum of the variables divided by total
number of variables.
Xi X = n
Median: It is the value of middle item of the series
when arranged in ascending or descending order of magnitude.

M = Value of ( n + 1)/2 th item Mode: It is the most frequently occurring value in a
series.
Measures of dispersion
It gives the idea about how the values of variables are scattered around the mean value of a series. These measures are: 1. Range: Simplest possible measure of dispersion, defined as difference between the two extreme values of a variable.
Range = (Highest value
of variable)
_
(Lowest value of variable)
1.
Mean deviation: It is the average of the difference of values of variables from average of the series. Mean deviation =
| Xi X | n
3. Standard deviation: It is defined as the square-root
of the average of squares of deviation obtained from arithmetic mean. Standard deviation = (Xi X )2
Measure of Asymmetry ( Skewness)

This gives us information about the formation of the series and shows the manner in which the items are clustered around the average.
In case of Positive skewness, we have Mode < Median < Mean and in case of negative skewness we have Mean < Median < Mode.
Bivariate Analysis
It involves analysis of two variables or attributes in a two way classification. It involves the following methods: 1. Simple regression and correlation 2. Association of attributes 3. Two-way ANOVA (Software- SPSS)
Height Weight Altitude
Multi-variate analysis
It is the simultaneous analysis of more than two variables/ attributes in a multiway classification. It involves :
1. 2. 3. 4. Multiple regression and multiple correlation Multi-ANOVA Canonical analysis Cluster analysis
Height Weight Fathers income
Inferential analysis
It is done through Parametric tests of Hypotheses and Nonparametric Tests of hypotheses.
Parametric Tests include (Assumptions are made):

z test t test
For judging the significance difference between the means of two independent variables.
2 test
For comparing the sample variance to theoretical variance.
Nonparametric Tests
(No assumptions)
1. Sign Test 2. Fisher-Irwin Test 3. Rank correlation 4. One sample runs Test 5. Chi-square Test
Difference in elongation of treated and untreated seedlings

Pair 1 2 3 4 5 6 7 8 9 10 Difference (mm) 6.0 1.3 10.2 23.9 3.1 6.8 -1.5 -14.7 -3.3 11.1 Signed Rank 5 1 7 10 3 6 -2 -9 -4 8
Smallest difference is ranked as 1 and so on. First ,absolute vales are taken (ignoring signs), then signs are restored.
1. The ranks with negative signs total 15 and with positive signs are 40.
2. From the theoretical value we should have a rank sum of 8 for 10 pairs. 3. Since 15 > 8 , the data supports null hypotheses that elongation is unaffected by electric current treatment.
Significance of Data Editing and Validation

The data is processed and analysed in accordance with the developing the research plan. Searching patterns of relationship that exist among data-groups, which is thus subjected to statistical tests of significance to determine with what validity data can be said to indicate any conclusions. Generalizations can be made on the basis of data analysis and may be stated as hypotheses to be tested by subsequent researchers in coming time.
References:
en.wikipedia.org Research Methodology, C. R. Kothari Statistical Methods, Snedecor & Cochran
Acknowledgement
Dr. L. M. Tewari sir Dr. Ashish Tewari sir Dr. Chitra Pandey mam Dr. Geeta Tewari mam Respected Teachers and dear friends

Data Editing and Validation

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Data Editing and Validation

Загружено:

Авторское право:

Доступные форматы

DATA EDITING AND VALIDATION

Department of Botany DSB Campus Kumaun University Nainital

Data Editing operations:

Non Parametric tests

2. Measures of dispersion (Range , Mean deviation and

3. Measures of Asymmetry (skewness)

Measures of central tendency:

Median: It is the value of middle item of the series

when arranged in ascending or descending order of magnitude.

(Lowest value of variable)

3. Standard deviation: It is defined as the square-root

Measure of Asymmetry ( Skewness)

Parametric Tests include (Assumptions are made):

Difference in elongation of treated and untreated seedlings

Significance of Data Editing and Validation

Вам также может понравиться