Вы находитесь на странице: 1из 145

MB0034-Unit 1-An Introduction to Research

Unit 1 An Introduction to Research


Meaning and Definition of Research
Research simply means a search for facts answers to questions and solutions to problems. It is
a purposive investigation. It is an organized inquiry. It seeks to find explanations to unexplained
phenomenon to clarify the doubtful facts and to correct the misconceived facts.
The search for facts may be made through either:
- Arbitrary (or unscientific) Method: Its a method of seeking answers to question
consists of imagination, opinion, blind belief or impression. E.g. it was believed that the
shape of the earth was flat; a big snake swallows sun or moon causing solar or lunar
eclipse. It is subjective; the finding will vary from person to person depending on his
impression or imagination. It is vague and inaccurate. Or
- Scientific Method: this is a systematic rational approach to seeking facts. It eliminates
the drawbacks of the arbitrary method. It is objective, precise and arrives at conclusions
on the basis of verifiable evidences.
Therefore, search of facts should be made by scientific method rather than by arbitrary
method. Then only we may get verifiable and accurate facts. Hence research is a systematic and
logical study of an issue or problem or phenomenon through scientific method.
Young defines Research as a scientific undertaking which, by means of logical and systematic
techniques, aims to:
a) Discover of new facts or verify and test old facts,
b) Analyze their sequences, interrelationships and causal explanations,
c) Develop new scientific tools, concepts and theories which would facilitate reliable and
valid study of human behaviour.
d) Kerlinger defines research as a systematic, controlled, empirical and critical
investigation of hypothetical propositions about the presumed relations among natural
phenomena.



Objectives:
After studying this lesson the students should be able to understand:
- Research and scientific method
- Characteristics of Research
- Purpose of research
- Different types of Research
- Research Approaches
- Significance of research in Social and Business Sciences
1.1.1 Research and Scientific Method
Research is a scientific endeavour. It involves scientific method. The scientific method is a
systematic step-by-step procedure following the logical processes of reasoning. Scientific
method is a means for gaining knowledge of the universe. It does not belong to any particular
body of knowledge; it is universal. It does not refer to a field of specific subject of matter, but
rather to a procedure or mode of investigation.
The scientific method is based on certain articles of faith. These are:
- Reliance on Empirical Evidence: Truth is established on the basis of evidence.
Conclusion is admitted, only when it is based on evidence. The answer to a question is
not decided by intuition or imagination. Relevant data are collected through observation
or experimentation. The validity and the reliability of data are checked carefully and the
data are analyzed thoroughly, using appropriate methods of analysis.
- Use of Relevant Concepts: We experience a vast number of facts through our sense.
Facts are things which actually exist. In order to deal with them, we use concepts with
specific meanings. They are symbols representing the meaning that we hold. We use
them in our thinking and communication. Otherwise, clarity and correct understanding
cannot be achieved.
- Commitment of Objectivity: Objectivity is the hallmark of the scientific method. It
means forming judgement upon facts unbiased by personal impressions. The conclusion
should not vary from person to person. It should be the same for all persons.
- Ethical Neutrality: Science does not pass normal judgment on facts. It does not say that
they are good or bad. According to Schrdinger Science never imposes anything,
science states. Science aims at nothing but making true and adequate statements about its
object.
- Generalization: In formulating a generalization, we should avoid the danger of
committing the particularistic fallacy, which arises through an inclination to generalize on
insufficient or incomplete and unrelated data. This can be avoided by the accumulation of
a large body of data and by the employment of comparisons and control groups.
- Verifiability: The conclusions arrived at by a scientist should be verifiable. He must
make known to others how he arrives at his conclusions. He should thus expose his own
methods and conclusions to critical scrutiny. When his conclusion is tested by others
under the same conditions, then it is accepted as correct.
- Logical reasoning process: The scientific method involves the logical process of
reasoning. This reasoning process is used for drawing inference from the finding of a
study or for arriving at conclusion
Characteristics of Research
- It is a systematic and critical investigation into a phenomenon.
- It is a purposive investigation aiming at describing, interpreting and explaining a
phenomenon.
- It adopts scientific method.
- It is objective and logical, applying possible test to validate the measuring tools and the
conclusions reached.
- It is based upon observable experience or empirical evidence.
- Research is directed towards finding answers to pertinent questions and solutions to
problems.
- It emphasizes the development of generalization, principles or theories.
- The purpose of research is not only to arrive at an answer but also to stand up the test of
criticism.

Purpose of Research
The objectives or purposes of research are varied. They are:
- Research extends knowledge of human beings, social life and environment. The search is
for answers for various types of questions: What, Where, When, How and Why of
various phenomena, and enlighten us.
- Research brings to light information that might never be discovered fully during the
ordinary course of life.
- Research establishes generalizations and general laws and contributes to theory building
in various fields of knowledge.
- Research verifies and tests existing facts and theory and these help improving our
knowledge and ability to handle situations and events.
- General laws developed through research may enable us to make reliable predictions of
events yet to happen.
- Research aims to analyze inter-relationships between variables and to derive causal
explanations: and thus enables us to have a better understanding of the world in which we
live.
- Applied research aims at finding solutions to problems socio-economic problems,
health problems, human relations problems in organizations and so on.
- Research also aims at developing new tools, concepts and theories for a better study of
unknown phenomena.
- Research aids planning and thus contributes to national development.

Types of Research
Although any typology of research is inevitably arbitrary, Research may be classified crudely
according to its major intent or the methods. According to the intent, research may be classified
as:
Pure Research
It is undertaken for the sake of knowledge without any intention to apply it in practice, e.g.,
Einsteins theory of relativity, Newtons contributions, Galileos contribution, etc. It is also
known as basic or fundamental research. It is undertaken out of intellectual curiosity or
inquisitiveness. It is not necessarily problem-oriented. It aims at extension of knowledge. It may
lead to either discovery of a new theory or refinement of an existing theory. It lays foundation for
applied research. It offers solutions to many practical problems. It helps to find the critical
factors in a practical problem. It develops many alternative solutions and thus enables us to
choose the best solution.
Applied Research
It is carried on to find solution to a real-life problem requiring an action or policy decision. It is
thus problem-oriented and action-directed. It seeks an immediate and practical result, e.g.,
marketing research carried on for developing a news market or for studying the post-purchase
experience of customers. Though the immediate purpose of an applied research is to find
solutions to a practical problem, it may incidentally contribute to the development of theoretical
knowledge by leading to the discovery of new facts or testing of theory or o conceptual clarity. It
can put theory to the test. It may aid in conceptual clarification. It may integrate previously
existing theories.
Exploratory Research
It is also known as formulative research. It is preliminary study of an unfamiliar problem about
which the researcher has little or no knowledge. It is ill-structured and much less focused on pre-
determined objectives. It usually takes the form of a pilot study. The purpose of this research
may be to generate new ideas, or to increase the researchers familiarity with the problem or to
make a precise formulation of the problem or to gather information for clarifying concepts or to
determine whether it is feasible to attempt the study. Katz conceptualizes two levels of
exploratory studies. At the first level is the discovery of the significant variable in the situations;
at the second, the discovery of relationships between variables.
Descriptive Study
It is a fact-finding investigation with adequate interpretation. It is the simplest type of research. It
is more specific than an exploratory research. It aims at identifying the various characteristics of
a community or institution or problem under study and also aims at a classification of the range
of elements comprising the subject matter of study. It contributes to the development of a young
science and useful in verifying focal concepts through empirical observation. It can highlight
important methodological aspects of data collection and interpretation. The information obtained
may be useful for prediction about areas of social life outside the boundaries of the research.
They are valuable in providing facts needed for planning social action program.
Diagnostic Study
It is similar to descriptive study but with a different focus. It is directed towards discovering
what is happening, why it is happening and what can be done about. It aims at identifying the
causes of a problem and the possible solutions for it. It may also be concerned with discovering
and testing whether certain variables are associated. This type of research requires prior
knowledge of the problem, its thorough formulation, clear-cut definition of the given population,
adequate methods for collecting accurate information, precise measurement of variables,
statistical analysis and test of significance.
Evaluation Studies
It is a type of applied research. It is made for assessing the effectiveness of social or economic
programmes implemented or for assessing the impact of developmental projects on the
development of the project area. It is thus directed to assess or appraise the quality and quantity
of an activity and its performance, and to specify its attributes and conditions required for its
success. It is concerned with causal relationships and is more actively guided by hypothesis. It is
concerned also with change over time.
Action Research
It is a type of evaluation study. It is a concurrent evaluation study of an action programme
launched for solving a problem for improving an exiting situation. It includes six major steps:
diagnosis, sharing of diagnostic information, planning, developing change programme, initiation
of organizational change, implementation of participation and communication process, and post
experimental evaluation.
According to the methods of study, research may be classified as:
1. Experimental Research: It is designed to asses the effects of particular variables on a
phenomenon by keeping the other variables constant or controlled. It aims at determining
whether and in what manner variables are related to each other.
2. Analytical Study: It is a system of procedures and techniques of analysis applied to
quantitative data. It may consist of a system of mathematical models or statistical
techniques applicable to numerical data. Hence it is also known as the Statistical Method.
It aims at testing hypothesis and specifying and interpreting relationships.
3. Historical Research: It is a study of past records and other information sources with a
view to reconstructing the origin and development of an institution or a movement or a
system and discovering the trends in the past. It is descriptive in nature. It is a difficult
task; it must often depend upon inference and logical analysis or recorded data and
indirect evidences rather than upon direct observation.
4. Survey: It is a fact-finding study. It is a method of research involving collection of
data directly from a population or a sample thereof at particular time. Its purpose is to
provide information, explain phenomena, to make comparisons and concerned with cause
and effect relationships can be useful for making predications

Research Approaches
There are two main approaches to research, namely quantitative approach and qualitative
approach. The quantitative approach involves the collection of quantitative data, which are put to
rigorous quantitative analysis in a formal and rigid manner. This approach further includes
experimental, inferential, and simulation approaches to research. Meanwhile, the qualitative
approach uses the method of subjective assessment of opinions, behaviour and attitudes.
Research in a situation is a function of the researchers impressions and insights. The results
generated by this type of research are either in non-quantitative form or in the form which cannot
be put to rigorous quantitative analysis. Usually, this approach uses techniques like depth
interviews, focus group interviews, and projective techniques.

Significance of Research in Social and Business Sciences
According to a famous Hudson Maxim, All progress is born of inquiry. Doubt is often better
than overconfidence, for it leads to inquiry, and inquiry leads to invention. It brings out the
significance of research, increased amounts of which makes progress possible. Research
encourages scientific and inductive thinking, besides promoting the development of logical
habits of thinking and organization.
The role of research in applied economics in the context of an economy or business is greatly
increasing in modern times. The increasingly complex nature of government and business has
raised the use of research in solving operational problems. Research assumes significant role in
formulation of economic policy, for both the government and business. It provides the basis for
almost all government policies of an economic system. Government budget formulation, for
example, depends particularly on the analysis of needs and desires of the people, and the
availability of revenues, which requires research. Research helps to formulate alternative
policies, in addition to examining the consequences of these alternatives. Thus, research also
facilitates the decision making of policy-makers, although in itself it is not a part of research. In
the process, research also helps in the proper allocation of a countrys scare resources. Research
is also necessary for collecting information on the social and economic structure of an economy
to understand the process of change occurring in the country. Collection of statistical information
though not a routine task, involves various research problems. Therefore, large staff of research
technicians or experts is engaged by the government these days to undertake this work. Thus,
research as a tool of government economic policy formulation involves three distinct stages of
operation which are as follows:
- Investigation of economic structure through continual compilation of facts
- Diagnoses of events that are taking place and the analysis of the forces underlying them;
and
- The prognosis, i.e., the prediction of future developments
Research also assumes a significant role in solving various operational and planning problems
associated with business and industry. In several ways, operations research, market research, and
motivational research are vital and their results assist in taking business decisions. Market
research is refers to the investigation of the structure and development of a market for the
formulation of efficient policies relating to purchases, production and sales. Operational research
relates to the application of logical, mathematical, and analytical techniques to find solution to
business problems such as cost minimization or profit maximization, or the optimization
problems. Motivational research helps to determine why people behave in the manner they do
with respect to market characteristics. More specifically, it is concerned with the analyzing the
motivations underlying consumer behaviour. All these researches are very useful for business
and industry, which are responsible for business decision making.
Research is equally important to social scientist for analyzing social relationships and seeking
explanations to various social problems. It gives intellectual satisfaction of knowing things for
the sake of knowledge. It also possesses practical utility for the social scientist to gain knowledge
so as to be able to do something better or in a more efficient manner. This, research in social
sciences is concerned with both knowledge for its own sake, and knowledge for what it can
contribute to solve practical problems.

Summary
Research simply means a search for facts. The search for facts may be made through either
arbitrary (or unscientific) method or scientific method. Young defines Research as a scientific
undertaking which, by means of logical and systematic techniques, aims to: Discover of new
facts or verify and test old facts, analyze their sequences, interrelationships and causal
explanations, develop new scientific tools, concepts and theories which would facilitate reliable
and valid study of human behaviour. Kerlinger defines research as a systematic, controlled,
empirical and critical investigation of hypothetical propositions about the presumed relations
among natural phenomena.
The scientific method is based on certain articles of faith. These are:
1. Reliance on empirical evidence:
2. Use of relevant concepts
3. Commitment of objectivity
4. Ethical neutrality
5. Generalization
6. Verifiability
7. Logical reasoning process
Research is directed towards finding answers to pertinent questions and solutions to problems. It
emphasizes the development of generalization, principles or theories. The purpose of research is
not only to arrive at an answer but also to stand up the test of criticism. The purpose of research
is to extend knowledge of human beings Research establishes generalizations and general laws
and contributes to theory building in various fields of knowledge. Research verifies and tests
existing facts and theory and these help improving our knowledge and ability to handle situations
and events. General laws developed through research may enable us to make reliable predictions
of events yet to happen. Research aims to analyze inter-relationships between variables and to
derive causal explanations: and thus enables us to have a better understanding of the world in
which we live.
Applied research aims at finding solutions to problems socio-economic problems, health
problems, human relations problems in organizations and so on. Research also aims at
developing new tools, concepts and theories for a better study of unknown phenomena. Research
aids planning and thus contributes to national development. Pure Research is undertaken for the
sake of knowledge without any intention to apply it in practice. Applied Research is carried on to
find solution to a real-life problem requiring an action or policy decision. It is thus problem-
oriented and action-directed. Exploratory Research is also known as formulative research. It is
preliminary study of an unfamiliar problem about which the researcher has little or no
knowledge. Descriptive Study is a fact-finding investigation with adequate interpretation.
Diagnostic Study
is similar to descriptive study but with a different focus. Evaluation Studies
is a type of applied research. Action Research
is a type of evaluation study. The role of research in applied economics in the context of an
economy or business is greatly increasing in modern times. Research also assumes a significant
role in solving various operational and planning problems associated with business and industry.
Research is equally important to social scientist for analyzing social relationships and seeking
explanations to various social problems.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034-Unit 2 -Selection and Formulation
of a Research Problem
Unit 2 -Selection and Formulation of a
Research Problem
Meaning of Research Problem
Research really begins when the researcher experiences some difficulty, i.e., a problem
demanding a solution within the subject-are of his discipline. This general area of interest,
however, defines only the range of subject-matter within which the researcher would see and
pose a specific problem for research. Personal values play an important role in the selection of a
topic for research. Social conditions do often shape the preference of investigators in a subtle and
imperceptible way.
The formulation of the topic into a research problem is, really speaking the first step in a
scientific enquiry. A problem in simple words is some difficulty experienced by the researcher in
a theoretical or practical situation. Solving this difficulty is the task of research.
R.L. Ackoffs analysis affords considerable guidance in identifying problem for research. He
visualizes five components of a problem.
1. Research-consumer: There must be an individual or a group which experiences some
difficulty.
2. Research-consumers Objectives: The research-consumer must have available, alternative
means for achieving the objectives he desires.
3. Alternative Means to Meet the Objectives: The research-consumer must have available,
alternative means for achieving the objectives he desires.
4. Doubt in Regard to Selection of Alternatives: The existence of alternative courses of
action in not enough; in order to experience a problem, the research consumer must have
some doubt as to which alternative to select.
5. There must be One or More Environments to which the Difficulty or Problem Pertains: A
change in environment may produce or remove a problem. A research-consumer may
have doubts as to which will be the most efficient means in one environment but would
have no such doubt in another.
Objectives:
After studying this unit you should be able to understand:
- The meaning of Research Problem
- Choosing the problem
- Review of Literature
- Criteria for formulating the problem
- Objective of Formulating the Problem
- Techniques involved in Formulating the Problem
- Criteria of Good Research Problem
Choosing the Problem
The selection of a problem is the first step in research. The term problem means a question or
issue to be examined. The selection of a problem for research is not an easy task; it self is a
problem. It is least amenable to formal methodological treatment. Vision, an imaginative insight,
plays an important role in this process. One with a critical, curious and imaginative mind and is
sensitive to practical problems could easily identify problems for study.
The sources from which one may be able to identify research problems or develop problems
awareness are:
- Review of literature
- Academic experience
- Daily experience
- Exposure to field situations
- Consultations
- Brain storming
- Research
- Intuition
Review of literature
Frequently, an exploratory study is concerned with an area of subject matter in which explicit
hypothesis have not yet been formulated. The researchers task then is to review the available
material with an eye on the possibilities of developing hypothesis from it. In some areas of the
subject matter, hypothesis may have been stated by previous research workers. The researcher
has to take stock of these various hypotheses with a view to evaluating their usefulness for
further research and to consider whether they suggest any new hypothesis. Sociological journals,
economic reviews, the bulletin of abstracts of current social sciences research, directory of
doctoral dissertation accepted by universities etc afford a rich store of valuable clues. In addition
to these general sources, some governmental agencies and voluntary organizations publish
listings of summaries of research in their special fields of service. Professional organizations,
research groups and voluntary organizations are a constant source of information about
unpublished works in their special fields.
Formulating the problem
The selection of one appropriate researchable problem out of the identified problems requires
evaluation of those alternatives against certain criteria, which may be grouped into:
Internal Criteria
Internal Criteria consists of:
1) Researchers interest: The problem should interest the researcher and be a challenge to
him. Without interest and curiosity, he may not develop sustained perseverance. Even a small
difficulty may become an excuse for discontinuing the study. Interest in a problem depends
upon the researchers educational background, experience, outlook and sensitivity.
2) Researchers competence: A mere interest in a problem will not do. The researcher
must be competent to plan and carry out a study of the problem. He must have the ability to
grasp and deal with int. he must possess adequate knowledge of the subject-matter, relevant
methodology and statistical procedures.
3) Researchers own resource: In the case of a research to be done by a researcher on his
won, consideration of his own financial resource is pertinent. If it is beyond his means, he
will not be able to complete the work, unless he gets some external financial support. Time
resource is more important than finance. Research is a time-consuming process; hence it
should be properly utilized.
External Criteria
1) Research-ability of the problem: The problem should be researchable, i.e., amendable
for finding answers to the questions involved in it through scientific method. To be
researchable a question must be one for which observation or other data collection in the real
world can provide the answer.
2) Importance and urgency: Problems requiring investigation are unlimited, but available
research efforts are very much limited. Therefore, in selecting problems for research, their
relative importance and significance should be considered. An important and urgent problem
should be given priority over an unimportant one.
3) Novelty of the problem: The problem must have novelty. There is no use of wasting
ones time and energy on a problem already studied thoroughly by others. This does not
mean that replication is always needless. In social sciences in some cases, it is appropriate to
replicate (repeat) a study in order to verify the validity of its findings to a different situation.
4) Feasibility: A problem may be a new one and also important, but if research on it is not
feasible, it cannot be selected. Hence feasibility is a very important consideration.
5) Facilities: Research requires certain facilities such as well-equipped library facility,
suitable and competent guidance, data analysis facility, etc. Hence the availability of the
facilities relevant to the problem must be considered.
6) Usefulness and social relevance: Above all, the study of the problem should make
significant contribution to the concerned body of knowledge or to the solution of some
significant practical problem. It should be socially relevant. This consideration is particularly
important in the case of higher level academic research and sponsored research.
7) Research personnel: Research undertaken by professors and by research organizations
require the services of investigators and research officers. But in India and other developing
countries, research has not yet become a prospective profession. Hence talent persons are not
attracted to research projects.
Each identified problem must be evaluated in terms of the above internal and external criteria
and the most appropriate one may be selected by a research scholar.
Objective of Formulating the Problem
A problem well put is half-solved. The primary task of research is collection of relevant data and
the analysis of data for finding answers to the research questions. The proper performance of this
task depends upon the identification of exact data and information required for the study. The
formulation serves this purpose. The clear and accurate statement of the problem, the
development of the conceptual model, the definition of the objectives of the study, the setting of
investigative questions, the formulation of hypothesis to be tested and the operational definition
of concepts and the delimitation of the study determine the exact data needs of the study. Once
the exact data requirement is known, the researcher can plan and execute the other steps without
any waste of time and energy. Thus formulation gives a direction and a specific focus to the
research effort. It helps to delimit the field of enquiry by singling out the pertinent facts from a
vast ocean of facts and thus saves the researcher from becoming lost in a welter of irrelevancies.
It prevents a blind search and indiscriminate gathering of data which may later prove irrelevant
to the problem under study. It helps in determining the methods to be adopted for sampling and
collection of data
Techniques involved in Formulating Problem
The problem selected for research may initially be a vague topic. The question to be studied or
the problem to be solved may not be known. Hence the selected problem should be defined and
formulated. This is a difficult process. It requires intensive reading of a few selected articles or
chapters in books in order to understand the nature of the problem selected.
The process of defining a problem includes:
1. Developing title: The title should be carefully worded. It should indicate the core of the
study, reflect the real intention of the researcher, and show on what is the focus e.g.,
Financing small-scale industries by commercial banks. This shows that the focus is on
commercial banks and not on small-scale industries. On the other hand, if the title is The
Financial Problem of Small-scale industries, the focus is on small-scale industries.
2. Building a conceptual model: On the basis of our theoretical knowledge of the
phenomenon under study, the nature of the phenomenon, its properties / elements and
their inter-relations should be identified and structured into a framework. This conceptual
model gives an exact idea of the research problem and shows its various properties and
variables to be studied. It serves as a basis for the formulation of the objectives of the
study, on the hypothesis to be tested. In order to workout a conceptual model we must
make a careful and critical study of the available literature on the subject-matter of the
selected research problem. It is for this reason; a researcher is expected to select a
problem for research in his field of specialization. Without adequate background
knowledge, a researcher cannot grasp and comprehend the nature of the research
problem.
3. Define the Objective of the Study: The objectives refer to the questions to be
answered through the study. They indicate what we are trying to get through the study.
The objectives are derived from the conceptual model. They state which elements in the
conceptual model-which levels of, which kinds of cases, which properties, and which
connections among properties are to be investigated, but it is the conceptual model that
defines, describes, and states the assumptions underlying these elements. The objectives
may aim at description or explanation or analysis of causal relationship between
variables, and indicate the expected results or outcome of the study. The objectives may
be specified in the form of either the statements or the questions.
Criteria of Good research Problem
Horton and Hunt have given following characteristics of scientific research:
1. Verifiable evidence: That is factual observations which other observers can see and
check.
2. Accuracy: That is describing what really exists. It means truth or correctness of a
statement or describing things exactly as they are and avoiding jumping to unwarranted
conclusions either by exaggeration or fantasizing.
3. Precision: That is making it as exact as necessary, or giving exact number or
measurement. This avoids colourful literature and vague meanings.
4. Systematization: That is attempting to find all the relevant data, or collecting data in a
systematic and organized way so that the conclusions drawn are reliable. Data based on
casual recollections are generally incomplete and give unreliable judgments and
conclusions.
5. Objectivity: That is free being from all biases and vested interests. It means observation
is unaffected by the observers values, beliefs and preferences to the extent possible and
he is able to see and accept facts as they are, not as he might wish them to be.
6. Recording: That is jotting down complete details as quickly as possible. Since human
memory is fallible, all data collected are recorded.
7. Controlling conditions: That is controlling all variables except one and then attempting
to examine what happens when that variable is varied. This is the basic technique in all
scientific experimentation allowing one variable to vary while holding all other
variables constant.
8. Training investigators: That is imparting necessary knowledge to investigators to make
them understand what to look for, how to interpret in and avoid inaccurate data
collection.
Summary
Research really begins when the researcher experiences some difficulty, i.e., a problem
demanding a solution within the subject-are of his discipline. The formulation of the topic into a
research problem is, really speaking the first step in a scientific enquiry. The selection of one
appropriate researchable problem out of the identified problems requires evaluation of those
alternatives against certain criteria, which may be grouped into internal criteria and external
criteria. A problem well put is half-solved. The primary task of research is collection of relevant
data and the analysis of data for finding answers to the research questions. The problem selected
for research may initially be a vague topic. The process of defining a problem includes:
- Developing title
- Building a conceptual model
- Define the Objective of the Study
Horton and Hunt have given following characteristics of scientific research:
- Verifiable evidence
- Accuracy
- Precision
- Systematization
- Objectivity
- Recording
- Controlling conditions
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 3 Hypothesis
Unit 3 Hypothesis
Introduction
A hypothesis is an assumption about relations between variables. It is a tentative explanation of
the research problem or a guess about the research outcome. Before starting the research, the
researcher has a rather general, diffused, even confused notion of the problem. It may take long
time for the researcher to say what questions he had been seeking answers to. Hence, an adequate
statement about the research problem is very important. What is a good problem statement? It is
an interrogative statement that asks: what relationship exists between two or more variables? It
then further asks questions like: Is A related to B or not? How are A and B related to C? Is A
related to B under conditions X and Y? Proposing a statement pertaining to relationship between
A and B is called a hypothesis.

Objectives:
After studying this lesson you should be able to understand:
- Meaning and Examples of Hypothesis
- Criteria for constructing of hypothesis
- Nature of Hypothesis
- the need for having Hypothesis
- Characteristics of good hypothesis
- Types of hypothesis
- Null Hypothesis and alternative hypothesis
- Concepts of Hypothesis
- The level of Significance
- Decision rule of testing hypothesis
- Type I and Type II Errors
- Two Tailed and One Tailed Test
- Procedures for Testing hypothesis
- Testing of Hypothesis

Meaning and Examples of Hypothesis
According to Theodorson and Theodorson, a hypothesis is a tentative statement asserting a
relationship between certain facts. Kerlinger describes it as a conjectural statement of the
relationship between two or more variables. Black and Champion have described it as a
tentative statement about something, the validity of which is usually unknown. This statement is
intended to be tested empirically and is either verified or rejected. It the statement is not
sufficiently established, it is not considered a scientific law. In other words, a hypothesis carries
clear implications for testing the stated relationship, i.e., it contains variables that are measurable
and specifying how they are related. A statement that lacks variables or that does not explain
how the variables are related to each other is no hypothesis in scientific sense.
Criteria for Hypothesis Construction
Hypothesis is never formulated in the form of a question. The standards to be met in formulating
a hypothesis:
- It should be empirically testable, whether it is right or wrong.
- It should be specific and precise.
- The statements in the hypothesis should not be contradictory.
- It should specify variables between which the relationship is to be established.
- It should describe one issue only.

Nature of Hypothesis
A scientifically justified hypothesis must meet the following criteria:
- It must accurately reflect the relevant sociological fact.
- It must not be in contradiction with approved relevant statements of other scientific
disciplines.
- It must consider the experience of other researchers.
The Need for having Working Hypothesis
- A hypothesis gives a definite point to the investigation, and it guides the direction on the
study.
- A hypothesis specifies the sources of data, which shall be studied, and in what context
they shall be studied.
- It determines the data needs.
- A hypothesis suggests which type of research is likely to be most appropriate.
- It determines the most appropriate technique of analysis.
- A hypothesis contributes to the development of theory
Characteristics of Good Hypothesis
1. Conceptual Clarity
2. Specificity
3. Testability
4. Availability of Techniques
5. Theoretical relevance
6. Consistency
7. Objectivity
8. Simplicity

Types of Hypothesis
There are many kinds of hypothesis the researcher has to be working with. One type of
hypothesis asserts that something is the case in a given instance; that a particular object,
person or situation has particular characteristics. Another type of hypothesis deals with
the frequency of occurrence or of association among variables; this type of hypothesis
may state that X is associated with Y. A certain Y proportion of items e.g. urbanism tends
to be accompanied by mental disease or than something are greater or lesser than some
other thing in specific settings. Yet another type of hypothesis asserts that a particular
characteristics is one of the factors which determine another characteristic, i.e. X is the
producer of Y. hypothesis of this type are called causal hypothesis.
Null Hypothesis and Alternative Hypothesis
In the context of statistical analysis, we often talk null and alternative hypothesis. If we
are to compare method A with method B about its superiority and if we proceed on the
assumption that both methods are equally good, then this assumption is termed as null
hypothesis. As against this, we may think that the method A is superior, it is alternative
hypothesis. Symbolically presented as:
Null hypothesis = H
0
and Alternative hypothesis = H
a

Suppose we want to test the hypothesis that the population mean is equal to the
hypothesis mean ( H
0
) = 100. Then we would say that the null hypotheses are that the
population mean is equal to the hypothesized mean 100 and symbolical we can express
as: H
0
: = H
0
=100
If our sample results do not support these null hypotheses, we should conclude that
something else is true. What we conclude rejecting the null hypothesis is known as
alternative hypothesis. If we accept H
0,
then we are rejecting H
a
and if we reject H
0
, then
we are accepting H
a
. For H
0
: = H
0
=100, we may consider three possible alternative
hypotheses as follows:
Alternative
Hypothesis
To be read as follows
H
a
: H
0

(The alternative hypothesis is that the population mean is not equal
to 100 i.e., it may be more or less 100)
H
a
: > H
0

(The alternative hypothesis is that the population mean is greater
than 100)
H
a
: < H
0

(The alternative hypothesis is that the population mean is less than
100)
The null hypothesis and the alternative hypothesis are chosen before the sample is drawn
(the researcher must avoid the error of deriving hypothesis from the data he collects and
testing the hypothesis from the same data). In the choice of null hypothesis, the following
considerations are usually kept in view:
- Alternative hypothesis is usually the one which wishes to prove and the null hypothesis
are ones that wish to disprove. Thus a null hypothesis represents the hypothesis we are
trying to reject, the alternative hypothesis represents all other possibilities.
- If the rejection of a certain hypothesis when it is actually true involves great risk, it is
taken as null hypothesis because then the probability of rejecting it when it is true is
(the level of significance) which is chosen very small.
- Null hypothesis should always be specific hypothesis i.e., it should not state about or
approximately a certain value.
- Generally, in hypothesis testing we proceed on the basis of null hypothesis, keeping the
alternative hypothesis in view. Why so? The answer is that on assumption that null
hypothesis is true, one can assign the probabilities to different possible sample results,
but this cannot be done if we proceed with alternative hypothesis. Hence the use of null
hypothesis (at times also known as statistical hypothesis) is quite frequent.

Concepts of Hypothesis Testing
Basic concepts in the context of testing of hypothesis need to be explained.
The Level of Significance
This is a very important concept in the context of hypothesis testing. It is always some
percentage (usually 5%) which should be chosen with great care, thought and reason. In case we
take the significance level at 5%, then this implies that H
0
will be rejected when the sampling
result (i.e., observed evidence) has a less than 0.05 probability of occurring if H
0
is true. In other
words, the 5% level of significance means that researcher is willing to take as much as 5% risk
rejecting the null hypothesis when it (H
0
) happens to be true. Thus the significance level is the
maximum value of the probability of rejecting H
0
when it is true and is usually determined in
advance before testing the
Decision Rule of Test of Hypothesis:
Given a hypothesis H
0
and an alternative hypothesis H
0
we make rule which is known as
decision rule according to which we accept H
0
(i.e., reject H
a
) or reject H
0
(i.e., accept
a
). For
instance, if (H
0
is that a certain lot is good (there are very few defective items in it) against H
a

that the lot is not good (there are many defective items in it), that we must decide the number of
items to be tested and the criterion for accepting or rejecting the hypothesis. We might test 10
items in the lot and plan our decision saying that if there are none or only 1 defective item among
the 10, we will accept H
0
otherwise we will reject H
0
(or accept H
a
). This sort of basis is known
as decision rule.

Type I & Type II Errors
In the context of testing of hypothesis there are basically two types of errors that researchers
make. We may reject H
0
when H
0
is true & we may accept H
0
when it is not true. The former is
known as Type I & the later is known as Type II. In other words, Type I error mean rejection of
hypothesis which should have been accepted & Type II error means accepting of hypothesis
which should have been rejected. Type I error is donated by (alpha), also called as level of
significance of test; and Type II error is donated by (beta).

Decision
Accept H
0
Reject H
0

H
0
(true) Correct decision Type I error ( error)
H
o
(false) Type II error ( error) Correct decision
The probability of Type I error is usually determined in advance and is understood as the level of
significance of testing the hypothesis. If type I error is fixed at 5%, it means there are about
chances in 100 that we will reject H
0
when H
0
is true. We can control type I error just by fixing it
at a lower level. For instance, if we fix it at 1%, we will say that the maximum probability of
committing type I error would only be 0.01.
But with a fixed sample size, n when we try to reduce type I error, the probability of committing
type II error increases. Both types of errors can not be reduced simultaneously. There is a trade-
off in business situations, decision-makers decide the appropriate level of type I error by
examining the costs of penalties attached to both types of errors. If type I error involves time &
trouble of reworking a batch of chemicals that should have been accepted, where as type II error
means taking a chance that an entire group of users of this chemicals compound will be
poisoned, then in such a situation one should prefer a type I error to a type II error means taking
a chance that an entire group of users of this chemicals compound will be poisoned, then in such
a situation one should prefer a type II error. As a result one must set very high level for type I
error in ones testing techniques of a given hypothesis. Hence, in testing of hypothesis, one must
make all possible effort to strike an adequate balance between Type I & Type II error.
Two Tailed Test & One Tailed Test
In the context of hypothesis testing these two terms are quite important and must be clearly
understood. A two-tailed test rejects the null hypothesis if, say, the sample mean is significantly
higher or lower than the hypnotized value of the mean of the population. Such a test
inappropriate when we haveH
0
: = H
0
and H
a
: H
0
which may > H
0
or < H
0
. If
significance level is % and the two-tailed test to be applied, the probability of the rejection area
will be 0.05 (equally split on both tails of curve as 0.025) and that of the acceptance region will
be 0.95. If we take = 100 and if our sample mean deviates significantly from , in that case we
shall accept the null hypothesis. But there are situations when only one-tailed test is considered
appropriate. A one-tailed test would be used when we are to test, say, whether the population
mean in either lower than or higher than some hypothesized value.

Procedure for Testing Hypothesis
To test a hypothesis means to tell (on the basis of the data researcher has collected) whether or
not the hypothesis seems to be valid. In hypothesis testing the main question is: whether the null
hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing refers to all
those steps that we undertake for making a choice between the two actions i.e., rejection and
acceptance of a null hypothesis. The various steps involved in hypothesis testing are stated
below:

Making a Formal Statement
The step consists in making a formal statement of the null hypothesis (H
o
) and also of the
alternative hypothesis (H
a
). This means that hypothesis should clearly state, considering the
nature of the research problem. For instance, Mr. Mohan of the Civil Engineering Department
wants to test the load bearing capacity of an old bridge which must be more than 10 tons, in that
case he can state his hypothesis as under:
Null hypothesis H
O
: =10 tons
Alternative hypothesis H
a
: >10 tons
Take another example. The average score in an aptitude test administered at the national level is
80. To evaluate a states education system, the average score of 100 of the states students
selected on the random basis was 75. The state wants to know if there is a significance difference
between the local scores and the national scores. In such a situation the hypothesis may be state
as under:
Null hypothesis H
O
: =80
Alternative hypothesis H
a
: 80
The formulation of hypothesis is an important step which must be accomplished with due care in
accordance with the object and nature of the problem under consideration. It also indicates
whether we should use a tailed test or a two tailed test. If H
a
is of the type greater than, we use
alone tailed test, but when H
a
is of the type whether greater or smaller then we use a two-tailed
test.
Selecting a Significant Level
The hypothesis is tested on a pre-determined level of significance and such the same should have
specified. Generally, in practice, either 5% level or 1% level is adopted for the purpose. The
factors that affect the level of significance are:
- The magnitude of the difference between sample ;
- The size of the sample;
- The variability of measurements within samples;
- Whether the hypothesis is directional or non directional (A directional hypothesis is one
which predicts the direction of the difference between, say, means). In brief, the level of
significance must be adequate in the context of the purpose and nature of enquiry.
Deciding the Distribution to Use
After deciding the level of significance, the next step in hypothesis testing is to determine the
appropriate sampling distribution. The choice generally remains between distribution and the t
distribution. The rules for selecting the correct distribution are similar to those which we have
stated earlier in the context of estimation.

Selecting A Random Sample & Computing An Appropriate Value
Another step is to select a random sample(S) and compute an appropriate value from the sample
data concerning the test statistic utilizing the relevant distribution. In other words, draw a sample
to furnish empirical data.
Calculation of the Probability
One has then to calculate the probability that the sample result would diverge as widely as it has
from expectations, if the null hypothesis were in fact true.
Comparing the Probability
Yet another step consists in comparing the probability thus calculated with the specified value
for , the significance level. If the calculated probability is equal to smaller than value in case
of one tailed test (and /2 in case of two-tailed test), then reject the null hypothesis (i.e. accept
the alternative hypothesis), but if the probability is greater then accept the null hypothesis. In
case we reject H
0
we run a risk of (at most level of significance) committing an error of type I,
but if we accept H
0
, then we run some risk of committing error type II.
Flow Diagram for Testing Hypothesis

committing type I error committing type II
error
Testing of Hypothesis
The hypothesis testing determines the validity of the assumption (technically described as null
hypothesis) with a view to choose between the conflicting hypotheses about the value of the
population hypothesis about the value of the population of a population parameter. Hypothesis
testing helps to secede on the basis of a sample data, whether a hypothesis about the population
is likely to be true or false. Statisticians have developed several tests of hypothesis (also known
as tests of significance) for the purpose of testing of hypothesis which can be classified as:
- Parametric tests or standard tests of hypothesis ;
- Non Parametric test or distribution free test of the hypothesis.
Parametric tests usually assume certain properties of the parent population from which we draw
samples. Assumption like observations come from a normal population, sample size is large,
assumptions about the population parameters like mean, variants etc must hold good before
parametric test can be used. But there are situation when the researcher cannot or does not want
to make assumptions. In such situations we use statistical methods for testing hypothesis which
are called non parametric tests because such tests do not depend on any assumption about the
parameters of parent population. Besides, most non-parametric test assumes only nominal or
original data, where as parametric test require measurement equivalent to at least an interval
scale. As a result non-parametric test needs more observation than a parametric test to achieve
the same size of Type I & Type II error.
Important Parametric Tests
The important parametric tests are:
- z-test
- t-test
- x
2
-test
- f-test
All these tests are based on the assumption of normality i.e., the source of data is considered to
be normally distributed. In some cases the population may not be normally distributed, yet the
test will be applicable on account of the fact that we mostly deal with samples and the sampling
distributions closely approach normal distributions.
Z-test is based on the normal probability distribution and is used for judging the significance of
several statistical measures, particularly the mean. The relevant test statistic is worked out and
compared with its probable value (to be read from the table showing area under normal curve) at
a specified level of significance for judging the significance of the measure concerned. This is a
most frequently used test in research studies. This test is used even when binomial distribution or
t-distribution is applicable on the presumption that such a distribution tends to approximate
normal distribution as n becomes larger. Z-test is generally used for comparing the mean of a
sample to some hypothesis mean for the population in case of large sample, or when population
variance is known as z-test is also used for judging the significance of difference between means
to of two independent samples in case of large samples or when population variance is known z-
test is generally used for comparing the sample proportion to a theoretical value of population
proportion or for judging the difference in proportions of two independent samples when
happens to be large. Besides, this test may be used for judging the significance of median, mode,
co-efficient of correlation and several other measures
T-test is based on t-distribution and is considered an appropriate test for judging the significance
of sample mean or for judging significance of difference between the two means of the two
samples in case of samples when population variance is not known (in which case we use
variance of the sample as an estimate the population variance). In case two samples are related,
we use paired t-test (difference test) for judging the significance of their mean of difference
between the two related samples. It can also be used for judging the significance of co-efficient
of simple and partial correlations. The relevant test statistic, t, is calculated from the sample data
and then compared with its probable value based on t-distribution at a specified level of
significance for concerning degrees of freedom for accepting or rejecting the null hypothesis it
may be noted that t-test applies only in case of small sample when population variance is
unknown.
X
2
-test is based on chi-square distribution and as a parametric test is used for comparing a
sample variance to a theoretical population variance is unknown.
F-test is based on f-distribution and is used to compare the variance of the two-independent
samples. This test is also used in the context of variance (ANOVA) for judging the significance
of more than two sample means at one and the same time. It is also used for judging the
significance of multiple correlation coefficients. Test statistic, f, is calculated and compared with
its probable value for accepting or rejecting the H
0
.

Summary
A hypothesis is an assumption about relations between variables. It is a tentative explanation of
the research problem or a guess about the research outcome. Before starting the research, the
researcher has a rather general, diffused, even confused notion of the problem. A hypothesis
gives a definite point to the investigation, and it guides the direction on the study. A hypothesis
specifies the sources of data, which shall be studied, and in what context they shall be studied. In
the context of hypothesis testing these two terms are quite important and must be clearly
understood. A two-tailed test rejects the null hypothesis if, say, the sample mean is significantly
higher or lower than the hypnotized value of the mean of the population.
The hypothesis is tested on a pre-determined level of significance and such the same should have
specified. Generally, in practice, either 5% level or 1% level is adopted for the purpose. After
deciding the level of significance, the next step in hypothesis testing is to determine the
appropriate sampling distribution. The hypothesis testing determines the validity of the
assumption (technically described as null hypothesis) with a view to choose between the
conflicting hypotheses about the value of the population of a population parameter. Z-test is
based on the normal probability distribution and is used for judging the significance of several
statistical measures, particularly the mean. The relevant test statistic is worked out and compared
with its probable value (to be read from the table showing area under normal curve) at a
specified level of significance for judging the significance of the measure concerned. This is a
most frequently used test in research studies. T-test is based on t-distribution and is considered
an appropriate test for judging the significance of sample mean or for judging significance of
difference between the two means of the two samples in case of samples when population
variance is not known (in which case we use variance of the sample as an estimate of the
population variance). X
2
-test is based on chi-square distribution and as a parametric test is used
for comparing a sample variance to a theoretical population variance is unknown. F-test is based
on f-distribution and is used to compare the variance of the two-independent samples.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 4 Research Design
Unit 4 Research Design
Meaning of Research Design
The research designer understandably cannot hold all his decisions in his head. Even if he could,
he would have difficulty in understanding how these are inter-related. Therefore, he records his
decisions on paper or record disc by using relevant symbols or concepts. Such a symbolic
construction may be called the research design or model. A research design is a logical and
systematic plan prepared for directing a research study. It specifies the objectives of the study,
the methodology and techniques to be adopted for achieving the objectives. It constitutes the
blue print for the collection, measurement and analysis of data. It is the plan, structure and
strategy of investigation conceived so as to obtain answers to research questions. The plan is the
overall scheme or program of research. A research design is the program that guides the
investigator in the process of collecting, analyzing and interpreting observations. It provides a
systematic plan of procedure for the researcher to follow elltiz, Jahoda and Destsch and Cook
describe, A research design is the arrangement of conditions for collection and analysis of data
in a manner that aims to combine relevance to the research purpose with economy in procedure.

Objectives:
After studying this lesson you should be able to understand:
- Needs of Research Design
- Characteristics of a Good Research Design
- Components of Research Design
- Experimental and Non-experimental Hypothesis Testing Research
- Different Research Designs
- Research Design for Studies in Commerce and Management
- Research Design in Case of Exploratory Research Studies
- Research Design in case of Descriptive and Diagnostic Research Studies
- Research Design in case of Hypothesis testing Research Studies
- Principles of Experimental Designs
- Important Experimental Designs
- Formal Experimental Designs
Needs of Research Design
The need for the methodologically designed research:
a- In many a research inquiry, the researcher has no idea as to how accurate the results of his
study ought to be in order to be useful. Where such is the case, the researcher has to
determine how much inaccuracy may be tolerated. In a quite few cases he may be in a
position to know how much inaccuracy his method of research will produce. In either case he
should design his research if he wants to assure himself of useful results.
b- In many research projects, the time consumed in trying to ascertain what the data mean
after they have been collected is much greater than the time taken to design a research which
yields data whose meaning is known as they are collected.
c- The idealized design is concerned with specifying the optimum research procedure that
could be followed were there no practical restrictions.
Characteristics of a Good Research Design
1. It is a series of guide posts to keep one going in the right direction.
2. It reduces wastage of time and cost.
3. It encourages co-ordination and effective organization.
4. It is a tentative plan which undergoes modifications, as circumstances demand, when the
study progresses, new aspects, new conditions and new relationships come to light and
insight into the study deepens.
5. It has to be geared to the availability of data and the cooperation of the informants.
6. It has also to be kept within the manageable limits

Components of Research Design
It is important to be familiar with the important concepts relating to research design. They are:
1. Dependent and Independent variables: A magnitude that varies is known as a
variable. The concept may assume different quantitative values, like height, weight, income,
etc. Qualitative variables are not quantifiable in the strictest sense of objectivity. However,
the qualitative phenomena may also be quantified in terms of the presence or absence of the
attribute considered. Phenomena that assume different values quantitatively even in decimal
points are known as continuous variables. But, all variables need not be continuous. Values
that can be expressed only in integer values are called non-continuous variables. In
statistical term, they are also known as discrete variable. For example, age is a continuous
variable; where as the number of children is a non-continuous variable. When changes in one
variable depends upon the changes in one or more other variables, it is known as a dependent
or endogenous variable, and the variables that cause the changes in the dependent variable
are known as the independent or explanatory or exogenous variables. For example, if demand
depends upon price, then demand is a dependent variable, while price is the independent
variable. And if, more variables determine demand, like income and prices of substitute
commodity, then demand also depends upon them in addition to the own price. Then,
demand is a dependent variable which is determined by the independent variables like own
price, income and price of substitute.
2. Extraneous variable: The independent variables which are not directly related to the
purpose of the study but affect the dependent variable are known as extraneous variables. For
instance, assume that a researcher wants to test the hypothesis that there is relationship
between childrens school performance and their self-concepts, in which case the latter is an
independent variable and the former, the dependent variable. In this context, intelligence may
also influence the school performance. However, since it is not directly related to the purpose
of the study undertaken by the researcher, it would be known as an extraneous variable. The
influence caused by the extraneous variable on the dependent variable is technically called as
an experimental error. Therefore, a research study should always be framed in such a
manner that the dependent variable completely influences the change in the independent
variable and any other extraneous variable or variables.
3. Control: One of the most important features of a good research design is to minimize
the effect of extraneous variable. Technically, the term control is used when a researcher
designs the study in such a manner that it minimizes the effects of extraneous independent
variables. The term control is used in experimental research to reflect the restrain in
experimental conditions.
4. Confounded relationship: The relationship between dependent and independent
variables is said to be confounded by an extraneous variable, when the dependent variable is
not free from its effects.
- Research hypothesis: When a prediction or a hypothesized relationship is tested by
adopting scientific methods, it is known as research hypothesis. The research hypothesis
is a predictive statement which relates a dependent variable and an independent variable.
Generally, a research hypothesis must consist of at least one dependent variable and one
independent variable. Whereas, the relationships that are assumed but not be tested are
predictive statements that are not to be objectively verified are not classified as research
hypothesis.
- Experimental and control groups: When a group is exposed to usual conditions in an
experimental hypothesis-testing research, it is known as control group. On the other
hand, when the group is exposed to certain new or special condition, it is known as an
experimental group. In the afore-mentioned example, the Group A can be called a
control group and the Group B an experimental one. If both the groups A and B are
exposed to some special feature, then both the groups may be called as experimental
groups. A research design may include only the experimental group or the both
experimental and control groups together.
- Treatments: Treatments are referred to the different conditions to which the
experimental and control groups are subject to. In the example considered, the two
treatments are the parents with regular earnings and those with no regular earnings.
Likewise, if a research study attempts to examine through an experiment regarding the
comparative impacts of three different types of fertilizers on the yield of rice crop, then
the three types of fertilizers would be treated as the three treatments.
- Experiment: An experiment refers to the process of verifying the truth of a statistical
hypothesis relating to a given research problem. For instance, experiment may be
conducted to examine the yield of a certain new variety of rice crop developed. Further,
Experiments may be categorized into two types namely, absolute experiment and
comparative experiment. If a researcher wishes to determine the impact of a chemical
fertilizer on the yield of a particular variety of rice crop, then it is known as absolute
experiment. Meanwhile, if the researcher wishes to determine the impact of chemical
fertilizer as compared to the impact of bio-fertilizer, then the experiment is known as a
comparative experiment.
- Experiment unit: Experimental units refer to the predetermined plots, characteristics or
the blocks, to which the different treatments are applied. It is worth mentioning here that
such experimental units must be selected with great caution.
Experimental and Non-Experimental Hypothesis Testing Research
When the objective of a research is to test a research hypothesis, it is known as a hypothesis-
testing research. Such research may be in the nature of experimental design or non-experimental
design. A research in which the independent variable is manipulated is known as experimental
hypothesis-testing research, where as a research in which the independent variable is not
manipulated is termed as non-experimental hypothesis-testing research. E.g., assume that a
researcher wants to examine whether family income influences the social attendance of a group
of students, by calculating the coefficient of correlation between the two variables. Such an
example is known as a non-experimental hypothesis-testing research, because the independent
variable family income is not manipulated. Again assume that the researcher randomly selects
150 students from a group of students who pay their school fees regularly and them classifies
them into tow sub-groups by randomly including 75 in Group A, whose parents have regular
earning, and 75 in group B, whose parents do not have regular earning. And that at the end of the
study, the researcher conducts a test on each group in order to examine the effects of regular
earnings of the parents on the school attendance of the student. Such a study is an example of
experimental hypothesis-testing research, because in this particular study the independent
variable regular earnings of the parents have been manipulated

Different Research Designs
There are a number of crucial research choices, various writers advance different classification
schemes, some of which are:
1. Experimental, historical and inferential designs (American Marketing Association).
2. Exploratory, descriptive and causal designs (Selltiz, Jahoda, Deutsch and Cook).
3. Experimental, and expost fact (Kerlinger)
4. Historical method, and case and clinical studies (Goode and Scates)
5. Sample surveys, field studies, experiments in field settings, and laboratory experiments
(Festinger and Katz)
6. Exploratory, descriptive and experimental studies (Body and Westfall)
7. Exploratory, descriptive and casual (Green and Tull)
8. Experimental, quasi-experimental designs (Nachmias and Nachmias)
9. True experimental, quasi-experimental and non-experimental designs (Smith).
10. Experimental, pre-experimental, quasi-experimental designs and Survey Research
(Kidder and Judd).
These different categorizations exist, because research design is a complex concept. In fact,
there are different perspectives from which any given study can be viewed. They are:
1. The degree of formulation of the problem (the study may be exploratory or formalized)
2. The topical scope-breadth and depth-of the study(a case or a statistical study)
3. The research environment: field setting or laboratory (survey, laboratory experiment)
4. The time dimension(one-time or longitudinal)
5. The mode of data collection (observational or survey)
6. The manipulation of the variables under study (experimental or expost facto)
7. The nature of the relationship among variables (descriptive or causal)


Research Design for Studies in Commerce and Management
The various research designs are:
Research design in case of exploratory research studies Exploratory research studies are also
termed as formulative research studies. The main purpose of such studies is that of formulating a
problem for more precise investigation or of developing the working hypothesis from an
operational point of view. The major emphasis in such studies is on the discovery of ideas and
insights. As such the research design appropriate for such studies must be flexible enough to
provide opportunity for considering different aspects of a problem under study. Inbuilt flexibility
in research design is needed because the research problem, broadly defined initially, is
transformed into one with more precise meaning in exploratory studies, which fact may
necessitate changes in the research procedure for gathering relevant data. Generally, the
following three methods in the context of research design for such studies are talked about:
1. The survey of concerning literature happens to be the most simple and fruitful method
of formulating precisely the research problem or developing hypothesis. Hypothesis
stated by earlier workers may be reviewed and their usefulness be evaluated as a basis for
further research. It may also be considered whether the already stated hypothesis suggests
new hypothesis. In this way the researcher should review and build upon the work
already done by others, but in cases where hypothesis have not yet been formulated, his
task is to review the available material for deriving the relevant hypothesis from it.
Besides, the bibliographical survey of studies, already made in ones area of interest may
as well as made by the researcher for precisely formulating the problem. He should also
make an attempt to apply concepts and theories developed in different research contexts
to the area in which he is himself working. Sometimes the works of creative writers also
provide a fertile ground for hypothesis formulation as such may be looked into by the
researcher.
2. Experience survey means the survey of people who have had practical experience with
the problem to be studied. The object of such a survey is to obtain insight into the
relationships between variables and new ideas relating to the research problem. For such
a survey, people who are competent and can contribute new ideas may be carefully
selected as respondents to ensure a representation of different types of experience. The
respondents so selected may then be interviewed by the investigator. The researcher must
prepare an interview schedule for the systematic questioning of informants. But the
interview must ensure flexibility in the sense that the respondents should be allowed to
raise issues and questions which the investigator has not previously considered.
Generally, the experience of collecting interview is likely to be long and may last for few
hours. Hence, it is often considered desirable to send a copy of the questions to be
discussed to the respondents well in advance. This will also give an opportunity to the
respondents for doing some advance thinking over the various issues involved so that, at
the time of interview, they may be able to contribute effectively. Thus, an experience
survey may enable the researcher to define the problem more concisely and help in the
formulation of the research hypothesis. This, survey may as well provide information
about the practical possibilities for doing different types of research.
3. Analyses of insight-stimulating examples are also a fruitful method for suggesting
hypothesis for research. It is particularly suitable in areas where there is little experience
to serve as a guide. This method consists of the intensive study of selected instance of the
phenomenon in which one is interested. For this purpose the existing records, if nay, may
be examined, the unstructured interviewing may take place, or some other approach may
be adopted. Attitude of the investigator, the intensity of the study and the ability of the
researcher to draw together diverse information into a unified interpretation are the main
features which make this method an appropriate procedure for evoking insights. Now,
what sorts of examples are to be selected and studied? There is no clear cut answer to it.
Experience indicates that for particular problems certain types of instances are more
appropriate than others. One can mention few examples of insight-stimulating cases
such as the reactions of strangers, the reactions of marginal individuals, the study of
individuals who are in transition from one stage to another, the reactions of individuals
from different social strata and the like. In general, cases that provide sharp contrasts or
have striking features are considered relatively more useful while adopting this method of
hypothesis formulation. Thus, in an exploratory of formulative research study which
merely leads to insights or hypothesis, whatever method or research design outlined
above is adopted, the only thing essential is that it must continue to remain flexible so
that many different facets of a problem may be considered as and when they arise and
come to the notice of the researcher.
Research design in case of descriptive and diagnostic research studies
Descriptive research studies are those studies which are concerned with describing the
characteristics of a particular individual, or of a group, where as diagnostic research studies
determine the frequency with which something occurs or its association with something else. The
studies concerning whether certain variables are associated are the example of diagnostic
research studies. As against this, studies concerned with specific predictions, with narration of
facts and characteristics concerning individual, group of situation are all examples of descriptive
research studies. Most of the social research comes under this category. From the point of view
of the research design, the descriptive as well as diagnostic studies share common requirements
and as such we may group together these two types of research studies. In descriptive as well as
in diagnostic studies, the researcher must be able to define clearly, what he wants to measure and
must find adequate methods for measuring it along with a clear cut definition of population he
wants to study. Since the aim is to obtain complete and accurate information in the said studies,
the procedure to be used must be carefully planned. The research design must make enough
provision for protection against bias and must maximize reliability. With due concern for the
economical completion of the research study, the design in such studies must be rigid and not
flexible and must focus attention on the following:
1. Formulating the objective of the study
2. Designing the methods of data collection
3. Selecting the sample
4. Collecting the data
5. Processing and analyzing the data
6. Reporting the findings.
In a descriptive / diagnostic study the first step is to specify the objectives with sufficient
precision to ensure that the data collected are relevant. If this is not done carefully, the study may
not provide the desired information. Then comes the question of selecting the methods by which
the data are to be obtained. While designing data-collection procedure, adequate safeguards
against bias and unreliability must be ensured. Which ever method is selected, questions must be
well examined and be made unambiguous; interviewers must be instructed not to express their
own opinion; observers must be trained so that they uniformly record a given item of behaviour.
More often than not, sample has to be designed. Usually, one or more forms of probability
sampling or what is often described as random sampling, are used. To obtain data, free from
errors introduced by those responsible for collecting them, it is necessary to supervise closely the
staff of field workers as they collect and record information. Checks may be set up to ensure that
the data collecting staffs performs their duty honestly and without prejudice. The data collected
must be processed and analyzed. This includes steps like coding the interview replies,
observations, etc., tabulating the data; and performing several statistical computations.
Last of all comes the question of reporting the findings. This is the task of communicating the
findings to others and the researcher must do it in an efficient manner.

Research Design in case of Hypothesis-Testing Research Studies
Hypothesis-testing research studies (generally known as experimental studies) are those where
the researcher tests the hypothesis of causal relationships between variables. Such studies require
procedures that will not only reduce bias and increase reliability, but will permit drawing
inferences about causality. Usually, experiments meet these requirements. Hence, when we talk
of research design in such studies, we often mean the design of experiments.
Principles of Experimental Designs
Professor Fisher has enumerated three principles of experimental designs:
1. The principle of replication: The experiment should be reaped more than once. Thus,
each treatment is applied in many experimental units instead of one. By doing so, the
statistical accuracy of the experiments is increased. For example, suppose we are to examine
the effect of two varieties of rice. For this purpose we may divide the field into two parts and
grow one variety in one part and the other variety in the other part. We can compare the yield
of the two parts and draw conclusion on that basis. But if we are to apply the principle of
replication to this experiment, then we first divide the field into several parts, grow one
variety in half of these parts and the other variety in the remaining parts. We can collect the
data yield of the two varieties and draw conclusion by comparing the same. The result so
obtained will be more reliable in comparison to the conclusion we draw without applying the
principle of replication. The entire experiment can even be repeated several times for better
results. Consequently replication does not present any difficulty, but computationally it does.
However, it should be remembered that replication is introduced in order to increase the
precision of a study; that is to say, to increase the accuracy with which the main effects and
interactions can be estimated.
2. The principle of randomization: It provides protection, when we conduct an
experiment, against the effect of extraneous factors by randomization. In other words, this
principle indicates that we should design or plan the experiment in such a way that the
variations caused by extraneous factors can all be combined under the general heading of
chance. For instance if we grow one variety of rice say in the first half of the parts of a
field and the other variety is grown in the other half, then it is just possible that the soil
fertility may be different in the first half in comparison to the other half. If this is so, our
results would not be realistic. In such a situation, we may assign the variety of rice to be
grown in different parts of the field on the basis of some random sampling technique i.e., we
may apply randomization principle and protect ourselves against the effects of extraneous
factors. As such, through the application of the principle of randomization, we can have a
better estimate of the experimental error.
3. Principle of local control: It is another important principle of experimental designs.
Under it the extraneous factors, the known source of variability, is made to vary deliberately
over as wide a range as necessary and this needs to be done in such a way that the variability
it causes can be measured and hence eliminated from the experimental error. This means that
we should plan the experiment in a manner that we can perform a two-way analysis of
variance, in which the total variability of the data is divided into three components attributed
to treatments, the extraneous factor and experimental error. In other words, according to the
principle of local control, we first divide the field into several homogeneous parts, known as
blocks, and then each such block is divided into parts equal to the number of treatments.
Then the treatments are randomly assigned to these parts of a block. In general, blocks are
the levels at which we hold an extraneous factors fixed, so that we can measure its
contribution to the variability of the data by means of a two-way analysis of variance. In
brief, through the principle of local control we can eliminate the variability due to extraneous
factors from the experimental error.
Important Experimental Designs
Experimental design refers to the framework or structure of an experiment and as such there are
several experimental designs. We can classify experimental designs into two broad categories,
viz., informal experimental designs and formal experimental designs. Informal experimental
designs are those designs that normally use a less sophisticated form of analysis based on
differences in magnitudes, where as formal experimental designs offer relatively more control
and use precise statistical procedures for analysis.
Informal experimental designs:
- Before and after without control design: In such a design, single test group or area is
selected and the dependent variable is measured before the introduction of the treatment.
The treatment is then introduced and the dependent variable is measured again after the
treatment has been introduced. The effect of the treatment would be equal to the level of
the phenomenon after the treatment minus the level of the phenomenon before the
treatment.
- After only with control design: In this design, two groups or areas (test and control area)
are selected and the treatment is introduced into the test area only. The dependent
variable is then measured in both the areas at the same time. Treatment impact is assessed
by subtracting the value of the dependent variable in the control area from its value in the
test area.
- Before and after with control design: In this design two areas are selected and the
dependent variable is measured in both the areas for an identical time-period before the
treatment. The treatment is then introduced into the test area only, and the dependent
variable is measured in both for an identical time-period after the introduction of the
treatment. The treatment effect is determined by subtracting the change in the dependent
variable in the control area from the change in the dependent variable in test area.
Formal Experimental Designs
1. Completely randomized design (CR design): It involves only two principle viz., the
principle of replication and randomization. It is generally used when experimental areas
happen to be homogenous. Technically, when all the variations due to uncontrolled
extraneous factors are included under the heading of chance variation, we refer to the
design of experiment as C R Design.
2. Randomized block design (RB design): It is an improvement over the C Research
design. In the RB design the principle of local control can be applied along with the other
two principles.
3. Latin square design (LS design): It is used in agricultural research. The treatments in a
LS design are so allocated among the plots that no treatment occurs more than once in
any row or column.
4. Factorial design: It is used in experiments where the effects of varying more than one
factor are to be determined. They are especially important in several economic and social
phenomena where usually a large number of factors affect a particular problem.

Summary
A research design is a logical and systematic plan prepared for directing a research study. In
many research projects, the time consumed in trying to ascertain what the data mean after they
have been collected is much greater than the time taken to design a research which yields data
whose meaning is known as they are collected. Research design is a series of guide posts to keep
one going in the right direction. It is a tentative plan which undergoes modifications, as
circumstances demand, when the study progresses, new aspects, new conditions and new
relationships come to light and insight into the study deepens. Exploratory research studies are
also termed as formulative research studies. The main purpose of such studies is that of
formulating a problem for more precise investigation or of developing the working hypothesis
from an operational point of view. Descriptive research studies are those studies which are
concerned with describing the characteristics of a particular individual, or of a group, where as
diagnostic research studies determine the frequency with which something occurs or its
association with something else.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 5-Case Study Method
Unit 5-Case Study Method
Meaning of Case Study
Case study is a method of exploring and analyzing the life of a social unit or entity, be it a
person, a family, an institution or a community. The aim of case study method is to locate or
identify the factors that account for the behaviour patterns of a given unit, and its relationship
with the environment. The case data are always gathered with a view to attracting the natural
history of the social unit, and its relationship with the social factors and forces operative and
involved in this surrounding milieu. In short, the social researcher tries, by means of the case
study method, to understand the complex of factors that are working within a social unit as an
integrated totality. Looked at from another angle, the case study serves the purpose similar to the
clue-providing function of expert opinion. It is most appropriate when one is trying to find clues
and ideas for further research.

The major credit for introducing case study method into social investigation goes to Frederick
Leplay. Herbert Spencer was the first social philosopher who used case study in comparative
studies of different cultures. William Healey used case study in his study of juvenile
delinquency. Anthropologists and ethnologists have liberally utilized cast study in the systematic
description of primitive cultures. Historians have used this method for portraying some historical
character or particular historical period and describing the developments within a national
community.

Objectives:
After studying this lesson you should be able to understand:
Assumptions of Case Study Method
Advantages of Case Study Method
Disadvantages of Case Study Method
Making Case Study Effective
Case Study as a Method of Business Research

Assumptions of Case Study Method
- Case study would depend upon wit, commonsense and imagination of the person doing
the case study. The investigator makes up his procedure as he goes along.
- If the life history has been written in the first person, it must be as complete and coherent
as possible.
- Life histories should have been written for knowledgeable persons.
- It is advisable to supplement case data by observational, statistical and historical data
since these provide standards for assessing the reliability and consistency of the case
material.
- Efforts should be made to ascertain the reliability of life history data through examining
the internal consistency of the material.
- A judicious combination of techniques of data collection is a prerequisite for securing
data that are culturally meaningful and scientifically significant.

Advantages of Case Study Method
Case study of particular value when a complex set of variables may be at work in generating
observed results and intensive study is needed to unravel the complexities. For example, an in-
depth study of a firms top sales people and comparison with worst salespeople might reveal
characteristics common to stellar performers. Here again, the exploratory investigation is best
served by an active curiosity and willingness to deviate from the initial plan when findings
suggest new courses of inquiry might prove more productive. It is easy to see how the
exploratory research objectives of generating insights and hypothesis would be well served by
use of this technique

Disadvantages of Case Study Method
Blummer points out that independently, the case documents hardly fulfil the criteria of
reliability, adequacy and representativeness, but to exclude them form any scientific study of
human life will be blunder in as much as these documents are necessary and significant both for
theory building and practice.

Making Case Study Effective
Let us discuss the criteria for evaluating the adequacy of the case history or life history which is
of central importance for case study. John Dollard has proposed seven criteria for evaluating
such adequacy as follows:
i) The subject must be viewed as a specimen in a cultural series. That is, the case drawn out from
its total context for the purposes of study must be considered a member of the particular cultural
group or community. The scrutiny of the life histories of persons must be done with a view to
identify thee community values, standards and their shared way of life.
ii) The organic motto of action must be socially relevant. That is, the action of the individual
cases must be viewed as a series of reactions to social stimuli or situation. In other words, the
social meaning of behaviour must be taken into consideration.
iii) The strategic role of the family group in transmitting the culture must be recognized. That is,
in case of an individual being the member of a family, the role of family in shaping his behaviour
must never be overlooked.
iv) The specific method of elaboration of organic material onto social behaviour must be clearly
shown. That is case histories that portray in detail how basically a biological organism, the man,
gradually blossoms forth into a social person, are especially fruitful.
v) The continuous related character of experience for childhood through adulthood must be
stressed. In other words, the life history must be a configuration depicting the inter-relationships
between thee persons various experiences.
vi) Social situation must be carefully and continuously specified as a factor. One of the important
criteria for the life history is that a persons life must be shown as unfolding itself in the context
of and partly owing to specific social situations.
vii) The life history material itself must be organised according to some conceptual framework,
this in turn would facilitate generalizations at a higher level.

Case Study as a Method of Business Research
In-depth analysis of selected cases is of particular value to business research when a complex set
of variables may be at work in generating observed results and intensive study is needed to
unravel the complexities. For instance, an in-depth study of a firms top sales people and
comparison with the worst sales people might reveal characteristics common to stellar
performers. The exploratory investigator is best served by the active curiosity and willingness to
deviate from the initial plan, when the finding suggests new courses of enquiry, might prove
more productive

Summary
Case study is a method of exploring and analyzing the life of a social unit or entity, be it a
person, a family, an institution or a community. Case study would depend upon wit,
commonsense and imagination of the person doing the case study. The investigator makes up his
procedure as he goes along. Efforts should be made to ascertain the reliability of life history data
through examining the internal consistency of the material.. A judicious combination of
techniques of data collection is a prerequisite for securing data that are culturally meaningful and
scientifically significant. Case study of particular value when a complex set of variables may be
at work in generating observed results and intensive study is needed to unravel the complexities.
The case documents hardly fulfil the criteria of reliability, adequacy and representativeness, but
to exclude them form any scientific study of human life will be blunder in as much as these
documents are necessary and significant both for theory building and practice. In-depth analysis
of selected cases is of particular value to business research when a complex set of variables may
be at work in generating observed results and intensive study is needed to unravel the
complexities.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 6-Sampling
Unit 6-Sampling
Meaning of Sampling
A part of the population is known as sample. The method consisting of the selecting for study, a
portion of the universe with a view to draw conclusions about the universe or population is
known as sampling. A statistical sample ideally purports to be a miniature model or replica of the
collectivity or the population constituted of all the items that the study should principally
encompass, that is, the items which potentially hold promise of affording information relevant to
the purpose of a given research.
Sampling helps in time and cost saving. It also helps in checking their accuracy. But on the other
hand it demands exercise of great care caution; otherwise the results obtained may be incorrect
or misleading.
Objectives
After studying this lesson you should be able to understand:
- Advantages of sampling
- Sampling procedure
- Characteristics of good sample
- Methods of Sampling
- Probability or Random Sampling
- Non-probability or Non Random Sampling

Advantage of Sample Survey
Sampling has the following advantages:
- The size of the population: If the population to be studied is quite large, sampling is
warranted. However, the size is a relative matter. Whether a population is large or small
depends upon the nature of the study, the purpose for which it is undertaken, and the time
and other resources available for it.
- Amount of funds budgeted for the study: Sampling is opted when the amount of
money budgeted is smaller than the anticipated cost of census survey.
- Facilities: The extent of facilities available staff, access to computer facility and
accessibility to population elements in another factor to be considered in deciding to
sample or not. When the availability of these facilities is limited, sampling is preferable.
- Time: The time limit within the study should be completed in another important factor to
be considered in deciding the question of sample survey. This, in fact, is a primary reason
for using sampling by academic and marketing researchers.

Sampling Procedure
The decision process of sampling is complicated one. The researcher has to first identify the
limiting factor or factors and must judiciously balance the conflicting factors. The various
criteria governing the choice of the sampling technique:
1. Purpose of the Survey: What does the researcher aim at? If he intends to generalize the
findings based on the sample survey to the population, then an appropriate probability
sampling method must be selected. The choice of a particular type of probability
sampling depends on the geographical area of the survey and the size and the nature of
the population under study.
2. Measurability: The application of statistical inference theory requires computation of the
sampling error from the sample itself. Probability samples only allow such computation.
Hence, where the research objective requires statistical inference, the sample should be
drawn by applying simple random sampling method or stratified random sampling
method, depending on whether the population is homogenous or heterogeneous.
3. Degree of Precision: Should the results of the survey be very precise, or even rough
results could serve the purpose? The desired level of precision as one of the criteria of
sampling method selection. Where a high degree of precision of results is desired,
probability sampling should be used. Where even crude results would serve the purpose
(E.g., marketing surveys, readership surveys etc) any convenient non-random sampling
like quota sampling would be enough.
4. Information about Population: How much information is available about the
population to be studied? Where no list of population and no information about its nature
are available, it is difficult to apply a probability sampling method. Then exploratory
study with non-probability sampling may be made to gain a better idea of population.
After gaining sufficient knowledge about the population through the exploratory study,
appropriate probability sampling design may be adopted.
5. The Nature of the Population: In terms of the variables to be studied, is the population
homogenous or heterogeneous? In the case of a homogenous population, even a simple
random sampling will give a representative sample. If the population is heterogeneous,
stratified random sampling is appropriate.
6. Geographical Area of the Study and the Size of the Population: If the area covered by
a survey is very large and the size of the population is quite large, multi-stage cluster
sampling would be appropriate. But if the area and the size of the population are small,
single stage probability sampling methods could be used.
7. Financial resources: If the available finance is limited, it may become necessary to
choose a less costly sampling plan like multistage cluster sampling or even quota
sampling as a compromise. However, if the objectives of the study and the desired level
of precision cannot be attained within the stipulated budget, there is no alternative than to
give up the proposed survey. Where the finance is not a constraint, a researcher can
choose the most appropriate method of sampling that fits the research objective and the
nature of population.
8. Time Limitation: The time limit within which the research project should be completed
restricts the choice of a sampling method. Then, as a compromise, it may become
necessary to choose less time consuming methods like simple random sampling instead
of stratified sampling/sampling with probability proportional to size; multi-stage cluster
sampling instead of single-stage sampling of elements. Of course, the precision has to be
sacrificed to some extent.
9. Economy: It should be another criterion in choosing the sampling method. It means
achieving the desired level of precision at minimum cost. A sample is economical if the
precision per unit cost is high or the cost per unit of variance is low.
The above criteria frequently conflict and the researcher must balance and blend them to obtain
to obtain a good sampling plan. The chosen plan thus represents an adaptation of the sampling
theory to the available facilities and resources. That is, it represents a compromise between
idealism and feasibility. One should use simple workable methods instead of unduly elaborate
and complicated techniques

Characteristics of a Good Sample
The characteristics of a good sample are described below:
- Representativeness: a sample must be representative of the population. Probability
sampling technique yield representative sample.
- Accuracy: accuracy is defined as the degree to which bias is absent from the sample. An
accurate sample is the one which exactly represents the population.
- Precision: the sample must yield precise estimate. Precision is measured by standard
error.
- Size: a good sample must be adequate in size in order to be reliable.

Methods of Sampling
Sampling techniques or methods may be classified into two generic types:
Probability or Random Sampling
Probability sampling is based on the theory of probability. It is also known as random sampling.
It provides a known nonzero chance of selection for each population element. It is used when
generalization is the objective of study, and a greater degree of accuracy of estimation of
population parameters is required. The cost and time required is high hence the benefit derived
from it should justify the costs.
The following are the types of probability sampling:
i. Simple Random Sampling: This sampling technique gives each element an equal and
independent chance of being selected. An equal chance means equal probability of
selection. An independent chance means that the draw of one element will not affect the
chances of other elements being selected. The procedure of drawing a simple random
sample consists of enumeration of all elements in the population.
1.Preparation of a List of all elements, giving them numbers in serial order 1, 2, B, and so
on, and
2.Drawing sample numbers by using (a) lottery method, (b) a table of random numbers or
(c) a computer.
Suitability: This type of sampling is suited for a small homogeneous population.
Advantages: The advantage of this is that it is one of the easiest methods, all the elements
in the population have an equal chance of being selected, simple to understand, does not
require prior knowledge of the true composition of the population.
Disadvantages: It is often impractical because of non-availability of population list or of
difficulty in enumerating the population, does not ensure proportionate representation and
it may be expensive in time and money. The amount of sampling error associated with any
sample drawn can easily be computed. But it is greater than that in other probability
samples of the same size, because it is less precise than other methods.
ii. Stratified Random Sampling: This is an improved type of random or probability
sampling. In this method, the population is sub-divided into homogenous groups or strata,
and from each stratum, random sample is drawn. E.g., university students may be divided
on the basis of discipline, and each discipline group may again be divided into juniors and
seniors. Stratification is necessary for increasing a samples statistical efficiency, providing
adequate data for analyzing the various sub-populations and applying different methods to
different strata. The stratified random sampling is appropriate for a large heterogeneous
population. Stratification process involves three major decisions. They are stratification
base or bases, number of strata and strata sample sizes.
Stratified random sampling may be classified into:
a) Proportionate stratified sampling: This sampling involves drawing a sample
from each stratum in proportion to the latters share in the total population. It gives
proper representation to each stratum and its statistical efficiency is generally higher.
This method is therefore very popular. E.g., if the Management Faculty of a University
consists of the following specialization groups:
Specialization stream No. of students Proportion of each stream
Production
Finance
Marketing
Rural development
40
20
30
10
0.4
0.2
0.3
0.1
100 1.0

The research wants to draw an overall sample of 30. Then the strata sample sizes would
be:
Strata Sample size
Production
Finance
Marketing
Rural development
30 x 0.4
30 x 0.2
30 x 0.3
30 x 0.1
12
6
9
3
30

Advantages: Stratified random sampling enhances the representativeness to each
sample, gives higher statistical efficiency, easy to carry out, and gives a self-weighing
sample.
Disadvantages: A prior knowledge of the composition of the population and the
distribution of the population, it is very expensive in time and money and identification
of the strata may lead to classification of errors.
b) Disproportionate stratified random sampling: This method does not give
proportionate representation to strata. It necessarily involves giving over-representation
to some strata and under-representation to others. The desirability of disproportionate
sampling is usually determined by three factors, viz, (a) the sizes of strata, (b) internal
variances among strata, and (c) sampling costs.
Suitability: This method is used when the population contains some small but
important subgroups, when certain groups are quite heterogeneous, while others are
homogeneous and when it is expected that there will be appreciable differences in the
response rates of the subgroups in the population.
Advantages: The advantages of this type is it is less time consuming and facilitates
giving appropriate weighing to particular groups which are small but more important.
Disadvantages: The disadvantage is that it does not give each stratum proportionate
representation, requires prior knowledge of composition of the population, is subject to
classification errors and its practical feasibility is doubtful.
iii. Systematic Random Sampling: This method of sampling is an alternative to random
selection. It consists of taking k
th
item in the population after a random start with an item
form 1 to k. It is also known as fixed interval method. E.g., 1
st
, 11
th
, 21
st
Strictly
speaking, this method of sampling is not a probability sampling. It possesses characteristics
of randomness and some non-probability traits.
Suitability: Systematic selection can be applied to various populations such as students
in a class, houses in a street, telephone directory etc.
Advantages: The advantages are it is simpler than random sampling, easy to use, easy to
instruct, requires less time, its cheaper, easier to check, sample is spread evenly over the
population, and it is statistically more efficient.
Disadvantages: The disadvantages are it ignores all elements between two k
th
elements
selected, each element does not have equal chance of being selected, and this method
sometimes gives a biased sample.
Cluster Sampling
It means random selection of sampling units consisting of population elements. Each such
sampling unit is a cluster of population elements. Then from each selected sampling unit, a
sample of population elements is drawn by either simple random selection or stratified random
selection. Where the population elements are scattered over a wide area and a list of population
elements is not readily available, the use of simple or stratified random sampling method would
be too expensive and time-consuming. In such cases cluster sampling is usually adopted. The
cluster sampling process involves: identify clusters, examine the nature of clusters, and
determine the number of stages.
Suitability: The application of cluster sampling is extensive in farm management surveys, socio-
economic surveys, rural credit surveys, demographic studies, ecological studies, public opinion
polls, and large scale surveys of political and social behaviour, attitude surveys and so on.
Advantages: The advantages of this method is it is easier and more convenient, cost of this is
much less, promotes the convenience of field work as it could be done in compact places, it does
not require more time, units of study can be readily substituted for other units and it is more
flexible.
Disadvantages: The cluster sizes may vary and this variation could increase the bias of the
resulting sample. The sampling error in this method of sampling is greater and the adjacent units
of study tend to have more similar characteristics than do units distantly apart.
Area sampling
This is an important form of cluster sampling. In larger field surveys cluster consisting of
specific geographical areas like districts, talluks, villages or blocks in a city are randomly drawn.
As the geographical areas are selected as sampling units in such cases, their sampling is called
area sampling. It is not a separate method of sampling, but forms part of cluster sampling.
Multi-stage and sub-sampling
In multi-stage sampling method, sampling is carried out in two or more stages. The population is
regarded as being composed of a number of second stage units and so forth. That is, at each
stage, a sampling unit is a cluster of the sampling units of the subsequent stage. First, a sample of
the first stage sampling units is drawn, then from each of the selected first stage sampling unit, a
sample of the second stage sampling units is drawn. The procedure continues down to the final
sampling units or population elements. Appropriate random sampling method is adopted at each
stage. It is appropriate where the population is scattered over a wider geographical area and no
frame or list is available for sampling. It is also useful when a survey has to be made within a
limited time and cost budget. The major disadvantage is that the procedure of estimating
sampling error and cost advantage is complicated.
Sub-sampling is a part of multi-stage sampling process. In a multi-stage sampling, the sampling
in second and subsequent stage frames is called sub-sampling. Sub-sampling balances the two
conflicting effects of clustering i.e., cost and sampling errors.
Random Sampling with Probability Proportional to Size
The procedure of selecting clusters with probability Proportional to size (PPS) is widely used. If
one primary cluster has twice as large a population as another, it is give twice the chance of
being selected. If the same number of persons is then selected from each of the selected clusters,
the overall probability of any person will be the same. Thus PPS is a better method for securing a
representative sample of population elements in multi-stage cluster sampling.
Advantages: The advantages are clusters of various sizes get proportionate representation, PPS
leads to greater precision than would a simple random sample of clusters and a constant
sampling fraction at the second stage, equal-sized samples from each selected primary cluster are
convenient for field work.
Disadvantages: PPS cannot be used if the sizes of the primary sampling clusters are not known.
Double Sampling and Multiphase Sampling
Double sampling refers to the subsection of the final sample form a pre-selected larger sample
that provided information for improving the final selection. When the procedure is extended to
more than two phases of selection, it is then, called multi-phase sampling. This is also known as
sequential sampling, as sub-sampling is done from a main sample in phases. Double sampling or
multiphase sampling is a compromise solution for a dilemma posed by undesirable extremes.
The statistics based on the sample of n can be improved by using ancillary information from a
wide base: but this is too costly to obtain from the entire population of N elements. Instead,
information is obtained from a larger preliminary sample n
L
which includes the final sample n.

Replicated or Interpenetrating Sampling
It involves selection of a certain number of sub-samples rather than one full sample from a
population. All the sub-samples should be drawn using the same sampling technique and each is
a self-contained and adequate sample of the population. Replicated sampling can be used with
any basic sampling technique: simple or stratified, single or multi-stage or single or multiphase
sampling. It provides a simple means of calculating the sampling error. It is practical. The
replicated samples can throw light on variable non-sampling errors. But disadvantage is that it
limits the amount of stratification that can be employed.
Non-probability or Non Random Sampling
Non-probability sampling or non-random sampling is not based on the theory of probability. This
sampling does not provide a chance of selection to each population element.
Advantages: The only merits of this type of sampling are simplicity, convenience and low cost.
Disadvantages: The demerits are it does not ensure a selection chance to each population unit.
The selection probability sample may not be a representative one. The selection probability is
unknown. It suffers from sampling bias which will distort results.
The reasons for usage of this sampling are when there is no other feasible alternative due to non-
availability of a list of population, when the study does not aim at generalizing the findings to the
population, when the costs required for probability sampling may be too large, when probability
sampling required more time, but the time constraints and the time limit for completing the study
do not permit it. It may be classified into:

Convenience or Accidental Sampling
It means selecting sample units in a just hit and miss fashion E.g., interviewing people whom
we happen to meet. This sampling also means selecting whatever sampling units are
conveniently available, e.g., a teacher may select students in his class. This method is also known
as accidental sampling because the respondents whom the researcher meets accidentally are
included in the sample.
Suitability: Though this type of sampling has no status, it may be used for simple purposes such
as testing ideas or gaining ideas or rough impression about a subject of interest.
Advantage: It is the cheapest and simplest, it does not require a list of population and it does not
require any statistical expertise.
Disadvantage: The disadvantage is that it is highly biased because of researchers subjectivity, it
is the least reliable sampling method and the findings cannot be generalized.
Purposive (or judgment) sampling
This method means deliberate selection of sample units that conform to some pre-determined
criteria. This is also known as judgment sampling. This involves selection of cases which we
judge as the most appropriate ones for the given study. It is based on the judgement of the
researcher or some expert. It does not aim at securing a cross section of a population. The chance
that a particular case be selected for the sample depends on the subjective judgement of the
researcher.
Suitability: This is used when what is important is the typicality and specific relevance of the
sampling units to the study and not their overall representativeness to the population.
Advantage: It is less costly and more convenient and guarantees inclusion of relevant elements
in the sample.
Disadvantage: It is less efficient for generalizing, does not ensure the representativeness,
requires more prior extensive information and does not lend itself for using inferential statistics.
Quota sampling
This is a form of convenient sampling involving selection of quota groups of accessible sampling
units by traits such as sex, age, social class, etc. it is a method of stratified sampling in which the
selection within strata is non-random. It is this Non-random element that constitutes its greatest
weakness.
Suitability: It is used in studies like marketing surveys, opinion polls, and readership surveys
which do not aim at precision, but to get quickly some crude results.
Advantage: It is less costly, takes less time, non need for a list of population, and field work can
easily be organized.
Disadvantage: It is impossible to estimate sampling error, strict control if field work is difficult,
and subject to a higher degree of classification.
Snow-ball sampling
This is the colourful name for a technique of Building up a list or a sample of a special
population by using an initial set of its members as informants. This sampling technique may
also be used in socio-metric studies.
Suitability: It is very useful in studying social groups, informal groups in a formal organization,
and diffusion of information among professional of various kinds.
Advantage: It is useful for smaller populations for which no frames are readily available.
Disadvantage: The disadvantage is that it does not allow the use of probability statistical
methods. It is difficult to apply when the population is large. It does not ensure the inclusion of
all the elements in the list.

Summary
A statistical sample ideally purports to be a miniature model or replica of the collectivity or the
population. Sampling helps in time and cost saving. If the population to be studied is quite large,
sampling is warranted. However, the size is a relative matter. The decision regarding census or sampling
depends upon the budget of the study. Sampling is opted when the amount of money budgeted is
smaller than the anticipated cost of census survey. The extent of facilities available staff, access to
computer facility and accessibility to population elements - is another factor to be considered in
deciding to sample or not. In the case of a homogenous population, even a simple random sampling will
give a representative sample. If the population is heterogeneous, stratified random sampling is
appropriate. Probability sampling is based on the theory of probability. It is also known as random
sampling. It provides a known non-zero chance of selection for each population element.
Simple random sampling technique gives each element an equal and independent chance of
being selected. An equal chance means equal probability of selection.
Stratified random sampling is an improved type of random or probability sampling. In this
method, the population is sub-divided into homogenous groups or strata, and from each stratum,
random sample is drawn.
Proportionate stratified sampling involves drawing a sample from each stratum in proportion
to the latters share in the total population.
Disproportionate stratified random sampling does not give proportionate representation to
strata.
Systematic random sampling method is an alternative to random selection. It consists of taking
k
th
item in the population after a random start with an item form 1 to k. It is also known as fixed
interval method.
Cluster sampling means random selection of sampling units consisting of population elements.
In Area sampling larger field surveys cluster consisting of specific geographical areas like
districts, taluks, villages or blocks in a city are randomly drawn.
Multi-stage sampling is carried out in two or more stages. The population is regarded as being
composed of a number of second stage units and so forth. That is, at each stage, a sampling unit
is a cluster of the sampling units of the subsequent stage.
Double sampling and multiphase sampling refers to the subsection of the final sample form a
pre-selected larger sample that provided information for improving the final selection.
Replicated or interpenetrating sampling involves selection of a certain number of sub-samples
rather than one full sample from a population.
Non-probability or non random sampling is not based on the theory of probability. This
sampling does not provide a chance of selection to each population element.
Purposive (or judgment) sampling method means deliberate selection of sample units that
conform to some pre-determined criteria. This is also known as judgment sampling.
Quota sampling is a form of convenient sampling involving selection of quota groups of
accessible sampling units by traits such as sex, age, social class, etc. it is a method of stratified
sampling in which the selection within strata is non-random.
Snow-ball sampling is the colourful name for a technique of Building up a list or a sample of a
special population by using an initial set of its members as informants.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 7-Sources of Data
Unit 7-Sources of Data
Meaning and Importance of Data
The search for answers to research questions is called collection of data. Data are facts, and other
relevant materials, past and present, serving as bases for study and analyses. The data needed for
a social science research may be broadly classified into (a) Data pertaining to human beings, (b)
Data relating to organization and (c) Data pertaining to territorial areas.
Objectives
After studying this lesson you should be able to understand:
- Primary sources of data
- Advantages and disadvantages of primary data
- Disadvantages of primary data
- Methods of collecting primary data
- Secondary sources of data
- Features of secondary data
- Use of Secondary data
- Advantages of secondary data
- Disadvantages of secondary data
- Evaluation and of secondary data
Personal data or data related to human beings consist of:
1. Demographic and socio-economic characteristics of individuals: Age, sex, race, social
class, religion, marital status, education, occupation income, family size, location of the
household life style etc.
2. Behavioral variables: Attitudes, opinions, awareness, knowledge, practice, intentions, etc.
3. Organizational data consist of data relating to an organizations origin, ownership,
objectives, resources, functions, performance and growth.
4. Territorial data are related to geo-physical characteristics, resource endowment,
population, occupational pattern infrastructure degree of development, etc. of spatial
divisions like villages, cities, talluks, districts, state and the nation.
The data serve as the bases or raw materials for analysis. Without an analysis of factual data, no
specific inferences can be drawn on the questions under study. Inferences based on imagination
or guess work cannot provide correct answers to research questions. The relevance, adequacy
and reliability of data determine the quality of the findings of a study.

Data form the basis for testing the hypothesis formulated in a study. Data also provide the facts
and figures required for constructing measurement scales and tables, which are analyzed with
statistical techniques. Inferences on the results of statistical analysis and tests of significance
provide the answers to research questions. Thus, the scientific process of measurements,
analysis, testing and inferences depends on the availability of relevant data and their accuracy.
Hence, the importance of data for any research studies.
The sources of data may be classified into (a) primary sources and (b) secondary sources.

Primary Sources of Data
Primary sources are original sources from which the researcher directly collects data that have
not been previously collected e.g.., collection of data directly by the researcher on brand
awareness, brand preference, brand loyalty and other aspects of consumer behaviour from a
sample of consumers by interviewing them,. Primary data are first hand information collected
through various methods such as observation, interviewing, mailing etc.
Advantage of Primary Data
- It is original source of data
- It is possible to capture the changes occurring in the course of time.
- It flexible to the advantage of researcher.
- Extensive research study is based of primary data
Disadvantage of Primary Data
1. Primary data is expensive to obtain
2. It is time consuming
3. It requires extensive research personnel who are skilled.
4. It is difficult to administer.
Methods of Collecting Primary Data
Primary data are directly collected by the researcher from their original sources. In this
case, the researcher can collect the required date precisely according to his research
needs, he can collect them when he wants them and in the form he needs them. But the
collection of primary data is costly and time consuming. Yet, for several types of social
science research required data are not available from secondary sources and they have to
be directly gathered from the primary sources.
In such cases where the available data are inappropriate, inadequate or obsolete, primary
data have to be gathered. They include: socio economic surveys, social anthropological
studies of rural communities and tribal communities, sociological studies of social
problems and social institutions. Marketing research, leadership studies, opinion polls,
attitudinal surveys, readership, radio listening and T.V. viewing surveys, knowledge-
awareness practice (KAP) studies, farm managements studies, business management
studies etc.
There are various methods of data collection. A Method is different from a Tool while
a method refers to the way or mode of gathering data, a tool is an instruments used for the
method. For example, a schedule is used for interviewing. The important methods are
(a) observation, (b) interviewing, (c) mail survey, (d) experimentation,
(e) simulation and (f) projective technique. Each of these methods is discussed in detail in
the subsequent sections in the later chapters.

Secondary Sources of Data
These are sources containing data which have been collected and compiled for another
purpose. The secondary sources consists of readily compendia and already compiled
statistical statements and reports whose data may be used by researchers for their studies
e.g., census reports , annual reports and financial statements of companies, Statistical
statement, Reports of Government Departments, Annual reports of currency and finance
published by the Reserve Bank of India, Statistical statements relating to Co-operatives
and Regional Banks, published by the NABARD, Reports of the National sample survey
Organization, Reports of trade associations, publications of international organizations
such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and Financial journals
newspapers etc.
Secondary sources consist of not only published records and reports, but also unpublished
records. The latter category includes various records and registers maintained by the
firms and organizations, e.g., accounting and financial records, personnel records, register
of members, minutes of meetings, inventory records etc.
Features of Secondary Sources
Though secondary sources are diverse and consist of all sorts of materials, they have
certain common characteristics.
First, they are readymade and readily available, and do not require the trouble of
constructing tools and administering them.
Second, they consist of data which a researcher has no original control over collection
and classification. Both the form and the content of secondary sources are shaped by
others. Clearly, this is a feature which can limit the research value of secondary sources.
Finally, secondary sources are not limited in time and space. That is, the researcher using
them need not have been present when and where they were gathered.

Use of Secondary Data
The second data may be used in three ways by a researcher. First, some specific
information from secondary sources may be used for reference purpose. For example, the
general statistical information in the number of co-operative credit societies in the
country, their coverage of villages, their capital structure, volume of business etc., may be
taken from published reports and quoted as background information in a study on the
evaluation of performance of cooperative credit societies in a selected district/state.
Second, secondary data may be used as bench marks against which the findings of
research may be tested, e.g., the findings of a local or regional survey may be compared
with the national averages; the performance indicators of a particular bank may be tested
against the corresponding indicators of the banking industry as a whole; and so on.
Finally, secondary data may be used as the sole source of information for a research
project. Such studies as securities Market Behaviour, Financial Analysis of companies,
Trade in credit allocation in commercial banks, sociological studies on crimes, historical
studies, and the like, depend primarily on secondary data. Year books, statistical reports
of government departments, report of public organizations of Bureau of Public
Enterprises, Censes Reports etc, serve as major data sources for such research studies.

Advantages of Secondary Data
Secondary sources have some advantages:
1. Secondary data, if available can be secured quickly and cheaply. Once their
source of documents and reports are located, collection of data is just matter of
desk work. Even the tediousness of copying the data from the source can now be
avoided, thanks to Xeroxing facilities.
2. Wider geographical area and longer reference period may be covered without
much cost. Thus, the use of secondary data extends the researchers space and
time reach.
3. The use of secondary data broadens the data base from which scientific
generalizations can be made.
4. Environmental and cultural settings are required for the study.
5. The use of secondary data enables a researcher to verify the findings bases on
primary data. It readily meets the need for additional empirical support. The
researcher need not wait the time when additional primary data can be collected.

Disadvantages of Secondary Data
The use of a secondary data has its own limitations.
6. The most important limitation is the available data may not meet our specific
needs. The definitions adopted by those who collected those data may be
different; units of measure may not match; and time periods may also be different.
7. The available data may not be as accurate as desired. To assess their accuracy we
need to know how the data were collected.
8. The secondary data are not up-to-date and become obsolete when they appear in
print, because of time lag in producing them. For example, population census data
are published tow or three years later after compilation, and no new figures will
be available for another ten years.
9. Finally, information about the whereabouts of sources may not be available to all
social scientists. Even if the location of the source is known, the accessibility
depends primarily on proximity. For example, most of the unpublished official
records and compilations are located in the capital city, and they are not within
the easy reach of researchers based in far off places.
Evaluation of Secondary Data
When a researcher wants to use secondary data for his research, he should evaluate them
before deciding to use them.
1. Data Pertinence
The first consideration in evaluation is to examine the pertinence of the available secondary
data to the research problem under study. The following questions should be considered.
- What are the definitions and classifications employed? Are they consistent ?
- What are the measurements of variables used? What is the degree to which they conform
to the requirements of our research?
- What is the coverage of the secondary data in terms of topic and time? Does this
coverage fit the needs of our research?
On the basis of above consideration, the pertinence of the secondary data to the research on hand
should be determined, as a researcher who is imaginative and flexible may be able to redefine his
research problem so as to make use of otherwise unusable available data.
2. Data Quality
If the researcher is convinced about the available secondary data for his needs, the next step is to
examine the quality of the data. The quality of data refers to their accuracy, reliability and
completeness. The assurance and reliability of the available secondary data depends on the
organization which collected them and the purpose for which they were collected. What is the
authority and prestige of the organization? Is it well recognized? Is it noted for reliability? It is
capable of collecting reliable data? Does it use trained and well qualified investigators? The
answers to these questions determine the degree of confidence we can have in the data and their
accuracy. It is important to go to the original source of the secondary data rather than to use an
immediate source which has quoted from the original. Then only, the researcher can review the
cautionary ands other comments that were made in the original source.
3. Data Completeness
The completeness refers to the actual coverage of the published data. This depends on the
methodology and sampling design adopted by the original organization. Is the methodology
sound? Is the sample size small or large? Is the sampling method appropriate? Answers to these
questions may indicate the appropriateness and adequacy of the data for the problem under
study. The question of possible bias should also be examined. Whether the purpose for which the
original organization collected the data had a particular orientation? Has the study been made to
promote the organizations own interest? How the study was conducted? These are important
clues. The researcher must be on guard when the source does not report the methodology and
sampling design. Then it is not possible to determine the adequacy of the secondary data for the
researchers study.

Summary
Data are facts and other relevant materials, past and present, serving as bases for study and
analyses. The data needed for a social science research may be broadly classified into (a) Data
pertaining to human beings, (b) Data relating to organization and (c) Data pertaining to territorial
areas. Personal data or data related to human beings consists of: Demographic and socio-
economic characteristics of individuals: Age, sex, race, social class, religion, martial status,
education, occupation income, family size, location of the household life style etc.
Behavioural variables: Attitudes, opinions, awareness, knowledge, practice, intentions, etc.
Organizational data consist of data relating to an organizations origin, ownership, objectives,
resources, functions, performance and growth. Territorial data are related to geophysical
characteristics, resource endowment, population, occupational pattern infrastructure degree of
development, etc. of spatial divisions like villages, cities, taluks, districts, state and the nation.
Data form the basis for testing the hypothesis formulated in a study. Data also provide the facts
and figures required for constructing measurement scales and tables. The sources of data may be
classified into (a) primary sources and (b) secondary sources. Primary data are first hand
information collected through various methods such as observation, interviewing, mailing etc.
The secondary sources consist of readily compendia and already complied statistical statements
and reports. Finally secondary sources are not limited in time and space. That is, the researcher
using them need not have been present when and where they were gathered. Secondary data, if
available can be secured quickly and cheaply.
Wider geographical area and longer reference period may be covered without much cost. Thus,
the use of secondary data extends the researchers space and time reach. The use of secondary
data broadens the data base from which scientific generalizations can be made. The use of a
secondary data has its own limitations. The most important limitation is the available data may
not meet our specific needs. The secondary data are not up-to-date and become obsolete when
they appear in print, because of time lag in producing them. Primary data are directly collected
by the researcher from their original sources. There are various methods of data collection. A
Method is different from a Tool while a method refers to the way or mode of gathering data, a
tool is an instruments used for the method. For example, a schedule is used for interviewing. The
important methods are (a) observation, (b) interviewing, (c) mail survey, (d) experimentation, (e)
simulation and projective technique.
Copyright 2009 SMU
Powered by Sikkim Manipal University
MB0034- Unit 8-Observation
Unit 8-Observation
Meaning of Observation
Observation means viewing or seeing. Observation may be defined as a systematic viewing of a
specific phenomenon in its proper setting for the specific purpose of gathering data for a
particular study. Observation is classical method of scientific study.
Objectives:
After studying this lesson you should be able to understand:
- General characteristics of observation method
- Process of observation
- Types of observation
- Participant Observation
- Non-participant observation
- Direct observation
- Indirect observation
- Controlled observation
- Uncontrolled observation
- Prerequisites of observation
- Advantages of observation
- Limitations of observation
- Use of observation in business research

General Characteristics of Observation Method
Observation as a method of data collection has certain characteristics.
1. It is both a physical and a mental activity: The observing eye catches many things that
are present. But attention is focused on data that are pertinent to the given study.
2. Observation is selective: A researcher does not observe anything and everything, but
selects the range of things to be observed on the basis of the nature, scope and objectives
of his study. For example, suppose a researcher desires to study the causes of city road
accidents and also formulated a tentative hypothesis that accidents are caused by
violation of traffic rules and over speeding. When he observed the movements of vehicles
on the road, many things are before his eyes; the type, make, size and colour of the
vehicles, the persons sitting in them, their hair style, etc. All such things which are not
relevant to his study are ignored and only over speeding and traffic violations are keenly
observed by him.
3. Observation is purposive and not casual: It is made for the specific purpose of noting
things relevant to the study. It captures the natural social context in which persons
behaviour occur. It grasps the significant events and occurrences that affect social
relations of the participants.
4. Observation should be exact and be based on standardized tools of research and such
as observation schedule, social metric scale etc., and precision instruments, if any.

Process of Observations
The use of observation method requires proper planning.
- First, the researcher should carefully examine the relevance of observation method to the
data needs of the selected study.
- Second, he must identify the specific investigative questions which call for use of
observation method. These determine the data to be collected.
- Third, he must decide the observation content, viz., specific conditions, events and
activities that have to be observed for the required data. The observation content should
include the relevant variables.
- Fourth, for each variable chosen, the operational definition should be specified.
- Fifth, the observation setting, the subjects to be observed, the timing and mode of
observation, recording, procedure, recording instruments to be used, and other details of
the task should be determined.
- Last, observers should be selected and trained. The persons to be selected must have
sufficient concentration powers, strong memory power and unobtrusive nature. Selected
persons should be imparted both theoretical and practical training.

Types of Observations
Observations may be classified in different ways. With reference to investigators role, it may be
classified into (a) participant observation and (b) non-participant observation. In terms of mode
of observation, it may be classified into (c) direct observation. With reference to the rigor of the
system adopted. Observation is classified into (e) controlled observation, and (f) uncontrolled
observation
Participant Observation
In this observation, the observer is a part of the phenomenon or group which is observed and he
acts as both an observer and a participant. For example, a study of tribal customs by an
anthropologist by taking part in tribal activities like folk dance. The persons who are observed
should not be aware of the researchers purpose. Then only their behaviour will be natural. The
concealment of research objective and researchers identity is justified on the ground that it
makes it possible to study certain aspects of the groups culture which are not revealed to
outsiders.
Advantages: The advantages of participant observation are:
- The observer can understand the emotional reactions of the observed group, and get a
deeper insight of their experiences.
- The observer will be able to record context which gives meaning to the observed
behaviour and heard statements.
Disadvantages: Participant observation suffers from some demerits.
1. The participant observer narrows his range of observation. For example, if there is a
hierarchy of power in the group/community under study, he comes to occupy one
position within in, and thus other avenues of information are closed to him.
2. To the extent that the participant observer participates emotionally, the objectivity is lost.
3. Another limitation of this method is the dual demand made on the observer. Recording
can interfere with participation, and participation can interfere with observation.
Recording on the spot is not possible and it has to be postponed until the observer is
alone. Such time lag results in some inaccuracy in recording
Non-participant observations
In this method, the observer stands apart and does not participate in the phenomenon observed.
Naturally, there is no emotional involvement on the part of the observer. This method calls for
skill in recording observations in an unnoticed manner.
Direct observation
This means observation of an event personally by the observer when it takes place. This method
is flexible and allows the observer to see and record subtle aspects of events and behaviour as
they occur. He is also free to shift places, change the focus of the observation. A limitation of
this method is that the observers perception circuit may not be able to cover all relevant events
when the latter move quickly, resulting in the incompleteness of the observation.
Indirect observation
This does not involve the physical presence of the observer, and the recording is done by
mechanical, photographic or electronic devices, e.g. recording customer and employee
movements by a special motion picture camera mounted in a department of a large store. This
method is less flexible than direct observations, but it is less biasing and less erratic in recording
accuracy. It is also provides a permanent record for an analysis of different aspects of the event.
Controlled observation
This involves standardization of observational techniques and exercises of maximum control
over extrinsic and intrinsic variables by adopting experimental design and systematically
recording observations. Controlled observation is carried out either in the laboratory or in the
field. It is typified by clear and explicit decisions on what, how and when to observe.
Uncontrolled observation
This does not involve control over extrinsic and intrinsic variables. It is primary used for
descriptive research. Participant observation is a typical uncontrolled one

Prerequisites of Effective Observation
The prerequisites of observation consist of:
- Observations must be done under conditions which will permit accurate results. The
observer must be in vantage point to see clearly the objects to be observed. The distance
and the light must be satisfactory. The mechanical devices used must be in good working
conditions and operated by skilled persons.
- Observation must cover a sufficient number of representative samples of the cases.
- Recording should be accurate and complete.
- The accuracy and completeness of recorded results must be checked. A certain number of
cases can be observed again by another observer/another set of mechanical devices, as the
case may be. If it is feasible, two separate observers and sets of instruments may be used
in all or some of the original observations. The results could then be compared to
determine their accuracy and completeness.

Advantages of observation
Observation has certain advantages:
1. The main virtue of observation is its directness: it makes it possible to study behaviour as
it occurs. The researcher need not ask people about their behaviour and interactions; he
can simply watch what they do and say.
2. Data collected by observation may describe the observed phenomena as they occur in
their natural settings. Other methods introduce elements or artificiality into the researched
situation for instance, in interview; the respondent may not behave in a natural way.
There is no such artificiality in observational studies, especially when the observed
persons are not aware of their being observed.
3. Observations is more suitable for studying subjects who are unable to articulate
meaningfully, e.g. studies of children, tribal, animals, birds etc.
4. Observations improve the opportunities for analyzing the contextual back ground of
behaviour. Further more verbal resorts can be validated and compared with behaviour
through observation. The validity of what men of position and authority say can be
verified by observing what they actually do.
5. Observations make it possible to capture the whole event as it occurs. For example only
observation can provide an insight into all the aspects of the process of negotiation
between union and management representatives.
6. Observation is less demanding of the subjects and has less biasing effect on their conduct
than questioning.
7. It is easier to conduct disguised observation studies than disguised questioning.
8. Mechanical devices may be used for recording data in order to secure more accurate data
and also of making continuous observations over longer periods.

Limitations of Observation
Observation cannot be used indiscriminately for all purposes. It has its own limitations:
1. Observation is of no use, studying past events or activities. One has to depend upon
documents or narrations people for studying such things.
2. Observation is not suitable for studying and attitudes. However, an observation of related
behaviour affords a good clue to the attitudes. E.g. and observations of the seating pattern
of high caste and class persons in a general meeting in a village may be useful for
forming an index of attitude.
3. Observation poses difficulties in obtaining a representative sample. For interviewing and
mailing methods, the selection of a random sampling can be rapidly ensured. But
observing people of all types does not make the sample a random one.
4. Observation cannot be used as and when the researcher finds a convenient to use it. He
has to wait for the eve n to occur. For example, an observation of folk dance of a tribal
community is possible, only when it is performed.
5. A major limitation of this method is that the observer normally must be at the scene of
the event when it takes place. Yet it may not be possible to predict where and when the
even will occur, e.g., road accident, communal clash.
6. Observation is slow and expensive process, requiring human observers and/or costly
surveillance equipments.

Use of Observation in Business Research
Observation is suitable for a variety of research purposes. It may be used for studying (a) The
behaviour of human beings in purchasing goods and services.: life style, customs, and manner,
interpersonal relations, group dynamics, crowd behaviour, leadership styles, managerial style,
other behaviours and actions; (b) The behaviour of other living creatures like birds, animals etc.
(c) Physical characteristics of inanimate things like stores, factories, residences etc. (d) Flow of
traffic and parking problems
(e) movement of materials and products through a plant.

Summary
Observation means viewing or seeing. Observation may be defined as a systematic viewing of a
specific phenomenon in its proper setting for the specific purpose of gathering data for a
particular study. Observation is classical method of scientific study. Observation as a method of
data collection has certain characteristics. Observations may be classified in different ways. With
reference to investigators role, it may be classified into (a) participant observation and (b) non-
participant observation. In terms of mode of observation, it may be classified into (c) direct
observation. With reference to the rigor of the system adopted. Observation is classified into (e)
controlled observation, and (f) uncontrolled observation. This does not involve the physical
presence of the observer, and the recording is done by mechanical, photographic or electronic
devices, e.g. recording customer and employee movements by a special motion picture camera
mounted in a department of a large store. This involves standardization of observational
techniques and exercises of maximum control over extrinsic and intrinsic variables by adopting
experimental design and systematically recording observations. This does not involve control
over extrinsic and intrinsic variables. It is primary used for descriptive research. Participant
observation is a typical uncontrolled one.
Observation has certain advantages: Observation cannot be used indiscriminately for all
purposes. It has its own limitations. Observation is suitable for a variety of research purposes. (a)
The behaviour of human beings in purchasing goods and services: life style, customs, and
manner, interpersonal relations, group dynamics, crowd behaviour, leadership styles, managerial
style, other behaviours and actions.








Copyright 2009 SMU
Powered by Sikkim Manipal University
.
. MB0034- Unit 9-Schedule and Questionnaire
Unit 9 Schedule and Questionnaire
Meaning of Schedule and Questionnaire
The mail survey is another method of collecting primary data. This method involves sending
questionnaires to the respondents with a request to complete them and return them by post. This
can be used in the case of educated respondents only. The mail questionnaires should be simple
so that the respondents can easily understand the questions and answer them. It should preferably
contain mostly closed-end and multiple choice questions so that it could be completed within a
few minutes.
The distinctive feature of the mail survey is that the questionnaire is self-administered by the
respondents themselves and the responses are recorded by them, and not by the investigator as in
the case of personal interview method. It does not involve face-to-face conversation between the
investigator and the respondent. Communication is carried out only in writing and this required
more cooperation from the respondents than in verbal communication

Objectives
After studying this lesson you should be able to understand:
- Types of questionnaire
- Structured or standard questionnaire
- Unstructured questionnaire
- Processes of data collection
- Alternate method of sending questionnaires
- Importance of questionnaire
- Advantages of questionnaire
- Disadvantages of Questionnaire
- Distinction between schedule and questionnaire

Types of Questionnaires
Questionnaires may be classified as:
Structured/ standardized questionnaire
Structured questionnaires are those in which there are definite, concrete and preordained
questions with additional questions limited to those necessary to clarify inadequate answers or to
elicit more detailed responses. The questions are presented with exactly the same wording and in
the same order to all the respondents.
Unstructured questionnaire
In unstructured questionnaires the respondent is given the opportunity to answer in his own terms
and in his own frame of reference.

Process of Data Collection
The researcher should prepare a mailing list of the selected respondents by collecting the
addresses from the telephone directory of the association or organization to which they belong.
A covering letter should accompany a copy of the questionnaire. Exhibit 7.1 is a copy of a
covering letter used by the author in a research study on corporate planning. It must explain to
the respondent the purpose of the study and the importance of his cooperation to the success of
the project. Anonymity may be assured.
Alternative Modes of Sending Questionnaires
There are some alternative methods of distributing questionnaires to the respondents. They are:
(1) personal delivery, (2) attaching questionnaire to a product (3) advertising questionnaire in a
newspaper of magazine, and
(4) news stand insets.
Personal Delivery
The researcher or his assistant may deliver the questionnaires to the potential respondents with a
request to complete them at their convenience. After a day or two he can collect the completed
questionnaires from them. Often referred to as the self-administered questionnaire method, it
combines the advantages of the personal interview and the mail survey. Alternatively, the
questionnaires may be delivered in person and the completed questionnaires may be returned by
mail by the respondents.
Attaching Questionnaire to a Product
A firm test marketing a product may attach a questionnaire to a product and request the buyer to
complete it and mail it back to the firm. The respondent is usually rewarded by a gift or a
discount coupon.
Advertising the Questionnaires
The questionnaire with the instructions for completion may be advertised on a page of magazine
or in section of newspapers. The potential respondent completes it tears it out and mails it to the
advertiser. For example, the committee of Banks customer services used this method.
Management studies for collecting information from the customers of commercial banks in
India. This method may be useful for large-scale on topics of common interest.
News-Stand Inserts
This method involves inserting the covering letter, questionnaire and self addressed reply-paid
envelope into a random sample of news-stand copies of a newspaper or magazine.
Improving the Response Rate in a Mail survey
The response rate in mail surveys is generally very low more so in developing countries like
India. Certain techniques have to be adopted to increase the response rate. They are:
1. Quality Printing: The questionnaire may be neatly printed in quality light coloured
paper, so as to attract the attention of the respondent.
2. Covering Letter: The covering letter should be couched in a pleasant style so as to
attract and hold the interest of the respondent. It must anticipate objections and answer
them briefly. It is a desirable to address the respondent by name.
3. Advance Information: Advance information can be provided to potential respondents by
a telephone call or advance notice in the newsletter of the concerned organization or by a
letter. Such preliminary contact with potential respondents is more successful than follow
up efforts.
4. Incentives: Money, stamps for collection and other incentives are also used to induce
respondents to complete and return mail questionnaire.
5. Follow-up-contacts: In the case of respondents belonging to an organization, they may
be approached through some one in that organization known as the researcher.
6. Larger sample size: A larger sample may be drawn than the estimated sample size. For
example, if the required sample size is 1000, a sample of 1500 may be drawn. This may
help the researcher to secure an effective sample size closer to the required size.

Importance of Questionnaire
The significance of questionnaire method is that it affords great facilities in collecting data from
large, diverse, and widely scattered groups of people. It is used in gathering objective,
quantitative data as well as for securing information of a qualitative nature. In some studies,
questionnaire is the sole research tool utilised but it is more often used in conjunction with other
methods of investigations. In questionnaire technique, great reliance is placed on the
respondents verbal report for data on the stimuli or experiences which is exposed as also for
data on his behaviour.
Advantages of Questionnaires
The advantages of mail surveys are:
- They are less costly than personal interviews, as cost of mailing is the same through out
the country, irrespective of distance.
- They can cover extensive geographical areas.
- Mailing is useful in contacting persons such as senior business executives who are
difficult to reach in any other way.
- The respondents can complete the questionnaires at their convenience.
- Mail surveys, being more impersonal, provide more anonymity than personal interviews.
- Mail surveys are totally free from the interviewers bias, as there is no personal contact
between the respondents and the investigator.
- Certain personal and economic data may be given accurately in an unsigned mail
questionnaire.
Disadvantages of Questionnaires
The disadvantages of mail surveys are:
1. The scope for mail surveys is very limited in a country like India where the percentage of
literacy is very low.
2. The response rate of mail surveys is low. Hence, the resulting sample will not be a
representative one.

Distinction between schedules and questionnaires
Questionnaires are mailed to the respondent whereas schedules are carried by the investigator
himself. Questionnaires can be filled by the respondent only if he is able to understand the
language in which it is written and he is supposed to be a literate. This problem can be overcome
in case of schedule since the investigator himself carries the schedules and the respondents
response is accordingly taken. A questionnaire is filled by the respondent himself whereas the
schedule is filled by the investigator.

Summary
The mail survey is another method of collecting primary data. This method involves sending
questionnaires to the respondents with a request to complete them and return them by post. The
distinctive feature of the mail survey is that the questionnaire is self-administered by the
respondents themselves and the responses are recorded by them, and not by the investigator as in
the case of personal interview method. There are some alternative methods of distributing
questionnaires to the respondents. They are: (1) personal delivery, (2) attaching questionnaire to
a product
(3) advertising questionnaire in a newspaper or a magazine, and (4) news stand insets. The
response rate in mail surveys is generally very low, more so in developing countries like India.
Certain techniques have to be adopted to increase the response rate. They are less costly than
personal interviews, as cost of mailing is the same through out the country, irrespective of
distances. They can cover extensive geographical areas. Mailing is useful in contacting persons
such as senior business executives who are difficult to reach in any other way. The respondents
can complete the questionnaires at their conveniences
Mail surveys, being more impersonal, provide more anonymity than personal interviews. Mail
surveys are totally free from the interviewers bias, as there is no personal contact between the
respondents and the investigator. Certain personal and economic data may be given accurately in
an unsigned mail questionnaire. The scope for mail surveys is very limited in a country like India
where the percentage of literacy is very low. The response rate of mail surveys is low. Hence, the
resulting sample will not be a representative one. The significance of questionnaire method is
that it affords great facilities in collecting data from large, diverse, and widely scattered groups
of people. Questionnaires are mailed to the respondent whereas schedules are carried by the
investigator himself. A questionnaire is filled by the respondent himself whereas the schedule is
filled by the investigator.
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 10-Interviewing
Unit 10 -Interviewing
Meaning of Interview
Interviewing is one of the prominent methods of data collection. It may be defined as a two way
systematic conversation between an investigator and an informant, initiated for obtaining
information relevant to a specific study. It involves not only conversation, but also learning from
the respondents gesture, facial expressions and pauses, and his environment. Interviewing
requires face to face contact or contact over telephone and calls for interviewing skills. It is done
by using a structured schedule or an unstructured guide.
Interviewing may be used either as a main method or as a supplementary one in studies of
persons. Interviewing is the only suitable method for gathering information from illiterate or less
educated respondents. It is useful for collecting a wide range of data from factual demographic
data to highly personal and intimate information relating to a persons opinions, attitudes, values,
beliefs past experience and future intentions. When qualitative information is required or probing
is necessary to draw out fully, and then interviewing is required. Where the area covered for the
survey is a compact, or when a sufficient number of qualified interviewers are available, personal
interview is feasible.

Interview is often superior to other data-gathering methods. People are usually more willing to
talk than to write. Once report is established, even confidential information may be obtained. It
permits probing into the context and reasons for answers to questions.
Interview can add flesh to statistical information. It enables the investigator to grasp the
behavioural context of the data furnished by the respondents.
Objectives
After studying this lesson you should be able to understand:
- Types of interviews
- Structured Directive interview
- Unstructured non-directive interview
- Focused interview
- Clinical interview
- Depth interview
- Approaches to the interview
- Qualities of interview
- Merits of interview method
- Demerits of interview method
- Interview techniques in business research
- Interview Problems
- Methods and Aims of controlling non-response
- Telephone Interviewing
- Group Interviews

Types of Interviews
The interview may be classified into: (a) structured or directive interview, (b) unstructured or
non-directive interview, (c) focused interview, (d) clinical interview and (e) depth interview.

Structured Directive Interview
This is an interview made with a detailed standardized schedule. The same questions are put to
all the respondents and in the same order. Each question is asked in the same way in each
interview, promoting measurement reliability. This type of interview is used for large-scale
formalized surveys.
Advantages: This interview has certain advantages. First, data from one interview to the next
one are easily comparable. Second, recording and coding data do not pose any problem, and
greater precision is achieved. Lastly, attention is not diverted to extraneous, irrelevant and time
consuming conversation.
Limitation: However, this type of interview suffers from some limitations. First, it tends to lose
the spontaneity of natural conversation. Second, the way in which the interview is structured may
be such that the respondents views are minimized and the investigators own biases regarding
the problem under study are inadvertent introduced. Lastly, the scope for exploration is limited.
Unstructured or Non-Directive Interview
This is the least structured one. The interviewer encourages the respondent to talk freely about a
give topic with a minimum of prompting or guidance. In this type of interview, a detailed pre-
planned schedule is not used. Only a broad interview guide is used. The interviewer avoids
channelling the interview directions. Instead he develops a very permissive atmosphere.
Questions are not standardized and ordered in a particular way.
This interviewing is more useful in case studies rather than in surveys. It is particularly useful in
exploratory research where the lines of investigations are not clearly defined. It is also useful for
gathering information on sensitive topics such as divorce, social discrimination, class conflict,
generation gap, drug-addiction etc. It provides opportunity to explore the various aspects of the
problem in an unrestricted manner.
Advantages: This type of interview has certain special advantages. It can closely approximate
the spontaneity of a natural conversation. It is less prone to interviewers bias. It provides greater
opportunity to explore the problem in an unrestricted manner.
Limitations: Though the unstructured interview is a potent research instrument, it is not free
from limitations. One of its major limitations is that the data obtained from one interview is not
comparable to the data from the next. Hence, it is not suitable for surveys. Time may be wasted
in unproductive conversations. By not focusing on one or another facet of a problem, the
investigator may run the risk of being led up blind ally. As there is no particular order or
sequence in this interview, the classification of responses and coding may required more time.
This type of informal interviewing calls for greater skill than the formal survey interview.
Focused Interview
This is a semi-structured interview where the investigator attempts to focus the discussion on the
actual effects of a given experience to which the respondents have been exposed. It takes place
with the respondents known to have involved in a particular experience, e.g, seeing a particular
film, viewing a particular program on TV., involved in a train/bus accident, etc. The situation is
analysed prior to the interview. An interview guide specifying topics relating to the research
hypothesis used. The interview is focused on the subjective experiences of the respondent, i.e.,
his attitudes and emotional responses regarding the situation under study. The focused interview
permits the interviewer to obtain details of personal reactions, specific emotions and the like.

Merits: This type of interview is free from the inflexibility of formal methods, yet gives the
interview a set form and insured adequate coverage of all the relevant topics. The respondent is
asked for certain information, yet he has plenty of opportunity to present his views. The
interviewer is also free to choose the sequence of questions and determine the extent of probing,
Clinical Interview
This is similar to the focused interview but with a subtle difference. While the focused interview
is concerned with the effects of specific experience, clinical interview is concerned with broad
underlying feelings or motivations or with the course of the individuals life experiences.
The personal history interview used in social case work, prison administration, psychiatric
clinics and in individual life history research is the most common type of clinical interview. The
specific aspects of the individuals life history to be covered by the interview are determined
with reference to the purpose of the study and the respondent is encouraged to talk freely about
them.
Depth Interview
This is an intensive and searching interview aiming at studying the respondents opinion,
emotions or convictions on the basis of an interview guide. This requires much more training on
inter-personal skills than structured interview. This deliberately aims to elicit unconscious as
well as extremely personal feelings and emotions.
This is generally a lengthy procedure designed to encourage free expression of affectively
charged information. It requires probing. The interviewer should totally avoid advising or
showing disagreement. Of course, he should use encouraging expressions like uh-huh or I
see to motivate the respondent to continue narration. Some times the interviewer has to face the
problem of affections, i.e. the respondent may hide expressing affective feelings. The interviewer
should handle such situation with great care.

Approaches to Interview
Interviewing as a method of data collection has certain features. They are:
The Participants: The interviewer and the respondent are strangers. Hence, the investigator
has to get him introduced to the respondent in an appropriate manner.
The Relationship between the Participants is a Transitory one: It has a fixed beginning and
termination points. The interview proper is a fleeting, momentary experience for them.
Interview is not a mere casual conversational exchange: Interview is a conversation with a
specific purpose, viz., obtaining information relevant to a study.
Interview is a mode of obtaining verbal answers to questions put verbally: The interaction
between the interviewer and the respondent need not necessarily be on a face-to-face basis,
because interview can be conducted over the telephone also. Although interview is usually a
conversation between two persons, it need not be limited to a single respondent. It can also be
conducted with a group of persons, such as family members, or a group of children or a group of
customers, depending on the requirements of the study.
Interview is an inter-actionable process: The interaction between the interviewer and the
respondent depends upon how they perceive each other.
The respondent reacts to the interviewers appearance, behaviour, gestures, facial expression and
intonation, his perception of the thrust of the questions and his own personal needs. As far as
possible, the interviewer should try to be closer to the social-economic level of the respondents.
Moreover, he should realize that his respondents are under no obligations to extend response.
One should, therefore, be tactful and be alert to such reactions of the respondents as lame-excuse,
suspicion, reluctance or indifference, and deal with them suitably. One should not also argue or
dispute. One should rather maintain an impartial and objective attitude. Information furnished by
the respondent in the interview is recorded by the investigator. This poses a problem of seeing
that recording does not interfere with the tempo of conversation.
Interviewing is not a standardized process: Like that of a chemical technician; it is rather a
flexible psychological process. The implication of this feature is that the interviewer cannot
apply unvarying standardized technique, because he is dealing with respondents with varying
motives and diverse perceptions. The extent of his success as an interviewer is very largely
dependent upon his insight and skill in dealing with varying socio-physiological situations.

Qualities of Interviews
The requirements or conditions necessary for a successful interview are:
Data availability: The needed information should be available with the respondent. He should
be able to conceptualize it in terms to the study, and be capable of communicating it.
Role perception: The respondent should understand his role and know what is required of him.
He should know what is a relevant and how complete it should be. He can learn much of this
from the interviewers introduction, explanations and questioning procedure.
The interviewer should also know his role: He should establish a permissive atmosphere and
encourage frank and free conversation. He should not affect the interview situation through
subjective attitude and argumentation.
Respondents motivation: The respondent should be willing to respond and give accurate
answer. This depends partly on the interviewers approach and skill. The interview has interest in
it for the purpose of his research, but the respondent has no personal interest in it. Therefore, the
interviewer should establish a friendly relationship with the respondent, and create in him an
interest in the subject-matter of the study. The interviewer should try to reduce the effect of
demotivating factors like desire to get on with other activities, embarrassment at ignorance,
dislike of the interview content, suspicious about the interviewer, and fear of consequence, He
should also try to build up the effect of motivating actors like curiosity, loneliness, politeness,
sense of duty, respect of the research agency and liking for the interviewer.
The above requirement reminds that the interview is an interaction process. The investigator
should keep this in mind and take care to see that his appearance and behaviour do not distort the
interview situation.

Merits of Interview Method
There are several real advantages to personal interviewing.
- First the greatest value of this method is the depth and detail of information that can be
secured. When used with well conceived schedules, an interview can obtain a great deal
of information. It far exceeds mail survey in amount and quality of data that can be
secured.
- Second, the interviewer can do more to improve the percentage of responses and the
quality of information received than other method. He can note the conditions of the
interview situation, and adopt appropriate approaches to overcome such problems as the
respondents unwillingness, incorrect understanding of question, suspicion, etc.
- Third, the interviewer can gather other supplemental information like economic level,
living conditions etc. through observation of the respondents environment.
- Fourth, the interviewer can use special scoring devices, visual materials and the like in
order to improve the quality of interviewing.
- Fifth, the accuracy and dependability of the answers given by the respondent can be
checked by observation and probing.
- Last, interview is flexible and adaptable to individual situations. Even more, control can
be exercised over the interview situation.

Demerits of Interview Method
Interviewing is not free limitations.
- Its greatest drawback is that it is costly both in money and time.
- Second, the interview results are often adversely affected by interviewers mode of
asking questions and interactions, and incorrect recording and also by the respondents
faulty perception, faulty memory, inability to articulate etc.
- Third, certain types of personal and financial information may be refused in face-to face
interviews. Such information might be supplied more willingly on mail questionnaires,
especially if they are to be unsigned.
- Fourth, interview poses the problem of recording information obtained from the
respondents. No full proof system is available. Note taking is invariably distracting to
both the respondent and the interviewer and affects the thread of the conversation.
- Last, interview calls for highly interviewers. The availability of such persons is limited
and the training of interviewers is often a long and costly process.

Interviewing techniques in Business Research
The interview process consists of the following stages:
- Preparation
- Introduction
- Developing rapport
- Carrying the interview forward
- Recording the interview
- Closing the interview
Preparation
The interviewing requires some preplanning and preparation. The interviewer should keep the
copies of interview schedule/guide (as the case may be) ready to use. He should have the list of
names and addresses of respondents, he should regroup them into contiguous groups in terms of
location in order to save time and cost in traveling. The interviewer should find out the general
daily routine of the respondents in order to determine the suitable timings for interview. Above
all, he should mentally prepare himself for the interview. He should think about how he should
approach a respondent, what mode of introduction he could adopt, what situations he may have
to face and how he could deal with them. The interviewer may come across such situations as
respondents; avoidance, reluctance, suspicion, diffidence, inadequate responses, distortion, etc.
The investigator should plan the strategies for dealing with them. If such preplanning is not done,
he will be caught unaware and fail to deal appropriately when he actually faces any such
situation. It is possible to plan in advance and keep the plan and mind flexible and expectant of
new development.

Introduction
The investigator is a stranger to the respondents. Therefore, he should be properly introduced to
each of the respondents. What is the proper mode of introduction? There is no one appropriate
universal mode of introduction. Mode varies according to the type of respondents. When making
a study of an organization or institution, the head of the organization should be approached first
and his cooperation secured before contacting the sample inmates/employees. When studying a
community or a cultural group, it is essential to approach the leader first and to enlist
cooperation. For a survey or urban households, the research organizations letter of introduction
and the interviewers identity card can be shown. In these days of fear of opening the door for a
stranger, residents cooperation can be easily secured, if the interviewer attempts to get him
introduced through a person known to them, say a popular person in the area e.g., a social
worker. For interviewing rural respondents, the interviewer should never attempt to approach
them along with someone from the revenue department, for they would immediately hide
themselves, presuming that they are being contacted for collection of land revenue or
subscription to some government bond. He should not also approach them through a local
political leader, because persons who do not belong to his party will not cooperate with the
interviewer. It is rather desirable to approach the rural respondents through the local teacher or
social worker.
After getting himself introduced to the respondent in the most appropriate manner, the
interviewer can follow a sequence of procedures as under, in order to motivate the respondent to
permit the interview:
1. With a smile, greet the respondent in accordance with his cultural pattern.
2. Identify the respondent by name.
3. Describe the method by which the respondent was selected.
4. Mention the name of the organization conducting the research.
5. Assure the anonymity or confidential nature of the interview.
6. Explain their usefulness of the study.
7. Emphasize the value of respondents cooperation, making such statements as You are
among the few in a position to supply the information. Your response is invaluable. I
have come to learn from your experience and knowledge.
Developing Rapport
Before starting the research interview, the interviewer should establish a friendly relationship
with the respondent. This is described as rapport. It means establishing a relationship of
confidence and understanding between the interviewer and the respondent. It is a skill which
depends primarily on the interviewers commonsense, experience, sensitivity, and keen
observation.
Start the conversation with a general topic of interest such as weather, current news, sports event,
or the like perceiving the probable of the respondent from his context. Such initial conversation
may create a friendly atmosphere and a warm interpersonal relationship and mutual
understanding. However, the interviewer should guard against the over rapport as cautioned by
Herbert Hyman. Too much identification and too much courtesy result in tailoring replied to the
image of a nice interviewer. The interviewer should use his discretion in striking a happy
medium.
Carrying the Interview Forward
After establishing rapport, the technical task of asking questions from the interview schedule
starts. This task requires care, self-restraint, alertness and ability to listen with understanding,
respect and curiosity. In carrying on this task of gathering information from the respondent by
putting questions to him, the following guidelines may be followed:
1. Start the interview. Carry it on in an informal and natural conversational style.
2. Ask all the applicable questions in the same order as they appear on the schedule without
any elucidation and change in the wording. Ask all the applicable questions listed in the
schedule. Do not take answers for granted.
3. If interview guide is used, the interviewer may tailor his questions to each respondent,
covering of course, the areas to be investigated.
4. Know the objectives of each question so as to make sure that the answers adequately
satisfy the question objectives.
5. If a question is not understood, repeat it slowly with proper emphasis and appropriate
explanation, when necessary.
6. Talk all answers naturally, never showing disapproval or surprise. When the respondent
does not meet the interruptions, denial, contradiction and other harassment, he may feel
free and may not try to withhold information. He will be motivated to communicate when
the atmosphere is permissive and the listeners attitude is non judgmental and is
genuinely absorbed in the revelations.
7. Listen quietly with patience and humility. Give not only undivided attention, but also
personal warmth. At the same time, be alert and analytic to incomplete, non specific and
inconsistent answers, but avoid interrupting the flow of information. If necessary, jot
down unobtrusively the points which need elaboration or verification for later and
timelier probing. The appropriate technique for this probing is to ask for further
clarification in such a polite manner as I am not sure, I understood fully, is this.what
you meant?
8. Neither argue nor dispute.
9. Show genuine concern and interest in the ideas expressed by the respondent; at the same
time, maintain an impartial and objective attitude.
10. Should not reveal your own opinion or reaction. Even when you are asked of your views,
laugh off the request, saying Well, your opinions are more important than mine.
11. At times the interview runs dry and needs re-stimulation. Then use such expressions as
Uh-huh or That interesting or I see can you tell me more about that? and the like.
12. When the interviewee fails to supply his reactions to related past experiences, represent
the stimulus situation, introducing appropriate questions which will aid in revealing the
past. Under what circumstances did such and such a phenomenon occur? or How did
you feel about it and the like.
13. At times, the conversation may go off the track. Be alert to discover drifting, steer the
conversation back to the track by some such remark as, you know, I was very much
interested in what you said a moment ago. Could you tell me more about it?
14. When the conversation turns to some intimate subjects, and particularly when it deals
with crises in the life of the individual, emotional blockage may occur. Then drop the
subject for the time being and pursue another line of conversation for a while so that a
less direct approach to the subject can be made later.
15. When there is a pause in the flow of information, do not hurry the interview. Take it as a
matter of course with an interested look or a sympathetic half-smile. If the silence is too
prolonged, introduce a stimulus saying You mentioned that What happened then?


Additional Sittings
In the case of qualitative interviews involving longer duration, one single sitting will not do, as it
would cause interview weariness. Hence, it is desirable to have two or more sittings with the
consent of the respondent.
Recording the Interview
It is essential to record responses as they take place. If the note taking is done after the interview,
a good deal of relevant information may be lost. Nothing should be made in the schedule under
respective question. It should be complete and verbatim. The responses should not be
summarized or paraphrased. How can complete recording be made without interrupting the free
flow of conversation? Electronic transcription through devices like tape recorder can achieve
this. It has obvious advantages over note-taking during the interview. But it also has certain
disadvantages. Some respondents may object to or fear going on record. Consequently the risk
of lower response rate will rise especially for sensitive topics.
If the interviewer knows short-hand, he can use it with advantage. Otherwise, he can write
rapidly by abbreviating word and using only key words and the like. However, even the fast
writer may fail to record all that is said at conversational speed. At such times, it is useful to
interrupt by some such comment as that seems to be a very important point, would you mind
repeating it, so that I can get your words exactly. The respondent is usually flattered by this
attention and the rapport is not disturbed.
The interviewer should also record all his probes and other comments on the schedule, in
brackets to set them off from responses. With the pre-coded structured questions, the
interviewers task is easy. He has to simply ring the appropriate code or tick the appropriate box,
as the case may be. He should not make mistakes by carelessly ringing or ticketing a wrong item.
Closing the Interview
After the interview is over, take leave off the respondent thanking him with a friendly smile. In
the case of a qualitative interview of longer duration, select the occasion for departure more
carefully. Assembling the papers for putting them in the folder at the time of asking the final
question sets the stage for a final handshake, a thank-you and a good-bye. If the respondent
desires to know the result of the survey, note down his name and address so that a summary of
the result could be posted to him when ready.

Editing
At the close of the interview, the interviewer must edit the schedule to check that he has asked all
the questions and recorded all the answers and that there is no inconsistency between answers.
Abbreviations in recording must be replaced by full words. He must ensure that everything is
legible. It is desirable to record a brief sketch of his impressions of the interview and
observational notes on the respondents living environment, his attitude to the survey,
difficulties, if any, faced in securing his cooperation and the interviewers assessment of the
validity of the respondents answers.

Interview Problems
In personal interviewing, the researcher must deal with two major problems, inadequate
response, non-response and interviewers bias.
Inadequate response
Kahn and Cannel distinguish five principal symptoms of inadequate response. They are:
- partial response, in which the respondent gives a relevant but incomplete answer
- non-response, when the respondent remains silent or refuses to answer the question
- irrelevant response, in which the respondents answer is not relevant to the question
asked
- inaccurate response, when the reply is biased or distorted and
- Verbalized response problem, which arises on account of respondents failure to
understand a question or lack of information necessary for answering it.
Interviewers Bias
The interviewer is an important cause of response bias. He may resort to cheating by cooking
up data without actually interviewing. The interviewers can influence the responses by
inappropriate suggestions, word emphasis, tone of voice and question rephrasing. His own
attitudes and expectations about what a particular category of respondents may say or think may
bias the data. Another source of response of the interviewers characteristics (education, apparent
social status, etc) may also bias his answers. Another source of response bias arises from
interviewers perception of the situation, if he regards the assignment as impossible or sees the
results of the survey as possible threats to personal interests or beliefs he is likely to introduce
bias.
As interviewers are human beings, such biasing factors can never be overcome completely, but
their effects can be reduced by careful selection and training of interviewers, proper motivation
and supervision, standardization or interview procedures (use of standard wording in survey
questions, standard instructions on probing procedure and so on) and standardization of
interviewer behaviour. There is need for more research on ways to minimize bias in the
interview.

Non-response
Nonresponse refers to failure to obtain responses from some sample respondents. There are
many sources of non-response; non-availability, refusal, incapacity and inaccessibility.
Non-availability
Some respondents may not be available at home at the time of call. This depends upon the nature
of the respondent and the time of calls. For example, employed persons may not be available
during working hours. Farmers may not be available at home during cultivation season. Selection
of appropriate timing for calls could solve this problem. Evenings and weekends may be
favourable interviewing hours for such respondents. If someone is available, then, line
respondents hours of availability can be ascertained and the next visit can be planned
accordingly.
Refusal
Some persons may refuse to furnish information because they are ill-disposed, or approached at
the wrong hour and so on. Although, a hardcore of refusals remains, another try or perhaps
another approach may find some of them cooperative. Incapacity or inability may refer to illness
which prevents a response during the entire survey period. This may also arise on account of
language barrier.
Inaccessibility
Some respondents may be inaccessible. Some may not be found due to migration and other
reasons. Non-responses reduce the effective sample size and its representativeness.
Methods and Aims of control of non-response
Kish suggests the following methods to reduce either the percentage of non-response or its
effects:
1. Improved procedures for collecting data are the most obvious remedy for non-response.
Improvements advocated are (a) guarantees of anonymity, (b) motivation of the
respondent to co-operate (c) arousing the respondents interest with clever opening
remarks and questions, (d) advance notice to the respondents.
2. Call-backs are most effective way of reducing not-at-homes in personal interviews, as are
repeated mailings to no-returns in mail surveys.
3. Substitution for the non-response is often suggested as a remedy. Usually this is a
mistake because the substitutes resemble the responses rather than the non-responses.
Nevertheless, beneficial substitution methods can sometimes be designed with reference
to important characteristics of the population. For example, in a farm management study,
the farm size is an important variable and if the sampling is based on farm size,
substitution for a respondent with a particular size holding by another with the holding of
the same size is possible.
Attempts to reduce the percentage or effects on non-responses aim at reducing the bias caused by
differences on non-respondents from respondents. The non-response bias should not be confused
with the reduction of sampled size due to non-response. The latter effect can be easily overcome,
either by anticipating the size of non-response in designing the sample size or by compensating
for it with a supplement. These adjustments increase the size of the response and the sampling
precision, but they do not reduce the non-response percentage or bias.

Telephone Interviewing
Telephone interviewing is a non-personal method of data collection. It may be used as a major
method or supplementary method.

It will be useful in the following situations:
1. When the universe is composed of those persons whose names are listed in telephone
directories, e.g. business houses, business executives, doctors, other professionals.
2. When the study required responses to five or six simple questions. E.g. Radio or
Television program survey.
3. When the survey must be conducted in a very short period of time, provided the units of
study are listed in telephone directory.
4. When the subject is interesting or important to respondents, e.g. a survey relating to trade
conducted by a trade association or a chamber of commerce, a survey relating to a
profession conducted by the concerned professional association.
5. When the respondents are widely scattered.
Advantages: The advantages of telephone interview are:
1. The survey can be completed at very low cost, because telephone survey does not involve
travel time and cost and all calls can be made from a single location.
2. Information can be collected in a short period of time. 5 to 10 interviews can be
conducted per hours.
3. Quality of response is good, because interviewer bias is reduced as there is no face-to-
face contact between the interviewer and the respondent.
4. This method of interviewing is less demanding upon the interviewer.
5. It does not involve field work.
6. Individuals who could not be reached or who might not care to be interviewed personally
can be contacted easily.
Disadvantages: Telephone interview has several limitations:
1. It is limited to persons with listed telephones. The sample will be distorted. If the
universe includes persons not on phone in several counties like India only a few persons
have phone facility and that too in urban areas only. Telephone facility is very rare in
rural areas. Hence, the method is not useful for studying the general population.
2. There is a limit to the length of interview. Usually, a call cannot last over five minutes.
Only five or six simple questions can be asked. Hence, telephone cannot be used for a
longer questionnaire.
3. The type of information to be collected is limited to what can be given in simple, short
answers of a few words. Hence, telephone is not suitable for complex surveys, and there
is no possibility of obtaining detailed information.
4. If the questions cover personal matters, most respondents will not cooperate with the
interviewer.
5. The respondents characteristics and environment cannot be observed.
6. It is not possible to use visual aids like charts, maps, illustrations or complex scales.
7. It is rather difficult to establish rapport between the respondent and the interviewer.
8. There is no possibility to ensure the identity of the interviewer and to overcome
suspicions.

Group Interviews
A group interview may be defined as a method of collecting primary data in which a number of
individuals with a common interest interact with each other. In a personal interview, the flow of
information is multi dimensional. The group may consist of about six to eight individuals with a
common interest. The interviewer acts as the discussion leader. Free discussion is encouraged on
some aspect of the subject under study. The discussion leader stimulates the group members to
interact with each other.

The desired information may be obtained through self-administered questionnaire or interview,
with the discussion serving as a guide to ensure consideration of the areas of concern. In
particular, the interviewers look for evidence of common elements of attitudes, beliefs, intentions
and opinions among individuals in the group. At the same time, he must be aware that a single
comment by a member can provide important insight.
Samples for group interview can be obtained through schools, clubs and other organized groups.
The group interview technique can be employed by researchers in studying peoples reactions on
public amenities, public health projects, welfare schemes etc. It is a popular method in marketing
research to evaluate new product or service concepts, brands names, packages, promotional
strategies and attitudes. When an organization needs a great variety of information in as much
detail as possible at a relatively low cost and in a short period of time, the group interview
technique is more useful. It can be used to generate primary data in the exploratory phase of a
project.
Advantages: The advantages of this technique are:
1. The respondents comment freely and in detail.
2. The method is highly flexible. The flexibility helps the research work with new concepts
or topics which have not been previously investigated.
3. Visual aids can be used.
4. A group can be interviewed in the time required for one personal interview.
5. The client can watch the interview unobserved.
6. Respondents are more articulated in a group than in the individual interviews.
7. The technique eliminates the physical limitations inherent in individual interviews.


Disadvantages: This method is not free from draw backs.
1. It is difficult to get a representative sample.
2. There is the possibility of the group being dominated by one individual.
3. The respondents may answer to please the interviewer or the other members in the group.
4. Nevertheless, the advantage of this technique outweighs the disadvantages and the
technique is found to be useful for surveys on topics of common interest.

Summary
Interviewing is one of the prominent methods of data collection. The interview may be classified
into: (a) structured or directive interview,
(b) unstructured or non-directive interview, (c) focused interview, and
(d) clinical interview and (e) depth interview. Structured interview is made with a details
standardized schedule. The same questions are put to all the respondents and in the same order.
Non-directive method is the least structured one. The interviewer encourages the respondent to
talk freely about a given topic with a minimum of prompting or guidance. In focused type of
interview, a detailed pre-planned schedule is not used. Clinical interview is a semi-structured
interview where the investigator attempts to focus the discussion on the actual effects of a given
experience to which the respondents have been exposed. This is similar to the focused interview
but with a subtle difference. While the focused interview is concerned with the effects of specific
experience, clinical interview is concerned with broad underlying feelings or motivations or with
the course of the individuals life experiences. This is an intensive and searching interview
aiming at studying the respondents opinion, emotions or convictions on the basis of an interview
guide. Detailed interview requires much more training on inter-personal skills than structured
interview. This deliberately aims to elicit unconscious as well as extremely personal feelings and
emotions.
Interviewing as a method of data collection has certain features. They are:
1. The requirements or conditions necessary for a successful interview are:
2. There are several real advantages to personal interviewing.
3. Interviewing is not free limitations.
In personal interviewing, the researcher must deal with two major problems, inadequate
response, non-response and interviewers bias. Telephone interviewing is a non-personal
method of data collection. It may be used as a major method or supplementary method. It
will be useful in the following situations. A group interview may be defined as a method
of collecting primary data in which a number of individuals with a common interest
interact with each other. In a personal interview the flow of information is multi
dimensional. The group may consist of about six to eight individuals with a common
interest. The interviewer acts as the discussion. The quality of data collected depends
ultimately upon the capabilities of interviewers. Hence, careful selection and proper
training of interviewers is essential.



Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 11-Processing Data
Unit 11-Processing Data

Meaning of Data Processing
Data in the real world often comes with a large quantum and in a variety of formats that any
meaningful interpretation of data cannot be achieved straightaway. Social science researches, to
be very specific, draw conclusions using both primary and secondary data. To arrive at a
meaningful interpretation on the research hypothesis, the researcher has to prepare his data for
this purpose. This preparation involves the identification of data structures, the coding of data
and the grouping of data for preliminary research interpretation. This data preparation for
research analysis is teamed as processing of data. Further selections of tools for analysis would
to a large extent depend on the results of this data processing.
Data processing is an intermediary stage of work between data collections and data
interpretation. The data gathered in the form of questionnaires/interview schedules/field
notes/data sheets is mostly in the form of a large volume of research variables. The research
variables recognized is the result of the preliminary research plan, which also sets out the data
processing methods beforehand. Processing of data requires advanced planning and this planning
may cover such aspects as identification of variables, hypothetical relationship among the
variables and the tentative research hypothesis.
The various steps in processing of data may be stated as:
- Identifying the data structures
- Editing the data
- Coding and classifying the data
- Transcription of data
- Tabulation of data.
Objectives:
After studying this lesson you should be able to understand:
- Checking for analysis
- Editing
- Coding
- Classification
- Transcription of data
- Tabulation
- Construction of Frequency Table
- Components of a table
- Principles of table construction
- Frequency distribution and class intervals
- Graphs, charts and diagrams
- Types of graphs and general rules
- Quantitative and qualitative analysis
- Measures of central tendency
- Dispersion
- Correlation analysis
- Coefficient of determination

Checking for Analysis
In the data preparation step, the data are prepared in a data format, which allows the analyst to
use modern analysis software such as SAS or SPSS. The major criterion in this is to define the
data structure. A data structure is a dynamic collection of related variables and can be
conveniently represented as a graph where nodes are labelled by variables. The data structure
also defines and stages of the preliminary relationship between variables/groups that have been
pre-planned by the researcher. Most data structures can be graphically presented to give clarity
as to the frames researched hypothesis. A sample structure could be a linear structure, in which
one variable leads to the other and finally, to the resultant end variable.
The identification of the nodal points and the relationships among the nodes could sometimes be
a complex task than estimated. When the task is complex, which involves several types of
instruments being collected for the same research question, the procedures for drawing the data
structure would involve a series of steps. In several intermediate steps, the heterogeneous data
structure of the individual data sets can be harmonized to a common standard and the separate
data sets are then integrated into a single data set. However, the clear definition of such data
structures would help in the further processing of data.

Editing
The next step in the processing of data is editing of the data instruments. Editing is a process of
checking to detect and correct errors and omissions. Data editing happens at two stages, one at
the time of recording of the data and second at the time of analysis of data.
Data Editing at the Time of Recording of Data
Document editing and testing of the data at the time of data recording is done considering the
following questions in mind.
- Do the filters agree or are the data inconsistent?
- Have missing values been set to values, which are the same for all research questions?
- Have variable descriptions been specified?
- Have labels for variable names and value labels been defined and written?
All editing and cleaning steps are documented, so that, the redefinition of variables or later
analytical modification requirements could be easily incorporated into the data sets.
Data Editing at the Time of Analysis of Data
Data editing is also a requisite before the analysis of data is carried out. This ensures that the data
is complete in all respect for subjecting them to further analysis. Some of the usual check list
questions that can be had by a researcher for editing data sets before analysis would be:
1. Is the coding frame complete?
2. Is the documentary material sufficient for the methodological description of the study?
3. Is the storage medium readable and reliable.
4. Has the correct data set been framed?
5. Is the number of cases correct?
6. Are there differences between questionnaire, coding frame and data?
7. Are there undefined and so-called wild codes?
8. Comparison of the first counting of the data with the original documents of the
researcher.
The editing step checks for the completeness, accuracy and uniformity of the data as created by
the researcher.
Completeness: The first step of editing is to check whether there is an answer to all the
questions/variables set out in the data set. If there were any omission, the researcher sometimes
would be able to deduce the correct answer from other related data on the same instrument. If
this is possible, the data set has to rewritten on the basis of the new information. For example,
the approximate family income can be inferred from other answers to probes such as occupation
of family members, sources of income, approximate spending and saving and borrowing habits
of family members etc. If the information is vital and has been found to be incomplete, then the
researcher can take the step of contacting the respondent personally again and solicit the requisite
data again. If none of these steps could be resorted to the marking of the data as missing must
be resorted to.
Accuracy: Apart from checking for omissions, the accuracy of each recorded answer should be
checked. A random check process can be applied to trace the errors at this step. Consistency in
response can also be checked at this step. The cross verification to a few related responses would
help in checking for consistency in responses. The reliability of the data set would heavily
depend on this step of error correction. While clear inconsistencies should be rectified in the data
sets, fact responses should be dropped from the data sets.
Uniformity: In editing data sets, another keen lookout should be for any lack of uniformity, in
interpretation of questions and instructions by the data recorders. For instance, the responses
towards a specific feeling could have been queried from a positive as well as a negative angle.
While interpreting the answers, care should be taken as a record the answer as a positive
question response or as negative question response in all uniformity checks for consistency in
coding throughout the questionnaire/interview schedule response/data set.
The final point in the editing of data set is to maintain a log of all corrections that have been
carried out at this stage. The documentation of these corrections helps the researcher to retain the
original data set.


Coding
The edited data are then subject to codification and classification. Coding process assigns
numerals or other symbols to the several responses of the data set. It is therefore a pre-requisite
to prepare a coding scheme for the data set. The recording of the data is done on the basis of this
coding scheme.
The responses collected in a data sheet varies, sometimes the responses could be the choice
among a multiple response, sometimes the response could be in terms of values and sometimes
the response could be alphanumeric. At the recording stage itself, if some codification were done
to the responses collected, it would be useful in the data analysis. When codification is done, it is
imperative to keep a log of the codes allotted to the observations. This code sheet will help in the
identification of variables/observations and the basis for such codification.
The first coding done to primary data sets are the individual observation themselves. This
responses sheet coding gives a benefit to the research, in that, the verification and editing of
recordings and further contact with respondents can be achieved without any difficulty. The
codification can be made at the time of distribution of the primary data sheets itself. The codes
can be alphanumeric to keep track of where and to whom it had been sent. For instance, if the
data consists of several public at different localities, the sheets that are distributed in a specific
locality may carry a unique part code which is alphabetic. To this alphabetic code, a numeric
code can be attached to distinguish the person to whom the primary instrument was distributed.
This also helps the researcher to keep track of who the respondents are and who are the probable
respondents from whom primary data sheets are yet to be collected. Even at a latter stage, any
specific queries on a specific responses sheet can be clarified.

The variables or observations in the primary instrument would also need codification, especially
when they are categorized. The categorization could be on a scale i.e., most preferable to not
preferable, or it could be very specific such as Gender classified as Male and Female. Certain
classifications can lead to open ended classification such as education classification, Illiterate,
Graduate, Professional, Others. Please specify. In such instances, the codification needs to be
carefully done to include all possible responses under Others, please specify. If the preparation
of the exhaustive list is not feasible, then it will be better to create a separate variable for the
Others please specify category and records all responses as such.
Numeric Coding: Coding need not necessarily be numeric. It can also be alphabetic. Coding has
to be compulsorily numeric, when the variable is subject to further parametric analysis.
Alphabetic Coding: A mere tabulation or frequency count or graphical representation of the
variable may be given in an alphabetic coding.
Zero Coding: A coding of zero has to be assigned carefully to a variable. In many instances,
when manual analysis is done, a code of 0 would imply a no response from the respondents.
Hence, if a value of 0 is to be given to specific responses in the data sheet, it should not lead to
the same interpretation of non response. For instance, there will be a tendency to give a code of
0 to a no, then a different coding than 0 should be given in the data sheet. An illustration of the
coding process of some of the demographic variables is given in the following table.



= Could be treated as a separate variable/observation and the actual response could be recorded.
The new variable could be termed as other occupation
The coding sheet needs to be prepared carefully, if the data recording is not done by the
researcher, but is outsourced to a data entry firm or individual. In order to enter the data in the
same perspective, as the researcher would like to view it, the data coding sheet is to be prepared
first and a copy of the data coding sheet should be given to the outsourcer to help in the data
entry procedure. Sometimes, the researcher might not be able to code the data from the primary
instrument itself. He may need to classify the responses and then code them. For this purpose,
classification of data is also necessary at the data entry stage.

Classification
When open ended responses have been received, classification is necessary to code the
responses. For instance, the income of the respondent could be an open-ended question. From all
responses, a suitable classification can be arrived at. A classification method should meet certain
requirements or should be guided by certain rules.
First, classification should be linked to the theory and the aim of the particular study. The
objectives of the study will determine the dimensions chosen for coding. The categorization
should meet the information required to test the hypothesis or investigate the questions.
Second, the scheme of classification should be exhaustive. That is, there must be a category for
every response. For example, the classification of martial status into three category viz.,
married Single and divorced is not exhaustive, because responses like widower or
separated cannot be fitted into the scheme. Here, an open ended question will be the best mode
of getting the responses. From the responses collected, the researcher can fit a meaningful and
theoretically supportive classification. The inclusion of the classification Others tends to fill
the cluttered, but few responses from the data sheets. But others categorization has to carefully
used by the researcher. However, the other categorization tends to defeat the very purpose of
classification, which is designed to distinguish between observations in terms of the properties
under study. The classification others will be very useful when a minority of respondents in the
data set give varying answers. For instance, the reading habits of newspaper may be surveyed.
The 95 respondents out of 100 could be easily classified into 5 large reading groups while 5
respondents could have given a unique answer. These given answer rather than being separately
considered could be clubbed under the others heading for meaningful interpretation of
respondents and reading habits.
Third, the categories must also be mutually exhaustive, so that each case is classified only once.
This requirement is violated when some of the categories overlap or different dimensions are
mixed up.
The number of categorization for a specific question/observation at the coding stage should be
maximum permissible since, reducing the categorization at the analysis level would be easier
than splitting an already classified group of responses. However the number of categories is
limited by the number of cases and the anticipated statistical analysis that are to be used on the
observation.

Transcription of Data
When the observations collected by the researcher are not very large, the simple inferences,
which can be drawn from the observations, can be transferred to a data sheet, which is a
summary of all responses on all observations from a research instrument. The main aim of
transition is to minimize the shuffling proceeds between several responses and several
observations. Suppose a research instrument contains 120 responses and the observations has
been collected from 200 respondents, a simple summary of one response from all 200
observations would require shuffling of 200 pages. The process is quite tedious if several
summary tables are to be prepared from the instrument. The transcription process helps in the
presentation of all responses and observations on data sheets which can help the researcher to
arrive at preliminary conclusions as to the nature of the sample collected etc. Transcription is
hence, an intermediary process between data coding and data tabulation.
Methods of Transcription
The researcher may adopt a manual or computerized transcription. Long work sheets, sorting
cards or sorting strips could be used by the researcher to manually transcript the responses. The
computerized transcription could be done using a data base package such as spreadsheets, text
files or other databases.
The main requisite for a transcription process is the preparation of the data sheets where
observations are the row of the database and the responses/variables are the columns of the data
sheet. Each variable should be given a label so that long questions can be covered under the label
names. The label names are thus the links to specific questions in the research instrument. For
instance, opinion on consumer satisfaction could be identified through a number of statements
(say 10); the data sheet does not contain the details of the statement, but gives a link to the
question in the research instrument though variable labels. In this instance the variable names
could be given as CS1, CS2, CS3, CS4, CS5, CS6, CS7, CS8, CS9 and CS10. The label CS
indicating Consumer satisfaction and the number 1 to 10 indicate the statement measuring
consumer satisfaction. Once the labelling process has been done for all the responses in the
research instrument, the transcription of the response is done.
Manual Transcription
When the sample size is manageable, the researcher need not use any computerization process to
analyze the data. The researcher could prefer a manual transcription and analysis of responses.
The choice of manual transcription would be when the number of responses in a research
instrument is very less, say 10 responses, and the numbers of observations collected are within
100. A transcription sheet with 10050 (assuming each response has 5 options) row/column can
be easily managed by a researcher manually. If, on the other hand the variables in the research
instrument are more than 40 and each variable has 5 options, it leads to a worksheet of 100200
sizes which might not be easily managed by the researcher manually. In the second instance, if
the number of responses is less than 30, then the manual worksheet could be attempted manually.
In all other instances, it is advisable to use a computerized transcription process.
Long Worksheets
Long worksheets require quality paper; preferably chart sheets, thick enough to last several
usages. These worksheets normally are ruled both horizontally and vertically, allowing responses
to be written in the boxes. If one sheet is not sufficient, the researcher may use multiple rules
sheets to accommodate all the observations. Heading of responses which are variable names and
their coding (options) are filled in the first two rows. The first column contains the code of
observations. For each variable, now the responses from the research instrument are then
transferred to the worksheet by ticking the specific option that the observer has chosen. If the
variable cannot be coded into categories, requisite length for recording the actual response of the
observer should be provided for in the work sheet.
The worksheet can then be used for preparing the summary tables or can be subjected to further
analysis of data. The original research instrument can be now kept aside as safe documents.
Copies of the data sheets can also be kept for future references. As has been discussed under the
editing section, the transcript data has to be subjected to a testing to ensure error free
transcription of data.

Transcription can be made as and when the edited instrument is ready for processing. Once all
schedules/questionnaires have been transcribed, the frequency tables can be constructed straight
from worksheet. Other methods of manual transcription include adoption of sorting strips or
cards.
In olden days, data entry and processing were made through mechanical and semi auto-metric
devices such as key punch using punch cards. The arrival of computers has changed the data
processing methodology altogether.

Tabulation
The transcription of data can be used to summarize and arrange the data in compact form for
further analysis. The process is called tabulation. Thus, tabulation is a process of summarizing
raw data displaying them on compact statistical tables for further analysis. It involves counting
the number of cases falling into each of the categories identified by the researcher.
Tabulation can be done manually or through the computer. The choice depends upon the size and
type of study, cost considerations, time pressures and the availability of software packages.
Manual tabulation is suitable for small and simple studies.
Manual Tabulation
When data are transcribed in a classified form as per the planned scheme of classification,
category-wise totals can be extracted from the respective columns of the work sheets. A simple
frequency table counting the number of Yes and No responses can be made easily by
counting the Y response column and N response column in the manual worksheet table
prepared earlier. This is a one-way frequency table and they are readily inferred from the totals
of each column in the work sheet. Sometimes the researcher has to cross tabulate two variables,
for instance, the age group of vehicle owners. This requires a two-way classification and cannot
be inferred straight from any technical knowledge or skill. If one wants to prepare a table
showing the distribution of respondents by age, a tally sheet showing the age groups horizontally
is prepared. Tally marks are then made for the respective group i.e., vehicle owners, from each
line of response in the worksheet. After every four tally, the fifth tally is cut across the previous
four tallies. This represents a group of five items. This arrangement facilitates easy counting of
each one of the class groups. Illustration of this tally sheet is present below.

Although manual tabulation is simple and easy to construct, it can be tedious, slow and error-
prone as responses increase.
Computerized tabulation is easy with the help of software packages. The input requirement will
be the column and row variables. The software package then computes the number of records in
each cell of three row column categories. The most popular package is the Statistical package for
Social Science (SPSS). It is an integrated set of programs suitable for analysis of social science
data. This package contains programs for a wide range of operations and analysis such as
handling missing data, recording variable information, simple descriptive analysis, cross
tabulation, multivariate analysis and non-parametric analysis.

Construction of Frequency Table
Frequency tables provide a shorthand summary of data. The importance of presenting
statistical data in tabular form needs no emphasis. Tables facilitate comprehending masses of
data at a glance; they conserve space and reduce explanations and descriptions to a minimum.
They give a visual picture of relationships between variables and categories. They facilitate
summation of item and the detection of errors and omissions and provide a basis for
computations.
It is important to make a distinction between the general purpose tables and specific tables. The
general purpose tables are primary or reference tables designed to include large amount of source
data in convenient and accessible form. The special purpose tables are analytical or derivate ones
that demonstrate significant relationships in the data or the results of statistical analysis. Tables
in reports of government on population, vital statistics, agriculture, industries etc., are of general
purpose type. They represent extensive repositories and statistical information. Special purpose
tables are found in monographs, research reports and articles and reused as instruments of
analysis. In research, we are primarily concerned with special purpose.

Components of a Table
The major components of a table are:
A Heading:
(a) Table Number
(b) Title of the Table
(c) Designation of units
B Body
1. Sub-head, Heading of all rows or blocks of stub items
1. Body-head: Headings of all columns or main captions and their sub-captions.
2. Field/body: The cells in rows and columns.
C Notations:
- Footnotes, wherever applicable.
- Source, wherever applicable.
Principles of Table Construction
There are certain generally accepted principles of rules relating to construction of tables. They
are:
1. Every table should have a title. The tile should represent a succinct description of the
contents of the table. It should be clear and concise. It should be placed above the body of
the table.
2. A number facilitating easy reference should identify every table. The number can be
centred above the title. The table numbers should run in consecutive serial order.
Alternatively tables in chapter 1 be numbered as 1.1, 1.2, 1.., in chapter 2 as 2.1, 2.2,
2.3. and so on.
3. The captions (or column headings) should be clear and brief.
4. The units of measurement under each heading must always be indicated.
5. Any explanatory footnotes concerning the table itself are placed directly beneath the table
and in order to obviate any possible confusion with the textual footnotes such reference
symbols as the asterisk (*) DAGGER (+) and the like may be used.
6. If the data in a series of tables have been obtained from different sources, it is ordinarily
advisable to indicate the specific sources in a place just below the table.
7. Usually lines separate columns from one another. Lines are always drawn at the top and
bottom of the table and below the captions.
8. The columns may be numbered to facilitate reference.
9. All column figures should be properly aligned. Decimal points and plus or minus
signs should be in perfect alignment.
10. Columns and rows that are to be compared with one another should be brought closed
together.
11. Totals of rows should be placed at the extreme right column and totals of columns at the
bottom.
12. In order to emphasize the relative significance of certain categories, different kinds of
type, spacing and identifications can be used.
13. The arrangement of the categories in a table may be chronological, geographical,
alphabetical or according to magnitude. Numerical categories are usually arranged in
descending order of magnitude.
14. Miscellaneous and exceptions items are generally placed in the last row of the table.
15. Usually the larger number of items is listed vertically. This means that a tables length is
more than its width.
16. Abbreviations should be avoided whenever possible and ditto marks should not be used
in a table.
17. The table should be made as logical, clear, accurate and simple as possible.
Text references should identify tables by number, rather than by such expressions as the table
above or the following table. Tables should not exceed the page size by photo stating. Tables
those are too wide for the page may be turned sidewise, with the top facing the left margin or
binding of the script. Where tables should be placed in research report or thesis? Some writers
place both special purpose and general purpose tables in an appendix and refer to them in the text
by numbers. This practice has the disadvantages of inconveniencing the reader who wants to
study the tabulated data as the text is read. A more appropriate procedure is to place special
purpose tables in the text and primary tables, if needed at all, in an appendix.

Frequency Distribution and Class Intervals
Variables that are classified according to magnitude or size are often arranged in the form of a
frequency table. In constructing this table, it is necessary to determine the number of class
intervals to be used and the size of the class intervals.
A distinction is usually made between continuous and discrete variables. A continuous variable
has an unlimited number of possible values between the lowest and highest with no gaps or
breaks. Examples of continuous variable are age, weight, temperature etc. A discrete variable can
have a series of specified values with no possibility of values between these points. Each value
of a discrete variable is distinct and separate. Examples of discrete variables are gender of
persons (male/female) occupation (salaried, business, profession) car size (800cc, 1000cc,
1200cc)
In practice, all variables are treated as discrete units, the continuous variables being stated in
some discrete unit size according to the needs of a particular situation. For example, length is
described in discrete units of millimetres or a tenth of an inch.
Class Intervals: Ordinarily, the number of class intervals may not be less than 5 not more than
15, depending on the nature of the data and the number of cases being studied. After noting the
highest and lower values and the feature of the data, the number of intervals can be easily
determined.
For many types of data, it is desirable to have class intervals of uniform size. The intervals
should neither be too small nor too large. Whenever possible, the intervals should represent
common and convenient numerical divisions such as 5 or 10, rather than odd division such as 3
to 7. Class intervals must be clearly designated in a frequency table in such a way as to obviate
any possibility of misinterpretation of confusion. For example, to present the age group of a
population, the use of intervals of 1-20, 20-50, and 50 and above would be confusing. This may
be presented as 1-20, 21-50, and above 50.
Every class interval has a mid point. For example, the midpoint of an interval 1-20 is 10.5 and
the midpoint of class interval 1-25 would be 13. Once class intervals are determined, it is routine
work to count the number of cases that fall in each interval.
One-Way Tables: One-way frequency tables present the distribution of cases on only a single
dimension or variable. For example, the distribution of respondents of gender, by religion, socio
economic status and the like are shown in one way tables (Table 10.1) lustrates one-way tables.
One way tables are rarely used since the result of frequency distributions can be described in
simple sentences. For instance, the gender distribution of a sample study may be described as
The sample data represents 58% by males and 42% of the sample are females.
Tow-Way Table: Distributions in terms of two or more variables and the relationship between
the two variables are show in two-way table. The categories of one variable are presented one
below another, on the left margin of the table those of another variable at the upper part of the
table, one by the side of another. The cells represent particular combination of both variables. To
compare the distributions of cases, raw numbers are converted into percentages based on the
number of cases in each category. (Table 10.2) illustrate two-way tables.
TABLE10.2

Another method of constructing a two-way table is to state the percent of representation as a
within brackets term rather than as a separate column. Here, special care has been taken as to
how the percentages are calculated, either on a horizontal representation of data or as vertical
representation of data. Sometimes, the table heading itself provides a meaning as to the method
of representation in the two-way table.

Graphs, Charts & Diagrams
In presenting the data of frequency distributions and statistical computations, it is often desirable
to use appropriate forms of graphic presentations. In additions to tabular forms, graphic
presentation involves use of graphics, charts and other pictorial devices such as diagrams. These
forms and devices reduce large masses of statistical data to a form that can be quickly understood
at the glance. The meaning of figures in tabular form may be difficult for the mind to grasp or
retain. Properly constructed graphs and charts relieve the mind of burdensome details by
portraying facts concisely, logically and simply. They, by emphasizing new and significant
relationship, are also useful in discovering new facts and in developing hypothesis.
The device of graphic presentation is particularly useful when the prospective readers are non-
technical people or general public. It is useful to even technical people for dramatizing certain
points about data; for important points can be more effectively captured in pictures than in tables.
However, graphic forms are not substitutes for tables, but are additional tools for the researcher
to emphasize the research findings.
Graphic presentation must be planned with utmost care and diligence. Graphic forms used should
be simple, clear and accurate and also be appropriate to the data. In planning this work, the
following questions must be considered.
(a) What is the purpose of the diagram?
(b) What facts are to be emphasized?
(c) What is the educational level of the audience?
(d) How much time is available for the preparation of the diagram?
(e) What kind of chart will portray the data most clearly and accurately?
Types of Graphs and General Rules
The most commonly used graphic forms may be grouped into the following categories:
a) Line Graphs or Charts
b) Bar Charts
c) Segmental presentations.
d) Scatter plots
e) Bubble charts
f) Stock plots
g) Pictographs
h) Chesnokov Faces
The general rules to be followed in graphic representations are:
1. The chart should have a title placed directly above the chart.
2. The title should be clear, concise and simple and should describe the nature of the data
presented.
3. Numerical data upon which the chart is based should be presented in an accompanying
table.
4. The horizontal line measures time or independent variable and the vertical line the
measured variable.
5. Measurements proceed from left to right on the horizontal line and from bottom to top on
the vertical.
6. Each curve or bar on the chart should be labelled.
7. If there are more than one curves or bar, they should be clearly differentiated from one
another by distinct patterns or colours.
8. The zero point should always be represented and the scale intervals should be equal.
9. Graphic forms should be used sparingly. Too many forms detract rather than illuminating
the presentation.
10. Graphic forms should follow and not precede the related textual discussion.
Line Graphs
The line graph is useful for showing changes in data relationship over a period of time. In this
graph, figures are plotted in relation to two intersecting lines or axes. The horizontal line is called
the abscissa or X-axis and the vertical, the ordinal or Y-axis. The point at which the two axes
intersect is zero for both X and Y axis. The O is the origin of coordinates. The two lines divide
the region of the plane into four sections known as quadrants that are numbered anti-clockwise.
Measurements to the right and above O are positive (plus) and measurements to the left and
below O are negative (minus). is an illustration of the features of a rectangular coordinate type
of graph. Any point of plane of the two axes is plotted in terms of the two axes reading from the
origin O. Scale intervals in both the axes should be equal. If a part of the scale is omitted, a set
of parallel jagged lines should be used to indicate the break in the scale. The time dimension or
independent variable is represented by the X-axis and the other variable by Y-axis.

Quantitative and Qualitative Analysis
Measures of Central Tendency
Analysis of data involves understanding of the characteristics of the data. The following are the
important characteristics of a statistical data: -
- Central tendency
- Dispersion
- Skew ness
- Kurtosis
In a data distribution, the individual items may have a tendency to come to a central position or
an average value. For instance, in a mark distribution, the individual students may score marks
between zero and hundred. In this distribution, many students may score marks, which are near
to the average marks, i.e. 50. Such a tendency of the data to concentrate to the central position of
the distribution is called central tendency. Central tendency of the data is measured by statistical
averages. Averages are classified into two groups.
1. Mathematical averages
2. Positional averages


Arithmetic mean, geometric mean and harmonic mean are mathematical averages. Median and
mode are positional averages. These statistical measures try to understand how individual values
in a distribution concentrate to a central value like average. If the values of distribution
approximately come near to the average value, we conclude that the distribution has central
tendency.
Arithmetic Mean
Arithmetic mean is the most commonly used statistical average. It is the value obtained by
dividing the sum of the item by the number of items in a series. Symbolically we say

If x1 x2 x3 xn are the values of a series, then arithmetic mean of the series obtained by
(x
1
+ x
2
+ x
3
+x
n
)
/ n.
If put
(
x
1
+ x
2
+ x
3
+x
n
) = EX,
then arithmetic mean = EX/n
When frequencies are also given with the values, to calculate arithmetic mean, the values are
first multiplied with the corresponding frequency. Then their sum is divided by the number of
frequency. Thus in a discrete series, arithmetic mean is calculated by the following formula.
Arithmetic mean = Efx/ Ef
Where, Efx = sum the values multiplied by the corresponding
frequency.
Ef = sum of the frequency
If x
1
x
2
x
3
x
n
are the values of a series, and f
1
f
2
f
3
f
n
are their corresponding frequencies,
Arithmetic mean is calculated by (f
1
x
1
+ f
2
x
2
+ f
3
x
3
+ f
n
x
n
) / (f
1
+ f
2
+ f
3
+ f
n
) or
Arithmetic mean = Efx / Ef
Individual series
- Find arithmetic mean of the following data.
58 67 60 84 93 98 100
Arithmetic mean = EX/n
Where EX = the sum of the item
n = the number of items in the series.
EX = 58 + 67+ 60 + 84 + 93 + 98 + 100 = 560
n = 7
EX = 560/7 = 80
- Find arithmetic mean for the following distribution
2.0 1.8 2.0 2.0 1.9 2.0 1.8 2.3 2.5 2.3
1.9 2.2 2.0 2.3
Arithmetic mean = EX/n
Where EX = the sum of the item
n = the number of items in the series.
EX = 2.0 + 1.8 + 2.0 + 2.0+ 1.9 + 2.0 + 1.8 + 2.3 + 2.5 + 2.3 + 1.9 + 2.2
+ 2.0 + 2.3 = 29
n = 14
EX = 29/14 = 2.07
Discrete series
o Calculate arithmetic mean of the following 50 workers according to their daily
wages.
Daily wage : 15 18 20 25 30 35 40 42
Numbers of workers : 2 3 5 10 12 10 5 2

Arithmetic mean using direct formula

Arithmetic mean = Efx/ Ef
Where, Efx = 473
Ef = 0
Arithmetic mean = 1473 /50
29.46
Continuous Series
- Find arithmetic mean for the following distribution.
Marks : 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
No. of students : 6 12 18 20 20 14 8 2



Arithmetic mean = Efx/ Ef
Where, Efx = 4700
Ef = 100

Arithmetic mean = 4700 / 100
= 47
Geometric Mean
Geometric mean is defined as the n
th
root of the product of N items of a series. If there
are two items in the data, we take the square root; if there are three items we take the
cube root, and so on.
Symbolically,
GM =
Where x
1,
x
2
. ..x
n
are the items of the given series. To simplify calculations, logarithms
are used.

Accordingly,
GM = Anti log of (Elog x /n)
In discrete series
GM = Anti log of
E f . log x / E f
I llustration

GM = Anti log of (Elog x /n)
= Anti log of (19.9986 / 10)
= Anti log of 1.9986
= 99.967
Geometric mean for discrete series
Calculate geometric mean of the following data given below:-

Class No. of families Income
Landlords 1 100
Cultivators 50 80
Landless labourers 25 40
Money lenders 2 750
Scholl teachers 3 100
Shop keepers 4 150
Carpenters 3 120
Weavers 5 60

GM = Anti log of E f. log x / E f
= Anti log of 173.7907 / 93
= Anti log 1. 86871
= 73.91
Harmonic Mean
In individual series
HM = N / E(1/x)
In discrete series
HM = N / Ef (1/m)
N = Total frequency
M = Mi values of the class
I llustration
For individual series
1. Find harmonic mean of the following data
5 10 3 7 125 58 47 80 45 26

HM = N / E(1/x)
HM = 10 / .89
= 11.235
Harmonic mean for discrete series
Compute harmonic mean for the following data
Marks : 10 20 25 30 40 50
Frequency : 20 10 15 25 10 20

HM = N / Ef (1/x)
HM = 100/ 4.58
= 21.834
Harmonic mean for continuous series
1. Calculate harmonic mean for the given data.
Class : 10-20 20-30 30-40 40-50 50-60 60-70
Frequency : 5 7 3 15 12 8

HM = N / E(1/x)
HM = 50 / 1.369 = 37.8689
Median
Median is the middlemost item of a given series. In individual series, we arrange the
given data according to ascending or descending order and take the middlemost item as
the median. When two values occur in the middle, we take the average of these two
values as median. Since median is the central value of an ordered distribution, there occur
equal number of values to the left and right of the median.
Individual series
Median = (N+ 1 / 2)
th
item

Illustration
1. Find the median of the following scores.
97 50 95 51 90 60 85 64 81 65 80 70 75
First we arrange the series according to ascending order.
50 51 60 64 65 70 75 80 81 85 90 95 97
Median = (N+ 1) / 2
th
item
= (13+ 1) / 2
th
item
= (14 / 2)
th
item
= (7)
th
item
= 75
Median for distribution with even number of items
2. Find the median of the following data.
95 51 91 60 90 64 85 69 80 70 78 75
First we arrange the series according to ascending order.
51 60 64 69 70 75 78 80 85 90 91 95
Median = (N+ 1) / 2
th
item
= (12+ 1) / 2
th
item
= (13 / 2)
th
item
= (6.5)
th
item
= (6
th
item + 7
th
item) / 2
= (75 + 78) / 2
= 153/2
= 76.5
Median for Discrete Series
To find the median of a grouped series, we first of all, cumulate the frequencies. Locate
median at the size of (N+ 1) / 2
th
cumulative frequency. N is the cumulative frequency
taken.
Steps
2. Arrange the values of the data in ascending order of magnitude.
3. Find out cumulative frequencies
4. Apply the formula (N+ 1) / 2
th
item
1. Look at the cumulative frequency column and find the value of the variable
corresponding to the above.
Find median for the following data.
Income : 100 150 80 200 250 180
Number of persons : 24 26 16 20 6 30
First of all arrange the data according to ascending order.

Median = (N+ 1) / 2
th
item
= (122+ 1) / 2
th
item
= (123) / 2
th
item
= (61.5)
th
item
= Value at the 61.5 cumulative frequency is taken as median
Therefore Median = 150


Median for Continuous Series
To find the median of a grouped series, with class interval, we first of all, cumulate the
frequencies. Locate median at the size of (N) / 2
th
cumulative frequency. Apply the
interpolation formula to obtain the median
Median = L
1
+ (N/2 m) / f X C
L
1
= Lower limit of the median Class
N/2 = Cumulative frequency/ 2
m = Cumulative frequency of the class preceding the median class
f = frequency of the median class
C = Class interval
Find median of the following data.
Class : 12-14 15-17 18-20 21-23 24-26
Frequency : 1 3 8 2 6

Median = L
1
+ (N/2 m) / f X C
L
1
= 18
N/2 = 10
m = 4
f = 8
C = 2
= 18+ (10 4) / 8 X 2
= 18 + 6/8 X 2
= 18 + (12/8)
= 18 + 1.5
= 19.5
Merits of Median
2. Median is easy to calculate and simple to understand.
3. When the data is very large median is the most convenient measure of central
tendency.
4. Median is useful finding average for data with open-ended classes.
5. The median distributes the values of the data equally to either side of the median.
6. Median is not influenced by the extreme values present in the data.
7. Value of the median can be graphically determined.
Demerits of Median
- To calculate median, data should be arranged according to ascending order. This is
tedious when the number of items in a series is numerous.
- Since the value of median is determined by observation, it is not a true representative of
all the values.
- Median is not amenable to further algebraic treatment.
- The value of median is affected by sampling fluctuation.
Mode
Mode is the most repeating value of a distribution. When one item repeats more number of times
than other or when two items repeat equal number of times, mode is ill defined. Under such case,
mode is calculated by the formula (3 median 2 mean).
Mode is a widely used measure of central tendency in business. We speak of model wage which
is the wage earned by most of the workers. Model shoe size is the mostly demanded shoe.
Merits of Mode
- Mode is the most typical and frequented value of the distribution.
- It is not affected by extreme values.
- Mode can be determined even for series with open-ended classes.
- Mode can be graphically determined.
Demerits of Mode
1. It is difficult to calculate mode when one item repeats more number of times than others.
2. Mode is not capable of further algebraic treatment.
3. Mode is not based on all the items of the series.
4. Mode is not rigidly defined. There are several formulae for calculating mode.
Mode for Individual Series
1. Calculation of mode for the following data.
7 10 8 5 8 6 8 9
Since item 8 repeats more number of times. Therefore mode = 8
Calculation of mode when mode is ill defined.
2. Calculation of mode for the following data.
15 25 14 18 21 16 19 20
Since no item repeats more number of times mode is ill defined.
Mode = (3 median 2 mean)
Mean = 18.5
Median = (18 +19)/2
= 18.5
Mode = (3 X 18.5) (2 X 18.5)
= 55.5 36.5 = 19
Mode for Discrete data Series
In discrete series the item with highest frequency is taken as mode.
3. Find mode for the following data.

Since 65 is the highest frequency its size is taken as mode
Mode = 31
Calculation of Mode Using Grouping Table and Analysis Table
To make Grouping Table
1. Group the frequency in two
2. Frequencies are grouped in two leaving the first frequency.
3. Group the frequency in three
4. Frequencies are grouped in three leaving the first frequency.
5. Frequencies are grouped in three leaving the first and second frequency.
To make Analysis Table
1. Analysis table is made based on grouping table.
2. Circle the highest value of each column.
3. Assign marks to classes, which constitute the highest value of the column.
4. Count the number of marks.
5. The class with the highest marks is selected as the model class.
6. Apply the interpolation formula and find the mode.
Mode = L
1
+ (f
1
f
0
/ 2f
1
-f
0
-f
2
) X C
L
1
= Lower limit of the model class
f
1
= frequency

of the model class
f
0 =
frequency

of the class preceding the model class
f
2
= frequency

of the class succeeding the model class
C = class interval

Illustration
Find mode for the following data using grouping table and analysis table.


Steps
1. In column I, the frequencies are grouped in two
2. In column II, frequencies are grouped in two, leaving the first frequency.
3. In column III, frequencies are grouped in three
4. In column IV frequencies are grouped in three, leaving the first frequency.
5. In column V frequencies are grouped in three, leaving the first and second frequency.

Since highest mark is 5 and is obtained by the class 40-60.
Therefore model class = 40-60
Mode is calculated by the formula
Mode = L
1
+ (f
1
f
0
)

/ (2f
1
-f
0
-f
2
) X C
L
1
= Lower limit of the model class = 40
f
1
= frequency

of the model class = 27
f
0 =
frequency

of the class preceding the model class = 15
f
2
= frequency

of the class succeeding the model class = 13
C = class interval = 20
Mode = 40 + (27 15)
/
(2 X 27 15-13) X 20
= 40 + (12/ 54-28) 20
= 40 + (12/ 26) 20
= 40 + (.4615) 20
= 40 + 9.23
= 49.23

Dispersion
Dispersion is the tendency of the individual values in a distribution to spread away from the
average. Many economic variables like income, wage etc., are widely varied from the mean.
Dispersion is a statistical measure, which understands the degree of variation of items from the
average.
Objectives of Measuring Dispersion
Study of dispersion is needed to:
1. To test the reliability of the average
2. To control variability of the data
3. To enable comparison with two or more distribution with regard to their variability
4. To facilitate the use of other statistical measures.
Measures of dispersion points out as to how far the average value is representative of the
individual items. If the dispersion value is small, the average tends to closely represent the
individual values and it is reliable. When dispersion is large, the average is not a typical
representative value.
Measures of dispersion are useful to control the cause of variation. In industrial production,
efficient operation requires control of quality variation.
Measures of variation enable comparison of two or more series with regard to their variability. A
high degree of variation would mean little consistency and low degree of variation would mean
high consistency.
Properties of a Good Measure of Dispersion
A good measure of dispersion should be simple to understand.
1. It should be easy to calculate
2. It should be rigidly defined
3. It should be based on all the values of a distribution
4. It should be amenable to further statistical and algebraic treatment.
5. It should have sampling stability
6. It should not be unduly affected by extreme values.
Measures of Dispersion
1. Range
2. Quartile deviation
3. Mean deviation
4. Standard deviation
5. Lorenz curve
Range, Quartile deviation, Mean deviation and Standard deviation are mathematical
measures of dispersion. Lorenz curve is a graphical measure of dispersion.
Measures of dispersion can be absolute or relative. An absolute measure of dispersion is
expressed in the same unit of the original data. When two sets of data are expressed in
different units, relative measures of dispersion are used for comparison. A relative
measure of dispersion is the ratio of absolute measure to an appropriate average.
The following are the important relative measures of dispersion.
6. Coefficient of range
7. Coefficient of Quartile deviation
8. Coefficient of Mean deviation
9. Coefficient of Standard deviation
Range
Range is the difference between the lowest and the highest value.
Symbolically, range = highest value lowest value
Range = H L
H = highest value
L = lowest value
Relative measure of dispersion is co-efficient of range. It is obtained by the following
formula.
Coefficient of range = (H L) / (H + L)

1. Calculate of range of the following distribution, giving income of 10 workers.
Also calculate the co-efficient of range.
25 37 40 23 58 75 89 20 81 95
Range = H L
H = highest value = 95
L = lowest value = 20
Range = 95 20 = 75
Coefficient of range = (H L) / (H + L)
= (95 20) / (95 +20)
= 75/ 115
= .6521
Range is simple to understand and easy to calculate. But it is not based on all items of the
distribution. It is subject to fluctuations from sample to sample. Range cannot be
calculated for open-ended series.
Quartile Deviation
Quartile deviation is defined as inter quartile range. It is based on the first and the third
quartile of a distribution. When a distribution is divided into four equal parts, we obtain
four quartiles, Q1, Q2, Q3 and Q4.
First quartile Q
1
is point of the distribution where 25% of the items of the distribution lie
below Q
1,
and 75% of the items of the distribution lie above the Q
1
. Q
2
is the median of
the distribution, where 50% of the items of the distribution lie below Q
2,
and 50% of the
items of the distribution lie above the Q
2
. Third quartile Q
3
is point of the distribution
where 75% of the items of the distribution lie below Q
3,
and 25% of the items of the
distribution lie above the Q
3
.
Quartile deviation is based on the difference between the third and first quartiles. So
quartile deviation is defined as the inter-quartile range.
Symbolically, inter-quartile range = Q
3
- Q
1

Quartile Deviation = (Q
3
- Q
1
) / 2
Co-efficient of Quartile Deviation = (Q
3
- Q
1
)

/ (Q
3
+ Q
1
)
Merits of Quartile Deviation
1. Quartile Deviation is superior to range as a rough measure of dispersion.
2. It has a special merit in measuring dispersion in open-ended series.
3. Quartile Deviation is not affected by extreme values.
Demerits of Quartile Deviation
4. Quartile Deviation ignores the first 25% of the distribution below Q
1
and 25% of
the distribution above the Q
3
.
5. Quartile Deviation is not amenable to further mathematical treatment.
6. Quartile Deviation is very much affected by sampling fluctuations.
Problems
I ndividual Series
10. Find the Quartile Deviation and its co-efficient.
20 58 40 12 30 15 50
First of all arrange the data according to ascending order.
12 15 20 28 30 40 50
Q
1
= Size of (N+1) / 4
th
item
= Size of (7+1) / 4
th
item
= Size of (8 / 4)
th
item
= 2
nd
item
= 15
Q
3
= Size of 3(N+1) / 4
th
item
= Size of 3 X (7+1) / 4
th
item
= Size of 3 X 8 / 4
th
item
= (3 X 2)
nd
item
= 6
th
item
= 40

Co-efficient of Quartile Deviation = (Q
3
- Q
1)
/ (Q
3
+ Q
1
)
= (40- 15
)
/ (40+ 15)
= 25/55
= .4545
Discrete Series
11. Find quartile Deviation and its co-efficient for the following data.
Income : 110 120 130 140 150 160 170 180 190 200
Frequency: 50 45 40 35 30 25 20 15 10 5

Q
1
= Size of (N+1) / 4
th
item
= Size of (275+1) / 4
th
item
= Size of (276 / 4)
th
item
= size of 69
th
cumulative frequency
= 120
Q
3
= Size of 3(N+1) / 4
th
item
= Size of 3 X (275 +1) / 4
th
item
= Size of 3 X69
th
item
= Size of 207
th
cumulative frequency
= 160
Quartile Deviation = (160 120) /2
= 40/2
= 20
Co-efficient of Quartile Deviation = (Q
3
- Q
1
) / (Q
3
+ Q
1
)
= (160- 120

/ (160+ 120)
= 20/280
= .0714
Continuous Series
Find quartile deviation for the following series
Marks : 0-20 20-40 40-60 60-80 80-100
Frequency : 10 30 36 30 14

Q
1
= lies in (N) / 4
th
class
= lies in (120) / 4
th
class
= lies in (30)
th
cumulative frequency class
= lies in 20- 40
Q1 can be obtained by applying the interpolation formula
= L1 + (N/4) m / f X C
= 20 + (30 10) / 30 X 20
= 20 + 20/ 30 X 20
= 20 + 400/30
= 20 + 13.33
= 33.33
Q
3
= lies in 3(30)
th
cumulative frequency class
= lies in 60-80 class
Q3 can be obtained by applying the interpolation formula
= L1 + 3 (N/4) m / f X C
= 60 + (90 76) / 30 X 20
= 60 + (14/ 30) X 20
= 60 + 280/30
= 60 + 9.33
= 69.33
Quartile Deviation = (Q
3
- Q
1
) /2
= (69.33 33.33) 2
= 36/2
= 18
Co-efficient of Quartile Deviation = (Q
3
- Q
1
) / (Q
3
+ Q
1
)
= (69.33 33.33) / (69.33 + 33.33)
= 36/ 102.66
= .3505
Mean Deviation
Range and quartile deviation do not show any scatter ness from the average. However,
mean deviation and standard deviation help us to achieve the dispersion.
Mean deviation is the average of the deviations of the items in a distribution from an
appropriate average. Thus, we calculate mean deviation from mean, median or mode.
Theoretically, mean deviation from median has an advantage because sum of deviations
of items from median is the minimum when signs are ignored. However, in practice,
mean deviation from mean is frequently used. That is why it is commonly called as mean
deviation.
Formula for calculating mean deviation = D/N

Where
D = sum of the deviation of the items from mean, median or mode
N = number of items
D is mode less meaning values or deviation is taken without signs.

Steps
1. Calculate mean, median or mode of the series
2. Find the deviation of items from the mean, median or mode
3. Sum the deviations and obtain D
4. Take the average of the deviations D/N, which is the mean deviation.
The co- efficient of mean deviation is the relative measure of mean deviation. It is
obtained by dividing the mean deviation by a particular measure of average used for
measuring mean deviation.
If mean deviation is obtained from median, the co-efficient of mean deviation is obtained
by dividing mean deviation by median.
The co-efficient of mean deviation = mean deviation / median
If mean deviation is obtained from mean, the co-efficient of mean deviation is obtained
by dividing mean deviation by mean.
The co-efficient of mean deviation = mean deviation / mean
If mean deviation is obtained from mode, the co-efficient of mean deviation is obtained
by dividing mean deviation by mode.
The co-efficient of mean deviation = mean deviation / mode
Problems
Calculate mean deviation for the following data from mean
Daily wages : 15 18 20 25 30 35 40 42 45
Frequency : 2 3 5 10 12 10 5 2 1


Mean = 1473/50
= 20
Mean deviation = f D / N
= 505/50
= 10.1
The co-efficient of mean deviation = mean deviation / mean
= 10.1 /20
= .505
Continuous series
The procedure remains the same. The only difference is that we have to obtain the
midpoints of the various classes and take deviations of these midpoints. The deviations
are multiplied by their corresponding frequencies. The value so obtained is added and its
average is the mean deviation.
Calculate mean deviation for the following data.
Class : 5-10 10-15 15-20 20-25 25-30 30-35 35-40 40-45
Frequency : 6 5 15 10 5 4 3 2

Arithmetic mean = A + fx / F
= 22.5 + 65/50
= 22.5 +1.3
= 28.8
Mean deviation from mean = f D / N
= 516.6/50
= 10.332
The co-efficient of mean deviation = mean deviation / mean
= 10.332 / 28.8
= .3762
Mean deviation from median
To find median

Median = L1 + (n/2 m/f) C
= 15 + 25 11/ 15 X 5
= 15 + 6/15 X 5
= 15 + 30/15
= 15 + 2
= 17
Mean deviation from median = f D / N
= 369/50
= 7.38
The co-efficient of mean deviation = mean deviation / median
= 7.38/17
= .434
Mean deviation from mode = model class 15-20
= L
1
+ (f
1
-f
0
/ 2 f
1
-f
0
-f
2
) C
= 15 + (15-5 / 2X15-5-10) X 5
= 15 + (10 / 30-5-10) X 5
= 15 + (10 / 15) X 5
= 15 + 3.33
= 18.33

Mean deviation from mode = f D / N
= 356.72/50
= 7.13
The co-efficient of mean deviation = mean deviation / mode
= 7.16/ 18.3
= .3912
Merits of Mean Deviation
5. Mean deviation is simple to understand and easy to calculate
6. It is based on each and every item of the distribution
7. It is less affected by the values of extreme items compared to standard deviation.
8. Since deviations are taken from a central value, comparison about formation of
different distribution can be easily made.
Demerits of Mean Deviation
9. Algebraic signs are ignored while taking the deviations of the items.
10. Mean deviation gives the best result when it is calculated from median. But
median is not a satisfactory measure when variability is very high.
11. Various methods give different results.
12. It is not capable of further mathematical treatment.
13. It is rarely used for sociological studies.
Standard deviation
Standard deviation is the most important measure of dispersion. It satisfies most of the
properties of a good measure of dispersion. It was introduced by Karl Pearson in 1893. Standard
deviation is defined as the mean of the squared deviations from the arithmetic .omean.
Standard deviation is denoted by the Greek letter
Mean deviation and standard deviation are calculated from deviation of each and every item.
Standard deviation is different from mean deviation in two respects. First of all, algebraic signs
are ignored in calculating mean deviation. Secondly, signs are taken into account in calculating
standard deviation whereas, mean deviation can be found from mean, median or mode.
Whereas, standard deviation is found only from mean.
Standard deviation can be computed in two methods
12. Taking deviation from actual mean
13. Taking deviation from assumed mean.
Formula for finding standard deviation is \
E (x-x)
2
/ N
Steps
14. Calculate the actual mean of the series E x / N
15. Take deviation of the items from the mean ( x-x)
16. Find the square of the deviation from actual ( x-x) mean
2
/ N
17. Sum the squares of the deviations E ( x-x)
2

18. Find the average of the squares of the deviations E ( x-x)
2
/ N
19. Take the square root of the average of the sum of the deviation
Problems
1. Calculate the standard deviation of the following data
49 50 65 58 42 60 51 48 68 59
Standard deviation from actual mean
Arithmetic mean = E x / N
= 550 /10
= 55

S.D = \
E (x-x)
2
/ N
= \ 614 /10
= \ 61.4
= 7.836
Standard deviation from assumed mean
Assumed mean = 50


S.D = \
E (x-x)
2
/ N E {(x-x) / N}
2

= \ 864 /10 50/10
= \ 86.4 5
2

= \ 81.4 25
= \ 61.4
= 7.836
Discrete Series
Standard deviation can be obtained by three methods.
20. Direct method
21. Short cut method
22. Step deviation
Direct method
Under this method formula is
S.D = \
E (fx)
2
/ N E {(fx) / N}
2


Calculate standard deviation for the following frequency distribution.
Marks : 20 30 40 50 60 70
Frequency : 8 12 20 10 6 4

S.D = \
E (FX)
2
/ N E {(FX) / N}
2

= \ 112200/60 E {2460 / 60}
2

= \ 141870
2


=

\ 1870 1681
= \ 189
= 13.747
Correlation Analysis
Economic and business variables are related. For instance, demand and supply of a
commodity is related to its price. Demand for a commodity increases as price falls.
Demand for a commodity decreases as its price rises. We say demand and price are
inversely related or negatively correlated. But sellers supply more of a commodity when
its price rises. Supply of the commodity decreases when its price falls. We say supply and
price are directly related or positively co-related. Thus, correlation indicates the
relationship between two such variables in which changes in the value of one variable is
accompanies with a change in the value of other variable.
According to L.R. Connor, if two or more quantities vary in sympathy so that
movements in the one tend to be accompanied by corresponding movements in the
other(s) they are said to be correlated.
W.I. King defined Correlation means that between two series or groups of data, there
exists some casual connection.
The definitions make it clear that the term correlation refers to the study of relationship
between two or more variables. Correlation is a statistical device, which studies the
relationship between two variables. If two variables are said to be correlated, change in
the value of one variable result in a corresponding change in the value of other variable.
Heights and weights of a group of people, age of husbands and wives etc., are examples
of bi-variant data that change together.
Correlation and Causation
Although, the term correlation is used in the sense of mutual dependence of two or more
variable, it is not always necessary that they have cause and effect relation. Even a high
degree of correlation between two variables does not necessarily indicate a cause and
effect relationship between them. Correlation between two variables can be due to
following reasons:-
1. Cause and effect relationship: Heat and temperature are cause and effect variable.
Heat is the cause of temperature. Higher the heat, higher will be the temperature.
2. Both the correlated variables are being affected by a third variable. For instance,
price of rice and price of sugar are affected by rainfall. Here there may not be any
cause and effect relation between price of rice and price of sugar.
3. Related variable may be mutually affecting each other so that none of them is
either a cause or an effect. Demand may be the result of price. There are cases
when price rise due to increased demand.
4. The correlation may be due to chance. For instance, a small sample may show
correlation between wages and productivity. That is, higher wage leading to lower
productivity. In real life it need not be true. Such correlation is due to chance.
5. There might be a situation of nonsense or spurious correlation between two
variables. For instance, relationship between number of divorces and television
exports may be correlated. There cannot be any relationship between divorce and
exports of television.
The above points make it clear that correlation is only a statistical relationship and it does
not necessarily signify a cause and effect relationship between the variables.
Types of Correlation Analysis
Correlation can be:
- Positive or negative
- Linear or non-linear
- Simple, multiple or partial

Positive and Negative Correlation
When values of two variables move in the same direction, correlation is said to be positive.
When prices rise, supply increases and when prices fall supply decreases. In this case, an
increase in the value of one variable on an average, results in an increase in the value of other
variable or decrease in the value on one variable on an average results in the decrease in the
value of other variable.
If on the other hand, values of two variables move in the opposite direction, correlation is said
to be negative. When prices rise, demand decreases and when prices fall demand increases. In
this case, an increase in the value of one variable on an average results in a decrease in the
value of other variable.
Linear and Non-Linear Correlation
When the change in one variable leads to a constant ratio of change in the other variable,
correlation is said to be linear. In case on linear correlation, points of correlation plotted on a
graph will give a straight line. Correlation is said to be non-linear when the change in one
variable is not accompanied by a constant ratio of change in the other variable. In case of non-
linear correlation, points of correlation plotted on a graph do not give a straight line. It is called
curvilinear correlation because graph of such correlation results in a curve.
Simple, Partial and Multiple Correlations
Simple correlation studies relationship between two variables only. For instance, correlation
between price and demand is simple as only two variables are studied in this case. Multiple
correlation studies relationship of one variable with many variables. For instance, correlation of
agricultural production with rainfall, fertilizer use and seed quality is a multiple correlation.
Partial correlation studies the relationship of a variable with one of the many variables with
which it is related. For instance, seed quality, temperature and rainfall are three variables, which
determine yield of a crop. In this case, yield and rainfall is a partial correlation.
Utility of Correlation
Study of correlation is of immense practical use in business and economics.
o Correlation analysis enables us to measure the magnitude of relationship existing
between variables under study.
o Once we establish correlation, we can estimate the value of one variable on the
basis of the other. This is done with the help of regression equations.
o The correlation study is useful for formulation of economic policies. In
economics, we are interested in finding the important dependant variables on the
basis of independent variable.
o Correlation study helps us to make relatively more dependable forecasts
Methods of Studying Correlation
Following methods are used in the study of correlation:
o Scatter diagram
o Karl Pearson method of Correlation
o Spearmans Rank correlation method
o Concurrent Deviation method
Scatter Diagram
This is a graphical method of studying correlation between two variables. In scatter diagram,
one variable is measured on the x-axis and the other is measured on the y-axis of the graph.
Each pair of values is plotted on the graph by means of dot marks. If plotted points do not show
any trend, two variables are not correlated. If the trend shows upward rising movement,
correlation is positive. If the trend is downward sloping, correlation is negative.

Karl Pearsons Co-Efficient of Correlation
Karl Pearsons Co-Efficient of Correlation is a mathematical method for measuring correlation.
Karl Pearson developed the correlation from the covariance between two sets of variables. Karl
Pearsons Co-Efficient of Correlation is denoted by symbol r. The formula for obtaining Karl
Pearsons Co-Efficient of Correlation is:
Direct method

xy / N E Covariance between x and y = y/N)Ex/N X E(

SD
x
= standard deviation of x series = \ xE(
2
/ N) x/N)E(
2

SDy = standard deviation of y series = \ yE(
2
y/N)E/ N) (
2

Shortcut Method using Assumed Mean
If short cut method is used using assumed mean, the formula for obtaining Karl Pearsons Co-
Efficient of Correlation is:
dxdy / N ECovariance between x and y = dy/N)Edx/N X E(
SD
x
= \ dxE(
2
dx /N)E/ N) (
2

SDy = \ dyE(
2
dy /N)E/ N) (
2


Steps in calculating Karl Pearsons Correlation Coefficient using Shortcut Method
o Assume means of x and y series
o Take deviations of x and y series from assumed mean and get dx and dy
o Square the dx and dy and find the sum of squares and get dx2 and dy2.
o Multiply the corresponding deviations of x and y series and total the products to
get dxdy.
If the deviations are taken from the arithmetic mean dx = 0 and dy =0 and the formula
becomes

Shortcut Method using Arithmetic Mean
If short cut method is used using actual mean, the formula for obtaining Karl Pearsons Co-
Efficient of Correlation is:

Interpreting Co-Efficient of Correlation
The Co-Efficient of Correlation measures the correlation between two variables. The value of Co-
Efficient of Correlation always lies between +1 and 1. It can be interpreted in the following
ways.
If the value of Co-Efficient of Correlation r is 1 it is interpreted as perfect positive correlation.
If the value of Co-Efficient of Correlation r is -1, it is interpreted as perfect negative correlation.
If the value of Co-Efficient of Correlation r is 0 < r < 0.5, it is interpreted as poor positive
correlation.
If the value of Co-Efficient of Correlation r is 0.5 < r < 1, it is interpreted as good positive
correlation.
If the value of Co-Efficient of Correlation r is 0 > r > -0.5, it is interpreted as poor negative
correlation.
If the value of Co-Efficient of Correlation r is 0.5 > r > -1, it is interpreted as good negative
correlation.
If the value of Co-Efficient of Correlation r is 0, it is interpreted as zero correlation.
Probable Error
Probable Error of Correlation coefficient is estimated to find out the extent to which the
value of r is dependable. If Probable Error is added to or subtracted from the correlation
coefficient, it would give such limits within which we can reasonably expect the value of
correlation to vary.
If the coefficient of correlation is less than Probable Error it will not be significant. If the
coefficient of correlation r is more than six times the Probable Error, correlation is definitely
significant. If Probable Error is 0.5 or more, it is generally considered as significant. Probable
Error is estimated by the following formula
PE = 0.6745 (1- r
2
/ \ N)
Coefficient of Determination
Besides probable error, another important method of interpreting coefficient of
correlation is the Coefficient of Determination. Coefficient of Determination is the square
of correlation or r
2
. For instance, suppose the coefficient of correlation between price and
supply is 0.8. We calculate the coefficient of determination as r
2
, which is .8
2
or .64. It
means that 64% of the variation in supply is on account of changes in price.
Spearmans Rank Correlation Method
Charles Edward Spearman, a British psychologist devised a method for measuring
correlation between two variables based on ranks given to the observations. This method
is adopted when the variables are not capable of quantitative measurements like
intelligence, beauty etc. in such cases, it is impossible to assign numerical values for
change taking place in such variables. It is in such cases rank correlation is useful.
Spearmans rank correlation coefficient is given by
r
k
= 1- 6 D
2
/ n (n2-1)
Where D is the difference between ranks and n, number of pairs correlated.
Concurrent Deviation Method
In this method, correlation is calculated between direction of deviations and not their
magnitudes. As such only the direction of deviations is taken into account in the
calculation of this coefficient and their magnitude is ignored.
The formula for the calculation of coefficient of concurrent deviations is given below:
r
c
= +- \
(2C-n / n(
Steps in the Calculation of Concurrent Deviation
o Find out the direction of change of x-variable. When a successive figure in the
series increase direction is marked as + and when a successive figure in the series
decrease direction of change is marked as -. It is denoted as dx.
o Find out the change in direction of y-variable. It is denoted as dy.
o Multiply dx and dy and determine the value of C. C is the number of positive
products of dxdy
(- X or + X +).
o Use the formula rc = +- \
(2C-n / n(to obtain the value of coefficient of rc.
Problems
1. Calculate Karl Pearsons co-efficient of correlation for the following data.
X : 43 44 46 40 44 42 45 42 38 40 42 57
Y : 29 31 19 18 19 27 27 29 41 30 26 10


Direct method

xy / N ECovariance between x and y = y/N)Ex/N X E(
DE
x
= standard deviation of x series = \ xE(
2
/ N) x/N)E(
2

Dy = standard deviation of y series =E \ yE(
2
y/N)E/ N) (
2

Shortcut Method using Assumed Mean
If short cut method is used using assumed mean, the formula for obtaining Karl Pearsons Co-
Efficient of Correlation is:

dxdy / N ECovariance between x and y = dy/N)Edx/N X E(
DE
x
= \ dxE(
2
dx /N)E/ N) (
2

Dy = E\ dyE(
2
dy /N)E/ N) (
2



dxdy = 494E
N = 12
dx = 43E
dy = 54E
dxE
2
= 407
dyE
2
= 944






= 0.714
Interpretation: There is good positive correlation between x and y variable.

Summary
Data processing is an intermediary stage of work between data collections and data
interpretation. The various steps in processing of data may be stated as:
- Identifying the data structures
- Editing the data
- Coding and classifying the data
- Transcription of data
- Tabulation of data.
The identification of the nodal points and the relationships among the nodes could
sometimes be a complex task than estimated. When the task is complex, which involves
several types of instruments being collected for the same research question, the
procedures for drawing the data structure would involve a series of steps. Data editing
happens at two stages, one at the time of recording the data and second at the time of
analysis of data. All editing and cleaning steps are documented, so that the redefinition of
variables or later analytical modification requirements could be easily incorporated into
the data sets. The editing step checks for the completeness, accuracy and uniformity of
the data set created by the researcher. The edited data are then subject to codification and
classification. Coding process assigns numerals or other symbols to the several responses
of the data set. It is therefore a pre-requisite to prepare a coding scheme for the data set.
The recording of the data is done on the basis of this coding scheme.
o Numeric Coding: Coding need not necessarily be numeric. It can also be
alphabetic. Coding has to be compulsorily numeric, when the variable is subject
to further parametric analysis.
o Alphabetic Coding: A mere tabulation or frequency count or graphical
representation of the variable may be given an alphabetic coding.
o Zero Coding: A coding of zero has to be assigned carefully to a variable.
The transcription of data can be used to summarize and arrange the data in compact form
for further analysis. Computerized tabulation is easy with the help of software packages.
Frequency tables provide a shorthand summary of data. The importance of presenting
statistical data in tabular form needs no emphasis. The major components of a table are:
- A Heading:
- Table Number
- Title of the Table
- Designation of units
- B Body
- Stub-head, Heading of all rows or blocks of sub items
- Body-head: Headings of all columns or main captions and their sub-captions.
- Field/body: The cells in rows and columns.
- C Notations:
- Footnotes, wherever applicable.
- Source, wherever applicable.
Variables that are classified according to magnitude or size are often arranged in the form
of a frequency table. In constructing this table, it is necessary to determine the number of
class intervals to be used and the size of the class intervals. The most commonly used
graphic forms may be grouped into the following categories:
- Line Graphs or Charts
- Bar Charts
- Segmental presentations.
- Scatter plots
- Bubble charts
- Stock plots
- Pictographs
Chesnokov Faces
Copyright 2009 SMU
Powered by Sikkim Manipal University
.
MB0034- Unit 12 -Research Report Writing
Unit 12 -Research Report Writing
Meaning of Research Report
Research report is a means for communicating research experience to others. A research report is
a formal statement of the research process and it results. It narrates the problem studied, methods
used for studying it and the findings and conclusions of the study.
Objectives:
After learning this lesson you should be able to understand:
- Purpose of Research Report
- Characteristics of Research Report
- Functions of Research Report
- Types of Research Report
- Contents of Reports
- Styles of Reporting
- Steps in Drafting Reports
- Editing the Final Draft
- Evaluating the Final Drafts
Purpose of Research Report
The purpose of the research report is to communicate to interested persons the methodology and
the results of the study in such a manner as to enable them to understand the research process and
to determine its validity. The aim is not to convince but to convey what was done, why and what
was its outcome.
Characteristics of Research Report
Research report is a narrative and authoritative document on the outcome of a research effort. It
represents highly specific information for a clearly designated audience. It is simple, readable
and accurate form of communication.

Functions of Research Report
It serves as a means for presenting the problem studied, methods and techniques used for
collecting and analyzing data, findings and conclusions and recommendations. It serves as a
basic reference material for future use.
- It is a means for judging the quality of research project.
- It is a means for evaluating researchers competency.
- It provides a systematic knowledge on problems and issues analyzed.

Types of Research Report
Research reports can be classified as:
- Technical reports
- Popular reports
- Summary reports
- Research abstract
- Research article
These differ in terms of the degree of formality, physical form, scope, style and size.
Technical Reports
In a technical report a comprehensive full report of the research process and its outcome are
included. It covers all the aspects of the research process. A description of the problem studied,
the objectives of the study, method and techniques used, a detailed account of sampling filed and
other research procedures, sources of data, tools for data collection, methods of data processing
and analysis, detailed findings and conclusions and suggestion.
Popular Reports
In popular report the reader is less interested in the methodological details, but more interested in
the findings of the study. Complicated statistics are avoided and pictorial devices are used. After
a brief introduction to the problem and the objectives of the study, an abstract of the findings of
the study, conclusion and recommendations are presented. More headline, underlining pictures
and graphs may be used. Sentences and paragraphs should be short.
Interim Report
When there is a time lag between data collection and presentation of the result, the study may
lose significance and usefulness. An interim report in such case can narrate what has been done
so far and what was its outcome. It presents a summary of the findings of that part of analysis
which has been completed.
Summary Reports
Summary report is meant for lay audience i.e., the general pubic. It is written in non-technical,
simple language with pictorial charts that just contains objectives, findings and its implications.
It is a short report of two to three pages.
Research Abstract
Research abstract is a short summary of technical report. It is prepared by a doctoral student on
the eve of submitting his thesis. It contains a brief presentation of the statement of the problem,
the objectives of the study, methods and techniques used and an overview of the report. A brief
summary of the results of the study may also be used.
Research Article
Research article is designed for publication in a professional journal. A research article must be
clearly written in concise unambiguous language. It must be logically organized. Progression
from a statement of a problem and purpose of the study, through analysis of evidence to the
conclusions and implications are given in the report.

Contents of the Research Report
The outline of a research report is given below:
I. Prefatory Items
- Title page
- Declaration
- Certificates
- Preface/ acknowledgements
- Table of contents
- List of tables
- List of graphs/ figures/ charts
- Abstract or synopsis
II. Body of the Report
- Introduction
- Theoretical background of the topic
- Statement of the problem
- Review of literature
- The scope of the study
- The objectives of the study
- Hypothesis to be tested
- Definition of the concepts
- Models if any
- Design of the study
- Methodology
- Method of data collection
- Sources of data
- Sampling plan
- Data collection instruments
- Field work
- Data processing and analysis plan
- Overview of the report
- Limitation of the study
- Results: findings and discussions
- Summary, conclusions and recommendations
III. Reference Material
- Bibliography
- Appendix
- Copies of data collection instruments
- Technical details on sampling plan
- Complex tables
- Glossary of new terms used.

Styles of Reporting
Communicate to a Specific Audience
The first step is to know the audience, its background, and its objectives. Most effective
presentations seem live conversations or memos to a particular person as opposed to an
amorphous group. Audience identification affects presentation decisions such as selecting the
material to be included and the level of presentation. Excessive detail or material presented at too
low a level can be boring. The audience can become irritated when material perceived as relevant
is excluded or the material is presented at too high level. In an oral presentation, the presenter
can ask audience whether they already know some of the material.
Frequently, a presentation must be addressed to two or more different audiences. There are ways
to deal with such a problem. In a written presentation, an executive summary at the outset can
provide an overview of the conclusions for the benefit of those in the audience who are not
interested in details. The presentation must respect the audiences time constraints. An appendix
can be used to reach some people selectively, without distracting the others. Sometimes
introduction to a chapter or a section can convey the nature of the contents, which certain
audiences may bypass. In an oral presentation, the presence of multiple audiences should be
recognized.
Structure the Presentation
Each piece of presentation should fit into the whole, just as individual pieces fit into a jigsaw
puzzle. The audience should not be muttering. The solution to this is to provide a well-defined
structure. The structure should include an introduction, a body, and a summary. Further, each of
the major sections should be structured similarly. The precept is to tell the audience what you are
going to say, say it and then tell them what you said. Sometimes you want to withhold the
conclusion to create interest.
Introduction should play several roles. First, it should provide audience interest. A second
function is to identify the presentations central idea or objective. Third, it should provide a road
map to the rest of the presentation so that the audience can picture its organisation and flow.
It is better to divide the body of the presentation into two to five parts. The audience will be able
to absorb only so much information. If that information can be aggregated into chunks, it will be
easier to assimilate. Sometimes the points to be made cannot be combined easily or naturally. In
that case, it is necessary to use a longer list. One way to structure the presentation is by the
research questions. Another method that is often useful when presenting the research proposal is
to base it on the research process. The most useful presentations will include a statement of
implications and recommendations relevant to the research purpose. However, when researcher
lacks information about the total situation because the research study addresses only a limited
aspect of it, the ability to generate recommendations may be limited.
The purpose of the presentation summary is to identify and underline the important points of the
presentations and to provide some repetition of their content. The summary should support the
presentation communication objectives by helping the audience to retain the key parts of the
content. The audience should feel that there is a natural flow from one section to another.
Create Audience Interest
The audience should be motivated to read or listen to the presentations major parts and to the
individual elements of each section the audience should know why the presentation is relevant to
them and why each section was included. A section that cannot hold interest should be excluded
or relegated to appendix.
The research purpose and objectives are good vehicles to provide motivation. The research
purpose should specify decisions to be made and should relate to the research questions. A
presentation that focuses on those research questions and their associated hypothesis will
naturally be tied to relevant decisions and hold audience interest. In contrast, a presentation that
attempts to report on all the questions that were included in the survey and in the cross-
tabulations often will be long, uninteresting and of little value.
As the analysis proceeds and presentation is being prepared, the researcher should be on the
lookout for results that are exceptionally persuasive, relevant, interesting, and unusual.
Sometimes, the deviant respondent with strange answers can provide the most insight in his or
her responses that are pursued and not discarded.


Be Specific and Visual
Avoid taking or writing in the abstract. If different members of the audience have different or
vague understandings of important concepts, there is a potential problem. Terms that are
ambiguous or not well known should be defined and illustrated or else omitted. The most
interesting presentations usually use specific stories, anecdotes, studies, or incidents to make
points.
Address Validity and Reliability Issues
The presentation should help the audience avoid misinterpreting the results. The wording of the
questions, the order in which they are asked, and the sampling design are among the design
dimensions that can lead to biased results and misinterpretations. The presentation should not
include an exhaustive description of all the design considerations. Nobody is interested in a
textbook discussion of the advantages of telephone over mail surveys, or how you locate homes
in an area sampling design.
The presentation should include some indication of the reliability of the results. At the minimum,
it always should be clear what sample size was involved. The key results should be supported by
more precise information in the form of interval estimates or a hypothesis test. The hypothesis
test basically indicates, given the sample size, what probability exists that the results were merely
an accident of sampling. If the probability of the latter is not low, then the results probably would
not be repeated. Do not imply more precision than is warranted.

Steps in Drafting the Research Report
Along with the related skill of working with and motivating people, the ability to communicate
effectively is undoubtedly the most important attribute a manager can have. Effective
communication between research users and research professional is extremely important to the
research process. The formal presentation usually plays a key role in the communication effort.
Generally, presentations are made twice during the research process. First, there is the research
proposal presentation. Second, there is the presentation of the research results.
Guidelines for successful presentations
In general a presenter should:
- Communicate to a specific audience.
- Structure the presentation.
- Create audience interest
- Be specific and visual
- Address validity and reliability issues

Editing the Final draft
A research report requires clear organisation. Each chapter may be divided into two or more
sections with appropriate headings and in each section margin headings and paragraph headings
may be used to indicate subject shifts. Physical presentation is another aspect of organisation. A
page should not be fully filled in from top to bottom. Wider margins should be provided on both
sides and on top and bottom as well.
Centred section heading is provided in the centre of the page and is usually in solid font size. It is
separated from other textual material by two or three line space.
Marginal heading is used for a subdivision in each section. It starts from the left side margin
without leaving any space.
Paragraph heading is used to head an important aspect of the subject matter discussed in a
subdivision. There is some space between the margin and this heading.
Presentation should be free form spelling and grammar errors. If the writer is not strong in
grammar, get the manuscript corrected by a language expert.
Use the rules of punctuations.
Use present tense for presenting the findings of the study and for stating generalizations.
Do not use masculine nouns and pronouns when the content refers to both the genders. Do not
abbreviate words in the text; spell out them in full. Footnote citation is indicated by placing an
index number, i.e., a superscript or numeral, at the point of reference. Reference style should
have a clear format and used consistently.

Evaluating the Final Draft
The general guidelines discussed so far are applicable to both written and oral presentations.
However, it is important to generate a research report that will be interesting to read. Most
researchers are not trained in effective report writing. In their enthusiasm for research, they often
overlook the need for a good writing style. In writing a report, long sentences should be
reconsidered and the critical main points should stand out.
Here are some hints for effective report writing.
- Use main heading and subheadings to communicate the content of the material discussed.
- Use the present tense as much as possible to communicate information.
- Whether the presentation is written or oral, use active voice construction to make it lively
and interesting, passive voice is wordy and dull.
- Use computer-generated tables and graphs for effective presentations.
- Use informative headings.
- Use double-sided presentation if possible. For example, tables or graphs could be
presented on the left side of an open report and their descriptions on the right side.

Summary
Research report is a means for communicating research experience to others. The purpose of the
research report is to communicate to interested persons the methodology and the results of the
study in such a manner as to enable them to understand the research process and to determine its
validity. Research report is a narrative and authoritative document on the outcome of a research
effort. It represents highly specific information for a clearly designated audience. It serves as a
means for presenting the problem studied, methods and techniques used for collecting and
analyzing data, findings and conclusions and recommendations. It serves as a basic reference
material for future use. It is a means for judging the quality of research project. It is a means for
evaluating researchers competency. It provides a systematic knowledge on problems and issues
analyzed. In a technical report a comprehensive full report of the research process and its
outcome. It covers all the aspects of the research process. In popular report the reader is less
interested in the methodological details, but more interested in the findings of the study. An
interim report in such case can narrate what has been done so far and what was its outcome. It
presents a summary of the findings of that part of analysis which has been completed. Summary
report is meant for lay audience i.e., the general pubic. It is written in non-technical, simple
language with pictorial charts it just contains objectives, findings and its implications. It is a
short report of two to three pages. Research abstract is a short summary of technical report. It is
prepared by a doctoral student on the eve of submitting his thesis. Research article is designed
for publication in a professional journal. A research article must be clearly written in concise and
unambiguous language.



References:

1. R. Pannershelvam, Research Methodology, Prentice-Hall of India,
New Delhi, 2004.
2. P. L. Bhandarkar and T. S. Wilkinson, Methodology and Techniques of Social Research, Himalaya
Publishing House, Delhi.
3. Ackoff R. L., The Design of Social Research, Chicago, 1953.






Copyright 2009 SMU
Powered by Sikkim Manipal University
.

Вам также может понравиться