Вы находитесь на странице: 1из 19

Data Collection

Lesson 1: Understanding the Nature and Role of Data in Research

 Research is crucial in answering problems or objectives of the organization/firm or


researcher with the support of data.
 Data are like pieces of a puzzle that provide a better picture of the behaviour or
performance of a certain organization.
 Data is important to help a company analyse its behaviour or performance in terms of
sales, revenue, cost, etc.

 Collection of Primary Data


 Primary data – those which are considered as first-hand information gathered for a
specific purpose; may be collected through: interview, questionnaire, observation, and
experiment.
 Secondary data – those which are second-hand information gathered by an institution
such as, Security and Exchange Commission, DepEd, Philippine Statistical Authority,
etc.

 Interview to collect primary data:


 Oral exchange of questions and answers, where the researcher has to take down
notes and has the opportunity to give follow up questions that will lead to
collecting the most appropriate information needed to answer the research
problem.
 Using of questionnaire to collect primary data:
 Most common method of collecting data with the use of a questionnaire that
contains a set of questions that may be handled personally or sent through email
or through any other messaging platforms such as the social media messaging
features. This questionnaire is also considered as a data collecting tool, which
may be researcher-developed, adapted, or adopted. The researcher-developed or
adapted instruments should undergo reliability and validity tests.
 Doing a direct observation to collect primary data
 Data may also be acquired through doing direct observations of a situation as it
happens. This may be relatively expensive and time-consuming, it is also prone to
bias on the part of the researcher.

 Collection of Secondary Data


 Data may also be acquired through reviewing existing documents stored by others
that are made available to researchers.
 May include time-series data, cross-section data, panel data, and pooled data.
 To define the data needed by the researcher, it is a must to determine first the
objective of the study
Lesson 2: Collecting Data Using Appropriate Instruments

 Data collection – the process by which the researcher collects the information needed to
answer the research problem
 Collect information related to your study and obtain data that will answer the research
problem
 In quantitative research, the researcher decides what to study; ask specific questions,
narrow questions; collects quantifiable data from participants; analyses these
numbers using statistics; and conducts the inquiry in an unbiased, objective manner.

 Before beginning to collect the data needed for the study, consider the following:
a. The aim of the research and the questions to be answered
b. The type of data that are needed to be collected
c. The time frame and the resources
d. The methods and procedures that should be used and followed to process the data

 To collect high-quality data that is relevant to your purposes, follow these steps:
1. Define the aim of your research
 Recall the aim of your research.
 If you aim to test a hypothesis, measure something precisely, or gain large-
scale statistical insights, then collect quantitative data.
 If you aim to explore ideas, understand experiences, or gain detailed insights
into specific context, collect quantitative data.
 If you have several aims, you can use a mixed-method approach that collects
both types of data.
2. Choose your data collection method
 Based on the data you want to collect, decide on which method is best suited
for your research.
 Consider statement of the problem, and keep in mind what kind of answer you
want to solve the SOP
3. Plan your data collection procedure
 Coming up with a systematic plan is a must
 Plan what you should do from the very beginning, way before collecting data
 All documents must be prepared; approval letter, permission letter, etc
 Methodology must reflect all the preparations you need before carrying out
the collection of data
 There are variables that can be measures directly such as a participant’s age or
length of service in a company; however, most of the time, researchers are
interested to delve into abstract concepts that can be measured indirectly.
 In order to measure abstract conceptual ideas/variables, you need to turn them
into measurable observations through operationalization
 Operationalization – means to turn abstract concepts/ideas into measurable
observations. When planning how you will collect the data, you need to
translate the conceptual definition of what you want to study into an
operational definition of what you will actually measure.

 Example or Operationalization
 You have decided to use surveys to collect quantitative data. In this case, the
concept that you wanted to measure is the leadership of classroom presidents,
which you need to operationalize.
 You may operationalize this concept in two ways:
a. You ask to rate the classroom presidents’ own leadership skills on 5-point
scales, assessing the ability to delegate, decisiveness, and dependability.
b. You ask their classmates to provide anonymous feedback on the classroom
presidents’ regarding the same topics

 Sampling
 Sample – a specific group that you will collect data from
 The size of the sample is always less than the total size of the population, or the
entire group that you want to draw conclusions about
 In research, a population doesn’t always refer to people. It can mean a group
containing elements of anything you want to study, such as objects, events,
organizations, countries, species, organisms, etc.
 You may need to develop a sampling plan to obtain data systematically. This
involves defining a population and a sample.
 Your sampling method will determine how you recruit participants or obtain
measures for your study. To decide on a sampling method, you will need to
consider factors like the required sample size, accessibility of the sample, and
timeframe of the data collection

 Standardizing procedures
 If multiple researchers are involved, write a detailed manual to standardize data
collection procedures in your study.
 This means laying out specific step-by-step instructions so that everyone in your
research team collects data in a consistent way – for example, by conducting
experiments under the same conditions and using objective criteria to record and
categorize observations.

2.1: Pre-collection of data

The Preparation for the Data Collection

 Identify the unit of the analysis: who will be studied?


 Before collecting the data, you must first identify the unit of analysis. This is the
level from which your data will be gathered.
 For example, your data may be collected from an individual, a family, a school, a
school district, or any particular community
 There may be different units of analysis:
a. One for the dependent variable’
b. One for the independent variable

 Population and Sample


 Population – a group of individuals that have the same characteristics
 A sample is a subgroup of the target population that the researcher plans to study
for the purpose of making generalizations about the target populations
 Samples are only estimates
 Studying a sample is more efficient since it includes a small number of
participants and this will cost less.
 The difference between the sample estimate and the true population is the
“sampling error”

 Two types of sampling


1. Probability sampling – the selection of individuals from the population so that they
are representative of the population. Each member has an equal chance of being
selected as a sample
2. Nonprobability sampling – the selection of participants because they are available,
convenient, or represent some characteristic the investigator wants to study. Members
of this sample did not have an equal chance of being selected and not in this sample.

 Data-collection instruments and measurement processes


1. Survey/Questionnaires
 The questionnaire is a tool designed for the collection of quantitative data and
is widely used in construction research as it is a good research instrument for
collecting standardized data and making generalizations.
 Questionnaires can provide quick responses but adequate care must be taken
when developing questionnaires, to ensure you don’t influence the response
you receive
 He design of your questionnaire should reflect your research aims and
objectives.
2. Interviews
 Interviews are a tool mainly for the collection of qualitative data and are
popular as a data-collection tool because of their flexibility.
 Interviews may be done even in a quantitative study, usually to backup
explanations or interpretations of data.
 According to Silverman (1997: 98), interviews are active interactions between
two or more people leading to a negotiated contextually based result. These
interactions can come in a structured or semi-structured form to generate
insights and concepts.
 When planning and considering an interview, the following factors are taken
into consideration:
 Completeness
 Tact
 Precision
 Accuracy
 Confidentiality
 Interviews require specialized skills from the interviewer, who will need to
negotiate a good partnership with the respondent to ensure a highly detailed and
valid set of qualitative data is collected and transcribed effectively.
 In order to understand other persons’ constructions of reality, we would do well to
ask them […] and to ask them in such a way that they can tell us in their terms
[…] and in-depth which addresses the rich context that is the substance of their
meanings.
 Different types of interview:
 Individual, face-to-face verbal interchange
 Face-to-face group interviews (focus groups)
 Telephone surveys
 Interviews can be:
 Conducted as a one-time occurrence
 Conducted as multiple, longer sessions
 Structures, semi-structured, unstructured
3. Observation
 Observation is a systematic data-collecting technique that involves watching
individuals in their natural environment or in a naturally occurring situation.
 The processes under observation are normal and not contrived. They can range
from individual cases to groups and whole communities. They provide highly
detailed information about natural processes. The data collection is laborious and
time-consuming and may have to be repeated to ensure reliability. However,
observation schedules based on a set of expectations can make data collection
easier.
 The level of observer participation can vary from wholly participant to a non-
participant. The non-participant observer has limited interaction with the people
being observed.
 Observers can collect data through field notes, video, or audio recordings, which
can be analyzed using qualitative analytical tools. If you code your observations
to exact numerical data, it can be analyzed using a quantitative approach.
 One of the main benefits of using a wholly or partial participant observation is
that the level of immersion and prolonged involvement with participants can lead
to good rapport, thereby encouraging participants to speak up freely. This helps
with the rich details of the collected data.
4. Document analysis
5. Laboratory experiments
6. Quasi-experiement
7. Scales (measuring and weighing tapes)

 Obtaining permissions: what permissions are needed?


 Institutional or organizational (ex. School or district)
 Site-specific (ex. Secondary school)
 Individual participants
 Parents of minor participants
 Campus approval (ex. University or college)
 And institutional review board (IRB)

 Linking Data Collection to Variables and Questions


 Types of data measures: what information to collect?
 Performance measures (ex. Test performance)
 Attitudinal measures (feelings toward topics)
 Behavioural measures
 Factual measures (documents, records)

 Locating or developing an instrument for data collection


2.2: Measuring the instrument

 Measuring instruments
 What is Reliability? Reliability means consistency. Scores from measuring
variables that are stable and consistent. The question that needs to be answered is,
Do we get the same result if measured repeatedly?
 Types of reliability
1. Test-retest – scores are stable over time; Ask respondents again. Example: Measure a
second time using the same sample and the same instrument.
2. Alternate forms – equivalence of two measurements
3. Split half - Compare the same sample using different parts of the same instrument
(applies to survey instruments, where we often assess using a combination of
questions.
4. Inner-rater reliability - similarity in observation of behavior by two or more
individuals; Compare the same sample across different interviewers/instruments.
5. Internal consistency – consistent scores across the instrument
 What is Validity? Validity is about whether our measurement really measures our
concept (or what needs to be measured). Is it meaningful as a measurement tool for
this study?
 Validity: Scores from measuring variables that are meaningful

 Types of validity
1. Face validity - Does it even look like it is measuring the right thing?
2. Content validity - representative of all possible questions that could be asked; Does it
cover the range of meanings contained in the concept or study? For example, the concept
of prejudice contains prejudice on grounds of race, minority, gender, etc.
3. Criterion-referenced (predictive) validity - Scores are a predictor of an outcome or
criterion they are expected to predict. Does it predict the outcomes that we believe relate
to the concept or study? For example, you are to measure degrees of anxiety, your
instrument should be able to measure or predict the degrees of anxiety.
4. Construct validity - determination of the significance, meaning, purpose, and use of the
scores; Does it relate to the variables in the way our theory predicts? For example, if we
are measuring marital satisfaction and our theory says that it should be associated with
marital fidelity, is that true?

 Scales of measurement
A. Categorical measurement
1. Nominal: categories that describe traits or characteristics participants can check
2. Ordinal: participants rank order a characteristic, trait, or attribute
B. Continuous measurement
1. Interval: provides continuous response possibilities to questions with assumed
equal distance
2. Ratio: a scale with a true zero and equal distances among units

2.3: Procedures for Administering the Data Collection

1. Develop standard written procedures for administering an instrument


2. Train researchers to collect observational data

3. Obtain permission to collect and use public documents

4. Respect individuals and sites during data gathering (ethics)


Lesson 3: Data Organization

 The organization of data refers to the process of classifying collected data for ease
of presentation.
 It may be done through encoding data in a spreadsheet. However, before encoding
the data, make sure that the data set is accurate and complete. In case you have
missing values or a respondent failed to answer some of the items in the
questionnaire, do not encode anything on that item, instead, just leave it blank
 You must do the same thing if you are organizing or encoding the secondary data.
Nevertheless, it is better if you have complete data or no missing values. To avoid
missing out on any necessary information, better check all the items of the
questionnaire before leaving the participant or respondents.
Lesson 4: Presenting and Interpreting Data in Tabular and Graphical Forms

 To be able to create and present an organized picture of information from a research


report, it is important to use certain techniques to communicate findings and
interpretations of research studies into visual form.
 The common techniques being used to display results are tabular, textual, and graphical
methods.

 Textual presentation of data


 Uses words, statements, or paragraphs with numerals, numbers to describe data.
 Tabular presentation of data
 The Tabular Presentation of Data uses tables to present clear and organized data.
A table must be clear and simple but complete.
 A good table should include the ff parts:
 Table number and title - these are placed above the table. The title is
usually written right after the table number.
 Caption subhead – refers to columns and rows
 Body – contains all the data under each subhead
 Source – indicates if the data is secondary and it should be acknowledged
 Graphical presentation of data
 A graph or chart portrays the visual presentation of data using symbols such as
lines, dots, bars or slices. It depicts the trend of a certain set of measurements or
shows a comparison between two or more sets of data or quantities.
 Line graph
 A line graph is a graphical presentation of data that shows a continuous change or
trend. It may show an ascending or descending trend.
 Bar graph
 A bar graph uses bars to compare categories of data. It may be drawn vertically or
horizontally. A vertical bar graph is best to use when comparing means or
percentages between distinct categories. The categories are measured
independently and compared with one another. A horizontal bar graph may
contain more than 5 categories. A bar graph is plotted on either the x-axis or the
y-axis.
 Pie charts or circle graphs
 A pie chart is usually used to show how parts of a whole compare to each other
and to the whole. The entire circle represents the total and the parts are
proportional to the amount of the total they represent.
 When presenting data using the Tabular and Graphical Presentations, it is a must
that a textual analysis follows each Table or Graph/Figure.
Lesson 5: Using Statistical Techniques to Analyse Data

 What is Bivariate Analysis?


 Bivariate analysis means the analysis of bivariate data. It is one of the simplest
forms of statistical analysis, used to find out if there is a relationship between two
sets of values. It usually involves the variables X and Y.
 Univariate analysis is the analysis of one (“uni”) variable.
 Bivariate analysis is the analysis of exactly two variables.
 Multivariate analysis is the analysis of more than two variables.

 The results from bivariate analysis can be stored in a two-column data table.
 Bivariate analysis is not the same as two sample data analysis. With two sample
data analysis (like a two sample z test in Excel), the X and Y are not directly
related. You can also have a different number of data values in each sample; with
bivariate analysis, there is a Y value for each X.

 Types of Bivariate Analysis


1. Scatter plot – gives you a visual idea of the pattern your variables follow
2. Regression analysis - is a catch-all term for a wide variety of tools that you can use
to determine how your data points might be related. In the image above, the points
look like they could follow an exponential curve (as opposed to a straight line).
Regression analysis can give you the equation for that curve or line. It can also give
you the correlation coefficient.
3. Correlation coefficients - Calculating values for correlation coefficients are using
performed on a computer, although you can find the steps to find the correlation
coefficient by hand here. This coefficient tells you if the variables are related.
Basically, a zero means they aren’t correlated (i.e. related in some way), while a 1
(either positive or negative) means that the variables are perfectly correlated (i.e. they
are perfectly in sync with each other).
5.2: Bivariate: Statistical Computing and Analysis

 What do we need to understand when using statistical computations and analysis


1. Appropriateness – which method will best fit the study? The answer is, it depends
on how data were collected and what type of data was collected.
2. Interpretation - How should the results be interpreted? Basically, you need to relate
your interpretations back to theory/ theories.
3. Diagnostics - How should we check whether the methods are appropriate? Look at
the data collected, juxtapose it with the research questions. Do you think the
interpretation of the data answers what it asked?
 For the Bivariate analysis, we will cover, specifically, the Chi-square, T-test,
ANOVA, Correlation, Linear Regression, and Multiple regression.
 The analysis will always be based on the type of research questions a study has. The
research question may talk about differences and relationships.
 Under a research question that talks about differences, the type of bivariate analysis that a
researcher should use will be determined by the type of dependent variables. If the
dependent variable is ordinal or nominal, Chi-square should be used. On the other hand,
if the dependent variable is interval or ratio then, use T-test analysis.
 Chi-square: (symbolized by X2) is a statistical test that compares differences
between observed and expected frequencies.
 T-test: (symbolized by t ) compares the difference in the means of the dependent
variable of two groups, whether the difference is significant or just by chance.
 ANOVA: analysis of variance) compares the difference in the means of the
dependent variable of more than two groups.
 Let us continue with the bivariate analyses, wherein the research question talks about the
relationship. If the relationship of variables are not interested in making predictions,
Correlation should be used. Otherwise, for one (1) prediction, use linear regression, and
for two (2) and above number of predictions, use multiple regression.
 Correlation (symbolized by r) is a statistical test to determine the strength of the
relationship between two variables. No dependent or independent variables to look at, but
just the strength of variables' relationship. It is illustrated on a scatterplot.
 Linear regression is a statistical test that uses one (1) predictor variable to predict one (1)
criterion variable. Regression is often illustrated by a scatterplot.
 In regression, the predictor (independent variable) is on the x-axis and the criterion
variable (dependent variable) is on the y-axis
 Multiple regression is a statistical test that predicts the criterion variable from two or
more predictors.
 R-squared ( symbolized by R2 ) is a statistical measure of how close the data are
to the regression line. It is an indication of how good our model is at predicting
the criterion variable. R-squared measures from 0 to 1. That is, if the r-squared
measures 0.4, it means that our model has explained 40% of how close the data
are fitted to the regression line.
 Beta coefficient shows how well each predictor is able to predict the dependent
variable. Beta coefficient falls from the range of -1 to 1. A positive beta
coefficient means the relationship between the predictor and the dependent
variable is positive. Otherwise, the relationship is negative.

Вам также может понравиться