Вы находитесь на странице: 1из 15

Chapter 3 Research Methodology

CHAPTER IV Research Methodology


The objective of this chapter is to provide review about the research design. This chapter will focus on the explaining the methodology used in the project. This chapter also explains the significance of different method that are important for statistical analysis to increase the significance of the study.

3.1 Introduction
It very important to evaluate and introduce the methods used in the study. It helps in determining the statistical outline of the project. In the past various studies have been conducted on rural banking by various researchers. It is very difficult to study the effect of eBanking on satisfaction level of customers in rural market. Mostly studies on rural banking are conducted in Developing countries both especially African and South Asia. In India researches have been undertaken but the main focus of study are limited up Regional Rural Banks and Microfinance companies in rural market. Only very few studies are conducted on private banks presence in Rural market and the significance of internet Banking. So it is very important to understand the importance of this study to analyse the effectiveness of eBanking in rural market. So it is very important to select the appropriate statistical method to increase the significance of the study. So in this chapter an attempt has been made to present the methodology adopted for the study, such as data collection, data analysis and detailed description of the results.

3.2 Research Design


A research design is a systematic plan to study a scientific problem. The design of a study defines the study type (descriptive, correlational, semi-experimental, experimental, review, metaanalytic) and sub-type (e.g., descriptive-longitudinal case study), research question, hypotheses, independent and dependent variables, experimental design, and, if applicable, data collection methods and a statistical analysis plan. Descriptive Research design has been used in this study.

Descriptive research design is the method where both qualitative and quantitative variable has been used in the study. Descriptive research design includes three methods of three types:1. Observational Method 2. Case Study Method 3. Survey Method. Here in this study survey method has been used. In survey method research, participants answer questions administered through interviews or questionnaires. After participants answer the

questions, researchers describe the responses given. Here close ended questionnaire has been used in the study. In this study questionnaire is based on the eBanking satisfaction of people in rural segment. The surveyed population was required to respond to different variables on the basis of five point Likerts scales, which rated 1 as least satisfactory and 5 as most satisfactory. The collected data has been analyzed through IBM statistical software SPSS. At the outset Cronbach Alpha test has been employed to check the internal consistency (reliability) of the data. Later on Kaiser-Meyer-Olkin and Bartletts tests were conducted to test sample adequacy and sphericity of collected data. To diagnose the problem of multi-co linearity degree of correlation has been estimated. As the results have shown problem of co-linearity, factor analysis has been done as a tool of dimension reduction. The results have further been analyzed through regression to establish the relation of RGER scores with overall satisfaction level of customers.

3.3 Statement of Problem


Since the privatization of Banks in 1994 Indian banking sector has grown at rapid speed. Due to intense competition and entry of new players it is very tough for existing players to maintain its profitability in the market. As the time is passing banks are going beyond the urban market. Even the new guidelines of RBI are also increasing the pressure on banking players to increase their presence in rural market. As a result big private players like HDFC Bank and ICICI Bank are following the aggressive strategy to explore opportunities in rural market. But there involve huge administration cost in rural banking which is a tough task for rural banks. Here eBanking can play a significant role to reduce down the administration cost (cost of operations) for banks so it is very important for private banks to understand the importance of eBanking. Banks can understand the significance of eBanking by conducting the study on existing rural customers.

Hence this study aims ate determining the satisfaction level of rural customer with eBanking. Here simply questionnaire method has been used in the study. The collected data has been analyzed through IBM statistical software SPSS. Cronbach Alpha test has been employed to check the internal consistency (reliability) of the data. Later on Kaiser-Meyer-Olkin and Bartletts tests were conducted to test sample adequacy and reliability of collected data. To diagnose the problem of multi-co linearity degree of correlation has been estimated. As the results have shown problem of co-linearity, factor analysis has been done as a tool of dimension reduction. The results have further been analyzed through regression to establish the relation of RGER scores with overall satisfaction level of customers.

3.4 Objectives of the Study Main Objective: - The main objective of study is to understand the role of eBanking in a
bank to penetrate the rural market.

Secondary Objectives:I. II. III. To view the scope of the Banking in rural market? To analyse overall satisfaction of rural customers from e-banking services. To identify the factors that influence rural customers satisfaction from ebanking. IV. To identify the primary obstacles hindering the wide acceptability and propensity to use e-banking as a primary banking channel in rural areas. V. VI. To find out the ability of rural people to understand the banking activities.

3.5 Research Methodology


3.5.1 Data Collection The study is primarily based upon primary data collected through a questionnaire from rural users of e-banking channels from different villages of Shimla District of Himachal Pradesh (India). Questionnaire comprises of 2 general questions and 17 questions relating to variables to be studied. The selection of variables is based upon previous research work. The surveyed

population was required to respond to different variables on the basis of five point Likert s scales, which rated 1 as least satisfactory and 5 as most satisfactory.

3.5.1 Analysis of data


The collected data has been analyzed through IBM statistical software. At the outset Cronbach Alpha test has been employed to check the internal consistency (reliability) of the data. Later on Kaiser-Meyer-Olkin and Bartletts tests were conducted to test sample adequacy and sphericity of collected data. To diagnose the problem of multi-co linearity degree of correlation has been estimated. As the results have shown problem of co-linearity, factor analysis has been done as a tool of dimension reduction. The results have further been analyzed through regression to establish the relation of RGER scores with overall satisfaction level of customers.

3.6 Statistical Tools Used in the study. 3.6.1 Likert Scale: A method of ascribing quantitative value to qualitative data, to make it
amenable to statistical analysis. A numerical value is assigned to each potential choice and a mean figure for all the responses is computed at the end of the evaluation or survey. Variables used in Scales: - Generally there are five variables used in the study ranging from Strongly Disagree at one end to strongly agree to other end. This is also quite useful for evaluating a respondents opinion of important purchasing, product, or satisfaction features. The scores can be used to create a chart of the distribution of opinion across the population. For further analysis, you can cross tabulate the score mean with contributing factors. For the score to have meaning, each item in the scale should to be closely related to the same topic of measurement. Here in this study Likert scale is based on satisfaction level of customer where 1 stands for least satisfaction and 5 for most satisfaction.

3.6.2 Cronbach Alpha test:- Cronbach's (alpha) is a coefficient of internal consistency. It


is commonly used as an estimate of the reliability of a psychometric test for a sample of examinees. It was first named alpha by Lee Cronbach in 1951, as he had intended to continue with further coefficients. The measure can be viewed as an extension of the Kuder Richardson Formula 20 (KR-20), which is an equivalent measure for dichotomous items. Alpha is not robust

against missing data. Several other Greek letters have been used by later researchers to designate other measures used in a similar context. The theoretical value of alpha varies from zero to 1, since it is the ratio of two variances. However, depending on the estimation procedure used, estimates of alpha can take on any value less than or equal to 1, including negative values, although only positive values make sense. Higher values of alpha are more desirable. Some professionals, as a rule of thumb, require a reliability of 0.70 or higher (obtained on a substantial sample) before they will use an instrument. Obviously, this rule should be applied with caution when has been computed from items that systematically violate its assumptions. Furthermore, the appropriate degree of reliability depends upon the use of the instrument. Cronbach's alpha will generally increase as the intercorrelation among test items increase, and is thus known as an internal consistency estimate of reliability of test scores. Because intercorrelation among test items are maximized when all items measure the same construct, Cronbach's alpha is widely believed to indirectly indicate the degree to a set of items measures a single unidimensional latent construct. However, the average intercorrelation among items is affected by skew just like any other average. Thus, whereas the modal intercorrelation among test items will equal zero when the set of items measures several unrelated latent constructs, the average intercorrelation among test items will be greater than zero in this case. Indeed, several investigators have shown that alpha can take on quite high values even when the set of items measures several unrelated latent constructs. As a result, alpha is most appropriately used when the items measure different substantive areas within a single construct. Cronbach's alpha 0.9 0.7 < 0.9 0.6 < 0.7 0.5 < 0.6 < 0.5 Internal consistency Excellent Stakes testing) Good testing) Acceptable Poor Unacceptable (Low-Stakes (High-

3.6.3 Kaiser-Meyer-Olkin :- Sampling adequacy predicts if data are likely to factor well,
based on correlation and partial correlation. In the old days of manual factor analysis, this was extremely useful. KMO can still be used, however, to assess which variables to drop from the model because they are too multicollinear.

There is a KMO statistic for each individual variable, and their sum is the KMO overall statistic. KMO varies from 0 to 1.0 and KMO overall should be .60 or higher to proceed with factor analysis. If it is not, drop the indicator variables with the lowest individual KMO statistic values, until KMO overall rises above .60.

To compute KMO overall, the numerator is the sum of squared correlations of all variables in the analysis (except the 1.0 self-correlations of variables with themselves, of course). The denominator is this same sum plus the sum of squared partial correlations of each variable i with each variable j, controlling for others in the analysis. The concept is that the partial correlations should not be very large if one is to expect distinct factors to emerge from factor analysis.

It is an index to identify whether sufficient correlation exist among the variables has checked the sampling adequacy or not. It compares the magnitudes of the observed correlation coefficients with the partial correlation coefficients. The minimum acceptable value of KMO is 0.50.

3.6.4 Bartletts tests:- Bartlett's test (see Snedecor and Cochran, 1989) is used to test if k
samples are from populations with equal variances. Equal variances across samples is called homoscedasticity or homogeneity of variances. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption. Bartlett's test is sensitive to departures from normality. That is, if the samples come from nonnormal distributions, then Bartlett's test may simply be testing for non-normality. Levene's test and the BrownForsythe test are alternatives to the Bartlett test that are less sensitive to departures from normality.

Bartlett's test is used to test the null hypothesis, H0 that all k population variances are equal against the alternative that at least two are different. If there are k samples with size . and sample variances

then Bartlett's test statistic is:-

3.6.5 Multicolinearity:- It is a statistical phenomenon in which two or more predictor


variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a non-trivial degree of accuracy. In this situation the coefficient estimates of the multiple regression may change erratically in response to small changes in the model or the data. Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data themselves; it only affects calculations regarding individual predictors. That is, a multiple regression model with correlated predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with respect to other.

Detection of Multicolinearity:Indicators that multicolinearity may be present in a model: 1. Large changes in the estimated regression coefficients when a predictor variable is added or deleted 2. Insignificant regression coefficients for the affected variables in the multiple regression, but a rejection of the joint hypothesis that those coefficients are all zero (using an F-test)

3. If a multivariable regression finds an insignificant coefficient of a particular explanatory, yet a simple linear regression of the explained variable on this explanatory variable shows its coefficient to be significantly different from zero, this situation indicates multicollinearity in the multivariable regression. 4. Some authors have suggested a formal detection-tolerance or the variance inflation factor (VIF) for multicollinearity: Tolerance = where , VIF=

is the coefficient of determination of a regression of explanator j on all the

other explanators. A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multicollinearity problem. 5. Condition number test: The standard measure of ill-conditioning in a matrix is the condition index. It will indicate that the inversion of the matrix is numerically unstable with finite-precision numbers (standard computer floats and doubles). This indicates the potential sensitivity of the computed inverse to small changes in the original matrix. The Condition Number is computed by finding the square root of (the maximum eigenvalue divided by the minimum eigenvalue). If the Condition Number is above 30, the regression is said to have significant multicollinearity. 6. FarrarGlauber test: If the variables are found to be orthogonal, there is no multicollinearity; if the variables are not orthogonal, then multicollinearity is present. C. Robert Wichers has argued that FarrarGlauber partial correlation test is ineffective in that a given partial correlation may be compatible with different multicollinearity patterns. The FarrarGlauber test has also been criticized by other researchers. 7. Construction of a correlation matrix among the explanatory variables will yield indications as to the likelihood that any given couplet of right-hand-side variables are creating multicollinearity problems. Correlation values (off-diagonal elements) of at least .4 are sometimes interpreted as indicating a multicollinearity problem.

Consequences of Multicolinearity:-

1. Even extreme multicollinearity (so long as it is not perfect) does not violate OLS assumptions. OLS estimates are still unbiased and BLUE (Best Linear Unbiased Estimators) 2. Nevertheless, the greater the multicollinearity, the greater the standard errors. When high multicollinearity is present, confidence intervals for coefficients tend to be very wide and t-statistics tend to be very small. Coefficients will have to be larger in order to be statistically significant, i.e. it will be harder to reject the null when multicollinearity is present. 3. Note, however, that large standard errors can be caused by things besides multicollinearity. 4. When two variabless are highly and positively correlated, their slope coefficient estimators will tend to be highly and negatively correlated. When, for example, b1 is greater than 1, b2 will tend to be less than 2. Further, a different sample will likely produce the opposite result. In other words, if you overestimate the effect of one parameter, you will tend to underestimate the effect of the other. Hence, coefficient estimates tend to be very shaky from one sample to the next.

Remedies for Multicolinearity:1. Make sure you have not fallen into the dummy variable trap; including a dummy variable for every category (e.g., summer, autumn, winter, and spring) and including a constant term in the regression together guarantee perfect multicollinearity. 2. Try seeing what happens if you use independent subsets of your data for estimation and apply those estimates to the whole data set. Theoretically you should obtain somewhat higher variance from the smaller datasets used for estimation, but the expectation of the coefficient values should be the same. Naturally, the observed coefficient values will vary, but look at how much they vary. 3. Leave the model as is, despite multicollinearity. The presence of multicollinearity doesn't affect the efficacy of extrapolating the fitted model to new data provided that the predictor variables follow the same pattern of multicollinearity in the new data as in the data on which the regression model is based.

4. Drop one of the variables. An explanatory variable may be dropped to produce a model with significant coefficients. However, you lose information (because you've dropped a variable). Omission of a relevant variable results in biased coefficient estimates for the remaining explanatory variables. 5. Obtain more data, if possible. This is the preferred solution. More data can produce more precise parameter estimates (with lower standard errors), as seen from the formula in variance inflation factor for the variance of the estimate of a regression coefficient in terms of the sample size and the degree of multicollinearity. 6. Mean-center the predictor variables. Generating polynomial terms can cause some multicollinearity if the variable in question has a limited range. Mean-centering will eliminate this special kind of multicollinearity. However, in general, this has no effect. It can be useful in overcoming problems arising from rounding and other computational steps if a carefully designed computer program is not used. 7. Standardize your independent variables. This may help reduce a false flagging of a condition index above 30. 8. It has also been suggested that using the Shapley value, a game theory tool, the model could account for the effects of multicollinearity. The Shapley value assigns a value for each predictor and assesses all possible combinations of importance. 9. Ridge regression or principal component regression can be used. 10. If the correlated explanators are different lagged values of the same underlying explanator, then a distributed lag technique can be used, imposing a general structure on the relative values of the coefficients to be estimated.

Here in this study multicolinearity have been analysed between different factors. If there is multicolinearity factor and regression analysis has been used to reduce the multicolinearity.

3.6.6 Factor Analysis:- Factor analysis is a tool to reduce the number of variables to such a
small number that could be capable enough to explain the observed variance in the large number of variables. It reduces the number of variables to such a small number which could be capable enough to explain observed variance in the large number of variables. Initially the communalities of variables have been calculated to represent the amount of variation extracted from each

variable. Variable with higher value is expected to represent better one. The extraction of variable is done by principal component analysis method.

3.6.7 Regression Analysis: - It is a statistical process for estimating the relationships


among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables that is, the average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution.

Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. However this can lead to illusions or false relationships, so caution is advisable; for example, correlation does not imply causation.

Many techniques for carrying out regression analysis have been developed. Familiar methods such as linear regression and ordinary least squares regression are parametric, in that the regression function is defined in terms of a finite number of unknown parameters that are estimated from the data. Nonparametric regression refers to techniques that allow the regression function to lie in a specified set of functions, which may be infinite-dimensional.

The performance of regression analysis methods in practice depends on the form of the data generating process, and how it relates to the regression approach being used. Since the true form of the data-generating process is generally not known, regression analysis often depends to some extent on making assumptions about this process. These assumptions are sometimes testable if a sufficient quantity of data is available. Regression models for prediction are often useful even when the assumptions are moderately violated, although they may not perform optimally. However, in many applications, especially with small effects or questions of causality based on observational data, regression methods can give misleading results.

Regression models involve the following variables: The unknown parameters, denoted as , which may represent a scalar or a vector. The independent variables, X. The dependent variable, Y.

In various fields of application, different terminologies are used in place of dependent and independent variables. A regression model relates Y to a function of X and . Y The approximation is usually formalized as E(Y | X) = f(X, ). To carry out regression analysis, the form of the function f must be specified. Sometimes the form of this function is based on knowledge about the relationship between Y and X that does not rely on the data. If no such knowledge is available, a flexible or convenient form for f is chosen. Assume now that the vector of unknown parameters is of length k. In order to perform a regression analysis the user must provide information about the dependent variable Y:

If N data points of the form (Y, X) are observed, where N < k, most classical approaches to regression analysis cannot be performed: since the system of equations defining the regression model is underdetermined, there are not enough data to recover .

If exactly N = k data points are observed, and the function f is linear, the equations Y = f(X, ) can be solved exactly rather than approximately. This reduces to solving a set of N equations with N unknowns (the elements of ), which has a unique solution as long as

the X are linearly independent. If f is nonlinear, a solution may not exist, or many solutions may exist. The most common situation is where N > k data points are observed. In this case, there is enough information in the data to estimate a unique value for that best fits the data in some sense, and the regression model when applied to the data can be viewed as an overdetermined system in . In the last case, the regression analysis provides the tools for: 1. Finding a solution for unknown parameters that will, for example, minimize the distance between the measured and predicted values of the dependent variable Y (also known as method of least squares). 2. Under certain statistical assumptions, the regression analysis uses the surplus of information to provide statistical information about the unknown parameters and predicted values of the dependent variable Y.

Classical assumptions for regression analysis: 1. The sample is representative of the population for the inference prediction. 2. The error is a random variable with a mean of zero conditional on the explanatory variables. 3. The independent variables are measured with no error. (Note: If this is not so, modeling may be done instead using errors-in-variables model techniques). 4. The predictors are linearly independent, i.e. it is not possible to express any predictor as a linear combination of the others. 5. The errors are uncorrelated, that is, the variancecovariance matrix of the errors is diagonal and each non-zero element is the variance of the error. 6. The variance of the error is constant across observations (homoscedasticity). If not, weighted least squares or other methods might instead be used.

Summary and Conclusion


In this chapter an attempt has been made to provide the justification for the research design. The study is a descriptive research design based on questionnaire. The questionnaire has been formed in a simplest way so that positive response can be received. The questionnaire has been constructed to determine the satisfaction level with eBanking in rural market. Questionnaire has tried to cover almost all necessary elements based on eBanking. Questionnaire has been prepared on the basis of Likerts five points scale.

The collected data has been analyzed through IBM statistical software SPSS. At the outset Cronbach Alpha test has been employed to check the internal consistency (reliability) of the data. Later on Kaiser-Meyer-Olkin and Bartletts tests were conducted to test sample adequacy and sphericity of collected data. To diagnose the problem of multi-co linearity degree of correlation has been estimated. As the results have shown problem of co-linearity, factor analysis has been done as a tool of dimension reduction. The results have further been analyzed through regression to establish the relation of RGER scores with overall satisfaction level of customers.

Вам также может понравиться