Академический Документы
Профессиональный Документы
Культура Документы
The study primarily has been based on secondary data collected from various
national (National Horticultural Board1, 2, Department of Economics and
Statistics- IASRI13’4,2 ITC5, DAC/ Directorate of Economics and Statistics6,
Ministry of Agriculture7, APEDA8, NHRDF9, and Planning Commission, Govt,
of India10) and international sources like FAO11, AVRDC12, ADB13 pertaining to
3 IASRI. 2004. Agricultural Research Data Book 2004. New Delhi: Indian Agricultural
Statistics Research Institute, Indian Council of Agricultural Research, http://
www.iasri.res.in/agridata/04data%5Cchapter%206%5Cdb2004tb6_l 1 .htm.
4 IASRI. 2006. Agricultural Research Data Book 2006. New Delhi: Indian Agricultural
Statistics Research Institute, Indian Council of Agricultural Research, http://
www.iasri.res.in/agridata/HOME.HTML. ^
5 ITC. 2007. International Trade Statistics by Country and Product Group. International
Trade Centre, http://www.intracen.org/tradstat/sitc3-3d/ indexre.htm.
8 APEDA. 2007. Export statistics for agro-food products, India 2005-2006. Agricultural
and Processed Food Products Export Development Authority. 610 pp.
http://apeda.com/apedawebsite/
10 Planning Commission. (2007). National 5-Year Plans: 11th plan proposals. Planning
Commission, Govt, of India, http://planningcommission.nic.in/plans/planrel/
appl l_16jan.pdf.
11 FAOSTAT. 2007. FAOSTAT On-line. Rome: United Nations Food and Agriculture
Organization, http://faostat.fao.org/default.aspx.
area, production, productivity and marketing of vegetables in Indian and export
of vegetables in world market. The time series analyses has been carried out to
evaluate the pattern in the data series and extrapolation of that pattern was used
to throw light into the future planning for policy makers.
The data has been collected from secondary sources and analyzed through the
graphs. A time plot has been made and analysis for trends over time of area, the
production, productivity and marketing behavior and other systematic features
for planned strategy in vegetable sector were revealed. Although, in practice,
linear trend has been commonly used, but as it was rarely fitted the best in
production data, therefore, other trend pattern has also been attempted. The rate
of growth or decline is not of constant nature throughout but varies considerably
in different time with different vegetable crops.
The time series analysis has been studied to understand the rate of growth in
area, production and productivity of different vegetable crops grown in different
agro-climatic zones of India. In order to determine the type and nature of the
variations and disparity in the vegetable sector across different states collected
data were processed and analyzed. Besides appropriate statistical tools (MS-
EXCEL, COSTAT) has been used to compare the actual current performance
with the time series data on production, trade and marketing issues of vegetable
sectors in order to understand the causes of such variations if any in different
regions of the country.
The analysis would help to understand and forecast the behavior of the vegetable
sector in future years which is needed by policy makers, administrative planner,
and research and development managers.
13 ADB. (2007). India’s Economic Growth to Moderate in 2007, ADB Says. ADB News
Release, 27 March 2007. http://www.adb.org/Media/Articles/ 2007/11664-
indian-developmentsoutlooks/
40
StatisticaC analysis
AVERAGE
Syntax
Numberl, number2, ... are 1 to 30 numeric arguments for which you want the
average.
Remarks
COVAR
Returns covariance, the average of the products of deviations for each data point
pair. Use covariance to determine the relationship between two data sets. For
example, you can examine whether greater income accompanies greater levels of
education.
Syntax
41
Remarks
Where, x and y are the sample means AVERAGE (arrayl) and AVERAGE
(array2), and n is the sample size.
RSQ
Syntax
Remarks
42
• If an array or reference argument contains text, logical values, or empty
cells, those values are ignored; however, cells with the value zero are
included.
• If known_y's and known x's are empty or have a different number of data
points, RSQ returns the #N/A error value.
• The equation for the Pearson product moment correlation coefficient, r,
is:
Where, x and y are the sample means AVERAGE (known_x’s) and AVERAGE
(known_y’s).
AVEDEV
Returns the average of the absolute deviations of data points from their mean.
AVEDEV is a measure of the variability in a data set.
Syntax
Numberl, number 2,... are 1 to 30 arguments for which you want the average
of the absolute deviations. You can also use a single array or a reference to an
array instead of arguments separated by commas.
Remarks
43
• If an array or reference argument contains text, logical values, or empty
cells, those values are ignored; however, cells with the value zero are
included.
• The equation for average deviation is:
cv = —~ £(X-X mean)
CONFIDENCE
Returns a value that you can use to construct a confidence interval for a
population mean. The confidence interval is a range of values. Your sample
mean, x, is at the center of this range and the range is x ± CONFIDENCE. For
example, if x is the sample mean of delivery times for products ordered through
the mail, x ± CONFIDENCE is a range of population means. For any population
mean, po, in this range, the probability of obtaining a sample mean further from
po than x is greater than alpha; for any population mean, po, not in this range, the
probability of obtaining a sample mean further from po than x is less than alpha.
In other words, assume that we use x, standard_dev, and size to construct a two-
tailed test at significance level alpha of the hypothesis that the population mean
is po. Then we will not reject that hypothesis if po is in the confidence interval
and will reject that hypothesis if po is not in the confidence interval. The
confidence interval does not allow us to infer that there is probability 1 - alpha
that our next package will take a delivery time that is in the confidence interval.
Syntax
Alpha is the significance level used to compute the confidence level. The
confidence level equals 100*(1 - alpha) %, or in other words, an alpha of 0.05
indicates a 95 percent confidence level.
44
Standard deviation: is the population standard deviation for the data range and
is assumed to be known.
Remarks
X mean ^ 1.96(<j/Vn)
45
Time series analysis
Trend analysis
Trend analysis uses a technique called least squares to fit a trend line to a set of
time series data and then project the line into the future for a forecast. Trend
analysis is a special case of regression analysis where the dependent variable is
the variable to be forecasted and the independent variable is time. While moving
average model limits the forecast to one period in the future, trend analysis is a
technique for making forecasts further than one period into the future.
A study of time trends may focus, therefore, on one or more of the following:
The selection of a strategy for analyzing trend data will depend in part on the
purpose of the analysis, and on careful consideration of all of the issues
discussed above. Once there is a sound conceptual framework, tables, graphs and
statistical analysis are tools for examining and analyzing trend data; graphs, in
particular, are an effective tool for presenting the pattern of change over time.
Regardless of whether statistical techniques will be used for analyzing data over
time the most straightforward and intuitive first step in assessing a trend is to plot
the actual observed data by year (or some other time period deemed appropriate).
In addition, the data should be examined in tabular form. These initial steps are
indispensable for understanding the general shape of the trend, for identifying
any outliers in the data. Inspection of the data provides the basis for making
subsequent analysis choices and should never be bypassed. Visual inspection of
the data may indicate that use of statistical procedures is inappropriate.
46
One step toward improving the interpretability of the data is to put the rates on a
logarithmic scale. A log transformation of the data provides more appropriate
and realistic results because it "flattens" the series of rates. While the overall
shape of the trend is unchanged, the rate of increase or decrease is somewhat
altered.
Statistical Procedures
Regression Analysis
Another advantage of using regression methods for analyzing trends and making
projections are that other variables can be included in a model. Without using
some form of regression modeling, the impact of other variables cannot be
accounted for the results. There are several regression approaches that can be
employed to examine trend data. Following is a generic description of these
approaches.
47
A forecast is calculated by inserting a time value into the regression equation.
The regression equation is determined from the time-serieas data using the “least
squares method” (Least square method determines the values for a and b so that
the resulting line is the best-fit line through a set of the historical data. After a
and b have been determined, the equation can be used to forecast future values.
The general equation for a trend line: F=a+bt, Where: F - forecast, t - time value,
a - y intercept, b - slope of the line. This data pattern is linear in nature and fits in
straight line equation: y = mx +c, where, y is the predicted/ dependent variable
and x is the independent variable, c is the intercept and m is the slope of the
curve.
Prerequisites: 2. Correlation
There should be a sufficient correlation between the time parameter and the
values of the time-series data. More specifically if the trend line equation is
providing a high value of coefficient of correlation (R2), then higher be the
accuracy of prediction about dependent variable from the given value of
independent variable.
48
The coefficient of determination, R2, measures the percentage of variation in the
dependent variable that is explained by the regression or trend line. It has a value
between zero and one, with a high value indicating a good fit
Goodness of fit: Determination Coefficient RSQ (Range: [0, 1], RSQ=1 means
best fitting; RSQ=0 means worse fitting)
1) Roughly: visually, comparing the data pattern to the one of the 5 trends
(linear, logarithmic, polynomial, power, exponential)
2) In a detailed way: By means of the determination coefficient e.g., trends in
area, production, and yield of various vegetable crops would be quantified
econometrically by plotting the time curve and by adding trendline in chart
option in MS-EXCEL worksheet.
The add trendline will provide six different trend/ regression type i.e., Linear,
Logarithmic, Polynomial, Power, Exponential and Moving average option. By
choosing and highlighting any one option at a time will deliver the trend graph.
By highlighting option bar, EXCEL window will display-
Trendline name: automatic (as default marked, if already choose for any one of
the trend/regression type i.e., Linear, Logarithmic, Polynomial, Power,
Exponential and Moving average option.
Forecast: Forward and Backward (putting options for desired period will deliver
the predicted value as per the trendline equation)
Set intercept: generally this option should not be highlighted, as by default the
intercept is set at 0, however, for any set of variable if intercept is known the
same may be given for better fit equations
49
Display equation on chart- the square marked area may be clicked for
highlighting this option to get the best fit equation on the time graph
Display R-square value on chart- the square marked area may be clicked for
highlighting this option to get the best fit equation and R2on the time graph.
Based on the high R2 value, for a given sets of data one can choose the trend
equation that fitts best.
50
JLnaCysis in MS^EXCEL
Excel includes multiple functions for regression analysis. It can actually be used
for a large variety of different types of trends such as polynomial, logarithmic
and exponential. The following figure shows five different types of trend:
polynomial of order 3, linear, logarithmic, natural exponential, and power
function.
51
Figure showing five different type of trend line
As shown in above figure, there are many different types of trendline possible.
Each reflects a different relationship between the independent and the dependent
variables. Some trend functions of a single variable - other than a linear or a
polynomial trend - are listed below in a tabular form.
Logarithmic Y = a In (x) +b
Power Y = axb
8-
!
1
52
The exponentialfunction to base b, Y = abx, can be transformed into In (y) = In
(a) + x In (b) with hi (a) as the intercept and In (b) as the slope.
= In (a) +bx, given x and y values yielded In (a) as the intercept and b as the
slope.
The above case can easily be extended with its single independent variable to
include multiple independent variables. When the dependent variable is a
function of multiple independent variables the problem is called multiple
regressions. Hence, the regression equation would be y = 1114X4 + 013X3 + 102X2 +
mixi+mo-
term is like a different variable, i.e. y = a3Z3 + a2Z2 + aizi + ao, written in this
form it becomes apparent that a polynomial regression is no different from a
multiple regression.
53
Econometric analysis
The individual input cost included not only market price, but also its
transportation and spreading cost. The irrigation cost included the cost of water,
i
if any, in terms of water tax by the government or purchase cost from the
neighboring farmers, irrigation labor cost, plus depreciation cost of irrigation
equipment. In case the source of water was tube well, the irrigation cost
additionally included the cost of maintenance, depreciation, and operation of the
tube well.
Total production cost for each crop has been estimated by adding individual cost
items. Cash cost has been estimated as the total cost less the value of family
labor and family-produced manure and seeds. The interest rate on cash cost has
also been included in the total cost at the rate of 10% per crop season. The share
of each cost item (factor share) in the total cost was estimated in percentage
terms. The factor shares for labor, seed, fertilizer, manure, irrigation, pesticide,
and others (staking and mulching) has been reported. In estimating these shares,
the cost of the labor used to apply an input has been taken out from the input cost
and aggregated into die labor cost.
Gross revenue
Gross revenue has been estimated as outputs (main and by-products) produced in
one planting period multiplied by market price of the output.
54
Economic efficiency in production
Net returns have been estimated as gross revenue less cost of all variable inputs.
All inputs including family labor and other farm-owned resources, except land
and management labor, has been considered as variable inputs. Higher net
returns, therefore, indicate efficiency of land and input management (seed,
fertilizers etc.) combined.
This has been estimated as net return (as defined above) divided by all variable
costs and multiplied by one hundred. The costs of all inputs including family
owned resources, except land, have been treated as variable cost in this case.
55
The equation developed for calculation of PIP as:
Efficiency of Technology
The production function in equation (1) was specified in best fit trend line
equations. The contribution of individual input components over the yield
function was estimated statistically. The significant differences in yield as
contributed by different inputs were estimated by standard statistical design of
experiments, mostly Randomized Block Design (RBD) in present study. The
significant difference in yield value as affected by contribution of any individual
input in equation (1) will represent the extent of difference in technical
efficiency, at the given level of specific input use.
56
In the present study, the yield data has been considered as dependent variable
and the effect of factors like seed (genotype), fertilizers and technical
interventions were estimated as independent variable. Standard statistical
package COSTAT has been used for analysis. Further yield has been considered
as a function of input variables and multiple regression analysis derived the
partial coefficient values. The partial coefficient values representing the
influence of each factor were converted in percent contribution of that factor on
yield parameter.
Market integration
Indian markets across states are not well integrated, as evidenced by wide
variability in seasonality of a vegetable across markets. For example, prices of
brinjal in Calcutta may be higher in October month, while in the low range in the
Delhi and Madras markets. A similar situation may be seen to prevail for other
vegetables and markets as well across different major markets in India.
Therefore, integrating markets by providing information on market arrival and
prices can help to reduce seasonality in Indian markets.
,s Ali, S. (2005). Total Factor Productivity Growth and Agricultural Research and
Extension: An Empirical Analysis for Pakistan’s Agriculture, 1960-96. Pakistan
Development Review. 44: 4 Part II, Pp. 729-746.
57
estimation. If January prices were not available (indicated by -) for all of the
three years, June was taken as the base. Weighted average prices of a vegetable
were calculated by weighing the individual monthly prices with the share of a
market in the total monthly arrival of that vegetable in all India. Similarly, the
weighted average price index of all vegetables in a month in India was calculated
by weighing the relative share of all vegetables in a month. The same procedure
was followed for getting the monthly arrival indices for different vegetables in
major markets in India.
Market variation
The three years average price index (PI) and market arrival (MA) data for a
particular vegetable crop for a particular market in twelve different months were
sorted in terms of maximum, minimum and mean price index values as well as
market arrival values. The per cent variation was calculated by subtracting
maximum and minimum values, as under:
Seasonality
58
Similarly the actual market arrival of January for any crop in any market was
converted to 100 by using a suitable conversion factor and accordingly market
arrival indices were calculated in MS-EXCEL for all the data set.
To study the trend in trade the export performance ratio (EPR) was estimated to
examine the comparative advantage of India in export of major vegetable, using
the method suggested by Balassa (1965) 16. The EPR of India in potato and
tomato was estimated by the equation:
EPR=Sit/Swt...(l)
where,
Su = Share of reference individual commodity in India’s total export, and
Swt = Share of that commodity in the total world export.
Since EPR is based on observed pattern of trade flows, it is also called Revealed
Comparative Advantage (RCA).
If EPR or RCA is greater than unity, the country has the comparative advantage
in export of the concerned commodity and vice versa.
59
Revealed Symmetric Comparative Advantage (RSCA)
To study variability, the per cent coefficient of variation was used as an index of
instability. The sustainability in export of tomato, potato and onion was
estimated by computing the coefficient of variation as suggested by Kumar et
al. (2005)18.
The high CV value in case of export from India indicated high degree of
instability in the market which may be due to many bottle necks and involvement
of many factors. Similarly, low CV values indicate a high degree of stability in
export market.
18 Kumar, N. R., Singh, B.P. Paul Khurana, S.M. and Pandey, N.K. (2005). Impact of
WTO on Potato Export from India. Agricultural Economics Research
Review, 18:291-304
60