Академический Документы
Профессиональный Документы
Культура Документы
697-704
Printed in India. DOI: 10.16943/ptinsa/2014/v80i3/55144
Review Article
Atmospheric Data Collection, Processing and Database Management in
India Meteorological Department
A K JASWAL, N M NARKHEDE and RACHEL SHAJI
India Meteorological Department, Pune 411 005, India
The availability of a proper meteorological database is a major prerequisite for studying the processes of climate by
scientists and policy makers. The acquisition of pertinent meteorological data, timely access and database management are
important components that make the climate information valuable. New technologies in the field of communication and
great strides in the automation age have offered new ways of making data products available to the user community. Data
collection and database management are the key components of the availability of climate data. Due to increased observation
frequency, huge amounts of data are received at data centres. Fast and effective quality control for identification and
flagging of suspicious observations is needed at the data centres to provide easy access to information and dissemination
of quality assured observations to the users.
India Meteorological Department (IMD) collects, processes, archives and disseminates a wide range of weather, climate
and other environmental data generated by the department’s observational network.With the installation of ‘CliSys’
Climate Database Management System (CDMS) in the National Data Centre located at Pune, the data acquisition and
processing operations have been automated. This system provides a set of tools and procedures that allow all data relevant
to climate studies to be properly stored and managed. It serves the main objectives of climate data collection, storage,
quality control, easy access, data protection, climate data analysis and customized product generation.
Key Words: Database; Data collection; Data Management; Metadata; Quality Control
activities at the National Data Centre (NDC) in IMD of the country. There are at present 530 surface
were upgraded with a new Climatological Database observatories, 45 radiation observatories, 10 air
Management System (CDMS) named ‘CliSys’ which pollution monitoring observatories, 62 Pilot Balloon
has been developed according to the World observatories, 39 Radiosonde and Radiowind
Meteorological Organisation (WMO) guidelines by observatories, 219 agro-meteorological observatories
Meteo France International (MFI). This is a new step and more than 700 hydro-meteorological
to secure the climate data heritage of the country and observatories. In addition to this, marine surface data
manage the atmospheric data efficiently. CliSys is a are also recorded and transmitted to IMD by more
set of tools and procedures that allow all data relevant than 200 ships of the Indian Voluntary Observing
to climate studies to be properly stored and managed. Fleet (IVOF). Of the 530 surface observatories,
This paper outlines the current methods of nearly two third are manned by the non-departmental
atmospheric data collection, processing, quality staff. The observers of non-departmental
control and database management in IMD making observatories are given training in taking observations
use of the newly acquired CDMS. while the necessary instruments and other stores are
provided by the IMD. The meteorological data thus
Methods of Data Collection collected from all over the country are used on real
Surface weather observations are fundamental to all time basis for operational weather forecasting. These
meteorological services upon which forecasts and data records arefurther quality checked and archived
warnings are made in support of wide range of in NDC for various kinds of usage including climate
weather sensitive activities. Collection of related studies.
meteorological data electronically atthe source allows
automatic quality controls to be applied, including Manually observed data forms the bulk of
error checking, prior to the data transmission from meteorological database in IMD which is used in
the observation site. The methods of observations in preparing meteorological information not only in the
IMD can be categorized into two major classes, planning but also in the operational fields. The
manually observed and automatic data collection observations taken manually are air temperature,
stations. Manual observations are recorded in the relative humidity, sunshine duration, solar radiation,
manuscript form at the observing stations which are wind speed and direction, precipitation, evaporation
sent to the designated Meteorological Centers (MCs) and cloud cover, etc. The main mode of transmitting
or Regional Meteorological Centers (RMCs) as data from observation sites to the RMCs and MCs is
shown in Fig. 1. After manual scrutiny of these data by internet and telephones. The surface and upper-
at RMCs/MCs, they are keyed-in in the well defined air observations in World Meteorological
formats and sent to National Data Centre (NDC). Data Organization (WMO) standard coded messages are
generated by automatic weather stations are subjected sent through Global Telecommunication System
to minimum quality checks before transmitting (GTS) for operational use and real-time data
through Automatic Message Switching System processing, quality control and archival use.
(AMSS) and are archived at NDC in near real-time However, all meteorological observations recorded
basis. The processes involved in data acquisition, by observatories in IMD’s network do not flow
processing, quality control and dissemination in IMD through GTS. Instead, these observations are
are shown in Fig. 2. manually scrutinised and keyed in at RMCs and MCs
in delayed mode in standard formats supplied by
Manually Observed Data
NDC. These data files are then transferred to NDC
Manual surface observatories are located almost one for further processing and archival use. The
in each district so as to meet the requirements of availability of various kinds of data archived in NDC
weather forecasting, agriculture and other operations is given in Table 1.
Atmospheric Data Collection, Processing & Database Management in India Meteorological Department 699
Table 1: Atmospheric data availability in the archives of National Data Center (NDC). The availability of a particular parameter
depends upon its year of start of recording.
AWS-SYNOP Hourly 2007 Atmospheric pressure, Air Temperature, Dew point temperature,
Rainfall, Wind (Direction and Speed), Sunshine hours
SURFACE Daily 1969 Atmospheric pressure, Air temperature, Dew point temperature,
Humidity, Vapour Pressure, Evaporation, Rainfall, Wind (Direction
and Speed), Sunshine hours, Visibility, Weather phenomenon
SURFACE Monthly 1901 Atmospheric pressure, Air temperature, Dew point temperature,
Humidity, Vapour Pressure, Evaporation, Rainfall, Wind (Direction
and Speed), Sunshine hours, Visibility, Weather phenomenon
UPPER AIR Daily, Monthly 1951 Air Temperature, Dew Point Temperature, Wind (Direction and speed)
AUTOGRAPHIC Hourly 1969 Atmospheric pressure, Air temperature, Humidity, Wind speed,
Pressure, Rainfall, Sunshine
MARINE Hourly 1961 Atmospheric pressure, Air temperature, Dew point temperature, Sea
surface Temperature, Present and past weather, Wind (Direction and
Speed), Visibility, Clouds, Wind wave, Swell wave
AGROMET Hourly 1972 Air Temperature, Wet bulb temperature, Relative humidity, Vapour
Pressure, Evaporation, Evaporation transpiration, Rainfall, Wind
(Direction and Speed), Sunshine hours, Atmospheric pressure, Soil
temperature, Soil moisture
AMSS and offline loading by assigning semi-colon Production (in-charge of data products elaboration
delimited ASCII files. The system includes data entry, and data access) as shown in Fig. 3. The main features
data monitoring, data retrieval, reporting, quality of CDMS installed in NDC are:
control and metadata sub-systems integrated into a
comprehensive CDMS.The system has been designed A unified Data Storage Structure
to operate in a three-tier configuration viz. Database
server, Application server and Web server. The system It is built around a Relational Database Management
is mainly composed of five subsystems namely the System (RDBMS) where data, including all the
Database (in-charge of the storage of climate data), relevant metadata, are stored in one unique powerful
the Data Acquisition (responsible for real time and database ensuring thereby the centralization and
non-real time acquisition of climate data), the uniqueness of all the information. Furthermore, the
Metadata Management (takes care of the acquisition features offered by RDBMS ensure, among others,
and management of metadata), Data Management performances in climate data retrieval, reliability
(responsible for the climate data management) and through various back up mechanisms, climate data
Atmospheric Data Collection, Processing & Database Management in India Meteorological Department 701
Fig. 4: Data consistency and integrity controls in Climate Data Management System
702 A K Jaswal et al.
Real-time Data
Real-time data are observations retrieved by
automatic data processing methods from the site in
real or almost real time and transmitted to a data
collection centre instantaneously and delivered to
real-time data users, e.g. weather forecasters. Real-
time automatic acquisition processes in CliSys consist
of two main background modules. The first module
reads and converts input ASCII files in CliSys format
Fig. 5: Schematic functional view of Task Management in Climate
Data Management System already described. The second one inserts these files
into the database. Real-time meteorological
observational data such as synoptic messages and
changes in the observatories, data formats for storing AWS/ARG data flowing through GTS are decoded
to the proper variables of the CliSys database and
data and its domains (data types), etc., it ensures the
referential integrity of data and avoids redundancy are archived. These data consist of WMO standard
and ambiguity of data. Tables storing this vital coded messages viz. SYNOP, AWS, ARG, TEMP,
METAR and SHIPS. Data values are checked for its
information about the data (metadata) are referred to
as catalogue (Directory of directory). The information global limits and internal consistency before ingestion
can be stored as requirements in groups as listing into the database.
(state wise, subdivision wise lists, etc.). Computation The real-time data reception in the database is
is an important activity responsible for automated visualized in near real time basis. A geographical data
operations of system components to achieve desired monitoring displays the IMD’s observation network
goals. It focuses mainly towards extraction of data on map of India and related status of the incoming
from network (GTS, AMSS, etc.) and puts it in data from stations through colour codes. Also a
structured tables of database with recommended tabular form indicates the status (missing value rate)
norms. This also includes computation of many of a given parameter for a given period and for a given
derived parameters and tables on the fly. Quality station, in accordance with the theoretical metadata
control includes applying standard algorithms to frequency measurements. It displays the real/expected
check the authenticity of extracted data value. This time series in both tabular and graphical forms.
also contains database related constraints to ensure
uniqueness of data records and various relational Non-Real Time Data
aspects of parameters within records. Operations
consist of many activities to run and get maximum The non-real time data are those from the manual
throughput of the system. It includes monitoring the stations which are sent in a delayed mode and
system for availability of services and data 24x7 and received from the Regional Meteorological Centres
data loading of historical data in the system with (RMCs) and Meteorological Centres (MCs) of IMD.
matching data loading formats. These consist of surface, autographic, upper-air,
radiation, rainfall, agro-meteorological data. All
The data catalogue is generated automatically observatories send their observation records to their
Atmospheric Data Collection, Processing & Database Management in India Meteorological Department 703
respective data-keying centres on monthly basis. stages. All data quality checks and flagging system
These observations are first manually scrutinized by in the CDMS have been developed as per the WMO
the technical sections of the respective data-keying guidelines (WMO 1993).Quality controls for each
centers located at RMCs and MCs and then keyed-in data are mainly designed for hourly and daily
in the pre-defined NDC supplied data formats. After meteorological data in the system. They make real-
receiving these keyed-in observations from the data- time checks at the data ingestion level through
keying centres, quality checks viz. valid character tolerance tests. After data loading, the data control
check for each field, duplicate checks, extreme value process is run with results stored in a doubt table and
limit, internal consistency, hydrostatic checks for the the quality control flags are updated. The quality
upper-air data are applied at NDC as per the controls pass through four levels which are Syntax
guidelines given by the WMO for data processing cum Tolerance tests, Priority test, Filter tests (Oracle
(WMO 1992). These data in ASCII format are then table constraints) and consistency controls. The
converted into the CliSys format and are ingested into “Meteorological” controls are performed once the
the database. The non-real time data acquisition data are in the database. These are Global limit check,
processes use the same data loading procedures as Related elements global limits, Internal Consistency,
used by the real-time data. Global rate of change and station record limits,
Standard deviation check and Station rate of change
Quality Control check. Further, spatial consistency checks can be
An important part of the management of an NMHS applied by visualization of climate values spatially
is the data quality control process in operation. Due with direct link to data modification functionalities.
to station automation, increased observation Once the data are in the database, non-real time tests
frequency and increased data transmission speeds, consisting of internal consistency i.e. physical
huge amounts of data are received in data centres relationships among climatological elements and
world over. The quality control process is designed temporal consistency i.e. variation of a climatological
to check the quality of the whole data-flow ensuring element in time are applied.
that data is error-free as far as possible. Therefore,
Quality Control Flags
fast and effective quality control for identification
and flagging of suspicious observations is needed to From the data ingestion to the archiving processes
provide easy access to correct information and and subsequently through the data management
dissemination of reliable observations to the users. module, the CDMS checks the validity of a data and
There are several points where errors can creep into flags it with the appropriate coded value. During
the data and so these must be detected and eliminated, quality control, suspicious values or certain errors
and if possible, the errors should be replaced by the are identified and quality control flags are associated
corrected values while also retaining the original to each climate element. Flagging information is
values. World Meteorological Organisation (WMO assigned to each data element in order to indicate the
1981) prescribes that certain quality control level of data quality. However, some erroneous and
procedures must be applied to all meteorological data suspicious values are inspected manually to avoid
for their international exchange. exclusion of extreme weather phenomena.
geographical co-ordinates, address, etc.), station for general public on IMD Pune website (http://
history, sensor history (e.g. calibration, repairs, etc.), imdpune.gov.in/aws/aws_index.html). Monthly
station environment (e.g. terrain, exposure, etc.), kind extremes of temperature and rainfall for surface
of equipment, state of station, sensor statistics (e.g. observatory stations under IMD network are compiled
frequency and kind of error, etc.), service information every month and are also hosted on IMD Pune website
and plans (WMO 2003). CliSys provides a module for the benefit of all users.
to specifically manage the metadata. It has features
of adding new values (stations, instruments, etc.) and Summary
modification of existing values. It has in-built tools Understanding and predicting changes in climate are
to consult and display the metadata values. It uses important activities for any NHMS of a country. With
more than 50 tables to manage stations details, the development of information technologies, the
instruments, parameters, geographic characteristics, research on weather and climate has also made a great
sensors information, site pictures, etc. Metadata have progress. The power and user-friendliness of
a key role in the process of creating datasets, as the computers in collection, transmission, processing and
knowledge of the station history provides increased storage of meteorological data and the ability to
confidence in the statistical techniques employed to record and transfer information electronically have
ensure that the only variations that remain in a climate given climatologists new tools to rapidly improve the
time series are due to actual climate variability and understanding of climate. Hence, it isessential to
change. Most metadata have been derived from the archive and provide access to the reliable
station’s documentation, both from current and climatological data needed to describe, understand,
historical documents available in the department. and predict changes in the Earth System as well as
the impacts of these changes on society. With WMO
Data Retrieval and Supply
Information System (WIS) architecture at national
With consideration of data policies existing in IMD, level, better data management and storage will allow
a variety of climatological data both basic and derived easy discovery, better access and retrieval of climate
products are regularly supplied by the NDC in information and services in future.
response to a large number of queries received from
central and state governments, universities, research Acknowledgements
institutes, public sector undertakings and private Authors are grateful to Dr. L. S. Rathore, Director
sectors. These data queries are categorised as General of Meteorology for encouragement and
departmental, research institutes and commercial support. We are also thankful to the unknown
organisations. Research institutes can register with reviewers for constructive suggestions that helped to
IMD giving specific details of the projects to get improve this manuscript.
meteorological data at very nominal rates. The data
supply to commercial organisation is done by
charging actual cost of data which is revised after References
every three years. The information supplied is used WMO (1983) Guide to Climatological Practices, Second
for lay-out of run-ways, town planning, air- Edition.WMO No.100, Geneva
conditioning, industry, port installations, installation WMO (1992) Manual on the Global Data Processing System,
of high towers, bridges and other structures, operation WMO No. 485, i Geneva
of multipurpose energy projects, water and power WMO (1993) Guide on the Global Data Processing System,
management, defence operations in inaccessible WMO No. 305, Geneva
regions, calibration of defence equipment, WMO (2003) Guidelines on Climate Metadata and
environmental studies, renewable energy sources and Homogenization,World Climate Data and Monitoring
many more. Daily AWS/ARG data are freely available Programme (WCDMP-53), Geneva.