Академический Документы
Профессиональный Документы
Культура Документы
DEPARTMENT
August 2016
No part of this document can be reproduced in any form or by any means, disclosed or
distributed to any person without the prior consent of APTS except to the extent required
for submitting bid and no more.
This RFP is meant to invite proposals from interested companies capable of delivering the
services described herein. The content of this RFP has been documented as a set of four
volumes explained below:
Volume II: Instructions to Bidders, Scope of Work and Financial and Bidding Terms& Forms
Volume II of RFP purports to detail out all that may be needed by the potential bidders to
understand the Terms & Conditions, project implementation approach, commercial terms
and bidding process details.
Kindly note that all volumes of the RFP have to be read in conjunction as there are cross
references on sections in these volumes. The selected System Integrator will be solely
responsible for any gaps in scope coverage caused by not referring to all three volumes.
CR Core Package
DR Disaster Recovery
SI System Integrator
Use of information technology to automate and digitise Government activities and services is not
new to India. The Digital India program launched by the Government of India aims to propel the
country to the next level of e-Governance maturity. Envisaged to be a programme to transform India
into a digitally empowered society and knowledge economy, it sets the long term direction.
Similarly, States have their respective e-governance initiatives. Collectively, there is no dearth of
programmes and projects to implement this vision. The question that emerges is with so many
things happening, what should bind them together into a holistic approach, such that there is
convergence and coherence.
e-Pragati, the Andhra Pradesh State Enterprise Architecture, is this new paradigm. It is a Whole-of-
Government framework and adopts a mission-centric approach to implementation. e-Pragati seeks
to help realise the vision of Sunrise AP 2022 by supporting the seven development Missions
launched by the Government in the areas of Primary Sector (Agriculture & Allied), , Social
Empowerment (Education & Healthcare), Skill Development, Urban Development, Infrastructure,
Industrial Development, and the Services Sector.
1.2. Vision
e-Pragati is not a project. It is large program with a long-term vision for creating a sustainable eco-
system of e-Governance. The vision of e-Pragati is stated below:
"e-Pragati is a new paradigm in governance based on a Whole-of-Government
framework, transcending the departmental boundaries. It adopts a Mission-centric
approach in its design and implementation and seeks to realize the Vision of Sunrise
AP 2022, by delivering citizen-centric services in a coordinated, integrated, efficient
and equitable manner."
e-Pragati is a framework to provide integrated services to citizens through a free flow of
information, and to usher in an era of good governance, characterised by efficiency, effectiveness,
transparency, and foresight. The different dimensions of the vision of e-Pragati are described below:
Developmental
1. e-Pragati will be a catalyst for enhancing the effectiveness of implementing various
developmental projects and welfare schemes undertaken by the Government, by providing
insights and foresights through analysis of data.
2. Planning and monitoring of public sector schemes and projects shall take advantage of IT,
GIS, and satellite imaging technologies.
Aspirational
1. e-Pragati shall be an effective tool in realising the vision of Sunrise Andhra Pradesh.
Citizen-Centric
1. Citizens and businesses will have a seamless and smooth interface with Government.
2. Departments and Government agencies will interoperate with ease and provide integrated
services to citizens and businesses.
3. The medium of paper will be minimised in all G2C, C2G, G2B, B2G, and G2G interactions.
Inclusive
1. Digital divide will be adequately addressed, especially leveraging the mobile technologies.
2. e-Pragati will enhance realisation of participative and inclusive governance, by analysing the
devolution of the benefits of development upon various sections of the society and by
measuring the impact thereof.
3. Citizen engagement will be accomplished with ease.
Technological
1. Government and citizens will be enabled to take advantage of leading technologies like
SMAC IoT and Big Data Analysis.
2. Principles of open data, open standards and open APIs will be ingrained in the designs of all
information systems.
3. e-Pragati will ensure the right balance between information security and privacy of personal
data.
In line with its vision, e-Pragati seeks to move away from the existing systems of Governance
(Government 1.0) towards establishing Government 2.0. The siloed and hierarchical systems will be
replaced by an integrated and collaborative operating model. The single-channel, 'one-size-fits-all'
models of service delivery will give way to personalised services delivered through multiple
channels. The output-driven processes will be replaced by transparent, outcome-driven procedures.
The citizens will no longer be passive spectators of governance and mere recipients of services, but
will be empowered to be active participants in the governance process.
All these aspirations will be achieved through establishment of a common and shared digital
infrastructure and applications, delivering a set of integrated and cross-cutting services based on
common standards and enterprise principles.
Value to Government:
1. The effectiveness of implementing various development projects and welfare schemes
undertaken by the Government will be enhanced, through extensive use of Enterprise
Project/Program/Scheme Management Systems.
Value to Society:
1. The implementation of e-Pragati will unleash various potential in the software, hardware,
electronics, and networking sectors. This will have a significant extent of multiplier effect to
the tune of 4 X.
2. e-Pragati, being based on open technologies, will open up new windows for innovation and
produce IT and non-IT employment in various sectors.
3. The economic development of the State will be spurred by increased productivity in all the
seven major sectors comprising the Sunrise AP Mission.
4. The successful implementation of e-Pragati is likely to motivate several such initiatives
across the country, as witnessed in respect of the CARD and e-Seva projects pioneered by
AP, and would lead to faster development of the nation.
A large and complex program like e-Pragati may not make a radical impact unless certain
fundamental principles of Enterprise Architecture are adopted by all the stakeholders, namely, the
Government, the System Integrators and the users. These are stated below. It is the responsibility
of the System Integrator selected for the implementation of DataLytics to meticulously observe
and/or promote the observance of these principles by the other stakeholders.
Data is defined consistently throughout Government, and the definitions are understandable
and available to all users. Defining Metadata and Data Standards (MDDS) within each
domain assumes great significance.
Data is an asset that has a specific and measurable value to the Government and hence, it
must be managed accordingly.
Applications are independent of specific technology choices and therefore can operate on a
variety of technology platforms.
A set of Common Mandatory Standards is prescribed for strict compliance by all the System
Integrators selected for implementing different packages of e-Pragati, to ensure interoperability,
maintainability, and uniformity of user experience across the entire landscape of e-Pragati. These
relate to the areas listed below:
The technical specifications of the above standards can be accessed at the URL http://e-
pragati.ap.gov.in/bestpractices.html.
It is necessary to upgrade the e-Pragati systems to the relevant standards, whenever the standards
are revised, during the currency of the contract. It is part of the responsibility of the SI (to be
selected through this RFP) to follow the relevant standards for system upgrade.
Business Intelligence is a set of methodologies, processes, and architectures that leverage the
output of information management processes for analysis, reporting, performance management,
and information delivery. Data Analytics is the process of developing actionable insights through
problem definition and application of statistical models and analysis against existing and/or
simulated future data. Big data analytics is the process of examining large data sets containing a
variety of data types -- i.e., big data -- to uncover hidden patterns, unknown correlations, and other
useful information.
Business Intelligence and Data Analytics system comprises of applications and technologies for
gathering, storing, analysing, and providing access to data to help decision makers in making
decisions. Typically, the applications include decision support systems, query, and reporting, Online
Analytical Process (OLAP), statistical analysis, forecasting, and data mining.
Business Intelligence and Data Analytics Technologies could help government policy makers draw
key conclusions from data, and become a critical component of a large e-Governance initiative like
the e-Pragati. These technologies are more towards G2G than other forms. All major government
plans can be designed and major decisions can be arrived at on the basis of detailed multi-
dimensional analyses of all the relevant data, which, in the context of a Government, is bound to
beimmense. Business Intelligence and Data Analytics can play a role in gaining a better insight into
what the citizens’ needs are, and, the manner in which they should be provided. The use of such
Analytical tools also allows decision makers to gain insights and take more knowledgeable decisions
and to plan more effectively to introduce new services and improve the quality of existing services
DataLytics application is an integrated Business Intelligence and Data Analytics system which
includes conventional and Big Data. The system is proposed to be a state-wide Analytical Engine
which takes in data from various government department databases, internet, sensors, machine logs
and other sources, transforms them, and presents them in an analys--able format. Also, DataLytics
provides tools for performing analysis on the data, and gain insights, make data-based predictions,
and identify best course of action for improving operational efficiency and governance.
The role of Big data analytics in Government is gaining increasing significance, as indicated below:
1. Insight: Big Data Analytics opens up opportunity to use unlimited amount of data, particularly
data from internet to understand people sentiment, and improve governance. Data from
newspapers, blogs, social media, news channels, emails, reports, satellites, images etc. can be
scanned, related, and structured in a manner that can be used for analysis. The insights provided
by big data analytics enables the Government to predict the future and plan appropriate
interventions or make mid-course corrections in all societally important sectors like healthcare,
In short, e-Pragati DataLytics system is proposed to be an Integrated Business Intelligence and Big
Data Analytics system. It is envisioned to provide data-based and informed decision making support
to GoAP in its mission in realising Sunrise AP 2022.
The Table below lists the stakeholders of DataLytics and their expectations from it:
DataLytics is a new, whole-of-state analytics system, which is pioneering in the Governance systems.
The Table below gives the key facts of the proposed DataLytics initiative.
The following Table lists DataLytics user scenarios. The scenarios are grouped by departments, and
are meant to provide a quick view of how the proposed DataLytics system will be used to improve
department processes.
S. No User Scenarios
Integrated Scenarios
1. Analysing the content in electronic and social media and other sources to understand
public sentiment on the 10 selected flagship programs of the Government, conducting a
root-cause analysis and suggesting appropriate interventions and mid-course corrections
to improve the delivery of the programs. (Complex)
2. Predicting a disaster(drought only), identifying the areas (districts & mandals) likely to be
affected, and suggesting advance interventions required to mitigate the adverse impact
on the population. (Complex)
3. Analysing the Text inputs (unstructured data) in the Grievance Portal (Mee Kosam) and
the popular print media (3 newspapers), identifying of key problem areas (Region / Type
of Problem / Frequency/Severity) and suggesting suitable remedial action (once a month)
(Medium)
4. Designing a Happiness Index, appropriate to the socio-economic profile of the State,
supporting the Government in conducting approriate sample surveys, analysing the
results and making suitable recommendations for enhancement of the Index. (twice in
the project period) (Medium)
Planning Department
1. Analysing the patterns of public expenditure on top 10 sectors of the economy,
identifying the correlations with the progress in achieving the relevant Sustainable
Development Goals and suggesting the desired areas and sectors for intervention
(Annually) (Simple)
2. Analysing the medium-term impact of development and welfare schemes, identifying the
gaps and realigning the schemes for enhanced effectiveness. (half-yearly) (Simple)
3. Analysing the geographical spread of various schemes and making corrections for even
distribution (Annual) (Simple)
4. Analysing the distribution of community assets, identifying the gaps in demand vs
locations and suggesting the right locations for creating new community assets. (Annual)
(Simple)
5. Analysing the trends of growth of GSDP, geographically and sector-wise (top 20 sectors),
identifying causal factors for high and low growth rates and suggesting the right mix of
interventions required to optimise the growth rate of the economy of the State.
(Complex)
Department of Energy
1. Demand-Supply Analytics and Optimization of generation and power purchase planning
(Medium)
- Data Analysis & DSS: Qualitative & Quantitative analysis of potable drinking water
supplied to the rural people in the habitations as per defined norms through
implementation of various water supply schemes under different programs in the
State (Complex)
- Data Analysis & DSS: Analysis of Bus flying all-weather road connectivity to
habitations through implementation of various Road laying works under different
programs in the State (Medium)
Notes:
1. The GoAP will be responsible to share the internal data relating to or generated by the
departments, as required for the SI to generate the reports/ insights. SI will be responsible
for creating the formats and online collection mechanisms for such data.
2. The SI shall be responsible to collect all the required data external to GoAP.
Second Total
First 11 11 Reports
Project Periods Depts. Depts. in Year
Year1 26 0 26
Year2 166 86 252
Year 3 166 166 332
The following Table provides the indicative details on distribution of user scenarios as per their
complexity and year wise execution.
Note:
I. Reports complexity is rounded off to the nearest by percentage to give an
indicative number of reports.
II. First Year SI has to delivery last quarter reports after PoC and Pilot for the first 11
departments.
III. Second Year SI has to delivery last two Quarter reports for the second 11
departments.
IV. From Year 1 onwards, 11 departments and Year 2 onwards, additional 11
departments shall be considered into the project scope. Hence the SI needs to
procure the necessary software as per the solution requirements on Year-on-Year
incremental basis and should also propose the rollout strategy for hardware
scaling up in technical proposal. Hardware shall be deployed at SDC/ NoC near the
new Capital of GoAP. Not the entire hardware would be deployed on day one,
however the bidder should ensure that all hardware is installed by the end of 2
Years from the date of signing of the contract. Hardware deployment shall be on
incremental basis spread over 2 years on need basis to meet the performance
requirements of the System as per SLA specified in the RFP.
V. The indicative complexity of Year 1 and Year 2 user scenarios are provided in this
Volume of RFP. The description of user scenarios of Year 2 and any pending user
scenarios of Year 1 will be provided to SI at the time of SRS phase of the project.
Logical view of DataLytics shows the key components and layers of DataLytics system. A brief
description of these components and layers is provided in this section.
DataLytics shall support processing of all types of data from a variety of sources. Given below is an
indicative list of data sources, and categories.
Structured Semi/Unstructured
Internal 1. Departmental Database 1. e-Mails
2. Data Hubs 2. Documents
3. Data Warehouse 3. XML documents
4. Data Marts
External 1. Sensor Data
2. Log Stream Data
3. Web sites
4. Satellite Data
5. Social media
6. Bioinformatics
7. Blogs/Articles
8. Documents
9. E-mails
10. Audio-visuals
11. XML
1. Streaming Data – Streaming data comprises of unstructured data coming in from various
sources. The data shall be held in a buffer area and when a set limit is reached, it shall be
transmitted to DataLytics system (Hold-Transmit).
2. Batch Data – Batch data is normally extracted from within Government departments using ETL
or ELT processes. Structured data may be loaded directly to Data Warehouse, and
unstructured/semi structured data to Hadoop or equivalent(or better) unstructured data
processing platform
Special Note: Unstructured Data Processing Platform considerations – Unstructured Data
processing platform shall be open source, horizontally scalable, and should be capable of
handling large volumes, variety, and velocity of unstructured and structured data(Big Data).
Apache Hadoop fulfills most of the requirements. However, there might be better
alternatives available in market today. The rate at which these products are being emerged
is staggering, it is possible that entirely a new system of open-source Big Data analytics
framework might be in place. In the above context, the term “Hadoop” is used in this
document to represent “Open source Big Data processing framework” for the sake of
convenience.
3. Near Real-time data analytics zone – Capability shall process incoming stream data in real time
to provide quick insights into the data. This data may then be persisted on Hadoop system. Near
real-time analytics shall provide capabilities like log stream analysis, sensor data analysis etc. The
real-time analytics system must be able to quickly identify useful data and discard data that is
not useful. Near real-time data shall augment insights obtained from batch analysis
Derived Data Sources – Big data platform includes Data Warehouses, Hadoop Data Store, Discovey
Lab, Data Marts, Data Hubs.
1. Data Warehouse – A State-wide Data warehouse shall be created to store structured data. The
warehouse shall store whole of state level data, comprising of structured data from
departmental database and data hubs. The warehouse shall support Massively Parallel
processing and provide optimal performance considering structured and unstructured data. It
shall have no single point of failure.
2. Data Lake - The lake shall have capabilities required to make it easy for developers, data
scientists and analysts to store data of any size, shape and speed and do all types of processing
and analytics. It shall ease ingesting and storing all data while making it faster to get up and
running with batch, streaming and interactive analytics.
3. Data Marts – Data marts shall hold department specific data. For example: Agriculture
department mart, Welfare department mart etc.
4. Data Model - The Big Data Platform shall have a common data model, which should be able to
manage all the information both structured and unstructured so that the department users are
able to get a complete view for supporting enterprise Reporting, Statistical analysis, Forecasting,
and high performance Advanced Analytics
Analytical Data Virtualisation – Data Virtualisation shall provide a layer of abstraction and hide
complexity of data storage and retrieval underneath. It shall hide cryptic names of tables and
columns from users and provide business friendly definitions of data which can be used to create
reports even by non-technical people. Also, the data abstraction layer shall have capability to access
structured, unstructured, or both data in a single query. The language for query shall be standard
RDBMS, and query initiated at any level should have ability to process data from all data stores
(structured and unstructured). The layer should support a strong optimiser to tune query execution,
for response time as well as throughput
Data Usage Layer – These are the usage scenarios of DataLytics. Different users may want different
types of outputs based on their role, responsibilities, and functions. DataLytics shall provide the
following usage capabilities:
1. Reports and Ad-hoc queries – Analytical reporting (based on data warehouse/Datamart).
The system shall provide scripting language, ability to handle complex headers, footers,
nested subtotals, and multiple report bands on a single page.
2. The system shall support simple, medium, and complex queries against both structured and
unstructured data.
3. User specific reports shall be configured by the SI after soliciting requirements from
respective users.
4. Online Analytical Processing (OLAP) – Slicing and dicing, measuring dependent variables
against multiple independent variables. It enables users regroup, re-aggregate, and re-sort
by dimensions.
Delivery Layer – Describes how users and applications consume output from DataLytics system. This
may be in the form of DataLytics Services, alerts on emails and phones, actions, integration with
office applications like word, excel etc., collaboration(discussion threads etc.), mobile and so on. The
Delivery layer shall support delivery thru following mechanisms:
1. DataLytics services – It offers ability to embed actions, alerts, and reports in other
application, tool or UI. They shall have ability to refresh automatically based on predefined
schedule.
2. Alerts – This is to notify stakeholders if a certain event has occurred. Alerts may be delivered
in the form of email, reports, or messages.
3. Actions – Enable users take some action based on alerts or reports. For example: removing a
duplicate record or fixing a corrupted data.
4. Portal – Portals provide mechanism to catalogue and index, classify, and search for
DataLytics objects such as reports or dashboards. All DataLytics reports to be made available
to department users on the portals, based on the roles and responsibilities.
5. Mobile – Reports, dashboards, and portals shall be accessible on Mobile devices too.
6. Office Applications – The system should integrate with Standard Office products at the
minimum. The data and reports should be importable and exportable from/to Office
products
The complete Big Data Platform comprising of Active Data Warehouse, Hadoop and advance
discovery lab should have a Single web based tool to manage and administer hardware, OS and
software. The management tool should allow DBA’s to determine system status, trends, capacity
usage and even individual queries across Big Data Platform. The management platform should allow
administer system throughput congestion and health. The management tool should provide rewind
feature to go back in time and understand the changes in performance a query has gone through.
The management tool should be an integral part of big data platform. The management tool should
provide security by fully featured role-based permission engine that can function in standalone or
DataLytics shall contain GUI that allows users to configure the system, manage metadata, set up
workflow and business rules, set up ETL rules etc. This interface shall be available to the DataLytics
application maintenance team. It should support removing certain entities or attributes, cleaning up
the text, or merging/splitting attributes.
It is envisaged that DataLytics system will be deployed in the interim data centre of GoAP near
Andhra Pradesh New Capital. The SI has to propose a suitable deployment architecture keeping this
assumption into consideration.
Cloud
Other e-Pragati Packages
DataLytics
Delivery
Channels
Reports, Alerts, BI Services,
Mobile, Portals, Office Apps,
Actions
Security Systems
Identity & Access Management
DataLytics System
Security Web
Switch Switch Switch Server DB Server
Security/SSO Server
Multi Node
Data LDAP
Server
Warehouse
System
•••• Centralized Monitoring System
Data Collection &
Processing System
Hadoop/ Hadoop/ Hadoop/ Hadoop/
Discovery Discovery Discovery Discovery Portal
Node 1 Node 2 Node 3 Node N Server
Reporting
Server DB Server
SAN for Structured
Data Backup Solution
The e-Pragati green field and brown field applications are distributed across multiple Data Centres
and different cloud environments. Hence the deployment location of DataLytics system shall be
connected with these Data Centres and cloud environments with suitable network bandwidth. The
GoAP will procure and establish the network connectivity with required bandwidth requirements.
The below Figure indicates the network architecture view of the DataLytics system.
To enable information protection in its DataLytics life cycle, security controls in participating IT
systems have to be adhering information system security policies. To achieve this objective, the
following processes should be considered and implemented at minimum:
The following Table gives high level scope of work of the SI in implementing DataLytics project:
SI shall prepare the SRS document, Solution Architecture and Design documents in the form of a
BLUEPRINT for the complete functionality of DataLytics. Other remaining activities including
Software Product configuration/customisation, testing etc. shall be undertaken as per software
development processes.
1. The SI shall perform the testing of the solution based on the approved test plan and
criteria; document the results and shall fix the bugs found during testing.
2. The application should undergo comprehensive testing which includes at least Unit
Testing, System Testing, Integration Testing, Performance Testing, Regression
Testing (in case of any change in the software) and Load & Stress testing.
3. The SI should preserve the test case results and should make them available to the
third party auditor (if any, to be appointed by the PMU at its own cost) for review,
on the directions of the PMU.
4. The SI shall share the tools used for testing the application system with the PMU. If
the tool is a proprietary tool then it should share at least one license with the PMU.
5. The testing of the application system shall include all components vis-à-vis the
functional, operational, performance and security requirements of the project, as
envisioned in this RFP.
6. Though the PMU is required to provide the formal approval for the test plan, it is
ultimately the responsibility of the SI to ensure that the end product delivered
meets all the requirements specified in this RFP and the signed off SRS. The
responsibility of testing the system lies with the SI.
7. The SI shall create a testing environment and ensure that all the application
software upgrades/releases are appropriately tested in this environment before
applying them on the actual production system. Any downtime/system outage for
Application system caused by applying such patches shall be attributed to the SI as
system downtime and shall attract penalties as per SLA.
8. GoAP shall engage a panel of TPAs for conducting the Acceptance Testing of
DataLytics system. A detailed Testing and Quality Assurance methodology to be
followed by SI is given in Annexure 5.
Solution Documentation
The SI shall prepare/update the documents including the following minimal set of Project
Documentation:
a. User Manual
b. Training Manual
c. Operations Manual
d. Maintenance Manual
PoC
After the initial period of system study and analysis, the SI shall design a Proof-of-Concept
covering the 3 scenarios each in respect of the following 2 departments:
i. Civil Supplies
ii. Agriculture
The purpose of the PoC shall be to:
1. Establish the basic functionalities of the Product deployed for the DataLytics
solution
2. Establish the method of extracting data from a variety of sources (e-Pragati Projects,
legacy systems, external sources, web and social media), in a variety of modes
(online, batch mode, messages from sensors and other IoT devices)
3. Establish the ETL methodology
4. Prove the initial set of algorithms designed by the data scientists of the SI in
consultation with the SMEs of the 2 departments chosen for the PoC.
5. Provide the Analytical Reports @ 3 for each of the 2 departments
6. Validate the soundness of the methodology adopted and for the preparation of the
blueprint and confirm/change/enhance the contents thereof.
7. Validate the results thrown up by the PoC by comparing with ground realities.
Rollout
THE PILOT IN 4 DEPRTMENTS SHALL BE COMPLETED IN ALL RESPECTS AND ACCEPTED BY
THE TARGE DEPARTMENTS IN A PERIOD NOT EXCEEDING 5 MONTHS FROM THE SIGNING OF
THE AGREEMENT WITH THE SI. After the successful implementation of the PoC and Pilot in
Civil Supplies, Agriculture, Healthcare and Labour Department, the rollout for the remaining
6 departments namely Planning, Municipal Administration, ITE&C, Energy, Welfare and
Panchayat Raj and one Cross sectional of phase 1 shall be taken up for the development of
Use case scenarios as per the terms of the RFP. In the second phase development of use
cases for the remaining 11 departments namely Finance, Police, Marketing, Registration,
Revenue, Excise, Irrigation, Tourism, Women Development, Housing and Transportation
shall be taken up as per the terms of the RFP.
The Rollout shall happen over a period of 12 months from the launch of the pilot.
It shall be in full compliance with all the requirements of the RFP i.e., remaining departments
targeted for Year 1 shall be covered in 7 months from the launch of the pilot and 11 more
departments in the following 6 months.
The below Table briefly summarizes the various phases of implementation:
Note: Total project scope is 1 year for development and 3 years of maintenance.
Solution Certification
a. The application has to be free from any security threat and the SI shall have to produce
the third party audit certification for the same.
b. Further, the SI shall get the third party certification from the agencies which are
empaneled and approved by APTS/GoAP and shall submit the testing certificate to the
PMU before Go-Live of the DataLytics. The cost of certification shall be borne by
APTS/ITE&C dept.
c. In addition PMU at its own cost may also engage any other third party agency and get
the application tested. SI has to provide full support for this activity.
Operational Acceptance for the DataLytics shall be awarded by the PMU only if all user
scenarios are met by the SI ensuring the full functionality.
In order to accept the System, PMU must be satisfied that all of the work has been
completed and delivered to PMU’s complete satisfaction and that all aspects of the System
perform acceptably. The operational acceptance of the system shall only be certified when
the proposed system is installed and configured at the sites according to the design and that
all the detailed procedures of operating them have been carried out by the SI in the
presence of PMU staff. The system is said to be “Go-Live” when it is installed, configured,
and operationalised for all use scenarios related to 11 departments including cross-sectoral:
Based on the above and only after being completely satisfied that at least a minimum
percentage of all the users of internal stakeholders have access to the System and are using
the System for the respective functional areas, the PMU shall issue such OPERATIONAL
ACCEPTANCE of DataLytics.
SI has to work with departments for data collection and design of new models for implementation of
new use case scenarios, if any.
SI should develop the Standard Operating Procedures (SOPs), in accordance with the ISMS, ISO
27005 & ITIL standards. These SOPs shall cover all the aspects including Infrastructure installation,
monitoring, management, data backup & restoration, security policy, business continuity & disaster
recovery, operational procedures etc. The SI shall obtain sign-offs on the SOPs from the department
and shall make necessary changes, as and when required, to the fullest satisfaction of GoAP. GoAP IT
& IT related polices and security policy shall be adhered.
SI shall provide automated tool based monitoring of all performance indices and online reporting
system for SLAs defined in Volume III of RFP. The tools should have the capability for the PMU to log
in anytime to see the status.
The weekly SLA report is the summary of the daily SLA reports. The Monthly SLA report is the
summary of the Weekly SLA reports. The Quarterly SLA report is the summary of the Monthly SLA
reports.
Besides the SLA reports SI also need to annually submit the following:
a. Certification stating all patches/ upgrades/ service releases have been properly installed
b. Asset Information Register
c. Standard operating procedure
d. Updated Project Exit Management Plan
Further at the last quarter of Operation and Management phase SI needs to submit the Project Exit
report.
The broad activities to be undertaken by the SI during the operation and maintenance phase are
discussed in subsequent paragraphs.
Setting up and Management of Technical Support (including manpower & other field support staff)
a. The SI shall be required to provide Technical Support (Tech Support) services to enable effective
support to the internal and external users for technical.
b. Additionally SI shall be required to provide Field Support Staff to enable effective field support.
c. SI shall ensure helpdesk facility shall have following:
i. Call logging mechanism through Phone
ii. Call logging mechanism through e-mail
iii. Call logging mechanism through portals and applications
d. The SI shall provide at least the following services:
i. Provision and supervision of personnel for the Tech Support. Minimum qualification
requirements for personnel for this process are stated in the Volume II of RFP. Further SI
shall ensure that helpdesk resources and also the technical field staff should have
knowledge of local language i.e. Telugu.
ii. Helpdesk shall provide its services on all working days of GoAP between 08:00 Hrs. to
20:00 Hrs. However, minimal support be available for remaining hours of the day and
non-working days.
iii. All grievances shall be assigned a ticket number and the number shall be made available
to the user along with the identification of the agent, without the user having to make a
request in this regard, at the beginning of the interaction.
iv. Tech Support team shall provide support for technical queries and other software
related issues arising during day to day operations
e. The Physical space for the helpdesk and any other required infrastructure shall be provided by
the SI.
f. The SI shall categorise the technical issues and potential faults in three levels – Low, Medium
and High, in consultation with the PMU. The levels shall be based on the following criteria.
i. Impact on business through disruption of services and operations;
ii. Number of offices and geographical locations being affected by the issue;
g. The SI shall adhere to the service level agreement with respect to the resolution of issues at
various levels.
h. The interactions shall also be recorded and the records maintained for reference for a period of
3 months from the date of resolution of the problem.
i. All complaints/ grievances of users shall be recorded and followed up for resolution and an
escalation matrix to be developed for any delay in resolution.
j. Apart from using helpdesk for recording grievances received through telephone, e-mail, and
portal facility should be made available to the users to record their grievances.
k. The Technical Team should register the complaints to the Tech Support for the
server/network/Application related problems. It shall be ensured that the complaints logged by
the technical team must be on High Priority Basis.
l. The SI shall provide the following helpdesk performance monitoring reports –
i. Calls per week, month or other period;
ii. Numeric and graphical representation of call volume;
3.3. Deliverables
The following Table outlines detailed deliverables expected from the SI. PMU will be created for
each package with a SME (department person) and PMU will be made responsible for the sign-
off of deliverables.
x. Security design
xi. NoSQL database and Data lake design
Boundary conditions
ii. Machine Configuration
e-Pragati program is user-centric and aims at providing better government services to the end
user. The goal is to build a common vision of success amongst all stakeholders and ensure the
institutional and implementation arrangements support strategy.
The trainer need to provide the special training on DataLytics to the training / faculty team of the
departments.
Sl. Description
3. Detailed training plan shall be created, and training material shall be prepared and soft
copies to be distributed to the participants. 5 copies of printed training manuals shall be
supplied to each of 22 departments.
4. Training plan shall include details like participant names, training location, date, and
time. And all necessary arrangements shall be made to enable smooth running of
sessions.
6. PMU will conduct an exit test at the end of the training session and allots grades to the
all participants. There will not be any fail criteria. All the participants with bottom grade
will have to be re-trained. If more than 30% of the participants receive the bottom
grade, then the training will have to be re-conducted.
Sl. Details
1. Training Plan
2. Training Manuals
3. User Guides and Materials
4. Documented Evidence of Successful User Training.
T ABLE 10: Training Deliverables
Training shall introduce the GoAP resources on systems, procedures and processes in an
elaborate manner. The actual requirement of training may be assessed while implementing
DataLytics and will be decided mutually by Government designated team and SI. Concept of
Trainer’s Training program will be organised by Government designated team to train the
trainers of the DataLytics SI.
The training need analysis of all key stakeholders has to be done and then training plan will
have to be developed in line with overall project plan. Given below are high level
requirements of DataLytics Training Plan:
4
.
T ABLE 11: DATALYTICS T RAINING REQUIREMENTS
Change Management initiative shall focus on addressing key aspects of project including
building awareness among stakeholders. Change management shall also include
development and execution of communication strategy for stake holders.
Change Management workshops shall be planned and conducted based on needs of various
stakeholders of DataLytics. Key considerations for Change Management Process are given
below:
Sl. Description
2. Assess Change Readiness – How ready departments and stakeholders are? Are
Sl. Description
Data Analytics Data Analytics capabilities should provide minimum functionalities of data
integration, ETL/ELT, data quality, reporting, dashboards, data query, advanced
analytics (descriptive, causal, mining, predictive, prescriptive, statistical,
mathematical) and real time analytics (Sensor analytics, log stream analytics).
These functionalities would be available for all types of data i.e. structured,
semi structured and unstructured
Big Data Platform The Big Data platform would integrate and analyse all types of data i.e.
Structured or unstructured, Big or Small, Real-time or Batch across multiple
data stores. This platform will comprise of following three core components:
the Enterprise Data Warehouse (EDW), the Advance Discovery Lab, and the
Hadoop Platform.
a. An EDW will be RDBMS based warehouse and it would hold structured
data in centralised, consistent manner and will deliver strategic and
operational analytics.
b. A Hadoop Platform will be used for capturing, storing, archiving, and
refining all types of structured/unstructured data.
c. An Advance Discovery Lab will be used by business analysts to unlock
insights from big data with rapid exploration capabilities and a variety of
analytic techniques.
The Big data platform should also provide seamless query across 3
components.
73. Server Operating System DataLytics system must be deployable on Open source
Big Data Processing Platform (Hadoop equivalent or
better). The system should be deployable on Windows
or Linux or any common operating systems
74. Application development Application development must be language agnostic and
interactive with no dependency on Hadoop or any niche
skill sets.
75. Development tools Tools must provide interactive and wizard driven
application development environment. But at the same
time must provide native development kits or tools so
that people with programming knowledge and skills can
develop and deploy Applications
76. Cloud readiness DataLytics solution should be deployable in virtualised
envirnoments
Sl Requirements
no
1. The SI shall provide a single system that is optimised and tuned to provide maximum
performance, scalability, and efficiency for DataLytics
2. The hardware and software configuration must be built to protect against component failures
such as disk failures, CPU failures, memory failure, network card failures, and system
controller failures.
5. The proposed system should have an integrated management and monitoring system from
disk to applications.
6. The proposed system should have a unified patching approach where a single release should
patch the entire system viz firmware, Bios, OS, Server, Network and system software’s.
7. The proposed system should have a high-speed network interconnect between all
components.
9. The SI should provide single support to all the DataLytics components, operating system, and
hardware.
Operating System
Virtualisation
Servers
Storage
Network
Embedded network switching technology
2. Audit Logs should be written once and be readable on multiple devices in a secure manner
4. DataLytics System shall develop and apply Data Lifecycle policies to the Log files
6. Audit Logs should be useful for debugging, error reconstruction, and attack detection
8. Exception messages shall ensure that no unintended information, which could compromise
data, is displayed
9. DataLytics System shall always be fail safe
10. DataLytics shall have Tamper proof Audit trail mechanism for user access. It should record
details of all access to the system at the individual user level. This includes, in part,
logon/logoff times. Query run times, data accessed or modified, etc.
DW Platform should also track and log attempted access to unauthorised data. In case of
critical data it should trigger an alert and send a notification to the administration and
security personnel
11. DW Platform should also track and log attempted access to unauthorised data. In case of
critical data it should trigger an alert and send a notification to the administration and
security personnel
12. DataLytics should have the capability to perform database activity monitoring and blocking
on the network combined with consolidation of audit data from popular databases like
Oracle, MySQL, Microsoft SQL Server, SAP Sybase, and IBM DB2 as well as from operating
system logs.
13. The system should have the capability to perform White list, black list, and exception list
based enforcement on the network.
14. DataLytics should have extensible audit collection framework with templates for XML and
table-based audit data.
Usage
Criticality High Medium Low
High Type 1 Type 2 Type 3
Medium Type 4 Type 5 Type 6
Low Type 7 Type 8 Type 9
T ABLE 14: SERVICE CATEGORIES
Criticality shall be defined in terms of Application, and Users. An indicative categorisation matrix is
provided below. Resources shall be assigned in the order of priority
Sl. no Description
1 Database should have capability to form in-memory columnar format to accelerate
analytics
2 The in-memory capability should work transparently with existing applications, BI and
Reporting tools
3 The in-memory capability should be compatible with Cloud Computing, Big Data and Data
Warehousing
4 The in-memory capability should be scalable and should ensure data availability and
scalability
5 The in-memory capability should not have any hardware lock-in or limitations
6 The in-memory capability should not have any database size limit
Sl. no Description
1 The Datalytics system shall have in-Database mining, analytical and querying capabilities,
and should be able to interoperate with other DBMS.
2 It should combine large amounts of data with sophisticated analytical processing capabilities
available within database for fast, efficient, parallel and scalable execution of queries.
3 Features and interfaces such as Map Reduce, embedded statistical algorithm and mining
libraries, predictive modelling integration, decision automation, and mixed workload
management would be preferred.
3. Personal information may be processed only insofar as it is adequate, relevant and not
excessive in relation to the purposes for which they are collected and/or further processed.
4. The personal information must be accurate and, where necessary, kept up to date. Every
reasonable step must be taken to ensure that personal information which is inaccurate or
incomplete, having regard to the purposes for which they were collected or for which they
are further processed, is deleted or rectified.
5. The personal information in the custody of the DataLytics or any other body, which is a part
of DataLytics, shall not be transmitted to any other body or person without the appropriate
legal authority.
6. Encryption requirements must be identified and applied where relevant. For example:
Passwords, and other sensitive data
S. No Requirements Description
1. DataLytics system shall have capability to collect structured, unstructured, and semi-
structured data from various sources such as logs, webpages, sensors, emails, documents
etc. Relevant pull or push based mechanisms shall be used to collect data. Example: Web
crawler to pull data from Web pages.
2. DataLytics shall provide capability to hold and transmit raw data collected from various
sources to Data centre
3. DataLytics shall provide capability to transit collected data within datacentres to place
them on right devices for processing
4. DataLytics system shall provide mechanisms to cleanse different types of data –
Traditional, Sensor based, Log Data, and data from internet. Relevant data cleansing
frameworks have to be included. For example: BIO-AJAX framework for standardising
biological data and so on
5. DataLytics system shall provide mechanisms to eliminate data redundancy. Data
deduplication, redundancy detection, and data compression are some of the common
ways to eliminate redundancy, and DataLytics shall adapt these in an optimal manner so
as to not to adversely impact computational capability
As scalability is one of the key requirements for e-Pragati and considering fast developments in
hardware industry, it is essential that the data warehouse system platform should allow co-
residence support of at least two generations of hardware. Co-residence would preserve the current
investment in data warehouse system as new hardware generations are added to the existing Data
Warehouse system and overall act as one warehouse.
3.13. Performance
3.13.1. Queries per second
DataLytics shall support following number of Queries per second
Total number of users – 150 with an anticipated growth rate of 5% per annum
3.16. Manageability
DataLytics should efficiently manage simultaneous mixed workload like loading, transformation,
Business Analytics, data mining, development, etc.
3.17. Usability
Sl Description
no
1 DataLytics shall describe solutions capability to support multiple user interfaces and any
limitations to the ability to support major web browsers (i.e. Internet Explorer, Firefox,
etc.).
S.No. Requirements
1. The SI shall provide a single system that is optimised and tuned to provide maximum
performance, scalability, and efficiency for DataLytics
2. The hardware and software configuration must be built to protect against component
failures such as disk failures, CPU failures, memory failure, network card failures, and
system controller failures.
3. The proposed system should have an integrated management and monitoring system
from disk to applications.
4. The proposed system should have a unified patching approach where a single release
should patch the entire system viz firmware, Bios, OS, Server, Network and system
software’s.
5. The proposed system should have a high-speed network interconnect between all
components.
6. The solution SI should provide single support to all the DataLytics components,
operating system, and hardware.
Operating System
Virtualisation
Servers
Storage
Network
Embedded network switching technology
T ABLE 21: GENERIC REQUIREMENTS
The Tables given below provide the details on compute, storage and memory requirements for
structured and unstructured data processing units, and discovery platform:
Usable Data Storage Usable Data & compute core ratio Memory
( uncompressed)
25 TB storage
(Indicative usable storage GoAP. will provide a standard Usable Data GoAP will provide a
and SI can choose any & Compute ratio as part of the proposal as standard RAM (Unit) for
standard storage ex: 20TB acceptance criteria (Unit) and for any the proposed usable
or any XX TB (Unit)) future change in usable data store during storage and compute as a
the contract period, SI should install the standard. For any
required compute as per the acceptance addition of storage or
criteria. compute during the
contract period, GoAP
will add equivalent
memory to the DPU.
T ABLE 23: STRUCTURED DATA UNIT
Usable Data Storage Usable data & compute core ratio Memory
( uncompressed)
5 TB storage GoAP. Will provide a standard Usable GoAP. will provide a
(Indicative usable storage Data & Compute ratio as part of the standard RAM (Unit) for
and SI can choose any proposal as acceptance criteria (Unit) and the proposed usable
standard storage ex: 5TB for any future change in usable data store storage and compute as a
or any XX TB (Unit)) during the contract period, SI should standard. For any addition
install the required compute as per the of storage or compute
acceptance criteria. during the contract
period, GoAP will add
equivalent memory to the
DPU.
T ABLE 24: ADVANCED DISCOVERY LAB UNIT
Raw Data Storage Raw Data & compute core ratio Memory
( uncompressed)
48 TB storage
(Indicative usable storage GoAP. will provide a standard Usable Data GoAP will provide a
and SI can choose any & Compute ratio as part of the proposal as standard RAM (Unit) for
standard storage ex: 20TB acceptance criteria (Unit) and for any the proposed usable
or any XX TB (Unit)) future change in usable data store during storage and compute as a
the contract period, SI should install the standard. For any addition
required compute as per the acceptance of storage or compute
criteria. during the contract
period, GoAP. will add
equivalent memory to the
DPU.
T ABLE 25: U NSTRUCTURED DATA PROCESSING UNIT
Notes:
From an implementation perspective, the GoAP. will provision for infrastructure for size for initial
data, and YoY growth.
Should be deployable at Interim Data Center of GoAP near Andhra Pradesh New Capital.
Meets the proposed solution requirement and SLA requirements for DataLytics
Should support the concept of “Unit” for modular infrastructure requirements of DataLytics
and :
o The SI has to provision appropriate type and number of cells for initial data, YoY
growth.
o The data sizing (either usable or raw) has to be done as per Unit definitions.
o In case of additional infrastructure capacity required beyond planned YoY growth,
GoAP have to supply number of “Units” to address that requirement.
o Supports high performing optimized infrastructure (server, storage, and network)
deployment for reduced data center foot print.
Should have separate environments for Development, Testing, Pre-production for the entire
project duration.
All other supporting hardware like load balancer, network equipment, backup architecture will be
considered by the GoAP to support for implement and working for the proposed DataLytics solution.
total project duration. However, in case any difference in number of resources that are proposed by
the SI, he has to provide a detailed justification on how the proposed resources would meet the
overall man power requirements for the total project duration. Also please note that as the phase 2
implementation will still be going on the Year 2 of the project, entire operations team may not be
required and the table below for Operations and Maintenance represents the indicative resources
from year 3 onwards.
Specific SLA
The following describes various SLAs of queries:
The teams are expected to follow a very systematic approach and use appropriate tools for
bidirectional traceability including the defect tracking. The tool is expected to provide end-to-end
trace-ability from requirements to defects and vice-versa (reverse traceability). The traceability has
to be achieved at least by mapping the defects to test cases which in turn are mapped to
requirements that are bundled into sub-modules under modules. As every Business Process is
expected to be achieved by composition of services (Orchestration or Choreography) all such
services have to be mapped to specific requirements. Similarly, it has to be noted that all non-
functional requirements have to be mapped to test cases which in turn should help to substantiate
the SLAs related to Performance, Scalability, Availability, Security and other criterion as defined in
the Non-functional requirement sections of each application specific ePRS documents of Volume I.
Methodology of development is the decision of the SI. The developed solution should adhere to
industry standards and the principles of e-Pragati.
The quality certification for every intermediate, demo or test release is expected to be given based
on detailed analysis and necessary reports substantiating the relevant criterion. Passing or failing
requirements based on traceability is mandatory. Every business process, service, module and sub-
module has to be certified not only based on the requirements that are passed or failed but also the
test cases and the defects still pending. All the senior managers from development, quality
assurance, project management, technical managers and GoAP would decide whether a version is fit
to release based on the reports. TPA is only for the Security Audit and it does not involve testing the
application. Any improved quality control processes and tools beyond the minimal scope stated
above is welcome and the system integrator will be given further attention during the proposal
review.
Different types of testing are anticipated like user interface testing, functional testing, compliance
testing, acceptance testing, smoke testing, integration testing, systems integration testing,
operational readiness testing, performance testing, load testing, stress testing, pre-production and
production testing. All Services and Business Processes have to be tested in a standalone mode. A
brief description of different types of testing presumed is given below in this section. Please note
Products, Applications (Desktop, Browser or Mobile) and Portals have to be tested for all the user
interface requirements in addition to the functional and non-functional requirements as applicable.
In all stages and different kinds of testing wherever possible it is required to use appropriate tools.
Functional Testing
As a part of the functional testing all the services (granular web services), business services and
business processes are expected to be tested independently in standalone mode using appropriate
tools. All messages (request/response) have to be tested for requirements including the compliance,
security, performance and other criteria. The products, applications and portals have to be tested
for the functional and non-functional requirements.
Integration Testing
The integration testing is the functional testing for integration requirements. All requirements for
integration between sub-modules, modules, intra-package and inter-package that are identified
during the requirements documentation have to be tested in this phase.
Compliance Testing
All requirements that are cross mapped to specific clauses in a specification, policy or standard have
to be tested in this phase. This testing is expected for all the clauses that are relevant in a
specification, standard or policy for the services, processes, products, applications and portals. A
report on the clauses of the specification with pass or fail results for compliance is mandatory.
Performance Testing
Performance Testing includes but not limited to load, stress, scalability and availability testing
requirements and related criteria. To meet specific SLAs or requirements necessary testing tools
have to be used to confirm that the results meet the defined criterion.
Acceptance Testing
This otherwise called User Acceptance Testing is the testing of the product owners/stakeholders
who validate all functionalities as per the business. Such a subject matter expert team can also
review requirements and can pass or fail using the User Acceptance Test cases that are usually end
to end in nature.
Smoke Testing
The application or services are tested after deployment in its environment to ensure the application
sanctity. Some basic test cases are identified and run to check this.
Pre-Production Testing
This is otherwise called limited user testing. Before taking a release version to production limited
users are identified and rolled out to them to monitor the product or application. Any fixes required
are applied before taking to products.
Production Testing
A full release version is tested with full loads by end users for a limited period. This is otherwise
called the warranty state or stabilisation period.
It is recommend for a solution for automated testing and automated test case generation. This
ensures complete and appropriate test cases are generated, reducing waste and enhancing
application quality, as long as the scope and coverage of test cases and their results are verified and
signed off by PMU.
Analytical reports may be categorised as Simple, Medium, and Complex reports. On a high-level, the
criteria for categorisation are:
1. Number of fields to be displayed on reports – As the number of fields increases, the complexity
increases
2. Number of data sources from which data has to be extracted – As the number of data sources
increases, the complexity increases
3. Timeliness of Reports (Real-time or batch) – More effort and resources are required to
configure real-time reports than batch. Also, if a large report has to be generated in few seconds
or a minute, then, naturally, it will require more resources than what is required to run the same
report in few hours.
4. Type of Data – Structured or Unstructured. It is much easier to run a query and generate report
using structured data than unstructured.
5. Complexity of query – Queries using complex joins and multiple sources will increase the
complexity of the reports
The Table below gives rule of thumb to determine whether a report is simple, medium or complex. It
is to be noted that these rules shall be considered as high level guidelines, and if required, the SI may
use his own rationale/rules to determine complexity of reports in consultation with departments
during the system study:
The following Table provides anticipated distribution of report complexity. However, these are high-
level figures, and it is the responsibility of the SI to validate these numbers.