Вы находитесь на странице: 1из 16

BCBS 239:

The What, When, Why, Who, Where an


d How
by Rupert Brown February 24, 2015 2 Comments
SHARE THIS:

Introduction
In the January 2015 BIS BCBS239 Adoption progress report it was stated that
compliance with Principle 2 (data architecture/IT infrastructure) was rated lowest. BCBS
239 is the first time that the Enterprise IT Architecture profession has been subject to
regulatory scrutiny like its construction and transportation industry forbears.
This has become necessary because Distributed Enterprise IT Architectures have now hit
the same engineering maturity inflection point with both many partial subsystem issues
and in some cases public catastrophic supply chain failures.
Enterprise Architects must shift focus from solely functional analysis models to address
the sustained volume, velocity and variety of data that they now aggregate and process to
produce meaningful measures of Financial Sector business risks.
Lets remind ourselves what the banks are being asked to do when it comes to data
architecture/IT infrastructure:

A bank should design, build and maintain data architecture and IT


infrastructure which fully supports its risk data aggregation
capabilities and risk reporting practices not only in normal times
but also during times of stress or crisis, while still meeting the
other Principles.
Perhaps the reason for the lack of compliance is that
a. There are no concrete definitions of what a Data Architecture for a large complex
Enterprise should look like let alone any internationally agreed standards. Many
of the first generation Chief Data Officers now being appointed by banks have no
track record of Design, Build or Maintain in any engineering discipline let
alone IT.
b. BIS itself hasnt set any concrete expectations either as to what artefacts should
be produced.

What Comprises a Risk Data Architecture?


I didnt find much help from a quick visit to Wikipedia, which talks about how the risk data
architecture should comprise of 3 Pillars:

Conceptual
Logical
Physical

So I drew on my three decades of CTO experience and as well as the thoughts of


regulatory experts to offer some clarity by having you focus on the following questions:

Conceptual:
o What are the core risk data entities that have to be measured, reported
and mitigated?
o What are the component pieces of data that make up these risks?
Logical:
o What are the linkages between the key entities and the component pieces
of data i.e. synthesis, aggregation, transformations?
Physical:
o Similarly, what are the IT Applications that perform the composition of the
risk entities and their subsequent analysis and distribution?

What You Really Have to Do


BIS offer a few clues as follows:-

A banks risk data aggregation capabilities and risk reporting


practices should be:
1. Fully documented and subject to high standards of validation.
2. Considered as part of any new initiatives, including
acquisitions and/or divestitures, new product development, as
well as broader process and IT change initiatives.
Banks do not necessarily need to have one data model; rather,
there should be robust automated reconciliation procedures where
multiple models are in use.
Given the scale of most Financial Institutions IT Estates that comprise hundreds of
applications deployed over thousands of physical servers this can only be addressed by
automation i.e.

Consistent Discovery and Monitoring of Data Flows both messaging and block
file transfers at both Routing and ETL endpoints

Automated Validation Processes to ensure naming and format standards are


being conformed to this can be done on a sampling basis as per typical factory
quality control processes

At the moment Banks only have very crude Application Portfolio systems which use an
in-house defined, arbitrary classification of the business functions applications perform
with occasionally some manual audit/regulatory firedrill processes to provide some form of
information connectivity catalogue.
Ironically this current lack of data analysis rigor leads to banks being repeatedly charged
for unauthorized use of fee-liable data during audit cycles, which often runs to many
millions of /$.

When Are the Core Risk Data Entities Assembled/Distributed?


Timeliness, Repeatability and Frequency of Risk Calculations is also a key factor in the
BIS Principles lets now apply the same macro Data Architecture pillars to this section of
their requirements.

Conceptual: When are the key business and information system events in the
trading day: Are there multiple operational daily cycles running across the
business entities? Are these catalogued or have they become part of an
Organizational Folklore?

Logical: The logical When model is fundamentally a classic


Gantt-chart-critical-path-analysis problem of resources and dependencies
although IT Application development tends to favour a range of Agile/Sprint
Iterative delivery models, once things are in production then the basic laws of
physics must be applied to ensure correct and timely delivery of the core risk data
entities. During the 2008 crises many Bankss daily risk batches ran very late
because the job steps had been added piecemeal and no review
processes/tooling existed to derive the critical path and highlight bottlenecks.

Physical: Both Time Synchronization and Job Scheduling Capabilities need to be


standardized across all the computing and ancillary equipment of the IT estate to
ensure that consistent production of the key risk data entities can be maintained
and that the necessary capacity/performance headroom required in times of
market stress is either available or can be enabled on demand.

What You Really Have to Do


Most major Financial Institutions have good implementations of time synchronization
infrastructure across their estates for BCBS 239 compliance this does not need to be at
the same degree of precision required by Low Latency/Algorithmic platforms.

Conversely the same institutions have largely failed to maintain single- or


consistently-federated scheduling toolsets across their Application portfolios this is due
to a combination of under investment and weak technical leadership coupled with
disinterest from the Enterprise Architecture functions who have failed to document the
core dimension of time in their taxonomies.
The well-publicized failure of RBS core mainframe batch processing platform and its
knock on effects across the UK banking system for several weeks should have been a
wakeup call to the industry to invest in comprehension and strategic
investment/optimisation of this key enabling asset.

Why Is the Risk Aggregation and Production Process


Implemented in a Particular Way?
Sir Christopher Wrens tomb in St Pauls Cathedral carries the inscription SI
MONUMENTUM REQUIRIS CIRCUMSPICE i.e. If you seek his monument look around
you i.e. All architectures should have a purpose that should be self-evident. BIS hints at
this too with the statement Risk data should be reconciled with banks sources, including
accounting data where appropriate, to ensure that the risk data is accurate.
Again we can apply the 3 Pillars to clarify these requirements as follows

Conceptual: The why or rationale for how the Risk Aggregation process is
conducted for a particular institution lies at the heart of its Core Business
Operating Model and Business Risk Appetite. If this cannot be documented and is
not reviewed in line with quarterly financial statements significant regulatory
scrutiny can be expected.

Logical: Clear automatically derived Process metrics lie at the heart of ensuring
that the Risk Aggregation Process is being operated in line with an institutions
Core Business Operating Model and that it is helping to alert and protect the
institution in times of stress.

Physical: KPIs must be immutable and irrefutable i.e. directly derived from the
underlying data flows and operations to have any value.

What You Really Have to Do


Currently the reporting of KPIs is often massaged by a set of human aggregation
processes into monthly Status Decks cultural change needs to occur to ensure that the
real world dashboard data is automatically embedded into the reports to avoid ambiguity /
political emphasis.

Who Is Responsible for the Governance of the Risk Data Entity


Production and Delivery Processes?
As with the other pieces of this jigsaw BIS are giving few tangible clues i.e. The owners
(business and IT functions), in partnership with risk managers, should ensure there are
adequate controls throughout the lifecycle of the data and for all aspects of the technology
infrastructure
Lets apply our 3 tiers approach again to try and decode this sentence and determine the
Whos.

Conceptual: The key controls in this process are a combination of Job Roles +
Operating Committees + Exception/Escalation Chains these need to be
maintained in a sustainable consistent archive and regularly reviewed.
Deputization/Succession planning for attendees also needs to be addressed.

Logical: The committee agenda/minutes need to be bound to both the metrics


and any Exception/Escalation activities that have occurred during each reporting
period.

Physical: Where possible direct binding to KPI data and Exception/Escalation


workflow data should be at the heart of the Operating Committees minutes/actions
rather than manual aggregation/interpretation of the data.

What You Really Have to Do


You have to systematically blend the document-based approach of operating committees
and scorecards with the live operational dashboards of process/technical monitoring
technologies which admittedly is very difficult to do currently. This gives rise to the
commonly used interpreted RAG scorecard that is often manually assembled and
massaged to manage bad news and/or accentuate positive results. With the advent of
document based databases and semantics this area is ripe for automation and
simplification.

Where Are the Core Risk Data Entities


Created/Stored/Distributed?
The notions of Geography and City Planning are much more comfortable spaces for
architects to describe and operate in and indeed some of the concepts are very mature
in large corporations so applying the 3 pillar approach would appear to be straightforward.

There are multiple geographic concepts that need to be captured to answer this
question i.e.
Real World Geopolitical entities and the judicial/regulatory constraints
they impose on the content of the data
o Organizational Geography (i.e. Business units/Divisions/Legal Entities)
these typically act as segments and groupings of the core risk data entities
and in many cases will be aligned to Real World constraints/boundaries
o Application Portfolio geography As noted in earlier sections there need
to be a clear linkage between risk data entities and the applications
involved in the production and distribution processes
Logical: Real World and Organizational Geographies are generally well
understood concepts the notion of What is an Application however needs to be
clearly defined and agreed both in IT and its client business units. It is notable that
ITIL failed to define a standard for what an application comprises and often
confuses it with the notion of Service which can become a very overloaded term.
o

Physical: Geopolitical Entities and Organization Hierarchies/Accounts are


typically core components of an Enterprise Reference Data Platform they are
largely well understood concepts with concrete taxonomies, curated data and
standardized coordinate systems + semantic relationships.
Application portfolios are typically owned by the Enterprise Architecture/CTO
function in Financial Institutions and in many cases are little more than a collection
of manually maintained spreadsheets supported by Powerpoint or Visio
function/swimlane diagrams aka McKinsey Charts.
Enterprise IT Architects are often cynically viewed by their departmental peers as
Ivory Tower Thinkers because of their inability to make application definitions
concrete sustainable data entities. This is a key weakness that permeates most
EA departments in that they almost always focus on refining the functional
classifications of what applications do and often have to play catch-up as
corporations reorganise or refocus their strategies.

What You Really Have to Do


The application portfolio of a Financial Institution should be a first-class entity within its
Enterprise Reference Data platform the CTO/Enterprise Architecture function has
responsibility for its maintenance, which must be largely automated with
validation/comformance checking processes against the physical infrastructure.
NOTE: An application will often span multiple logical production/dev/test environments as
well as now being able to be instantiated dynamically on premise or external to an
institution so the data model and maintenance functions must be able handle these short
lived instances.

How Is the Risk Aggregation Process Built and Operated?


Finally we get to the most detailed set of architectural artefacts specified by BIS: A bank
should establish integrated data taxonomies and architecture across the banking group,
which includes information on the characteristics of the data (metadata), as well as use of
single identifiers and/or unified naming conventions for data including legal entities,
counterparties, customers and accounts.
It is interesting to note that BIS focuses on the notion of standardized identifiers and
naming conventions which are quite basic hygiene concepts, in reality there are some
much more important components of the architecture that need to be defined first.

Conceptual: The key functional elements of the aggregation system are the
corollary of the What section earlier in this document i.e. the operations
performed on the data and the event stimuli along the critical path of the
production process discussed in the When section.
The other key entity that needs to be described to answer the How question is
the notion of process state i.e. How to introspect what is occurring at any point in
time rather than at process or time boundaries?

Logical: The logical model of state needs to be able to answer the questions:- Do
we know What is in flight, is it on time i.e. When and is it sufficiently
complete/correct Why and if not Who is working to remediate things.

Physical: The When Section discussed the need for temporal consistency and a
coherent task scheduling capability in this section all of connections, data flows,
storage and control points and physical mechanisms that deliver them need to be
described.
As with any complex command and control system some level of
redundancy/duplication should be provided but the range and fidelity of integration
and storage techniques needs to be actively managed.

What You Really Have to Do


As noted above, in many large Financial Institutions Integration, Storage, Reporting,
Delivery, Command+Control systems have become both fragmented and growing, so to
achieve effective Risk Data Aggregation and Reporting compliance a single, integrated
toolset needs to be applied along the supply chain.

So What Have We learned?

Being compliant with the principles stated by BIS in BCBS239 requires real IT work with
tangible Design+Build assets and sustainable Maintenance processes linking
multiple transaction and reference data systems with real artefacts owned by both Line
and Enterprise Architecture Teams.
The Enterprise Application Portfolio and supporting functional taxonomy must become
concrete reference data entities.
Linkage of dataflows to each of the Application/Environment instances must be achieved
through significant mechanisation and automated physical discovery toolsets manual
firedrill collections of user opinions is not an option.
NB Excel, PowerPoint and Word artefacts are only ancillary to the core solution.
And finally The data produced by this exercise should be used for the strategic
optimisation of the organisation not just for appeasing a set of faceless regulatory
bureaucrats. You can only manage what you measure is a very old business maxim.

Digitizing Risk Data Architecture


Reporting to Avoid Plea Bargaining
by Rupert Brown June 10, 2015 No Comments

Can applying semantics make BCBS 239 reporting consistent


and comparable across all financial services regulatory bodies?
Lets explore.
When Best Practice isnt good enough The regulatory vagueries and the punitive fines
are making for a tense summer for bank risk officers. On the one hand, regulators are
demanding asking for a whole lot of architectural work be done including using standard
LEIs, but havent been too specific on what all the tasks should be. Lots of hoops all in
motion.
In my previous blog posting I discussed how to go about constructing a risk data
architecture that would meet the specifications set byBCBS now lets look in detail at
how to report on a Risk Data Architecture in a financial services institute so that it can be
objectively measured against a peer organization.

Lets remind ourselves what the BCBS asks are:

A bank should have in place a strong governance framework, risk data architecture and IT infrastructure.
A bank should establish integrated data taxonomies and architecture across the banking group, which
includes information on the characteristics of the data (metadata), as well as use of single identifiers
and/or unified naming conventions for data including legal entities, counterparties, customers and
accounts.
A banks risk data aggregation capabilities and risk reporting practices should be:-Fully documented
and subject to high standards of validation. This validation should be independent and review the
banks compliance with the Principles in this document. The primary purpose of the independent
validation is to ensure that a banks risk data aggregation and reporting processes are functioning
as intended and are appropriate for the banks risk profile. Independent validation activities should
be aligned and integrated with the other independent review activities within the banks risk
management program, and encompass all components of the banks risk data aggregation and reporting
processes. Common practices suggest that the independent validation of risk data aggregation and risk
reporting practices should be conducted using staff with specific IT, data and reporting expertise.

Hmm note the use of the term Independent Validation


Data Governance Consultant Naomi Clark recently drafted a quick pic of the basis of a 100-page essay.

What is actually happening in practice is that each major institutions regional banks are
lobbying/negotiating with their local/regional regulators to agree on an initial form of
compliance typically as some form of MS Word or PowerPoint presentation to prove that
that they have an understanding of their risk data architecture. One organization might
write up 100 page tome to show its understanding and another might write up 10. The
ask is vague and the interpretation is subjective. Just what is adequate?

Stack the Compliance Odds in Your Favor


XBRL has become the dominant standard for most of the major Regulatory Reporting
initiatives behind each of the dialects is what is known as a data point model, i.e. all the
key risk and other financial values that comprise the report. What if we could extend XBRL
to report BCBS 239 compliance and be able then to compare each of the submissions
produced by the banks?

Some ground rules


There are no right answers as to what a risk data architecture should look like the key
thing is to find a way of being able to systematically compare different architectures to find
inconsistencies and outliers this is what the regulators do with the financial content so
why not do it with the system architectures?
NOTE: We need to build a model, which of course needs to be anonymous. For public
reporting purposes we dont put any system names in the model, e.g. Murex, Calypso, etc,
would be meaningless nor do we put any technology types in for the interchanges or
schedulers e.g. MQ, EMS, Autosys, ControlM, etc.

The Parts: Data Points, Paths, Applications, Interchanges &


Schedulers
Instead of a massive PowerPoint or Word document, I propose a comparative logical
model that is comprised of 5 key components:

Data Points A Data Point is what is actually reported as a number based on the intersections/dimensions
of an XBRL report, i.e., effectively a cell in a spreadsheet.

Paths Each Data Point is created from a Path, i.e., the flow of data via a group of Applications and
Interchanges. NOTE:Many Data Points will be generated from a single path although given the number of
data points in a regulatory report we can expect that many different paths will exist across the application
estate.

Applications An Application is a collection of computers (physical or virtual) that run a group of


operational functions in a bank. In reality applications are really financial entities of a set of hardware
and software costs grouped together this is one reason why this model is defined at the logical level,
as the groupings are arbitrary within and across banks. NOTE:The same Application and an Interchange may
appear in multiple data point paths.

Interchanges Multiple data interchanges will exist between applications (even within a path) to produce
a data point these will be message or batch-based and be related to different topics and formats. As
a rule of thumb each data entity/format being moved, any distinct topics used to segregate it, and each

transport used to move it, results in a separate interchange being defined. When you start to think about
the complexity that this rule of thumb implies you get to the heart of why BCBS 239s requirement for accurately
describing Data Architectures is needed and why moving to a logical approach levels the playing field.

Schedulers Applications and Interchanges are both controlled by schedulers. In an ideal world there should
be a single master scheduler but in reality multiple schedulers abound in large complex organisations.
Identifying the number of places where temporal events are triggered is a major measure of technical +
configuration risk and variability in a banks risk architecture. As I stated, a single controlling scheduler
should be the right thing to do in an ideal world but the lessons of the RBS batch failure are a textbook
case of concentration risk in a system design.

An Example Path: So putting it all together we get a picture like this: (I.E. the path above comprises)
 5 Applications, 4 Interchanges 2 of which are message based, 2 of which are scheduled (FTP etc), 2

Controlling Schedulers

So what does this abstract model achieve?


The abstract model allows us to look objectively at the variability of each banks data
architecture without getting lost in the often emotive technical weeds of My approach is
better than yours.
NOTE: There are no right answers but there may be some interesting features that
could occur.
What if some banks reported path lengths of 20 systems to compute a particular Data
Point when the average might be say 12. If they had say 20 different scheduling
mechanisms for triggering data movements and calculations this should also be a
significant concern.
Banks that report say short path lengths of 2 or 3 also would raise concerns. Are they
really much more efficient or have they accurately mapped out their data
flows/formats/interchange technologies?

Potential extensions to the abstract model


It would be instructive to extend the attributes of Applications, Interchanges and
Schedulers to include measures of the rate of change i.e. both Functional changes and
Platform changes given the need now to support both stress testing and back testing of
risk calculations means that regulators need to know that systems are being remediated.

So for each reporting period the dates of the last functional change and platform change
should also be reported and the number of functional and platform changes in the
reporting period.

Adding Bitemporality to the mix


The need to determine how bitemporally aware each of the systems and interchanges are
in a path is also a significant attribute that can be added to this mix. Without this capability
backtesting can only be done by building copies of system environments and setting the
content and clocks back to the required period.

Normalizing the architecture adding Dimensionality


If youve got this far you now understand that weve managed to define a set of abstract
logical architectures across the banks that means we can compare scale, complexity and
volatility as well as potentially driving statistical convergence to a consistent level of detail.
To the untrained eye the picture above looks like some sort of abstract Maths PhD thesis.
There is no easy way for any Banking/Technology subject matter expert to intuitively
interpret the results and compare banks objectively in terms of their functional
approach/evolution to Risk Data Management/Delivery.
So we need to add some sort of consistent functional classification to the Application list
described in the Path specifications.
Any functional model will do my starter proposal would be BIAN (www.bian.org) as it
seems to have some traction in the FS Architecture community. This massive PDF gives a
standard set of names for bank services albeit its not an ontology, just a spreadsheet
BIAN really needs to amend this.) So build your abstract model for each Application A1
through An and then map each of your applications to BIANs services. Currently each
bank has its own service model and cant be compared. So this is your normalization
that adds a form of geospatial, if you will.
Footnote: Today each major bank defines its own private Enterprise functional model
with the content often mastered in PowerPoint my previous 2 employers both did this
and then transcribed the contents into Planning IT for various technology/systems
portfolio reports.
This classification was then used to produced functional system estate maps aka
McKinsey charts that are used by IT management to try and spot duplicative functionality
across the application portfolio, however because the charts do not contain the data
interchanges between the systems the approach is fundamentally flawed as the true
complexity is never shown.

Almost there

By applying a BIAN (or similar) mapping to the abstract Applications + Interchanges graph
we now have a comparable set of functional and data flow based topologies across the
set of reporting banks described in Banking language rather than abstract algebra. We
can now ask questions as to why Bank A has 6 reference data systems vs Bank B who
may have 2, or why the number of interchanges between 2 functional domains ranges
from 20 in Bank A to 120 in Bank B.
As with the abstract model, adding this form of consistent dimensionality drives reporting
convergence; if data lies outside statistical norms then it will naturally provoke curiosity
and scrutiny.

Using FIBO as a dimension for Interchanges


FIBO (http://www.omg.org/spec/EDMC-FIBO/) can be applied in similar fashion to BIAN to
the set of Interchanges. At the moment however FIBO is incomplete so comparisons will
be limited to the foundational entities rather than the high level asset classes. Initially this
does not matter as the initial focus of any analysis will be on scale and macro level
variability but over time this will need to be completed to allow full data entity flow analysis
of a risk system topology.
Using FIBO should also have an interesting side effect it can be assumed that each
standard output Data Point will be comprised of the same FIBO entities for each bank so
any discrepancies will be very easy to spot.

So how do I actually add this data to my XBRL-based Regulatory COREP/FINREP


etc Submissions?
So this is where the Semantics part of the title comes in. We could use triples to visualize
this but it would not be a pretty picture (literally not pretty). Instead I am using triples to
compute the associations. We can describe the graph-based path model, the scheduling
and temporal attributes of the system as well as the BIAN and FIBO classifications as a
set of semantic triples, e.g., A1 has a ServiceType of Instrument Reference Data. (for
more information on triples)
Conveniently, XBRL is an XML-based reporting syntax so we can just extend the XBRL
report content using the industry RDF specifications of triples
see :-http://en.wikipedia.org/wiki/Resource_Description_Framework for full details of the
syntax and plenty of examples.

Is There Really an Alternative?


Without triples, the agreed Essay format of BCBS 239 exam submission (the 100 pages
or so) seems to be the norm in the industry. As a taxpayer and shareholder in a number of
banks it seems only logical to question how that approach will:
a. Build confidence that our political and regulatory masters are being systematic and
consistent in their reporting content and metrics?

b. Reduce the likelihood of future banking IT failures since it does not deliver
continuously improving comparative measurement and analysis?
c. Improve the ROI to bank shareholders of their holdings and dividend payouts as it
does not provide means to target efficiency improvements in IT systems and
supporting institutional knowledge?
It wont. We are paying people to draw pictures, and we have no idea if it is correct. We
have to move into a more grown up world and take the human element out we can
actually create a comparative scale.

Lets recap, Summarize and Comply


Today BCBS 239 is a set of qualitative concepts that the FS Industry is trying to achieve
with no comparable, objective measures of success. As a result an inordinate amount of
professional services fees are being spent by the major banks to try and achieve
compliance coupled with a series of negotiations with regulatory bodies that themselves
are not deeply IT literate to agree an initial level of completeness and correctness.
The semantic approach proposed above has the following benefits:

An extension of existing agreed industry standard reporting formats/syntax


An objective way of assessing different FS risk management architectures and
processes
Application of standardized terminology to any logical Risk Data architecture
Topologies
A range of statistical measures to benchmark FS Data Architectures and drive
convergence/remediations
Architectural Risk and Volatility measures independent of specific application and
technical platform religious/political factions
The reporting process is repeatable with evolutionary measures of volatility,
convergence and improvement
Actually embedding standards such as BIAN and FIBO to solve real world industry
problems and drive consistent adoption