Вы находитесь на странице: 1из 159

Institute for Informatics and Automation Problems of

National Academy of Science of the Republic of


Armenia

Suren A. Chilingaryan

Universal Data Exchange Solution for Modern


Distributed Data Acquisition Systems and Its
Implementation for Cosmic Ray Monitor Networks

Thesis
to take PhD degree in Computer Science

Collaborations:

Institutes fur Prozessdaten- Scientific adviser:


verarbeitung und Elektronik,
Forschungszentrum Karlsruhe, Prof. Dr. Hartmut Gemmeke
Karlsruhe, Germany
Co-adviser:
Cosmic Ray Division,
Yerevan Physics Institute, Dr. Wolfgang Eppler
Yerevan, Armenia

February 2006
ACKNOWLEDGMENT

The research presented in this thesis was carried out in the “Institutes fur Prozessdatenverarbeitung
und Elektronik” at the Forschungszentrum, Karlsruhe. The results were implemented in the Cosmic
Ray Division of the Yerevan Physics Institute.
I wish to thank Hartmut Gemmeke for the opportunity to carry out the interesting research on the
data exchange topics in the Forschungszentrum, Karlsruhe. I would like as well to thank Wolfgang
Eppler for introduction into the problems of the data exchange in the modern physical experiments
and for his support throughout all my work on the thesis. Without his help this work will never be
accomplished. I wish to express my gratitude to Cord Lefhalm for the help in a better understanding
of the control system demands and especially KATRIN experiment. I would like to mention as well
his valuable help in launching the LabVIEW bindings of the prototype implementation together
with the LabVIEW RealTime running on the NI FieldPoint devices. The useful discussions on the
optimizations topics with Michael Zapf have inspired the quantity of ideas used in the prototype
implementation. I want to express thanks to all my colleagues (Rainer Stotzka, Volker Hartmann,
Nicole Ruiter, Michael Beller, Rong Liu, Tim Mueller, Heinz Francone, Armen Beglarian) at
Forschungszentrum, Karlsruhe for their friendship and help.
The implementation of the dissertation results for the Aragats Space Environmental Center was
done in collaboration with electronics and data analysis groups of Cosmic Ray Division of Yerevan
Physics Institute. I want to express my gratitude to Varuzhan Danielyan, Aram Eghikyan, Nerses
Gevorgyan, Karen Arakelyan, Ashot Avetisyan for fruitful collaboration and many useful
discussions. I feel also appreciation to the staff of Nor Amberd and Aragats research stations for the
hospitality and support in implementing new data acquisition system for the ASEC monitors.
Additionally, I want to thank lecturers of International Conference “Solar Extreme Events 2005” for
the introduction in the problems of the solar physics and Space Weather.
The implementation was developed in the Linux operating system by means of a wide range of the
different sophisticated tools provided by the GNU, Apache and other open source communities. The
XML functionality is founded on the Gnome XML Library. I wish to tender a lot of thanks to
people contributing in all these projects. I also wish to thank the Free Software Foundation and
other organizations supporting free software development.
A lot of materials used during the thesis preparation were obtained from CiteSeer.IST Scientific
Literature Digital Library. Therefore, I would like to express my gratitude to people maintaining
this nice archive of electronic publications.
CONTENT

Acknowledgment................................................................................................................................. 2
Content ................................................................................................................................................ 3
Acronyms ............................................................................................................................................ 8
Introduction ....................................................................................................................................... 11
Chapter 1 Distributed Data Acquisition Systems .............................................................................. 16
1.1. Introduction ....................................................................................................................... 16
1.1.1. DDAS Data Exchange ............................................................................................... 18
1.2. Existing DDAS .................................................................................................................. 19
1.2.1. Aragats Space Environmental Center (ASEC).......................................................... 19
1.2.2. Karlsruhe Tritium Neutrino Experiment (KATRIN)................................................. 21
1.2.3. H.E.S.S. Telescope .................................................................................................... 22
1.3. Data Exchange Protocol Requirements ............................................................................. 23
1.4. Existing RPC Solutions ..................................................................................................... 25
1.4.1. Open Network Computing Remote Procedure Call .................................................. 25
1.4.2. Distributed Computing Environment / Remote Procedure Call................................ 25
1.4.3. Internet Inter-ORB Protocol ...................................................................................... 26
1.4.4. Internet Communications Engine .............................................................................. 26
1.4.5. Inter-Client Exchange................................................................................................ 26
1.4.6. Desktop Communication Protocol ............................................................................ 27
1.4.7. Multimedia Communication Protocol ....................................................................... 27
1.4.8. Object Remote Procedure Call .................................................................................. 27
1.4.9. Java Remote Method Protocol................................................................................... 27
1.4.10. D-Bus......................................................................................................................... 27
1.4.11. XML Remote Procedure Call .................................................................................... 28
1.4.12. Simple Object Access Protocol ................................................................................. 28
1.4.13. Open Proccess Control – Data Access ...................................................................... 28
1.4.14. PROFInet ................................................................................................................... 29
1.4.15. Open Process Control - XML Data Access ............................................................... 30
1.4.16. Summary.................................................................................................................... 30
1.5. OPC XML-DA Specification ............................................................................................ 31
1.5.1. OPC XML-DA Interfaces.......................................................................................... 32

3
1.6. OPC Complex Data Specification ..................................................................................... 33
1.6.1. Complex Type Descriptions ...................................................................................... 33
1.6.2. OPC Binary Type System ......................................................................................... 35
1.6.3. Complex Data Representations ................................................................................. 36
1.6.4. Data Filters and Queries ............................................................................................ 37
1.7. Issues of the OPC XML-DA Protocol............................................................................... 38
1.8. Summary............................................................................................................................ 40
Chapter 2 New High Data Rate Approach ........................................................................................ 41
2.1. Introduction ....................................................................................................................... 41
2.2. Linkage between SOAP message and Binary Data........................................................... 42
2.2.1. Base64 Encoding ....................................................................................................... 42
2.2.2. SOAP message with Attachment............................................................................... 42
2.2.3. Web Service Attachment ........................................................................................... 43
2.2.4. Summary.................................................................................................................... 43
2.3. Binary Formats in Heterogeneous Environment ............................................................... 44
2.3.1. External Data Representation .................................................................................... 44
2.3.2. Common Data Representation................................................................................... 45
2.3.3. Abstract Syntax Notation One................................................................................... 45
2.3.4. Native Data Representation ....................................................................................... 46
2.3.5. XML .......................................................................................................................... 46
2.3.6. Summary.................................................................................................................... 47
2.4. HDR Binary....................................................................................................................... 48
2.5. Historical Data Access....................................................................................................... 49
2.6. Multicasting....................................................................................................................... 50
2.6.1. Pragmatic General Multicast ..................................................................................... 51
2.6.2. Summary.................................................................................................................... 52
2.7. Security Aspects ................................................................................................................ 52
2.7.1. SSL Security .............................................................................................................. 52
2.7.2. X509 Certificates....................................................................................................... 53
2.7.3. OPC XML-DA Security ............................................................................................ 54
2.7.4. XML Encryption and XML Signature Specifications............................................... 54
2.7.5. Multicast Security...................................................................................................... 55
2.7.6. XML Security ............................................................................................................ 56
2.8. Compatibility Aspects ....................................................................................................... 57

4
2.9. The OPC XML-DA HDR Specification............................................................................ 57
2.9.1. HDR Specific Data Space ......................................................................................... 59
2.9.2. HDR Specific Metadata Properties............................................................................ 60
2.9.3. Type Systems............................................................................................................. 61
2.9.4. Synchronous Clients Support .................................................................................... 61
2.9.5. Standard Queries ....................................................................................................... 63
2.9.6. Custom Data Representation Query .......................................................................... 63
2.9.7. Historical Data Access Query.................................................................................... 64
2.9.8. Query Extracting Component Parts from Compound Data....................................... 65
2.9.9. Security...................................................................................................................... 66
2.9.10. Compatibility ............................................................................................................. 68
2.10. Summary............................................................................................................................ 70
Chapter 3 Prototype Implementation................................................................................................. 71
3.1. Introduction ....................................................................................................................... 71
3.2. Multitasking Environments ............................................................................................... 72
3.2.1. Real-Time Constraints............................................................................................... 73
3.2.2. Scheduling ................................................................................................................. 75
3.2.3. Deadline Driven Scheduling Algorithm .................................................................... 77
3.2.4. Overload Management .............................................................................................. 78
3.2.5. Scheduling on the Multiprocessor ............................................................................. 79
3.3. Data Representation Concept ............................................................................................ 80
3.3.1. Standard Representations .......................................................................................... 81
3.3.2. Converting Binary Representations........................................................................... 82
3.3.3. Representation Caching............................................................................................. 83
3.3.4. Security Caching ....................................................................................................... 85
3.4. Component Model ............................................................................................................. 85
3.4.1. Backend ..................................................................................................................... 87
3.4.2. Frontend..................................................................................................................... 87
3.5. Data Flow .......................................................................................................................... 88
3.6. Threading Model ............................................................................................................... 90
3.6.1. Scheduler ................................................................................................................... 91
3.7. Memory Management........................................................................................................ 93
3.7.1. Dedicated Ring Buffers ............................................................................................. 93
3.7.2. Transparent Memory ................................................................................................. 95

5
3.8. Evaluation of Prototype Implementation........................................................................... 96
3.8.1. Framework................................................................................................................. 97
3.8.2. LabVIEW Bindings ................................................................................................... 97
3.8.3. Performance Analysis................................................................................................ 98
3.9. Future Development ........................................................................................................ 100
3.10. Summary.......................................................................................................................... 101
Chapter 4 Data Acquisition System of Aragats Cosmic Ray Monitors Network ........................... 103
4.1. Introduction ..................................................................................................................... 103
4.2. Physical Aspects .............................................................................................................. 104
4.3. ASEC Detectors............................................................................................................... 106
4.3.1. Aragats Solar Neutron Telescope ............................................................................ 107
4.3.2. Nor-Amberd Multidirectional Muon Monitor......................................................... 108
4.3.3. Aragats Multidirectional Muon Monitor ................................................................. 109
4.3.4. MAKET-ANI .......................................................................................................... 110
4.3.5. Frontend Electronics................................................................................................ 110
4.4. ASEC Data Acquisition System ...................................................................................... 111
4.4.1. Embedded Software................................................................................................. 111
4.4.2. Frontend Computers ................................................................................................ 112
4.4.3. Unified Readout and Control Servers...................................................................... 113
4.4.4. Registry.................................................................................................................... 114
4.4.5. Operator Frontend ................................................................................................... 114
4.4.6. Error Handling and Notifications ............................................................................ 115
4.4.7. Data Storage ............................................................................................................ 116
4.5. Unified Readout and Control Server ............................................................................... 116
4.5.1. Layered Architecture ............................................................................................... 116
4.5.2. Threading Model ..................................................................................................... 117
4.5.3. Detector Network .................................................................................................... 118
4.5.4. Configuration........................................................................................................... 118
4.5.5. Commands ............................................................................................................... 119
4.5.6. Error Logging .......................................................................................................... 119
4.5.7. Self-Announcement ................................................................................................. 120
4.6. Data Representation......................................................................................................... 120
4.6.1. Basic Data Format ................................................................................................... 120
4.6.2. Detector Description................................................................................................ 121

6
4.6.3. Detector Geometry .................................................................................................. 123
4.6.4. Data Layout ............................................................................................................. 124
4.7. OPC XML-DA Interface ................................................................................................. 127
4.7.1. URCS Server ........................................................................................................... 128
4.7.2. Registry Server ........................................................................................................ 129
4.7.3. ASEC-Specific Metadata Properties ....................................................................... 130
4.7.4. Complex Data .......................................................................................................... 131
4.8. Summary.......................................................................................................................... 131
Conclusion....................................................................................................................................... 133
Appendix A The XML Benchmark ................................................................................................. 139
A.1. Introduction ..................................................................................................................... 139
A.2. DDAS Requirements ....................................................................................................... 140
A.3. XML Libraries................................................................................................................. 141
A.3.1. Apache XML ........................................................................................................... 142
A.3.2. Gnome XML ........................................................................................................... 143
A.3.3. Oracle XML............................................................................................................. 143
A.3.4. Expat........................................................................................................................ 143
A.3.5. Sun JAXP ................................................................................................................ 144
A.3.6. Other XML Libraries............................................................................................... 144
A.4. Performance Evaluation .................................................................................................. 144
A.4.1. XML Parsing ........................................................................................................... 145
A.4.2. DOM Manipulations................................................................................................ 146
A.4.3. XML Schema Validation......................................................................................... 147
A.4.4. XML Transformation .............................................................................................. 148
A.4.5. XML Security .......................................................................................................... 149
A.4.6. Memory Consumption............................................................................................. 149
A.4.7. XML Schema and XSL Transformation Behavior.................................................. 150
A.5. Summary.......................................................................................................................... 150
Bibliography .................................................................................................................................... 152

7
ACRONYMS

ACK Acknowledgement
ADAS Aragats Data Acquisition System
ADC Analog-Digital Converter
AMMM Aragats Multidirectional Muon Monitor
ArNM Aragats Neutron Monitor
ASEC Aragats Space Environmental Center
ASN Abstract Syntax Notation
BER Basic Encoding Rules
CBS Constant Bandwidth Server
CDR Common Data Representation
CME Coronal Mass Ejection
CORBA Common Object Request Broker Architecture
CR Cosmic Rays
CPU Central Processing Unit
DAQ Data Acquisition
DAS Data Acquisition System
DCE Distributed Computing Environment
DCOM Distributed Component Object Model
DCOP Desktop Communication Protocol
DCOM Distributed Component Object Model
DCS Distributed Control System
DDAS Distributed Data Acquisition System
DHCP Dynamic Host Configuration Protocol
DSC Data logging and Supervisory Control
DVIN Data Visualization Network
GCR Galactic Cosmic Rays
GLE Ground Level Enhancement
GPS Global Positioning System
GUI Graphical User Interface
EAS Extensive Air Showers
EDF Earliest Deadline First

8
EDZL Earliest Deadline until Zero Laxity
EPDF Earliest Pseudo Deadline First
FIFO First In First Out
FTP File Transfer Protocol
HDR High Data Rate
HESS High Energy Stereoscopic System
HMI Human-Machine Interface
HTTP Hyper Text Transport Protocol
HTTPS Secure HTTP
ICE Inter-Client Exchange
IIOP Internet Inter-ORB Protocol
IMF Interplanetary Magnetic Field
IP Internet Protocol
JRMP Java Remote Method Protocol
KATRIN KArlsruhe TRItium Neutrino
KSC KATRIN Slow Control
LabVIEW Laboratory Virtual Instrumentation Engineering Workbench
LDAP Lightweight Data Access Protocol
LLF Least Laxity First
LST Least Slack Time
MCOP Multimedia Communication Protocol
MIME Multipurpose Internet Mail Extension
MPI Message Passing Interface
NAK Negative Acknowledgement
NAMMM Nor-Amberd Multidirectional Muon Monitor
NANM Nor-Amberd Neutron Monitor
NDR Native Data Representation
NFS Network File System
NUMA Non-Uniform Memory Access
ONC RPC Open Network Computing Remote Procedure Call
OPC Open Process Control
OPC DA OPC Data Access
OPC HDA OPC Historical Data Access
OPC A&E OPC Alarms and Events

9
OPC XML-DA OPC XML Data Access
ORPC Object Remote Procedure Call
PCI Peripheral Component Interconnect
PGM Pragmatic General Multicast
PLC Programmable Logic Controller
POSIX Portable Operating System Interface X
PXI PCI eXtensions for Instrumentation
RMI Remote Method Invocation
RMT Reliable Multicast Transport
RPC Remote Procedure Call
SCR Solar Cosmic Rays
SMP Simultaneous Multi Processing
SNT Solar Neutron Telescope
SOAP Simple Object Access Protocol
SQL Structured Query Language
SSL Secure Socket Layer
TCP Transmission Control Protocol
TLB Translation Lookaside Buffer
TLS Transport Layer Security
TBS Total Bandwidth Server
UDDI Universal Description, Discovery and Integration
UDP User Datagram Protocol
UI User Interface
URCS Universal Readout and Control Server
WCET Worse Case Execution Time
WS Web Service
WSDL Web Service Definition Language
XDR eXternal Data Representation
XML eXtensible Markup Language
XPATH XML Path Language
XSD XML Schema Definition
XSL eXtensible Stylesheet Language
XSLT XSL Transformation

10
INTRODUCTION

Modern experiments in high energy physics and astrophysics pose enlarging challenges to data
acquisition and data analysis systems. There is a growing need in the more sophisticated and fast
systems providing on-line decision making and storing of huge amounts of data, as well as to
perform artificial intelligence task like pattern recognition and multiple comparisons. It is
impossible to handle on a single PC whole data flow from such high energy physics experiments,
like HESS (High Energy Stereoscopic System) Telescope, LHC (Large Hadron Collider) based
experiments at CERN and others [1, 2]. Many of the modern experiments are distributed over the
world; the detectors are often located in the different geographic regions. For example, the detectors
of the ASEC (Aragats Space Environmental Center [3, 4]) are operating in the two scientific
stations located on the slopes of Aragats Mountain at the distance of 20 km. In addition there are
plans to extend ASEC monitor network with the detectors placed in the several equatorial countries
during the years 2007-2008 [5]. All detectors are used for the joint measurements of the major solar
events in progress. Each detector, located on the specific latitude and longitude provides very
specific information to be accounted in the forecasting of the Space Weather hazard for the
satellites and surface industries.

Such requirements make the distributed data acquisition systems more and more important.
However, the lack of the standardization causes a certain extent of chaos in the developments of the
distributed data acquisition systems. At the moment the wide range of different data exchange
solutions are used to provide inter-process communication for distributed components. DCOM
(Distributed Component Object Model [6]), CORBA (Common Object Request Broker
Architecture [7]), Java RMI (Remote Method Invocation [8]), SOAP (Simple Object Access
Protocol [9]), MPI (Message Passing Interface [10]) protocols are most popular among them.
Furthermore, several industrial systems, using OPC DA (Open Process Control Data Access [11])
or OPC XML-DA (OPC XML Data Access [12]) are available. For example, the data acquisition
system for the H.E.S.S. telescope uses the omniORB implementation of CORBA for inter-process
communication [2]. The DVIN (Data Visualization Interactive Network), based on a legacy ASCII
data representation and FTP data transfer protocol, is used in the ASEC (Aragats Space
Environmental Center) experiment [3]. The KATRIN (KArlsruhe TRItium Neutrino) experiment is
an excellent example for heavily heterogeneous system. Besides the industrial solutions controlling
subsystems of KATRIN, the experimental data is currently acquired using the LabView DSC (Data

11
logging and Supervisory Control) system from National Instruments [13]. As a result of such
different approaches data sharing and reusability of data analysis software become an inefficient
and tedious task. The evident consequence of such situation consists in impaired effectiveness of
scientific work, lack of cooperation between groups working on similar projects and slow downs in
implementation of newer promising technologies, like the Grid Computing [14].
In spite of the availability of the big amount existing data exchange solutions, there is no industry
approved protocol, which combines high compatibility and flexibility with outstanding performance
required for the real-time tasks. The pure RPC (Remote Procedure Call) solutions are lacking in
standardization of the metadata required in the data acquisition and slow control systems. Several
widely used slow control systems are limited to one specific platform and inoperable in
heterogeneous environments. The XML based solutions are mostly too slow for real-time systems
working with fast data flows [15].

The thesis reviews the requirements of the modern data acquisition and slow control systems and
proposes a universal middleware solution, which can be used for almost any experiment
independent of the data rates, real-time demands and metadata requirements. The data exchange
standardization will speed-up software development, simplify the data exchange between related
experiments, and allow reusing software components made by scientists from different groups
around the world. To achieve maximal compatibility with existing software the OPC XML-DA and
OPC Complex Data specifications are used as protocol foundations [12, 16]. The OPC XML-DA
specification is a restatement of the industry accepted OPC DA specification in terms of XML. It
defines a Web Service interface facilitating the exchange of plant data in heterogeneous
environments across the internet. According to the specification, the server data space is divided
into the set of OPC Items which are addressed using “ItemPath” and “ItemName” identifiers. Each
of the OPC Items stores the data variable along with standard and user-defined metadata
information. The Web Service provides interfaces for reading, writing current values and examining
the metadata information. The variable value changes are reported to the clients using polled style
subscription mechanism.
Despite the fact that the OPC XML-DA specification is designed mainly for control systems, the
subscription mechanism used together with buffering makes the protocol usage in the data
acquisition systems possible as well. The only drawback lies in its XML nature. The XML is
today’s standard for the representation and exchange of structured data, especially if that data must
be read and interpreted by different applications working in heterogeneous environments and
written by different groups. Still the scientific data for most of the physical experiments is produced

12
in structured binary format at high data rates. It often consists of series of large vectors or matrixes.
The XML representation of this data is significantly larger than the native one and requires much
more computational resources for serialization of the numerical data, and is practically unusable for
complex manipulations on matrices and vectors. For these applications the XML representation
could not be used certainly. However, there is an enormous value in data processing and publishing
automation especially when scientists from different groups world-wide are involved into the
development of data analysis software. This is achieved by associating XML metadata with the
binary scientific data. The metadata describes properties like the data accuracy, how, when and by
whom the data was produced (parameters, algorithms used etc.), so that every user can interpret the
associated binary data.
The thesis proposes several extensions over the standard OPC XML-DA specifications providing
mixed XML / Binary approach. The protocol and metadata information are transferred using OPC
XML-DA protocol and the scientific data in the pure binary form linked with the main OPC XML-
DA message using WS-Attachment (Web Service Attachments) technology. The slightly modified
OPC Binary notation from OPC Complex Data specification is used to describe transferred data
layout in heterogeneous environments. Therefore, the XML message only carries a data type name
and a reference to the binary stream, while the scientific data is transferred unattended in pure
binary format [13, 17].

The first chapter of the thesis contains the overview of the distributed data acquisition system
concepts and introduces the basic requirements to the data exchange middleware. Then, the
available data exchange solutions are reviewed in the light of the protocol requirements. The
chapter concludes with the detailed description of the OPC XML-DA and OPC Complex Data
specifications.
In the second chapter the developed High Data Rate (HDR) extensions are introduced and entirely
described. The chapter starts from the description of the basic concepts used in the extensions
design and opening a way for the protocol utilization in the high data rate real-time environments.
Then, the HDR specification is presented.
The third chapter introduces main concepts used in the prototype protocol implementation. Main
attention is pointed on the algorithms allowing coexistence of XML processing tasks with tasks
requiring fast real-time response. The chapter ends with the overall performance evaluation of the
developed protocol implementation in different conditions. The presented evaluation results make
possible to estimate achievable performance depending on the network bandwidth, computational
power and the number of the clients.

13
The implementation of the methods described in the first chapters to the new data acquisition
system of the ASEC detectors network is described in the last chapter. The ASEC detectors are
located on the several high altitude scientific stations on the slopes of the Aragats Mountain and
used to measure different characteristics of the cosmic ray fluxes. The data from the detectors are
used to monitor space weather and forecast upcoming radiation and geomagnetic storm. The last
chapter gives insight how the new developed software is (and will be) implemented for the DAQ
system of distributed particle detectors network.
The multi-aspect comparison of the available XML toolkits is performed in the Appendix. The
libraries analyzed in their abilities to provide XML processing functionality for the developed
protocol implementation. The emulation of the real OPC XML-DA data flow was implemented and
executed to obtain results.

14
The contribution of the thesis author can be summarized as follows:

1. The thorough study of the existing data exchange solutions and data representations capable for
heterogeneous environments is performed.
2. Basing on the existing high level standards the High Data Rate extensions to the OPC XML-DA
specification facilitating the data exchange in the high performance data acquisition systems are
developed. Particularly, basing on the WS-Attachment technology and OPC Complex Data
specification the approach allowing the data dissemination utilizing different data
representations is developed. To achieve highest possible efficiency maintaining in the some
time interoperability in the heterogeneous environments the data exchange approach based on
the native server data representation is proposed. Basing on the XML Encryption and Signature
specifications the new improved data protection mechanism capable of the protection of the data
disseminated by the multicasting protocol and stored on the intermediary servers is established.
The ability to handle the data protection on the per-element basis is another benefit of the
proposed security approach. Finally, the syntax of queries providing customizable interface for
the data access is defined. Specifically, this gives ability to access the historical data.
3. In order to design effective server implementation capable of dynamic data conversions, the
“data representation tree” based architecture is designed. To enhance the system scalability and
capabilities of sustaining the soft real-time demands, the memory management algorithms
optimized for the protocol outline are developed.
4. The developed protocol is implemented as a two-layer multiplatform OPC library. The first
layer provides abstraction from the operating system details and the second layer implements
the protocol itself. The library is implemented in pure C and depends only on the POSIX
standard (alternatively WINAPI for compatibility with Microsoft Windows platform) and
LibXML open source library. Therefore, it can be used in a very broad range of various
environments.
5. The server threading and component models are designed and on the top of the OPC library the
prototype server implementation is developed. To facilitate integration of the NI LabVIEW
components in the data acquisition system the bindings providing OPC XML-DA HDR abilities
to LabVIEW applications are implemented.
6. The new unified architecture of the ASEC data acquisition system thoroughly utilizing proposed
data exchange solution is developed and implemented.

15
CHAPTER 1

DISTRIBUTED DATA ACQUISITION SYSTEMS

1.1. Introduction
At the highest level, distributed and solid data acquisition systems have similar architectures. Both
have distinct application components (consisting of a combination of hardware and software) that
perform specific actions. Each of these components shares the data with others according to its
given task. Each component is also an independent module that can be scaled or modified without
affecting the other parts of the application. For example, in LabVIEW, a modular application can be
created with function-specific VI’s that are used as building blocks by higher level modules [18-
20].
The distributed architecture is not only used to share processing load among multiple computers,
but solves range of rising problems in the design of sophisticated systems as well.
Performance The standard data rates in the modern high energy physics experiments are exceeding
hundred megabytes per second. The single computer is unable to process the data at such high rates.
High performance clusters are used to handle this task.
Scalability Distributed configurations are well scaled to satisfy growing demands of the long-term
experiments. In a distributed system, individual components can be upgraded or modified without
affecting the other parts of the system. Computing capacity and functionality can be added where
needed by upgrading hardware components or by adding computer nodes.
Determinism The multiple tasks performed on the single computer can interfere with each other and
cause jitter to their individual loop execution times. Especially these jitters are harmful to the time
critical tasks controlling precise hardware. The heavy tasks providing user interface may corrupt the
whole system if they are coexisting on the same PC with the time critical tasks. Therefore, a
reduction in the number of tasks performed on a single PC improves the determinism of the time-
critical tasks and, therefore, the system stability.
Compatibility In situations where a distributed system is already in use, widening of functionality
by adding computing nodes can be more attractive than altering of the existing software. Distributed
systems can be heterogeneous, so newer technology can be mixed with older ones.
Fault Tolerance The several PC nodes may replicate same functionality providing strong fault
tolerance. In the case of PC failure its role will be automatically taken by replicating PC. The fault
tolerance is especially important in the cases when the detectors are located in hardly accessible

16
places operating without permanent supervision.
Distributed Systems Many experiments consist of the hundreds detectors spread over the world. In
this case the distributed design rather than multiple isolated nodes will provide possibility for much
better data preprocessing by correlating the data from different places.

The distributed data acquisition system can be modeled as a set of nodes communicating with
connected hardware devices and other nodes, performing data processing and serving the data to
other nodes. The four different node types can be distinguished.
• The frontend nodes are interfacing with the experiment electronics. They are able to read out
the experiment data from underlying devices, acquire a current status and a configuration and
probably adjust them.
• The processing nodes perform data processing, issue warnings on the system behavior and
adjust automatically the parameters of the system depending on the considered conditions. The
processing nodes can filter the data of scientific interest from the enormous data flows, try to
recover from the system failures, adjust the system to the new conditions, etc…
• The data logging nodes are used to store the data in the permanent storage. In most case it is
one of the available relational databases. However, some old data acquisition systems may use
plain ASCII or binary files. There is a trend switching to the XML databases as well.
• The UI nodes are used by operators to monitor and control the system behavior.

The typical data acquisition system will contain several frontend nodes, reading the data from
underlying electronics. The data types vary greatly, ranging from a simple scalar values to a multi-
dimensional spectrum of vector values. Further, one or more nodes make the data preprocessing:
filtering uninteresting data, adjusting system configuration and issuing warning to the system
operators. The processing nodes may adjust the types of the data, create new composite data types
or pass the data unaltered. In the case of the fast data flows it can be a cluster of nodes rather than
the single node or even the GRID-like network [1, 21, 22]. Finally, the data is stored in the database
using one or more data logging nodes. Currently mostly the relational SQL databases are used;
however there are big trends of switching to the more sophisticated XML databases [13].
Afterwards the higher level software uses the data from database to present information to the end
users and perform the scientific analysis correlating the data from different sources, etc… Finally,
the UI nodes are used by operators to monitor and control system operation. Of course, in the
simple systems the single PC can carry out several of the described functionalities. The simplest
case is a single PC performing all reading, processing and storing.

17
1.1.1. DDAS Data Exchange
Just as in a single-computer system, the components of a distributed system need to share data with
each other in an easy and transparent way. The nodes should be able to establish fast and reliable
connection with each other. Since the parts of the data acquisition system can be installed in hardly
accessible places, sometimes in different countries, the connection should be operable trough
standard internet connections and include proxies and firewalls support. The slow high latency
radio-network links should be considered as well.
In the most cases sophisticated data exchange solutions are required. Normally, the metadata
information should be transferred along with the data. The metadata contains information helping
the software to process the data without human interactions. It includes information about the data
type, the physical meaning of the data, engineering units, time the data was acquired, the data
quality. The acceptable data ranges, operator notification conditions are also important for
automatic system state monitoring.
The requirements to the connection speed and latency may highly vary depending on the installation
type. In the high energy physic experiments the data rates can reach hundreds megabytes per second
and even more. On the other hand, in the many control systems the data rates are not so big, but
high real-time criteria should be met. The meta-data requirements are highly dependent on the
experiment as well. However, it is possible to distinguish a set of attributes common to the most of
the experiments. In some cases the nodes may work on different platforms.
The frontend nodes are mostly running on the real-time systems. PharLap ETS is used on the
FieldPoint and PXI devices together with LabView Real-Time [18, 23]. QNX and a wide range of
real-time Linux solutions are used as well [24]. The processing and data-logging nodes are normally
constructed on the standard PC boxes running Linux or Windows systems.
The communication between nodes running the wide range of the different operating systems,
applying a lot of special requirements on the metadata information, data rates and supported
commercial systems are imposing high demands on the used data exchange solutions. At the
moment most of the data acquisition systems uses non-standard self-developed protocols to
communicate nodes with each other. The wide range of solutions based on DCOM, CORBA, RMI,
SOAP, MPI protocols were used [1, 2, 21, 22, 25, 26]. Furthermore, several industrial systems,
using OPC or OPC XML-DA are available [13]. In the most cases the data acquisition and the slow
control are treated separately and the different data exchange protocols are used to handle these
tasks. For example, the data acquisition system for the H.E.S.S. telescope uses the omniORB
implementation of CORBA for inter-process communication [2]. The FTP data transfer protocol
was used for data acquisition in the old ASEC data acquisition system. The control was organized

18
using LabView solution from National Instruments [3, 4]. The KATRIN (KArlsruhe TRItium
Neutrino) experiment is an excellent example for a heavily heterogeneous system. Besides the
industrial solutions controlling subsystems of KATRIN, the experimental data is currently acquired
using the LabView DSC (Data logging and Supervisory Control) system from National Instruments
[13].
As a consequence of such different approaches the data sharing and the reusability of the data
analysis software become an inefficient and tedious task. Impaired effectiveness of scientific work,
the lack of cooperation between groups working on similar projects and slow downs in
implementation of newer promising technologies, like the Grid Computing, is an evident
consequence of such situation.
Therefore, unification of the data exchange protocol is an extremely important and urgent task. The
single protocol used for both data acquisition and control will simplify design, speed up
development and allow more cooperation between collaborating groups of scientists. The engineers,
operators and scientists will benefit from the protocol unification.

1.2. Existing DDAS


To give the more detailed view on the requirements imposed on the data transfer protocol by
different data acquisition systems I will give short overview of the several operating solutions.
These systems differ dramatically in their goal and requirements. The data acquisition of Aragats
Space Environmental Centre is an example of a highly distributed system with nodes spread over
the world. However, the data rates are not very high. The KATRIN is an excellent example for a
heavily heterogeneous system. Besides the industrial solutions controlling subsystems of KATRIN,
the experimental data are currently acquired using the LabView DSC (Data logging and
Supervisory Control) system from National Instruments. The data acquisition of the H.E.S.S.
telescope impose high demands on the connection speed, the data flow is reaching ten megabyte per
second.

1.2.1. Aragats Space Environmental Center (ASEC)


The data acquisition system of ASEC (Aragats Space Environmental Center [3, 4]) currently
consists of 6 readout and control nodes located in the two research stations on the slopes of Aragats
Mountain. Additionally, the project of the new low-latitude world-wide particle detector network
with participation of Costa-Rica, Croatia, Egypt, Bulgaria, Armenia and Indonesia was discussed at
UN/NASA/ESA IHY workshop [5]. The detector setup measures a number of incident charged and
neutral particles depending on their energy and coming direction along with atmospheric pressure

19
and temperature. The particles are born in cascades originated in the atmosphere by primary ions.
Studies of these particles shed light on the high-energy particle acceleration by sun flares and
coronal mass ejection driven shocks.
The particles are classified by several properties, including the particle charge, energy range and
direction. The number of events in each class is summed over a considered amount of time by the
ASEC detectors and transferred to frontend nodes as a time series. The frontend node stores the data
in ASCII files and serves it to the processing node using FTP protocol. The processing node makes
preliminary data processing and issues warnings about software or hardware failures, coming
geomagnetic storms and other important events. Afterwards the data are passed to the data logging
node which stores it in the MySQL relational database. The DVIN web interface is used by the end
users to visualize data from the database. The operators are running National Instruments LabView
to control detector properties, including ADC (Analog Digital Converter) thresholds, power supply
high voltages, and the counting duration and so on. The scientific stations are connected with the
laboratory in Yerevan using a long range radio-network. Therefore, the network bandwidth
optimizations are very important. All nodes are running under the Linux operating system (Fig. 1).

Fig. 1. The figure represents old ASEC data acquisition system layout.

Currently the single combined processing/data logging node is used to handle all the data. The node
is located in the main lab in Yerevan. However, the center is rapidly developing, new detectors and
channels are coming to operation. Therefore, the detector network spread over the world will

20
require more autonomous processing. The dedicated processing nodes should be able to
communicate with each other correlating the data from different detectors in that case. Furthermore,
the software should automatically adjust to the new detectors and channels configuration. This
requires the data to be accompanied by the metadata describing channel assignments, measuring
units, acceptable value limits, etc…

The main requirements are:


1. Ability to work in highly distributed environment across slow non-stable communication
links trough proxies and firewalls.
2. Support for extensible metadata information describing the data meaning and acquisition
conditions.

1.2.2. Karlsruhe Tritium Neutrino Experiment (KATRIN)


The KATRIN (KArlsruhe Tritium Neutrino [13]) experiment is designed to measure the mass of the
electron neutrino directly to a precision of 0.23 eV. It is a tritium beta-decay experiment scaling up
the size of previous experiments by an order of magnitude with a much more intense tritium source.
The KATRIN Slow Control (KSC) is characterized by its heterogeneous nature. It consists of the
several subsystems: Tritium source, Magnet and Vacuum subsystems, Spectrometer... This control
of these subsystems is performed using commercial control systems from Siemens and National
Instruments. The data collection is performed on the FieldPoint devices from the National
Instruments running with LabView Real-time (see Fig. 2).

Fig. 2. The figure represents layout of the KATRIN data acquisition and control system. The commercial control
systems are used to control subsystems. The data moved between subsystems and database using high level OPC
XML-DA protocol with High Data Rate extensions.

The ability to operate in such heterogeneous environment is achieved truth throughout use of
standard protocols and standard interfaces at a high level. Especially for physical experiments this

21
objective is most important, as they have typical life cycles of 10 to 20 years. Obviously, there is an
ongoing change of the demands presented to the data acquisition system in this long time period.
The system hardware and software components must be upgraded to reflect these changes and,
therefore, every now and then replaced by others targeted for the new environments. The hardware
components, operating systems, web tools are rapidly developed. However, the well selected
interfaces and protocols have shown to exhibit the longest live cycles. Communicating the various
components by means of mentioned interfaces are giving opportunities to upgrade subsystems
individually, collaborate the software development efforts, reuse existing software products and
simplify the software maintenance.
The main requirements are:
1. Support by the major slow control system vendors.
2. Compliance with highest level standards.

1.2.3. H.E.S.S. Telescope


The HESS (High Energy Stereoscopic System [2]) is a next generation imaging air Cherenkov
telescope designed to provide a comprehensive study of non-thermal phenomena in the universe.
The experiment will consist of a four telescope array. Each telescope in the array is a heterogeneous
system with several subsystems that must be controlled and read out. The telescope subsystems
comprise a camera with 960 individual photo-multiplier tubes, light pulsar systems for calibration
purposes, a source tracking system, etc… The main data flow is the event information from each
camera to the farm. The event size from a camera is 4 KB, which with data reduction decreases to
approximately 1.5 KB. Hence, the expected trigger rate of the four telescope system of 1 kHz yields
a data rate of 6 MB/s. Additionally, the subsystems produce data of different sizes and at different
rates.
The H.E.S.S. experiment site in the Khomas Highland in Namibia has only a small bandwidth
telecommunication connection to the participating institutes, so only minor intervention from the
outside is possible. Thus, the DAQ system is designed in a robust fashion and easily operated. The
software is composed of a network of distributed C++ and Python objects, living in approximately
100 multi-threaded processes. For inter-process communication the omniORB implementation of
the CORBA protocol standard is used. Each DAQ process contains a StateController object which
implements inter-process communication and runs control state transitions. The DAQ system
distinguishes four different process categories. The Controllers directly interact with the hardware
and read out the data. Each hardware component is controlled by one Controller process. The
Controllers push the data to intermediate Receivers, which perform further processing and store the
data. The Receivers also provide an interface that allows other processes to sample processed data.

22
Readers actively request data from the Receivers at a rate different from the actual data-taking rate.
The data or derived quantities are available then for display and monitoring purposes. Manager
processes are not involved in the data transport but control the data-taking (see Fig. 3).

The main requirements are:


1. Robust and easy operated.
2. Support of high data rates.

Fig. 3. The figure illustrates nodes intercommunication in the H.E.S.S. data acquisition system.

1.3. Data Exchange Protocol Requirements


Prior to analyzing different existing solutions there is a need to define essential requirements for the
universal data exchange framework. As it was shown in the previous section the various physical
experiments may demand diverse functionalities from the protocol. For experiments measuring
various particles coming from space it is important to accept information at rather high data rates
and store it in reliable storage, but delays between data receipt and delivery to the client have decent
limits. Vice versa for control systems data rates are not so big, but high real-time criteria should be
met. Another opposition illustrating this idea is high standardization efforts versus high speed. For
long running international experiments with restricted data rates the compatibility and utilization of
highest possible approved industry standards become a main goal, whose importance can’t be
overestimated. Other experiments, which are generating data at enormous rates, require an adequate
compatibility level with existing specifications, but more attention should be devoted to the
performance of framework and ability to sustain required data rates. The ability to work in
heterogeneous environments is essential to the data exchange protocol. In many data acquisition
systems the frontend nodes are built at the top of the special real-time hardware-software
complexes, while the processing and UI nodes utilizes cheap Linux and Windows PC boxes.
However, often big parts of the system are homogeneous and may benefit from optimizations

23
disregarding the heterogeneity. The universal data processing components should be able to process
automatically a wide range of different scientific information coming from various parts of the data
acquisition system.
The automatic processing in the complex and mutable environments imposes requirements on the
descriptions of the transferred data. All data should be accompanied by the well defined metadata
information describing a data type, a physical meaning of the data, engineering units, time the data
was acquired, the data quality and other important information. Besides the predefined metadata
entries it should be possible to extend metadata with custom experiment-specific information.
Furthermore, to support complex data sources containing a variety of experimental information the
server implementation should support browsable address space of data items. The clients should be
able to browse address space, search for the required items using comprehensive filters. The server
should provide several interfaces providing abilities to read and adjust information associated with
the data items. The data streaming and notifications reporting the data changes should be considered
as well.
Finally, the existence of a significant amount of widely used commercial control and data
acquisition systems should be considered. WinCC from Siemens, LabView from National
Instruments are widely used in many running systems. The complex distributed systems often
integrate commercial solutions to control several independent subsystems. The KATRIN control
system should be an excellent example. Therefore, the protocol should precisely adopt a possibility
to interoperate with the software from major commercial vendors and as a consequence to rely on
the highest level standards.

The short summary of the described requirements is as follows. They are essential to the data
exchange protocol in the case if it should be used as a universal middleware solution.
• It should be based on the highest level, industry accepted standards.
• It should be fast enough to sustain very high data rates, used in the high energy physics and
astrophysics experiments.
• It should have predictable request-response latencies allowing usage in the real-time systems.
• It should support extensible metadata. The metadata should include the predefined set of
records common to the most experiments as well as a possibility to define custom experiment-
specific records.
• It should operate in the heterogeneous environments. However, the special optimizations for
the homogeneous cases are welcome.
• It should work transparently through internet proxies and firewalls.

24
• The server implementation should allow clients browsing of the server address space using
comprehensive filters. Provide interfaces to read and adjust the data. Issue notifications to the
clients reporting the data changes. Implement functionality allowing the data streaming.

1.4. Existing RPC Solutions


The wide choice of the data exchange solutions is currently available. First RPC (Remote Procedure
Call) solutions were introduced in the end of 80th. The rapid development of the internet
communications in the last years urged the appearance of the new sophisticated object oriented
middleware solutions. The CORBA and Enterprise Java Beans are best examples of such solutions.
Furthermore, the OPC Foundation proposed the large set of standards for slow control systems. The
rest of the section reviews existing solutions in the light of the stated requirements. I will leave out
the simple data exchange protocols like HTTP, FTP and others, since they are oriented upon the
solid not structured data.

1.4.1. Open Network Computing Remote Procedure Call


The ONC RPC (Open Network Computing Remote Procedure Call [27]) is widely deployed RPC
family protocol. It was originally developed by Sun Microsystems for the NFS (Network File
System) implementation and sometimes referenced as a Sun RPC protocol. The ONC RPC supports
procedure calls by means of UDP and TCP protocols. Therefore, the transparent work over the
internet proxies is not supported. The protocol is widely supported by the software on the all major
platforms. The binding are available for many programming languages including C, C++, Perl,
Java, .NET and probably majority of others. The XDR (eXternal Data Representation [28]) binary
representation is used for the data serialization.

1.4.2. Distributed Computing Environment / Remote Procedure Call


The DCE (Distributed Computing Environment [29]) is one of the oldest middleware solutions.
Although DCE is still actively developing, it is not very popular nowadays. The implementation is
available for the most UNIX systems and several others including Windows. There no many
bindings for languages other than C. The protocol works by means of UDP and TCP protocols.
Proxies are not supported. The NDR (Network Data Representation) is used for data serialization.
In contrast to the XDR, the NDR allows primitive data types to be transferred in several widely
used formats. The small header initiating the data is used to specify used format. Such design allows
omitting format conversions in the homogeneous cases. Therefore, the protocol performance in such
cases outperforms ONC solution. Support of the streaming communications is another benefit of the
DCE approach.

25
1.4.3. Internet Inter-ORB Protocol
The IIOP (Internet Inter-ORB Protocol [30]) is communication protocol from the CORBA
(Common Object Request Broker Architecture [7]) specifications set. The CORBA is standardized
by the Object Management Group and appears to be one of the most popular distributed object
infrastructures outside of the Microsoft world. The CORBA have numerous and well tested
implementations on the majority of available systems. The language bindings are available for the
most programming languages.
The IIOP protocol works over TCP/IP protocol. It uses CDR (Common Data Representation [30])
binary representation for data serialization. The CDR representation is slightly more optimized for
Linux/Windows environments than the XDR one. Therefore, the IIOP-based solutions are able to
give better performance than solutions based on ONC RPC while running on these platforms. Like
the ONC RPC the IIOP is missing proxy support.
It should be mentioned that the CORBA is much more sophisticated solution, providing not the
simple RPC mechanism, but the comprehensive set of protocols allowing a creation of full-fledged
distributed object infrastructures. In a general sense CORBA wraps code written in some language
into a bundle containing additional information on the capabilities of the code inside, and how to
call it. The resulting wrapped objects can be transparently called from other programs (or CORBA
objects) over the network then.

1.4.4. Internet Communications Engine


The ICE (Internet Communications Engine [31]) is aimed to provide same functionality as a
CORBA does, but in the simpler, clearer and faster way. The protocol implementation is developed
by ZeroC Company and available under GNU GPL and proprietary licenses. The bindings for C++,
Java, DOT.NET, Visual Basic, Python and PHP are available. Additionally, the ICE supports the
data compression on the slow links. The ICE outperforms the TAO CORBA implementation in the
most tests regarding the performance comparisons provided by the ZeroC [32].

1.4.5. Inter-Client Exchange


The ICE (Inter-Client Exchange [33]) protocol is a part of X11R6 system. However, it is
implemented as a separate library and can be used separately. The ICE C library is available on
many platforms. However, the bindings for other languages are rare. The connections over proxy
servers are not supported. The protocol has a very limited set of supported data types and misses
even the floating point numbers support.

26
1.4.6. Desktop Communication Protocol
The DCOP (Desktop Communication Protocol [34]) is a simple protocol developed by the KDE
team on the top of ICE protocol. The DCOP transport mechanism does not assume any data
serialization. Therefore, the data serialization should be performed prior to sending the data in
heterogeneous environments. The protocol is mainly used for the application inter-communication
in the KDE desktop environment.

1.4.7. Multimedia Communication Protocol


The MCOP (Multimedia Communication Protocol [35]) is another protocol providing functionality
similar to the CORBA. It is missing some of the CORBA functionalities. However, it supports the
data streaming missed in the official CORBA specifications. The MCOP documentation says: “It
has conceptual similarities to CORBA, but it is intended to extend it in all ways that are required for
real time multimedia operations“. The protocol is developed by the Arts team for their streaming
audio server.

1.4.8. Object Remote Procedure Call


The ORPC (Object Remote Procedure Call) is a RPC mechanism used by DCOM (Distributed
Component Object Model [6]) for inter-process communication on the Windows platform.
Although there are several commercial DCOM implementations outside of the Microsoft world,
they are not well tested and work only on the limited number of UNIX platforms.

1.4.9. Java Remote Method Protocol


The JRMP (Java Remote Method Protocol) is RPC protocol for the Java environment. The JRMP is
used by Java RMI (Java Remote Method Invocation [8]) to allow server java application to provide
interfaces that can be used from outside of the current Java virtual machine. It depends heavily on
proprietary features of the Java programming language like object-serialization for passing objects,
class-loaders for instantiating remote objects and Java interface definitions for describing classes
respectively objects. Thus, it is only supported in the Java Environment. The software written in
other programming languages are not able to access services provided using Java RMI.

1.4.10. D-Bus
The D-Bus is a newly developed system for inter-process communication. It is mainly designed to
provide communication between desktop components and the core of operating system. The D-Bus
system consists of a wire protocol for exposing a typical object-oriented language to other
applications and a bus daemon that allows applications to find and monitor one another. Compared

27
with CORBA, the D-BUS hardcodes and specifies a lot of things that CORBA leaves open-ended,
because CORBA is more generic and D-BUS has specific use-cases in mind. The D-Bus uses a fast
binary protocol, like CORBA. The connection operates over Unix Domain Sockets or TCP/IP and
don’t work over proxy servers [36, 37].

1.4.11. XML Remote Procedure Call


XML-RPC is a remote procedure call protocol which uses XML to encode its calls and HTTP as a
transport mechanism. Hence it works transparently with all kind of internet proxies and firewalls.
Moreover, the HTTP is not an only acceptable transport, there are solutions working using E-Mail
and other widely used transports [38].
The protocol implementations are available for many programming languages for all variety of
available platforms. The XML-RPC is designed to be as simple as possible. Although most of all
RPC demands can be accomplished, it is seriously limited in abilities to operate in the object
oriented environment.
The XML data representation is excellent for heterogeneous environments. However the data
serialization requires a lot of CPU resources. The data takes twice or even more bandwidth
compared with binary protocols. The comparisons show that XML-based data exchange protocol
implementations are approximately ten times slower than CORBA [39].

1.4.12. Simple Object Access Protocol


The SOAP (Simple Object Access Protocol [9]) is much more sophisticated and full-featured
version of the XML-RPC. It integrates a wide range of different XML technologies to construct
accomplished middleware solution. The scalability and extensibility are main advantages of the
SOAP protocol over XML-RPC solution. The SOAP uses XSD Schema for data prototyping. The
XML Namespaces allows SOAP usage in heavily object oriented environments. The “SOAP
Message with Attachment” extension allows integration of the binary data into the SOAP message.
However no serialization is defined and data are embedded as a binary stream.
Furthermore, the SOAP, WSDL (Web Service Definition Language) and UDDI (Universal
Description, Discovery and Integration) are forming Web Services framework. Finally, SOAP is
much more widely supported compared with XML-RPC. However, it has the same performance
issues and is approximately ten times slower than the competing CORBA solutions [39].

1.4.13. Open Proccess Control – Data Access


OPC (Open Process Control or alternatively OLE for Process Control [11, 40, 41]) is a set of
specifications helping the PLC (Programmable Logic Controller), DCS (Distributed Control

28
System), HMI (Human-Machine Interface), and other factory-floor software vendors to move real-
time data between field devices, control systems and other applications in a standard way,
promoting multi-vendor compatibility and interoperability. The OPC specifications are developed
by the OPC Foundation. The most important specifications for data exchange are OPC DA (OPC
Data Access), OPC A&E (OPC Alarms and Events), and OPC HDA (OPC Historical Data Access).
The OPC DA is used to move real-time continually changed control data. On the contrary the OPC
HDA specification defines interface to access already stored data. OPC A&E provides alarm and
event notifications on demand. These include process alarms, operator actions, informational
messages, and tracking/auditing messages.
OPC DA (OPC Data Access) uses Microsoft DCOM technology to provide a communication link
between OPC servers and OPC clients and purely supported outside of the Microsoft world.
According to the OPC DA specification, the server data space is divided into the set of OPC Items.
Each of the OPC Items stores the data variable along with predefined and custom metadata
information. The specification defines interfaces for reading, writing current values and examining
the metadata information. The subscription mechanism can be used to track the data changes.
The OPC DA is supported by the major vendors of commercial control systems and appears to be
standard de facto for slow control industry. The implementations are only available on the
Microsoft platform. The proxies and firewalls are not supported.

1.4.14. PROFInet
The PROFInet is another solution for the control automation based on the Microsoft DCOM
technology. The PROFInet concept supports the structuring of an automation system into
autonomous subsystems called technological modules. Technological modules comprise all
mechanics, electronics and software to perform a specific task. Each technological module is
modeled by means of the DCOM component. The external behavior of the component is described
by the interface variables. The system integrator may define a certain relations between the
subsystems in order to facilitate the data exchange automation. Besides the data read and write
requests the PROFInet architecture consider possibility to define own interfaces providing
subsystem dependent functionality. The interface variables may be exposed to the clients by means
of the OPC interfaces. Each component of the PROFInet architecture is represented as a separate
OPC server in this case. Therefore, the OPC-compliant software may be used to control PROFInet
system. More information about PROFInet is available in [42, 43].
However, the approach suffers the same drawbacks like the OPC does. The PROFInet is based on
the Microsoft DCOM technology and, therefore, is badly suited for the heterogeneous systems and
the systems distributed across the internet.

29
1.4.15. Open Process Control - XML Data Access
The OPC XML-DA (OPC XML Data Access [12]) specification is a restatement of the industry
accepted OPC DA specification in terms of XML. It is based on the SOAP protocol and defines a
Web Service interface facilitating the exchange of plant data in heterogeneous environments across
the internet. According to the specification, the server data space is divided into the set of OPC
Items which are addressed using “ItemPath” and “ItemName” identifiers. Each of the OPC Items
stores the data variable along with standard and user-defined metadata information. By default the
OPC XML-DA specification anticipates usage of variables and arrays of several basic data types
(floating-point and integer numbers, enumerations, strings, time variables, etc…). However, the
OPC Complex Data specification gives an idea how complex data types can be constructed,
described and served to the clients. The OPC Complex Data specification defines as well a way
how the data should be served to the clients using several different data representations [16].
The Web Service provides interfaces for reading, writing current values and examining the
metadata information. The variable value changes are reported to the clients using polled style
subscription mechanism. The client initiates subscription to the group of OPC Items (OPC Group)
and agrees to issue periodic refresh requests. The subscription behavior is controlled using several
properties. The “EnableBuffering” demands buffering of the values-changes detected in between
client polling requests. The “Deadband” specifies the percentage of a full engineering unit range of
an item value that must change prior to the value becomes interesting to the clients. The “Holdtime”
and “Waittime” are used to reduce latency time of reporting a value change to the client and
minimize the number of round trips between the client and server.
Despite the fact that the OPC XML-DA specification is designed mainly for control systems, the
subscription mechanism used together with buffering makes possible the protocol usage in the data
acquisition systems possible as well.
The OPC XML-DA is based on the SOAP protocol. Hence it is well suited for internet and supports
all available proxy and firewall solutions. The slow control industry has started slightly adopting the
OPC XML-DA in the new products. There are several commercial protocol implementations
operating on the Windows and Linux platforms. However, no open source solution is available yet.
The only major drawback of the OPC XML-DA is a performance. Being approximately ten times
slower than DCOM based OPC DA protocol the OPC XML-DA can be doubtfully used in high-
performance real-time systems design in its current form [44].

1.4.16. Summary
In spite of the availability of a wide choice of powerful distributed object frameworks (CORBA and

30
others), the OPC XML-DA seems to be better solution. It performs well in the highly heterogeneous
network environments. It works over all kinds of proxies and firewalls. Furthermore, it is the only
multi-platform solution supported (or scheduled) by the major slow control system vendors.
Actually, from the list of requirements stated in section 1.3 it has problems only with performance
and data streaming support.
Of course, the good performance is essential to the universal data exchange solution. To address
these problems, a High Data Rate (HDR) extension is developed. The extension uses mixed XML /
Binary approach. The XML is used to represent the protocol and meta-data information, while the
scientific data is transferred in a pure binary format. The extension is only used with systems
imposing very high performance demands. Hence, it will not break compatibility with existing
solutions in the cases when the performance of the OPC XML-DA protocol is satisfactory. The
HDR extension is described in the Chapter 2.

1.5. OPC XML-DA Specification


The OPC XML-DA specification [12] defines a set of interfaces for reading, writing current values
and examining the metadata information. The variable value changes are reported to the clients
using polled style subscription mechanism.
The server data space is organized as a tree, providing folders and data items similar to the folders
and files used in the standard file system. The data items are called OPC Items and in contrast to the
files can carry out both folder and data item functionalities simultaneously. The OPC Items are
addressed using “ItemPath” and “ItemName” identifiers supplied with each request. Besides the
associated data value each OPC Item stores set of standard and user-defined metadata information.
The client is allowed to request currently associated value with all or part of the contained metadata.
The metadata specifying the currently associated value quality is mandatory returned to the client.
The wide range of predefined metadata properties are described by the OPC XML-DA
specification. These properties include information on the item data type, precision timestamp,
quality, access permissions, fastest rate at which the server could obtain information from the
underlying data source, engineering units, smallest and largest positive value that can be stored in
the item, time difference between item’s timestamp and the local time in which the item was
obtained, and so on. The metadata can be extended with user-defined properties as well.
Strings, booleans, 32 and 64 bit floating point numbers, 8-64 bit signed and unsigned integers,
enumerations, base64 encoded strings, date and time variables, XML qualified names as well as
arrays and custom structures of the above types are supported by the OPC XML-DA specification.
The localization support allows clients to get the string values in the desired language. Clients can

31
query the server about the full list of the supported languages as well.
The OPC XML-DA doesn’t support any special way of the transport security. However, the HTTPS
(Secure HTTP) protocol can be used to protect data between the client and server. Each of the OPC
Items may support reading of the associated data, adjusting of the associated data or both. The
information about access permissions is available to the client truth “accessRights” metadata
property. The sophisticated systems may allow or disallow access depending on the X509 certificate
used by the client to establish HTTPS connection [45].

1.5.1. OPC XML-DA Interfaces


The following interfaces are supported by the OPC XML-DA specification:
GetStatus: The GetStatus interface provides a common mechanism for checking the status of the
server – whenever it is operational or in need of maintenance. The vendor-specific information
about the server (version number, etc...) can be obtained as well.
Browse: The Browse interface provides a common mechanism to browse the server’s data space.
The list of nodes containing in a subtree of the specified OPC Item is returned in the response. The
client can request only first-level children of the item or a complete subtree. Sophisticated filters
selecting subnodes depending on their name or vendor associated information are allowed as well.
It is possible to limit the maximal amount of items reported in the response. In the case if not all
items are reported, the reference will be returned to the client using “ContinuationPoint” attribute.
To retrieve remaining items client should send new browse request along with this reference.
GetProperties: The GetProperties interface provides the ability to read metadata associated with one
or more OPC Items. The client is able to specify interesting metadata fields or request all available
metadata.
Read: The Read interface provides the ability to read value and associated metadata for one or more
OPC Items. The Read request runs to completion before the response is returned. The server obtains
the data for an item, or it determines that the data cannot be read. It places either the data or an error
code for each requested item into the response, according to the structure and order of the Items in
the request. The latest absolute time at which server should respond is specified by the
“RequestDeadline” attribute. The data can be returned from the server’s cache or read from the
underlying physical device. The cache usage can be controlled by the client truth “MaxAge”
attribute.
Write: The Write interface provides the ability to adjust value and quality for one or more OPC
Items. The service runs to completion before the response is returned. The server writes the data for
each item, or it determines that the data cannot be written. If requested, after all writes complete, the
server performs a read of the items. The server places either the data or an error code (write or read)

32
for each requested item into the response, matching the structure and order of the request. The latest
absolute time at which server should respond is specified by the “RequestDeadline” attribute.
Subscription: The Subscription interface provides ability to establish polled style subscription to the
group of OPC Items (OPC Group). The client initiates subscription and agrees to issue periodical
refresh requests. The values associated with the OPC Group items are returned along with responses
to these requests. The subscription behavior is controlled using several properties. The
“RequestedSamplingRate” indicates the fastest rate at which the server should poll the underlying
devices for data changes. Polling at faster rates is acceptable, but not necessary to meet the client’s
needs. The “EnableBuffering” demands buffering of the values-changes detected in between client
polling requests. The “Deadband” specifies the percentage of a full engineering unit range of an
item value that must change prior to the value interesting to the clients. The “Holdtime” and
“Waittime” are used to reduce latency time of reporting a value change to the client and minimize
the number of round trips between the client and server. The “HoldTime” instructs server to hold
off returning from the polled refresh call until the specified absolute server time is reached. The
“WaitTime” instructs the server to wait the specified duration after the “Holdtime” is reached
before returning if there are no changes to report. A change in one of the subscribed items, during
this wait period, will result in the server returning immediately rather than completing the wait time.

1.6. OPC Complex Data Specification


The OPC Complex Data specification [12] describes how to represent and access complex data
within the existing DCOM based OPC DA framework. However, the same ideas can be used
together with the OPC XML-DA framework.
The remaining part of the section describes methods and concepts defined in the OPC Complex
Data specification in order to provide the complex data to the clients in different forms. These
concepts in conjunction with OPC XML-DA specification further will be adopted by the High Data
Rate extension to enable fast binary data dissemination.

1.6.1. Complex Type Descriptions


The OPC Complex Data specification provides a mechanism to describe and distribute complex
data structures contained within the OPC XML-DA server data space. The Complex Data Type
Descriptions are provided through metadata properties. The “Type System ID”, “Dictionary ID”,
“Type ID” properties are used to describe the OPC Item type.
The Dictionary is an entity that describes one or more complex types using a syntax defined by a
Type System. A Type Description is a portion of a Dictionary that describes a single complex type.

33
A Type Description may contain references to other complex types within the same Dictionary, as a
result of which, a Type Description may not contain all information required to understand the
complex type. A Dictionary, on the other hand, should contain all information that a client needs to
understand the complex types it contains.
The OPC Complex Data complied server should provide all type description information truth CPX
branch in the server data space (see Fig. 4). For each of the supported Type Systems a corresponded
node should be registered under the “CPX” node. The “Type System ID” specifies the name of the
node used for the considered Type System.
Most of the Type Systems provide a set of standard and user defined types. The Dictionaries are
used to scope all types required to fully identify individual Type Description. The “Dictionary ID”
property contains the string that identifies the Type Dictionary that contains the type definition of
the OPC Item. The value of this property depends on the server implementation and used Type
System. All supported dictionaries are registered under the Type System node. The server may, but
is not required to, make the dictionary available to the clients using the “Dictionary” property.
Finally, all data types available in the dictionary are registered under the dictionary node and
contain complete data type description in the “TypeDescription” property. The “Type ID” property
identifies uniquely the Type Description in the Type Dictionary.

Fig. 4. The figure represents the layout of the complex type description items inside the OPC XML-DA data
space.

34
It is expected that dictionaries may change during the lifetime of the OPC server. Changes may be a
result of a change in a complex data type but it is more likely that dictionary changes are a result of
the addition of new type descriptions. The clients may detect the dictionary changes by subscribing
to the OPC Item associated with the Type Dictionary.
The client must recognize the type system in order to make use of any of the type description
information. Following Type Systems are defined by the OPC Complex Data specification:
XMLSchema: The XML Schema Type System is used to describe complex data values represented
in XML. The Type System utilizes the XML Schema to describe the data type. In that case the
location of the XSD containing the type description is used to identify Type Dictionary. The “Type
ID” corresponds to the schema type of the data element.
OPCBinary: The OPC Binary Type System is used to describe complex binary data values. The
OPC Binary is defined later inside the OPC Complex Data specification.

1.6.2. OPC Binary Type System


The OPC Binary dictionary is composed of a set of Type Descriptions. These Type Descriptions are
expressed using XML meta-language and used to define constructed types. The constructed type is
composed from a set of fields. Each field is represented by the XML element and references to the
some basic or previously defined type. The XML attributes are used to qualify type properties (see
example at Fig. 5). The data described by the types follow in the binary stream in the same order as
they appear in the XML document. The type name is used as an element name for basic types. The
“TypeReference” element is used for previously defined types. The “TypeID” attribute specifies the
type name in that case.

<TypeDescription TypeID=”constructed_type”>
<Float Named=”sensor” Length=”8”/>
<Integer Name=”n” Length=”4” Signed=”false”/>
<Integer Name=”counters” Length=”1”Signed=”false” ElemetCountRef=”n”/>
</TypeDescription>

Fig. 5. The example Type Description is presented on the figure. The data correspondent to the type consists
of the double precision (8 bytes) floating point number, the 32bit unsigned integer value specifying the size of
the following array, and variable-length array of 8bit unsigned integer values.

The “Integer”, “Float”, “BitString” and “CharString” basic types are supported. The “Interger” and
“Float” types represent integer and floating point numbers. The type size (number of bytes) is
specified using “Length” attribute. For the integer numbers the “Signed” attribute specifies
whenever the integer is signed or unsigned. The “BitString” is used to describe raw binary data. The
“Length” attribute specifies the number of bits in that case. The “CharString” is used for strings.

35
The “StringEncoding” attribute specifies string encoding. The ASCII, UCS-2, UCS-4, UTF-8 and
UTF-16 are supported by the specification. The “CharWidth” attribute can be used to specify char
width (how many bytes used to encode one character). This option allows client to parse the data
containing strings in unknown encoding. The “CharCountRef” attribute is used to specify string
length.
The “ElementCount”, “ElementCountRef” and “FieldTerminator” attributes are used to construct
arrays. The “ElementCount” attribute is used to define fixed-length array. The attribute value
specifies the number of elements in the array. The “ElementCountRef” attribute is used to define
variable length array. The attribute value should contain the name of preceding field in the same
Type Description. The referenced field must be an integer type that contains the number of elements
in the array. The “FieldTerminator” attribute is used to describe fields containing an arbitrary
sequence of values. The value for the “FieldTerminator” is used to indicate the end of the field. It
must be a sequence of bytes represented as a string formatted with the XML ‘hexBinary’ notation.
The “FieldTerminatror” can be used to define null-terminated strings.The “CharCountRef” is not
specified in that case. Only one of the “ElementCount”, “ElementCountRef”, “Terminator” and
“CharCountRef” attributes is allowed to present in the field description.
Several attributes of the Type Description element controls the binary encoding. The
“DefaultBigEndian” attribute defines the default byte order for all integers and all multi-byte
characters contained by this Type Description, including those contained in nested Type
Descriptions. When the value of this attribute is “true”, Big Endian byte order is used. When
“false”, Little Endian byte order is used. The “DefaultFloatFormat” attribute defines the default
format used for all floating point numbers contained by this Type Description, including those
contained in nested Type Descriptions. Only the ‘IEEE-754’ format is defined by the OPC Complex
Data specification. More information about binary encoding will be presented in section 2.3.

1.6.3. Complex Data Representations


Normally, the OPC XML-DA servers expose data using SOAP encoding rules. This data
representation is well suited for a control system working in the highly heterogeneous environments
at moderate data rates. However, it is not acceptable for the clients requesting the data at high rates.
The OPC XML-DA server can support the needs of both types of clients by representing the data
item using SOAP encoding rules as well as one or more alternative binary formats.
Servers that support data representations would represent the set of available conversions for a
specific complex data item by creating a branch named “CPX” under the complex data item within
the server data space. This branch would contain one or more items that would expose the same
value in the different formats. This structure is shown in Fig. 6.

36
Fig. 6. The figure illustrates how the CPX branch is constructed in the OPC XML-DA server data space to
make the data representations, filters and queries available to the clients. The CPX branch is defined by the
OPC Complex Data Specification and registered for all complex items supporting type conversions and queries.

A client would be able to determine the set of available type conversions by browsing the items in
the “CPX” branch. A client would then be able to read and write data in alternate formats by
accessing the items contained in this CPX branch instead of the base complex data item. It is up to
the server to decide whether it will support writes to the OPC Item in the alternate formats.
These alternate items would also support all the properties described in section 1.6.1. Those
properties are necessary for the client to read the Type Description information for the alternate
formats. The “Unconverted Item ID” property is used to specify original unconverted OPC Item to
the clients. The value of this property is the “ItemName”. Therefore, the OPC Complex Data
specification requires that all items in the “CPX” branch under a complex data item should have the
same “ItemPath”.
The OPC XML-DA servers that support data representations should choose names for alternate
items that are descriptive enough to allow a client to present the names in a list for the end user.

1.6.4. Data Filters and Queries


Another important aspect of the OPC Complex Data specification is a definition of the queries. The
queries can control the return of the OPC XML-DA server to the client. Applications for filters or
queries range from a simple mask where the client tells the server to ignore some fields when

37
testing whether a data change update has occurred to complex queries that the server uses to extract
the data from an underlying data source. For this reason, the OPC Complex Data specification
defines a general query specification mechanism. Normally, the queries are constructed using some
well known query languages, like XPath or SQL. However, the custom implementation specific
approach is allowed. The metadata properties can be used to describe the query syntax in that case.
The server registers a correspondent “Data Filter” branch in the server data space for all items
supporting queries. The branch is created under the “CPX” item for the primary data representation
and directly under the representation item for the alternative data representations (Fig. 6). However,
not all representations are required to support queries.
A client who wishes to specify a query must write the query parameters to the “Data Filter” item.
The following format is used: <DataFilter Name=”Query Name”>query content</DataFilter>
Here, the “Query Name” is used to specify unique query identifier, and “query content” provides
query parameters. These parameters are not defined by the OPC Complex Data specification and
depend on the server implementation. Supplying the identifier of already existing query will result
in an error. The server validates the query parameters and creates a new item under the “Data Filter”
branch. The query identifier is used as a name of the newly created item. After that the client is able
to read from or subscribe to the item, in order to receive the data with the query applied. The write
support is not considered by the specification. The client may update the query by writing the new
query value to the OPC Item associated with the query. The client may destroy query by writing
empty string to the associated OPC Item.
The client must be able to find all existing queries by browsing the “Data Filters” branch. The query
parameters are stored in the “Data Filter Value” metadata property of the query associated OPC
Item. The servers are not required to make query items created by one client visible to other clients.
However, they may choose to do so.
The ‘Unfiltered Item ID’ metadata property is used to specify original not queried OPC Item to the
clients. In the case if the query applied to the item in alternative format, the OPC Item associated
with considered representation is referenced.

1.7. Issues of the OPC XML-DA Protocol


The major drawback of the OPC XML-DA protocol lies in its XML nature, obviously prohibiting
its utilization in high performance real-time systems. XML is today’s standard of choice for the
representation and exchange of structured data, particularly where that data must be read and
interpreted by different applications working in heterogeneous environments and written by
different groups. Still the scientific data for most of the physical experiments are produced in a

38
structured binary format. It often consists of series of large vectors or matrices. The XML
representation of this data is significantly larger than the native one and requires much more
computational resources for serialization of the numerical data, and is practically unusable for
complex manipulations on matrices and vectors (see Fig. 7). Another issue limiting system
scalability is caused by the properties of the HTTP protocol. The HTTP is connection oriented
client-server protocol implementation. Hence, it is impossible to support multiple clients at once
using multicasting approaches. Finally, the polled style subscription mechanism considered by the
OPC XML-DA specification is less effective compared with data streaming solutions while
supporting single client at the high data rates [15, 17].

<Items ValueType=”ArrayOfAnyType”>
<anyType xsi:type="xsd:float">0.4354354</anyType>
<anyType xsi:type="xsd:float">0.4354354</anyType>
<anyType xsi:type="xsd:float">0.6767675</anyType>
<anyType xsi:type="xsd:float">0.5644423</anyType>
<anyType xsi:type="xsd:float">0.5644423</anyType>
<anyType xsi:type="xsd:float">0.5644423</anyType>
<anyType xsi:type="xsd:float">0.5644423</anyType>
</Items>
Fig. 7. The OPC XML-DA representation of the 8-value floating point array is illustrated by the figure. The
natural binary representation consumes 64 bytes in the system memory. The OPC XML-DA representation
needs 387 bytes. Hence, the representation is impacting 6-fold data size increase.

Another limitation of the OPC XML-DA protocol lies in the fact that there is no mechanism defined
to access older data values (Historical Data). This feature is essential for data acquisition systems
and systems working in environments with unstable network connections. The caching possibilities
anticipated by the OPC XML-DA subscription mechanism are not enough to establish reliable data
readout in such environments. The OPC HDA (OPC Historical Data Access [40]) specification
defines that functionality for the DCOM based systems. However, its equivalent in the XML terms
does not exist.
Therefore, the following summarizing list of issues should be addressed by the High Data Rate
extension allowing protocol usage in the high performance systems as well.
• The XML format used by the OPC XML-DA for the data serialization is inefficient for
describing complex numerical data.
• The OPC XML-DA protocol does not consider multicasting approach for the system scaling.
• The polled style subscription mechanism considered by the OPC XML-DA specification is
less effective compared with data streaming solutions.
• Access to the Historical Data is not anticipated by the OPC XML-DA protocol.

39
• The two-way compatibility with legacy OPC XML-DA clients and servers should be
obviously provided.

1.8. Summary
The introduction into the distributed data acquisition systems and available data exchange protocols
is provided in this chapter. Further, the requirements to the universal data exchange solution are
stated. On the strength of the formulated requirements the OPC XML-DA protocol is selected as a
basis for the development of universal data exchange solution. The main reasons consist in high
level standard compliance and interoperability in a heavily heterogeneous network environment.
However, several limitations prevent protocol utilization in the high performance and data
acquisition systems. The extremely slow performance and absence of mechanism providing access
to the historical data are most important from them. To address these issues several extensions to
the original specification are introduced in the next chapter.

40
CHAPTER 2

NEW HIGH DATA RATE APPROACH

2.1. Introduction
The scientific data for most physical experiments are produced in a structured binary format at high
data rates. It often consists of series of large vectors or matrices. The XML representation of this
data is significantly larger than the native one and requires much more computational resources for
serialization of the numerical data, and is practically unusable for complex manipulations on
matrices and vectors. Therefore the SOAP data encoding rules assumed by the OPC XML-DA
specification are very inefficient and can be doubtfully used in the high performance systems. In
order to improve OPC XML-DA protocol performance and solve other issues stated in Chapter 1
the HDR (High Data Rate) extensions were developed.
The HDR extensions consider usage of mixed XML / Binary approach to handle performance issue.
The protocol and metadata information are transferred strictly following the OPC XML-DA
specification, while the scientific data is transferred in a pure binary format. The OPC XML-DA
message only carries a data type name and a reference to a binary stream.
Support of several simultaneous clients is achieved using multicasting approach. The HDR
extension allows the binary to be transferred by a separate multicasting stream, referenced from the
OPC XML-DA message. The reliable multicasting protocol is used to guarantee the data
consistency.
The extension is only used with systems imposing very high performance demands and provides
two-way compatibility with legacy OPC XML-DA clients and servers. Hence, it will not break
compatibility with existing solutions in the cases when performance of the OPC XML-DA protocol
is satisfactory.

The HDR extension scope:


• Allows separation of the scientific data from the protocol and metadata information.
• Defines rules to exchange binary data in heterogeneous environments.
• Considers OPC Complex Data specification to distribute data in different formats.
• Utilizes OPC Binary notation to describe the data format.
• Anticipates mechanism rendering access to the Historical Data.
• Provides support for multicast data transfer to maintain several synchronous clients.

41
• Revises security considerations.
• Manages two-way compatibility with legacy clients and servers.

2.2. Linkage between SOAP message and Binary Data


As it was stated in the previous section, the high data rate extension considers that the OPC XML-
DA message carries only meta-data and protocol specific information, while the scientific data is
transferred using alternative approach. However, the certain technology should be used to set a
reference to the binary stream, so client application could find and extract the data. To achieve this
goal, several technologies can be used. Encapsulation of the binary data into the SOAP message
using Base64 encoding, SOAP message with attachment and WS-Attachment technologies are most
widely used among them.

2.2.1. Base64 Encoding


The simplest way to couple the XML message with binary data is achieved using BASE64 data
encoding. The BASE64 is a binary to text encoding scheme whereby an arbitrary sequence of bytes
is converted to a sequence of ASCII characters. The only characters used are upper and lower case
Roman alphabet characters, the numbers and the “+” and “/” symbols, with the “=” symbol as a
special suffix code. Therefore, the BASE64 encoded string can be simply embedded in the XML
body as an attribute or element content. [46]
The resultant base64-encoded data has a length that is greater than the original length by the ratio
4:3. The considered amount of processor resources is required for serialization. Therefore, this
approach is badly suited for the high performance systems.

2.2.2. SOAP message with Attachment


The SOAP Message with Attachment [47] is constructed using MIME’s Multipart/related media
type with each part embedded within a MIME boundary (defined in the Context-Type header). Each
MIME part has header information describing content. Context-Type specifies the type of the data
embedded in this MIME part. Content Transfer-encoding specifies the encoding used for the
MIME part. Content-ID contains an identifier to refer this content from other parts of MIME
package. The CID URL scheme (based on RFC 2153) is used to construct the identifier.
The root part of the MIME message contains the SOAP envelope with Content-Type set to
text/xml. Both the SOAP header and body of a SOAP message may refer to other entities in the
message package. Following the SOAP encoding rules, the “href” attribute can be used to reference
any resource.

42
2.2.3. Web Service Attachment
WS-Attachment (Web Service Attachment [48]) uses DIME (Direct Internet Message
Encapsulation [49]) for sending and receiving SOAP messages with additional attachments, like
binary files, XML fragments, and even other SOAP messages. DIME is designed to encapsulate a
SOAP message and its related attachments in a MIME-like way. As with SOAP, the DIME
messages can be sent using HTTP transport protocol. The HTTP “Content-Type” header must be
set to "application/dime" in that case.

A DIME message consists of a series of one or more DIME records. Records are serialized into the
stream one after the other and are delineated with a binary header. The first record in a DIME
message has the MB (Message Begin) flag set in the header and the last record has the ME
(Message End) flag set. The large records or records where the size of the data is not initially
known can be broken down into the series of chunks. Each chunk has a header and a payload like
normal records; however, a CF (Chunk Flag) is set in the header. In addition to mentioned flags, the
DIME record header includes also a length of the record, content type, ID to uniquely identify each
record and support for adding any optional information that may be transmitted with a particular
record. In contrary with MIME technology the DIME sets no restrictions on the content of the
DIME record. In fact, the payload of the DIME record can contain another DIME message.
WS-Attachments specification indicates that the SOAP Message must be contained in the first
record of a DIME message. The attachments can be referenced from the SOAP message using the
ID field of the DIME record header. The “href” attribute is considered for that purpose.

2.2.4. Summary
The SOAP message with Attachments technology utilizes separator strings to specify data record
boundaries. Such design requires from a receiving side to process whole multipart message in order
to extract data fields. The sizes of receiving buffers are hardly guessed. Although, the “Content-
Length” header can be used to specify full length of the records handling a described problem, this
approach limits the possibility to stream the data with the preliminary unknown length.
The DIME encapsulation used in the WS-Attachment technology allows achieving both goals. The
chunked transfers provide streaming possibilities while the length headers allowing fast parsing.
Additionally, the certain amount of computation resources can be saved using binary headers
instead of text one’s assumed by the SOAP message with Attachment specification. Therefore, the
HDR extension considers usage of the WS-Attachment technology to link the SOAP message with
binary attachments.

43
2.3. Binary Formats in Heterogeneous Environment
As it was stated before the usage of a binary based data transfer is more efficient compared with
XML one. Therefore, the high performance systems under high load should utilize binary
representation for data exchange. However, native binary representations may differ among various
parts of heterogeneous environments. Different architectures have different byte ordering, sizes of
basic types, and even floating point format may differ (it is the case for some older architectures
like VAX stations). Additionally various compilers choose different data alignments to accelerate
access to variables. Such differences raise the necessity to establish a standard binary
representation, understandable by all participants of a data exchange.
There is a variety of different standardized binary formats used to exchange in the heterogeneous
environments. The XDR (eXternal Data Representation [28]) standardized by Sun Microsystems as
RFC-1014, CDR (Common Data Representation [30]) used by the CORBA implementations,
ASN.1 (Abstract Syntax Notation One [50]) defined by ISO/IEC 8824-8255 are most known among
them.
However, it should be considered that all these formats require some data conversions even in the
case of homogeneous environment. Assuming that most experiment computers have similar
architectures this solution is sub-optimal and limits peek performance. This assumption is rather
often true for modern data acquisition systems, since many of them, like ASEC, are based on the
clusters of x86 Linux PC’s [51]. Another often used solution is a mixed Linux/Windows
environment running on x86 based computers. In this case the systems also have similar data
representations. The only difference is the alignment of the data.
To optimize performance in the homogeneous cases, the native data representation (NDR) of the
server can be used to transfer the binary data. The description of the data format is sent along with
the data in that case. The client analyzes format description and performs data conversion only in
the case of format incompatibility.

2.3.1. External Data Representation


XDR (eXternal Data Representation [28]) was proposed by Sun Microsystems Inc and standardized
as RFC-1014. It is widely used in SUN RPC servers and in the NFS file system. The advantages of
this representation are the rather small size, simple encoding rules and a lot of available software
implementations for data encoding and decoding. The XDR representation uses the big-endian byte
order. IEEE 754 standard is used to represent floating point numbers. All data is aligned to 4 bytes.
The PC platform utilizes little-endian byte order to represent multi-byte data. Therefore, the XDR
will require complete data encoding/decoding while transferring data between two PC computers.

44
Since it is currently the most abundant platform in the data acquisition world, the XDR
representation will issue serialization overhead in the most often used case.

2.3.2. Common Data Representation


CDR (Common Data Representation [30]) is a data presentation for IOP (Inter-ORB Protocol)
protocol considered for the data exchange by the CORBA specification. It is similar to the XDR
with a following exception. It uses the natural alignment of the PC architecture, and messages are
passed in the same byte order as the source platform creates it. Therefore, CDR representation is
slightly more optimized for the PC platform compared with XDR one.
There are numerous CORBA implementations available for most existing platforms. However, all
these implementations are providing a sophisticated set of functionalities defined by the CORBA
specifications. Hence, they are big in size and complicated in use. The XTL is known to be only
library providing simple CDR serialization support.

2.3.3. Abstract Syntax Notation One


The ASN.1 (Abstract Syntax Notation One [50]) is an ISO/IEC 8824-8825 standard for multi-
vendor data exchange in heterogeneous environment. The ASN.1 defines a set of specifications
facilitating the exchange of structured data especially between application programs over networks
by describing data structures in a way that is independent of machine architecture and
implementation language.

The ASN.1 specification defines the notation for describing structured data, but does not restrict the
way the data is encoded. The various ASN.1 encoding rules are used for data serialization.
Currently, the following encoding rules are defined:
BER (Basic Encoding Rules): BER is self-describing variable-length encoding scheme, which
means that each data value can be identified, extracted and decoded without any additional
information. It is very size-effective and uses approximately 30% less memory than the platform
specific data representation. However, the serialization and extraction are very expensive. The
floating point numbers are serialized from the system IEEE 754 representation to the BER specific
representation. The performance comparisons in [52, 53] indicate that ASN.1/BER is 3-4 times
slower compared with XDR encoding.
CER (Canonical Encoding Rules): CER is a restricted variant of BER. BER gives more than one
way to represent data. For example a Boolean TRUE value can be encoded as any non-zero value.
Whereas BER gives choices as to how data values may be encoded, CER select just one encoding
from those allowed by the basic encoding rules, eliminating all of the options. It is useful for

45
cryptographic needs to ensure that a structure that needs to be digitally signed produces a unique
serialized representation.
DER (Distinguished Encoding Rules): DER is another restricted variant of BER. The DER encoding
is used by cryptographic software to store certificates.
PER (Packed Encoding Rules): PER provides much more compact encoding than BER. It tries to
represent the data units using a minimal possible number of bits. PER is not self-describing like
BER, and requires that the decoder should know the complete abstract syntax of the data structure
to be decoded.
GSER (Generic String Encoding Rules): GSER defines a human readable UTF-8 character string
encoding of ASN.1 values. It is primarily intended for defining the LDAP-specific encoding of new
LDAP attribute syntaxes and assertion syntaxes. The GSER representation is larger compared with
BER and requires more processing for serialization.
XER (XML Encoding Rules): XER provides ASN.1 encoding in terms of XML. The resulting data
representation is big and requires a lot of resources for encoding/decoding.

The ASN.1 implementations are widely available for all platforms and programming environments.
However, the performance of existing encoding rules does not satisfy requirements placed by the
high-performance data acquisition systems.

2.3.4. Native Data Representation


The base concept lying upon the NDR (Native Data Representation) is to transport data in the native
server representation, as it stored in the system memory. The NDR approach assumes data to be
accompanied by the metadata describing used format: byte order, basic data sizes, floating point
format, used alignments. This metadata can be transferred along with data or served by the server
upon the client request. The latter proceeding allows to safe bandwidth in the cases when client and
server exchanging messages without preserving permanent connection.
The NDR approach is very effective in homogeneous cases, since no serializing/decoding is
performed. In the heterogeneous cases the approach considers conversion only on the client’s side.
It allows to lower burden on the highly loaded servers moving processing to the client’s side. The
[54] proves that the approach is faster than the common data exchange methods even in the case of
heterogeneous environment.

2.3.5. XML
As it was said before, the SOAP data encoding rules are too slow for utilization in high-
performance systems. Of course, it is possible to design another XML data representation which

46
would be smaller and faster compared with SOAP one. However, the requirement to serialize data
into the ASCII strings will impose too big overhead for contemporary systems while serving the
data at speeds of dozens megabyte per second.

2.3.6. Summary
Fig. 8 represents performance comparison of XDR, CDR and NDR approaches. Floating point array
of different sizes was used to estimate performance. The benchmarks represent amount of time
required to round-trip message exchange between Linux server and Windows client. The
availability and popularity of PC architecture makes this environment preferred for the data
acquisition system development. Nowadays it is used in the most of operating systems. All data
acquisition systems reviewed in section 1.2 are based on the PC architecture. The PXI and field-
point devices considered by the KATRIN control system designed are based on this architecture as
well.
The comparison results confirm that the NDR approach is much more efficient compared with XDR
and CDR ones in Linux/Windows environment. The benchmark indicates the three-fold
performance difference between NDR and fastest tested CDR solution. Therefore, the NDR
approach is assumed by the HDR extension to transport the data in heterogeneous environments in a
most efficient way.

60000

50000

40000
time (us)

30000

20000

10000

0
4096 16384 65536 262144
size (KB)

NDR XDR (Glibc) CDR (XTL) CDR (OmniORB)

Fig. 8. The chart represents the time required for a round-trip message exchange between Linux and Windows
x86 PC’s depending on a portable binary format used for serialization.

47
2.4. HDR Binary
The OPC Complex Data specification [16] defines OPC Binary Type System (see section 1.6.2).
The Type System is used to describe the binary data distributed by the server. However, OPC
Binary notation is insufficient for describing the data encoded using NDR representation. The
notation does not provide syntax for describing data alignments used by various compilers to
optimize access to variables. The references are not anticipated by the OPC Binary specification as
well.

The HDR specification extends the original OPC Binary notation to address these shortcomings.
The syntax entities are defined as follows:
1. An “alignment” attribute is used to describe data alignments. This XML attribute is used
together with all basic and constructed data types and specifies the number of bytes the data is
aligned for. For example, the 4-byte alignment means that the value corresponded to the current
field should be stored in the memory at address multiple of 4 (missed addresses between the end
of last field and actual start of the current field if any are filled with random numbers). If
“alignment” attribute is omitted, the following default values are used:
a. For Integer, Float and Reference basic types, the alignment is considered to be equal to
the type own length (in bytes).
b. No alignment (alignment is equal to 1) is considered for other basic types.
c. For constructed types, the alignment is considered to be equal to the maximal alignment
of the encompassed variables.
2. A new basic type “Reference” is defined to describe data pointers referencing detached data
blocks. The “Reference” concept allows transferring complex structures consisting of several
memory blocks referenced from primary one using memory pointers. It inherits all attributes
supported by the Integer basic type. The “Length” attribute should be used to define pointer
length in bytes.

The OPC Binary with described extensions is defined as HDR Binary notation and used to describe
data types in the HDR compliant systems. The “http://dside.dyndns.org/HDRBinary/” XML
Namespace is used by the HDR Binary schema definition. The Type System should be referenced
using “HDRBinary” identifier from the “Type System ID” metadata properties and the
correspondent type description branch should be created within the data space of the HDR
compliant servers as anticipated by the OPC Complex Data specification.

48
2.5. Historical Data Access
The access to the older data values (Historical Data) is not anticipated by the OPC XML-DA
specification. However, this feature is very important for data acquisition systems and systems
working in environments with unreliable network connections. The caching possibilities anticipated
by the OPC XML-DA subscription mechanism are not enough to establish reliable data readout in
such environment. The control clients can benefit from availability of historical data in that case as
well.

To overpass this limitation the HDR specification defines a special query (see more details in
section 2.9.7). The query extracts Historical Data depending on the parameters supplied by the
client. Afterwards, the extracted data is made available to the client using especially created OPC
Item. The following parameters are considered by the query:
Start time: The start time specifies the start of the interesting time slice. If no start time is specified,
the timestamp of the earliest available value is considered.
End time: The end time specifies the end of interesting time slice. If no end time is specified, all
values after the start time are considered.
Bounds: By default the query selects all values between the start and end time, and any value that
falls exactly on the start time, but not any value that falls exactly on the end time. If “bounds”
parameter is set to TRUE, the server will return the value that falls exactly on the end time.
Furthermore, if no value falls exactly on the start time, then the latest available value prior to the
start time is returned as well. Similarly, if no value falls exactly on the end time, then the earliest
available value after the end time is returned.
Aggregate: The parameter enables support for aggregates as defined by the OPC HDA (OPC
Historical Data Access [40]) specification. The aggregates are methods that summarize data values.
Common aggregates include averages over a given time range, minimum over a time range,
maximum over a time range. Only numeric data types (integer and floating-point numbers) are
supported by the aggregators. All aggregates should omit bad data values from the calculation. The
aggregate quality should be “uncertain/subnormal” in that case. It is up to server implementation to
decide if values with “uncertain” qualities should be kept or skipped.
The list of aggregates defined by the OPC HDA specification is presented in Table 1. The complete
description of these aggregates is available in [40]. Additionally, the server may support
implementation-specific aggregates. In that case names of these aggregates along with descriptions
should be made available to the end users, using standard OPC XML-DA data dissemination
capabilities.

49
Table 1. The aggregates defined by the OPC HDA specification
Aggregate Description
INTERPOLATIVE This aggregate is used to retrieve interpolated values
TIMEAVERAGE Time-weighted average over the resample interval
TOTAL Totalized value (integral) over the resample interval
AVERAGE Average data value
COUNT Number of available values in the resample interval
STDEV Standard deviation over the resample interval
VARIANCE Variance over the resample interval
MINIMUM ACTUAL TIME Minimum value with its timestamp
MINIMUM Minimum value
MAXIMUM ACTUAL TIME Maximum value with its timestamp
MAXIMUM Maximum value
START Value at the beginning of the resample interval
END Value at the end of the resample interval
DELTA Difference between the first and last values
REGSLOPE Slope of the regression line
REGCONST Intercept of the regression line at the start of res. interval
REGDEV Standard deviation of the regression line
RANGE Difference between the minimum and maximum values
DURATION GOOD Duration of time in the interval during which the data is good
DURATION BAD Duration of time in the interval during which the data is bad
PERCENT GOOD Percent of data in the interval that has good quality
PERCENT BAD Percent of data in the interval that has bad quality
WORST QUALITY Worst quality of data in the interval

Resample interval: The sever divides the time slice specified by the start and end time into a
sequence of intervals and then calculates an aggregate for each interval. The resample interval
specifies the duration of these intervals.
Query: The query provides a possibility of selecting certain values from the time slice specified by
the start and end time. The format of the query is not defined by the HDR specification and depends
on a server implementation. However, the high level query language, like SQL or XQuery is
recommended for that purpose.

2.6. Multicasting
Multicast communication allows a single transmission of the data from the OPC server to be
received by multiple clients. In contrary to conventional point-to-point unicast delivery, the delivery
using multicasting approach transfer the data over each link of the network only once and create
only copies when the links to the destination split. This capability is usually essential for the highly
loaded data acquisition systems with the requirement to improve fault tolerance or implement some
kind of load-balancing. Fig. 9 represents a typical scenario demanding the usage of the multicasting
approach.

50
Data processing cluster

PC PC PC
Database

Multicast stream

Server
Fig. 9. The figure represents the example of the typical multicasting system layout. The cluster of PC’s is used
to process data acquired from the server using multicast connection

However, the nowadays multicasting implementations rely on the unreliable UDP protocol. So the
data consistency can’t be guaranteed. To solve this problem a working group on RMT (Reliable
Multicast Transport) was organized by IETF (Internet Engineering Task Force).
The RMT protocol uses multicasting to provide message dissemination capability and adds reliable
delivery mechanisms.

2.6.1. Pragmatic General Multicast


There is still no RMT protocol accepted as a must use standard. However, PGM (Pragmatic General
Multicast [55]) is accepted as an experimental protocol by IETF.
The PGM runs over a standard datagram multicast protocol. The data source multicasts sequence of
data packets, using standard multicasting approach. However, the data is not discarded after
transmission but stored in cache for a considered amount of time.
Upon detection of a missing data packet, a receiver repeatedly unicasts a NAK (Negative
Acknowledgement) to the last-hop PGM network element on the distribution tree from the data
source. A receiver repeats this NAK until it receives a NCF (NAK confirmation) from that PGM
network element. In turn, the network element forwards the NAK upstream. And so forth until the
NAK reaches the source of the original data packet or DLR (Designated Local Repairer). On NAK
acknowledgement receipt the DLR multicasts a missing packet from the cache again.
Additionally, special SPM (Source Path Message) messages are periodically multicasted by the
PGM data source. The SPM messages are used by all PGM network elements to establish reverse
path to the source. In addition, SPMs complement the role of data packets in provoking further
NAKs from receivers, and maintaining receives window state in the receivers.
The PGM is currently supported under the Windows 2003 Server operating system. Several
implementations are also available for Linux platform. The PGM enabled routers are provided by
the CISCO Corporation, the Nortel Corporation and perhaps some other vendors.

51
2.6.2. Summary
The PGM protocol is assumed by our system to support multiple synchronous clients requesting the
same data. The realization is oriented toward the high speed data acquisition systems supporting
multiple intranet clients. Some of the supported OPC Items are selected by the OPC XML-DA
HDR server for broadcast depending on their data rates and estimated number of clients. The
dedicated multicast group is allocated for each of these OPC Items. The address of the multicast
group is stored along with other metadata information.
During the server operation, every value change of the data variable associated with the OPC Item
is broadcasted into the corresponded group using native data representation. Both the data type
conversion as well as OPC Item grouping is performed on the client-side. However, the OPC XML-
DA HDR clients are not obliged to support PGM multicast. The data to the PGM incapable clients
will be transferred using the “WS-Attachments” approach.

2.7. Security Aspects


One of the main advantages of the OPC XML-DA protocol over competitors is an ability to work in
the highly distributed environment via internet utilizing proxy server on the message way. In such
distributed environment the data will surely pass truth different kind of routers and proxy servers it
is not intended for. If data is transferred unprotected the persons having access to one of these
routers and proxy servers may get and alter the passed information in some circumstances. Even
worse, in case of availability of writable data items controlling the installation behavior the
malicious agent may cause serious damage to the system. Therefore, the reliable security approach
is mandatory for OPC XML-DA protocol while supporting clients across the internet.

2.7.1. SSL Security


SSL (Secure Socket Layer [56]) and TLS (Transport Layer Security [57]) technologies provide
endpoint authentication and communication privacy across the internet using public and symmetric
key cryptography. The SSL handshake involves a number of basic phases. At the simplest level the
following sequence is used. At first the peers exchange information about supported cryptographic
algorithms. Then a symmetric key is generated by the server and passed to the client using public
key approach. The key is digitally signed to ensure that it was generated by the server, not a
malicious agent in the middle between server and client. Finally, the data exchange starts. The
generated symmetric key is used to encrypt all transferred information using conventional
cryptography approach [45, 58].
A symmetric key also known as conventional cryptography requires from the sender and receiver to

52
share a key: a secret piece of information that may be used to encrypt or decrypt a message. If this
key is secret, then nobody other than the sender or receiver may read the message. The task of
privately choosing a key before communicating, however, can be problematic.
Public key cryptography solves the key exchange problem by defining an algorithm which uses two
keys, each of which may be used to encrypt a message. If one key is used to encrypt a message then
the other must be used to decrypt it. This makes it possible to receive confidential messages by
simply publishing one key (the public key) and keeping the other secret (the private key).

The message digest is used to guarantee message integrity. The sender creates a concise summary
of the message called message digest and send it along with message. Upon receipt of the message,
the receiver creates its own summary and compares it with one received from the sender. If they
agree, then the message was received intact. The hash (one-way) functions creating short, fixed-
length representations of longer, variable-length messages are used to create message digest. Digest
algorithms are designed to produce unique digests for different messages, making it too difficult to
determine the message from the digest, and also impossible to find two different messages, which
create the same digest.
Digital signatures are used to identify the sender of a message and ensure that the message is really
from them. Digital signatures are created by encrypting a digest of the message, and other
information (such as a sequence number) with the sender's private key. Though anyone may decrypt
the signature using the public key, only the signer knows the private key. This means that only they
may have signed it. Including the digest in the signature means the signature is only good for that
message; it ensures also the integrity of the message since no one can change the digest and still
sign it. To guard against interception and reuse of the signature by an intruder at a later date, the
signature contains a unique sequence number.

2.7.2. X509 Certificates


The public key cryptography considers encryption of the message intended for personal use by the
receiver’s public key. Therefore, it is required to establish a mechanism providing association
between public keys and real world entities. Security certificates are considered for that purpose.
The certificate associates a public key with the real identity of an individual, server, or other entity,
known as the subject. Certificates are issued by the server or trusted the 3rd party agencies
(Certificate Authority - CA) and signed with CA private key. Hence, the preliminary knowledge of
certificates belonging to the trusted CA’s is enough to assure the client identity.
The information about subject includes the identifying information (distinguished name) and the
public key. Distinguished names are defined by the X.509 standard [59], which defines the fields,

53
field names, and abbreviations used to refer to the fields. These fields include information on the
entity name, organization and location. E-mail address is often included as a part of entity name.
The certificate includes also the identification and signature of the Certificate Authority that has
issued the certificate, and the period of time during which the certificate is valid.

2.7.3. OPC XML-DA Security


The OPC XML-DA specification expects the security approach to be provided by the transport level
protocol. HTTPS (Secured Hyper Text Transfer Protocol) is considered for that purpose. HTTPS
uses standard HTTP approach for the data transfer. However, instead of using plain text
communication, the data is encrypted using SSL or TLS protocols, thus ensuring reasonable
protection from the eavesdroppers and message forgery.

The SSL handshake assumes X509 certificates for the data protection. Prior to the actual data
exchange the client and server are exchanging with their certificates in order to establish protected
data channel. Therefore, the authentication based on the client supplied certificate is possible. The
OPC XML-DA server is able to identify person identity based on the certificate it have supplied to
establish HTTPS connection. Further, the server may allow or disallow access to certain portions of
the server data space depending on this identity.

2.7.4. XML Encryption and XML Signature Specifications


The XML Encryption and Signature specifications provide a standard way to encrypt and digitally
sign content of an XML document [60, 61]. It is possible to handle whole document at once or
specify a desired subset using XPath language. Hence, particular parts of a single XML document
can be signed and encrypted by the different authorities.
Both encryption and digital signatures are based on the secure key concepts. The symmetric and
asymmetric keys are used by the XML specifications to establish secure data exchange in the same
way as the SSL does. The XML Signature specification defines an XML structure to describe the
used keys. Besides the key the structure has addition fields allowing description of the X509
certificates as well. The keys can be stored either clear text or ciphered by means of other
symmetric or asymmetric key. The clear text keys are stored inside a “KeyInfo” element and used
to distributed public keys or certificates. The ciphered keys are normally symmetric-keys encrypted
using client public key. The “EncryptedKey” element is used to attach ciphered keys to the
document.
In many cases several keys are used within the single XML document. Each of the keys have a
name defined and can be referenced from other parts of the XML document using this name.

54
The XML Encryption specification defines an “EncryptedData” element structure. In accordance
with the specification the element provides a full set of information required by the receiving side to
decrypt data. This information includes list of used algorithms, symmetric key used for data
encryption and cipher data itself. Similarly to the described SSL approach, the symmetric key is
encrypted using receiver public key and stored along with other information. The “EncryptedData”
element replaces the protected element or its content in the encrypted version of the XML
document. When an entire XML document is encrypted, the “EncryptedData” element will become
the root of a new document.

The XML specification allows the same XML data to be represented in more than one way (order
of attributes, etc…). Therefore, the XML Signature specification defines transformations which are
used to convert XML content into the unique canonical form. The “Signature” element defined by
the specification should contain information required for the receiving side to verify data integrity.
The information includes sender certificate, used canonicalization transformation, digest algorithm
and encrypted digest itself. The “Signature” element can be included inside the signed element
content (enveloped signature) or disposed in another place of an XML content referencing the
signed elements using XPath expression (detached signature).

2.7.5. Multicast Security


The HTTPS is point-to-point security approach. Hence it is incapable of providing security for the
multicast data and some other mechanism has to be implemented. The HDR solution assumes
symmetric key cryptography for multicast data connections as seen in Fig. 10. An OPC XML-DA
server generates a random symmetric secret key to encrypt all data passing to a multicast group.
The client establishes a secure HTTPS control connection to the server, using asymmetric key

OPC XML-DA HTTPS/XMLSec OPC XML-DA Connection

HDR Server Encrypted using asymmetric-key cryptography


Client

High data rate binary connection Multicast High data rate binary connection
Encrypted with symmetric key Group Encrypted with symmetric key

Symmetric
key

Fig. 10. OPC XML-DA HDR security layout used to protect data between the server and multicast clients.

55
cryptography as specified by the OPC XML-DA specification. Further, it confirms PGM protocol
support and requests data. The OPC XML-DA HDR server responds with an address of the
multicast stream. The key used to protect data is sent along with response. Therefore, the client can
get data from the multicasting stream and decrypt it with the provided key.

2.7.6. XML Security


The major security improvement is achieved by moving from the transport level security to the
XML-only security approach as described by the XML Encryption and XML Signature
specifications. The XML-based security solution will bring a number of benefits over the standard
HTTPS based implementation. The most important, it provides more control over authentication
and read-write access. The XML Encryption and XML Signature specifications allow protection of
the individual parts of an XML document using different credentials. Therefore, each of the
transferred OPC Items can be encrypted or signed using different certificates. It will help
disseminate the protected content in the large distributed environments with a lot of people
controlling various system operations.
Another benefit is coming from the possibility to protect only a critical part of the data. The XML
security approach can save a lot of computational resources in the case if the server streams dozens
of megabytes per second, but only a small part of that data is critical and requires protection from
forgery. Finally, the approach is extremely useful in the cases then intermediate OPC servers are
used (see Fig. 11). It does not require from the intermediate server to decode and encode data again.
The data is passed through encoded, inaccessible by the running software. Therefore, hacker will
not get any valuable information by breaking into the intermediate server.

Fig. 11. The figure illustrates differences between HTTPS and XML security approaches while the
intermediate OPC server is used. The left diagram represents the system using HTTPS approach. The XML
security approach is illustrated on the right one.

56
The XML Security approach is well suited for the pure XML messages. However, the binary
attachments are not a part of the XML document. Hence, the XML Encryption and Signature
specifications do not define any standard way of protecting them. To protect the attachments by
means of XML Security approach the following procedure is considered by the HDR specification.
Digital Signature: The attachment digest should be calculated using one of the algorithms defined
in the XML Encryption specification. SHA1, SHA256, SHA512, RIPEMD-160 are defined by the
current version of the specification. However, only SHA1 is defined as mandatory to implement
feature. Therefore, the HDR specification recommends the use of SHA1 algorithm for digest
calculation. The evaluated digest value along with the used algorithm is stored using BASE64
encoding inside the appropriate OPC Item element within the primary OPC XML-DA message.
Encryption: The encryption is handled in the way similar to the Multicasting security approach. The
symmetric key is generated and used for attachment encryption. If the multicasting is enabled the
same key can be used for encrypting both attachment and multicast stream. The data encryption is
performed using one of the block encryption algorithms from the XML Encryption specification.
TRIPLEDES, AES128, AES256 and AES192 are defined in the current version. The TRIPLEDES
algorithm is recommended by the HDR specification. After encryption, the symmetric-key along
with used encryption algorithm is stored using BASE64 encoding inside the appropriate OPC Item
element within the primary OPC XML-DA message.
After the digest and symmetric key have been stored inside primary OPC XML-DA message they
can be easily protected by means of XML Encryption and Signature specifications.

2.8. Compatibility Aspects


The two-way compatibility with legacy OPC XML-DA clients and servers is a strict requirement.
The HDR capable server should support the legacy clients. However, it can disallow access to the
OPC Items generating data at very high rates. Limiting the minimal renewal interval is another
possibility. As well the HDR clients are required to support generic OPC XML-DA servers and
retrieve data in a legacy way.

2.9. The OPC XML-DA HDR Specification


The OPC XML-DA HDR specification defines a set of rules to provide the high performance binary
data exchange at the top of the OPC XML-DA protocol [12]. The HDR capable server is allowed to
distribute data in the several data representations. The integration of binary data with the primary
SOAP message is achieved using WS-Attachment technology [48]. The SOAP message is
transferred in the first DIME record. The attachments are transferred sequentially in the following

57
records. The DIME record ID should contain a full identifier of the affiliated data item. The
attachments should be referenced from the SOAP message using “href” attribute. Please, consider
example in Fig. 12. The client is not allowed to request several representations of the same data
item in one request.

Fig. 12. The figure illustrates the sample layout of the OPC XML-DA HDR server response to Read request. The
response carries OPC XML-DA message along with two binary attachments inside a DIME envelope. The “href”
attribute is used to reference correspondent attachment from OPC XML-DA message.

Some DIME chunks in the Options header may contain optional information describing the
attachment behavior. The flag variable is stored in the first byte. Meaning of each field is described
in Table 2 . The second byte is reserved for the future use. The following bytes are used to carry
optional information depending on the given flags configuration. The Options header can be
omitted indicating the default behavior.

Table 2. Defined DIME flags


Bit Meaning
New block indicator. If the DIME message encompasses complex data item consisting of
the several memory blocks the flag indicates a start of a new block. The block address is
1
stored along with the flag variable in the Options header using Native Server
Representation.
New key indicator. This flag indicates that the symmetric-key used for the data protection is
2
changed.

58
The complex data consisting of several memory blocks linked together with references is
transferred sequentially according to the specification. Each block should be transferred in one or
more separate DIME chunks. The first chunk’s Options header should contain the block address in a
server memory space. This address allows the client application to resolve references according to
the blocks layout in a server memory space.

2.9.1. HDR Specific Data Space


The CPX branch as defined by the OPC Complex Data specification is used to provide the data in
different representations (see section 1.6). Although the native server representation is defined as a
default representation for high performance data exchange, the HDR specification allows one or
more other XML and binary formats to be used. The OPC XML-DA HDR server may decide to
support certain clients with data in their native representation. This functionality may be required by
the simple OPC clients running on slow embedded hardware due to impossibility to convert data on
their side. Additionally, the HDR specification defines a query allowing a client to request data in
the desired custom representation (see section 2.9.6).
The standard data representation defined by the core OPC XML-DA specification should be
mandatory present in the server data space and available for all clients. However, the servers are
allowed to completely or partially restrict read and write access to that representation. Alternatively,
the server can limit the minimal renewal intervals for the items disseminating XML data
representations. The access can be restricted depending on the client identity and item data rate. The
client source address and/or provided certificate are normally used to distinguish clients. For OPC
Items correspondent to the representations other than standard one the server is allowed to define
visibility constraints. These items can be browsable by all clients or only certain clients upon the
server decision.

The HDR specification defines names for the following standard representations:
HDR: The “HDR” item should be used for the data dissemination using fastest available binary
representation. The specification assumes native server representation for that purpose. However, in
some circumstances it is allowed to use other representation to achieve fastest possible data
exchange. The intermediate OPC XML-DA server streaming the data to the clients from underlying
master servers without any processing is an excellent example. The data to the clients will be
supported in the native representations of the master servers in that case, since conversion to the
intermediate server’s native representation will only waste the system resources.
XDR: The “XDR” item should be used for dissemination the data in the XDR (eXternal Data
Representation) representation if this representation is supported by the server.

59
CDR: The “CDR” item should be used for dissemination the data in the CDR (Common Data
Representation) representation if this representation is supported by the server.
Additionally, the “Multicast” item is used to distribute the data using multicasting approach (see
more details in section 2.9.4) and the “Data Filter” item is used by the query mechanism (see details
in section 1.6.4). It is up to the server implementation to choose names for the other supported data
representations. However, the requirement “to choose names descriptive enough to allow a client to
present names in a list to the end user” stated by the OPC Complex Data specification should be
considered.

2.9.2. HDR Specific Metadata Properties


The HDR specification defines a set of the metadata properties to help client applications in
comprehending the different representations provided by the server. These properties are used to
describe the structure and layout of the associated data and concomitant multicasting stream as well
as to specify security requirements. Table 3 lists all metadata properties defined by the HDR
specification in addition to the properties which are provided by the OPC XML-DA specification.

Table 3. Metadata properties defined by HDR specification


Property Type Description
Specifies Type System used to provide description
Type System ID String
of the associated data
Specifies the Dictionary within the Type System
Dictionary ID String
used to provide type description
Type ID String Specifies the type of the associated data
For items in alternative representation specifies the
Unconverted Item ID String
original OPC Item
For the items associated with queries specify the
Unfiltered Item ID String
original OPC Item the query is applied to
For the items associated with queries specify used
Data Filter Value String
query parameters
Specifies Type System used to describe the data
Stream Type System ID String
broadcasted in concomitant multicast group
Specifies Dictionary used to provide type
Stream Dictionary ID String
description of the broadcasted data
Specifies the type of the data that is broadcasted in
Stream Type ID String
the associated multicast group
Defines the security requirements assigned by the
server for comprising OPC Item. The
Enumeration:
“Authentication” specifies that client should
Any
authenticate itself to access the item. The
Plain
Access Type “Encryption” requires the request encryption from
Authentication
the client. The “Plain” specifies that security is not
Encryption
supported by the server for that item and client
should use plain HTTP protocol to request data. The
“Any” allows client to choose desired security.

60
2.9.3. Type Systems
The “XMLSchema” and “HDRBinary” Type Systems (defined by the OPC Complex Data
specification, see section 1.6 and HDR specification, see section 2.4) are considered by the HDR
extension for the data type description. The HDR Binary notation is used for all binary
representations. The XML Schema [62] is used to describe XML based representations.
In accordance with the OPC Complex Data specification HDR complied server should provide all
type description information truth CPX branch in the server data space. However, the HDR
specification in contrast with OPC Complex Data specification not only allows but requires that the
server should provide a full dictionary to the clients using “Dictionary” property of the dictionary
associated OPC Item. Each of the OPC Items representing data in formats other than default one
should provide references to the used Type System, Dictionary and Type. The “Type System ID”,
“Dictionary ID” and “Type ID” metadata properties are used for that purpose. The OPC Items
disseminating information in native OPC XML-DA representation are not required and not allowed
to provide any of these attributes to avoid legacy client confusion.

2.9.4. Synchronous Clients Support


The reliable multicasting approach using the PGM (Pragmatic General Multicast [55]) protocol is
assumed by the HDR specification to support multiple synchronous clients requesting the same
data. The realization is oriented toward the high speed data acquisition systems supporting multiple
intranet clients. Some of the supported OPC Items are selected by the OPC XML-DA HDR server
for broadcast depending on their data rates and estimated number of clients. The dedicated multicast
group is allocated for each of these OPC Items.
The information about multicast group is provided to the client application truth “Multicast” item
registered under the “CPX” branch of considered OPC Item within the server data space. The client
application should issue read request to get information about address of multicast group and
symmetric key used for data protection. The server is allowed to change the symmetric key during
operation. It is not required by the HDR specification; however it is recommended to change the
key on a day or week basis to strengthen security. The client can get notifications about the key
changes by means of subscribing to the “Multicast” item.
The “Multicast” item provides information about multicast group using simple XML document.
The XML Schema in Fig. 13 describes its structure. In accordance with OPC Complex Data
specification this XML Schema is part of the HDR Dictionary within the “XMLSchema” Type
System. Therefore, the client may query “/CPX/XMLSchema/HDR/Stream” item to get the schema
definition.

61
XML Schema Definition
<xs:complexType name="Encryption">
<xs:simpleContent><xs:extension base=“xs:base64Binary”>
<xs:attribute name="Algorithm" type="xs:anyURI" use="required"/>
</xs:extension></xs:simpleContent>
</xs:complexType>
<xs:complexType name="PGMStream">
<xs:attribute name="address" type="xs:string" use="required"/>
<xs:sequence>
<xs:element minOccurs=”0” maxOccurs=”1”
name=”Encryption” type=”hdr:Encryption”/>
</xs:sequence>
</xs:complexType>
<xs:element name="Stream">
<xs:complexType>
<xs:element name="PGM" type="hdr:PGMStream" />
</xs:complexType>
</xs:element>
Example
<Stream>
<PGM address="224.2.0.1">
<Encryption Algorithm=“http://www.w3.org/2001/04/xmlenc#tripledes-cbc“>
BASE64 encoded symmetric key
</Encryption>
</PGM>
</Stream>

Fig. 13. The schema represented on the figure defines a structure of the XML document which is returned to
the client application by the server in order to provide information about multicast group used for the data
dissemination. The sample of such document is found at the end of the figure. The multicast group is
identified by the “224.2.0.1” IP address. The Triple DES crypto-algorithm is used to encode all the data
multicasted into the group.

The data format used by the server to broadcast the values in a multicast stream is defined by the
“Stream Type System ID”, “Stream Dictionary ID” and “Stream Type ID” metadata properties.
Table 4 contains a list of the properties which “Multicast” item should provide to describe data type
as defined by the OPC Complex Data and HDR specifications.
During the server operation, every value change of the data variable associated with the OPC Item
is broadcasted into the corresponded multicast group. Each data value is encapsulated in a separate
DIME envelope. The fist record of that DIME envelope contains the OPC XML-DA message. This
message is coinciding with the OPC XML-DA subscription polling response message whereas the
standard WS-Attachment approach is used for transfer. It is used to provide a client with
information about encompassed value timestamp, quality and etc... The next DIME record contains
the data value itself. To avoid UDP fragmentation the data should be transferred in a small DIME
chunks. The complex items consisting of several memory blocks are transferred as described in
section 2.9.

62
Table 4. Multicast item type description metadata properties
Metadata Property Value
Type System ID XMLSchema
Dictionary ID HDR
Type ID Stream
Stream Type System ID HDRBinary
Stream Dictionary ID appropriate HDRBinary dictionary
Stream Type ID appropriate HDRBinary type

2.9.5. Standard Queries


The OPC Complex Data specification defines a query mechanism. The queries applications range
from a simple mask where the client tells the server to ignore some fields when testing whether a
data change update has occurred to complex queries that the server uses to extract the data from an
underlying data source. The OPC Complex Data specification defines only a mechanism, but leaves
the specific query definition up to the server implementation. The HDR specification defines the
quantity of standard queries. These queries specify a standard way of requesting client-specific
binary representation, accessing historic data and extracting the data component parts.
Additionally, the HDR specification states that parameters for all queries should be expressed in the
XML terms.

2.9.6. Custom Data Representation Query


The OPC XML-DA HDR server may allow clients to specify binary representation they want to use
for a data exchange. This functionality is mainly required by the simple clients who are unable to
perform data conversion on their side.

XML Schema Definition


<xs:element
name="CreateCustomRepresentation" xmlns:hb="http://dside.dyndns.org/HDRBinary/">
<xs:complexType><xs:sequence>
<xs:element name="TypeDictionary" type="hb:TypeDictionary"
minOccurs=”1” maxOccurs=”1”/>
</xs:sequence></xs:complexType>
</xs:element>
Example
<CreateCustomRepresentation xmlns:hb="http://dside.dyndns.org/HDRBinary/">
<hb:TypeDictionary>
The localized version of the item type description should be included here
using HDR Binary notation.
</hb:TypeDictionary>
</CreateCustomRepresentation>
Fig. 14. The schema represented on the figure defines a structure of the query requesting server to create client-
specific data representation. The sample of such query is demonstrated at the end of figure.

63
The HDR specification defines a standard query for that purpose. The client should read the NDR
representation description and localize it. The localization is achieved by replacing original type
qualifiers by the qualifiers appropriate for the client configuration. These qualifiers include
properties specifying the data alignments, byte order, basic type sizes, etc… After the new type
description is ready the client should issue query which structure is defined by the XML Schema
presented in Fig. 14. If client-specific representations are supported by the server, it will create new
OPC Item using a name provided by the client in a query. The requested data representation will be
used while reading the data from that item.
The described query can be only applied to items corresponding to the original OPC XML-DA data
representation.

2.9.7. Historical Data Access Query


The access to the Historical Data is not provided by the core OPC XML-DA specification. To
elude this limitation the HDR specification defines a query which can be used to access historical
data. The query parameters allow the client to specify the data he is interested in. The client is able
to select required time slice, filter interesting values using implementation-specific queries and
aggregate the data over long time intervals. The request structure is described by the XML Schema
realized in Fig. 15.

XML Schema Definition


<xs:element name="GetHistoricalData">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs=”0” maxOccurs=”1” name=”Filter”/>
</xs:sequence>
<xs:attribute name=”start_time” type=”xs:dateTime” use=”optional”/>
<xs:attribute name=”end_time” type=”xs:dateTime” use=”optional”/>
<xs:attribute name=”bounds” type=”xs:boolean” default=”false”/>
<xs:attribute name=”resample_interval” type=”xs:duration” use=”optional”/>
<xs:attribute name=”aggregate” type=”xs:string” use=”optional”/>
</xs:complexType>
</xs:element>
Example
<GetHistoricalData start_time=”2005-01-01” end_time=”2006-01-01”>
<Filter>
The query used to filter values can be included here. The format of the
query is not defined by the HDR specification and depends on a server
implementation.
</Filter>
</GetHistoricalData>
Fig. 15. The schema represented on the figure defines a structure of the query requesting Historical Data
from the server. The sample of such query is demonstrated at the end of figure.

64
The following query parameters are supported (more information about parameters is provided in
section 2.5):
start_time: The “start_time” parameter specifies the start time of the interesting time slice.
end_time: The “end_time” parameter specifies the end time of time interesting slice.
bounds: The “bounds” parameter specifies whenever the start and end times are treated inclusive or
exclusive.
aggregate: The “aggregate” parameter specifies the aggregate function as defined by the OPC HDA
(OPC Historical Data Access [40]) specification. This option should be used together with
“resample_interval” parameter.
resample_interval: This option has only meaning if “aggregate” function is specified. The sever
divides the time slice specified by the start and end time into a sequence of intervals and then
calculates an aggregate for each interval. The “resample_interval” parameter specifies the duration
of these intervals.
query: The “query” element provides a possibility to select certain values from the time slice
specified by the start and end time. The format of the query is not defined by the HDR specification
and depends on a server implementation. However, the high level query language, like SQL or
XQuery is recommended for that purpose.

Besides the standard aggregates defined in the OPC HDA specification, the server may support
implementation-specific aggregates. In that case the names of these aggregates along with
descriptions should be reported to the clients. The “Aggregates” branch within the server data space
is considered for that purpose. The OPC Item for each supported implementation-specific aggregate
should be registered under the “Aggregates” branch. The complete description of the aggregate
behavior should be supplied to the client upon the read request to that item.

Upon the query receipt the server should create OPC Item using the name provided by the client in
a query. Afterwards, the read requests to that item will return Historical Data values starting from
the oldest ones. The item should be automatically destroyed after the last value is read. To prevent
access interference, the item should be available only to the client issued the query. The HDR server
is allowed to limit maximal time slice range or maximal number of values falling into the specified
time slice.

2.9.8. Query Extracting Component Parts from Compound Data


In some cases client may be interested only in the specific part of the complex data item. The HDR
specification defines a query allowing clients to request component parts of the compound data

65
items. The XML Schema represented in Fig. 16 defines a structure of that query. However, the
query syntax slightly differs between XML and binary representations. For XML representation
XPath expression should be used to specify a required part of data. For the binary representations
client should specify the field name defined in the HDR Binary type description. Full type path
separated by the ‘/’ symbols should be used to indicate sub-records.
After the query item is created, the client is able to retrieve only specified component part of the
data by accessing that item.

XML Schema Definition


<xs:element name="CreateCustomRepresentation">
<xs:complexType>
<xs:attribute name="query" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
Example
<ExtractComponent query=”/sensor1”/>
Fig. 16. The schema represented on the figure defines a structure of the query requesting extraction of
component part from the complex data item. The sample of such query demanding extraction of the data
obtained by the first sensor is demonstrated at the end of the figure.

2.9.9. Security
The OPC XML-DA HDR server is allowed to restrict unauthorized access to the certain OPC Items
or to the server as a whole. The access can be restricted depending on the client source address or
certificate used to establish secure connection. The “Access Type” metadata property should be
used by the server to specify security requirements on the OPC Item level. The following four
modes are provided by the HDR specification:
Authentication: The mode states that the client should authenticate itself to access the item. The
client can be authenticated based on its source address or X509 certificate used to establish secure
connection.
Encryption: This mode in addition to authentication places the requirement to encrypt whole
network traffic used to transmit the item values.
Plain: This mode states that security is not supported by the server for that item and a client should
use plain HTTP protocol to access data.
Any: This mode specifies that it is up to a client to choose an appropriate security mode. It is default
behavior if the “Access Type” property is not available.

The server should provide security approach using HTTPS (Secure HTTP) protocol as defined by
the OPC XML-DA specification. However, the server may choose to support as well the security

66
approach based on the XML Encryption and XML Signature specifications [60, 61] (see section
2.7.6 for more details). In that case the appropriate compatibility information should be available to
the clients under “/Extensions” branch within the server data space as specified in section 2.9.10.
The XML Security approach allows clients to encrypt and sign separately the data associated with
certain OPC Items within the request. The different security entities can be used for all such items.
Prior to a standard request processing the XML Security compliant server should decrypt all
encrypted items, validate supplied digital signatures and validate access permissions to all requested
items depending on the desired security mode. The client source address together with the supplied
certificate may be used for validation purpose. It is significant, that the certificate used on OPC Item
level should have precedence over the general one used to sign whole message if both are specified.
The server should consider for a response the security layout used in the request. The response
entities correspondent to the encrypted and signed request entities should be correspondingly
encrypted and signed. The certificate used to digitally sign request entity should be used to encrypt
correspondent response entity.
The XML Security approach is aimed to protect the XML content. However, in the HDR mode the
binary attachments should be protected as well. The security approach used for the attachment
protection is described in section 2.7.6. The message encryption should be implemented by means
of a generated symmetric key. An attachment digest is calculated in order to guarantee the data
consistency. Both the digest and the key are stored inside the “Value” element of the appropriate
OPC Item within the OPC XML-DA response. Hence, they will be protected from the modification
and eavesdropping by means of XML Security approach.
The HDR specification requires only digest to be present if the Authentication security mode is
used. The data is attached unencrypted in this case. While the Encryption security mode is used the
attached data should be mandatory encrypted. Therefore, both the digest and symmetric key should
be encompassed in the “Value” element. The XML Schema in Fig. 17 defines how the symmetric-
key and digest should be included in the OPC XML-DA message. The “Digest” element is used to
store data digest and the “Encryption” element is used to handle symmetric-key.
The described procedure is only used together with the XML Security approach. If standard HTTPS
security is used, the data protection is achieved by means of the Secure HTTP protocol and separate
handling of the attached data is not required.
The security of the concomitant multicast streams is carried out using symmetric-key cryptography.
The server generates a symmetric key for each supported multicast stream. This key is used to
encrypt all data broadcasted to the correspondent multicast group. For a better safety the HDR
specification recommends to regenerate the key on the day or week basis.

67
The clients authorized to receive the broadcasted data will get the key along with multicast group
address from the special OPC Item as described in section 2.9.4. The “Encryption” is only
supported security mode for that item.

XML Schema Definition


<xs:complexType name="Digest">
<xs:simpleContent>
<xs:extension base=“xs:base64Binary”>
<xs:attribute name="Algorithm" type="xs:anyURI" use="required"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="Encryption">
<xs:simpleContent>
<xs:extension base=“xs:base64Binary”>
<xs:attribute name="Algorithm" type="xs:anyURI" use="required"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name=”BinaryData”>
<xs:sequence>
<xs:element name=”Digest” minOccurs=”0” maxOccurs=”1”
type=”hdr:Digest”/>
<xs:element name=”Encryption” minOccurs=”0” maxOccurs=”1”
type=”hdr:Encryption”/>
</xs:sequence>
<xs:attribute name=”href” type=”xs:anyURI” use=”required”/>
</xs:complexType>
Example
<Value href=”thismessage://Data/Item1” xsi:type=”BinaryData”
xmlns:hdr=”http://dside.dyndns.org/HDR/”>
<hdr:Digest Algorithm=”http://www.w3.org/2000/09/xmldsig#sha1”>
BASE64 encoded message digest
</hdr:Digest>
<hdr:Encryption Algorithm=“http://www.w3.org/2001/04/xmlenc#tripledes-cbc“>
BASE64 encoded symmetric key
</hdr:Encryption>
</Value>
Fig. 17. The XML Schema represented on the figure defines how the attachments digest and symmetric-key
should be included into the primary OPC XML-DA message. The sample is demonstrated at the bottom. The
message digest calculated using SHA1 crypto-algorithm is encapsulated into the “DigestAlgorithm” XML
node and Triple DES symmetric-key is encapsulated into the “Encryption” node. The BASE64 encoding is
used for the data encapsulation in both cases.

2.9.10. Compatibility
The OPC XML-DA HDR enabled server should advertise information about all supported
extensions along with their versions under the ‘/Extensions’ branch within the server data space.

68
The HDR specification defines following extensions: HDR, Multicast, Custom Representations,
Historical Data Access, Component Extraction, and XML Security.
HDR: A core of the High Data Rate extension. Implements support of the data dissemination using
multiple XML and Binary representations. This extension is mandatory and should be supported by
all HDR capable servers.
Multicast: This optional extension provides possibility to disseminate the data to multiple
synchronous clients using PGM multicasting approach (see section 2.9.4).
Custom Representations: This optional extension provides clients with a possibility to request data
in the client-specific binary format (see section 2.9.6).
Historical Data Access: This optional extension defines a query which allows requesting the
Historical Data from the server (see section 2.9.7).
Component Extraction: This optional extension provides clients with a possibility to retrieve
desired component parts from the compound data items (see section 2.9.8).
XML Security: This optional extension provides per-item security mechanism based on the XML
Encryption and XML Signature specifications (see section 2.9.9).
The HDR compatibility is ensured using the following approach. The client in the compatibility
mode checks whether HDR extensions are supported. If this is the case it switches to the HDR
mode. In details, the OPC XML-DA HDR server should register the “HDR” item under
“/Extensions/” path within the server data space. In order to detect the supported HDR version the
client should read the value of that item. If “HDR” item is not available under the “/Extensions/”
path, then the server is HDR incapable and a client should utilize legacy OPC XML-DA protocol.
After the core HDR support is assured, the client can query “/Extensions/” branch to detect support
of optional extensions.
While communicating with OPC XML-DA HDR server, the HDR capable client should proclaim
HDR compatibility by declaring “http://dside.dyndns.org/HDR/” namespace in the root element of a
SOAP request. However, the server’s HDR compliance should be assured before. To avoid legacy
client confusion the OPC XML-DA HDR server should expose the HDR specific items and
attributes only to the clients proclaimed HDR compatibility. The “/Extensions” branch is an only
exception.

Altogether, the following hand-shake approach should be used. Client in the legacy mode should
query the “HDR” item under “/Extensions/” path. If the query worked well, the client can switch to
the high data rate mode by declaring HDR namespace in the requests. After that it should be able to
browse the OPC Items providing binary data exchange.

69
2.10. Summary
The High Data Rate extension to the OPC XML-DA protocol is introduced in this chapter in order
to provide universal data exchange solution for modern data acquisition systems.
The OPC XML-DA protocol is an apparent successor of the de facto standard for automation
industry and control systems, the OPC DA protocol. It is able to work in highly heterogeneous
environments through a various kinds of internet proxies and firewalls. However, several serious
limitations prevent its utilization in the high performance data acquisition systems. The extremely
slow performance and absence of mechanism providing access to the historical data are most
important issues. The High Data Rate extension is developed in order to handle these problems and
preserves high level compatibility with legacy clients at the same time. Hence, it gives a chance for
the OPC XML-DA protocol utilization in a variety of the formerly unsupported environments with
severe performance demands.
The separation of the data from protocol and metadata information, usage of the server native
binary representation for data exchange and the opportunity to support multiple clients with the
newest PGM multicasting standard are main concepts of this extension. The query mechanism
adopted from the OPC Complex Data specification allows extending server functionality with
complex queries. The number of standard queries was defined to help clients in accessing historical
data and acquiring most appropriate binary data format.

70
CHAPTER 3

PROTOTYPE IMPLEMENTATION

3.1. Introduction
The effective implementation of the OPC XML-DA HDR server requires the handling of a set of
automation and optimization tasks. Converting of the data between different binary representations
as well as encoding to and from the OPC XML-DA default representation should be dynamically
performed. The data may vary from the simple integer and floating point numbers to the
sophisticated structures containing multiple vectors, matrixes, dynamic arrays and references.
The OPC XML-DA HDR server is accomplishing many simultaneous tasks. It executes a quantity
of drivers communicating with the underlying hardware and reading the data, supports simultaneous
client requests, maintains the data structures. Even more complication arrives with the requirement
to accomplish specific tasks in the appropriate time slice. The driver threads interfacing the precise
hardware should be executed in the exactly defined time with a microsecond precision. The OPC
XML-DA clients may define the latest time server should respond. In the cases when such client is
controlling sophisticated control system with multiple components requiring operation
synchronization the response delay may have fatal consequences. All these requirements should be
considered by the manager scheduling tasks in the OPC XML-DA system. Further, the memory
management optimizations are required for long-running heavily multithreaded server applications
with big amounts of rapidly changing data. Especially it is important in the cases when several real-
time threads are running among the lower priority ones with active memory utilization. Particularly,
the OPC XML-DA based systems should be optimized since the real-time threads managing
underlying hardware are coexisting with the threads generating XML content in a response to the
clients. The real-time task scheduling is another important aspect that should be considered by the
system implementation.
The algorithms and concepts used to handle the described problems are discussed in this chapter.
The Data Representation concept allowing automatic data representation management, the
Dedicated Cyclic Buffers and Transparent Memory technologies used to optimize memory behavior
and the scheduling policies are described. Further, the developed implementation is reviewed. The
abstraction layers, request processing chain, object and threading models are described. The chapter
is concluded with a performance evaluation. The developed system is compared with the simple
implementations providing similar capabilities. The CORBA and SOAP based data exchange

71
solutions are considered. The performance depending on the computational power, network
bandwidth and number of the clients is analyzed.

3.2. Multitasking Environments


Nowadays most of the server applications consist of a set of cooperating threads. The thread is a
single sequential flow of control within an application. The threads are working within the same
memory address space and sharing application resources. The threads can exchange the data using
shared memory buffers and pipes. The notifications are provided using conditions and signals. The
locks, semaphores are used to synchronize access to the shared resources. For the POSIX complaint
systems the thread behavior is defined by the POSIX 1003.1b (Real-time extensions) and 1003.1c
(Threads extensions) specifications. The Microsoft implements proprietary Windows Threads for
their operating system. However, the POSIX threads implementation is available for the Windows
platform as well. The complete description about POSIX thread implementation is beyond the scope
of this thesis. The good introduction is available in [63].
The OPC XML-DA HDR server should accomplish many simultaneous tasks. Server should
execute quantity of drivers communicating with the underlying hardware and reading the data,
support simultaneous client requests, maintain the data structures. Threads provide a way for the
application to split itself into two or more simultaneously running tasks. However, the single CPU
system is able to execute only one thread in a given instant. Although the SMP systems can execute
multiple threads at once the number of scheduled tasks usually surpasses the number of available
processors. Therefore, all modern operating systems include so called schedulers involved in
deciding when to start a given task, how mach processor time to allocate it. In the multiprocessing
environment the scheduler should decide at which processor to run the task as well. The task
placement problem is not as easy as it seems at first. For example, in the NUMA (Non-Uniform
Memory Access) systems the memory access latency may significantly be dependent on the CPU
performing task. The NUMA architecture is considerably popular nowadays. Particularly, it is
incorporated in the multiprocessor systems on the basis of AMD Opteron platform. Another
example is Hyper Threading technology utilized in the high performance line of Intel processors.
The Hyper Threading technology emulates two virtual processors from the single physical. This
approach allows better CPU resource utilization, since integer and floating point operations can be
executed in parallel. Therefore, the SMP system equipped with such processors will have four
virtual processors. However, in the case if only two tasks are executed, they should be scheduled on
the virtual processors belonging to the different physical cores in order to get optimal performance.
Even more complication arrives with the requirement to execute specific tasks in the appropriate

72
time slice. The driver threads interfacing the precise hardware should execute in the exactly defined
time with microsecond precision. The OPC XML-DA clients may define the latest time server
should respond. In the cases when such client is controlling a sophisticated control system with
multiple components requiring operation synchronization the response delay may have fatal
consequences. Therefore, all these requirements should be considered by the manager scheduling
tasks in the OPC XML-DA system.
This section will give brief introduction in the existing scheduling algorithms, real-time
requirements and will introduce the threading model used for prototype system implementation.

3.2.1. Real-Time Constraints


Although the fast response and high performance are desired by the most standard non real-time
applications, the only criterion of the valid behavior is a correct output. A program requesting long
time to produce the correct output can be considered inefficient and poorly designed, nevertheless it
is accurate. In contrary, for the real-time applications the execution time is essential. The results are
only valid if they are obtained before certain deadline. For example, the data acquisition systems
interfacing the cosmic ray monitors are required to perform the event readout from the hardware
buffer before new event arrives. The count rate of the ASEC monitors reaches several KHz.
Therefore, the readout should be performed within several hundreds of microseconds to avoid loss
of events. However, many data acquisition systems have much stricter requirements.

Various systems are applying different requirements concerning the problems how much time the
task may utilize and how precise this time slice is. These requirements are called real-time
requirements or real-time demands. Three main types of the real-time demands are distinguished
[64, 65].
Interactive: The applications usually interact with users during data processing. The user may issue
command to the program which performs the desired actions and presents the results. The time
needed to perform a command must be reasonably short. Otherwise, the user will perceive the
system as sluggish and, if the delay is unexpected, perhaps misinterpret the state of operation. For
example, if HMI control interface from time to time would need several seconds to update a screen
as a response to the operator pressing a key, the operator might get the impression that the HMI has
missed the key press and repeat it. The real-time demands of interactive systems are relatively
relaxed. The delays shorter than 100 milliseconds are in the most cases not noticed by the user. The
deadline failures are not very critical and would not cause a system failure.
Soft Real-Time: The real-time requirements are tighten when the task controls some kind of the
external hardware. The response times in this case may be much shorter reaching microsecond

73
precision. However, it is not an intrinsic rule for the real-time systems. In fact the deadlines may
vary from the nanoseconds to hours. For the real-time systems the really important ability consists
in meeting required deadlines independent of their duration. For the soft real-time systems the
deadline failures lowers the task utility. However, the successful completion after deadline still has
a positive impact and system is degraded but not failed completely. For example, in the case of
primary data storage failure the longer delay before switching to the backup storage will result in
the loss of data for the period of the delay. However, after the switching has been done a system is
functional again. Therefore, the losses from the missing deadline are measured in terms of lateness
(completion time minus deadline). Large values of lateness represent lower utility of the task
completion for the system.
Hard Real-Time: A hard deadline is a special case where utility is a binary. The completion after the
deadline is completely useless for the system. Or even the completion after deadline may have a
negative effect on the system. An incoming missile interception by an anti-missile when it is only a
few meters away from its target can serve as an example of a negative effect. Fig. 18 illustrates
difference between Hard and Soft Real-Time demands.

Fig. 18. The figure illustrates differences between soft real-time (left) and hard real-time (right) demands.

A real-time system typically consists of multiple tasks mixing soft and hard real-time demands.
Moreover, the system may include tasks with unrestricted execution time. In a standard
uniprocessor computer only one task can be executed at a moment, so the tasks executions should
scheduled in order to optimize the system behavior. The most important characteristic of the real

74
time system is predictability. The term predictability in this case means that upper bound of the
worst-case response times of the system exists and can be computed. In contrary to the standard
tasks the optimization of real-time tasks is devoted to the minimization of the worst case response
time, rather than the average response time. In order to confine a maximum response time, each
primitive operation performed by the system must be predictable. Thereby, for the hard real-time
tasks the optimization should minimize worst-case deadline failures. For the soft real-time tasks
besides the deadline failure minimization the average failure lateness should be minimized as well.
There are basically two approaches for the developer of a safety-critical real-time system to verify
that a system will meet its deadlines. One method is to actually run the software and measure the
performance of the system. The second approach to verification of the schedulability of a real-time
system is to perform a theoretical analysis of the software processes before running the system
actually.

3.2.2. Scheduling
The schedulers used in the modern operating systems usually distinguish different task classes:
kernel tasks, services and user-level tasks. All classes may have different scheduling policies. And
tasks may have different priorities within the class.
The Linux operating system distinguishes two classes: the real time tasks and normal priority user-
level tasks. The real-time tasks can be only executed by the super-user using two different
scheduling policies. The FIFO (First In First Out) policy allows tasks to run to completion on the
first-come-first-serve basis. The scheduler selects from the queue task with the highest priority and
executes it until completion. Then the next task is chosen. The new higher priority task will preempt
running lower priority one. The tasks scheduled by Round Robin policy are executed in a round
robin fashion. The short time slices are allocated by the scheduler to all highest priority round robin
tasks. Then the allocated time slice is over the control is passed to the next task of the certain
priority in queue. The executed task is placed to the bottom of the list in its priority queue. The
Round Robin tasks are always preempted by the FIFO policy based tasks. The user level threads are
scheduled when there no real-time tasks waiting. The certain timeslice is allocated for the each user-
level task. The timeslice is essentially the period of time a task is allowed to execute before a
chance is given to other tasks. It is allocated depending on the task priority (a nice value). The
scheduler may dynamically boost or punish priority in order to give some benefits for the
interactive tasks at the expense of CPU hungry tasks [66].
On the Microsoft Windows NT platform the threads are scheduled based on their priority as well.
The priority levels range from zero (lowest priority) to 31 (highest priority). The system treats all
threads with the same priority as equal. It assigns time slices in a round-robin fashion to all threads

75
with the highest priority. If none of these threads are ready to run, the system assigns time slices in
a round-robin fashion to all threads with the next highest priority. If a higher-priority thread
becomes available to run, the system ceases to execute the lower-priority thread, and assigns a full
time slice to the higher-priority thread. The thread priority is combined from the process priority
class and thread priority within this priority class. The system can boost and lower the current
thread priority in order to optimize interactive thread behavior and prevent starvation. However, this
is only performed for low priority classes with maximal priority level below 16 [67, 68].
Other general-purpose multitasking system has more-or-less similar behavior. The major difficulty
with real-time system implementation using general-purpose operating system is bad predictability.
The following reasons make these systems incapable of sustaining hard real-time demands.
Preemption: Not all system calls are reentrant. Hence, these calls should be executed to completion
before context switching is able to pass control to the higher priority thread.
External Interrupts: Normally, the external interrupts are disabled at the several critical sections in
the kernel to protect system tables from corruption.
Timer Resolution: The general-purpose systems do not normally provide a reliable mechanism to
wake up a task at a certain time. The sleep timer can only promise to wake a task up at a certain
period after the specified time. For the Linux systems equipped with a kernel in default
configuration this period is approximately 10 microseconds. The Windows has a millisecond
precision.
Memory Allocation: All threads within the same process are sharing same memory heap. Therefore,
during the dynamic memory allocation higher priority threads can be blocked by the lower priority
ones. Due to the heavily memory fragmentation this effect is especially harmful for the long-
running server applications with a big amount of rapidly changing data.
Memory Paging: The modern operating systems may swap out some rarely used memory pages into
the special depository on the hard drive. The access to such switched out memory pages causes
horrible unpredictability.
Context Switching: The context switching is a resource expensive task. The scheduler should save
the state of currently executing thread and restore the state of newly executed one. The state
information includes caches, registers, TLB (Translation Lookaside Buffer) tables, etc…

As it was stated before, these issues complicate the real-time system development on the top of the
general-purpose operating system. The real-time operating systems are designed with a primary
goal to provide the better handling of the described problems. The review of the most popular
available solutions is available in [69, 70]. However, the soft real-time demands may be satisfied

76
even by the certain general-purpose operating systems. One of the widely solutions is Linux
running equipped with 2.6 series kernel [66]. The kernel 2.4 with Ingo Molnar real-time patches is
acceptable as well. The problem is that the schedulers used in the Linux and other general-purpose
systems are not aware of deadline handling. The scheduler allocates an arbitrary sequence of
timeslices to execute tasks having same priority. However, the certain assignment sequence may
help in keeping required deadlines. Consider the following example. The scheduler should execute
two periodic tasks every six second. The first task have deadline 5 seconds and requires 3 seconds
for completion. The second one requires only one second for completion and its deadline is 2
seconds. If the scheduler will execute the second task after the first one, the second one will miss it.
In contrary, if the first task is started after the second one is completed, both tasks will meet their
deadlines.
To provide scheduling optimization certain priority assignment procedure should be performed by
the running system. The extensible research of such procedures is carried out starting in early 70th.
The most important results are reviewed in [64, 71, 72]. The standard server implementation
considers supporting arbitrary requests without a priori information about requests frequency,
resources required for request processing, deadlines. Therefore, the static priority assignment is
inappropriate for the server tasks scheduling. In the rest of the section the research taken in the area
of dynamic priority assignments is briefly reviewed.

3.2.3. Deadline Driven Scheduling Algorithm


The EDF (Earliest Deadline First [64]) is a dynamic priority assignment algorithm designed for the
task scheduling in the dynamic systems with the arbitrary amount of periodic and sporadic tasks.
Using this algorithm, priorities are assigned to the task according to the deadlines of their current
requests. The highest priority will be assigned to the task with the nearest deadline and the lowest
one to the task with the furthest deadline. In the case if a new task becomes ready for execution, the
scheduler checks whether the deadline of the newly released process is shorter than the deadline of
the currently executing one. If so, the currently executing process is preempted. In the single
processor case the EDF algorithm is optimum in the sense that if a set of tasks can be scheduled by
any algorithm, it can be scheduled by the EDF [73, 74]. Liu and Layland have proven that the EDF
is able to schedule a set of periodic independent tasks with fixed periods equal to their deadlines if
and only if the total CPU usage is bounded to 100% [74]. These results indicate that the algorithm is
effective in the sense of CPU utilization.
However, it is not a case in the multiprocessor environments. The optimality is not guaranteed by
the EDF algorithm in this case [75]. Consider, for example, three periodic tasks T1, T2 and T3 that
must be executed on 2 processors. Let T1 and T2 have identical deadline requirements, namely a

77
period of 50 units and an execution requirement of 25 units. Let T3 have requirements of 100 and
80. If the EDF algorithm is used to schedule these tasks the T1 and T2 will have maximal priorities
and will be executed on the two processors in parallel for 25 time units. Therefore, T3 with 80 units
of processing time required and only 75 units available will miss its deadline. However, if T1 and
T2 are mapped to one processor and T3 is executed on another one all tasks will easily meet their
deadlines.
Another problematic behavior of the algorithm is operating in overloaded environments. In that case
the scheduler continuously gives a priority to processes that are close to missing their deadlines.
Therefore, due to the domino effect the system performance is rapidly degraded in the case of
overload.

3.2.4. Overload Management


To handle the resource starvation problem several techniques are developed. The TBS (Total
Bandwidth Server [76, 77]) approach assigns deadlines to the sporadic tasks in such a way that the
overall sporadic load never exceeds a specified maximum value. If the assigned deadline is falls
beyond the deadline required by a task with hard real-time demands the task should be rejected with
the error code indicating resource starvation. However, since the task may be finished before the
actual deadline, the scheduler may give a chance to the task and only cancel it after the deadline is
actually missed. The CBS (Constant Bandwidth Server [78, 79]) approach is based upon the budget
concept. The maximum budget and server period are defined by the algorithm in order to limit
resources available for the sporadic tasks. Initially the budget available for the sporadic tasks is
equal to the server maximum budget. For the each sporadic task a suitable deadline equal to the
current server deadline is assigned. As the job is scheduled the available budget is decreased by the
job execution time. Every time when the budget is reaches zero, the server budget is recharged to
the maximum value and the server deadline is postponed by a specified period to reduce the
interference on the periodic tasks. Although the deadline is postponed the task is still eligible for
execution. These solutions are providing control over the resources spent for the sporadic request
handling. In the case when resources required for the periodic task scheduling are known a priori
these approaches allow preventing resource starvation by the calculation of the appropriate amount
of the resources available for the arbitrary sporadic tasks.
If preliminary knowledge of the periodic task requirements is not available, the overloads can be
handled through resource reservation approach. The idea behind resource reservation is to allow
each task to request a fraction of the available resources, just enough to satisfy its timing
constraints. The scheduler must prevent each task to consume more than the requested amount to
protect the other tasks in the system. The CBS can be used to implement the protection mechanism.

78
Each scheduled task in that case should be handled by the dedicated CBS with a considered
bandwidth. If the precise amount of the required resource is not known, the feedback statistical
approach may be used to adjust the reservations. The statistical approach cannot guard a system
from the deadline misses. However, the system overload would not cause a domino effect in that
case.

3.2.5. Scheduling on the Multiprocessor


The optimal scheduling in the multiprocessor case is a much more complicated task. As it was
shown before, the EDF approach is not optimal for scheduling tasks in the multiprocessor
environments. Currently existing various solutions are developed to handle this problem. However,
all these solutions demand a priori knowledge of the computational resources required for the tasks
completion.
The alternative to EDF dynamic priority assignment algorithm is constructed using laxity concept.
The laxity is a maximum amount of time a task execution can be delayed without causing the task
to miss it deadline. Mathematically, the laxity is the task deadline minus remaining computational
time. The LLF (Least Laxity First) algorithm alternatively known as LST (Least Slack Time)
schedules tasks depending on their laxity [80]. The process, which has the least laxity, is assigned
the highest priority in the system and is executed therefore. Whilst a process is executing it can be
preempted by another one whose laxity has decreased below the laxity of the currently running
process. As with EDF, the LLF is optimal. That is if a set of tasks can be scheduled by any
algorithm, it can be scheduled by the LLF. However, the scheduling of two tasks with similar
laxities causes difficulties. In this case one task will run for a short while and then get preempted by
the other and vice versa. Thus, a lot of context switch operations will be performed during the run
time. Each context switch requires considerable amount of processor time. Moreover, the context
switching embarrasses cache utilization. Therefore, the system efficiency will be significantly
diminished.
The EDF-US [81], fpEDF [82] algorithms are extensions of the EDF. To provide better
schedulability in multiprocessor environment they are giving the highest priority to the tasks with
the CPU utilizations above certain threshold and then assigning priority to the rest in the EDF order.
However, these algorithms are not optimal and there exists the task sets, which are schedulable by
EDF and not schedulable by these algorithms [83].
The next set of solutions is based on the concept of Proportional Fairness or P-Fairness [84]. This
approach is designed for optimal scheduling periodic tasks on a multiprocessor. The Pfair
scheduling tries to maintain proportionate progress for all tasks using quantum based model. The

79
time is subdivided into the fixed-length slots. Within each slot each processor may be allocated to
one task at most. The P-Fairness rules require the tasks to be scheduled in such a way that all slots
assigned to the single task are uniformly distributed over its period. To achieve described
scheduling each task is divided on a set of quantum length subtasks that must execute within
windows of approximately equal length. The end of a subtask’s windows defines a pseudo-deadline
for that task. The EDF algorithm is used to schedule the subtasks. This prioritization is called EPDF
(Earliest Pseudo Deadline First). At present three optimal Pfair algorithms are known: PF [84], PD
[85], PD2 [86]. These algorithms are based on the EPDF basis, but differ in the choice of tie-
breaking rules. However, because of the requirement to distribute each task executions over the
time the Pfair algorithms suffers from the same context switching problem as LLF does.

As it was stated before the EDF is not optimal in the multiprocessor case. LLF causes a large
number of preemptions under certain conditions. The EDZL (Earliest Deadline until Zero Laxity)
algorithm combines two mentioned algorithms in order to eliminate stated problems. EDZL
considers both deadline and laxity for priority assignments. It schedules jobs according to EDF
while all tasks have positive laxities. When laxity of certain task becomes zero, EDZL gives the
task highest priority. According to [83], the number of preemptions required by EDZL is
comparable with that using EDF algorithm. However, more task configurations can be successfully
scheduled. And the most important result is EDZL ability to schedule every task set schedulable by
EDF. This means that the algorithm is optimal in single processor case.

3.3. Data Representation Concept


As is evident from the first chapter the data has to be supported to the different clients in the
corresponding representations. Converting the data between different binary representations as well
as encoding it to and from the OPC XML-DA default representation should be performed by the
client and server software. Moreover, multiple XML representations are allowed by the HDR
specification and the server implementation should be able to convert the XML data between them
as well. The data may vary from the simple integer and floating point numbers to the custom
structures containing multiple vectors, matrixes, dynamic arrays and references.
To automate the handling of the data in the different formats by means of dynamic conversion the
“Data Representation” concept is introduced. The system design considers that each OPC Item is
associated with a set of the different data representations. For example, the data can be available in:
server native representation, arbitrary client native representation, OPC XML-DA representation.
The native server representation is assumed as a primary representation. For each representation,

80
except a primary one, so called base representation is defined. Two functions are defined to convert
the data from/to the base representation (see Fig. 19). Therefore, the conversion between any two
representations is carried out by the sequence of operations recoding the data to/from its base
representations. For example, to convert data from the custom XML representation to the client
native representation the following sequence is used: the data converted from the XML
representation to the OPC XML-DA standard representation. Then, the OPC XML-DA standard
representation is converted to the native server representation. Finally, the native server
representation is converted to the native client representation.

Client X Binary First XML


Representation Representation

Client Y Binary Server Native OPC XML-DA Second XML


Representation Representation Representation Representation

Client Z Binary Third XML


Representation Representation

Part of the data in


client’s Z Binary
Representation

Fig. 19. The figure illustrates the relations of the standard representations. The arrows indicates
corresponding base representations.

3.3.1. Standard Representations


A set of the standard representations is defined to handle conversions defined by the HDR
specification. This set includes the NDR (Native Data Representation), the OPC XML-DA standard
representation and arbitrary XML and binary representations. The arbitrary binary and XML
representations are used to provide the data in the appropriate format to the clients who are unable
to handle format conversion on their side. For the binary representations this mostly includes the
simple clients running on the embedded hardware with limited abilities to handle NDR conversions.
The XML representations are mostly used together with various web based software. For example,
simple web applets or flash animations may request the data in the format appropriate for
immediate display.
Another more powerful example is a possibility to use web components to represent the data in a

81
graphical form. Currently there is available a number of SVG and Flash based solutions
constructing sophisticated charts using provided XML data. Therefore, it is possible to request the
data in the appropriate XML format and pass it to these components in order to represent graphical
information to the end user. Moreover, the XForms specification provides a standard way of
implementing web based GUI interfaces. Hence the operators are able to control the system by
means of predefined XML documents. Both these technologies brought together allow simple
construction of web based HMI (Human-Machine Interface) interfaces using the variety of available
components.
The conversions between the NDR and binary data representations as well as conversion between
NDR and OPC XML-DA standard representation are carried out based on the HDR Binary type
descriptions (see section 2.4). The conversions between the different XML representations as well
as conversions between the standard OPC XML-DA representation and arbitrary XML
representation are performed using supplied XSL Transformation stylesheets.
The Data Representations can be used more widely than just defining different representations of
the same data. The data representation may represent the data obtained by applying various data
filters and transformations to the base representation. This approach is very conventional in
providing support to all kinds of the data filters and queries considered by the OPC Complex Data
specification (see section 1.6.4). The possibility to accept queries from clients combined with the
“Data Representation” concept gives a broad range of powerful options to the system designers. For
example, it is possible automatically extract critical information for the operators, made server-side
data preprocessing or filter parts of a data not important for the client. The implementation of the
standard query extracting a component parts from the compound data (see section 2.9.8) is
implemented using this approach.

3.3.2. Converting Binary Representations


In contrary to the XML representations, which are converted at once using XSL transformation
engine of the underlying XML library, the binary representations are converted in parts by the
following conversion procedure.
For each basic type a set of functions is defined. These functions are used to convert a value of that
basic type. For the conversion from the NDR representation to OPC XML-DA representation these
functions are just serializing numeric values into the correspondent XML string. During reverse
conversion the numeric values are correspondingly extracted from the XML string. To convert the
data between the NDR representation and arbitrary binary representation the function computes a
difference among the considered types. This difference includes information about type alignment,
byte order and basic type size. If the types are not equal the conversion altering binary

82
representation is performed, otherwise, the data is just copied.
The handling of the references is a little bit more complicated. As it was described in section 2.9 a
complex data consisting of the several memory blocks linked together with references is transferred
sequentially in the separate DIME chunks. The chunk’s “Options” header contains the block
address in a server memory space. Obviously, in the client memory space the blocks are placed
differently. Therefore, it is required to resolve all references from the server positions to the client
ones. To convert references the search through all chunks is performed. Then the chunk with
appropriate “Options” header is found, its current memory address is used as a new reference value.
The conversion routine analyzes the HDR description sequentially record by record. For each
record the correspondent function related to the record type is called. The converted value is
encapsulated in the DOM tree if the conversion into the standard SOAP representation is
performed. In other cases the values are stored sequentially in the dynamically allocated memory
buffer. For the dynamic allocation the dedicated per representation memory pools are used. The
speculative statistical approach is used to preallocate the memory pools (see details in section 3.7).
Therefore, the memory management has a negligible impact on the system performance.

Finally it should be noted that the direct data conversion between two non native binary
representations is not supported. The data should be converted to the NDR representation and then
back to the required representation in order to perform recoding.

3.3.3. Representation Caching


The data conversion between the considered representations, especially XML ones, implies
significant CPU usage. In the case of long representation chains the data conversions are actually
hurting the server performance. In fact even in the most standard case when the standard OPC
XML-DA server supports legacy clients with a SOAP encoded data the conversions between native
and XML representations take more than 90% percent of the used computational resources.
Therefore, in the case of several clients a representation caching preserves a considerable amount of
resources and drastically improves performance.

By means of default approach the data is cached on the server in NDR representation. Upon a client
request a memory buffer is allocated and representation conversion is performed into the buffer.
The buffer is released after the data are sent to the client. Therefore, the request of a next client to
the same data item is handled in a same way. If the Representation Caching is enabled, the server
allocates a set of buffers for each OPC Item. One buffer for each supported representation. The
NDR representation is cached in the correspondent buffer. Other buffers are initially empty. Upon

83
the client request arrival the server performs representation conversion and stores derived
representations in the correspondent buffers. The next client will get the cached representation.
Even in the case of clients requesting different representations the approach can improve
performance. For example, if a first client requests the data in the one XML representation and a
second one in another the following procedure will be performed. After the first client request
receipt the server will convert the NDR representation into the OPC XML-DA standard
representation. And then that into the XML representation desired by the first client. Both the OPC
XML-DA representation as well as the first client’s desired representation will be cached in the
appropriate buffers. Then the second client request is received the server will not convert the NDR
into the OPC XML-DA representation, but will use that cached and simply convert it in the client
desired format. After the data value renewal all non-native representations will be cleared and
conversions will be performed again on the client request.

In Fig. 20 client’s request processing time using Representation Caching optimization is compared
against the implementation without it. The conversion between Native Data Representation and
standard OPC XML-DA representation is performed to make comparison. The upper diagram
indicates the time required to support the first client. The bottom diagram is stands for the second
and further accesses. As it can be seen from the figure, the approach using Representation Caching
is slightly slower than the default one, if a single client is supported. However, for the following
clients it is considerably faster. The results confirm that Representation Caching becomes efficient
starting from the 3rd simultaneous client and can bring up to 20% of performance gain per additional
simultaneous client.

1st client

2nd and further


clients

0 100 200 300 400 500 600


With Caching Without Caching time (ms)

Fig. 20. Impact of the representation caching on the request processing time is displayed on the figure. The
chart represents time required for the client request processing with and without caching. The upper part
represents the time required to process first request to the considered data. The time required for processing
following requests is presented on the lower part of the chart.

84
3.3.4. Security Caching
The security handling operations are implying considerable CPU usage as well. The security
subsystem can be significantly optimized for the multi-client environments if the caching
mechanism similar to the Representation Caching would be implemented for digital signatures and
encrypted content.
The security caching is surely can not be performed while the HTTPS security approach is used.
The messages destined for arbitrary clients contain different OPC Groups configurations and
properties. Moreover, the response time is included in the OPC XML-DA message. Hence, there
does not exist a pair of identical messages. However, the XML Security approach defined by the
HDR specification (see section 2.7.6) allows security handling on per-item basis. In that case only
the data values are protected. The members of the OPC Group are handled completely
independently. Different security entities are allowed inside a single message. Therefore the values
of certain data items can be stored in pre-protected form and just included into the final message.
However, only binary representations can be supported in that way. The OPC XML-DA
specification allows clients to request server for certain data conversions. The specification states
that the server should mandatory support conversions from any supported scalar data type to a
string. Moreover, the string values should be handled to the client using one of the supported
locales. Therefore, the final OPC XML-DA message carrying considered data value may
significantly vary depending on the request properties. So there is no possibility to cache security
entities for the standard OPC XML-DA representation.
For the binary data representations the HDR specification defines that the type alteration and string
localization should be performed on the client side. The data is transferred using plain NDR
representation uniformly for all clients. Therefore it is possible to preliminary produce digital
signature for subsequent utilization in the responses. Moreover, the data may be encrypted with a
symmetric key and stored for the further use.
Altogether, to prepare the data for a client the server would need only to encode the data digest and
symmetric key by means of XML Security. No data conversions should be performed in this case,
the data can be sent just from the internal server buffer.

3.4. Component Model


As it was described in the first chapter the OPC XML-DA HDR server should provide a set of well
defined interfaces for accessing structured objects contained within the server address space. These
objects are referenced as OPC Items and encompass a certain amount of associated information
along with metadata describing how this information was obtained. The OPC XML-DA prototype

85
implementation is constructed around these OPC Items (see Fig. 21).
The OPC Item is designed as a self-contained service. The connection with the data source is not
maintained. And vice versa the data source maintains that truth provided interface. More precisely,
the OPC Item is a data structure surrounded by an API (Application Programming Interface)
interface. The interface supports reading and writing of the data as well as subscription for the data
change notifications. The client application is allowed to read a current value, write a new value or
ask for a data renewal. In exactly the same way the data source may update the current value or read
it. Upon the data renewal request a special flag is set within the OPC Item data structure. The data
source is able to monitor this flag in order to support user renewal requests. The important point is
that the data can be read or written using any of supported data representations. All required
conversions are performed automatically.
Besides the OPC Item the OPC Server object is defined. It maintains global data structures,
including run-time status, access control lists, server item tree and so on. The server object provides
two interfaces: frontend and backend. Using frontend API the clients are able to search required
item, get current server status and query about supported capabilities. The backend API provides a
standard interface for OPC Items management.

Fig. 21. The figure represents prototype implementation component model. The core system is composed of the
structured data items along with public interfaces. The Hardware Backend is executing multiple
implementation-specific drivers communicating with underlying hardware. The OPC XML-DA Frontend is
processing requests from the clients. The HTTP encapsulation is performed by means of the stand-alone
Apache WWW server.

86
3.4.1. Backend
The system design considers implementation of the data sources by means of vendor supported
plugins. These plugins are used to control underlying hardware components via appropriate
hardware-specific protocol. The data read out from the hardware is made available to the system
using one or more OPC Items. It is possible as well to register items providing the ability to control
the driver and hardware behavior. The backend API defines a standard interface for manipulations
with the considered OPC Items. It provides a standard way to register new items, link them into the
server data tree and destroy when they are not required any more. By means of backend API the
hardware plugins are able as well to specify a set of supported representations, desired caching and
security model.
Besides, the implementation-specific plugins, the backend provides a CORBA (Common Object
Request Broker Architecture [7]) service interface. The CORBA interface allows implementing
OPC XML-DA HDR support into the various existing applications without redesign indispensable
for plugin-style integration. As a major benefit it provides a way to use provided server
implementation from the whole range of programming environments. However, to achieve maximal
performance the plugins should be implemented.

3.4.2. Frontend
The frontend accepts requests from the client applications using a certain communication scheme.
Then the required data is obtained using frontend API and provided to the client in the desired
format. The default communication scheme is based on OPC XML-DA HDR specification using
HTTP as a wire protocol. However, it is possible to implement other communication schemes in
order to integrate the system in arbitrary computing environments. For example, the GridFTP [87]
frontend will allow simple system integration into the existing GRID framework. Furthermore, in
late 2005 the OPC Foundation introduced concepts of a new promising technology OPC UA (OPC
Unified Architecture [88]). It is based on the Microsoft .NET technology and provides further
development of the OPC data access interfaces. Although the full specification set is not provided
yet and, therefore, implementation is not still possible. The OPC UA support can be implemented
using dedicated frontend at the moment of specifications arrival. Finally, the described approach
makes possible to use various third party components in the system talking in different protocols.
Our implementation in this case should be used as a bridge between these components.
The OPC XML-DA HDR frontend is developed as anticipated by the OPC XML-DA and HDR
specifications. However, to simplify implementation the HTTP/HTTPS protocol support is not
embedded into the server. The frontend accepts OPC XML-DA messages using simple TCP based

87
protocol. The HTTP support is provided by means of Apache web server. The apache module is
developed to transfer the client request from the Apache web server to the OPC XML-DA server.
Fig. 22 confirms that such approach does not need much more resources compared with direct
connection even under high loads.

Fig. 22. The figure illustrates performance downgrade while using apache module compared with direct
connection. To produce High load a bundle of legacy clients requesting the data in XML format were used.
The low load was imposed by a bundle of HDR clients requesting the data in the NDR representation.

3.5. Data Flow


This subsection describes the data flow from the underlying hardware to the end user. To simplify
processing the following six stages are segregated: Readout, Preprocessing, Representationize
(creating appropriate data representation), Userize (altering the data format depending on the client
request), Security (protecting the data) and Send. The Readout is handled by the hardware plugins
in order to submit the data into the OPC Item internal buffer. The Send is called from the frontend
in order to send the data. All other stages are automatically managed by the OPC Item logic. Such
approach allows constructing the data flow from the blocks of simpler reusable components. Further
the arbitrary stages can be skipped out in certain cases. For example, the Readout is skipped if the
“MaxAge” parameter allows the server to use cached value. The Representationize stage can be
skipped if Representation Caching is enabled and requested representation is available in the cache
(consider illustration in Fig. 23). If the Security Caching is enabled and appropriate value is cached
the server should just send the cached data. In more detail the following stages are singled out:
Readout: At this stage the underlying electronics is examined. The data is read out and cached
inside an internal server memory space using NDR representation. This stage is highly dependent
on the underlying hardware interface and, therefore, it is carried out by the implementation specific
drivers. Depending on the “MaxAge” parameter the stage can be skipped out.

88
Fig. 23. The figure illustrates the data flow considered by the system design. The left diagram illustrates the
case then the Representation Caching is enabled. The right one illustrates the data flow while the
Representation Caching is disabled. The topmost white box represents the data originally submitted to the
OPC server. White boxes in the middle represent two different data representations maintained by the
server. The white box at the bottom represents the data representation which is provided to the end-user by
the client library. The edge strings indicates the executed processing stages.

PreProcessing: The system design requires a readout stage to be as simple and fast as possible.
Therefore, at the Readout stage the data are just stored in the memory buffer. If certain
preprocessing is required prior to making the data available it should be performed at this stage.
Representationize: At this stage the conversion from the cached NDR representation to the
requested representation is performed. If the Representation Caching is enabled and the appropriate
cached representation is available, then this stage is skipped and cached value is used.
Userize: At this stage the representation is modified to correspond to the user request. The
modifications are defined by the OPC XML-DA specification and include the type altering and
string localization. The stage is only performed while supporting clients in the legacy OPC XML-
DA representation. In the HDR binary mode the data is transferred unaltered using plain NDR
representation and all considered type conversions and string localizations are performed by the
client library.
Security: This stage is only considered if the advanced XML Security approach is used. When the
data protection is not required or the data is protected using standard HTTPS approach the Security

89
stage is omitted. At this stage the data is encrypted and digitally signed as defined by the XML
Signature and Encryption specifications.
Send: At this stage all data items considered for the response are grouped into the OPC Group. The
response DOM tree is serialized into the plain string. This string along with all considered
attachments is encapsulated in the DIME message and data is delivered to the HTTP transport layer.

3.6. Threading Model


The OPC XML-DA HDR server implementation considers simultaneous handling of multiple
clients and hardware drivers. The clients may define desired deadlines specifying the latest time
server should respond to the request. In order to handle all this tasks obeying required timings a set
of periodic and sporadic tasks are executed within the server. This section reviews the threading
model used in prototype OPC XML-DA HDR system implementation. The operating system
scheduler is used to execute tasks. The deadlines are handled using EDZL based dynamic priority
assignment algorithm (see section 3.2.5). Basically, the system model distinguishes four types of
tasks: Driver, Client-Server, Data Management, and Maintenance.
1. Driver tasks are used to run plugins communicating with hardware.
2. Client-Server tasks are processing client requests.
3. Data Management tasks are handling all system data management: data preprocessing,
representation management and security.
4. Maintenance task manages periodical jobs used for the server optimization. Cancellation of
the timed out subscriptions, memory pre-allocation, and configuration auto adjustments are
examples of such activity.

The data management is a most important and resource intensive task in the system. All data flow
from the underlying hardware to the network driver is carried out by the pool of data management
threads. The Preprocessing, Representationize, Userize and Security stages are handled by these
threads (see section 3.5). The threading model anticipates the client-server and driver components
to be as simple as possible with minimal resource requirements. Each hardware component is
controlled by a dedicated driver thread. The driver thread maintains connection with the hardware
component as foreseen by the component specification. The data is read out into the buffer within
the server memory space. In the most cases the buffer will be pre-allocated in advance by
maintenance task. In the case if the driver needs to perform some performance-intensive but not
restricted in time actions the lower priority subthreads should be scheduled to perform these actions.
The client requests are handled by a pool of client-server threads. These threads are kept simple as

90
well. The thread processes client request and gives orders to the data processing threads to prepare
required data. Further, it simply waits until the data is ready and then passes it away. Such design
allows scheduling the driver and client-server tasks with highest priority without causing serious
performance penalty to other tasks.
The highest priority is assigned to the drivers in order to prevent hardware failures. If several
drivers are executed on the single system, it is up to the system integrator to assure that available
processing resources are enough to schedule all of them in time. However, the minimal
requirements anticipated by the system design make this task rather easy. Next priority level is
assigned to the client-server tasks. The main reason is that different priorities may be assigned to
the arbitrary clients and it is impossible to determine required priority level prior to the request
processing. Therefore, since the request processing is not very resource extensive the requests are
processed with highest priority and only then the appropriate priority is assigned. The lowest
priority is assigned to the maintenance thread to schedule server optimizations in the idle time. All
other priorities are used by the data manager to schedule data management tasks in appropriate
order depending on their priority and deadline requirements. Table 5 lists priority assignments
considered by the prototype implementation.

Table 5. Prototype implementation threading model


Priority Tasks
Maximal Hardware device driver plugins (backend)
Maximal - 1 Client request processing threads (frontend)
Dynamically assigned to the data processing tasks depending on the task priority
Intermediate
and considered deadline
Minimal Server maintenance thread

3.6.1. Scheduler
The scheduler is used to manage data processing threads used to convert the data between different
representations and adjust to the user desired format. All data flow stages are treated as separate
tasks and executed by the dedicate pool of threads. The pool can be extended or reduced by the
maintenance thread depending on the server load.
To handle deadline requirements these threads are prioritized by the EDZL dynamic priority
assignment algorithm. The EDZL is selected, because it is simple in implementation. Like the EDF
approach it is optimal in single processor environment (used in the most of operating solutions).
However, in the multiprocessor case it can successfully schedule more task configurations.
Compared with LLF algorithm it considers much less context switches. And, therefore, it is
perfectly suited for managing data processing threads.

91
The threads are scheduled based on the client priority, task laxity and deadline. The system
integrator may segregate clients by priorities. The tasks posed by the high priority clients will be
executed at first independent from the deadlines and laxities of the tasks scheduled by the clients
with lower priorities. As it was stated before, the tasks are executed for each data processing stage
separately. Therefore, it is possible that certain stages are required for several clients with different
deadlines and priorities. In that case the correspondent task is scheduled with maximal priority
considered by the clients demanding this stage execution.
The second check is a task laxity. The tasks with the laxity below a certain threshold have
precedence over others and executed in order corresponding to their laxities. Remaining tasks are
executed using EDF policy. Fig. 24 illustrates thread priority assignments.

The major difficulty in algorithm implementation is brought with a fact that required execution time
is never known a priori. However, for the subscriptions periodic requests are issued. Hence, it is
possible to collect statistic allowing estimation of a maximal amount of effective processor time
required for task completion. This time is used to calculate laxity. For the sporadic read and write
requests as well as for the first subscription iterations while the execution time still unknown the
EDF policy is used anyway.

Fig. 24. The figure illustrates task priority assignments considered by the system design.

92
The same information about the maximal amount of effective processor time required by
maintained subscriptions is used to protect the server from the resource starvation. After the total
load is passed over the first threshold the priority of all sporadic tasks are reduced. At first the
scheduler will execute all periodic tasks independent of their deadlines and only then the sporadic
ones. After a second threshold the new requests are rejected with the error message indicating
resource starvation. In the case that a high priority client requests data, the server is permitted to
drop subscriptions kept for lower priority ones.

3.7. Memory Management


Optimization of the memory allocation performance in a multithreaded environment is one of the
most important aspects of the client-server architecture development. Standard memory allocators,
like dlmalloc by Doug Lee, implemented in modern operating systems are very effective and in
most cases custom implementations can not outperform them [89]. However, in the case of long-
running heavily multithreaded server applications with big amounts of rapidly changing data
considerable optimizations are possible, especially in the case when several threads should satisfy
real-time requirements. The necessity of synchronization between multiple threads accessing the
shared memory heap is a major drawback of standard memory managers. Especially this effect
affects real-time threads running among the lower priority ones with active memory utilization.
The OPC XML-DA HDR based data exchange system is greatly affected by this drawback, since
high priority readout threads coexist with threads processing and generating the XML context,
which require the big amount of memory and high intensity of memory management operations.
The requirement to convert data between XML and binary representations makes it difficult to
determine the required size of the memory buffer to hold the whole data. The described below
techniques are used in the prototype server implementation in order to reduce overall amount of
memory operations and completely eliminate unpredictability in the scheduling of the high priority
readout threads caused by the memory allocation from the shared heap.

3.7.1. Dedicated Ring Buffers


Widely used solutions like the “growing buffers” are suboptimal since they consider multiple buffer
resizing operations. These operations are very expensive and intensively utilizing the memory
subsystem. In the case of fragmented memory, the displacement of already allocated memory may
be required. Further complication is brought by the requirement of caching older values by the
Subscription mechanism of the OPC XML-DA specification. This demand is suppressing the reuse
of memory buffer and appears as another source of heavy heap utilization. Fortunately, the

93
sequential nature of the OPC XML-DA protocol allows the assumption that older cached values
will be freed earlier than newer ones. Making an additional assumption about successive submission
of new values (i.e. allocation of any processing item should be completely finished before starting a
new one) a described below custom approach can be used.
The approach is based on the concept of the “dedicated ring buffers”. For effective memory space
allocation each representation of a data item is associated with the sufficiently big dedicated ring
buffer. The buffer is used to store all memory blocks related to the current data item representation
and corresponding representations of all cached data item values. Fig. 25 illustrates the difference
between “growing buffers” and “dedicated ring buffers” memory allocation schemes.
In the case of the too small buffer to hold all data required by the caching policy it will be replaced
by a bigger one. The old buffer will be freed as soon as no utilized data will remain in it. To prevent
overflow of a buffer during the memory block allocation, the memory usage statistic is collected.
This statistical information is used to pre-allocate certain amount of space in the buffer and should
prevent copying partly allocated blocks into new buffers.
The approach gives the following benefits:
• Most of the blocks are allocated in successive order, which is extremely efficient, since data
can be manipulated as one big chunk, instead of many small pieces.
• Storage of multiple cached values in the same ring buffer is optimal from both speed and
memory utilization points of view.
• Usage of pre-allocated dedicated buffers terrifically reduces memory fragmentation.
• Statistical information helps to estimate required space before allocation and reduces an
amount of memory management system calls.

Growing Buffer
Current Value Cached Value Cached Value
(412KB) (296KB) (192KB)
512KB 512KB 256KB
Dedicated Ring Buffer
Current Value Cached Value Cached Value
(412KB) (296KB) (192KB)
1024KB
Buffer End Buffer Start free

Fig. 25. The figure illustrates the difference between “Growing Buffer” and “Dedicated Ring Buffer” memory
allocation schemes. The storage for the current and two cached values associated with a single OPC Item is
allocated in order to show a difference. The “Growing Buffer” scheme uses three independent buffers. The
“Dedicated Ring Buffer “scheme uses single big buffer to store all values.

94
The most important gain of the approach is elimination of the demand to lock / unlock the heap on
every memory manipulation request. Combined with pre-allocation of buffers of certain sizes this
resolves blocking of high priority threads by low priority ones due to memory allocation from the
same heap. The evaluation results presented, in Fig. 26 show that performance of the proposed
allocator is practically unaffected by the server load, whereas the growing buffer allocator reveals
five-fold speed decrease under heavy loads. It is even more important that the time required to
allocate memory with “dedicated ring buffer” allocator is constant and, therefore, predictable.
While the time required by the sequential memory allocation using growing buffer technology
differs in three times under the high loads. And therefore, it surely can not be used in the system
with strict real-time demands.

3.7.2. Transparent Memory


Another important improvement of the memory management subsystem is achieved by the ability
to directly manipulate the ring buffers. The backend API interface provides the hardware driver
threads with ability to allocate blocks of memory directly from the associated ring buffers. Such
approach completely eliminates the necessity of process-wide memory allocation routines and
makes high priority data acquisition threads really independent from the system load.

20000

15000
time (us)

10000

5000

0
0 1 4 16 64
concurrent clients
Growing Buffer Dedicated Ring Buffer

Fig. 26. Comparison of “growing buffer” and “dedicated ring buffer” memory allocation techniques depending
on the server load (X-axis). The charts represent the time required to allocate 512KB sized value, divided in 8
blocks. The server load is implied by multiple clients requesting data in OPC XML-DA compatibility mode.

95
Fig. 27. The figure represents multi-layer prototype OPC XML-DA HDR server implementation. The
Abstraction Library hides operating system details from the other system components. The Data
Management Library provides a set of interfaces for the server data structure management. On the top of
these libraries prototype server implementation is constructed. It consists of the Frontend and Backend parts
and controller scheduling required operations in appropriate time. The Apache WWW server is used for the
HTTP encapsulation. The LabVIEW bindings are enabling OPC XML-DA support in the LabVIEW based
systems.

3.8. Evaluation of Prototype Implementation


The protocol is implemented as a double-layer library (see Fig. 27). The abstraction layer provides
harboring of the operating system details. All memory management, thread synchronization,
network communication and other OS specific operations are performed by means of the
abstraction library. The hardware drivers intended to perform on the multiple platforms should rely
on the library as well. Currently, it is implemented and thoroughly tested on the Linux and
Windows environments. However, the Linux implementation is based on the POSIX specifications
and, therefore, porting to other POSIX compliant systems is an easy task. In addition to support of
the general-purpose Windows OS the library is ported and tested on the Phar Lap real-time system.
That is based on the Windows NT core and usually used in conjunction with LabVIEW Real-Time
on the FieldPoint and PXI devices [23].
The second layer is constructed on the top of the abstraction interlayer and provides a set of API
interfaces which are described in section 3.4. The server API interface enables the management of
the server configuration and run-time status. The backend API interface provides a way to configure
data sources, allocate memory from the associated ring buffers and submits new data. The
conversions between different data representations and coupling with the protocol related
information are performed using the frontend API interface.
The library is implemented in pure C utilizing object oriented approach by means of design similar
to one described in [90]. This allows the library integration into the environments programmed by
means of arbitrary language adopting concept of object files. The examples are C, C++,

96
FORTRAN, and Pascal. The parts of the system written in other languages (Java is the most used
example) can be linked with the library using CORBA interface (see section 3.4.1).

3.8.1. Framework
On top of the library the OPC XML-DA HDR server framework is constructed. The framework
consists of the backend and frontend interfaces and a controller used to schedule various operations
considering defined timing demands. The frontend interface accepts connections from the clients
and prepares data in the user requested format in accordance with OPC XML-DA HDR
specification. The backend interface maintains hardware plugins and CORBA connections. The
plugins are communicating with physical hardware and submitting the data to the server utilizing
backend API interface.
Currently the framework implements the OPC XML-DA specification and the core of the HDR
specification. Hence, it is able to support data in both XML compatibility and NDR binary formats.
The support of optional HDR extensions like a PGM multicast, XML Security and others is not
available in prototype implementation yet.
Besides the OPC XML-DA interface the server provides a standard web interface. The clients by
means of standard web client may browse the server data space, monitor and adjust items value.
The sophisticated web-based control interfaces may be designed using XML data representation by
means of Chiba XForms implementation [91].

3.8.2. LabVIEW Bindings


LabVIEW (Laboratory Virtual Instrumentation Engineering Workbench [18]) is a platform and
development environment for a visual control system programming from National Instruments. It is
used for data acquisition, instrument control, and industrial automation on a variety of platforms
including Microsoft Windows, various flavors of UNIX, Linux, and Mac OS. Besides a standard
version, the real-time LabVIEW implementation operating on the industrial PXI platform is
available [23].
The LabView based control and data acquisition solutions are very popular and used in the many
operating installations. However, the current LabVIEW edition does not provide any means of OPC
XML-DA protocol support. In order to allow integration of the LabVIEW components into the OPC
XML-DA based system the OPC XML-DA bindings for LabVIEW are released. To achieve that the
framework is compiled as a shared library and loaded into the LabVIEW memory space providing
the data exchange capabilities by means of the OPC XML-DA HDR protocol. Both the Windows
and Linux versions of LabVIEW are supported. In order to support LabVIEW Real-Time the
framework is ported to the Phar Lap (Windows NT based real-time operating system).

97
3.8.3. Performance Analysis
To compare the performance of the developed system with other available solutions, the omniORB
and gSOAP toolkits were used. The omniORB is one of the fastest CORBA implementations
according to [92]. The gSOAP is the most featured SOAP toolkit available under Linux. However
OPC XML-DA is a higher level protocol than the CORBA and SOAP ones. Therefore, direct speed
comparisons between OPC XML-DA implementation and these toolkits are impossible.

120

100
throughput (MB/s)

80

60

40

20

0
2 16 128 512 2048
item size (KB)

12

10
throughput (MB/s)

0
2 16 128 512 2048
item size (KB)
Binary XML Corba gSoap
Fig. 28. The chart represents the maximal achievable throughput of a Linux-running server supporting a
single windows client with arrays of floating-point numbers in HDR (binary) mode. The AMD Athlon 64 X2
4800+ (2.4GHZ dual core, 2GB Ram) server connected to a 1GBit network was used to get the top diagram,
and the Intel PIV Northwood (2.2GHZ, 1GB Ram) server connected to a 100MBit network for the bottom
one.

98
Nevertheless, the implementation of a simple OPC-like protocol using these toolkits was developed.
Of course, the full OPC compatibility is not provided by these implementations, but it gives the
possibility to make rough estimations of the system performance compared with the mentioned
environments.
Windows clients, connected to a Linux server using a 100MBit and 1GBit network links, were used
to measure the performance and scalability. Both sides used x86 compatible hardware. As it can be
seen in Fig. 28 and Fig. 29 the developed framework in the binary mode (XML metadata and binary
data) achieves the performance of pure binary CORBA system as soon as the transferred data
frames have moderate sizes. Even in the XML mode (XML metadata and data) the framework is
faster than a competing solution, based on the standard SOAP implementation. Fig. 30 proves that
the system is able to support multiple clients without significant performance lost.

Fig. 29. Evaluation of the server in load imposed by the 3 clients making one request per second. Each of
the clients is requesting 512KB of floating point data. The Intel PIV Northwood (2.2GHZ, 1GB Ram)
server connected to 100MBit network was used to perform the benchmark. The OPC XML-DA HDR
server load is evaluated in the HDR (Binary) and compatibility (XML) modes and compared with CORBA
and SOAP based solutions.

50 5
Throughtput (MB/s)

45 4,5
Throughput (MB/s)

40 4
35 3,5
30 3
25 2,5
20 2
15 1,5
10 1
5 0,5
0 0
1 2 4 8 32 128 1024 16384 1 2 4 8 32 128 1024

Clients Clients

Fig. 30. The chart represents the performance dependence on the number of the clients. Each client
continuously requests floating point arrays (16KB). The data was served to the clients in HDR binary mode
for the left diagram and in OPC XML-DA compatibility mode for the right one. The performance is
evaluated as a maximal data amount which can be transferred to all clients in a second. AMD Athlon 64 X2
4800+ (2.4GHZ dual core, 2GB Ram) server connected to 1GBit network was used to perform the
benchmark.

99
3.9. Future Development
In this chapter it is shown that proposed protocol can be effectively used to transfer the data on high
rates. However, there is a wide open space for future improvements in both concepts and
implementation. The most important thing is an upcoming OPC UA standard promising to bring a
new level of sophistication to the OPC-based systems. Basing on the currently available information
the new specifications will unify OPC A&E, OPC HDA and OPC DA servers in a single
architecture. The new sophisticated information model will represent the server data space as a set
of various object interconnected by a different kind of references. The OPC Item object consists
from arbitrary number of different attributes, variables of simple and complex types, commands and
notifiers. This provides a possibility to define complex relations between supported items,
implement custom methods managing the associated data and so on. The specifications will provide
an enhanced security model, different means of redundancy support. Fortunately, as it was stated in
section 3.4.2 the protocol implementation is separated from the data model. And, therefore, the
OPC UA protocol may be implemented on the existing base by means of appropriate frontend as
soon as a complete set of specifications will be available to the public [88].
Another, important trend in the data acquisition and analysis system development is a GRID
technology, which provides unbeatable computational capabilities. For the modern experiments
with data rates exceeding dozens of megabytes per second and even more the distributed
computations are only a possible solution to handle all the data in real-time. To provide better
integration into the GRID frameworks the certain optimizations of the proposed implementation
may be considered.
Something can be improved in the current implementation as well. One thing is network,
computation and memory resource sharing between multiple clients. The management system can
be constructed using CBS approach described in section 3.2.4. Sophisticated resource management
schemes may be constructed to allocate quotas depending on the client identity, source address and
other information. For example, the client may be entitled for unrestricted bandwidth while it is
operating in the system intranet. And the same client may be considerably restricted to the data rate
when it is request data from outside using a slow internet connection. One more example is sharing
resources between permanent subscription connections and sporadic read requests. The CBS allows
limiting resource amount allocated to the sporadic tasks in order to prevent resource starvation
among the critical continuous tasks.
Another important topic is optimization of the scheduler used to prioritize data management tasks.
Current implementation is based upon EDZL approach. This approach doesn’t consider any
relations between tasks. However, implementation of certain data queries (see section 1.6.4) may

100
require the consideration of such relations in the task prioritization. When the OPC UA will be
implemented this requirement surely will arrive as an important task. Besides relations, the Pfair
based schedulers (see section 3.2.5) may provide better resource management in the multiprocessor
case. Nowadays, the most data acquisition systems are based on the single processor computers.
However, all CPU vendors are switching to the multicore processors. Therefore, in the nearest
future many of data acquisition systems will be equipped with multiprocessor solutions. The active
research is ongoing in the area of the Pfair based schedulers [72]. The following examples may be
important for the OPC XML-DA task management. SRMS (Stochastic Rate-Monotonic
Scheduling) is designed for use in systems where the periodic tasks have highly variable execution
times and soft rather than hard deadlines. The task synchronization using Lock/Free mechanism
under the Pfair scheduling is discussed in [93]. The Group-based or Hierarchal approach is
proposed in order to reduce number of context switches required by the Pfair scheduling [94]. The
careful study of currently available and developing solutions in this area may help in implementing
of much more optimal and sophisticated scheduler for the data management tasks.
Last topic of interest is further standardization of metadata properties and queries to help in
establishing universal data exchange rules in various experiments. For example, in cosmic ray
physics many detectors are measuring a number of incident particles depending on energy, direction
and charge. In order to facilitate the data exchange between detectors maintained by the different
collaborations the correspondent metadata properties should be defined. It means that one should
define a standard property indicating energy of measured particles, a standard property indicating
charge of measured particles and so on.

3.10. Summary
In this chapter it is shown that OPC XML-DA based protocol can be used to transfer data in a very
efficient way if proposed extensions are added. The inclusion of high data rates for many scientific
applications is very important if OPC XML-DA should be used as a universal data exchange
protocol.
Main concepts of these extensions involve the separation of the data from protocol and metadata
information, usage of the server native binary representation for data exchange and the opportunity
to support multiple clients with the newest PGM multicasting standard. Furthermore, the following
ideas were used to improve implementation performance and usability. The “data representation”
concept allows automatic conversion between different representations of a data and furnishes an
opportunity to create custom data filters and masks on the server side. Data tweaking with user
supported queries is another benefit of this approach. The memory management using “dedicated

101
ring buffer” technique eliminates most of memory allocation delays on the heavily loaded servers
and makes high priority data acquisition threads really independent from the system load. The
scheduling model allows soft real-time system implementation. The EDZL based scheduling
approach is effectively managing tasks priorities in order to satisfy the client timing request. In
conjunction with considered threading model and memory optimizations it makes possible to
sustain soft real-time demands even on the certain general-purpose operating systems.
The results of the benchmark measurements demonstrate that the implementation achieves the
performance of pure binary CORBA systems at the same time providing all XML metadata
considered by the OPC XML-DA specification. Therefore, our solution can be used as a universal
solution combining acceptable performance with high level industry approved standards.

102
CHAPTER 4

DATA ACQUISITION SYSTEM OF ARAGATS COSMIC RAY

MONITORS NETWORK

4.1. Introduction
In this chapter the new data acquisition system of ASEC (Aragats Space Environmental Center)
detector network is described. Currently the network consists of 6 detectors located in the two
research stations on the slopes of Aragats Mountain. Additionally, the possibility of the network
extension to several near-equator countries is considered by the data acquisition system design.
The ASEC Particle detectors perform monitoring of various species of secondary cosmic rays with
different energy thresholds. The detector setup measures a number of incident charged and neutral
particles depending on their energy and coming direction along with atmospheric pressure and
temperature. These measurements quantified as time series are the basic data for the physical
inference on the Space Weather issues.
The research stations are located on the altitudes 2000 and 3200 meters above the sea level and
connected with main lab in Yerevan by means of the wide-range radio network. The OPC XML-DA
protocol is used to provide both the data dissemination and control capabilities for the distributed
data acquisition system. The information from the detectors is used to issue time critical warning
about space weather issues. Because of slow and not always stable connection between the stations,
the centralized data processing not always possible. Therefore, the preliminary data processing is
performed by the distributed network components operating in all research stations, independently.
The main component of the data acquisition system is URCS (Unified Readout and Control Server).
The URCS server reads out the time series from the underlying electronics by means of the detector
specific drivers and makes a preliminary analysis of the received data. Then, the data is made
available for other system components by means of the OPC XML-DA protocol. Each detector
channel is disseminated using separate OPC Item. The predefined metadata properties are used to
indicate the physical meaning of the channel. The URCS components may exchange information in
order to correlate the obtained data with the data collected by other components of detector
network. Besides the data items the URCS server provides a set of control items providing a
possibility to control both detector electronics and URCS software behavior.
In addition to certain amount of URCS servers the distributed data acquisition system consists of
the operator web frontends and data storage subsystem. The web frontend provides operators with a

103
possibility to monitor current data and adjust URCS configuration. The data storage servers are
periodically inquiring the data from all detectors and storing it in the MySQL database as row
numbers. The physical meaning of each column is obtained from metadata properties and stored in
XML document of the special structure in the same database as well.
Further, the stored data is analyzed by the off-line software and made available to scientists by the
means of DVIN (Data Visualization Interactive Network) interface [4, 51].

4.2. Physical Aspects


Galactic Cosmic Rays (GCR, mostly protons and heavier nuclei), are accelerated in Galaxy in
tremendous explosions of the supernovas and by other exotic stellar sources. After traveling tens of
million years in the Intergalactic magnetic fields they arrive in solar system as highly isotropic and
stable flux. In turn, our near star - sun is a variable object changing radiation and particle flux
intensities on many orders of magnitude during few minutes. Therefore, because of sun’s closeness
the effects of changing fluxes have a major influence on the earth, including climate, safety and
other issues (see for example [95, 96]).
Before the cosmic ray flux can reach Earth surface and get registered by the detector the particles
are exposed to the interactions with the magnetosphere and atmosphere. The ability of charged
particles to penetrate into the magnetosphere from outside is limited by the Earth's magnetic field.
The shielding effect of the magnetic field is usually described by the concept of cutoff rigidities,
since the magnetosphere imposes a lower limit on the energy of primary cosmic ray particles to
enter the atmosphere. These cutoff rigidities are highly dependent upon the geographic latitude and
reaching their maximum at the near-zero latitudes. Near the poles the shielding effect is much
lower. Then, the primary flux collides with atoms in the atmosphere and if energetic enough
produce secondary elementary particles (electrons, muons, neutrons, etc.).
Among other geophysical parameters, the influence of the sun on the earth environments can be
described as the changing (modulation) of the stable galactic cosmic ray “background”. The sun
modulates GCR (Galactic Cosmic Rays) in several ways. The explosive flaring processes on the
Sun result in ejection of huge amounts of solar plasma and in acceleration of the copious electrons
and ions. These particles, along with neutrons, produced by protons and ions within the flare
constitute, so called, SCR (Solar Cosmic Rays). The SCR reach the earth and, if energetic enough
initiate secondary elementary particles in the terrestrial atmosphere. This effect is called GLE
(Ground Level Enhancement). Other, non-direct solar modulation effects are also influence the
intensity of the GCR. The solar wind “blows out” the lowest energy GCR from the solar system,
thus changing the GCR flux intensity inverse proportionally to the sun activity, well described by

104
the 11 year cycle. The very fast solar wind from the coronal holes, huge magnetized plasma clouds
and shocks initiated by CME (Coronal Mass Ejections) traveling in the interplanetary space with
velocities up to 3 thousand of kilometer per second (so called interplanetary CME – iCME) disturb
IMF (Interplanetary Magnetic Field). On arrival at the earth the magnetic field of the iCME plasma
shock triggered overall depletion of the GCR, measured as decrease of the secondary cosmic
detected by the networks of particle detectors covering the earth (so called Forbush decrease Fd).
Charged particles hitting the shock are reflecting those forming the “depletion” region behind that.
Due to abundance of low energy primary protons and nuclei which are normally deflected by the
geomagnetic field at low latitudes anyway, the Fd depletion is pronounced at high latitudes. Vice-
versa geomagnetic storms appearing as a sudden change of the earth magnetic field can enlarge a
count rate of the middle and low latitude particle detectors without any notable alteration of the high
latitude detectors count rates. If the magnetic field of iCME is directed southwards it reduces the
cutoff rigidity and GCRs typically effectively declined by the magnetosphere at middle and low
latitudes penetrate now into the atmosphere and generate additional secondary particles. In high
latitudes cutoff rigidity is very low and the count rates of particle detectors are determined mostly
by the attenuation of the cascades in the atmosphere and a decrease of the cutoff rigidity did not
enlarge significantly a number of secondary particles reaching a detector.
Low energy cosmic rays (up to ~1 GeV/nucleon) are effectively registered by the particle
spectrometers on board space stations (SOHO, ACE) and satellites (GOES, CORONAS). The
latitudinal dependence of the earth magnetic field provides a possibility to use dispersed network of
the NM (Neutron Monitors) as a spectrometer registering GCR in the energy range from 0.5 to ~ 10
GeV [97]. The surface particle detectors measure the amount of the secondary particle incident on
usually not very large detector surface. These measurements quantified as time series are the basic
data for the physical inference on the solar modulation effects. There is absolutely no possibility to
distinguish SCR and GCR on the event-by-event basis. The solar modulation effects are detected as
non-random changes of the time series. The SCR modulation affects mostly the lower energy
particles and, therefore, the registered effect is highly dependent on the detector location. In high
latitudes the secondary particles produced by abundant low energy SCR modulation can reach
1000% and more. In low latitudes the enhancements due to SCR can be very small, usually a
fraction of percent. The direct measurement of highest energy cosmic rays by space-born
spectrometers or balloons is not feasible yet. Therefore, recently some large surface detectors
intended to register GCR with energies higher that 105 - 106 GeV (PeV region) are used for
detecting SCR [98, 99]. The experimental technique used for these detectors i.e. registration of the
Extensive Air Showers (EAS) is very similar to the techniques used for SCR detection. The

105
difference is that PeV particles generate millions and millions of secondary particles in the
atmosphere, whose large portion is reaching a surface (in contrast to this only a few particles
generated by GeV SCR reach a surface). To detect and measure energy and type of PeV particles
hundred meters of particle detectors are used. Detectors are triggered by the special condition
allowing rejection of the low energy particles.
Charged particles travel and reach the Earth by way of the “best magnetic connection paths”, which
is not a straight line between their birthplace and the earth. The solar neutrons, on the other hand,
not influenced by solar and interplanetary magnetic fields, reach the earth directly from their place
of birth on the solar disc. This feature allows us to “map” the flare location and provide the “time
stamp” of the neutron production making them excellent “probes” of solar accelerators. For this
reason we need to detect the solar neutrons, distinguish them from other incoming particles,
measure their energy, and determine their incoming direction. The first step to achieve these
enhanced possibilities of neutron detection was to establish the network of SNT (Solar Neutron
Telescopes), installed at seven locations on high mountains around the world, forming the second
operating international world-wide particle detector network, the NM (Neutron Monitor) network
being the first [100].
The large variety of solar modulation effects and the stringent limitations of space and surface-
based experimental techniques require new ideas for developing experimental techniques for
measuring the changing fluxes of the elementary particles. Therefore, a new type of particle
detectors with enhanced flexibility to precisely and simultaneously measure changing fluxes of
different secondary particles with different energy thresholds will be a key to a better understanding
of the sun.
Hybrid particle detectors of ASEC (Aragats Space Environmental Center) measuring both charged
and neutral components of secondary cosmic rays provide good coverage of different species of
secondary cosmic rays with a different energy threshold. A multivariate correlation analysis of the
detected fluxes of charged and neutral particles is used for the research of the geo-effective events
i.e. Ground Level Enhancements, Forbush decreases, Geomagnetic Storms and for the
reconstructing of the energy spectra of SCR [4].

4.3. ASEC Detectors


The ASEC provides monitoring of different species of secondary cosmic rays and consists of six
detectors located at two high altitude research stations on Mt. Aragats in Armenia. Geographic
coordinates: 40°30'N, 44°10'E, geomagnetic cutoff rigidity: ~7.6 GV, altitude 2000m and 3200m.
Both stations are connected with main lab in Yerevan. The specifications of the ASEC monitors are

106
shown in Table 6. Additionally, the project of the new low-latitude world-wide particle detector
network with the participation of Costa-Rica, Croatia, Egypt, Bulgaria, Armenia and Indonesia was
discussed at UN/NASA/ESA IHY workshop [5]. Establishing a new world-wide network of the
ASEC detectors, at low to mid latitudes will give a possibility to measure energy spectra of primary
particles with energies up to 50 GeV, as well as, provide cost-effective possibilities for Space
Weather research.

Table 6. Characteristics of the ASEC monitors

Detector Altitude, m Surface, m2 Threshold(s), MeV In operation since

NANM 2000 18 100 1996


ArNM 3200 18 100 2000
SNT channels 4 (60cm thick) 120, 200, 300, 500
3200 1998
and veto 4 (5cm thick) 7
2 x NAMMM 2000 4.86 + 4.86 7, 350a 2002
AMMM 3200 45 5000 2002
MAKET-ANI 3200 6 7 1996
a
the first value – energy threshold for the upper detector, the second number – a bottom detector.

The two 18NM-64 neutron monitors [97] estimating number of incident neutrons are in operation at
Nor-Amberd and Aragats research stations. They are called the NANM (Nor Amberd Neutron
Monitor), and the Aragats Neutron Monitor (ArNM), respectively. The monitors are equipped with
interface cards, providing time integration of counts from 1 sec up to 1 minute. Other ASEC
detectors are based on the scintillation effect and layers filtering part of the particles spectrum. The
scintillation is used to detect charged particles passing through the detector. The sensor, called a
scintillator, consists of a transparent plastic material that fluoresces when struck by ionizing
radiation. A sensitive photomultiplier tube (PMT) measures the light from the scintillator. The PMT
is attached to an electronic amplifier and other electronic equipment to count and possibly quantify
the amplitude of the signals produced by the photomultiplier. The following subsections review
briefly the detectors operating at ASEC.

4.3.1. Aragats Solar Neutron Telescope


The SNT (Aragats Solar Neutron Telescope) is a part of the world-wide network of Solar Neutron
Telescopes. The Aragats SNT is formed from 4 separate identical modules, as shown in Fig. 31.
Each module consists of 60 cm thick scintillation block overviewed by photomultiplier. The
detecting volume is formed from standard 50x50x5 cm3 plastic scintillators stacked vertically on a
100x100x10 cm3 horizontal scintillator slab. One meter above the thick lower scintillator slab is
another scintillator slab 100 x 100 x 5 cm3, with the goal to register charged particles. A scintillator

107
light capture cone and PMT (Photo Multiplier Tube) are located on the bottom and top slabs
separately to measure the number of events in each of them.
Incoming neutrons undergo nuclear reactions in the thick plastic target and produce protons and
other charged particles. The intensity of the scintillation light induced by these charged particles has
a dependence on the neutron energy and is measured by the PMT on the bottom scintillators. In the
upper 5 cm thickness of the scintillator plastic, the neutrons do not effectively interact with the
scintillator nuclei and, therefore, only registered by the bottom scintillators. In contrast charged
particles are very effectively registered by both the upper thin 5 cm and the lower thick 60cm
scintillators. Therefore, the absence of signal in the upper scintillators, coinciding with signal in the
thick lower scintillators, points to neutral particle detection.
The signal amplitude of the photomultiplier output signals is discriminated according to 4 threshold
values, corresponding to the threshold energies of 120, 200, 300, 500 MeV respectively. When
coincidences of the top and bottom scintillators are registered, it is possible to roughly estimate the
direction of the incoming charged particle.

Fig. 31. The figure presents Aragats Solar Neutron Telescope (left) and Nor-Amberd Multidirectional Muon
Monitor (right).

4.3.2. Nor-Amberd Multidirectional Muon Monitor


The NAMMM (Nor-Amberd Multidirectional Muon Monitor) consists of two layers of plastic
scintillators above and below one of the three sections of the NANM (6 counters BP28) as shown in
Fig. 31. Each layer is composed of 6 scintillators having the area of 0.81 m2. The distance between
layers is approximately 1 m. The lead filter of the neutron monitor absorbs electrons and low energy
muons. Muons with energy above 350 MeV can reach bottom scintillator. Therefore, the detector is
able to count independently intensities of neutrons, high energy muons and other charged particles.

108
Additionally, registration of coincidences between detector signals from the upper and lower layers
allows separate measurements of muons arriving from different directions. Combinations when
multiple detectors are triggered indicate EAS hitting the detector setup.

4.3.3. Aragats Multidirectional Muon Monitor


The Aragats Multidirectional Muon Monitor (AMMM) includes 15m2 scintillation detectors,
located at the top of the concrete stratum and 90m2 detectors of the same type are located 24 m
below, as shown in Fig. 32. The lower layer of the AMMM constitutes a very sensitive high energy
muon monitor, robust to local atmospheric conditions because of the rather high energy threshold.
The 6m thick concrete blocks plus 7m soil filter the electrons and the low-energy muons. Thus, only
muons with energies above 5 GeV are registered by the bottom detectors.
Using the coincidence technique, it is possible to monitor changing count rates from numerous
space directions. Detectors at the top are grouped in 3, while those in the underground hall are
grouped in 8 to provide a significant amount of coincidences. The geometry of the detector
arrangement allows one to detect an angle of incidence with angular accuracy of approximately 5º
for the particles arriving from the range of directions from vertical to 60º declinations.

Fig. 32. The figure represents Aragats Multidirectional Muon Monitor.

109
4.3.4. MAKET-ANI
The MAKET-ANI surface array consists of 92 detectors formed from 5cm thick plastic scintillators
to measure particle density of the registered EAS. Twenty four of them have 0.09 m2 area and 68
have 1 m 2 area. The central part consists of 73 scintillation detectors and is arranged in a
rectangle of 85 x 65 m2. In order to estimate the zenith and azimuthal angles, 19 detectors
from the 92 are equipped with timing readout to measure the EAS front appearance with an
accuracy of approximately 5 nanoseconds. The photomultipliers (PM-49) are placed in
light-tight iron boxes. Logarithmic Analog to Digital Converters (ADC) and Constant Fraction
Discriminators (CFD) are placed just above the photomultiplier.

4.3.5. Frontend Electronics


All frontend equipment is developed by the electronics group of Cosmic Ray Division of Yerevan
Physics Institute, according to modern very compact and high reliable technologies, oriented for
easy maintenance and production. To minimize data transmission rate, the raw data is partially
processed in microcontroller before sending it to the frontend computer. A newly designed readout
is based on the concept of full software control of the detector parameters and maximum utilization
of all detector data. Each photomultiplier has its own local programmable high voltage (HV) power
supply and buffer preamplifier to condition the pulses in preparation for sending them via long
coaxial cables without degrading the dynamic range and signal-to-noise ratio. Counting modules
are located in the counter room. They have buffer preamplifiers and programmable threshold
comparators (discriminators) at the inputs. The threshold of the counter module input comparators
can be programmed by voltage and polarity in the range from -0.5V to 0.5V. Besides the
comparators, the buffer preamplifier output signals can be transferred to other data processing
devices such as ADC’s, etc., to be installed later. All electronics modules are based on using
modern 8-bit and 32-bit microcontrollers, for the detector control system (HV programming and
measurement) and for the main data acquisition respectively. Currently the Atmel 8-bit and Fujitsu
FR 32-bit controllers are used.
The main pressure sensor of the whole system is placed in a special pressure-tight box with a
possibility of periodic calibration using a standards Hg barometer. It consists of Motorola
MPXA6115 Integrated Silicon Pressure Sensor and ATMEL 8-bit microcontroller and has a
frequency modulated output for direct coupling with counter modules and serial asynchronous
interface to connect to the PC.
At the moment all detectors are equipped with a standard serial interface. However, this interface
restricts the maximal number of devices connected to the single front-end host. The distance from

110
the host system is limited as well. Therefore, the Lantronix XPort Ethernet interface will be used in
the next generation of data acquisition boards. Despite the fact that all ASEC boards are currently
missing embedded Ethernet interface, the UART-Ethernet converters produced by the 1st Mile
Company are used as a temporary solution to provide Ethernet connectivity.

4.4. ASEC Data Acquisition System


In the old data acquisition system each detector was frontended by the computer running Linux
operating system with specialized readout software. Furthermore, certain detectors with possibilities
of operator control were equipped with two frontend computers. The first one was used for the data
readout and the second for the control by means of the LabVIEW software. Such heterogeneous
design caused not only the requirement to maintain multiple unreliable computers but as well posed
significant difficulties in the software management. The software components written for various
detectors in different time utilized different interfaces and file formats. The comparable software
components had to be synchronously developed for the systems controlling diverse detectors. Most
of detector drivers were developed only for the specific Linux versions and there were problems
with porting them on the new systems. The support of multiple interfaces and file formats
considered by the various system components had brought complication to the development of
analysis software. Moreover, the chaos with file formats had several times resulted in data
misinterpretation and, therefore, had caused researchers to repeat considerable amount of the work.

Therefore, a new ADAS (ASEC Data Acquisition System) system is designed. It is based on the
universal and well defined interfaces developed in order to uniformly control all parts of the ASEC
experiment. Fig. 33 presents the overall system design. The following subsections are briefly
reviewing the main component of the system.

4.4.1. Embedded Software


The microcode running inside of all data acquisition boards is implemented using double buffer
client-server architecture. The devices are initialized with predefined parameters and wait for the
control from the host system. The host system should specify desired detector configuration and
issue initialization command. The data consistency is assured using CRC16 checksums carried
along with commands. After the initialization request the detector starts operation in the standard
mode. Double buffer architecture is used to relax timing demands. While the current data is
prepared in the first buffer, the data of previous operation is available from the second one upon the
driver request. After the data in the first buffer is ready, the buffers are switched. If the

111
configuration adjustment is required the host system should alter the detector configuration and
restart operation by means of initialization command.
In order to allow detector auto-detection the embedded software of all detectors supports the
discovery query of a standard form. In the response to that query the information about a detector
type, embedded software version and hardcoded options is returned.

Fig. 33. The figure represents a layout of the new ASEC data acquisition system.

4.4.2. Frontend Computers


All detectors are connected to the frontend computers over Ethernet interface by means of the UDP
protocol using 1st Mile UART-Ethernet converters. Single frontend computer is dedicated for each
research station in order to ensure reliability independently from the wireless connection faults. The

112
same type of computers is used at all research stations. In that way the maintenance is greatly
simplified. Currently, the Minibox M100 (VIA C3 533MHZ, 512 MB RAM) system based on the
VIA Eden platform is used. The major advantage of this architecture is the absence of mechanical
parts. The system has passive (fan-less) cooling. Instead of a hard drive, the Compact Flash memory
card is used. This improves significantly the system reliability. The maintenance simplification
comes as an additional benefit. The data acquisition software can be performed just by replacing
Compact Flash card and can be performed by the technical stuff. The Minibox M100 is equipped
with small LCD display. It is used by the data acquisition software to represent current system
status and notify operators about failures. The computers are equipped with Gentoo Linux based
operating system. It is used in conjunction with the 2.6 family kernel optimized for the real-time
applications.

4.4.3. Unified Readout and Control Servers


The detectors are controlled by the URCS (Unified Readout and Control Server), which is operating
on the described readout computers. Most of the URCS components are the same for all ASEC
detectors. The specific actions are performed with help of the dedicated drivers communicating with
the underlying electronics using appropriate protocol. The URCS server is a primary system
component, both the data readout and detector control are carried with its help. The URCS
architecture is precisely described in section 4.5.

The OPC XML-DA protocol is used by the URCS server to provide both the data dissemination and
control capabilities for the remote data acquisition components. However, the developed OPC
XML-DA HDR implementation is a complicated piece of code and not thoroughly tested in real
applications yet. In fact, besides the sample applications developed for the performance and
stability evaluation, the ASEC data acquisition system is first real-world application of the designed
system. Therefore, in order to preserve data acquisition system continuity the OPC XML-DA
support is not embedded into the URCS server but a separate OPC XML-DA HDR server is
executed on the same computer. The URCS server uses CORBA interface to publish detector data
and accept control commands from the operators (see section 3.4.1). The described design assures
that the entire system would not fail because of problems in the OPC XML-DA interface. If the
problems arrive the OPC XML-DA server will be automatically restarted while the underlying
electronics is continuously controlled by the URCS server. All the data will be available again as
soon as OPC XML-DA server is restarted.
Each detector is represented using dedicated branch within the OPC XML-DA server data space.
The detector channels are rendered using separate OPC Items. The predefined metadata properties

113
are used to indicate the physical meaning of each channel (see section 4.7 for details). These
properties may include the information on the registered particle types, accepted energy range,
particles coming direction etc. The URCS components may exchange information in order to
correlate the obtained data with the data collected by other components of detector network.
However, this capability is not used by the current software yet. Besides the data items the URCS
server provides a set of control items providing a possibility to control both detector electronics and
URCS software behavior.

4.4.4. Registry
The other components of the distributed system should be able to find a full list of operating URCS
servers. The special registry is developed to handle this goal. The registry is running on the stand-
alone server (optionally together with one of the URCS servers) and providing a self-announcement
interface by means of OPC XML-DA protocol. The “Registry” item is available within the OPC
XML-DA server data space. The URCS servers periodically write the XML document of a defined
structure into this item in order to announce themselves (the document structure is described in
section 4.7). The registry server analyzes this document and creates appropriate OPC Item in the
server data space within the “Registry” branch. If Item exists already the associated information is
just updated. So, the “Registry” branch contains a full list of the running servers. The used names
are descriptive enough to give an idea about URCS server location and assignment.
The remote applications wishing to get the full list of the active URCS servers are browsing the
“Registry” branch and querying certain items for information. The query brings information about
the server URL, current status, short assignment description and last renewal time.

4.4.5. Operator Frontend


The operator web frontend is another component of the distributed system and used by the operators
to control operation of the URCS servers and the detectors behind the URCS servers. The web
interface is implemented in PHP language and support connection with URCS server by means of
the OPC XML-DA protocol. The operator is able to browse the published data and monitor a
current status of the system. The Macromedia Flash based animation is used to provide visual
representation of the time series by means of periodically updated data curve. The program written
in ActionScript (and used to control animation) establishes the subscription with the considered
OPC Item and periodically inquiries for the new data. This data along with older cached values is
used to draw the time-value curve in real-time showing the time evolution for the specified data
channel. The metadata properties may be used by the URCS server to specify special conditions
demanding the operator’s intervention (see section 4.7). If this happens the frontend will sound an

114
alarm.
The web frontend is used as well to control the URCS configuration, including configurations of the
underlying electronic devices. Individual OPC Items are controlling the configuration of
correspondent detectors. The additional item controls the URCS software behavior. Each
configuration is described by the XML file with detector-specific structure. The structure is not
hard-coded in the software and described by means of the XSD Schema descriptions. This
description is used to present appropriate XForm entries to operators providing ability to adjust
certain options. On the basis of the submitted data new XML file with configuration is generated
and submitted to the appropriate URCS driver which can use that at its discretion. The Chiba
software is used to generate XForm and configuration documents basing on the XSD Schema [91].
Chiba is implemented in Java and designed to run in a Servlet 2.3 webcontainer [101]. The
webcontainer is provided by means of the Apache Tomcat server.
Several hardware components within the data acquisition system are still controlled by the
LabVIEW based programs. These components are integrated in the new environment as well. The
integration is achieved by means of the OPC XML-DA bindings for LabVIEW (see 3.8.2) and,
therefore, the operators are able to use standard web frontend for the control.

4.4.6. Error Handling and Notifications


Besides sophisticated web-based interface, simple Linux command-line control application is
available. It implements basic functionalities, including URCS restart, certain driver restart, and
configuration renewal. The current status, current value and the last error message may be reviewed
using this client as well. That brings a possibility to implement simple automated control scripts
executed periodically in order to control system status and perform recovery operations if
necessary. In the case if automatic recovery is not possible the notification is delivered to operators
by means of e-mail.
In order to detect automatically hardware failures and Space Weather severe conditions the
acceptable value ranges are specified for the data channels. If certain data value is dropped outside
of these ranges indicating either the hardware failure or the Space Weather sudden disturbances the
notification message is delivered to operators. The [4] demonstrated a strong relation between inter-
detector correlations and CME driven Space Weather conditions. Therefore, the correlations
between time series obtained from the different detectors are calculated and in the case if those
surpass the defined threshold the notification is sent to the operators as well.
As it was described before, the slow long distance wireless links are used to connect the research
stations with main lab. Due to weather conditions, sometimes the link is completely down. The
URCS servers are developed in the way allowing fully autonomous operations. If a connection is

115
not available for a long period of time, the data is stored in local files on the flash card and
delivered to the storage server by means of FTP protocol as soon as the connection is restored.

4.4.7. Data Storage


The data is stored by means of two powerful servers working in parallel at the main lab. The dual-
core AMD Athlon X2 4800+ systems equipped with 2GB of operating memory and two Serial-
ATA 400GB hard drives organized in mirroring raid are used. These servers are periodically
inquiring the data from all detectors and storing it in the MySQL database as row numbers. The
physical meaning of each database column is obtained from metadata properties and stored in the
XML document of a special structure (see section 4.6) within the same database.
Further, the stored data is analyzed by the off-line software and made available to scientists by the
means of DVIN (Data Visualization Interactive Network) web interface [4, 51].

4.5. Unified Readout and Control Server


The URCS (Unified Readout and Control Server) is a main system component and used to control
detectors. The data readout is performed by means of URCS server as well. The data is
disseminated to the clients using OPC XML-DA protocol. For each detector the separate branch is
registered within the OPC XML-DA data space. In this branch several control items are registered
as well. Using these items the operators are able to adjust detector (and driver) configuration, send a
few supported commands and retrieve error reports. The same control items are registered in the
stand-alone branch associated with URCS server and used to perform general server operations.

4.5.1. Layered Architecture


The several levels of abstraction are forming the core of a URCS server layout. The abstraction
library lies in the deepest level, providing an ability to run data acquisition software under the
multiple operational systems (this library is a part of the OPC XML-DA HDR library, see section
3.8). Currently, the Windows NT family and Linux systems are fully supported in both 32 and 64
bit environments. However, support for any POSIX complaint system can be easily added.
The connections and writers are forming next abstraction layer. The connection abstraction
interface provides a uniform way of accessing underlying data acquisition boards and makes easy
supporting new protocols without any changes in data acquisition code. The current version of the
software supports devices connected through UART, USB and Ethernet interfaces. The writer
abstraction layer provides an ability to save data in the different formats. The multiple ways of
bringing data to the client applications is another possibility brought by the writer abstraction layer
utilization. This possibility is used to distribute the data using the OPC XML-DA protocol. The data

116
is submitted to the OPC XML-DA server by means of the CORBA protocol. The ORBit CORBA
implementation is used for that purpose [102]. However, for the data safety in addition to the
dissemination by means of OPC XML-DA protocol, the data is stored locally in files. The file
format used to store files consists of two parts: an ASCII data compatible with legacy software
developed for the old data acquisition system and XML data containing metadata description of the
ASCII data. A more detailed review of the data format is available in section 4.6.
The Device is a top most system component. It is used to get the data from the connected hardware,
perform required preprocessing and pass it away. Each of the devices is associated with single
Connection used to communicate with underlying data acquisition board and multiple Writers
disseminating the data.
The multiple described devices are supported within single URCS server. The list of devices along
with their settings is provided by the system integrator by means of the XML configuration file.
Fig. 34 illustrates described abstraction layers.

Fig. 34. The figure illustrates URCS server abstraction layers.

4.5.2. Threading Model


The URCS server is developed using multiple readout threads and single processing. For the each
device the dedicated thread with highest system priority is executed. This thread controls the
underlying data acquisition board sustaining strict timing demands. The thread is polling as well the

117
board for the data and storing that in the intermediate ring buffer. The single buffer is used for all
devices. The pointer to the device, which is obtained the information, is stored along with the data.
The processing and dissemination is performed by a single thread running with lower priority. It
processes the ring buffer record by record and for each record executes processing routine of
appropriate Device. It passes as well the data to the appropriate Writers upon the request from
Device processing routine. In order to prevent overall server hang-ups due to the Writer delays (for
example, if the OPC XML-DA server is crashed and temporary inaccessible) the timeout restricts a
maximal amount of time used to the data storage and dissemination. If the processing is not finished
in the assigned time slice, the record is postponed and processing of the next record from the buffer
is started.
All the OPC XML-DA threads (section 3.6) are running with priorities lower than priorities
assigned to the URCS server.

4.5.3. Detector Network


As it has already been stated the URCS server communicates with underlying data acquisition
boards using Ethernet interface by means of UDP protocol. The detectors Ethernet segment is
isolated from the outer world in order to avoid the spying and data corruption. Only a frontend
computer running the URCS server is connected to the segment.
In order to obtain the network address the connected devices are issuing discovery request by means
of the DHCP protocol. The frontend computer accepts the request and assigns an IP address from a
dedicated pool.
Usually, all operating detectors are listed along with their IP addresses in the URCS configuration
file. However, it is possible to specify the IP range and default configuration in order to enable the
device autodetection. In this case the URCS server will probe all IP’s in the range using discovery
command (see section 4.4.1). The found detectors will be initialized using specified default
configuration.

4.5.4. Configuration
The behavior of a device is controlled by means of XML configuration. Initially this configuration
is read from the URCS configuration file. After that, it is mapped to the correspondent OPC Item
and operators are able to adjust configuration by writing new value into this item. Upon the
configuration adjustment, the configuration file is updated. So, the current configuration would not
be lost after a server restart.
The configuration structure is completely device specific. The DOM in-memory representation of
the appropriate configuration is passed to the device driver. It is up to the driver to process

118
configuration and extract required information. A portion of the configuration is controlling the
driver behavior. Normally, this part includes connection properties (type, address, timeouts), a list
of writers to use for the data storage along with their properties, properties of the data preprocessing
algorithms, et cetera. Another part controls the detector hardware behavior and is passed by the
driver to the detector’s embedded software. The configuration is described by XML Schema
Description, which is available to the clients by means of the OPC XML-DA server as well. Both
current configuration and this schema description are used by Chiba to generate XForms entries
providing control interface.
Besides, the device specific configurations the configuration controlling the global URCS behavior
is available. This configuration is mapped to the OPC Item and may be adjusted by the operators as
well.

4.5.5. Commands
The URCS server and affiliated Devices are able to accept commands. The “Command” item is
registered for each device in the OPC server data space. The client by writing an XML document
into this item may request the driver to perform certain actions. The “Command” item is registered
as well in the URCS-wide branch within the OPC server data space. The execution of the server-
wide operations may be requested by writing the inquiry into this item.
Currently, only two commands are supported by the URCS server. The server accepts commands
demanding the server restart and the reload of the configuration file. It is possible as well to restart
only a certain driver by writing command in the “Command” item associated with that driver.
The structure of the “Restart” and “Configuration Reload” commands are described in section 4.7.
Other device specific commands are not standardized by the URCS server design. In this case an
XML query is passed to the Device driver and it is up to the driver implementation to handle the
request in the proper way.

4.5.6. Error Logging


Under Linux the information about problems during server operation is reported using standard
syslog facility. The Linux syslog implementation is a very powerful tool and able to store the
logged information in local files, in the temporary memory buffer and on the remote server by
means of different protocols. The information may be sent as well to the operators by mail
depending on its criticality. Under Windows the error messages are stored in the local file.
Besides the log files, the URCS server provides information about last occurred error by means of
OPC XML-DA protocol. The errors related to each of the operating Devices is reported separately
using “Error” item registered in the correspondent branch with the OPC server data space.

119
4.5.7. Self-Announcement
The URCS server announces itself to the other components of the distributed data acquisition
system by means of the OPC XML-DA based registry server. The address of the registry server is
obtained from the configuration file. The URCS server periodically writes the information about its
current status and address into the “Registry” item available within the registry server’s data space.

4.6. Data Representation


The embedded software controlling the ASEC detectors classifies the incident particles by several
properties. The number of events in each class is summed over the considered amount of time and
transferred to the URCS server upon the request. Therefore, the ASEC data is represented as a
sequence of vectors identified by the readout time and duration of the integration interval (time
series). The vector components are representing a number of events registered in each class. The
information from the environmental sensors (pressure, temperature, etc…) are obtained at the end
of integration time slice and stored in the additional vector components.

The data storage subsystem of the old ASEC data acquisition system has been based on the
unstructured ASCII files. The rows were representing vectors and the columns were representing
vector components. As it was stated before a lack of structure caused misinterpretations in the data
meaning and introduced a lot of obstacles in the data analysis automation. Even more problems of
such kind are expected after the international network is set. Therefore, the new XML based data
format was developed.

4.6.1. Basic Data Format


The XML is excellently fit system demands on meta-information describing meaning of the
enclosed data. However, the completely new approach will obsolete all existing data analysis
software. Further, providing a full range of meta-data with each element of the time series will
drastically increase the size of the stored data. This will result in the increased requirements to the
network and computational resources and, as a consequence, in a higher installation costs. Finally,
the comparison of the XML-based query languages and SQL has shown a tenfold advantage of the
SQL approach [13]. Therefore, to compensate the described drawbacks the interim solution is
introduced. In the way similar to the ancient data acquisition system the data is represented as
vectors within the ASCII strings. These ASCII strings are enclosed in the XML structure providing
basic information about the enclosed data and referencing an external document with the thorough
detector description.

120
The data depicted in Fig. 35 illustrates the representation of a single vector by means of the new
format. The “installation” attribute references external entity with the detector description.
However, the detector structure may be changed several times during detector operation. Therefore,
the specific layout used to obtain the vector should be referenced as well. The “layout” attribute is
used for that purpose. The “Time” and “Duration” elements indicate the end and duration of the
integration time slice. The timestamp and duration are represented following the encoding rules
defined by the ISO-8601specification [103]. Special conditions encountered during the data
acquisition are described using “Quality” element. Usually, this element indicates hardware failures
resulting in partly or completely inaccurate data. The “Value” element holds a data vector in the
ASCII representation.

<Data installation=”NAMMM” layout=”layoutid”>


<Time>2006-02-25T16:50:00.0000000+04:00</Time>
<Duration>P30.0000000</Duration>
<Quality>Good</Quality>
<Value>1846 2760 1956 1848 1763 … </Value>
</Data>
Fig. 35. A sample data vector represented using new XML based format is shown on the figure.

In that way, the ASCII strings can be easily extracted from the data and used by the legacy
applications. The newly developed applications are considering the XML description in order to
extract the appropriate data from the ASCII strings.
So, the detector data consists of the data vectors of the aforesaid type and the detector description.
This description is not transported with the data but available upon the request from the URCS
server by means of the OPC XML-DA protocol. Therefore, the network utilization of new data
acquisition system does not exceed that of the old one. However, the data and description can be
reconciled in a single document destined for the data exchange with the collaborating groups of
scientists (see example in Fig. 36).

Three main components may be segregated in the detector description: the global detector
description, the description of the detector components and description of the logical data layout
(scientific meaning of each vector component). The first two components are preliminary filled by
the operators and the data layout is automatically generated by the URCS software.

4.6.2. Detector Description


The information included in general detector description is presented in Table 7. The description of
the NAMMM monitor is provided as an example in Fig. 36.

121
<Installation id=”NAMMM”>
<Title>Nor-Amberd Multidirectional Muon Monitor</Title>
<Type>Multidirectional Muon Monitor</Title>
<Collaboration>
<Title>Aragats Space Environmental Center</Title>
<URL>http://crdlx5.yerphi.am/DVIN/</URL>
<Email>asec@crdlx5.yerphi.am</Email>
</Collaboration>
<Maintainer>
<Title>Cosmic Ray Division of Yerevan Physics Institute</Title>
<URL>http://crdlx5.yerphi.am/</URL>
<Email>crd@crdlx5.yerphi.am</Email>
<PostalAddress>…</PostalAddress>
<Phone>…</Phone>
<FAX>…</FAX>
</Maintainer>
<Location>
<Country>Armenia</Country>
<Region>Aragatsotn</Region>
<Coordinates>
<Latitude>40.5</Latitude>
<Longitude>44.167</Longitude>
<Altitude>2000</Altitude>
</Coordinates>
</Location>
<CutoffRigidity>7.6</CutoffRigidity>

<Geometry id=”geometryid”>
XML description of the detector components, see below. The multiple geometries may
be used during the experiment operation and, therefore, described here. The
“geometryid” is used to reference correspondent geometry from other parts of the
detector description. The “CurrentGeometry” element is used to reference currently
used detector geometry.
</Geometry>
<Layout id=”layoutid”>
XML description of the logical data layout, see below. The data layout may be altered
during the experiment operation. The “layoutid” is used to reference correspondent
data layout from other parts of the detector description. The “CurrentLayout” element
is used to reference currently used layout.
</Layout>
<CurrentGeometry>geometryid</CurrentGeometry>
<CurrentLayout>layoutid</CurrentLayout>

<Data installation=”NAMMM” layout=”layoutid”>


The optional data vectors embedded into the descriptions, for format see Fig. 35
</Data>
<Data installation=”NAMMM” layout=”layoutid”>
The optional data vectors embedded into the descriptions, for format see Fig. 35
</Data>
</Installation>
Fig. 36. The figure contains an example of the detector description. The information corresponds to Nor-
Amberd Multidirectional Muon Monitor.

122
Table 7. The standard fields of the detector description.
Element Description
Title Detector name
Type Type of the detector
Information (Web Server, E-Mail) about network the detector is
Collaboration
participates in, if any
Information (Web Server, E-Mail, Postal Address, Phone, FAX) about
Maintainer
organization maintaining the detector
The detector location (country, region and geographical coordinates in
Location
WGS84 system)
Cutoff Rigidity The cutoff rigidity at the detector location
Geometry contains description of the hardware component parts; Layout
Geometry
contains description of the data arrangement within the vector. The
Layout
configuration may be changed during the detector operation. Therefore,
CurrentGeometry
several geometries and layouts may be described. CurrentGeometry and
CurrentLayout
CurrentLayout elements are referencing present-day descriptions.
The detector description can be reconciled with one or more data
Data
vectors by means of Data elements.

4.6.3. Detector Geometry


The detector geometry describes the detector component parts and their mutual disposition. This
information is used to help hardware engineers in finding damaged components, to help physicists
in finding direction the particles are coming from and to provide the basic detector description for
the third parties overlooking the data.
The “Coordinate” branch defines a coordinate system by means of two angles. “Slope” is a slope of
the XY-plane and “Angle” is an angle between northern direction and Y-axis. The point specified in
the “Location” branch of the general detector description is used as an origin of coordinates. Then,
the detector placement is described in this coordinate system.

Most of ASEC detectors have layered architecture. Several layers of the scintillation sensors are
segregated by the transition layers filtering part of the particles spectra. Therefore, the sensors are
described layer by layer. For each layer the unique ID number, vertical alignment, short assignment
description and a list of sensors are defined. For each sensor the following information is defined:
ID number unique within the layer, the coordinates, dimensions, type and short HTML description.
Fig. 38 contains parts of the Nor-Amberd Multidirectional Muon Monitor description as an
example.
Each system component part is uniquely identified by the layer ID and sensor ID. This two ID
numbers are used to reference component parts from the logical data layout in order to describe
which detector component is used to obtain information contained in the certain data channel.

123
<Geometry id="geometryid">
<Coordinate>
<Slope>0</Slope>
<Angle>25.4</Angle>
</Coordinate>
<Layer id=”1” Z=”1”>
<Description>The upper layer of the NAMMM</Description>
<Sensor id=”1” X=”0” Y=”0” Width=”0.9” Length=”0.9” type=”scintillator”/>
<Sensor id=”2” X=”0.9” Y=”0” Width=”0.9” Length=”0.9” type=”scintillator”/>

</Layer>
<Layer id=”2” Z=”0”>
<Description>The bottom layer of the NAMMM</Description>

<Sensor id=”5” X=”0.9” Y=”0.9” Width=”0.9” Length=”0.9” type=”scintillator”/>
<Sensor id=”6” X=”1.8” Y=”0.9” Width=”0.9” Length=”0.9” type=”scintillator”/>
</Layer>
<Layer id=”3” Z=”0.5”>
<Description>Filtering layer. The electrons and muons with energy below 350MeV are
filtered by this layer.</Description>
</Layer>
</Geometry>
Fig. 38. This figure contains an example of the detector geometry description. The parts of the description of
the Nor-Amberd Multidirectional Muon Monitor detector are taken as a source.

4.6.4. Data Layout


The most important part of the detector description is an explanation of the component layout
within the data vector. This data layout description indicates the physical meaning and acceptable
value range for each vector component. The XML description providing the layout explanation
consists of an element list. Each element provides information about a single data component and
specifies its location within the data vector (see Fig. 37).

Fig. 37. The figure illustrates how the data layout description is used to provide information on the physical
meaning of the data components.

124
Utilizing the proposed approach the application may execute XPath query on the layout description
and select the nodes of interest. For, example the query “//Layout/Channel[@type = ’moun’]” will
extract all elements providing the information on the muon flux intensity. Then, the application
finds positions in the data vector containing the interesting data and extract it. Thus, the data
acquisition system can automatically find the required information and in the case if the some
channels are added to the detector, removed from the detector or rearranged the software will
automatically handle new data layout without code adjustment.

Currently, there are defined three types of channel description elements: “Channel”, “Variance” and
“Correlation”. The “Channel” describes a value representing measurements of a certain sensor. As
it was described in introduction, the ASEC detectors are measuring the number of incident particles
registered during the certain amount of time. At the moment the minute intervals are used.
However, the embedded software is able to operate with intervals having a precision up to the
hundreds of milliseconds. This information is not stored, but used to calculate variances of the
stored minute data and correlations between the data channels. The “Variance” and “Correlation”
elements are used to describe values containing these variance and correlation information,
accordingly.

The following attributes are used to describe the value meaning and acceptable range:
id: Unique ID number, used to reference value.
sensor, layer: The attributes contains sensor and layer ID numbers indicating the detector part
used to obtain value. The attributes are not included for the data obtained by means of several
parts, like coincidences (a count rate of a particle flux registered by the sensors in the upper and
lower layers simultaneously).
type: The “type” is an enumeration describing the physical meaning of the value. Currently
defined types are listed in Table 8.
units: The attribute specifies engineering units in which the value is represented.
low, high: These attributes are pointing to the minimal and maximal values likely to be obtained
in normal operation. If the values are exceeding these limits the alert is sent to the operator.
energy_min, energy_max: For the sensors counting a number of incident particles, these attributes
are specifying minimal and maximal energy thresholds. Only the particles with energy within the
specified thresholds are counted. If only “energy_min” attribute is specified, then the data value
represents a number of incident particles with energy above the specified threshold. The energy is
specified in megaelectron-volts.

125
direction: In order to estimate the intensity of the particle flux coming from the certain direction,
the ASEC electronics separately counting number of particles simultaneously registered by the
several layers of the scintillation sensors. The “direction” attribute is used to describe such type of
the coincidence data.
To indicate the direction, the ID numbers of the sensors, which are registered the particles, are
presented top-down. The numbers are divided by the ‘-‘ sign. For example, “1-5” means that the
counted particles are passed through the first detector in the upper layer and the fifth detector in
the lower one.
description: This attribute contains a short description of a free form.

Table 8. Basic types of the ASEC sensors


Type Description
temperature Temperature sensor
pressure Pressure sensor
humidity Humidity sensor
magnetic_field Sensor measuring components of the geomagnetic field vector
intensity_charged Sensor measuring intensity of the incident charged particle flux
intensity_muon Sensor measuring intensity of the incident muon flux
intensity_neutron Sensor measuring intensity of the incident neutron flux

For the “Variance” and “Correlation” elements only ID number and information about the primary
data channels (for which the variance and correlation are calculated) is included. For the “Variance”
element it is the “targetid” attribute and for the “Correlation” the “targetid1” and “targetid2”
attributes are used. Fig. 39 illustrates the data layout description.

<Layout id=”layoutid” geometry=”geometryid”>


<Channel id=”1” type=”pressure”>1</Channel>
<Channel id=”2” type=”intensity_charged” layer=”1” detector=”1”>2</Channel>
<Channel id=”3” type=”intensity_charged” layer=”1” detector=”2”>3</Channel>
<Channel id=”4” type=”intensity_muon” direction=”1-3”
energy_min=”350”>4</Channel>
<Channel id=”5” type=”intensity_muon” layer=”2” detector=”2”
energy_min=”5000”>5</Channel>
<Variance id=”6” targetid=”2” >6</Variance>
<Variance id=”7” targetid=”3” >7</Variance>
<Correlation id=”8” targetid1=”2” targetid2=”3” >8</Correlation>
</Layout>
Fig. 39. The figure illustrates the data layout description.

126
4.7. OPC XML-DA Interface
Within the data acquisition system the data dissemination and control are achieved by means of the
OPC XML-DA protocol. The OPC XML-DA interface is available for all URCS and registry
servers. Each detector frontended by the URCS server is represented in the OPC XML-DA data
space by a separate branch (see Fig. 40). A set of general items as well as the detector dependent
data items are provided within this branch. The following general items are mandatory registered
within the detector branch: “Description”, “Configuration”, “Command”, “Error” and “Data”.

Fig. 40. The figure illustrates the OPC XML-DA server data space. Most of the data represents real layout of
the Nor-Amberd URCS server and Nor-Amberd Multidirectional Muon Monitor. The “Registry” branch
represents the data space of the Registry server operating in the main lab in Yerevan.

The “Description” is a read-only item and provides the detector description introduced in section
4.6. Besides obtaining a current detector layout, the client application may subscribe to this item in
order to be notified about the detector layout changes. The “Configuration” item provides a way for
handling detector configuration. The current configuration can be obtained using the read request.
The certain operators (the authentication is currently performed basing on the IP addresses) are
permitted to write a new configuration into the item. As well the client application may subscribe to
the item in order to get notified about configuration changes. The “Command” item is used by the
client applications to issue commands. Currently, the only supported command is “Restart”
demanding the driver restart (other operating drivers are not affected). The command syntax is
presented in Fig. 41. However, the arbitrary drivers may support other commands. In this case it is
up to the driver implementation to define the syntax of these commands. The client application
should be able to get a full list of the supported commands along with their descriptions by means
of reading the “CommandInfo” item. The “Error” is a read-only item and provides the information

127
about the last occurred error concerning the detector operation. Finally, the “Data” branch exposes
the data items providing information from the detector sensors. The items are named basing on their
positions in the data vector considered by the current detector layout (see section 4.6 and Fig. 40).

XML Schema Definition


<xs:complexType name=”DeviceCommandType”/>
<xs:element
name=”DeviceCommand” type=”DeviceCommandType”
abstract=”true”/>
<xs:element
name=”DeviceRestart” type=”DeviceCommandType”
substitutionGroup=”DeviceCommand”/>
Example
<DeviceRestart/>
Fig. 41. The figure illustrates a sample and an XML schema definition of the device restart command.

Besides the data items published within the “Data” branch, the OPC server may expose available
data by means of the structured branches with descriptive names. In this case the data items
available within these branches are linked together with certain items within the “Data” branch.
That is, the data space is organized in such a way that multiple OPC Items are providing access to
the same data. The data is not copied, but several OPC Items are referencing the same data structure
in the memory space in this case.
The Nor-Amberd Multidirectional Muon Monitor data layout is presented in Fig. 40 as an example.
All data are available within the “Data” branch by means of the numbered data items. The
appropriate numbers may be obtained from the detector description as described in section 4.6.4.
Additionally, the “Charged Particles”, “High Energy Muons”, “Directions” and “Variances”
branches are providing the same data in a structured way. The “Charged Particles” contain values
from the sensors which are located in the upper layer of the NAMMM detector and counting
incidents of all charged particles. In turn the “High Energy Muons” branch contains information
from the bottom sensors, which are counting only muons with energies above 350MeV. The
“Directions” branch is used to provide information about muon fluxes from different directions.
And the “Variances” branch contains various statistical data.

4.7.1. URCS Server


Besides the detector-specific branches available within the data space of the OPC XML-DA server
frontending an URCS server, a set of items controlling global behavior of the URCS server are
provided as well. The following items are mandatory: “Configuration”, “Command” and “Error”.
The “Command” item is used to accept commands affecting inter-server operation. Currently the

128
restart and configuration renewal commands are supported. The restart command has the same
syntax with its driver-specific analogue, but completely restarts whole URCS server including all
drivers. The syntax of the configuration renewal command is described in Fig. 42. Upon receipt of
that command the server reloads a configuration file and notifies all operating drivers about
configuration modification. The “Error” item is used by the clients to obtain the information about
the last occurred error concerning the inter-server behavior.

<xs:complexType name=”URCSCommandType”/>
<xs:element name=”URCSCommand” type=”URCSCommandType” abstract=”true”/>
<xs:element
name=”URCSRestart” type=”URCSCommandType”
substitutionGroup=”URCSCommand”/>
<xs:element
name=”URCSReloadConfiguration” type=”URCSCommandType”
substitutionGroup=”URCSCommand”/>
Example
<URCSReloadConfiguration/>
Fig. 42. The figure illustrates a sample and XML schema definition of the reload configuration command.

XML Schema Definition


<xs:element name="URCSAnnounce">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs=”1” maxOccurs=”1” name=”Name” type=”xs:string”/>
<xs:element minOccurs=”1” maxOccurs=”1” name=”Status” type=”xs:string”/>
<xs:element minOccurs=”1” maxOccurs=”1” name=”URL” type=”xs:anyURI” />
<xs:element minOccurs=”0” maxOccurs=”1” name=”Description” type=”xs:string”/>
</xs:sequence>
</xs:complexType>
</xs:element>
Example
<URCSAnnounce>
<Name>Nor-Amberd URCS Server</Name>
<Status>Running</Status>
<URL>http://192.168.0.16/URCS.opc</URL>
<Description> Short server assignment description</Description>
</URCSAnnounce>
Fig. 43. The schema represented in the figure defines a structure of self-announcement request accepted by the
registry from the URCS servers. The sample of such query is demonstrated at the bottom of the figure.

4.7.2. Registry Server


The registry server exposes in the OPC XML-DA data space a single branch named “Registry”. The
URCS servers are writing in this item the self-announcement notifications. The XML Schema
presented in Fig. 43 defines a structure of such notification. The items corresponding to the
registered servers are available inside the “Registry” branch. Clients obtain information about the

129
certain URCS server by means of the reading data from these items. The information includes the
URL at which the server is available and last re-announcement time along with the short server
description.

4.7.3. ASEC-Specific Metadata Properties


To simplify the automatic processing by the OPC XML-DA client part of the complete detector
description is provided for the OPC XML-DA clients by means of the ASEC-specific meta-data
properties. The list of these properties is presented in Table 9 along with short descriptions.

Table 9. The ASEC-specific metadata properties


Property Description
A standard property defined in the OPC XML-DA specification
scanRate specifying the fastest rate at which the server could obtain the data from
the underlying data source. For the ASEC detectors, this property
represents the duration of the counting interval in milliseconds.
engineeringUnits A standard property defined in the OPC XML-DA specification
specifying the units of measurement.
The property defined by the OPC XML-DA specification for analog
data in order to specify the lowest and the highest values likely to be
lowEU, highEU obtained in normal operation. These properties are used by the ASEC
servers to indicate acceptable count rates. If the data is out of specified
range, the monitoring software should issue notification for the operator.
description A short description of the data channel. The property is defined by the
OPC XML-DA specification as well.
asec_sensor_type Enumeration indicating the basic physical meaning of the data. The
acceptable values are presented in Table 8.
The property specifies ID number of the enclosed data (the position in
asec_sensor_id the data vector). For the items registered in the structured branches, this
property may be used to find the associated OPC Item within the “Data”
branch.
This property is used in conjunction with statistical data items providing
variance and correlation information on the certain data channel. It
asec_target_id references a primary data channel used for variance and correlation
calculation. A single ID number pointing to the primary data is used for
variance items and array of the two ID numbers for the items providing
correlation information.
For the sensors counting a number of incident particles, these two
asec_energy_min, attributes are specifying minimal and maximal energy thresholds. Only
the particles with energy within the specified thresholds are counted in
asec_energy_max
order to obtain a value disseminated by the OPC Item. The energy is
specified in megaelectron-volts.
This property provides information on the direction of the counted
particles. The ID numbers of the sensors, which are registered a
asec_direction coincidence, are presented top-down and divided by the ‘-‘ sign. For
example, “1-5” means that the counted particles are passed through the
first detector in the upper layer and the fifth detector in the lower one.

130
4.7.4. Complex Data
The current ASEC data acquisition system does not require complex data functionality for the
sensor data dissemination. All data values are represented by the floating-point or integer scalar
numbers. The data is disseminated in conformance with the original OPC XML-DA specification
using only the native SOAP data encoding rules. Therefore, the HDR extension is not used to
support the sensor data distribution. However, the XML, scheduling and memory management
optimizations described in Chapters 3-4 are very important, since the reliable but slow (533MHZ
only) frontend computers are considered by the data acquisition system design.
The configurational items are implemented as well using the functionality based on the OPC
Complex Data specification. As it was stated above, most of the items controlling server and
detectors behavior are based on the structured XML data. A structure of these items is described by
means of the “XMLSchema” type system. The schema descriptions are available under the “CPX”
branch within the server data space in the conformance with the OPC Complex Data specification.
The client applications may find an appropriate structure description by means of metadata
properties defined in the same specification (see details in section 1.6).

4.8. Summary
In this chapter a new data acquisition system of the ASEC detector network is described. The
system is based upon the uniform components utilizing the high level standards. The standard
Ethernet interface is used to connect the detector electronics to the frontend computers. A single
frontend computer handles all detectors within the research station. The single-type Minbox M100
computers are used everywhere. These computers are based on the VIA Eden architecture and do
not include any unreliable mechanicals parts. The processor with low power consumption allows
one to use the passive cooling. Instead of the hard drives the Compact Flash is used.
The software controlling detectors is based on the layered architecture running multiple drivers.
Therefore, the detectors of all types are controlled within a single application running multiple
drivers. The OS abstraction library allows software execution within any supported platform and
simplifies implementation of the new ones. New XML based self-describing format eliminates
problems with data misinterpretations and allows the processing automation. Furthermore, the self-
describing format facilitates the data exchange between related experiments. The XML description
is designed so that the data is not enlarged drastically in the size and still can be stored within a fast
SQL database.
The OPC XML-DA protocol is used to provide both the data dissemination and detector control
capabilities. The protocol is based on a standard HTTP transport and an XML data representation

131
and, therefore, allows constructing highly heterogeneous systems distributed over the world. The
protocol is also supported by the quantity of commercial control solutions, including LabVIEW by
means of bindings described in the previous chapter. Thus, the components of the data acquisition
system controlled by the LabVIEW applications are uniformly embedded in the data acquisition
environment as well.
The data control and monitoring capabilities are provided by means of the sophisticated web
frontend. The web frontend obtains the data and status information from the frontend servers by
means of the OPC XML-DA protocol and makes it available to the operators running a standard
web browser. The Macromedia Flash animations are used for the data visualization, and the Chiba
XForm generator is used to provide sophisticated control interfaces. Such design eliminates the
complex requirements for operator workplace and opens a way for the detector monitoring
practically from the anywhere. Even the GPRS or WiFi enabled PDA and Smartphone devices can
be used for the control purposes.
Besides the sophisticated control frontend, a set of the automatic scripts is periodically performed in
order to monitor the detectors status and current sensor values. The information obtained from the
different detectors is correlated and under certain conditions a notification message is sent to the
operators.
The data is stored by means of two interchangeable servers working in parallel at the main lab.
These servers are periodically inquiring the data from all detectors and storing it in the MySQL
database. Further, the stored data is analyzed by the off-line software and made available to
scientists by the means of DVIN (Data Visualization Interactive Network) web interface.

132
CONCLUSION

The life cycle of the experiments in high energy astrophysics usually exceeds ten years. Obviously,
there is an ongoing change of the demands presented to the data acquisition system in this long time
period. The system hardware and software components must be upgraded to reflect these changes
and, therefore, every now and then replaced by others targeted for the new environments. The
hardware components, operating systems, web tools are rapidly developed. However, the well
selected interfaces and protocols have a longer live cycle. If the various system components are
communicated by means of mentioned interfaces, the opportunities to upgrade subsystems
individually, to collaborate the software development efforts, to reuse existing software products
and to simplify the software maintenance are obtained. The presented work is aimed to develop a
universal middleware solution for the distributed data acquisition and control systems. This solution
should be able to facilitate the data exchange for almost any distributed system independent of the
presented demands on the protocol performance, real-time constraints, meta-data information and
compatibility with the existing commercial software.

In the first chapter several operating data acquisition systems were reviewed. Basing on this review
and considerations inspired during the design of a new data acquisition system for the ASEC
(Aragats Space Environmental Center [3, 4]) experiment the fundamental requirement to the
universal data exchange middleware has been stated as follows:
• It should be based on the highest level, industry accepted standards.
• It should be fast enough to sustain very high data rates, used in the high energy physics and
astrophysics experiments.
• It should have predictable request-response latencies allowing usage in the real-time systems.
• It should support extensible metadata. The metadata should include the predefined set of
records common to the most experiments as well as a possibility to define custom experiment-
specific records.
• It should operate in the heterogeneous environments. However, the special optimizations for
the homogeneous cases are welcome.
• It should transparently work through internet proxies and firewalls.
• The server implementation should allow clients browsing of the server address space using
comprehensive filters. Provide interfaces to read and adjust the data. Issue notifications to the
clients reporting the data changes. Implement data streaming capability.

133
Then, a full range of existing data exchange solutions have been analyzed in the light of stated
demands. However, no protocol has shown ability to sustain all stated demands. The pure RPC
(Remote Procedure Call) solutions are lacking in standardization of the metadata required in the
data acquisition and slow control systems. Several widely used slow control systems are limited to
one specific platform and inoperable in heterogeneous environments. The binary protocols have
difficulties with operation in highly distributed environments through internet proxies and firewalls.
The XML based solutions are mostly too slow for real-time systems working with fast data flows.
Nevertheless, the OPC XML-DA (Open Process Control XML Data Access [12]) protocol proposed
by the OPC Foundation is very close to the stated demands. The OPC XML-DA is an apparent
successor of the de facto standard for automation industry and control systems, the OPC DA
protocol. It performs very well in the highly distributed environments across the internet proxies
and firewalls. The protocol support gradually appears in the commercial control systems. The major
drawback lies in its XML nature, obviously prohibiting its utilization in high performance real-time
systems. The absence of a mechanism providing access to the historical data is another issue to be
addressed if the protocol should be used as a universal data exchange solution.
To obtain a required performance in dissertation a set of several self-developed extensions to the
original specification were proposed. The HDR (High Data Rate) extensions are based on the OPC
Complex Data specification [16] and WS-Attachment technology [48]. The two-way compliance
with the legacy OPC XML-DA clients and server is preserved. Therefore, the proposed approach
can be still used in conjunction with a variety of the commercial components supporting the OPC
XML-DA protocol. While the subsystems demanding extremely high performance may rely on the
proposed extensions in order to sustain required data rate. The separation of the data from protocol
and metadata information, usage of the server native binary representation for data exchange and
the opportunity to support multiple clients with the newest PGM multicasting standard are main
concepts of this extension. The query mechanism adopted from the OPC Complex Data
specification allows extending server functionality with complex queries. The number of standard
queries was defined to help clients in accessing historical data and acquiring most appropriate
binary data format.
The WS-Attachment technology by means of the DIME (Direct Internet Message Encapsulation
[49]) encapsulation is used to link a binary data with OPC XML-DA protocol. However, the
necessity to transfer data in heterogeneous environments still requires an agreement about a binary
exchange format. In order to achieve the best possible performance a quantity of available solutions
were evaluated. However, all of them require some data conversions even in the case of

134
homogeneous environment. Assuming, that most PCs of the experiment have similar architectures
this solution is sub-optimal and limits peek performance. This assumption is rather often true for
modern data acquisition systems. Therefore, to achieve best possible performance the NDR (Native
Data Representation) is considered by the HDR extension for the data representation. The data is
transferred by means of the server native representation. Any format conversions are performed by
the receiving side. Such approach is very effective in the homogeneous environments because no
conversion is performed at all. According to [54] the approach is faster than the common data
exchange methods even in the case of heterogeneous environment.

The effective implementation of the OPC XML-DA HDR server requires handling of a set of
automation and optimization tasks. Converting the data between different binary representations as
well as encoding to and from the OPC XML-DA default representation should be dynamically
performed. The data may vary from the simple integer and floating point numbers to the
sophisticated structures containing multiple vectors, matrixes, dynamic arrays and references.
The OPC XML-DA HDR server is accomplishing many simultaneous tasks. It executes quantity of
drivers communicating with the underlying hardware and reading the data, supports simultaneous
client requests, maintains the data structures. Even more complication arrives with the requirement
to execute specific tasks in the appropriate time slice. The driver threads interfacing the precise
hardware should execute in the exactly defined time with microsecond precision. The OPC XML-
DA clients may define the latest time server should respond. In the cases that such client is
controlling sophisticated control system with multiple components requiring operation
synchronization the response delay may have fatal consequences. All these requirements should be
considered by the manager scheduling tasks in the OPC XML-DA system. Further, the memory
management optimizations are required for long-running heavily multithreaded server applications
with big amounts of rapidly changing data. Especially it is important in the cases that several real-
time threads are running among the lower priority ones with active memory utilization. Particularly,
the OPC XML-DA based systems should be optimized since the real-time threads managing
underlying hardware are coexisting with the threads generating XML content in a response to the
clients. The real-time task scheduling is another important aspect that should be considered during
the system implementation.
In the third chapter the optimizations and techniques developed in order to handle the described
tasks in the prototype implementation are described. The “data representation” concept allows
automatic conversion between different representations of the data and furnishes an opportunity to
create custom data filters and masks on the server side. Data tweaking with user supported queries

135
is another benefit of this approach. The memory management using “dedicated ring buffer”
technique eliminates most of memory allocation delays on the heavily loaded servers and makes
high priority data acquisition threads really independent from the system load. The proposed
scheduling model allows soft real-time system implementation. The EDZL (Earliest Deadline until
Zero Laxity [83]) based scheduling approach is effectively managing tasks priorities in order to
satisfy the client timing request. In conjunction with considered threading model and memory
optimizations it makes possible to sustain soft real-time demands even on the certain general-
purpose operating systems.

The prototype implementation is released as a multiplatform C framework using multi-layer


architecture. The framework is based on the OS abstraction interface. All memory management,
thread synchronization, network communication and other OS specific operations are hided from
the other server components by means of the abstraction interface. The next layer is an OPC data
management library. The library provides means of the data representation conversions and OPC
data space management. On top of the library an OPC XML-DA HDR server is constructed. It
consists of the task scheduler, data manager, the backend and frontend interfaces. The frontend
interface accepts connections from the clients and prepares data in the user requested format in
accordance with OPC XML-DA HDR specification. The backend interface maintains driver
plugins. These plugins are used to obtain the data from the underlying hardware and submit that to
the server data space. Besides the data plugins, the CORBA interface may be used to publish the
data.
Currently, the framework is tested under Linux and Windows operating systems. However, porting
to other POSIX compliant systems should be an easy task. Additionally, the bindings for the
National Instruments LabVIEW are developed to provide the possibility to use the OPC XML-DA
interface in LabVIEW based control systems. Both the Windows and Linux versions are supported.
In order to support LabVIEW Real-Time (running on the FieldPoint and PXI devices) the
framework is ported to Phar Lap (Windows NT based real-time operating system) [18].
Currently the framework implements the OPC XML-DA specification and core of the HDR
specification. Hence, it is able to support data in both XML compatibility and NDR binary formats.
The support of optional HDR extensions like a PGM multicast, XML Security and others are not
available in prototype implementation yet. Besides the OPC XML-DA interface the server provides
a standard web interface. The clients by means of standard web client may browse the server data
space, monitor and adjust items value. The sophisticated web-based control interfaces may be
designed using XML data representation by means of Chiba XForms implementation [91].

136
To evaluate prototype performance the performance of the developed system was compared with
the omniORB and gSOAP solutions. The omniORB is one of the fastest CORBA implementations
according to [92]. The gSOAP is the most featured SOAP toolkit available under Linux. The results
show that XML based protocols can be used to transfer data in a very efficient way if the proposed
extensions are added. The implementation achieves the performance of pure binary CORBA
systems at the same time providing all XML metadata considered by the OPC XML-DA
specification. Even in the legacy OPC XML-DA mode the prototype implementation is
considerably faster than the competing SOAP solution.

Finally, the developed approach was used to provide the data dissemination and control capabilities
in the new ASEC data acquisition system. The data acquisition system is based upon the uniform
components utilizing the high level standards. The standard Ethernet interface is used to connect the
detector electronics to the frontend computers. A single frontend computer handles all detectors
within the research station. The single-type Minbox M100 computers are used for all detectors.
These computers are based on the VIA Eden architecture and do not include any unreliable
mechanicals parts. The processor with low power consumption allows one to use the passive
cooling. Instead of the hard drives the Compact Flash memory cards are used.
The detector governing software is based on the layered architecture running multiple drivers.
Therefore, the detectors of all types are controlled within a single application running multiple
drivers. The OS abstraction library allows software execution within any supported platform and
simplifies implementation of the new ones. New XML based self-describing format eliminates
problems with data misinterpretations and allows the processing automation. Furthermore, the self-
describing format facilitates the data exchange between related experiments. The XML description
is designed so that the data is not enlarged drastically in the size and still can be stored within a fast
SQL database.
The data control and monitoring capabilities are provided by means of the sophisticated web
frontend. The web frontend obtains the data and status information from the frontend servers by
means of the OPC XML-DA protocol and makes it available to the operators running a standard
web browser. The Macromedia Flash animations are used for the data visualization, and the Chiba
XForm generator is used to provide sophisticated control interfaces. Such design eliminates the
complex requirements for operator workplace and opens a way for the detector monitoring
practically from the anywhere. Even the GPRS or WiFi enabled PDA and Smartphone devices can
be used for the control purposes.
Besides the sophisticated control frontend, a set of the automatic scripts is periodically performed in

137
order to monitor the detectors status and current sensor values. The communication is performed by
means of the OPC XML-DA protocol as well. The information obtained from the different detectors
is correlated and under certain conditions a notification message is sent to the operators.
The data is stored by means of two interchangeable servers working in parallel at the main lab.
These servers are periodically inquiring the data from all detectors and storing it in the MySQL
database. Further, the stored data is analyzed by the off-line software and made available to
scientists by the means of DVIN (Data Visualization Interactive Network) web interface.

138
APPENDIX A

THE XML BENCHMARK

A.1. Introduction
The XML (eXtensible Markup Language [104]) proposed by the World Wide Web consortium has
brought a lot of new ideas and abilities in the field of information management systems. XML
allows community to describe new grammars for data structures that meet their needs more
efficiently. This ability in the last few years gives birth to sub-dialects which spark numerous new
tendencies in the whole spectrum of scientific life. Several parts of modern data acquisition systems
should been re-examined in the light of XML.
The XML allows automation of many standard tasks. A ready-made, expressive syntax combined
with XSD (XML Schema Definition [105]) means that developers don’t have to worry about
creating new syntaxes, parsers and validators for configuration and state files. XML RPC (XML
Remote Procedure Call [106]), SOAP (Simple Object Access Protocol [107]), OPC XML-DA
(Open Process Control XML Data Access [108]) are providing different approaches for the XML
based inter-machine communications. Together with UDDI (Universal Description, Discovery and
Integration Language [109]) they form the complete distributed computational environment. XSIL
(Extensible Scientific Interchange Language [110]), XDF (Extensible Data Format [111]) are used
to represent scientific data in a standard way. All major databases now offer some degree of support
for XML storage and querying. XML Encryption [61] and XML Signature [60] specifications are
covering problems of the data security and consistency. XSL (eXtensible Stylesheet Language
[112]), XSLT (XSL Transformation [113]) accompanied by XML ability to separate data structure
from presentation are giving invaluable abilities in document management and preparation.
MathML (Mathematical Markup Language [114]), CML (Chemical Markup Language [115]) and
SVG (Scalable Vector Graphics [116]) are giving the possibility to use the whole range of scientific
formulas and illustrations. XUL (XML User Interface Language [117]) provides an opportunity to
create rich user interfaces for cross platform applications.
Many of these technologies can simplify the design in the field of the data acquisition and control
system development. The SOAP protocol is used to provide data dissemination capabilities in the
heterogeneous environment. Various XML representations are often used by the DAQ systems to
store the data in the easy accessible and self-describing way. The data consistency is achieved using
XML Signature technology. The XML Encryption enables data protection from the unauthorized
access. The XACL (XML Access Control Language [118]) is used to specify security polices

139
providing client dependent access to the XML documents and services. XSL transformations are
often used for creating dynamic web interfaces controlling system behavior.
As is easy to see, a lot of subsystems of the modern data acquisition and control environments can
be implemented using XML technologies. The richly featured XML solutions are increasingly used
in the newly developed DDAS systems. Therefore, the performance, reliability and standard
compliance of the used XML solutions will more and more affect to the overall system behavior.
Because of the wide availability of various XML tools developed by the range of commercial
vendors and open source communities, the throughout investigation of available solutions becomes
an important task in the DDAS design.

Currently, many XML benchmarking projects are available on the net. The XMark [119], Sonoski
XML Benchmark [120], Piccolo [121] are most know of them. Unfortunately, they don’t provide
the complex measurement of the XML environments. Mostly one or at most two different aspects of
the XML processing are evaluated. The predefined sequence of the XML data is used to make
comparisons. Moreover, they are mostly oriented towards the Java world which is doubtful to be
used in real time applications.
The implementation of the fast and reliable data exchange solution requires full-featured
multiplatform XML toolkit which is able to process different types and sizes of XML files at the
highest speeds using the variety of modern XML technologies. Therefore, in the current chapter the
major multi-platform XML implementations are compared in their abilities to handle tasks posed by
the modern data acquisition systems. The complete comparison of the supported features as well as
throughout performance evaluation is provided. To get the most precise results the standard
behavior of the OPC XML-DA server is emulated using especially developed benchmarking
software.

A.2. DDAS Requirements


Before starting real investigations it is necessary to clearly formulate primary and secondary
demands posed by the system design. Based on the data exchange solution proposed in Chapter 1,
the following requirements can be stated as the most important. The first fundamental demand is a
possibility to process small incoming OPC XML-DA HDR messages at a very high speed. The
XML Schema is considered for incoming message consistency validation. The DOM (Document
Object Model [122-124]) Level 1-3 specifications define API which should be used for the
response messages composition and serialization. For the legacy clients the size of these messages
can be very big, since the SOAP encoding rules are used for the data exchange in this case (see

140
section 1.4.15). The XML Encryption and Signature technologies are providing a standard way for
the security approach implementation (see section 2.7.6).
The HDR Binary Type System (see section 2.4) requires XML processing functionality, along with
XML Schema validation and DOM tree maintenance possibilities. The XMLSchema Type System
depends completely on the XSD definitions (see section 1.6.1). The XSL Transformations are
considered by the system for converting between different XML representations. The XPath [125]
is default syntax for component part extraction queries (see section 2.9.8). Finally, the XSL
transformations, XForms (XML Forms [126]) and XML Schema are required to present the data to
the end users.

The following list summarizes the most important XML functionalities:


• XML message parsing
• XML message validation
• The DOM tree construction and manipulation
• XML Security
• XSLT and XPath queries

It should be stated as well that the toolkit should work on the most available platforms including
special real-time solutions. It should be as fast as possible and should be able to work with huge
XML documents. The library should work on the embedded hardware with the highly constrained
amount of memory and system performance. The ability to work in the multithreaded environment
is essential as well.

A.3. XML Libraries


The thorough search for available multiplatform solutions has brought a number of libraries
supporting different XML technologies. However, all these libraries can be grouped in a few
toolkits providing a full or almost full set of required capabilities. Because of the high performance
requirements only C/C++ and Java-based solutions are considered.
The following sub-sections are describing these toolkits. The toolkits are reviewed in the light of the
stated system requirements. The information about software licensing terms, supported operating
systems and implemented technologies is provided.
The precise information about supported XML specifications is represented in Table 10. This
comparison was performed on February, 2004. Hence, all provided information is valid at that
instant.

141
Table 10. The specification compliance comparison of available multiplatform XML toolkits

Toolkits
Expat Gnome Apache Oracle JAXP QT
Specs
Thread Safety Yes Yes Yes Yes Yes Yes
Namespaces Yes Yes Yes Yes Yes Yes
DTD callbacks Yes Yes Yes Yes No
XML Schema No 1.0, Partly 1.0 1.0 Yes No
Relax NG No Partly No No No No
SAX 2 Minimal 2 1 2 2
DOM L1, L2, L3 L1, L2 L1, L2, L3 L1, L2 L1, L2, L3 L1, L2
XSLT Yes Yes Yes Yes Yes No
EXSLT No Yes Yes No Yes No
Xpath 1.0 1.0 2.0 1.0 1.0 No
Canonical
1.0 1.0 No No 1.0 No
XML
Signature No Yes Partial No Partial No
Signature
No Yes No No Partial No
XPath Filter
Signature
Decryption No Yes No No No No
Transform
Encryption No Yes No No Partial No
Embedded
FTP,
Network No FTP, HTTP FTP, HTTP No No
HTTP
Clients
SOAP 1.1 1.1 No No 1.1 No
SOAP with
No Planed No No Yes No
Attachments
WSDL Planed Yes No No Yes No
UDDI No Yes No No Yes No
ebXML No No No No Yes No

A.3.1. Apache XML


The Apache XML toolkit is developed by the Apache Foundation. It consists of 3 libraries: Apache
Xerces for C++, Apache Xalan for C++ and Apache XML Security. The Apache Xerces for C++ is
a validating XML parser. Besides parsing capabilities it supports a full set of DOM manipulations,
XML Schema validation. The XPath language and XSL transformations are maid available by the
Apache Xalan for C++ library. The Apache XML Security for C++ is aimed to provide support for
the XML Security as defined by the XML Encryption and Signature specifications. However, it is
not fully functional at the moment.
Additionally, the Pathan library is available. It provides support for XPath expressions confirming
to the XPath v2.0 specification.

142
The Apache XML toolkit is distributed under free Apache license. The documentation claims
support of the following operating systems: Windows NT, Linux, BSD, Solaris, MacOS X, HP-UX,
IRIX, AIX, OS/2, UnixWare, PTX, AS/400, OS/390. The following versions of the described
libraries were used for the performance evaluation: Xerces 2.4.0, Xalan 1.7.0, XML Security CVS
snapshot from September 2003. The total library size is 7MB.

A.3.2. Gnome XML


The Gnome XML is a set of libraries developed by various people for Gnome desktop environment.
The core of the toolkit is a Gnome XML Library, also known as LibXML. It provides XML parsing
capabilities, DOM-style tree manipulations, and XPath queries. The XML Schema implementation
is started. However, it is not fully functional at the moment. The GDome library provides DOM
compliant tree manipulations at the top of LibXML library. The XSL transformations are
implemented using LibXSLT. The XMLSec library provides complete support for XML Encryption
and Signatures. Additionally, SOAP protocol implementation is provided by the soup library.
The whole Gnome XML toolkit is distributed under free MIT license with exception of GDome
library which is available under LGPL license. The documentation claims that the following
operating systems are supported: Windows, Linux, QNX, BSD, Solaris, MacOS X, OpenVMS,
MVS. The bindings for Python, PHP, Perl, Tcl/Tk, Ruby, Delphi and Kylix programming languages
are available. The following versions of the described libraries were used for performance
evaluation: LibXML 2.6.5, LibXSLT 1.1.2, GDome 0.8.1, XMLSec 1.2.4. Total library size is
2MB.

A.3.3. Oracle XML


The Oracle XDK (XML Development Kit) is developed by the Oracle Corp. to provide XML
capabilities in their database solutions. The Oracle XML is close-source library. The library is
available for Windows NT, Linux, Solaris and HP-UX using C, C++ and Java programming
languages. It implements XML parsing, Schema validation, XPath queries and XSL
transformations.
The Oracle XDK is distributed under Oracle OTN license. The XDK 9i version 9.2.0.6.0 was used
for performance evaluation. Overall library size is 4.5MB.

A.3.4. Expat
The Expat is a simple XML parser widely used in various open-source software including Mozilla
framework. Although it implements only parsing capabilities, there are several libraries at the top of
Expat implementing DOM, XPath and XSLT specifications. The benchmark evaluates three of
them: CenterPoint XML also known as CSLXML, Sablotron and Arabica.

143
The Expat is distributed under free MIT license. GPL is used for Sablotron, Netscape license for
CenterPoint XML, and BSD for Arabica. The Expat works on most available platforms including
Windows, Linux, QNX, LynxOS, BSD, Solaris, MacOS X, HP-UX, IRIX, AIX, OS/2, OpenUnix,
and OpenVMS. The bindings for Python, PHP, Perl, Tcl/Tk, Ruby and Delphi programming
languages are available.
The following versions were used for performance evaluation: Expat 1.95.6, Sablotron 1.0.1,
CenterPoint XML 2.1.7, and Arabica Jan04. The total size of all libraries providing a maximal set
of capabilities is 1MB.

A.3.5. Sun JAXP


The Sun JAXP (Java API for XML Processing) is based on the Apache XML for Java toolkit.
Currently, it is most used Java solution. Sun JAXP provides the XML parsing, Schema validation,
XSL transformations capabilities. The XML Encryption and Signature is provided with Apache
XML Security for Java. Additionally, the Apache SOAP library implements SOAP specification.
Sun J2SE 1.4.2 with JAXP 1.2 and Apache XML Security for Java 1.0.5D2 were used for
performance evaluation.

A.3.6. Other XML Libraries


Additionally, the RXP Parser 1.2.8 and Trolltech QT 3.3.0 XML module were used in benchmarks.

A.4. Performance Evaluation


To get the most accurate results of performance comparison, the small applications implementing
various required XML capabilities are developed for all described toolkits. Three different types of
the XML data are used together with these applications for performance evaluation. The big RDF
(Resource Description Framework) file (approximately 10MB) is used to estimate abilities of
toolkits to work with huge XML documents. The “opcgen” XML generator emulates standard
behavior of the OPC XML-DA server working in the High Data Rate mode. The small OPC XML-
DA messages are sequentially generated in order to test the competitor abilities in processing OPC
XML-DA traffic. The “xmlgen” XML generator randomly produces common-sense XML
documents filled with arbitrary values. This generator is used to evaluate toolkits performance in
maintaining XML documents with a simple structure. The configuration files, HDR Binary type
descriptions are examples of such data.

The following procedure is used for all tests and data types. Each of the implemented applications is
executed certain number of times in the loop. Hence, the effective running time is measured with
high precision. Further, the following normalization approach is performed in order to allow

144
representing several benchmarks with different properties (and scales as a result) on a single chart.
The single XML toolkit is selected. All obtained times are divided by the time obtained for this
selected toolkit. The calculated value is called normalized time and used on the provided charts to
represent the amount of computation required by the toolkits to accomplish a specific task. The
Gnome XML is usually used for that purpose. The only exception is XML Schema validation
benchmark, since not all runs are completed successfully by the Gnome XML toolkit in this test.
Therefore, the Apache XML toolkit is selected for the schema validation benchmark.

The described application were executed on the P4 platform (PIV 2.2GHZ / i850E / 1GB PC800
RDRAM) running Mandrake Linux 9.2 (Kernel 2.4.22, GNU C Library 2.3.2, GNU C Compiler
3.3.1, Sun J2SE 1.4.2) in a single user mode to obtain performance information. The tests were
executed in the multi-threaded environment as it normally occurs in the server implementations
supporting multiple synchronous clients. However, the observations remain true in a single-
threaded case as well.

It should be noted that on a single CPU machine, parsing of 50 requests in 10 concurrent threads
will be slower than parsing of 500 requests in a single thread due to the context switches that must
be performed in the multithreaded case. The load is distributed if the server has multiple processors.
However, the necessity to maintain access sharing between multiple threads allocating memory in a
single memory heap will cause a significant slowdown in any case.

A.4.1. XML Parsing


The parsing benchmark evaluates the time required to parse an XML document and construct in-
memory information representation. Since the parsing functionality is required by all other XML
related tasks, the behavior measured by this test is very important and will affect drastically on
overall system performance.
There are two different parsing approaches available: SAX (Simple API for XML [127]) and DOM
(Document Object Model [122]). SAX is a set of streaming interfaces that decompose an XML
document into a linear sequence of well-known method calls. DOM is a set of traversal interfaces
that decompose an XML document into a hierarchal tree of generic objects/nodes. DOM is best
suited to applications that need to retain an XML document in memory for further traversal and
manipulations. Typically SAX-based applications have no need to retain a generic view of the XML
in memory and can parse an XML documents sequentially. This ability gives advantages over DOM
approach for the large XML documents, since the DOM parser has to hold whole tree in the system
memory.

145
The benchmark measures the exact parsing time. The document is parsed from the memory buffer,
so the disk I/O is not affecting results. The document validation is not performed and will be
evaluated in a separate test. As can be seen from Fig. 44 the Gnome XML library is performed very
well in both SAX and DOM benchmarks. The several modes are used for the Gnome DOM parsing
evaluation. The original LibXML uses fast DOM-style hierarchical trees. However, the full DOM
compliance is not provided. The “LibXML + GDome” indicates the parsing performance if
standard DOM implementation is used. The “LibXML (no pthreads)” result is achieved in a single-
threading environment. Finally, the “LibXML Push” represents performance when a push parser is
used (the XML data streamed into the push parser piece by piece).
Further analysis of Fig. 44 indicates that Expat is the fastest for SAX processing. However, all
Expat based DOM engines have rather poor performance. All other toolkits are performing
relatively equivalently with an exception of the slowest Trolltech QT library.

Fig. 44. The XML document parsing performance is evaluated on the figure. The left diagram provides
comparison of the SAX based parsers and the right one is stands for DOM ones.

A.4.2. DOM Manipulations


This benchmark is used to estimate performance for various DOM manipulation tasks. The test
applications create the DOM document node by node. Afterwards, they move some nodes from one
position in the tree to another and finally serialize the resulting tree in the XML string. This
functionality is mainly used for response message construction. The good performance is especially
important while supporting HDR incapable clients with the data from the intensive sources. The
HDR Binary Type System is another system component depending on the DOM functionality.
As can be seen from Fig. 45 the Oracle XDK achieves the best performance in this test. The
LibXML is approximately two times slower. The other tested toolkits are even slower. Additionally,
it should be noted that the Center Point XML as well as both tested Java based toolkits fail to create

146
the biggest required XML file. Although for Java based toolkits the problem can be solved most
likely by tuning JVM (Java Virtual Machine) memory configuration, the Center Point XML is
crashed surely because of an error in implementation.

Fig. 45. The DOM performance is evaluated on the figure. DOM creation, alteration and serialization
operations are used in the benchmark.

As for DOM parsing benchmark, the Gnome XML is tested using several modes. The original
LibXML uses fast DOM-style hierarchical trees. However, full compatibility with DOM API is not
ensured. The “LibXML + GDome” indicate performance if standard DOM is used. It can be
observed from the figure that the true DOM performance is more than 3 times slower compared
with LibXML native one. Finally, the “LibXML (no pthreads)” results are achieved in a single-
threading environment.

A.4.3. XML Schema Validation


The XSD schema benchmark evaluates the time required to validate XML document against XSD
schema. For this benchmark both the “xmlgen” and “opcgen” data generators are used. The
“xmlgen” generates simple-structured XML documents, which are described by the simplest
schema. The “opcgen” generates standard OPC XML-DA messages. These messages are described
by the very complicated schema with a lot of various rules, restrictions, extensions and other polices
defined by the XSD specification. The schema validation functionality is mainly required by the
data acquisition system for the message and configuration consistency validation.

147
As can be seen from Fig. 46 the Gnome XML Library is the fastest one. The Oracle XDK is a little
slower. However, it should be noted that the Gnome XML schema implementation is still
incomplete and not all required tests are performed. Therefore, the XDK should be considered the
best solution in this area.

Fig. 46. The schema based XML validation performance is evaluated on the figure.

A.4.4. XML Transformation


The XSL transformation benchmark represents a time required to transform files between different
representations. The XSLT style sheets provide the instructions describing how the conversion
between these representations should be performed. The XSLT capability is considered by the HDR
specification for providing automatic XML representation management mechanism. These
representations have mostly a simple structure. Therefore, only “xmlgen” is used to generate XML
documents for this benchmark.

Fig. 47. The figure illustrates performance evaluation results in the area of XSL transformations.

148
An exact transformation time is measured by this test. The parsing time of the XML and XSLT
documents are not included in the final results. As can be seen from the benchmark results
introduced on Fig. 47 the Gnome XML toolkit is the fastest one again. The Xalan, XDK and JAXP
are considerably beyond. Expat based Sablotron implementation is terribly slow.

A.4.5. XML Security


Finally, the last benchmark investigates the performance of the security subsystem. This benchmark
consists of two independent tests. First one measures the time required for message encryption and
decryption. The second one is used to detect the time required to add a digital signature and check
message integrity by validating an accompanying signature.
At the moment, the XML Security is fully supported only by the Gnome XML. Although the alpha
implementations are available for Apache and JAXP, the results presented in Fig. 48 demonstrate
that these implementations are still unusable.

Fig. 48. The figure illustrates performance of the security operations.

A.4.6. Memory Consumption


The final test is used to compare memory requirements posed by the competing toolkits. The
optimal memory utilization is very important when the server is running on the embedded system
with the restricted amount of memory. Fig. 49 illustrates the amount of memory required to hold a
DOM representation of the half megabyte XML document. The result indicates that the Gnome
XML and Oracle XDK care about memory utilization more than other solutions. The Apache
Xerces performs a bit worse. As expected, the Java-based solutions use a lot of memory.

149
Expat + CSLXML

Expat + Sablotron

LibXML

Xerces

QT XML Module

Oracle XDK for C

Oracle XDK for Java

Sun XML Pack

0 5000 10000 15000 20000


kilobytes
Fig. 49. The figure illustrates amount of memory required by the considered XML toolkits for representing
half megabyte XML document.

A.4.7. XML Schema and XSL Transformation Behavior


One important note should be made concerning the toolkits behavior in the XML Schema validation
and XSL Transformation benchmarks. The Apache Xerces and the Sun JAXP parsers have a
different approach than the Oracle XDK has. While the last one should perform the XSL
transformation and XSD validation as a separate task over already parsed documents, the Xerces
and JAXP can perform it only in conjunction with parsing. This approach has some good and bad
sides. From one point of view, it provides a possibility to spare some memory. Hence, this approach
is extremely good for processing huge XML documents. On the contrary, while processing the bulk
of small XML documents, this approach produces an extra speed penalty if original DOM
representation is required for other tasks. The Gnome XML Library is able to use both of mentioned
approaches for a schema validation, and only the second one for XSL transformation.

A.5. Summary
Many aspects of the proposed protocol depend on the various XML specifications. XML document
processing and construction, consistency checking, encryption and representation alteration is a
very short list of the required capabilities. The proposed system design requires from the underlying
XML library to provide a rich set of features, a high level of reliability and unbeatable performance.
Therefore, the extensive XML library comparison in their abilities to sustain these requirements is
performed in this chapter.
From the viewpoint of features the Gnome XML and Apache XML toolkits are clear winners. Only
the XML Schema support is not fully completed in the Gnome XML out of the required facilities.

150
The XML Security is missing in Apache XML. However, XML Security and schema validation are
not the first requirements of developing system, but they are demanded for the optional extensions
implementation. Moreover, the rapid development of the both these toolkits give hope of getting
fully functional implementations soon.
In the most performance evaluations the Gnome XML and Oracle XDK are the best. In the memory
consumption the Gnome XML is the best again. Therefore, the Gnome XML toolkit is selected to
provide the base XML support for the server implementation.

151
BIBLIOGRAPHY

[1] J. Vermeulen and et al., "ATLAS DataFlow: the Read-Out Subsystem, Results from Trigger
and Data-Acquisition System Testbed Studies and from Modeling," presented at 14th IEEE-
NPSS Real-Time Conference, Stockholm, Sweden, 2005.
[2] Borgmeier, N. Komin, M. d. Naurois, S. Schlenker, U.Schwanke, and C. Stegmann, "The
Central Data Acquisition System of the H.E.S.S. Telescope System," presented at 28th
ICRC, Tsukuba, 2003.
[3] A. Chilingarian and et. al, "Monitoring and Forecasting of the Geomagnetic and Radiation
Storms During the 23RD Solar Cycle: Aragats Regional Space Wheather Center," Advances
in Space Research, vol. 31, pp. 861-865, 2003.
[4] A. Chilingarian and et. al, "Correlated measurments of secondary cosmic ray fluxes by the
Aragats Space-Environmental Center monitors," Nuclear Instruments & Methods in Physics
Research, vol. A543, pp. 483-496, 2005.
[5] A.Chilingarian, "Report on the United Nations/European Space Agency/National
Aeronautics and Space Administration of the United States of America Workshop on the
International Heliophysical Year 2007," Abu Dhabi, UAE 2005.
[6] Microsoft. (1997). DCOM Architecture. Available: http://msdn.microsoft.com
[7] OMG. (2004). Common Object Request Broker Architecture: Core Specification. Available:
http://www.omg.org
[8] Sun Microsystems. Java Remote Method Invocation. Available:
http://java.sun.com/j2se/1.5/pdf/rmi-spec-1.5.0.pdf
[9] W3C. (2003). SOAP Version 1.2 Part 0: Primer. Available:
http://www.w3.org/TR/2003/REC-soap12-part0-20030624/
[10] B. Barney. (2006). Message Passing Interface. Available:
http://www.llnl.gov/computing/tutorials/mpi/
[11] OPC Foundation. (2003). OPC Data Access 3.0 Specification. Available:
http://opcfoundation.org
[12] OPC Foundation. (2004). OPC XML-DA 1.01 Specification. Available:
http://opcfoundation.org
[13] W. Eppler, A. Beglarian, S. Chilingaryan, S. Kelly, V. Hartmann, and H. Gemmeke, "New
Control System Aspects for Physical Experiments," IEEE Transactions on Nuclear Science,
vol. 51, no. 3, pp. 482-488, 2004.

152
[14] W. Hoschek and et al., "Data Management in an International Data Grid Project," presented
at IEEE/ACM Workshop on Grid Computing, Bangalore, 2000.
[15] S. Chilingaryan and W. Eppler, "Universal Dara Exchange Protocol based on OPC XML,"
presented at 2nd Workshop on Information Technology and Its Disciplines Kish, Iran, 2004.
[16] OPC Foundation. (2003). OPC Complex Data 1.00 Specification. Available:
http://opcfoundation.org
[17] S. Chilingaryan and W. Eppler, "High Speed Data Exchange Protocol for Modern
Distributed Data Acquisition Systems based on OPC XML-DA," presented at 14th IEEE
NPSS Real Time Conference, Stockholm, Sweden, 2005.
[18] C. Clark, LabVIEW Digital Signal Processing: McGraw-Hill Professional, 2005.
[19] National Instruments. Data Acquisition Fundamentals. Available: http://zone.ni.com
[20] National Instruments. Deterministic Data Streaming in Distributed Data Acquisition
Systems. Available: http://zone.ni.com
[21] X. Grave, R. Canedo, J.-F. Clavelin, S. Du, and E. Legay, "NARVAL a Modular Distributed
Data Acquisition System with Ada 95 and RTAI," presented at 14th IEEE-NPSS Real-Time
Conference, Stockholm, Sweden, 2005.
[22] G. Hegyesi, J. Imrek, G. Kalinka, J. Molnar, D. Novak, J. Vegh, L. Balkay, M. Emri, S. A.
Kis, G. Molnar, L. Tron, I. Valastyan, I. Bagamery, T. Bukki, S. Rozsa, Z. Szabo, and A.
Kerek, "Ethernet Based Distributed Data Acquisition System for a Small Animal PET,"
presented at 14th IEEE-NPSS Real-Time Conference, Stockholm, Sweden, 2005.
[23] National Instruments. (2004). LabView Real-Time Module User Manual. Available:
http://www.ni.com/pdf/manuals/322154e.pdf
[24] Aeolean Inc. (2002). Introduction to Linux for Real-Time Control. Available:
http://aeolean.com/html/RealTimeLinux/RealTimeLinuxReport-2.0.0.pdf
[25] D. Chapin, G. Brooijmans, M. Clements, D. Cutts, A. Haas, S. E. K. Mattingly, M. Mulders,
D. Petravick, R. Rechenmacher, and G. Watts, "The DZERO Level 3 Data Acquisition
System," presented at 13th IEEE-NPSS Real-Time Conference, Montreal, Canada, 2003.
[26] J. Becla and D. L. Wang, "Lessons Learned from Managing a Petabyte," presented at
Conference on Innovative Data Systems Research, Asilomar, CA, USA, 2005.
[27] R. Srinivasan. (1995). RFC 1831. RPC: Remote Procedure Call Protocol Specification
Version 2. Available: http://www.ietf.org/rfc/rfc1831.txt
[28] Sun Microsystems Inc., "XDR: External Data Representation Standard, Sun Microsystems,"
RFC 1014, 1987.
[29] M. M. Kong, "DCE: An Environment for Secure Client/Server Computing," Hewlett-

153
Packard Journal, vol. 46, no.6, pp. 6-15, 1995.
[30] OMG. (1998). General Inter-ORB Protocol. Available: http://www.omg.org
[31] M. Henning and M. Spruiell. Distributed Programming with ICE. Available:
http://www.zeroc.com/download/Ice/3.0/Ice-3.0.1.pdf
[32] ZeroC. (2005). ICE Performance. Available: http://www.zeroc.com/performance/
[33] R. Scheifler and J. Brown. Inter-Client Exchange (ICE) Protocol. X Consortium Standard.
Version 11, Release 6.4. Available:
http://ftp.xfree86.org/pub/XFree86/4.5.0/doc/PDF/ice.pdf
[34] P. Brown and M. Ettrich. (2000). DCOP: Desktop COmmunications Protocol. Available:
http://developer.kde.org/documentation/other/dcop.html
[35] S. Westerfeld. Multimedia Communication Protocol (MCOP) documentation. Available:
http://www.arts-project.org/doc/mcop-doc/
[36] H. Pennington, A. Carlsson, and A. Larsson. D-BUS Specification Version 0.11. Available:
http://dbus.freedesktop.org/doc/dbus-specification.html
[37] H. Pennington, D. Wheeler, J. Palmieri, and C. Walters. D-BUS Tutorial Version 0.4.1.
Available: http://dbus.freedesktop.org/doc/dbus-tutorial.html
[38] D. Winer. (1999). XML-RPC Specification. Available: http://www.xmlrpc.com/spec
[39] N. A. B. Gray, "Comparison of Web Services, Java-RMI, and CORBA service
implementations," presented at Fifth Australasian Workshop on Software and System
Architectures, Melbourne, Australia, 2004.
[40] OPC Foundation. (2003). OPC Historical Data Access 1.20. Available:
http://opcfoundation.org
[41] OPC Foundation. (2002). OPC Alarms and Events Custom Interface Standard Version 1.10.
Available: http://opcfoundation.org
[42] PROFIBUS Working Group. (2001). PROFInet Architecture Description and Specification.
Available: http://www.profinet.felser.ch/technik/2202_d09.pdf
[43] H. Kleines, J.Sarkadi, F.Suxdorf, and K.Zwoll, "PROFINET – AN INTEGRATED
AUTOMATION CONCEPT BASED ON ETHERNET," presented at l International
Conference on Accelerator and Large Experimental Physics Control Systems, Gyeongju,
Korea, 2003.
[44] F. Iwanitz. XML-DA Opens Windows Beyond the Firewall. Available:
http://ethernet.industrial-networking.com/opc/articledisplay.asp?id=21
[45] F. Hirsch and R. S. Engelschall. SSL/TLS Strong Encryption: An Introduction. Available:
http://httpd.apache.org/docs/2.0/ssl/ssl_intro.html

154
[46] S. Josefsson. (2003). The Base16, Base32, and Base64 Data Encodings (RFC-3548).
Available: http://www.ietf.org/rfc/rfc3548.txt
[47] J. J. Barton, S. Thatte, and H. F. Nielsen. (2000). SOAP Message with Attachments.
Available: http://www.w3.org/TR/SOAP-attachments
[48] H. F. Nielsen, E. Christensen, and J. Farrell. (2002). WS-Attachments. Available:
ftp://www6.software.ibm.com/software/developer/library/ws-attach.pdf
[49] H. Frystyk, H. Sanders, R. Butek, and S. Nash. (2002). Direct Internet Message
Encapsulation. Available: http://msdn.microsoft.com/library/en-us/dnglobspec/html/draft-
nielsen-dime-02.txt
[50] Abstract Syntax Notation One (ASN.1) and ASN.1 Encoding Rules. Available:
http://asn1.elibel.tm.fr/standards/
[51] A. Chilingarian, "Aragats Space-Environmental Center: Status and SEP Forecasting
Possibilities," presented at 22nd ISTC, Nagoya, Japan, 2002.
[52] R. G. Lavender, D. G. Kafura, and R. W. Mullins, "Programming with ASN.1 using
Polymorphic Types and Type Specialization," Upper Layer Protocols, Architecture and
Applications, 1994.
[53] B. C. Meyers and G. Chastek, "The Use of ASN.1 and XDR for Data Representation in
Real-Time Distributed Systems," Carnegie-Mellon University CMU/SEI-93TR -10, 1993.
[54] F. Bustamante, G. Eisenhauer, K. Schwan, and P. Widener, "Efficient Wire Formats for
High Performance Computing," presented at ACM conference on supercomputing, 2000.
[55] J. Gemmel, T. Montgomery, T. Speakman, N. Bhaskar, and J. Crowcroft, "The PGM
Reliable Multicast Protocol," IEEE Network, 2003.
[56] A. O. Freier, P. Karlton, and P. C. Kocher. (1996). The SSL Protocol, Version 3.0.
Available: http://wp.netscape.com/eng/ssl3/
[57] T. Dierks and C. Allen. (1999). RFC 2246. The TLS Protocol, Version 1.0. Available:
http://www.ietf.org/rfc/rfc2246.txt
[58] B. Schneier, Applied Cryptography, 2nd Edition: Wiley Publishing, 1996.
[59] ITU. (2005). Recommendation X.509, The Directory - Authentication Framework.
Available: http://www.itu.int/rec/recommendation.asp?type=folders&lang=e&parent=T-
REC-X.509
[60] M. Bartel, J. Boyer, B. Fox, B. LaMacchia, and E. Simon. (2002). XML – Signature Syntax
and Processing. Available: http://www.w3.org/TR/xmldsig-core
[61] T. Imamura, B. Dillaway, and E. Simon. (2002). XML Encryption Syntax and Processing.
Available: http://www.w3.org/TR/xmlenc-core/

155
[62] W3C. (2004). XML Schema Part 0: Primer Second Edition. Available:
http://www.w3.org/TR/xmlschema-0/
[63] K. A. Robbins and S. Robbins, Unix Systems Programming: Communication, Concurrency,
and Threads: Prentice Hall PTR, 2003.
[64] G. C. Buttazzo, Hard Real-Time Computing Systems : Predictable Scheduling Algorithms
and Applications: Kluwer Academic Publishers, 2000.
[65] E. D. Jensen. (2005). Overview of Fundamental Real-Time Concepts and Terms. Available:
http://www.real-time.org/realtimeoverview.htm
[66] J. Aas. (2005). Understanding the Linux 2.6.8.1 CPU Scheduler. Available:
http://josh.trancesoftware.com/linux/linux_cpu_scheduler.pdf
[67] Microsoft. (1995). Real-Time Systems and Microsoft Windows NT. Available:
http://msdn.microsoft.com/
[68] J. M. Hart, Windows System Programming, 3rd Edition: Addison Wesley Professional,
2004.
[69] R. Yerraballi, "Real-Time Operating Systems: An Ongoing Review," presented at 21st IEEE
Real-Time Systems Symposium, Orlando, 2000.
[70] S. Baskiyar and N. Meghanathan. (2004). A Survey of Contemporary Real-time Operating
Systems. Available: http://ai.ijs.si/informatica/PDF/29-2/12_Baskiyar-
A%20Survey%20of%20Contemporary...pdf
[71] N. Audsley and A. Burns, "Real-time System Scheduling," Carnegie-Mellon University
YOR 134, 1990.
[72] L. Sha, T. Abdelzaher, K.-E. Arzen, A. Cervin, T. Baker, A. Burns, G. Buttazzo, M.
Caccamo, J. Lehoczky, and A. K. Mok, "Real-Time Scheduling Theory: A Historical
Perespective," Real-Time Systems, vol. 28, pp. 101-155, 2004.
[73] M. L. Dertouzos, "Control robotics: the procedural control of physical processes,"
Proceeding of IFIP Congress, pp. 807--813, 1974.
[74] C. L. Lui and J. W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard Real
Time Environment," Journal of Association for Computing Machinery, vol. 20, pp. 46-61,
1973.
[75] S. K. Dhall and C. L. Liu, "On a real-time scheduling problem," Operations Research, vol.
26, no.1, 1978.
[76] M. Spuri and G. Buttazzo, "Scheduling aperiodic tasks in dynamic priority systems," Real-
Time Systems, vol. 10, no.2, pp. 179-210, 1996.
[77] M. Spuri and G. Buttazzo, "Efficient aperiodic service under the earliest deadline

156
scheduling," presented at IEEE Real-Time Systems Symposium, 1994.
[78] L. Abeni and G. Buttazzo, "Resource Reservations in dynamic real-time systems.," Real-
Time Systems, vol. 27, no.2, pp. 123-165, 2004.
[79] L. Abeni and G. Buttazzo, "Integrating multimedia applications in hard real-time systems,"
presented at IEEE Real-Time Systems Symposium, Madrid, Spain, 1998.
[80] A. Mok, "Fundamental Design Problems of Distributed Systems for Hard Real-time
Environments," PhD Thesis, Massachusetts Institute of Technology, Cambridge, 1983.
[81] A. Srinivasan and S. Baruah, "Deadline-based scheduling of periodic task systems on
multiprocessors," Informational Processing Letters, vol. 84, no.2, pp. 93-98, 2002.
[82] S. Baruah, "Optimal utilization bounds for the fixed-priority scheduling of periodic task
systems on identical multiprocessors," IEEE Transactions on Computers, vol. 53, no.6, pp.
781-784, 2004.
[83] M. Park, S. Han, H. Kim, S. Cho, and Y. Cho, "Comparison of Deadline-Based Scheduling
Algorithms for Periodic Real-Time Tasks on Multiprocessor," IEICE Transactions on
Information and Systems, vol. E88-D, No.3, pp. 658-661.
[84] S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel, "Proportionate Progress: A
Notion of Fairness in Resource Allocation," Algorithmica, vol. 15, no.6, pp. 600-625, 1996.
[85] S. K. Baruah, J. E. Gehrke, and C. G. Plaxton, "Fast Scheduling of Periodic Tasks on
Multiple Resources," presented at 9th International Parallel Processing Symposium, 1995.
[86] J. Anderson and A. Srinivasan, "Early-release Fair Scheduling," presented at 12th
Euromicro Conference on Real-Time Systems, 2000.
[87] W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, and S. Tuecke, "GridFTP:
Protocol Extensions to FTP for the Grid," Argonne National Laboratory, 2002.
[88] OPC Foundation. (2005). OPC Unified Architecture Release Candidate Specification.
Available: http://opcfoundation.org
[89] M. Masmano, I. Ripoll, and A. Crespo, "Dynamic storage allocation for real-time embedded
systems," presented at RTSS 2003, Cancun, Mexico, 2003.
[90] L. Deniau. (2001). Object Oriented Programming in C. Available:
http://ldeniau.home.cern.ch/ldeniau/html/oopc/oopc.html
[91] E. Hanson and L. McCay. (2005). Chiba Cookbook. Available:
http://chiba.sourceforge.net/chibacookbook.pdf
[92] P. H. F. M. Verhoeven, J. Huang, and J. J. Lukkien, "Network Middleware and Mobility,"
presented at 2nd Workshop on Embedded Systems, STW, 2001.
[93] P. Holman and J. H. Anderson. (2006). Locking under Pfair Scheduling. Available:

157
http://www.cs.unc.edu/~anderson/papers/tocs04.pdf
[94] P. Holman and J. H. Anderson, "Group-based Pfair Scheduling," Real-Time Systems, vol.
32, no.1-2, pp. 125-168, 2006.
[95] K. S. Carslaw, R. G. Harrison, and J. Kirkby, "Cosmic Rays, Clouds, and Climate," Science,
vol. 289, pp. 1732-1737, 2002.
[96] A. Daglis (editor), Effects of Space Weather on Technology Infrastructure, NATO Science
Series II, vol. 176. Dordrecht, Boston, London: Kluwer, 2004.
[97] H. Morall, A. Belov, and J. M. Clem, "Design and Co-0rdination of Multi-station
International Neutron Monitor Networks," Space Science Reviews, vol. 93, pp. 285-303,
2000.
[98] S. N. Karpov, Z. M. Karpova, Y. V. Balabin, and E. V. Vashenyuk, "Study of the GLE
events with use of the EAS-arrays data," presented at 29th I.C.R.C., Pune, India, 2005.
[99] C. D. Andrea and J. Poirier, "Ground level muons coincident with the 20 January 2005 solar
flare," presented at 29th I.C.R.C., Pune, India, 2005.
[100] K. Munakata and et. al, "Precursors of geomagnetic storms observed by the muon detector
network," Journal of Geophysical Research, vol. 105, pp. 27,457-27,468, 2000.
[101] M. Andrews. Story of a Servlet: An Instant Tutorial. Available:
http://java.sun.com/products/servlet/articles/tutorial/index.html
[102] E. Birney, M. Lausch, T. Lewis, S. Genaud, and F. Rehberger. ORBit Beginners
Documentation V1.6. Available: http://www.gnome.org/projects/ORBit2/orbit-
docs/orbit/book1.html
[103] G. Klyne and C. Newman. (2002). Date and Time on the Internet: Timestamps (RFC 3339).
Available: http://www.ietf.org/rfc/rfc3339.txt
[104] W3C. (Feb. 2004). Extensible Markup Language (XML) 1.0 (Third Edition). Available:
http://www.w3.org/TR/REC-xml
[105] W3C. (Oct. 2004). XML Schema Part 0: Primer Second Edition. Available:
http://www.w3.org/TR/xmlschema-0/
[106] D. Winer. (Jun. 1999). XML-RPC Specification. Available: http://www.xmlrpc.com/spec
[107] W3C. (Jun. 2003). SOAP Version 1.2 Part 0: Primer. Available:
http://www.w3.org/TR/2003/REC-soap12-part0-20030624/
[108] OPC Foundation. (Dec, 2004). OPC XML-DA 1.01 Specification. Available:
http://opcfoundation.org
[109] T. Bellwood. (2002). Understanding UDDI. Available: http://www-
106.ibm.com/developerworks/webservices/library/ws-featuddi/

158
[110] K. Blackburn, A. Lazzarini, T. Prince, and R. Williams. (2000). XSIL: Extensible Scientific
Interchange Language. Available: http://xml.coverpages.org/blackburnXSIL20000121-
pdf.gz
[111] E. Shaya. XDF, the eXtensible Data Format for Scientific Data. Available:
http://xml.gsfc.nasa.gov/XDF/XDFwhite.txt
[112] W3C. (Oct. 2001). Extensible Stylesheet Language (XSL) Version 1.0. Available:
http://www.w3.org/TR/xsl/
[113] W3C, "XSL Transformations (XSLT)," Nov. 1999.
[114] W3C. (Oct. 2003). Mathematical Markup Language (MathML) Version 2.0 (2nd Edition).
Available: http://www.w3.org/TR/2003/REC-MathML2-20031021/
[115] P. Murray-Rust, "Chemical Markup Language," World Wide Web Journal, 1997.
[116] W3C. (Jan. 2003). Scalable Vector Graphics (SVG) 1.1. Available:
http://www.w3.org/TR/SVG11/
[117] D. Matejka. (1999). Introduction to a XUL Document. Available:
http://www.mozilla.org/xpfe/xptoolkit/xulintro.html
[118] S. Hada and M. Kudo. (2000). XML Access Control Language: Provisional Authorization
for XML Documents. Available: http://www.trl.ibm.com/projects/xml/xacl/xacl-spec.html
[119] XMark (An XML Benchmark Project). Available: http://monetdb.cwi.nl/xml/index.html
[120] Sonoski Software Solutions. Sonoski XML Benchmark. Available:
http://www.sosnoski.com/opensrc/xmlbench/results.html
[121] Y. Oren. Piccolo SAX Parsers Benchmark. Available:
http://piccolo.sourceforge.net/bench.html
[122] W3C. (Sep. 2000). Document Object Model (DOM) Level 1 Specification (Second Edition).
Available), http://www.w3.org/TR/2000/WD-DOM-Level-1-20000929/
[123] W3C. (Nov. 2000). Document Object Model (DOM) Level 2 Core Specification. Available:
http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/
[124] W3C. (Feb. 2003). Document Object Model (DOM) Level 3 Core Specification. Available:
http://www.w3.org/TR/2003/WD-DOM-Level-3-Core-20030226/
[125] W3C. (Nov. 1999). XML Path Language (XPath) Version 1.0. Available:
http://www.w3.org/TR/xpath
[126] W3C. (Oct. 2003). XForms 1.0. Available: http://www.w3.org/TR/xforms/
[127] Simple API for XML (SAX). Available: http://www.saxproject.org/

159

Вам также может понравиться