Академический Документы
Профессиональный Документы
Культура Документы
Faculteit Ingenieurswetenschappen
Vakgroep Informatietechnologie
Voorzitter: Prof. Dr. Ir. P. LAGASSE
Scriptie ingediend tot het behalen van de academische graad van burgerlijk ingenieur in de
computerwetenschappen
Academiejaar 2006-2007
Ghent University
Faculty of Engineering
Master Thesis written to obtain the academic degree of Master of Computer Science Engineering
Preface
Two exciting years have flown by... I still remember the sunny summer day in 2005 when I decided to
continue my studies at Ghent University with a special study bridge program to obtain a Master of
Computer Science Engineering Software Engineering degree in two years. Thanks to Prof. Dr. Ir. K.
De Bosschere and Prof. Dr. Ir. L. Eeckhout for their support in composing a special personal study
program. Id like to thank my parents too for giving me this opportunity; housing and supporting a
student for two additional years is a real challenge as well.
Without doubt, its been a challenging two years to combine courses from the Bachelor and Master
curricula, sometimes having to attend three lessons at the same time, fighting conflicting project
deadlines while reserving time for extra-curricular activities. Luckily, this final years Master Thesis
allows for a personal touch to put the crown on the work.
The subject of this Master Thesis is Windows Workflow Foundation, a core pillar of Microsofts latest
release of the .NET Framework. When choosing a topic for the thesis, back in the spring of 2006, the
technology was still in beta, imposing quite some other challenges. Lots of betas, breaking changes
and sometimes a bit of frustration later, the technology has reached its final version, and to be
honest Ive been positively surprised with the outcome and certain aspects of the technology.
Although my interest in .NET goes back to the early 2000s, workflow didnt cross my path until one
year ago, wetting my appetite for this topic even more.
Id like to thank Prof. Dr. Ir. B. Dhoedt and Prof. Dr. Ir. F. De Turck for their support to research this
cutting edge technology and to support this thesis. Furthermore, I cant thank enough Kristof
Steurbaut for his everlasting interest in the topic and for providing his insights in practical use cases
for the technology at INTEC.
Researching the topic of workflow also opened up for another opportunity during this academic year,
writing a scientific paper entitled Dynamic workflow instrumentation for Windows Workflow
Foundation for the ICSEA07 conference. Without the support from Bart, Filip, Kristof and Sofie this
wouldnt have been possible. Their incredible eye for detail was invaluable to deliver a high quality
work and made it a unique and wonderful experience.
This year hasnt only been a massive year at university; its been a busy year outside as well. In
November 2006, I went to Barcelona to attend the Microsoft TechEd 2006 conference where I was
responsible for some Ask-the-Experts booths, attended numerous sessions on various technologies
including Workflow Foundation and where I participated in Speaker Idol and won. Special thanks to
Microsoft Belux to support this trip and to provide lots of other opportunities to speak at various
conferences.
Looking at the future, Im happy to face another big oversea challenge. In February 2007 I visited the
US headquarters of Microsoft Corporation in Redmond, WA. After two stressful days with flight
delays, tough interview questions and meeting a bunch of great people, I returned home on my
birthday with whats going to be my most wonderful birthday gift so far: a full time job offer as
Software Design Engineer at Microsoft Corporation. Im proud to say Ill take on this opportunity
starting from October this year.
iv
The cutting edge nature of the technology discussed in this thesis, contacts with Microsoft and my
future plans to work at Microsoft Corporation have driven the decision to write this work in English,
supported by Prof. Dr. Ir. B. Dhoedt. Special thanks to Prof. Dr. Ir. Taerwe and Prof. Dr. Ir. De
Bosschere to grant permission for this.
Finally, Id also like to thank my colleague students in the Master of Computer Science Engineering
bridge program for their endless support on a day-to-day basis. Arne and Jan, youve been a great
support the last two years and I hope to stay in touch.
Samenvatting
In november 2006 bracht Microsoft de Windows Workflow Foundation (WF) uit als deel van de .NET
Framework 3.0 release. Workflow stelt ontwikkelaars in staat om businessprocessen samen te stellen
op een grafische manier via een designer. In dit werk evalueren we de geschiktheid van workflowgebaseerde ontwikkeling in de praktijk, toegepast op medische agents zoals deze in gebruik zijn op
de dienst Intensieve Zorgen (IZ) van het Universitaire Ziekenhuis Gent (UZ Gent). Meer specifiek
onderzoek we hoe workflows dynamisch aangepast kunnen worden om tegemoet de komen aan
dringende structurele wijzigingen of om aspecten zoals loggen en authorizatie in een workflow te
injecteren. Dit deel van het onderzoek resulteerde in het bouwen van een zogenaamd
instrumentatieraamwerk. Verder werd ook onderzoek verricht naar het bouwen van een generieke
set bouwblokken die kunnen helpen bij het samenstellen van data-gedreven workflows zoals dit
typisch gebeurt bij het bouwen van medische agents. Ontwerpbeslissingen worden in detail
besproken en prestatieanalyses worden uitgevoerd om de toepasbaarheid van workflow, via de in dit
werk gebouwde technieken, te toetsen.
Trefwoorden: .NET, workflow, dynamic adaptation, instrumentation, generic frameworks
vi
Summary
Recently, Microsoft released Windows Workflow Foundation (WF) as part of its .NET Framework 3.0
release. Using workflow, business processes can be developed much like user interfaces, in a
graphical fashion. In this work, we evaluate the feasibility of workflow-driven development in
practice, applied on medical agents from the department of Intensive Care (IZ) of Ghent University
Hospital (UZ Gent). More specifically, we investigate how workflows can be modified dynamically to
respond to urgent changes or to crosscut aspects in an existing workflow definition. This resulted in
the creation of an instrumentation framework. Furthermore, a generic framework is created to assist
in the development of data-driven workflows by means of composition of generic building blocks.
Design decisions are outlined and performance analysis is conducted to evaluate the applicability of
workflow in this domain using the techniques created and described in this work.
Keywords: .NET, workflow, dynamic adaptation, instrumentation, generic framework
vii
I.
INTRODUCTION
DYNAMIC ADAPTATION
viii
25
20
15
10
5
0
Procedural
Workflow
9 10
IV. CONCLUSION
Workflow seems to be a valuable candidate for the
implementation of various types of applications. Using
dynamic adaptation and instrumentation workflows can
be made highly dynamic at runtime. Generic building
blocks allow for easy composition of pretty complex
(data-driven) workflows, while having the potential to
raise the performance bar.
ACKNOWLEDGEMENTS
The author wants to thank promoters Bart Dhoedt
and Filip De Turck for the opportunity to conduct this
research and to create a paper on dynamic
instrumentation for the ICSEA07 conference. The
realization of this work wouldnt have been possible
without the incredible support by Kristof Steurbaut
throughout the research.
REFERENCES
[1]
[2]
[3]
ix
Table of contents
Chapter 1 - Introduction ........................................................................................ 1
1
2.1
2.2
Definition of workflows............................................................................................................ 7
2.2.1
Code-only ......................................................................................................................... 7
2.2.2
2.2.3
2.3
Compilation .............................................................................................................................. 9
2.4
Activities ................................................................................................................................. 10
3.2
Introduction ................................................................................................................................... 13
2.1
2.2
3.2
3.3
3.4
5.1
5.2
5.2.1
5.2.2
5.3
5.4
Conclusion .............................................................................................................................. 27
Logging ................................................................................................................................... 28
6.2
6.3
Authorization ......................................................................................................................... 32
6.4
8.1
8.2
8.3
8.4
8.4.1
8.4.2
8.5
8.6
Conclusion ...................................................................................................................................... 48
Introduction ................................................................................................................................... 50
2.1
2.2
2.3
3.2
3.3
3.4
4.2
4.3
4.4
4.4.1
4.4.2
4.4.3
4.5
4.6
4.6.1
Chatty ............................................................................................................................. 68
4.6.2
Chunky ........................................................................................................................... 68
4.6.3
ForeachActivity ...................................................................................................................... 69
5.2
5.3
FilterActivity ........................................................................................................................... 74
5.4
PrintXmlActivity ..................................................................................................................... 75
5.5
6.2
6.3
6.4
6.5
6.6
Queue-based communication................................................................................................ 82
9.2
9.3
9.4
9.5
9.6
10
9.6.1
9.6.2
9.6.3
9.6.4
Inter-workflow parallelism............................................................................................. 99
9.6.5
xiii
Chapter 1 Introduction | 1
Chapter 1 Introduction
1 Whats workflow?
The concept of workflow exists for ages. On daily basis humans execute workflows to get their jobs
done. Examples include shopping, decision making process during meetings, etc. All of these have
one thing in common: the execution of a flow of individual steps that lead to some desired result. In
case of the shopping example, one crosses a market place with a set of products in mind to find the
best buy available, performing decision making based on price, quality and marketing influences.
In the computing space, programmers have been dealing with workflow for ages as well. Application
development often originates from a flowchart diagram being translated into code. However, thats
where it often ends these days. The explicitness of a visual representation of a workflow is turned
into some dark piece of code, which makes it less approachable by management people, not to speak
about the problem of code maintenance especially when code is shared amongst developers. Todays
workflow concept is all about keeping the graphical representation of some kind of business process
that can be turned to execution by a set of runtime services.
Workflow is based on four tenets. Although not so well-known (yet) as the web service SOA tenets or
the database ACID properties, these four tenets are a good starting point for further discussion:
Its also important to remark that the second tenet on the long-running and stateful character of
workflows is in strong contrast to the stateless character of web services. The combination of both
Chapter 1 Introduction | 2
principles can unlock a lot of potential however, for instance by exposing a workflow through a web
service to allow cross-organization business processing (e.g. a supply chain).
2 Why workflow?
In order to be successful, workflow needs a set of compelling reasons to use it. In the previous
paragraph a few advantages were already pointed out. One good reason to use workflows is the
visual representation of workflows that makes them easier to understand and to maintain. This
graphical aid provided by tools makes workflows approachable to a much broader set of people,
including company management.
Furthermore, the need for long-running workflows implies the availability of a set of runtime services
to allow hydration (i.e. the persistence of a running workflow when it becomes idle) and dehydration
(i.e. the process of loading a workflow in memory when it becomes active again) of a workflow. In a
similar way, the need for transparency leads to the requirement of having runtime services for
tracking and runtime inspection. Considering these runtime services (amongst others like scheduling
and transactions), workflow usage becomes even more attractive. Having to code these runtime
services yourself would be a very time-consuming activity and lead to a productivity decrease.
In the end, workflow is much more than some graphical toy and has a broad set of applications:
Business Process Management (BPM) Workflows allow for rapid modification in response
to changing business requirements. This makes software a real tool to model business
processes and to use software for what it should be intended for: supporting the business.
Document lifecycle management Versioning, online document management systems and
interactions between people have become a must for companies to be productive when
dealing with information. Approval of changes is just one example workflow can be used for.
Page or dialog flow A typical session when working with an application consists of a flow
between input and output dialogs or pages. Using workflow, this flow can be modeled and
changed dynamically based on the users input and validation of business rules.
Cross-organization integration Combining workflow with the power of web services, one
can establish a more dynamic way to integrate businesses over the internet in a Businessto-Business (B2B) fashion, e.g. in order-supply chain processing.
Internal application workflow The use of workflow inside an application can allow of
extension and modification by end-users. Pieces of the application that rely on business rules
can be customized more easily and with out-of-the-box tool support.
Chapter 1 Introduction | 3
With Windows Workflow Foundation (WF), a technology introduced in the .NET Framework 3.0,
workflow is brought to the masses and becomes a first class citizen of the .NET developers toolbox.
The .NET Framework 3.0, formerly known as WinFX, is a set of managed code libraries thats created
in the Windows Vista timeframe and ships with Windows Vista out-of-the-box but is also ported back
to run on Windows XP and Windows Server 2003. Other pillars of .NET Framework 3.0 include (see
Figure 1):
Just like weve seen the availability of the DBMS extend to the desktop with products like SQL Server
2005 Express and more recently SQL Server Everywhere Edition, WF brings the concept of workflow
Chapter 1 Introduction | 4
processing to the desktop. Essentially WF is an in-process workflow processing engine that can be
hosted in any .NET application, ranging from console applications over Windows Forms-based GUI
applications to Windows Services and web services.
Compared to BizTalk Server, WF is a pluggable lightweight component that can be used virtually
anywhere but lacks out-of-the-box support for complex business integration (e.g. using data
transformations), business activity monitoring (BAM), adapters to bridge with external systems (like
MQ Series, SAP, Siebel and PeopleSoft) and reliability and scalability features. Although there is a
blurry zone between both technologies, its safe to say BizTalk is rather to be used in complex crossorganization business integration scenarios while WF benefits from its more developer-oriented
fashion and is to be used more often inside an application. For the record, Microsoft has already
announced to replace the orchestration portion of BizTalk by WF in a future release of the BizTalk
product, leading to convergence of both technologies.
That Microsoft puts a bet on workflow-based technologies should be apparent from the adoption of
the WF technology in the next version of the Microsoft Office System, i.e. Office System 2007
(formerly known as Office 12) and the Windows SharePoint Services 3.0 technology for document
management scenarios. Other domains where WF will be implemented are ASP.NET to create a
foundation for page flow, future releases of BizTalk as mentioned previously and Visual Studio Team
System for work item processing.
More information on Windows Workflow Foundation can be found on the official technology website
http://wf.netfx3.com.
4 Problem statement
This first part of this work focuses on dynamic adaptation of workflows at runtime. Without doubt,
scenarios exist where its desirable to modify a workflow instance thats in flight. A possible scenario
consists of various business reasons that mandate a dynamic change (e.g. introducing an additional
human approval step after visual inspection of the workflow instances state). However, also in other
situations dynamic adaptation can be beneficial, for example to weave aspects in a workflow
definition without making the core workflow heavier or clumsier.
In the second part, focus is moved towards the creation of generic workflow building blocks that
makes composition of data-driven workflows easier. The results of this research and analysis are
applied on workflow-based agent systems that are used by the department of Intensive Care (IZ) of
Ghent University Hospital (UZ Gent).
For both parts, performance tests are conducted to evaluate the feasibility of the discussed
techniques and to get a better image of possible performance bottlenecks.
Chapter 2 Basics of WF | 5
Chapter 2 Basics of WF
1 Architectural overview
On a macroscopic level, the WF Runtime Engine gets hosted inside some host process such as a
console application, a Windows Forms application, a Windows Service, a web application or a web
service. The tasks of the runtime engine are to instantiate workflows and to manage their lifecycle.
This includes performing the necessary scheduling and threading.
The concept workflow is used to refer to the definition of a workflow, which is defined as a class, as
discussed further on. Each workflow is composed from a series of activities which are the smallest
units of execution in the workflow era. One can reuse existing activities that ship with WF but the
creation of custom activities either from scratch or by composition of existing activities is supported
too. Well discuss this concept further on.
A single workflow can have multiple workflow instances, just like classes are instantiated. The big
difference compared to simple objects is the runtime support that workflow instances receive, for
example to dehydrate a running workflow instance when it is suspended. The services in charge of
these things are called the Runtime Services.
Figure 2 outlines the basic architecture of WF.
Next to the runtime, there is tools support as well. In the future these tools will become part of
Visual Studio Orcas but for now these get embedded in Visual Studio 2005 upon installation of the
Windows SDK. An interesting feature of the graphical workflow designer is that it allows for re-
Chapter 2 Basics of WF | 6
hosting in other applications, effectively allowing developers to expose a designer to the end-users
(e.g. managers) to modify workflows using graphical support.
Chapter 2 Basics of WF | 7
The second type of workflow supported by WF is the state machine workflow. This type of workflow
is in fact a state machine that has a starting state and ending state. Transitions between states are
triggered by events. An example of a state machine workflow is shown in Figure 4.
Chapter 2 Basics of WF | 8
partial class SimpleSequential
{
#region Designer generated code
/// <summary>
/// Required method for Designer support - do not modify
/// the contents of this method with the code editor.
/// </summary>
[System.Diagnostics.DebuggerNonUserCode]
private void InitializeComponent()
{
this.CanModifyActivities = true;
this.helloWorld = new System.Workflow.Activities.CodeActivity();
//
// helloWorld
//
this.helloWorld.Name = "helloWorld";
this.helloWorld.ExecuteCode +=
new System.EventHandler(this.helloWorld_ExecuteCode);
//
// SimpleSequential
//
this.Activities.Add(this.helloWorld);
this.Name = "SimpleSequential";
this.CanModifyActivities = false;
}
#endregion
private CodeActivity helloWorld;
}
Code 2 - Designer generated code of a sequential workflow
XOML gets translated into the equivalent code and is compiled into an assembly thats equivalent to
a code-based definition. Its a common misunderstanding that XOML is only used by WPF (where it is
called XAML) to create GUIs; XOML can be used for virtually any object definition thats based on
composition, including user interfaces and workflows.
Chapter 2 Basics of WF | 9
2.2.3 Conditions and rules
Beside of the workflow definition itself, a workflow often relies on conditional logic. Such conditions
can be defined in code or declaratively using XML. The latter option allows for dynamic changes of
rules at runtime without the need for recompilation, which would cause service interruption. The
creation of such a declarative rule condition is illustrated in Figure 5.
2.3 Compilation
Workflow compilation is taken care of by the Visual Studio 2005 IDE using the MSBuild build system.
Under the hood, the Workflow Compiler wfc.exe is invoked to build a workflow definition. This
workflow-specific compiler validates a workflow definition and generates an assembly out of it. The
command-line output of the compiler is shown in Listing 1.
C:\Users\Bart\Documents\WF>wfc sample.xoml sample.cs
Microsoft (R) Windows Workflow Compiler version 3.0.0.0
Copyright (C) Microsoft Corporation 2005. All rights reserved.
Compilation finished with 0 warning(s), 0 error(s).
Listing 1 - Using the Windows Workflow Compiler
Additionally, WF supports dynamic compilation of XOML files using the WorkflowCompiler class, as
shown in Code 4. This allows a new workflow definition to be compiled and executed dynamically. An
applicable scenario is when the workflow designer is hosted in an application and an end-user
defines a workflow using that tool.
WorkflowCompiler compiler = new WorkflowCompiler();
WorkflowCompilerParameters param = new WorkflowCompilerParameters();
compiler.Compile(param, new string[] { "Sample.xoml" });
Code 4 - Invoking the workflow compiler at runtime
Chapter 2 Basics of WF | 10
2.4 Activities
The creation of workflows is based on the principle of composition. A set of activities is combined
into a workflow definition in either a sequential or event-driven state manner, in a similar way as GUI
applications are composed out of controls.
WF ships with a series of built-in activities that are visualized in the Visual Studio 2005 Toolbox while
working in a workflow-enabled project. An extensive discussion of those activities would lead us too
far, so well just illustrate this toolbox in Figure 6. When required through the course of this work,
additional explanation of individual activities will be given in a suitable place.
Remark that a large portion of these activities has an equivalent in classic procedural programming,
such as if-else branches, while-loops, throwing exceptions and raising events. Others enable more
complex scenarios such as replication of activities, parallel execution and parallel event listening.
Chapter 2 Basics of WF | 11
A powerful feature of WF and the corresponding tools is the ability to combine multiple activities
into another activity. Using this feature, one is able to create domain-specific activities that allow for
reuse within the same project or cross-project by creating a library of custom activities. This also
creates space for 3rd party independent software vendors (ISVs) to create a business out of
specialized workflow activity creation.
We wont cover all possible hosts in here, as the principle is always the same. Nevertheless its worth
to note that the web services hosting model relies on the WebServiceInput, WebServiceOutput and
Chapter 2 Basics of WF | 12
WebServiceFault activities to take data in and send data or faults out. Another interesting thing to
know is the lack of direct WCF support in WF, something that can only be established by manual
coding. According to Microsoft this is due to timing issues since WF was introduced relatively late in
the .NET Framework 3.0 development cycle. This shortcoming will be fixed in a future release.
Scheduling Services are used to manage the execution of workflow instances by the
workflow engine. The DefaultWorkflowSchedulerService is used in non-ASP.NET applications
and relies on the .NET thread pool. On the other side, the ManualWorkflowSchedulerService
is used by ASP.NET hosts and interacts with the ASP.NET host process (e.g. the worker pool).
CommitWorkBatch Services enable custom code to be invoked when committing a work
batch. This kind of service is typically used in combination with transactions and is used to
ensure the reliability of the workflow-based application by implementing additional errorhandling code.
Persistence Services [2] play a central role in workflow instance hydration and dehydration.
Putting a workflow instances runtime data on disk is a requirement to be able to deal with
long-running workflows and to ensure scalability. By default SQL Server is used for
persistence.
Tracking Services [3] allow inspection of a workflow in flight by relying on events that are
raised by workflow instances. Based on a Tracking Profile, only the desired data is tracked by
the tracking service. WF ships with a SQL Server database tracking service out of the box.
Local Communication Services enable communication of data to and from a workflow
instance. For example, external events can be sent to a workflow instance and data can be
sent from a workflow instance to the outside world, based on an interface type. The type of
service is often referred to as data exchange.
Services can be configured through the XML-based application configuration file or through code. All
of these services are implementations of a documented interface in the Windows SDK which allows
for custom implementation, e.g. to persist state to a different type of database.
4 Dynamic updates
Changing a workflow instance thats in progress is one very desirable feature. In order to respond to
rapid changing business requirements, workflow changes at runtime are a common requirement.
With WF, its possible to modify a workflow instance both from the inside (i.e. the workflows code)
and the outside (i.e. the host application). This is accomplished by means of the WorkflowChanges
class which well deal with quite a lot in this work.
2 The basics
In order to make changes to a workflow instance, one has to create an instance of the type
WorkflowChanges. This class takes the so-called transient workflow, which is the activity tree of the
running workflow, and allows application of changes to it. Once changes have been proposed, these
have to be applied to the running instance. At this point in time, activities in the tree can vote
whether or not they allow the change to occur. If the voting result is positive, changes are applied
and the workflow instance has been modified successfully. This basic process is reflected in code
fragment Code 6.
WorkflowChanges changes = new WorkflowChanges(this);
//Change the transient workflow by adding/removing/... activities
changes.TransientWorkflow.Activities.Add(...);
foreach (ValidationError error in changes.Validate())
{
if (!error.IsWarning)
{
string txt = error.ErrorText;
//Do some reporting and/or fixing
}
}
this.ApplyWorkflowChanges(changes);
Code 6 - Basic use of WorkflowChanges
Composite activities, like a WhileActivity or an IfElseActivity, are a bit more difficult to change since
one needs to touch the body of these activities. Basically, the same principles apply, albeit a bit
lower in the tree hierarchy, typically using tree traversal code.
Of course its also possible to delete activities by means of the Remove method of the Activities
collection.
This sample activity opens the doors to data binding. The use of the custom activity in the WF
designer in Visual Studio 2005 is shown in Figure 7.
Binding support at design-time is visualized by the presence of a small blue-white information icon in
the right margin of the property names column. Clicking it allows to bind the property of the activity
to a property of the enclosing workflow definition, as shown in Figure 8.
However, when applying dynamic updates we want to be able to establish such a binding at runtime.
This is made possible through the use of the ActivityBind class in WF. An example of this classs
usage is illustrated in code fragment Code 11.
For simplicitys sake, this sample inserts a DemoActivity activity at the end of a sequential workflow
that would otherwise return the same value as the input value. By feeding the workflows input to
the dynamically added DemoActivity and the DemoActivitys output in the reverse direction, we can
adapt the original input value dynamically. This mechanism could be used to apply a correction factor
to numerical data that flows though a workflow, e.g. to account for increased shipping costs.
private void UpdateWorkflow(WorkflowInstance instance)
{
WorkflowChanges changes =
new WorkflowChanges(instance.GetWorkflowDefinition());
DemoActivity da = new DemoActivity(2.0);
ActivityBind bindInput = new ActivityBind("Workflow1", "Input");
da.SetBinding(
MultiDynamicChange.DemoActivity.InputProperty, bindInput);
ActivityBind bindOutput = new ActivityBind("Workflow1", "Output");
da.SetBinding(
MultiDynamicChange.DemoActivity.OutputProperty, bindOutput);
changes.TransientWorkflow.Activities.Add(da);
instance.ApplyWorkflowChanges(changes);
}
Code 11 - Establishing dynamic data binding
Each WorkflowInstance can be queried for the underlying workflow definition using a method called
GetWorkflowDefinition that obtains a reference to the root activity. Based on the workflow instance
information, additional filtering logic can be applied to select a subset of workflow instances and/or
workflow instances of a given type. Once such a set of workflow instances is retrieved, changes can
be applied using WorkflowChanges. The WF runtime takes care of suspending the workflow prior to
making the dynamic update and resuming it right after the change was applied. Since changes are
applied on an instance-per-instance basis, exceptions thrown upon failure to change an instance only
affect one particular workflow adaption at a time, so rich failure reporting and retry logic can be
added if desired.
Needless to say, this code can be extended to inject multiple suspension points in a configurationdriven manner. Notice that every suspension point needs to have a unique name (e.g. suspend1)
which can be generated at random. Next, an Error property value has to be set. This will be used in
the decision logic of step two (see next section) to identify the suspension location. Error really is a
bit of a misnomer, since workflow suspension doesnt necessarily imply that an error has occurred. In
our situation it really means that we want to give the host application a chance to take control at
that point in time. Finally, the suspension point has to be inserted at the right place in the workflow,
e.g. before or after an existing activity in the workflow. In our sample, the suspension point is
injected after an activity called demoCode1. The original sample workflow and the corresponding
instrumented one are depicted in Figure 9.
When making this more flexible, we end up with a configurable table of suspension point injections
that consists of tuples {Before/After, ActivityName}. Based on this, instrumentation can be applied in
a rather straightforward iterative fashion. Changing this table, for instance stored in a database, at
runtime will automatically cause subsequent workflow instances to be instrumented accordingly.
Again, additional flexibility will be desirable since we cant recompile the host application to embed
this switching logic. Different approaches exist ranging from simple adaptations to far more complex
ones. An overview:
The tuple {Before/After, ActivityName} could be linked to a set of tuples containing actions to
be taken when the suspension point is hit, optionally with additional conditions. Such an
action tuple could look like ,Before/After, ActivityName, ActivityType- meaning that an
activity of type ActivityType has to be inserted before or after ActivityName in the workflow.
Evaluation of conditions could be expressed using the rule engine in WF and by serializing the
rules in an XML file.
A more generic approach would use reflection and application domains to load the switching
logic dynamically. For each suspension point, the response logic type is kept as well (see
tuple representation in Code 17). At runtime, these response logic types (which implement
the interface of Code 18) are loaded dynamically (Code 19). When the WorkflowSuspended
}
Code 17 - Type definition for instrumentation tuples
Notice that the code fragments above need robust error handling to keep the workflow runtime from
crashing unexpectedly when creating workflow instances. The dynamic nature of the code makes
compile-time validation impossible, so runtime exceptions will be thrown when invalid parameters
are passed to the instrumentor. For example, activities with a given name could be non-existent or
the instrumentation action type could be unloadable. One might consider running a different
WorkflowRuntime instance to check the validity of instrumentation logic using dummy workflow
As an example, well build a simple adaptation action (Code 21) that corresponds to the one shown in
Code 11, including the data binding functionality. However, encapsulation in an IAdaptationAction
type allows for more flexible injection when used together with the instrumentation paradigm.
public class TimesTwoAdaptor : IAdaptationAction
{
private WorkflowInstance instance;
public void Initialize(WorkflowInstance instance)
{ this.instance = instance; }
public void Execute()
{
WorkflowChanges changes =
new WorkflowChanges(instance.GetWorkflowDefinition());
DemoActivity da = new DemoActivity(2.0);
ActivityBind bindInput = new ActivityBind("Workflow1", "Input");
da.SetBinding(DemoActivity.InputProperty, bindInput);
ActivityBind bindOutput = new ActivityBind("Workflow1", "Output");
da.SetBinding(DemoActivity.OutputProperty, bindOutput);
changes.TransientWorkflow.Activities.Add(da);
instance.ApplyWorkflowChanges(changes);
}
public void Dispose() {}
}
Code 21 - Encapsulation of a workflow adaptation
Using the Instrumentator delegate, it becomes possible to adapt the instrumentation logic at will.
This further increases the overall flexibility of the instrumentation framework. This way, one could
instrument with suspension points but also with, for instance, logging activities. Furthermore, one
could get rid of the Console-based interaction by querying some data source through a querying
interface to get to know whether or not an activity has to be instrumented.
This form of recursion-based instrumentation is most useful when lots of instrumentations will be
required and if the underlying querying interface doesnt cause a bottleneck. On the other side,
when only a few of sporadic instrumentations are likely to be requested and when workflow
definitions are quite huge, it might be a better idea to use an approach based on instrumentation
points without having to traverse the whole workflow definition tree.
Last but not least, notice that instrumentations can be performed on already-instrumented workflow
instances too, allowing for accumulative instrumentations.
5.4 Conclusion
As weve shown in this paragraph, dynamic instrumentation of workflows through the use of
suspension points is a very attractive way to boost the flexibility of WF. This technique combines the
goodness of external modifications on the hosting layer having access to contextual information
with the horsepower of internal modifications that have a positional characteristic in a workflow
definition allowing for just-in-time adaption. To realize the flexibility of this mechanism, one should
think of the result obtained by encapsulating the instrumentation logic itself in an adaptation
action (the snake swallowing its own tail).
During the previous discussion, one might have observed a correspondence to the typical approaches
employed in the aspect-oriented development, such as crosscutting. Indeed, as well show further in
this work, combining dynamic adaptation with generic components for aspects like logging,
authorization, runtime analysis, etc. will open up the door for highly-dynamic systems that allow for
runtime modification and production debugging to a certain extent.
The primary drawback to this methodology is the invasive nature of dynamic activity injections that
can touch the inside of the workflow instance, effectively breaking encapsulation. Notice that the WF
architecture using dependency properties still hides the real private members of workflow types, so
unless youre reflecting against the original workflow type you wont be able to get access to private
members from an object-oriented (OO) perspective.
However, philosophically one could argue that the level of abstraction that WF enables is one floor
higher than the abstraction and encapsulation level in the world of OO. Based on this argument,
injection of activities in a workflow instance really is a breakage of encapsulation goals. Nevertheless,
when used with care and with rigorous testing in place it shouldnt hurt. As a good practice,
developers should adopt strict rules when writing injections since theres not much the runtime can
do to protect them against malicious actions. Flexibility at the cost of increased risk
Adding logging to a workflow. This can be used to gather diagnostic information, for example
to perform production debugging.
Measurement of service times can be accomplished by adding a measurement scope to
the workflow instance, i.e. surrounding the region that needs to be times with a start to
measure activity and a stop to measure activity.
Protecting portions of a workflow from unauthorized access. To do this, the workflow host
application layer could add some kind of access denied activities in places that are disallowed
for the user that launches the workflow instance. This decouples the authorization aspect
from the internals of a workflow definition.
6.1 Logging
As a first example, well create a logger that can be added dynamically by means of instrumentation.
It allows data to be captured from the workflow instance in which the logger is injected by means of
dependency properties. To illustrate this, consider a workflow defined as in Code 23.
public sealed partial class AgeChecker : SequentialWorkflowActivity
{
public AgeChecker() { InitializeComponent(); }
static DependencyProperty FirstNameProperty =
DependencyProperty.Register("FirstName", typeof(string),
typeof(AgeChecker));
static DependencyProperty AgeProperty =
DependencyProperty.Register("Age", typeof(int),
typeof(AgeChecker));
public string FirstName {
get { return (string)this.GetValue(FirstNameProperty); }
set { this.SetValue(FirstNameProperty, value); }
}
public int Age {
get { return (int)this.GetValue(AgeProperty); }
set { this.SetValue(AgeProperty, value); }
}
private void sayHello_ExecuteCode(object sender, EventArgs e) {
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine("Welcome {0}!", FirstName);
Console.ResetColor();
}
}
Code 23 - A simple workflow definition using dependency properties
This workflow definition consists of a single sayHello CodeActivity that prints a message to the
screen. Assume we want to check that the Age property has been set correctly; in order to do so,
wed like to inject a logging activity into the workflow instance in order to inspect the internal values
at runtime. Well accomplish this by means of instrumentation. In code fragment Code 24 you can
see the definition of such a simple logging activity which has a few restrictions well talk about in a
minute.
public class LoggingActivity: Activity
{
private string message;
private string[] args;
public LoggingActivity() { }
public LoggingActivity(string message, params string[] args)
{
this.message = message;
this.args = args;
}
protected override ActivityExecutionStatus Execute
(ActivityExecutionContext executionContext)
{
This activity takes two constructor parameters: a message to be logged (in the shape of a formatting
message as used in .NET) and a parameter list with the names of the properties that need to be fed
into the formatting string. Inside the Execute method, the activity retrieves the values of the
specified properties using the dependency properties of the parent. Notice that this forms a first
limitation, since nesting of composite activities will make the search for the right dependency
properties to get a value from more difficult. This can be solved using a recursive algorithm that
walks up the activity tree till the Parent property is null. Nevertheless, for the sake of the demo this
definition is sufficient. Next, the message together with the retrieved parameters is sent to an
ILogger (see Code 25) service using Local Communication Services.
[ExternalDataExchange]
interface ILogger
{
void LogMessage(string format, params object[] args);
}
Code 25 - The ILogger interface for LoggingActivity output
In here, we added the logger to the top of the workflow instances activity tree. On top of the code
fragment you can see the registration of the logger through Local Communication Services. For sake
of the demo, well just print the logging messages to the console.
[ExternalDataExchange]
public interface IStopwatchReporter
{
void Report(Guid workflowInstanceId,
string watcher, TimeSpan elapsed);
}
Code 29 - Reporting measurement results through an interface for Local Communication Services
Finally, the instrumentation can be done by implementing the IStopwatchReporter interface (Code
30) and injecting the measurement points in the workflow instance (Code 31).
class StopwatchReporter : IStopwatchReporter
{
public void Report(Guid workflowInstanceId,
string watcher, TimeSpan elapsed) {
Console.WriteLine("{0}: {1} - {2}", workflowInstanceId,
watcher, elapsed.TotalMilliseconds);
}
}
Code 30 - Implementation of a stopwatch timing reporter
This sample could be extended easily to allow for more flexibility. For instance, a third stopwatch
action could be introduced to report the result. This would allow for cumulative timings where the
timer is started and stopped on various places, while the result is only reported once when the
activity is told to do so by means of the action parameter. We could also allow for reuse of the same
Stopwatch instance by providing a reset action that calls the Stopwatch instances Reset method.
6.3 Authorization
The next sample shows how instrumentation can be applied to workflow instances to protect certain
paths from unauthorized access. In some cases, authorization might be built-in to the workflow
definition itself, for example using Authorization Manager [5], but in other cases it might be more
desirable to inject intra-workflow authorization checks at a later stage in the game.
In order to reject access to a certain portion of a workflow well use a ThrowActivity that throws an
AccessDeniedException as defined in Code 32.
[Serializable]
public class AccessDeniedException : Exception
{
public AccessDeniedException() { }
public AccessDeniedException(string message)
: base(message) { }
public AccessDeniedException(string message, Exception inner)
: base(message, inner) { }
protected AccessDeniedException(
System.Runtime.Serialization.SerializationInfo info,
System.Runtime.Serialization.StreamingContext context)
: base(info, context) { }
}
Code 32 - AccessDeniedException definition for authorization barriers
Now consider a workflow definition like the one shown in Figure 11. However, in a non-instrumented
workflow we dont want to have the accessDenied activity in place yet. As a matter of fact, a
Our goal is to add this one dynamically through an external modification driven by instrumentation at
the host level. The code shown in Code 33 illustrates how to accomplish this goal. Notice that this
sample also illustrates how to instrument a composite activity, in this case the left branch of the
IfElseActivity.
Dictionary<string, object> args = new Dictionary<string, object>();
args.Add("OrderValue", 15000);
WorkflowInstance instance = workflowRuntime.CreateWorkflow(
typeof(WFInstrumentation.Workflow3), args);
WorkflowChanges c = new WorkflowChanges(instance.GetWorkflowDefinition());
ThrowActivity t = new ThrowActivity();
t.FaultType = typeof(AccessDeniedException);
((CompositeActivity)c.TransientWorkflow.GetActivityByName("expensive"))
.Activities.Insert(0, t);
instance.ApplyWorkflowChanges(c);
instance.Start();
Code 33 - Instrumentation of a workflow instance with an access authorization guard
However, the presence of such handlers assumes prior knowledge of the possibility for the workflow
to be instrumented with an authorization barrier, which might be an unlikely assumption to make.
Nevertheless, situations can exist where its desirable to have the entire authorization mechanism
being built around dynamic instrumentations. In such a scenario, adding fault handlers in the
workflow definition perfectly makes sense.
An alternative approach to authorization would be to add static barriers at development time, which
rely on Local Communication Services to query the host whether or not access is allowed at a certain
point in the workflow tree, based on several indicators like the user identity and various internal
values. This approach could be extended to the dynamic case by providing a generic workflow
inspector activity that can be injected through instrumentation. Such an activity would be
parameterized with a list of names for the to-be-retrieved properties and would take an approach
like the one used in the logging sample (Code 24) to collect the values. Using Local Communication
Services, these values can be sent to the host where additional logic calculates whether or not access
is allowed. If not, a dynamic update is applied in order to put an authorization barrier in place.
Notice the type needs to be marked as Serializable in order to be used correctly by Local
Communication Services when reporting the result. The reporting interface is shown in code
fragment Code 35 and is pretty straightforward. Basically, well map an Inspection on the results of
that inspection. This way, the user of the inspector can trace back the retrieved values to the
inspection target.
[ExternalDataExchange]
interface IInspectorReporter
{
void Report(Guid workflowInstanceId, string inspection,
Dictionary<Inspection, object> results);
}
Code 35 - Inspector reporting interface
This interface reports the results with the corresponding inspection name (set by the inspector
activitys constructor, see further) and the workflow instance ID.
Finally, lets take a look at the inspector activity itself. The code is fairly easy to understand and
performs some activity tree traversal (see method FindActivity) and a bit of reflection stuff in order
to get the values from fields and properties. Getting the value from a dependency property was
illustrated earlier when we talked about a logging activity.
public class InspectorActivity : Activity
{
private string inspection;
private Inspection[] tasks;
public InspectorActivity() { }
public InspectorActivity(string inspection, Inspection[] tasks)
{
this.inspection = inspection;
this.tasks = tasks;
}
protected override ActivityExecutionStatus Execute
(ActivityExecutionContext executionContext)
{
Dictionary<Inspection, object> lst =
new Dictionary<Inspection, object>();
foreach (Inspection i in tasks)
{
Activity target = FindActivity(i.Path);
switch (i.Target)
{
case InspectionTarget.Field:
path.Split('/'))
|| p.Length == 0)
"..")
current.Parent;
((CompositeActivity)current).Activities[p];
return current;
}
}
Code 36 - Activity for dynamic workflow inspection
Next, we should not forget to hook up this inspector service at the workflow runtime level. This is
done in an equivalent way as shown previously in Code 27. Finally, the instrumentation itself has to
be performed, which is outlined in code fragment Code 38.
The result of this inspection when applied on the workflow defined in Code 23 is shown in Figure 13.
Of course we dont want to shut down the workflow runtime in order to attach a workflow inspector
to an instance. Therefore, the hosting application needs to be prepared to accept debugging jobs
when demanded. A first scenario consists of attaching a debugger to an existent workflow instance
that is specified by its instance ID. Then the host application can accept calls to debug a workflow
instance, obtain the instrumentation logic (for example by dynamically loading the instrumentation
job that injects the workflow inspector or by applying an auto-generated inspection activity that
reflects against all the available properties of the workflow instance in order to get access to it).
Another scenario is to instrument workflow instances at creation time to capture information about
newly created instances and whats going on inside these instances.
From the elaboration above, one might see a pretty big correspondence to the definition of tracking
services. Its the case indeed that there is quite some overlap with the goals of tracking in WF since
serialized workflow information can be used to inspect workflow instance internals. However, using
dynamic instrumentation, one can get a more fine-grained control over the inspections itself and the
places where these inspections happen. One could take all of this a step further by using the
inspections as a data mining tool to gather information about certain internal values that occur
during processing, which might otherwise remain hidden. Last but not least, where tracking services
impose a fixed overhead till the workflow runtime is reconfigured to get rid of tracking, dynamic
instrumentation does not and, hence, is more applicable when sporadic inspections are required.
In our world of dynamic workflow updates, the WorkflowMonitor as it ships with the SDK falls short
since its not able to visualize dynamic changes in real-time when a workflow instance is running. In
order to overcome this limitation, we made a simple change to the WorkflowMonitors code [6] in
order to update the designer view on a periodic basis, allowing for dynamic updates to become
visible.
The result of a dynamic instrumentation (as explained above when talking about the instrumentation
principle) applied to the workflow of Figure 14 is shown in Figure 15. In this illustration one can
clearly observe the three suspension points that were added dynamically as well as a dynamic
update (dynDelay) applied to the workflow through one of these suspension points.
Monitoring workflow instances in progress could be the starting point for dynamic adaptations as
covered earlier:
For non-instrumented workflow instances, its possible to perform an update right away
using the external modification methodologies described earlier in this chapter.
When a workflow instance has been instrumented, applying updates at suspension points
becomes dependent on the way the IAdaptionAction has been implemented. For example,
the action could query a database that contains adaptation hints on a per-workflow instance
basis. When such a hint is present for the suspended workflow instance and for the current
suspension point, an update is applied, possibly loading an activity dynamically from disk in
order to be inserted in the workflow instance.
8 Performance analysis
8.1 Research goal
In this paragraph we take a closer look at the results of a performance study conducted to measure
the impact of dynamic updates on the overall system throughput. As weve seen previously,
workflow instances are hosted by a workflow runtime thats responsible for the scheduling, the
communication with services and the instance lifecycles. Depending on the application type, the
workflow runtime will have a workload varying from a few concurrent instances to a whole bunch of
instances that need to be served in parallel, a task performed by WFs scheduling service. There are a
few metrics that can be used to get an idea of the systems workload:
Other influences are caused by the services hooked up to the runtime. For instance, when tracking
has been enabled, this will have a big impact on the overall performance. In a similar fashion,
persistence services will cause significant additional load on the machine as well as network traffic to
talk to the database. Its clear that the network layout and other infrastructure decision will influence
WFs performance. An interesting whitepaper on performance characteristics is available online [7].
Well systematically measure the time it takes to adapt a workflow dynamically, i.e. the cost of the
ApplyWorkflowChanges method call. When working with suspension points, other information will
be gathered too, such as the time it takes to suspend a workflow and to resume it immediately,
which is a common scenario when instrumented workflows ignore suspension points. Suspending a
workflow instance will give other instances a chance to execute before the former instance is
resumed again, which can lead to performance losses when the runtime is under a high load.
Symbol
n
Meaning
Number of sequential modifications
Avg
Worst
Best
80
70
60
ms
50
40
Best
30
Avg
20
Worst
10
0
10
100
1000
N
Graph 1 - Impact of workload on internal modification duration (n = 1)
160
140
120
ms
100
80
Best
60
Avg
40
Worst
20
0
10
100
1000
N
Graph 2 - Impact of workload on internal modification duration (n = 2)
300
250
ms
200
Best
150
Avg
100
Worst
50
0
10
100
1000
N
Graph 3 Impact of workload on internal modification duration (n = 3)
ms
200
150
100
Avg
50
0
Worst
1
Avg
8,981
14,570
21,871
Worst
72,600
156,500
258,000
Best
6,900
9,000
11,000
Best
n
Graph 4 - Impact of update batch size on internal modification duration (N = 1000)
Since internal modifications have a precise timing meaning that theres no risk of missing an update
because it is applied in-line with the rest of the workflow instance execution we can conclude that
updating a workflow instance, even under high load conditions, only introduces a minor delay on the
average. Therefore, we do not expect much delay on the overall execution time of an individual
workflow instance, a few exceptions set apart (characterized by the worst execution times shown
7.000
6.000
ms
5.000
4.000
Best
3.000
Avg
2.000
Worst
1.000
0
10
100
1000
N
Graph 5 - Impact of workload on external modification duration (n = 1)
9.000
8.000
7.000
ms
6.000
5.000
Best
4.000
Avg
3.000
Worst
2.000
1.000
0
10
100
1000
N
Graph 6 - Impact of workload on external modification duration (n = 2)
ms
10.000
9.000
8.000
7.000
6.000
5.000
4.000
3.000
2.000
1.000
0
Best
Avg
Worst
10
100
1000
N
Graph 7 - Impact of workload on external modification duration (n = 3)
A first fact that draws our attention is the different order of magnitude (seconds) of the adaptation
durations compared to the equivalent internal adaptations, certainly for higher workloads. As a
matter of fact, the average execution time for external modification grows faster than the equivalent
average execution time in the internal modification case.
ms
This difference can be explained by the fact that external workflow modifications have a far bigger
impact on the scheduling of the workflow runtime since workflow instances need to be suspended
prior to performing a dynamic update. This trend is illustrated in Graph 8, where we measured the
time elapsed between suspending a workflow instance and resuming it directly again, based on the
WorkflowSuspended event provided by the workflow runtime.
200
180
160
140
120
100
80
60
40
20
0
Best
Avg
Worst
10
100
1000
N
Graph 8 - Suspend-resume time
ms
In Graph 9 we take a closer look at the impact of multi-adaptation dynamic update batches.
10.000
9.000
8.000
7.000
6.000
5.000
4.000
3.000
2.000
1.000
0
Avg
Worst
1
Avg
3.450,007
4128,337
4.872,183
Worst
6.792,600
8121,500
9.441,300
160,500
277,100
312,500
Best
Best
n
Graph 9 - Impact of update batch size on external modification duration (N = 1000)
Again, we observe a linear relationship between the number of adaptations and the execution time
required to apply the dynamic update. This can be reduced by merging neighboring activities into a
SequenceActivity when applicable, in order to reduce the batch size of the dynamic update.
Comparing internal modifications with external modifications, the former ones certainly outperform
the latter ones. However, external modifications offer a more flexible way of adaptation in terms of
the ability to apply unanticipated updates from inside the workflow runtime hosting layer context.
Furthermore, external modifications have access to contextual information about the surrounding
system in which the workflow instances participate.
As a conclusion, one should adopt the best practice to evaluate carefully which internal modifications
might be desirable (i.e. update scenarios that can be anticipated upfront) and what information
gates (e.g. by means of Local Communication Services, providing access to the hosting layer
context) should be present for internal modifications to have enough information available.
9 Conclusion
The long-running nature of workflows is typically in contrast with their static definitions. For this very
reason, it makes sense to look for dynamic adaptation mechanisms that allow one or more instances
of a workflow to be changed at runtime. In this chapter weve outlined the different approaches
available in WF to do this.
Even if workflows are not used in a long-running fashion, dynamic adaptation proves useful to
abstract away aspects like logging that shouldnt be introduced statically. If such aspects would be
included in the workflows definition right away, one of the core goals of workflow graphical
business process visualization would be defeated.
Performance-wise, one can draw the distinction between external and internal modification. While
the former one allows more flexibility with respect to the direct availability of state from the
environment surrounding the workflow instances, it has a much higher cost than internal
modification, due to the need to suspend a workflow instance which might also cause persistence to
take place. Furthermore, to allow precise modifications from the outside suspension points could be
used, but these incur a non-trivial cost too.
As part of this work, the current implementation of the AB Switch agent was investigated in detail.
From this investigation we can conclude that the typical flow of operations performed by the agent
consists of many repetitive tasks, each slightly different from one another, for example to query the
database. It is clear that the real flow of operations is buried under masses of procedural coding,
making the original flow as depicted in Figure 17 invisible to end-users and even to developers.
Therefore, workflow seems an attractive solution to replace the classic procedural coding of the
algorithm that is full of data retrieving activities, calculation activities, decision logic, formatting and
sending of mails, etc. This methodology would not only allow visual inspection of the agents inner
workings by various stakeholders, but it also allows hospital doctors to investigate the decision logic
being taken on a patient-per-patient basis, i.e. which criteria led to a positive or negative switch
advise. In other words, workflow wouldnt only provide a more visual alternative to the original
procedural coding, but the presence of various runtime services such as tracking would open up for
possibilities far beyond what is possible to implement in procedural coding in a timely fashion.
The agent is represented as one big workflow that is triggered once a day and processes all
patients one-by-one as part of the workflow execution.
Decision making for an individual patient is treated as the workflow, while the hosting layer
is responsible to create workflow instances for each patient.
Both approaches have their pros and cons. Having just one workflow that is responsible for the
processing of all patients makes the agent host very simple. Its just a matter of pulling the trigger
once a day to start the processing. However, for inspection of individual patient results based on
tracking (see Tracking Services in Chapter 1 paragraph 7 on page 39), this approach doesnt work out
that well since one cant gain direct access to an individual patients tracking information. Also, if one
would like to trigger processing for an individual patient on another (timely) basis or in an ad-hoc
fashion, the former approach wont help out.
From the performance point of view, having one workflow instance per patient would allow the
workflow runtimes scheduling service to run tasks in parallel without any special effort from the
developer. On the dark side of the latter approach, the hosting layer will have more work to do since
it needs to query the database as well to retrieve a set of patients that have to be processed.
Furthermore, aggregating the information thats calculated for the patients (e.g. for reporting
purposes or for subsequent data processing by other agents or workflows) will become a task of the
host, which might be more difficult due to the possibility for out of order completion of individual
workflow instances. This out of order completion stems from the scheduling mechanism employed in
WF. Theres no guarantee that workflow instances complete in the same order as they were started.
After all, in the light of enabling long-running processing in workflow, some instances might take
longer to execute than others. But even in the case of short-running workflows, the runtime
preserves the right to swap out a workflow instance in favor of another one. This behavior
becomes more visible on machines hosting multiple workflow definitions that can be instanced at
any point in time. Therefore one shouldnt rely on any ordering assumptions whatsoever.
Well try to find the balance between both approaches by encapsulating the database logic in such a
way that it can be used by both the host and the workflow itself, in order to reduce the plumbing on
the host required to talk with a database.
As a final note, we should point out that composition of individual workflow definitions is possible in
WF by various means. One possibility is to encapsulate the decision making workflow for a patient in
The (theoretical) risk for SQL injection attacks because of string concatenation-based query
composition in code;
Quite a bit of ugly coding to handle unexpected database (connection) failures, partially
caused by the shaky Sybase database provider in .NET;
A few performance optimizations and .NET and/or C# patterns that could prove useful for
the agents implementation.
Notice that there are various kinds of parallelism in workflow, both intra-workflow and on the
runtime level. By representing the evaluation process for a single patient as a single workflow
instance, well get parallel processing of multiple patients at a time. By putting ParallelActivity blocks
in the workflow definition, we do get parallelism inside workflow instances as well.
It might be tempting to wrap Boolean operators in some kind of parallel activity too. In such a
structure, all branches of the parallel activity would represent an operand of the enclosing operator.
Although such an approach has great visual advantages, its far from straightforward to implement
and to use during composition. Remember that WF is all about control flows and there is no intrinsic
concept of data exchange between activities in a contract-driven manner. In other words, its not
possible to apply a (to WF meaningful) interface (such as returns a Boolean value) to an activity
thats used by an enclosing activity (e.g. the one that collects all Boolean results and takes the
conjunction or disjunction). Instead, WF is built around the concept of (dependency) properties for
cross-activity data exchange, which introduces the need for quite a bit of wiring during composition,
e.g. to connect child activities (the operands) to their parent (the operator). In addition, quite some
validation logic would be required to ensure the correctness of a Boolean composition.
In case data gathering results have to be validated to match certain criteria, e.g. a non-null check,
additional custom activities can be created that perform such logic and make the handling of such
corner cases visible in the workflow definition. In case of a criteria matching failure, a special custom
exception can be thrown, somewhat the equivalent of putting an implicit throws clause in the
custom activitys contract. Notice however that .NET doesnt use checked exceptions [12]. Therefore
users of such a custom activity should be made aware of the possibility for the activity to throw an
exception, e.g. by providing accurate documentation.
Queries shouldnt be hardcoded, allowing a query to be changed without having to touch the
workflow definition. To realize this requirement, queries will be identified by a unique name
and will be stored in a configuration base. This should make it possible to optimize queries or
even to refactor queries into stored procedures without having to shut down the agent.
Storage of queries should be implemented in a generic fashion, allowing queries to be stored
by various means. An interface to retrieve query definitions will be used to allow this level of
flexibility. For example, queries could be stored in XML or in a database, by implementing the
query retrieval interface in a suitable way.
Parameterization of queries should be straightforward in order to ease composition tasks.
Typically, the parameters required to invoke a query will originate from the outside (e.g. a
patients unique identifier) or from other sources inside the same workflow. Furthermore,
parameters should be passed to the data gathering mechanism in a generic fashion,
maximizing the flexibility of data gathering while minimizing composition efforts.
Results produced by invoking a query should be kept in a generic fashion too. This should
make consumption of produced values easy to do, while supporting easy chaining of various
queries as well, i.e. outputs of one query could act as (part of the) inputs for another query.
Notice the XML serialization support thats required for workflow properties of this type in order to
be persisted properly when the runtime indicates to do so. This data representation format makes
inspection at runtime, e.g. using instrumentation mechanisms, much easier than facing various
values spread across the entire workflow definition.
Next, a contract for query execution and data retrieval has to be created. In order to keep things
inside the workflow definition as simple as possible, well retrieve a query object through Local
Communication Services (LCS) by means of a query manager as shown in Code 42.
[ExternalDataExchange]
public interface IQueryManager
{
IQuery GetQuery(string name);
}
Code 42 - Query manager used by LCS
Finally, the interface for query definition and execution is defined as follows:
public interface IQuery
{
string Name { get; }
List<PropertyBag> Execute(PropertyBag parameters);
}
Code 43 - Query representation and execution interface
Using this query representation, chaining of query results to query inputs becomes pretty easy to do,
because of the use of property bags both as input and as output. The WF designer supports indexing
in List<T> collections, so its possible to grab the first row from the querys results collection just by
using a *0+ equivalent in the designer.
}
public static DependencyProperty ResultsProperty =
DependencyProperty.Register("Results",
typeof(List<PropertyBag>),
typeof(GatherDataActivity));
[Browsable(true)]
[Category("Input/output")]
[Description("Results of the query.")]
[DesignerSerializationVisibility(
DesignerSerializationVisibility.Visible)]
public List<PropertyBag> Results
{
get { return (List<PropertyBag>)base.GetValue(ResultsProperty); }
set { base.SetValue(ResultsProperty, value); }
}
public static DependencyProperty QueryProperty =
DependencyProperty.Register("Query",
typeof(string),
typeof(GatherDataActivity));
[Browsable(true)]
[Category("Database settings")]
[Description("Name of the query.")]
[DesignerSerializationVisibility(
DesignerSerializationVisibility.Visible)]
public string Query
{
get { return (string)base.GetValue(QueryProperty); }
set { base.SetValue(QueryProperty, value); }
Weve dropped the implementation of the validator and designer helper classes which provide
validation support during compilation and layout logic for the workflow designer respectively. All of
the parameters to the GatherDataActivity have been declared as WF dependency properties to assist
with binding. For more information on dependency properties we refer to [14].
The core of the implementation is in the Execute method which is fairly easy to understand. First, it
retrieves the query manager to grab the query object from, based on the querys name. Execution of
the query is then just a matter of calling the Execute method on the obtained IQuery object using the
appropriate parameters. Finally, results are stored in the Results property and the activity signals it
has finished its work by returning the Closed value from the ActivityExecutionStatus enum.
Each <Query> element is composed of a set of Inputs (the parameters) and Outputs (the columns)
together with a unique name, a category (for administrative purposes) and the parameterized query
statement itself. This parameterized statement will typically be mapped on a parameterized SQL
command thats executed against the database, grabbing the parameter values from the
GatherDataActivitys input PropertyBag and spitting out the corresponding query results which are
stored in the GatherDataActivitys output list of PropertyBag objects subsequently.
The data types supported on parameters and columns are directly mapped to database-specific
types. Another level of abstraction could be introduced to make these types abstract, with automatic
translation of type names into database-specific data types. However, the query definition XML isnt
part of our core framework, so one can modify the XML schema at will. Our query representation
maps nicely on a Sybase or SQL Server database, as used by the AB Switch agent.
This implementation loads an XML file from disk and builds an IQuery object of type SybaseQuery
based on a specified query name. We should point out that the implementation shown in here is
slightly simplified for the sake of the discussion. A production-ready query manager should allow
more flexibility by detecting changes of the file on disk using a FileSystemWatcher component or by
providing some kind of Refresh method that can be called from the outside when needed. Ideally,
Lets point out a few remarks concerning this piece of code. First of all, the GetOdbcType helper
method was oversimplified, only providing support for the types int and datetime which happen to
be the most common ones in the AB Switch agent. For a full list of types, take a look at the OdbcType
enumeration. Next, the Execute method has some built-in logic that allows chaining of query blocks
by translating column names in parameter names suffixed by the @ character. In case more flexibility
is needed, CodeActivity activities can be put in place to provide extended translation battles from
gathered data into query parameterization objects. However, when naming parameters and columns
in a consistent fashion, this kind of boilerplate code can be reduced significantly. For the AB Switch
5.1 ForeachActivity
WF has one type of loop activity the WhileActivity that allows repetitive execution of a sequence
activity based on a given condition. However, when turning workflows into data processing units as
is the case with the AB Switch agent data iteration constructs seem useful. To serve this need, the
ForeachActivity was created, allowing a set of data to be iterated over while executing a sequence
activity for each data item in the set, more or less equal to C#s foreach keyword.
However, before continuing our discussion of this custom activity we should point out that this
activity might not be the ideal choice when dealing with long-running workflow instances that can be
suspended. The limitations imposed by our ForeachActivity are somewhat equivalent to limitations
of iterators in various programming languages: during iteration the source collection shouldnt
change. In case a workflow instance suspends while iterating over a set of data, that set of data will
Figure 27 - ForeachActivity
Essentially, the ForeachActivity forms a graphical representation for data iteration tasks in flowchart
diagrams, for example to perform some kind of calculation logic that uses intermediary results based
on individual data items in a set of retrieved data. As mentioned before, its key for the body of the
ForeachActivity to be short-running and non-suspending (at least not explicitly the workflow
runtime can suspend instances for various reasons). If long-running processing is required on data
items, individual workflow instances should be fired to perform further processing on individual
items in an asynchronous manner, causing the main iteration loop to terminate quickly. This
approach can be realized using the InvokeWorkflow activity of WF. However, in such a case one
should take under consideration any possibility to get rid of the ForeachActivity altogether by moving
the iterative task to the host layer where workflow instances can be created for individual data
items.
An example use in the AB Switch agent could be the iteration over the list of patients that have to be
processed for the past 24 hours, as shown in Figure 28.
As long as the SequenceActivity nested inside the ForeachActivity is short-running kind of a normal
procedural coding equivalent theres nothing wrong with this approach. However, one shouldnt
forget about the sequential nature of the ForeachActivity, as it is derived from the SequenceActivity
base class from the WF base class library. Because of this, no parallelism can be obtained right away
while processing records. The IList interface used for the input of the ForeachActivity nails down this
sequential ordering.
Therefore, for the macro-level of the AB Switch agent, well move the patient iteration logic to the
host layer, effectively representing the processing for one single patient in one workflow definition,
enhancing the degree of parallelism that can be obtained at runtime by executing calculations for
multiple patients at the same time. Nevertheless, we can still take advantage of our query manager
object to retrieve the list of patients in a generic fashion on the host layer, even without having to
use the GatherDataActivity.
Notice that an alternative approach could be the use of a single activity used to signal either a
positive case or a negative case. However, WF doesnt support a designer layout mechanism that can
change an activitys color based on property values set on the activity, so well stick with two
separate custom activities with a hardcoded green and red color respectively.
Indeed, in a procedural coding equivalent, wed optimize this code by assigning the condition directly
to the value variable. However, such an approach in the world of workflow wont provide any visuals
that indicate the outcome of a condition. It should be noted however that the condition builder from
WF seems to be reusable for custom activities, but no further attention was paid to this in particular,
especially since that kind of implementation again would hide the colorful signaling.
As an example of a more complex IfElseActivity combined with such signaling blocks, take a look at
the epilogue of the AB Switch implementation that makes the final decision for a patient (Figure 30).
The corresponding canSwitch condition for the IfElseActivity is shown in Figure 31.
5.3 FilterActivity
As an example to illustrate the richness of the presented approach to data-driven workflows, a
FilterActivity has been created to help protecting private patient data thats required throughout the
workflow itself but shouldnt be exposed to the outside. Different approaches to realize this goal
exist, ranging from dropping entries from property bags over data replacement with anonymous
values to data encryption so that only authorized parties can read the data.
The skeleton for such a filtering activity is outlined in Code 50 below.
[DefaultProperty("Filters")]
[Designer(typeof(FilterDesigner), typeof(IDesigner))]
[ToolboxItem(typeof(ActivityToolboxItem))]
public class FilterActivity : Activity
{
public static DependencyProperty FiltersProperty =
DependencyProperty.Register("Filters", typeof(string[]),
typeof(FilterActivity));
[Browsable(true)]
[Category("Filtering")]
[Description("Property filters.")]
[DesignerSerializationVisibility(
DesignerSerializationVisibility.Visible)]
public string[] Filters
{
get { return (string[])base.GetValue(FiltersProperty); }
set { base.SetValue(FiltersProperty, value); }
}
public static DependencyProperty ListSourceProperty =
DependencyProperty.Register("ListSource",
typeof(List<PropertyBag>), typeof(FilterActivity));
[Browsable(true)]
[Category("Data")]
[Description("List source.")]
[DesignerSerializationVisibility(
DesignerSerializationVisibility.Visible)]
public List<PropertyBag> ListSource
{
get { return (List<PropertyBag>)base.GetValue(ListSourceProperty); }
set { base.SetValue(ListSourceProperty, value); }
}
protected override ActivityExecutionStatus Execute(
ActivityExecutionContext executionContext)
{
foreach (PropertyBag p in ListSource)
foreach (string f in Filters)
// Performing filtering logic
return ActivityExecutionStatus.Closed;
}
Code 50 - Code skeleton for a filtering activity
The real richness of this activity stems from its possible combination with dynamic updates and
instrumentation. Especially in research-driven environments, it makes sense to be able to use a
5.4 PrintXmlActivity
A final core building block in our library is the PrintXmlActivity that transforms property bags into an
XML representation and (optionally) applies an XSLT stylesheet to the data. Since property bags are
derived from a built-in collection type, serialization to XML is a trivial thing to do, resulting in a
friendly representation of the name/value pairs, as illustrated in Listing 3.
<?xml version="1.0" encoding="utf-16"?>
<ArrayOfPropertyBag>
<PropertyBag>
<Bed>E307</Bed>
<First>Marge</First>
<Last>Simpson</Last>
<ID>591122</ID>
<Number>078A93</Number>
<Antibiotics>
<PropertyBag>
<Name>Ciproxine IV</Name>
<Dose>200mg/100ml flac</Dose>
</PropertyBag>
<PropertyBag>
<Name>Dalacin C IV</Name>
<Dose>600mg/4ml amp</Dose>
</PropertyBag>
</Antibiotics>
</PropertyBag>
<PropertyBag>
<Bed>E309</Bed>
<First>Homer</First>
<Last>Simpson</Last>
<ID>370818</ID>
<Number>060A39</Number>
<Antibiotics>
<PropertyBag>
<Name>Flagyl IV</Name>
<Dose>500mg/100ml zakje</Dose>
</PropertyBag>
</Antibiotics>
</PropertyBag>
</ArrayOfPropertyBag>
Listing 3 - XML representation of property bags
Applying an XSLT stylesheet isnt difficult at all using .NETs System.Xml.Xsl namespaces
XslCompiledTransform class.
6 Other ideas
6.1 Calculation blocks
Workflows similar to the AB Switch workflow tend to perform quite a bit of calculation work in order
to produce useful results based on the data fed in. The AB Switch workflow contains a good example
a calculation block used to calculate the levophed dose, as shown in Figure 32.
When retrieving all of the required data for the calculation in only a few data gathering activities, the
code for the calculation block tends to be pretty easy to implement. In case of AB Switch, one wellchosen query was used to retrieve all of the input values for the calculation block at once. This is
shown in Code 52.
SELECT PharmaID, MAX(Rate) AS Rate FROM PharmaInformation WHERE PatientID =
@PatientID and PharmaID in (1000582, 1000583, 1000584, 1000586) AND
EnterTime BETWEEN @from AND @to GROUP BY PharmaID
Code 52 - Retrieving input values for levophed calculation
In the original design of AB Switch, the maximum dose for each pharma item was retrieved
individually, resulting in four queries. Using some SQL horsepower such as aggregation and grouping,
this was reduced to one single query, making it more agent-specific but much more efficient.
One could even go one step further by writing a stored procedure that does all of the calculation at
the database layer, returning a singleton result value with the calculated value, reducing the
calculation block to a regular GatherDataActivity. This would put additional stress on the database
however and should be considered carefully.
Though the use of a CodeActivity is by far the easiest implementation for (quite complex) calculation
logic, it doesnt have the advantage of human readability. As long as the calculation algorithm is wellknown by people who inspect the workflow definition, this shouldnt be a big problem, especially
when the CodeActivity is well-named. In such as case the calculation block can be considered a black
box. Nevertheless, without doubt situations exist where more flexibility is desirable, reducing the
need for recompilation when calculation logic changes.
A first level of relaxation can be obtained by parameterization of calculation logic using a
PropertyBag thats fed in by the host. Alternatively, Local Communication Services can be used to
query the host for parameterization information during a workflow instances execution lifecycle, in
an interface-based fashion.
However, if the logic itself needs to be modifiable at runtime, more complex mechanisms will have to
be put in place to support this. One such approach could be the use of dynamic type loading (keeping
in mind that types cannot be unloaded from an application domain once they have been loaded, see
[4]) through reflection as outlined in Code 53.
The creation of a calculator activity that invokes a calculator through the ICalculator interface is
completely analogous to the creation of our GatherDataActivity that relied on a similar IQuery
interface. The only difference is the use of reflection in the GetCalculator method implementation, in
order to load the requested calculator implementation dynamically (see Code 54). For better
scalability, some caching mechanism is highly recommended to avoid costly reflection operations
each time a calculation is to be performed.
class CalculatorManager : ICalculatorManager
{
public ICalculator GetCalculator(string calculator)
{
// Get the calculator type from the specified assembly.
// The assembly could also live in the GAC.
string assembly = "put some assembly name here";
string type = calculator;
return (ICalculator)Activator.CreateInstance(assembly, type);
}
}
Code 54 - Dynamic calculator type loading
This technology has several benefits including the unification of various data domains ranging from
relational over XML to in-memory objects (even allowing joins between domains), the generation of
efficient queries at runtime, strong type checking against a database model, etc.
From a workflows point of view, the LINQ entity frameworks could be useful to build the bridge
towards a strong-typed approach for query execution in a workflow context, using an alternative
GatherDataActivity implementation that binds to a LINQ query or its underlying expression tree
representation in some way.
That being said, the future applicability of LINQ in this case is still to be seen. One question that will
have to be answered is the availability of Sybase database support which is highly unlikely to be
created by Microsoft itself, although third parties can use LINQs extensibility mechanism to support
it. Time will tell how LINQ will evolve with respect to WF-based applications, if such a scenario is even
under consideration for the moment.
For more advanced execution control, one can use WFs queuing mechanisms to queue work that
can be retrieved elsewhere for further processing. This concept would drive us too far from home
however. For a detailed elaboration, we kindly refer to [14].
Crashing the thread that retrieves data from the completed workflow instance isnt a good thing and
can be avoided completely by making the workflow return nothing and having it queue up the
obtained results. Another component (in a separate process or even on another host) can then drain
the queue optionally in a transactional manner for further processing and/or result delivery.
Using designer re-hosting we were capable of composing AB Switch without requiring the Visual
Studio 2005 tools, resulting in a XOML file that can be compiled for execution. As mentioned in
Chapter 2, paragraph 2.3 the WF API has a WorkflowCompiler class aboard that can be used for
dynamic workflow type compilation and thats exactly whats happening in the sample application
when calling Workflow, Compile Workflow.
However, to be usable by end-users directly some issues need further investigation, resulting in more
simplification of the tool. The most obvious problem is likely the (lack of) end-user knowledge of
some WF concepts such as property binding. For example, when dragging a GatherDataActivity to
the designer surface, properties have to be set to specify the query name but also to bind the query
parameters and the result object. This is shown in Figure 35.
Although all of the workflow designer tools (e.g. the property binding dialog, the rules editor, etc.)
from Visual Studio 2005 are directly available in the designer re-hosting suite, some are too difficult
for use by end-users directly.
One idea to overcome some of these issues is making the re-hosted designer tool more domainspecific, with inherent knowledge of the query manager in order to retrieve a list of available queries
(notice that the query manager interface will have to be extended to allow this). Once all queries are
retrieved in a strongly-typed fashion (i.e. with full data type information for parameters and
columns), the system can populate the toolbox with all queries available (in the end one could create
a query designer tool in addition to build new queries using the same tool, provided the end-user has
some SQL knowledge). Queries in WF are nothing more than pre-configured GatherDataActivity
blocks, together with a parameterization PropertyBag and a result list of PropertyBags. This approach
reduces the burden of creating and binding input/output properties manually, increasing the
usability dramatically at the cost of a more domain-specific workflow designer host but that should
be just okay.
Nevertheless, some basic knowledge of bindings will remain necessary for end-users in order to bind
one activitys output to another activitys input, for example to feed the output from a data retrieval
operation to a PrintXmlActivity. An alternative to make this more user friendly might be the
visualization of all PropertyBag and List<PropertyBag> objects in some other toolbox pane. Bindings
for activities default properties could be established then using simple drag-and-drop operations.
Such an approach visualizes the implicit dataflow present in workflow-based applications.
For example, a GatherDataActivity could have its Parameters property decorated with some custom
attribute that indicates it is the default input property. In a similar fashion, the Results property can
then be specified as the default output property. Using reflection, the tool can find out about those
default properties to allow for drag-and-drop based property binding. An early prototype of this
mechanism is shown in Figure 36, where green lines indicate inputs and red lines indicate outputs.
The list on the left-hand side shows all of the (lists of) PropertyBags that are present in the current
workflow, together with their bindings.
Finally, in order to execute the newly created workflow definition, it has to be compiled and copied
to a workflow runtime host with all of the services configured that enable data gathering. In order to
make it callable from the outside, some kind of interface will need to be provided as well, which
could be generated automatically by inspecting input parameter lists and promoting them to a WSDL
or WCF contract. Building such an auto-deploy tool that takes care of all wiring of the workflow
engine and the web services faade wont be a trivial thing to do.
Questions unanswered remain, such as the need for some simple debugging support helping endusers to spot problems in a workflow composition. WFs activity model supports validators, which
have been implemented for our activities to indicate missing properties. However, validation
messages might look strange in the eyes of a regular end-user. Therefore a good goal is to reduce the
number of properties that have to be set manually, reducing the chance of compilation and/or
validation errors. However, WF base library activities such as IfElseActivity and CodeActivity are likely
to be required in the lions part of composed workflows, e.g. to drive decision logic. Such activities
will still require manual configuration, possibly resulting in the weakest link of the chain: the
composition tool is only as simple as the most complex activity configuration
Also, activities like PrintXmlActivity require more complex parameters, such as complete XSLT files,
which are far from easy to create even for most developers.
In the end, we believe theres lots of room for research around this topic, in order to make workflows
easily composable by end-users themselves, especially in a data-processing workflow scenario.
9 Performance analysis
9.1 Research goal
An important question when applying workflow as a replacement for procedural coding is the overall
performance impact on the application. Though performance degradations are largely overshadowed
in long-running workflows, applying workflows for short-running data processing operations can be
impacted by performance issues, especially when these operations are triggered through some
request-response RPC mechanism such as web services. This being said, efficient resource utilization
in long-running workflows matters as well.
In this paragraph well investigate the impact of workflow-based programming on data processing
applications. More specifically well take a look at the performance of the ForeachActivity and the
GatherDataActivity compared their procedural equivalents, the performance characteristics of perrecord based processing using individual workflow instances versus iterative workflows and the
impact of intra-workflow and inter-workflow parallelization.
Notice this piece of code isnt thread-safe since the WorkflowCompleted event handler isnt
synchronized. However, locking introduces a significant overhead that would affect the performance
results for a workflow-based application negatively. Therefore, we dont fix this issue at the risk of
having a blocked test program when the counter n doesnt reach its final value because of
overlapped write operations. In such a case we simply restart the test application.
We measured results like the following:
Procedural: 1015 ms
Workflow: 3983 ms
So, the difference is about a factor four due to various mechanisms inside WF that are quite resource
hungry, especially the scheduler that has to schedule individual workflow instances as well as each
individual activity inside a workflow.
2500
2000
1500
Procedural
1000
Workflow
500
0
1
10
Using a default workflow scheduler configuration, we observe that the gap between procedural and
workflow-based execution is shrinking slightly from a factor 4 down to a factor that drops under 2. In
absolute figures however, workflow remains slower than the procedural equivalent. By tweaking the
number of scheduler threads a slight performance increase can be realized for regular code-based
workflows, as shown in Graph 11 for a workflow definition with a sequence of two CodeActivity
blocks. In this case, it doesnt make much sense to parallelize write operations to the Console but
when facing database operations with latency, parallelization will pay off as well see further on,
especially when combined with the ParallelActivity from the WF toolbox.
800
700
600
500
400
300
200
100
0
1
10
The cost of thousand integer operations can be neglected, giving us a good idea of the overhead
caused by workflow instance creation and input/output communication. To separate the costs of
workflow creation from the code execution inside, we measured the time it takes to create and
execute thousand instances of a void non-parameterized workflow, resulting in 287 ms. Based on
this, the remaining 133 ms can be considered an approximation for the input/output costs for the
three data values used in and produced by the workflow.
In addition we measured the cost associated with pushing the calculation logic outside the workflow
definition using Local Communication Services with a simple calculator as shown in Code 59.
[ExternalDataExchange]
interface ICalculator
{
int Sum(int a, int b);
}
class Calculator : ICalculator
{
public int Sum(int a, int b)
{
return a + b;
}
}
Code 59 - LCS service for a simple calculation
Subtracting the workflow instance creation cost results in a total processing cost of 230 ms where
the communication cost is the most significant due to the trivial cost for an addition operation. From
this we can conclude that LCS introduces a significant cost. Nevertheless, the flexibility of LCS is much
desired so one should be prepared to pay this cost to establish communication with the hosting
layer.
800
700
600
500
400
Procedural
300
Workflow
200
100
2500
2300
2100
1900
1700
1500
1300
1100
900
700
500
300
100
Number of iterations
This growing gap between procedural and workflow can be explained by the nature of WF work
scheduling. Our ForeachActivity implementation uses the activity event model to schedule the
execution of the loops body. The core of the ForeachActivity implementation is shown in Code 60.
When the loop body finishes execution for an iteration, an event is raised thats used to call the loop
body again as long there are items remaining in the source sequence. For the execution of the loop
body, the WF scheduler is triggered indirectly by calling the ExecuteActivity method on the
ActivityExecutionContext object. This scheduling overhead accumulates over time, causing the gap
between procedural and workflow-based iteration to grow.
Notice that this scheduling mechanism operates on the level of the workflow engine, across many
workflow instances. This means that individual workflow instances might be paused temporarily in
order to give other instances a chance to make progress. Because of this, the average execution time
of an individual workflow instance will increase. Strictly spoken, the ForeachActivity gives the WF
scheduler an opportunity to schedule another workflow instance every time it schedules the work for
the next iteration of the loop. One can compare this with voluntary yielding of threads in systems
with cooperative scheduling. Furthermore, such switching between instances can occur inside the
At first glance, this loop mechanism seems to be a disadvantage from a performance point of view.
However, it really depends on the contents of the loop. If long-running operations inside the loop
construct are launched in an asynchronous manner, the scheduler can switch active execution to
another workflow instance or a parallel branch in the same workflow instance. When the background
work has been completed, the activity is rescheduled to allow activity execution completion.
The cost of the loop body is kept to a minimum, calculating the product of each products unit price
and the number of items still in stock. These values are summed up to result in a total product stock
value. Although such a data processing mechanism could be written in a much more efficient manner
by means of (server-side) DBMS features such as aggregates and cursors, a workflow-based variant
might be a better choice to clarify the processing mechanism by means of a sequential diagram. Of
Notice the little trick in the SELECT statements FROM clause that takes the Cartesian product of two
(identical) tables just to get a massive amount of rows so that the TOP row restriction clause doesnt
run out of juice when measuring performance for large sets of data, exceeding the original tables
row count.
For the query manager implementation thats hooked up to the workflow engine we use the same
ADO.NET APIs as well as the same connection string. This causes data gathering to take place using
the same underlying connection technology and parameterization.
The results of the performance measurement are shown in Graph 13.
3000
2000
1500
Procedural
1000
Workflow
500
2500
2300
2100
1900
1700
1500
1300
1100
900
700
500
300
0
100
2500
This feature allows multiple readers to share one database connection and overcomes former
limitations of the Tabular Data Stream (TDS) protocol of SQL Server, which is also used by Sybase.
However, SQL Server 2005 is the only product today that supports MARS out of the box. For similar
nested DbDataReader implementations using other database products, one would have to open
multiple connections to the database to perform parallel data retrieval. Alternatively, the parent
query results could be loaded in memory first just like the GatherDataActivity does ready for
subsequent iteration that executes the child queries for each parent-level record.
Again, this sample could be implemented more efficiently by means of SQL join operations but such
an approach might be impossible if advanced calculation logic or condition-based branching has to
be performed as part of the data processing job, as is the case in AB Switch. Also, if the data
processing spans multiple databases or different kinds of data sources (e.g. calling a web service
inside the query manager implementation) or if cross-domain joins need to be performed (e.g.
combining data from a relational database with data from an XML source), such optimizations will
have to leave the scene. We should mention however that future technologies like LINQ might
overcome such limitations, especially the cross-domain querying one.
The results of this test are presented in Figure 39. This time we observe that the performance of the
procedural code variant is hurt by the nested query operations, compared to the results in the
previous section. At the same time, the workflow based approach keeps its linear trend but it still
lags behind with a factor 4 or so.
10000
9000
8000
7000
6000
5000
4000
Procedural
3000
Workflow
2000
1000
0
100 300 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500
Number of data records
The inherent parallelism of the workflow doesnt help much; most likely it even hurts the
performance a bit because were putting a high load on the scheduler inside the outer loops body by
creating two parallel activities that both have a relatively short execution time.
10000
8000
6000
Procedural
4000
Workflow
2000
0
2
10
Though the workflow-based variant doesnt catch up with the procedural coding which is still doing
serialized data retrieval operations by the way the situation is improving compared to the original
tests with no or little parallelism inside a workflow.
However, we dont expect much more progression when staying in an iterative data processing
model where only intra-workflow parallelism can be employed. The overhead of the iteration logic
seems to be a limiting factor and sequential processing of records doesnt allow for much flexibility,
neither for the developer nor for the workflow runtime. Therefore, we move our focus towards interworkflow parallelism.
9.6.4 Inter-workflow parallelism
Instead of using a huge outer loop mechanism inside a workflow definition it makes more sense to
use one workflow instance per unit of work, e.g. the processing of one patient in a medical agent. In
this section, well investigate this approach which was also taken for the AB Switch agent. For
aggregation kind of tasks, data can be reported back to the workflow host where its collected for
subsequent aggregation and/or reporting, optionally even by triggering another workflow that takes
in the collected set of data results and returns the aggregated result.
This time, our workflow definition is reduced to a very simple set of two parallel queries that return
some data based on a workflow parameter (in this case the products unique identifier) passed as a
dependency property. The result produced by the workflow instance is reported back to the caller
by means of yet another dependency property. Such a fine-grained workflow definition is shown in
Figure 40.
Next, well move the outer iteration logic to the workflow host, using a SqlDataReader iterating over
the products that require processing. For each such product in the result set, a workflow instance is
created that will perform subsequent processing for the individual product, in this case reduced to
some data gathering operations with little calculation involved. This is illustrated in Code 63.
using (SqlConnection conn = new SqlConnection(dsn))
{
SqlCommand cmd = new SqlCommand("SELECT TOP " + N +
" p.ProductID, p.SupplierID, p.CategoryID" +
" FROM Products AS p, Products AS p1", conn);
conn.Open();
SqlDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
Dictionary<string, object> args
= new Dictionary<string, object>();
args["Parameters"] = new PropertyBag();
args["Parameters"]["CategoryID"] = (int)reader["CategoryID"];
args["Parameters"]["SupplierID"] = (int)reader["SupplierID"];
WorkflowInstance instance = workflowRuntime
.CreateWorkflow(typeof(Workflow3), args);
instance.Start();
}
}
Code 63 - Moving the outer loop to the workflow host
2000
1800
1600
1400
1200
1000
800
600
400
200
0
Procedural
2500
2300
2100
1900
1700
1500
1300
1100
900
700
500
300
Workflow
100
The performance results of this workflow-based approach compared to a procedural equivalent are
shown in Graph 15. This time workflow clearly is the winner because of the inherent parallelism
between workflow instances. Although such parallelism could be created manually in procedural
code too, its much easier to do in a workflow-based design where one doesnt have to worry about
thread safety and so on.
25000
20000
15000
Procedural
10000
Workflow
5000
0
1
10
Furthermore, the number of parallel threads available to WF can be configured using the
DefaultWorkflowSchedulerService, as outlined below.
using (WorkflowRuntime workflowRuntime = new WorkflowRuntime())
{
workflowRuntime.AddService(new DefaultWorkflowSchedulerService(n));
Code 65 - Tweaking the workflow scheduler
In Code 65, n stands for the number of worker threads available to the WF scheduler. By default, this
number is 5 for uni-processor machines and 4 times the number of processors for multi-processor
machines, including multi-core ones [18]. On my machine with a dual core processor, this results in a
total of 8 worker threads when not overriding this default.
We expect a better throughput when increasing the number of available threads to the WF
scheduler. A little test was conducted to investigate this based on the workflow definition from
Figure 40, running 10 to 100 parallel workflow instances with 5 to 25 threads. The result of this test is
depicted in Graph 17. Similarly, the overall performance of the system was measured for 100 parallel
workflow instances with threads numbers ranging from 5 to 50 (see Graph 18).
From these graphs we can conclude that it is highly recommended to tweak the default settings
when optimizing for performance. However, increasing the number of threads does put a higher load
on the system which has to be considered carefully on production servers. Also, theres a ceiling to
40000
35000
30000
5 threads
25000
Default (8)
20000
10 threads
15000
15 threads
10000
20 threads
5000
25 threads
0
10
20
30
40
50
60
70
80
90 100
45000
40000
35000
30000
25000
20000
15000
10000
5000
0
5
10
15
20
25
30
35
40
45
50
Number of threads
10 Conclusion
The creation of a set of generic building blocks to assist in the definition of domain-specific
workflows is definitely a good idea to improve the approachability of workflow definition by people
from the field, e.g. medical staff. In this chapter we took a closer look at such an approach to create
data-driven workflows for medical agents. Using just a few custom blocks including one to gather
data, another one to iterate over sets of data and one to format output data we were capable of
expressing fairly complex agent structures that have a non-trivial procedural equivalent.
Chapter 5 - Conclusion
In this work we investigated the Windows Workflow Foundation platform of the .NET Framework 3.0
on two fields: dynamic adaptation and its suitability as a replacement for procedural coding of datadriven processing agents.
To support dynamic adaptation of workflow instances, we created an instrumentation engine layered
on top of the Dynamic Updates feature provided by WF. We distinguished between three update
types: the ones conducted from inside the workflow instance itself (internal modification), updates
applied to a workflow instance from the outside on the hosting layer (external modification) as well
as a combination of both, where external modification is used to inject an internal modification in a
workflow instance. Using this instrumentation mechanism, aspects can be injected in a workflow to
keep the workflow definition clean and smooth. Examples include logging, state inspection, time
measurement and authorization. It was found that the instrumentation engine provides enough
flexibility to adapt workflow instances in various ways and at various places. From a performance
point of view, internal modification outperforms the other options. On the other hand, external
modification is more flexible because of its direct access to contextual information from the host and
because workflows dont have to be prepared for possible changes. Injection of adaptation logic
using the instrumentation framework combines the best of both worlds.
The second part of this work covered the creation of generic building blocks for easy composition of
data-driven processing agents, applied to the AB Switch agent which is used by the Intensive Care (IZ)
department of Ghent University (UZ Gent). It was shown that with a relatively low number of basic
building blocks a great level of expressiveness can be realized in a generic manner. The core block of
this design is without doubt the GatherDataActivity that retrieves data from some data source based
on a query manager that allows for flexible implementation. Such a set of building blocks also opens
up for granular publication through (web) services, for example using WCF. Other communication
mechanisms such as queue-based message exchange have been touched as well.
We should mention that WF is the youngest pillar of the .NET Framework 3.0 and maybe the most
innovative one for a general-purpose development framework. Till the advent of WF, most workflow
engines were tied to some server application like BizTalk that applies workflows in a broader context,
e.g. to allow business orchestrations. With WF, the concept of workflow-driven design has shifted to
all sorts of application types with lots of practical use case scenarios. Nevertheless it remains to be
seen how workflow will fit in existing application architectures, also in combination with SOA. One
could question WFs maturity but to our humble opinion WFs BizTalk roots contribute to this
maturity a lot. During the research, we saw WF transition out of the beta phase to the final release in
November 2006, each time with lots of small improvements.
To some extent, the impact of the introduction of workflow in a day-to-day programming framework
could be compared to the impact of other programming paradigms back in history, such as OO
design. With all such evolutions, the level of abstraction is raised. In the case of WF, things such as
persistence of long-running state and near real-time state tracking are introduced as runtime
services that free developers from the burden of implementing these manually. Nevertheless, one
has to gain some basic knowledge and feeling over time concerning how these runtime services
WF features
Considerations
Designer
SOA
Tracking
Granularity
Scheduling
Performance
Persistence
Updates
Abstract
As the complexity of business processes grows, the
shift towards workflow-based programming becomes
more
attractive.
The
typical
long-running
characteristic of workflows imposes new challenges
such as dynamic adaptation of running workflow
instances. Recently, Windows Workflow Foundation
(in short WF) was released by Microsoft as their
solution for workflow-driven application development.
Although WF contains features that allow dynamic
workflow adaptation, the framework lacks an
instrumentation framework to make such adaptations
more manageable. Therefore, we built an
instrumentation framework that provides more
flexibility for applying workflow adaptation batches to
workflow instances, both at creation time and during
an instances lifecycle. In this paper we present this
workflow
instrumentation
framework
and
performance implications caused by dynamic
workflow adaptation are detailed.
1. Introduction
In our day-to-day life were faced with the concept
of workflow, for instance in decision making
processes. Naturally, such processes are also reflected
in business processes. Order processing, coordination
of B2B interaction and document lifecycle
management are just a few examples where workflow
is an attractive alternative to classic programming
paradigms. The main reasons to prefer the workflow
paradigm over pure procedural and/or object-oriented
development are:
Business process visualization: flowcharts are a
popular tool to visualize business processes;
workflow brings those representations alive. This
helps to close the gap between business people and
the software theyre using since business logic is
no longer imprisoned in code.
Transparency: workflows allow for human
inspection, not only during development but even
more importantly during execution, by means of
tracking.
Development model: building workflows consists
of putting together basic blocks (activities) from a
toolbox, much like the creation of UIs using IDEs.
Runtime services free developers from the burden
of putting together their own systems for
persistence, tracking, communication, etc. With
2. Dynamic adaptation
2.1. Static versus dynamic: workflow
challenges
In lots of cases, workflow instances are long-lived:
imagine workflows with human interaction to approve
orders, or systems that have to wait for external
service responses. Often, this conflicts with ever
changing business policies, requiring the possibility to
adapt workflow instances that are in flight. Another
application of dynamic workflow adaptation is the
insertion of additional activities into a workflow. By
adding logic for logging, authorization, time
measurement, state inspection, etc dynamically, one
can keep the workflows definition pure and therefore
more readable.
3.2. Implementation
Each instrumentation task definition consists of a
unique name, a workflow type binding, a parameter
evaluator, a list of injections and optionally a set of
services. Well discuss all of these in this section. The
interface for instrumentation tasks is shown in Figure
3. The instrumentation tool keeps a reference to the
workflow runtime and intercepts all workflow
instance creation requests. When its asked to spawn a
new instance of a given type with a set of parameters,
all instrumentation tasks bound to the requested
workflow type are retrieved. Next, the set of
parameters is passed to the parameter evaluator that
Instrumentation
before (left) and
after (right)
5. Performance evaluation of
instrumentation
5.1. Test methodology
To justify the use of workflow instrumentation, we
conducted a set of performance tests. More
specifically, we measured the costs imposed by the
dynamic update feature of WF in various scenarios. In
order to get a pure idea of the impact of a dynamic
update itself, we didnt hook up persistence and
tracking services to the runtime. This eliminates the
influences caused by database performance and
communication. In real workflow scenarios however,
services like persistence and tracking will play a
prominent role. For a complete overview of all factors
that influence workflow applications performance,
see [3]. All tests were performed on a machine with a
a) Internal modification
b) External modification
Figure 5. Overhead of applying dynamic
modifications to workflows
5.5. Discussion
Based on these results, we conclude that ad hoc
workflow instance instrumentation based on external
modification has a significant performance impact,
certainly under heavy load conditions. Instrumentation
taking place at workflow instance creation time or
adaptation from the inside is much more attractive in
terms of performance. Injection of dynamic (internal)
adaptations as explained in section 4.2 should be taken
under consideration as a valuable performance booster
when little contextual information from the host layer
is required in the update logic itself. Overuse of
suspension points, though allowing a big deal of
flexibility, should be avoided if a workflow instances
overall execution time matters and load on the
persistence database has to be reduced. These results
should be put in the perspective of typically long
running workflows. Unless were faced with timecritical systems that require a throughput as high as
possible, having a few seconds delay over the course
of an entire workflows lifecycle shouldnt be the
biggest concern. However, in terms of resource
6. Conclusions
Shifting the realization of business processes from
pure procedural and object-oriented coding to
workflow-driven systems is certainly an attractive
idea. Nevertheless, this new paradigm poses software
engineers with new challenges such as the need for
dynamic
adaptation
of
workflows
without
recompilation ([4]), a need that arises from the longrunning characteristic of workflows and the ever
increasing pace of business process and policy
changes. To assist in effective and flexible application
of dynamic updates of various kinds, we created a
generic instrumentation framework capable of
applying instrumentations upon workflow instance
creation and during workflow instance execution. The
former scenario typically applies to weaving aspects
into workflows, while the latter one can assist in
production debugging and in adapting a running
workflow to reflect business process changes. The
samples discussed in this paper reflect the flexibility
of the proposed instrumentation framework.
Especially, the concept of suspension points allowing
external modifications to take place in a time-precise
manner opens up for a lot of dynamism and flexibility.
From a performance point of view, instrumentations
taking place at workflow instance creation time and
internal modifications are preferred over ad hoc
updates applied on running workflow instances. Also,
applying ad hoc updates is a risky business because of
the unknown workflow instance state upon
suspension. This limitation can be overcome by use of
suspension points, but these shouldnt be overused
7. Future work
One of the next goals is to make the
instrumentation framework easier and safer by means
of designer-based workflow adaptation support and
instrumentation correctness validation respectively.
Workflow designer rehosting in WF seems an
attractive candidate to realize the former goal but will
require closer investigation. Furthermore, our research
focuses on the creation of an activity library for
patient treatment management using workflow, in a
data-driven manner. Based on composition of generic
building blocks, workflow definitions are established
to drive various processes applied in medical practice.
Different blocks for data gathering, filtering,
calculations, etc will be designed to allow the creation
of data pipelines in WF. Our final goal is to combine
this generic data-driven workflow approach with the
power of dynamic updates and online instrumentation.
The need for dynamic adaptation in the health sector
was pointed out in [5]. This introduces new challenges
to validate type safety with respect to the dataflowing
through a workflow, under the circumstances of
References
[1] D. Shukla and B. Schmidt, Essential Windows
[2]
[3]
[4]
[5]
Bibliography | 116
Bibliography
1. Microsoft Corporation. Windows SDK. Windows SDK. s.l. : Microsoft Corporation, 2006.
2. De Smet, Bart. WF - Working with Persistence Services. B# .NET Blog. [Online] October 14, 2006.
[Cited: April 12, 2007.]
http://community.bartdesmet.net/blogs/bart/archive/2006/10/14/4580.aspx.
3. De Smet, Bart. WF - Working with Tracking Services. B# .NET Blog. [Online] October 15, 2006.
[Cited: April 12, 2007.]
http://community.bartdesmet.net/blogs/bart/archive/2006/10/15/4582.aspx.
4. Richter, Jeffrey. CLR cia C# Second Edition. s.l. : Microsoft Press, 2006.
5. team, Microsoft's AzMan. Authorization Manager Team Blog. MSDN Blogs. [Online] Microsoft
Corporation. [Cited: May 28, 2007.] http://blogs.msdn.com/azman/.
6. De Smet, Bart. WF - Using the WorkflowMonitor in combination with Dynamic Updates. B# .NET
Blog. [Online] October 16, 2006. [Cited: April 27, 2007.]
http://community.bartdesmet.net/blogs/bart/archive/2006/10/16/4585.aspx.
7. Microsoft Corporation. Performance Characteristics of Windows Workflow Foundation. MSDN.
[Online] November 2006. [Cited: March 14, 2007.] http://msdn2.microsoft.com/enus/library/aa973808.aspx.
8. De Smet, Bart. Performance measurement in .NET 2.0 - The birth of Stopwatch. B# .NET Blog.
[Online] March 24, 2006. [Cited: May 17, 2007.]
http://community.bartdesmet.net/blogs/bart/archive/2006/03/24/3838.aspx.
9. Steurbaut, Kristof. Intelligent software agents for healthcare decision support - Case 1: Antibiotics
switch agent (switch IV-PO). Ghent : UGent - INTEC, 2006.
10. De Turck, F, et al. Design of a flexible platform for execution of medical decision support agents
in the Intensive Care Unit. Comput Biol Med. 37, 2007, 1.
11. Corp., Sybase. Adaptive Server Enterprise. Sybase. [Online] Sybase Corp. [Cited: May 21, 2007.]
http://www.sybase.com/products/databasemanagement/adaptiveserverenterprise.
12. Skeet, Jon. Why doesn't C# have checked exceptions? . MSDN Blogs. [Online] March 12, 2004.
[Cited: May 14, 2007.] http://blogs.msdn.com/csharpfaq/archive/2004/03/12/88421.aspx.
13. De Smet, Bart. WF - Introducing External Data Exchange, the CallExternalMethodActivity and
Local Communications Services. B# .NET Blog. [Online] October 17, 2006. [Cited: May 03, 2007.]
http://community.bartdesmet.net/blogs/bart/archive/2006/10/17/4584.aspx.
Bibliography | 117
14. Shukla, Dharma and Schmidt, Bob. Essential Windows Workflow Foundation. s.l. : Addison
Wesley, 2006.
15. Vihang Dalal. Windows Workflow Foundation: Everything About Re-Hosting the Workflow
Designer. MSDN. [Online] Microsoft, May 2006. [Cited: May 22, 2007.]
http://msdn2.microsoft.com/en-us/library/aa480213.aspx.
16. Corp., Microsoft. Northwind and pubs Sample Databases for SQL Server 2000. [Online] Microsoft
Corporation. [Cited: May 08, 2007.]
http://www.microsoft.com/downloads/details.aspx?familyid=06616212-0356-46a0-8da2eebc53a68034&displaylang=en.
17. Kleinerman, Christian. Multiple Active Result Sets (MARS) in SQL Server 2005. MSDN. [Online]
Microsoft Corp., June 2005. [Cited: May 24, 2007.] http://msdn2.microsoft.com/enus/library/ms345109.aspx.
18. Allen, Scott. Hosting Windows Workflow. OdeToCode.com. [Online] August 6, 2006. [Cited: May
24, 2007.] http://www.odetocode.com/Articles/457.aspx.
19. Microsoft Corporation. .NET Framework 3.0. .NET Framework 3.0. [Online] 2007. [Cited: March
26, 2007.] http://www.netfx3.com.