Вы находитесь на странице: 1из 41

Tutorials About RSS

Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Software Architecture
Jakob Jenkov
Last update: 2014-10-31

Note: This tutorial is still work in progress. It will be updated bit by bit, until it reaches a more comprehensive and
coherent state. However, you may still get something out of it already now.

Software architecture and software design are two aspects of the same topic. Both are about how software is
structured in order to perform its tasks. The term "software architecture" typically refers to the bigger structures of a
software system, whereas "software design" typically refers to the smaller structures.

Exactly where the boundary is between architecture and design is hard to say, since the architecture of a system also
affects its design. The design of the bigger structures affect the design of the smaller structures. To set it somewhere
meaningful (to decide what should be included and excluded in this tutorial), I have set the boundary at the process
level. Software design is thus concerned with the internal design of a single software process, whereas software
architecture is concerned with the design of how multiple software processes cooperate to carry out their tasks.
How does my definition of software architecture fit with the term "distributed systems" ? The way I see it, software
architecture provides the basic structures on top of which the various distributed algorithms can run. Yes, there is a
certain overlap between the two terms, but various different distributed algorithms can run on top of the same
underlying architectures.

Software architecture is also influenced by the hardware architecture of the whole system (software + hardware). You
may need different architectures (and thus design) depending on what hardware you are using. Or, you may choose
different hardware depending on your architecture.

Common Software Architectures


There are many different types of architectures, but some architectural patterns occur more commonly than others.
Here is a list of common software architecture patterns:

Single process.
Client / Server (2 processes collaborating).
3 Tier systems (3 processes collaborating in chains).
N Tier systems (N processes collaborating in chains).
Service oriented architecture (lots of processes interacting with each other).
Peer-to-peer architecture (lots of processes interacting without a central server).
Hybrid architectures - combinations of the above architectures.

Here is a simple illustration of these architectures.


Process Communication Channels
Processes typically have three media through which they can communicate with each other. These are:

Network
Disk
Pipes

Processes can communicate with each other via computer networks. Through this medium a process can
communicate with processes running on the same computer as itself, or with processes running on a different
computer, provided that the two computers running the processes are connected with a computer network.

Processes running on the same computer can also communicate with each other via the computer's hard disk (or other
disks like USB disks etc.). Processe A can write files to the disk which are processed by Process B. A reply can also
be sent back from Process B in a file written to disk which Process A then reads.

Processes can also communicate via network storage, which is essentially a hard disk connected to the computer
network. This way processes can also communicate with processes running on different computers, via the
combination of network and disk communication.

Depending on the operating system the processes are running on, processes running on the same machine can also
communicate with each other through pipes. Pipes are channels of communication provided by the operating systems
for processes. The communication takes place like network communication, but the messages exchanged are kept
internally in the RAM of the computer. Pipes can be faster than network communication, because a lot of network
protocol overhead can be eliminated when the communicating processes run on the same computer.

Processes could also communicate via a RAM disk, which is a virtual hard disk allocated in the RAM of a computer. A
RAM disk looks like a disk to the process, but is much faster than a disk because the data is only stored in RAM.

Process Communication Modes


Processes can communicate with each other in either:

Synchronous mode.
Asynchronous mode.

When a process A communicates with a process B synchronously, it means that process A sends a message to
process B and waits for B to reply. Process A does not do anything until it gets a reply from process B.

When two processes communicates asynchronously, the processes sends messages to each other without waiting for
each other to reply. Process A may send a message to process B and then continue with some other work. At some
point process B sends a message back to process A, and process A processes that message when process A has
time for it.

Synchronous and asynchronous communication has different advantages and use cases. You can use asynchronous
communication to implement synchronous communication, or use synchronous communication to implement
asynchronous communication.

The synchronous and asynchronous communication modes are illustrated here:

Next: Single Process Architecture


Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Single Process Architecture


Jakob Jenkov
Last update: 2013-10-16

A software system consisting of a single running process is said to have a single process architecture. Or simply that it
is a single process. You could also call this a 1-tier architecture. Single process applications are also often referred to
as standalone applications.

Common examples of single process applications are:

Command line programs.


Desktop applications without network communication.
Mobile applications without network communication.

In today's world more and more applications are open for communication in one way or another though. Command line
applications might be able to work on the output of another command line application. Desktop applications can
update themselves over the internet, and report errors to remote error databases. Mobile applications can
communicate with other mobile applications installed on the same phone via intents (Android) etc.

Next: Computer Architecture

Tweet
Jakob Jenkov
Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Computer Architecture
Computer Architecture Overview
CPU
RAM
Disk Drives
NIC
The Bus
Other Devices

Jakob Jenkov
Last update: 2014-10-31

Computer Architecture Overview


The architecture of computers influences software architecture. In fact, since computers are part of software
architecture, computer architecture is part of software architecture too. Your software architecture sits on top of your
computer architecture. Therefore I have added this quick overview of computer architecture to this software
architecture tutorial.

Here is a diagram illustrating the main components of a computer:


The CPU executes the instructions. The RAM stores the instructions and data the instructions work on. The disk drives
can store data persistently between system restarts, whereas the RAM is typically deleted when the computer restarts.
The NIC connects the computer to a network so it can communicate with other computers. The Bus connects the CPU
with the RAM, disk drives and the NIC so they can exchange data and instructions.

I will get into a bit more detail about each of these components in the following sections.

CPU
The CPU (Central Processing Unit) is what executes all the instructions. The faster the CPU is, the the faster the
computer can execute instructions.

Some CPUs have multiple cores. Each core functions like a separate CPU which can execute instructions
independently, regardless of what the other cores are doing. If your computer CPU has multiple cores, you should
think about how to utilize all the cores when designing your software architecture.

CPUs often have a limited amount of cache memory, which is memory only the CPU can access, and which is much
faster than the ordinary RAM.

RAM
The RAM (Random Access Memory) can store both instructions for the CPU to execute, and the data these
instructions are processing. The more memory a computer has, the more data and instructions it can store. The RAM is
normally cleared when the computer is shut down or rebooted. It is typically not a permanent storage (although
persistent RAM actually exists).

Having a lot of memory in a computer is useful when caching data which normally reside on disk, or which is read
from a remote system via the network (using the NIC).

The speed of the memory determines how fast you can read and write data in it. The faster the better.

Disk Drives
The disk drives can store data just like the RAM, but unlike the RAM the data is saved even when the computer shuts
down. The disk drives are often much slower than RAM to access, so if you need to process big amounts of data, it is
preferable to keep that data in RAM.

The speed of the disk drives determine how fast you can read and write data from and to it. The faster the better. The
speed of a disk consists of two numbers: Search time and transfer time. Search time tells how fast the disk can search
to a certain position on the disk. The transfer time tells how fast the disk can transfer data once it is in the right position.

Some disks have a bit of read cache RAM to speed up reading of data from the disk. When a chunk of data is
requested from the disk, the disk will read a bigger chunk into the cache, hoping that the next chunk requested will be
within the data stored in the disk cache memory.

Some types of hard disks work more like memory. Among these are SSDs (Solid State Disks). Since SSDs work like
memory, the search time is very low. Every memory cell can be addressed directly. This is great if your software is
doing many small reads from different places on the disk. The transfer time of SSDs are typically also higher than from
normal hard disks.
NIC
The NIC (Network Interface Card) connects the computer to a network. This enables the computer to communicate
with other computers, for instance via the internet. The speed of the NIC determines how fast the computer can
communicate with other computers. Of course, the speed of the NIC in the other computers and the network equipment
between them matters too.

The Bus
The Bus connects the CPU to the RAM, disk drives and NIC. The speed of the us impacts how fast the components
can exchange data and instructions. Of course the speed of the components themselves impact this too.

Other Devices
I have left out devices like keyboard, mouse, monitor, USB devices, sound card and graphics cards etc. These
typically do not have the big impact on your software architecture (unless you are doing computer games or something
similar to that).

Next: Client-Server Architecture

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Client-Server Architecture
Jakob Jenkov
Last update: 2014-10-31

Client / server architecture is also called 2-tier architecture. A client is talking to a server which performs some services
on behalf of the client.

Common examples of client / server communication is:

Desktop application to database server communication


Browser to web server communication.
Mobile to server communication.
FTP client to FTP server communication.

2-Tier Architecture Disadvantages


In the early days of client / server applications, desktop application to database server communication was a normal
use case. Most of the business logic was embedded inside the desktop application. Therefore this style of client /
server applications were also called "fat client applications". Fat client applications are illustrated simply here:
Embedding all the business logic of a client / server application inside the client applications has some
disadvantages. First of all, it results in potential race conditions (parallelization problem) when two desktop
applications attempts to update the database at the same time. If two applications read a record, update it, and save it
again at the same time, which version of the updated record will be saved in the database?

Another problem with fat client applications is that the client application has to be installed on each client machine.
Back in the day that had to be done manually, but today automatic installation systems exist that can install
applications on desktop computers. That way applications (including updates) can be administered centrally.

The disadvantages of 2-tier fat client applications is what made software developers move to 3-tier and n-tier
architecture.

Next: N Tier Architecture

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

N Tier Architecture
Web And Mobile Applications
Rich Internet Applications (RIA)
Web Application Advantages

Jakob Jenkov
Last update: 2014-10-31

N tier architecture means splitting up the system into N tiers, where N is a number from 1 and up. A 1 tier architecture
is the same as a single process architecture. A 2 tier architecture is the same as a client / server architecture etc.

A 3 tier architecture is a very common architecture. A 3 tier architecture is typically split into a presentation or GUI tier,
an application logic tier, and a data tier. This diagram illustrates a 3 tier architecture:
The presentation or GUI tier contains the user interface of the application. The presentation tier is "dumb", meaning it
does not make any application decisions. It just forwards the user's actions to the application logic tier. If the user
needs to enter information, this is done in the presentation tier too.

The application logic tier makes all the application decisions. This is where the "business logic" is located. The
application logic knows what is possible, and what is allowed etc. The application logic reads and stores data in the
data tier.

The data tier stores the data used in the application. The data tier can typically store data safely, perform transactions,
search through large amounts of data quickly etc.

Web And Mobile Applications


Web applications are a very common example of 3 tier applications. The presentation tier consists of HTML, CSS and
JavaScript, the application logic tier runs on a web server in form of Java Servlets, JSP, ASP.NET, PHP, Ruby, Python
etc., and the data tier consists of a database of some kind (mysql, postgresql, a noSQL database etc.). Here is a
diagram of a typical 3 tier web application:

Actually, it is the same principle with mobile applications that are not standalone applications. A mobile application
that connects to a server typically connect to a web server and send and receive data. Here is a diagram of a typical 3
tier mobile application:

Rich Internet Applications (RIA)


In the first generations of web applications a lot of the HTML, and parts of the CSS and JavaScript was generated by
scripts running on the web server. When a browser requests a certain page on the web server, a script was executed
on the web server which generated the HTML, CSS and JavaScript for that page.

Today the world is moving to rich internet applications (RIA). RIA also uses a 3 tier architecture, but all the HTML, CSS
and JavaScript is generated in the browser. The browser has to download the initial HTML, CSs and JavaScript files
once, but after that the RIA client only exchanges data with the web server. No HTML, CSS or JavaScript is sent forth
and back (unless that is part of the data, like with an article that contains HTML codes).

RIA applications are explained in more detail in the next text in the software architecture trail.

Web Application Advantages


The purpose of N tier architecture is to insulate the different layers of the application from each other. The GUI client
doesn't know how the server is working internally, and the server doesn't know how the database server works
internally etc. They just communicate via standard interfaces.

Web applications especially have another advantage. If you make updates to the GUI client or the application logic
running on the server, all clients get the latest updates the next time they access the application. The browser
downloads the updated client, and the updated client accesses the updated server.

Next: RIA Architecture

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

RIA Architecture
First Generation Web Applications
Second Generation Web Applications
RIA Web Applications
RIA Technologies

Jakob Jenkov
Last update: 2014-10-31

RIA (Rich Internet Applications) are a special breed of web applications where the user interface has much richer
functionality than what the first and second generation web applications. They look and feel more like desktop
applications. RIA user interfaces are typically developed using HTML5 + JavaScript + CSS3, Flex (Flash), JavaFX,
GWT, Dart or some other RIA tool. In the long run, variations of HTML5 + JavaScript + CSS3 seems to be the winner
(GWT and Dart can be compiled to JavaScript).

The richer GUI client of RIA user interfaces also results in a somewhat different internal architecture and design of the
web applications. RIA user interfaces and their backends are typically more cleanly separated than for first and second
generation web applications. This makes the RIA GUI more independent of the server side, and also makes it easier
for GUI and server developers to work in parallel. I will explain how in this text, but first I have to explain how the
typical internal design looked in first and second generation web applications.

First Generation Web Applications


First generation web applications were page oriented. You would include all GUI logic and application logic inside the
same web page. The web page would be a script which was executed by the web server, when requested by the
browser. GUI logic and application logic were mixed up inside the same page script. Here is an illustration of this
architecture and design:
Being page oriented, every action the application allowed was typically embedded in its own web page script. Each
script was like a separate transaction which executed the application logic, and generated the GUI to be sent back to
the browser after the application logic was executed. Here is an illustration of the paged nature of first generation web
applications:

The GUI was pretty dumb. The browser showed a page. When the user clicked something in the page, the browser
was typically redirected to a new page (script).

First generation web page technologies include Servlets (Java), JSP (JavaServer Pages), ASP, PHP and CGI scripts
in Perl etc.

Anyone who has worked with first generation web application knows that this design was a mess. GUI logic and
application logic was interleaved, making it hard to locate either one when you needed to make changes to either the
GUI or application logic. Code reuse was low, because the code was spread over a large amount of web page scripts.
GUI state (like pressed down buttons) which had to be maintained across multiple pages, had to be kept either in the
session data on the server, or be sent along to each page and back to the browser again. If you wanted to change the
programming language of the web page scripts (like from PHP to JavaServer Pages (JSP) ), that would often require a
complete rewrite of the web application. It was a nightmare.

Second Generation Web Applications


In second generation web applications developers found ways to separate the GUI logic from the application logic on
the server. Web applications also became more object oriented than they had been when the code was spread over
multiple pages. Often, web page scripts were used for the GUI logic, and real classes and objects were used for the
application logic. Here is a diagram illustrating the design of second generation web applications:
Frameworks were developed to help make second generation web applications easier to develop. Examples of such
frameworks are ASP.NET (.NET), Struts + Struts 2 (Java), Spring MVC (Java), JSF (JavaServer Faces), Wicket (Java)
Tapestry (Java) and many others. In the Java community two types of framework designs emerged. Model 2 and
component based frameworks. I will not go to deep into the difference here, because in my opinion both designs are
obsolete.

Second generation web applications were easier to develop than first generation web applications, but they still had
some problems. Despite the better separation of GUI and application logic, the two domains still often got intertwined
in each other. Also, since the GUI logic was written in the same language as the application logic, changing the
programming language meant rewriting the whole application again. Additionally, due to the limits of second
generation web application technologies, the GUIs developed were often more primitive than what people were used
to from desktop applications. So, even if second generation web application technologies were a step forward, they
were still a pain to work with.

RIA Web Applications


RIA (Rich Internet Applications) web applications are the third generation of web applications. RIA web applications
were first and foremost known for having a user interface that looked much closer to desktop applications.

To achieve these more advanced user interfaces, RIA technologies are typically executed in the browser using either
JavaScript, Flash, JavaFX or Silverlight (both Flash and Silverlight are on their way out to my knowledge, but I could
be wrong). Here is a diagram illustrating RIA web application architecture and design:

As you can see, the GUI logic is now moved from the web server to the browser. This complete split from the server
logic has some positive and negative consequences for the design and architecture of the application.

First of all, the GUIs can become much more advanced with RIA technologies. That by itself is an advantage.

Second, because the GUI logic is executed in the browser, the CPU time needed to generate the GUI is lifted off the
server, freeing up more CPU cycles for executing application logic.

Third, GUI state can be kept in the browser, thus further cleaning up the server side of RIA web applications.

Fourth, since the GUI logic completely separated from the application logic, it becomes easier to develop reusable GUI
components, which can be reused no matter what server side programming language is used for the application logic.

Fifth, RIA technologies typically communicate with the web server by exchanging data, not GUI code (HTML, CSS and
JavaScript). The data exchange is typically XML sent via HTTP, or JSON sent via HTTP. This changes the server side
from being "pages" to being "services" that perform some part of the application logic (create user, login, store task
etc.). When your server side becomes complete free from generating GUI, your application logic becomes very clear.
Your application logic gets some data in, and sends some data out, and doesn't have to think about anything else. No
GUI state, no GUI generation etc. This is a really big advantage during development. Yes, your application logic still
has to generate either XML or JSON from data, but that is much, much easier than generating GUI, and much much
closer to the application logic and data model.

Sixth, since the GUI and application logic on the server typically communiate via HTTP + JSON, or HTTP + XML, the
GUI is 100% independent of what programming language that is used on the server. The GUI logic's interface to the
server is just HTTP calls. This means that you can change the programming language and tools on the client
independently of the server, as well as change the server tools without influencing the client.

Seventh, since GUI logic and application logic are completely separated, and the only interface between them is
HTTP + JSON / XML, the GUI and application logic can also be developed independently of each other. The
application logic developer can implement the services and test them independently of the GUI. And the GUI logic
developer can just create a fake (mock) service which returns the data he needs to develop his GUI needs, so he can
develop and test his GUI. Once the real service is ready, the mock service can be swapped for the real service without
too much work.

Eighth, because the back end just consists of services that send and receive data, it is much easier add a different type
of client in the future, if needed. For instance, you might want to add a native mobile iOS or Android application client,
which also interacts with your back end services.

Ninth, since the back end now consist of simple services receiving and sending data, your web application is naturally
prepared for the "open application" trend, where web applications can be accessed both via a GUI and via an API (in
case your users need to write a program to interact with your web application).

Tenth, since the GUI and back end services only exchange data, the traffic load is often smaller than if the back end
had to send both the data and HTML, CSS and JavaScript. Thus your bandwidth requirements may go down (but
sometimes you end up sending more small requests than you would otherwise, so it may cancel each other out).

As you can see, what on the surface seems like a small change in how we develop web applications actually have
quite a lot of beneficial side effects. Once you experience these advantages in a real project, you will never want to go
back to second generation web applications again.

Here is a somewhat more detailed diagram of RIA architecture, so you can see all the separations of responsibility
(and thus theoretically verify the advantages I have mentioned above):

Of course there are some negative consequences of RIA technology too. The complete split of GUI logic from
application logic sometimes means that the GUI logic is implemented using a different programming language than the
application logic. Your GUI logic might be implemented in JavaScript, ActionScript (Flash) or Dart, and the application
logic in Java, C# or PHP. Java developers can use Google Web Toolkit (GWT) to program JavaScript GUIs using
Java. The new RIA technologies means that the developer team must master more technologies. This is of course a
disadvantage. But I think you can live with that, given all the advantages RIA web applications have.

RIA Technologies
Here is a list of a few well-known RIA technologies:

HTML5 + CSS3 + JavaScript + JavaScript frameworks


jQuery
jQuery Mobile
AngularJS
Sencha EXT-JS
SmartClient
D3
Dart
GWT (Google Web Toolkit)
JavaFX
Flex (Flash)
Silverlight (now unsupported)

My perception is that the world is moving towards HTML5 + CSS3 + JavaScript RIA solutions, rather than Flex and
JavaFX. Microsoft has already shut down development of their Silverlight, and I suspect that JavaFX and perhaps
even Flex will go the same way in the future. But that is just speculation.
Next: Service Oriented Architecture (SOA)

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Service Oriented Architecture (SOA)


Jakob Jenkov
Last update: 2014-10-31

Service oriented architecture (SOA) is an architecture in which the application logic of one (or more) applications are
split up into separate services which can be called individually. Services can call each other too, if needed.

I have written a separate trail on Service Oriented Architecture, so I will not write more about it in this software
architecture tutorial.

Next: Event-driven Architecture

Tweet
Jakob Jenkov
Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Event-driven Architecture
Event-driven Architecture Between Processes
The Event Queue
The Event Log
Event Collectors
The Reply Queue
Read vs. Write Events
Event Log Replay Challenges
Handling Dynamic Values
Coordination With External Systems
Event Log Replay Solutions
Replay Mode
Multi Step Event Queue

Jakob Jenkov
Last update: 2015-09-23

Event-driven architecture is an architectural style where components in the system emit events and react to events.
Instead of component A directly calling component B when some event occurs, component A just emits an event.
Component A knows nothing about which components listen for its events.

Event-driven architecture is used both internally within a single process and between processes. For instance, GUI
frameworks typically use events a lot. Additionally, the assembly line concurrency model (AKA reactive, non-blocking
concurrency model) as explained in my tutorial about concurrency models also uses an event-driven architecture.
In this tutorial I will focus on how event-driven architecture between processes look. Thus, when I write event-driven
architecture throughout the rest of this tutorial, that is what I refer to, even if it is not the only meaning of the term.

Event-driven Architecture Between Processes


Event-driven architecture is an architectural style where incoming requests to the system are collected into one or
more central event queues. From the event queues the events are forwarded to the backend services that are to
process the events.

Event-driven architecture is also sometimes referred to as message-driven architecture or stream processing


architecture. The events can be seen as a stream of messages - hence the other two names. The stream processing
architecture has also been called a lambda architecture. Regardless, I will continue using the name event-driven
architecture.

The Event Queue


I an event-driven architecture you have one or more central event queues into which all events are inserted before they
are processed. Here is a simple illustration of event-driven architecture with an event queue:

The events are ordered when inserted into the queue so you know in what sequence your system responds to the
events.

The Event Log


The messages written to the event queue can be written to an event log (typically on disk). If the system crashes, the
system can rebuild its state simply by replaying the event log. Here is an illustration of an event-driven architecture
with an event log to persist events:

You can also backup the event log and thus take a backup of the state of the system. You can use this backup to run
performance tests on new releases before actually deploying them to production. Or, you could replay the backup of
the event log to reproduce some error that has been reported.

Event Collectors
Requests usually arrive over the network, for instance as HTTP requests or via other protocols. The events are
received from their many sources via event collectors. Here is an illustration of an event-driven architecture with event
collectors added:
The Reply Queue
Sometimes you may need to send a reply back to a request (event). Therefore some event-driven architectures have a
reply queue too. Here is a diagram of an event-driven architecture that uses both an event queue (inbound queue) and
a reply queue (outbound queue):

As you can see, the reply may have to be routed back to the correct event collector. For instance, if an HTTP collector
(a web server essentially) sends requests received via HTTP into the event queue, the reply generated for that event
may have to be sent back to the client via the HTTP collector (server) again.

Normally the reply queue is not persisted, meaning it is not written to the event log. Only the incoming events are
persisted to the event log.

Read vs. Write Events


If you categorize all incoming requests as events they will all get pushed into the event queue. If the event queue is
persistent (is persisted to an event log) that means that all events are persisted. Persisting events is usually slow, so if
we could filter out some of the events that do not need to be persisted, we could potentially increase performance of
the event queue.

The reason the event queue is persisted to the event log is so that we can replay the event log and recreate the exact
state of the system as caused by the events. To support this, only events that change system state actually need to be
persisted. In other words, you can divide the events into read events and write events. Read events only read system
state but does not change it. Write events change system state.

With a division of events between read and write events, only write events need to be persisted. This will give a
performance increase to the event queue. Exactly how big this performance increase is, depends on the ratio between
read and write events.

In order to divide events into read and write events, the distinction must be made already in the event collectors, before
the event reaches the event queue. Otherwise the event queue cannot know if a given event should be persisted or
not.

You could also split your event queues into two. One event queue for read events and one event queue for write
events. That way read events are not slowed down behind slower write events, and the event queue does not have to
inspect each message to see if it should be persisted or not. The read event queue does not persist events, and the
write event queue always persist events.

Here is an illustration of an event-driven architecture with the event queue split up into read and write event queues:
Yes, it looks a bit chaotic with the arrows, but in practice it is not really so chaotic to create 3 queues and distribute
messages between them.

Event Log Replay Challenges


The ability to just replay the event log to recreate system state in case of e.g. a system crash or system restart is often
emphasized as one of the big advantages of event-driven architecture. In the cases where a log can just be replayed
independent of time and surrounding systems, this is a big advantage.

However, replaying the event log completely independent of time is not always possible. I will cover some of the
challenges to event log replay in the following sections.

Handling Dynamic Values


As mentioned earlier, write events are events that when processed may change the system state. Sometimes such a
state change depends on dynamic values which are resolved at the time the event is processed. Examples of dynamic
values could be the date and time the event is processed (e.g. an order date) or the currency exchange rate at that
specific date and time.

Such dynamic values represent a challenge to event log replay. If you replay the event log on a different day the
service processing the event may resolve a different dynamic value, like another date and time, or another exchange
rate. Replaying the event log on a different day would thus not result in recreating the exact same system state as
when the events were originally processed.

To solve the problem with dynamic values you could have the write event queue be able to stamp the needed dynamic
values into the event. However, for this to work the event queue would need to know what dynamic values each
message need. This would complicate the design of the event queue. Every time a new dynamic value is needed, the
event queue would need to know how to lookup that dynamic value.

Another solution is that the write event queue only stamps the write events with the date and time of the event. With the
original date and time of the event the service processing the event can lookup what the dynamic value was at the
given date and time. For instance, it could lookup the exchange rate that was in effect at that time. This of course
requires that the service can actually lookup dynamic values based on date and time, and this is not always the case.

Coordination With External Systems


Another challenge to event log replay is coordination with external systems. For instance, imagine that your event log
contains product orders from a web shop. When you process an order the first time your system may send the order to
an external payment gateway to charge the amount from the customer's credit card.

If you replay the event log later, you do not want the client being charged again for the same order. Thus, you do not
want to send the orders to the external payment gateway during replay.

Event Log Replay Solutions


Solving the problems with log replay is not always easy. Some systems have no problems and can replay the event
log as it is. Other systems may need to know the date and time of the original event. And yet other systems may need
to know a whole lot more - like values obtained from external systems during the original processing of the event.

Replay Mode
In any case, any service listening for events from the write event queue must know whether the incoming event is an
original event or a replayed event. That way the service can determine how to handle the resolution of dynamic values
and coordination with external systems.

Multi Step Event Queue


Another solution for the event log replay challenges is to have a multi step event queue. Step one collects all write
events. Step two resolves dynamic values. Step three coordinates with external systems. In case the log needs to be
replayed, only the third step is replayed. Step 1 and 2 are skipped. Exactly how this would be implemented depends
on the concrete system.

Next: Peer-to-peer (P2P) Architecture

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Peer-to-peer (P2P) Architecture


Jakob Jenkov
Last update: 2014-10-31

Peer-to-Peer (P2P architecture is an architecture in which a number of equal peers cooperate to provide a service for
each other, without a central server. All peers are both clients of, and servers for each other.

I have written a separate trail on peer-to-peer networks, so I will not write more about it in this software architecture
tutorial.

Next: Scalable Architectures

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Scalable Architectures
Scalability Factor
Vertical and Horizontal Scalability
Architecture Scalability Requirements
Task Parallelization

Jakob Jenkov
Last update: 2014-10-31

A scalable architecture is an architecture that can scale up to meet increased work loads. In other words, if the work
load all of a sudden exceeds the capacity of your existing software + hardware combination, you can scale up the
system (software + hardware) to meet the increased work load.

Scalability Factor
When you scale up your system's hardware capacity, you want the workload it is able to handle to scale up to the
same degree. If you double hardware capacity of your system, you would like your system to be able to handle double
the workload too. This situation is called "linear scalability".

Linear scalability is often not the case though. Very often there is an overhead associated with scaling up, which
means that when you double hardware capacity, your system can handle less than double the workload.

The extra workload your system can handle when you scale up your hardware capacity is your system's scalability
factor. The scalability factor may vary depending on what part of your system you scale up.

Vertical and Horizontal Scalability


There are two primary ways to scale up a system: Vertical scaling and horizontal scaling.

Vertical scaling means that you scale up the system by deploying the software on a computer with higher capacity
than the computer it is currently deployed on. The new computer may have a faster CPU, more memory, faster and
larger hard disk, faster memory bus etc. than the current computer.

Horizontal scaling means that you scale up the system by adding more computers with your software deployed on.
The added computers typically have about the same capacity as the current computers the system is running on, or
whatever capacity you would get for the same money at the time of purchase (computers tend to get more powerful for
the same money over time).

Architecture Scalability Requirements


The easiest way to scale up your software from a developer perspective is vertical scalability. You just deploy on a
bigger machine, and the software performs better. However, once you get past standard hardware requirements,
buying faster CPUs, bigger and faster RAM modules, bigger and faster hard disks, faster and bigger motherboards etc.
gets really, really expensive compared to the extra performance you get. Also, if you add more CPUs to a computer,
and your software is not explicitly implemented to take advantage of them, you will not get any increased performance
out of the extra CPUs.

Scaling horizontally is not as easy seen from a software developer's perspective. In order to make your software take
advantage of multiple computers (or even multiple CPUs within the same computer), your software need to be able to
parallelize its tasks. In fact, the better your software is at parallelizing its tasks, the better your software scales
horizontally.

Task Parallelization
Parallelization of tasks can be done at several levels:

Distributing separate tasks onto separate computers.


Distributing separate tasks onto separate CPUs on the same computer.
Distributing separate tasks onto separate threads on the same CPU.

You may also take advantage of special hardware the computers might have, like graphics cards with lots of CPU
cores, or InfiniBand network interface cards etc.
Distributing separate tasks to separate computers is often referred to as "load balancing". Load balancing will be
covered in more detail in a separate text.

Executing multiple different applications on the same computer, possible using the same CPU or using different CPUs
is referred to as "multitasking". Multitasking is typically done by the operating system, so this is not something software
developers need to think too much about. What you need to think about is how to break your application into smaller,
independent but collaborating processes, which can also be distributed onto different CPUs or even computers if
needed.

Distributing tasks inside the same application to different threads is referred to as "multithreading". I have a separate
tutorial on Java Multithreading so I will not get deeper into multithreading here.

To be fully parallelizable, a task must be independent of other tasks executing in parallel with it.

To be fully distributable onto any computer, the task must contain, or be able to access, any data needed to execute
the task, regardless of what computer executes the task. Exactly what that means depends on the kind of application
you are developing, so I cannot get into deeper detail here.

Next: Load Balancing

Tweet
Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Load Balancing
Common Load Balancing Schemes
Even Task Distribution Scheme
DNS Based Load Balancing
Weighted Task Distribution Scheme
Sticky Session Scheme
Even Size Task Queue Distribution Scheme
Autonomous Queue Scheme
Load Balancing Hardware and Software

Jakob Jenkov
Last update: 2014-10-31

Load balancing is a method for distributing tasks onto multiple computers. For instance, distributing incoming HTTP
requests (tasks) for a web application onto multiple web servers. There are a few different ways to implement load
balancing. I will explain some common load balancing schemes in this text. Here is a diagram illustrating the basic
principle of load balancing:
The primary purpose of load balancing is to distribute the work load of an application onto multiple computers, so the
application can process a higher work load. Load balancing is a way to scale an application.

A secondary goal of load balancing is often (but not always) to provide redundancy in your application. That is, if one
server in a cluster of servers fail, the load balancer can temporarily remove that server from the cluster, and divide the
load onto the functioning servers. Having multiple servers help each other in this way is typically called "redundancy".
When an error happens and the tasks is moved from the failing server to a functioning server, this is typically called
"failover".

A set of servers running the same application in cooperation is typically referred to as a "cluster" of servers. The
purpose of a cluster is typically both of the above two mentioned goals: To distribute load onto different servers, and to
provide redundancy / failover for each other.

Common Load Balancing Schemes


The most common load balancing schemes are:

Even Task Distribution Scheme


Weighted Task Distribution Scheme
Sticky Session Scheme
Even Size Task Queue Distribution Scheme
Autonomous Queue Scheme

Each of these schemes are explained in more detail below.

Even Task Distribution Scheme


An even task distribution scheme means that the tasks are distributed evenly between the servers in the cluster. This
scheme is thus very simple. This makes it easier to implement. The even task distribution scheme is also known as
"Round Robin", meaning the servers receive work in a round robin fashion (evenly distributed). Here is a diagram
illustrating even task distribution load balancing:
Even task distribution load balancing is suitable when the servers in the cluster all have the same capacity, and the
incoming tasks statistically require the same amount of work.

Even task distribution ignores the difference in the work required to process each task. That means, that even if each
server is given the same number of tasks, you can have situations where one server has more tasks requiring heavy
processing than the others. This may happen due to the randomness of incoming tasks. This would often even itself
out over time, since the overloaded server may all of a sudden receive a set of light work load tasks too.

So, even if even task distribution load balancing distributes the tasks evenly onto the servers in the cluster, this may
not result in a truly even distribution of the work load.

DNS Based Load Balancing


DNS based load balancing is a simple scheme where you configure your DNS to return different IP addresses to
different computers when they request an IP address for your domain name. This achieves an effect that is similar to
the even task distribution scheme, except that most computers cache the IP address and thus keep coming back to the
same IP address until a new DNS lookup is made.

While DNS based load balancing is possible, it is not the best way of reliably distributing traffic across multiple
computers. You are better off using dedicated load balancing software or hardware.

Weighted Task Distribution Scheme


A weighted task distribution load balancing scheme distributes the incoming tasks onto the servers in the cluster using
weights. That means that you can specify the weight (ratio) of tasks a server should receive relative to other servers.
This is useful if the servers in the cluster do not all have the same capacity.

For instance, if one of three servers only has 2/3 capacity of the two others, you can use the weights 3, 3, 2. This
means that the first server should receive 3 tasks, the second server 3 tasks, and the last server only 2 tasks, for every
8 tasks received. That way the server with 2/3 capacity only receives 2/3 tasks compared to the other servers in the
cluster. Here is a diagram illustrating this example:

As mentioned earlier, weighted task distribution load balancing is useful when the servers in the cluster do not all have
the same capacity. However, weighted task distribution still does not take the work required to process the tasks into
consideration.

Sticky Session Scheme


The two previous load balancing schemes are based on the assumption that any incoming task can be processed
independently of previously executed tasks. This may not always be the case though.

Imagine if the servers in the cluster keep some kind of session state, like the session object in a Java web application
(or in PHP, or ASP). If a task (HTTP request) arrives at server 1, and that results in writing some value to session state,
what happens if subsequent requests from the same user are sent to server 2 or server 3? Then that session value
might be missing, because it is stored in the memory of server 1. Here is a diagram illustrating the situation:
The solution to this problem is called Sticky Session Load Balancing. All tasks (e.g. HTTP requests) belonging to the
same session (e.g the same user) are sent to the same server. That way any stored session values that might be
needed by subsequent tasks (requests) are available.

With sticky session load balancing it isn't the tasks that are distributed out to the servers, but rather the task sessions.
This will of course result in a somewhat more unpredictable distribution of work load, as some sessions will contain
few tasks, and other sessions will contain many tasks.

Another solution is to avoid using session variables completely, or to store the session variables in a database or
cache server, accessible to all servers in the cluster. I prefer to avoid session variables completely if possible, but you
may have good reasons to use session variables.

One way to avoid session variables is by using a RIA GUI + architecture which can contain any needed session
scoped variables inside the memory of the RIA client, instead of inside the memory of the web server.

Even Size Task Queue Distribution Scheme


The even size task queue distribution scheme is similar to the weighted task distribution scheme, but with a twist.
Instead of blindly distributing the tasks onto the servers in the cluster, the load balancer keeps a task queue for each
server. The task queues contain all requests that each server is currently processing, or which are waiting to be
processed. Here is a diagram illustrating this principle:

When a server finishes a task, for instance has finished sending back an HTTP response to a client, the task is
removed from the task queue for that server.

The even tasks queued distribution scheme works by making sure that each server queue has the same amount of
tasks in progress at the same time. Servers with higher capacity will finish tasks faster than servers with low capacity.
Thus the task queues of the higher capacity servers will empty faster and thus faster have space for new tasks.

As you can imagine, this load balancing scheme implicitly takes both the work required to process each task and the
capacity of each server into consideration. New tasks are sent to the servers with fewest tasks queued up. Tasks are
emptied from the queues when they are finished, which means that the time it took to process the task is automatically
impacting the size of the queue. Since how fast a task is completed depends on the server capacity, server capacity is
automatically taken into consideration too. If a server is temporarily overloaded, its task queue size will become larger
than the task queues of the other servers in the cluster. The overloaded server will thus not have new tasks assigned
to it until it has worked its queue down.

The load balancer will have to do a bit more accounting using this scheme. It has to keep track of task queues, and it
has to keep track of when a task is completed, so it can be removed from the corresponding task queue.

Autonomous Queue Scheme


The autonomous queue load balancing scheme, all incoming tasks are stored in a task queue. The servers in the
server cluster connects to this queue and takes the number of tasks they can process. Here is a diagram illustrating
this scheme:

In this scheme there is no real load balancer. Each server takes the load it is able to handle. There is just the task
queue and the server. If a server falls out of the cluster, its tasks are kept unprocessed on the task queue, and
processed by other servers later. Thus each server functions autonomously of the other servers and of the task queue.
No load balancer needs to know what servers are part of the cluster etc. The task queue does not need to know about
the servers. Each server just needs to know about the task queue.

Autonomous queue load balancing also implicitly takes the work load and capacity of each sever into consideration.
Servers only take tasks from the queue then they have capacity to process them.

Autonomous queue has a little disadvantage compared to even queue size distribution. A server that wants a task
needs to first connect to the queue, then download the task, and then provide a response. This is 2 to 3 network
roundtrips (depending on whether a response needs to be sent back).

The even queue size distribution scheme has one network roundtrip less. The load balancer sends a request to a
server, and the server sends back a response (if needed). That is just 1 to 2 network roundtrips.

Load Balancing Hardware and Software


In many cases you don't have to implement your own load balancer. You can buy ready-made hardware and software
for this purpose. Many web server implementations have some kind of built-in software load balancing functionality. Be
sure to look at what already exists before you start implementing your own load balancing software!

Next: Caching Techniques

Tweet
Jakob Jenkov
Copyright Jenkov Aps

All Trails

Trail TOC

Page TOC

Previous

Next
Tutorials About RSS
Tech and Media Labs

This site uses cookies to improve the user experience. OK

Software Architecture

1. Software Architecture
2. Single Process Architecture
3. Computer Architecture
4. Client-Server Architecture
5. N Tier Architecture
6. RIA Architecture
7. Service Oriented Architecture (SOA)
8. Event-driven Architecture
9. Peer-to-peer (P2P) Architecture
10. Scalable Architectures
11. Load Balancing
12. Caching Techniques

Caching Techniques
Populating the Cache
Keeping Cache and Remote System in Sync
Write-through Caching
Time Based Expiry
Active Expiry
Managing Cache Size
Caching in Server Clusters
Cache Products

Jakob Jenkov
Last update: 2014-10-31

Caching
Caching is a technique to speed up data lookups (data reading). Instead of reading the data directly from it source,
which could be a database or another remote system, the data is read directly from a cache on the computer that
needs the data. Here is an illustration of the caching principle:
A cache is a storage area that is closer to the entity needing it than the original source. Accessing this cache is
typically faster than accessing the data from its original source. A cache is typically stored in memory or on disk. A
memory cache is normally faster to read from than a disk cache, but a memory cache typically does not survive system
restarts.

Caching of data may occur at many different levels (computers) in a software system. In a modern web application
caching may take place in at least 3 locations, as illustrated below:

Most modern web applications use some kind of database. The database may cache data in memory so it does not
have to read it from disk. The web server may cache static files like images, CSS files, JavaScript etc. in memory
instead of reading that from disk every time they are needed. The web application may cache data read from the
database, so it does not have to access that in the database (via the network) every time it is needed. And finally the
browser may cache static files and data too. In HTML5 browsers have local storage, an application cache, and a
web SQL database in which they can store data.

When implementing a cache you have the following three issues to think about:

Populating the cache


Keeping the cache and remote system in sync
Managing cache size

I will get into these three issues throughout the rest of this text.

Populating the Cache


The first challenge of caching is to populate the cache with data from the remote system. There are basically two
techniques to do this:

1. Upfront population
2. Lazy population

Upfront population means that you populate the cache with all needed values when the system keeping the cache is
starting up. Being able to do so requires that you know what data to populate the cache with. You may not always
know what data should be inserted into the cache at system startup time.

Lazy evaluation means that you populate the cache the first time a certain piece of data is needed. First you check the
cache to see if the data is already there. If not, you read the data from the remote system and insert into the cache.

I have summed up the advantages and disadvantages of upfront and lazy population in the table below:

Advantages Disadvantages

Upfront Upfront population has The initial build of the cache make take a long time.
population no one-off cache read
delays like lazy You risk caching data that is never read.
population has.
Lazy Lazy population only Lazy population of the cache will not have any speedup from the first
population caches data that is cache read, since this is the time the data is fetched from the remote
actually read. system and inserted into the cache. This may result in an inconsistent
user experience.
Lazy population has no
upfront cache build
delay.

Of course it is possible to combine upfront and lazy population. Perhaps you populate the cache with the most read
data upfront, and let the rest of the data be populated lazily.

Keeping Cache and Remote System in Sync


A big challenge of caching is to keep the data stored in the cache and the data stored in the remote system in sync,
meaning that the data is the same. Depending on how your system is structured, there are different ways of keeping
the data in sync. I will cover some of the possible techniques in the following sections.

Write-through Caching
A write-through cache is a cache which allows both reading and writing to it. If the computer keeping the cache writes
new data to the cache, that data is also written to the remote system. That is why it is called a "write-through" cache.
The writes are written through to the remote system.

Write-through caching works if the remote system can only be updated via the computer keeping the cache. If all data
writes goes through the computer with the cache, it is easy to forward the writes to the remote system and update the
cache correspondingly.

Time Based Expiry


If the remote system can be updated independently of the computer keeping the cache, then it can be a problem to
keep the cache and the remote system in sync.

One way to keep the data in sync is to let the data in the cache expire after a certain time interval. When the data has
expired it will be removed from the cache. When the data is needed again, a fresh version of the data is read from the
remote system and inserted into the cache.

How long the expiration time should be depends on your needs. Some types of data (like an article) may not need to
be fully up-to-date at all times. Maybe you can live with a 1 hour expiration time. For some articles you might even be
able to live with a 24 hour expiration time.

Keep in mind that a short expiration time will result in more reads from the remote system, thus reducing the benefit of
the cache.

Active Expiry
An alternative to time based expiration is active expiration. By active expiration I mean that you actively expire the
cached data. For instance, if your remote system is updated you may send a message to the computer keeping the
cache, instructing it to expire the data that was updated.

Active expiry has the advantage that the data in the cache is made up-to-date as fast as possible after the update in
the remote system. Additionally, you don't have any unnecessary expirations for data that has not changed, as you
may have with time based expiration.

The disadvantage of active expiration is that you need to be able to detect changes to the remote system. If your
remote system is a relational database, and this database can be updated through different mechanisms, each of
these mechanisms need to be able to report what data they have updated. Otherwise you cannot send an expiration
message to the computer keeping the cache.

Managing Cache Size


Managing cache size is an important aspect of caching too. Many system has so much data stored that it is impossible,
or unfeasible, to store all of that data in the cache. Therefore you need a mechanism to manage how much data you
store in the cache. Managing the cache size is typically done by evicting data from the cache, to make space for new
data. There are a few standard cache eviction techniques. These are:

Time based eviction.


First in, first out (FIFO).
First in, last out (FILO).
Least accessed.
Least time between access.

Time based eviction is similar to time based expiration which has been covered earlier. In addition to keeping the
cache in sync with the remote system, time based expiration can also be used to keep the cache size down. Either you
have a separate thread running which monitors the cache, or the clean up is done when attempting to read or write a
new value to the cache.

First in, first out means that when you attempt to insert a new value into the cache, you remove the earliest inserted
value to make space for the new one. Of course you do not remove any values before the cache meets its space limit.

First in, last out is the reverse of the FIFO method. This method is useful if the first stored values are also the ones that
are typically accessed the most.

Least accessed eviction means that the cache values that have been accessed the least number of times are evicted
first. The purpose of this technique is to avoid having re-read and store often read values. In order to make this
technique work, the cache has to keep track of how many times a given value has been accessed.

An issue to keep in mind using least accessed eviction is that old values in the cache automatically has a higher
number of accesses, even if they are not accessed anymore. Perhaps an old article has been accessed a lot in the
past, but is now being accessed a lot less. The article's access count is still high though, meaning it does not get
evicted despite lower access counts now. To avoid this situation, the access count could count accesses within the
last N hours. But this further complicates the access counting.

Least time between access eviction takes the time between accesses of a value into account. When a value is
accessed the cache marks the time the value was accessed and increases the access count. When the value is
accessed the next time, the cache increments the access count, and calculates the average time between all
accesses. Values that were once accessed a lot but fade in popularity will have a dropping average time between
accesses. Sooner or later the average may drop low enough that the value will be evicted.

A variation of the least time between access eviction is to only calculate the time for the last N accesses. N could be
100, 1.000 or some other number that makes sense in your application. Whenever the access count reaches N, the
access count is reset to 0 and the time of the access is stored too. This approach will evict values with fading
popularity faster than if the total access count and time was used.

Another variation of least time between access is to reset the access count at regular intervals, and just use least
accessed eviction. For instance, for every hour a value is cached, the access count for the previous hour is stored in
another variable for use in eviction decisions. The access count for the next hour is reset to 0. This mechanism will
have the same effect as the variation described above.

The difference between the last two variations comes down to checking if the access count has reached N, or if the
time interval has exceeded Y, for every cache access. Since checking an integer is typically faster than reading the
system clock, I would go with the first approach. The first approach only reads the system clock every N accesses,
whereas the second approach reads the system clock for every access (to see if the time interval has expired).

Remember, that even if you are using cache size management, you may still have to evict, read and store values in
order to make sure they are in sync with the remote system. Even if a cached value is accessed a lot and thus
deserves to stay in the cache, sometimes it may need to be synchronized with the remote system.

Caching in Server Clusters


Caching is simpler in systems that runs on a single server. With a single server you can assure that all writes go
through the same server and thus use a write-through cache. The situation is more complicate when your application
is distributed across a cluster of servers. The following diagram illustrates this situation:
A simple write-through cache will only update the cache on the server that executes the write operation. The cache on
the other servers will know nothing of that write operation.

In a server cluster you may have to either use time based expiration or active expiration to make sure that all caches
are in sync with the remote system.

Cache Products
Implementing your own cache is not too hard, depending on how advanced you need it to be. However, if you are not
in the mood to implement your own cache, there exist many different ready-to-use caching products. Some of these
products are:

Memcached
Ehcache

I do not know these products well enough to make a recommendation, but I know what both are in widespread use.

Tweet

Jakob Jenkov

Copyright Jenkov Aps

All Trails

Trail TOC

Вам также может понравиться