Вы находитесь на странице: 1из 41

Umbrella Activities : The phases and related steps of the generic

view of software engineering are complemented by a number


of umbrella activities. Typical activities in this category include:

01. Software project tracking and control : When plan,


tasks, models all have been done then a network of software
engineering tasks that will enable to get the job done on time will have
to be created.

02. Formal technical reviews : This includes reviewing the


techniques that has been used in the project.

03. Software quality assurance : This is very important to


ensure the quality measurement of each part to ensure them.

04. Software configuration management : Software


configuration management (SCM) is a set of activities designed to
control change by identifying the work products that are likely to
change, establishing relationships among them, defining mechanisms
for managing different versions of these work products.

05. Document preparation and production : All the project


planning and other activities should be hardly copied and the
production get started here.

06. Reusability management : This includes the backing up of


each part of the software project they can be corrected or any kind of
support can be given to them later to update or upgrade the software
at user/time demand.
07. Measurement : This will include all the measurement of
every aspects of the software project.
08. Risk management : Risk management is a series of steps
that help a software team to understand and manage uncertainty. Its a
really good idea to identify it, assess its probability of occurrence,
estimate its impact, and establish a contingency plan that should the
problem actually occur.

SOFTWARE COSTING
History
It was first published in Boehm's 1981 book Software Engineering Economics[1] as a model for
estimating effort, cost, and schedule for software projects. It drew on a study of 63 projects
at TRW Aerospace where Boehm was Director of Software Research and Technology. The study
examined projects ranging in size from 2,000 to 100,000 lines of code, and programming languages
ranging from assembly to PL/I. These projects were based on the waterfall model of software
development which was the prevalent software development process in 1981.

References to this model typically call it COCOMO 81. In 1995 COCOMO II was developed and
finally published in 2000 in the book Software Cost Estimation with COCOMO II.[2] COCOMO II is the
successor of COCOMO 81 and is better suited for estimating modern software development
projects. It provides more support for modern software development processes and an updated
project database. The need for the new model came as software development technology moved
from mainframe and overnight batch processing to desktop development, code reusability, and the
use of off-the-shelf software components. This article refers to COCOMO 81.

COCOMO consists of a hierarchy of three increasingly detailed and accurate forms. The first
level, Basic COCOMO is good for quick, early, rough order of magnitude estimates of software costs,
but its accuracy is limited due to its lack of factors to account for difference in project attributes (Cost
Drivers). Intermediate COCOMO takes these Cost Drivers into account andDetailed
COCOMO additionally accounts for the influence of individual project phases.

Basic COCOMO
Basic COCOMO compute software development effort (and cost) as a function of program size.
Program size is expressed in estimated thousands of source lines of code (SLOC, KLOC).

COCOMO applies to three classes of software projects:


Organic projects - "small" teams with "good" experience working with "less than rigid"
requirements

Semi-detached projects - "medium" teams with mixed experience working with a mix of rigid
and less than rigid requirements

Embedded projects - developed within a set of "tight" constraints. It is also combination of


organic and semi-detached projects.(hardware, software, operational, ...)

The basic COCOMO equations take the form

Effort Applied (E) = ab(KLOC)b [ person-months ]


b

Development Time (D) = cb(Effort Applied)d [months]


b

People required (P) = Effort Applied / Development Time [count]

where, KLOC is the estimated number of delivered lines (expressed in thousands ) of


code for project. The coefficients ab,bb, cb and db are given in the following table:

Software project ab bb cb db

Organic 2.4 1.05 2.5 0.38

Semi-detached 3.0 1.12 2.5 0.35

Embedded 3.6 1.20 2.5 0.32

Basic COCOMO is good for quick estimate of software costs. However it does not
account for differences in hardware constraints, personnel quality and experience, use
of modern tools and techniques, and so on.

Intermediate COCOMOs
Intermediate COCOMO computes software development effort as function of program
size and a set of "cost drivers" that include subjective assessment of product, hardware,
personnel and project attributes. This extension considers a set of four "cost drivers",
each with a number of subsidiary attributes:-

Product attributes
Required software reliability

Size of application database

Complexity of the product

Hardware attributes

Run-time performance constraints

Memory constraints

Volatility of the virtual machine environment

Required turnabout time

Personnel attributes

Analyst capability

Software engineering capability

Applications experience

Virtual machine experience

Programming language experience

Project attributes

Use of software tools

Application of software engineering methods

Required development schedule

Each of the 15 attributes receives a rating on a six-point scale that ranges from
"very low" to "extra high" (in importance or value). An effort multiplier from the
table below applies to the rating. The product of all effort multipliers results in
an effort adjustment factor (EAF). Typical values for EAF range from 0.9 to 1.4.
Ratings

Very Very Extra


Cost Drivers Low Low Nominal High High High

Product attributes

Required software reliability 0.75 0.88 1.00 1.15 1.40

Size of application database 0.94 1.00 1.08 1.16

Complexity of the product 0.70 0.85 1.00 1.15 1.30 1.65

Hardware attributes

Run-time performance constraints 1.00 1.11 1.30 1.66

Memory constraints 1.00 1.06 1.21 1.56

Volatility of the virtual machine environment 0.87 1.00 1.15 1.30

Required turnabout time 0.87 1.00 1.07 1.15

Personnel attributes

Analyst capability 1.46 1.19 1.00 0.86 0.71


Applications experience 1.29 1.13 1.00 0.91 0.82

Software engineer capability 1.42 1.17 1.00 0.86 0.70

Virtual machine experience 1.21 1.10 1.00 0.90

Programming language experience 1.14 1.07 1.00 0.95

Project attributes

Application of software engineering method 1.24 1.10 1.00 0.91 0.82

Use of software tools 1.24 1.10 1.00 0.91 0.83

Required development schedule 1.23 1.08 1.00 1.04 1.10

The Intermediate Cocomo formula now takes the form:

E=ai(KLoC)(b )(EAF)
i

where E is the effort applied in person-months, KLoC is the estimated number of


thousands of delivered lines of code for the project, and EAF is the factor calculated
above. The coefficient ai and the exponent bi are given in the next table.

Software project ai bi

Organic 3.2 1.05

Semi-detached 3.0 1.12

Embedded 2.8 1.20


The Development time D calculation uses E in the same way as in the Basic
COCOMO.

Detailed COCOMO
Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of
the cost driver's impact on each step (analysis, design, etc.) of the software engineering process.

The detailed model uses different effort multipliers for each cost driver attribute. These Phase
Sensitive effort multipliers are each to determine the amount of effort required to complete each
phase. In detailed cocomo,the whole software is divided in different modules and then we apply
COCOMO in different modules to estimate effort and then sum the effort

In detailed COCOMO, the effort is calculated as function of program size and a set of cost drivers
given according to each phase of software life cycle.

A Detailed project schedule is never static.The five phases of detailed COCOMO are:-

plan and requirement.

system design.

detailed design.

module code and test.

integration and test.

Cost Costructive Model

Deadlock
Both processes need resources to continue execution. P1 requires additional resource R1 and is in possession
of resource R2, P2 requires additional resource R2 and is in possession of R1; neither process can continue.

In concurrent programming, a deadlock is a situation in which two or more competing actions are
each waiting for the other to finish, and thus neither ever does.

In a transactional database, a deadlock happens when two processes each within its own
transaction updates two rows of information but in the opposite order. For example,
process A updates row 1 then row 2 in the exact timeframe that process Bupdates row 2 then row 1.
Process A can't finish updating row 2 until process B is finished, but process B cannot finish
updating row 1 until process A is finished. No matter how much time is allowed to pass, this situation
will never resolve itself and because of this, database management systems will typically kill the
transaction of the process that has done the least amount of work.

In an operating system, a deadlock is a situation which occurs when a process orthread enters a
waiting state because a resource requested is being held by another waiting process, which in turn
is waiting for another resource held by another waiting process. If a process is unable to change its
state indefinitely because the resources requested by it are being used by another waiting process,
then the system is said to be in a deadlock.[1]

Deadlock is a common problem in multiprocessing systems, parallel computing and distributed


systems, where software and hardware locks are used to handle shared resources and
implement process synchronization

In telecommunication systems, deadlocks occur mainly due to lost or corrupt signals instead of
resource contention.

Examples
Any deadlock situation can be compared to the classic "chicken or egg" problem.[4] It can also be
considered a paradoxical "Catch-22" situation.[5] A real world example would be an illogical statute
passed by the Kansas legislature in the early 20th century, which stated:[1][6]

When two trains approach each other at a crossing, both shall come to a full stop and neither shall
start up again until the other has gone.

A simple computer-based example is as follows. Suppose a computer has three CD drives and three
processes. Each of the three processes holds one of the drives. If each process now requests
another drive, the three processes will be in a deadlock. Each process will be waiting for the "CD
drive released" event, which can be only caused by one of the other waiting processes. Thus, it
results in a circular chain.

Moving onto the source code level, a deadlock can occur even in the case of a single thread and
one resource (protected by a mutex). Assume there is a function f1 which does some work on the
resource, locking the mutex at the beginning and releasing it after it's done. Next, somebody creates
a different function f2 following that pattern on the same resource (lock, do work, release) but
decides to include a call to f1 to delegate a part of the job. What will happen is the mutex will be
locked once when entering f2 and then again at the call to f1, resulting in a deadlock if the mutex is
not reentrant (i.e. the plain "fast mutex" variety).

Necessary conditions
A deadlock situation can arise if all of the following conditions hold simultaneously in a system: [7]

1. Mutual exclusion: at least one resource must be held in a non-shareable mode. [1] Only one
process can use the resource at any given instant of time.

2. Hold and wait or resource holding: a process is currently holding at least one resource and
requesting additional resources which are being held by other processes.

3. No preemption: a resource can be released only voluntarily by the process holding it.

4. Circular wait: a process must be waiting for a resource which is being held by another
process, which in turn is waiting for the first process to release the resource. In general,
there is a set of waiting processes, P = {P1, P2, ,PN}, such that P1 is waiting for a resource
held by P2, P2 is waiting for a resource held by P3 and so on until PN is waiting for a resource
h

Avoiding database deadlock


An effective way to avoid database deadlocks is to follow this approach from the Oracle Locking
Survival Guide:

Application developers can eliminate all risk of enqueue deadlocks by ensuring that transactions
requiring multiple resources always lock them in the same order.[9]

This single sentence needs some explanation:

First, it highlights the fact that processes must be inside a transaction for deadlocks to
happen. Note that some database systems can be configured to cascade deletes, which
generate implicit transactions which then can cause deadlocks. Also, some DBMS vendors offer
row-level locking, a type of record locking which greatly reduces the chance of deadlocks, as
opposed to page-level locking, which has the potential of locking out much more processing.

Second, the reference to "multiple resources" means "more than one row in one or more
tables." An example of locking in the same order might involve processing all INSERTS first, all
UPDATES second, and all DELETES last; within the processing of each of these handling all
parent-table changes before child-table changes; and processing table changes in the same
order (such as alphabetically, or ordered by an ID or account number).

Third, eliminating all risk of deadlocks is difficult to achieve when the DBMS has automatic
lock-escalation features that raise row-level locks into page locks which can escalate to table
locks. Although the risk or chance of experiencing a deadlock will not go to zero as deadlocks
tend to happen more on large, high-volume, complex systems, it can be greatly reduced, and
when requiredprogrammers can enhance the software to retry transactions when the system
detects a deadlock.

Fourth, deadlocks can result in data loss if developers do not write the software specifying
the use of transactions on every interaction with a DBMS; such data loss is difficult to locate and
can cause unexpected errors and problems.

Deadlocks offer a challenging problem to correct as they result in data loss, are difficult to isolate,
cause unexpected problems, and are time-consuming to fix. Modifying every section of software
code in a large database-oriented system in order to always lock resources in the same order when
the order is inconsistent takes significant resources and testing to implement.

Deadlock handling
Most current operating systems cannot prevent a deadlock from occurring. When a deadlock occurs,
different operating systems respond to them in different non-standard manners. Most approaches
work by preventing one of the four Coffman conditions from occurring, especially the fourth
one.Major approaches are as follows.

Ignoring deadlock

In this approach, it is assumed that a deadlock will never occur. This is also an application of
the Ostrich algorithm.This approach was initially used by MINIX and UNIX.This is used when the
time intervals between occurrences of deadlocks are large and the data loss incurred each time is
tolerable.

Detection

Under deadlock detection, deadlocks are allowed to occur. Then the state of the system is examined
to detect that a deadlock has occurred and subsequently it is corrected. An algorithm is employed
that tracks resource allocation and process states, it rolls back and restarts one or more of the
processes in order to remove the detected deadlock. Detecting a deadlock that has already occurred
is easily possible since the resources that each process has locked and/or currently requested are
known to the resource scheduler of the operating system.[12]

Deadlock detection techniques include, but are not limited to, model checking. This approach
constructs a finite state-model on which it performs a progress analysis and finds all possible
terminal sets in the model. These then each represent a deadlock.
After a deadlock is detected, it can be corrected by using one of the following methods:

1. Process termination: one or more processes involved in the deadlock may be aborted. We
can choose to abort allprocesses involved in the deadlock. This ensures that deadlock is
resolved with certainty and speed. But the expense is high as partial computations will be
lost. Or, we can choose to abort one process at a time until the deadlock is resolved. This
approach has high overheads because after each abort an algorithm must determine
whether the system is still in deadlock. Several factors must be considered while choosing a
candidate for termination, such as priority and age of the process.

2. Resource preemption: resources allocated to various processes may be successively


preempted and allocated to other processes until the deadlock is broken.

Prevention

Deadlock prevention works by preventing one of the four Coffman conditions from occurring.

Removing the mutual exclusion condition means that no process will have exclusive access
to a resource. This proves impossible for resources that cannot be spooled. But even with
spooled resources, deadlock could still occur. Algorithms that avoid mutual exclusion are
called non-blocking synchronization algorithms.

The hold and wait or resource holding conditions may be removed by requiring processes to
request all the resources they will need before starting up (or before embarking upon a particular
set of operations). This advance knowledge is frequently difficult to satisfy and, in any case, is
an inefficient use of resources. Another way is to require processes to request resources only
when it has none. Thus, first they must release all their currently held resources before
requesting all the resources they will need from scratch. This too is often impractical. It is so
because resources may be allocated and remain unused for long periods. Also, a process
requiring a popular resource may have to wait indefinitely, as such a resource may always be
allocated to some process, resulting in resource starvation.[13] (These algorithms, such
as serializing tokens, are known as the all-or-none algorithms.)

The no preemption condition may also be difficult or impossible to avoid as a process has to
be able to have a resource for a certain amount of time, or the processing outcome may be
inconsistent or thrashing may occur. However, inability to enforce preemption may interfere with
a priority algorithm. Preemption of a "locked out" resource generally implies arollback, and is to
be avoided, since it is very costly in overhead. Algorithms that allow preemption include lock-free
and wait-free algorithms and optimistic concurrency control. If a process holding some resources
and requests for some another resource(s) that cannot be immediately allocated to it, the
condition may be removed by releasing all the currently being held resources of that process.

The final condition is the circular wait condition. Approaches that avoid circular waits include
disabling interrupts during critical sections and using a hierarchy to determine a partial
ordering of resources. If no obvious hierarchy exists, even the memory address of resources has
been used to determine ordering and resources are requested in the increasing order of the
enumeration.[1] Dijkstra's solution can also be used.

Avoidance
be avoided if certain information about processes are available to the operating system before
allocation of resources, such as which resources a process will consume in its lifetime. For every
resource request, the system sees whether granting the request will mean that the system will enter
an unsafe state, meaning a state that could result in deadlock. The system then only grants requests
that will lead to safe states.[14] In order for the system to be able to determine whether the next state
will be safe or unsafe, it must know in advance at any time:

resources currently available;

resources currently allocated to each process;

resources that will be required and released by these processes in the future.

It is possible for a process to be in an unsafe state but for this not to result in a deadlock. The notion
of safe/unsafe states only refers to the ability of the system to enter a deadlock state or not. For
example, if a process requests A which would result in an unsafe state, but releases B which would
prevent circular wait, then the state is unsafe but the system is not in deadlock.

One known algorithm that is used for deadlock avoidance is the Banker's algorithm, which requires
resource usage limit to be known in advance.[1] However, for many systems it is impossible to know
in advance what every process will request. This means that deadlock avoidance is often
impossible.

Two other algorithms are Wait/Die and Wound/Wait, each of which uses a symmetry-breaking
technique. In both these algorithms there exists an older process (O) and a younger process (Y).
Process age can be determined by a timestamp at process creation time. Smaller timestamps are
older processes, while larger timestamps represent younger processes.

Wait/Die Wound/Wait

O needs a resource held by Y O waits Y dies

Y needs a resource held by O Y dies Y waits

Another way to avoid deadlock is to avoid blocking, for example by using non-blocking
synchronization or read-copy-update.

Livelock
A livelock is similar to a deadlock, except that the states of the processes involved in the livelock
constantly change with regard to one another, none progressing. This term was defined formally at
some time during the 1970san early sighting in the published literature is in Babich's 1979 article
on program correctness.[15] Livelock is a special case of resource starvation; the general definition
only states that a specific process is not progressing. [16]

A real-world example of livelock occurs when two people meet in a narrow corridor, and each tries to
be polite by moving aside to let the other pass, but they end up swaying from side to side without
making any progress because they both repeatedly move the same way at the same time.

Livelock is a risk with some algorithms that detect and recover from deadlock. If more than one
process takes action, the deadlock detection algorithm can be repeatedly triggered. This can be
avoided by ensuring that only one process (chosen arbitrarily or by priority) takes action. [17]

Distributed deadlock
Distributed deadlocks can occur in distributed systems when distributed transactions or concurrency
control is being used. Distributed deadlocks can be detected either by constructing a global wait-for
graph from local wait-for graphs at a deadlock detector or by a distributed algorithm like edge
chasing.

Phantom deadlocks are deadlocks that are falsely detected in a distributed system due to system
internal delays but don't actually exist. For example, if a process releases a resource R1 and issues
a request for R2, and the first message is lost or delayed, a coordinator (detector of deadlocks)
could falsely conclude a deadlock (if the request for R2 while having R1would cause a deadlock).

parallel processing
The simultaneous use of more than one CPU to execute a program. Ideally, parallel
processing makes a program run faster because there are more engines (CPUs)
running it. In practice, it is often difficult to divide a program in such a way that separate
CPUs can execute different portions without interfering with each other.

Most computers have just one CPU, but some models have several. There are even
computers with thousands of CPUs. With single-CPU computers, it is possible to
perform parallel processing by connecting the computers in a network. However, this
type of parallel processing requires very sophisticated software called distributed
processing software.
Note that parallel processing differs from multitasking, in which a single CPU executes
several programs at once.

4 Hashing techniques
The idea of hashing

If one wants to store a certain set of similar objects and wants to quickly
access a given one (or come back with the result that it is unknown), the first
idea would be to store them in a list, possibly sorted for faster access. This
however still would need log(n) comparisons to find a given element or to
decide that it is not yet stored.
Therefore one uses a much bigger array and uses a function on the space of
possible objects with integer values to decide, where in the array to store a
certain object. If this so called hash function distributes the actually stored
objects well enough over the array, the access time is constant in average. Of
course, a hash function will usually not be injective, so one needs a strategy
what to do in case of a so-called "collision", that is, if more than one object
with the same hash value has to be stored. This package provides two ways to
deal with collisions, one is implemented in the so called "HashTabs" and
another in the "TreeHashTabs". The former simply uses other parts of the
array to store the data involved in the collisions and the latter uses an AVL tree
to store all data objects with the same hash value. Both are used basically in
the same way but sometimes behave a bit differently.
Memory requirements
Due to the data structure defined above the hash table will need one machine
word (4 bytes on 32bit machines and 8 bytes on 64bit machines) per possible
entry in the hash if all values corresponding to objects in the hash
are true and two machine words otherwise. This means that the memory
requirement for the hash itself is proportional to the hash table length and not
to the number of objects actually stored in the hash!

In addition one of course needs the memory to store the objects themselves.

For TreeHashTabs there are additional memory requirements. As soon as


there are more than one key hashing to the same value, the memory for an
AVL tree object is needed in addition. An AVL tree objects needs about 10
machine words for the tree object and then another 4 machine words for each
entry stored in the tree. Note that for many collisions this can be significantly
more than for HashTab tables. However, the advantage of TreeHashTabs is
that even for a bad hash function the performance is never worse
than log(n) for each operation where n is the number of keys in the hash with
the same hash value.

or

Hashing is the technique used for performing almost constant time search in case of insertion,
deletion and find operation. Taking a very simple example of it, an array with its index as key is the
example of hash table.
So each index (key) can be used for accessing the value in a constant search time. This mapping key
must be simple to compute and must helping in identifying the associated value. Function which helps us
in generating such kind of key-value mapping is known as Hash Function.

Buffer
A buffer is a region of memory used to temporarily hold data while it is being moved from
one place to another.

That would be the most simple yet sensible definition for a buffer irrespective of where it
may appear. Computers often have different devices in it that work at different speeds. For
example the RAM is much faster when compared to the Hard Disk. Further the CPU of a
computer is only capable of handling a specific amount of data in a given time.
These and many other reasons make it a need for operating systems to have Buffers or
Temporary memory locations it can use. For example imagine that there are two different
processes. It can be tricky to transfer data between these processes as the processes may be
at two different states at a given time.

Let us say process A : Is sending a bitmap to the printer driver so that it can send it to the
printer. Unfortunately the driver is busy printing another page at that time. So until the
driver is ready the OS stores the data in a buffer.

The same concept is applied for other things like copying files to a USB drive, playing a
video, taking input from a IO device etc.

Kernel Architecture
Structure of monolithic, micro, and hybrid kernels.

The kernel is the core of an operating system. It is the software responsible for running programs
and providing secure access to the machine's hardware. Since there are many programs, and
resources are limited, the kernel also decides when and how long a program should run. This is
called scheduling. Accessing the hardware directly can be very complex, since there are many
different hardware designs for the same type of component. Kernels usually implement some level of
hardware abstraction (a set of instructions universal to all devices of a certain type) to hide the
underlying complexity from applications and provide a clean and uniform interface. This helps
application programmers to develop programs without having to know how to program for specific
devices. The kernel relies upon software drivers that translate the generic command into instructions
specific to that device.

An operating system kernel is not strictly needed to run a computer. Programs can be directly loaded
and executed on the "bare metal" machine, provided that the authors of those programs are willing
to do without any hardware abstraction or operating system support. This was the normal operating
method of many early computers, which were reset and reloaded between the running of different
programs. Eventually, small ancillary programs such as program loaders and debuggers were
typically left in-core between runs, or loaded from read-only memory. As these were developed, they
formed the basis of what became early operating system kernels. The "bare metal" approach is still
used today on many video game consoles and embedded systems, but in general, newer systems
use kernels and operating systems.

Four broad categories of kernels:

Monolithic kernels provide rich and powerful abstractions of the underlying hardware.
Microkernels provide a small set of simple hardware abstractions and use applications called
servers to provide more functionality.

Exokernels provide minimal abstractions, allowing low-level hardware access. In exokernel


systems, library operating systems provide the abstractions typically present in monolithic
kernels.

Hybrid (modified microkernels) are much like pure microkernels, except that they include
some additional code in kernelspace to increase performance.

Thread (computing)

A process with two threads of execution, running on a single processor

In computer science, a thread of execution is the smallest sequence of programmed instructions


that can be managed independently by a scheduler, which is typically a part of the operating system.
[1]
The implementation of threads andprocesses differs between operating systems, but in most
cases a thread is a component of a process. Multiple threads can exist within the same process,
executing concurrently (one starting before others finish) and share resources such as memory,
while different processes do not share these resources. In particular, the threads of a process share
its instructions (executable code) and its context (the values of its variables at any given moment).
On a single processor, multithreading is generally implemented by time slicing (as inmultitasking),
and the central processing unit (CPU) switches between differentsoftware threads. This context
switching generally happens frequently enough that the user perceives the threads or tasks as
running at the same time (in parallel). On a multiprocessor or multi-core system, multiple threads can
be executed in parallel (at the same instant), with every processor or core executing a separate
thread simultaneously; on a processor or core with hardware threads, separate software threads can
also be executed concurrently by separate hardware threads.

Threads made an early appearance in OS/360 Multiprogramming with a Variable Number of


Tasks (MVT) in 1967, in which they were called "tasks". Process schedulers of many modern
operating systems directly support both time-sliced and multiprocessor threading, and the operating
system kernel allows programmers to manipulate threads by exposing required functionality through
the system call interface. Some threading implementations are called kernel threads, whereas
lightweight processes (LWP) are a specific type of kernel thread that share the same state and
information. Furthermore, programs can have user-space threads when threading with timers,
signals, or other methods to interrupt their own execution, performing a sort of ad hoc time-slicing.

Threads differ from traditional multitasking operating system processes in that:

processes are typically independent, while threads exist as subsets of a process

processes carry considerably more state information than threads, whereas multiple threads
within a process share process state as well as memory and other resources

processes have separate address spaces, whereas threads share their address space

processes interact only through system-provided inter-process communication mechanisms

context switching between threads in the same process is typically faster than context
switching between processes.

Systems such as Windows NT and OS/2 are said to have "cheap" threads and "expensive"
processes; in other operating systems there is not so great a difference except the cost of
an address space switch which on some architectures (notablyx86) results in a translation lookaside
buffer (TLB) flush.

Single threading
In computer programming, single threading is the processing of one command at a time.[2] The
opposite of single threading is multithreading.[3] While it has been suggested that the term single
threading is misleading, the term has been widely accepted within the functional
programming community.[4]
Multithreading
Multithreading is mainly found in multitasking operating systems. Multithreading is a widespread
programming and execution model that allows multiple threads to exist within the context of a single
process. These threads share the process's resources, but are able to execute independently. The
threaded programming model provides developers with a useful abstraction of concurrent execution.
Multithreading can also be applied to a single process to enable parallel execution on
a multiprocessing system.

Multithreaded applications have the following advantages:

Responsiveness: multithreading can allow an application to remain responsive to input. In a


single-threaded program, if the main execution thread blocks on a long-running task, the entire
application can appear to freeze. By moving such long-running tasks to a worker thread that
runs concurrently with the main execution thread, it is possible for the application to remain
responsive to user input while executing tasks in the background. On the other hand, in most
cases multithreading is not the only way to keep a program responsive, with non-blocking
I/O and/or Unix signals being available for gaining similar results.[5]

Faster execution: this advantage of a multithreaded program allows it to operate faster


on computer systems that have multiple CPUs or one or more multi-core CPUs, or across
a cluster of machines, because the threads of the program naturally lend themselves to parallel
execution, assuming sufficient independence (that they do not need to wait for each other).

Lower resource consumption: using threads, an application can serve multiple clients
concurrently using fewer resources than it would need when using multiple process copies of
itself. For example, the Apache HTTP server usesthread pools: a pool of listener threads for
listening to incoming requests, and a pool of server threads for processing those requests.

Better system utilization: as an example, a file system using multiple threads can achieve
higher throughput and lower latency since data in a faster medium (such as cache memory) can
be retrieved by one thread while another thread retrieves data from a slower medium (such as
external storage) without either thread waiting for the other to complete.

Simplified sharing and communication: unlike processes, which require a message


passing or shared memory mechanism to perform inter-process communication (IPC), threads
can communicate through data, code and files they already share.

Parallelization: applications looking to utilize multicore or multi-CPU systems can use


multithreading to split data and tasks into parallel subtasks and let the underlying architecture
manage how the threads run, either concurrently on a single core or in parallel on multiple cores.
GPU computing environments like CUDA and OpenCL use the multithreading model where
dozens to hundreds of threads run in parallel on a large number of cores.

Multithreading has the following drawbacks:

Synchronization: since threads share the same address space, the programmer must be
careful to avoid race conditions and other non-intuitive behaviors. In order for data to be
correctly manipulated, threads will often need torendezvous in time in order to process the data
in the correct order. Threads may also require mutually exclusiveoperations (often implemented
using semaphores) in order to prevent common data from being simultaneously modified or read
while in the process of being modified. Careless use of such primitives can lead to deadlocks.

Thread crashes a process: an illegal operation performed by a thread crashes the entire
process; therefore, one misbehaving thread can disrupt the processing of all the other threads in
the application.

Scheduling
Operating systems schedule threads either preemptively or cooperatively. Preemptive multithreading
is generally considered the superior approach, as it allows the operating system to determine when
a context switch should occur. The disadvantage of preemptive multithreading is that the system
may make a context switch at an inappropriate time, causinglock convoy, priority inversion or other
negative effects, which may be avoided by cooperative multithreading. Cooperative multithreading,
on the other hand, relies on the threads themselves to relinquish control once they are at a stopping
point. This can create problems if a thread is waiting for a resource to become available.

Until the early 2000s, most desktop computers had only one single-core CPU, with no support
for hardware threads, although threads were still used on such computers because switching
between threads was generally still quicker than full-process context switches. In 2002, Intel added
support for simultaneous multithreading to the Pentium 4 processor, under the name hyper-
threading; in 2005, they introduced the dual-core Pentium D processor and AMD introduced the
dual-coreAthlon 64 X2 processor.

Processors in embedded systems, which have higher requirements for real-time behaviors, might
support multithreading by decreasing the thread-switch time, perhaps by allocating a
dedicated register file for each thread instead of saving/restoring a common register file.

Processes, kernel threads, user threads, and fibers


Scheduling can be done at the kernel level or user level, and multitasking can be done preemptively
or cooperatively. This yields a variety of related concepts.

At the kernel level, a process contains one or more kernel threads, which share the process's
resources, such as memory and file handles a process is a unit of resources, while a thread is a
unit of scheduling and execution. Kernel scheduling is typically uniformly done preemptively or, less
commonly, cooperatively. At the user level a process such as a runtime system can itself schedule
multiple threads of execution. If these do not share data, they are usually analogously called
processes,[6] while if they share data they are usually called (user) threads, particularly if
preemptively scheduled. Cooperatively scheduled user threads are known as fibers; different
processes may schedule user threads differently. User threads may be executed by kernel threads in
various ways (one-to-one, many-to-one, many-to-many). The term "light-weight process" variously
refers to user threads or to kernel mechanisms for scheduling user threads onto kernel threads.
A process is a "heavyweight" unit of kernel scheduling, as creating, destroying, and switching
processes is relatively expensive. Processes own resources allocated by the operating system.
Resources include memory (for both code and data), file handles, sockets, device handles, windows,
and a process control block. Processes are isolated by process isolation, and do not share address
spaces or file resources except through explicit methods such as inheriting file handles or shared
memory segments, or mapping the same file in a shared way see interprocess communication.
Creating or destroying a process is relatively expensive, as resources must be acquired or released.
Processes are typically preemptively multitasked, and process switching is relatively expensive,
beyond basic cost of context switching, due to issues such as cache flushing.[a]

A kernel thread is a "lightweight" unit of kernel scheduling. At least one kernel thread exists within
each process. If multiple kernel threads can exist within a process, then they share the same
memory and file resources. Kernel threads are preemptively multitasked if the operating system's
process scheduler is preemptive. Kernel threads do not own resources except for a stack, a copy of
the registers including the program counter, and thread-local storage (if any), and are thus relatively
cheap to create and destroy. Thread switching is also relatively cheap: it requires a context switch
(saving and restoring registers and stack pointer), but does not change virtual memory and is thus
cache-friendly (leaving TLB valid). The kernel can assign one thread to each logical core in a system
(because each processor splits itself up into multiple logical cores if it supports multithreading, or
only supports one logical core per physical core if it does not), and can swap out threads that get
blocked. However, kernel threads take much longer than user threads to be swapped.

Threads are sometimes implemented in userspace libraries, thus called user threads. The kernel is
unaware of them, so they are managed and scheduled in userspace. Some implementations base
their user threads on top of several kernel threads, to benefit from multi-processor machines (M:N
model). In this article the term "thread" (without kernel or user qualifier) defaults to referring to kernel
threads. User threads as implemented by virtual machines are also called green threads. User
threads are generally fast to create and manage, but cannot take advantage of multithreading or
multiprocessing, and will get blocked if all of their associated kernel threads get blocked even if there
are some user threads that are ready to run.

Fibers are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber must
explicitly "yield" to allow another fiber to run, which makes their implementation much easier than
kernel or user threads. A fiber can be scheduled to run in any thread in the same process. This
permits applications to gain performance improvements by managing scheduling themselves,
instead of relying on the kernel scheduler (which may not be tuned for the application). Parallel
programming environments such as OpenMP typically implement their tasks through fibers. Closely
related to fibers are coroutines, with the distinction being that coroutines are a language-level
construct, while fibers are a system-level construct.

Thread and fiber issues

Concurrency and data structures

Threads in the same process share the same address space. This allows concurrently running code
to couple tightly and conveniently exchange data without the overhead or complexity of an IPC.
When shared between threads, however, even simple data structures become prone to race
conditions if they require more than one CPU instruction to update: two threads may end up
attempting to update the data structure at the same time and find it unexpectedly changing
underfoot. Bugs caused by race conditions can be very difficult to reproduce and isolate.

To prevent this, threading APIs offer synchronization primitives such as mutexes to lock data
structures against concurrent access. On uniprocessor systems, a thread running into a locked
mutex must sleep and hence trigger a context switch. On multi-processor systems, the thread may
instead poll the mutex in a spinlock. Both of these may sap performance and force processors
in SMP systems to contend for the memory bus, especially if the granularity of the locking is fine.

Although threads seem to be a small step from sequential computation, in fact, they represent a huge
step. They discard the most essential and appealing properties of sequential computation:
understandability, predictability, and determinism. Threads, as a model of computation, are wildly non-
deterministic, and the job of the programmer becomes one of pruning that nondeterminism.

The Problem with Threads, Edward A. Lee, UC Berkeley, 2006 [7]

I/O and scheduling

User thread or fiber implementations are typically entirely in userspace. As a result, context
switching between user threads or fibers within the same process is extremely efficient because it
does not require any interaction with the kernel at all: a context switch can be performed by locally
saving the CPU registers used by the currently executing user thread or fiber and then loading the
registers required by the user thread or fiber to be executed. Since scheduling occurs in userspace,
the scheduling policy can be more easily tailored to the requirements of the program's workload.

However, the use of blocking system calls in user threads (as opposed to kernel threads) or fibers
can be problematic. If a user thread or a fiber performs a system call that blocks, the other user
threads and fibers in the process are unable to run until the system call returns. A typical example of
this problem is when performing I/O: most programs are written to perform I/O synchronously. When
an I/O operation is initiated, a system call is made, and does not return until the I/O operation has
been completed. In the intervening period, the entire process is "blocked" by the kernel and cannot
run, which starves other user threads and fibers in the same process from executing.

A common solution to this problem is providing an I/O API that implements a synchronous interface
by using non-blocking I/O internally, and scheduling another user thread or fiber while the I/O
operation is in progress. Similar solutions can be provided for other blocking system calls.
Alternatively, the program can be written to avoid the use of synchronous I/O or other blocking
system calls.

SunOS 4.x implemented "light-weight processes" or LWPs. NetBSD 2.x+, and DragonFly
BSD implement LWPs as kernel threads (1:1 model). SunOS 5.2 through SunOS 5.8 as well as
NetBSD 2 to NetBSD 4 implemented a two level model, multiplexing one or more user level threads
on each kernel thread (M:N model). SunOS 5.9 and later, as well as NetBSD 5 eliminated user
threads support, returning to a 1:1 model.[8] FreeBSD 5 implemented M:N model. FreeBSD 6
supported both 1:1 and M:N, user could choose which one should be used with a given program
using /etc/libmap.conf. Starting with FreeBSD 7, the 1:1 became the default. FreeBSD 8 no longer
supports the M:N model.

The use of kernel threads simplifies user code by moving some of the most complex aspects of
threading into the kernel. The program does not need to schedule threads or explicitly yield the
processor. User code can be written in a familiar procedural style, including calls to blocking APIs,
without starving other threads. However, kernel threading may force a context switch between
threads at any time, and thus expose race hazards and concurrency bugs that would otherwise lie
latent. On SMP systems, this is further exacerbated because kernel threads may literally execute on
separate processors in parallel.

Models

1:1 (kernel-level threading)

Threads created by the user in a 1:1 correspondence with schedulable entities in the kernel [9] are the
simplest possible threading implementation. OS/2 and Win32 used this approach from the start,
while on Linux the usual C library implements this approach (via the NPTL or older LinuxThreads).
This approach is also used by Solaris, NetBSD, FreeBSD, OS X, andiOS.

N:1 (user-level threading)

An N:1 model implies that all application-level threads map to a single kernel-level scheduled entity;
[9]
the kernel has no knowledge of the application threads. With this approach, context switching can
be done very quickly and, in addition, it can be implemented even on simple kernels which do not
support threading. One of the major drawbacks however is that it cannot benefit from the hardware
acceleration on multi-threaded processors or multi-processor computers: there is never more than
one thread being scheduled at the same time.[9] For example: If one of the threads needs to execute
an I/O request, the whole process is blocked and the threading advantage cannot be utilized.
The GNU Portable Threads uses User-level threading, as does State Threads.

M:N (hybrid threading)

M:N maps some M number of application threads onto some N number of kernel entities, [9] or "virtual
processors." This is a compromise between kernel-level ("1:1") and user-level ("N:1") threading. In
general, "M:N" threading systems are more complex to implement than either kernel or user threads,
because changes to both kernel and user-space code are required. In the M:N implementation, the
threading library is responsible for scheduling user threads on the available schedulable entities; this
makes context switching of threads very fast, as it avoids system calls. However, this increases
complexity and the likelihood of priority inversion, as well as suboptimal scheduling without extensive
(and expensive) coordination between the userland scheduler and the kernel scheduler.
Database Keys
Keys are very important part of Relational database. They are used to establish and identify relation
between tables. They also ensure that each record within a table can be uniquely identified by
combination of one or more fields within a table.

Super Key

Super Key is defined as a set of attributes within a table that uniquely identifies each record within a
table. Super Key is a superset of Candidate key.

Candidate Key

Candidate keys are defined as the set of fields from which primary key can be selected. It is an
attribute or set of attribute that can act as a primary key for a table to uniquely identify each record in
that table.

Primary Key

Primary key is a candidate key that is most appropriate to become main key of the table. It is a key
that uniquely identify each record in a table.
Composite Key

Key that consist of two or more attributes that uniquely identify an entity occurance is
called Composite key. But any attribute that makes up the Composite key is not a simple key in its
own.

Secondary or Alternative key


The candidate key which are not selected for primary key are known as secondary keys or
alternative keys

Non-key Attribute

Non-key attributes are attributes other than candidate key attributes in a table.

Non-prime Attribute

Non-prime Attributes are attributes other than Primary attribute.


Types of Database Management Systems
For those that do not know that there are different types of database management systems
then this is probably the article you should be reading. Read on and brush up on your
database knowledge.

What is a Database

Management System?
A Database Management System or DBMS is a single or set of computer programs that are
responsible for creating, editing, deleting and generally maintaining a database or collection
of data records. They type of database management system is determined by the database
model. A database model is the manner in which the data collection is stored, managed and
administered. The various database management systems based on these data models are:

Relational Database

Management Systems

Relational database management systems are the most widely used database management
systems today. They are relatively easy to use. Relational database management systems
are named so because of the characteristic of normalizing the data which is usually stored in
tables. The relational model relies on normalizing data within rows and columns in tables.
The data can be related to other data in the same table or other tables which has to be
correctly managed by joining one or more tables. Relational models may be somewhat less
efficient than other models; however this may not be a problem with the processing power
and memory found in modern computers. Data in this type of model is stored is fixed
predefined structures and are usually manipulated using Structured Query Language (SQL).
Relational database management systems include Oracle, Ms SQLServer, IBM DB2, mySQL,
SQLite and PostgreSQL among others. Here is acode example of SQL in operation on a
relational database management system.

Flat File Based Database

Management Systems

Flat File based database management systems are probably the simplest of them all. These
are sometimes called Flat models. These come in human readable text formats as well as in
binary formats. These are ideal for stand alone applications, holding software configuration
and native format storage models. Flat files in a formatted row and column model rely on
assumptions that every item in a particular model consists of the same data. One common
example of this type of database is the CSV (Comma Separated Values) and another is a
spreadsheet such as Ms Excel.

Hierarchical Database

Management Systems
Hierarchical database management systems operates on the parent child tree-like model.
These normally have a 1:N relationship and are good for storing data with items describing
attributes, features and so on. These could store a book with information on chapters and
verses. They can also be used to store a database of songs, recipes, models of phones and
anything that can be stored in a nested format. Hierarchical database management systems
are not quite efficient for various real world operations. One such example of a Hierarchical
database management system is a XML document. Here is an example of a Flash application
manipulating a Hierarchical XML data model.

Network Database Management Systems

A Network database management system uses a data model similar to Hierarchical


database management systems The major difference here is that the tree structure in the
Network models can have a many parent to many child relational model. The Network model
structure is based on records and sets and most of these databases use SQL for
manipulation of their data. Network database management systems tend to be very flexible
but are rarely used ad were very quite common in the1960s and 1970s. Searching for an
item in this model requires the program to traverse the entire data set which is quit
cumbersome. These have mainly been replaced by Relational database management
systems in today's modern computing.

Object-oriented Database Management Systems

Object-oriented database management systems borrow from the model of the Object-
oriented programming paradigm. In this database model, the Object and its data or
attributes are seen as one ad accessed through pointers rather than stored in relational
table models. Object-oriented database models consist of diverse structures and is quite
extensible. This data model was designed to work closely with programs built with Object-
oriented programming languages thereby almost making the data and the program operate
as one. With this model applications are able to treat the data as native code. There is little
commercial implementation of this database model as it is still developing. Examples of
Object-oriented database management systems include IBM DB4o and DTS/S1 from
Obsidian Dynamics.

These are the five major classifications for types of database management systems.
Database schema
A database schema of a database system is its structure described in a formal language supported
by the database management system (DBMS) and refers to the organization of data as a blueprint of
how the database is constructed (divided into database tables in the case of Relational Databases).
The formal definition of database schema is a set of formulas (sentences) called integrity
constraints imposed on a database. These integrity constraints ensure compatibility between parts of
the schema. All constraints are expressible in the same language. A database can be considered a
structure in realization of the database language.[1] The states of a created conceptual schema are
transformed into an explicit mapping, the database schema. This describes how real world entities
are modeled in the database.

"A database schema specifies, based on the database administrator's knowledge of possible
applications, the facts that can enter the database, or those of interest to the possible end-
users."[2] The notion of a database schema plays the same role as the notion of theory in predicate
calculus. A model of this theory closely corresponds to a database, which can be seen at any
instant of time as a mathematical object. Thus a schema can contain formulas representing integrity
constraints specifically for an application and the constraints specifically for a type of database, all
expressed in the same database language.[1] In a relational database, the schema defines
the tables, fields, relationships, views, indexes, packages,procedures, functions, queues, triggers, ty
pes, sequences, materialized views, synonyms, database links, directories, XML schemas, and other
elements.

Schemas are generally stored in a data dictionary. Although a schema is defined in text database
language, the term is often used to refer to a graphical depiction of the database structure. In other
words, schema is the structure of the database that defines the objects in the database.

Significant Figures

RULES FOR SIGNIFICANT FIGURES

1. All non-zero numbers ARE significant. The number 33.2 has THREE significant figures because all of
the digits present are non-zero.

2. Zeros between two non-zero digits ARE significant. 2051 has FOUR significant figures. The zero is
between a 2 and a 5.

3. Leading zeros are NOT significant. They're nothing more than "place holders." The number 0.54 has
only TWO significant figures. 0.0032 also has TWO significant figures. All of the zeros are leading.
4. Trailing zeros to the right of the decimal ARE significant. There are FOUR significant figures in
92.00.

92.00 is different from 92: a scientist who measures 92.00 milliliters knows his value to the nearest 1/100th
milliliter; meanwhile his colleague who measured 92 milliliters only knows his value to the nearest 1
milliliter. It's important to understand that "zero" does not mean "nothing." Zero denotes actual information,
just like any other number. You cannot tag on zeros that aren't certain to belong there.

5. Trailing zeros in a whole number with the decimal shown ARE significant. Placing a decimal at
the end of a number is usually not done. By convention, however, this decimal indicates a significant zero.
For example, "540." indicates that the trailing zero IS significant; there are THREE significant figures in this
value.

6. Trailing zeros in a whole number with no decimal shown are NOT significant. Writing just "540"
indicates that the zero is NOT significant, and there are only TWO significant figures in this value.

7. Exact numbers have an INFINITE number of significant figures. This rule applies to numbers that
are definitions. For example, 1 meter = 1.00 meters = 1.0000 meters =
1.0000000000000000000 meters, etc.

So now back to the example posed in the Rounding Tutorial: Round 1000.3 to four significant
figures. 1000.3 has five significant figures (the zeros are between non-zero digits 1 and 3, so by rule 2
above, they are significant.) We need to drop the final 3, and since 3 < 5, we leave the last zero alone. so
1000. is our four-significant-figure answer. (from rules 5 and 6, we see that in order for the trailing zeros to
"count" as significant, they must be followed by a decimal. Writing just "1000" would give us only one
significant figure.)

8. For a number in scientific notation: N x 10 x, all digits comprising N ARE significant by the first
6 rules; "10" and "x" are NOT significant. 5.02 x 104 has THREE significant figures: "5.02." "10 and "4"
are not significant.

Rule 8 provides the opportunity to change the number of significant figures in a value by manipulating its
form. For example, let's try writing 1100 with THREE significant figures. By rule 6, 1100 has TWO significant
figures; its two trailing zeros are not significant. If we add a decimal to the end, we have 1100., with FOUR
significant figures (by rule 5.) But by writing it in scientific notation: 1.10 x 103, we create a THREE-
significant-figure value.

File Organization & Access Method

File Access Method


The way by which information/data can be retrieved. There are two method of file accesss:

1. Direct Access
2. Sequential Access
Direct Access
This access method the information/data stored on a device can be accessed randomly and
immediately irrespective to the order it was stored. The data with this access method is quicker
than sequential access. This is also known as random access method. For example Hard disk,
Flash Memory

Sequential Access

This access method the information/data stored on a device is accessed in the exact order in
which it was stored. Sequential access methods are seen in older storage devices such as
magnetic tape.

File Organization Method


The process that involves how data/information is stored so file access could be as easy and
quickly as possible. Three main ways of file organization:

1. Sequential
2. Index-Sequential
3. Random
Sequential file organization
All records are stored in some sort of order (ascending, descending, alphabetical). The order is
based on a field in the record. For example a file holding the records of employeeID, date of birth
and address. The employee ID is used and records stored is group accordingly
(ascending/descending). Can be used with both direct and sequential access.

Index-Sequential organization

The records is stores in some order but there is a second file called the index-file that indicates
where exactly certain key points. Can not be used with sequential access method.

Random file organization

The records are stored randomly but each record has its own specific position on the disk
(address). With this method no time could be wasted searching for a file. Instead it jumps to the
exact position and access the data/information. Can only be used with direct access access
method.
Requirements analysis

A sy stems engineering perspective


on requirements analysis

Requirements
analysis in systems
engineering and software
engineering, encompasses
those tasks that go into determining the needs or conditions to meet for a new or altered product or
project, taking account of the possibly conflicting requirements of the
variousstakeholders, analyzing, documenting, validating and managingsoftware or system
requirements.[2]

Requirements analysis is critical to the success of a systems or software project. [3] The requirements
should be documented, actionable, measurable, testable, traceable, related to identified business
needs or opportunities, and defined to a level of detail sufficient for system design.

Overview
Conceptually, requirements analysis includes three types of activities

Eliciting requirements:(e.g. the project charter or definition), business process


documentation, and stakeholder interviews. This is sometimes also called requirements
gathering.
Analyzing requirements: determining whether the stated requirements are clear, complete,
consistent and unambiguous, and resolving any apparent conflicts.

Recording requirements: Requirements may be documented in various forms, usually


including a summary list and may include natural-language documents, use cases, user stories,
or process specifications.

Requirements analysis can be a long and tiring process during which many delicate psychological
skills are involved. New systems change the environment and relationships between people, so it is
important to identify all the stakeholders, take into account all their needs and ensure they
understand the implications of the new systems. Analysts can employ several techniques to elicit the
requirements from the customer. These may include the development of scenarios (represented
asuser stories in agile methods), the identification of use cases, the use of workplace observation
or ethnography, holdinginterviews, or focus groups (more aptly named in this context as
requirements workshops, or requirements review sessions) and creating requirements
lists. Prototyping may be used to develop an example system that can be demonstrated to
stakeholders. Where necessary, the analyst will employ a combination of these methods to establish
the exact requirements of the stakeholders, so that a system that meets the business needs is
produced. Requirements quality can be improved through these and other methods

Visualization. Using tools that promote better understanding of the desired end-product such
as visualization and simulation.

Consistent use of templates. Producing a consistent set of models and templates to


document the requirements.

Documenting dependencies. Documenting dependencies and interrelationships among


requirements, as well as any assumptions and congregations.

Requirements analysis topics


Stakeholder identification

See Stakeholder analysis for a discussion of people or organizations (legal entities such as
companies, standards bodies) that have a valid interest in the system. They may be affected by it
either directly or indirectly. A major new emphasis in the 1990s was a focus on the identification
of stakeholders. It is increasingly recognized that stakeholders are not limited to the organization
employing the analyst. Other stakeholders will include:

anyone who operates the system (normal and maintenance operators)

anyone who benefits from the system (functional, political, financial and social beneficiaries)

anyone involved in purchasing or procuring the system. In a mass-market product


organization, product management, marketing and sometimes sales act as surrogate consumers
(mass-market customers) to guide development of the product
organizations which regulate aspects of the system (financial, safety, and other regulators)

people or organizations opposed to the system (negative stakeholders; see also Misuse
case)

organizations responsible for systems which interface with the system under design

those organizations who integrate horizontally with the organization for whom the analyst is
designing the system

Stakeholder interviews

Stakeholder interviews are a common technique used in requirement analysis. Though they are
generally idiosyncratic in nature and focused upon the perspectives and perceived needs of the
stakeholder, often this perspective deficiency has the general advantage of obtaining a much richer
understanding of the stakeholder's unique business processes, decision-relevant business rules,
and perceived needs. Consequently this technique can serve as a means of obtaining the highly
focused knowledge that is often not elicited in Joint Requirements Development sessions, where the
stakeholder's attention is compelled to assume a more cross-functional context, and the desire to
avoid controversy may limit the stakeholders willingness to contribute. Moreover, the in-person
nature of the interviews provides a more relaxed environment where lines of thought may be
explored at length.

Joint Requirements Development (JRD) Sessions

Requirements often have cross-functional implications that are unknown to individual stakeholders
and often missed or incompletely defined during stakeholder interviews. These cross-functional
implications can be elicited by conducting JRD sessions in a controlled environment, facilitated by a
trained facilitator (Business Analyst), wherein stakeholders participate in discussions to elicit
requirements, analyze their details and uncover cross-functional implications. A dedicated scribe
should be present to document the discussion, freeing up the Business Analyst to lead the
discussion in a direction that generates appropriate requirements which meet the session objective.

JRD Sessions are analogous to Joint Application Design Sessions. In the former, the sessions elicit
requirements that guide design, whereas the latter elicit the specific design features to be
implemented in satisfaction of elicited requirements.

Contract-style requirement lists


One traditional way of documenting requirements has been contract style requirement lists. In a
complex system such requirements lists can run to hundreds of pages long.

An appropriate metaphor would be an extremely long shopping list. Such lists are very much out of
favour in modern analysis; as they have proved spectacularly unsuccessful at achieving their aims;
but they are still seen to this day.
Strengths
Provides a checklist of requirements.

Provide a contract between the project sponsor(s) and developers.

For a large system can provide a high level description from which lower-level requirements
can be derived.

Weaknesses
Such lists can run to hundreds of pages. They are not intended to serve as a reader-friendly
description of the desired application.

Such requirements lists abstract all the requirements and so there is little context. The
Business Analyst may include context for requirements in accompanying design documentation.

This abstraction is not intended to describe how the requirements fit or work
together.

The list may not reflect relationships and dependencies between requirements. While
a list does make it easy to prioritize each individual item, removing one item out of context
can render an entire use case or business requirement useless.

The list doesn't supplant the need to review requirements carefully with
stakeholders in order to gain a better shared understanding of the implications for the design
of the desired system / application.

Simply creating a list does not guarantee its completeness. The Business Analyst must make
a good faith effort to discover and collect a substantially comprehensive list, and rely on
stakeholders to point out missing requirements.

These lists can create a false sense of mutual understanding between the stakeholders and
developers; Business Analysts are critical to the translation process.

It is almost impossible to uncover all the functional requirements before the process of
development and testing begins. If these lists are treated as an immutable contract, then
requirements that emerge in the Development process may generate a controversial
change request.
Alternative to requirement lists

As an alternative to requirement lists, Agile Software Development uses User stories to suggest
requirements in everyday language.

Measurable goals

Best practices take the composed list of requirements merely as clues and repeatedly ask "why?"
until the actual business purposes are discovered. Stakeholders and developers can then devise
tests to measure what level of each goal has been achieved thus far. Such goals change more
slowly than the long list of specific but unmeasured requirements. Once a small set of critical,
measured goals has been established, rapid prototyping and short iterative development phases
may proceed to deliver actual stakeholder value long before the project is half over.

Prototypes

A prototype is a computer program that exhibits a part of the properties of another computer
program, allowing users to visualize an application that has not yet been constructed. A popular form
of prototype is a mockup, which helps future users and other stakeholders to get an idea of what the
system will look like. Prototypes make it easier to make design decisions, because aspects of the
application can be seen and shared before the application is built. Major improvements in
communication between users and developers were often seen with the introduction of prototypes.
Early views of applications led to fewer changes later and hence reduced overall costs considerably.

Prototypes can be flat diagrams (often referred to as wire frames) or working applications using
synthesized functionality. Wireframes are made in a variety of graphic design documents, and often
remove all color from the design (i.e. use a grey scale color palette) in instances where the final
software is expected to have graphic design applied to it. This helps to prevent confusion as to
whether the prototype represents the final visual look and feel of the application

Use cases

A use case is a structure for documenting the functional requirements for a system, usually involving
software, whether that is new or being changed. Each use case provides a set of scenarios that
convey how the system should interact with a human user or another system, to achieve a specific
business goal. Use cases typically avoid technical jargon, preferring instead the language of
the end-user or domain expert. Use cases are often co-authored by requirements engineers and
stakeholders.

Use cases are deceptively simple tools for describing the behavior of software or systems. A use
case contains a textual description of the ways in which users are intended to work with the software
or system. Use cases should not describe internal workings of the system, nor should they explain
how that system will be implemented. Instead, they show the steps needed to perform a task.
Requirements specification

The output of the requirements analysis process is a requirements specification.

Types of Requirements
Requirements are categorized in several ways. The following are common categorizations of
requirements that relate to technical management: [1]

Customer Requirements
Statements of fact and assumptions that define the expectations of the system in terms of
mission objectives, environment, constraints, and measures of effectiveness and suitability
(MOE/MOS). The customers are those that perform the eight primary functions of systems
engineering, with special emphasis on the operator as the key customer. Operational
requirements will define the basic need and, at a minimum, answer the questions posed in
the following listing:[1]

Operational distribution or deployment: Where will the system be used?

Mission profile or scenario: How will the system accomplish its mission objective?
Performance and related parameters: What are the critical system parameters to
accomplish the mission?

Utilization environments: How are the various system components to be used?

Effectiveness requirements: How effective or efficient must the system be in performing


its mission?

Operational life cycle: How long will the system be in use by the user?

Environment: What environments will the system be expected to operate in an effective


manner?
Architectural Requirements
Architectural requirements explain what has to be done by identifying the necessary systems
architecture of a system.
Structural Requirements
Structural requirements explain what has to be done by identifying the necessary structure of
a system.
Behavioral Requirements
Behavioral requirements explain what has to be done by identifying the
necessary behavior of a system.
Functional Requirements
Functional requirements explain what has to be done by identifying the necessary task,
action or activity that must be accomplished. Functional requirements analysis will be used
as the toplevel functions for functional analysis.[1]
Non-functional Requirements
Non-functional requirements are requirements that specify criteria that can be used to judge
the operation of a system, rather than specific behaviors.

Core Functionality and Ancillary Functionality Requirements


Core Functionality requirements are those without fulfilling which the product cannot be
useful at all. Ancillary Functionality requirements are those that are supportive to Core
Functionality. The product can continue to work even if some or all of the Ancillary
Functionality requirements are fulfilled but with some side effects. Security, safety, user
friendliness and so on are examples of Ancillary Functionality requirements. [4]
Performance Requirements
The extent to which a mission or function must be executed; generally measured in terms of
quantity, quality, coverage, timeliness or readiness. During requirements analysis,
performance (how well does it have to be done) requirements will be interactively developed
across all identified functions based on system life cycle factors; and characterized in terms
of the degree of certainty in their estimate, the degree of criticality to system success, and
their relationship to other requirements.[1]
Design Requirements
The build to, code to, and buy to requirements for products and how to execute
requirements for processes expressed in technical data packages and technical manuals. [1]
Derived Requirements
Requirements that are implied or transformed from higher-level requirement. For example, a
requirement for long range or high speed may result in a design requirement for low weight. [1]
Allocated Requirements
A requirement that is established by dividing or otherwise allocating a high-level requirement
into multiple lower-level requirements. Example: A 100-pound item that consists of two
subsystems might result in weight requirements of 70 pounds and 30 pounds for the two
lower-level items.[1]

Well-known requirements categorization models


include FURPS and FURPS+, developed at Hewlett-
Packard.

Requirements analysis issues[edit]

Stakeholder issues[edit]

Steve McConnell, in his book Rapid Development,


details a number of ways users can inhibit
requirements gathering:

Users do not understand what they want or users


don't have a clear idea of their requirements

Users will not commit to a set of written


requirements

Users insist on new requirements after the cost


and schedule have been fixed

Communication with users is slow

Users often do not participate in reviews or are


incapable of doing so

Users are technically unsophisticated

Users do not understand the development process

Users do not know about present technology

This may lead to the situation where user requirements


keep changing even when system or product
development has been started.It is also means that the
requirement thats is under action process.
Engineer/developer issues

Possible problems caused by engineers and developers during requirements analysis are:

A natural inclination towards writing code can lead to implementation beginning before the
requirements analysis is complete, potentially resulting in inelegant refactoring to meet actual
requirements once they are known.

Technical personnel and end-users may have different vocabularies. Consequently, they may
wrongly believe they are in perfect agreement until the finished product is supplied.

Engineers and developers may try to make the requirements fit an existing system or model, rather
than develop a system specific to the needs of the client.

Analysis may often be carried out by engineers or programmers, rather than personnel with the
domain knowledge to understand a client's needs properly.

Attempted solutions

One attempted solution to communications problems has been to employ specialists in business or
system analysis.

Techniques introduced in the 1990s like prototyping, Unified Modeling Language (UML), use cases,
and Agile software development are also intended as solutions to problems encountered with
previous methods.

Also, a new class of application simulation or application definition tools have entered the market.
These tools are designed to bridge the communication gap between business users and the IT
organization and also to allow applications to be 'test marketed' before any code is produced. The
best of these tools offer:

electronic whiteboards to sketch application flows and test alternatives

ability to capture business logic and data needs

ability to generate high fidelity prototypes that closely imitate the final application

interactivity

capability to add contextual requirements and other comments

ability for remote and distributed users to run and interact with the simulation

Вам также может понравиться