Академический Документы
Профессиональный Документы
Культура Документы
2) Analysis of algorithms
2.1) Terminologies that you might encounter in analysis of
algorithm
a) The computational complexity (or just simply complexity) of an algorithm
is the amount of resources required for running it (a property unrelated to
“complexity” in a conventional sense)
As the amount of needed resources varies with the input. The complexity is
generally expressed as a function n → f(n), where n is the size of the input, and
f(n) is either the worst-case complexity, that is the maximum of the amount of
resources that are needed for all inputs of size n, or the average-case complexity,
that is average of the amount of resources over all input of size n.
Generally, when "complexity" is used without being further specified, this is the
worst-case time complexity that is considered.
b) The computational complexity (or just simply complexity) of a problem is
the minimum of the complexities of all possible algorithms for this problem
(including the unknown algorithms). The study of the complexity of explicitly
given algorithms is called analysis of algorithms, while the study of the complexity
of problems is called computational complexity theory. Clearly, both areas are
strongly related, as the complexity of an algorithm is always an upper bound of the
complexity of the problem solved by this algorithm.
c) Worst-case complexity (usually denoted in asymptotic notation) measures the
resources (e. g. running time, memory) an algorithm requires in the worst -case. It
gives an upper bound on the resources required by the algorithm.
In the case of running time, the worst-case time-complexity indicates the longest
running time performed by an algorithm given any input of size n, and thus this
guarantees that the algorithm finishes on time. Moreover, the order of growth of
the worst-case complexity is used to compare the efficiency of two or more
algorithms.
d) Average case complexity of an algorithm is the amount of some computational
resource (typically time) used by the algorithm, averaged over all possible inputs.
It is frequently contrasted with worst-case complexity which considers the
maximal complexity of the algorithm over all possible inputs.
There are three primary motivations for studying average-case complexity:
First, although some problems may be intractable in the worst-case, the inputs
which elicit this behavior may rarely occur in practice, so the average-case
complexity may be a more accurate measure of an algorithm's performance.
Second, average-case complexity analysis provides tools and techniques to
generate hard instances of problems which can be utilized in areas such as
cryptography and derandomization
Third, average-case complexity allows discriminating the most efficient
algorithm in practice among algorithms of equivalent based case complexity
(for instance Quicksort).
Average-case analysis requires a notion of an "average" input to an algorithm,
which leads to the problem of devising a probability distribution over inputs.
Alternatively, a randomized algorithm can be used. The analysis of such
algorithms leads to the related notion of an expected complexity. (need to
research more for this!)
e) Computational resources: The simplest computational resources are
computation time, the number of steps necessary to solve a problem, and memory
space, the amount of storage needed while solving the problem, but many more
complicated resources have been defined. Resources needed to solve a problem are
described in terms of asymptotic analysis, by identifying the resources as a
function of the length or size of the input. Resource usage is often partially
quantified using Big O notation.
f) Algorithmic efficiency is a property of an algorithm which relates to the number
of computational resources used by the algorithm. An algorithm must be analyzed
to determine its resource usage, and the efficiency of an algorithm can be measured
based on usage of different resources
For maximum efficiency we wish to minimize resource usage. However, different
resources such as time and space complexity cannot be compared directly, so
which of two algorithms is considered to be more efficient often depends on which
measure of efficiency is considered most important.
For example, bubble sort and timsort are both algorithms to sort a list of items
from smallest to largest. Bubble sort sorts the list in time proportional to the
number of elements squared O(n2) but only requires a small amount of extra
memory which is constant with respect to the length of the list O(1). Timsort sorts
the list in time linearithmic (proportional to a quantity times its logarithm) in the
list's length O(n log n), but has a space requirement linear in the length of the list
O(n). If large lists must be sorted at high speed for a given application, timsort is a
better choice; however, if minimizing the memory footprint of the sorting is more
important, bubble sort is a better choice.
f) Performance
2.2) Algorithmic efficiency continued
a) Brief overview
An algorithm is considered efficient if its resource consumption, also known as
computational cost, is at or below some acceptable level. Roughly speaking,
'acceptable' means: it will run in a reasonable amount of time or space on an
available computer, typically as a function of the size of the input
There are many ways in which the resources used by an algorithm can be
measured: the two most common measures are speed and memory usage; other
measures could include transmission speed, temporary disk usage, long-term disk
usage, power consumption, total cost of ownership, response time to external
stimuli, etc. Many of these measures depend on the size of the input to the
algorithm, i.e. the amount of data to be processed. They might also depend on the
way in which the data is arranged; for example, some sorting algorithms perform
poorly on data which is already sorted, or which is sorted in reverse order.
In practice, there are other factors which can affect the efficiency of an algorithm,
such as requirements for accuracy and/or reliability. As detailed below, the way in
which an algorithm is implemented can also have a significant effect on actual
efficiency, though many aspects of this relate to program/software optimization
issues.
b) Theoretical analysis
In the theoretical analysis of algorithms, the normal practice is to estimate their
complexity in the asymptotic sense. The most commonly used notation to describe
resource consumption or "complexity" is Donald Knuth's Big O notation,
representing the complexity of an algorithm as a function of the size of the input n.
Big O notation is an asymptotic measure of function complexity, where f(n) –
O(g(n)) roughly means the time requirement for an algorithm is proportional to
g(n), omitting lower-order terms that contribute less than g(n) to the growth of the
function as n grows arbitrarily large. This estimate may be misleading when n is
small, but is generally sufficiently accurate when n is large as the notation is
asymptotic
For example, bubble sort may be faster than merge sort when only a few items are
to be sorted; however either implementation is likely to meet performance
requirements for a small list. Typically, programmers are interested in algorithms
that scale efficiently to large input sizes, and merge sort is preferred over bubble
sort for lists of length encountered in most data-intensive programs.
Implementation issues can also have an effect on efficiency, such as the choice of
programming language, or the way in which the algorithm is actually coded, or the
choice of a compiler for a particular language, or the compilation options used, or
even the operating system being used
c) Measures of resource usage
Time measure in theory: analyze the algorithm, typically using time complexity
analysis to get an estimate of the running time as a function of the size of the input
data. The result is normally expressed using Big O notation. This is useful for
comparing algorithms, especially when a large amount of data is to be processed.
More detailed estimates are needed to compare algorithm performance when the
amount of data is small, although this is likely to be of less importance. Algorithms
which include parallel processing may be more difficult to analyze.
This sort of test also depends heavily on the selection of a particular programming
language, compiler, and compiler options, so algorithms being compared must all
be implemented under the same conditions.
Space measure
This section is concerned with the use of memory resources (registers, cache,
RAM, virtual memory, secondary memory) while the algorithm is being executed.
As for time analysis above, analyze the algorithm, typically using space
complexity analysis to get an estimate of the run-time memory needed as a
function as the size of the input data. The result is normally expressed using Big O
notation.
Some algorithms, such as sorting, often rearrange the input data and don't need any
additional space for output data. This property is referred to as "in-place" operation
j:=0;
while j < n*n*n do
j += 2;
i++;
The total time for the code above to execute is f(n) = 3n*(40 + n3/2) = 120n + 1.5n4
=> O(f(n)) = O(n4)