CS 163 Lecture-Module #09:
Checking Performance of Programs


time Anagrams1.java
    recording info. as in homework
time Anagrams1_sort.java
    recording info. as in homework

when multiplying data by 10 multiplies time by 10,
called linear time
(linear function in math e.g. y = 7x + 3)


for the other, observe multiplying number of items by 10 multiplies time by ~~100 == 10²
called quadratic
see also graph:




notations for running time

need notation to express running-times

first point is that express
"time of algorithm for N items of data"
as "T(N)"
e.g. above first program Anagrams1.java  had
T1(1,000,000) := 0.980s.
T1(10,000,000) := 10.939s.
T1(100,000,000) := s.

could always try to get exact formula for T(N)
e.g. here T1(N) == 
similarly for the other algorithm for this problem
T2(N) == 
Then it starts appearing that the T2 algorithm is better for small values of N:
T1(1) :=  == 
T2(1) :=  == 
T1(2) :=  == 
T2(2) :=  == 
but these additional details   "--------------"
just obscure the significant performance:
as N gets larger, T1() really is much better than T2()
e.g.:
T1() ==  == 
T2() ==  == 
T1() ==  == 
T2() ==  == 

then considering that exact formulas would actually obscure instead of clarifying quality of an algorithm's time-function,
how should we express algorithms' time-functions?
what's done is noting most significant term in what would be T(N)
e.g.
T1(N) ==  + ...
T2(N) ==  + ...
use notation  indicating that T(N) is `on the order of' that term
"Big Oh" for order
T1(N) == O()
T2(N) == O()
from those most significant terms alone, one can predict that
as N gets larger, T_1() is much better than T_2()

in fact can do the prediction without that factor 
T1(N) == O()
T2(N) == O()
mathematical justification:
even with a small factor on one or a large factor on the other,
with N² in it T_2() will get larger than T_1() as N gets large.


this O() is actually a bit `loose':
(you may already be thinking that considering what happens to constants, ...)
T(N) in O(something) actually says only that
T(N) ≤ csomething (as N gets large).
e.g. since T2(N) == O() and  ≤  (as N gets large),
technically it's correct to write T2(N) == O()
since all that says is that T2(N) ≤ 
if want to be precise with equality,
technically should use "Θ()" Theta instead of "O()":
T1(N) == Θ()
T2(N) == Θ()
similarly there's a notation for situations with "≥", Omega "Ω()"
e.g. T2(N) == Ω()
and further there's a notation for situations with "<", little o "o()"
e.g. T1(N) == o()
main notation used is O(), as above



common time-functions, in order of increasing growth rate
    i.e. in order of increasing :
name            function
----            --------
constant        O()
    e.g. basic access one element in array or hash-table
    
    if the operation simply always requires same time Zs.,
    then it's O()
    

logarithmic     O()
    e.g. one binary search

linear          O()
    e.g. make a copy of file


                O(N*log(N))
    e.g. above multiple binary searches

polynomial
  quadratic     O()
      e.g. above multiple linear searches

  cubic         O()
      e.g. matrix-multiplication

exponential     O()
    e.g. Tower(s) of Hanoi problem


static analysis

so how analyze an algorithm?

in CS 251 you learn that all code gets translated into primitive machine language instructions to add, subtract, compare one value to another, load data from memory, store data into memory, a few others
computer handles its machine language straightforwardly:
get program's next instruction, do it,
get program's next instruction, do it,
get program's next instruction, do it,
.
.
.
time to handle each of the primitive instructions approximately the same
    particularly in modern RISC computers
    RISC: Reduced Instruction Set Computer
    vs. CISC: Complex Instruction Set Computer
        had some instructions that were powerful,
        e.g. handling entire loop control in one instruction
    but RISC facilitates speedup with pipelining,
        further enhancements in design
    see "Small Is Beautiful "
how much time per instruction?
depends on particular machine's clock, ...
say in commericially available machines:
1972 Intel 4004 approx. 15000ns.
1980s DEC VAX 11/780 approx. 2000ns.
1983 Intel 80286 approx. 1000ns.
1989 Intel 486 approx. 100ns.
MIPS M2000 approx. 100ns.
1990 Motorola 68040 approx. 50ns.
1994 Pentium I approx. 10ns.
1997 DEC Alpha approx. 2ns.
2002 Pentium 3 around 1ns. depending on details of configuration
2007 Pentium 4 around 0.3ns. depending on details of configuration

that varies: today one thing, tomorrow or on another machine less or more...
but the underlying model is always the same:
base unit is the instruction.
the time unit we'll use is "time to execute one primitive instruction"

so we analyze our code into such:

actions each of which reduces to one primitive instruction:
* arithmetic operation e.g.  +  or  %
* logical operation e.g. 
* comparison operation e.g. 
* accessing operation
    e.g. array indexing like  A[i]
    or object accessing using 
* simple assignment: setting a variable to have a value
*  break,  continue

e.g. analyze the following code:
x = 3 + x;
if( x < y  &&  x < 0 )
  y = x;
arithmetic operation,  assignments,
comparisons,  logical operation;
total  primitive instructions

how about "x++"?
arithmetic operation plus assignment
primitive instructions

loop:
for( j = 0; j < n; j++ )
  A[i][j] = 0;
if this loop were 'unrolled',
sequence of instructions would be as follows:
j = 0;

confirm j < n
A[i][j] = 0;
j++;    // j = 

confirm j < n
A[i][j] = 0;
j++;    // j = 

confirm j < n
A[i][j] = 0;
j++;    // j = 

.
.
.

confirm j < n
A[i][j] = 0;
j++;    // j = 

confirm j < n
A[i][j] = 0;
j++;    // j = 

confirm j < n
(stop)
analysis:
initial assignment,
comparisons,
double array dereferences,
    so 2n basic array dereferences,
assignments to the array elements,
incrementations :=  instructions
total  primitive instructions
we summarize this as Θ()
    (again, simpler than keeping detailed expression
    and contains all the information we'll use:
    general indication of performance)

note: in some analysis we do count in some detail comparisons versus assignments
because some may be more expensive
e.g. comparing long strings could be more time consuming than swapping them — just changing 

but generally, approximate things
and do this not only after determine exact expression,
but while analyzing the code:
build up Θ() expressions, as follows:

1. individual primitive operation
"+", "<=" etc.
each corresponds to 1 primitive instruction
so naturally Θ().

2. sequence of pieces of code
sum of individual times

any fixed sequence of primitive operations is Θ(1)
because Θ(1) + Θ(1) := Θ(2*1) := Θ(1)
Θ(1) + Θ(1) + Θ(1) + Θ(1) + Θ(1) := Θ(5*1) := Θ(1)
and so on
but only in cases with a fixed/constant sequence of actions
e.g.    x = y + 3;    is
addition Θ(1) + assignment Θ(1) := Θ()
further operations which are a fixed sequence of primitive ones and hence Θ(1):
* the work of invoking a method is Θ(1)
    (See CS 251:
    not the work of the code inside the method;
    here describing the work of
    simply processing the method invocation:
    a few primitive instructions
    to prepare arguments and transfer
    to start executing the method's code,
    and a few primitive instructions to return execution where left.
    code inside method may be more than Θ(1);
    as necessary, we'll address that.)
* actually, even the code inside basic methods is Θ(1)
    e.g.  System.out.println(MATERIAL)
    probably just a restricted number of primitive operations

3. loop with variable number of iterations
multiply number of iterations by time for one iteration
e.g. with the following loop:
for( j = 0; j < n; j++ )
  A[i][j] = 0;
from points 1 and 2 above
we can now say the loop body A[i][j] = 0;  is Θ()
and each time through the loop we
confirm  j < n;    and then    j++     totaling Θ()
thus the amount of time each iteration through the loop
totals Θ(1)
how many times through the loop? 
so total time is  n * Θ(1) :=  := 

another e.g. (like homework):
  ...
  sum = 0;                      // Θ(1) +
  for ( i = 0;                  // Θ(1) +
        i < n*n; i++ )          // Θ(n²) *
    for ( j = 0; j < n; j++ )   // Θ(n) *
      sum++;                    // Θ(1)
total is Θ(1) + Θ(1) + Θ(n² * n * 1)
                    := Θ(n³ + 2)
                    := Θ(n³)
your answers for homework should look similar
noting time contributions of lines
and then with final simplified result

another e.g.:
  ...
  sum = 0;                      // Θ(1) +
  for ( i = 0; i < n*n; i++ )   // Θ(n²) *
    for ( j = 0; j < i; j++ )   // Θ(?i?) *
      sum++;                    // Θ(1)
would we then get Θ(1 + n2*i*1) := Θ(1 + i*n2)???
some time/space complexitiies do indeed use two variables
    e.g. CS 263's textbook's page 341: "O(|E| + |V|)"
but here it seems wrong somehow when stating the time complexity of the program
because  i  is not a parameter to program here
    reflecting extent of data submitted to program
    like n does
i  just holds miscellaneous values fleetingly

in this case using O() instead of Θ() actually simplifies analysis
because   i < n*n   implies   i ≤ n*n   which means  
so analysis can be as follows:
  ...
  sum = 0;                      // O(1) +
  for ( i = 0; i < n*n; i++ )   // O(n²) *
    for ( j = 0; j < i; j++ )   // O(n*n) *
      sum++;                    // O(1)
then we get O(1 + n2*n2*1) = O(n4)

can't we do Θ() analysis of this code somehow?
yes
O() analysis as immediately above is  analysis
    doing measurements assuming things will always be as bad as possible
for Θ() try doing  analysis:
what is average value of  i
so Θ() analysis of the code here can be as follows:
  ...
  sum = 0;                      // Θ(1) +
  for ( i = 0; i < n*n; i++ )   // Θ(n2) *
    for ( j = 0; j < i; j++ )   // Θ(n2/2) *
      sum++;                    // Θ(1)
then we get Θ(1 + n2*(n2/2)*1) = Θ(n4/2) 

in this example, average-case Θ() analysis actually ends up with same result
as worst-case O() analysis



to be honest, I should acknowledge that general CS literature doesn't
use O(), Θ() specifically for worst-case, average-case analysis the way I'm specifying here
but it's OK for us in this course; this isn't necessarily inconsistent with the literature.



4. conditional, i.e.  if
what about when execution is irregular?
e.g. code to find index of smallest in array:
index_smallest = 0;
for( j = 1; j < n; j++ )
  if( A[j] < A[index_smallest )
    index_smallest = j;
inside loop, does stuff sometimes, other times not
time complexity would actually be clearer if there were an  else:
index_smallest = 0;
for( j = 1; j < n; j++ )
  if( A[j] < A[index_smallest )
    index_smallest = j;
  else
    index_smallest = index_smallest
with the else, we could say that each time through the loop, does Θ(1) comparison plus one or the other assignment statement.
without the  else, what we do is like that anyway:
we consider that we're finding upper bound on execution time:
assume total time is worst of possibilities
e.g. always doing the body of  if  that has no  else,
or when there's an  if-else  assuming time is worst of the two possibilities
here again using notation "O()" instead of "Θ()"



is there any other aspect of performance besides time?

how could avoid this program having a "spoon" in it?



are there any other metrics of programs?

managers decide which metrics to emphasize; there are tradeoffs between...
I'll revisit this perspective as our analyses will show some algorithms faster than others
but one might choose a 'worse'-time one anyway because might prefer simplicity...


(Copyright © 2009 by Hugh McGuire   — for thoughts about this, see   http://www.cis.gvsu.edu/~mcguire/teaching/copyright_thoughts.html .)