MTH 225 Lecture-Module #09:
Some Functions on Integers

binary representation of integers


(Section 3.6 of our textbook)

need to know this material to be able to understand how computers encode all data

well, remembering back to ,
what is the interpretation of our normal decimal i.e. base-ten representation of integers?
digits are 
e.g.   397ten has 3 in 100s place, 9 in 10s place, 7 in 1s place
i.e.   397ten  ==  
I use notation "di" for the digit in the 10i place
e.g. for 397ten,  d0,  d1,  d2

why might decimal representation concern any of you now?
well, someone has to implement  Integer.toString(int)
which depends on handling (individual) digits — what if it's you?
public class ... {
    public static void main(String[] args) {
        int n_given = new Random() .nextInt(1000);
        System.out.println("n_given's digit d0: " + digit(n_given, 0));
        System.out.println("n_given's digit d1: " + digit(n_given, 1));
        System.out.println("n_given's digit d2: " + digit(n_given, 2));
        }
for each of a given number ngiven's digits di corresponding to our base's powers 10i,
di := ⌊ngiven / 10i⌋ mod 10
    (for not-quite-up-to-date Web-browsers:  ... |_ ngiven / 10i _| ...)
    public static int digit(int n_given, int place) {
        
        }
    }
e.g. ⌊397 / 100⌋ mod 10 := ⌊397/⌋ mod 10 := ⌊⌋ mod 10 :=  which is d0
e.g. ⌊397 / 101⌋ mod 10 := ⌊397/⌋ mod 10 := ⌊⌋ mod 10 :=  mod 10 :=  which is d1
e.g. ⌊397 / 102⌋ mod 10 := ⌊397/⌋ mod 10 := ⌊⌋ mod 10 :=  mod 10 :=  which is d2


anyway, people(/computers) have used and still use some bases other than ten

ancient Babylonians used base sixty(!)
sounds silly; do you do anything like that? 

various LINUX/UNIX utilities such as access-control use
base-eight representation called 
for base eight, digits are 
e.g. 162eight  ==  1*82 + 6*81 + 2*80
                        ==  tententen
                        ==  ten

and Computer Science also uses hexadecimal which is base 
used for compactness and convenient conversion with the following base:

fundamental base Computer Science uses is 
ultimately because it's easiest to build machines processing data represented this way
this was an insight of  in 1945:
The all-or-none, two-equilibrium arrangements [of electronic components] are [best]. [Then] it is natural to use a system of arithmetic in which the digits are also two-valued.
"First Draft of a Report on the EDVAC [Electronic Discrete Variable Automatic Computer]"
in that paper, von Neumann specified further basic features for design of computers
e.g. central arithmetic unit, memory, stored program, ...
all computers now conform to that design called "von Neumann architecture"

anyway,
base-two arithmetic called 
for base two, digits are only 
abbreviates "binary digit"
e.g. 110two has 1 in 22 place, 1 in 21 place, 0 in 20 place
i.e. 110two  ==  1*22 + 1*21 + 0*20
                    ==  tententen
                    ==  ten
(reversing:   ten == 110two)
e.g. 1001011two has 1 in 26 place, 0 in 25 place, 0 in 24 place, ...
i.e. 1001011two  ==  1*26 + 0*25 + 0*24 + 1*23 + 0*22 + 1*21 + 1*20
                          ==  ten +                    + ten +         + tenten
                        ==  ten
thus can obtain decimal representation of a number presented in binary


like with base ten above,
for each of a given number ngiven's binary digits bi corresponding to powers 2i,
    e.g. for 1001011two,  b0,  b1,  b2,
        b3,  b4,  b5,  b6
bi := ⌊ngiven / 2i⌋ mod 2
    (for not-quite-up-to-date Web-browsers:  ... |_ ngiven / 2i _| ...)
e.g. for b0 ⌊75 / 20⌋ mod 2 := ⌊75/⌋ mod 2 := ⌊⌋ mod 2 :=  which is b0
e.g. for b1 ⌊75 / 21⌋ mod 2 := ⌊75/⌋ mod 2 := ⌊⌋ mod 2 :=  mod 2 :=  which is b1
e.g. for b2 ⌊75 / 22⌋ mod 2 := ⌊75/⌋ mod 2 := ⌊⌋ mod 2 :=  mod 2 :=  which is b2
incidentally, notice that ⌊75 / 22⌋ == ⌊⌊75 / 21⌋ / 2⌋
and we had already calculated ⌊75 / 21
generally, ⌊u / vx+1⌋ == ⌊⌊u / vx⌋ / v⌋
    (as  u/vx+1 == (u/vx)/v )
enabling us to do these evaluations more easily as follows:
e.g. ⌊75 / 23⌋ mod 2 := ⌊⌊75 / 22⌋ / 2⌋ mod 2 := ⌊ / 2⌋ mod 2 := ⌊⌋ mod 2 :=  mod 2 :=  which is b3
e.g. ⌊75 / 24⌋ mod 2 := ⌊⌊75 / 23⌋ / 2⌋ mod 2 := ⌊ / 2⌋ mod 2 :=  mod 2 :=  which is b4
e.g. ⌊75 / 25⌋ mod 2 := ⌊⌊75 / 24⌋ / 2⌋ mod 2 := ⌊ / 2⌋ mod 2 :=  mod 2 :=  which is b5
e.g. ⌊75 / 26⌋ mod 2 := ⌊⌊75 / 25⌋ / 2⌋ mod 2 := ⌊ / 2⌋ mod 2 :=  mod 2 :=  which is b6

such processing actually provides one algorithm for obtaining binary representation of a number presented in decimal
here's ≈Java encoding of this algorithm:
(≈p.221 in our textbook)
    void base_2_expansion(int n, int[] b) {    // resulting bits stored in |b[]|
                                            // i.e. this code will set |b[]|
                                            // to be {...,1,0,1,1,0,0,...}
                                            // as appropriate
        // let |ngiven| denote the initial given value for e
        int k = 0;
        while ( n > 0 ) {
            // at this point n == ⌊ngiven/2n⌋
            b[k] = n % 2;

            n = n/2;    // in Java, division of |int|s automatically does floor
                        // so this statement |n = n/2;| resets n
                        // to: ⌊⌊ngiven / 2k⌋ / 2⌋
                        // which is: ⌊ngiven / 2(k+1)⌋
            k = k + 1;  // then with k changed here to k + 1,
                        // n == ⌊ngiven / 2k⌋ for the new k
            }
        // assume |b[]| otherwise contains zeroes
        }


an English presentation of the preceding algorithm for obtaining binary representation of a nonnegative number n is as follows:
    until n is 0, repeatedly do the following:
        * write 1 if n is odd or 0 if n is even
        * divide n by 2, discarding any fractional part or remainder
    the result is the REVERSE of the 1s and 0s that you wrote
e.g. for 75:

e.g. for :


an alternative algorithm is as follows:
    * write all the powers of two until reach (or pass) n
    * for each power of two 2k here, going from large down to small:
        + if 2k > n, write a 0
        + otherwise, write a 1 and subtract p from n
    the result is the sequence of 1s and 0s in the order written
e.g. for 75:

e.g. for :




application of binary representation of integers:
repeated squaring algorithm good way to do exponentiation

a straightforward way to do exponentiation is as follows:
the following code is supposed to return ae:
int exponentiation(int a, int e) {
    
    }
note loop repeats e times
thus do operations in the loop such as the important multiplication e times

well consider what occurs in cryptographic secure communication:
a message such as "HI" represented as number a e.g. a=0708
and then need to calculate ae (modulo another value m)
where e is a number such as the following:
692,469,504,614,203,622,460,465,868,734,959,711,852,932,318,
302,149,455,793,493,889,446,970,470,508,640,328,165,696,445,
236,740,914,740,354,470,714,166,287,934,901,598,998,105,531,
316,760,680,771,813
($ man ssh-keygen)

the number of seconds required by a computer to do e multiplications would be at least the following:
692,469,504,614,203,622,460,465,868,734,959,711,852,932,318,
302,149,455,793,493,889,446,970,470,508,640,328,165,696,445,
236,740,914,740,354,470,714,166,287,934,901,598,998,105,531,
316,760 s.
in years that's more than the following:
692,469,504,614,203,622,460,465,868,734,959,711,852,932,318,
302,149,455,793,493,889,446,970,470,508,640,328,165,696,445,
236,740,914,740,354,470,714,166,287,934,901,598,998,105 y.

instead of that,
repeated squaring algorithm faster way to calculate ae
even with that value of e
loop (shown below) will repeat only  times(!)
which requires only a fraction of a second

to understand this algorithm,
consider first a simple example:
  • suppose e is 8192
    and suppose the following operations are performed:
      sqrrep := a;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
    
    at that point have ae — doing only  operations, not 8,192(!)

    what if e is 139264 which is  131072 + 8192 ?

      result := 1;
      sqrrep := a;
        // now sqrrep is a1 which is a20
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a2 which is a21
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a4 which is a22
      ·
      ·
      ·
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a8192 which is a213
      result := result * sqrrep;
        // now result is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      sqrrep := sqrrep * sqrrep;
        // now sqrrep is a which is a2
      result := result * sqrrep;
    
    at the end here that last multiplication  result * sqrrep
    is  a8192 * a131072 (which is  a213 * a217 )
    which equals a() (which is  a())
        by Law 1 of Exponents
    which equals a139264 i.e. ae which was desired
    (incidentally note that e=139264ten == 100010000000000000two
    and the binary representation reflects that e is  1*217 + 1*213 )

    the repeated squaring algorithm for calculating ae generalizes from the preceding examples:
    start  result  at 1 and  sqrrep  at a ;
    repeatedly square  sqrrep  so  sqrrep  holds values of a raised to successive powers of two;
    and include value of  sqrrep  in  result  if e contains the corresponding power of two.
    here's Java encoding of this algorithm:
    int exponentiation(int a, int e) {
        int result = 1;
        int sqrrep = a;
        while ( e > 0 ) {
            // e == ⌊egiven / 2i⌋  and  sqrrep == a2i
            if ( e % 2 == 1 )     // as above with binary representation
                                    // this condition corresponds
                                    // to egiven containing 2i
                result = result * sqrrep;    // so make |result| contain sqrrep == a2i
            sqrrep = sqrrep * sqrrep;        // advance sqrrep to a2(i+1)
            e = e/2;              // advance e to ⌊egiven/2(i+1)⌋
            }
        return  result;
        }
    
    as above when determining binary representation,
    e % 2 == 1  holds iff egiven contains corresponding 2i ;
    and (as indicated in examples for 8192 etc. above),
    sqrrep is the corresponding a2i

    see sample processing actually in homework



    incidentally, further for cryptography,
    we may not care about full  result  so much as result mod another number m
    in this case actually do  mod m  different times during the algorithm
    as shown in our textbook by Rosen p.226 (Algorithm 5)

    e.g. to calculate 70827 mod 2001
    don't straightforwardly start 70827 (= )
        that would lose precision
        (even if not with this relatively small example exponent 27,
        consider more realistic large exponent indicated above)
    int modular_exponentiation(
        int a, int e, int m     // a := , e := , m := 
        )
      {
      int result = 1;           // result := 
      int sqrrep = a % m;       // sqrrep :=   %   :=  
      while ( e > 0 ) {         //  > 0  :=  
        if ( e % 2 == 1 )       //  % 2 == 1  :=   == 1  :=  
          result = result * sqrrep % m;
                                // result :=   *  % 
                                //        :=   % 
                                //        :=  
        sqrrep = sqrrep * sqrrep % m;
                                // sqrrep :=   *  % 
                                //        :=   % 
                                //        :=  
        e = ⌊e/2⌋;              // e :=  ⌊/2⌋  :=  ⌊⌋  :=  
        }
      while ( e > 0 ) {         //  > 0  :=  
        if ( e % 2 == 1 )       //  % 2 == 1  :=   == 1  :=  
          result = result * sqrrep % m;
                                // result :=   *  % 
                                //        :=   % 
                                //        :=  
        sqrrep = sqrrep * sqrrep % m;
                                // sqrrep :=   *  % 
                                //        :=   % 
                                //        :=  
        e = ⌊e/2⌋;              // e :=  ⌊/2⌋  :=  ⌊⌋  :=  
        }
      while ( e > 0 ) {         //  > 0  :=  
        if ( e % 2 == 1 )       //  % 2 == 1  :=   == 1  :=  
          result = result * sqrrep % m;
        sqrrep = sqrrep * sqrrep % m;
                                // sqrrep :=   *  % 
                                //        :=   % 
                                //        :=  
        e = ⌊e/2⌋;              // e :=  ⌊/2⌋  :=  ⌊⌋  :=  
        }
      while ( e > 0 ) {         //  > 0  :=  
        if ( e % 2 == 1 )       //  % 2 == 1  :=   == 1  :=  
          result = result * sqrrep % m;
                                // result :=   *  % 
                                //        :=   % 
                                //        :=  
        sqrrep = sqrrep * sqrrep % m;
                                // sqrrep :=   *  % 
                                //        :=   % 
                                //        :=  
        e = ⌊e/2⌋;              // e :=  ⌊/2⌋  :=  ⌊⌋  :=  
        }
      while ( e > 0 ) {         //  > 0  :=  
        if ( e % 2 == 1 )       //  % 2 == 1  :=   == 1  :=  
          result = result * sqrrep % m;
                                // result :=   *  % 
                                //        :=   % 
                                //        :=  
        sqrrep = sqrrep * sqrrep % m;
                                // sqrrep :=   *  % 
                                //        :=   % 
                                //        :=  
        e = ⌊e/2⌋;              // e :=  ⌊/2⌋  :=  ⌊⌋  :=  
        }
      while ( e > 0 ) {         //  > 0  :=  
    
      return  result;           // return 
      }
    
    secure communication would transmit 1728 instead of 0708 ("HI")



    but then what?
    well, receiver would calculate  (1728867) mod 2001  :=  

    the receiver needs to know 
    stay tuned for explanation of that in next lecture-module


    (Copyright © 2009 by Hugh McGuire   — for thoughts about this, see   http://www.cis.gvsu.edu/~mcguire/teaching/copyright_thoughts.html .)