CS 163 Lecture-Module #08:
Searching

searching sequences

common task in computing & information systems
find something in sequence of material:

e.g. suppose  int[] array = { 55, 33, 77, 88, 22 };
if  target  is 88, then that searching code will set  i := 
or if  target  is 33, then that searching code will set  i := 

after that searching,
is it always safe to use  array[i], e.g.:
System.out.println(array[i]);  ?

then that searching code would set  i := 


incidentally the termination in that initial searching code
    using  if
can be folded into the  for head there as follows:
int i;

if ( i < array.length )
    // this condition implies |array[i]| must be |target|
    ... // use |array[i]|
could also do searching backward instead of forward through array as follows:
int i;

if (  )
    // this condition implies |array[i]| must be |target|
    ... // use |array[i]|
why might proceed backward instead of forward?
* comparing  i < array.length  may take two or three times as long as comparing  i >= 0
* might be nice to be consistent with our language's API which returns  in such situations

e.g. (albeit trivial) application: program storing user-given numbers but not duplicates:
$ java searching

data:  [ ]
Enter value: 
not already among data; adding it

data:  [ 55 ]
Enter value: 
not already among data; adding it

data:  [ 55 33 ]
Enter value: 
not already among data; adding it

data:  [ 55 33 77 ]
Enter value: 
already among data

data:  [ 55 33 77 ]
Enter value: 
not already among data; adding it

data:  [ 55 33 77 88 ]
Enter value: 
not already among data; adding it

data:  [ 55 33 77 88 22 ]
Enter value: 
already among data

data:  [ 55 33 77 88 22 ]
Enter value: 

That's all, eh?
$ 

import java.util.Scanner;

public class searching {
  
  static int search(int[] a, int target) {
    int result;
    for ( result = a.length - 1 ;
            result >= 0  &&  a[result] != target ;
            result--
            )
      ;
    return  result;
    }

  public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    int[] data = new int[0];
    for ( ; ; ) {   
      System.out.print("\ndata:  ");
      display_array(data);
      System.out.print("Enter value: ");
      if ( !scanner.hasNextInt() ) {
        System.out.println("\nThat's all, eh?");
        break;
        }
      int value = scanner.nextInt();
      int s = search(data, value);
      if ( s >= 0 ) {
        // this condition implies |array[i]| must be |value|
        System.out.println("already among data");
        continue;
        }
      // else {
        // didn't encounter |value| in array
        System.out.println("not already among data; adding it");
        int[] newspace = new int[data.length + 1];
        /*
        for ( int i = 0; i < data.length; i++ )
          newspace[i] = data[i];
        */
        System.arraycopy(data, 0, newspace, 0, data.length);
        newspace[] = value;
        data = newspace;

        /*
         * |ArrayList| could handle this processing.
         
         * (Think code never changes?  Java 1.5 added type parameters
         * to |ArrayList| etc.)
         */
        // }
      }
    }

  static void display_array(int[] a) {
    System.out.print("[ ");
    for ( int v : a )
      System.out.println(v + " ");
    System.out.println(']');
    }
  }
a detail: might be nicer to have elements kept in order:
$ java searching_s

array:  [ ]
Enter value: 55
not already in array; adding it

array:  [ 55 ]
Enter value: 33
not already in array; adding it

array:  [ 33 55 ]
Enter value: 77
not already in array; adding it

array:  [ 33 55 77 ]
Enter value: 33
already in array

array:  [ 33 55 77 ]
Enter value: 88
not already in array; adding it

array:  [ 33 55 77 88 ]
Enter value: 22
not already in array; adding it

array:  [ 22 33 55 77 88 ]
Enter value: 77
already in array

array:  [ 22 33 55 77 88 ]
Enter value: 44
already in array

array:  [ 22 33 44 55 77 88 ]
Enter value: 

That's all, eh?
$ 
to achieve that operationality, change parts as follows:
import java.util.Scanner;

public class searching_s {
  
  static int search_s(int[] a, int target) {
    int result;
    for ( result = a.length - 1 ;
            result >= 0  &&  a[result]  target ;
            result--
            )
      ;
    return  result;
    }

  public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    int[] data = new int[0];
    for ( ; ; ) {   // 'forever'
      System.out.print("\ndata:  ");
      display_array(data);
      System.out.print("Enter value: ");
      if ( !scanner.hasNextInt() ) {
        System.out.println("\nThat's all, eh?");
        break;
        }
      int value = scanner.nextInt();
      int s = search_s(data, value);
      if ( s >= 0  &&  array[s] == value ) {
        System.out.println("already among data");
        continue;
        }
      // else {
        System.out.println("not already in array; adding it");
        // maintaining array sorted
        // insert |value|
        // 
        int[] newspace = new int[data.length + 1];
        // First, copy elements through position |s|
        // because |array[s] < value|:
        /*
        for ( int i = 0; i  s; i++ )
          newspace[i] = data[i];
        */
        System.arraycopy(data, 0, newspace, 0, );
        newspace[] = value;
        /*
        for ( int i = s + 1; i < data.length; i++ )
          newspace[i + 1] = data[i];
        */
        System.arraycopy(
            data,
            s + 1,
            newspace,
            (s + 1) + 1,
            data.length - (s + 1)
            );
        data = newspace;
        
        // |ArrayList| could further handle such insertion
        // in middle of sequence; but I'm showing you...

        // }
      }
    }

  static void display_array(int[] a) {
    System.out.print("[ ");
    for ( int i = 0; i < a.length; i++ )
      System.out.print(a[i] + " ");
    System.out.println(']');
    }
  }
regardless of whether the elements are left unordered or kept ordered,
that straightforward searching called 

but consider:
would you really search that way
e.g. to find an entry in a telephone-book?

if the number of elements is n,
linear searching checks average  of them before finding
e.g. if n = 16,000 , linear searching checks 


better way of searching called 
(actually humans search slightly differently from binary search)
we can quantify how much better:
if n = 16,000 elements, binary search checks  of them
(generally for n elements binary search checks log2(n) of them)
called "binary" because split elements to search in  repeatedly
e.g. suppose searching items as follows for target-value say 70:
(each "**" is an item)
**  **  **  **  **  **  **  **  **  **  **  **  **  **  **
and suppose you know they're 
then
say as follows:
**  **  **  **  **  **  **  55  **  **  **  **  **  **  **
with this information you suddenly know :
                                **  **  **  **  **  **  **
then again compare target to middle element :
                                **  **  **  77  **  **  **
with this information you know  of that range:
                                **  **  **
then again compare target to middle element of current range
and so on

main work in this algorithm comprises
repeatedly adjusting  of current segment of array:
    /**
     * Performs the standard binary search.
     * @return index where item is found, or -1 if not found
     * @author Mark Allen Weiss and Hugh McGuire
     */
    static int binary_search(int[] a, int target) {
        int middle = -1;
        for ( int low = 0, high = a.length - 1;  low <= high; ) {
            middle = (low + high) / 2;
            if ( target < a[middle] )
                high = middle - 1;
            else if ( a[middle] < target ) {
                low = middle + 1;
                middle = middle + 1;
                  /*
                   * adjust |middle| in case will stop if this new |low|
                   * is greater than |high|
                   * so insertion index |middle| will be
                   * past element less than |target|
                   */
                }
            else    // must have |a[middle] == target| --- found it!
              break;
            }
        return  middle;
        }
invoke  binary_search(array, value)  in place of  search_s(array, value)

again, suppose number of items given  a.length := 16,000
but not really looking at each and every one of them like linear search, just have them all stored in array:
**********************************************************
with the data  -- necessary for this scheme
and suppose we're searching for  target
then to follow the algorithm:

suppose  a[middle] < target
then 
then focus attention on that half of data:
                             *****************************
current range of items comprises 
but not looking at all of them like linear search
have looked at  item so far
just have them all stored in array.
continuing to follow the algorithm:
compare middle element of current range to target,
then focus attention on the only half of this data where  target  could be
say left half:
                             **************
current range of items comprises 
but not looking at all of them like linear search
have looked at  items so far
just have them all stored in array
continuing to follow the algorithm:
compare middle element of current sequence to target,
then focus attention on the only half of this data where  target  could be
range reduces to 
but have actually looked at  items so far
continuing,
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
after looking at  items, reduce to range comprising 
then look at last 1 item

and how many items looked at?
from among the original 16,000, only  items actually looked at
didn't need to look at even half of them (8,000) to determine presence or absence of  target


(Copyright © 2008 by Hugh McGuire   — for thoughts about this, see   http://www.cis.gvsu.edu/~mcguire/teaching/copyright_thoughts.html .)