Binary Search Tree


Binary Search Tree is a node-based binary tree data structure which has the following properties:

  • The left subtree of a node contains only nodes with keys lesser than the node’s key.
  • The right subtree of a node contains only nodes with keys greater than the node’s key.
  • The left and right subtree each must also be a binary search tree

The above properties of Binary Search Tree provide an ordering among keys so that the operations like search, minimum and maximum can be done fast. If there is no ordering, then we may have to compare every key to search a given key.


Searching a key
To search a given key in Binary Search Tree, we first compare it with root, if the key is present at root, we return root. If key is greater than root’s key, we recur for right subtree of root node. Otherwise we recur for left subtree.

Illustration to search 6 in below tree:
1. Start from root.
2. Compare the inserting element with root, if less than root, then recurse for left, else recurse for right.
3. If element to search is found anywhere, return true, else return false.



Insertion of a key
A new key is always inserted at leaf. We start searching a key from root till we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node.
         100                                    100
        /   \            Insert 40            /    \
      20     500    --------->          20     500
     /  \                                        /  \ 
    10   30                              10   30
                                                       \  
                                                       40

Illustration to insert 2 in below tree:
1. Start from root.
2. Compare the inserting element with root, if less than root, then recurse for left, else recurse for right.
3. After reaching end,just insert that node at left(if less than current) else right.



Time Complexity: The worst case time complexity of search and insert operations is O(h) where h is height of Binary Search Tree. In worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may become n and the time complexity of search and insert operation may become O(n).

Some Interesting Facts:
Deletion 
Now we are going to talk about delete operation. When we delete a node, three possibilities arise.
1)      Node to be deleted is leaf: Simply remove from the tree.
              50                            50
           /     \         delete(20)      /   \
          30      70       --------->    30     70
         /  \    /  \                     \    /  \
       20   40  60   80                   40  60   80
2)      Node to be deleted has only one child: Copy the child to the node and delete the child
              50                            50
           /     \         delete(30)      /   \
          30      70       --------->    40     70 
            \    /  \                          /  \ 
            40  60   80                       60   80
3)      Node to be deleted has two children: Find inorder successor of the node. Copy contents of the inorder successor to the node and delete the inorder successor. Note that inorder predecessor can also be used.

              50                            60
           /     \         delete(50)      /   \
          40      70       --------->    40    70
                 /  \                            \
                60   80                           80
The important thing to note is, inorder successor is needed only when right child is not empty. In this particular case, inorder successor can be obtained by finding the minimum value in right child of the node.

Illustration:



Time Complexity: The worst case time complexity of delete operation is O(h) where h is height of Binary Search Tree. In worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may become n and the time complexity of delete operation may become O(n)

Find maximum count of duplicate nodes in a Binary Search Tree
Given a Binary Search Tree (BST) with duplicates, find the node (the most frequently occurred element) in the given BST. If the BST contains two or more such nodes, print any of them.

Note: We cannot use any extra space. (Assume that the implicit stack space incurred due to recursion does not count)

Assume a BST is defined as follows:
  • The left subtree of a node contains only nodes with keys less than or equal to the node’s key.
  • The right subtree of a node contains only nodes with keys greater than or equal to the node’s key.
  • Both the left and right subtrees must also be binary search trees.
Examples:
Input :   Given BST is
 
                    6
                 /    \
                5       7
              /   \    /  \
             4     5  7    7
Output : 7 
 
Input :  Given BST is
 
                   10
                 /    \
                5       12
              /   \    /  \
             5     6  12    16
Output : 5 or 12 
We can print any of the two value 5 or 12.

Approach:
To find the node, we need to find the Inorder Traversal of the BST because its Inorder Traversal will be in sorted order.

So, the idea is to do recursive Inorder traversal and keeping the track of the previous node. If the current node value is equal to the previous value we can increase the current count and if the current count becomes greater than the maximum count, replace the element.

Advantages of BST over Hash Table
Hash Table supports following operations in Θ(1) time.
  1.  Search
  2. Insert
  3. Delete

The time complexity of above operations in a self-balancing Binary Search Tree (BST) (like Red-Black Tree, AVL Tree, Splay Tree, etc) is O(Logn).

So Hash Table seems to beating BST in all common operations. When should we prefer BST over Hash Tables, what are advantages. Following are some important points in favor of BSTs.
  1. We can get all keys in sorted order by just doing Inorder Traversal of BST. This is not a natural operation in Hash Tables and requires extra efforts.
  2. Doing order statistics, finding closest lower and greater elements, doing range queries are easy to do with BSTs. Like sorting, these operations are not a natural operation with Hash Tables.
  3. BSTs are easy to implement compared to hashing, we can easily implement our own customized BST. To implement Hashing, we generally rely on libraries provided by programming languages.
  4. With Self-Balancing BSTs, all operations are guaranteed to work in O(Logn) time. But with Hashing, Θ(1) is average time and some particular operations may be costly, especially when table resizing happens.


Reference:


Comments

Popular posts from this blog

Circular Linked List, Doubly Linked list, and Circular Doubly Linked List

Hashing and Hash Tables, Trees & Binary Tree