Binary Search Tree
Binary
Search Tree is a node-based binary tree data structure which has the
following properties:
- The
left subtree of a node contains only nodes with keys lesser than the
node’s key.
- The
right subtree of a node contains only nodes with keys greater than the
node’s key.
- The left and right subtree each must also be a binary search tree
The above properties of Binary Search Tree provide an
ordering among keys so that the operations like search, minimum and maximum can
be done fast. If there is no ordering, then we may have to compare every key to
search a given key.
Searching a key
To search a given key in Binary Search Tree, we first compare it with root, if the key is present at root, we return root. If key is greater than root’s key, we recur for right subtree of root node. Otherwise we recur for left subtree.
To search a given key in Binary Search Tree, we first compare it with root, if the key is present at root, we return root. If key is greater than root’s key, we recur for right subtree of root node. Otherwise we recur for left subtree.
Illustration to search 6 in below tree:
1. Start from root.
2. Compare the inserting element with root, if less than root, then recurse for left, else recurse for right.
3. If element to search is found anywhere, return true, else return false.
Insertion of a key
A new key is always inserted at leaf. We start searching a key from root till we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node.
A new key is always inserted at leaf. We start searching a key from root till we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node.
100 100
/ \ Insert 40 / \
20 500
---------> 20 500
/ \ / \
10 30
10 30
\
40
Illustration to insert 2 in below tree:
1. Start from root.
2. Compare the inserting element with root, if less than root, then recurse for left, else recurse for right.
3. After reaching end,just insert that node at left(if less than current) else right.
Time Complexity: The worst case time complexity of search and insert operations is O(h) where h is height of Binary Search Tree. In worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may become n and the time complexity of search and insert operation may become O(n).
Some Interesting Facts:
- Inorder traversal of BST always
produces sorted output.
- We can construct a BST with only
Preorder or Postorder or Level Order traversal. Note that we can always
get inorder traversal by sorting the only given traversal.
- Number of unique BSTs with n
distinct keys is Catalan Number
Now we are going to talk about delete operation. When we delete a node, three possibilities
arise.
1)
Node to be deleted is leaf: Simply
remove from the tree.
50 50
/ \
delete(20) / \
30 70
---------> 30 70
/ \
/ \ \ / \
20 40
60 80 40 60 80
2)
Node to be deleted has only one child: Copy
the child to the node and delete the child
50 50
/ \ delete(30) / \
30 70 ---------> 40 70
\ / \ / \
40 60 80 60 80
3)
Node to be deleted has two children: Find
inorder successor of the node. Copy contents of the inorder successor to the
node and delete the inorder successor. Note that inorder predecessor can also
be used.
50 60
/ \
delete(50) / \
40 70
---------> 40 70
/ \ \
60 80 80
The important thing to note is, inorder
successor is needed only when right child is not empty. In this particular
case, inorder successor can be obtained by finding the minimum value in right
child of the node.
Illustration:
Time
Complexity: The
worst case time complexity of delete operation is O(h) where h is height of
Binary Search Tree. In worst case, we may have to travel from root to the
deepest leaf node. The height of a skewed tree may become n and the time
complexity of delete operation may become O(n)
Find maximum count of
duplicate nodes in a Binary Search Tree
Given a Binary Search Tree (BST) with duplicates, find the
node (the most frequently occurred element) in the given BST. If the BST
contains two or more such nodes, print any of them.
Note: We cannot use any extra space. (Assume
that the implicit stack space incurred due to recursion does not count)
Assume a BST is defined as follows:
- The
left subtree of a node contains only nodes with keys less than or equal to
the node’s key.
- The
right subtree of a node contains only nodes with keys greater than or
equal to the node’s key.
- Both
the left and right subtrees must also be binary search trees.
Examples:
Input : Given BST is
6
/ \
5 7
/ \ / \
4 5 7 7
Output : 7
Input : Given BST is
10
/ \
5 12
/ \ / \
5 6 12 16
Output : 5 or 12
We can print any of the two value 5 or 12.
Approach:
To find the node, we need to find the Inorder Traversal of
the BST because its Inorder Traversal will be in sorted order.
So, the idea is to do recursive Inorder traversal and
keeping the track of the previous node. If the current node value is equal to
the previous value we can increase the current count and if the current count
becomes greater than the maximum count, replace the element.
Advantages of BST over Hash Table
Hash Table supports following operations in Θ(1) time.
- Search
- Insert
- Delete
The time complexity of above operations in a self-balancing
Binary Search Tree (BST) (like Red-Black Tree, AVL Tree, Splay Tree, etc) is
O(Logn).
So Hash Table seems to beating BST in all common operations.
When should we prefer BST over Hash Tables, what are advantages. Following are
some important points in favor of BSTs.
- We can get all keys in sorted order by just doing Inorder Traversal of BST. This is not a natural operation in Hash Tables and requires extra efforts.
- Doing order statistics, finding closest lower and greater elements, doing range queries are easy to do with BSTs. Like sorting, these operations are not a natural operation with Hash Tables.
- BSTs are easy to implement compared to hashing, we can easily implement our own customized BST. To implement Hashing, we generally rely on libraries provided by programming languages.
- With Self-Balancing BSTs, all operations are guaranteed to work in O(Logn) time. But with Hashing, Θ(1) is average time and some particular operations may be costly, especially when table resizing happens.
Reference:
Comments
Post a Comment