Software Craftsmanship

Go Back

Sorting Algorithms

Posted by Tom Spencer on 2019-12-19

It does depend on the data structure but there are several ways to sort. We sort by comparing elements: for example small and big, alphabetically, in terms of size or any other comparison. In general the values to compare are defined in the method in the classes def ==() in ruby and Hashcode equals in Java. Every element in a data structure can be labelled and this label behaves like a map even though the structure itself is not a map. Each language treats these data structures as key value pairs. The hash value is automatically generated by the computer and the assignment is made by system to memory. We use these values to sort. Here is a short list of some of the best known sorting algorithms:

  • Bubble sort,
  • Insertion sort,
  • Heap sort,
  • Binary Sort
  • Merge sort - with trees,
  • Quick sort - fastest but limitations,
  • Radix sort - Most stable big numbers

Bubble sort

Bubble sort is a relatively simple sorting algorithm. For this algorithm we iterate over all the elements and compare each element and put, for example, the smallest at the front and biggest at the back. In every step you create a new variable and there store one of the values e.g. the big value. Then in the position of the big value you put the small value. Then the big value disappears because it is replaced by the small one but we have the big value stored. In the position of the small value you put the big value. This is a three step procedure.

How many times do we execute bubble sort? At least as many elements in the data structure. If we have n elements we sort at least 2n times. The problem is that if you run through 2 n times only a few parts sorted. So 2n times has to be performed 2 n times so we have n * 2 n executed. The complexity is therefore 2n$^2$ or O(n$^2$). Bubble sort, therefore, has n squared complexity.

We can try to work out the complexity for this algorithm. This algorithm is in most cases a recursion algorithm. Most of the time with searching we want to use recursion algorithms. Why is recursion useful. A function is recursive if it calls itself but this call should be done with simpler arguments - for example if you want your function to execute some process on a data structure when it calls itself the arguments should be simpler than the original.

What is the fundamental difference between iterative algorithms and recursive ones? Actually there is no difference between them. Every algorithm that can be solved recursively can be solved by iterative means and vice versa. This is a theoretic point because sometimes it is a lot more difficult to find an example of the other group. In the formal setting there is no difference between recursion and iteration.

What about complexity? It depends on the algorithm. In general terms recursive algorithms are slightly slower. When you create a recursive algorithm you are creating a stack in your memory. There could be parts that wait for the recursion to finish to be executed. In general terms recursion is slower than iterative methods. If they are slower why use recursive methods? Sometimes the benefits make the algorithm faster. Sometimes the algorithm is so good in recursive translation that it works well.

When should we use recursion and when shouldn’t we? We want to use recursion when the data structure is not linear. When we want to compute in a non - iterative way. The simplest algorithm is bubble sort but in general terms the rest are not iterative so recursion is the best solution in most cases for searching.

Merge sort

Merge sort is the most simple recursive algorithm for sorting. It works in a similar way to binary search. It divides the whole data structure into two parts and recursively halves it and when it is divided it puts them in ordered way. Until single elements then reverse process. Sorting is when put them together through comparison.

For merge sort first we need to cut in half recursively. It is a process of linear complexity. We therefore take the first element in the array and progress element by element. If keep doing this in how many pairs can you divide. If we keep dividing by two we get at most 2$^k$ elements. If we want to know what k is we have to take the logarithm of n. This process has to be executed once for every element. If you have n elements the whole process takes O (n * logn). This is a linear logarithmic function.

Quick Search

Quick sort sees our data structure as a binary tree. The searching process is fast in a tree. Go through elements in tree. Tree sorting is fast. This is not intuitive and hard to debug. The computations are strange. If you do everything correctly you have no problems and sorting is fast. We are treating the data structure as if it was a database.