Benchmarking sorting algorithms in Ruby

For some time now, I wanted to try the implementation of some simple sort algorithms in Ruby. So I did!

I’ve implemented 4 sorting algorithms:
– Quick sort
– Merge sort
– Bubble sort
– Binary tree sort

Quick sort

It’s one of the most efficients. In average, it makes O(n.log(n)) comparisons to sort n items. This is kinda good.

Concept

It takes a pivot element and moves all the inferior elements to the left part of the array and all the superior ones to the right. It iterates on the 2 sub arrays until it has just a one value array. The array is then sorted.

module QSort
  def self.sort(ar)
    return ar if ar.size = 1

    pivot = ar.pop
    sub1 = []
    sub2 = []

    ar.each do |elem|
      (elem = pivot) ? sub1.push(elem) : sub2.push(elem)
    end

    sort(sub1) + [pivot] + sort(sub2)
  end
end

Benchmarks

To sort an array of 500'000 random integers
  user     system      total        real
  3.220000   0.030000   3.250000 (  3.256061)

To sort an array of 1'000'000 random integers
      user     system      total        real
  6.680000   0.130000   6.810000 (  6.845635)

Note

One thing that could be nice is to implement a version using in-place swapping not requiring extra arrays.

Merge Sort

Merge sort is a divide and conquer algorithm. It’s a stable algorithm which means it has the same complexity in average or worst case scenario: O(n.log(n)).

Concept

First, it divides the list in sub-lists until having just one-element lists.
Second, it merges the lists knowing the lists are ordered.

# Merge sort - Tri fusion
module MergeSort

  # merge 2 ordered lists
  def self.mergeList(a, b)
    return b if a.empty?
    return a if b.empty?
    (a[0] = b[0] ? [a.delete_at(0)] : [b.delete_at(0)]) + mergeList(a, b)
  end

  def self.sort(a)
    return a if a.size = 1
    b = a.slice!(0..(a.size/2 - 1))
    mergeList(sort(a), sort(b))
  end
end

Benchmark

The problem with this algorithm or maybe more with Ruby is that if you want to sort a big list (i.e. 1’000’000 elements) you get a ‘Stack level too deep’ exception in your face ! I guess it’s a Ruby problem, I should test it with another language to compare.

I can not even ask it to sort a 100’000 elem lists. I’m a little bit upset there…

I finally found a solution. The easiest way to fix this is increasing the Stack Size of Ruby VM, setting the RUBY_THREAD_VM_STACK_SIZE environment variable:

export RUBY_THREAD_VM_STACK_SIZE=50000000
To sort an array of 100'000 random integers
user      system   total     real
43.790000 0.860000 44.650000 ( 44.763988)

It’s deceiving, I was expecting better performance.

Bubble sort

This is the most time consuming one. The complexity of this one is O(n2) in every scenario (average or worst).

Concept

It repeatedly steps through the list to be sorted, compares each pair of adjacent items and swaps them if they are in the wrong order. Pretty simple no?

module BubbleSort
  def self.sort(ar)
    return ar if ar.size <= 1

    swapped = false
    loop do
      for i in 0..(ar.size - 2) do
        if ar[i] > ar[i + 1]
          ar[i], ar[i + 1] = ar[i + 1], ar[i]
          swapped = true
        end
      end
      break if swapped == false
    end
    return ar
  end
end

Benchmarks

For 1’000’000 entries, I can’t let my computer turn all night. So no benchmark there.

Tree sort

Concept

This one is nice. The idea is to insert list’s elements in a sorted binary tree. Each node has all its inferior elements to the left and all higher ones to its right. Once this insertions are done, we just have to flatten the tree to create the ordered list.

The insertions takes on average O(n.log(n)) which is pretty good.

# Binary tree sort - Tri arborescent
# Tree struct: [leftTree, value, rightTree]
module BinaryTreeSort

  # building the tree
  def self.insert(tree, value)
    return [nil, value, nil] if tree.nil?
    if value <= tree[1]
      [insert(tree[0], value), tree[1], tree[2]]
    else
      [tree[0], tree[1], insert(tree[2], value)]
    end
  end

  # convert the tree to an array
  def self.flatten(tree)
    return [] if tree.nil?
    flatten(tree[0]) + [tree[1]] + flatten(tree[2])
  end

  def self.sort(ar)
    return ar if ar.size <= 1
    tree = nil
    ar.each do |elem|
      tree = insert(tree, elem)
    end
    return flatten(tree)
  end
end

Benchmarks

To sort an array of 500'000 random integers
       user     system      total        real
  9.430000   0.100000   9.530000 (  9.573206)

To sort an array of 1'000'000 random integers
       user     system      total        real
 20.080000   0.090000  20.170000 ( 20.239815)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s