Randomizing an array.

by Rabbit

Update! Mad props to Ilan Berci on Ruby Forum for the enlightenment. His version is incredible:

Update! Ilan emailed me back saying, “I didn’t come up with it, the technique is very common amongst the community.” Is that so? Learn something new every day!

class Array

  def randomize
    sort_by { rand }
  end

  def randomize!
    self.replace(randomize)
  end

end

Are you ready for these benchmarks?

                      user     system      total        real
100 elements      0.000000   0.000000   0.000000 (  0.000026)
500 elements      0.000000   0.000000   0.000000 (  0.000025)
1000 elements     0.000000   0.000000   0.000000 (  0.000025)
5000 elements     0.000000   0.000000   0.000000 (  0.000025)
50_000 elements   0.000000   0.000000   0.000000 (  0.000024)

That is fucking sweet.

I’ve come up with a few solutions, each with benchmarks. Two are incredibly slow, one is incredibly fast, and one is just right. For those of you who only care about the code, I’m presenting the “best” solution first. Here it is.

The “best” solution

class Array

  def randomize
    duplicated_original, new_array = self.dup, self.class.new
    new_array << duplicated_original.slice!(rand(duplicated_original.size)) until new_array.size.eql?(self.size)
    new_array
  end

  def randomize!
    self.replace(randomize)
  end

end

The benchmarks:

                    user     system      total        real
100 elements    0.000000   0.000000   0.000000 (  0.000143)
500 elements    0.000000   0.000000   0.000000 (  0.000905)
1000 elements   0.010000   0.000000   0.010000 (  0.002304)
5000 elements   0.030000   0.000000   0.030000 (  0.034623)

Hyper-fast. (Destroys the original array.)

class Array

  def randomize
    original_size, new_array = size, self.class.new

    until new_array.size.eql?(original_size)
      new_array << self.slice!(rand(size))
    end

    new_array
  end

  def randomize!
    self.replace(randomize)
  end

end

The benchmarks:

                    user     system      total        real
100 elements    0.000000   0.000000   0.000000 (  0.000013)
500 elements    0.000000   0.000000   0.000000 (  0.000011)
1000 elements   0.000000   0.000000   0.000000 (  0.000011)
5000 elements   0.000000   0.000000   0.000000 (  0.000010)

Before you get all hyped up about those benchmarks, realize that the above randomization method destroys your original array. Yeah, really. Unless you have need for a randomized array (e.g. randomized_array = normal_array.randomize), you should probably stick with the top option. :)

Ignoring the hyper-fast method because it destroys the original array, this is a very, very nice solution if I do say so myself. Feel free to use this in your real-world projects. ;)

— second — Randomize an array in Ruby.

I took another swing at my array#randomize method and came up with this.

class Array

  def randomize
    a = self.class.new

    until a.size.eql?(size)
      indice = rand(size)
      a << self.find { |x| x.eql?(self[indice]) } unless a.include?(self[indice])
    end

    a
  end

  def randomize!
    self.replace(randomize)
  end

end

Here are the benchmarks.

                          user     system      total        real
randomize 100:        0.010000   0.000000   0.010000 (  0.006395)
randomize 500:        0.170000   0.000000   0.170000 (  0.176871)
randomize 1000:       0.650000   0.000000   0.650000 (  0.657394)
randomize 5000:      18.620000   0.050000  18.670000 ( 18.793249)

I know nothing of randomization theory (if there is such a thing), so I’m simply combining the tools available to me via Ruby.

— first —

Inefficient with anything but small arrays, but it’s easy to read, understand and get working. Enjoy.

class Array

  def randomize
    a = self.class.new
    a << self[rand(size)] and a.uniq! until a.size.eql?(size)
    a
  end

  def randomize!
    self.replace(randomize)
  end

end

Some benchmarks. Pretty scary. Don’t use for anything serious, obviously.

                    user        system     total             real
randomize 100:      0.010000    0.000000   0.010000  (  0.014094)
randomize 500:      0.530000    0.010000   0.540000  (  0.539428)
randomize 1000:     4.200000    0.020000   4.220000  (  4.242288)
randomize 5000:     91.330000   0.400000   91.730000 ( 92.531058)

Holy shit! The processing time goes up exponentially as the array gets bigger. Seriously, don’t use this is a real project.