Array Intersection Bake-off

One of those moments where an interview question turns into a research project, or is it really a bake off?  The simple problem is demonstrate an algorithm to intersect two lists of numbers, fundamentally it’s a question about using modern interpreted languages and their associative array bits to make a simple intersection routine.  However many languages support many different ways to do things.  I’ve put together a test of Python vs. Java vs. Ruby vs. Perl vs. PHP and got a few interesting benchmarks.

Short Version of the results: java wins with python comming in second.

But, things are not so simple for instance there is a simple base approach (python example)

1def isect(a, b) :
2    o = []
3    h = {}
4    for i in a :
5        h[i] = True
6    for i in b :
7        if h.get(i, False) :
8            o.append(i)
9    return o

or a more language unique version

1def isect(a, b) :
2    h = dict(iter([(i, True) for i in a]))
3    return [i for i in b if h.get(i,False)]

or just using features of the language

1def isect(a, b) :
2    return list(set(a) & set(b))</pre>

All of these return the same result, but clearly we can think of this as a progression of an alorigthm.   Version 1 focusing on textbook to code, version 2 says there’s some cool language features and version 3 says, dude like don’t you really know what your doing?

The upshot was that I sat down and wrote 12 different versions of this just to see what the language differences were.

Language Version Alg. Time Run Time
java 1 3.076 5.888
perl 1 3.622 6.691
php 1 3.901 22.878
python 1 1.740 4.526
ruby 1 3.517 9.853
java 2 0.817 4.362
python 2 3.550 6.441
ruby 2 7.984 14.281
python 3 1.500 4.356
ruby 3 3.809 10.184
c++ 1 0.830 1.209
python 4 1.040
php 2 2.000
php 3 10.064
java 3 3992.045

Of course you’re probably wondering about language versions:

  • java 1.6.0
  • perl 5.8.8
  • php 5.2.9
  • python 2.5.4
  • ruby 1.8.6
 1// Java Version # 1
 2    private static int[] isect(int a[], int b[]) {
 3        int            l[] = new int[a.length];
 4        TreeMap        h = new TreeMap();
 5        int            idx = 0;
 6
 7        for (int i = 0; i < a.length; i++) {
 8            h.put(new Integer(a[i]), 1);
 9        }
10        for (int i = 0; i < b.length; i++) {
11            if (h.containsKey(new Integer(b[i])))
12                l[idx++] = b[i];
13        }
14
15        int o[] = new int[idx];
16        for (int i = 0; i < idx; i++) {
17            o[i] = l[i];
18        }
19
20        return o;
21    }
 1# Perl Version 1
 2sub isect {
 3    my($a, $b) = @_;
 4    my(@o, %h);
 5
 6    for my $i (@$a) {
 7        $h{$i} = 1;
 8    }
 9    for my $i (@$b) {
10        push(@o, $i) if $h{$i};
11    }
12
13    return @o;
14}
 1# PHP Version 1
 2function isect($a, $b) {
 3    $h = array();
 4    $o = array();
 5
 6    foreach ($a as $i) {
 7        $h[$i] = true;
 8    }
 9    foreach ($b as $i) {
10        if ($h[$i]) {
11            array_push($o, $i);
12        }
13    }
14
15    return $o;
16}
 1# Python Version 1
 2def isect(a, b) :
 3    o = []
 4    h = {}
 5    for i in a :
 6        h[i] = True
 7    for i in b :
 8        if h.get(i, False) :
 9            o.append(i)
10    return o
1# Ruby Version 1
2def isect(a, b)
3  return a & b
4end
 1// Java Version 2
 2    private static Integer[] isect(Integer a[], Integer b[]) {
 3        ArrayList<integer> l = new ArrayList</integer><integer>(a.length);
 4        HashSet</integer><integer>   h = new HashSet</integer><integer>();
 5
 6        for (int i = 0; i < a.length; i++) {
 7            h.add(a[i]);
 8        }
 9        for (int i = 0; i < b.length; i++) {
10            if (h.contains(b[i]))
11                l.add(b[i]);
12        }
13
14        return (Integer[])l.toArray(new Integer[0]);
15    }
1# Python Version 2
2def isect(a, b) :
3    h = dict(iter([(i, True) for i in a]))
4    return [i for i in b if h.get(i,False)]
1# Ruby Version 2
2def isect(a, b) 
3  # Convert array to Set objects and perform intersection
4  a = a.to_set
5  b = b.to_set
6
7  return a & b
8end
1# Python Version 3
2def isect(a, b) :
3    return list(set(a)& set(b))
 1# Ruby Version 3
 2def isect(a, b) 
 3    o = Array.new
 4    h = Hash.new
 5
 6    a.each do |i|
 7        h[i] = 1
 8    end
 9    b.each do |i|
10        if h[i] 
11            o.push(i)
12        end
13    end
14
15    return o
16end
 1# PHP Version 2
 2function isect($a, $b) {
 3    $b = array_flip($b);
 4    $o = array();
 5
 6    foreach ($a as $i) {
 7        if(isset($b[$i])) {
 8            $o[] = $i;
 9        }
10    }
11
12    return $o;
13}
1# PHP Version 3
2function isect($a, $b) {
3    return array_intersect($a, $b);
4}
1// Java Version 3
2    private static Integer[] isect(Integer a[], Integer b[]) {
3        Set<integer> l = new HashSet</integer><integer>(Arrays.asList(a));
4        l.retainAll(Arrays.asList(b));
5
6        return (Integer[])l.toArray(new Integer[0]);
7    }
1# Python Version 4
2def isect(a, b) :
3    h = set(a)
4    return [i for i in b if i in h]
 1// C++ Version 1
 2std::list<int>* isect(int len, int a[], int b[]) {
 3    __gnu_cxx::hash_set</int><int>   h = __gnu_cxx::hash_set</int><int>();
 4
 5    for (int i = 0; i < len; i++) 
 6        h.insert(a[i]);
 7
 8    std::list< int>   *l = new std::list< int>();
 9
10    for (int i = 0; i < len; i++) 
11        if (h.find(b[i]) != h.end()) 
12            l->push_back(b[i]);
13
14    return l;
15}

Would enjoy peoples thoughts and feedback, or alternate implementations for these and other languages