You are on page 1of 4

Problem

Counting Inversions
Problem Statement
Input. Array A containing numbers 1,2,3,... in some arbitrary order.
Output. No. of inversions.
Example. 1, 3, 5, 2, 4, 6
Inversions. (3, 2) (5, 2) (5, 4)
1 2 3 4 5
|
\/ \/
|
/\ /\
| / \/ \
| /
/\
\
1 3 5 2 4

6
|
|
|
|
6

Motivation.
Numerical similarity between two ranked lists.
Applications.
Find similar preferences between movies listed by two friends.
Colaborative filtering. recommendation based on purchases made by similar buyers.
Q. Largest possible No. of inversions for a six element array.
!
n(n 1)
n
=
2
2
Worst case. array in backward order.

Algorithm 1. Brute-force
Basic Idea. Compare each element to the rest of the elements similar to bubble sort.
Running time. T (n) = O(n2 )
Goal. Can we do better?

|1

| problem 3. Counting Inversions

Algorithm 2.
Divide and Conquer
Definition. Inversion (i, j) if A[i] > A[j] with i < j.
Types of inversions.

lef t if (i, j) n2

right if (i, j) > n2

split if i n < j
2

After recursive calls, it requires residual task of collecting the results from all calls.
Pseudocode.
Count ( Array A, Length n )
IF n == 1 THEN return 0
ELSE
X = Count(1st half of A, \frac{n}{2})
Y = Count(2nd half of A, \frac{n}{2})
Z = CountSplit (A, n)
return X + Y + Z
Our Goal. O(nlogn) if we solve CountSplit in linear time. This is ambitious as counting
inversions takes quadratic time, if all the inversions are split (the case in previous example).
Can we do better?

Algorithm 2. Piggy-backing MergeSort


Basic Idea. Make recursive calls doing both sort and count.
Motivation. Recall merge step in mergesort, it could be used to uncover split inversions.
Pseudocode.
CountSplit (Array A, Length n)
IF n == 1 THEN return 0
ELSE
[B,X] = CountSort(1st half of A, \frac{n}{2})
[C,Y] = CountSort(2nd half of A, \frac{n}{2})
[D,Z] = CountSplitMerge (A, B, C, n)
return X + Y + Z
Note. Array A has no split inversions, then all elements of sorted sub-array B are less
then all elements of sorted sub-array C.
Example.
Consider merging B and C,
___ ___ ___
|
|
|
|
B = | 1 | 3 | 5 |

|3

C =

D =

D =

D =

D =

|___|___|___|
___ ___ ___
|
|
|
|
| 2 | 4 | 6 |
|___|___|___|
___ ___ ___ ___ ___ ___
|
|
|
|
|
|
|
| 1 |
|
|
|
|
|
|___|___|___|___|___|___|
___ ___ ___ ___ ___ ___
|
|
|
|
|
|
|
| 1 | 2 |
|
|
|
|
|___|___|___|___|___|___|
___ ___ ___ ___ ___ ___
|
|
|
|
|
|
|
| 1 | 2 | 3 |
|
|
|
|___|___|___|___|___|___|
___ ___ ___ ___ ___ ___
|
|
|
|
|
|
|
| 1 | 2 | 3 | 4 |
|
|
|___|___|___|___|___|___|

and so on.

1. When 2 is copied to D, we discover 2 spit inversions (3,2) and (5,2).


2. When 4 is copied to D, we discover 1 spit inversions (5,4).
Claim. Consider x from B and y from C. If split inversion involves y, it is copied to D.
Then the number of elements on right of B are the count of inversions. otherwise if x is
copied to D then there are no split inversions.
Proof.

q
q

If x is copied to D before y then x<y, requires no inversions.


If y is copied to D before x then y<x, x and y are a split inversions.

Pseudocode.
[D,Z] = CountSplitMerge(A,B,C,n)

q
q

While merge sorted arrays B,C; keep running total of number of split inversions.
When element from C gets copied to output D, increment Z with number of elements
right of i in B.

Running Time.
T (n) = merge + count
= n+n
= O(n)

| problem 3. Counting Inversions


Total Running Time.
T (n) = O(n) + O(nlogn)
= O(nlogn)