Merge Join Vs Hash Join Vs Nested Loop Join

Merge join Vs Hash join Vs Nested
loop join
by Muthukkumaran kaliyamoorthy Published on: October 5, 2011
Comments: 25 Comments
Tags: Merge join Vs Hash join Vs Nested loop join, Physical join type
Categories:Performance, SQL party
This months TSQL Tuesday party is being hosted by Stuart R

Ainsworth (Blog| Twitter). I am very glad to write my first blog
post as t-SQL Tuesday post on my newly designed website.
SQL server has three types of internal joins. I know most of folks
never heard this join type because its not logical join and its not
often used in their codes.
Then, when it will be used?
Well the answer is it depends.
This means it depends upon the record sets and indexes. The
query optimizer will be smart and always try to pick up the most
optimal physical joins. As we know SQL optimizer creates a plan
cost based and depends upon the query cost it will choose the
best join.
How the query optimizer will choose the join type internally?
Well, there is some algorithm has written internally for the query
optimizer to choose the join type.
Lets go for some practical examples and will finally summarize it.
First I will give some basic idea how the join will work and
when/How the optimizer will decide to use anyone of the internal
join (Physical join).
It depends upon the table size

It depends upon the index on the join column
It depends upon the Sorted order on the join column
Test:
The test has done by following configuration.
RAM: 4 GB
Server : SQL server 2008 (RTM)
?
create table tableA (id int identity ,name
varchar(50))
declare @i int
set @i=0
while (@i<100)
begin
insert into tableA (name)
select name from master.dbo.spt_values
set @i=@i+1
end
--select COUNT(*) from dbo.tableA --250600
go
create table tableB (id int identity ,name
varchar(50))
declare @i int
set @i=0
while (@i<100)
begin
insert into tableB (name)
set @i=@i+1
end
-- select COUNT(*) from dbo.tableB --250600
select * from dbo.tableA A join tableB B
on (a.id=b.id)
Test1: Without Index
Lets create a clustered index

?
create
tableA
create
tableB
unique clustered index cx_tableA on

(id)
unique clustered index cx_tableB on
(id)
Test1: With Index
If either of the table has indexed then it goes hash join. I havent
shown this picture here. You can drop either of the table indexes
and test it.
Lets create a medium table
?
create table tableC (id int identity,name
varchar(50))
insert into tableC (name)
-- select COUNT(*) from dbo.tableC --2506
create table tableD (id int identity,name
varchar(50))
insert into tableD (name)

select * from dbo.tableC C join tableD D
on (C.id=D.id)
-- select COUNT(*) from dbo.tableD --2506
Test2: With Index

?
create
tableC
create
tableD
unique clustered index cx_tableC on

(id)
unique clustered index cx_tableD on
(id)
If either of the table has indexed then it goes merge join. I

havent shown this picture here. You can drop either of the table
indexes and test it.
?
create table tableE (id int identity,name
varchar(50))
insert into tableE (name)
select top 10 name from master.dbo.spt_values
-- select COUNT(*) from dbo.tableE --10
create table tableF (id int identity,name
varchar(50))
insert into tableF (name)
select top 10 name from master.dbo.spt_values
-- select COUNT(*) from dbo.tableF --10
?
create
tableE
create
tableF
unique clustered index cx_tableE on

(id)
unique clustered index cx_tableF on
(id)
Test3: With Index
If either of the table has indexed then it goes Nested loop join. I
havent shown this picture here. You can drop either of the table
indexes and test it.
You can also join tables vice versa like big table Vs Medium table
Vs small table
?
select * from dbo.tableA A join tableC C
on (a.id=C.id)
select * from dbo.tableA A join tableE E
on (a.id=E.id)
select * from dbo.tableC C join tableE E
on (C.id=E.id)
In this case if all the table has indexed then it goes Nested loop
join. If they dont then hash join. If either of the table has
indexed then it goes Nested loop join. I havent shown this
picture here.
Still you can force optimizer to use any one of the internal joins,
but its not good practice. The query optimizer is smart it will
dynamically choose the best one.
Here just I used the merge hint so the optimizer goes to merge
join instead of a hash join (Test1 without an index)
?
select * from dbo.tableA A join tableB B
on (A.id=B.id)option (merge join)
select * from dbo.tableA A inner merge join
tableB B
on (A.id=B.id)
Table 1: Test uses a unique clustered index
From the table diagram:
If both the tables have NO index then the query optimizer

will choose Hash joins internally.
If both the tables have indexes then the query optimizer will
choose Merge (For big tables) /Nested loop (For small
tables) internally.
If either of the tables have indexes then the query optimizer

will choose Merge (For medium tables) /Hash (For big
tables) /Nested loop (For small & big Vs small tables) internally.
Table 1: Test using clustered index
?
(create clustered index cx_tableA on tableA
(id))
With index
Table size
(Both)
Big (Both)
HASH
Medium (Both) HASH
NESTED
Small (Both)
LOOP
Big Vs
Small(medium) HASH
Without
Index(Both)
HASH
HASH
NESTED
LOOP
HASH
Either of
table has
index
HASH
HASH
HASH
HASH

Merge Join Vs Hash Join Vs Nested Loop Join

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Merge Join Vs Hash Join Vs Nested Loop Join

Hochgeladen von

Copyright:

Verfügbare Formate

Merge join Vs Hash join Vs Nested

This months TSQL Tuesday party is being hosted by Stuart R

It depends upon the table size

Test1: Without Index

Lets create a clustered index

unique clustered index cx_tableA on

Test1: With Index

insert into tableD (name)

Test2: With Index

unique clustered index cx_tableC on

If either of the table has indexed then it goes merge join. I

-- select COUNT(*) from dbo.tableF --10

Lets create a clustered index

unique clustered index cx_tableE on

Test3: With Index

Table 1: Test uses a unique clustered index

From the table diagram:

If both the tables have NO index then the query optimizer

If either of the tables have indexes then the query optimizer

Das könnte Ihnen auch gefallen