Beruflich Dokumente
Kultur Dokumente
Outline
IntroducLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
2
ICPP - 2011
IntroducLon
Tremendous increase in interest in interactive web-sites
(social networking, e-commerce etc.)
Dynamic data is stored in databases for future retrieval and
analysis
Database lookups are expensive
Memcached a distributed memory caching layer,
implemented using traditional BSD sockets
Socket interface provides portability, but entails additional
processing and multiple message copies
High-Performance Computing (HPC) has adopted advanced
interconnects (e.g. InfiniBand, 10 Gigabit Ethernet/iWARP,
RoCE)
Low latency, High Bandwidth, Low CPU overhead
Outline
IntroducLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
4
ICPP - 2011
Memcached
Overview
System
Area
Network
System
Area
Network
Internet
Proxy
Servers
(Memcached
Clients)
Memcached
Servers
Database
Servers
Outline
IntroducLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
6
ICPP - 2011
ApplicaLon
Interface
Sockets
Kernel
Space
TCP/IP
IB Verbs
TCP/IP
Protocol
ImplementaLon
Hardware
Ooad
SDP
RDMA
User
space
Ethernet
Driver
IPoIB
Network
Adapter
1/10
GigE
Adapter
InniBand
Adapter
10
GigE
Adapter
InniBand
Adapter
InniBand
Adapter
Network
Switch
Ethernet
Switch
InniBand
Switch
10
GigE
Switch
InniBand
Switch
InniBand
Switch
10 GigE-TOE
SDP
IB Verbs
1/10 GigE
IPoIB
RDMA
7
ICPP - 2011
Problem
Statement
High-performance RDMA capable interconnects
have emerged in the scientific computation
domain
Applications using Memcached are still relying on
sockets
Performance of Memcached is critical to most of
its deployments
Can Memcached be re-designed from the ground
up to utilize RDMA capable networks?
8
ICPP - 2011
Our Approach
ApplicaAon
ApplicaAon
UCR
Sockets
IB
Verbs
1/10
GigE
Network
9
ICPP - 2011
Outline
IntroducLon
&
MoLvaLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
10
ICPP - 2011
Target
Origin
Header
Header + Data
Set
LComplFlag
Header
Handler
Set
ComplFlag
Header
Handler
Copy
Data
RDMA Data
Set
LComplFlag
Target
Completion
Handler
Completion
Handler
Set
RComplFlag
Set
RComplFlag
Set
ComplFlag
13
ICPP - 2011
Outline
IntroducLon
&
MoLvaLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
14
ICPP - 2011
Master
Thread
Sockets
Worker
Thread
Verbs
Worker
Thread
Shared
Data
Memory
Slabs
Items
Sockets
Worker
Thread
RDMA
Client
Verbs
Worker
Thread
Outline
IntroducLon
&
MoLvaLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
16
ICPP - 2011
Experimental
Setup
Used
Two
Clusters
Intel
Clovertown
Each
node
has
8
processor
cores
on
2
Intel
Xeon
2.33
GHz
Quad-core
CPUs,
6
GB
main
memory,
250
GB
hard
disk
Network:
1GigE,
IPoIB,
10GigE
TOE
and
IB
(DDR)
Intel
Westmere
Each
node
has
8
processor
cores
on
2
Intel
Xeon
2.67
GHz
Quad-core
CPUs,
12
GB
main
memory,
160
GB
hard
disk
Network:
1GigE,
IPoIB,
and
IB
(QDR)
Memcached
So\ware
Memcached
Server:
1.4.5
Memcached
Client:
(libmemcached)
0.45
17
ICPP - 2011
350
300
200
Time (us)
Time (us)
250
150
100
SDP
200
IPoIB
150
OSU Design
100
50
1G
50
10G-TOE
0
1
Message Size
ICPP - 2011
700
4500
600
4000
500
Time (us)
3500
3000
SDP
400
IPoIB
2500
2000
300
1500
200
1G
100
10G-TOE
1000
500
0
OSU Design
0
4K
8K
16K
32K
4K
Message Size
8K
16K
32K
64K
128K
512K
Message Size
ICPP - 2011
100
90
200
80
Time (us)
70
150
100
50
60
SDP
50
IPoIB
40
OSU Design
30
1G
20
10G-TOE
10
0
0
1
800
4000
700
3500
600
Time (us)
3000
2500
2000
1500
500
SDP
400
IPoIB
300
OSU Design
1000
200
500
100
0
4K
8K
16K
32K
64K
1G
10G-TOE
0
4K
Message Size
8K
16K
32K
64K
Message Size
ICPP - 2011
Memcached
Latency
(10%
Set,
90%
Get)
350
250
300
200
Time (us)
250
150
SDP
200
100
50
IPoIB
150
OSU Design
100
1G
10G
50
0
0
1
Message Size
22
Memcached
Latency
(50%
Set,
50%
Get)
250
100
90
200
80
Time (us)
70
150
100
50
60
SDP
50
IPoIB
40
OSU Design
30
1G
20
10G-TOE
10
0
0
1
16
23
700
600
500
2000
SDP
1800
10G - TOE
1600
OSU Design
1400
SDP
IPoIB
OSU Design
1200
400
1000
300
800
200
600
400
100
200
0
8 Clients
16 Clients
8 Clients
16 Clients
24
350000
300000
250000
900000
SDP
800000
10G - TOE
700000
OSU Design
600000
200000
500000
150000
400000
SDP
IPoIB
OSU Design
300000
100000
200000
50000
100000
0
8 Clients
16 Clients
8 Clients
16 Clients
25
Outline
IntroducLon
&
MoLvaLon
Overview
of
Memcached
Modern
High
Performance
Interconnects
Unied
CommunicaLon
RunLme
(UCR)
Memcached
Design
using
UCR
Performance
EvaluaLon
Conclusion
&
Future
Work
26
ICPP - 2011
Thank
You!
{jose,
subramon,
luom,
zhanjmin,
huangjia,
rahmanmd,
islamn,
ouyangx,
wangh,
surs,panda}@cse.ohio-state.edu
28