Sie sind auf Seite 1von 4

Multiple Multiple Multiple Multiple Sparse Sparse Sparse Sparse Tables Tables Tables Tables Based Based Based

Based On On On On Pivot Pivot Pivot Pivot Table Table Table Table F FFFor or or or
MMMMulti- ulti- ulti- ulti-T TTTenant enant enant enant DDDData ata ata ata SSSStorage torage torage torage I IIInnnn SaaS SaaS SaaS SaaS
Wang Xue, Li Qingzhong and Kong Lanju
School oI Computer Science and Technology
University oI Shandong
Jinan, Shandong Province, China
wangxue0701gmail.com
$EVWUDFW $EVWUDFW $EVWUDFW $EVWUDFW - In In In In order order order order to to to to eeeexcellent xcellent xcellent xcellently ly ly ly support support support support SaaS SaaS SaaS SaaS application application application application, ,,,
mmmmulti-tenant ulti-tenant ulti-tenant ulti-tenant database database database database system system system system needs needs needs needs to to to to meet meet meet meet the the the the tenants tenants tenants tenants````
requirement requirement requirement requirement of of of of isolation isolation isolation isolation and and and and on-demand on-demand on-demand on-demand customization customization customization customization,,,, and and and and then then then then
need need need needs sss to to to to pppprovid rovid rovid rovideeee data data data data storage storage storage storage mechanism mechanism mechanism mechanism and and and and index index index index mechanism mechanism mechanism mechanism
that that that that support support support supporting ing ing ing isolation isolation isolation isolation and and and and flexib flexib flexib flexibiiiility. lity. lity. lity. Multiple Multiple Multiple Multiple Sparse Sparse Sparse Sparse Tables Tables Tables Tables is is is is
aaaa good good good good Approach Approach Approach Approach for for for for Multi-tenant Multi-tenant Multi-tenant Multi-tenant Data Data Data Data Storage Storage Storage Storage in in in in SaaS SaaS SaaS SaaS, ,,, but but but but no no no no
use use use use of of of of physical physical physical physical index index index index provided provided provided provided by by by by RDBMS. RDBMS. RDBMS. RDBMS. Based Based Based Based on on on on the the the the Multiple Multiple Multiple Multiple
Sparse Sparse Sparse Sparse Tables Tables Tables Tables a aaapproach pproach pproach pproach, ,,, in in in in this this this this paper, paper, paper, paper, we we we we p pppropose ropose ropose roposed ddd a aaa meta-data meta-data meta-data meta-data
driven driven driven driven indexing indexing indexing indexing mechanism mechanism mechanism mechanism. ... A AAAccording ccording ccording ccording to to to to tenant tenant tenant tenants sss```` customization customization customization customization
requirement requirement requirement requirement,,,, t ttthe he he he model model model model cccconstruct onstruct onstruct onstructs sss respective respective respective respective index index index index metadata metadata metadata metadata for for for for
business business business business data data data data of of of of the the the the tenants tenants tenants tenants,,,, and and and and achieves achieves achieves achieves isolation isolation isolation isolation of of of of index index index index data data data data
&&&& customization; customization; customization; customization; meanwhile meanwhile meanwhile meanwhile the the the the index index index index maintenance maintenance maintenance maintenance strategies strategies strategies strategies
are are are are given. given. given. given. According According According According to to to to tenants tenants tenants tenants```` access access access access requests requests requests requests and and and and the the the the Pivot Pivot Pivot Pivot
Table, Table, Table, Table, the the the the model model model model returns returns returns returns the the the the tenants tenants tenants tenants```` result result result result sets sets sets sets more more more more quickly quickly quickly quickly or or or or
updates updates updates updates index index index index data data data data on-demand. on-demand. on-demand. on-demand. D DDDetailed etailed etailed etailed experimental experimental experimental experimental results results results results
show show show show that that that that the the the the index index index index maintenance maintenance maintenance maintenance and and and and data data data data access access access access of of of of this this this this
approach approach approach approach works works works works with with with with good good good good performance performance performance performance under under under under balanced balanced balanced balanced
conditions conditions conditions conditions....
,QGH[ ,QGH[ ,QGH[ ,QGH[ 7HUPV 7HUPV 7HUPV 7HUPV VDDV VDDV VDDV VDDV PXOWLWHQDQW PXOWLWHQDQW PXOWLWHQDQW PXOWLWHQDQW VSDUVH VSDUVH VSDUVH VSDUVH WDEOH WDEOH WDEOH WDEOH SLYRW SLYRW SLYRW SLYRW WDEOH WDEOH WDEOH WDEOH
LQGH[ LQGH[ LQGH[ LQGH[
I. INTRODUCTION
In SaaS applications business model, one application
instance is leased by multiple tenants, the multi-tenant data
belong to the same application but customed individualy by
diIIerent tenants|4|, Sparse table, as an eIIective approach Ior
multi-tenant database, stores all the tenants data in the same
wide table|1||3|. Since diIIerent tenants have diIIerent
schemas and diIIerent columns, the wide table will be always
sparse. For example, tenant 1 may need 150 columns Ior his or
her data while tenant 2 may only need 50 columns. When the
data oI the two tenants are stored shared in one wide table, the
rows Ior tenant 2 will have at least 100 nulls. In order to solve
the single sparse table`s lots oI nulls problem, |2||9| proposed
a multiple sparse tables approach, and stored the tenants` data
in multiple sparse tables on-demand not in one.
This paper analyzes the problems exist in the traditional
Multiple Sparse Tables mode, proposes an index model called
Pivot Table|6|, since one sparse table stores data Irom
diIIerent tables oI diIIerent tenants, traditional index can not
be created in the sparse tables, so this paper proposes pivot
table which serves as logical index, and gives the maintenance
strategies, and proves the eIIectiveness oI this model by the
detailed experiment results.
The rest oI this paper is structured as Iollows. Section 2
reIers to the multiple sparse tables approach while section 3
reIers to the Multiple Sparse Tables based on the Pivot Table,
and the experiments Iollows in section 4, the conclusion is in
the section 5.
II. TRADITIONAL APPROACH
$ 6LQJOH 6SDUVH 7DEOH
The Force.com propose a sparse table approach|1|, all the
tenants` data are stored share in one wide table, in which a row
stands Ior a row oI any logical table owed by any tenant, this
wide table reserves lots oI Ilex columns, which are used to
store the tenants` customized business data. The tenants
customize table columns Ireely, and store the customize
inIormation in the metadata table, any oI customized column
is mapped into a reserved Ilex one in the wide table. The
customized data entry, mapped through logical and storage
layer, is stored uniIied in the wide data table.
Customized Ior diIIerent needs oI diIIerent tenants, leads
to the varied schemas in the number oI columns, So storing all
the tenants` data in a sparse data table, will result in a large
number oI null values. To address the diIIerent customized
needs oI diIIerent tenants, |2| proposed a multiple sparse
tables approach, and stored the tenants` data in multiple sparse
tables on-demand not in one.
% 0XOWLSOH 6SDUVH 7DEOHV $SSURDFK
The shared data storage architecture oI sparse table
prevents the number oI tables booming with the increase oI
the tenants and the tuple reconstruction is also relatively
simple|2||5||7|.
In the traditional multiple sparse tables approach(As
shown in Fig.1),the tenants` data is distributed to a number oI
sparse tables which has diIIerent number oI columns, called
SparseDataTable1, SparseDataTable2 to SparseDataTableN,
which have gradient columns(ColumnMax1, ColumnMax2 to
ColumnMaxN) as the data tables.(As shown in Fig.2)The
MetaSparseTable stores the metadata oI the multiple sparse
tables, including the table names, the number oI columns and
so on; the RealTableName column, added to the Tables,
indicates the actual tenants` data is stored in which sparse
Proceeding of the IEEE
International Conference on Information and Automation
Shenzhen, China June 2011
978-1-61284-4577-0270-9/11/$26.00 2011 IEEE
634
table. The Tables and Columns store the metadata oI the
logical schema oI tenants`. (As shown in Fig.3)
MetaData D ata
Fig.1 Traditional Multiple Sparse Tables
SparseDataTable1
P K P K P K P K Tenant id Tenant id Tenant id Tenant id
P K P K P K P K TableName TableName TableName TableName
Column Column Column Column 1111
Column Column Column Column 2222
Column Column Column Column 3333
. . . . . . . . . . . .
ColumnMAX ColumnMAX ColumnMAX ColumnMAX 1111
SparseDataTable 2
P K P K P K P K Tenant id Tenant id Tenant id Tenant id
P K P K P K P K TableName TableName TableName TableName
Column Column Column Column 1111
Column Column Column Column 2222
Column Column Column Column 3333
. . . . . . . . . . . .
ColumnMAX ColumnMAX ColumnMAX ColumnMAX 2222
SparseDataTableN
P K P K P K P K Tenant id Tenant id Tenant id Tenant id
P K P K P K P K TableName TableName TableName TableName
Column Column Column Column1111
Column Column Column Column2222
Column Column Column Column3333
. . . . . . . . . . . .
ColumnMAXN ColumnMAXN ColumnMAXN ColumnMAXN 1
D u t u
Fig.2 Multiple Sparse Tables
MetaSparseTable
P K P K P K P K SparseTableName SparseTableName SparseTableName SparseTableName
MaxColumn MaxColumn MaxColumn MaxColumn
MtuDutu
Tables
P K P K P K P K Tableid Tableid Tableid Tableid
Tenant id Tenant id Tenant id Tenant id
TableName TableName TableName TableName
RealTableName
Columns
P K P K P K P K Columnid Columnid Columnid Columnid
Tenant id Tenant id Tenant id Tenant id
Tableid Tableid Tableid Tableid
ColumnName ColumnName ColumnName ColumnName
DataType
L engt h
Nullable
Val ueNum
I sI ndexed
Fig.3 MetaData Tables
Traditional database systems rely on the index to quickly
locate table rows that some Iields meet speciIic matching
conditions. But, to create local index Ior the reserved Ilex
columns oI sparse tables is not practical, because the sparse
tables may be used to store diIIerent columns and datatypes oI
diIIerent logical schemas. Moreover, iI creating an index on
the reserved columns according to the needs oI tenants, the
index data space will be very large and the query eIIiciency
will be very low. So, this paper copies the index data
synchronously into the Pivot Table, which is served as a
traditional database index.
III. MULTIPLE SPARSE TABLES BASED ON THE PIVOT TABLE
$ 7KH $UFKLWHFWXUH
To solve the problems oI the traditional sparse tables
approach, in this paper, the traditional approach is expanded to
include an indexing mechanism based on the mapping table
Pivot Table. (As shown in Fig.4)This mechanism includes
index metadata, which is stored in the indexmetadata , and
index data, which is stored in the indexdata.(As shown in
Fig.5)The Pivot table ,served as the index oI sparse tables,
uses the physical index oI DBMS to support query
optimization.
MetaData D a t a Pivot Table
Figure.4 Multiple Sparse Tables based on Pivot Table
Index Data
P K P K P K P K Tenant id Tenant id Tenant id Tenant id
P K P K P K P K Tableid Tableid Tableid Tableid
P K P K P K P K Columnid Columnid Columnid Columnid
P K P K P K P K G U I D G U I D G U I D G U I D
V alue
Pivot Table
IndexMetaData
P K P K P K P K Tenant id Tenant id Tenant id Tenant id
P K P K P K P K Tableid Tableid Tableid Tableid
Columnid
I nde x
Fig.5 Pivot Table
The Metadata includes schema metadata and index
metadata, the schema metadata contains the tenants` logical
schemas, while the index metadata includes the tenants` index
deIinitions, in this model, the binary relation tables oI Tables,
Columns and MetaSparseTable are used to store the schema
metadata and the IndexMetaData is used to store the index
metadata. The metadata provides support Ior schema mapping,
in this paper ,the users` requests are switched transparent to
the sparse data tables through the Metadata.
The Tables stores the schema deIinitions, the Columns
stores the column deIinition oI the schemas, and the
IndexMetaData stores the index deIinition.
MetadataTables,Columns,MetaSparseTable,IndexMetaData
}
Tables( Tableid,Tenantid,Tablename,RealTableName)
Columns(Columnid,Tenantid,Tableid,ColumnName,DataTy
pe,Length,Nullable,ValueNum,IsIndexed)
MetaSparseTable (SparseTableName, MaxColumn)
IndexMetaData(Tenantid, Tableid, Columnid)
The Data includes business data tables and index data
table, the business data tables store the tenants` business data
in the sparse tables, while the index data table stores the
tenants` index data, in the model, SparseDataTable1
SparseDataTable2SparseDataTableN are all business
data tables and IndexData is index data table, the index data
table serves as the index oI the sparse data tables.
DataSparseDataTable1,SparseDataTable2,.,
SparseDataTableN, IndexData }
SparseDataTable1(GUID,Tenantid,Tableid,value1,value2,val
ue3,.,valueMAX1)
635
SparseDataTable2(GUID,Tenantid,Tableid,value1,value2,val
ue3,.,valueMAX2)
...
SparseDataTableN(GUID,Tenantid,Tableid,value1,value2,va
lue3,.,valueMAXN)
IndexData(Tenantid,Tableid,Columnid,GUID,value)
% 0DLQWHQDQFH VWUDWHJ\ RI WKH 3LYRW 7DEOH
When the tenant customize the index, the model store the
customization into the IndexMetadata, the index data is stored
shared in the IndexData based on the IndexMetadata, so, when
tenants send requests Ior the data, iI index exists, then the
model will more quickly send response to the tenants,
otherwise, the model queries the eligible data by scanning
through the data table. Now let`s give the maintenance
algorithms oI the Pivot Table.
DeIinitions:
Logical Operators Set Landor}
Relational Operators Set R`~~}
Relational Table RABCD.}; ABCD. are
all columns oI R.
Alg.1 Query Algorithm
Inputselect B Irom R where A v1 a n d B v2 o r c c1 a n d .
OutputResultSet
1get Tenantid and logical table R
2query the Tables get the phisical table P
3w h ere clau se and w h ere clau se
4Io r every item in w here l r op k lL,rRo p R `
5deIine query condition oI the index data e unindexed
condition e `
6use TenantidRr to query IndexMetaData
7i I result is null
e e l (r o p k )
e l s e
e e l (r o p k )
endiI
end Ior
(8) iI e is not null
Query IndexMeataData according e get a set oI GUID ,then
query p according the set TablesColumns an d e `
e l s e
query p according the Tables columns
return ResultSet
In this Algorithm, Iirstly, get the physical table oI the
tenant`s logical table P, then iterate the sub-condition in the
where clause, iI the column involved in the sub-condition is
indexed, it will be added to the query condition e, otherwise it
will be appended to e`, Iinally, iI e is not null, query the
indexdata table by e, get a set ,and then query the sparse table
according the set and e`, and get the ResultSet; or query the
sparse table directly by e` and get the ResultSet.
Alg.2 Insertion Algorithm
Inputinsert into R(A,B, C) values(v1,v2,v3)
OutPut
1get Tenantid and logical table R
2query the Tablesget the phisical table P
3update P according the TablesColumns
4query IndexMetaData and according to
Tenantid RABCi I exi s t s i ndex
update IndexData
The algorithm does additional updating to the indexdata
table out oI the traditional updating, Iirstly, get the physical
table oI the tenant`s logical table P, update it and the check the
indexmetadata table, iI R exists index, then update the
indexdata table, because oI the physical index oI the indexdata
table, the additional work waste little time, which is worthy oI
query eIIiciency.
Alg.3 Update Algorithm
Input update R set A v1, Bv2,. where C v3 a n d D v4 o r .
Output
1get Tenantid and logical table R
2query the Tables get the phisical table P
3w here clause and w here clause
4Io r every item in where l r op k lL,rRo p R `
5deIine query condition oI the index data e unindexed
condition e `
6use TenantidRr to query IndexMetaData
7i I result is null
e e l (r o p k )
e l s e
e e l (r o p k )
endiI
end Ior
(8) iI e ise not null
Query IndexMeataData according e get a set oI GUID ,then
update indexdata according to the set ,then update p according the set
Tables Columns an d e `
e l s e
update p according the Tables Column s
In this Algorithm, Iirstly, get the physical table oI the
tenant`s logical table P, then iterate the sub-condition in the
where clause, iI the column involved in the sub-condition is
indexed, it will be added to the query condition e, otherwise it
will be appended to e`, Iinally, iI e is not null, query the
indexdata table by e, get a set ,and then update the sparse table
and indexdata table according the set and e`; or update the
sparse table directly by e`.
IV. EXPERIMENTAL RESULTS AND ANALYSIS
636
We build a prototype system by adding Pivot Table to the
traditional sparse tables approach based on the relational
DBMS ORACLE 10G,then we test the eIIiciency oI the
approach proposed in this paper.
we have 4 sparse tables with 30, 80,200,500 columns in
our multiple sparse tables approach and each column reIerred
here is varchar2 (100),50 tenants whose data is evenly
distributed in the 4 sparse tables, each tenant has 10,000 rows
in every sparse table on average, the experimental results and
analysis is shown below:
(1) The 50 tenants each sends 100 query requests, the target
table and columns in where clause is evenly distributed in the
sparse tables, Fig.6 records the average execution time oI the
traditional multiple sparse tables approach and the one with
Pivot Table.
0
0
!00
!0
?00
?0
! 3 9 !! !3 ! ! !9
Tud`t`ouu'
l`votTu|'
Fig.6 Average Query Time
Fig.6 shows that the Pivot Table approach is optimized to the
traditional.
(2) The 50 tenants each sends 100 updating requests, the
target table and columns in where clause is evenly distributed
in the sparse tables, Fig.7 records the average execution time
oI the traditional multiple sparse tables approach and the one
with Pivot Table.
0
!0
?0
30
+0
0
b0
0
80
90
! 9 !3 ! ?! ? ?9 33 3 +! + +9
Tud`t`ouu'
l`votTu|'
Fig.7 Average Update Time
Fig.7 shows that the Pivot Table approach waste a little more
time than the traditional, but the little extra time is worth Ior
the much more quickly oI query response.
V. CONCLUSION
The multiple sparse tables approach improves the single
sparse table, saves space and improves eIIiciency, but it did
not use the physical index oI DBMS.
We propose the Pivot Table approach based on the
traditional multiple sparse tables, Iull use oI the multiple
sparse tables and physical index, make the query eIIiciency
quickly and little impact on update eIIiciency. As the
experimental results show, the approach we proposed makes
the multiple sparse tables more eIIicient.
ACKNOWLEDGMENT
This work is supported by National Key Technologies
R&D Program No.2009BAH44B02, the National Natural
Science Foundation oI China under Grant No.90818001, the
Natural Science Foundation oI Shandong Province oI China
under Grant No.2009ZRB019YT &No.2009ZRB019RW, the
Key Technology R&D Program oI Shandong Province under
Grant No.2009GG10001002 and Supported by Independent
Innovation Foundation oI Shandong University under Grant
No.2009TS030.
REFERENCES
|1| Craig.D.Weissman, The Design oI the Force.com Multitenant Internet
Application Development PlatIorm, SIGMOD`09, 2009.
|2| Chen Weiliang, Zhang Shidong, Kong Lanju, A multiple sparse tables
approach Ior multi-tenant data storage in SaaS, Industrial and InIormation
Systems (IIS), 2010 2nd International ConIerence.
|3| Eric Chu, JenniIer Beckmann, JeIIrey Naughton, The Case Ior a Wide-
Table Approach to Manage Sparse Relational Data Sets, SIGMOD`07,
June 1114, 2007, Beijing, China.
|4| SteIan Aulbach , Torsten Grust , Dean Jacobs , Multi-Tenant Databases
Ior SoItware as a Service:Schema-Mapping Techniques, SIGMOD`08,
June 912, 2008, Vancouver, BC, Canada.
|5| J. L. Beckmann, A. Halverson, R. Krishnamurthy, and J. F. Naughton.
Extending RDBMSs to support sparse datasets using an interpreted
attribute storage Iormat. In Proc. oI ICDE, 2006.
|6| C. Cunningham, G. GraeIe, and C. A. Galindo-Legaria. Pivot and
unpivot: Optimization and execution strategies in an rdbms. In
VLDB, pages 9981009, 2004.
|7| Rakesh Agrawal, Amit Somani, Yirong Xu, Storage and querying oI
ecommerce data. In Proc. oI VLDB, pages 149-158, 2001.
637

Das könnte Ihnen auch gefallen