Sie sind auf Seite 1von 6

ANALYSIS OF SaaS MULTI-TENANT DATABASE IN A CLOUD ENVIRONMENT

Maram Hassan AlAlwan


Alalwan.maram@gmail.com

Soha S. Zaghloul
smekki@ksu.edu.sa

College of Computer and Information Science Department of Computer Sciences Kind Saud University Riyadh, Saudi Arabia

ABSTRACT
Recently, cloud computing became a dominant field in the information technology world. It prevails over both academia and industry. Cloud Service Providers (CSP) provide many services such as storage, platform and applications. However, security is the most critical concern that impedes the dominance of cloud usage. Since Software as a Service (SaaS) security issues are under the end-users control, it became more common than other cloud service models. In addition, its popularity comes from its remote delivery for the application functions over the Internet to subscribed users. Multi-tenancy is the main property of SaaS, it allows vendors to provide multiple requests and configurations through a single instance of the application. In this context, a customer is known as a "tenant". In the same way, a single database is shared amongst customers to store all tenants data: this is known as "multi-tenant database". This reduces operational and maintenance costs; offers more reliability. On the other hand, the occurrence of a problem affects all customers. The risk of leaking information is the most undesirable situation in this architecture. In addition, multi-tenant databases are the most suitable architecture for data mining: this may increase the income of CSPs. From tenants perspective, this approach lacks flexibility. This paper explores the different implementation approaches used in multi-tenant databases. It also provides an analytical study on each of the presented approaches.

model which is Software as a Service (SaaS). Accordingly, the application functions are delivered remotely over the Internet on subscription basis. In SaaS model, customers are not required to install, maintain or even manage the application; alternatively, the Cloud Service Provider (CSP) is responsible for these jobs. On the other hand, the IT departments in most organizations consume a lot of effort and money in developing and managing their applications. Since cloud computing reduces effort and cost, then most organizations tend to outsource their IT applications and devote themselves to other commercial competitions. Many SaaS applications are widely used in business such as Customer Relationship Management (CRM), Supplier Relationship Management (SRM), Business intelligence (BI) and Academic and administrative resources (multiple educational institutions are sharing the same database-space to collaborate and benefit from each other) [1], [2]. However, this model works only for specific applications whose behavior is well defined and universally accepted. The most important SaaS characteristic is multitenancy, where a single running instance of an application serves multiple requests from multiple customers, each customer is considered as a tenant. All tenants participate and use the same database to store their data. This leads us to use a multi-tenant database architecture [3]. The rest of the paper is organized as follows: Section 2 explores the different isolation degrees that might be used in creating multi-tenant data architecture. Various implementation approaches for multi-tenant database are presented in Section 3. Section 4 analyzes the implementation approaches presented in the previous section. Finally, Section 5 concludes the paper.

KEYWORDS
Cloud; SaaS; Multi-tenancy; Multitenant database; Chunk Table.

1 INTRODUCTION In the traditional model of the software application, customers should buy the needed application and then install it on their local computer or data center. Today the cloud computing paradigm offers an alternative service

ISBN: 978-0-9853483-3-5 2013 SDIWC

523

2 MULTI-TENANT DATABASE ARCHITECTURE Data is the most important asset for any business; it is also the heart of SaaS application. Therefore, the provision of a secure and efficient database architecture became one of the highest priorities for SaaS vendors. This is an attempt to gain tenant's trust who is concerned about surrendering his data to a place that is shared with other tenants. Moreover, he has no control over it. Many approaches are implemented for the purpose of data isolation for each tenant in a database. The selection of the approach to use depends on the complexity of the application and many other business considerations. Mainly three approaches are covered in literature. Here they are: - Separate Database: In this approach, a separate database is assigned to each tenant for data storage. Each database contains some metadata used to redirect each tenant to the correct database. This approach is considered expensive in both implementation and maintenance. - Shared database, separate schema: In this approach all tenants share the same physical database, however, the schema different for each tenant. This approach is relatively simple to implement. - Shared database, shared schema: In this approach all tenants will share both the physical database and the schema. Tables are shared by all tenants. Customers information is separated using primary keys which are specified in the database design. This approach is relatively economic because it supports a large number of tenants per database server [1]. Selecting the appropriate approach depends on different criteria. For example, the separate database approach is the appropriate solution for large organizations tenants who need to store large amounts of data. The same approach is also suitable if security and legal requirements are of high concern. On the other hand, the shared database shared schema is the appropriate solution for individual tenants who have low amounts of data to store. Also, the same approach is the optimum solution in case of frequent changed applications. [4].

3 IMPLEMENTING MULTI-TENANT DATABASES In order to implement multi-tenancy in the database, most hosted services use a query transformation layer to map multiple single-tenant logical schemas in the application to one multitenant physical schema in the database. For normal workloads, the fundamental development limitation of this approach is the number of tables the database can handle. In turn, the number of tables depends on the amount of available memory, the tables design, and their development in such a way that the tenant receives a response to any query in an efficient way [5]. Various approaches are explored and implemented in the literature. The general idea is to map the logical source tables into xed generic structures. In the following sub-sections, these approaches are explained, followed by an example that illustrates how different tenants (Tenant A, Tenant B, and Tenant C) use that approach to configure one sales table which stores their salespersons information in multi-tenant application. Here they are: 3.1 Private Tables The private table technique provides a high level of isolation and privacy among tenants. This is achieved by allowing each tenant to have his own private table in the database to satisfy his needs. Figure 1 illustrates the implementation of such approach.

Figure 1: Private Table Implementation

ISBN: 978-0-9853483-3-5 2013 SDIWC

524

3.2 Universal Table The universal table layout holds a large number of generic data columns with a flexible data type such as VARCHAR data type. The n-th column of a logical source table for each tenant is mapped to ColN in the universal table. In addition, two unique columns, Tenant_id and Table columns are used: Tenant_id identifies tenants from each other, whereas the Table column identifies the specific table of the same tenant. Each tenant fills his columns with the needed data. The rest of the columns that are not related to him are filled with Null values. Figure 2 illustrates the implementation of such approach.

- Each table consists of five columns: Tenant_id used to identify each tenant from other; the Table column is used to identify the specific table of the same tenant; the col column is used to determine the

Figure 3: Extension Table Implementation

represented column in the logical source table; the row column is used to determine the represented row in the logical source table; and finally the data type column used to store the values of the logical source table rows according to their data types in the designated pivot Table. In general, each new row in the pivot table is created for each field "cell" in the logical source table. Figure 4 illustrates the implementation of the pivot table.

Figure 2: Universal Table Implementation

3.3 Extension Tables In Extension Tables, the logical source table is partitioned into different tables; the common attribute for all tenants is stored in the base table. Each group having the same attribute is stored in different extension tables. These separated tables are joined together by adding Tenant_id column and row column. The latter represents the specific row in the logical source tables. So, multiple tenants can share the base tables as well as the extension. Figure 3 illustrates the implementation of such approach. 3.4 Pivot Tables In this technique, Pivot tables are shared by all tenants. The source tables are mapped as follows: - A separate table for each data type is created. For example, we might have two pivot tables, the first one pivot_int to store integer values, and the second is pivot_str to store string values.

Figure 4: Pivot Table Implementation

ISBN: 978-0-9853483-3-5 2013 SDIWC

525

3.5 Chunk Table The Chunk Table technique is similar to the Pivot Table approach; however, they are different in two points: 1. It has a set of data columns with multiple data types with or without indexes. 2. The column col in the Pivot Table is replaced by the chunk column in the Chunk Table. In the Chunk Table technique the columns in the logical source tables are partitioned into groups according to their popularity. Each group is assigned to a chunk ID and mapped to the appropriate Chunk Table. In fact, the Universal Table is an extreme chunking: only one chunk per row. On the other hand, we have one row for each field in the pivot table in contrast to having one row for each chunk in the Chunk Tables. Figure 5 illustrates the implementation of the Chunk Tables.

Figure 6: Chunk Folding Table Implementation

3.7 The XML Data Type/Document The XML data type database extension technique is a combination of relational database systems and Extensible Markup Language (XML). The extension of XML can be provided as native XML data type, or by storing the XML document in the database as a Character Large Object (CLOB) or Binary Large Object (BLOB).

Figure 5: Chunk Table Implementation

3.6 Chunk Folding Table In the Chunk folding Table technique, logical source tables are vertically partitioned into two tables. The first one the base table - is used to store the heavily used part of the logical source tables; in other words, the columns that are used the most by the tenants. On the other hand, the second table is used to store the remaining columns which are less frequently used by most of the tenants: these are the Chunk Tables. Therefore, we can say that the Chunk Folding table technique mixes the ideas of both the Extension table and Chunk Tables. Figure 6 illustrates an implementation of such technique.

Figure 7: XML Data Type Implementation

XML data type is used to facilitate the creation of the database tables, columns, views, as well as variables and parameters. This technique satisfies most tenants needs, because their data can be handled without changing the original database

ISBN: 978-0-9853483-3-5 2013 SDIWC

526

relational schema. XML data type can be supported by several relational database products [6]. Figure 7 illustrates the XML data type implementation. 4 ANALYSIS This section analyzes mentioned approaches. the previously

4.4 Pivot Table The Pivot Table technique provides much and better performance for the following reasons: - It is a safe data-type structure: each Pivot Table is created for each different data type. - It efficiently supports indexing, two Pivot Tables can be created for each type: one with indexes and one without, each value is placed in one of these tables depending on whether it needs to be indexed or not. - This approach eliminates the need to handle many null values. The main drawback in this technique is that it has more columns of meta-data than the actual data. Moreover, reconstructing the logical source column requires many join operations along the row column. As a result, this leads to a much higher runtime overhead for interpreting the metadata [7]. 4.5 Chunk Table Unlike the Pivot Tables, this approach reduces the ratio of stored meta-data to actual data. Therefore, there is less overhead for reconstructing the logical source tables. Unlike the Universal Tables, this approach provides a well-defined way of adding indexes and reducing the number of columns. Moreover, this technique provides high flexibility since the width of the Chunk Tables may vary. However, this flexibility adds more complexity at the querytransformation layer. 4.6 Chunk Folding Table Chunk Folding technique achieves good performance that is obtained by mapping the most heavily utilized parts of the logical schemas into the basic tables; the remaining parts are mapped into Chunk Tables that match their structure as closely as possible. Furthermore, it does not put any limitations on the consolidation or the extensibility. However, it is not efficient with generic structures that use only a small and/or fixed number of tables. Chunk Folding tries to utilize the databases entire meta-data in an effective way as much as possible [7]. 4.7 The XML Data Type/Document Obviously, using the XML data type in the database extension technique adds many advantages to the database in terms of its

4.1 Private Table This technique is very simple to implement. In addition, it provides a high level of privacy. Moreover, the query transformation layer needs only to rename the tables in order to map each tenant to his own table. The main drawback of this approach is the large number of tables created for each tenant since no sharing is allowed. Therefore, this technique achieves a good performance for a low number of tenants. 4.2 Universal Table In contrast to the private table technique, the Universal table offers a great consolidation degree with no extensibility. This approach is relatively easy to implement. In fact, it is considered as a flexible approach because tenants can extend their tables as needed. The main drawback of this approach is that the rows need to be very wide even for slender source tables. As a result, the database has to handle many null values. Furthermore, indexes on the columns are not supported in this approach. On the other hand, the data type becomes an important issue since each tenant might have a different data type in his logical source table. This issue leads to the necessity of adding additional structures to provide indexing in this technique. 4.3 Extension Table This approach lies somewhere in the middle between the Private and Universal tables. In other words, it provides a better consolidation than the Private Table Layout. At the same time, it outperforms the Universal table in terms of extensibility. However, the number of tables increases proportionally with the number of tenants and the diversity of their business requirements. Another drawback for this approach is that an additional join operation takes place at the run time when reconstructing the logical source tables.

ISBN: 978-0-9853483-3-5 2013 SDIWC

527

simplicity in the implementation. Also, it provides flexibility to have multiple data types with the various numbers of columns in each logical source table. Furthermore, the insertion and the retrieval of XML data is fast, however overall performance is affected by this data structure [6]. 5 CONCLUSION The multi-tenancy concept is the primary characteristic of SaaS applications. With SaaS, we need to use a multi-tenant database where all tenants (customers) participate and share a database. However, each tenant stores his own data. In this paper, the approach of multi-tenancy is explained in detail. Several implementation techniques in designing a multi-tenant database are presented. An analytical study of each technique is also presented: its advantages, disadvantages, and performance. In addition, the paper reveals the usability of each technique: when it is preferably used and under which circumstances. 6 REFERENCES
[1] Pippal, S., Sharma, V., Mishra, S., Kushwaha, D.S.: An Efficient Schema Shared Approach for Cloud Based Multitenant Database with Authentication and Authorization Framework. In International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 213218 (2011). [2] Xu, J., Li, X., Zhao, X.: Design of Database Architecture in the SaaS-based Multi-tenant Educational Information System. In the 6th International Conference on Computer Science & Education, 114119, ( 2011). [3] Li, X., Shi, Y., Guo, Y., Ma, W.: Multi-tenancy Based Access Control in Cloud. In International Conference on Computational Intelligence and Software Engineering (CiSE) (2010). [4] http://cloudcomputing.sys-con.com/node/1610582. Last accessed on 24/12/2012. [5] Mateljan, V., Cisic, D., Ogrizovic, D.: Cloud Databaseas-a-Service (DaaS) ROI. In MIPRO, Proceedings of the 33rd International Convention, 11851188 (2010). [6] Yaish, H., Goyal, M., Feuerlicht, G.: An Elastic Multitenant Database Schema for Software as a Service. In the 9th IEEE International Conference on Dependable, Autonomic and Secure Computing, 737743 (2011). [7] Aulbach, S., Grust, T., Jacobs, D., Kemper, A., Rittinger, J.: Multi-Tenant Databases for Software as a Service:Schema-Mapping Techniques. In ACM SIGMOD Proceedings International Conference on Management of Data, 11951206 (2008).

ISBN: 978-0-9853483-3-5 2013 SDIWC

528

Das könnte Ihnen auch gefallen