Sie sind auf Seite 1von 42

Introduction

There is always dilemma on which database to use for the system on management platform. So,
most of the firms go for either relational database or they choose non-relational database. Both of
the database saves the data like a library with many books in it, but the way the data is stored
differs from one another. Not only that, even the way data processes and manipulation of data is
different.

In a relational database, the database is stored mainly in the form of rows and column which is
knows as a table. In a database, there is always one or many connections between the tables known
as relation. All the tables in a database are connected to each other through the foreign key. There
are only three types of connection and they are one-one relation, one-many relation and many to
many relations. In one-one and one-many, there is the use primary key to connect with each other
whereas there is use of pivot tables to connect entity with the relation of many-many. There are
foreign keys of both entity on the pivot table where those become primary key in their respective
table.

In a non-relational database, there is no relational schema. They tend to have different database
architecture structure with no fixed schema like relational database. Also, there is no traditional
ways of storing the data, that is, in the form of rows and column. These databases use storage model
that is optimized in such a specific way that it stores the only data types that are setup as per the
requirements. For example, the document based non-relational database such as MongoDB are
store the document or data in a Json format. There are other ways where the data is stored in this
type of database. They are as follows:
1. Document based database
2. Columnar based database
3. Key/Value based database
4. Graph based database
5. Time series based database
6. Object based database
7. External index based database [ CITATION Mic18 \l 1033 ]

There is always rush for a customer to have his bike serviced. They want their bike servicing done
as soon as possible. But with huge number of bikes on the cities, there is always a rush at a bike
servicing center as well. A customer would not know about the rush that is going on the service
center and will end up spending a lot of time in the service center. But with the use of online
information application, the user can not only know about rush that is going on the service center,
also will he be to know about the servicing that will be done in bike. This application helps the user
to save the time by planning about the bike service based on the application.
Figure 1 Bike Service Management Features

This online application has to store important information about the user and his bike. With the
thousands of users, the data structure varies and starts to get complicated. Not only that, the data
will have sensitive information about the user which will be problematic if its security is breached.
As mentions previously about the data complexity, the database will have more pressure with the
increase of data manipulation and searching of massive amount of data. So, with the proper
comparative analysis of relational and non-relational database, the preferred database is used to
solve the problem of service center and its customer.
Keyword
Database, Relational Database, Non-relational Database, MongoDB, MySQL, Web Application,
Bike Service Center

Aims
In this dissertation, I aim to find the effective and appropriate database system among SQL
database or no SQL database in terms of performance, scalability as well as cost effective for an
automobile service management system.

Objective
In this dissertation, I purpose the followings:
 To understand the advantages and disadvantages of different kinds of databases, and
chose the best data infrastructure for online management system.
 To test and compare different database scalability, performance and cost as well.
 To deploy the appropriate database in online management system after the comparison.
History
The storage of data in libraries, government or schools goes back before the computers were even
created. It was very important to store the data and once people realized that, they started
researching the way of storing the data in files, indexing it and the way for its maintenance, so that
it could be retrieved in future if needed. But with the development of computers in around mid-
nineteen century, there was also a rapid development in the field of database. It made the data
storage easy, cost-effective and the maintenance was easy compared to the physical files. It also
took less space to store [ CITATION DrT12 \l 1033 ].
The first database system was developed in 1960 by Charles W. Bachman which was Integrated
Database System. Later, the IBM created their own database system which was knows as IMS.
With the increase of development and popularity of computer, it was used for many business
purposes, making database useful for many general purposes. Because of this, it was important to
develop a common standard for database. So, a group form a Database Task Group took a
responsibility to design and standardize COBOL language, which was also known as “CODASYL
Approach” in 1971 [CITATION 19No \l 1033 ].
But it was not popular in the market because of its complexity. So, it was Edgard Codd, who first
developed the idea of storing the data in the form of table with fixed-length records rather than
form list, was new way for storing the data. With his idea, Interactive Graphics and Retrieval System
(INGRES) was able to implement his idea and concluded it to be an effective way of storing data. So,
the system worked with QUEL, a query language. Later on, the QUEL was replaced by SQL as it was
more advanced and functional then QUEL and relational database idea was implemented and
invented relational database management system (RDBMS)[ CITATION Kei17 \l 1033 ].
NoSQL was first named/used by Carlo Strozzi in 1998 as it was light weighted and open sourced
database as it didn’t used the SQL. NoSQL system basically means no use SQL system but some of
the NoSQL database still uses or support SQL. This database didn’t used the ridged protocol of
relational database like use of tabular structure and relations between them. The traditional
database (relational database) has a complex structured query language and as the data grew in
larger volume, the performance of the database would degrade. NoSQL uses distributed database
system model, which means a database in multiple environments or computer. It was firstly used
for the response to the web data, that is for the processing of huge volume of unstructured data [ CI
TATION Kei17 \l 1033 ].
Problem Statement
There is always a state of dilemma for an enterprise to choose the storage for their data. The
database preference differs from one enterprise to another enterprises. Some enterprise may need
huge storage and want to compromise on the data processing efficiency where as other enterprise
may need a limited storage but the data processing performance to be good.
There are numerous factors that should be considered before choosing an appropriate database for
the management system. Some of the factors that should be considered before choosing database
are as follows:[ CITATION And18 \l 1033 ]
 Which data model works best for the system, that is relational data or non-structured data.
 With huge data collected, it is important to choose database that keeps the data consistent
as per need.
 Consider database that keeps data available with the proper backup and restoration facility
with very little downtime as possible.
 Choose database that has preferred level of data encryption and proper security that
provides accessibility to only authorized personal.
 Is database flexible enough to extend in future and easy for integration of IT infrastructures
for smooth workflow.
 Choosing the database that have proper response time despite the amount of data that flow
is very important.
 Consider database that are user friendly as they are used my different authorized personal
like administrators, database admins.
 While choosing database, another important factor that should be considered are the
difficulty and cost of database implementation and their service in future.
Figure 2 Confusion on choosing database

As the image shown above, there is a state of confusion for a project manager, whether to use
relational or non-relational database. Whereas, in case of database expert, he suggests project
manager to consider the various factors that are suitable for the system and choose database
accordingly.

Figure 3 Solutions for choosing database

As the figure shown above, the database expert is providing the factors that should be considered
before choosing the database for the system, that is, the factors which are mentioned above. With
this, now project manager can decide the database by comparing the need of the current system
that is required and with the feature that is available in the database.
Motivation for research
There has always been busy environment around the bike service center. Many customers come
with their bike to get serviced but always ends up in the queue. Because of this, they end up wasting
their time. Even if they leave their bike on the service center and comes to pick it up when servicing
is done, there is always less satisfaction on the bike service that is done. So, most of them prefer to
wait in queue making them waste lot more time.
But if there is an online application, the user can always make an appointment online and take bike
to service center as per the time in an appointment. This makes it easier for a user to see how
servicing is done on their bike improving the customer satisfaction level. Not only that, this saves
the time of a customer by avoiding long queue. Also, if the user is satisfied with work done by a
specific mechanic, they can take an appointment for a same mechanic. This improves the customer
satisfaction which automatically attracts further more customer. With that, even employee get
motivated he/she gets more appointment. And if there is reward system for this, this motivates
another employee as well which automatically directs the company to success and fortune.
When the topic comes to storage of data, there is always of confusion on which database to choose
as there is relational and non-relational database as well as recently developed hybrid database
called NewSQL. With the verity of database that are available nowadays, each one has their own
advantage and disadvantage. Relational database is a traditional database where information is
stored where there is relation between the entity which makes it easier to find out the information
whereas no-relation database opposes the principle of database giving it an advantage by saving
data into Key-Value or a document. Not only that, it is important to evaluate the requirement of
system and choose the correct database that suits the system in terms of scalability, security,
performance as well as in terms of cost effective.
Because of the confusion for choosing the database, I propose to choose the proper database that is
suitable for bike service management system and compare and analyze the two different databases,
that is, relational database and non-relational database. For that these database’s scalability and
performance will be tested. Not only that, there will also be comparison on the basis of cost among
these databases. As a bike service center, there is bound to have quite large number of data that will
be stored and for that, it is important to choose the correct database that fulfill the requirement of
this system.
Research Question
For this dissertation, research questions are as follows:

“Which database is better for the management system in terms of scalability, efficiency,
security as well as cost effective?”
Ethical Consideration
Ethical consideration is a major element in dissertation that aims not to disclose any information of
users without their permissions. This also includes users not to harm anyone in any ways. User
should be aware of where the data that are gathered being used. There should be transparency of
the work that are being done between the participates and the answer that are collected should not
be biased.
The privacy of firms and users should not disclose in anyways however, in this dissertation, I asked
for a permission of the firms from where I collected the data to post it on my dissertation where it
could get public. With their approval, I published their information through questionnaire. Also, I
did not ask for a name or email of the candidate who had participated. It was done to maintain the
privacy of the participates who answered the questionnaire and one whom I interviewed. These is
no use of offensive, discriminatory or inappropriate words during the primary research like
questionnaire with database experts. Also, any questions that are personal and are not connected to
the objectives are not asked as well maintaining high objectivity. And, if there are any questions
that they do not want to answer, any pressure to answer that are not applied. Moreover, authors
are cited for their work.
Literature Review
Secondary Research
Facebook, one of the biggest firms these days has very huge data in their possession. Not to mention
the hassle in the management of the data, the storage of the data of such volume is nearly close to
impossible. In order to overcome this problem, Facebook has used both relational database and
non-relational database. Facebook has used MySQL as a primary database to store all structured
data such as various user info, HBase, known as HydraBase is used for all the chat, emails and
incorporate SMS. Including that, Cassandra, Haystack and Memcached are also used by Facebook
for different sectors like photo, cached and many more [ CITATION gan19 \l 1033 ].

Amazon Web Services (AWS) has both relational and non-relational database for their web
services. They have different types of database for the different types of applications. As per table of
amazon web services, the relational database is such as Amazon RDS or Aurora is used for
traditional application, CRM and e-commerce. Key-value database such as Amazon DynamoDB is
used by a high-traffic, e-commerce system or gaming application. In-memory database such as
Amazon ElastiCache for Memcached and Amazon ElasticCache for Redis is used for caching, session
management, gaming leaderboards. Document based database such as Amazon DocumentDB is
used for content management, catalogues and user profiles. Graph based database such as Amazon
Neptune is used for fraud detection, social networking [ CITATION 19No1 \l 1033 ].

As per the discussion between Ken Ashcraft and Alfred Fuller, they have compared both the SQL
and NoSQL as both are available on Google App Engine. They have compared it on the basis of
queries, transactions, consistency, scalability, management and schema. As a final conclusion from
their discussion, Datastore as non-relational database and cloud SQL as relational database, there is
an advantage of cloud SQL over datastore on the basis of the queries, transaction security and
consistency whereas datastore has overcome cloud SQL on the basis on scalability, management of
data records and schema. [ CITATION Ken12 \l 1033 ]
From this result, it shows that there is both advantage and disadvantage of both of the database.
There is basis where specific database cannot complete all the area. That’s why it is important to
prioritize the needs of a system and choose only what basis have more prioritization and
compromise other basis and choose the database accordingly.

According to Alon Brody, there are many expects where both of the database is needed and some of
the firm uses both of the database in different sectors. The author has stated the reason why and
when to use the relational or non-relational database with the example of CRM applications. He has
compared both of the database on the basis of scalability and indexing with basic functionality and
distinct differences between them. As a result, non-relational database has proved better scalability
as it has got supports for replications and sharding. Sharding is an architecture pattern in the
database which is somehow related to the horizontal partitioning. In database, the data can be
partitioned separated into different rows and column. Sharding is also known as breaking up two
or more data into different chunks which is known as logical sharding. With the distribution of
different logical shards into different database nodes, it becomes a physical shard where multiple
logical are hold [ CITATION Mar19 \l 1033 ]. In case of indexing, it’s work is same for both of the
database but how they work differs from each other as their database layout and architecture is
different. As a result, it shows that both of the database has performed quite well with index. And
with the use if CRM application, it shows that relational database removes the security and
technical access challenge whereas non-relational database are used for the data mining and
analyzing the huge number of data[ CITATION Alo17 \l 1033 ].
According to the paper by Rick Cattell on Scalable SQL and NoSQL Data Stores, there are some of the
points that has supported the relational database scalability. Also, the author has some of the
counter arguments to support the non-relational database. For the relational database, there has
been a dominant market of relational database as this database have an ACID (Atomicity,
Consistency, Isolation and Durable) along with better performance. Scalability for relational
database is difficult but not impossible whereas non-relational database is scalable like Google’s Big
Table[ CITATION Bry18 \l 1033 ]. With the ridge schema, relational database is not quite flexible
but gives the advantages of integrity of data types whereas non-relational are highly scalable
because of new database architecture like sharding and flexible schema [ CITATION Ric11 \l 1033 ].

There has been comparison on the basis of index between relational database and non-relational
database by Victoria Malaya. She has used MS SQL Server and MongoDB to compare on how the
index work on the database. Indexing is a process of optimizing the overall performance of a
database where number of disks that are accessed are minimized as much as possible when query
is executed [ CITATION Avn16 \l 1033 ]. There are three types of indexing. They are Primary
Index, Secondary Index and Clustering Index [ CITATION tut192 \l 1033 ]. Indexing is done on
both of the database but there is not index available by default on MS SQL Server whereas Indexing
on MongoDB is available by default. Index are supported on any column of table for MS SQL Server.
In case of MongoDB, Index are supported any fields or sub fields of a collections. Index Data
Structure in MS SQL Server and MongoDB are in B-Tree. It is most used multilevel data structure for
indexing [ CITATION Gur19 \l 1033 ]. For MS SQL Server, the index pointes to a heap or a clustered
Index whereas index pointer in MongoDB uses memory mapped files that are stored in the memory.
Both of the databases have a non-clustered index but MS SQL Server only has a clustered Index. MS
SQL Server lacks predefined index whereas MongoDB creates a unique and non-delete able file on
the _id field. The size of index key on MS SQL Server is 900 bytes whereas the size of index key on
MongoDB is 1024 bytes [ CITATION Bas13 \l 1033 ]. MS SQL Server can support up to 16 column
per index key and MongoDB supports up to 31 fields per index key. MongoDB has a query optimizer
for selection of index for a specific operation whereas query optimizer in MS SQL Server optimizes
the most efficient way to read data by doing a full table scan or more indexes. In both of the
database, the indexes are unique. This prevents the entry of duplicate records [ CITATION Vic13 \l
1033 ].

According to Tim O’ Reilly, Web 2.0 in the computer industry is a revolution in the business sector
by move towards internet as a platform, and any attempt to understand the rules for success on
that new platform [ CITATION Lip16 \l 1033 ]. This new version, or rather improvement form of
web has new features and functionality Some of the example of features of Web 2.0 are Blogs where
users can post thoughts and updates, Wikies with an online content of the world and some
networking sites like Facebook [ CITATION Chr08 \l 1033 ]. Big Data is a huge collection of chunks
of data that is growing exponentially with the time. It is so big that even it is challenging for a
relational database to manage and compute and analyze the data. There are three types of big data
and they are structured big data, unstructured big data and semi structured big data [ CITATION Gu
r191 \l 1033 ]. To manage the data, there has been an evolution on the database with the name of
non-relational database. Because of that, non-relational database has been more active in many of
the web applications. Non-relational database stores the data in any possible methods or forms or
combination. But with the finite number of data in a structured way, non-relational database can
still not rival the relational database because of the level of support and level of maturity and
widespread whereas if there is a involvement of big data, non-relational is always the best option
for the better performance and analytic from that data [ CITATION PAT16 \l 1033 ].

As per the journal from the International Journal of Scientific and Engineering Research, the
comparison between MySQL and MongoDB is done based on various heading. The heads of the
comparison between these databases are as below:
Based on Terms/Concept
In MySQL, the data are kept in the form of rows and column whereas the data in the MongoDB are
kept in document. The table can be joined in the MySQL whereas the document of the mongo
database is embedded. The schema of MySQL is fixed whereas the schema of the mongo database is
flexible [ CITATION Sus17 \l 1033 ].
Based on Schema
In MySQL, the schema to create a table with their metadata are as follows:
CREATE TABLE user (id varchar (30), name varchar (100), age int
(30));

Whereas in MongoDB, the schema to create a new document with their metadata are as follows:
db. myCollection.insertOne ({id:"1", name: “Test”, age:55"})

In MySQL, the schema to drop a table with their metadata are as follows:
DROP TALBE user;

Whereas in MongoDB, the schema to drop a new document with their metadata are as follows:
db.user.drop ()

In MySQL, the syntax to insert a data in table with their metadata are as follows:
INSERT INTO user (id, name, age) VALUES (1, “Test”, 20)

Whereas in MongoDB, the syntax to insert a new data with their metadata are as follows:
db. myCollection.insertOne ({id:"1", name: “Test”, age:55"})

In MySQL, the syntax to select a data in table with their metadata are as follows:
SELECT * FROM user;

Whereas in MongoDB, the syntax to insert a new data with their metadata are as follows:
db.user.find();

In MySQL, the syntax to delete a data in table with their metadata are as follows:
DELETE FROM user where id=1;

Whereas in MongoDB, the syntax to delete a new data with their metadata are as follows:
db.user.remove()
In MySQL, the syntax to update a data in table with their metadata are as follows:
UPDATE user SET name=”Joey” where id=1;

Whereas in MongoDB, the syntax to update a new data with their metadata are as follows:
db.user.update({‘name’: ‘Test’}, {$set : {‘name’: ‘Joey’}})
Based on performance
The researcher on this journal has used textbook management system as a system to test the
performance between these two databases and are shown in the graph as well. For the testing,
author has inserted data from 100 to 50,000 textbook information to the database. The total
amount of the time that was taken during the testing was recorded and plotted in the bar graph.
In terms of inserting the data into the database, MongoDB proved to be 30 to 50 times faster
compared to the MySQL. There does not seems to any difference in speed of data insertion until
1000 rows. When there was insertion of more than 1000 rows, MongoDB took a lead making a huge
gap between their performance.
Not only that, the author has tested the performance of the query of these two databases. As per the
author, the query test actually calculates the time that is taken by the query to fetch the data out of
database. In this test, MongoDB’s query performance is three times more than that of MySQL
queries. Because of this, MongoDB is more preferred for the data manipulation of Big Data than
MySQL[ CITATION Sus17 \l 1033 ].
With the comparison made by the author above, it is concluded that the performance of the
MongoDB is a lot faster than MySQL. Not only that, with the new generation of application that
deals with huge number of data, it is important for a database to work as per the needs, that is in
terms of scalability, performance and schema as well. With this comparison, MongoDB is more
rational to use in comparison to MySQL because of huge difference in performance between them
along with ease of use. If any application work over only data manipulation in huge volume with
different data model, MongoDB is more suitable.
There has always been comparison between these two databases. In a journal from University of
Oradea, there is a comparison that is done between the relational and non-relational database. The
researcher has used MYSQL and MongoDB for the comparison between them on a web application.
For this comparison, the researchers insert, update and delete data operation from the both
databases using same web application. The researcher used the 1, 500, 1000, 10000, 25000 and
50000 user’s records for INSERT, UPDATE, DELETE and SELECT on both of the databases. Since, it
is web application, all of those users are inserted at the same time.
For the insert operation, the researcher used those records on both of the database and the result
are shown below in table[ CITATION Cor15 \l 1033 ].

Insert users MongoDB – in sec MYSQL – in sec

1 00:00:00:003 00:00:00:402
500 00:00:00:018 00:00:00:183
1000 00:00:00:033 00:00:00:387
10000 00:00:00:521 00:00:01:085
25000 00:00:00:816 00:00:03:378
50000 00:00:01:835 00:00:08:306
In the table shown above, there is quite a huge difference between the performance efficiency
between these two databases. MongoDB is taking a less time in insertion of user data from 1, 500,
1000, 10000, 25000, 50000 simultaneously compared to the MYSQL. This shows that MongoDB has
better performance.
After that, there researchers have done the select operation of those data and the output of this
operation is shown below in the tables[ CITATION Cor15 \l 1033 ].

Select users MongoDB – in sec MYSQL – in sec

1 00:00:00:003 00:00:00:083
500 00:00:00:017 00:00:00:005
1000 00:00:00:031 00:00:00:006
10000 00:00:00:291 00:00:01:052
25000 00:00:00:830 00:00:03:190
50000 00:00:01:616 00:00:08:327

In the table shown above, there is quite huge differences between in the performance between
there databases. For single user, MongoDB is taking less time compared to MYSQL. But after that,
MYSQL is taking less time than MongoDB for the selecting the no. of previously inserted users. This
time, MYSQL takes the lead in performance.
After selection of user, the researches do update operation on the number of previously selected
users. The result of that update is shown below[ CITATION Cor15 \l 1033 ].

Update users MongoDB – in sec MYSQL – in sec

1 00:00:00:005 00:00:00:039
500 00:00:00:059 00:00:00:059
1000 00:00:00:042 00:00:00:159
10000 00:00:00:463 00:00:04:634
25000 00:00:01:294 00:00:19:946
50000 00:00:02:224 00:00:31:205

As the table shown above, MYSQL is taking more time than MongoDB in updating one user. Both of
this database are taking same time to update 500 users. After that, MongoDB is ahead of MYSQL in
updating all the remaining users in web application.
For delete operation, the researcher deletes the user in both databases simultaneously. The results
of the delete operation as shown below [ CITATION Cor15 \l 1033 ].

Delete users MongoDB – in sec MYSQL – in sec

1 00:00:00:004 00:00:00:081
500 00:00:00:007 00:00:00:063
1000 00:00:00:017 00:00:00:082
10000 00:00:00:106 00:00:00:200
25000 00:00:00:317 00:00:00:350
50000 00:00:01:508 00:00:01:787

In the table given above, it shows that the MongoDB performance is a lot faster than the MySQL in
the operation of delete. Even MongoDB takes lead in MYSQL here.
From this journal, they have shown that MongoDB takes advantage on the performance (CRUD
operation) efficiency over MYSQL. Just that, the performance of these database not that noticeable
as long as the database does not take time more than one second. Not only that, MongoDB, being
open sourced database has been more advantage as well [ CITATION Cor15 \l 1033 ].
From the journal that is prepared by Konrad Fraczek and Malgorzata Plechawska-Wojcik, they have
used MongoDB, Cassandra database and PostgreSQL on web-based application for the research.
The author has compared the data model from these databases where MongoDB is a document
based non-relational database whereas Cassandra is a column-oriented database and PostgreSQL is
relational database.

The implementation of different database by the author are show below:


SQL Implementation: The figure below contains a data model for the application built by author in
PostgreSQL.

Figure 4 Data Model of Application in PostgreSQL


A query to select the timeline of the user is most complex query used in this application. The sample
of the query are as follows:

SELECT user.login login, update.id id, update.date date,


update.body body
FROM user_status_updates update
JOIN users user ON user.id = update.userID
JOIN followers f ON f.followedId = user.id
WHERE f.followerId = ?
ORDER BY update.date DESC
LIMIT 20 OFFSET (CURRENT_PAGE -1) * 20

MongoDB Implementation: To access the data, author has used official java driver. The data
model of this database consists of three document named users, comment and hashtags as shown in
the figure below. Comment document are nested to status_updates which result in smaller data
object compared to relational database.

Figure 5 Data model of an application in MongoDB

Query for retrieving the information of user’s timeline are as follows:


db.status_updates.
find({“login”: {“$in”: [“?”,”?”]}}).
sort({date: -1}).
skip((CURRENT_PAGE -1) * 20).
limit(20);
Cassandra Implementation: To access the data from Cassandra database, DataStax was used.
Different key with different color was used to determine direction of sorting for the column during
the creation of table. The data model of this database is shown below:[ CITATION Bro15 \l 1033 ]

Figure 6 Data model of an application in Cassandra

It is more complex for this database to retrieve the information of user’s timeline. To get the
foremost page of data, the query are as follows:
SELECT statusUpdateLogin, statusUpdateId,
toTimestamp(statusUpdateId) as date, body
FROM user_status_update_timeline WHERE timelineLogin =?
To get data from every page, WHERE clause is used as shown in the query below
SELECT statusUpdateLogin, statusUpdateId,
toTimestamp(statusUpdateId) as date, body
FROM user_status_update_timeline
WHERE timelineLogin =? and statusUpdateId <?
In place of ‘?’ from the query shown above, id from the last status update is placed. For example, if
the id from the last status update from first page is 20, then 20 is kept in place of ‘?’ for the second
page. [ CITATION Fra17 \l 1033 ]

The SQL data model that is used in relational databases were designed to avoid data redundancy
and keep the relations between data. Because of this many complex join queries should be used to
fetch data which is inefficient for big data set. In case of MongoDB, the data model is simple
because of the features where the data could be fetched in an array or any nested document. The
data model for the Cassandra is complex one as one table per query pattern is done to avoid
reading from multiple partition which leads to a lot of data redundancy. This database has its own
language called Cassandra Query Language (CQL) but this does not support the LIKE operator
from the SQL and this allows this database to perform a full-text search operation. [ CITATION tut19
1 \l 1033 ]
The author also performed performance test between PostgreSQL, MongoDB and Cassandra where
JMeter was used as a supporting tool. The author has done testing in three different parts and they
are as follows:
Simulating users’ traffic: In this testing, 100 users would use the application and it lasted for five
minutes. As a result of this testing, PostgreSQL is fastest when dealing with small number of data
but it’s performance drop when data size increases. Cassandra was not the fastest among three
and its performance also dropped rapidly with the increase in data volume. In case of MongoDB,
there was slight change in the performance even with the increase in the data as shown in the
graph below. [ CITATION Fra17 \l 1033 ]

Figure 7 Number of executed test cycles

Data Inserting: In this testing, 1000 records are inserted in each database and the time taken to
insert are recorded. As shown in the figure below, MongoDB is slowest in inserting data compared
to PostgreSQL and Cassandra whereas PostgreSQL is the fastest to insert data.[ CITATION Fra17 \l
1033 ]

Figure 8 Time taken by database for inserting records


Full-text search: This is last test done by author where different number of hashtags were inserted
in the database and then searched. As shown in the figure below, Cassandra database has taken the
most number of time to search for the hashtags compared to other two database where are
PostgreSQL has taken twice the time taken by MongoDB.[ CITATION Fra17 \l 1033 ]

Figure 9 Time taken for Full text search on database using hash tags

From the research done by author, it is concluded that implementation of few functions like
paginations is complex compared to MongoDB and PostgreSQL as its query language is not as rich
as SQL. Also, the research shows that there is advantage of non-relational data compared to
relational database if there is large set of data that needs to be manipulated but if there is small data
set, the relational can handle it pretty well. In case of writing data, relational database, that is
PostgreSQL is fastest than other two non-relational database.
Figure 10 Bar graph of database popularity

As per the graph below, there is a popularity of oracle with the ranking score of 1346.66. After that,
it is followed up by MYSQL with the ranking score of 1279.07. Rank position of database from one
to fourth is acquired by the relational database whereas fifth position is acquired by MongoDB with
the ranking score of 410.06. Even with the evolution of non-relational database, there is still more
user of relational database then non-relational database.[ CITATION Sha19 \l 1033 ]
Primary Research
Research Methodology
Research Methodology is a systematic process of solving a problem. It is a way of how should a
research be carried out [ CITATION Asi19 \l 1033 ]. For the research methodology in this
dissertation, both of the research approach, that is quantitative approach and qualitative approach
is used.
Qualitative approach refers to the information that can be or is concerned with the phenomenal
entities. This approach mainly focuses on the dynamic and negotiated reality like psychology,
behavior and many other non-countable things. The methods of collection of qualitative approach
can be observation, interview, discussion or group discussion. For this approach, I will send a
questionnaire to database expert and evaluate the outcome from the result of that questionnaire.
Also, I will research on the journals and research papers that is related to the comparative analysis
between relational database and non-relational database.
Quantitative approach refers to the information that can be or concern with the entities that can
be measured. This approach mainly focuses on the fixed and measurable reality. The methods of
collection of data are survey, statistical analysis report, questionnaires, interview and many more.
For a quantitative approach. I will look for a relational database and non-relational database that is
popular between different firms and the user interactive interface tools to check the performance
and complexity of the database. After that, I will use the database with the preferred result on the
online application.[ CITATION Sim19 \l 1033 ]
For this dissertation, I’ve used data collection to collect primary and secondary data.
Data Collection: Data collection is a methodology of research where many information on the
interested field are gathered in a systemic way which helps the researcher to answer the research
question by evaluating the data collected. While the data collection methods may vary, I have used
questionnaire, interview and group discussion as a way of primary data collection and have used
both of the data collection approach.
Questionnaire is one of the research methodology and techniques where a fixed set of questions
are made and disturbed. This sets of questions can be online as well as physical piece of paper. The
person who has a questionnaire has to tick or just write down opinions [ CITATION Que19 \l 1033
]. As a primary research, I’ve made questionnaire with few questions using google form. I have used
objective question to ask for the participates as they are easy to answer with tick. With the use of
objective question, answering the questions become difficult and there is high chance that the
answer are we get are biased. These questionnaires are sent to the different institutes, school and
private companies of Kathmandu through email and messaging applications. These questionnaires
are filled up by database admin, IT officers and managers of a company. The responses given by the
candidate are saved in the form of pie chart making it easy for an analysis.
Other methods of data collections are as follows:
 Research paper relating to the database
 Interviewing database experts
 Collecting review from internet
 Collecting data from website
 Interviewing enterprise staff working on field of database
 Surveying companies on the database they use
Development Methodology
Development methodology is the process of using the processes or series that is used during the
timeline of development of a product. With the increase of problem during the development of the
project, new methodologies are discovered to prevent from facing the same problem again and
again. For the development of the product, I have used waterfall development methodology where
one phase is completed before jumping to another phase and the process are done in serial. As the
figure shown below, there are six stages in the waterfall methodologies.

Figure 11 Steps for Waterfall Methodologies

Requirement Analysis: In this step, I have gathered all the requirement that is needed for the
product. After that, I have analyzed the requirement that I got from the bike service center and
studied if the requirement is feasible or not. By gathering and analyzing the requirement, only
feasible requirement is chosen for next phase so that it does not cause any breakdown of system.
System Design: In this development phase, I have studies all the requirement from the previous
phase and chosen the MVC design pattern to construct the project. Not only that, I have also done
architectural design that helps us to describe the behavior and structure of a system using the class
diagram. This diagram describes about the structure of a system. For the description of behavior of
this system, I have made use case diagram. In case of logical design, I have done ER-diagram that
describes the database structure and relational between an entity. This phase is necessary as it
describes the overall system and the development of product is done on the basis of system design.
Implementation: In this development phase, the product finally starts to take a shape as the
development of the product starts. In this phase, I have made a design based on the wireframe of
the product. Also, the actual database is created as per the ER-diagram from the previous phase.
Since this phase consumes more time than other, I have separated more time for the development
of the system compared to other phase. At the end of this phase, an actual deliverable product is
constructed as per the system design.
Testing: In this development phase, the product that was developed in the previous phase will be
tested. In this phase, the test case is written and testing are carried out as per the test case. If any
bug or fault is detected, it is recorded in the test case and later on it is fixed. This phase is especially
important as it helps to make a software bug free before the deployment of a software.
Deployment: This is a final development phase of a system where the product that are developed
and tested are deployed in the real environment where end users can use it. In this phase, other
activities like integration of code, code review are done. If every test and integration are success,
the product are deployed.
Maintenance: In this phase, the product that are deployed in the previous phase are maintained if
the performance of the product degrades. Not only that, if there are more of future works that are
developed, it is done in this phase. Also, monitoring of the software for the better performance is
also done in this phase including the fixing of bugs and errors.
Tools
For this research, I shall use MySQL and MongoDB for the comparison between relational database
and non-relational database. For this test, I will insert, view, modify and delete a data record of 100,
500 and 1000 simultaneously on the GUI tools used for specific database and compare the time
taken by relational database and non-relational database.
Tools for MySQL
Internet Information Services (IIS) is an extensible web services developed by Microsoft
Cooperation for windows series (not available in Linux or Mac) is not active by default in new
version of windows. I have used Microsoft Web Platform Installer 5.1 to install MySQL. It is a tool
from IIS that simplifies the way for installing different open source applications [ CITATION Sou16
\l 1033 ]. From that, I’ve installed MySQL database with the version of 5.5 in my system. To access
the MySQL database, terminal or PowerShell is needed along with the user name and password
with command ‘mysql -u username -p’ [ CITATION Was12 \l 1033 ]. The user should
write his/her username in place of username in the given command. After that, it asks for
password. With the correct, user can access to the database as shown in the figure below:

Figure 12 MySQL Monitor on CUI


With this, user can now work on the database but it is quite inefficient to use command line for
every work that is needed to be done. To fix that issue, I will use MySQL database interface. So, I will
use SQLyog as a user interactive tools to do CRUD operation. This is an open source (community
edition) GUI tool mainly by used by a database administrator, developers and database architects. It
got different power tools like schema and data sync, SSH and HTTP tunneling, scheduled backup
and import external data [ CITATION Web19 \l 1033 ].

Figure 13 Dashboard of SQLyog

Tools for MongoDB


To install MongoDB database, I had to go to its official website and install it on my system. And for
the GUI interface, MongoDB has provided the Compass where one can easily connect just from click
and make document as well. Also, there are other tools where performance of the database is
completed as well just as shown in the figure shown below:

Figure 14 Dashboard of Compass


Technology
Relational database is widely used for over 40 years. Nowadays, although it has appeared to be
inadequate to handle massive data, it is still the mainstream database infrastructure. There are
several famous Relational databases, such as Oracle, MySQL and Microsoft SQL Server. These three
databases are the most popular databases in the current market. [ CITATION DrT12 \l 1033 ]
MySQL
MySQL is an open source relational database management system that is currently owned and
managed by Oracle Inc. It is based on SQL (Structured Query Language) and data are stored in
tabular structure. It is used in wide range of purposed field like e-commerce, logging applications,
data warehousing and many more.[ CITATION 12319 \l 1033 ]

Unlike relational database, non-relational database does not have any ridge principle like
relational database. With the recent growth of the big data, the popularity of non-relational
database has grown as well. With its flexibility and ability to process the large number in short
period of time, it has been more active leaving relational database behind. MongoDB, Cassandra are
the example of non-relational database. [ CITATION Mon19 \l 1033 ]
MongoDB
MongoDB is an open source non-relational database management system that is based on
document and it is developed and maintained by MongoDB Inc. The data stored in this database are
in the form of BSON. BSON are the advanced form of JSON data type. It is a cross-platform database
with high level of performance, high availability and easy scalability. Many big companies’
developer like Facebook, Adobe, Google and many more.[ CITATION tut19 \l 1033 ]

For the development of the product, I have used PHP (PHP Hypertext Preprocessor) as a core
technology. It’s framework Laravel is used for both client side of the product as well as server side.
In client side, I have used HTML (Hyper Text Markup Language) and CSS (Cascading Style
Sheets) for creating and designing the website whereas I have used JavaScript to make the website
more interactive. JavaScript is used for both of the front-end and server side as well.
Technique
There has always been errors and problems that arises during the development of a product. With
that, many of the software development technique are developed like structured programming,
functional programming, object-oriented programming along with other software development
techniques and implemented to avoid the some of the previously discovered problem.
In this dissertation, I have used object-oriented programming techniques to build the product. I
chose this technique for developing product because it is easy to troubleshoot and find fault
because you know exactly where to look for but in structure programming, finding a fault is very
difficult as code is not separated and are in huge chunks. With the ability of inheritance, it is easy to
reuse the code and it is efficient way of coding whereas it is very difficult in structured
programming. With the use of encapsulation and polymorphism, I have chosen object-oriented
programming technique.

Figure 15 Advantage of OOPS


For the designing of database, I have used entity relationship model as a technique to map out the
logical structure of the database that will be used in my product. In this database model, all the real
word entity is mapped down to the tool such as visual paradigm and they are connected with each
other through a relationship. With the help of this, the database with their attributes are created in
the actual environment.

Figure 16 ER annotations with example


Conceptual Models
In this dissertation, the application is made for the management system of a bike service center
where customer can register and log in the system. After that, they can take an appointment to get
their bike service done. For the appointment, admin of the system sees if workers are busy with
another schedule or not. And if they are not busy, admin give an appointment to the customer and
customer can later on come and gets their bike service done.
A conceptual model is a pictorial representation of an overall system that uses concepts, ideas and
architecture of the product. It is especially made to establish or define the entity beforehand
making it easier for the development of the product. It is a used to define the scope and base of a
product and need high level of understanding. In this dissertation, I have created various
conceptual models and they are listed below:
ER Diagram:
This is a diagram of entity relationship model of a bike service management system’s database
where different entity is mapped with their attributes and relationship is plotted with one-one or
one-many or many-to-many. There are four main entity in this system and they are user, bike,
mechanic and appointment and there is use of same length for same data types which is 225 for
varchar, 10 for integer but there is no length for timestamp data type by default. Except for many to
many relationships, there is use of other relationship like one to one and one to many.

Figure 17 ER Diagram of Bike service management system


Use Case Diagram:
Use case is a diagram that show the interaction of the users with the system where all of the feature
of the system is mapped with the only authorized users.

Figure 18 Use Case diagram of an admin

As the figure above, the admin can login into the system. After admin in logged in into the, he sees
all the appointment request and checks if the mechanic is available or not. If the mechanics are
available, he confirms the appointment and send the appointment date and time to the user. After
that, when the customer brings in the bike, he records the bike and check it in. And after the bike’s
servicing is done, admin checks out the bike.

Figure 19 Use case diagram of customer

The figure above is the image of a customer and the functionality it is accessed to. As per the figure,
the customer can register in to the system and log into it with the help of username and password.
He/she can check if there is huge queue or not. If the customer wants, he/she can request for an
appointment and after getting the appointment for bike servicing. After that, the customer checks in
the bike in the bike servicing center and after the completion the servicing, the customer can make
an online payment as well.
Discussions
PEST Analysis
PEST (Politics, Environment, Social, Technological) is a simple and widely used strategic business
tool used by organizations to discover all the opportunities and threat due to the political,
economic, social and technological factors. With the use of this analysis method, it is easier to find
out the external factors that affects the business and plan the strategic process and implement the
solutions. There are other variations of this analysis like PESTEL. This analysis is done to inform the
other different business management tools like SWOT analysis, Risk Analysis and many more.
Without the PEST analysis, there is high probability high chance that business might be
unsuccessful because of the external factors that is, political, economic factor, social factor and
technological factor.
In this dissertation, I have made a PEST Analysis of bike service management system and how it
effects on the business politically, economically, socially and technology. The image of a PEST
Analysis is shown in the figure given below:

Figure 20 PEST Analysis of bike service management system

As shown in the image above, all the points of the PEST Analysis are explained below:
Political (P): In this sector, analysis is made on how the politics factor such as government affects
the business. If there is high taxation on the automobiles, then the consumer rate would drop
making it difficult to get more customer. Moreover, change in the rules and regulation like banning
old automobiles impacts the business because if the old autos are banned, new bikes are purchased
impacting positively on the business
Economic (E): In this sector, analysis is made on how the economic factor such as labor cost,
international foreign exchange can affect the business. In case of bike service management system,
if there is increase in the product cost or labor cost, the service charge increases demotivation the
customer to use service system.
Social (S): It is another important sector where business is affected by the social factor such as age
group, culture and religion. In case of bike service management system, the consumer increases
when the student graduates and starts to work. However, there is problem in transportation
because of traffic jam and this might lead the consumer to demotivation which might affect the
business.
Technological (T): This is another sector where technology and techniques that are used in
business are analyzed along with its impact. There is online feature for a consumer to make an
appointment online for bike service management system. Also, there is online payment services
making customer easy for payment and motivating and increasing the customer.

With the development of the product, there are many quality attributes that are included in the bike
service management system. These quality attributes are in-built in the system to make the system
more secure, easy user experience and well-maintained product. Some of the quality attributes of
bike service management system are as follows:

Security:
With the use of online application, many sensitive information is transferred through the internet in
real time. Because of this, many hackers and spammers get attracted who always has many
different ways to attack the system. Therefore, it is important to keep the security attributes in the
system in check. Some of the web vulnerabilities that our system might face are cross site scripting
where attackers attack from different web platform. SQL Injection is another most common threat
where attacker might destroy the database using malicious code is SQL statement. If the hacker gets
hold of the hash password, he could crack the password and could log into the admin section with
this, he could bring down the whole system.

Figure 21 Attacks that web application face


To prevent all these attacks, I have implemented various methods and techniques from preventing
the attackers to attack on system and they are shown below:

Figure 22 Solution to the online threats

For Authentication and authorization feature, I have used the Laravel framework’s auth feature
where it hashes password by default and Laravel facilitates the “guard” and “session” feature where
users are authenticated and only authorized one can access the secured site. With session guard, it
maintains the session storage and with the form authentication and input validation, attacks like
SQL Injection can be avoided.

Usability:
Usability is another important attribute in bike service management system. It is an attribute where
it makes the use of an application easy with easy navigation and familiarity of application and for
that, I have used navigation on the top of the product where is user can find easily. The built
application should be easy to learn as well if there are any special feature. To add the usability in
my product, I have used help and alert pop-up for needy people. Also, there is an alert system that
pops out if any error arises during user activity such as filling up form or entering wrong password.

Figure 23 List of attributes of usability

As the image shown above, I have used all of the attributes in the system making it easier to use and
learn with the proper navigation and clear visual of an application as well as making the web
application responsive as per the device. Moreover, different tasks are separated and used in
different screens. All these factors are considered as well techniques are used to make this web
application more usable.
Scalability
There is probability that the user can increase in near future along with data traffic and the data
that are stored in the database. Therefore, it is important to have a scalable attribute in the system
which can handle the data storage and data traffic with the increase of user during the reading and
writing records. Bike service management system has used MySQL as its core database. It would be
difficult to maintain scalability in this database. Because of that, I have tried caching the database
query as variable in memory to improve the efficiency and scalability of an application.

Figure 24 Scalability of database

To increase the scalability of database, there is the use of data types in the table because of which
similar data type can be inserted which makes the data reading and writing scalable.
Maintainability
With the use of web application, the performance might degrade in mean time. There might be
useless cache that are available in the server or some bugs may arise. Therefore, it is important to
maintain the web application, server and database.

Figure 25 Discussion for maintaining web application

For the maintenance of my web application, I have made a schedule for maintenance where
database backing up feature is working or not is checked and server cache is cleared. Also, MySQL
query is optimized and server hardware are upgraded from time to time for better performance. By
doing such task, the maintainability is built in my application.
Automation
Without the use of human efforts, the automation plays an important role helping user to complete
task in much less time with much less efforts.

Figure 26 Automation in web application

For the automation, I have used auto form fill up where customer can easily fill up the form on one
click. Moreover, if the admin is updating the product, all the info of the product is shown in the
screen of an admin automatically. Also, When the user logs out of the system, the session that is
stored in the browser is erased automatically for the security purpose from the browser with the
help of Laravel guard and session which are pre-build in the framework.

Issue Log

Figure 27 Issue Log of bike service management system


Final Integrations of Product
For the integration of product, I have made a separate database using MySQL with the name of
bike_servicing_system. As for the version control, I have used git where all of the code is backed up.
Finally, for the integration of product, I have made a migration of the table where it is easy to make
a database and both of the frontend and backend are in the same repository which makes it easy to
run in the server and the figure are shown in the figure below.

Figure 28 bike_servicing_system database with the table

Figure 29 GitHub for bike servicing system


Findings and Opinions
There are four questions on the questionnaire all relating to the database that are used by a firm.

Figure 30 Types of database used by firm

As the figure shown above, there are responses from the few firms and there is a majority of user
who uses relational database with 44.4% whereas there is 22.3% of users who uses only non-
relational database. And there is 33.3% of user who have used both of the database. This shows that
there is majority of users who uses relational database compared to non-relational database.

Figure 31 Chart of preferred database based on performance

The figure above shows that majority of the user prefer MySQL as their database with good
performance as the majority of 44.4% whereas MongoDB stands second among the database with
the 33.3% that are preferred by the users in terms of performance. And there is oracle with the user
of 22.2% as third most preferred database in terms of performance. With that, there are no user
that prefer the remaining database in terms of the performance.
Figure 32 Chart of preferred databased based on cost-effective

As the figure above, MySQL has been cost-effective among other database with the percentage of
44.44. At second, there is MongoDB as a cost-effective database with 33.3%. And there is Oracle and
PostgreSQL as third most effective database with the percentage of 11.1 on both databases. For the
remaining database, there is no users among nine who this them as cost-effective.

Figure 33 Chart of preferred database by user

The figure mentioned above shows that there is a majority user who prefer to use MySQL among
other databases with 44.4%. MongoDB stands second as a user preference with the percentage of
33.3. There is oracle database on third most preferred database with 22.2% of user preferences.

From the questionnaire above, it can be concluded that there is majority of firm who still uses
relational database in their system. And compared to the performance, there is more vote on
MySQL than other database. In case of cost and user’s preferences, MySQL wins other five database
whereas MongoDB stands second among other databases. Also, as per the secondary research, the
performances of the database when reading data, speed degrade by a little when it comes to
MongoDB compared to MySQL, Cassandra and MS SQL Server. But when it comes to inserting data,
MySQL has good speed compared to MongoDB, Cassandra and MS SQL Server. Also, scalability of
relational database is very poor whereas non-relational database has good scalability. Not only that,
the non-relational database sharding (kind of horizontal partitioning) of data is advantageous
whereas relational database sharding feature is too complex to manage.
It would seem that both of the relational database and non-relational database has their own
advantage and disadvantage. Also, choosing the correct type of database for a system is very
necessary. When dealing with the finite number of data with structured data type, I would prefer
relational database. But if there is an involvement of “Big Data” and good performance is required
when performing data analysis and data mining, non-relational database stands atop of relational
database. In order to use specific type of database, one has to compromise the disadvantage of the
database with its advantage. For example, if the firm needs high scalability and need to work with
unstructured data, then they need to use non-relational database by compromising the cost and
query language . With the increase of modern application, many of the firm has started using both
of the database. For example, Facebook uses MySQL from relational database to store their user
information whereas they use Haystack to store all the photo that are uploaded in the Facebook
application.
Conclusion
This dissertation planned a comparative analysis of relational and non-relational database where
performance along with scalability of different database are researched for a bike servicing center.
Before starting the development of product, I studied about the different types of database like
MySQL, MS SQL Server, Cloud SQL, MongoDB and Cassandra with their features, advantage and
disadvantage. Also, I did some of the research of my own on the database preference in the market
with the help of questionnaire. As the requirement for the bike service management was uncertain,
I used a waterfall methodology which can help me with the gathering of requirement and
continuous development during the development of a system.
Before the development of the system, it is important for me to choose the database. So, I aimed to
choose the database that is more appropriate for the bike service center based on the data types,
performance, scalability and cost-effective as well. The data for the bike service center has not been
the specific but there is high probability that there won’t be any big data involved. Not only that,
there is a structured data types that are used in the system. Also, the database has to be easy to use
and maintain for this system. There is also a fixed data types that are used in the system.
In the final analysis, the comparison between relational and non-relational database with different
databases like MySQL, MongoDB, MS SQL Server, Cassandra and Cloud SQL, it has shown that each
of the database has their own advantages and disadvantage and both of the database are best suited
as per the requirement of the system in various sectors. And for selecting the database for this
system, I have used MySQL which is relational database. I have used this database for my system
because it is easy to implement and offers a widespread range of supports as well. Moreover, this
database helps to link the information from table with the help of foreign key making it easy for a
database administrator to retrieve information. In spite of having problem with the big data, MySQL
is perfect database for my system because there is no big data involved and performance of this
database is good and compared to MongoDB, the performance of MySQL is almost same when small
number of data is being dealt with. The scalability of this database is not as good as MongoDB but I
still choose MySQL because of its ridge schema. This helps us to manage and maintain the database
easily. With the MySQL being an open source, it is very cost effective as well and as for the GUI to
use this database, SQLyog, a community edition is free to use as well. To access the extra feature
from SQLyog, we need to get an enterprise version.
Finally, there are high number of firms that uses both relational database as well as non-relational
database. With the increase of modern web application and big involvement of big data, their
analysis and manipulation are only possible with the help of non-relational database. But when
there are tasks like many transactions, doing complex query and need the consistency, relational
database is always a better approach.
Future Work
In the near future, bike service management system can collect the user’s feedback and work on the
improvement of the application. There are trends of e-wallet in the market and if it still continues to
exists, this application will have an integrated online payment service. In this dissertation, the
comparison between the MySQL as relational database and MongoDB as non-relational database
was done within a limited environment. So, there is even more room for improvement of this
application in near future if there is an involvement of “Big Data”. With the increase of users and
data, hybrid database (use of both relational and non-relation database) should be used for
efficiency.
With the increase of BYOD (Bring your own device) it is important to work on the mobile
application of this product. This helps to gather even more customer and security threat such as
information leakage, data modification, unauthorized well in future. In order to avoid this
application will have authorization and authentication feature in-built. The application will be
available in both android and iOS platform as well.

Das könnte Ihnen auch gefallen