Sie sind auf Seite 1von 4

IAETSD Journal for Advanced Research in Applied Sciences, Volume 4, Issue 1, Jan-June /2017

ISSN (Online): 2394-8442

DATABASE THEORY AND IMPLEMENTATION


TECHNIQUES
1
SHREYAS P, 2 VINAY R KANGOKAR, 3 SHRIRAKSHA S NATARAJ, 4 R SHAMILI SWETHA
DEPARTMENT OF INFORMATION SCIENCE, DSCE
1
shreyaspoint@gmail.com , 2 vinayrk100@gmail.com , 3 rakshanataraj11@gmail.com , 4 shamiliswetha2828@gmail.com

ABSTRACT
Database theory encapsulates a broad range of topics related to the study and research of the
theoretical realm of databases and database management systems. We suggest here a methodology for deductive
databases, temporal and spatial databases, real time databases, managing uncertain data and probabilistic
databases, and Web data. Theoretical aspects of data management include among other areas, the foundations of
query languages, computational complexity and expressive power of queries, finite model theory, database
design theory, dependency theory, foundations of concurrency control and database recovery which have also
been included briefly in the paper. Implementation plays a major role in database theory and is being
highlighted in the paper through various techniques.

KEYWORDS: databases, computational complexity, database recovery, queries languages, database design.

1. INTRODUCTION
Databases often fall into one of two broad categories. The first comprises specific purpose, limited databases. In academia, these often contain
data gathered to perform a relatively limited role only in a particular project. The database may be intended to provide the researcher with a
particular set of data, but have no particular function or role at the conclusion of the project. For example, Lock years Coin Hoards of the
Roman Republic (CHRR) database included only data necessary for the project.

The second category comprises general purpose, resource databases. A good example of a resource database is county archaeological sites and
monuments records (SMRs), or national monuments records (Hansen: 1993). These databases are not project specific but are intended to be of
use to a wide variety of users. Resource databases usually attempt to be comprehensive within their `domain of discourse', are maintained and
updated, and are made available to interested parties. As these databases attempt to be comprehensive in order to accommodate unpredicted
enquiries and research, they include a wide variety of data which in turn requires a complex `data structure', or way of storing the information.

The CISP database is intended to be a resource database and as a result has a complex data structure. This structure, however, provides great
power and flexibility both for the retrieval and for the handling of the data, but also for future expansion of the database to include other
information and materials.

2. FOUNDATION OF QUERY LANGUAGES


Databases are a key technology in computer science that brings together fascinating theoretical topics and highly relevant practical applications.
The main aim is to give an extended introduction to this interesting field, with a special focus on database query languages, their expressive
power, and computational complexity. The theoretical and practical aspects of a variety of query languages include:

first-order logic as a query language and the relational algebra


conjunctive queries and their unions
Navigational queries: path queries
Data log and its relatives.

To Cite This Article: SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI
SWETHA,. DATABASE THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in
Applied Sciences ; Pages: 134-137
135. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137

2.1 FIRST ORDER LOGIC AS A QUERY LANGUAGE

The use of first-order logic as database logic is shown to be powerful enough for formalizing and implementing not only relational but also
hierarchical and network-type databases. [6] It enables one to treat all the types of databases in a uniform manner. This paper focuses on the
database language for heterogeneous databases. The language is shown to be general enough to specify constraints for a particular type of
database, so that a specification of database type can be translated to the specification given in the database language, creating a logical
environment for different views that can be defined by users.

2.2 CONJUCTIVE QUERIES AND THEIR UNION

Conjunctive queries are an important class of database queries, equivalent in expressive power to SPJ queries in the relational algebra. We
consider the classical problem of testing containment of conjunctive queries. The problem is well known to be NP-complete [7]. Acyclic queries,
in particular, have been extensively studied in the context of query optimization in distributed database systems, and are well known to have
desirable algorithmic properties.

2.3 NAVIGATION QUERIES

In its most basic form, a navigation query is a combination of one or more dimension values [4]. These dimension values are referred to as
the navigation descriptors. A navigation query instructs the Endear MDEX Engine to return the set of records that represents the intersection of
all the dimension values that it contains. For example, in the illustration below, Bottle A represents the intersection between the Red and USA
dimension values. Bottle C represents the intersection between the White and France dimension values [9].

When an intersection does not exist between all of the dimension values in a navigation query, that query is considered a dead end. For example,
in the illustration above, the Sparkling and Chile dimension values have no bottles in common and, therefore, no intersection.

2.4 DATALOG AND ITS RELATIVES

[2]A number of languages have been proposed to model, query, and manipulate data, as well as for expressing very general classes of integrity
constraints, inference procedures, and ontological knowledge. Such languages are nowadays crucial for many applications such as semantic data
publishing and integration, decision support, and knowledge management. In this tutorial we first introduce Data log, a powerful rule-based
language originally intended for expressing complex queries over relational data, and that today is at the basis of languages for the specification
of optimization and constraint satisfaction problems as well as of ontological constraints in data and knowledge bases. We then discuss the
limitations of Datalog for the semantic web, in particular for ontological modeling and reasoning, and we present several extensions that allow
capturing some of the ontology languages of the OWL family, the standard language for semantic data modeling on the semantic web.

3. COMPUTATIONAL COMPLEXITY
Computational complexity theory is a subfield of theoretical computer science one of whose primary goals is to classify and compare the
practical difficulty of solving problems about finite combinatorial objects e.g. given two natural numbers nn and mm, are they relatively
prime? Given a propositional formula , does it have a satisfying assignment? If we were to play chess on a board of size nnnn, does white
have a winning strategy from a given initial position? These problems are equally difficult from the standpoint of classical computability
theory in the sense that they are all effectively decidable. Yet they still appear to differ significantly in practical difficulty. For having been
supplied with a pair of numbers m>n>0m>n>0, it is possible to determine their relative primarily by a method (Euclids algorithm) which
requires a number of steps proportional to log(n)log(n)[3]. On the other hand, all known methods for solving the latter two problems require a
brute force search through a large class of cases which increase at least exponentially in the size of the problem instance.

Complexity theory attempts to make such distinctions precise by proposing a formal criterion for what it means for a mathematical problem to
be feasibly decidable i.e. that it can be solved by a conventional Turing machine in a number of steps which is proportional to a polynomial
function of the size of its input. The class of problems with this property is known as PP or polynomial time and includes the first of the three
problems described above. PP can be formally shown to be distinct from certain other classes such as EXPEXP or exponential time which
includes the third problem from above [5, 8, 10]. The second problem from above belongs to a complexity class known as NPNP or non-
deterministic polynomial time consisting of those problems which can be correctly decided by some computation of a non-deterministic Turing
machine in a number of steps which is a polynomial function of the size of its input.
136. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137

A famous conjecture often regarded as the most fundamental in all of theoretical computer science states that PP is also properly contained
in NPNP i.e. PNPPNP. Demonstrating the non-coincidence of these and other complexity classes remain important open problems in
complexity theory. But even in its present state of development, this subject connects many topics in logic, mathematics, and surrounding fields
in a manner which bears on the nature and scope of our knowledge of these subjects.

4. EXPRESSIVE POWER OF QUERIES


The study of expressive power concentrates on comparing classes of queries that can be expressed in different languages, and on proving
impressibility or inexpressibility of certain queries in a query language. Two main topics are addressed. First, an algebraic approach is
presented to define a general notion of expressive power. Heterogeneous algebras represent information systems and Orphisms represent the
correspondences between the instances of databases, the correspondences between answers, and the correspondences between queries. An
important feature of this new notion of expressive power is that query languages of different types can be compared with respect to their
expressive power. In the case of relational query languages, the new notion of expressive power is shown to be equivalent to the notion used by
Chandra and Hare. In the case of non relational query languages, the versatility of the new notion of expressive power is demonstrated by
comparing the fix point query languages with an object-oriented query language called FQL. The expressive power of the Functional Query
Language FQL is the second main topic of this paper. The specifications of FQL functions can be recursive or even mutually recursive, FQL has
a fix point semantics based on a complete lattice consisting of bag functions. The query language FQL is shown to be more expressive than the
fix point query languages. This result implies that FQL is also more expressive than Data log with stratified negation. Examples of recursive
FQL functions are given that determine the ancestors of persons and the bill of materials [6, 8].

5. FINITE MODEL THEORY


Finite model theory arose as an independent field of logic from consideration of problems in theoretical computer science. Basic concepts in this
field are finite graphs, databases, computations etc.[9] One of the underlying observations behind the interest in finite model theory is that many
of the problems of complexity theory and database theory can be formulated as problems of mathematical logic, provided that we limit ourselves
to finite structures. While the objects of study in finite model theory are finite structures, it is often possible to make use of infinite structures in
the proofs. Since many central theorems of MT do not hold when restricted to finite structures, FMT is quite different from MT in its methods
of proof.

FMT is mainly about discrimination of structures. The usual motivating question is whether a given class of structures can be
described (up to isomorphism) in a given language. For instance, can all cyclic graphs be discriminated (from the non-cyclic ones) by a
sentence of the first-order logic of graphs? This can also be phrased as: is the property "cyclic" FO expressible?

6. REAL TIME DATABASE


Traditionally, real-time systems manage their data (e.g. chamber temperature, aircraft locations) in application dependent structures. As real-time
systems evolve, their applications become more complex and require access to more data. [8, 10]It thus becomes necessary to manage the data in
a systematic and organized fashion. Database management systems provide tools for such organization, so in recent years there has been interest
in merging database and real-time technology. The resulting integrated system, which provides database operations with real-time constraints is
generally called a real-time database system (RTDBS) A RTDBS can be viewed as a value-added database system that supports real-time
transactions. A real-time transaction has to be completed by its deadline to be of full benefit to the system. Such guarantees are usually hard to
ensure. In case a transaction's deadline is not met, the transaction is called a tardy transaction.

7. WEB DATA MINING


Web mining aims to discover useful information or knowledge from the Web hyperlink structure, page content, and usage data. Although Web
mining uses many data mining techniques, as mentioned above it is not purely an application of traditional data mining techniques due to the
heterogeneity and semi-structured or unstructured nature of the Web data many new mining tasks and algorithms were invented in the past
decade. Based on the primary kinds of data used in the mining process, Web mining tasks can be categorized into three types: Web structure
mining, Web content mining and Web usage mining [3, 7].

Web structure mining: Web structure mining discovers useful knowledge from hyperlinks (or links for short), which represent the
structure of the Web. For example, from the links, we can discover important Web pages, which is a key technology used in search
engines.
Web content mining: Web content mining extracts or mines useful information or knowledge from Web page contents. For example,
we can automatically classify and cluster Web pages according to their topics. These tasks are similar to those in traditional data
mining.
Web usage mining: Web usage mining refers to the discovery of user access patterns from Web usage logs, which record every click
made by each user.

8. PROBABLISTIC DATABASE
A probabilistic database (PDB) P for a vocabulary is a finite set of tuples of the form t p , where t is a -atom and p [0, 1]. Moreover,
if t p P and t q P, then p = q. Any probabilistic database is a particular type of a GM, where each random variable is associated
to a tuple (or to an attribute value, depending on whether we model tuple-level or attribute-level uncertainty). Query answers can also be
represented as a GM, by creating new random variables corresponding to the tuples of all intermediate results, including one variable for every
answer to the query.
137. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137

Thus, GMs can be used both to represent probabilistic databases that have non-trivial correlations between their tuples and to compute the
probabilities of all query answers. However, there are some significant distinctions between the assumptions made in GMs and in probabilistic
databases.

9. DATABASE DESIGN THEORY


The overall goal of database design theory is to capture as much of our models structure as possible particularly constraints in the database
schema itself. Doing so allows the database engine to enforce those constraints automatically and simplifies the application logic built on top of
it. A normalized database schema has two main benefits:

1. Minimal redundancy. Database design theory gives a formal way to identify and eliminate data redundancy in a database [4]

2. Constraint capture. Certain types of constraints can be expressed implicitly by the structure of a relational model, and we will exploit this to
relieve the applica9tion of enforcing them [4].

Functional dependencies : Database design theory centers on the concept of a functional dependency. Functional dependencies (FDs)
are a generalization of the key concept .As part of the formal definition of an FD, we will also formalize various types of keys that can arise
during database design [4].

10. FUNCTIONAL DEPENDENCY THEORY


Normalization theory is necessary for designing a good XML document. A good document means that it has minimal redundancy. The research
on XML functional dependency and normalization is still an open problem. This paper first presents a path language of XML model, and then
proposes a new kind of XML functional dependency (XFD) that has stronger expression ability to XML functional dependency that can result in
redundancies. In this paper, the problems of XFD logical implication and XFD closure are studied and a group of the corresponding inference
rules is proposed. Based on the XFD, a kind of XML normal form (XNF) and an algorithm converting into XNF are also proposed in this paper
[2, 6].

11. DATABASE RECOVERY


A major responsibility of the database administrator is to prepare for the possibility of hardware, software, network, process, or system failure. If
such a failure affects the operation of a database system, you must usually recover the database and return to normal operation as quickly as
possible. Recovery should protect the database and associated users from unnecessary problems and avoid or reduce the possibility of having to
duplicate work manually [7].

12. CONCLUSION
Finally we are here to conclude that database management is the most efficient as well as booming platform for storing the data in the IT
industry and most of the IT companies basically use MYSQL WORKBENCH as their platform but relational databases can be used using the
cloud too, where the information is already stored in the cloud in the form of services by using the virtualization technique. Therefore people
must be aware of how to create the database and connect it to the cloud which is the main IT buzz platform right now.

REFERENCES
[1] R. Abbott, H. Garcia-Molina: What is a Real-Time Database System? Abstracts of the Fourth Workshop on Real-Time Operating systems,
IEEE (July 1987) 134-138
[2] Bing Liu : What is web data mining? Exploring hyperlinks, contents and usage data: Second edition
[3] Serge Abiteboul Ioana Manolescu INRIA Saclay & ENS Cachan INRIA Saclay & Paris-Sud University Philippe Rigaux CNAM Paris &
INRIA Saclay Marie-Christine Rousset Pierre Senellart Grenoble University Tlcom ParisTech : Copyright @2011 by Serge Abiteboul, Ioana
Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart; to be published by Cambridge University Press 2011. For personal use
only, not for distribution
[4] Abiteboul, S.; Hull, R.; and Vianu, V. 1995. Foundations of databases, volume 8. Addison-Wesley Reading.
[5] Arora, S., and Barak, B., 2009, Computational Complexity: A Modern Approach, Cambridge, England: Cambridge University Press.
[6] A.K. Chandra, P.M. Merlin, Optimal implementation of conjunctive queries in relational databases, Proc. 9th ACM Symp. on Theory of
Computing, 1977, pp. 77 90.
[7] Ling Liu: Encyclopedia of database systems.
[8] Baader, F., Calvanese, D., McGuiness, D., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory,
Implementation and Applications. Cambridge University Press (2003).
[9] Johnson, D.S., Klug, A.C.: Testing containment of conjunctive queries under functional and inclusion dependencies. J. of Computer and
System Sciences 28(1), 167189 (1984).
[10] International centre for computation logic: https://iccl.inf.tu-dresden.de/web/Foundations_of_Databases_and_Query_Languages_(SS2015)/en.

Das könnte Ihnen auch gefallen