Beruflich Dokumente
Kultur Dokumente
ABSTRACT
Database theory encapsulates a broad range of topics related to the study and research of the
theoretical realm of databases and database management systems. We suggest here a methodology for deductive
databases, temporal and spatial databases, real time databases, managing uncertain data and probabilistic
databases, and Web data. Theoretical aspects of data management include among other areas, the foundations of
query languages, computational complexity and expressive power of queries, finite model theory, database
design theory, dependency theory, foundations of concurrency control and database recovery which have also
been included briefly in the paper. Implementation plays a major role in database theory and is being
highlighted in the paper through various techniques.
KEYWORDS: databases, computational complexity, database recovery, queries languages, database design.
1. INTRODUCTION
Databases often fall into one of two broad categories. The first comprises specific purpose, limited databases. In academia, these often contain
data gathered to perform a relatively limited role only in a particular project. The database may be intended to provide the researcher with a
particular set of data, but have no particular function or role at the conclusion of the project. For example, Lock years Coin Hoards of the
Roman Republic (CHRR) database included only data necessary for the project.
The second category comprises general purpose, resource databases. A good example of a resource database is county archaeological sites and
monuments records (SMRs), or national monuments records (Hansen: 1993). These databases are not project specific but are intended to be of
use to a wide variety of users. Resource databases usually attempt to be comprehensive within their `domain of discourse', are maintained and
updated, and are made available to interested parties. As these databases attempt to be comprehensive in order to accommodate unpredicted
enquiries and research, they include a wide variety of data which in turn requires a complex `data structure', or way of storing the information.
The CISP database is intended to be a resource database and as a result has a complex data structure. This structure, however, provides great
power and flexibility both for the retrieval and for the handling of the data, but also for future expansion of the database to include other
information and materials.
To Cite This Article: SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI
SWETHA,. DATABASE THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in
Applied Sciences ; Pages: 134-137
135. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137
The use of first-order logic as database logic is shown to be powerful enough for formalizing and implementing not only relational but also
hierarchical and network-type databases. [6] It enables one to treat all the types of databases in a uniform manner. This paper focuses on the
database language for heterogeneous databases. The language is shown to be general enough to specify constraints for a particular type of
database, so that a specification of database type can be translated to the specification given in the database language, creating a logical
environment for different views that can be defined by users.
Conjunctive queries are an important class of database queries, equivalent in expressive power to SPJ queries in the relational algebra. We
consider the classical problem of testing containment of conjunctive queries. The problem is well known to be NP-complete [7]. Acyclic queries,
in particular, have been extensively studied in the context of query optimization in distributed database systems, and are well known to have
desirable algorithmic properties.
In its most basic form, a navigation query is a combination of one or more dimension values [4]. These dimension values are referred to as
the navigation descriptors. A navigation query instructs the Endear MDEX Engine to return the set of records that represents the intersection of
all the dimension values that it contains. For example, in the illustration below, Bottle A represents the intersection between the Red and USA
dimension values. Bottle C represents the intersection between the White and France dimension values [9].
When an intersection does not exist between all of the dimension values in a navigation query, that query is considered a dead end. For example,
in the illustration above, the Sparkling and Chile dimension values have no bottles in common and, therefore, no intersection.
[2]A number of languages have been proposed to model, query, and manipulate data, as well as for expressing very general classes of integrity
constraints, inference procedures, and ontological knowledge. Such languages are nowadays crucial for many applications such as semantic data
publishing and integration, decision support, and knowledge management. In this tutorial we first introduce Data log, a powerful rule-based
language originally intended for expressing complex queries over relational data, and that today is at the basis of languages for the specification
of optimization and constraint satisfaction problems as well as of ontological constraints in data and knowledge bases. We then discuss the
limitations of Datalog for the semantic web, in particular for ontological modeling and reasoning, and we present several extensions that allow
capturing some of the ontology languages of the OWL family, the standard language for semantic data modeling on the semantic web.
3. COMPUTATIONAL COMPLEXITY
Computational complexity theory is a subfield of theoretical computer science one of whose primary goals is to classify and compare the
practical difficulty of solving problems about finite combinatorial objects e.g. given two natural numbers nn and mm, are they relatively
prime? Given a propositional formula , does it have a satisfying assignment? If we were to play chess on a board of size nnnn, does white
have a winning strategy from a given initial position? These problems are equally difficult from the standpoint of classical computability
theory in the sense that they are all effectively decidable. Yet they still appear to differ significantly in practical difficulty. For having been
supplied with a pair of numbers m>n>0m>n>0, it is possible to determine their relative primarily by a method (Euclids algorithm) which
requires a number of steps proportional to log(n)log(n)[3]. On the other hand, all known methods for solving the latter two problems require a
brute force search through a large class of cases which increase at least exponentially in the size of the problem instance.
Complexity theory attempts to make such distinctions precise by proposing a formal criterion for what it means for a mathematical problem to
be feasibly decidable i.e. that it can be solved by a conventional Turing machine in a number of steps which is proportional to a polynomial
function of the size of its input. The class of problems with this property is known as PP or polynomial time and includes the first of the three
problems described above. PP can be formally shown to be distinct from certain other classes such as EXPEXP or exponential time which
includes the third problem from above [5, 8, 10]. The second problem from above belongs to a complexity class known as NPNP or non-
deterministic polynomial time consisting of those problems which can be correctly decided by some computation of a non-deterministic Turing
machine in a number of steps which is a polynomial function of the size of its input.
136. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137
A famous conjecture often regarded as the most fundamental in all of theoretical computer science states that PP is also properly contained
in NPNP i.e. PNPPNP. Demonstrating the non-coincidence of these and other complexity classes remain important open problems in
complexity theory. But even in its present state of development, this subject connects many topics in logic, mathematics, and surrounding fields
in a manner which bears on the nature and scope of our knowledge of these subjects.
FMT is mainly about discrimination of structures. The usual motivating question is whether a given class of structures can be
described (up to isomorphism) in a given language. For instance, can all cyclic graphs be discriminated (from the non-cyclic ones) by a
sentence of the first-order logic of graphs? This can also be phrased as: is the property "cyclic" FO expressible?
Web structure mining: Web structure mining discovers useful knowledge from hyperlinks (or links for short), which represent the
structure of the Web. For example, from the links, we can discover important Web pages, which is a key technology used in search
engines.
Web content mining: Web content mining extracts or mines useful information or knowledge from Web page contents. For example,
we can automatically classify and cluster Web pages according to their topics. These tasks are similar to those in traditional data
mining.
Web usage mining: Web usage mining refers to the discovery of user access patterns from Web usage logs, which record every click
made by each user.
8. PROBABLISTIC DATABASE
A probabilistic database (PDB) P for a vocabulary is a finite set of tuples of the form t p , where t is a -atom and p [0, 1]. Moreover,
if t p P and t q P, then p = q. Any probabilistic database is a particular type of a GM, where each random variable is associated
to a tuple (or to an attribute value, depending on whether we model tuple-level or attribute-level uncertainty). Query answers can also be
represented as a GM, by creating new random variables corresponding to the tuples of all intermediate results, including one variable for every
answer to the query.
137. SHREYAS P, VINAY R KANGOKAR, SHRIRAKSHA S NATARAJ and R SHAMILI SWETHA. DATABASE
THEORY AND IMPLEMENTATION TECHNIQUES. Journal for Advanced Research in Applied Sciences; Pages: 134-137
Thus, GMs can be used both to represent probabilistic databases that have non-trivial correlations between their tuples and to compute the
probabilities of all query answers. However, there are some significant distinctions between the assumptions made in GMs and in probabilistic
databases.
1. Minimal redundancy. Database design theory gives a formal way to identify and eliminate data redundancy in a database [4]
2. Constraint capture. Certain types of constraints can be expressed implicitly by the structure of a relational model, and we will exploit this to
relieve the applica9tion of enforcing them [4].
Functional dependencies : Database design theory centers on the concept of a functional dependency. Functional dependencies (FDs)
are a generalization of the key concept .As part of the formal definition of an FD, we will also formalize various types of keys that can arise
during database design [4].
12. CONCLUSION
Finally we are here to conclude that database management is the most efficient as well as booming platform for storing the data in the IT
industry and most of the IT companies basically use MYSQL WORKBENCH as their platform but relational databases can be used using the
cloud too, where the information is already stored in the cloud in the form of services by using the virtualization technique. Therefore people
must be aware of how to create the database and connect it to the cloud which is the main IT buzz platform right now.
REFERENCES
[1] R. Abbott, H. Garcia-Molina: What is a Real-Time Database System? Abstracts of the Fourth Workshop on Real-Time Operating systems,
IEEE (July 1987) 134-138
[2] Bing Liu : What is web data mining? Exploring hyperlinks, contents and usage data: Second edition
[3] Serge Abiteboul Ioana Manolescu INRIA Saclay & ENS Cachan INRIA Saclay & Paris-Sud University Philippe Rigaux CNAM Paris &
INRIA Saclay Marie-Christine Rousset Pierre Senellart Grenoble University Tlcom ParisTech : Copyright @2011 by Serge Abiteboul, Ioana
Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart; to be published by Cambridge University Press 2011. For personal use
only, not for distribution
[4] Abiteboul, S.; Hull, R.; and Vianu, V. 1995. Foundations of databases, volume 8. Addison-Wesley Reading.
[5] Arora, S., and Barak, B., 2009, Computational Complexity: A Modern Approach, Cambridge, England: Cambridge University Press.
[6] A.K. Chandra, P.M. Merlin, Optimal implementation of conjunctive queries in relational databases, Proc. 9th ACM Symp. on Theory of
Computing, 1977, pp. 77 90.
[7] Ling Liu: Encyclopedia of database systems.
[8] Baader, F., Calvanese, D., McGuiness, D., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory,
Implementation and Applications. Cambridge University Press (2003).
[9] Johnson, D.S., Klug, A.C.: Testing containment of conjunctive queries under functional and inclusion dependencies. J. of Computer and
System Sciences 28(1), 167189 (1984).
[10] International centre for computation logic: https://iccl.inf.tu-dresden.de/web/Foundations_of_Databases_and_Query_Languages_(SS2015)/en.