Sie sind auf Seite 1von 13

Database

A database is an organized collection of data. The term originated within the computer industry, but its meaning has been broadened by popular use, to the extent that the European Database Directive (which creates intellectual property rights for databases) includes non-electronic databases within its definition. This article is confined to a more technical use of the term; though even amongst computing professionals, some attach a much wider meaning to the word than others. One possible definition is that a database is a collection of records stored in a computer in a systematic way, so that a computer program can consult it to answer questions. For better retrieval and sorting, each record is usually organized as a set of data elements (facts). The items retrieved in answer to queries become information that can be used to make decisions. The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science. The central concept of a database is that of a collection of records, or pieces of knowledge. Typically, for a given database, there is a structural description of the type of facts held in that database: this description is known as a schema. The schema describes the objects that are represented in the database, and the relationships among them. There are a number of different ways of organizing a schema, that is, of modeling the database structure: these are known as database models (or data models). The model in most common use today is the relational model, which in layman's terms represents all information in the form of multiple related tables each consisting of rows and columns (the true definition uses mathematical terminology). This model represents relationships by the use of values common to more than one table. Other models such as the hierarchical model and the network model use a more explicit representation of relationships. Strictly speaking, the term database refers to the collection of related records, and the software should be referred to as the database management system or DBMS. When the context is unambiguous, however, many database administrators and programmers use the term database to cover both meanings. Many professionals would consider a collection of data to constitute a database only if it has certain properties: for example, if the data is managed to ensure its integrity and quality, if it allows shared access by a community of users, if it has a schema, or if it supports a query language. However, there is no agreed definition of these properties. Database management systems are usually categorized according to the data model that they support: relational, object-relational, network, and so on. The data model will tend to determine the query languages that are available to access the database. A great deal of the internal engineering of a DBMS, however, is independent of the data model, and is concerned with managing factors such as performance, concurrency, integrity, and recovery from hardware failures. In these areas there are large differences between products.

Database models
Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model. For any one logical model various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance. An example of this is the relational model: all serious implementations of the relational model allow the creation of indexes which provide fast access to rows in a table if the values of certain columns are known. A data model is not just a way of structuring data: it also defines a set of operations that can be performed on the data. The relational model, for example, defines operations such as selection, projection, and join. Although these operations may not be explicit in a particular query language, they provide the foundation on which a query language is built.

Network model
The network model (defined by the CODASYL specification) organizes data using two fundamental constructs, called records and sets. Records contain fields (which may be organized hierarchically, as in COBOL). Sets (not to be confused with mathematical sets) define one-to-many relationships between records: one owner, many members. A record may be an owner in any number of sets, and a member in any number of sets. The operations of the network model are navigational in style: a program maintains a current position, and navigates from one record to another by following the relationships in which the record participates. Records can also be located by supplying key values. Although it is not an essential feature of the model, network databases generally implement the set relationships by means of pointers that directly address the location of a record on disk. This gives excellent retrieval performance, at the expense of operations such as database loading and reorganization. [edit]

Relational model
The relational model was introduced in an academic paper by E. F. Codd in 1970 as a way to make database management systems more independent of any particular application. It is a mathematical model defined in terms of predicate logic and set theory. The products that are generally referred to as relational databases (for example Adaptive Server Enterprise, Ingres, Oracle, DB2, Informix, and SQL Server) in fact implement a model that is only an approximation to the mathematical model defined by Codd. The data structures in these products are tables, rather than relations: the main differences being that tables can contain duplicate rows, and that the rows (and

columns) can be treated as being ordered. The same criticism applies to the SQL language which is the primary interface to these products. There has been considerable controversy, mainly due to Codd himself, as to whether it is correct to describe SQL implementations as "relational": but the fact is that the world does so, and the following description uses the term in its popular sense. A relational database contains multiple tables, each similar to the one in the "flat" database model. Relationships between tables are not defined explicitly; instead, keys are used to match up rows of data in different tables. A key is a collection of one or more columns in one table whose values match corresponding columns in other tables: for example, an Employee table may contain a column named Location which contains a value that matches the key of a Location table. Any column can be a key, or multiple columns can be grouped together into a single key. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one. A key that can be used to uniquely identify a row in a table is called a unique key. Typically one of the unique keys is the preferred way to refer to row; this is defined as the table's primary key. A key that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial number), is sometimes called a "natural" key. If no natural key is suitable (think of the many people named Brown), an arbitrary key can be assigned (such as by giving employees ID numbers). In practice, most databases have both generated and natural keys, because generated keys can be used internally to create links between rows that cannot break, while natural keys can be used, less reliably, for searches and for integration with other databases. (For example, records in two independently developed databases could be matched up by social security number, except when the social security numbers are incorrect, missing, or have changed.)

Object database models


In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases. A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-

oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities. Object databases suffered because of a lack of standardization: although standards were defined by ODMG, they were never implemented well enough to ensure interoperability between products. Nevertheless, object databases have been used successfully in many applications: usually specialized applications such as engineering databases or molecular biology databases rather than mainstream commercial data processing. However, object database ideas were picked up by the relational vendors and influenced extensions made to these products and indeed to the SQL language.

Hierarchical model
In a hierarchical data model, data is organized into a tree-like structure in such a way that it cannot have too many relationships. The structure allows repeating information using parent/child relationships. All attributes of a specific record are listed under an entity type. In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as columns. Entity types are related to each other using 1: N mapping also known as one to many relationships. An example would be: an organisation has records of employees in a table (entity type) called Employees. In the table there would be attributes/columns such as First Name, Last Name, Job Name and Wage. The company also has data about the employees children in a separate table called Children with attributes such as First Name, Last Name, and DOB. The Employee table represents a parent segment and the Children table represents a Child segment. These 2 segments form a hierarchy where an employee may have many children but each child may only have 1 parent. Hierarchical structures were widely used in the first mainframe database management systems. However, owing to their restrictions, they often cannot be used to relate structures that exist in the real world. Hierarchical relationships between different types of data can make it very easy to answer some questions, but very difficult to answer others. If a one-to-many relationship is violated (e.g., a patient can have more than one physician) then the hierarchy becomes a network.

Applications of databases
Databases are used in many applications, spanning virtually the entire range of computer software. Databases are the preferred method of storage for large multiuser applications, where coordination between many users is needed. Even individual users find them convenient, though, and many electronic mail programs and personal organizers are based on standard database technology. Software database drivers are available for most database platforms so that application software can use a common application programming interface (API) to retrieve the information stored in a database. Two commonly used database APIs are JDBC and ODBC.

Decision support system Definitions


The concept of a decision support system (DSS) is extremely broad and its definitions vary depending upon the author's point of view (Druzdzel and Flynn 1999). A DSS can take many different forms and the term can be used in many different ways (Alter 1980). On the one hand, Finlay (1994) and others define a DSS broadly as "a computer-based system that aids the process of decision making." In a more precise way, Turban (1995) defines it as "an interactive, flexible, and adaptable computer-based information system, especially developed for supporting the solution of a nonstructured management problem for improved decision making. It utilizes data, provides an easy-to-use interface, and allows for the decision maker's own insights." Other definitions fill the gap between these two extremes. For Keen and Scott Morton (1978), DSS couple the intellectual resources of individuals with the capabilities of the computer to improve the quality of decisions ("DSS are computer-based support for management decision makers who are dealing with semi-structured problems"). For Sprague and Carlson (1982), DSS are "interactive computer-based systems that help decision makers utilize data and models to solve unstructured problems." On the other hand, Keen (1980) claims that it is impossible to give a precise definition including all the facets of the DSS ("there can be no definition of decision support systems, only of decision support"). Nevertheless, according to Power (1997), the term decision support system remains a useful and inclusive term for many types of information systems that support decision making. He humorously adds that every time a computerized system is not an on-line transaction processing system (OLTP), someone will be tempted to call it a DSS. As you can see, there is no universally accepted definition of DSS. Additionally, the specifics of it is what makes it less generalized and more detailed. In addition, a DSS also is a specific Software application that helps to analyze data contained with a customer database. This approach to customers is used when deciding on target markets as well as customer habits. As you can see in this specific example, it is obvious that DSS can be used for more than just organization. Recommended reading: Druzdzel and Flynn (1999), Power (2000), Sprague and Watson (1993), the first chapter of Power (2002), the first chapter of Makaras (1999), the first chapter of Silver (1991), the first two chapters of Sauter (1997), and Holsaple and Whinston (1996).

Applications
As mentioned above, there are theoretical possibilities of building such systems in any knowledge domain. One of the examples is Clinical decision support system for medical diagnosis. Other examples include a bank loan officer verifying the credit of a loan applicant or an

engineering firm that has bids on several projects and wants to know if they can be competitive with their costs. A specific example concerns the Canadian National Railway system, which tests its equipment on a regular basis using a Decision Support System. A problem faced by any railroad is worn-out or defective rails, which can result in hundreds of derailments per year. Under a DSS, CN managed to decrease the incidence of derailments at the same time other companies were experiencing an increase. DSS has many applications that have already been spoken about. However, it can be used in any field where organization is necessary. Additionally, a DSS can be designed to help make decisions on the stock market, or deciding which area or segment to market a product toward. DSS has endless possibilities that can be used anywhere and anytime, for its decision making needs.

Designing a Specific DSS


Ideal planning DSS design methodology Iterative nature of situation analysis Interviewing techniques matrix Design Approaches Data and Data Management Considerations in Selecting a DSS Generator Models and Model Management Considerations in Selecting a DSS Generator User Interface Considerations in Selecting a DSS Generator Connectivity Considerations in Selecting a DSS Generator Hardware and Software Considerations in Selecting a DSS Generator Cost Considerations in Selecting a DSS Generator Vendor Considerations in Selecting a DSS Generator Characteristics of Good Team Members

Three-tier (computing)
It has been suggested that this article or section be merged into Multitier architecture. (Discuss)

Visual overview of a Three-tiered application In computing, Three-tier is a client-server architecture in which the user interface, functional process logic ("business rules"), data storage and data access are developed and maintained as independent modules, most often on separate platforms. The term

"three-tier" or "three-layer", as well as the concept of multitier architectures, seems to have originated within Rational Software. The three-tier model is considered to be a software architecture and a software design pattern. Apart from the usual advantages of modular software with well defined interfaces, the three-tier architecture is intended to allow any of the three tiers to be upgraded or replaced independently as requirements or technology change. For example, a change of operating system from Microsoft Windows to Unix would only affect the user interface code. Typically, the user interface runs on a desktop PC or workstation and uses a standard graphical user interface, functional process logic may consist of one or more separate modules running on a workstation or application server, and an RDBMS on a database server or mainframe contains the data storage logic. The middle tier may be multi-tiered itself (in which case the overall architecture is called an "n-tier architecture"). It seems similar, although defined in slightly different terms, to the Model-viewcontroller concept and the pipes and filters concept.

3- and n-Tier Architectures Introduction


Through the appearance of Local-Area-Networks, PCs came out of their isolation, and were soon not only being connected mutually but also to servers. Client/Servercomputing was born. Servers today are mainly file and database servers; application servers are the exception. However, database-servers only offer data on the server; consequently the application intelligence must be implemented on the PC (client). Since there are only the architecturally tiered data server and client, this is called 2-tier architecture. This model is still predominant today, and is actually the opposite of its popular terminal based predecessor that had its entire intelligence on the host system. One reason why the 2-tier model is so widespread, is because of the quality of the tools and middleware that have been most commonly used since the 90s: RemoteSQL, ODBC, relatively inexpensive and well integrated PC-tools (like Visual Basic, Power-Builder, MS Access, 4-GL-Tools by the DBMS manufactures). In comparison the server side uses relatively expensive tools. In addition the PC-based tools show good Rapid-Application-Development (RAD) qualities i.e. that simpler applications can be produced in a comparatively short time. The 2-tier model is the logical consequence of the RAD-tools popularity : for many managers it was and is simpler to attempt to achieve efficiency in software development using tools, than to choose the steep and stony path of "brainware".

Why 3-tier?
Unfortunately the 2-tier model shows striking weaknesses, that make the development and maintenance of such applications much more expensive.

The complete development accumulates on the PC. The PC processes and presents information which leads to monolithic applications that are expensive to maintain. Thats why its called a "fat client". In a 2-tier architecture, business-logic is implemented on the PC. Even the businesslogic never makes direct use of the windowing-system, programmers have to be trained for the complex API under Windows. Windows 3.X and Mac-systems have tough resource restrictions. For this reason applications programmers also have to be well trained in systems technology, so that they can optimize scarce resources. Increased network load: since the actual processing of the data takes place on the remote client, the data has to be transported over the network. As a rule this leads to increased network stress. How to conduct transactions is controlled by the client. Advanced techniques like two-phase-committing cant be run. PCs are considered to be "untrusted" in terms of security, i.e. they are relatively easy to crack. Nevertheless, sensitive data is transferred to the PC, for lack of an alternative. Data is only "offered" on the server, not processed. Stored-procedures are a form of assistance given by the database provider. But they have a limited application field and a proprietary nature. Application logic cant be reused because it is bound to an individual PCprogram. The influences on change-management are drastic: due to changes in business politics or law (e.g. changes in VAT computation) processes have to be changed. Thus possibly dozens of PC-programs have to be adapted because the same logic has been implemented numerous times. It is then obvious that in turn each of these programs have to undergo quality control, because all programs are expected to generate the same results again. The 2-tier-model implies a complicated software-distribution-procedure: as all of the application logic is executed on the PC, all those machines (maybe thousands) have to be updated in case of a new release. This can be very expensive, complicated, prone to error and time consuming. Distribution procedures include the distribution over networks (perhaps of large files) or the production of an adequate media like floppies or CDs. Once it arrives at the users desk, the software first has to be installed and tested for correct execution. Due to the distributed character of such an update procedure, system management cannot guarantee that all clients work on the correct copy of the program.

3- and n-tier architectures endeavour to solve these problems. This goal is achieved primarily by moving the application logic from the client back to the server.

What is 3- and n-tier architecture?


From here on we will only refer to 3-tier architecture, that is to say, at least 3-tier architecture. The following diagram shows a simplified form of reference-architecture, though in principal, all possibilities are illustrated

Client-tier
Is responsible for the presentation of data, receiving user events and controlling the user interface. The actual business logic (e.g. calculating added value tax) has been moved to an application-server. Today, Java-applets offer an alternative to traditionally written PC-applications. See our Internet-page for further information.

Application-server-tier
This tier is new, i.e. it isnt present in 2-tier architecture in this explicit form. Business-objects that implement the business rules "live" here, and are available to the client-tier. This level now forms the central key to solving 2-tier problems. This tier protects the data from direct access by the clients. The object oriented analysis "OOA", on which many books have been written, aims in this tier: to record and abstract business processes in business-objects. This way it is possible to map out the applications-server-tier directly from the CASE-tools that support OOA. Furthermore, the term "component" is also to be found here. Today the term predominantly describes visual components on the client-side. In the non-visual area of the system, components on the server-side can be defined as configurable objects, which can be put together to form new application processes.

Data-server-tier
This tier is responsible for data storage. Besides the widespread relational database systems, existing legacy systems databases are often reused here. It is important to note that boundaries between tiers are logical. It is quite easily possible to run all three tiers on one and the same (physical) machine. The main importance is that the system is neatly structured, and that there is a well planned definition of the software boundaries between the different tiers.

The advantages of 3-tier architecture


As previously mentioned 3-tier architecture solves a number of problems that are inherent to 2-tier architectures. Naturally it also causes new problems, but these are outweighed by the advantages.

Clear separation of user-interface-control and data presentation from application-logic. Through this separation more clients are able to have access to a wide variety of server applications. The two main advantages for clientapplications are clear: quicker development through the reuse of pre-built business-logic components and a shorter test phase, because the servercomponents have already been tested. Re-definition of the storage strategy wont influence the clients. RDBMS offer a certain independence from storage details for the clients. However, cases like changing table attributes make it necessary to adapt the clients application. In the future, even radical changes, like lets say switching form an RDBMS to an OODBS, wont influence the client. In well designed systems, the client still accesses data over a stable and well designed interface which encapsulates all the storage details. Business-objects and data storage should be brought as close together as possible, ideally they should be together physically on the same server. This way - especially with complex accesses - network load is eliminated. The client only receives the results of a calculation - through the business-object, of course. In contrast to the 2-tier model, where only data is accessible to the public, business-objects can place applications-logic or "services" on the net. As an example, an inventory number has a "test-digit", and the calculation of that digit can be made available on the server. As a rule servers are "trusted" systems. Their authorization is simpler than that of thousands of "untrusted" client-PCs. Data protection and security is simpler to obtain. Therefore it makes sense to run critical business processes, that work with security sensitive data, on the server. Dynamic load balancing: if bottlenecks in terms of performance occur, the server process can be moved to other servers at runtime. Change management: of course its easy - and faster - to exchange a component on the server than to furnish numerous PCs with new program versions. To come back to our VAT example: it is quite easy to run the new version of a tax-object in such a way that the clients automatically work with the version from the exact date that it has to be run. It is, however, compulsory that interfaces remain stable and that old client versions are still compatible. In addition such components require a high standard of quality control. This is because low quality components can, at worst, endanger the functions of a whole set of client applications. At best, they will still irritate the systems operator. As shown on the diagram, it is relatively simple to use wrapping techniques in 3-tier architecture. As implementation changes are transparent from the viewpoint of the object's client, a forward strategy can be developed to replace legacy system smoothly. First, define the object's interface. However, the functionality is not newly implemented but reused from an existing host application. That is, a request from a client is forwarded to a legacy system and processed and answered there. In a later phase, the old application can be replaced by a modern solution. If it is possible to leave the business objects interfaces unchanged, the client application remains unaffected. A requirement for wrapping is, however, that a procedure interface in the old application remains existent. It isnt possible for a business object to emulate a terminal.

Das könnte Ihnen auch gefallen