Sie sind auf Seite 1von 243

Course outline

Part I: Beginning MS Access. Relational database concepts: tables, records, fields, and keys. Relationships. Designing database schema. Normal forms; normalization of data model. Implementing schema: creating MS Access database. Tables in Design view and Datasheet view.

Chapter 1
Introduction to Microsoft Access

What is Microsoft Access?

A Relational Database Management System Part of Microsoft Office Commonly used to create desktop databases and database

Is Access very different from Microsoft Excel?


Excel vs. Access

Excel stores information in spreadsheets. Data, Labels, formulas and charts are all stored together.
There is no defined structure of a spreadsheet. Very easy to use, works fine with the amount of data in order of tens rows Performance sharply falls as the amount of data increases. User interface is limited; unsuitable for larger applications

Excel vs. Access (contd)

Access stores information in tables whose structure must

be declared upfront. Each column (field) must only contain data of the type it was declared. Data integrity is maintained automatically. Very good performance with up to tens of thousands of rows or even more Flexible user interface using forms (windows). Flexible reporting tool for better data presentation.

Is Access a database server?

No When you open a database in Access, you open a
file. Even if the file is on a network location, Access at each users workstation will go search the entire file for database lookup, for example. By contrast, a database server such as Microsoft SQL Server, Sybase, or Oracle, only receives a request for data. It then searches for the data on its drive, and sends back the result. Access is a desktop database system.

Development in MS Access
In Access, as well as any other relational database
system, development falls into two parts: database design and application design.

Databases and Database Applications

A database is a collection of data. A database application is, generally, a program
that provides a convenient, user-friendly method of accessing the data in the database and performs data processing.

Databases and Database Applications

Unlike Excel, in Access users should not enter
data directly into tables. Thats what the applications are for! Users enter data through the application. This greatly reduces chances of errors going undetected, and generally presents a nice, friendly way to work with data.


Databases and Database Applications

Database design is different from application
design. A database is designed first, then the application. Access offers means for designing both databases and applications.


Databases and Database Applications

Access stores all types of objects in .mdb files. A .mdb file may store both a database and an
application but they may also be stored in separate .mdb files.


Databases and Database Applications

It is possible to create a non-Access application
that works with an Access database. And it is possible to work with a non-Access database from an non-Access application.


MS Access Database Objects

Tables are the primary mechanism for storing
data. A table is a two-dimensional structure that stores data about an entity. Each row (record) represents an instance of the entity (a person, a book, etc) Queries are used to access the data. With queries you can filter the data, perform operations such as aggregation, add new or update existing data.


MS Access Database Objects (contd)

Forms are the primary mechanism of the user
interface. The user interacts with the application through forms. Forms also incorporate Visual Basic for Applications (VBA) code that may be used to perform various actions. Reports are used to present the data for output. Reports are optimized for printing.


MS Access Database Objects (contd)

Macros are used to quickly automate a sequence
of actions.] Modules store VBA procedures (sequence of code) that can be used from anywhere in the application.


Microsoft Access is a desktop Database
management system Access offers much faster performance and more flexibility than Microsoft Excel. An Access developer designs databases and database applications. The former store data and the latter are used to work with the data.


Chapter 2
Database Design


Definition of Relational Design

A relational database is am collection of data
stored in a set of predefined tables.

The tables can be linked to each other with



Database table
A table stores data in records (rows) and fields
(columns). Each table should store information about one entity. Fields are attributes of the entity.


Creating a table
To create a table, one provides names of fields
what data type each field is going to be , what restrictions are placed on the fields, and which of them is (are) the primary key.


Primary key
Primary key of a table is a field or a group of fields that
uniquely identify each row in a table. There can be no two records with the same primary key in a table. There is always only one primary key in a table. A primary key may consist of one field or several fields. The latter is called composite primary key.


Primary key (contd)

Primary key serves as a unique identifier of rows
in a table. It is also used in relationships to link tables. Although not required, it is strongly recommended that each table have a primary key.


Choosing the primary key

A table may have a number of candidate keys
several fields or groups of fields that can uniquely identify each record. E.g. for an employee table, social security number, driver license number, or an enterprise-wide employee index number may be all candidate keys. In case of several candidate keys . The designer chooses one of them to be the primary key.


Choosing the primary key

Often the table will not have a natural primary key, such
as an ISBN for a book or social security number for a person that ism, there are no candidate keys. Then one must create and additional field that will serve as primary key. Such primary key is often an integer number. In MS Access there is a datatype Autonumber, which is used for this specific purpose.


Foreign Key
Foreign Key is a field which is a primary key in
another table. Foreign key links two tables, setting up a relationship. In the example below, the tables are linked on the Employee_ID field.


Relationships between tables reflect real-world
relationships between entities that these tables represent. For example, in a shipping company employee creates an order. In the database, there is a table that stores information on employees, and another that stores information on orders. The tables are linked with onetomany relationship, which means that an employee can create many orders, and each order can be created by only one employee.


Types of Relationships
One-to-one. An office computer can be assigned
to only one person. One-to-many. A publisher can print many books, and a book can be printed by only one publisher. Many-to-many. A book can have many authors, an author can release many books


Relationship - examples
We manage a building complex and want to
automate the management process There are several buildings in the complex, each has a different number of apartments. Apartments are of different size, each has a different number of tenants living in it.


Relationships - examples (contd)


Many-to-many Relationship
In relational databases, a many-to-many relationship
between two tables cannot be set. The answer to the problem is to introduce a third table that links the two main tables. The primary key in the linking table is a combination of primary keys in each of the two tables. The linking table may also contain data specific to the link. This converts one many-to-many relationship into two one-to-many relationships.


Relationships - examples (contd)

Employees work on projects. Each employee can
be assigned with one or several projects, and each project can be assigned to one or several employees.


Relationships - examples (contd)


Normalization is a process of organizing data in
an optimal way. That means eliminating redundancy and inconsistency Normalization is an integral part of database design.


Normalization: an example
We are creating a database on labor force based on existing documents
in the form of spreadsheets.


Normalization: an example (contd)

The first table looks like this:

This table is obviously very difficult to deal with.

For example, to update a value, one will have to read the field and find which part of it must be updated.


Normalization example (contd)

The field data_1980 and data_1990 hold multiple values.
This must be eliminated to simplify database operations. Normalization Step 1: Ensure that each field holds only one value.

Note that the primary key for this table is composite and
consists of the fields age_group and sex.


Normalization example (contd)

The table as designed can hold values for only
two years, 1980 and 1990. In real life we will want to store data for all years. But for how many fields should we create for this purpose? These fields are a repeating group. We can eliminate it by having a row (record) for each value of the group.


Normalization example (contd)

Normalization Step 2: Eliminate repeating groups. Note that year now becomes part of the primary


Normalization example (contd)

The table is now in the first Normal Form (1NF). This design of the table brings the ability to store data for
any number of years. However, each records contains an age group description. This information is redundant. That occurs because group_description depends only on part of the primary key (age_group) and not on the whole key.


Normalization example (contd)

Normalization Step 3: Ensure that each field depends on
the whole primary key. This is achieved by splitting the table into two for each field that does not depend on the whole key and linking them with a primary/foreign key relationship.

Labor Force


Normalization example (contd)

The structure is now in the Second Normal Form
(2NF). However, some problems still remain. The field total can be calculated form the other three field. This is just a waste of storage space and must be eliminated. The field officer_name in fact depends on the field officer_id, and not on the primary key. This leads to redundancy, as can be seen on the table.


Normalization example (contd)

While not required by normalization rules, it is
also advisable to have only one value in each row of the main table. This will simplify development and help avoid problems in the future. Normalization Step 4: eliminate derived fields and
ensure that all non-key fields are mutually independent. This is achieved, once again, by splitting the table and setting up a new relationship.


Normalization example (contd)





Entity-Relationship Diagram


Normalization example (contd)

The structure is now in the Third Normal Form
(3NF). The data are now in a logical, non-redundant form. The data can be easily reassembled into their original form. There are higher normal forms but they are rarely used.

Third Normal Form

A database structure is in the third normal form if: - Each row/column intersection contains only one value and there are no repeating groups; - Each column depends on the entire primary key of the table; - Each column depends only on the entire primary key of the table. Unless there are strong reasons against it, each database should be at
least in the Third Normal Form.


Normalization process
Database designer does not have to strictly follow the
normalization steps. In fact, if a designer starts with a good picture of entities and relationships in to-be database, restructuring may not be required. Rather, he/she may design a structure and then test it to make sure that the database is normalized. The key to quick and efficient database design is to remember that each table must hold data on only one entity, and each field should be an attribute of that entity.


Physical database design

What we have done so far is logical database
design the development of a data model, or database structure. This process is not platform dependent: a database structure can normally be implemented in any relational database management system. At the physical design stage, DBMS specifics come into picture.


Referential Integrity
An essential feature if a relational DBMS supported by
Access. When set, database management system will ensure that there are no inconsistencies when inserting, updating or deleting data. In our example, this means that an attempt to add a record with a non-existing age group (attempt to insert an age_group_id that does not exist in Age_group table) will fail.


Referential Integrity (contd)

Likewise, an attempt to delete an age group or an
officer that are still referenced by records in the Labor_Force table will also fail (unless cascade update/delete option is set on the relationship).


Data Types
In Access, like many other relational database
management system, each column (field) can hold values of only one data type. The type of each column is defined when the table is created.


Data Types in Access

Access supports the following data types:
- Text: stores unformatted text up to 255 characters in length. - Memo: stores unformatted data up to 65535 characters in length. - Number: stores numerical data, both integer and real. - Date/Time: stores dates and/or time. - Currency: stores numerical data with four digits after the decimal point. - OLE Object: stores such data such as a Word document or Excel graph. - Hyperlink: stores Web addresses (URLs). - Yes/No: stores Boolean data either Yes (True) or No (False). - Autonumber: stores an automatically updated number of each record.

Notes on data types

Memo is very different form text. Whereas fields of the
latter type actually contain text, fields of the former type internally only contain references to the location of text. That is because in relational databases all records in a table mist have the same width. This imposes restrictions on how such fields can be used (e.g., you cannot sort a table on this field). Use Text instead of Memo unless you have to store very large textual data in each record. Even when using Text, use Field Size to specify the limit of the length of text in a field.

Notes on data types (contd)

Use Text instead of Memo unless you have to
store very large textual data in each record. Even when using Text, use Field Size to specify the limit of the length of text in a field.


Notes on data types (contd)

Although Number is used to store all types of
numerical data, Field Size attribute determines the nature of the data that is stored. Use Integer or Long Integer to store integer values (e.g. passport numbers), Single or Double to store floating point data.


Notes on data types (contd)

Autonumber is used for primary keys where there is no
natural primary key in a table. E.g., pre-assigned officer_id is a natural primary key for the table Officer. However, in the table Age_Group Autonumber can be used to assign the primary key. Autonumber fields are never manually updated. Access does this for you. Autonumber is only used for primary keys. It cannot be used on foreign keys! Use Long Integer for foreign keys that reference Autonumber .


NULL values
A field in a record may either have some value, or
NULL. NULL is not the same as zero or even an empty string. The latter are values. NULL means no value, nothing at all. Usually, primary key fields cannot be NULL. No field of a foreign key can NULL if referential integrity is set.


NULL values (contd)

The result of a comparison with a NULL field is
always a NULL, or False. The result of an expression where on of terms is NULL is NULL. Note: Access treats NULL as an empty string when concatenating strings.


Field validation
Very often a piece of data can have only a limited range
of values. E.g., if the operator enters a year of 3002, this is clearly an error. It is possible to set up field validation at the table level using the Validation Rule property of a field. For the year, one may enter a validation rule like >1900 and <2100 With a validation rule in place, all new values in the field will validated, and violating values prohibited from being entered, in all data entry modes.

Cascade update/delete
When setting up a relationship in Access, there are
options to cascade update related fields or delete related records. The former means that if a records primary key is updated in the main table, the value will be automatically updated in child table(s). The latter means that if a record is deleted from the main table, related records will be deleted from child table(s).


Cascade update/delete (contd)

While these options may simplify development
they are dangerous, especially cascade delete. E.g., if the cascade delete option were set on the relationship between the tables Labor_Force and Officer, accidentally deletion of an officers record would result in the loss of all data in the Labor_Force table which were entered or approved by this officer. The rule of thumb is to avoid using these options or be extremely cautious if they are still used.

In this chapter, we have covered:
The basics of relational databases; The role of primary and foreign keys; Normalization Type of relationships and how they are set; Data types; Specifics of physical database design in Access.




The role of queries

In Excel, users normally enter data directly into the
spreadsheet. Likewise, searching for data is usually done manually. In Access , a query allows the developer to retrieve data meeting specific criteria. Also, queries can be used to modify a table by inserting, updating or deleting data. The former type of queries are called select queries, and the latter action queries.


Queries in Access (contd)

There are three ways to build a query:
- With a Wizard; - In the Design View; - Provides a statement in the SQL language.

Whichever method is used, the result is the same:

select queries and action queries. Some advanced queries can only be expressed in SQL.

Queries in Access
Are database objects, just like tables, forms, or
reports. Can be saved or run at any time. Most often, queries are used by or in applications.


Select queries
Are used to retrieve data from a database Can derive da5a from many tables at once Can perform various operations on data, including
aggregation. E.g., we can ask for the average price of books in the library, or the total of economically inactive people across all age groups


Building queries in Design View

In Design View, you can build queries visually.


Specifying tables
First we specify which table(s) will be used in the


Specifying fields
Now, we specify which field(s) the query will
display. Asterisk (*) stands for all fields. labor_force.* means all fields form the labor_force table.


Simple select query

When a query is ready, we can run it. When we
run the query on the previous page, we get the following results:


Selecting data
The first query selected all fields from the
labor_force table; and since we did not specify any criteria, it selected all rows. This is not very subtle, because we might just as well open the table. We always explicitly specify which field(s) we want to select. Usually, we also specify criteria for records. The query will only return records that meet the criteria.

Select query with criteria

Find and display all records entered by the officer
with Id 142.


The result

Note that the field officer_id participates in selection

(a criterion is set on that field) but is not shown in the result. This is because the Show property was turned off in the query.

Using multiple tables

The real power of queries is seen when data from
multiple tables are selected. This is where relationships become very important. Data from tables are joined, usually along primary-foreign keys.


Multiple table query

Find all labor force records entered by officer Davies


How it works
First, access finds records in the officer table where
the officer_name is Davies. Then it finds and displays records in the labor_force table whose officer_ids match the selected records from the officer table.


The SQL language

SQL (structured query Language) is the standard
query language for relational database systems. First developed by IBN in 1970. Supported by almost all major database systems. Differs from programming languages in that it is declarative rather than procedural. SQL is set-oriented. In practical terms, this means that we do not go record by record but rather by whole table.

The SQL language (contd)

SQL exists in different versions and flavors. The
standard is ANSI-92; most database system support ansi-92 and add their own extensions to the language. The version of SQL used in MS Access 2000 is not fully compatible with the ANSI-92 standard.


SQL queries in Access

Design View is just a visual method of building a
SQL query. Below is a query in the Design View

And the same query in SQL View.


SQL queries in Access (contd)

Whichever view a query is created in, it will be
available in the other view, with the exception of some complicated query that cannot be displayed in design view. All changes mad in one view are immediately reflected in the other.


SQL query types

The following SQL statements are used to
retrieve and manipulate data:
SELECT is used to retrieve data; UPDATE is used to modify existing records in a table; INSERT is used to append records from a table; DELETE is used to delete records form a table.

SELECT creates select queries, and the other

three action queries.. SQL also includes data definition statements (CREATE TABLE, etc)

The SELECT statement: the syntax

SELECT [predicate] * | table.* | [table.]field1 [AS alias1] [, [table.]field2], ... FROM tableexpression, [, ...] IN [externaldatabase] [WHERE... ] [GROUP BY... ] [HAVING... ] [ORDER BY... ] [WITH OWNERACCESS OPTION]

The SELECT statement: predicate

Optional can be one of the following:
DISTINCT display only records with distinct combinations of the selected fields; DISTINCTROW - display select fields from unique rows TOP display only the top records by the actual number or by percent.


The SELECT statement: field list

In the field list, we specify which fields we select
from which tables. An asterisk (*) means display all fields form selected table(s); labor_force.* - display all fields from the table labor_force. labor_force.value - display the field value from the table labor_force. If we join several tables, and some of them have fields with identical names, we must qualify field names; otherwise, it is optional.

The SELECT statement: field list (contd)

In the field list, we may also specify what
operations we perform on the fields. These can be arithmetic, string, aggregate, or other expressions. We can use the AS operator to name these expressions, or rename fields for output. SELECT value/1000 AS [Value (thousands)] FROM labor_force

The SELECT statement: field list (contd)

Square brackets are used with fields whose names
are keywords or contain spaces or special characters. It is better not to use keywords to name fields in tables, or have spaces in field names. It is perfectly fine, however, for output expressions or field names to contain spaces. String values in SQL must be delimited with quotes.

The SELECT statement: field list (contd)

MS Access (or rather, MS Jet Database Engine)
uses Visual Basic for Applications expression service. This means that you can se VBA functions in queries.
SELECT Left(officer_name,3) FROM officer

The SELECT statement: FROM

The FROM clause indicates what is the data
source for the query i.e. where the data coems from. This can be one or more tables or query, the syntax is simple:
SELECT officer_name FROM officer

The SELECT statement: WHERE

The WHERE clause is used to specify criteris, i.e.
records we want to select. Can be used in any type of a select query. WHERE takes effect at the time data is retrieved: thus, you can specify restrictions in fields you do not select. SELECT age_group_id, activity_group_id, sex, value FROM labor_force WHERE officer_id=130

The SELECT statement: WHERE (contd)

Conditions are combined using logical operators
AND, OR, and NOT. Of these, NOT has the highest precedence, then AND, the OR. VBA functions can be used in the WHERE clause, but not aggregate functions. SELECT * FROM labor_force WHERE year =1980 AND officer_id = 130 OR officer_id = 142

The SELECT statement: WHERE (contd)

In conditions, you cab specify all standard
comparison operators (=,< ,>, <=, >=) and a number of other operators. The IN operator lets you restrict the expression to a list of values you provide, e.g.
officer_name IN (Francis, Davies)

will restrict records to those whose values of the field officer_name are either Francis or Davies)

The SELECT statement: WHERE (contd)

BETWEENAND this operator specifies the
range for a value as a condition, e.g. the condition
price BETWEEN 20000 AND 30000

will restrict records to those whose values of the field price are form 20000 to 30000 inclusive.


The SELECT statement: WHERE (contd)

The LIKE operator allows to use wildcards. It is
always used on strings. The wildcards are *, which stand for any number of any character ; ?, which stands for any one character; and #, which stands for any one digit. officer_name LIKE Jo*n will match Johnson and Johansson. officer_name LIKE J*s?n will match Johnson, Jansen and Johansson.

The SELECT statement: WHERE (contd)

NOTE: in standard SQL , % and _(underscore)
are used as wildcards. Apart from the wildcards, you can use square brackets to enclose the list of characters to match.

officer_name LIKE [AJ]* will match officer names beginning with A or J. officer_name LIKE [A-J]* will match officer names beginning with letters A through J.

The SELECT statement: ORDER BY

The ORDER BY clause is used to specify a sort
order of the output. ORDER BY is followed by a field list. The result is first sorted by the first field in the list, then by the second, and so forth. These fields do not have to be selected in the statement. SELECT * FROM labor_force WHERE year =1991 ORDER BY age_group_id, sex, activity_id

Are used to combine data from several tables In the Design View, joins are automatically
constructed along the lines of relationships as tables are added to the query. In SQL, joins must be explicitly stated. There are two types of joins: inner joins and outer joins. In (standard) SQL, all joins are defined in the FROM part of a standard, forming a table expression.

Inner joins
Create a result set where there are matches
between two tables on a selected field(s). The general syntax: a INNER JOIN b ON a.field1 = b. field2 [AND a. field3 = b. field4] Where a, b are tables and fielddX are linking fields. You can use any comparison operator, not just = (< ,>, <=,) but those are less frequently used.

Inner joins: example

Show all labor force records entered by officer
SELECT labor_Force.*, officer.officer_name FROM labor_force INNER JOIN officer.officer_id ON labor_force. officer_id = officer.officer_id WHERE officer.officer_name = Francis

Joins with more than two tables

A table expression in brackets can participate in
joins as a table. This allows to select data form more than two tables.


A join with three tables

Find all age groups for which officer Johnson has
ever entered records.
SELECT DISTINCT a.group_description FROM (officer AS o INNER JOIN labor_force AS l ON o.officer_id = l.officer_Id) INNER JOIN age_group AS a ON a.age_group_id = l.age_group_id WHERE o.officer_name = Johnson

Table aliases
On the previous slide, we took advantage of table
aliases. These are used in the FROM clause, and provide a way to give a table a short name. Syntax: table_name AS a or table_name a where a is an alias. Either way, the table table_name can then be referred to in the query as simply a. An alias can have several characters just as will, e.g. labor_force AS lf

Alternative syntax
For inner joins, there is an alternative, shorter
syntax: one may simply numerate tables in the FROM clause and specify join conditions in the WHERE clause. The INNER JOIN keywords is then not used. This is not part of the ANSI standard, but is supported on almost all database management systems. For outer joins, a valid table expression, with keywords LEFT JOIN or RIGHT JOIN, must always be provided in the FROM clause.

Alternative syntax: example

Select all records for economically active people for the
year 1980

SELECT * FROM labor_force l, activity_group a WHERE l. activity_id = a.activity_id AND a.activity_description = Economically active AND l.year = 1980

Outer joins
Unlike inner joins, where matching records must
exist in both tables, an outer join takes all records from one table and then joins them with another table where there is a match. Outer joins can be left or right. Left joins table all records from the table on the left hand side of the join, and match them with records on the right hand side of the join. Right join does just the reverse. There is no keyword OUTER JOIN. There are, however, LEFT JOIN and RIGHT JOIN.

Outer joins: example

Show all officers an labor force records that they
entered, including officers who have not entered any records. SELECT o.officer_name, l.* FROM officer o LEFT JOIN labor_force l, ON o.officer_id = l.officer_id An inner join would not have shown officers without any records entered, as it requires matching records in both tables.

Full outer joins

Full outer joins produce all records from both
joined tables, matching them where possible. They are not currently supported in major DBMS systems, such Oracle, MS SQL Server, but not in all versions.


Parameter queries
Parameters are a mechanism that makes it possible to
supply restrictions on the query at execution time (vs. design time). In the easiest form, you can include parameters as terms in the WHERE clause by just providing a prompt: WHERE officer_name = [Enter officers name] At execution time, a window will be displayed requesting the user to enter the name. the prompt will be Enter officers name. The value will be used as though it was provided at design time.

Parameter queries
To reuse a parameter, specify the same parameter
(prompt) several times. The user will have to enter a value only once. SELECT l.*, [Occupation group:] FROM labor_force l, activity_group a WHERE l.age_group_id = a.age_group_id AND a.group_description = [Occupation group:] AND selected_group IN (15-19, 20-24)

One of the most important features of queries. Allows to perform such operations on data as
summing up, finding averages, finding minimum or maximum values. Aggregate functions are used to aggregate data. The most common are Sum, Avg, Min, Max, and Count. Can be combined with regular criteria.

Aggregation (contd)
The difference between regular functions and
aggregate functions is that the former always work within a record, whereas the latter operate across multiple records.


Aggregation (contd)
Find the average number if economically active
men in age group 1 for years 1980 through 1989
SELECT Avg(Value) FROM labor_force WHERE year BETWEEN 1980 AND 1989 AND age_group_id = 1 AND activity_id = A

Grouping (contd)
In the previous example, the Avg aggregate
function ran against the entire table restricted by the WHERE clause. Grouping is used to make an aggregate run against specific groups of records.


Grouping (contd)
List the number if recirds entered by each officer.
SELECT officer_id, count(*) AS [Number of records] FROM labor_force ORDER BY officer_id


Grouping (example)
Access first finds each distinct combination of
fields in the select list that are not aggregated. Then it runs the aggregate function against all records for each combination of these fields. In this example, Access finds all distinct officer_ids. Then it counts the number of records for each officer_id.


Grouping rules
Grouping requires the use of aggregate functions. If grouping, all fields in the select list must either
be grouped by or be aggregated (results of aggregate functions). There are no restrictions on which fields may appear in the WHERE clause. However, aggregate functions cannot be used in the WHERE clause. Several aggregates can be used in a query.

Aggregate functions
Sum(expr): sums up the values of the expression
expr. The expression can include field names, functions, and constants, but its results must be numeric. Example: Sum(value/1000) Avg(expr): calculates the arithmetic mean of the values of expr. Restrictions on the expression are the same as for Sum. Example: Avg(values) Min(expr), Max(expr): Display the minimum or maximum values of the expression expr. The expression may return a number, a string, a date.

Aggregate functions (contd)

Count: calculates the number of records returned
by a query. Count(*) returns the number of all records returned by a query, including records that contain NULL fields. Count(field) returns the number of records where the field field is not NULL.


The HAVING clause

At the time data is retrieved, the results of
aggregate functions are not yet known. This is why aggregate functions cannot appear in the WHERE clause. The HAVING clause, however may contain aggregate functions. The restrictions specified in the HAVING clause take effect when data has been retrieved, and all aggregate results calculated.

The HAVING clause (example)

Find the total number of people aged 15 and over
per each year and age group. SELECT a.age_group_id, a.group_description, l.year, Sum(l.value) AS Total FROM labor_force l, age_group a WHERE l.age_group_id = a.age_group_id GROUP BY a.age_group_id, a.group_description, l.year HAVING Count(l.value) = 6

The HAVING clause (example)

On the previous slide, the purpose of Count in the
HAVING clause is to eliminate totals obtained from incomplete data. E.g., if a record for economically active men, age group 5, year 1980, is missing, we cannot add the other values and arrive at a total for that group because such number would be based on incomplete data and therefore incorrect. Where does the figure 9 come form?

You can nest SQL queries within other queries.
Such queries are called subqueries, and are usually used in the WHERE or HAVING clauses. Subqueries must always be in parentheses. You can use the result of a subquery as a term in an expression.


Subqueries: example
Find the total number if labor force per sex and
each year. SELECT year, sex, Sum(value) AS Total FROM labor_force GROUP BY year, sex HAVING Count(value) = (SELECT Count(*) FROM age_group) * (SELECT Count(*) FROM activity_group) ORDER BY year, sex

Subqueries: predicates
So far we have covered the use of subqueries
terms in expressions. These subqueries should always return one value. Subqueries can be used in more complex comparisons using predicates ANY, ALL, [NOT] EXISTS, and [NOT] IN.


Subqueries: predicates (contd)

Find records for age groups 15-19 and 20-24.
SELECT * FROM labor_force WHERE age_group_id IN ( SELECT age_group_id FROM age_group WHERE group_description IN (15-19, 20-24)

Subqueries: predicates (contd)

Find officers who have entered more records than
Francis or Johnson.
SELECT o.officer_name FROM officer o, labor_force l WHERE o.officer_id = l.officer_id GROUP BY o.officer_name, l. officer_id HAVING Count(*) > ANY (SELECT Count(*) FROM labor_force l, officer o WHERE l.officer_id = o.officer_id AND o.officer_name IN (Francis, Johnson) GROUP BY o.officer_id)

Subqueries: predicates (contd)

Find officers who have entered more records than
Francis and Johnson.
SELECT o.officer_name FROM officer o, labor_force l WHERE o.officer_id = l.officer_id GROUP BY o.officer_name, l. officer_id HAVING Count(*) > ALL (SELECT Count(*) FROM labor_force l, officer o WHERE l.officer_id = o.officer_id AND o.officer_name IN (Francis, Johnson) GROUP BY o.officer_id)

Subqueries: predicates (contd)

ALL is more restrictive than ANY. The former
requires that the value in the main query satisfy the criterion with all values returned by the subquery. ANY requires that the value in the main table satisfy the criterion with at least one value returned by the subquery.


Subqueries: predicates (contd)

Find all officers who have not entered any
records. SELECT o.officer_name FROM officer o WHERE NOT EXISTS ( SELECT l.officer_id FROM labor_force l WHERE o.officer_id = l.officer_id )

UNION queries
Union queries consist of two or more SELECT
statements whose results are merged in the output. All of the SELECT statements in a union query must retrieve exactly the same number of fields, and data types of the fields must be identical.


UNION queries
SELECT *, Old AS [Revision] FROM labor_force UNION SELECT *, New AS [Revision] FROM revised_data ORDER BY year, age_group_id, sex, activity_id, Revision
For this statement to work, the tables labor_force and revised data must have identical structures.

Query as a data source

In Access, queries can be saved and later referred
to as if they were tables. In this way, they resemble what is called views in other relational database systems. This feature helps overcome some of the tradeoffs of normalization: now, data that is spread among many tables can be brought together in one query. Moreover, the data can be aggregated, or other operations can be dome on it.

Query as a data source (contd)

We often have to print reports on total number of
labor force by year/age group. To simplify the operation, we may use the following query: SELECT year, age_group_id, Sum(value) AS Total FROM labor_force GROUP BY year, age_group_id HAVING Count(*) = 6 Save it as qryTotalByYearAge

Query as a data source (contd)

Now we can run the following query for a
complete report: SELECT t.year, a.group_description, FROM age_group a, qryTotalByYearAge t WHERE a.age_group_id = t.age_group_id ORDER BY t.year, a.group_description

Query as a data source (contd)

Query results are obtained each time the query is
run. Queries results change when underlying tables change. Queries always contain up-to-date data from the underlying tables.


Crosstab queries
Crosstab queries provide a mechanism to
transpose data. Normally, when we select data, we specify output columns, which can be fields from a table, or results of expressions on these fields. Crosstab queries, by contrast, let us retrieve output field names (and values) form a table. Crosstab queries group data both vertically and horizontally; on the left side of output table and across the top.

Crosstab queries (contd)

Crosstab queries are an extension to the ANSI
standard. They are not supposed in MS SQL Server or Sybase. Oracle database servers have means for data transposition, but the syntax is different.


Crosstab queries (contd)

General syntax:
TRANSFORM aggfunction selectstatement PIVOT pivotfield [IN (value1[, value2[, ...]])]


Crosstab queries (contd)

aggfunction is a SQL aggregate function on
selected field, e.g. Max(value). This part defines what data will appear in the body of the output table, and what operations will be performed on it. sqlstatement is a SQL statement that defines which fields will appear as row headings. All these fields must be grouped by : this is our horizontal grouping.


Crosstab queries (contd)

pivotfield is the field on which the vertical
grouping is performed. This is the field whose values will be retrieved from a table and become column headings.


Crosstab queries (contd)

Find the total number of people aged 15 and over
per each year and age group, showing age groups at top.
TRANSFORM Sum(l.value) SELECT l.year FROM age_group a, labor_force l WHERE a.age_group_id=l.age_group_id GROUP BY l.year PIVOT a.group_description

Crosstab queries (contd)

The data are now grouped both horizontally and

vertically. The labels 15-19 and 20-24 appear on top of the table, even though they are values stored in the database.

Crosstab queries (contd)

The IN clause is used to restrict the column
headings to specific values, as well as specify the order for column headings (by default, column headings are sorted alphabetically). For example, if we change the last line of the query to be.
PIVOT a.group_description IN (30-34, 15-19) the age groups will appear as columns in this order and all other age groups will be ignored.

Crosstab queries
List labor force by year, sex, and economic
TRANSFORM Max(l.value) SELECT l.year, ag.group_description, FROM activity_group ac, labor_force l, age_group ag WHERE l.age_group_id=ag.age_group_id AND l.activity_id = ac.activity_id GROUP BY ag.group_description, l.year, PIVOT ac.activity_description

Crosstab queries

Thus, we have presented the data in its original

form. Note that although we execute an aggregate function (Max) against the data, the values shown in the output are not actually aggregated they are the figures stored in the table. Why?

Characteristic functions
In VBA, there is a function IIf:
IIf(expression, true_part, false_part) expression is a condition to be evaluates; true_part is a value that will be returned if the expression is true; false_part is a value that will be returned if the expression is false. Thus, IIf(5<4,5 is less than 4, 5 is not less than 4) will return 5 is not less than 4.

Characteristic functions (contd)

We can use the function IIf to filter values in the
field list, rather than on the record level as we normally do in WHERE or HAVING. IIf is included with the Visual Basic for Applications. It not available on database management systems other than Access. However, in most of those the CASE statement may perform similar functions.


Characteristic functions (contd)

Thus we can transpose the table without using a crosstab
SELECT l.year, a.group_description, Max(IIf(l.activity_id = A, l.value, NULL)) AS [Economically active], Max(IIf(l.activity_id = I, l.value, NULL)) AS [Economically inactive], Max(IIf(l.activity_id = N, l.value, NULL)) AS [Not stated], FROM labor_force l, age_group a WHERE l.age_group_id = a.age_group_id GROUP BY l.year, a.group_description, l.age_group_id, ORDER BY l.year, l.age_group_id,


Characteristic functions (contd)

We group by all primary key fields save one (in
this case). In each output field, we use the IIf function in conjunction with an aggregate function to select which value in the group will be placed in this field. If need be, we might place several conditions, e.g. IIf(activity_id=A AND sex=F, value, NULL)

Characteristic functions (contd)

Both the IIf function and the CASE statement are
more versatile than pure characteristic functions, which return l when a comparison is true and 0 otherwise. But an expression with the IIf function can still be said to be a characteristic function. This approach is very often used in all database systems. In many ways, it is more flexible than crosstab statements, which are available on few database management systems.

INSERT statement
Is used to append record(s) to a table. Can append values typed in the statement (syntax 1) or values from another table (syntax 2). Syntax 1: INSERT INTO target [(field list)] VALUES (value list) target is the table to which records are to be appended.

INSERT statement: example

INSERT INTO Officer (officer_id, officer_name) VALUES (150, Turner) This statement adds a record into the table officer. The field list can be omitted. Then one must know the physical order in which fields are arranged in a table to insert values in the correct order. Generally, this is not recommended.

Inserting records from another table

INSERT statement syntax 2
INSERT INTO target [(field list)] select_statement

This way, we can select data from one table and

append the results to another table


INSERT statement syntax 2

INSERT INTO labor_force (age_group_id, sex, year, activity_id, value, officer_id) SELECT age_group_id, sex, year, activity_id, value, officer_id FROM new data WHERE new_data.officer_id = 150

INSERT statement (contd)

Autonumber fields must never appear either in the
field or value list. These fields are never manually updated or inserted: their values are maintained automatically by Access. When you insert a record, Access will provide a value for such a field.


UPDATE statement
Is used to modify existing data Syntax:
UPDATE target SET field1=expr1, field2=expr2, [WHERE criteria] target specifies the table whose records are to be modified, fieldX denotes fields whose contents are to be updated, expr is a value or expression whose results will be assigned to the field.

UPDATE statement: example

UPDATE labor_force SET value = value/1000

This statement will update all records in the table

labor_force, dividing their values by 1000. UPDATE statements without a WHERE clause are very dangerous because they affect all records in a table.

UPDATE statement (contd)

The WHERE clause for the UPDATE statement
is basically the same as for the SELECT statement. WHERE is almost always used with UPDATE it is used to specify records that must be updated.


UPDATE statement: example

An operator mistakenly entered a number of
records with year 2202 instead of 2002. to correct the error, run UPDATE labor_force SET year = 2002 WHERE year = 2202

Subqueries may be used in UPDATE statements.


DELETE statement
Deletes records from a table. Syntax:
DELETE FROM target WHERE criteria

Delete all records entered by officer Johnson:

DELETE FROM labor_force WHERE officer _id = (SELECT officer _id FROM officer WHERE officer _name = Johnson


DELETE and UPDATE statements are
dangerous: their actions cannot be undone, and they can easily render an entire database useless. To protect against undesired modification, it is best to first develop the WHERE clause of an UPDATE or DELETE statement and test it with a SELECT statement to make sure that affected records are indeed those meant to be modified or deleted. Then one may execute the UPDATE or DELETE with the WHERE clause.

SELECTINTO statement
The SELECTINTO statement produces what
is called a make-table query in Access. SELECTINTO creates a new table and writes the results of the SELECT statement into it.

SELECT * INTO labor_force_1995 FROM labor_force WHERE year = 1995


In this chapter, we have covered:
The role of queries; Types of queries Building queries in the Design View; The SQL language: simple select queries; inner and outer joins; aggregation and grouping; subqueries; union queries; crosstab queries; Action queries in SQL.

Forms and Reports


The purpose of forms

To provide use interface: make data entry,
retrieval and manipulation much more convenient and user-friendly than queries or tables can do. Also forms offer advanced data filtering, automation and validation. Forms can, and very often do, contain business logic programming code that is used to process the data.


Ways to create a form

In design view (all elements are manually added
to the form); With a wizard (elements are added automatically) In this chapter, we shall concentrate on the Design view.


Creating a form
In the Design view, a form is created and edited.
In the Form view, the form interacts with the user.


Form elements
An Access form in reality is a window. Elements
that a form may contain are the elements of window interface: text boxes, combo boxes, labels, etc. These elements are called controls. The user enters data and receives feedback with the use of controls. There can be Visual Basic for Applications (VBA) code behind any form or control.


Available controls are listed as icons in the
toolbox at the left-hand side of the Access window in the Design view. To place control on a form, you select it in the toolbox and paste it on the form.


Form and control properties

The appearance and behavior of an object os
largely governed by its properties. Wizards help us configure forms or controls by setting some of their properties and possibly adding code behind the controls. The properties can also be set manually in the Design view.


Form and control properties

To access the properties of a form or control,
right-click on the object and choose Properties.


Some commonly used controls

Label: static (unchangeable) text. Cannot be bound to a data source. Text box: is used to enter or display text or numeric data. Radio button: lets the user select on of several predefined options. Checkbox: lets the user turn an option on or off. Combo box: lets the user select one of a number of values. List box: very much kike combo boxes but several values can be selected.

Some commonly used controls (contd)

Command Button: triggers an action, such as open another form, update the underlying table, refresh the form, execute a VBA procedure, etc. Option Group: simplifies the design of groups of radio buttons, checkboxes, or toggle buttons.


Bound and unbound forms

One of the most important form properties is
Record Source. If Record Source is set, a form is said to be bound to a table or query. That means controls will derive their data form, and write to the table or query. Each control may then be bound to a field or expression (with the Control Source property), but even on a bound form a control may or may not be bound.

Bound and unbound forms (contd)

If a control is bound to an expression (rather than
a field), it is not updateable. If a control is bound too a field, changes the user makes to the control (e.g., typing text or selecting a different value from a combo) are automatically written to the table when the user proceeds to the next record. An unbound form may still retrieve data from or write to, a table or query but the developer then has to manage the process.

Simple data entry form

To design a data entry form for the table officers:
Create a form; Bind it to the table officers; Create controls and bind them to individual fields in the table.

Wizards simplify the binding and configuring of

controls, which for some types of controls (e.g. combos) may be tedious.

Create a form


Bind the form

The record source property of a from sets the
table or query to which the form is bound.


Create and bind controls (text box)

To create a control (e.g., a text box), click on the
icon on the toolbox and draw the control on the form. Right-click on the control and click Properties to bring up the properties of the control. These are grouped by category. The Control Source property determines the field to which the control is bound.


Editing data with the form

Once controls have been bound, we can use the
form to view and edit data in the Form view.

The navigation buttons at the bottom of the form

are used to brows records of the table. The button appear automatically and usually indicated that the form is bound.

Combing data from different tables

Consider a data entry form for the table
labor_force. One of the properties of a labor force record is its activity group, represented by the field activity_id in the table. activity_id is a numeric key, which we may generate automatically. While it is possible to have the user type in this value when adding a record, it is bad style. The user would have to remember all activity ids, which are meaningless numbers

Combing data from different tables

The correct way is to let the user select form a list
of activity group descriptions, not type in a key value. The same applies to age groups and officers. Various control can be used for this purpose, but combo boxes and list boxes are among the most commonly used in such situations.


Combing data from different tables

In the following example, we are going to use a
wizard to configure the combo. Control Wizards are quite useful and simplify this process. If the Control Wizards are turned on, a wizard will be invoked automatically whenever you add a control to the form, however, control wizards are not available for some controls.


Combo box wizard

Configures a combo for different situations. Here,
we specify that the values for the combo are coming form a table.


Combo box wizard

Data can be fetched form tables or queries. It is
important to remember tat values for the activity_id field in the table labor_force come form the table activity_group.


Combo box wizard (contd)

Usually combos contain several columns. In our
case, there will be two: activity_description, which will display values for the user to select from, and activity_id, which will be stored in the table labor_force.


Combo box wizard (contd)

The key column (activity_id in this case) is
usually hidden form the user.


Combo box wizard (contd)

When a user selects and activity group from the
list internally the control takes up the value of the associated key. We specify that this value must be stored in the field activity_id.


Combo box wizard (contd)

At the last stage, we give the combo a title, which
will be placed in a label next to the combo.


The result
In the Form view, the user can select an activity
group from the list.


Option Groups
Option, or Radio, buttons, let the user select one
and only one, of a set of values in this way, they are similar to combo boxes. Since several controls participate in the selection (several buttons are present for the user to select one), the option buttons are grouped. We will use the Option Group control to let the user select an age group.


Option group wizard

Labels help the user identify available options.
They describe possible values for the field, and may not be equal to the values.


Option group wizard (contd)

Default button is one that is automatically
selected when the form opens. The user can then select another option. In this case we will not want a default value as either age group is equally likely to be selected.


Option group wizard (contd)

Values are what the key column is for a combo
box. Labels are just for the user to see; the while group takes the value of the selected option when the user makes a choice. In this case, the values must be age_group_ids.


Option group wizard (contd)

This value can be written to a table or otherwise
processed. We will write the value into the field age_group_id.


Option group wizard (contd)

An option group can be presented with option
(radio) buttons, check boxes or toggle buttons. In Windows, option groups are usually displayed as radio buttons.


Option group wizard (contd)

The last stage is to set a caption for the option


The result


Command buttons
Command buttons always trigger an action. The
wizard can configure button for mist common actions. E.g., lets place a button Delete Record on the form.


Command buttons (contd)

Choose what text or picture to place on the face or
the button.


Command buttons (contd)

Give a name to the button. This name is used to
reference the control in code. All controls have names, but not all wizards offer to provide a name for the controls they configure.


Command buttons (contd)

Now, a click on the button in the Form view will cause the
current record to be deleted (after a confirmation).

Configuring command buttons for other actions is equally


Are primarily intended for printing out contents
of the database, presented and formatted in a way that fits the users needs. Design is similar to forms, but reports are read only and usually do not interact with the user Can be created
With the AutoReport Wizard; With the (general) Report Wizard; In the Design View.


AutoReport Wizard
Creates a report that is simply a printout of a table
or query. To launch, select a table or query and select AutoReport form the New Object menu.


AutoReport Wizard (contd)

Doesnt have any options, and is quite limited in
capacities and use.


Report Wizard

The general Report Wizard is quite flexibel. Takes as input one or several tables or queries. Offers various grouping options. Offers several layouts and styles.

We will use the Report Wizard to create a report

on labor force.


Report: labor force

The first step is to choose data we want displayed.
The data may come form several queries or tables, which will be joined automatically.


Report: labor force (contd)

We can group by activity group, age group, or
labor force (either of the source tables). We choose to group by Labor_Force.


Report: labor force (contd)

More grouping levels can be added. For example,
if we add grouping by year, each year will be displayed individually, and all records for that year below it.


Report: labor force (contd)

On the next step we choose how the report will be
sorted. We may also add aggregation.


Report: labor force (contd)

Choose the available layouts.


Report: labor force (contd)

Choose a style


Report: labor force (contd)

The last step is to give the report a title. It will
also become the reports name.


Reports in the Design View

In the Design View, reports can be developed
from the scratch, or you can edit reports generated by the Report Wizard or AutoReport Wizard. It is often easiest to generate a report with the wizard and then add fine touches in the Design View.


Reports in the Design View (contd)

This is how our report rptLaborForce looks in the
Design View.


Report Section
While the report looks similar to form we have covered,
the most noticeable difference is that the report is divided into several sections. In fact , forms too can be divided into sections, but it is not as often used as with reports. Report Header, Page Header, Detail, Page Footer, and Report Footer are the standard section. The Report Header section appears once, at the beginning of the report, and Report Footer at the bottom. Page Header appears at the top of each page and Page Footer at the bottom of each page.

Report Section (contd)

All other sections are the body of the report. Their
number depends on the number of grouping levels. When the report is generated, a section will be in the field on which the section is grouped. The order, from top to bottom, in which these sections appear determines how the data will be grouped. On our report, each distinct year will be printed once. Age_group_description, activity_description, and sex will be printed as many times as there are such entries for each year.

Report Controls
Although any control can be placed on a report.
Labels and Text Boxes are most commonly used. That is because report controls are not interactive. Labels contain static text, which never changes. The Caption property of a label determines what text it contains. Text Boxes are bound to fields. When it actually appears on the report, a text box takes the value of the field it is bound to as each record is displayed.

Text boxes on reports

Consider this section: The control year on the left is a label. Its contents, i.e. the
word year, will never change on the page. The control on the right is a text box bound to the field year. It will take up the value of each year as the report is generated. Thus, on the report they will look like this: year 1980 year 1991

Grouping in reports
The Sorting and Grouping dialog box lets you
manually set grouping and sorting preferences. Fields on which the data are grouped are shown and can be selected in the Field/Expression column.


Grouping in reports (contd)

Group Header/Group Footer determine the report
is grouped on the field, and how many sections to create for the group. Group On/Group Interval determine how Access groups the data. Group On determines whether the data will be grouped on the entire field or a part of it, and Group Interval then determines what part of the field is grouped. Keep Together sets whether the entire section must appear on the same page, or it may be carried on to the next page.


Aggregation in reports
Is performed with the use of aggregate functions
used in queries (Sum, Min, Max, Avg, Count). To get an aggregated field, the RecordSource property of the control (usually a text box) must be set in this format:


Aggregation in reports (contd)

The way aggregation is carried out depends on the
location of control that contains aggregating expression. If such control is in a group header or footer, it will aggregate for that group. If it is in a report header or footer, it will aggregate data of the entire report. This way, sums in group headers/footers are running totals, and sum in the report header or footer is a grand total.


Reports: when to aggregate

Aggregation in reports cones in very handy when
you want to display Detail and Summary: both data as they appear in the database and their aggregates. This is something that queries cannot handle. However, if you want to only display aggregates, it is far easier to create a query that aggregates data, and create a report bound to the query.


Reports: example
We want to reproduce the spreadsheets base don
which we designed the database. First, create a crosstab query that represents the data in their original format.
TRANSFORM Max(l.value) SELECT l.year, ag.group_description, FROM labor_force l,, activity_group ac WHERE l.activity_id=ac. activity_id AND l.age_group_id=ag. age_group_id GROUP BY l.year, ag.group_description, l.age_group_id, PIVOT ac. activity_description

Reports: example (contd)

Create a report and bind it to the query.


Reports: example (contd)

Do not group yet. Grouping is done in the query.


Reports: example (contd)

Set a sort order.


Reports: example (contd)

Choose a layout. Note: the list of available
layouts is different because this is a crosstab query.


Reports: example (contd)

Choose a style


Reports: example (contd)

Name the report.


Reports: example (contd)

The report could look better. Also, we need to add

the totals.

Reports: example (contd)

First, lets change, move, and realign the labels
and text boxes.


Reports: example (contd)

We have two kinds of totals:
Line totals, which sums up values on the same line (a total for labor force across all activity groups by year, age group, and sex). Group totals: the total of both sexes in an age group in a year; the total of all people in an activity group in a year; and the total of all labor force in a year.


Reports: example (contd)

To introduce the line total, simply place a text box
that sums up fields of the record.
The Control Source for the text box is =[Economically active] + [Economically inactive] +[Not stated]


Reports: example (contd)

To introduce group totals, we need to group by
the year and by the age group. We do not need group headers, but for each of these we will add group footers, where the totals should be placed.


Reports: example (contd)

With the group footers in place, we need to add
labels and text boxes. For both sections the control sources of these text boxes are the same: - Sum([Economically active]) - Sum([Economically inactive]) - Sum([Not stated])


Reports: example (contd)

To calculate totals for all activity groups, the
easiest way is to place a text box in each section that sums up the contents of the three totals text boxes. In the group_descriptionfooter, name the text box for totals txtActive, txtInactive, and txtNotStated. The control source for the text box that calculates the line total is = txtActive+txtInactive+txtNotStated The same must be done in the year footer, but the text boxes names must be different.

Reports: example (contd)


Reports: example (contd)


In this chapter, we have covered:

The purpose of forms and ways to create them; The purpose of controls and different types of controls; Creating an advanced form with control wizards; The purpose of reports in Access; Creating a report with Report Wizard; Reports in the Design View; Grouping and aggregation in reports.


The End