You are on page 1of 16

Data Flow Diagram Symbols

Data Flow Diagrams Symbols

There are some symbols that are used in the drawing of business
process diagrams (data flow diagrams). These are now explained,
together with the rules that apply to them.
Flow diagrams in general are usually designed using simple symbols
such as a rectangle, an oval or a circle depicting a processes, data
stored or an external entity, and arrows are generally used to depict
the data flow from one step to another.
A DFD usually comprises of four components. These four components
can be represented by four simple symbols. These symbols can be
explained in detail as follows: External entities (source/destination of
data) are represented by squares; Processes (input-processing-output)
are represented by rectangles with rounded corners; Data Flows
(physical or electronic data) are represented by arrows; and finally,
Data Stores (physical or electronic like XML files) are represented by
open-ended rectangles.
Data flow diagrams present the logical flow of information through a
system in graphical or pictorial form. Data flow diagrams have only
four symbols, which makes useful for communication between
analysts and users. Data flow diagrams (DFDs) show the data used
and provided by processes within a system. DFDs make use of four
basic symbols.

Create structured analysis, information flow, process-oriented, data-

oriented, and data process diagrams as well as data flowcharts.

External Entity
An external entity is a source or destination of a data flow which is
outside the area of study. Only those entities which originate or
receive data are represented on a business process diagram. The
symbol used is an oval containing a meaningful and unique identifier.
A process shows a transformation or manipulation of data flows within
the system. The symbol used is a rectangular box which contains 3
descriptive elements:
Firstly an identification number appears in the upper left hand corner.
This is allocated arbitrarily at the top level and serves as a unique
Secondly, a location appears to the right of the identifier and describes
where in the system the process takes place. This may, for example,
be a department or a piece of hardware. Finally, a descriptive title is
placed in the centre of the box. This should be a simple imperative
sentence with a specific verb, for example 'maintain customer records'
or 'find driver'.
Data Flow
A data flow shows the flow of information from its source to its
destination. A data flow is represented by a line, with arrowheads
showing the direction of flow. Information always flows to or from a
process and may be written, verbal or electronic. Each data flow may
be referenced by the processes or data stores at its head and tail, or
by a description of its contents.
Data Store
A data store is a holding place for information within the system:
It is represented by an open ended narrow rectangle. Data stores may
be long-term files such as sales ledgers, or may be short-term
accumulations: for example batches of documents that are waiting to
be processed. Each data store should be given a reference followed by
an arbitrary number.
Resource Flow
A resource flow shows the flow of any physical material from its source
to its destination. For this reason they are sometimes referred to as
physical flows.
The physical material in question should be given a meaningful name.
Resource flows are usually restricted to early, high-level diagrams and
are used when a description of the physical flow of materials is
considered to be important to help the analysis.

External Entities
It is normal for all the information represented within a system to have
been obtained from, and/or to be passed onto, an external source or
recipient. These external entities may be duplicated on a diagram, to
avoid crossing data flow lines. Where they are duplicated a stripe is
drawn across the left hand corner, like this.
The addition of a lowercase letter to each entity on the diagram is a
good way to uniquely identify them.

When naming processes, avoid glossing over them, without really
understanding their role. Indications that this has been done are the
use of vague terms in the descriptive title area - like 'process' or
The most important thing to remember is that the description must be
meaningful to whoever will be using the diagram.

Data Flows
Double headed arrows can be used (to show two-way flows) on all but
bottom level diagrams. Furthermore, in common with most of the
other symbols used, a data flow at a particular level of a diagram may
be decomposed to multiple data flows at lower levels.
Data Stores
Each store should be given a reference letter, followed by an arbitrary
number. These reference letters are allocated as follows:
'D' - indicates a permanent computer file
'M' - indicates a manual file
'T' - indicates a transient store, one that is deleted after processing.
In order to avoid complex flows, the same data store may be drawn
several times on a diagram. Multiple instances of the same data store
are indicated by a double vertical bar on their left hand edge.

Process modelling

• Involves graphichally representing the functions, or processes, which capture,

manipulate, store and distribute data between a system and its environment and
between components of a system.

Process modelling > Deliverables

• A context diagram which provides the boundaries, or scope of the system.

• Physical DFD of the current system, specifying which people, and what
technologies are being used during which processes that move and/or transform

• Logical DFD, that is technology independent.

Process modelling > The decomposition diagram

• This represents the breakdown of activities in the functional area to be automated.

• It uses a tree structure and does not show sequence.

• Shows a snapshot of the functional breakdown of an organization, not the

organizational structure.

• Processes are supported by narrative descriptions.

• Can be used either for better understanding the business processes or can be used
to model the system.

• The root process is shown at the very top of the functional decomposition
diagram, which later feeds into the context level of the DFD diagram.

Process modelling > The structure chart

• Not the same as the process decomposition diagram

o It is a structural breakdown of the proposed system where the hierarchical

formal is used to model the system structure in modules.
o Each module calls on other modules, and each module is described by
process logic in the form of pseudo code:

 A=B+C

Data structure A consists of data structures B and C.

 A = B + C + (D)

Data structure A consists of data structures B and C and

optionally D.

 A = B + [C/D/E]

Data structure A consists of data structures B and one of either C,

D or E.

 A = B + { C+D }

Data structure A consists of data structure B and repetitions of the

group comprised of C and D.

Process modelling > Purpose of a DFD

• Ironically data movement is NOT the original purpose of a DFD, as described by

Tom DeMarco.

• The original purpose is to divide the system into pieces in such a way that the
pieces are reasonably independent, and to declare the interfaces amonth the pieces

Process modelling > Eight simple rules to drawing DFD's

1. Establish the context of the DFD, by laying out the sources and sinks and the
major inputs and outputs in the context level diagram.

2. Select where to start: The sources/sinks or the root process... don't try and be a
hero... follow iteration and discipline yourself.

3. All labels should be meaningful.

4. Processes should be labelled with action verbs.

5. Omit insignificant functions from the DFD, such as initialization and termination

6. Do not include control information or control of flow information, such "Read

Next Record" and so on...
7. Don't put too much information on any one DFD... partition for information
efficiency... a crowded DFD will not be utilized efficiently.

8. Be prepared to start over; false starts are inevitable.

Process modelling > The data flow diagram

• Captures the flow of data through a system. The system in question can be
physical or logical, manual or computer-based.

• The mechanics of a DFD are pretty simple because there are only four symbols
that are used:

1. The data flow

2. The data store
3. The data process
4. The data source or sink

• The above four symbols conform to two different yet equally substantive and
applicable standards:

o Gane and Sarson standard

o Demarco and Yourdon standard

(The above image is from the lecture notes passed out in class by Prof. Shubashish
Dasgupta in September 2005, and is NOT the work of Panos Marcoullis)

Process modelling > DFD rules

• A DFD is made up of data flows, data stores, data processes and data sources or

• Naming conventions for data processes:

o A data process transforms data from an input to an output, and it is named
by combining a verb with a noun and/or adjective such that:

[action verb].([noun]+[optional adjective])


o A data process should represent the funtional requriements of a system.

o There should be one data process for each of the processes in the
functional decomposition diagram.

o Naming is key: use crisp verb and noun combinations.

o If a process cannot be named easily, then the partitioning is not efficient.

• Naming conventions for data flows:

o This shows data in transit between processes and is named such that:



o A data flow should be decomposable.

• Naming conventions for data stores:

o Shows data packets at rest and are named such that:


example: APPLICANT

o They are connected to data processes by data flows.

o Each data store must correspond to an entity in the logical model of the
system (the entity-relationship disgram).

• Naming conventions for data sources or sinks

o Its where information originates and/or ends up after processing in the

system, and is named such that:



o There should be no data flowing between sources or sinks because this is
outside the scope of the system. They should communicate only through
the system.

Process modelling > DFD context level diagram

• Its the highest level view of the system, and it shows the entire system as one
process. The process depicted should correspond to the root process in your
functional decomposition diagram. This is the only place where a process can be
named after a noun, such as "System".

• Furthermore no data stores should be visible in the context level diagram, because
they are all encapsulated within the root process and are hence not visible here.

• The single process should be labelled "0" (the number zero).

• It looks something like this:

(The above image is from the lecture notes passed out in class by Prof. Shubashish
Dasgupta in September 2005, and is NOT the work of Panos Marcoullis)

Process modelling > DFD level 0 diagram

• It depicts the major processes within the system. All the data stores should be
visible here, and whats more, the data flows from and to the root process depicted
in the context level diagram should be the same here in the level 0. This is called
balancing. Each process in the level 0 should be numbered 1.0, 2.0, 3.0 and so on.

• It looks something like this:

(The above image is from the lecture notes passed out in class by Prof. Shubashish
Dasgupta in September 2005, and is NOT the work of Panos Marcoullis)

Process modelling > DFD level 1 diagram

• It explodes one specific process from the level 0, so that we can see more detail.
Again we must assure balancing between levels.

• It looks something like this:

(The above image is from the lecture notes passed out in class by Prof. Shubashish
Dasgupta in September 2005, and is NOT the work of Panos Marcoullis)

Process modelling > Concept of balancing

• States that inputs and outputs to a process should be conserved at the next level of

Process modelling > Using a data dictionary

• A place to store information for the support of system analysis, design,

implementation and other life cycle activities. It should be:

o automated or manual.

o easily accessible.

o easily modifiable.

o easily maintainable.

• It contains information about data elements, data structures and data flows.

• Entries in the data dictionary:

o Data elements
o Data structures
o Data files
o Data processes

• Definitions in the data dictionary:

o Narrative
o Format
o Volumes
o Samples
o Standard names

What is a data flow diagram? and DFD

rate or flag this pageTweet this

By oderog

data flow diagrams defined

Data flow diagram is a geographical tool that shows, process, flows, stores and external
entities in a system. Dataflow diagram shows the transformation of data into a system.
Dfd has got the following symbols

Process flow diagrams

Process symbol has got the following entities, process number (tells the number of the
process), locality (where activity is happening) and a process name

Data flow datagram process symbol rules

· It symbolizes the transformation of data

· There must be data flowing into/out of the process

· Process can have several inputs to it or output to it

· Process with no out becomes a null process

Data store Symbol

Consist of the following entities, data store number and name of data store. The function
of data store is to designate the storage of data in a dfd diagram

Rules of Data store

· dfd data store do not by level but they may reappear incase needed

· The symbol and the numbering remain the same

Data flow symbol

Data flow symbol may appear in different shape and they signify the movement of data.
They do not signify the movement of people, goods etc

· Doubles arrows signifies that activities occur at the same time which is wrong

· Data flow in is never equal to data flow out

Extended entity symbol

Extended entity is sources and destination of data. This means that source is the origin
and destination is the sink of data

Dos and Don’ts of external entity

· External entity never communicate with each other, this signify that there is no need for
the process

· External entity should not communicate directly with data store because external entities
can be identifier with the record of files and databases

How to develop Logical data flow diagram

Below are the guidelines in developing data flow diagrams

1. Develop a physical dfd
2. Explore the process for more details
3. Maintain consistency between the process
4. Following meaningful leveling convention
5. Ensure that dfd diagrams clarifies what is happening in the system
6. Remember dfd audience
7. Add control on the lower level dfd only
8. Assign meaningful level
9. Evaluate dfd for correctness

Step in drawing dfd diagrams

1. Make a list of all business activities and use it to determine the various external
entities, data flows, process and data store
2. Create a context diagram which shows external entity and data flows to and from
the system
3. Do not show any detailed process or data store
4. Draw diagram zero or the next level to show process but keep them general. Show
data stores and the level
5. Create a child diagram for each of the process in diagram zero
6. Check for errors and make sure the levels you assign to each process and data
flow are meaningful
7. Develop a physical dfd diagram from the logical dfd and distinguish between the
manual and automated protocol, describe actual files and report by name and
controls to indicate when the process are complete or errors occurs
8. Portion the physical DFD by separating or grouping parts of the diagram in order
to facilitate programming and implementation

Advantages of data flow diagrams

• It gives further understanding of the interestedness of the system and sub-systems

• It is useful from communicating current system knowledge to the user
• Used as part of the system documentation files
• Dataflow diagram helps to substantiate the logic underlining the dataflow of the
• It gives the summary of the system
• DFD is very easy to follow errors and it is also useful for quick reference to the
development team for locating and controlling errors

Disadvantages of data flow diagram

• DFD is likely to take many alteration before agreement with the user
• Physical consideration are usually left out
• It is difficult to understand because it ambiguous to the user who have little or no

Data Flow Diagram Symbols

The following rules are taken from: Accounting, Information Technology, and Business Solutions, 2nd Edition, by
Hollander, Denna, Cherrington, Second Edition (McGraw-Hill, 1999).
1 All processes should have unique names. If two data flow lines (or data
stores) have the same label, they should both refer to the exact same data
flow (or data store).
2 The inputs to a process should differ from the outputs of a process.
3 Any single DFD should not have more than about seven processes.
4 No process can have only outputs. (This would imply that the process is
making information from nothing.) If an object has only outputs, then it
must be a source.

Incorrect Correct
5 No process can have only inputs. (This is referred to as a “black hole”.) If
an object has only inputs, then it must be a sink.

Incorrect Correct
6 A process has a verb phrase label.
7 Data cannot move directly from one data store to another data store.
Data must be moved by a process.

Incorrect Correct
8 Data cannot move directly from an outside source to a data store. Data
must be moved by a process that receives data from the source and places
the data in the data store.

Incorrect Correct
9 Data cannot move directly to an outside sink from a data store.
Data must be moved by a process.

Incorrect Correct
10 A data store has a noun phrase label.
11 Data cannot move directly from a source to a sink. It must be moved by a
process if the data are of any concern to the system. If data flows directly
from a source to a sink (and does not involved processing) then it is
outside the scope of the system and is not shown on the system data flow
diagram DFD.

Incorrect Correct
12 A source/sink has a noun phrase label.
13 A data flow has only one direction between symbols. It may flow in both
directions between a process and a data store to show a read before an
update. To effectively show a read before an update, draw two separate
arrows because the two steps (reading and updating) occur at separate

Incorrect Correct
14 A fork in a data flow means that exactly the same data goes from a
common location to two or more different processes, data stores, or
sources/sinks. (This usually indicates different copies of the same data
going to different locations.)

Incorrect Correct
15 A join in a data flow means that exactly the same data comes from any of
two or more different processes, data stores, or sources/sinks, to a
common location.

Incorrect Correct
16 A data flow cannot go directly back to the same process it leaves. There
must be at least one other process that handles the data flow, produces
some other data flow, and returns the original data flow to the originating


17 A data flow to a data store means update (i.e., delete, add, or change).
18 A data flow from a data store means retrieve or use.
19 A data flow has a noun phrase label. More than one data flow noun
phrase can appear on a single arrow as long as all of the flows on the
same arrow move together as one package.