Sie sind auf Seite 1von 267

1.

INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN


(Anand Prakash Ruhil)

2. APPLICATIONS OF BIOSENSORS FOR FOOD & DAIRY INDUSTRIES


(Neelam Verma)

3. OPTIMIZATION TECHNIQUES IN FOOD AND DAIRY PROCESSES


(R.K.Sharma)

4. ARTIFICICAL INTELLIGENCE AND ITS APPLICATIONS


(Ashwani Kumar Kush)

5. GRAPHICAL USER INTERFACE


(Aswani Kumar Kush)

6. INTRODUCTION TO MULTIMEDIA AND ITS APPLICATIONS IN


DAIRYING (Jancy Gupta)

7. AN OVERVIEWOF MIS APPLICATIONS IN DAIRYING


(D.K.Jain and R.C.Nagpal)

8. DATA WAREHOUSE AND ITS APPLICATION IN AGRICULTURE


(Anil Rai)

9. DECISION SUPPORT SYSTEMS AND THEIR APPLICATIONS IN


DAIRYING (D.K. Jain and Adesh K. Sharma)

10. UTILIZING THE POTENTIAL OF DYNAMIC MICROBIAL MODELING IN


QUALITY ASSURANCE OF FOODS (Sanjeev K. Anand)

11. AN INTRODUCTION TO PROTEIN STRUCTURE AND STRUCTURAL


BIOINFORMATICS (Jai K Kaushik)

12. GENETIC ALGORITHMS AND THEIR APPLICATIONS IN DAIRY


PROCESSING (Avnish K. Bhatia)

13. EXPERT SYSTEMS IN DAIRYING


(Avnish K. Bhatia)

14. COMPUTATIONAL NEURAL NETWORKS WITH DAIRY AND FOOD


PROCESSING APPLICATIONS (Adesh K. Sharma and R. K. Sharma)

15. GIS APPLICATIONS IN DAIRYING


(D.K.Jain)
16. APPLICATION OF MATLAB FOR DESIGNING NEURAL NETWORKS
SOLUTIONS IN DAIRY PROCESSING
(Anand Prakash Ruhil, R.R.B.singh and R.C.Nagpal)

17. APPLICATION OF DIGITAL IMAGING FOR ASSESSING PHYSICAL


PROPERTIES OF FOODS (Ruplal Choudhary)

18. DYNAMIC MODELING OF DAIRY AND FOOD PROCESSING


OPERATIONS (Ruplal Choudhary)

19. MODELING OF FOOD PROCESSING OPERATIONS


(S.N. Jha)

20. NIR MODELING FOR QUALITY EVALUATION OF DAIRY AND FOOD


PRODUCTS (S.N.Jha)

21. COMPUTERIZED PROCESS CONTROL IN DAIRY AND FOOD INDUSTRY


(K. Narsaiah)

22. DETERMINATION AND PREDICTION OF SHELF LIFE OF MOISTURE


SENSITIVE FOOD PRODUCT BASED ON PRODUCT-PACKAGE-
ENVIRONMENT INTERACTION (I.K.Sawhney)

23. SENSORS IN RELATION TO COMPUTER APPLICATION IN FOOD


PROCESSCONTROL (A.A.Patel)

24. USE OF COMPUTER SOFTWARE LINDO IN DAIRY PROCESSING


(Smita Sirohi)

25. STATISTICAL METHODES FOR DAIRY PROCESSING


(D.K.Jain)

26. SORPTION ISOTHERMS AND GENERATION OF SORPTION


PARAMETERS (G.R.Patil & R.R.B.Singh)

27. REACTION KINETICS AND MODELING FOR PREDICTION OF SHELF


LIFE OF FOODS (G.R.Patil)

28. e-TONGUE IN MONITORING SENSORY QUALITY OF FOODS


(S.K. Kanawjia)
29. TIME-TEMPERATURE-INDICATORS IN CONTROLLING QUALITY OF
FOODS (R.R.B.Singh)

30. FUZZY LOGIC SYSTEM WITH EMPHASIS ON FOOD AND DAIRY


PROCESSING APPLICATIONS (Adesh K. Sharma)

31. BASIC CONCEPTS OF DATABASES


(A.P.Ruhil)

32. INTRODUCTION TO MANAGEMENT INFORMATION SYSTEM


(A.P.Ruhil)

33. GRAPHICAL REPRESENTATION OF SCIENTIFIC DATA


(A.P.Ruhil)

34. INFORMATION AND COMMUNICATION TECHNOLOGY


(Shuchita Upadhyaya Bhasin)
INTRODUCTION TO SYSTEMS ANALYSIS AND DESIGN

Anand Prakash Ruhil


Computer Centre
NDRI, Karnal

1.0 Introduction
In early days, System Analysis and Design was concerned with man made systems
involving inputs, processes and outputs. But in modern time, system analysis and design
deals with the process of examining and understanding the working of an existing system of
an organization, identifying problems (if any) in it with an objective to improve the same
through better methods, technology and procedures. The topic of the subject has three basic
components:

1. System
2. System Analysis
3. System Design

In the present context, the components system analysis and system design are interfaced
with the information technology to achieve the basic system objectives using computer
based systems. In the above discussion we have used the word system so many times and
the activities of system analysis and design are also revolving around system. So the
question arises: what is a system?

2.0 Definition of System


The word system is possibly the most overused word in our vocabulary. Commonly
we speak about political system, education system, human body system, social system,
transportation system, banking system, computer system etc. The term system is derived
from a Greek word systema, which means an organized relationship among functioning
units or components. A system exists because it is designed to achieve one or more
objectives. In the present context the word System can be defined as a group of
interdependent components in an organized way, which works together to achieve one or
more objectives as per the specified plan. The interdependent components may refer to
physical parts or managerial steps and are known as subsystem of a system.

3.0 Characteristics of a System


As defined a system is an organized relationship of various subsystems or components
to achieve some objectives. For example a Computer System has a number of subsystems
like Keyboard, Monitor, CPU, Mouse, etc. A Computer system is just an organized
relationship of these subsystems in a planned way. These subsystems are also dependent on
each other to achieve set targets. Thus the definition of a system suggests the following
characteristics that are present in all systems.

1. Organization: This refers to structure and order. It is the arrangement of


subsystems that help to achieve objectives.

1
2. Interaction: This refers to the manner in which each subsystem functions with other
components of the system.
3. Interdependence: This means that subsystems of a system are dependent on one
another. They are coordinated and linked together according to a plan.
4. Integration: This refers to the holistic view of the system. This is concerned with
how subsystems are tied together to achieve central objective.
5. Central Objective: Each system is developed in order to achieve centralized
objective. All subsystems are developed and integrated to achieve the centralized
objective keeping the unique identity of each subsystem.

4.0 Elements of a System


A system has following key elements:

1. Output and Inputs: One of the major objectives of a system is to produce an output
that has value to its user. The nature of output may be goods, services, information
etc. Inputs are the elements like material, manpower, information etc. that enter the
system for processing. Output is the outcome of processing.
2. Processor(s): The processor is the element of a system that involves the actual
transformation of input into output. It is the operational component of a system.
3. Control: The control element guides the system. It is the decision making
subsystem that controls the activities governing input, processing, and output.
4. Feedback: Control in a dynamic system is achieved by feedback. Feedback
compares the output of a system against performance standards and accordingly
information is communicated to system for necessary action. This may yield to
change in input or processing and consequently the output.
5. Environment: The environment is a super system within which an organization
operates. It is the source of external elements that affect the system. It often
determines how a system should work. For example the vendors, competitors, Govt.
policies, tax department etc. may provide constraints and consequently influence the
actual performance of the system.
6. Boundaries and Interface: A system should be defined by its boundaries – the
limits that identify its components, processes, and interrelationships when it
interfaces with another system. For example the Market & Sale section is concerned
with the sale of milk and milk products, collection of sale amount, determining
demand of products in near future. This section is not concerned about how the
products are being manufactured or what are the manufacturing losses for
production of these products. .

5.0 Illustration of Business of a Dairy Plant as a System


Organizations are complex systems that consist of interrelated and intertwine
subsystems. Business of a dairy plant can be defined in system terms. The business setup of
a dairy plant consists of various components like Milk procurement and billing, Processing
of milk, Manufacturing of dairy products, Inventory of milk and milk products, Inventory of
engineering parts, Marketing and Sale, Quality Control of dairy products, Manufacturing
and handling losses, Accounting, HRD, R&D etc. All these components work together to
enhance profit and produce good quality of milk products based on set procedures and rules
of their interactions. It can be observed that each component (i.e. subsystem) stated above is
a complete system within itself and these subsystems interact with each other to pass on

2
necessary information for smooth functioning of each subsystem as well as the whole
system. For example marketing section determines the demand of a particular dairy product
for next few days and this information is passed on to Production section to fulfill demand
in the market. Changes in one part of system may yield anticipated and unanticipated
consequences in other parts of the system. Thus the business of a dairy plant is a system
which receives resources (capital, people, plant, milk etc) and processes received milk and
produces the milk products.

6.0 Types of Systems


Systems have been classified in different ways. Common classifications are:

1. Physical and Abstract System


2. Deterministic and Probabilistic System
3. Open and Closed System
4. Information System

6.1 Physical and Abstract System

Physical system corresponds to tangible (i.e. physically available) entities that may
be static or dynamic in operation of a system. For example the physical parts of a dairy
plant are equipment, employees, office establishment, furniture, building that facilitate
operation of the plant. These parts are more or less same and can be seen and counted hence
these are static. In contrast, daily receipt of milk, manufacturing of products per day, sale of
products etc. keep on changing as per the need, hence these are dynamic.

Abstract systems are conceptual or non physical entities. They may be as straight
forward as formulas of relationships among set of variables or models - the abstract
conceptualization of physical situations. A model is a representation of a real or a planned
system.

6.2 Deterministic and Probabilistic System

Deterministic system is one in which the occurrence of all events is perfectly


predictable. Under the defined description of a system state at a given particular time and of
its operation, the next state can be perfectly predicted. For example numerically controlled
machine are deterministic in nature.

Probabilistic system is one in which the occurrence of events can not be perfectly
predicted. An example of such system is status of items in inventory, occurrences of a
telephone call in a particular time period etc.

6.3 Open and Closed System

Open systems are those systems which interact with its environment and have many
interfaces with outer world. Open system permits interaction across its boundaries; and
receives inputs from and provides outputs to the outside. These systems are usually adaptive
i.e. their interaction with environment is such as to favour their continued existence. For
example information systems must adapt to the changing demands of the users. A human
being is also an example of open system.

3
Closed systems are isolated from environmental influences i.e. they do not interact
with its environment and do not adopt changes. Such systems are rare, but relatively closed
systems are common. A relatively closed system is one which controls its input and so is
protected from environmental disturbances. For example a computer program is relatively
closed system which processes predefined inputs in a predefined way. Closed or relatively
closed systems are difficult to survive for a longer time since they do not interact with their
changing environment and will eventually decline.

6.4 Information System

Information regarding a subject helps in taking decision with higher confidence


since it reduces uncertainty about a state or event. It provides instructions, commands, and
feedback. It determines the nature of relationships among decision makers. An information
system is the basis for interaction between users at different level. An information system
can be defined as a set of devices, procedures, and operating systems designed around user
based criteria to produce information and communicate it to the user for planning, control,
and performance.

In order to work as an effective unit, the business has to make use of information. An
information system is often regarded as another subsystem. Information system provides
information for decision and control and acts as linking mechanism between the functional
subsystems. The major information systems are formal, informal, and computer based.

1. Formal Information System: A formal information system is concerned with the


communication flow and work flow on the basis of authority or responsibility at
different levels within an organization from top to bottom level. Information is
formally disseminated in form of instructions, memos, or reports from top
management to the intended user in the organization. The formal information is very
useful in achieving organization objectives and better control within an organization.
2. Informal Information System: Informal information system is concerned with the
personal needs and support related needs of an employee. It is an employee based
system designed to meet personnel and professional needs and to help solve work
related problems. It also funnels information upward through indirect channels. Thus
it is a useful system because it works within the framework of the business and its
stated policies.
3. Computer based System: The computer based information system uses the strength
and capabilities of a computer system for handling business applications to produce
information. A computer based system has the capability for storing large volume of
data, processing the data efficiently, doing complex computations very fast,
producing required information instantaneously, maintaining up to date online
information and providing the same to decision makers in real time and interacting
with the public databases globally available. The computer based information
systems can be further classified into following broad categories based on nature of
applications.

a. Transaction Processing System (TPS)


b. Management Information System (MIS)
c. Decision Support System (DSS)
d. Internet based Information System

4
7.0 System Analysis
System analysis is a process of fact gathering of the present system, diagnosing
problems, analyzing the business requirements and recommending improvements based on
information gathered. System analysis creates a background and base for system design in
order to achieve system objectives. Various components involved in system analysis are
project initiation, preliminary investigation (fact gathering), feasibility analysis and input/
output report etc. Finally system analysis brings out what a system should do.

8.0 System Design


After drafting out the complete requirements of a system in detail during the system
analysis phase, system design focuses on how to accomplish the same in order to achieve
system objectives. System design is a process of translating the user oriented functional
specifications to technical design specifications or from logical design to physical design.
The physical design process concentrates on how the data will be entered, how the files are
designed and organized, what will be format or layout of input/ output forms, how the
reports will be generated, etc.

9.0 System Analyst


The dictionary meaning of an analyst is “a person who conducts a methodical study
and evaluation of an activity such as a business to identify its desired objectives in order to
determine procedures by which these objectives can be gained”. A similar definition is
offered by Nicolas “The task of the system analyst is to elicit needs and resource constraints
and to translate these into a viable operation”1. Designing and implementing systems to suit
organizational needs are the functions of the system analyst. System analyst plays a vital
role in analysis and design of a system or an application in any business/ scientific
environment.

The analyst needs to identify problems by studying the existing system; collect all
background information of the organization; study the historical background of the
establishment; establish business and system objectives; gather all systems requirements
and their specifications keeping in view the future expansion and other business
requirements; find out various alternative feasible solutions and design suiting to
organization resources; identify hardware and software requirements and their selection
based on design recommended; design all input, output and procedures; develop the entire
system, test and implement it, and carry out the process till system becomes workable
solution in accordance with the system objectives.

Therefore it is required that the system analyst must have a number of qualities and
skill to play the role of investigator; change agent; psychology; architect; IT expert; trainer
and motivator etc. To meet the challenge an analyst must have the following qualities and
skills:

 Communication skill
 Quick assimilation and sharpness in understanding skills
1
John M Nichols, “Transactional Analysis for Systems Professionals”, Jr. of System Management, October
1978, p.6

5
 Foresightedness and vision
 Patience and Rationality
 Personality reading
 Project management skills
 Leadership qualities
 Selling/ Marketing skills
 Training and documentation capabilities
 IT knowledge
 Creativity
 Technical capability of questioning

10.0 System Development Approaches


Since the development of computers, the approach for development of computerized
systems has dynamically changed as per the changes in software development technology.
The software technology has changed from modular to structural and now objects oriented
technology. The major approaches for system developments are as follows:

1. System Development Life Cycle method


2. Structured Analysis Development method
3. Prototype method
4. Object oriented Development method

The most prevalent and important approach for system analysis and design is known as
System Development Life Cycle method (SDLC). Though it is traditional method but still it
holds good in any methodology or approach as it is generic term for system development.
This approach monitors and controls the system development beginning from fact
gathering, designing, implementation and maintenance in cyclic way as a continuous
process. Therefore we will focus more on this approach.

11.0 System Development Life Cycle (SDLC)


System development process has a life cycle just like a living system. Systems are
conceived, designed, developed and maintained. Over a period, system analyst makes a
number of changes in the existing computer system to accommodate new requirements of
users and technological developments that gives a totally different look to the system which
was developed in the beginning. This process shows that systems are just like living
systems which came into existence and after some time they die (i.e. replaced with new
one). Systems analysis and design are important factors of system development life cycle.
For computer professionals SDLC is a systematic approach to develop a computer
application to solve a business or scientific problems in order to fulfill the needs of an
organization or a customer for whom computer application is being developed. SDLC helps
the system analyst to monitor the working of progress while developing the application. It
also helps them to manage, plan and control the whole set of activities during development
of an application.

The different activities that are carried out in SDLC in order to follow a systematic
approach for solving a problem are grouped in different phases as described in a table given

6
below along with the key questions to be answered in different phases/ activities and the
output of each activity.

Sr. Phases/ Activities Key Questions Results


No.
1 System Study and Analysis Phase
i Need Recognition:
 Preliminary survey/  What is the problem or  Statement of scope and
initial investigation opportunity? objectives
 Why computerized system  Performance criteria
is required?
ii Feasibility Study:
 Evaluation of existing  What are the user‟s  Economic Technical and
system and procedures demonstrable needs? Operational feasibility
 Analysis of alternative  Is the problem worth  Cost benefit analysis
candidate systems solving?  System scope and
 Cost estimates  How can the problem be objectives
redefined?  Statement of new scope
 Whether the solution to the and objectives
problem will be cost
effective, technical and
operationally viable?
iii Detailed System Study:
 Study the existing  What must be done to  Statement showing the
system in detail to solve the problem? working of existing system
understand terminology,  What are the facts?  Logical model of the
components, procedure, system – data flow
workflow and data flow, diagrams, Data dictionary,
relationship between system flow chart, etc.
components  Pertinent data
 Data collection  Questionnaires/interviews
for data collection
iv Input–Output
Requirements:
 Identify input and output  How does the system  More suitable new input
requirements and their interact with outer and output formats
formats environment?
 How are different
components interacting
with each other?
v Iteration for
improvement:
 Above mentioned  How can a system be  Final result of system
activities are repeated further improved? study and analysis phase

7
2 System Design Phase
i General design
specifications:
 Translation of logical  How must the problem be  Physical design of the
design to physical design solved? system
 Selection of hardware  What is the new system  Flow of information into
and software processing flow? and out of the system
 Implementation plan
 Hardware and software
specifications
ii Input–Output design:
 Identify inputs to the  How will the reports be  New formats for output
system generated? reports
 Identify output of the  How will the data be  New formats/ screens for
system entered? data entry forms
 How will the data be  Different types of data
validated? validation checks used
 What will be the formats or during data entry
layout of input and output?  Devices and method to be
used for generating output
and data entry
iii File/ Data base design:
 Storage of input data and  How to design an efficient  Logical design of database
generated data database?  Schema and sub-schema of
 How will the data be stored database
in data files using the  Classification of data files
concepts of normalization?  Identification of primary
 What will be the structure key for each data file
of data files?  Creation of relationship
 How will the data files be between data files
related to each other?
iv Code design:
 Designing of codes for  How to design efficient  List of codes for different
input and output data to coding pattern? data items for input and
identify and retrieve output
record uniquely in files
v Program logic design:
 Identify procedures of  Does the user approve the  Algorithms, flow charts
the system to be designed system? and pseudo codes of
and developed for  How to develop logic of programs for data entry,
processing input/ output programs using different output reports, menu
data tools? design, intermediate data
processing etc.

8
3 System Development Phase
i Database creation:
 Create the designed data  Which DBMS should be  Physical creation of
base using a DBMS or selected to meet the database using logical
some other file creation organization requirements? design
method

ii Program writing (i.e.


coding):
 Writing of programs  Which programming  Programs for solving
using SQL or some other language should be used problems, data entry,
programming languages for writing programs? report generation, menus
like VC++, C++, VB, etc.
Java, VB.net etc.
iii Testing:
 Program testing  How well do individual  Test plans
 Module testing programs/ modules test  Formal system test (α and
 System testing out? β tests)
 Complete quality and  How ready are programs  Performance of software
performance testing for acceptance test? with respect to hardware

4 System Implementation Phase


i File/ system conversion
(installation):
 Software is installed at  What is the actual  Plan to change over to new
client site operation? system (At once/ Parallel/
 Data is entered in files  Are there delays in Location wise/ Modular)
using new software or loading/ converting data  Data is entered in new
existing data is converted files? files/ database
into file/ database
structure of new system
ii User training:
 Train the end users about  What to teach the users?  Plan for training the staff
the operation of software  How to train the staff? (Brainstorming sessions/
developed Seminars/ Operational/
 Exposure to executive Awareness/ On Job
staff about the software trainings, etc.)
and its capabilities
iii Documentation:
 Prepare different kind of  Are manuals ready?  User manual for beginners
manuals with respect to  Technical manual
level of usage  System manual

9
5 Post-implementation, Maintenance and Review Phase
i Evaluation:
 The implemented system  Is the key system running?  Users requirements and
is evaluated by end users standards are met
ii Maintenance:
 Collect feedback of users  Should the system be  Updated the system as per
about the system modified? the requirements/
 Focus on changes suggestions
associated with error  Satisfied users
correction, exceptional  Backup of data
situation occurred,
performance of system,
backups and recovery of
data files, etc.
iii Enhancement:
 Addition of new plans/  How to incorporate new  Steps involved in system
modules/ schemes due to plans/ modules in the analysis, design and
changes in users existing system? implementations are
priorities, environmental repeated to add new
factors, organizational modules
requirements, etc.

12.0 System Analysis and Design Tools


The information collected by the System Analyst (SA) during system analysis phase
has to be documented for further discussions and reviews to ensure the correctness of
information. The document prepared by SA should allow others to understand the existing
system, its components and their interrelationship. In order to make the document better
understandable some graphical methods of representation known as structured tools are
used by the SA. These tools not only help others to understand the system accurately and
easily but also help the SA to continue the system analysis phase in a structured manner.
Before discussing the structure tools, first have a look on the traditional method of showing
the flow of data in the system through system flow charts.

12.1 System Flow Chart:

This is graphical/ pictorial representation of the system details related to an


application showing the flow of data in system through flow chart method. System flow
chart shows chronological order of various processes along with data coming in and going
out of each process to achieve final meaningful results through the system. Some of the
important symbols used in drawing a system flow chart are given below:

10
Punch Card Manual Input Document Process Off-page On-line
Connector Display

Hard Sorting Merge Manual On-line On-page Flow


Disk Operation Storage Connector

Example: Following example shows a system flow chart for milk receipt and billing
system in brief

Receipt of Expenditure
milk details

On line data entry with Milk


validation checks Receipt
File

Bills for Bill


farmers/ Soc. Generation

Reports for Data


managers processing of
milk receipt

System Flow Chart

12.2 Structured Tools:

Besides system flow chart which is useful in drawing flow of data in different
processes and procedure, a number of other tools are also used to depict information flow.

11
Some of the structured tools used in system analysis as well as for preparation of document
are as follows:

1. Data Flow Diagram (DFD)


2. Decision Tree
3. Decision Table
4. Data Dictionary

12.3 Data Flow Diagram (DFD)

DFD is a graphical or pictorial representation of flow of data. This tool describes the
movement or flow of data within a system from initiation up till end regardless of whether
the system is manual or automated. It models a system by using external entities from which
data flows to a process which transforms the data and creates output data flow which go to
other processes or external entities or files. Data in files may also flow to processes as
inputs. The main advantage of DFD is that it can provide an overview of what data a system
would process, what transformations of data are done, what files are used and where the
results flows. DFD is also known as bubble charts as it consists of a series of bubbles joined
by lines. A DFD consists of following four pictorial symbols:

a. External Entity: External entity is a source or destination of data. It can be files,


departments, person, vendors, customer, etc.
b. Process: Process is the people or procedures involved in the system that are used to
transform data.
c. Data store: Data stores are used to show the data stored by a process within a
system like files, registers, documents, reports, vouchers, etc.
d. Data flow: Data flow shows the flow of data in the process and out of process in a
DFD from origin to destination.

External Entity

Process

Data Store

Data Flow

12
Procedure for drawing a DFD:

Step 1: Represent the whole system by a single process and identify its inputs and outputs.
This is known as context level DFD or zero level DFD.

Step 2: Identify major processes of the system and draw DFD considering these major
processes only and also identify the inputs and outputs for each and every major
process. This is known as top level DFD or first level DFD.
Step 3: After identification of major processes in first level DFD, an analyst has to identify
the processes from the first level DFD which can be further expanded. The analyst
then draws DFD of a particular process identified with expansion. This is known as
exploded DFD or Expanded DFD.

Example: Following example draws a context level DFD for milk procurement and
billing system

Management

Reports on Milk rec.,


Amount collected for Procurement cost, etc. Procurement
sale of subsidized inputs Section
TIA requirements

Milk
Society/ Procurement Milk receipt and
Collection MPBS A/C & Bill
details expenditure details Section
Centers
TIA requirements

Expenditure details
Data stored
in different
Payment details files

Context level DFD or Zero level DFD of MPBS

12.4 Decision Tree

It is a graphical representation for describing logical rules where decisions are to be


taken depending on the situation as per rules. It describes all the actions that result from
various combinations of conditions as per the logical rules. The graphical representation of
conditions and outcomes resembles with the branches of a tree.

Example: Following example shows the subsidy given on technical inputs to the
farmers by dairy plants for promotion of dairy in the area.

13
More than
5 Years 20% Subsidy
Regular
Members Between
1 to 5 Yrs 10% Subsidy
Subsidy on
Technical Occasional
Members 5% Subsidy
Inputs

Non
Members Nil

12.5 Decision Table

Decision table describes all conditions, actions and decision rules for initiating action on
occurrence of various conditions. Decision table describes the same information as
discussed in decision tree but in a tabular form. Decision table shows conditions and actions
in a simplified and orderly manner. It is divided in two vertical parts known as stub (left
part) and entry (right part) as shown in figure given below. The stub part is further divided
into two parts (lower and upper) as described under:

a. Condition Stub (upper part): All conditions are written in this area.
b. Action Stub (lower part): All actions are written in this area as per the requirement.

Similarly entry part is further divided into two parts (lower and upper) as described
under:

.a Condition Entry (upper part): In this part all possible combination of conditions
are marked either by „Y‟ or „N‟ where „Y‟ is for true and „N‟ for falseness of
condition.
b. Action Entry (lower part): In this part the action which will be taken as per the
combination of action specified in condition entry is shown by marking „X‟ in front
of action that will be taken with respect to the condition.

14
Condition Condition
Stub Entry

Action Action
Stub Entry

Example: Following example shows a decision table to determine rates of fresh milk
received form farmers.

Conditions Decision Rules


Fat% and SNF% within Y N N
specified range
Low Fat% and SNF% Y N
High Fat% and SNF% Y

Normal Rate X X
Penalty X
Incentive X

12.6 Data Dictionary

It is the description of all the data elements or data stores or data structure where
data structure consists of a group of data elements and data element is simplified unit of
data which can not be further decomposed. For example employee name may consist of
three parts as first name, middle name, Surname. Thus employee name is data structure and
its parts are data elements. Data stores are also considered as data structures.

15
APPLICATIONS OF BIOSENSORS FOR FOOD & DAIRY
INDUSTRIES

Neelam Verma
Department of Biotechnology
Punjabi University, Patiala-147 002.

1.0 Introduction:
Biosensors are attracting the attention of many investigators in the field of analytical
Biotechnology. Biosensors are gaining importance and popularity over conventional analytical
techniques because of specificity, low cost, fast response time, portable, ease of use and
continuous real time signal, while conventional methods are costly, laboratory bound and need
trained personnel. Biosensor is an analytical tool or system consisting of an immobilized
biological material in intimate contact with a suitable transducer that converts the biochemical
signal into a quantifiable electrical signal. The concept of Biosensor was pioneered by Clark
and Lyons in 1962 who proposed that enzyme could be immobilized at electrical detector to
form enzyme electrode. This first enzyme electrode was devised for monitoring glucose level in
blood using glucose oxidase.
A typical Biosensor, thus consists of two parts: biological component and transducer.
The biological component may be enzymes, whole cells (bacterial, fungal, animal or plant ) ,
organelles , tissues , receptors , antibodies , nucleic acid etc. The specificity of enzymes is the
main reason for use of enzymes in biosensors. Furthermore as the structure of enzymes become
better defined, it is possible to tailor enzymes that can function under stressed environments for
a long period of time . Microorganisms or tissues are useful as the enzymes in these remain in
their natural environment, increasing stability and activity, eliminating costly enzyme
purification procedures. Various immobilization procedures have been used in biosensor
construction. In general, the choice of procedure depends on the nature of biological element,
the type of transducer used, the physico-chemical properties of the analyte and the operating
conditions in which the biosensor is to function.
The various immobilization methods used are – Binding: crosslinking , carrier binding (
physical adsorption , ionic binding , metal binding , covalent binding ) and entrapping : gel
entrapping , fiber entrapping , microencapsulation.
Transducer: It converts the biochemical signal into an electrical signal. Several types of
transducers have been used in biosensor construction.

1. Amperometric transducer: At set voltage, current flows depending upon the


concentration of an analyte. This type of transducer is used to monitor oxygen or
hydrogen peroxide for the construction of biosensor for glucose measurement in food
samples.
2. Ion-selective electrode (ISE) / pH electrode: Here the potential varies with the analyte
concentration.e.g Na+, K+, NH4+. Urea can be determined by immobilizing urease in
close proximity to ISE in synthetic milk.

37
3. Gas sensing electrode: Here potential varies with the gas content e.g carbon dioxide
content can be measured. This type of transducer is suitable to determine microbial
cells count based on respiration rate studies.
4. Photomultiplier with fiber optics: It measures the amount of light emitted and can be
used to determine ATP concentration by immobilizing luciferase on fiber optics.
5. Photomultiplier with photodiode: It measures the amount of light absorbed. The
indicator may change the color because of change in pH . Penicillin amount can be
monitored by this transducer using penicillinase enzyme.
6. Thermistor: It measures the heat content and is useful for studies of exothermic
reactions. Use is almost universal. Heavy metal ions and pesticides and ethanol can be
monitored by thermistor
7. Piezoelectric crystal: It measures oscillation frequency depending upon the mass
absorbed on piezoelectric crystal. It is useful to construct immunosensors to study
antigen-antibody interaction.
8. Field effect transistor (FET): It is based on microelectronic device and is very useful
for miniaturization of biosensors.

Hence for the construction of successful biosensor the following requirements are
important.
1. Characterization of bioassay principle
2. Characterization of sensor (transducer) to match the bioassay principle.

2.0 Applications of Biosensors:


2.1 Biosensor for quality control in milk:

The food industry needs suitable analytical methods for quality control, that is , methods
that are rapid , reliable , specific and cost effective as current wet chemistries and analytical
practices are time consuming and may require highly skilled labor and expensive equipment.
The need arises from heightened consumer concern about food composition and safety. The
study was carried out keeping in view the recently emerging concer of the presence of urea in
milk , called “ synthetic milk “ .This urea biosensor is an immobilized urease yielding bacterial
cell biomass Bacillus sphaericus isolated from soil and coupled to the ammonium ion selective
electrode of a potentiometric transducer. Samples of milk were collected and analyzed for the
presence of urea by the developed Biosensor with a response time as low as 2 min. The results
were in good correlation with the pure enzyme system. However, it is worth the mention that
since milk is a complex system it contains much interference, which makes conventional
methods less reliable (1).

2.2 Biosensor for determination of glucose and sucrose in food samples:

Development of biosensor for the glucose estimation is based on the principle of oxidation
of glucose by glucose oxidase isolated and purified from Asprgillus niger MTCC 281. In the
oxidation process FAD component of enzyme gets reduced to FADH2 and further this is
reoxidized by molecular oxygen. So there is fall in dissolved oxygen which is measured by DO
probe using Clark electrode. Biocomponent consists of GOX immobilized in calcium alginate

38
beads confined to nylon membrane and coupled to clark electrode. The response time for
detection of glucose has been optimized to two minutes. It was observed that there is increase
in DO fall with increase in glucose concentration and linear range of detection by developed
biosensor is 25 ppb to 200ppb. Bioprobe was fully active for three days and can be used for
more than 16 cycles. The developed biosensor was applied for estimation of glucose in honey
and grape juice and has 83.3% reliability (2).

Saccharomyces cerevisae containing invertase coimmobilized with glucose oxidase has


been employed for the construction of bienzyme biosensor for monitoring sucrose in food
samples. Transducers used are thermistor and DO probe. Sucrose concentration was determined
in cold drinks, juices, jams and honey. Biosensor probes can be stored in buffer of pH 5.5 and
can be reused. The stability of probes was found to be two months. The linear range of
detection is 0.01 – 1.0 mM/L with response time of 15 min. and 0.05-0.5 mM/L with response
time of 10 min. in thermistor and DO probe respectively (3).

2.3 BOD Biosensor for monitoring dairy industry effluents:

A microbial electrode biosensor consisting of immobilized viable whole cells, nylon


membrane and DO probe has been developed for the estimation of Biochemical oxygen
demand (DO).Saccharomyces cerevisae, Bacillus sp and bacterial isolates from dairy effluent
were employed for the microbial electrode sensor for BOD. When a sample solution containing
the equivalent amount of glucose and glutamic acid was injected in to the sensor system, the
DO decreased markedly with time until steady state was reached. DO value has been
determined by DO probe and confirmed by Winkler method. The response time has been found
to be 12 min. A linear relationship was observed between DO decrease and the concentration
below 30 mg/L of glucose and glutamic acid. The DO decrease was reproducible within +_ 4.0-
6.6 % of the relative error when a sample solution containing 15 mg/L of glucose and glutamic
acid was employed. The microbial electrode sensor was applied to monitor BOD in dairy
effluent. Good comparative results were obtained between BOD estimated by the microbial
electrode and that determined by conventional 5 day method(4).

2.4 Biosensor for on line monitoring of ethanol in fermentation process of dairy effluent:

Isolated and purified enzyme Alcohol Dehydrogenase ( ADH) from Saccharomyces


cerevisae MTCC 249 was immobilized in calcium alginate beads wrapped in nylon membrane
and coupled to RTD probe or pH probe. The response time for estimation of ethanol has been
found to be 20 and 30 minutes respectively. The linear range of detection of ethanol using RTD
probe(5) and pH probe(6) is one nM to one M and 0.1 M to 15 M respectively. On line ehanol
production was monitored in fermentation process of dairy effluent. The reliability of
developed biosensor was checked by comparision to conventional methods. The results
obtained by developed biosensors were in good agreement with conventional methods. Hence
the developed biosensor could be successfully employed for ehanol production in fermented
broths. The biocomponent can be stored at 40C and reused .Further studies can be made for
determination of ethanol concentration in a sample using pH strips coated with ADH. So
portable, cost effective method can be developed for monitoring ethanol production in
fermentation processes.

39
2.5 Microbial and enzyme base Biosensors for monitoring heavy metal ions in food
samples for quality control:

Heavy metal contamination is of serious concern to human health since these substances are
non-biodegradable and retained by the ecological system. Conventional analytical techniques
fro heavy metals such as cold vapour atomic absorption spectrometry, and inductively coupled
plasma mass spectrometry are precise but of very high cost. The analysis of heavy metal ions
can be carried out with biosensor by using both protein ( enzyme, metal binding protein and
antibody )-based and whole-cell ( natural and genetically engineered microorganism )-based
approaches (7) .
Yeast Biosensor has been developed for the analysis of mercury and silver ions using
polarimeter and themistor. The immobilized yeast invertase furnished better kinetic
characteristics than free enzyme. The linear ranges of detection are o.o125-0.1 ppb for mercury
and 0.1-0.6 ppb for silver with polarimeter and 0.05-0.2 ppb for mercury using thermistor. The
biosensor responds specifically to mercury at pH 4.0 and to silver in the presence of masking
agent DPTA. The reliability of the developed biosensor has been found to be 92.5% and 100%
respectively for mercury and silver ions. For food analysisthe reliability of mercury biosensor is
96.4%. The biocomponent is stable upto 12 days and the ability of the biocomponent to
regenerate allows it to be reused (8,9).
An enzyme biosensor using urease has been developed for the analysis of copper ions using
ammonium ions selective electrode. The biosensor has linear range from 0.05.0.25 ppb and
response time of 8 minutes with reliability of 84.6% for food samples analysis(10).

Microbial biosensor has been developed from isolated Bacillus sphaericus for the analysis of
copper, nickel and lead ions. The relatively simple and cost effective technique of physical
adsorption for immobilization allowed full retention of enzyme activity . The linear range for
detection are 0.01-04,.002-0.04, and4.0-24.0 ppb for copper, nickel and lead ions respectively.
It is observed that inhibition is instantaneous and reversible by the ions hence 30 second for
copper ions and 1.5 min. for nickel and lead ions has been optimized. The copper ions have
been analyzed in meat liver and almonds, and nickel ions from wheat flour after complexing
with specific reagents by the developed biosensors(11,12).

2.6 Enzyme based biosensor for monitoring malathion pesticide residues :

Direct, fast and easy determination of various insecticides- organophosphorus and


carbamates, and herbicides has been achieved by integrating various biocomponents with
different transducers. For the construction of biosensors, bioassay principle, effect of solvents
and immobilization techniques used, and the compatibility of bioassay principle with the
transducer is important. The close integration of the biological events with the generation of a
signal offers the potential for fabricating compact and easy-to-use analytical tools of high
sensitivity and specificity. Their biological base makes them ideal for toxicological
measurements of pesticides, while conventional techniques can only measure concentration.
Screening of a particular source for biological component could be helpful to design a specific
biosensor (13).
A biosensor has been developed for monitoring malathion residues. Acetyl cholinesterase
enzyme found in blood is specific for organophosphorous pesticides and chlorinated pesticides

40
do not interfere.The enzyme was immobilized in calcium alginate beads and coupled to RTD
probe. Less than 1.0 ppm of malathion can be detected by the developed biosensor (14).
A carbon paste electrode was used for the construction of electrochemical biosensor for
monitoring malathion in soil, spiked water samples and vegetables. The linear range of
detection is 5-100ppb with response time of 8 min.(15).
Hence, Biosensor has potential for online, inline & offline monitoring of analytes in food &
dairy industries

3.0 References:
3 Verma, N. and Singh, M (2003). A disposable microbial based biosensor for quality control
in milk. Biosensors and Bioelectronics.,18.1219-1224.
4 Verma, N. and Singh, J (2004). Studies on the microbial production, enhancement,
purification, kinetic characterization and immobilization of glucose oxidase and its
applications as biosensor. Ph. D. Thesis.
5 Verma, N. and Gupta, M.(2001). Development of bienzyme biosensor for determination of
sucrose in food samples. M.Sc Project Report.
6 Verma, N. and Smriti (1998). Studies on the development of BOD Biosensor for
monitoring paper pulp and diary effluents. Abstract, Pittcon 98.,328. New Orleans, USA.
7 Verma, N. and Hora, B. (1998). Isolation and purification of alcohol dehydrogenase from
Saccharomyces cerevisae and its application as biosensor for monitoring ethanol production
from dairy effluent. M. Sc. Project Report.
8 Verma, N. and Khosla, B (1999). Studies on the development of biosensor for ethanol
monitoring using alcohol dehydrogenase from Saccharomyces cerevisae.M. Sc. Project
Report.
9 Verma, N. and Singh, M. (2005). Biosensors for heavy metals. BioMetals ., 18 . 121-129 .
10 Verma, N. Singh , M. and Dhillon, S.(2000). Studies on the development of Biosensor for
monitoring silver and mercury ions from industrial effluent accepted for ON SITE
ANALYSIS 2000...... LAB COMES TO THE FIELD, Las Vegas, NV, USA, January 2000
11 Neelam Verma and Singh, M. (2005). Development of a Yeast Biosensor for determination
of Silver Ions in Industrial Effluents. Int J. Env Studies. 62(1): 3-3.
12 Verma, N., Singh, M. and Kumar, V (2005). Development of enzyme biosensor for the
monitoring of copper ins in indystrial effluent and food samples. CHEM. ENVIRON.RES.
Accepted.
13 Verma, N. and Singh, M. PATENT ACCEPTED: Rapid and disposable microbial biosensor
for determination of copper in electroplating industrial effluents. Application No
822/Del/2002
14 Verma, N. and Singh, M.(2005). A Bacillus sphaericus based biosensor for monitoring
nickel ions in industrial effluents and foods. in the Pittsburgh Conference – Pittcon 2005,
held in Orlando Florida, USA. 2005.
15 Verma, N. and Dhillon, S.S.(2003). Biosensor for monitoring insecticides and herbicides-
A survey. Intern. J. Environ. Studies., 60: 29-43.
16 Verma, N. and Malaku, E. T. (2001).Studies on the development of disposable biosensor
for monitoring malathion pesticide residues In.Biochemistry-environment and agriculture.
Ed. APS Mann, S.K.Munshi, A.K.Gupta. Kalyani Publishers. pp 265-269.
17 Verma,N. and Dhillon, S.S. (2004). Development of biosensor for monitoring malathion
and 2,4- dichlorophenoxy acetic acid in foodstuffs, soil and water samples.(Ph.D. Thesis).

41
OPTIMIZATION TECHNIQUES IN FOOD AND DAIRY
PROCESSES

R. K. Sharma
School of Mathematics and Computer Applications
TIET, Patiala-147 004, (Punjab)

1.0 Optimization Techniques - Historical Perspective


The subject of optimization techniques has its roots in the study of linear inequalities,
which can be traced as far back as 1826 to the work of Fourier. Since then, many
mathematicians have contributed in the development of the subject. The applied side of the
subject got its start in 1939 when L.V. Kantorovich noted the practical importance of a certain
class of linear programming problems and gave an algorithm for their solution. Unfortunately,
for several years, Kantorovich’s work was unknown in the West and unnoticed in the East. The
subject really took off in 1947 when G.B. Dantzig invented the simplex method for solving the
linear programming problems that arose in U.S. Air Force planning problems. The earliest
published accounts of Dantzig’s work appeared in 1951 and his monograph published in 1963
remain an important reference.
In the same year Dantzig invented the simplex method, T.C. Koopmans showed that linear
programming provided the appropriate model for the analysis of classical economic theories. In
1975, the Royal Swedish Academy of Sciences awarded the Nobel Prize in economic science
to L.V. Kantorovich and T.C. Koopmans for their contributions to the theory of optimum
allocation of resources. The textbooks by Bradley et al., Bazaraa et al., Hillier and Lieberman,
and H. A. Taha are known for their extensive collections of interesting practical applications.

2.0 Optimization Models


A number of problems arising in science and engineering, including food and dairy
Science can be solved using different optimization models. One has to first formulate the
problem as per the frame-work of a model and then solve the same with the help of a
procedure. Putting the real life problems into a model remains a challenging task in the theory
of optimization models. We have several optimization models that can be used to solve the real
life problems. Some of the models are:

1. Linear Programming Models


2. Transportation and Assignment Models
3. Network Models: Minimal Spanning Tree, Shortest-Route Problem, Maximal Flow
Problem, CPM and PERT
4. Integer Programming Models
5. Dynamic Programming Models
6. Inventory Models
7. Queueing Models
8. Multi-Objective Programming Models
9. Non-Linear Programming Models and other models.

42
2.1 Linear Programming Models
In the linear programming models, we have to decide for the values of the variables in
some optimal fashion. These variables are referred to as decision variables. They are usually
written as xj, j = 1, 2, . . . , n. In linear programming, the objective is always to maximize or to
minimize some linear function of these decision variables, say, z = c1x1 + c2x2 + · · · + c nxn.
This function is called the objective function. It often seems that real-world problems are most
naturally formulated as minimizations, but when discussing mathematics it is usually nicer to
work with maximization problems. We either minimize cost or maximize profit. Of course,
converting from one to the other is trivial. In addition to the objective function, we also have
the constraints. Some of these constraints might be very simple, such as the requirement that
some decision variable be nonnegative. Others may be complex involving other decisions. But
in all cases the constraints consist of either equality or an inequality associated with some linear
combination of the decision variables. We generally use the Simplex Method to solve the linear
programming models.

2.2 Transportation and Assignment Models


The transportation model is basically a linear program that can be solved by simplex
method, but its special structure allows the development of a solution procedure that is
computationally more efficient. The transportation model seeks to determine the transportation
plan for a single commodity from a number of sources to a number of destinations so that the
transportation cost is minimum.

The assignment model is used in the situations when a number of jobs, say, m are to be
assigned to a given number of machines, say, n and the assignment cost for a job when
assigned to a machine is known. The objective in such situations is to assign the jobs to the
machines, one job per machine, at the least cost. The special structure of assignment model
allows it to be solved by an efficient technique called the Hungarian method.

2.3 Network Models


We have a number of network models that can be used in various situations. We may
sometimes be interested in finding the shortest path between two nodes in a network; we may
also be other time interested in finding the project completion time for a given schedule of a
project etc. What follows is the brief description of some of the network models.

2.3.1 Minimal Spanning Tree


This algorithm deals with linking the nodes of a network, directly or indirectly, using
the shortest length of connecting branches. This can be used in the situation when one wants to
establish the linkage between a number of nodes (villages, towns or cities) with minimum
connection cost.

2.3.2 Shortest-Route Problem


The shortest-route problem deals with the shortest route between a source and
destination in a transportation network. Equipment replacement problem and most reliable

43
route problem etc. can also be solved using the algorithms dealing with the shortest route
problem. Dijkstra’s and Floyd’s algorithms are widely used in order to solve the shortest-route
problems. Dijkstra’s algorithm is designed to find the shortest route between a source node and
every other node in the network and Floyd’s algorithm is general in the sense that it allows to
determine the shortest route between any two nodes in the network.

2.3.3 Maximal Flow Problem


Consider the flow of milk from some source to some destination through pipelines and
the capacity of flow of milk in each pipeline are given. We may be interested in calculating the
maximum flow of milk that can take place between such a network of pipelines. The
discussions of the maximal flow problem will be helpful in finding the answer of such
questions.

2.3.4 CPM and PERT


These are network-based methods designed to assist in planning, scheduling, and
control of projects. A project here can be defined as a collection of interrelated activities with
each activity consuming time and recourses. The CPM and PERT techniques provide analytic
means for scheduling the activities. CPM deals with the situations when deterministic time
estimates for the activity completion time are given while PERT can be used in the situations
when the activity completion time can not be given precisely. In PERT, we take three estimates
of activity completion time and use the β-distribution helps in finding the average activity
completion time as well as the variance of activity completion time that in turn help us in
finding the project completion time and the probability of completing the project in a given
time.

2.4 Dynamic Programming Models


This model determines the optimum solution of a n-variable problem by decomposing it
into n stages with each stage constituting a single variable sub problem. We have the
computational advantage that we have to solve a single variable problem. However, the nature
of the stage differs depending upon the optimization problem.

2.5 Inventory Models


This model is very useful in the situations when we have to store an item as stock. The
item may be produced by us or by some other agency. The cost involved in storing the item
should be minimized based on the holding cost, set up cost, and shortage cost. The inventory
model can suggest a policy such as: order y0 units of the item after every t0 time units.

2.5 Queueing Models


This model can be applied in the situations when one has to wait for service. One can be
waiting at the milk booth for getting the milk; the machine can be considered waiting for
starting the process etc. We can not eliminate the process of waiting without incurring
inordinate expenses; however, we can hope to achieve the situation when its adverse impact is

44
at a tolerable level. Poisson and exponential distributions have played an important role in the
development of queuing models.
2.6 Multi-Objective Programming Models
There can be a practical situation in which we have to optimize multiple, possibly
conflicting, objective function with respect to given constraints. For example, the management
is generally interested in maximizing the profits obtained by a process and making least
investment on the same. One can understand that it may be impossible to find a single solution
that optimizes the conflicting objectives; instead we find a compromise solution based on the
relative importance of each objective. The weights method and the preemptive method are two
techniques that can be used to solve the multi-objective programming models.

3.0 Problem Set

3.1 A dairy product manufacturer has to produce 200 kg of a product mixture containing two
ingredients, X1 and X2. X1 costs Rs. 3 per kg and X2 costs Rs. 8 per kg. Not more than 80
kg of X1 can be used and a minimum quantity to be used for X2 is 60 kg. Formulate a linear
programming problem (LPP) for the above situation. Find optimum quantities of both the
ingredients to be used if the manufacturer wants to minimize the cost using the simplex
method for solving LPP.

3.2 A dairy firm has to produce 10,000 liters of ice-cream. The main ingredients for the product
are milk, cream, and skimmed milk powder (SMP). Milk costs Rs.12 per liter, cream costs
Rs. 40 per kg and SMP costs Rs.14 per kg. Not more than 3,000 liters of milk can be used
and at least 1,500 kg of cream must be used, Also, at least 2,000 kg of SMP is required to
be used. Since the firm desires to minimize cost of production as well as maintain the PFA
standards of the product so that the quantity used should be according to the above
prescribed level. Find what amount of each ingredient the firm should use.

3.3 Use Vogel's Approximation Method (VAM) to obtain initial basic feasible solution of the
following transportation problem:

Destination
Origin Availability
D1 D2 D3 D4
O1 11 13 17 14 250
O2 16 18 14 10 300
O3 21 24 13 10 400

Demand 200 225 275 250

Also find an optimum solution for the same.

45
3.4 A dairy firm manufactures milk powder. The powder is being manufactured at its three
different plants. The product is shipped by road to its three major distribution depots. Since
the shipping costs are a major expense, the firm is interested to minimize the same. In this
context the cost coefficients, demands and supplies are given given in the following
transportation table:

To
From Available
A B C
I 50 30 220 1
II 90 45 170 3
III 250 200 50 4
Requirement 4 2 2 8

3.5 A dairy production manager has four B.Tech.(dairy technology) students (undergoing in-
plant training in the plant) as his associates, and there are four different tasks to be
performed. These technologists under training, differ in efficiency, and the tasks differ in
their inherent complexity. The manager's assessment, of the time each technologist would
take to perform each task, is given in the following matrix:

Dairy Technologists (under training)


Tasks
E F G H
A 20 25 17 14
B 15 28 14 23
C 36 21 18 16
D 18 26 24 10

How should the tasks be allocated, one to a dairy technologist, so as to minimize the
total man-hours?

3.6 Ice-cream manufacturing unit of Mother Dairy is selling its product through five different
marketing agencies in five different cities. At the onset of summer there is an immediate
demand for ice-cream in another five cities not having the agencies of the company. The
company is faced with the problem of deciding on how to assign the existing agencies to
dispatch the ice-cream to the needy cities in such a way that the total traveling distance is
minimized. The distances between surplus and deficit cities are given below.

Deficit Cities
Surplus cities
A' B' C' D' E'
A 10 5 9 10 11
B 13 19 6 12 14
C 3 2 4 4 5
D 18 9 12 17 15
E 11 6 14 19 10

46
3.7 Find the critical path for the project illustrated in the following network diagram (show all
the calculations and tabulate the same properly):

2 4 9

7 10
8
1

3 5 6

3.8 If for a period of 2 hours in a day (i.e. 8.00 a.m. to 10.00 a.m.), milk tankers arrive at a
chilling centre every 20 minutes but the service time continues to remain 36 minutes, then
calculate the following for this period:
(i) The probability that the chilling centre is empty
(ii) The average number of tankers in the system on the assumption that the waiting
capacity of the chilling centre is 4 tankers only.

3.9 A dairy manufacturer has to supply 20,000 pieces of a dairy product daily. The dairy plant
has the maximum production capacity of 30,000 pieces of the item every day. The cost of
holding a piece in stock is Rs. 3 per year and the set-up cost per production run is Rs. 50.
How frequently, and of what size the production runs be made?

3.10 The requirement of a particular raw material used for manufacturing various dairy
products at NDRI experimental dairy is 18,000 units per year. The holding cost per unit is
Rs. 1.20 per year, and the cost of one procurement is Rs.400.00. No shortages are allowed,
and the replenishment is instantaneous. Determine the following:
i) Optimum order quantity
ii) Number of orders per year
iii) Time between orders, and
iv) Total cost per year when the cost of one unit is Re.1.

47
ARTIFICICAL INTELLIGENCE AND ITS
APPLICATIONS

Ashwani Kumar Kush


Computer Centre,
Kurukshetra University,
Kurukshetra

1.0 AI
It is the science and engineering of making intelligent machines, especially intelligent computer
programs. It is related to the similar task of using computers to understand human intelligence.

1.1 Intelligence:

Intelligence is the computational part of the ability to achieve goals in the world.

1.2 Comparisons between human and computer intelligence

Arthur R. Jensen, a leading researcher in human intelligence, suggests ``as a heuristic


hypothesis'' that all normal humans have the same intellectual mechanisms and that differences
in intelligence are related to ``quantitative biochemical and physiological conditions'' These are
speed, short term memory, and the ability to form accurate and retrievable long term memories.

Whenever people do better than computers on some task or computers use a lot of computation
to do as well as people, this demonstrates that the program designers lack understanding of the
intellectual mechanisms required to do the task efficiently.

1.3 AI research start

After WW II, a number of people independently started to work on intelligent machines. The
English mathematician Alan Turing may have been the first. He gave a lecture on it in 1947.
He also may have been the first to decide that AI was best researched by programming
computers rather than by building machines. By the late 1950s, there were many researchers on
AI, and most of them were basing their work on programming computers.

1.4 Turing test

Alan Turing's 1950 article Computing Machinery and Intelligence discussed conditions for
considering a machine to be intelligent. He argued that if the machine could successfully
pretend to be human to a knowledgeable observer then you certainly should consider it
intelligent. This test would satisfy most people but not all philosophers. The observer could
interact with the machine and a human by teletype (to avoid requiring that the machine imitate

48
the appearance or voice of the person), and the human would try to persuade the observer that it
was human and the machine would try to fool the observer.

The Turing test is a one-sided test. A machine that passes the test should certainly be
considered intelligent, but a machine could still be considered intelligent without knowing
enough about humans to imitate a human.

The important features of Turing's test are:

1. It attempts to give an objective notion of intelligence, i.e., the behavior of a known


intelligent being in response to a particular set of questions. This provides a standard for
determining intelligence that avoids the inevitable debates over its "true" nature.

2. It prevents us from being sidetracked by such confusing and currently unanswerable


questions as whether or not the computer uses the appropriate internal processes or whether or
not the machine is actually conscious of its actions.

3. It eliminates any bias in favor of living organisms by forcing the interrogator to focus solely
on the content of the answers to questions.

Because of these advantages, the Turing test provides a basis for many of the schemes actually
used to evaluate modern AI programs. A program that has potentially achieved intelligence in
some area of expertise may be evaluated by comparing its performance on a given set of
problems to that of a human expert. This evaluation technique is just a variation of the Turing
test: a group of humans are asked to blindly compare the performance of a computer and a
human being on a particular set of problems. As we will see, this methodology has become an
essential tool in both the development and verification of modern expert systems.

2.0 Branches of AI
2.1 Logical AI
What a program knows about the world in general the facts of the specific situation in
which it must act, and its goals are all represented by sentences of some mathematical
logical language. The program decides what to do by inferring that certain actions are
appropriate for achieving its goals.
2.2 Search
AI programs often examine large numbers of possibilities, e.g. moves in a chess game
or inferences by a theorem proving program. Discoveries are continually made about
how to do this more efficiently in various domains.
2.3 Pattern recognition
When a program makes observations of some kind, it is often programmed to compare
what it sees with a pattern. For example, a vision program may try to match a pattern of
eyes and a nose in a scene in order to find a face.
2.4 Representation
Facts about the world have to be represented in some way. Usually languages of
mathematical logic are used.

49
2.5 Inference
From some facts, others can be inferred. Mathematical logical deduction is adequate for
some purposes, but new methods of non-monotonic inference have been added to logic
since the 1970s. The simplest kind of non-monotonic reasoning is default reasoning in
which a conclusion is to be inferred by default, but the conclusion can be withdrawn if
there is evidence to the contrary. For example, when we hear of a bird, we may infer
that it can fly, but this conclusion can be reversed when we hear that it is a penguin. It is
the possibility that a conclusion may have to be withdrawn that constitutes the non-
monotonic character of the reasoning. Ordinary logical reasoning is monotonic in that
the set of conclusions that can the drawn from a set of premises is a monotonic
increasing function of the premises.
2.6 Common sense knowledge and reasoning
This is the area in which AI is farthest from human-level, in spite of the fact that it has
been an active research area since the 1950s. While there has been considerable
progress, e.g. in developing systems of non-monotonic reasoning and theories of action,
yet more new ideas are needed.

2.7 Learning from experience


Programs do that. The approaches to AI based on connectionism and neural nets
specialize in that. There is also learning of laws expressed in logic. Programs can only
learn what facts or behaviors their formalisms can represent, and unfortunately learning
systems are almost all based on very limited abilities to represent information.
2.8 Planning
Planning programs start with general facts about the world (especially facts about the
effects of actions), facts about the particular situation and a statement of a goal. From
these, they generate a strategy for achieving the goal. In the most common cases, the
strategy is just a sequence of actions.
2.9 Ontology
Ontology is the study of the kinds of things that exist. In AI, the programs and sentences
deal with various kinds of objects, and we study what these kinds are and what their
basic properties are. Emphasis on ontology begins in the 1990s.
2.10 Heuristics
A heuristic is a way of trying to discover something or an idea imbedded in a program.
The term is used variously in AI. Heuristic functions are used in some approaches to
search to measure how far a node in a search tree seems to be from a goal. Heuristic
predicates that compare two nodes in a search tree to see if one is better than the other,
i.e. constitutes an advance toward the goal, may be more useful.
2.11 Genetic programming
Genetic programming is a technique for getting programs to solve a task by mating
random Lisp programs and selecting fittest in millions of generations.

3.0 Applications of AI
3.1 Game playing
You can buy machines that can play master level chess for a few hundred Rupees.
There is some AI in them, but they play well against people mainly through brute force
computation--looking at hundreds of thousands of positions.

50
3.2 Speech recognition
In the 1990s, computer speech recognition reached a practical level for limited
purposes. Thus United Airlines has replaced its keyboard tree for flight information by a
system using speech recognition of flight numbers and city names. It is quite
convenient.
3.3 Understanding natural language
Just getting a sequence of words into a computer is not enough. Parsing sentences is not
enough either. The computer has to be provided with an understanding of the domain
the text is about, and this is presently possible only for very limited domains.
3.4 Computer vision
The world is composed of three-dimensional objects, but the inputs to the human eye
and computers' TV cameras are two dimensional. Some useful programs can work
solely in two dimensions, but full computer vision requires partial three-dimensional
information that is not just a set of two-dimensional views. At present there are only
limited ways of representing three-dimensional information directly, and they are not as
good as what humans evidently use.
3.5 Expert systems
A ``knowledge engineer'' interviews experts in a certain domain and tries to embody
their knowledge in a computer program for carrying out some task. How well this
works depends on whether the intellectual mechanisms required for the task are within
the present state of AI. One of the first expert systems was MYCIN in 1974, which
diagnosed bacterial infections of the blood and suggested treatments. It did better than
medical students or practicing doctors, provided its limitations were observed. Namely,
its ontology included bacteria, symptoms, and treatments and did not include patients,
doctors, hospitals, death, recovery, and events occurring in time. Its interactions
depended on a single patient being considered. Since the experts consulted by the
knowledge engineers knew about patients, doctors, death, recovery, etc., it is clear that
the knowledge engineers forced what the experts told them into a predetermined
framework. In the present state of AI, this has to be true.
3.6 Heuristic classification
One of the most feasible kinds of expert system given the present knowledge of AI is to
put some information in one of a fixed set of categories using several sources of
information. An example is advising whether to accept a proposed credit card purchase.
Information is available about the owner of the credit card, his record of payment and
also about the item he is buying and about the establishment from which he is buying it
(e.g., about whether there have been previous credit card frauds at this establishment).

4.0 Relations between AI and Philosophy?


AI has many relations with philosophy, especially modern analytic philosophy. Both study
mind, and both study common sense.

5.0 Expert Systems


Expert knowledge is a combination of a theoretical understanding of the problem and a
collection of heuristic problem-solving rules that experience has shown to be effective in the

51
domain. Expert systems are constructed by obtaining this knowledge from a human expert and
coding it into a form that a computer may apply to similar problems.

This reliance on the knowledge of a human domain expert for the system's problem solving
strategies is a major feature of expert systems. The AI specialist, or knowledge engineer, as
expert systems designers are often known, is responsible for implementing this knowledge in a
program that is both effective and seemingly intelligent in its behavior.

One of the earliest systems to exploit domain-specific knowledge in problem solving was
DENDRAL, developed at Stanford in the late 1960s. DENDRAL was designed to infer the
structure of organic molecules from their chemical formulas and mass spectrographic
information about the chemical bonds present in the molecules. Because organic molecules
tend to be very large, the number of possible structures for these molecules tends to be huge.

Whereas DENDRAL was one of the first programs to effectively use domain-specific
knowledge to achieve expert level problem-solving performance, MYCIN established the
methodology of contemporary expert systems. MYCIN uses expert medical knowledge to
diagnose and prescribe treatment for spinal meningitis and bacterial infections of the blood.

Other classic expert systems include the PROSPECTOR program for determining the probable
location and type of ore deposits based on geological information about a site, the INTERNIST
program for performing diagnosis in the area of internal medicine, and XCON for configuring
VAX computers. XCON was developed in 1981, and at one time every VAX sold by Digital
Equipment Corporation was configured by that software. Numerous other expert systems are
currently solving problems in areas such as medicine, education, business, design, and science.

It is interesting to note that most expert systems have been written for relatively specialized,
expert level domains. These domains are generally well studied and have clearly defined
problem-solving strategies. Problems that depend on a more loosely defined notion of
"common sense" are much more difficult to solve by these means. In spite of the promise of
expert systems, it would be a mistake to overestimate the ability of this technology. Current
deficiencies include:

1. Difficulty in capturing "deep" knowledge of the problem domain.

2. Lack of robustness and flexibility.

3. Inability to provide deep explanations.

4. Difficulties in verification.

5. Little learning from experience.

In spite of these limitations, expert systems have proved their value in a number of important
applications. It is hoped that these limitations will only encourage the student to pursue this
important branch of computer science

52
5.1 Applications:

1. Natural Language Understanding and Semantics


2. Modeling Human Performance
3. Planning and Robotics
4. Languages and Environments for AI
5. Machine Learning
6. Alternative Representations: Neural Nets and Genetic Algorithms

6.0 Artificial Intelligence--A Summary


We have attempted to define artificial intelligence through discussion of its major areas
of research and application. This survey reveals a young and promising field of study whose
primary concern is finding an effective way to understand and apply intelligent problem
solving, planning, and communication skills to a wide range of practical problems. In spite of
the variety of problems addressed in artificial intelligence research, a number of important
features emerge that seem common to all divisions of the field; these include:
1. The use of computers to do reasoning, pattern recognition, learning, or some other form of
inference.
2. A focus on problems that do not respond to algorithmic solutions. This underlies the reliance
on heuristic search as an AI problem-solving technique.
3. A concern with problem solving using inexact, missing, or poorly defined information and
the use of representational formalisms that enable the programmer to compensate for these
problems.
4. Reasoning about the significant qualitative features of a situation.
5. An attempt to deal with issues of semantic meaning as well as syntactic form.
6. Answers that are neither exact nor optimal, but are in some sense "sufficient". This is a result
of the essential reliance on heuristic problem-solving methods in situations where optimal or
exact results are either too expensive or not possible.
7. The use of large amounts of domain-specific knowledge in solving problems. This is the
basis of expert systems.
8. The use of meta-level knowledge to effect more sophisticated control of problem solving
strategies. Although this is a very difficult problem, addressed in relatively few current
systems, it is emerging as an essential area of research.

7.0 The Applications Of Expert Systems


Applications tend to cluster into seven major classes.
 Diagnosis and Troubleshooting of Devices and Systems of All Kinds
 Planning and Scheduling
 Configuration of Manufactured Objects from Subassemblies
 Financial Decision Making
 Knowledge Publishing

53
 Process Monitoring and Control
 Design and Manufacturing

8.0 AI Applications in Industry / Dairy


1. Animal cloning by nuclear transfer; In vitro embryo production; Reproductive
physiology
2. Analysis of genetic relationships and breeding population structures in wild animal
species using molecular markers.
3. Genetic polymorphism of milk proteins - its association with production traits,
composition and properties of milk. Identification of silent variants of milk proteins;
Control of the quality of milk samples submitted for routine analysis; Effects of milk
composition on cheese yield and quality
4. Molecular endocrinology and genetics in poultry and dairy cattle aimed at identifying
genes and gene pathways that affect reproduction and production traits.
5. Dairy cattle biochemistry and physiology; Milk synthesis; endocrine-immune reaction;
biologically active peptides from milk proteins; neutrophil diapedesis and mastitis
6. Dairy cattle nutrition, ruminant carbohydrate and protein metabolism, nutritional
evaluation of new forages for dairy cows, optimizing the feeding value of agricultural
byproducts for ruminant animals.
7. Protein nutrition; Appetite control and forage utilization in ruminants; Nutritive value of
silage; Nutrition of farmed deer.
8. Poultry physiology and Nutrition; Pulmonary hypertension syndrome in broilers;
Nutritional and biochemical factors affecting cardiopulmonary function and ascites
mortality in broilers.
9. Estimation of genetic parameters from large-scale milk recording data for production
and reproduction traits; mixed model methodologies; Modeling; Impact of selection
decisions.

54
10. Dairy cattle population genetics. Estimation of genetic parameters, particularly for
lifetime production and fertility, and longevity; Estimation of associations of production
traits with RFLP's and protein polymorphisms. Identification of genetic markers for
resistance to mastitis.
11. Dairy cattle population genetics. Dairy herd recording operations; Estimation of genetic
parameters from large-scale milk recording data for conformation traits,
mastitis/somatic cell count, milk proteins, lifetime performance, and reasons for
disposal in dairy cows.
12. Information systems in dairy cattle breeding and farm management; On-farm decision-
support systems; Artificial intelligence; Data mining techniques; Rule induction for
expert-system development.
13. An expert system called LAIT-XPERT VACHES, developed to evaluate technical
management of dairy enterprises, was tested using case data. The expertise of the
system was provided from information obtained from interviews of three dairy
management or nutrition experts. LAIT-XPERT VACHES contains over 950 rules and
runs on IBM-compatible personal computers. It calculates objectives in milk
production, fat and protein production, feeding cost, reproduction, and other areas. In
addition, it detects problems and high performance according to these objectives;
researches the causes of problems in herd management, feeding, genetics, health,
housing, and other areas; and lists conclusions by sector. Using a monthly report of 10
farms registered in the DHI program of Quebec, LAIT-XPERT VACHES issued 92.3%
of the conclusions also issued by experts. However, the experts revealed only 53.3% of
conclusions reached by the expert system. With Agri-Lait reports of three farms, all
conclusions of LAIT-XPERT VACHES were validated by the experts. These results
demonstrated that use of an expert system makes it possible to obtain analyses of dairy
performance data equivalent to those of human experts.
14. Example of Sightech's artificial intelligence sorting technology for dairy and produce
applications. Egg breaking - egg-white inspection. Eyebolt detects presence of yolk in
egg-whites and actuates ejection
15. Department of Natural Resources and Environment, Australia – Seven “Online
Consultants” - Provides useful applications to help dairy farmers in their decision
making. Most of the applications relate to the Target 10 Dairy Extension programs run
throughout Victoria. Systems include: Pasture management, Nutrition of the Dairy
Cow, Soil and Fertilizer, Dairy Business, Managing Dairy Effluent, Nutrients for
Sustainable Agriculture, and Dealing with Wet Soils.
9.0 References

1. Shortliffe EH. Computer Based Medical Consultations: WCIN. New York: Elsevier, 1976.
2. Waterman DA. A Guide to Expert Systems. Addison-Wesley Publishing Company, 1986.
3. Durkin J. Expert Systems: Catalogue of Applications. Intelligent Computer Systems, PO
BOX 4117, Akron, Ohio, USA, 1993.
4. Aikens J S. PUFF: An expert system for interpretation of pulmonary function data.
Computers and Biomedical Research
1983;16: 199-208.
5. Wang CH, Tseng SS. A brain tumors diagnostic system with automatic learning

55
GRAPHICAL USER INTERFACE

Ashwani Kumar Kush


Computer Centre,
Kurukshetra University,
Kurukshetra
1. Introduction:

GUI is a program interface that takes advantage of the computer's graphics capabilities to
make the program easier to use. Well-designed graphical user interfaces can free the user from
learning complex command languages.

Graphical user interfaces feature the following basic components:


pointer : A symbol that appears on the display screen and that you move to select objects
and commands. Usually, the pointer appears as a small angled arrow. Text -processing
applications, however, use an I-beam pointer that is shaped like a capital I.
pointing device : A device, such as a mouse or trackball, that enables you to select objects
on the display screen.
icons : Small pictures that represent commands, files, or windows. By moving the pointer to
the icon and pressing a mouse button, you can execute a command or convert the icon into a
window. You can also move the icons around the display screen as if they were real objects
on your desk.
desktop : The area on the display screen where icons are grouped is often referred to as the
desktop because the icons are intended to represent real objects on a real desktop.
windows: You can divide the screen into different areas. In each window, you can run a
different program or display a different file. You can move windows around the display
screen, and change their shape and size at will.
menus : Most graphical user interfaces let you execute commands by selecting a choice
from a menu.

The first graphical user interface was designed by Xerox Corporation's Palo Alto Research
Center in the 1970s, but it was not until the 1980s and the emergence of the Apple Macintosh
that graphical user interfaces became popular. One reason for their slow acceptance was the
fact that they require considerable CPU power and a high-quality monitor, which until recently
were prohibitively expensive.

The hallmark of GUI programming lies in its graphical control features, such as toolbar
buttons or icons. Unlike pure text interfaces which accept only keystroke commands, GUIs
allow for a variety of input devices (such as a mouse or penlight) for the user to manipulate text
and images as visually displayed. In addition to their visual components, graphical user
interfaces also make it easier to move data from one application to another. A true GUI
includes standard formats for representing text and graphics. Because the formats are well-
defined, different programs that run under a common GUI can share data. This makes it

56
possible, for example, to copy a graph created by a spreadsheet program into a document
created by a word processor.

Visual Basic, Delphi and C++ are tools commonly used for GUI application development;
one can also use Help and HTML processors, numerous database scripting tools, ToolBook,
Hypercard, and many other products.

Examples of systems that support GUIs are Mac OS, Microsoft Windows, NEXTSTEP and
the X Window System. The latter is extended with toolkits such as Motif, Qt (KDE) and GTK+
(GNOME).

An example of the graphical user interface in Windows XP

Today's all major operating systems provide a graphical user interface. Applications
typically use the elements of the GUI that come with the operating system and add their own
graphical user interface elements and ideas. A GUI sometimes uses one or more metaphors for
objects familiar in real life, such as the desktop, the view through a window, or the physical
layout in a building. Elements of a GUI include such things as: windows, pull-down menus,
buttons, scroll bars, iconic images, wizards, the mouse, and no doubt many things that haven't
been invented yet. With the increasing use of multimedia as part of the GUI, sound, voice,
motion video, and virtual reality interfaces seem likely to become part of the GUI for many
applications. A system's graphical user interface along with its input devices is sometimes
referred to as its "look-and-feel."

Examples of application specific touchscreen GUIs include the most recent automatic
teller machines, airline self-ticketing, information kiosks and the monitor/control screens in
embedded industrial applications which employ a real time operating system (RTOS). The
latest cell phones and handheld game systems also employ application specific touchscreen
GUI.

57
2.0 Design
Whether you aspire to develop the next big software hit or simply create computer
applications for your personal and office use, your applications will need effective user
interfaces to fulfill their potential. Designing such an interface is part discipline (following
platform conventions and good design principles), part science (usability testing) and part art
(creating screen layouts that are informative, intuitive and visually pleasing).

3.0 Ten Principles for Good GUI Design

1. The user must be able to anticipate a widget's behavior from its visual properties.
Widgets in this context refer to visual controls such as buttons, menus, check boxes,
scroll bars, and rulers. So let's call this the Principle of Consistency at the Widget Level.
This principle stresses that every widget in your application should behave consistently.
If one button responds to a single mouse click then every button should respond to a
single click.
2. The user must be able to anticipate the behavior of your program using knowledge
gained from other programs. This is the Principle of Consistency at the Platform
Level. Consistency is important not only to visual elements like widgets but to
abstractions such as mouse gestures, accelerator keys, placement of menus, and icons
and toolbar icons. There are plenty of decisions regarding GUIs that are arbitrary and
platform-specific. Obtain a good GUI application design guide for your target platform,
and follow it. Maintaining consistency with the host platform trumps achieving
consistency of your application across platforms. Your users will change applications
on the same platform far more frequently than they will run your application on
different platforms.

3. View every user warning and error dialog that your program generates as an
opportunity to improve your interface. Good GUI interfaces rarely need or use
warnings and error dialogs. Exceptions include those that signal hardware problems
such as a disk failure or lost modem connection, or warnings that ask the user's
permission to perform an irreversible and potentially erroneous step. Otherwise, error
dialogs in GUI interfaces represent interface design flaws. Prevent, don't complain
about, user errors. The most common flaws arise from improperly formatted user input
and inappropriate task sequencing. Design your program interface to help your users
enter appropriate data. If your program requires formatted data (dates, currency,
alphanumeric only, or a particular numeric range) use bounded input widgets that
appropriately limit the user's input choices. If a certain program step cannot be
legitimately performed until your user completes other steps, disable the dependent step
until all its dependencies are satisfied. Most GUI environments dim disabled widgets to
signal that the widget cannot be selected. Use disabled widgets to limit user actions to
those that are valid.

58
4. Provide adequate user feedback. Like the Consistency Principle, the Principle of User
Feedback applies to widgets and to program activity. Widgets often provide visual
feedback; click a button, and it briefly suggests it has been depressed. Click a check box
and its new appearance alerts the user it has been selected or deselected. If you create
your own widgets, be sure to provide users with visual feedback for each affordance.
User feedback at the program level requires that users know whether a step is in
progress or completed. Change the cursor (the Mac wristwatch or the Window
hourglass) to indicate brief delays, and use progress bars to indicate progress of longer
tasks. Design every screen in your application so a novice user can easily tell what
steps, especially critical steps, have been performed.
5. Create a safe environment for exploration. Humans are born explorers. Great
interfaces invite and reward exploration, and offer the novice both the thrill of discovery
and the satisfaction of unassisted accomplishment. Some interfaces encourage users to
explore unfamiliar features, others do not. By allowing users to undo or redo, you
encourage them to explore your application without fear of corrupting the database.
Undo/Redo capabilities also eliminate the need for dialogs requesting permission to
perform a seemingly erroneous function. A good interface makes a user feel competent,
while poor interfaces leaves the same user feeling incompetent.
6. Strive to make your application self-evident. Good applications have comprehensive
manuals and online help materials explaining program features and how to use them to
solve real world problems. Great applications are those whose novice users rarely need
to refer to the manuals or online help. The difference between good and great is often
the degree to which the application and its interface are self-evident. From your choice
of labels and widget captions to the arrangement of widgets on the screen, every
interface design decision you make needs to be tested by users. Your goal is to create an
interface that needs no explanation.
7. Use sound, color, animation and multimedia clips sparingly. Sound, color,
animation and multimedia presentations are appropriate for education or entertainment,
but effective use in other applications is difficult. Most platforms have written
conventions that describe the appropriate use of sound, color and animation. Follow
them, and remember never to use color or sound as the sole means of communicating
with the user (many users are colorblind or hearing-impaired).
8. Help users customize and preserve their preferred work environment. If your
application will be installed and operated by a single user, preserving the work
environment may be as simple as creating a few registry entries such as the window's
latest size and position. Keep in mind that regardless of programming, the
characteristics of the user's display will affect your application's appearance; your full-
screen interface may look fine on a 14-inch VGA display, but will those 8-point Times-
Roman labels and captions be legible on a 17-inch display at a resolution of 1152x864?
One popular solution to any hardware irregularities or user preferences is to permit the
user to tailor the basic interface. Common user-tailored details include fonts, toolbar
location and appearance, menu entries, and (especially important for users with
impairments) color and sound. It is helpful to give users a way to choose among several

59
predefined schemes, and always include a way to return to the default color or sound
scheme.

9. Avoid modal behaviors. Programs using modal behavior force the user to perform
tasks in a specific order or otherwise modify the user's expected responses. Modal
behaviors generally feel uncomfortable to the user because they restrict more intuitive
or natural responses, but if consciously and thoughtfully applied they can be used to
advantage. For example, "Wizard" type tools simplify complex tasks by modal
behavior. Warnings and error messages are also typically modal, forcing users to first
address a critical issue before returning to the task. Modality in this latter context is
necessary but interrupts the user's concentration and goal-oriented behavior, and so is
another reason to avoid unnecessary warning and error messages. The best modal
behaviors are subtle but not hidden, and come forth naturally as a consequence of the
metaphor.

In a typical painting program, for example, selecting a widget generally alters the
subsequent function of the program and therefore results in modal behavior. Pick the
brush widget, and you are ready to paint. Pick the letter stencil widget, and you type
some text. Pick a shape widget, and you then draw that shape. This rarely causes
confusion because the modal behavior is based on a real world analogy; we already
know that by selecting a drawing instrument we are limiting the color, texture and line
thickness that will appear on our paper. Good interfaces reveal the palette selection at a
glance, and change the cursor to provide additional visual feedback once a selection is
made.

10. Design your interface so that your users can accomplish their tasks while being
minimally aware of the interface itself. We could call this the Principle of
Transparency. Interface transparency occurs when the user's attention is drawn away
from the interface and naturally directed at the task itself. It is the product of several
factors, including a screen layout that puts both tools and information where the user
expected them to be; icons and labels that are meaningful to the user; and metaphors
(including gestures) that are easy for users to recognize, learn, remember, and perform.
Choosing good metaphors and adhering to the above principles are an important start,
but the most direct way to insure a transparent interface is to perform user testing
throughout the program's creation.

Following these 10 principles should help you create more effective, user-friendly interfaces
while avoiding many design errors.

4.0 Examples in Dairy Industry


1. Gidm: A Gui-Based Model For Dairy Waste Management Analysis:GIDM is designed to
be an additional tool for answering questions related to the environmental impacts of
dairy operations. It operates on an individual dairy basis and incorporates the
GLEAMS water quality model for simulating nutrient transport of both nitrogen and
phosphorus for specific fields of a dairy. It is designed to be generic, so that any dairy

60
represented by a coverage for which relevant data such as topography, soil
characteristics, weather and field boundaries are available can be simulated. GLEAMS
is a field scale water quality model which includes hydrology, erosion/sediment yield,
pesticide and nutrient transport submodels.

The graphical user interface is designed to help the user in planning a balanced nutrient
management program for the dairy. User can select a dairy or cancel the menu by
selecting 'none'. If the dairy is successfully selected, a base map of that dairy is
displayed showing the contours of the dairy and its field boundaries. The field selection,
simulation and map utilities menus also appear at this time on the left side of the screen.
Now the user can create thematic maps of the selected dairy or define, modify, and
simulate field management practices. Once a dairy is selected, the user can either select
an existing plan to work with (Management plans sub-menu) or start defining new field
and crop management practices.

2. The Victorian Dairy Industry Authority (VDIA) and the Victorian Meat Authority
(VMA) contacted GUI, when they decided to rebuild their legacy business systems.
GUI not only helped redesign their systems and architect the new software from the
system level up, we also lead the development effort, supplied additional programming
resources, and helped restructure their network environment to support the new
development. A web-based interface was also implemented to allow Dairy Technical
Services (DTS) and other sample testing organizations to be able to transmit test result
data into the system and make it available for analysis and reporting as part of the core
application.

3. Dairy Information Services Kiosk and Dairy Portal: Centre for Electronic
Governance, Indian Institute of Management, Ahmedabad. The application aims at
helping the dairy farmers with timely messages and educating them on the care for their
milk cattle and enhance the production of quality milk. It also aims at assisting the dairy
unions in effectively scheduling and organizing the veterinary, artificial insemination,
cattle feed and other related services. The application uses Personal Computers at the
milk collection. The application includes two components - a Dairy Portal (DP) and a
Dairy Information Services Kiosk(DISK).

It stores and maintains the databases of cooperative society members, their cattle,
artificial insemination, veterinary, cattle feed and other service transactions in addition
to the daily milk transactions. Based on this data, the DISK software generates alert as
well as routine messages in the regional language, to be given to the farmers when they
come to deliver the milk. These messages typically draw the attention of the farmer towards
the health and productivity aspects of his milk cattle. In addition, the DISK generates
several summary reports for the management of the society.

The Dairy Portal has textual as well as multi-media content useful to the farmers,
extension workers, business executives and researchers dealing with the dairy sector.

61
The portal mainly offers services such as education, entertainment, discussion forum,
frequently asked questions, data transfers, application forms for submission to various
agencies, e-commerce.

4. GUICS (Graphical User Interface for Crop simulation) User interfaces appear to be the
key components. The software aims at simulation of crops in different environments.

5. DeLaval Performance Tester VPR100: It is used in the dairy industry for critical
testing to determine the efficiency of the milking process through monitoring of the
milking apparatus. The hand-held device measures all key performance characteristics
of the milking plant including vacuum, pulsation, pump shaft speed and air leakage
using internal vacuum transducers and Infra-Red light. A wireless-data system is used
to connect to an expanding range of sensors that give the user freedom from
cumbersome wiring in the dairy environment. The results are monitored and recorded
through a touch-screen interface and can be transferred to a PC for analysis and long-
term storage. Highly language-independent, easy-to-use graphical user interface.

6. ProSched : ProSched is a powerful and user friendly Production Scheduling software


designed for the process industries. Used for Dairy & beverage industry scheduling.

7. MadcapV, an acronym of Milk Analysis Data Capture and Processing, The Company’s
"Madcap" dairy software now operates in fifth generation serving the majority of the
processing industry in NZ and Australia. Client companies range in size from specialist
processors with just ten supply farms to an industry of some 15,000 suppliers. The
Madcap "Testing" module also handles data capture and manipulation for the largest
independent milk testing laboratories in Australasia. "The design-inherent flexibility
and user-definability allows Madcap to shape up readily to the most variant company
requirements and to what we term "cultural customization" to industry structures, from
the UK to India". Madcap has delivered a consistent and uniform national system to
Dairy Farmers for all processes relating to the collection, testing and payment for milk.

5.0 References
Norman DA. “The Design of Everyday Things”, 1988 Doubleday; ISBN 0-385-26774-6
Tognazzini B. “Tog on Interface”,1993 Addison-Wesley; ISBN 0-201-60842-1
Mandel T. “The GUI-CUI War Windows vs. OS/2 - The Designer's Guide to Human-Computer
Interfaces”, 1994 Van N.Reinhold; ISBN 0-442-01750-2
“The Windows Interface - An Application Design Guide”, 1991 Microsoft Press; ISBN 1-55615-439-9
Knudson R. User Interface Design Guidelines - A Brief Overview for Skiers and Applications
Developers.
Fraisse, C. W., K. L. Campbell, J. W. Jones, W. G. Boggess, and B. Negahban. 1995a. Integration of
GIS and Hydrologic Models for Nutrient Management Planning. Proceedings of the National
Conference on Environmental Problem-Solving with GIS. EPA/625/R-95/004. pp. 283-291.
Fraisse, C. W., K. L. Campbell, J. W. Jones, and W. G. Boggess. 1995b. GIDM User's and Developer's
Manual. Research Report AGE 95-3. Agricultural and Biological Engineering Department, University
of Florida. Gainesville, FL. 112 p.

62
Jensen, M. E., R. D. Burman and R. G. Allen (Eds.). 1990. Evapotranspiration and Irrigation Water
Requirements. American Society of Civil Engineers, Manuals and Reports on Engineering Practice, No.
70. 332 p.

63
INTRODUCTION TO MULTIMEDIA AND ITS
APPLICATIONS IN DAIRYING

Mrs. Jancy Gupta


Dairy Extension Division,
NDRI, Karnal

1.0 Introduction.
Animals contribute to the national economy in general and livestock farming economy in
particular by producing milk, meat, skin, hides, slaughterhouse byproducts, dung for manure
and fuel and draft power. India emerged as the highest milk producing country in the world
with our milk production touching 91 mt. However, in comparison to dairy developed countries
like Israel (8616 kg), USA (7767 kg), South Africa (7672 kg), Sweden (7376 kg), Netherlands
(6890 kg) and Denmark (6716 kg) , the productivity of the native animals is very low-milk
yield averaging 445 kg for cow and 811 kg for buffalo per lactation. The quality of milk in
terms of bacteriological quality is also very poor.

2.0 Multimedia System Aims:


1. To reach the 110 million farmers, spread over 500 districts and over 6000 blocks located in
different agro-ecological and techno-socio economic situations the best approach is through
information and communication technologies (ICTs).
2. Development of e-extension system will help to to develop reliable database in respect of
livestock products and productivity, increasing the adoption of technological interventions for
higher productivity, rapid genetic upgradation of livestock, improvement in the delivery
mechanism inputs and services to farmers.
In this paper efforts have been made to introduce the multimedia and its applications in
the promotion of dairying.

3.0What Is Multimedia?
Multimedia can be defined as any combination of text, graphic art, sound, animation, and
video combined and authored using authoring software and delivered by computer or other
electronic means.

4.0 A Multimedia Computer System?


A multimedia computer system is a computer system, which has the capability to integrate
two or more types of media (text, graphics, image, audio, and video) for the purpose of
generation, storage, representation, manipulation and access of multimedia information. In
general, the data size for multimedia information is much larger than textual information,
because representation of graphics, animation, audio or video media in digital form requires

69
much larger number of bits than that required for representation of plain text. Due to this,
multimedia computer systems require:

1. Faster CPU (for quick processing of large amount of data),


2. Larger storage devices (for storing large data files),
3. Larger main memory (for running programs with large data size),
4. Good graphics terminals (for displaying graphics, animation and video), and
5. I/O devices required to play any audio associated with a multimedia application
program.

5.0 Multimedia Components


5.1 Text. Alphanumeric characters are used to present information in text form. Computers are
widely used for text processing.
5.1.1 Hardware Requirements for Text
Text processing, with the use of computers, generally involves the following hardware devices:

1. Keyboards are most commonly used to input text data.


2. OCRs (Optical Character Recognizers) are used for direct input of printed text to
computers.
3. Computer screens are used to display text information.
4. Printers are most commonly used to output text in hard copy form.

5.1.2 Software Requirements for Text


The following text processing capabilities are highly desirable in a multimedia computer
system for better presentation and use ot textual information:
1. Text editing. Text and word processing packages are used to generate, edit and
properly layout a text document.
2. Text Style. Presentation of text information can be made more effective by using
text of various sizes, fonts and styles (bold, italics, shadow, etc.)
3. Hypertext. Both presentation and use of textual information can be greatly
enhanced by using hypertext features. It allows users to obtain information by
clicking on an anchor (a word or phrase linked to another document) with in a
document.
4. Text importing and exporting. The task of creating a textual document can often
be greatly simplified, if the document preparation software has text-importing
feature. This is because some of the text you want to incorporate in your document
may already exist as a file created by a word processor or a database file. The file
(partly or fully) can be simply imported into new document at the desired loaction.

5.2 Graphics

Computer graphics deals with the generation, representation, manipulation and display of
pictures with the aid of a computer. Graphics is an important component of multimedia because
a picture is a powerful way to illustrate information.

70
5.2.1 Graphics Types
The pictures used in computer graphics can be broadly classified into two types:
1. Line drawings. These are drawings and illustrations in the form of 2D and 3D
pictures, which are created from mathematical representation of simple objects, like
lines, circles, arcs, etc. Simple object types are used to create complex objects. For
example, the picture of chair can be drawn using lines and arcs.
2. Images. These are pictures and photographs, which are composed of a collection of
pixels (short form of “picture element, “ which is a unit of measurement across a
computer screen).
On a display screen, each component of pixel corresponds to a phosphor. A phosphor
glows when excited by an electron gun. Various combinations of different RGB intensities
produce different colors.

5.2.2 Hardware Requirements for Graphics


Computer graphics generally involves the following hardware devices:
1. Locating device (such as mouse, a joystick).
2. Flatbed or rectangular-coordinate digitizer (inputting existing line drawings).
3. Scanners (optical scanners, image-scan digitizers etc.)
4. Digital images are also captured directly by a digital camera or a frame capture
hardware such as video camera.
5. Computer screens with graphics display capability.
6. Laser printers.

5.2.3 Software Requirements for Graphics


The following graphics processing capabilities are highly desirable in a multimedia computer
system for better presentation and use of graphics information:
1. Painting or drawing software. This software allows the user to create graphics
from scratch by using a mouse and various simple objects, such as lines, circles,
and polygons with supporting colors.
2. Screen capture software. Often we need to incorporate images from computer
screen displays in some document.
3. Clip art. Clip art is a library of commonly used graphic images or objects, such as
a personal computer, printer, telephone etc. These images can be directly imported
from the library and used in a multimedia application.
4. Graphics importing. The task of creating a multimedia application incorporating
graphics can often be greatly simplified, if the application software can import
graphic images in some standard formats. Common graphics formats include
.BMP, .GIF and .PCX.

5.3 Animation
Computer animation deals with the generation, sequencing, and display (at a specified rate) of a
set of images (called frames) to create an effect of visual change or motion, similar to a movie
film (video). Animation deals with displaying a sequence of images at a reasonable speed to
create an impression of movement. For a jerk-free motion full motion animation, 25 to 30
frames have to be displayed per second.

71
5.3.1 Hardware Requirements for Animation
Computer animation generally involves the following hardware devices:
1. Image generation tools and devices, such as scanners, digital camera, and video
capture board interfaced to some standard video source, like video camera or video
cassette recorder, are used to generate images to be used in animation.
2. Computer monitors with image display capability is the minimum requirement for
outputting animation.

5.3.2 Software Requirements for Animation


The following software capabilities are highly desirable in a multimedia computer system with
animation facility:
1. Animation creation software. It allows the user to create animation sequences
from scratch by using a mouse and various simple objects, such as lines, circles
with various supporting colors.
2. Screen capture software.
3. Animation clips. This is a library of animation clips from which one can select
and directly import an animation clip.
4. Animation file importing. The task of creating a multimedia application
incorporating animation can often be greatly simplified, if the application software
can import animation files in some standard formats. Common animation file
formats include .FLI and .FLC.
5. Software support for high resolution. If the animation sequences of multimedia
application are made up of very high quality images, it is important to have not
only the necessary hardware, but also software support for displaying high
resolution images.
6. Recording and playback capability. It allows the user to control the recording
and display of an animation sequence.
7. Transition effect. Animation can be even more interesting, if it is enhanced with
transition effects, such as fade-in and fade-out, and rotation of objects.

5.4 Audio
Computer audio deals with synthesizing, recording, and playback of audio or sound with the
aid of a computer.

5.4.1 Analog and Digital Audio


Audio information travels in natural medium in the form of sound waves, which are analog in
nature. For the computer to be able to understand audio information, sound waves must be
converted from analog to digital form. Transducer is a device capable of changing signals from
one form to another.

5.4.2 Hardware Requirements for Audio


Computer audio generally involves the following hardware devices:
1. A sound board (or sound card), which is equipped with A/D (Analog-to-Digital)
and D/A (Digital-to-Analog) converters.
2. Some type of input device (such as a microphone) is used for audio input to record
a human voice or music or any type of sound in a computer.

72
3. Some type of output device (such as speakers or headphones) is used for audio
output to listen to a recorded sound.
4. Synthesized sound can also be generated on a computer by using keyboard ( as
interaction device) and sound sequencer software.
5. Sound editors are used to cut and paste sound sequences.

5.4.3 Software Requirements for Audio


The following software capabilities are highly desirable in a multimedia computer system with
audio facility:

1. Audio clips. This is library of audio clips (pre-made sound effects, music, and
narrations) from which one can select and directly import an audio clip, and use it in
multimedia application.
2. Audio file importing. The task of creating a multimedia application incorporating
audio can often be greatly simplified, if the application software can import audio
files in some standard formats. Common audio file formats include .WAV (windows
file), .MID (MIDI files), .VOC and .INS.
3. Software support for high quality sound. If a multimedia application uses very high
quality audio, to reproduce the sound effectively, it is important to have software
support for both recording and playback of high quality audio.
4. Text-to-speech conversion software. It is used to convert written text into
corresponding sound.
5. Speech-to-text conversion software. It is used to convert speech into corresponding
text.
5.5 Video
Like animation, computer video deals with the recording and display of a sequence of images at
a reasonable speed to create an impression of movement. Each individual image of a sequence
of images is called a frame. For a jerk-free full motion video, 25 to 30 frames have to be
displayed per second.

Video information travels in natural medium in the form of light waves, which are analog in
nature. For computer usage of video, information, light waves must be converted from analog
to digital form. Video camera is a transducer, which is commonly used to convert light waves
into electrical signals.

5.5.1 Hardware Requirements for Video


The following hardware devices are generally required in a computer system capable of
handling video:
1. Video camera is the most commonly used input device for capturing video data.
2. TV monitor or computer monitor.
3. Video board (or video card), which is equipped with A/D and D/A converters.
4. Video editors are used to cut and paste video sequences.

5.5.2 Software Requirements for Video


The following software capabilities are highly desirable in a multimedia computer system with
video facility:

73
1. Video clips. This is a library of video clips from which one can select and directly
import a video clip, and use it in multimedia application.
2. Recording and playback capability. It allows the user to control the recording
and display of a video sequence.

6.0 Applications of Multimedia in Dairying

Multimedia can be effectively utilized in knowledge management with respect to dairying. The
interactive multimedia on various aspects like genetic improvement, breeding, feeding,
healthcare, management of dairy animals, milking, processing of milk and milk products,
storage, handling, distribution and marketing of dairy products can be prepared and used for
wide dissemination of information to milk produces.
The right way of performing various operations can be shown to them on single click of
button.

6.1 Revitalization of Dairying with the help of Multimedia


The multimedia in dairying emerged as a powerful tool to contribute to the dairy and dairy
industries development through the access to vast global information and resources. The
following objectives can be set to reach the expected goals.

1. To offer opportunities to knowledge sharing, awareness to alternatives perspectives,


raising efficiency, participatory improvement in social and human conditions, access to
better quality education, healthcare, disaster relief capacity, reduce poverty, etc.
2. To offer choice to technological options appropriate to his livestock wealth, land,
capital, labour and knowledge resource.
3. To offer opportunities to manage the newer technologies, such as optimal use of inputs.
4. To build capacity to assess domestic and foreign market demand for products and
product quality criteria.
5. To source reputable suppliers of inputs and forging trust-based alliances with them.
6. To establish cooperation between small-scale producers to increase their presence and
power in the market.
7. Assessing implications of dairy enterprise in relation to changing policies on input
subsidies and trade liberalization.
8. To use video clips to learn complex procedures.
9. Access to internet and video conferencing at village e-chaupal for question-answer
service and feed-back mechanism.
10. To access knowledge for intensification and diversification of existing production
pattern.

Mutimedia –– A tool to
Information Accessing

74
Village-District/State Level Linkage of the Multimedia to Disseminate and Access
Information on Dairying

MANAGEMENT
Animal Sheds, Shelters, Flooring,
Aeration, Sunlight, Drainage,
Sanitation, Keeping Records of
Animal etc.

BREEDING
Selection of Elite Dairy Animals, Dairy Farming System
Raising-Heifers, Heat Detection,
KIOSK AI, Pregnancy Diagnosis, Calving
periods etc.

Knowledge/ HEALTH CARE


Database
Vaccination against-contagious
diseases, Mastitis, Prophylactic
measures, First-aid to animals,
Facilities at Vet. Dispensary etc.

Expert System
FODDER PRODUCTION
Round the year fodder production,
Area-under fodder crops,
Varieties, Grasses, Trees, and
Preservation of Fodder etc.

FEEDING
Feeding Requirements, Balanced
Fig. The Facets of Multimedia Feeding, rop residues, Green
use in dairy promotion Fodder, Concentrates, Minerals,
Treatments of straws, Hay and
Silage making etc.

Clean Milk Production Milk


Processing and Marketing
7.0 Conclusion.

Across the globe countries have recognized multimedia technology as an effective tool in
catalyzing the dairying related economic activities for efficient management and developing the
human resources. The multimedia extension system will provide a powerful tool to the dairy
extension functionaries, dairy farmers and dairy industries for exact, fast, accurate, cost
effective and efficient multi-way communication necessary for the overall improvement in
dairy farming business.

75
An Overview of MIS Applications in Dairying

D.K. Jain and R.C. Nagpal


Computer Centre
NDRI, Karnal

1.0 Introduction
Information systems refer to computer-based information processing
systems that are designed to support the operations, management and
decision functions of an organization. Information systems in organizations
provide information support for decision makers at various management and
decision levels. Thus, they encompass transaction processing systems,
management information systems (MIS), decision support systems, and
strategic information systems. MIS is a system required to obtain tactical1
information. MIS raises management from the level of piecemeal spotty
information, intuitive guesswork and isolated problem solving to the level of
system insights, system information, sophisticated data processing and
systematic problem solving. Hence, it is a powerful method for aiding
managers in solving problems and making decision. The MIS must fulfil the
following characteristics: a) The correctness of input data and that of the
processing rules leading to accurate resultant information, b) the information
should be complete, i.e., it should include all possible data, c) information
should be reliable, i.e., it should not conceal vital information, d) information
should be regular and timely, e) information should be relevant and concise,
i.e., should be presented in such a way that one may immediately perceive its
significance, e.g., the information in a graphical form.

2.0 MIS at NDRI – Cattle Management


There is a well organized cattle farm in the National Dairy Research
Institute, Karnal which consists of about 1500 heads of cattle belonging to
different species and breeds in different age groups. Of these, there are about
400 milch cattle, 165 milch buffaloes and 80 milch goats. There also exists a
farm-mechanized section, which produces seasonal fodders to meet the
green fodder requirement of these cattle. In addition to green fodder, the
cattle are also fed concentrate mixture formulated in the Institute itself using
recommended ingredients. A feeding schedule is drawn every fortnight to

1
Information used for short range planning and used by middle managers.
suggest the quantum of green fodder, dry and semi-dry fodders and
concentrate mixtures keeping in mind the number of cattle, their average milk
yield and other related factors.
Lot of animal farm data is being generated every day. This data
provides the basis for carrying out a number of studies on different aspects.
Thus it assumes a great importance to monitor the farm data. It is for this
reason that Director of the Institute and others at managerial position need to
know various kinds of information which relate to production of fodder, herd
milk production, milk yield per animal per day, variability in milk yield, etc. The
daily data being generated in the farm, when processed correctly and
efficiently, can provide answers to all these queries.
Computer Centre of the Institute is maintaining the farm data on a
regular basis and is preparing various reports at the press of a button, which
earlier used to be prepared manually taking days and weeks together. The
centre has thus not only standardized the various routine reports which are
prepared on fixed intervals – daily, fortnightly and monthly but is also
preparing periodical reports for varying period as and when required. Various
reports are being prepared on a regular basis for submission to appropriate
authorities for their perusal and remedial action. A brief description as to how
the data is maintained and what type of reports are generated at the Centre is
presented in the ensuing paragraphs.

2.1 Maintenance of animal data


In order to effectively manage Cattle Yard activities, relevant data must
flow from different sources of their origin to computer centre on a regular
basis for its maintenance on computer for generation of report. Accordingly
the detailed data on well structured schedule is being received at the centre
from the concerned sources and updated on computer every day. This data
comprises of breed-wise and specie-wise number of animals in milk and dry
their daily milk yield, highest milk yielding animal of the day with its milk yield,
milk delivery for processing into various milk products, fodder supplies
received from farm section, revenue earned from sale of milk products, etc.

2.2 Preparation of ration schedule


Ration schedule is being prepared every fortnight with the help of a
standardized computer package to compute the dry matter supplies to NDRI
herd including the experimental herd through fodders and feeds using the
percentage of dry matter in these feeds determined previously as well as the
average strength of herd to be fed during the fortnight on scientific basis.
2.3 Preparation of periodical MIS reports
Based on the data being updated on computer on daily basis, various
reports, both tabular and visual, are being prepared and passed on to all
concerned in addition to the Director for their perusal and corrective action
where needed. The periodicity of these reports is as follows :
1. Daily 4. Yearly
2. Fortnightly 5. Any other period
3. Monthly
These reports are being prepared for more than 20 years and have
been modified from time to time.

2.3.1 Daily Report


A daily report is being prepared everyday for the previous day and is
being sent to Director, I/c Cattle Yard, I/c Farm Section, I/c Record Cell in
DCB Division and Head, DCB Division. In case the previous day happens to
be a holiday, then two reports are prepared, one each of the previous working
day and the holiday and so on. Such a report, besides giving the actual
number of animals in milk and dry and their milk production for the flock in that
breed / species, also indicates the average milk yield per animal, both wet and
overall to give a fair idea of breed’s performance which managers intend to
monitor. The report also depicts the feeds supplies and sale proceeds of milk
and milk products, both for the day as well as cumulative since beginning of
the year, which is also a very useful information for the authorities.
Since the report is for one single day where no comparative
performance is taken into account, the report is a tabular one and no visuals /
graphs are being prepared in this case.

2.3.2 Fortnightly report


In the fortnightly report submitted every fortnight, in addition to average
milk yield per animal per day, variability in milk yield in terms of standard
deviation (S.D.) is also being worked out and reported so that if S.D. is
undesirably high, this could be very well looked into. Moreover, the
performance is also being compared with two previous fortnights, one just
preceding the current fortnight and another same fortnight a year ago so as to
evaluate the performance over time as well. Similarly fodder and feed
supplies are also reported for all the three fortnights. This facilitates the
authorities to study the changes in milk yield in relation to changes in the feed
supplies.
The report is made more meaningful and capable of drawing
instantaneous conclusions by the preparation of straight line graphs for all
important breeds of cattle and buffalo, i.e., Karan Swiss (KS), Karan Fries
(KF), Sahiwal (SW), Tharparkar (TP), all cows and Murrah (MU) buffaloes.
Average daily milk yield per animal for each breed forms the basis of drawing
line diagram by joining as many points as are the number of days in the
fortnight under reference. Suitable legends are given on the graphs for clear
understanding of the trend in milk yield. At times, when some important
conclusions are to be drawn, the graphs are prepared using different colour
prints for different lines depicting different periods / fortnights. Though the
report includes similar information for goats as for the cattle and buffaloes, no
graph is prepared in this case considering little importance of this species.
Besides the tabular and graphical reports, a summary report is being
provided with the Director’s copy for his quick grasp of the major
observations.

2.3.3 Monthly reports


Only two kinds of monthly reports are being prepared presently. These
are :
1. Demand and supply of fodders and feeds at NDRI Cattle Yard.
2. Milk production performance of cattle, buffaloes and goats at
NDRI.
Out of these, the report on fodders and feeds gives an account of
demand and supply of various green and dry fodders as well as concentrates
both during current month and cumulative since beginning of the year. It also
depicts the average daily demand and supply of fodders and feeds. A quick
glance of the report gives an immediate fair idea of how nearly the demand
has been met in the period under reference.
The report is supplemented with a graph, which predicts the quantities
of fodders and feeds demanded and supplied on each day of the month on
the basis of dry matter content in various fodders / feeds fed to the cattle. The
graph serves as a quick aid to understand what has been reported in the
preceding tabular report. Whenever there is a shortage of green fodder, this is
made good through extra feeding of concentrate. How far the demanded
quantities have been supplied can easily be judged by having a glance on
such a graph?
Second report on milk production performance of cattle, buffaloes and
goats at NDRI, Karnal gives an account of production performance of different
breeds / species of cattle and buffaloes during the current month as well as
cumulative performance since beginning of the year. Besides the milking and
overall average of animals of different breeds, the report also gives the
percent animals in milk and percent animals dry. This helps the manager to
monitor that percent animals in milk does not fall below the expected level in a
given period of time.
The report is supplemented by a number of graphs. These graphs
depict : a) Actual total milk produced by cattle and buffaloes in absolute terms
on daily basis during current year and the year before, b) Monthly milk
production during different months for the current year and two preceding
years, c) Average number of animals in milk during the corresponding period,
and d) Average milk yield per day per animal during the current year and a
year before. For the graph on average milk yield per day per animal, average
daily milk yield is worked out during each of 52 weeks of the year and as
many points are plotted to draw line diagrams showing productivity
performance of all breeds of cattle and buffaloes by preparing one graph for
each breed. A quick appraisal of the productivity performance is possible with
a single glance of the graph.
Of late the concept of composite graphs showing milk production vis-a-
vis fodder and feed supply on the same graph has been introduced. Two
kinds of graphs are being prepared in this category, one showing annual milk
production and fodder supply for the last ten years and the other showing
monthly milk production and fodder supply for all twelve months of last
completed calendar year. The graphs are able to predict whether the variation
in milk production are in consonance with variation in fodder supply or
otherwise, so that the reasons could be looked into in case of mismatch
between the two.
To further aid the appraisal of monthly reports, a summary of
observations is recorded on the top page of reports.
Another report which was prepared in the past pertained to production
and disposal of semen. This report showed the quantities of neat semen
collected from different donor bulls, production of chilled and frozen semen,
usage of semen doses by various consumption points and inventory of frozen
semen during the period under report.
3.0 MIS – Plant Management
Software tools can be applied to generate useful information relating to
processing of milk and milk products. Some of the possible applications
include : a) procurement and billing system, b) handling losses, c) cost of
production of dairy products, d) labour efficiency, e) formulation of ice-cream
mix and f) sale proceeds of dairy products.
A package had been developed by the Computer Centre to generate
bills for payment to farmers periodically for supply of milk based on its fat and
SNF content. Though it did not have much relevance to NDRI where
procurement of milk from farmers is not in practice, yet the package has its
usefulness in situations where milk is being procured by the milk plants and
payment is made at suitable intervals based on SNF and fat content. The
package could be suitably modified to incorporate any other parameter as per
the requirement.
The daily data on milk handled and products manufactured, if entered
in an appropriate software, say MS-Excel, can be used to determine handling
losses in fluid milk in absolute and percentage terms. Similarly, losses in fat
and SNF content could also be studied during the manufacturing of various
dairy products. Reports can be generated on daily basis or on batch basis or
periodically to indicate losses incurred in various operations which could be
compared with the admissible losses or norms so as to check the excess
losses, if any.
The cost of manufacturing dairy products needs to be worked out off
and on so as to fix the selling price of products which should, as far as
possible, match with the prevailing market price. Software can be developed
to find out these costs instantly by providing variable inputs costs, e.g., raw
material, labour, etc. and by using the pre-determined overhead costs in the
process of manufacturing.
Software can also be developed to find out labour efficiency in milk
plants. The data maintained on labour employed in different shifts and milk
handled could be used to work out the turnout per man-hour employed and
compared with the norms available in this regard. The information so
generated can serve the basis to determine quantum of bonus payment to be
made to the workers engaged in dairy operations.
Suitable packages like LP-88 can be employed to formulate linear
programming applications, say in the preparation of ice-cream mix, to
minimize the cost of production in order to maximize the sales margin, i.e.,
profit earned.
The sales proceeds of different products, if maintained on daily basis,
could be used to generate a daily report and periodical reports showing
product-wise, product group-wise and overall sales volume by developing
suitable software for the purpose. Such a report was developed at NDRI on
monthly basis for several years in the past.

4.0 Conclusion
These MIS reports have been found very useful by all Directors for
decision making ever since these were introduced except that some of the
Directors suggested minor modifications from time to time, which were
incorporated as and when suggested.

Select references :
1. Ahituv, N., et al. (1994). Principals of information systems for
management, IV Ed., Business and Educational Technologies, Win. C.
Brown Communications Inc., Dubuque.
2. David and Olson, Management information systems, McGraw Hill
Publications.
3. Rajaraman, V. (1994). Analysis and design of information systems,
Prentice Hall of India.
4. Simkin, Mark G. Computer information system for business.
5. Lucas, The analysis design and implementation of information system.
6. Mudrick, R.G., et al. Information system for modern management.
Data warehouse and its applications in agriculture
Anil Rai

Indian Agricultural Statistics Research Institute

Library Avenue: New Delhi-110012

1.0 Introduction

"A Data warehouse is a repository of integrated information, available for queries


and analysis. Data and information are extracted from heterogeneous sources as
they are generated. This makes it much easier and more efficient to run queries
over data that originally came from different sources." In other words Data
warehouse is a database that is used to hold data for reporting and analysis.

2.0 Goals of data warehousing

 To facilitate reporting as well as analysis


 Maintain an organizations historical information
 Be an adaptive and resilient source of information
 Be the foundation for decision making

3.0 Data warehouse Architecture

Data warehouse Architecture comprises of

 Operational source systems


 A data staging area
 One or more conformed data marts
 A data warehouse database
4.0 Operational source systems

Operational source systems are developed to capture and process original


business transactions. These systems are designed for data entry, not for
reporting, but it is from here the data in datawarehouse gets populated

5.0 Data staging area

Data staging area is where the raw operational data is Extracted, cleaned,
Transformed and combined so that it can be reported on and queried by users.
This area lies between the operational source systems and the user database
and is typically not accessible to users.

Data staging is a major process that includes the following sub procedures

 Extraction

The extract step is the first step of getting data into the data warehouse
environment. Extracting means reading and understanding the source data, and
copying the pas that are needed to the data staging for further work.

 Transformation

Once the data is extracted into the data staging area, there are many
transformation steps, including

1. Cleaning the data by correcting misspellings, resolving domain conflicts,


dealing with missing data elements, and parsing into standard formats.
2. Purging selected fields from the legacy data that are not useful for
datawarehouse.
3. Combining data sources by matching exactly on key values or by
performing fuzzy matches on non-key attributes.
4. Creating surrogate keys for each dimension record for in order to avoid
dependency on legacy defined keys, where the surrogate key generation
process enforces referential integrity between the dimension tables and
fact tables.
5. Building the aggregates for boosting the performance of common queries.

 Loading and indexing

At the end of transformation process, the data is in the form of load record
images. Loading in the data warehouse environment usually takes the form of
replicating the dimensional tables and fact tables and presenting these tables
and presenting these tables to bulk loading facilitates of each recipient data
mart. Bulk loading is a very important capability that is to be contrasted with
record-at-a time loading, which is far slower. The target data mart must then
index the newly arrived data for query performance.

6.0 Data mart

Data mart is a logical subset of an enterprise-wide data warehouse for example


a data warehouse for a retail chain is constructed incrementally from individual,
conformed data marts dealing with separate subject areas such as product sales.

Dimensional data marts are organized by subject area (such as sales, finance,
and marketing) and coordinated by data category, (such as customer, product,
and location). These flexible information stores allows data structures to respond
to business changes—product line additions, new staff responsibilities, mergers,
consolidations, and acquisitions.
7.0 Data warehouse database

A data warehouse database contains the data that is organized and stored
specifically for direct user queries and reports .it differs from an OLTP database
in that it is designed primarily for reads not writes

An OLAP application is a system designed for few but complex (read only)
request. An OLTP application is a system designed for many but simple
concurrent (and updating) requests

8.0 OLTP vs OLAP

8.1 OLTP (Online Transactional Processing)

o OLTP servers handle mission-critical production data accessed


through simple queries
o Usually handles queries of an automated nature
o OLTP applications consist of a large number of relatively simple
transactions.
o Most often contains data organised on the basis of logical relations
between normalised tables

8.2 OLAP (Online Analytical Processing)

o OLAP servers handle management-critical data accessed through


an iterative analytical investigation
o Usually handles queries of an ad-hoc nature
o supports more complex and demanding transactions
o contains logically organised data in multiple dimensions
Differences between Data warehouse and Data mart

Data warehouse Data mart

1. It is a multi-subject information 1. It is single subject data


store. warehouse

2. it is 100’s of giga bytes in size 2. Size is less than 100 giga bytes

3.It is difficult to build


3. It is difficult to build

Differences between Data warehouse and Data mart

Operational Data
Systems Warehousing

Query Predefined Ad hoc

Amount of information Few Few-Much

involved in queries
Time horizon of Up-to-date Historical and up-
to-date
required information
Information level Detailed Detailed and
summarized

Multidimensional data No Yes

CPU use All day long At maximum or


not used
9.0 Warehouse schema design

Dimensional modeling is a term used to refer a set of data modeling techniques


that have gained popularity and acceptance for data warehouse implementation.
Dimensional modeling is one of the key techniques in data warehousing.

Two types of tables are used in dimensional modeling: Fact tables and
dimensional tables

Fact tables: These are used to record actual facts and measures in the
business. Facts are numeric data items that are of interest to the business.

Examples: - telecommunication –length of cell in minutes, average number of


cells .

Dimensional tables:- Dimensional tables establish the context of the facts


Dimensional tables store fields that describe the facts .

Example:- - telecommunication- call origin ,call destination.

A schema is a fact tables plus its related dimendional table.

Star schema

 One fact table


 De-normalized dimension tables
 One column per level/attribute
 Simple and easy overview -> ease-of-use
 Relatively flexible
 Fact table is normalized
 Dimension tables often relatively small
 “Recognized” by many RDBMSes -> good
Performance
 Hierarchies are ”hidden” in the columns
Snowflake schemas

 Dimensions are normalized


 One dimension table per level
 Each dimension table has integer key, level name, and one column
per attribute
 Hierarchies are made explicit/visible
 Very flexible
 Dimension tables use less space
 Harder to use due to many joins
 Worse performance
10.0 HOW A DATA WAREHOUSE IS DIFFERENT FROM OTHER IT
PROJECTS.

 A data warehouse project is not a package implementation project.

A data warehouse project requires a number of tools and software utilities that
are available from multiple vendors. At present there is still no single suite of
tools that can automate the entire data warehouse effort.

 A data warehouse never stops evolving; it changes with business.

Unlike OL TP systems that are subject only to changes to the decisional


informational requirements of decision makers i.e. it is subject to any changes in
the business context of the enterprise.

 Data warehouses are huge.

A pilot data warehouse can easily be more than 10 gigabytes in size. A data
warehouse in production for more than an year can easily reach 1 terabyte,
depending on granularity &v volume of data. Databases of this size require
different debase optimization & tuning techniques.

11.0 Data warehouse implementation

The data warehouse implementation team builds or extends an existing


warehouse schema based on the final logical schema design produced during
planning. The team also builds the warehouse subsystems that ensure a steady,
regular flow of clean data from the operational systems into the data warehouse.
Other team members insta II and configure the selected front-end tools to
provide users with access to warehouse data. An implementation project should
be scoped to last between three to six months. Once the warehouse has been
deployed, the day-to-day warehouse management, maintenance, and
optimization tasks begin. Some members of the implementation team may be
asked to stay on and assist with the maintenance activities to ensure continuity.
The other members of the project team may be asked to start planning the next
warehouse rollout or may be released to work on other projects.

The tasks performed during a warehouse implementation include-

 Acquire and set up development environment.


 Obtain copies of operational tables.
 Finalize physical warehouse schema design.
 Build or configure extraction and transformation subsystems.
 Build or configure data quality subsystems.
 Build warehouse load subsystem.
 Set up data warehouse schema.
 Set up data warehouse metadata.
 Set up data access and retrieval tools.
 Conduct user training, testing & acceptance.
12.0 Applications in Agriculture:

Recently, in the field of agriculture a NATP Mission Mode Project “Integrated National
Agricultural Resources Information System (INARIS)”. In the project a state of art
Central Data Warehouse (CDW) of agricultural resources of the country is under the
process of development at IASRI, New Delhi. This is probably the first attempt of data
warehousing of agricultural resources in the world. This will provide systematic and
periodic information to research scientists, planners, decision makers and developmental
agencies in the form of On-line Analytical Processing (OLAP) decision support system.
The above project is being in progress with active collaboration and support from 13
other ICAR institutions, namely NBSSLUP Nagpur (for soil resources), CRIDA
Hyderabad (for agro-meteorology), PDCSR Modipuram (for crops and cropping
systems), NBAGR Karnal (for livestock resources), NBFGR Lucknow (for fish
resources), NBPGR New Delhi (for plant genetic resources), NCAP New Delhi (for
socio-economic resources), CIAE Bhopal (for agricultural implements and machinery),
CPCRI Kasargod (for plantation crops), IISR Calicut (for spices crops), ICAR Research
Complex for Eastern Region Patna (for water resources), NRC-AF Jhansi (for agro
forestry) and IIHR Bangalore (for horticultural crops).
In all 59 databases on agricultural technologies generated by council, research projects in
operation and the related agricultural statistics from published sources at least from the
year 1990 onwards at the district level are being integrated into this information system.
The system is currently under development phase with subject-wise data marts being
created, multi-dimensional data cubes being developed for publishing on Internet/Intranet
and the validation checks being implemented.
The above system has been developed keeping in view the three groups of users i.e. (1)
research managers and planners (2) research scientists and (3) general users. The
information of this data warehouse will be available to the user in the form of decision
support system in which the all the flexibility of the presentation of the information, its
on line analysis including graphic is inbuilt in to the system. The system also provides
the facility of spatial analysis of the data through web using functionalities of Geographic
Information System (GIS). Apart from this the subject wise information system has been
developed for the general users. The user of this system has the access of subject wise
dynamic reports through web. The facilities of data mining and generation of ad-hoc
querying will also be extended to limited users. Therefore, the dissemination of
information from this data warehouse for different categories users will be through web
browser with proper authentication of the users. The web site of the project is already
launched (www.inaris.gen.in) and the multidimensional cubes, dynamic reports, GIS
maps and some of the information systems are already available to the users.

References:

Humpshires, Hawkins (1999).Data warehousing Architecture and


implementation,Prentice –Hall, New Jersey

Ralph Kimball (1998).The Data warehouse lifecycle tool kit, Wiley Computer
Publishing, New York.

 http:/ /www.dwinfocenter.org
 http:/ /www.dmreview.com
 http:/ /www.dwreview.com
 http:/ /www.datamation.com
 http:/ /www.dw-institute.com
 http:/ /www.sas.com
 http:/ /www.idwa.org
DECISION SUPPORT SYSTEMS AND THEIR APPLICATIONS IN
DAIRYING

D.K. Jain and Adesh K. Sharma


Computer Centre
National Dairy Research Institute, Karnal-132001

1.0 INTRODUCTION
Dairy Sector in India has made commendable progress during the past three decades.
The potential of Information Technology (IT) has not been fully tapped in this sector.
The rapid strides that the country has registered in the IT field will remain incomplete
unless IT is optimally utilized to enhance the productivity, profitability, stability and
sustainability of the agricultural and dairying sector so that the quality of life of the
present and future generation improves. Technologies and tools that have been
developed in this field can play a significant role in overcoming most of the problems
related to agriculture, dairying and rural development.
In the agricultural and dairying sector, which is in the midst of powerful changes
influenced by industrialization and modernization, farm consolidations, reduced or
eliminated subsidies, environmental problems, land use conflicts, biotechnology, and
increased overall risk, the availability, accessibility, and application of contemporary
expert agricultural / dairying information is of high priority for farmers and
researchers. In view of the demands for the most current information, numerous
scientific and academic institutions and industry have turned to computerized
Decision Support Systems (DSS) as a means of packaging biological, agricultural,
and technical information to make it more easily accessible and useful for various
large number of beneficiaries in a rapidly transforming and competitive world.
Decision Support Systems (DSS), Group Decision Support Systems (GDSS),
Executive Information Systems (EIS) and Executive Support Systems (ESS), Expert
Systems (ES), and Artificial Neural Networks (ANN) are some of the major
technologies being designed to bring a desired change in this direction.

2.0 Decision Support Systems


A Decision Support System (DSS) can be defined as a system under the control of
one or more decision makers that assists in the activity of the decision making by
providing an organized set of tools intended to impart structure to portions of the
decision-making situation and to improve the ultimate effectiveness of the decision
outcome. Decision Support Systems (DSS) are a specific class of computer-based
information system that support technological and managerial decision making by
assisting the organization with knowledge about ill-structured or semi-structured
issues. Ill-structured problems are usually less tangible, a feeling or recognition of
dissatisfaction or uneasiness with the way things are but has difficulty clearly defining
the uneasiness, what the cause factors are, and possible solutions. By nature,
unstructured problems don’t have structured solutions. These are the problems that
require creativity, logic and reasoning, intuition, experimentation, and in some cases
best guesses and gut feelings. Thus the major focus of the DSS is on the problem
structuredness effectiveness of a given decision and that of managerial control. The
major components of a DSS are the data management system, the model management
system, the knowledge engine, the user interface and the user.
DSS is an interactive computer program that uses analytical methods and models to
help decision-makers formulate alternatives for large unstructured problems, analyze
their impacts, and then select appropriate solutions for implementation. The DSS will
essentially solve or give options for solving a given problem. The decision process is
structured in a hierarchical manner, the user inputs various parameters, and the DSS
essentially evaluates the relative impact of doing decision: x instead of decision: y.
The broader definition incorporates the above narrow definition but also includes
other technologies that support decision-making such as knowledge or information
discovery systems, database systems, and geographic information systems (GIS).
A properly designed DSS is an interactive software-based system intended to help
decision makers compile useful information from raw data, documents, personal
knowledge, and/or business models to identify and solve problems and make
decisions.
Decision support systems are used by organizational decision-makers to improve
strategic, tactical and operational decisions.
DSS can be classified into the following five broad categories:
1. Communications-driven DSS emphasizes communications, collaboration,
and shared decision-making support. Examples are simple bulletin boards,
threaded e-mails, audio conferencing, web conferencing, document sharing,
electronic mail, computer-supported face-to-face-meeting software, and
interactive video. It enables two or more people to communicate with each
other, share information, and coordinate their activities. Communications-
driven DSS is often categorized according to a time/location matrix, using the
distinction between same time (synchronous) and different times
(asynchronous), and that between same place (face-to-face) and different
places (distributed).
2. Data-driven DSS emphasizes access to and manipulation of time-series data
from an internal or external database source. Users can access relevant data by
simple query and retrieval tools for further synthesis and analysis for example
weather-related databases, agricultural and animal farm related databases.
3. Document-driven DSS integrates a variety of storage and processing
technologies to provide users document retrieval and analysis: this may
sometimes be found in libraries.
4. Knowledge-driven DSS is an expert or rule-based system where facts, rules,
information, and procedures are organized into schemes that allow for more
informed and effective decision-making. This is also sometimes referred to as
the "expert" type of DSS.
5. Model-driven DSS emphasizes access to and manipulation of a model, for
example, statistical, financial, optimization, simulation, and deterministic,
stochastic, or logic modeling. Model-driven DSS generally requires input data
from the end-user to aid in analyzing a situation.
A decision support system may present information graphically and may include an
expert system or artificial intelligence (AI). It may be aimed at business executives or
some other group of knowledge workers.

3.0 Group Decision Support Systems and Groupware Technologies


Although a large proportion of daily decisions faced by a typical manager must be
made by an individual, many of the decisions faced in today’s context are made by
group of individuals. Group decision making process has several advantages in the
sense that it brings a wider variety of perspectives, and the potential synergy
associated with collaborative activity besides having multiple sources of knowledge
and experience. The disadvantages associated with group decision making are that,
when unattended, can result in decision outcomes ranging from problematic to
catastrophic. Sometimes too many decision makers results in either a bad decision or
no decision at all. Some of the researchers have argued to replace the term group
decision maker with the term Multiparticipant Decision Maker (MDM) and
accordingly there are MDM support technologies.
The technologies as applied in a particular MDM context are Electronic Board Room,
Teleconference Room, Group Network, Information Centre, Collaboration Laboratory
and Decision Room. The simplest type of MDM support system is the Electronic
Board Room in which the primary technology is used in support of audiovisual and
multimedia activities. The highest level of system is the Decision Room in which
sophisticated computer technologies provide tools to support such activities as
brainstorming, analysis of issues, commentary, and consensus assessment or voting.

4.0 Executive Information Systems (EIS)


An EIS is a computer based system intended to facilitate and support the information
and decision making needs of senior executives by providing easy access to both
internal and external information relevant to meeting the stated goals of the
organization. The basic purpose of EIS is to provide senior managers with the
information they need about their operating environment, internal operations, industry
knowledge and events, markets, customers, and suppliers.

5.0 Expert Systems and Artificial Intelligence


The expert system is defined as a computer based application that employs a set of
rules based upon human knowledge to solve problems that require human expertise
which really means replicating expertise of human experts in different fields. An
expert system is a computer program that contains stored knowledge and solves
problems in a specific field in much the same way that a human expert would. The
knowledge typically comes from a series of conversations between the developer of
the expert system and one or more experts. The completed system applies the
knowledge to problems specified by a user.
An Expert System (ES), also called a Knowledge Based System (KBS), is a computer
program designed to simulate the problem-solving behaviour of an expert in a narrow
domain or discipline. In agriculture, expert systems unite the accumulated expertise of
individual disciplines, e.g., dairying, horticulture and agriculture, into a framework
that best addresses the specific, on-site needs of farmers. Expert systems combine the
experimental knowledge with the intuitive reasoning skills of a multitude of
specialists to aid farmers in making the best decisions for their animals / crops.
Some expert systems with application to food industry in general and dairy industry in
particular exist for the assessment of dairy processing plants with regard to product
safety, product quality and energy economy, etc.
HeatProc is one such expert system program. This facilitates assessment of milk
pasteurization plants. The most outstanding principle involved in the software is the
P*-concept characterizing processes differing in terms of time and temperature as to
microorganism death rate. The software has been developed using Crystal 3.5 expert-
system-shell (a 4GL written in “C”), which is a product of Intelligent Environment
Ltd. Richmond (UK). The software is a DOS-based program.
CookSim is another example of a knowledge-based system that guides the user
towards a safe thermal process design by automatically solving the associated
microbial kinetics. The CookSim package comprises of four components, viz., 1)
knowledge-base decision support : it facilitates data browsing and editing, and tools to
display simulation results and compare processor and bacterial distribution rates; 2)
object-oriented database : it contains the recipe, consisting of the food, the package,
the process in the CookSim terminology; 3) simulation component : it includes a
finite difference procedure for conduction heat transfer analysis, and algorithms to
calculate the thermal inactivation; 4) The user interface : it is a graphically oriented
and provides an easy way for the user to interact with the software.
A related approach was also followed in the development of the ChefCad package for
computer-aided design of complicated recipes consisting of consecutive heating /
cooling steps. This package is implemented on top of the C-based proKappa expert
system shell, which provided rule-based reasoning and object-oriented data structures,
and which runs on a UNIX workstation. The user interface works under the Windows
environment.
Artificial Neural Networks (ANNs) have emerged to be an important IT tool in the
area of artificial intelligence, where the goal is to develop tools capable of performing
cognitive functions such as reasoning and learning. A neural network is a powerful
data modeling tool that is able to capture and represent complex input / output
relationships. The development of neural network technology stemmed from the
desire to develop an artificial system that could perform intelligent tasks similar to
those performed by the human brain. A neural network acquires knowledge about
hidden data pattern through learning. It represents both linear and non-linear
relationships and in their ability to learn these relationships directly from the data
being modeled. Such relationships / models are later used on the remaining data sets
to make predictions. The approach here consists of creating systems in which the
structure of the human nervous system is mimicked and the signal treatment as
perceived to be performed in human nervous system is approximated. ANNs have
been used by researchers to solve problems related to agricultural and dairying
systems in the western world. ANN tools have been applied for quality assessment of
agricultural produce, predicting the milk yield and quality and shelf-life of dairy
products.

6.0 Creative Decision Making and Problem Solving


The creativity, intuition and innovativeness are the various components of decision
making process. The creativity is an ability to see the same things as everyone else but
to think of something different. It involves the ability to generate novel and useful
ideas and solutions to problems and challenges. From research, it is known that
human creativity can be supported by computer-based systems and that the use of
intelligent agents can both extend our reach and relieve us of many tasks, allowing us
to spend more time developing the creative side of our solutions and allowing
intuition to permeate the process. Analytical Hierarchy Process (AHP) is one such
creative problem-solving techniques. The AHP is a mathematically based theory and
comprehensive methodology for use in decision making employing two key aspects
viz., data from various variables that make up the decision and judgement of these
variables from decision makers to organise and evaluate the relative importance of
selected objectives and the relative importance of alternative solutions to a decision.
AHP consists of two steps viz., the structuring of a decision into a hierarchial model;
and pairwise comparison of all objectives and alternate solutions. Intuition although
not the same as creativity, is one of the important element in creative decision making
and problem solving. Managers who can harness their intuitions can often respond
more quickly to a given situation and apply both experience and judgement. In DSS,
decision maker should impart intuition to the decision process as a supplement to
other problem solving resources including creativity.

7.0 Data Mining


Data mining is a set of activities used to find new , hidden, or unexpected patterns in
data. Some of the data mining technologies are statistical techniques which are
capable of accommodating the condition of nonlinearity, multiple outliers, and non-
numerical data typically found in a data warehouse environment. Neural networks,
genetic algorithms and fuzzy logic are the machine learning techniques which can be
used to derive meaning from complicated and imprecise data, patterns and trends
within the data can be extracted and detected. Decision trees is another data mining
technique which is often used to assist in the classification of items or events
contained within the warehouse.

8.0 DSS Applications


International DSS programs
 Cow Culling Decision Support System - An interactive Decision Support
System (DSS) developed by the Department of Agricultural and Resource
Economics, The University of Arizona. This provides recommended
educational information on cow culling decisions for all combinations of cow
ages and cost differentials, to the ranchers. The major components of the DSS
include: 1) Biological Factors; 2) Market Factors; 3) Costs of Production; 4)
Management Alternatives; 5) Joint Consideration of Biological and Market
Factors; 6) The Value of Pregnancy Testing; and 7) Managing Herd
Composition. (Visit http://ag.arizona.edu/arec/cull/culling.html for more
details).
 Dairypert - an expert system (Dairypert) pertinent to a broad array of
biological and economic factors critical to the successful management and
operation of dairy herds throughout the United States has been developed
using both rule and model based components. DairyPert permits both
diagnostic evaluation of a dairy's current operation and limited predictive
analysis of changes in management practices. Heuristics used by experts from
a number of fields are combined with a modified version of the "Cornell Net
Carbohydrate and Protein" nutrition model, and natural language data entry
and advice routines. The knowledge base is structured modularity to determine
constraints to greater profitability and/or production in eight key management
areas. The areas are nutrition (including feed ration evaluations), physical
facilities and environment, herd health, reproduction, replacement, milking
practices, herd management ability, and economic constraints (including
financial, labor and risk factors). Each key area is further subdivided. The
system contains over 320 substantive rules. NEXPERT shell is used for
implementation, along with links to FoxPro database programs for data entry
and transcript readout, and Excel spreadsheets for nutrition models. Both IBM
and Apple Macintosh personal computers are used as platforms for the effort.
 NutMan - a nutrient management DSS program developed by Information
Systems and Insect Studies (ISIS; http://www.isis.vt.edu/) Lab of Virginia
Tech (http://www.vt.edu) and the US’s Virginia Department of Conservation
and Recreation (http://www.state.va.us/~dcr/sw/swindex.htm) and is in use in
Virginia farms since 1994. For further details and downloads visit
http://www.isis.vt.edu /dss/nutman/.
 DAIRYPRO - a computer package designed to assist dairy farmers in
northern Australia to make strategic decisions on their farm. The package is a
knowledge based decision support system (KBDSS) i.e. a combination of a
knowledge based system or expert system and a decision support system. The
statistical package SAS has been used for statistical model development. An
iterative prototyping method called 'evolutionary prototyping' has been used
for software development. The expert system shell 'Level 5 object for
Windows' was used to assist in the development of DAIRYPRO as it provided
a convenient method of separating the knowledge base from the program
control mechanisms.
 An Individual Feed Allocation Decision Support System for the Dairy
Farm - A fuzzy logic based decision support system that allocates
concentrated feed to cows through individual feeders according to their
performance has been jointly developed by the Dept. of Industrial Engineering
and Management, Ben-Gurion University of the Negev, Israel; and the
Institute of Agricultural Engineering Israel. Fuzzy sets were constructed by
analysing experimental data based on which ranges of values for the fuzzy sets
members were determined by an expert. The model was developed on a group
of 15 cows in a dairy of 50 milking cows (two milkings a day) under a
controlled dietary regime, assuming characteristic cow behaviour concerning
body weight, milk yield, lactation period, and concentrate feed consumption.
The data were analysed by the decision-support system and compared to
weekly decisions of the expert. Results indicated relationship between
historical cow performances and concentrated food intake. The advantage of
using fuzzy logic was that it enabled improved concentrate feeding ration
according to performance. The system enabled to automate decision making,
thus providing the farmer with a valuable tool. However, from an economic
point of view no significant statistical improvement was achieved by the fuzzy
logic system.
9.0 DSS developed at NDRI, Karnal
 An Interactive data-driven DSS on dairy production aspects – A
comprehensive interactive relational database oriented information system on
several aspects of dairy production pertaining to India has been developed by
the Institute Computer Centre with the main goal to integrate the scattered data
(available in different formats) on a single uniform platform so as to gauge the
temporal and spatial trends in dairy statistics. The aspects included in the
database are bovine population, milk production, average milk yield of milch
animals, area under fodder crops, dairy, meat and feed factories,
imports/exports of livestock products, value of output from milk and livestock
sector, Codex and PFA standards for milk and milk products, wholesale/retail
prices of selected brands of milk products in Delhi markets, wholesale/retail
price indices of various livestock products and excise duty on dairy products.
The secondary data is being collected regularly from various government and
R&D organizations. Data have been standardized using database
normalization techniques and then converted into database format using a
relational database management system (RDBMS). A graphical user interface
(GUI) has been developed for storage and retrieval of information from the
database. The present information system provides customized information on
the above noted aspects that is being shared by the potential users including
researchers, planners, administrators, policy makers, dairy industry, and the
farming community at large with a view to understand and visualize the dairy
scenario in the country.
 MSI-NDRI – is GUI-oriented model-based DSS software that runs under MS-
Windows operating environment. It was developed by using MS-Access 97
RDBMS program as back end tool and MS-Visual Basic 6.0 and Seagate
Crystal Reports 7.0 as front end software development tools. The software can
plot and analyze Moisture Sorption Isotherms (MSI) and estimate model
constants for BET, Caurie, Halsey, Iglesias and Chirife, Smith, Oswin, GAB
and Modified Mizrahi equations. The accuracy of prediction can be
determined by estimating residuals and RMS values. The program can
calculate GAB constants, and determine their temperature dependence. Other
information that can be generated by the software are isosteric heat of
sorption, pore size, monolayer moisture, properties of sorbed water viz.
number of adsorbed monolayers, bound or non-freezable water, density of
sorbed water and surface area of adsorption. The software MSI-NDRI is a
versatile tool for scientists and R & D laboratories engaged in the areas of
food processing and preservation.
 Web-enabled information system for online searching, ordering and
maintenance of dairy cultures – is a Web-enabled communication-based
DSS that facilitates online search for desired strains. After searching the
particular strain(s), the same can be ordered online too. The software also
facilitates to view a list of orders received during a specific interval of time
along with the other details about the ordered strains. The program comprises
of the following three major modules: 1) Searching and ordering of strains; 2)
Database maintenance for strains and members; and 3) Dispatch of strains to
members. The system has been developed using various Web-authoring and
publishing software tools such as Microsoft Access 97 (an RDBMS) as
backend, Internet Information Server (IIS), Active Server Pages (ASP), VB
Script, JavaScript, HTML, etc. as front-end. The system is capable to carry out
both the elementary as well as advanced operations online, viz. adding new
records, modifying existing information, deleting obsolete entries, system
administration, online querying and ordering of strains. The system is well
protected with passwords at each level i.e. online members have password to
their accounts using which they can login and order the strains and the
administrator too has a password protected system for the overall control.
 Multimedia information system on transferable dairy technologies - The
institute has been engaged in evolving appropriate dairy production and
processing technologies to suit Indian conditions. Several technologies have
been successfully developed over time. Many of these innovations are ready
for commercial exploitation. These transferable technologies have been
documented in the form of an interactive multimedia document-driven DSS.
The end-users can randomly get desired information about any technology as
well as detailed information about the developer for further interaction.
10.0 Emerging trends – Intelligent DSS
Intelligent decision support systems extend the notion of decision support by
adding techniques originating from Artificial Intelligence such as knowledge bases,
neural networks, genetic algorithms, fuzzy logic and learning. They offer approaches
for assisting decision making where information is incomplete or uncertain and where
decisions must be made using human judgment and preferences.
Multimedia technology is emerging as a key element for the adequate
presentation of the complex information managed by a DSS as the ultimate goal is to
provide the human decision maker with the relevant information on the basis of
available underlying data. Enhancing DSS with multimedia components allows more
effective and appealing presentations through the combination of different
representation formats (e.g., text, various kinds of graphics, and animations) so that
the strength of one medium will overcome the weakness of another. In particular, the
capabilities for 3-D graphics and animation offered by Virtual Reality (VR) tools may
increase the effectiveness of visualizations by mapping high-volume, multi-
dimensional data into meaningful presentations and by enabling interaction with these
displays.

11.0 DSS of the Future


Short-term DSS trends as forecasted by Sprague and Watson (1996) are given below
which will shape the future of decision making.
 Personal computer-based DSS will continue to grow mainly for personal support.
Integrated micro packages containing spreadsheets will take on more and more
functions. Newer packages for `creativity support' will become more popular as
extensions of analysis and decision making.
 For pooled inter-dependent decisions support, group DSS will become much more
prevalent in the next few years. The growing availability of local area networks
and group communication services like electronic mail will make this type of DSS
increasingly available.
 Decision Support System products in future will include the tools and techniques
of artificial intelligence work. DSS will provide the system for the assimilation of
Expert System Knowledge representation, natural language query, voice and
pattern recognition which will result into "intelligent DSS" that can `suggest',
`learn' and `understand' in dealing with managerial tasks and problems.
 The development of dialog support hardware, such as light pens, mouse devices,
touch screens, and high resolution graphics, will be further advanced by speech
recognition and voice synthesis. Dialog support software such as menus,
windows and `help' functions will also advance.

12.0 Conclusions
The face of dairying in India can be transformed by a well conceived deployment of
IT as has been illustrated in this write-up. The potential of IT as yet remains untapped
and urgent measures are required to derive maximum benefit. The key players
involved in the process such as government departments, educational and research
institutions, extension agencies are required to make contributions in this endeavour.
There is a need to develop necessary IT based animal husbandry and dairying services
in the country by having a separate web portal in order to disseminate know how
available in this field. Parallel steps are also required to develop necessary IT
communication infrastructure along with the utilization of fibre optic network,
wherever it is passing through rural segments as well as through wireless
communication devices.

13.0 References:
Bakker-Dhaliwal, R., M. A. Bell, P. Marcotte, and S. Morin, (2001) Decision Support
Systems (DSS): Information Technology in a Changing World. Published on
the Internet: http://www.irri.org/irrn262mini1.pdf.
Conlin, Joe (1990) Dairy Farm Decision Making. Business Management Collection,
Issue 103. Published by Department of Animal Science, University of
Minnesota, 101 Haecker Hall 1364 Eckles Avenue St. Paul, Minnesota 55108.
Herzog, Gerd (2003) Multimedia Information Presentation for Decision Support
Systems Published on the Internet: http://www.dfki.de/~flint/ (E-mail:
herzog@acm.org).
Ido Morag, Yael Edan, and Ephraim Maltz (2000) An Individual Feed Allocation
Decision Support System for the Dairy Farm. Journal of Agricultural
Engineering Research. E-mail: yael@bgumail.bgu.ac.il
Kalter, Robert J. and Skidmore, Andrew L. (1991) Dairypert: An Expert System for
Management Diagnostics of Dairy Farms. In: Dairy Decision Support Systems
Lawrence R. Jones (Ed), Vol. II, No. 2, Cornell University, 272 Morrison
Hall, Ithaca, NY 14853.
Kerr, D.V., Cowan, R.T. and Chaseling, J. (2003) The development and evaluation of
a Knowledge-Based Decision Support System for northern Australian Dairy
Farms. Published on the Internet: http://wcca.ifas.ufl.edu/archive/7thProc/
KERR/KERR.htm#N_1_; (E-mail: kerrd@dpi.qld.gov.au)
Power, D. J. (2000) Web-Based and Model-Driven Decision Support Systems:
Concepts and Issues. Proc. the 2000 Americas Conference on Information
Systems, Long Beach, California, August 10–13.
Power, D.J. (2003) A Brief History of Decision Support Systems. (Version 2.8)
Published on the Internet: http://DSSResources.COM/history/dsshistory.html.
Sharda, R., S. Barr, and J. McDonnell (1988) Decision Support Systems
Effectiveness: A Review and an Empirical Test, Management Science,
34(2)139-159.
Sharma, Adesh K., Verma, Deepak, Jain, D. K. and Singh, Rameshwar. Web-Enabled
Database System for Online Searching, Ordering and Maintenance of Dairy
Cultures. Indian Dairyman (In Press).
Smith, Terry R., Eastwood, Basil R. and Radtke, Angela Faris (2003) Agricultural
Databases for Decision Support (ADDS) - A Strategy for Developing and
Disseminating Knowledge-Based Systems. Published on the Internet:
http://wcca.ifas.ufl.edu/archive/ 7thProc/SMITH/SMITH.htm#N_1_
Sprague Jr., Ralph H. and Watson, H.J. 1996. Decision Support For Management.
Prentice Hall, New Jersey.
Sprague, R, H., Jr. (1980) A Framework for the Development of Decision Support
Systems. Management Information Systems Quarterly, 4(4)1-26.
Sprague, R. H., Jr. and E. D. Carlson (1982). Building Effective Decision Support
Systems. Prentice-Hall, Inc., Englewood Cliffs, N.J.
Turban, E. 1995. Decision Support And Expert Systems. Prentice Hall, New Jersey.
UTILIZING THE POTENTIAL OF DYNAMIC MICROBIAL
MODELING IN QUALITY ASSURANCE OF FOODS

Sanjeev K Anand
Dairy Microbiology Division
NDRI, Karnal

1.0 Introduction
Ensuring the microbial safety and shelf life of foods depends on minimizing the
initial level of microbial contamination, preventing or limiting the rate of microbial growth,
or destroying microbial populations. With many foods, these strategies have been practiced
successfully for years. However, in the last decade, the incidence of food borne disease has
increased, despite the introduction of the Hazard Analysis and Critical Control Points
(HACCP) concept and the promulgation of regulations in food safety. This has led to lots of
interest in risk analysis for ensuring the safety of food products under diverse operation
conditions through out the food supply chain.

One component of Risk Analysis is Risk Assessment that is primarily based on the
dynamic microbiological modeling involving population dynamic models of potential food
borne pathogens. This concept is also referred to as predictive microbiology as it uses
mathematical models to define the growth kinetics of food microorganisms and predict
microbial behavior over a range of conditions. Predictive microbiology is used to assess the
risks of food processing, distribution, storage and food handling; and, to implement control
measures in order to protect the microbiological quality of foods, important for both food
safety and product quality. Ecological theory suggests a wide ranging continuum of
microbial community dynamics is possible and the competitive inhibition is not inevitable.
This would indicate that the path and outcome of competitive interactions may be highly
sensitive to initial conditions and random variation in the key factors such as growth rates
and inter-specific competition coefficients. Even if outcomes of competitive interactions are
the same, the paths taken may differ, and public health outcomes that may be path-
dependent produce all together different situations.

2.0 Phases of microbial growth in foods

Microbial load in a food source depends upon the initial level of bacterial
contamination as well as environmental conditions (temperature, pH, water activity,
preservatives, antimicrobials and the composition of the atmosphere) which influence
growth, inactivation and survival in the food. Most studies in food microbiology are
concerned with the rapid growth of populations, but in many ecosystems, the survival
characteristics of the population also need to be considered. The longevity of bacterial
spores and their resistance to harsh conditions are well documented. However, the ability of
vegetative cells to resist stressful conditions is increasingly recognized as an important
ecologic trait. Attention also needs to be given to relatively slow-growing populations in
various situations, e.g., when the shelf life of a product is extended by control of rapidly
growing spoilage organisms. When populations are in the biokinetic range, the rate at which
they develop is determined by several factors such as temperature, water availability, and
pH applied in food preservation procedures. The extent of microbial growth is thus a

106
function of the time the population is exposed to combinations of intrinsic food properties
(e.g., salt concentration and acidity) and extrinsic storage conditions (e.g., temperature,
relative humidity, and gaseous atmosphere).

Bacterial growth can exhibit at least four different phases: lag phase, growth phase,
stationary phase and death phase.

2.1 Lag Phase


During the lag phase, cells increase in size but not in number because they are
adapting to a new environment, and, synthesis and repair are taking place. The length of the
lag phase depends on the current environment as well as the previous physiological state of
the cells. Cells that are from a very different environment or are damaged from their
previous physiological state may require more time to adjust. In some foods a lag phase
does not exist which results in cells that are ready for immediate growth.

2.2 Growth Phase


During the growth phase, cells grow exponentially and at a constant rate. The
maximum slope of the curve is the specific growth rate of the organism. Cell growth is
dependent upon the current environment (nutrients, temperature, pH, etc.), but is not
dependent upon the previous physiological state. In the field of predictive microbiology,
growth rate is commonly expressed as the change in cell number per time interval.

2.3 Stationary Phase


The stationary phase occurs at the maximum population density, the point at which
the maximum number of bacterial cells can exist in an environment. This typically
represents the carrying capacity of the environment. However, environmental factors such
as pH, preservatives, antimicrobials, native microflora and atmospheric composition as well
as depletion of growth-limiting nutrients can affect the maximum population density.

2.4 Death Phase


The death phase occurs when cells are being inactivated or killed because conditions
no longer support growth or survival. Some environmental factors such as temperature can
cause acute inactivation. Others may cause mild inactivation as with growth in the presence
of organic acids.

3.0 Dynamic Microbiology Models


Kinetic or dynamic microbiology involves knowledge of microbial growth responses
to environmental factors summarized as equations or mathematical models. The raw data
and models may be stored in a database from which the information can be retrieved and
used to interpret the effect of processing and distribution practices on microbial
proliferation. Coupled with information on environmental history during processing and
storage, predictive microbiology provides precision in making decisions on the
microbiologic safety and quality of foods. The term "quantitative microbial ecology" has
been suggested as an alternative to "predictive microbiology".

The development, validation, and application of predictive microbiology have been


extensively studied in the last decade. Most of the modeling studies have concentrated on
descriptions of the effect of constraints on microbial growth (rather than survival or death),
often using a kinetic model approach (rather than probability modeling) and most often

107
describing the effect of temperature as the sole or one of a number of controlling factors.
For example, the temperature dependence model for growth of Clostridium botulinum
demonstrated a good fit to data, but the authors noted that "care must be taken at extremes
of growth, as no growth may be registered in a situation where growth is indeed possible but
has a low probability".

The emphasis in modeling efforts on temperature (often in combination with other


factors) may be justified, given its crucial role in the safe distribution and storage of foods.
Surveys carried out over several decades in the United Kingdom, United States, Canada,
and Australia point to the predominant role of temperature abuse in outbreaks of food borne
disease.

3.1 Growth and no growth inter-phases


Because growth of pathogenic bacteria in foods always increases the risk for food
borne disease, defining the conditions at which no growth is possible is of considerable
practical significance for food manufacturers and regulators. Bacterial growth/no growth
interface models quantify the combined effect of various hurdles on the probability of
growth and define combinations at which the growth rate is zero. Increasing the level of one
or more hurdles at the interface by only a small amount will significantly increase the
probability of "fail safe" events and decrease the probability that a few cells in the
population will resolve the lag phase and begin to grow (a "fail dangerous" event). The
growth/no growth interface also have great physiologic significance because at that point
biosynthetic processes are insufficient to support population growth, and survival
mechanisms are in place.
A procedure to derive the interface was proposed by Ratkowsky and Ross; it
employs a logistic regression model to define the probability of growth as a function of one
or more controlling environmental factors. From this model, the boundary between growth
and no growth, at some chosen level of probability, can be determined. The form of the
expression containing the growth limiting factors is suggested by a kinetic model, while the
response at a given combination of factors is either presence or absence (i.e., growth/no
growth) or probabilistic (i.e., the fraction of positive responses in n trials). This approach
represents an integration of probability and kinetic approaches to predictive modeling.
3.2 Dynamic pathogen models
The incorporation of predictive models into devices such as temperature loggers has
been described for E. coli and Pseudomonas, as has the development of expert systems from
predictive modeling databases.
Based on the work carried out in our lab, we have also been able to develop the
quality and safety prediction models for indigenous milk products like paneer. The models
are based on the Cobb-Douglas equation and can predict the spoilage of the product based
on the total viable counts, moisture and pH. The modifications of the model have also been
developed by incorporating the proteolytic and lipolytic microflora. The model can also
predict the product safety for the common food pathogens such as Staphylococcus aureus
and Escherichia coli. The attempts are currently going on to convert these models in to
Computer based applications by developing the necessary software under artificial neural
network (ANN) modeling.
In general, mathematical models used in predictive microbiology are simplified,
imperfect expressions of the numerous processes that affect bacterial growth in foods. They
are classified as primary, secondary and tertiary.
3.3 Primary models

108
These models reflect changes in microbial load as a function of time. Primary
models typically have parameters based on cellular mechanisms that affect bacterial
behavior, but since the total number of such cellular processes has not been defined, the
majority of such models are empirical. If all cellular processes could be defined and
incorporated into one model, the resulting model would be too complex for routine use.

3.4 Secondary models


These models predict changes in primary model parameters based on single or
multiple environmental conditions. An example of a secondary model would be the growth
of a microorganism as a function of temperature.

3.5 Tertiary models


These models express secondary model predictions in a primary model using
spreadsheets and computer software. There are several pathogens modeling programme that
have been developed as user friendly software. However, the accuracy of these predictions
cannot be guaranteed for other bacterial strains and/or environments, without proper
validation studies.

4.0 Process for developing predictive models


Developing a useful model begins with a good experimental design. The following
variables must be considered when planning a predictive microbiology experiment:

4.1 Bacterial Strain(s)


Single strain models are strain specific and may not represent a worse-case scenario
of bacterial growth for a food pathogen or within a food source. On the other hand, a
multiple strain cocktail allows for the most negative outcome to be determined because the
strain with the highest growth rate will predominate. The strain source is important because
pathogens isolated directly from foods are preferred.

4.2 Test Matrix


Microorganisms sometimes grow differently in food matrices than in
microbiological media. Experiments planned for model development should use food as the
growth medium. In the past, the majority of published microbial models used
microbiological media. It is now possible to compare models using food as the growth
medium to past models to learn about the food matrix and how it affects pathogen growth.

4.3 Inoculum Preparation


Typically cells developed in growth media and taken from late growth phase are
used to inoculate the test matrix. What is known about the previous environment for the
inoculum should be a consideration in experimental design, and inoculum preparation
should be consistent from experiment to experiment. Prior environments that are relevant to
food safety issues are not commonly used in experiments though this could be a factor in
relating models to real-world food situations.

4.4 Environmental Conditions


The experimental design should consider what environmental conditions are relevant
to the food source of interest. For each condition, it is necessary to define the potential
range to be encountered in the food so that the model will be able to provide predictions
within that range of values. It is crucial that experiments cover several test values over the

109
array of values defined. Growth of the inoculum in environments more pertinent to the food
source is essential to detect environmental effects on pathogen growth.

4.6 Microbial Flora


The presence or absence of native microflora (background organisms) is an
important characteristic of the test matrix. Numerous reports document competition between
added pathogen and native microflora in retail foods and a resulting decrease in maximum
population density for the inhibited strain.

5.0 Measures of Model Performance


Models must undergo validation before they are used to aid in food safety decisions.
Validation involves comparing model predictions to experimental observations not used in
model development. Two primary tools for measuring model performance are the bias and
accuracy factor.
Bias factor is a multiplicative factor that compares model predictions and is used to
determine whether the model over- or under-predicts the response time of bacterial growth.
A bias factor greater than 1.0 indicates that a growth model is fail-dangerous. Conversely, a
bias factor less than 1.0 generally indicates that a growth model is fail-safe. Perfect
agreement between predictions and observations would lead to a bias factor of 1.0. On the
other hand, the accuracy factor is the sum of absolute differences between predictions and
observations, and it measures the overall model error.

6.0 Modeling Software


Modeling microbial growth and survival using analytical dynamic models has been
successfully used to simulate several organisms with public health significance for risk
analysis. These models are available in a number of commercial packages. One such
programme is Pathogen Modelling Programme (PMP version 7.0). This programme has
been developed and produced at the USDA-ARS Eastern Regional Research Center
(ERRC) in Wyndmoor, Pennsylvania. The PMP is a package of models that can be used to
predict the growth and inactivation of foodborne bacteria, primarily pathogens, under
various environmental conditions. These predictions are specific to certain bacterial strains
and specific environments (e.g., culture media, food, etc.) that were used to generate the
models.
Similarly, Combined Database for Predictive Microbiology; ComBase is a database
of microbial responses to food environments, supplied with browser and other supporting
programs. The ComBase is jointly run by the Institute of Food Research, UK, and the
USDA Eastern Regional Research Center; funded by the Food Standards Agency, UK and
the USDA Agriculture Research Service. Its internet-based version is freely available via
http://wyndmoor.arserrc.gov/combase. In this case, the classical predictive microbiology is
based on the assumption that the rate of growth/death of a given micro-organism in the
exponential phase is characteristic of its environment. The maximum rate is the maximum
slope of the “log (cell-conc.) versus time” curve, in a given environment. The most
important environment parameters considered are the temperature, the pH and the water
activity (a quantification of water available to the cells). Other factors such as the
concentrations of additives, preservatives, etc. may also influence the growth rate.
Recently, a software; Seafood Spoilage Predictor (SSP) has been developed to
predict shelf-life of seafood at constant and under fluctuating temperature storage
conditions. This software can read data from different types of loggers and in this way
evaluate the effect of fluctuating temperatures on shelf-life of seafood. SSP contains relative
rates of spoilage (RRS) models and microbial spoilage (MS) models. Models included in

110
the software should only be used in products stored within the range of conditions where
they have been successfully validated. A markedly expanded version of the SSP software is
now available at www.dfu.min.dk/micro/sssp/.
Another example of a dynamic model is the Growth Predictor or Perfringens
Predictor. Growth Predictor provides a set of models for predicting the growth of the
organisms as a function of environmental factors, including temperature, pH and water
activity. Some models also include an additional, fourth factor, such as the concentration of
carbon dioxide or acetic acid. No survival or death models are included in the current
version of Growth Predictor. Similarly, Perfringens Predictor provides a prediction of
growth of Clostridium perfringens during the cooling of meats. The input is temperature and
the output is viable count of C. perfringens
7.0 Modeling GMOs : A greater challenge
Pragmatic systems biologists normally model the interaction of molecules from the
bottom-up. Systems-theoretic biologists, on the other hand, tend to model systems from the
top-down. For both systems biologies, the narrative or simple diagrammatic models that are
routinely encountered in biology are insufficient for modelling systems. It is believed that
the systems modelling have to be mathematical in order to capture the complexities of
higher-level biological organization. Although the cell is an obvious system candidate,
neither forms of current systems biology are clear about the properties that merit the label of
system. This has led to a new approach of „metagenomics‟ that is the large-scale study of
the DNA of naturally existing microbial communities (rather than „artificial‟ lab cultures). It
can involve the shotgun sequencing of all the genomes in these communities, but is most
likely to be about sequencing or screening large segments of DNA extracted from wide-
ranging environmental samples. Metagenomics does more, however, than merely provide us
with lots of interesting DNA sequence data. It takes a non-traditional focus on the genomic
resource of a dynamic microbial community, not just individual strains of microbes or
individual genes and their functions.

Similarly, intracellular metabolite concentrations are of vital importance at the genomic


levels for the regulation of the cellular metabolic network of microorganisms. Through
quantification of the in vivo concentrations of intermediary metabolites and the
incorporation of knowledge about the kinetic properties of the enzymes involved, it is
possible to develop a dynamic model of the microbial metabolism. These models would
represent an important tool for calculating the effects of genetic manipulations, therefore
allowing a rational 'metabolic design' of microorganisms.

8.0 Conclusions:
The dynamic microbiology models can be effectively used in Total Quality and Safety
Management programmes. Several benefits of mathematical models to predict pathogen
growth, survival and inactivation in foods are: ability to account for changes in microbial
load in food as a result of environment and handling; use of predictive microbiology in
management of foodborne hazards; and, preparation of Hazard Analysis Critical Control
Point (HACCP) plans. In this way these models to depict dynamic bacterial growth or
inactivation are very efficient tools to predict the evolution of a microbial population during
the complete food supply chain.

111
Left: External Glucose concentration, Right: Corresponding intracellular
intracellular Glucose-6-Phosphate Phosphoenolpyruvate concentration
concentrations from two experiments from the two experiments
(G6P1 and G6P2)

References:

Adams M, Moss MO. Food microbiology. Cambridge: Royal Society of Chemistry;


1995.
Adair C, Briggs PA. The concept and application of expert systems in the field of
microbiological safety. Journal of Industrial Microbiology 1993;12:263-7.
Anand, S.K. , Chander, H. and Singh, S. 1996. " Predictive microbiology - Role In Dairy
Industry. Indian Dairyman . 48(12) : 25-27
Anand, S.K., Singh, S. and Chander, H. 1996. “A Mathematical Approach To Predict
Microbiological Shelf Life Of Paneer”. International Symposium On
Microbiological Safety Of Processed Foods Organised By Hindustan Lever
Research Foundation At Bangalore From Dec. 2-3.
Anand, S.K., Singh, S. and Chander, H.1997. “A Cobb-Douglas Model To Predict
Microbiological Safety Of Paneer”. 38th Annual Conference And Symposium Of
AMI Held At Jamia Milia Islamia, New Delhi. Dec 12-14.
Anand, S.K., Singh, S. And Chander, H. 1998. “Safety Models For Paneer With Regard To
S.Aureus And E.Coli “. XXIX Dairy Industry Conference, Held At NDRI From 28-
29 Nov.
Anand, S.K., 2000. “Computer Applications In Quality Assurance”. Workshop On
Information Technology In Dairy Research, Organised By Computer Center,
NDRI, Karnal From Jan04 To 07
Anand, S.K. 2005. “Role Of Predictive Models In Microbiological Risk Assessment Of
Food Products” As An Invited Paper In The National Seminar On „Risk Assessment
In Dairy Production And Processing, Organized By NDRI Alumni Association And
IDA (NZ) At NDRI, Karnal During Jan 14-15 2005
Archer DL. Preservation microbiology and safety: evidence that stress enhances virulence
and triggers adaptive mutations. Trends in Food Science Technology 1996;7;91-5.

112
Davey GR. Food poisoning in New South Wales: 1977-84. Food Technology in Australia
1985;37:453-7.
Enneking U. Hazard analysis of critical control points (HACCP) as part of the Lufthansa in-
flight service quality assurance. International Food Safety News 1993;2:52-3.
Graham A, Lund BM. The effect of temperature on the growth of non proteolytic type B
Clostridium botulinum. Letters in Applied Microbiology 1993;16:158-60.
Gill CO, Harrison JCL, Phillips DM. Use of a temperature function integration technique to
assess the hygienic efficiency of a beef carcass cooling process. Food Microbiology
1991;8:83-94.
Jones JE. A real-time database/models base/expert system in predictive microbiology.
Journal of Industrial Microbiology 1993;12:268-72.
Labuza TP, Fu B. Growth kinetics for shelf-life prediction: theory and practice. J of
Industrial Microbiology 1993;12:309-23.
Leistner L. Food preservation by combined methods. Food Research International
1992;25:151-8.
Maurice J. The rise and rise of food poisoning. New Scientist 1994;144:28-33.
Mcmeekin TA, Olley J. Predictive microbiology and the rise and fall of food poisoning.
ATS Focus 1995;88:14-20.
Mcmeekin TA, Olley J, Ross T, Ratkowsky DA. Predictive microbiology: theory and
application. Taunton, UK: Research Studies Press; 1993.
Mcmeekin TA, Ross T. Shelf life prediction: status and future possibilities. Int J Food
Microbiol 1996;33:65-83.
Mcmeekin TA, Ross T. Modeling applications. Journal of Food Protection 1996 (Suppl): 1-
88.
McMeeken, TA, J. Brown, K. Krist, D. Miles, K. Neumeyer, D.S. Nichols, J. Olley, K.
Presser, D. A. Ratkowsky, T. Ross, M. Salter, and S. Soontranon, Quantitative
microbiology; a basis for food safety. Emerging Infect. Dis.1997; 3(4); 541-49
Ratkowsky D, Ross T, mcmeekin TA, Olley J. Comparison of Arrhenius-type and
Belehradek-type models for the prediction of bacterial growth in foods. J Appl
Bacteriol 1996;71:452-9.
Ratkowsky DA, Ross T, Macario N, Dommett TW, Kamperman L. Choosing probability
distributions for modelling generation time variability. J Appl Bacteriol
1996;80:131-7.
Ratkowsky DA, Ross T. Modelling the bacterial growth/no growth interface. Letters in
Applied Microbiology 1995;20:29-33.
Ross T, mcmeekin TA. Predictive microbiology and HACCP. In: Pearson AM, Dutson TR,
editors. HACCP in meat, poultry and fish processing. London: Blackie Academic
and Professional; 1995. P. 330-53.
Singh, S., Anand, S.K. and Chander, H. 1996. “Effect Of Storage Period And Temperature
On Prediction Of Growth Models In Paneer”. International Symposium On
Microbiological Safety Of Processed Foods Organised By Hindustan Lever
Research Foundation At Bangalore From Dec. 2-3.
Snyder OP. Use of time and temperature specifications for holding and storing food in retail
operations. Dairy Food Environ Sanitat 1996;116:374-88.
Stewart T. Growth and inactivation models, FSIRO; fsrio@nal.usda.gov
Whiting RC, Buchanan RLB. Predictive modeling. In: Doyle MP, Beuchat LR, Montville
TJ, editors. Food microbiology fundamentals and frontiers. Washington (DC):
American Society for Microbiology Press 1997. P. 728-39.

113
AN INTRODUCTION TO PROTEIN STRUCTURE AND
STRUCTURAL BIOINFORMATICS

Jai K Kaushik
Molecular Biology Unit
NDRI, Karnal
1.0 Introduction
Proteins are the central workers in an organism to carry out mechanical, physical,
(metaphysical!), chemical, defense, electronics, postal and diverse other functions. They
may look and act as silky fibers as well as dagger like prickly needles (amyloids). They can
act from being like a mechanical pump or a high energy-generating machine to simple
appliance for switching on/off gene functions. These wonderful biomolecules play a pivotal
role in all the biochemical reactions in our body. In this talk we shall discuss the structure
and the mechanism by which they assume such unique and diverse functions from a limited
set of blocks known as amino acids. These biomolecules have a hierarchical organization,
simplest being their primary structure which is formed by fusion of the head (amine group)
of one amino acid with the tail (carboxyl group) of another amino acid to form a peptide
bond. The fusion is repeated a number of times in a sequence guided by the gene on which a
polypeptide chain is built to give the primary structure of proteins. This process is followed
concomitantly with the process of folding of the linear polypeptide chain in to local
structures known as secondary structures, which in turn reorient and fold to give a unique
three dimensional structure known as tertiary structure. In many cases the tertiary structure
is the biologically active structural unit of the protein, however, in some cases the tertiary
structures may again assemble to a higher order form known as quaternary structure of the
protein to give rise to a biologically active unit.

At this stage, there are several pertinent questions which one may like to ask, e.g.:

1. What short of blocks are needed to make a protein? (Chemical nature of Proteins).
2. How a primary structure of a protein knows which secondary and tertiary structures it
should assume? (Protein folding code/Protein folding problem).
3. From where the process of protein structure begins? (Nucleation of protein folding)
4. How fast and efficiently a protein can fold? (Levinthal’s paradox).
5. Whether only one final structure known as native structure of a polypeptide chain is
possible? (Thermodynamic versus Kinetic control of folding).
6. How many sequences are needed to fulfill all the biological functions? (The
conformational space occupied by the sequences).

The building blocks of a protein are 20 natural amino acids (Figure 1) which consist
of basic amino acids, acidic amino acids, polar amino acids and apolar (hydrophobic) amino
acids spanning all the chemical requirements which may be needed in structure formation,
biological function and solubility and packing in a particular biological milieu. It is argued
that the chemical nature of the polypeptide backbone is the central determinant of the 3-D
structures of proteins. The requirement that buried polar groups form intramolecular
hydrogen bonds limits the fold of the backbone to the well known units of secondary
structure while amino acid sequence chooses among the set of conformations available to
the backbone. Anfinsen back in early seventies suggested that all the information needed to

1
fold a polypeptide chain in to a native structure is encoded in the polypeptide chain itself.
However, evidence exists suggesting the important role played by the environment in
modulating the folding and stability of proteins. In the rest of the part we would address
some of the issues mentioned above.

Figure 1. Chemical structure of amino acids.

2.0 Structure and Conformation of Proteins

The amino acids are joined by a peptide bond to form a linear polypeptide chain.
The peptide bond is planer and the two planes can be rotated with respect to each other.
Figure 2 shows the three main chain torsion angles of a polypeptide. These are phi (), psi
(), and omega (.

2
Figure 2. Clockwise from top left:
Formation of a peptide bond between
amino and carboxyl groups of two adjacent
amino acids, top right: sequential addition
of new amino acids indicating the new
peptide bonds and the phi, psi and omega
torsional angles.The planarity of the
peptide bond ( restricts to 180 degrees in
very nearly all of the main chain peptide
bonds, however, in rare cases = 10 degrees
for a cis peptide bond which usually
involves proline. In a polypeptide the main chain N-C () and C-C () bonds relatively
are free to rotate. Using computer models of small polypeptides G. N. Ramachandran
systematically studied the relationship between the () and ( angles for various
secondary structural elements. He observed that a polypeptide chain can not take any
random
conformation but rather is dictated by steric hindrances resulting in a much smaller
conformational space available for proteins to assume a 3-D structure. The relationship
between () and ( angles has been shown in Figure 3 for the regular structures, the plot is
also known as the Ramachandran Plot.

3
The white space indicates the disallowed region, while other secondary structural
elements can assume the values of () and ( in the region specified. The only exceptions
is glycine residue which can have () and ( torsion values in the white region due to
absence of a sidechain resulting

Figure 3: Ramachandran plot: Left-side shows the space allowed to different secondary

structural elements, while panel on right


side shows the results obtained by
running procheck (a utility to judge the
quality of a model) on pyrrolidone
carboxyl peptidase from Pyrococcus
furiosus.

in a large conformational flexibility


around this residue. The presence of glycine in a polypeptide chain introduces a break in the
continuity of a regular structural element. Given the small to bulky sidechains of amino
acids only small number of structures may be allowed as shown in the Ramachandran plot
(Figure 3).
The optimization of the intramolecular interactions like hydrogen bonding results in
the formation of α-helix, β-strands, turns and random coils due to repetitions of similar
interactions among the backbone atoms of a polypeptide chain. The most simple and elegant
arrangement is a right-handed spiral conformation known as the '-helix' (Figure 4). A left-
handed helix is also possible and the handedness of a helix can be easily differentiated. An
easy way to remember how a right-handed helix differs from a left-handed one is to hold
both your hands in front of you with your thumbs pointing up and your fingers curled
towards you. For each hand the thumbs indicate the direction of translation and the fingers
indicate the direction of rotation. The structure repeats itself every 5.4 Å along the helix
axis, i.e. we say that the -helix has a pitch of 5.4 Å. -helices have 3.6 amino acid residues
per turn, i.e. a helix 36 amino acids long would form 10 turns. The separation of residues
along the helix axis is 5.4/3.6 or 1.5 Å, i.e. the -helix has a rise per residue of 1.5 Å.

4
Figure 4. Left side shows the conformation of an α-helix while the right side shows the
parallel and anti-parallel β-sheet secondary structural elements and the corresponding
hydrogen bonds making the structure.
Another common structure is the β-sheet. Amino acid residues in the β-conformation
have negative  angles and the  angles are positive. Typical values are  = -140 degrees
and  = 130 degrees. In contrast, α-helical residues have both  and  negative. A section
of polypeptide with residues in the β-conformation is referred to as a β-strand and these
strands can associate by main chain hydrogen bonding interactions to form a sheet.
In a β-sheet two or more polypeptide chains run alongside each other and are linked in a
regular manner by hydrogen bonds between the main chain C=O and N-H groups.
Therefore all hydrogen bonds in a β-sheet are between different segments of polypeptide.
This contrasts with the α-helix where all hydrogen bonds involve the same element of
secondary structure. The R-groups (side chains) of neighboring residues in a β-strand point
in opposite directions. Other important local structures are turns and random coils. A typical
protein is usually made of all the elements, the contents and location of these are dictated by
the primary structure. On the other hand some proteins may be made of only helices while
other may be purely made of sheets. Figure 5 shows typical proteins with different
composition of local structures.

Figure 5. Tertiary structure of Proteins. Left side shows α-helix-coil-β-strand structure while
the right side shows a structure with only β-sheets
and coils.

3.0 Problem of

Protein Folding
Figure 6. Energy landscapes of protein folding, right flat surface landscape indicates that
each residue assumes different conformation independent of other residue’s conformation
and thus provides astonomical number of solutions which are equivalent from where the
protein has to find a way to the state N, central funnel landscape shows the pathway to be

5
cooperative, that is each step guides the next one towards the N state with no kinetic traps
and with one minima at the bottom of the funnel, the energy landscape on the right side
shows that there can be several pathways to reach to N state but with varying degree of
kinetics depending upon the pathway taken. The model indicates that a protein may get
trapped in the local minima in search of a global minima (N state).
One of the most studied but least understood parts in protein science is the protein
folding code. How a protein folds by and large is still enigmatic in spite of immense data
available on the process due to sheer complexity of the process. The information for the
folding is encoded in the primary structure of the protein. Each protein has a unique
structure which differentiates it from the other related or unrelated proteins. Homologous
proteins usually have similar fold and overall structure but may differ at atomic levels. It is
also possible that two unrelated sequences may assume a similar fold, structure and
functions, on the other hand two similar sequences may assume quite different function due
to differences in the structure. In the last decade, quantum progress has been made to
understand the code of folding and in many cases it has been possible to simulate the
folding of a protein based on de novo and knowledge-based procedures. The ab initio
methods based on first principle can give the most accurate results but due to extremely
time intensive computations, they are by and large limited to small peptides or proteins. The
protein folding problem can be well appreciated if we assume that many bonds connecting
the atoms in an amino acid and around peptide bond can exist in several conformations.
Using a conservative estimate of 10 conformations per residue, one may come to a
staggering number of 10100 possible structures for a polypeptide chain of 100 amino acids.
But a protein under physiological condition usually has only one major conformation
known as native state conformation. Then, how the protein selects that particular
conformation from a list of 10100 possible conformations within millisecond in vivo as well
as in vitro? This is a classic problem known as “Levinthal’s paradox” in protein science
parlance. Most proteins can fold at sub-milliseconds time scale to minutes depending upon
the complexity of the structure and the solution conditions. The paradox can be resolved if
we assume that not all possible conformations are productive and a large number of
conformations are precluded due to excluded volume effect as well as due to
interdependence of conformation of one residue on the conformation of other neighborhood
residues.
If we go back to the Ramachandran plot, clearly a protein tries to sample structures
only in a limited conformational space by precluding many conformations (in the white
region of the plot) and thus simplifying the problem tremendously. Also, a protein may not
try conformations with high energy but rather following a straight way to the lowest energy
conformation known as global minima as shown in Figure 6 (central landscape). This is the
thermodynamic view of protein folding which has experimental support but lacking in
strong reasoning. On the other hand, arguments support the kinetic control of the folding
pathway but there are not many evidences (Figure 6, right panel). However, the later
mechanism is not without some experimental support. Recent findings of protein misfolding
and mispairing of disulfide bonds and transient non-native intermediate contacts suggests
that protein folding could well be partially controlled by
kinetics as shown in Figure 7. An unfolded or a nascent protein
has a choice to enter the productive pathway to form a native
structure following the pathway c→b→a or a misfolded protein
(amyloid fibrils) following a c→b→d pathway. The relative
speed of the two processes decides the final structure. The
partially folded protein (b) has exposed hydrophobic surfaces
which may form intramolecular interactions to form the native

6
globular protein (a) or through competitive intermolecular interactions may lead to the
formation of an amyloid fibril (d). Under in vivo conditions, the accessory proteins known
as chaperons may help the state (b) to reach to state (a), however mutations in the amino
acid sequence may still cause shift in equilibrium in favor of state (d).
Figure 7. The competition between the productive protein folding and unproductive protein
folding pathways. See text for explanation.
Many mechanism of proteins folding have been proposed, some fit to a particular class of
proteins, while other to another group. None of a single mechanism has been found suitable
to explain the folding pathways of all the proteins. Over the years, several mechanism like
hydrophobic collapse, framework model, nucleation growth, jigsaw and box mechanism
have been proposed. The sequential model which states that protein fold through several
hierarchical steps has been most successful protein folding mechanism proposed.

4.0 In-silico Protein Folding

There have been tremendous efforts going on to understand the protein folding code
so that novel proteins can be designed for varied therapeutics and industrial applications.
With the quantum jump in the computational power, especially in a parallel and distributed
computing, it is expected that we may soon be able to fold at least small peptides and
proteins to their final folded state (N). Mayo group at Scripps research Institute designed a
Zn-finger peptide which could be folded in to a helical conformation. Group of David Baker
at Washington University has developed several tools to design and fold new proteins. This
has been possible due to massive parallel computation and improved force-fields and the
possibility of extending the ab initio methods (first principle) to proteins. Recently, a
Folding@home program which involves distributed computing by employing the idle
power of computers lying in various parts of the world has been launched. In this
environment a part of the code is sent to participating large number of client machines on
the internet to simulate the protein folding. The idea is that most of computers do not use
their full computational power all the time, therefore it is possible to make use of the under-
utilized power of the computers on www network by dividing the big problem in small bits
among the large number of computers, which after computation send back the results to the
central server. So far the group has solved the folding of several small proteins. The project
has successfully solved the folding of several proteins and peptides like folding of a zinc
finger, HIV integrase, small  fold peptides, small  hairpins and villin headpiece (36
residue helix). More and more success stories are pouring in and in near future it should be
possible to take more difficult problems requiring massive computational power. In near
future, it should be possible to understand the protein assembly, aggregation, misfolding
and disease causing protein-misfolding/amyloid formation processes at a much greater
details.

5.0 Protein Structure Prediction

Prediction of protein structure at various levels has been a long standing problem in
protein science and molecular biology. Due to poor understanding of the interatomic
interactions, the force-fields and lack of enough experimental data in past had made the
problem enormously difficult. Prediction of protein structure based on first principle (ab
initio) method was beyond the computational power available, while the semi-empirical or
force-field methods were not accurate enough due to a number of approximations used in
these methods. The knowledge-based methods which need lot of experimental data were not

7
successful due to lack of representative protein structures (template) in each and every
family or class of proteins which had a high sequence homology with the target sequence.
In the recent times, due to availability of immense power of computers (mainly due to
cluster-based and distributed computing) and better understanding of molecular interactions,
it has been possible to use the ab initio as well as empirical methods with fewer
approximations. Due to availability of large number of experimental structures, determined
by NMR or x-ray crystallographic methods, it has been possible to predict the protein
structure with a high accuracy which may surpass the accuracy obtained by ab initio
methods. In the rest of the talk, the following methods and their use in protein structure
prediction would be discussed:
Ab-initio Methods
Semi-empirical Methods
Knowledge-based Methods
Comparative Modeling (Homology Modeling)

6.0 Structural Thermodynamics

The structure of a protein is dictated by the molecular interactions. The


thermodynamics of the folding process guides a polypeptide chain to assume a specific 3-D
structure. The optimization of the molecular interactions is achieved via passing through
many cooperative interactions leading to the native state of the molecule. Understanding the
protein folding and stability is important not only from academic point of view but also due
to their immense importance in biotechnological industries. Therefore, it is crucial to
understand the contribution of various molecular interactions in the folding and stability of
proteins structures. A large number of studies has been carried out to derive the value of
parameters contributing to protein stability. This has been achieved by using small
molecules as model compounds in combination with mutational studies and calculating the
contribution of individual residues and then evaluating the contribution of individual
interactions like hydrogen bonding, hydrophobic interactions, van der Waal interactions,
salt-bridges and disulfide bonds etc. In the last decades, the solution of large number of
structures of proteins and their mutants has helped in establishing the structure-stability
relationship governed by molecular interactions. The mutational data in combination with 3-
D structures of proteins have provided an opportunity to understand the protein stability
with a high degree of precision. Now it is possible to theoretically estimate the stability of
mutant proteins which can help in designing of more stable and rugged proteins and ligands.
It has been proposed that the change in stability due to mutations can be evaluated from the
changes in the contributing components like the hydrophobic effect (ΔΔGHP), side-chain
conformational entropy (ΔΔGconf), hydrogen bonds (ΔΔGHB), entropic loss due to internal
water molecules (ΔΔGH2O), cavity volume (ΔΔGcav), and secondary structural propensity of
 helix and  strand (ΔΔGproα and ΔΔGproβ, respectively). Assuming that the structural
changes due to substitutions are negligible in the region except for the mutation site, the
stability difference due to the mutation can be represented by the following equation:

ΔΔG = ΔGmut – ΔGwt


=ΔΔGHP+ΔΔGconf+ΔΔGHB+ΔΔGcav+ΔΔGH2O+ΔΔGproα+ΔΔGproβ+ΣΔΔGothers
(1)

Parameters defining the contribution of these factors have been refined by a least-square
analysis of the structure-stability data derived from analysis of the mutant proteins relative
to Gly (ΔΔGaa) and it has been found that:

8
ΔΔGHP = 0.146  ΔΔASAnp – 0.21  ΔΔASAp (2)

where ΔΔASAnp is the change in non-polar accessible surface area (C + S) in Å2, and
ΔΔASAp is the change in polar surface area (N + O) of the protein over unfolding.

ΔΔGconf = – TΔΔSconf (3)


ΔΔGH2O = – 4.51  ΔNH2O (4)
ΔΔGcav = – 0.073  ΔVcav (5)
ΔΔGHB = 22.08Σr[pp]-1+9.13Σr[pw]-1+7.7Σr[ww]-1 (6)
ΔΔGproα = 0.33  ΔPα and ΔΔGproβ = 0.11  ΔPβ (7)

where Pα and Pβ are the propensities of a residue for the α-helix and the β-sheet,
respectively. ΔNH2O and ΔVcav are the changes in the number of water molecules and cavity
volume (ΔVcav = Vcavmut – Vcavwild) close to the mutation site, respectively. r[pp], r[pw] and r[ww]
are the donor::acceptor distances (in Å) for the protein-protein, protein-water and water-
water hydrogen bonds. The change in stability of each mutant with respect to the wild-type
protein (ΔΔGmut-wild) can be evaluated from the difference in ΔΔG between the mutant and
wild-type protein.
In summery, we would discuss various facets of protein structure and architecture, the
problem of protein folding/misfolding/unfolding and the methods used for studying the
protein structure and folding mechanisms.

9
GENETIC ALGORITHMS AND THEIR APPLICATIONS IN
DAIRY PROCESSING

Avnish K. Bhatia
NBAGR, Karnal-132001

1.0 Problem Solving and Algorithms

Solving a problem involves computing a function from the set of input to the set of
output of the problem. Problem solving with computer involves formulation of an algorithm
for the problem and writing a computer program on the basis of the algorithm. A problem is
easy to solve with a computer if there exists an algorithm of polynomial (on the number of
variables) time complexity. Computers can solve such problems in a reasonable
computational time. But there are problems for which there are no known algorithms of
polynomial time complexity. Computer may take an intolerable computational time to solve
such difficult problems as the problem size increases.
Optimization problems occur in the field of operations research and engineering
design. These problems require finding a minimum or maximum value of an objective
function subject to certain constraints. These problems are modeled as mathematical
program of the form:
Optimize F(X ), X  (x , x , x )
1 2 n
Subject toC j ( X ), j  1, m
Where X is a vector of n  1 model variables and Cj(X) is a set of m  0 constraints.
An unconstrained optimization problem contains no constraint.
A solution vector X that satisfies all the constraints of the problem is called a feasible
solution. A feasible solution in the search space that maximizes or minimizes the objective
function is called an optimal solution.
Most of the real-world optimization problems are difficult (NP-hard), indicating that
polynomial time algorithms to solve them do not exist. Therefore, various heuristics have
been designed to solve these problems that may provide sub-optimal but acceptable solution
in a reasonable computational time. Many of the difficult problems have also been solved
using meta-heuristics such as simulated annealing, evolutionary algorithms, etc. derived
from natural physical and biological phenomena.
Evolutionary Algorithms (EAs) have been derived from the Darwin’s principle of
survival of the fittest in natural selection. Like natural evolution, EAs maintain a population
of individuals and evolve by manipulation of individual’s genetic structure (the genotypes)
through some genetic operators. There are three major variants of EAs: evolution strategies,
evolutionary programming and genetic algorithms. Genetic Algorithms (GAs) [Goldberg,
1989] are the most frequently used form of evolutionary algorithms.

2.0 Genetic Algorithms

John Holland invented genetic algorithms in sixties and early seventies. They attracted
attention of scientific community with the publication of Holland’s book, “Adaptation in
Natural and Artificial Systems”. Genetic algorithms differ from the other evolutionary
algorithms mainly by a sexual reproduction operator, that is, recombination (crossover)
between the parents selected for reproduction to produce the offspring. Mutation has been
given background role in genetic algorithms while it is the major operator in the other
evolutionary algorithms.
The standard genetic algorithm is made up of five major steps shown in the Figure-1.

Create Initial Population

Evaluate Fitness

Select the Chromosomes

NO
Apply Genetic Operators
(Crossover and Mutation)

Finished All Generations /


Stopping Criteria?

Figure-1: Flowchart of a genetic algorithm

Step 1: Randomly create an initial population.


The initial population of chromosomes generally involves encoding of every variable
into binary form, which then serves as the genotype of the individuals. Binary coding has
been found insufficient to solve many problems. Therefore, other than binary coding such as
real number coding and permutation coding have also been used.
The initial population consists randomly generated strings / chromosomes of a pre-
specified number of individuals on the basis of the genetic representation / coding.
Step 2: Calculate a score for each individual against some fitness criterion.
An integral part of GAs is the fitness function, which is derived from the objective
function of the optimization problem. The fitness function is the measure of an individual’s
fitness.
Step 3: Select the high fitness individuals to create the next generation.
The top scoring individuals are then selected to breed the next generation using a
selection method, which selects prospective parents from the population on the basis of the
fitness values.
The selection scheme used by a GA is intended to be analogous to the natural
selection. To bias the selection toward more fit and thus higher performing individuals, each
individual is assigned a probability P(x), which is proportional to the fitness of the
individual x relative to rest of the population. Fitness-proportionate selection is the most
commonly used selection method. Given the fi as the fitness of ith individual, P(x) in this
method is calculated as:
fx
P (x) 
 fi

The highly fit individuals are selected with higher probability. With fitness-
proportionate selection ensures that it is more likely that the top performing individuals are
given the opportunity to spread their genes through the new population.
After assignation of the expected values P(x), the individuals are selected using the
roulette wheel sampling that works in the following steps.
 Let C be the sum of expected values of individuals in the population.
 Repeat two or more times to select the parents for mating.
i. Choose a uniform random integer r in the interval [1,C].
ii. Loop through the individuals in the population, summing the expected
values, until the sum is greater than or equal to r. The individual index
where the sum crosses this limit is selected.
It is quite possible that an individual may be selected several times for breeding. It is
also reasonable to expect some of the relatively unfit individuals to be selected for breeding,
due to inherent randomness of this process. Other selection strategies such as tournament
selection, rank selection are used to avoid this biasness [Mitchell, 1996].
Step-4: Apply genetic operators to the selected parents.
Generally, two parents are selected at a time and are used to create two new children
for the next generation using crossover operator with a pre-specified probability of
crossover. Single-point crossover is the most common form of this operator, which works
by randomly marking a crossover spot within the chromosome length and exchanging the
genetic material on the right of the spot as shown below.

01010101 0100 010101011101


01110101 1101 011101010100

These newly created children may be subjected to mutation that involves “flipping of a
bit”, i.e. changing the value of each gene, with a pre-specified probability of mutation. An
example of working of the mutation operator appears below.
0 1 1 1 0 1 0 1 0 1 1 01111101011

The fifth bit (underlined) has been mutated in the individual.


Step 5: Repeat steps 2-4 until some stopping condition is reached.
The steps 2-4 complete one generation. The stopping criterion may be defined in many
ways. Pre-fixed number of generations is the most used criterion where the GA stops on
completion of the given number of generations. Other stopping criteria used are the desired
quality of solution, and the numbers of generations of the GA run without any improvement
in results.

Figure 2 shows pseudocode of a standard genetic algorithm.


Begin GA
g:=0 { generation counter }
Initialize population P(g)
Evaluate population P(g) {i.e. compute fitness values}
While not stopping-criteria
g:=g+1
Select P(g) from P(g-1)
Crossover P(g)
Mutate P(g)
Evaluate P(g)
End while
End GA

Figure 2: Pseudo-code of the standard genetic algorithm.

A standard genetic algorithm utilizes three operators: reproduction, crossover and


mutation. The main problem with solving a problem with genetic algorithms is pre-mature
convergence of the algorithm to a sub-optimal solution. The GA cannot advance further to
the global optimum. Many extensions such as elitist recombination, niching and restricted
mating have been added to the genetic algorithms to avoid their pre-mature convergence.

Genetic Parameters:
The value of parameters such as population size (N), crossover probability (pc),
mutation probability (pm), total number of generations (T) affect the convergence properties
of the genetic algorithms. Values of these parameters are generally decided before start of
GA execution on the basis of previous experience.
Experimental studies recommend the values of these parameters as: population size 20-
30, crossover rate 0.75-0.95, and mutation rate 0.005-0.01.
The parameters are generally fixed by tuning in trial GA runs before the actual run of
the GA. Deterministic control and adaptation of the parameter values to a particular
application have also been used. In deterministic control, the value of a genetic parameter is
altered by some deterministic rule during the GA run. Adaptation of parameters allows
change in their values during the GA run on the basis of previous performance. In self-
adaptation the operator settings are encoded into each individual in the population that
evolves during the GA run.

3.0 Constraint Handling in GAs

The constrained optimization problems contain very small proportion of feasible search
space. The chromosomes in the initial population are generated randomly. Also, the genetic
operators such as crossover and mutation alter the composition of chromosomes in the
population. The initialized and altered chromosomes may violate one or more constraints in
the constrained optimization problem and thus represent infeasible solutions. Several
methods have been used with GAs to treat the infeasibility. One straightaway method is to
remove the infeasible chromosomes from the population and generate new chromosomes.
But, it results in loss of valuable genes in the chromosomes. Use of penalty functions is the
most common method to treat infeasibility where a penalty term is added (subtracted) from
the fitness function of minimization (maximization) problem. It alters the fitness of
chromosomes and thus avoids selection of infeasible chromosomes, as the selection is
fitness-biased.

4.0 Applications in Dairy Processing

To design a control system for nonlinear engineering system such as chemical or dairy
plant, a linearized model of the plant is required. Non-differentiable and multi-input, multi-
output systems are difficult to solve using the traditional calculus Taylor expansion around
an equilibrium operating point. Genetic algorithm has been used to solve the frequency
domain system linearization problem [Tan et al. 1996]. The problem has been formulated as
minimization of linearization error and solved with a genetic algorithm. The method utilizes
the plant input/output data directly and requires no derivatives. It allows linearization of an
entire operating region and for the entire interested frequency range, the benefit of which
cannot be matched by existing methods.

5.0 References
D. E. Goldberg (1989) Genetic Algorithms in Search, Optimization and Machine Learning.
Addison Wesley.

M. Mitchell (1996) An Introduction to Genetic Algorithms. MIT Press, MA.

K. C. Tan, M. R. Gong, Y. Li (1996) Evolutionary linearization in the frequency domain.


Electronics Letterss32(1) 74-76.
EXPERT SYSTEMS IN DAIRYING

Avnish K. Bhatia
NBAGR, Karnal – 132001
1.0 Introduction
An expert system is a computing system that is capable of expressing and reasoning
about some domain of knowledge. It has been found to be effective in solving real world
problem using human knowledge and following human reasoning skills. Expert system is
capable of representing and reasoning about some knowledge rich domain, which usually
requires a human expert with a view toward solving problems and/or giving advice.
The primary goal of expert systems research is to make expertise available to
decision makers and technicians who need answers quickly. Enough expertise is not always
available at the right place and the right time. Portable with computers loaded with in-depth
knowledge of specific subjects can bring decades of expertise to a problem. The systems
can assist supervisors and managers with situation assessment and long-range planning.
Many systems now exist that bring a narrow slice of in-depth knowledge to a specific
problem.

2.0 Considerations for Building Expert Systems•

Can the problem be solved effectively by conventional programming?


• Is there a need and a desire for an expert system?
• Is there at least one human expert who is willing to cooperate?
• Can the expert explain the knowledge to the knowledge engineer can understand it.
• Is the problem-solving knowledge mainly heuristic and uncertain?

3.0 Components of an Expert System

Figure: Components of an Expert System

3.1 The Knowledge Base


It consists of knowledge about problem domain in the form of static and dynamic
databases. Static knowledge consists of rules and facts complied as a part of the system and
don’t change during execution of the system. It contains the domain-specific knowledge
acquired from the domain experts. It involves object descriptions, problem-solving
behaviors, experience, constraints, judgment, heuristics and uncertainties. The success of
an ES relies on the completeness and accuracy of its knowledge base.
3.2 The Inference Engine
It consists of inference mechanism and control strategy. Inference means search
through knowledge base and derive the new knowledge. It involves formal reasoning
involving matching and unification performed by human expert to solve problems in a
specific area. The knowledge is put to use for producing solutions. The engine is capable of
performing deduction or inference based on knowledge contained in the knowledge base. It
is also capable of using inexact or fuzzy reasoning based on probability or pattern matching
Three steps characterize an inference cycle
1. Match rules with given facts.
2. Select the rule that is to be executed.Execute the rule by adding the deduced fact to
the working memory
Inference engines use a method of chaining to produce a line of reasoning. In forward
chaining, the engine begins with the initial content of the workspace and proceeds towards a
final conclusion. In backward chaining, the engine starts with a goal and finds knowledge to
support that goal
Forward chaining is a reasoning process that begins with the known facts and tries to
work forward to find a successful goal. In order to implement it, we require the use of
dynamic database. In forward chaining, the facts from static and dynamic knowledge bases
are taken and are used to test the rules through the process of unification. When a rule
succeeds, then the rule is called fired and the conclusion (head of the rule) is added to the
dynamic knowledge.

3.3 Knowledge Acquisition


Potential sources of knowledge include human experts, textbooks, databases and your
own experience. The main source should always be the human expert, as they must already
have the knowledge we seek for problem solving. When highly specialized knowledge is
required, the expertise of multiple experts can be desirable. Textbooks (or other literature)
and databases should be used as a secondary source of information. However,
documentation can quickly become obsolete, but the Domain Expert can be relied upon to
possess up to the minute knowledge.Knowledge Representation: The process formalizes
and organizes the knowledge. One widely used representation is the production rule, or
simply rule. A rule consists of an IF part and a THEN part (also called a condition and an
action). The IF part lists a set of conditions in some logical combination. The piece of
knowledge represented by the production rule is relevant to the line of reasoning being
developed if the IF part of the rule is satisfied; consequently, the THEN part can be
concluded, or its problem-solving action taken. Expert systems whose knowledge is
represented in rule form are called rule-based systems. The other techniques include frames
and Semantic networks.

3.4 Explanation Module

It explains user about the reasoning behind any particular problem solution. It consists
of 'How' and 'Why' modules. The sub module ‘How’ tells user about the process through
which system has reached to a particular solution whereas "Why' sub module tells that why
is that particular solution.

4.0 Building an Expert System


An early step is to identify the type of tasks (interpretation, prediction, monitoring,
etc.) the system will perform. Another important step is choosing the experts who will
contribute knowledge. It is common for one or more of these experts to be part of the
development team. Unlike more general information systems design projects, the software
tools and hardware platform are selected very early
Expert System Shells: are generic systems that contain reasoning mechanisms but not the
problem-specific knowledge. Early shells were cumbersome but still allowed the user to
avoid having to completely program the system from scratch. Modern shells contain two
primary modules: a rule set builder and an inference engine. Clipwin is an example of
modern shell.
Validation
Validation is the most important stage in developing an expert system. It is the process
whereby a system is tested to show that its performance matches the original requirements
of the proposed system.
The basic idea of the adopted technique is to evaluate the behavior of the expert
system against that of human experts. A collection of carefully selected test cases is
generated. These test cases are solved by a number of domain experts as well as the expert
system. Different human experts will evaluate all test cases solutions using some evaluation
formula. Later on, an open discussion is to be held to let the human experts justify their
solutions. According to this discussion, evaluation of solutions may change and the final
ranking of solutions will be reached. If the expert system is far from precedence, the
knowledge base must be modified.
The User Interface in an ES
Design of the UI focuses on human concerns such as ease of use, reliability and
reduction of fatigue. Design should allow for a variety of methods of interaction (input,
control and query). Mechanisms include touch screen, keypad, light pens, voice command,
hot keys. It allows user to communicate with system in interactive mode and helps system
to create working knowledge for the problem to be solved.

5.0 Rule Based Expert Systems


The most popular ways of storing knowledge in expert systems is in the form of rules
and facts of a particular domain. A rule based expert system is one in which knowledge base
is in the form of rules and facts. It is also called production system. It contains knowledge of
the domain in the form of rules and facts used to make decision. Suppose doctor gives a rule
for measles as follows:
"If symptoms are fever, cough, running nose, rash and conjunctivitis then patient probably
has measles".
Prolog is the programming language more suitable for implementing such systems.
Rules (called production rules) and facts of production system can be easily expressed in
Prolog as follows:
hypothesis(measles) :- symptom(fever), symptom(cough), symptom(running_nose),
symptom(conjunctivitis), symptom(rash).
Prolog has its own inference engine. It performs unification, determines the order in which
the rules are scanned and performs conflict resolution.
These features of Prolog make designing production system easy. Prolog uses backward
chaining. It uses depth first search in which all the rules relative to a particular goal are
scanned as deeply as possible for a solution before Prolog backtracks and tries an alternative
goal.
Production Rule sets
Experts typically form sets of rules to apply to a given problem, which reflects the skill of
the expert on a topic. Rule sets are often represented in a tree-like structure with the most
general, strategic rules at the top of the tree and the most specific rules at leaf nodes.
Some examples of rules and interface statements in Prolog are given below.
hypothesis(cough) :- symptom(cough), symptom(sneezing), symptom(running_nose).
hypothesis(chicken_pox):- symptom(fever), symptom(chills), symptom(body_ache),
symptom(rash).
symptom(fever):-positive_ symp(‘Do you have fever(y/n) ?’, fever).
symptom(rash):- positive_ symp(‘Do you have rash(y/n) ?’, rash).
symptom(body_ache):- positive_ symp(‘Do you have body_ache (y/n) ?’, body_ache).
symptom(cough):-positive_ symp(‘Do you have cough (y/n) ?’, cough).
symptom(chills):- positive_ symp(‘Do you have chills (y/n) ?’, chills).
symptom(conjunctivitis):-positive_ symp(‘Do you have conjunctivitis(y/n)?',
conjunctivitis).
symptom(headache):- positive_ symp(‘Do you have headache (y/n) ?’, headache).
symptom(sore_throat):-positive_ symp(‘Do you have sore_throat (y/n) ?’, sore_throat).
symptom(running_nose):-positive_ symp(‘Do you have running_nose (y/n)?’,
running_nose).
symptom(sneezing):- positive_ symp(‘Do you have sneezing (y/n) ?’, sneezing).
symptom(swollen_glands):- positive_symp(‘Do you have swollen_glands(y/n) ?’,
swollen_glands).
positive_ symp(_, X) :-positive(X), !.
positive_ symp(Q, X):-not(negative(X)), ask_query(Q, X, R), R = ‘y’.
ask_query(Q, X, R):-writeln(Q), readln(R), store(X, R).
store(X, ‘y’) :-asserta(positive(X)).
store(X, ‘n’) :-asserta(negative(X)).
clear_consult_facts:- retractall(positive(_)).
clear_consult_facts:-retractall(negative(_)).

6.0 Benefits of Expert Systems

 Increased timeliness in decision making


 Increased productivity of experts.
 Improved consistency in decisions
 Improved understanding
 Improved management of uncertainty
 Formalization of knowledge
 Integrate with other systems, e.g., Decision Support Systems
 Transfer knowledge more easily
 Retain scarce knowledge

7.0 Limitations of Building ES

One important limitation is that expertise is difficult to extract and encode. The human
experts adapt naturally to a changed situation but an ES must be recoded. So, human experts
better recognize when a problem is outside the knowledge domain, but an ES may just keep
working. ES work in narrow domains only. Possible lack of training and experience of
knowledge engineers is also a limiting factor in developing expert systems.
8.0 Expert Systems in Dairying
DAIRYMAP: A Web-Based Expert System For Dairy Herd Management:
Dairy MAP is a web-based expert system aimed at the dairy producers. The system consists
of two major components – a preliminary statistical benchmarking analysis (based on Dairy
Herd Information reports provided by the Dairy Records Management Systems, Inc., in
Raleigh, NC) and, a detailed expert evaluation of the four major areas of dairy herd
management, viz., Somatic Cell Count and Mastitis, Reproduction, Genetics, and Milk
Production. The preliminary analysis provides information to the producer about the areas
of concern within each component of dairy management, and suggests further evaluation
and diagnosis by the Expert System, concluding with comments and recommendations for
improving the producer’s herd. The entire system is Web-based, allowing a producer from
anywhere in the United States to utilize the system. It provides a single integrated system
evaluating all the four major areas, with the preliminary analysis limited only by the dairy
herd data available. The system is available at the URL http://dairymap.ads.uga.edu/

Expert System Development for Control System Selection for Whole Processes
The selection of a control system for a whole process is a distinctly different task from
the controller-tuning problem. There are a number of objectives that must be satisfied
simultaneously and a suitable control system must be chosen from a huge number of
alternatives. These are typical characteristics of a design problem solved routinely by
experts using a combination of experience and algorithm. Our understanding of these sorts
of problems has not progressed to the stage where software, such as heat and material
balance packages, is available for their solution. There is possibility of extending on
software based on structural controllability and expert systems to select control systems.
COMPUTATIONAL NEURAL NETWORKS WITH DAIRY
AND FOOD PROCESSING APPLICATIONS

Adesh K. Sharma and R. K. Sharma


School of Mathematics and Computer Applications
TIET, Patiala-147 004, (Punjab)

1.0 Introduction
Neural computing implies the use of computational neural networks (CNN) to carry
out various tasks. A CNN (also known as artificial neural network, neural network,
connectionist model, or neural system) can be described as a computational system made up
of a number of simple but highly connected processing elements (PE) that tend to store
experimental knowledge by a dynamic state response to external inputs and make the
information available for use. Using available data, a typical CNN “learns” the essential
relations between given inputs and outputs by storing information in a weighted distribution
of connections. A learning algorithm provides the rule or dynamical equation that changes
the distribution of the weight (parameter) space to propagate the learning process. CNN are
especially useful for problems that can be represented as a mapping between vector spaces
in which hard-and-fast rules are not applicable. They are also exceptionally robust against
noise and tend to be immune to violations of assumptions that would cripple many
traditional methods. These properties make CNN powerful and versatile tools for a number
of far-reaching, interdisciplinary applications. There are a vast number of different CNN
paradigms making up the arsenal that comprises neural computing.

2.0 Computational Neural Networks (CNN)


Computational neural networks consist of the following three principal elements:
 Topology – the way a CNN is organized into layers and the manner in which
these layers are interconnected;
 Learning – the technique by which information is stored in the network; and
 Recall – how the stored information is retrieved from the network.
The basic structure of a CNN consists of PE that are also known as artificial
neurons, nodes, neurodes, units, etc., and are analogous to biological neurons in the human
brain, which are grouped into layers (also called slabs). The most common CNN structure
consists of an input layer, one or more hidden layer(s) and an output layer.
The input signals to the neurons are modified by assigning weights representing the
strengths of synapses associated with each input, e.g., a number of inputs x1 , x2 ,..., xn
associated with respective weights wj,1 , wj,2 ,..., wj,n form a combined input net j to the j th
neuron, which is expressed as the weighted sum of the inputs:
m

net j   w j,i xi ... 1


j 1

Sometimes a bias is added to the net input, that is, we add a new synapse having fixed
input x0  1 and weight w j,0  bias j . Thus Eq. (1) can be rewritten:
m

net j   w j,i xi ... 2


j 0
Bias has the effect of increasing or lowering the net input of the activation function,
depending on whether it is positive or negative, respectively. This sum of weighted signals,
net j , is transformed into an activation level using a transfer or activation function to
produce an output signal y j only when exceeding a certain threshold as follows: 
   
y j  f net j  f  w j,i xi ... 3
The output of a neuron is determined by the nature of its activation function. Some
commonly used activation functions are given below.
The transfer (or activation) function determines the output from a summation of the
weighted inputs of a neuron. The transfer functions for neurons in the hidden layer are often
nonlinear and they provide the nonlinearities for the network. Some frequently used
activation functions are:
i) Sigmoid function
f  x   
1
1 ex
... 4
ii) Hyperbolic function      ex  ex
  

 f x tanh x
ex  ex
... 5
iii) Gaussian function
f x   e x ... 6
2

where x , and  in the above expressions denote weighted sum of the inputs and parameter
value, respectively.

3.0 CNN Architecture

Connectionist architecture (or network topology) refers to the types of


interconnections between neurons. A network is said to be fully connected if the output
from a neuron is connected to every other neuron in the next layer. A network with
connections that passes outputs in a single direction only to neurons on the next layer is
called a feed forward network. A feed-back network allows its outputs to be inputs to
preceding layers. Networks that work with closed loops are known as recurrent networks.
Feed-forward networks are faster than feedback networks as they require only a single pass
to obtain a solution. Classification of networks is given below:
3.1 Feed-forward supervised CNN
These networks are typically used for prediction and function approximation tasks
pertaining to various food and dairy processing applications. Specific examples include:
 Back-propagation networks
 Radial basis function networks
3.2 Feed-forward unsupervised CNN
These networks are used to extract important properties of the input data and to map
input data into a “representation” domain. Two basic groups of methods belong to this
category:
 Hebbian networks performing the Principal Component Analysis of the input
data, also known as the Karhunen-Loeve transform
 Competitive networks used to perform Learning Vector Quantization of the
input data set
 Self-organizing Kohonen Feature Maps
3.3 Feed-back CNN
These networks are used to learn or process the temporal features of the input data
and their internal state evolves with time. Specific examples include:
 Recurrent back-propagation networks
 Associative memories
 Adaptive resonance networks

4.0 CNN Parameters


There are several network parameters that are adjusted on trial and error basis so as
to fine tune the network performance. A brief description of these parameters is given
below:
4.1 Hidden layers and hidden neurons – the number of hidden layers and
number of hidden neurons in each hidden layer is of prime importance in network
designing. In general, single hidden layer network models perform the best in dairy
applications as evidenced by the literature as well as authors‟ own experience. However,
there are situations when more than one hidden layer networks exhibit better prediction
ability. This is ascertained by trial and error approach in each application. Also, there is no
general method available to determine the number of hidden neurons in each hidden layer.
However, some researchers have recommended some techniques such as, i) Add number of
n.
input and output layer nodes and divide by 2; ii) Use the formula N h  , where, N h is
Ni
number of hidden neurons; n is number of patterns used for training;  is testing tolerance
and N i is the number of input variables; etc.
4.2 Learning rate - the learning rate determines the amount of correction term
that is applied to adjust the neuron weights during training. Small values of the learning rate
increase learning time but tend to decrease the chance of overshooting the appropriate
optimal solution. Large values of the learning rate may train the network faster, but may
result in no learning taking place at all. The adaptive learning rate varies according to the
amount of error being generated. Larger the error, smaller are the values and vice-versa.
Therefore, if the CNN is heading towards the optimal solution it will accelerate. Likewise, it
will decelerate when it is heading away from the optimal solution.
4.3 Momentum - the momentum value determines how much of the previous
corrective term should be remembered and carried on in the current training. The
larger the momentum value, the more emphasis is placed on the current
correction term and the less on previous terms. It serves as a smoothing process
that „brakes‟ the learning process from heading in an undesirable direction.
4.4 Training and testing tolerances - the training tolerance is the amount of
accuracy that the network is required to achieve during its learning stage on the training
data set. The testing tolerance is the accuracy that will determine the predictive result of
the CNN on the test data set.
4.5 Neural network learning - learning in present context may be defined as a change in
connection weight values that results in the capture of information that can later be
recalled. Generally, the initial weights for the network prior to training are preset to
random values within a predefined range. This technique is used extensively in error-
correction learning systems that are widely used in food and dairy applications. There
are following three classes of learning:
4.5.1 Supervised learning - it is most common type of learning in CNN. It requires
many samples to serve as exemplars. Each sample of this training set contains input values
with corresponding desired output values (or target values). Then the network will attempt
to compute the desired output from the set of given inputs of each sample by minimizing the
error of the model output to the desired output. It attempts to do this by continuously
adjusting the weights of its connection through an iterative learning process called training.

4.5.2 Unsupervised learning - it is sometimes called self-supervised learning and requires


no explicit output values for training. Each of the sample inputs to the network is assumed
to belong to a distinct class. Thus, the process of training consists of allowing the network
uncover these classes.
4.5.3 Reinforcement learning - It is a hybrid learning method in that no desired outputs are
given to the network, but the network is directed if the computed output is going in the
correct direction or not.
4.5.4 Off-line and on-line learning - we can categorize the learning methods yet into
another group, off-line or on-line. When the system uses input data to change its weights to
learn the domain knowledge, the system could be in training mode or learning mode. When
the system is being used as a decision aid to make recommendations, it is in the operation
mode, this is also sometimes called recall. In the off-line learning methods, once the system
enters into the operation mode, its weights are fixed and do not change any more. Most of
the networks are of the off-line learning type. In on-line or real time learning, when the
system is in operating mode (recall), it continues to learn while being used as a decision
tool. This type of learning has a more complex design structure.

4.6 Learning algorithms


A prescribed set of well-defined rules for the solution of a learning problem is called
learning algorithm. Some learning algorithms are discussed below.
There are several learning rules such as error-correction learning, Heabbian learning,
principal component learning, competitive learning, min-max learning, stochastic learning,
Boltzman learning, etc. Two most commonly used rules for food and dairy applications are
described as follows:
Let wk , j n denote the value of synaptic weight wk , j of neuron k excited by element
x j n of the input vector xn at time step n . The adjustment wk , j n  applied to the
synaptic weight wk , j at time step n is defined by different rules given below:
a) Hebbian learning rule - the simplest form of Hebbian learning rule is described by
wk, j  yk nx j n, ... 7
b) Delta rule - the delta rule states that the adjustment made to a synaptic weight of a
neuron is proportional to the product of the error signal and the input signal of the synapse
in question. Mathematically, it is described as
wk , j n  ek n x j n , ... 8
where  is a positive constant that determines the rate of learning as we proceed from one
step in the learning process to another, hence,  is referred to as learning rate parameter;
yk n is output signal of neuron k ; d k n is the corresponding desired (or target) output;
ek n dk n yk n is the corresponding error signal.

5.0 Crafting a CNN

The process to setup a neural network model involves the following steps:
i) The data to be used is defined and presented to the CNN as a pattern of input data with
the desired outcome or target.
ii) The data are partitioned into two sets viz., “training set” and “test set”. The CNN uses
the former set in its learning process in developing the model. The latter set is used to
validate the model for its predictive ability and when to stop the training of the CNN.
iii) The CNN‟s structure is defined by selecting the number of hidden layers to be
constructed and the number of neurons for each hidden layer.
iv) All the CNN parameters are set before starting the training process.
v) Now, the training process is started, which involves the computation of the output from
the input data and the weights. A learning algorithm is used to „train‟ the CNN by adjusting its
synaptic weights to minimize the difference between the current CNN output and the
desired output.
vi) Finally, an evaluation process is carried out in order to determine if the CNN has
„learned‟ to solve the task at hand. This evaluation process may involve periodically halting
the training process and testing its performance until an acceptable accuracy is achieved.
When an acceptable level of accuracy is achieved, the CNN is then deemed to have been
trained and ready to be utilized.
As no fixed rules exist in determining the CNN structure or its parameter values, a large
number of CNN may have to be constructed with different structures and parameters before
determining an acceptable model. Determining when the training process needs to be halted
is of vital significance to obtain a good CNN model. If a CNN is over trained, a curve-
fitting problem may occur whereby the CNN starts to fit itself to the training set instead of
creating a generalized model. This typically results in poor predictions of the test and
validation data set.

6.0 Dairy and food processing applications of CNN

CNN based models have been successfully applied in various real-life problems.
The research in this field is still under development across the globe. There has been
relatively little research into application of CNN in the field of agriculture in general and
dairying in particular especially in India. Some potential applications of connectionist
models in dairy and food processing are briefly presented:
i) Modeling of pH and acidity for cheese production has been made using CNN;
ii) Shelf-life prediction of pasteurized milk has been achieved using connectionist
models;
iii) Neural networks have been successfully employed to predict temperature, moisture
and fat in slab-shaped foods with edible coatings during deep-fat frying;
iv) Model predictive control (MPC) of an Ultra-High Temperature (UHT) milk
treatment plant has been realized using a neural system;
v) The CNN technique has been used to determine protein concentration in raw milk;
vi) Analysis of dairy patterns from a large biological database has been performed using
neural networks;
vii) Neural network models based on feed-forward back-propagation learning have been
found useful for prediction of dairy yield, i.e., 305-day milk yield, fat and protein in
Holstein dairy cattle;
viii) In a similar study, the CNN have been employed for dairy yield prediction as well as
cow culling classification;
ix) Prediction of cow performance with connectionist model has shown better results
than conventional methods;
x) Milk production estimates have been successfully obtained in a study by using feed-
forward artificial neural networks;
xi) The CNN have been applied to predict milk yield in dairy sheep;
xii) Neural network modeling of urinary excretion by farm animals has been reported;
xiii) The connectionist models have been used for detecting influential variables in the
prediction of incidence of clinical mastitis in dairy animals;
xiv) Health predictions of dairy cattle from breath samples have been carried out using
neural network models;
xv) Neural network approach has been used to synthesize an online feedback optimal
medication strategy for the parturient paresis problem of cows; and
xvi) Recently, the authors have employed neural computing paradigm for prediction of
lactation yields in Karan Fries crossbred dairy cattle.
GIS Applications in Dairying

D.K. Jain
National Dairy Research Institute, Karnal-132001

1.0 Introduction
There has been an increasing role of computer use for research and
management support. The microcomputer revolution has made computers available
on many managers’ and researchers’ desks. The research managers can now have
access to thousands of databases all over the world through the advent of Internet
Technology. Most of the organisations are now using computerised analysis in their
decision making. With the decline in the cost of hardware and software, the
capabilities of information system and networks continue to rise. Various
organisations are developing distributed information systems that enable easy
accessibility to data stored in multiple locations. Various information systems are
being integrated with each other and through this managers are able to make better
decisions due to availability of more accurate and timely information.
Though, a large number of databases have been created at the international
level, India needs to boost up efforts to create its own databases in respective fields.
Required attention has not been paid to the creation of databases in the field of
animal husbandry and dairying due to non-availability of reliable data. No single
department or agency can be made responsible for creation and retrieval of
databases. Each one of us working in the field have to contribute towards building
the information base. This will help us in improving the research capabilities.

2.0 Importance of Databases


Reliable statistics are essentially required for scientific planning and policy
formulation. Planners, Policy Makers and Administrators cannot visualise the role
and scope of development at the field level if they do not have access to the hard
data reflecting the field conditions. The successful implementation of development
plan depends upon the statistics which present the ground realities in the right
perspective. A challenge facing the livestock sector is how to plan and manage more
effectively available resources in terms of land, livestock and human being. There is
ample scope to improve efficiency and productivity at every level. The future growth
of livestock sector will depend on the quality of available data both at the micro as
well as macro level. In the case of Indian livestock sector, the absence of regular
data and its systematic periodic updates are major limitations in chalking out plans to
put its enormous potential. The lack of funds and inadequacy of required
infrastructure for data collection and transmission has hindered the development of a
comprehensive database on livestock. Development of databases and an integrated
information system covering scientific and technical information on animal husbandry
and dairying technologies developed, crops, animal husbandry, natural, genetic
resources, food processing, agro-climatic conditions, economic and social indicators,
dairy education and library resources, etc. needs attention. In this context, IASRI in
collaboration of 13 institutes of ICAR has completed a Mission Mode Project under
NATP 'Integrated National Agricultural Resources Information System' which is
expected to provide data warehouse on the soil, water, climate, animal, fisheries,
crops and cropping system along with socio-economic and geographical features on
a single platform.

3.0 Computer Application in Database Development


Most applications of the computer in livestock research and production at
present originate from the economically developed countries of the world. The
innovations of this technology are thus designed for intensive systems where large
herd size is kept. In India, as in many other economically developing countries,
production systems are extensive and subsistence. Therefore, computer
applications originating from developed countries cannot thus be implemented in the
Indian situation. Livestock production systems research in India is marginal and this
prevents useful and relevant applications of computers in all the areas of animal
science. The earliest applications of the computer in animal science, besides
analysis of data from experiments, was of maintaining Dairy Herd Improvement
Association (DHIA) records and processing them for breed improvement,
compounding of feeds and disease surveillance using clinical and mortality data from
veterinary hospitals.

4.0 Databases Available at NDRI


National Dairy Research Institute at Karnal has been making efforts in the
development of databases right from the establishment of Computer Centre. The
Centre has been engaged in creation of databases in the field of Animal Husbandry
and Dairying and other related fields, development of in-house software packages for
animal management, breeding data analysis and milk procurement and billing
system, etc. Some of the important databases available at NDRI are:

4.1 Indian livestock census


This database consists of information on bovine population according to
species in different age groups across various states of the country for the period
1966-2003. The projections of bovine population for future years for different states
are also available. There has been no single reference date for the collection of data
in different states of the country and hence, there are problems pertaining to
comparison of growth rates. In order to make available the livestock census data
reliably and uniformly, there is a need to streamline this work.

4.2 Milk production and its availability


The database on milk production consists of time series data on milk
production right from 1970-71 till 2003-04 across different states of the country. The
milk production projections have also been obtained up to 2010-11 on the basis of
growth rates estimated by fitting various growth models to the time series data for
different states and the country as a whole.
Although this database is based on the information collected through various
surveys being conducted regularly for the estimation of major livestock products and
the cost of production studies, there are still shortcomings both in the coverage as
well as to the less important products and byproducts. These data are essentially
required for the estimation of value of output from this sector. The gaps in this field
which need immediate attention are estimation of yield rate of livestock products like
other meat products, hair, pig, bristles, bone, etc.; estimate of the value of inputs;
production estimate of dungs specially of small animals and droppings, etc.;
estimation of losses of various livestock products; estimation of animal draught
power; production of estimates of poultry meat, etc.
A database has also been prepared at NDRI which gives information on per
capita per day availability of milk across different states right from 1970-71 till 2003-
04.

4.3 Feeds and fodder availability and requirement


A database has been prepared for state-wise land use pattern. This
database consists of information on total irrigated area; total cropped area; area
under various crops like food, non-food, fodder, etc.; area under permanent pasture
and other grazing land. Further, an attempt has also been made to estimate the
feeds / fodder availability from various sources like green fodder, dry fodder in the
form of crop residues, oilseed cakes, wheat and rice bran, vegetable and fruit
processing wastes, tops of certain crops, other agro-industrial by-products across
different states and all India. The projections of availability of feeds and fodder have
also been obtained. Similarly, requirement of feeds and fodder for various categories
of bovines was also obtained for the country as a whole. Thus, the nutrient
requirement and availability, and their gap has also been estimated and projected for
future years for the country.

4.4 Average milk yield of cows


This particular database consists of the information on average milk
productivity non-descript, crossbred cows, buffaloes and goats have been maintained
across major states from 1986-87 to 2003-04 obtained from Integrated Sample
Survey reports for estimation of production of various livestock products. A number
of weaknesses exist in the obtaining such type of estimates which relate to sampling,
weighment / measurement of the product and data collection.

4.5 Bovine population and milk production in selected countries of the world
This particular database comprises of bovine population according to different
species for selected countries of the world from 1974 to 2003 and is being constantly
updated. Similarly, a database on milk production in different countries of the world
according to different species (cow, buffalo and goat) also exist which also has a
coverage from 1974 to 2003 and is being updated regularly.

5.0 Data Visualisation


Desktop mapping is a powerful way to analyse research data. One can give
graphic form to its statistical data so that he can see it on a map. Patterns and trends
which are otherwise impossible to detect in lists of data become obvious when
displayed on a map. Hence, 'Desktop Mapping' turns our computer from a data
processor into a 'data visualizer', allowing us to see patterns and meanings in the
mass of information.
Hence, data visualisation is also becoming a major component of decision
support and the formation of organisational strategies. It is the process by which
numerical data are converted into meaningful images. The ability to create
multidimensional structures and models from raw data can assist in analysing
complex data sets by mapping physical properties to the data, thus taking advantage
of human visual systems. The identification of hidden patterns contained within the
data can be accelerated. Geographical Information System (GIS) is one such
technique of data visualisation, which is gaining importance.

6.0 Geographical Information Systems (GIS)


GIS is a computer-based system for capturing, storing, checking, integrating,
manipulating and displaying data using digitised maps. GIS can be defined as a
system of hardware, software, and procedures designed to support the capture,
management, manipulation, analysis, modelling, and display of spatially- referenced
data for solving complex planning and management problems. In other words, GIS is
both a database system with specific capabilities for spatially referenced data and a
set of operations for working with those data. GIS is a special- purpose digital
database in which a common spatial coordinate system is the primary means of
reference. In a GIS, every record or digital object has an identified geographical
location. A comprehensive GIS requires :
 data input, from maps, aerial photos, satellites, surveys, and other sources,
 data storage, retrieval and querry,
 data transportation, analysis, and modelling, including spatial statistics, and
 data reporting, such as maps, reports, and plans.
Finally it can be said that GIS is often referred to as the ‘spreadsheet' of the
1990s as against the backdrop when the computer spreadsheet changed the way
people organised and used information in the 1980s, so is the GIS doing the same
thing today, though in a much powerful way. The GIS facilitates wide use of limited
resources by clarifying characteristics and patterns over space. GIS can provide
access to types of information not otherwise available.
The GIS can be used to solve broader range of problems as comparable to
any isolated system for spatial or non-spatial data alone. For example using a GIS :
a) Users can interrogate geographical features displayed on computer
map and retrieve associated attribute information for display or further
analysis.
b) Maps can be constructed by querying or analysing attribute data.
c) New sets of information can be generated by performing spatial
operations.
d) Different items of attribute data can be associated with one another
through a shared location codes.
7.0 Important GIS Packages
Some of the important GIS packages are :
1. ARC/INFO is one of the first GIS packages that was available commercially
and is a package used all over the world. It has been developed by
Environmental System Research Institute (ESRI), USA. It is available on wide
range of platforms-PC’s, workstations, PRIME systems. It is also available on
variety of operating systems-DOS, UNIX, VMS, etc.
2. PAMAP is a product of Graphic Limited, Canada and is an integrated group of
software products designed for an open system environment. The package is
modular and is designed to address the wide range of mapping and analysis
requirements of the natural resource sector. It is also available on variety of
platforms – Pentium, 486/386/286 PC’s, UNIX, SUN, VAX system.
3. MAPINFO is a popular package translated in to several languages and ported
to several platforms like Windows, Macintosh, Sun, and HP workstations.
4. GRASS a public domain UNIX package with large established user base
which actually contributes to the code that is incorporated in to new versions.
5. ISROGIS is a state-of-art GIS package with efficient tools of integration and
manipulation of spatial and non-spatial data and consists of a set of powerful
module. It is available on PC platforms on MS-Windows and on UNIX and
SUN platforms.
6. IDRISI has been developed by Clarke University, USA, and an inexpensive
PC based advanced features including good import export facility, a new
digitisation module, and some image processing facilities.
7. GRAM is a PC based GIS tool developed by IIT, Bombay. It can handle both
vector and raster data and has functionality for raster based analysis, image
analysis, etc.

8.0 A Case Study of GIS Application


An attempt has been made at the institute to visualise the bovine related
attributes using digitised map of Indian states to identify bovine characteristic specific
zones. The data for the investigation on livestock and other related parameters was
obtained from secondary sources namely, Basic Animal Husbandry Statistics-2004,
Indian Livestock Census. The collected data was tabulated, keyed-in in the form of
MS-Access database. Single layer and double layer geographical maps have been
developed using ArcGIS software package by super imposing data on different
aspects of dairying on the digitised map of the country so as to have a comparative
assessment of these parameters across states.

8.1 Distribution of Indigenous, Crossbred Cattle and Buffaloes and their


Density across States
The digitised maps showing distribution of bovine population for 2003
Livestock Census showed that maximum concentration of livestock population was
observed in UP, MP, Rajasthan, Maharashtra, AP and West Bengal followed by
Gujarat, Karnataka, Tamil Nadu, Bihar and Orissa with minimum being in the North
East region except Assam. Density of livestock population was relatively higher in
smaller states. Highest concentration of Indigenous cattle population was found in
UP, MP, Maharashtra and West Bengal and lowest in Punjab, Kerala and majority of
north-eastern states except Assam, Meghalaya and Tripura. Density of indigenous
cattle population was found relatively more in West Bengal, Assam, Jharkhand and
Orissa. Highest concentration of crossbred cattle population was observed in Tamil
Nadu followed by J&K, Punjab, UP, Bihar, Orissa, Maharashtra, AP, Karnataka and
Kerala. Density of crossbred cattle population was found highest in smaller states like
Pondicherry, Lakshwadeep Islands, Kerala and Uttaranchal and relatively low in
states like Rajasthan, MP, Chhatisgarh and Jharkhand. Highest concentration of
buffaloes was found in UP, Rajasthan and AP followed by Punjab, Haryana, Bihar,
MP, Maharashtra and Gujarat with relatively lower concentration in NE states, West
Bengal, Kerala and Goa. Density of buffaloes was found to be relatively higher in
Punjab, Haryana, UP, AP and Gujarat.

8.2 Green and Dry Fodder Production across States


The digitised maps showing green fodder and dry fodder production across
different states for 2002-03 showed that the total fodder availability was found to be
relatively more in UP, Rajasthan, Gujarat and Maharashtra. Green fodder production
found to be relatively higher in states like Rajasthan and Maharashtra followed by
UP, MP and Gujarat. Dry fodder availability found to be relatively more in UP and
Maharashtra followed by Punjab and Karnataka. Area under fodder crops was found
to be relatively higher in Rajasthan followed by UP, Gujarat and Maharashtra. Area
under permanent pastures and grazing lands was found to be highest in MP followed
by HP, Rajasthan and Maharashtra.

8.3 Species-wise Milk Production and Per Capita Milk Availability and
Processing Capacity of Milk Plants across States
The digitised maps showing milk production of different species and per
capita per day milk availability showed that total milk production was found to be
higher in Punjab, UP, Rajasthan, Gujarat, Maharashtra and AP. The milk production
of non-descript cows was found to be relatively higher in UP, MP, Rajasthan and
West Bengal while the milk production of crossbred cows was found to be relatively
more in Punjab, Maharashtra, Tamil Nadu and Kerala. The milk production of
buffaloes was found to be relatively higher in Punjab, Haryana, UP, Rajasthan,
Gujarat and AP. The per capita per day milk availability was found to be relatively
more in Punjab, Haryana, Rajasthan and Gujarat. The processing capacity of milk
plants was found to be higher in Gujarat and Maharashtra followed by Karnataka and
Tamil Nadu.
8.4 Average Productivity of Cows and Buffaloes across States
The digitised maps showing average productivity of cows and buffaloes
across states showed that average milk yield of non-descript cows was found to be
relatively higher in Punjab, Haryana, Rajasthan and Gujarat while that of crossbred
cows to be more in Punjab, Gujarat, Meghalaya and Mizoram. Average milk yield of
buffaloes was relatively more in Punjab, Haryana, Jharkhand and Kerala.

9.0 Conclusions
We are in the age of Information. The management of knowledge and the
decisions of tomorrow represent new horizons and challenges that come with such
adventures. The purpose of management in this century will be refocused from
productivity of the firm/organisation to the productivity of their knowledge.
Knowledge will be the key organisational resource of this century. The world's
organisations will compete on a playing field of knowledge competency and the
successful managers will be those who will be able to harness the power of
technology to solve the complex problems.

References:
Anonymous. 1997. Integrated database management. Lecture notes for Working Seminar on
Database Management for Agricultural Planning and Research, held between 3-7
Nov. 1997 at IASRI, New Delhi.
Aronoff, S. 1995. Geographic information systems : A management perspective. WDL
Publications, Ottawa, Ontario, Canada, ISBN:0921804-91-1.
Turban, E. 1995. Decision Support and Expert Systems. Prentice Hall, New Jersey.
Sprague Jr., Ralph H. and Watson, H.J. 1996. Decision Support For Management. Prentice
Hall, New Jersey.
Walsham, G. and Sahay, S. 1999. GIS for District-Level Administration In India: Problems
and Opportunities. MIS Quarterly, 23(1): 39-64.
Lewis, J. 1999. Introduction to GIS Using Arc View. Department of Geography, McGill
University, Montreal.
Mitchell, A. 1998. Zeroing In - Geographical Information Systems at Work in the Community.
Environment Systems Research Institute, New York.
Application of MATLAB for designing neural network solutions in
dairy processing
Anand Prakash Ruhil1, R.R.B. Singh2 and R.C. Nagpal1
1
Computer Centre, NDRI, Karnal
2
Dairy Technology Division, NDRI, Karnal

1.0 Introduction
The MATLAB (MATrix LABoratory) is a special purpose computer program optimized to
perform engineering and scientific calculations. MATLAB is huge program, with an incredibly
rich variety of functions. MATLAB program implements the MATLAB language and its
extensive library of predefined functions to make technical programming tasks easier and more
efficient. This extremely wide variety of functions makes it much easier to solve technical
problems in MATLAB than in other languages such as FORTRAN, C, C++, VB, VC++ etc.

Neural Network Toolbox (NNT) of MATLAB is a powerful toolbox to solve neural network
related problems. NNT not only provides the established procedures of neural network but also
has a provision to write and add user defined methods and procedures in the toolbox for solving
specific problems. Components required to design neural network problems are available in the
form of objects in the toolbox. A user can design a neural network either through the wizard of
NNT or by writing a program in MATLAB language using network objects.

The purpose of this demo is to explain how to design a neural network for solving a problem
using NNT. This lecture consists of the following sections:

1. Problem Definition
2. Network Architecture
3. Data Preparation
4. Training and Simulation
5. Results

2.0 Problem Definition


Experimentally 148 observations on various input and output parameters of UHT milk at
different storage temperature and period were collected for predicting the shelf life of UHT milk.
The detailed description of collected data is given below. A suitable neural network model was
designed to predict the values of output parameters based on the input parameters.

Input Parameters:
• REFL – Reflectance (Colour)
• THMF – Total hydroxy methyl furfural (Browning)
• TBA – Thiobarbituric acid ( Oxidation)
• FFA – Free fatty acid (Lipolysis)
• TNBS –Tri nitro benzene sulphonic acid (Proteolysis)

Output Parameters:
• Flavor Score
• Total Sensory Score

Storage Temperature Levels and Periods:

Temperature Period (in days)


Levels (in oC)
9 0, 4, 8, 12, 16
15 0, 4, 8, 12, 16
25 0, 2, 4, 6, 8, 10, 12, 14, 16
35 0, 2, 4, 6, 8, 10, 12, 14, 16
45 0, 2, 4, 6, 8, 10, 12, 14, 16

3.0 Network Architecture


After defining the problem, next step is to design a suitable neural network model to get more
accurate prediction of output parameters. To design neural network we have to specify the
following components of the network:

1. Network Model: Feedforward neural network with backpropagation


2. Network Layers: 1 and 2 hidden layers excluding input and output data layers.
3. Neurons in each layer: Varied from 3 to 30 neurons in each hidden layer.
4. Weight and Bias Matrix: Randomly initialized.
5. Transformation Function on each hidden layer: Tangent Sigmoid function
6. Transformation Function on output layer: Pure linear function
7. Training Algorithm: Trainbr (Bayesian Regularization)
8. Performance function: The relative percentage of Root Mean Square (%RMS) was
calculated to evaluate the performance of network model
2
1 n O  E 
%RMS    i i   100
N i1  Oi 

where Oi = Observed value,


Ei = Predicted value and
N = Number of observations

There are some more network parameters depending on the training algorithm function for which
the values have to be specified in the beginning before training of the network. But for some of
the network parameters default values also serve the purpose. For example it is better to increase
the value of net.trainParam.epoch.
4.0 Data Preparation
The fundamental unit of data in any MATLAB program is array. Therefore input and output data
are stored in the form of two dimensional array (matrix). Rows of matrix will represent
parameters and columns will represent observations. Input and output data is further divided in to
two/ three subsets as per the requirement of selected network training algorithms. The output
data set is known as target in NNT terminology.

For the above problem, there are 5 input parameters, 2 output parameters and 148 observations.
We divide the data set into two subsets namely training subset comprising of 112 observations
(75% of total observations) and testing subset comprising of 36 observations (25% of total
observations). The input matrix will be of 5 rows and 112 columns whereas target matrix will be
of 2 rows and 112 columns for training the network. Similarly the size of input and target matrix
for testing the neural network will be 5x36 and 2x36 respectively.

Data files can also be imported in MATLAB from external sources like MS-Excel. In the present
context we have imported data from Excel file since it is easy to prepare and verify data in MS-
Excel worksheet.

5.0 Training and Simulation


After preparing input and output data and initializing the values of network components, next
step is to train the network with training data set and then simulate the network with test data
set to get the network output and to compute the performance parameter to evaluate the
performance of network. The network is trained and simulated for number of times to get
accurate and consistent results. Once the results are consistent value of performance parameter is
recorded. Similarly the value of performance parameter is recorded for different combination of
values of number of layers, number of neurons in each layer, transformation and training
function etc. The network model for which, the value of performance parameter is minimum, is
selected as best model. In the present context we have varied only two network components
namely number of layers and number of neurons in each layer.

Neural network training can be more efficient if certain preprocessing steps are performed on the
inputs and targets. NNT provides several preprocessing routines for this purpose. For example,
the above mentioned training algorithm trainbr generally works best when the network inputs
and targets are scaled so that they fall approximately in the range [-1, 1]. If the inputs and targets
do not fall in this range, we can use the preprocessing function premnmx or prestd to perform the
scaling.

Script of the Program:

%Sample program to create a feed forward backpropagation network.


%Network is trained and simulated for UHT milk data.
%Demonstration given in a Workshop on "Trends in Computational Biology" on 20/10/05
clear all
clc
fileName = 'Demo_uht_Data';
load (fileName);
targetData = targetDataFS; % Assign the exact target data to be proccessed
layers = 2; % enter the number of layers required in the network
S1=10; % Size of (Number of neurons in) first layer
S2=3; % Size of (Number of neurons in) second layer
S3=size(targetData,1); % Size of (number of neuron in) output layer
TF1 = 'tansig'; % Training function of first layer
TF2 = 'tansig'; % Training function of second layer
TF3 = 'purelin'; % Training function of outlayer layer
BTF = 'trainbr' ; % Backprop network training function

% preprocessing of data to scale input and target data in the range [-1, 1]
[pn,meanp,stdp,tn,meant,stdt] = prestd(inData, targetData);
[R,Q] = size(pn);
iitst = 3:4:Q;
iitr = [1:4:Q 2:4:Q 4:4:Q];
testing.P = pn(:,iitst); % partitioning of input data in testing data subset
testing.T = tn(:,iitst); % partitioning of output data in testing data subset
ptr = pn(:,iitr); % partitioning of input data in training data subset
ttr = tn(:,iitr); % partitioning of output data in training data subset

testTargetData = targetData(:,iitst); % to extract original test data from target data

%creates a feed forward backpropagation network with two or three layers


if (layers == 2)
net = newff(minmax(ptr),[S1 S3],{TF1 TF3},BTF);
else
net = newff(minmax(ptr),[S1 S2 S3],{TF1 TF2 TF3},BTF);
end

net = init(net); % initialize the network


net.trainParam.epochs = 2000; %Maximum number of epochs to train network
net.trainParam.show = 50; %Display te results after 50 every epochs

%TRAIN trains a network NET according to NET.trainFcn and NET.trainParam.


[net, tr, trY, trE]=train(net, ptr, ttr);

%Plot the training, validation and test errors.


pause
clf
plot(tr.epoch,tr.perf,'r')
legend('Training',-1);
ylabel('Squared Error')

%Simulate a neural network for testing the results on unforeseen data i.e. test data
[antst, ID, ID1, Etest, prftst] = sim(net,testing.P);
atst = poststd(antst,meant,stdt); % recoverting output into its original form

% following few line are to compute the value of performance parameter (%RMS)
testError = testTargetData - atst;
teste1 = testError./testTargetData;
teste2 = teste1.^2;
testSume2 = 0;
n = size(teste2,2);
for i = 1:n
testSume2 = testSume2 + teste2(1,i);
end
testAvge2 = testSume2 / n;
testRMS = sqrt(testAvge2) * 100;

%Following lines postprocesses the trained network response with a linear regression
% through function POSTREG
for i=1:S3
pause % Strike any key to display the next output for test data...
[mtest(i),btest(i),rtest(i)] = postreg(atst(i,:),testTargetData(i,:));
end

disp('Performace of Network for test data...')


disp ('SSE = ')
disp(prftst)

disp('Test data set RMS = ')


disp(testRMS)

disp('End of UhtmilkBR')

6.0 Results
A. Performance of ANN model for predicting flavour score using test data

Number of Number of %RMS R Number of


Hidden Neurons in each Effective
Layers Layer Parameters

1 5 6.19 0.945 19
1 10 6.18 0.94 19
1 15 5.85 0.950 15
1 20 5.85 0.950 15
1 25 5.85 0.950 15
2 3 7.95 0.901 28
2 5 9.17 0.851 60

B. Performance of neural network for predicting total sensory score using test data

Number of Number of %RMS R Number of


Hidden Neurons in each Effective
Layers Layer Parameters

1 5 4.39 0.966 22
1 10 4.36 0.967 20
1 15 4.36 0.967 20
1 20 4.33 0.967 20
1 25 4.36 0.967 20
2 3 4.20 0.97 21
2 5 6.09 0.896 52

7.0 References
1. Howard Demuth and Mark Beale, User’s Guide of Neural Network Toolbox
2. Stephenj. Chapman, MATLAB Pragramming for Enineers, 2004, Thomson Press
Application of Digital Imaging for assessing Physical Properties of Foods
Dr. Ruplal Choudhary
Senior Scientist
Dairy Engineering Division, NDRI Karnal 132001

1.0 Introduction

Inspection of food quality by digital image sensing is gaining importance in modern high-
volume, fully-automatic food processing plants. It offers the advantage of rapid, accurate and
non destructive quantification of various quality characteristics of food products. The images
of biological products, such as food and agricultural products can be acquired in visible or
non-visible range of electromagnetic waves. A vision based sensing system consists of a
source of illumination or radiation, and a spatial sensor, which measures the distribution of
reflected or absorbed radiation at each point of an object. The signal generated by a spatial
sensor is a 2-D image data, which needs to be correlated with the characteristics of the
products under inspection. Thus, a computer is an integral part of an image based sensing and
inspection system. Computer software are used for data acquisition, processing, analysis and
interpretation. Rapid decisions regarding acceptance, rejection or recycling can be made by
the image based sensing and control systems. Thus, for online quality inspection of food
product, computer integrated imaging systems are indispensable tools for modern automatic
food processing industry.

A simple machine vision system uses charge couple devices that are sensitive to the visible
light in the electromagnetic spectrum. The visible cameras are of color or monochrome types
to determine the reflectance characteristics illuminated by a light source. The steps involved
in visible image based quality sensing are presented in the following section.

2.0 Image acquisition

A laboratory visible imaging system, also known as computer vision system, consists of a
sensor or camera to acquire two or three dimensional images of products, which are
converted into digital images by a digitizer and stored in the computer digital images (Fig. 1).
The digital images are processed using computer algorithms to recognize the product and to
determine its characteristics. Based on the characteristics determined, products can be
classified or inspected for rejection. Thus an online machine vision inspection system
consists of image acquisition, digitization, processing, classification and actualization. Vision
systems are affected by the level and quality of illumination. A well designed illumination
system can help to improve the success of the image analysis by improving image contrast.
Good lighting can reduce reflection, shadow and some noise giving decreased processing
time. Various aspect of illumination including, location, lamp type and color quality need to
be considered when designing an illumination system for application in food industry
(Panigrahi and Gunasekaran, 2001).
Fig 1: A laboratory computer vision system (Wang and Sun, 2002)

3.0 Image processing and Analysis

Image processing involves a series of image operations that enhance the quality of an image
in order to remove defects such as geometric distortion, improper focus etc. Image analysis is
the process of distinguishing the objects (regions of interest) from the background and
producing quantitative information, which is used in the subsequent control systems for
decision making. Image processing/analysis involve a series of steps, which can be broadly
divided into three levels: low, intermediate, and high, as shown in figure 2.

3.1 Low level processing: includes image acquisition and pre-processing. Digital image
acquision is the transfer of electronic signal from the sensing device (camera) to the
computer in digital form. A digital image is represented by a matrix of numerical values,
each representing a quantized image intensity value. Each matrix element is knwn as pixel
(picture element). The total number of pixels in an image is determined by the size of the 2-D
array used in the camera. The intensity of the monochrome image is known as the grey level.
When an 8-bit integer is used to store each pixel value, gray levels range from 0 to 255,
where 0 is black and 255 is white. All intermediate values are sheds of gray varying from
black to white. Each pixel in color image is represented by 3 digits representing RGB (Red,
Green, Blue) components with each digit varying from 0 to 255. The RGB values can also be
converted to HSI (Hue, Saturation and Intensity) color model for further processing.

Preprocessing of raw data involves improving image quality by suppressing undesirable


distortions or by enhancing important features of interest.

3.2 Intermediate level processing: involves image segmentation, and image


representation and description. Image segmentation is the operation of selecting a region of
interest that has strong correlation with objects. It is therefore one of the most important steps
in the entire image processing technique, as subsequent extracted data are highly dependent
on the accuracy of this operation. Segmentation can be achieved by three different
techniques: thresholding, edge-based segmentation and region based segmentation.
Thresholding is a simple and fast technique for characterizing image regions based on
constant reflectivity of their surfaces. Edge-based segmentation relies on edge detection by
edge operators. Edge operators detect discontinuities in grey level, color, texture, etc. Region
segmentation involves the grouping together of similar pixels to form regions representing
single objects within the image. The segmented image may be represented as a boundary or a
region. Boundary representation is suitable for analysis of size and shape features while
region representation is used in the evaluation of image texture and defects.

Image description deals with the extraction of quantitative information from the previously
segmented image regions. Various algorithms are used for this purpose with morphological,
textural, and photometric features quantified so that subsequent object recognition and
classification may be performed.

Image morphology refers to the geometric structure within an image, which includes size,
shape, particle distribution, and texture characteristics. Texture is characterized by the spatial
distribution of gray levels in a neighborhood. For most image processing purposes, texture is
defined as a repeating patterns of local variations in image intensity, which are too fine to be
distinguished as separate objects at the observed resolution. Image texture can be used to
describe such image properties as smoothness, coarseness and regularity.

3.3 High level processing: involves recognition and interpretation, typically using
statistical classifiers or multilayer neural networks of the region of interest. These steps
provide the information necessary for the process control for quality sorting and grading.

At each stage of image processing process, interaction with a knowledge database is essential
for more precise decision making. Algorithms such as neural networks, fuzzy logic and
genetic algorithms are some of the techniques of building knowledge base into computer
structures. Such algorithms involve image understanding and decision making capacities thus
providing system control capabilities.

Fig 2: Image processing steps in food quality inspection


4.0 Applications of digital imaging

With the decreasing price of hardware and software, computer vision systems are being
increasingly used for automated quality inspection systems. It has been successfully
implemented for objective, online measurement of quality of several food products, such as,
horticultural produce, meat and fish, dairy and bakery, and food grains.

4.1 Dairy and Bakery

Internal and external appearances of baked products are important quality attribute, generally
correlated with the overall consumer acceptability of the product. Scott (1994) described a
system which measured the defects in baked loaves of bread by measuring its height and
slopes of the top. The internal structure of bread and cake, such as cell size, density, cell
distribution were analyzed and were directly correlated with the texture (Sapirstein, 1995).
More recently, the consumer acceptability of chocolate chip cookies were correlated with the
size, shape and percentage chocolate on the top surface of cookies (Davidson et al., 2001).

Functional properties of cheese were evaluated by Wang and Sun, 2002. Meltability and
browning properties of cheddar and mozzarella cheese were evaluated under different
cooking conditions and size of samples using machine vision. Ni and Gunasekaran (1995)
developed algorithms for evaluation of cheese shred dimensions using machine vision. This
will help maintain the quality of cheese shreds to be used in pizza toppings.

4.2 Meat, fish and poultry


Visual inspection is used extensively for the quality assessment of meat products applied to
processes from the initial grading to consumer purchases. McDonald and Chen (1990)
investigated image based beef grading. They discriminated between fat and lean in l.d.
muscle based on reflectance characteristics, however poor results were reported. Recently,
Subbiah (2004) examined computer vision for predicting the tenderness of aged, cooked-
beef. Textural features extracted from images of fresh beef using statistical methods, Gabor
filters, and wavelets were used to predict tenderness. The adaptive segmentation algorithm
for color beef images separated l.d. muscle with 98% accuracy. A linear regression model
using statistical textural features predicted shear force tenderness with an R2 value of 0.72. A
canonical discriminant model using Gabor textural features classified carcasses into three
tenderness groups with 79% accuracy. A stepwise regression model using wavelet textural
features successfully classified carcasses into nine tenderness certification levels. Poultry
carcasses were characterized using multispectral imaging techniques. The multispectral
images of chicken carcasses were able to detect and separate bruise, tumorous, and skin torn
carcasses from normal carcasses (Park et al., 1996). Artificial neural network (ANN) models
were employed for image feature extraction and classification. The ANN models performed
with 91% accuracy in classification of carcasses. Automatic fish sorting techniques using
image analysis has been investigated to reduce tedious human inspection and costs (Strachan
and Kell, 1995). Using this technique, fish species were identified and sorted online from a
conveyor belt.
Fig 3: Segmentation of l.d. muscle by convex hull algorithm (Subbiah, 2005). Left: original
image, Right: segmented l.d. muscle.

4.3 Fruits and vegetables

Computer vision has been extensively used for classification, defect detection, quality
grading and variety classification of fruits and vegetables. Defect segmentation on Golden
Delicious apples was performed by color machine vision system (Leemans et al, 1998) as
shown in figure 4.

The developed algorithm for color images gave satisfactory results with well contrasted
defects. Tao and Wen (1999) developed a novel adaptive spherical transform for machine
vision defect sorting system. The transform used fast feature extraction and improved the
speed of inspection upto 3000 apples/min. The system had an accuracy of 94% while sorting
defective apples from good ones. Machine vision based sorting systems for peaches,
strawberries, tomato and oranges, and mushrooms have been developed for sorting based on
shape, size and color features of fruits and vegetables (Tao et al., 1995). Sugar content,
acidity and other physico-chemical parameters of fruits and vegetables were predicted from
the visible and nonvisible images of fruits and vegetables (Kondo et al., 2000, Steinmetz et
al., 1999).

5.0 Conclusions

Review of digital imaging techniques and their applications shows that there is a great
potential of these nondestructive quality evaluation tools in food industry. The automated,
objective, rapid and hygienic inspection of raw and processed foods can be achieved by
computer vision systems. Machine vision systems have the potential to become a vital
component of automated food processing operations as increased computer capabilities and
greater processing speed of algorithms are continually developing to meet the requirements
of online quality control systems.

6.0 References:

Davidson, V.J., J. Rykes, and T. Chu (2001). Fuzzy models to predict consumer ratings for
biscuits based on digital features. IEEE Transactions on fuzzy systems, 9(1), 62-67.

Jayas, D.S. and C. Karunakaran (2005). Machine vision system in postharvest technology.
Stewart Postharvest Review, 2:2.

Kondo N., U. Ahmada, M. Montaa, and H. Muraseb (2000). Machine vision based quality
evaluation of Iyokan orange fruit using neural networks. Computers and Electronics in
Agriculture, 29(1-2), 135-147.

Leemans, V., H. Magein, and M.F. Destain (1998). Defects segmentation on Golden
Delicious apples by using color machine vision. Computers and Electronics in Agriculture.
20, 117-130.

Ni H., and S. Gunasekaran (1995). A computer vision system for determining quality of
cheese shreds. In: . In: Food processing automation IV. Proceedings of the FPAC conference,
St. Joseph, MI, USA: ASABE.

Panigrahi, S. and S. Gunasekaran (2001). Computer Vision. In: Nondestructive food


evaluation. Edited by: S. Gunasekaran. Marcel Dekker Inc., NY.

Park, B., Y.R. Chen, M. Nguyen, and H. Hwang (1996). Characterizing multispectral images
of tumorous, bruised, skin-torn and wholesome poultry carcasses. Transactions of the ASAE.
39(5), 1933-1941.
Sapirstein, H.D. (1995). Quality control in commercial baking: machine vision inspection of
crumb grain in bread and cake products. In: Food processing automation IV. Proceedings of
the FPAC conference, St. Joseph, MI, USA: ASABE.

Scott, A. (1994). Automated continuous online inspection, detection and rejection. Food
Technology Europe, 1(4), 86-88.

Steinmetz, V., J.M. Roger, E. Molto, and J.Blasco (1999). Online fusion of color camera and
spectrophotometer for sugar content prediction of apples. Journal of Agricultural Engineering
Research, 73, 207-216.

Strachan, N.J.C. and L. Kell (1995). A potential method for differentiation between haddock
fish stocks by computer vision using canonical discriminant analysis. ICES Journal of
Marine Science, 52, 145-149.

Subbiah J. (2004). Nondestructive evaluation of beef palatability. Unpublished PhD thesis,


Oklahoma State University, Stillwater, OK, USA.
Tao, Y. and Z. Wen (1999). An adaptive spherical image transform for high-speed fruit
defect detection. Transactions of the ASAE, 42(1), 241-246.

Tao, Y., P. Heinemann, Z. Varghese, C.T. Morrow, and H.J. Sommer (1995). Machine vision
for color inspection of potatoes and apples. Transactions of the ASAE, 38(5), 1555-1561.

Wang, H.H. and D.W. Sun (2002). Correlation between cheese meltability determined with a
computer vision method and with Arnott and Schreiber. Journal of Food Science, 67 (2),
745-749.
Dynamic modeling of dairy and food processing operations
Ruplal Choudhary
Senior Scientist
Dairy Engineering Division
NDRI, Karnal 132001

1.0 INTRODUCTION
A model is a description of the useful information extracted from a given practical
process. In general there are two methods in mathematical modeling: theoretical and
empirical. Theoretical models are built from the fundamental laws such as Newton’s law,
Laws of thermodynamics, Ohm’s law etc. For complex processes such as food process
operations, often it is difficult to obtain models theoretically. Empirical models are
therefore built by assuming the processes as black box. The inputs and outputs from such
systems are correlated by using statistical tools. In food process control, it is essential to
model the dynamics of the systems, that is how the relation between input and output
change with time. Figure 1 shows a black box model of a process.
2.0 DYNAMIC MODELLING
Dynamic models are used to describe the relationship between state variables in a
transient state.

u Process y

Figure 1: A black box process model

The dynamic models characterize the changes in outputs caused by the changes in inputs
to make the system move from one state to another. The dynamic systems generally can
be assumed as following the equation:

y(t) = f(y(t-1), y(t-2), … y(t-p), u(t-1), u(t-2), … u(t-q),  (t), (t-1),… (t-r) (1)
where y(t)=output vector with m variables, u = input vector with n variables,  are the set
of coefficients in the model. For linear models, it will be a (n+1) element vector, for
nonlinear, depends on model structure,  are the m dimensional residual variables.
Equation (1) is a general form of discrete time nonlinear AutoRegression Moving
Average with eXogenous input (NARMAX). If the system is a linear system with a
single input and single output (SISO), it can be simplified to ARX models such as:

y(t) = 1.5 y(t-T)-0.5y(t-2T)+0.9u(t-2T)+0.5u(t-3T) (2)

The output at time t is thus computed as a linear combination of past outputs and past
inputs. It follows, for example, that the output at time t depends on the input signal at
many previous time instants. This is what the word dynamic refers to. The equation (2) is
an example of an ARX linear model of a system, obtained by the system identification
algorithm of Matlab. The system identification algorithm uses the input u and output y
of a system to figure out:

1. The coefficients () in this equation (i.e., -1.5, 0.7, etc.).


2. How many delayed outputs to use in the description (two in the example: y(t-T)
and y(t-2T)).
3. The time delay in the system (2T in the example equation (2) it can be seen from
this equation that it takes 2T time units before a change in u will affect y).
4. How many delayed inputs to use (two in the example: u(t-2T) and u(t-3T)). The
number of delayed inputs and outputs are usually referred to as the model
order(s).

To understand the concept of the developing dynamic from available input output data,
let us take an example. Given:
{u(1), y(1)},{u(2), y(2)},...,{u(N), y(N)} ---------(3)
The first objective is to determine the model structure. If the model structure is ARX, it
can be written as follows:
y(t)  ay(t  1)  bu(t  1)  e(t) ---------(4)
The objective is to determine the coefficients a and b. The system of equation from given
data matrix can be written as:
y(2)  ay(1)  bu(1) 
 
y(3)  ay(2)  bu(2)  -------(5)
y(2) y(1) u(1) a
 y(3)    y(2) u(2) b 
    
The coefficients ‘a’ and ‘b’ for a model structure presented in equation (4) can be
estimated by the system identification toolbox using the following equation:

 y(2) 
 a  y(3) 
     T    T 
1

 ... 
 b   
 y(N)
--------(6)
 y(1) u(1) 
 
y(2) u(2) 
 
 ...... 
 
y(N  1) u(N 1)

The prediction error can be found by taking difference between the value predicted by the
model and the actual data. The data set used for obtaining accuracy of the model should
be different than the one used for developing the dynamic model.

The system identification toolbox of Matlab can be used for model development and
validation. Once the model parameters are estimated with least error, the models are
analyzed for their stability by obtaining step response and bode plot of the model. Using
these analytical tools, the system behavior is analyzed and suitable controllers can be
designed to implement the real time process control during dairy and food processing
operations.

3.0 REFERENCES:
1. Huang, Y., A.D. Whittaker and R.E. Lacey. 2002. Automation for Food
Engineering. CRC Press, Washington, DC.
2. Ljung, L. 1999. System Identification: Theory for the User, 2nd ed. Prentice Hall,
Upper Saddle River, NJ, 1999.
3. Matlab6.5. 2002. System Identification Toolbox. The Mathworks, Inc.
MODELING OF FOOD PROCESSING OPERATIONS

S. N. Jha
CIPHET, Ludhiana – 141 004

1.0 Introduction
The process of solving of food processing problems can roughly be divided into four
phases. The first consists of constructing a mathematical model for the corresponding
problems. This could be in the form of differential or algebraic equations. In second
phase the mathematical model is converted to numerical model by doing some
approximation for easy solution. The third phase is the solution of the numerical model.
Fourth and the final phase is application of solution in real design of process equipment
and operation.
Ideally modeling of food processing starts with the general laws of (chemical-) and
physics. The power and pitfalls of modeling approach are illustrated in this lecture with
two real-field examples that are familiar to most food scientists. One is roasting of grain
during popping operations and other pressing of paneer or citrus fruit for whey/juice
expulsion.

2.0 Grain Roasting


The grains that are roasted for getting their kernel popped are paddy, corn, millets
and some nuts such as peanut, gorgon nut etc. The popping process consists of
conditioning, high temperature roasting, and popping. Conditioning creates a small gap
between the kernel and the shell, equilibrates the moisture, and gelatinizes the kernel’s
starch. High temperature roasting generates superheated steam and builds pressure within
the grain. Majority of grains get popped during roasting, while a few whose shell wall is
hard do not pop easily. The hot roasted one is hit externally to develop crack for sudden
release of internal pressure for popping. A regular shape and uniform size are the
desirable features of a popped kernel but it gets distorted, if shell wall breaks and popping
takes place during roasting due to development of high pressure. The internal pressure
mainly is a function of roasting time, moisture content and temperature of the grain.

2.1 Mathematical modeling for knowing internal pressure in grain during roasting
A generalized model of conditioned spherical grain (Fig. 1) is comprised of an outer
shell, the starchy kernel and a small gap filled with air-water vapour mixture between the
kernel and the shell. If the grain is roasted to high temperature, it is expected that the
internal pressure developed due to air-water vapour mixture within the grain will induce
hoop stress in the shell wall (Fig. 2.). Analysis of thermal treatment of the grain mainly
involves three problems, inter-linked due to the volume constraint. They are:
(i) thermo-elasticity of the spherical shell
(ii) expansion and shrinkage of the kernel, and
(iii) compression of air-water vapour mixture in the gap between the kernel and
the shell.
If Vs, Vk, and Va are respectively the volume enclosed by the shell, the volumes of the
kernel and the gap between the kernel and the shell, small change in Vs is written as:

Tw
Rs
Kernel Rk t

Shell
Gap filled with
air-

Fig. 1. A generalized spherical model of conditioned grain

ΔVs  ΔVk  ΔVa (1.1)


2.1.1. Change in volume enclosed by the shell, ΔVs
In deriving the expression for change in volume enclosed by the shell, ΔVs , the
following assumptions were used:
(i) the shell is spherical
(ii) the shell material is elastically isotropic,
(iii) the thermal anisotropy of the shell
material is restricted to the radial p
direction only.
(iv) The shell’s thickness is small compared Rs
to its radius, and there is no abrupt
change in curvature so that bending
stress can be neglected, Fig. 2. Force diagram for a
(v) The roasting pan temperature is very spherical shell of grain
high and the moisture content of thin shell of the grain is so low that it attains
an equilibrium in moisture content and temperature almost immediately, and
(vi) The temperature of the shell wall remains uniform with respect to thickness,
so there is no thermal stress within the shell.
The net radial expansion of a thin spherical shell subjected to an internal pressure p,
uniform shell wall temperature field Tw, and radial contraction due to moisture loss can
be obtained by using the membrane theory of shells of revolutions as follows:
Based on the above assumptions, the simpler ‘membrane theory’ for shell may be
written as:
pR
ζθ  ζ φ  s (1.2)
2t
where, ζ θ , ζ φ , p, Rs, and t, are circumferential stress, Pa; meridonal stress, Pa;
internal pressure, Pa; radius and thickness of shell, m; of the grain, respectively

2
The circumferential membrane strain in the shell is given by 


εθ  εth  εm   ζ θ  ν ζ θ   (1.3)
 

 E 
where, εθ, εth, εm , and E are elastic circumferential strain, thermal strain,
circumferential moisture strain, Poisson’s ratio and modulus of elasticity, Pa; of shell
material, respectively. The circumferential thermal strain in the shell is given by
εth  αΔTs (1.4)

where, α and ΔTs are coefficient of linear thermal expansion, oC-1; and change in
temperature; oC ; with respect to initial temperature of shell, respectively.
Elastic circumferential strain is given by
εθ  ΔRs (1.5)
Rs
where, Rs and ΔRs are radius of shell, m and change in the same, respectively.
Circumferential moisture strain is expressed as:
εm  βΔMs (1.6)
where, β and ΔMs are the coefficient of linear moisture contraction (dimensionless) and

εθ   ζ θ  ν ζ θ   εth  εm
change in moisture content of shell, fraction, dry basis, respectively. From Eqn (1.3)
(1.7)
 
 E 
Substituting the Eqns (1.2), (1.4), (1.5), (1.6) into Eqn (1.7) and putting
expression of change in radius in and neglecting the higher terms, the expression for
change in volume enclosed by the shell becomes
 pRs 

ΔVs  3Vs (1  ν )  αΔTs  βΔMs  (1.8)

2tE 

2.1.2. Change in volume of the kernel, ΔVk


The change in volume of the kernel can be obtained by heat and mass transfer
analysis. To deduce expressions for moisture transfer from the kernel and heat transfer to
the kernel from the shell following assumptions were made:
(i) the kernel is incompressible and spherical
(ii) heat transfer to the kernel from the shell is mainly through conduction via
contact area of kernel and the shell wall during roasting. Heat transfer through
convection and radiation, if any, are ignored, and
(iii) specific heat and latent heat of vaporization of kernel are constant at a
particular roasting temperature.

3
2.1.3. Moisture transfer
The moisture content of the kernel during roasting is dependent on initial and
equilibrium moisture content of whole grain, roasting time and temperature. The
relationship between the moisture content of kernel and the grain with roasting
temperature and time can be defined by the following functions:

Mk  f (Mg ) (1.9)
Mg  f1 (Dg , θ) (1.10)
Dg  f2 (Mgo, T,Mge ) (1.11)
where, Mk, Mg, Dg, θ , Mgo, T, Mge respectively are moisture content of kernel, decimal,
dry basis (d.b.), moisture content of grain, decimal, d.b.; moisture diffusivity of grain,
m2min-1; roasting time, min; initial moisture content of grain, fraction, db; roasting pan
temperature, oC; and equilibrium moisture content of grain, decimal, db.
Substituting Eqns (1.11) & (1.12) in Eqn (1.9), final functions for Mk may be written as:
Mk  f [ f1 {f2 (Mgo, T,Mge)}, θ] (1.12)
Expressions for Dg, Mg and Mk assuming equilibrium moisture content of the grain at
high temperature conduction roasting, as zero (Jha & Prasad 1993, Jha, 1993) can be
written as:
Dg  Do  D1M go  D2T  D3M go D M
2
4 Tgo (1.13)

Mg  Mgo C1 exp(C2θ) (1.14)

M k  C3M g  C4 (1.15)
Dg π 2
in which C2 
R2
where, Do, D1, D2, D3, D4, C1, C2, C3, C4, are constants given in Table 1, and R is the
radius of the grain, m. Putting the expression for Mg from Eqn (1.14) into Eqn (1.15)

Mk  C1C3 Mgo exp (C2θ)  C4 (1.16)

The rate of moisture transfer from the kernel could be written as:
dMk  C C C M exp(C θ) (1.17)
1 2 3 go 2

2.1.4. Heat transfer
Heat input to the kernel from the shell by conduction, neglecting the convection
and radiation loss and heat transfer through the point contact, can be written as:
qin  4πkv R g R k  Tw  Tk 
 

(1.18)
 

 Rk 
 Rg 



4
where, qin, kv, Rg, Rk, Tw, and Tk, respectively are heat input to the kernel, kJmin-1;
thermal conductivity of air-water vapor mixture, kJmin-1m-1oC-1; radius of kernel plus gap
between the kernel and shell, if any, m; radius of kernel, m; shell wall and kernel
temperature, oC at any time θ , min. The heat accumulated in the kernel is:
dT
qac  Cpk Wkd (1  M k ) k (1.19)

where, qac, Cpk, Wkd are heat accumulated in kernel,kJmin-1;specific heat of kernel, kJkg-
1o -1
C ; and bone dried mass of kernel, kg; respectively. The heat utilized for vaporization
of moisture can be written as:
dMk
q  Lk Wkd (1.20)
out dθ
where, qout and Lk, are heat utilized in vaporization of moisture of kernel; kJmin-1 and
latent heat of vaporization of moisture of the same, kJkg-1, respectively. Now,
qac  qin  qout (1.21)

On substituting the Eqns (1.18), (1.19) & (1.20) into Eqn (1.21) and rearranging the terms
dTk 4π k v R g R k (Tw  Tk )  Lk  dMk
  
dθ C pk Wk (R g  R k )  (1  M  dθ
 C pk k ) 

Wsd
Let δ 1 then Wk  Wgd [1 Mg - δ1 (1 Ms )]
Wgd

where, Wsd, Wgd, Wk, and Ms are bone dried mass of shell, kg; bone dried mass of whole
grain, kg; mass of kernel, kg; and moisture content of shell, fraction, db; respectively.
Expressing the mass of the kernel in terms of the mass of grain and rearranging the terms
in the above equation, the simplified form of heat transfer equation can be written as:
dTk dMk
 A(Tw  Tk )  B (1.22)
dθ dθ
where,
4π k v R g R k (1  M go ) Lk
A and B 
Cpk Wgd{(1 Mg )  δ1 (1  Ms )}(Rg  R k ) Cpk (1  Mk )
dM k
If T  T  T, dT dT
 - k and using the expression for from Eqn (1.17), Eqn
w k
dθ dθ dθ
(1.22) can be written as:
dT
 AT  C (1.23)

where C = B1 exp(-C2 θ ) and B1 = BC1C2C3Mgo exp (-C2 θ )
Eqn (1.23) is first order differential equation with boundary (initial) conditions
θ  0 , T  Tko the solution is (Agnew, 1960):

5

T  T  B1 [exp( C2 θ)  exp (Aθ )]  (T  T )exp  Aθ (1.24)
A  C2
k w w ko

From Eqns (1.17) and (1.24) the moisture content and temperature of the kernel
respectively could be obtained at different time interval during roasting.
Now change in volume of the kernel due to change in moisture and temperature
can be written as:

ΔVk  Vk [γ (Tk  Tko )  γc (Mko  Mk )] (1.25)

where, γ and γ c are coefficients of volume expansion and contraction due to change in
temperature and moisture of the kernel, respectively.

2.1.5. Change in volume of air-water vapour mixture, Va


Let, Po, Vo, To and m respectively be the initial pressure, Pa; volume, m3; absolute
temperature, K; and mass of the air-water vapour mixture, kg; in the gap under ambient
conditions. Assuming the shell of grain impermeable to the air water vapour mixture for
the short period of roasting, the changes in pressure, volume, temperature and mass
during roasting period of θ be p,ΔV, ΔT, and Δm respectively. If the air water vapour
mixture obeys the ideal gas law, it can be written for ambient condition as :
Po Vo  m Go To (1.26)
and after elapsed time θ
(Po  p)(Vo  ΔV)  (m  Δm) G1 (To  ΔT) (1.27)
where, Go and G1 are gas constants, Jkg-1 K-1; at initial conditions and after time θ,
respectively. Dividing Eqn (1.27) by Eqn (1.26)

To (Po  p)(Vo  V) G1 (m  m)



(To  T) Po Vo Gom
As V  Va and Vo  Va
 G1Po (M  m) (To  T) 

Va  Va  -1 (1.28)


 Go mTo (Po  p) 
Let the volume of the air-water vapour gap Va be a fraction of  of the total
enclosed initial volume of the shell, Vs
 VδV (1.29)
a s

and Vk  (1 δ) Vs (1.30)

Now substituting the expressions for ΔVs , Vk and Va from the Eqns (1.8),
(1.25) and (1.28) respectively in Eqn (1.1) and replacing Va and Vk by Eqns (1.29) and

6
(1.30) respectively in the above equation, rearranging the terms, and assuming the right
hand side of the resulting equation as X, the simplified equation can be written as:

3R s mG o To (1 ν ) p 2  G o To m[3R s Po (1 ν)  2 t E X] p

 2tEPo [δ G1 (m  Δm) (To  ΔT)  G o mTo X]  0 (1.31)


Eqn (1.31) is in a quadratic from of p which can be solved for any time θ
2.2.. Solution of model
The gorgon nut (Euryale ferox), representing fully the assumed generalized
model (Fig. 1), was taken as an example. A material surface temperature of 213 oC, when
the nut was roasted at a pan temperature of 335 oC for the highest percentage of popping
was taken as a point of prediction. Properties and processing conditions reported
elsewhere were used. Some other basic data required for solution of the model were
determined separately (Table 1) and a computer program for knowing moisture content of
whole nut and kernel, change in volume of the nut and pressure built up within the nut
during roasting, was developed.
2.3. Validation of model
Accuracy of prediction of model was determined comparing the predicted values
with the measured ones. As measurement of pressure developed within the moving nut
60 200
Predicted internal

Predicted change i
pressure, MPa

volume, mm3

50
150
40
30 100
20 Predicted
50
10 Observed
0 0
0 2 4 0 2 4 6
Roasting time, min Roas ting tim e, m in
Fig. 4. Obs erved and predicted change in
Fig. 3. Predicted internal pressure volum e of whole gorgon nut during
in gorgon nut during roasting roas ting

during roasting was not possible, accuracy of its predicted values was assumed to be that
of moisture content and change in volume of the nut. Internal pressure predicted in the
nut during roasting increased rapidly in the beginning, and later its rate of increase
declined (Fig. 3). The pressure built-up within the nut at 2.8 min, optimum roasting time
reported in literature , and 213 oC nut temperature was found to be about 45 MPa.

140
Variation, %

115
90
65
40
15
-10
0 1 2 3 4 5
Roas ting tim e, m in
Fig. 5. Variation betw een predicted
and observed volume of gorgon nut
during roasting

The measurement of pressure built-up within the nut during roasting was neither
possible nor reported in the literature for any grain as far as direct validation is
concerned. The predicted and measured values of change in volume of the gorgon nut,

7
which is very much dependent on internal pressure, were thus compared (Fig. 4).
Variation between predicted and measured values was found to be high (24 to 136 %) till
1 min of roasting (Fig. 5). The large variation at initial stage could be due to the
assumptions made that the shell of the nut is impervious and attains an equilibrium
temperature almost immediately at high temperature roasting, which may not be
completely valid in the real situation. The maximum variation between the observed and
predicted values after 1 min however was within 8 %, and is only about 0.1 % beyond 2.8
min of roasting which suggests that the model could be used well for the purpose beyond
1 min of roasting time of the gorgon nut.
Internal pressure built-up within the nut during roasting is also very much
dependent on moisture content of the nut during
roasting. Scatter plots of measured versus 35 M = 1.0073M - 0.6849

Predicted, %, d.b.
kp ko

predicted moisture content of whole gorgon nut 30 R = 0.9985

and its kernel are presented in Figs. 6 – 7 for 25

entire period of roasting. Slopes and correlation 20


15
coefficients of curves are very near to 1. The
10
small intercepts in the regression equation show 10 15 20 25 30 35
little deviation from the real values, which is Observed, %, d.b.
Fig. 7. Observed and predicted moisture content
negligible as compared to its initial moisture of kernel during roasting:
content of 25.9 % and 33.4 % of whole nut and
kernel, respectively. The moisture content of about 14 % predicted by the model for the
best time of popping (2.8 min of roasting) is also in agreement with the reported values
for maximum expansion ratio of popcorn. Prediction of moisture by this model is also
better than that by models available for drying and roasting and one thus may use it to
regulate the internal pressure in the grain to avoid unregulated self bursting during
roasting to get a popped kernel of regular shape, uniform size and prevent injury to
nearby person by minimizing jumping of grain from the roasting pan.

3.0 Modeling for paneer/citrus fruit pressing for whey/juice extraction


3.1. Basis of the Model
The process of expression of liquid (juice or whey in this case) from the peeled fruit
or the curdled paneer mass as the case may be is carried out by applying pressure to them
within an envelope (cylinder) which retains the fibre of the fruit (insoluble solids) or
paneer mass skeleton and allows liquid juice or whey as the case may be to escape cross
the envelope through a filter medium fitted at the bottom. This model assumes that such
whey expression is a case of flow through a porous medium.
After application of the pressure the solid parts begin to consolidate and
simultaneously the juice/whey flows through the cell wall pores and the inter-curd lumps’
voids until it passes through the retaining envelop. It is assumed that this process can be
divided into the following three components:
(i) flow of whey or juice as case may be through the cell wall pores of individual
lump/fruit sac,
(ii) flow through the voids between curd lumps or fruit sac, and

8
(iii) consolidation of the solid mass.
Table 1. Roasting conditions and properties of gorgon nut used in validation
of model
Radius of grain, Rg 4.5 mm
Shell thickness, t 1 mm
Gap between shell and kernel, g 0.2 mm
Poisson’s ratio, ν 0.3
Modulus of elasticity, E 703  106 Pa
Coefficient of linear thermal expansion,  3.88  10-4 oC-1
Coefficient of linear moisture contraction,  0.0152
Coefficient of volume thermal expansion, 10.1510-4 oC-1
Coefficient of volume moisture contraction, c 0.0603
Initial moisture content of grain (gorgon nut), Mgo 25.9 % dry basis
Initial surface temperature of grain, Two 30oC
Final surface temperature of grain, Tw 213oC
Initial mass of a single grain, Wgo 4.585  10-4 kg
Initial moisture content of kernel, Mko 33.4 % (db)
Initial moisture content of shell, Mso 5 % (db)
Final moisture content of shell, Ms 2 % (db)
Latent heat of vaporization of moisture from kernel, Lk 2175.7 kJkg-1
Specific heat of kernel, Cpk 2.015 kJ/kgoC-1
Thermal conductivity of vapour, kv 1.41 Jm-1 min-1oC-1

Ratio δ 0.513

Normal atmospheric pressure, Po 101.3 kPa


Normal temperature, To 303.15 K
C1, C3, C4, respectively 0.979, 1.357, 0.75

3.2. Flow Through Cell Wall Pore within a fruit sac/curd lump
Natural flow of liquid through cell wall pores may be described by Hagen
Poiseulle equation for the flow of Newtonian fluids through a pipe as follows:
π r 4 (P2  P )
q 1  H ΔP (2.1)
p c
8ηL
where , qp is the flow rate of liquid; r is the pore radius, P1 – P2 = ΔP is pressure drop
across a pore of length L; η is the fluid viscosity; and Hc is the hydraulic conductivity.

3.3 Flow Through the Voids Between the Fruit Sac/Curd Lumps
Flow through porous media such as curdled mass of paneer or peeled fruit sac
may be described by Darcy’s law as follows:

9
q  k  p
q   ρg 
z (2.2)
t  
where, q = flow through voids; t = time; k = coefficient of permeability; ρ = fluid
p
density; g = gravitational constant; = hydraulic gradient in fluid; i.e. pressure
z
difference p over z. The amount of liquid either whey or fruit juice expressed is given
by integrating Eqn (2) and multiplying by the total cylinder area of flow Ac as given
below:
q
t

Q  A c  dt (2.3)
0
t
3.4 Consolidation of Medium During Pressing
Application of pressure in vertical direction to a medium saturated with water is
partitioned as follows:
ζ t  ζi  p (2.4)
where, ζ t = total applied pressure, ζ i = pressure carried by the medium skeleton, and
p = pressure carried by the medium fluid; pore or fluid pressure.
The process of consolidation of paneer mass may be expressed in form of
differential equation which could be obtained by combining Eqn. (2.1) to Eqn. (2.4) and
applying the law of conservation of mass as follows:
 2 P p  ζt
Cv 2  t (2.5)
z t
k
where, Cv is the coefficient of consolidation defined as: Cv  , k = coefficient of
mv γw
permeability of coagulum or peeled fruit, mv = coefficient of compressibility; γ w = unit
weight of whey or juice. Whey or juice during pressing is expressed through a screen
(filter medium) at the base of the cylinder (Mudgal and Agrawala, 1995). The mode of
application of pressure to the peeled fruit or curdled mass after draining the whey in case
of paneer constitute more than 80% liquid, may be of two simple type: (a) linearly
increase in pressure and (b) application of constant load. The solutions of the Eqn (2.5)
for the above two cases are summarized below:
 δζ 
Case I: Linear increase in load  t  R 
 δt 
The following initial and boundary conditions may be subjected to the Eqn (2.5)
p  0 at t  0
p
 0 at z = H, i.e. no discharge of liquid whey/juice when load is at top.
z
The solution to Eqn. (2.5) thus can be obtained as (Myers, 1971):
16H2 R    
 v(2n  1) 2 π 2 c t 
  2n  1πz 
p(z,t)   
1
1  exp 

Sin 
  (2.6)
π C n0 (2n  1)   
 
3 3
4H2 H
 

v     

10
where, H is initial height of coagulum or peeled fruit mass and z is a varying height from
top to bottom .
Case II: Application of constant pressure
If a constant pressure is applied which usually is done manually at small scale
ζ t
process, the term of Eqn. (2.5) will be zero. Thus the modified form of Eqn. (2.5)
t
can be written as:
 2 P P
Cv 2  (2.7)
z t

The solution of Eqn (2.7) can be obtained as below with the same initial and
boundary conditions as in case I.
 πz 
P(z,t)  Ae-Nt Sin  2H  (2.8)
 
Cπ2
v
where A = a constant known as Fouriers trigonometric coefficient, N, constant =
4H 2
Eqns (2.6) and (2.8) give the solution for p, the pore or fluid pressure at any time t and
height z from the top of the coagulum/peeled fruit in their respective conditions of
pressing. These equations may be expressed in terms of the amount of liquid expressed at
p
any height z and time t>0 by evaluating either from the Eqn. (2.6) or (2.8) as the case
z
may be. Here in this paper exercise is shown for the constant load condition only because
of simplicity and being used practically. By differentiating Eqn. (2.8) with respect to z
following equation is obtained:
P π N t Cos πz 

z  A 2He 2H
(2.9)
 
P
Putting the value of from Eqn. (2.9) to Eqn. (2.2) and evaluating the expression for Q
z
from Eqn. (2.3) as:
k  Aπ  πz 
e N t Cos  dt
t
Q  Ac   (2.10)
0 ρg  2H
 2H  

Nt
or, Q  K1 (1 e ) (2.11)
where, K1  AAc π k Cos  2H πz 

2H ρ g N  
Equation (2.11) is the final form of the model for constant load condition, which may
give the amount of liquid (juice or whey) expressed for a particular time and height of
pressing.

4.0 Validation of Model


The constants in the final model (Eqn. 2.11) were computed (Table 2.1) using the
experimental data of drainage of whey reported elsewhere and amount of juice extracted

11
applying a constant load of 40 kPa from orange available locally in a separate
experiments. Scatter plot of predicted and experimental amount of extracted whey and
juice showed a very close relation having correlation coefficients, R of about 0.999 in
case of whey and 0.998 in case of juice (Figs. 2.1-2.2). Differences in extracted and
predicted amount of whey and juice with extraction time (Figs 2.3 and 2.4) are attributed
mainly to four reasons. First, choking of holes of filter medium due to transportation of
few solid particles by the whey and/or juice during drainage. Second, the presence of
occluded air in the voids between the lumps/fruit sacs, which delays the building of
necessary pore pressure (p). Most of the differences in whey/juice expelled were in
initial five minutes of pressing and at later stage it became almost constant. In both these
cases there is a good agreement between the predicted and experimental values of
whey/juice extracted. Third, in deriving Eqn. (2.11) the area of flow has been taken as the
whole cylinder area Ac. This implies that the screen at base of the cylinder does not offer
any resistance to flow of liquid juice or whey passing through it. However, if the drainage
area of the screen, is reduced either by reducing the screen hole size or by choking of
holes with pressing time, to a level that offers resistance to flow of liquid through the
screen then this resistance should be incorporated in the coefficient of permeability of the
mass under comparison to reduce the deviation between the predicted and experimental
values. This fact is evident from the changing values of constants k1 and N (Table 2.1)
with screen size. Fourth, the coefficient of permeability k and the medium properties
parameter, Cv, were derived tacitly assuming constant. Even after assuming them
constants, differences in predicted and actual amount of extraction varied between 1.5 to
about 8 % only, well within the range of 10 % usually assumed for design of an
instrument or machine.
Table 2.1. Constants of Equation (12) computed from experimental data (Mudgal, 1993)

Applied pressure, kPa Screen size, m K1, ml N, min-1


40 600 159 0.387
40 500 142 0.401
40 212 108 0.415
180 120
R = 0.998
Predicted whey, m

Sieve size
600 micron
Predicted jiuce,ml

145 R =0.999 90
500 micron
110 212 micron 60

75
30
30 60 90 120
40
40 75 110 145 180 Actual juice, ml

Actual Whey, ml Fig. 2.2. Actual versus predicted amount of


Fig. 2.1. Actual vers us predicted extracted orange juice for 600 micron sieve as
am ount of whey us ing different s creen filter medium and 40 kPa loads
as filter m aterials

14 10

12
8
10
R = 0.973
Deviation, %

R = 0.962
6
Deviation, %

4
6

4 2

2
0
0 5 10 15 20
0
0 5 10 15 20 Tim e , m in
Tim e , m in
Fig. 2.4. Deviation in predicted f rom actual
Fig. 2.3. Deviation in predicted f rom actual amount of juice w ith extraction time f or 600 m
amount of w hey w ith extraction time f or 600 sieve as f ilter metdium and 40 kPa load.
m sieve as f ilter metdium and 40 kPa load

12
References
Agnew, R. P. (1960). Linear equations of first-order. In: Differential equations. Mc Graw
Hill Book Company Inc. New York, 65.
Gibson, J.E. (1965). Linear elastic theory of thin Shells. Pergamon Press, New York.
Jha, S. N., & Prasad, S. (1993 a).Thermal and physical properties of gorgon nut. Journal
of Food Process Engineering, 16(3), pp237-245.
Jha, S. N., & Prasad, S. (1993). Moisture diffusivity and thermal expansion of gorgon
nut. Journal of Food Science and Technology, 30(3): 163-165.
Jha, S. N., & Prasad, S. (1996). Determination of processing conditions of gorgon nut
(Euryale ferox). Journal of Agricultural Engineering Research, 63(103-112.
Jumikis, A.R. (1965). Soil Mechanics. Affiliated East-west pres Pvt. Ltd., New Delhi
Mudgal, V. D. and Agrawal, S. P. (1995). Whey drainage and matting studies of paneer
(Part I design, development and application of test-cell). Indian Journal of Dairy
Science, 48(12):671- 675.
Mudgal, V.D. and Agrawala, S.P. (1995a). Whey drainage and matting studies of paneer
(part II, influence of filter medium, duration of pressing and applied pressure). Indian
Journal of Dairy Science, 48(12): 676 – 680.
Myers, G.E. (1971). Analytical Methods in Conduction Heat Transfer. MacGraw Hill,
New York.
Shvets, L., Tolubonsky, V., Kivakovsky, N., Neduzhy, I., & Sheludko, I. (1987). Basic
concepts of thermodynamics and definitions. In: Heat Engineering, Mir Publishers,
Moscow, 11-36.
Terzeghi, K. (1983). Theoretical Soil Mechanics. John Willey, New York.

13
NIR Modeling for Quality Evaluation of Dairy and Food Products
S. N. Jha,
Senior Scientist,
Central Institute of Postharvest Engineering & Technology,
Ludhiana, 141 004

1.0 Introduction
Quality conscious consumers nowadays want to get assured about various quality attributes of food
items before they purchase. Fruits, vegetables and milk are increasing in popularity in the daily diets in
both developed and developing countries. Products’ quality and its measurement techniques are thus
naturally extremely important. The decisions concerning the constituents, level of freshness, ripeness,
and many other quality parameters are most based on chemical analysis or on subjective and visual
inspection for external appearance. Former methods are destructive in nature while the latter one often
becomes inaccurate. Several nondestructive techniques such as nuclear magnetic resonance technique,
x-ray and computer tomography, NIR and visual spectroscopy etc, for quality evaluation in terms of
levels of adulterants; color, gloss, flavor, firmness, texture, taste and freedom from external as well as
internal defects have been developed. Internal quality factors of fruits such as maturity, sugar content,
acidity, oil content, internal defects and constituents of food materials, however, are difficult to evaluate.
Methods are needed to predict these parameters nondestructively and with better accuracy, at faster
speed with involvement of less cost and less trained manpower. NIR spectroscopy method, as I think,
has all these desirable characteristics once it is developed for a particular parameter of a particular food
material. Development of NIR method, a kind of modeling of transmittance and/or reflectance data of
spectra acquired for a range of wavelengths at distinct intervals, for a particular parameter or a group of
parameters requires considerable efforts. Once model is developed any one can predict any parameters
anywhere in fraction of a minute.

The use of near-infrared spectroscopy as rapid and often nondestructive technique for measuring the
composition of biological materials has been demonstrated for many commodities. This method is no
longer new; as it started in early 1970 in Japan, Just after some reports from America. Even an official
method to determine the protein content of wheat is available. The National Food Research Institute
(NFRI), Tsukuba has since become a leading institute in NIR research in Japan and has played a pivotal
role in expanding near-infrared spectroscopy technology all over the country. In Japan, NIR as a
nondestructive method for quality evaluation was started for the determination of sugar content in intact
peaches, Satsuma orange and similar other soluble solids. Recently Central Institute of Postharvest
Engineering and Technology (CIPHET), Ludhiana has taken the lead in India and is working on this
method. But for faster development many other institutes, private and public sector companies have to
come together to develop this technique for its wide applicability in our country. This lecture thus
devotes considerable time to give a sufficient exposure to the participants for development of NIR
models for prediction of quality of food materials including dairy products nondestructively.

2.0 Advantages of NIR Spectroscopy


1. No Sample Preparation
Since the bands in the NIR are mainly overtones and combinations they are less intense
than the bands (primary vibrations) in the Mid IR region. Because of this samples can be
measured directly without any dilution.
2. No Waste
Spectroscopic methods are ideal since the sample is measured directly and is retained.
Thus, there is no tedious sample preparation involved and there are no waste materials
such as toxic solvents.
3. Fast Measurements
NIR Spectrometers can analyze the sample and calculate result in seconds, thereby
providing instant answers and increasing sample throughput.
4. Glass Containers- No Problem
Glass is transparent in the NIR thus samples can be measured in their containers, or
liquids can be easily analyzed in inexpensive disposable glass vials.
5. Water – No Problem: Water has less absorbance in the NIR region. Thus, aqueous
solutions can be measured directly, with careful control of the sample temperature.
6. Fibre Optic Sampling: High quality quartz fibre optics can be used to transmit NIR light
over a long distance without significant loss of light intensity. These sampling
accessories are very robust and are ideally suited for use in factory environment. Using
fibre optic probes, materials can be analyzed remotely in large containers and reaction
vessels.
7. Easy and Accurate Analysis: Since NIR methods require no sample preparation, the
amount of sampling error is significantly reduced thereby improving the accuracy and
reproducibility of the measurement. Furthermore since the sample is not destroyed during
the analysis the measurement can also be repeated. Once an instrument is calibrated the
day-to-day analysis is a simple task and does not require the user to learn any elaborate
procedures.
8. Analysis Costs: In comparison with wet chemical analysis, for NIR there is actually an
inverse relationship with respect to sample quantity and cost. The major costs for NIR are
incurred during the initial implementation of the methods. Thus, as the number of
samples increases the cost per analysis decreases.

3.0 Transmittance/absorbance/reflectance spectra


Spectra are curves drawn as a continuous function of relative reflectance or absorbance data with
respect to wavelengths. When a light energy, in a range of wavelengths, are thrown on any object
a portion of it get reflected, transmitted and absorbed as shown in Fig 1. These transmitted/
Fig. 2.5 Schematic representation of interaction of light with matter, θ 1= angle of incidence, θ R = angle of reflectance, θ T
= angle of transmittance, n1 n2 = refractive index of medium 1 and 2, respectively

reflected light are measured and plotted against the wavelength (Fig. 2). The nature of spectra mainly
depends upon the range of wavelengths, and chemical composition of the food samples.
Table 1 Divisions of the infrared region
Region Characteristics Wavelength range Wavenumber (cm-1)
transitions (nm)
Near infrared (NIR) Overtones 700 – 2500 14300 – 4000
combinations
Middle infrared Fundamental 2500 – 5 104 4000 - 200
(IR) vibrations
Far infrared Rotations 5  104 - 106 200 - 10

Infrared regions of wavelengths are divided into different ranges (Table 1). When light energy, a type of
electromagnetic wave, of certain ranges of wavelengths interacts with the matter, a vibration is induced
in its molecule. Depending upon the quantum and nature of vibrations we get overtone bands in spectral
curve and these bands are directly associated with the nature of chemical bonding of constituents of food
materials. The individual constituent thus can be identified and its amount can be estimated. Various
overtone bands at different wavelengths have been identified and through data analysis and modeling we
estimate the amount or percentage of certain constituents of a certain food materials.

4.0 Data analysis and NIR modeling


2.8
It is a process like churning of cream to 2.6
get better and larger amount of butter or 2.4
ghee from the same amount of milk. There
Absorbance

2.2
is no limit of Independent and Dependent 2
variables. It is like ocean in which you 1.8
have to dive to get some useful 1.6
1.4
1.2
700 750 800 850 900 950 1000 1050 1100 1150
Wavelengths, nm
Fig. 2. Spectra of procured unadulterated milk
information as per your need and level of satisfaction. It is a multivariate analysis. A large number of
variables are considered and their effect on selected attributes is seen. To simplify the model,
independent variables are reduced to a bare minimum possible number by following certain rules
and techniques without scarifying accuracy of easy prediction of certain attributes. Though, various
types of analysis methods are available in statistical book, only certain important and directly usable
in NIR spectral modeling/ analysis is described, in brief hereunder.

(a) Partial Least Squares Regression (PLS),


(b) Principal Component Regression (PCR), and
(c) Multiple Linear Regressions (MLR).

(a) Partial Least Square Regression

It is also known as Projection to Latent Structure (PLS), a method for relating the variations in
one or several response variables (Y-variables) to the variations of several predictors (X-variables), with
explanatory or predictive purposes. This method performs particularly well when the various X-
variables express common information, i.e. when there is a large amount of correlation, or even co
linearity.PLS is a method of bilinear modeling where information in the original X-data is projected
onto a small number of underlying (“latent”) variables called PLS components. The Y-data are actively
used in estimating the “latent” variables to ensure that the first components are those that are most
relevant for predicting the Y-variables. Interpretation of the relationship between X-data and Y-data is
then simplified as this relationship is concentrated on the smallest possible number of components.

By plotting the first PLS components one can view main associations between X-variables and
Y-variables, and also interrelationships within X-data and within Y-data. There are two versions of the
PLS algorithm: PLS1 deals with only one response variables at a time; and PLS2 handles several
responses simultaneously. Procedure of PLS can be depicted by the following figures
y
x u

**
*** ** *
*
** *** ** **
*** t
* *** ***
**
** y
x y
x

Bilinear modeling: Bilinear modeling (BLM) is one of several possible approaches for data
compression. These methods are designed for situations where co-linearity exists among the original
variables. Common information in the original variables is used to build new variables that reflect the
underlying (“latent”) structure. These variables are therefore called latent variables. The latent variables
are estimated as linear functions of both the original variables and the observations, thereby the name
bilinear. PCR, PCA and PLS are bilinear methods.

Observations Data Structure Error


= +

In these methods, each sample can be considered as a point in a multi-dimensional space. The
model will be built as a series of components onto which the samples - and the variables - can be
projected. Sample projections are called scores and variable projections are called loadings. The model
approximation of the data is equivalent to the orthogonal projection of the samples onto the model. The
residual variance of each sample is the squared distance to its projection. It models both the X- and Y-
matrices simultaneously to find the latent variables in X that will best predict the latent variables in Y.
These PLS components are similar to principal components, and will also be referred to as PCs.
5.0 Principles of Projection
Bearing that in mind, the principle of PCA is to find the directions in space along which the
distance between data points is the largest. This can be translated as finding the linear combinations
of the initial variables that contribute most to making the samples different from each other.
These directions, or combinations, are called Principal Components (PCs). They are
computed iteratively; in such a way that the first PC is the one that carries most information (or in
statistical terms: most explained variance). The second PC will then carry the maximum share of the
residual information (i.e. not taken into account by the previous PC), and so on. Fig 2 describes PCs
1 and 2 in a multidimensional space. This process can go on until as many PCs have been computed
as there are variables in the data table. At that point, all the variation between samples has been
accounted for, and the PCs form a new set of coordinate axes, which has two advantages over the
original set of axes (the original variables). First, the PCs are orthogonal to each other. Second, they
are ranked so that each one carries more information than any of the following ones. Thus, you can
prioritize their interpretation: Start with the first ones, since you know they carry more information
Variable 3

PC2
PC1

Variable 2

Variable 1

Fig. 3. Description of principal components

The way it was generated ensures that this new set of coordinate axes is the most suitable
basis for a graphical representation of the data that allows easy interpretation of the data structure.
(b) Principal Component Regression (PCR)
PCR is a method, which suited in situations as PLS. It is a two-step method. First, a principal
component analysis is carried out on the X-variables. The principal components are then used as
predictors in a MLR method. The Fig. 4 can describe PCR procedure.

(c) Multiple Linear Regressions (MLR)

It is a method for relating the variations in a response variable (Y-variable) to the variations
of several (X-variables), with explanatory or predictive purposes. An important assumption for the
method is that the X-variables are linearly independent, i.e. no linear relationship exists between the
X-variables. When the X-variables carry common information, problems can arise due to exact or
approximate co-linearity.

In MLR, all the X-variables are supposed to participate in the model independently of each
other. Their co-variations are not taken into account, so X-variance is not meaningful there. Thus the
only relevant measure of how well the model performs is provided by the Y-variances.

X
Y
PC2
PC1
(MLR)

PCA
+
X
PC1 PC2
X
PC3
Y=
f(PCj)
PCi = f(Xi)
Fig. 2. Description of PCR procedure

6.0 Selection of Regression Methods

Selection of regression method is of paramount importance in NIR modeling for


nondestructive method of quality evaluation of foods. One should know well that which type of
analysis is useful to his data for better prediction. For knowing the suitability of regression methods
knowledge of their characteristics are essential to save time. Otherwise one has to analyze their data
by all techniques and compare their results for selection

MLR Vs. PCR Vs. PLS


MLR has the following properties and behavior:
The number of X-variables must be smaller than the number of samples;
In case of co-linearity among X-variables, the b-coefficients are not reliable and they may be
unstable;
MLR tends to over fit when noisy data is used.
PCR and PLS are projection methods, like PCA

Model components are extracted in such a way that the first PC conveys the largest amount
of information, followed by the second PC, etc. At a certain point, the variation modeled by any new
PC is mostly noise. The optimal number of PCs - modeling useful information, but avoiding over
fitting - is determined with the help of the residual variances. If difference between standard error of
calibration (SEC) and standard error of validation (SEP) as well as between biases of calibration and
prediction sets of samples are minimal as shown in Table 2., one may assume model stable. SEP is
variation in the precision of predictions over several samples. It is computed as the standard
deviation of the residuals and Standard deviation is computed as the square root of the mean square
of deviations from the mean.

Table 2. NIR calibration and validation statistics for detection of adulterants in milk samples

Components Elements Calibration Validation


*
R SEC Bias R SEP Bias
Milk 58 0.89 4.33 -0.00 0.89 4.32 -0.34
Urea 58 0.98 0.76 -0.00 0.98 0.78 0.29
NaOH 58 0.95 0.69 -0.00 0.86 0.88 0.06
Oil 58 0.89 1.99 -0.00 0.74 2.53 -0.55
Shampoo 58 0.69 4.24 -0.00 0.58 3.83 -0.33
*R – multiple correlation coefficients, SEC – standard error of calibration and SEP standard error of
prediction.

Bias is the systematic difference between predicted and measured values. It is computed as the
average value of the residuals.

PCR uses MLR in the regression step; a PCR model using all PCs gives the same solution as
MLR (and so does a PLS1 model using all PCs). If one runs MLR, PCR and PLS1 on the same data,
he can compare their performance by checking validation errors (Predicted vs. Measured Y-values
for validation samples RMSEP). It can also be noted that both MLR and PCR only model one Y-
variable at a time.

The difference between PCR and PLS lies in the algorithm. PLS uses the information lying in
both X and Y to fit the model, switching between X and Y iteratively to find the relevant PCs. So
PLS often needs fewer PCs to reach the optimal solution because the focus is on the prediction of the
Y-variables (not on achieving the best projection of X as in PCA).
If there is more than one Y-variable, PLS2 is usually the best method if you wish to interpret
all variables simultaneously. It is often argued that PLS1 or PCR has better prediction ability. This is
usually true if there are strong non-linearities in the data. On the other hand, if the Y-variables are
somewhat noisy, but strongly correlated, PLS2 is the best way to model the whole information and
leave noise aside.The difference between PLS1 and PCR is usually quite small, but PLS1 will
usually give results comparable to PCR-results using fewer components.

MLR should only be used if the number of X-variables is low and there are only small
correlations among them. Formal tests of significance for the regression coefficients are well known
and accepted for MLR. If one chooses PCR or PLS, he may check the stability of his results and the
significance of the regression coefficients with Marten Uncertainty Test.
7.0 Data Pre-processing
What is Pre-processing?

Introducing changes in the values of your variables, e.g. so as to make them better suited for an
analysis, is called pre-processing. One may also talk about applying a pre-treatment or a
transformation. Benefits of pre-processing are:
1. Improves the distribution of a skewed variable by taking its logarithm.
2. Removes some noise in your spectra by smoothing the curves.
3. Improves the precision in your sensory assessments by taking the average of the sensory ratings
over all panelists.
4.It may improve overall precision in prediction
A wide range of transformations can be applied to data before they are analyzed. The main
purpose of transformations is to make the distribution of given variables more suitable for a
powerful analysis. The various types of transformations available are:
Computation of various functions
Smoothing
Normalization
Spectroscopic transformations
Multiplicative scatter correction
Adding noise
Derivatives
Transposition
Shifting variables
User-defined transformation
Details of these techniques are available in suggested readings. After putting all efforts, finally,
we get a model (Eqn 1) to predict the desired constituents of any dairy and food products as
hereunder:
i  n, j m
yi ai   bijxij(λ j) …(1)
i 1, j1

where yi is the ith component of food samples to be predicted, a is a regression constant for ith
component, bij is coefficients of absorption/reflectance/transmittance of NIR radiation for ith
components at jth wavelengths λ.
Examples of prediction of total soluble solids contents and maturity of various intact fruits
and tastes of juices are available and are being used commercially in developed countries. For details
followings are the suggested readings.
References
Dull, G.G. and Birth, G.S. (1989). Nondestructive evaluation of fruit quality: Use of near infrared
spectrophotometry to measure solube solids in intact honeydew melons. Hortscience, 24, 754.
Dull, G.G. and Birth, G.S. Smittle, D.A and Leffler, R.G. (1989). Near infrared analysis of soluble solids in
intact cantaloupe. J. Food Sci., 54, 393-395.
Iwamoto, M., Kawano, S. and Yukihiro, O. (1995). An overview of research and development of near infrared
spectroscopy in Japan. J. Near Infrared Spectrosc., 3, 179-189.
Jha, S. N. and Matsuoka, T. (2004). Detection of adulterants in milk using near infrared spectroscopy. Journal
of Food Science and Technology, 41(3), 313 – 316.
Jha, S. N. and Matsuoka, T. (2004). Nondestructive determination of acid brix ratio (ABR) of tomato juice
using near infrared (NIR) spectroscopy. International Journal of Food Science and Technology, 39(4):
425 - 430.
Jha, S. N.; Chopra, S. and Kingsly, ARP (2005). Determination of sweetness of intact mango using spectral
analysis. Biosystems Engineering, 91(2), 157 – 161.
Jha, S.N. Matsuoka T. (2000). Review: Nondestructive Techniques for quality evaluation of intact fruits and
vegetables. Food Science and Technology Research, 6(4), 248 – 251.
Kawano, S. (1994). Nondestructive near infrared quality evaluation of fruits and vegetables in Japan. NIR
News, 5, 10-12.
Kawano, S. (1998). New application of nondestructive methods for quality evaluation of fruits and vegetables in
Japan. J. Jpn. Soc. Hort. Sci., 67, 1176-1179.
Kawano, S., Fujiwara, T., and Iwamoto, M.C. (1993). Nondestructive determination of sugar content in satsuma
maddarin using NIR transmittance. J. Jpn. Soc. Hort. Sci., 62, 465-470.
Kawano, S., Watanabe, H. and Iwamoto, M. (1992). Determination of sugar content in intact peaches by near
infrared spectroscopy with fibre optics in interactance mode. J. Jpn. Soc. Hort. Sci., 61, 445-451.
Kim, S.M., Chen, P., McCarthy, M.J. and Zoin, B., (1999). Fruit internal quality evaluation using on-line
nuclear magnetic resonance sensors. J. Agric. Eng. Res., 74, 293-301.
Miquel, M.E., Evans, S.D. and Hall. L.D. (1998). Three dimensional imaging of chocolate confectionary by
magnetic resonance methods. Food Sci. technol., 31, 339-343.
Osborne, B.G., Fearn T., and Hindle P.H., (1993). Practical NIR Spectroscopy with Applications in Food and
Beverage Analysis. Longman Scientific & Technical Publishers Ltd, Singapore.
Osborne, S.D. and Kunnemeyer, R. (1999). A low cost system for the grading of kiwifruit. J. Near Infrared
Spectocs., 7, 9-15.
Slaughter, D.C. (1995). Nondestructive determination of internal quality in peaches and nectarines. Trans,
ASAE, 38, 617-623.
Tsenkova, R; Atanassova, S.; Ozaki, Y.; Toyoda, K.; and Itoh, K. (2001). Near-Infrared spectroscopy for
biomonitoring; influence of somatic cell count on cow’s milk composition analysis. International Dairy
Journal 11 (2001) 779-783.
COMPUTERIZED PROCESS CONTROL IN DAIRY
AND FOOD INDUSTRY

K. Narsaiah
Dairy Engineering Division
NDRI, Karnal-132001

1.0 Introduction
Dairy and food processing plants form very important link in food supply chain
i.e. from farm to fork. They convert the farm produce into products with desired
attributes using unit operation such as drying, evaporation, cooking, etc. Process control
is used to run these operations economically to give safe products consistently. A brief
overview of process control is presented below followed by some applications of
automatic process in dairy and food industry.

Simply stated, the term control means methods to force parameters in the
environment to have specific values. This can be as simple as making the room
temperature to stay at 21°C or as complex as guiding a spacecraft to Mars. All the
elements necessary to accomplish the control objectives are described by the term control
system. The technology of artificial control was first developed with human as an integral
part of the control action. With the use of machines, electronics and computers, the
human function is replaced and the term automatic control came in to use. The rapid
developments in digital electronics and associated computer technology have brought
about wide spread introduction of digital techniques in the field of process control. For
these reasons the degree of automation is increasing in dairy industry as well along with
other process industries. Consequently process control systems in dairy and food
processing plants are becoming increasingly complex. A qualitative knowledge of
process control is no longer sufficient for technical persons of dairy and food industry,
who are to handle modern process control systems. Brief review of basics of automatic
process control is given below.

2.0 Process Control Block Diagram


Block diagram is working description of process control, which is applicable to all
control situations and is independent of a particular application, that is, it uses more
general terms.

2.1 Process
In general, a process can consist of a complex assembly of phenomena that relate to some
manufacturing sequence. There are single variable processes, in which only one variable
is to be controlled as well as multi variable processes in which many variables, perhaps
interrelated, may require regulation.
2.2 Measurement
Sensor measures a variable and converts it into analogous electrical or pneumatic
information. Further transformation or signal conditioning may be required to complete
the measurement function.

2.3 Error Detector


It compares the measured variable with set point and gives the error or deviation to
controller. Often, it is a part of the control device.

2.4 Controller
It evaluates the error and determines the action to be taken, if needed. The evaluation
may be performed by an operator, by electronic signal processing (transistors, circuits) or
by a computer. Because of inherent capacity to handle multivariable systems and
decision-making ability, computers are used in process control. The controller requires an
input of a measured indication of the controlled variable (set point), expressed in the
same terms as the measured value.

2.5 Control Element


The final control element exerts a direct influence on the process. It accepts input from
controller, which is then transformed in to some proportional operation performed on the
process.
p
Control Element

Summing Point
u
e= r-b
r
Process
Controller

Measurement

Figure 1. Block diagram

The block diagram in Figure 1 is constructed from the elements described above. The
controlled variable in the process is denoted by ‘c’ in the diagram, and the measured
representation of the controlled variable is labeled as ‘b’. The controlled variable set
point is labeled ‘r’ for reference. The error detector is a subtracting-summing point that
outputs an error signal, e = r-b, to the controller for comparison and action. The out put
signal of controller ‘p’ activates the final control element to proceed the process in the
desired direction.

3.0 Digital Electronics in Process Control


Evolution of digital electronics in process control is given below

1) Direct use: There are numerous areas where digital circuits are used directly such as
alarms and multivariable control.

2) Data logging: Computers were first used for data logging only i.e., collection and
storage of vast amount of measurement data in a complex process and displaying the data
for review by process engineers to determine the process condition. Gradually computer
performed certain kinds of reduction of this data using control equation and even
indicated the corrective action, if any, that should be taken to regulate the process.

3) Supervisory Computer Control: The computer it self performs adjustments of loop set
points and provides a record of process parameters. The loops are still analog, but the set
points that determine the overall performance are set by computer on the basis of
equations solved by the computer using measured values of process parameters as inputs.

4) Programmable Controller or PLC: It is a computer / microprocessor based device that


implements the required sequence of events of a discrete state process. Control action
consists of driving a process through a pattern of such states in time, where the
progression of states is dependent on measurements of the present state of the system.
Every thing associated with such systems is expressed digitally i.e., two state variable
either on or off. Earlier to PLC, relay sequencer/ relay logic panel was used to carry out
above control action.

5) Direct digital use ( DDC) : The ultimate aim of computer application in process
control has been to use computer to perform continuous controller functions. In such
DDC system, the only analog elements left in the process control loop are the
measurement function and the final control element.

6) Advanced control concepts: Fuzzy set theory and neural networks modeling,
separately or combined, bring still further potential extensions of computerized process
control.

4.0 Computerized Process Control in Dairy and Food Industry

Consumption of dairy and food products can be traced back to antiquity. However, the
dairy and food industry lagged behind other manufacturing industries such as
automobiles and petrochemical industries in introducing automation and computerized
process control. The valid reasons for this slow adoption include variation of size, shape
and composition of food materials, large number of recipes, perishability of raw and
processed foods, lack of suitable sensors, use of both batch and continuous processes
simultaneously and low profit margins. The strides made in digital electronics in last two
decades are paving way for increased automation in all industrial processes. Increasing
global competitive pressures with respect to quality and environmental concerns and to
reduce costs is driving the automation in the dairy and food industry as well. The
application of computers also provides enormous documentation capability for inventory
control. Though the food industry is vast, we cover few applications in fruits and
vegetables and focus more on the dairy industry. Meat processing is skipped here,
though automation of poultry and carcass processing lines is increasing.

The fruit juice industry is rapidly adopting process control technologies to fully automate
all aspects of plant operations. The intent is to improve plant efficiency, reduce costs, and
provide the tools for better operations management The most modern plant installations
in recent years have a modular control package fully integrated into a management
information system with an advanced network. Management has the capability to review
any operational data and identify areas of concern very quickly. Corrective measures can
be implemented and the processes realigned.

Automation can start at the plant's gate, beginning with documentation of truck weight
and fruit identification at the scale house, and integrated with maturity/quality data
information (State Test) for the corresponding load of fruit. Next, it continues through
fruit handling, bin blending and juice room control. The system operates based on juice
demand from downstream processes and makes adjustments to fruit flow through the line
to deliver juice as required. Evaporator automation assures the juice is within a set range
of concentration according to pre-determined values. Finisher operation, by-product
recovery systems, and clean-up operations can be automated to assure consistent results

Production of premium quality fruit juices requires tighter sanitation control relative to
concentrate production since heat treatments are less severe to minimize changes to
flavor. The application of dairy standards for the equipment line and operations is not
unusual today. Stainless steel fruit handling/conveying equipment such as fruit storage
bins, bucket elevators, conveyor belt side rails, and other equipment are more common.
Washed fruit may be sanitized before leaving the washer. Furthermore, conveyor belts
may be sprayed with a sanitizing solution to minimize surface mold growth. Automatic
clean up systems are instrumental in consistently keeping extraction lines clean and
sanitized. Frequency of cleaning depends on finished product and could vary based on
fruit condition. Consistency of finished product is an important parameter in tomato
processing. This property can be measured on line by using optical sensor to know total
solids. This can be used in feed back control of tomato processing.

The dairy industry is uniquely positioned for easy adoption of computerized process
control because it requires extensive record keeping, finished products generally
homogenous with relatively few ingredients and fluid operations are often of long
duration, sequential and adaptable to software systems developed for continuous
processes. If we trace the progress of mechanization and automation in the dairy industry
all over the world, until 1948 there was only a slight increase in the fluid milk industry,
but there after the increase in the quality and healthfulness of processed milk product
tremendous. The high capacity of modern continuous pasteurization was made possible
by the development of the hygienic automatic diversion valve, used in conjunction with
very reliable temperature sensing and equally reliable logic control (initially with relays,
then with programmable logic controllers).

Standardization of milk is the primary operation after receipt of milk in a dairy plant. If
the fat content of incoming milk is known and it is supplied at constant rate, it is
sufficient to measure and control the flow of cream from the separator to obtain milk of
desired fat content. Single loop controller in spray dryer controls the heat input by
measuring outlet temperature with manual regulation of feed rate. Standard control
configuration of spray dryer includes two loops. In one loop heat measuring inlet
temperature of hot air controls heat input. In the other loop feed rate is controlled by
measuring the outlet temperature.

Another example is the process of making cheese. Cheeses are made by enzymatic curd
coagulation followed by inoculation with special bacteria. The inoculated curd is pressed,
further salted, and held for a curing period. Controlled stirring of the curd during
coagulation gives control of consistency, which is a function of entrapped air. From the
preparation vats, curd flows at a controlled rate on to a whey removal conveyor. The
amount of drainage is controlled by varying the speed of the conveyor.

There is great scope for automation and computer application in traditional dairy products
of India. Rasogolla is a popular Indian sweetmeat. Due to the increasing demand of
rasogolla, large-scale industrial production system is being developed at the National
Dairy Research Institute, Karnal, Haryana. In large-scale production, digital image
processing system with machine vision as a rapid nondestructive quality evaluation tool
can be used for quality assurance of rasogolla. The digital image processing will be
covered elsewhere in this short course. In continuous cooker of rasogolla in this system,
the sugar concentration needs to be maintained at constant level as some water evaporates
and some sugar goes out entrapped in rasogolla. On-line refractometer can be used for
measuring sugar concentration and feed back control loop can be used to replenish water
and sugar in cooker automatically. In addition, a level sensor can be used to maintain the
level of sugar syrup in the cooker.

References:
1. E.K. Rogers and C.J.B. Brimelow. 2001. Instrumentation and Sensors for the
Food Industry. CRC Press, Boca Raton, USA.
2. G.S. Mittal. 1997. Computerized Control Systems in the Food Industry. Marcell
and Dekker, Inc. New York, USA.
3. I. McFarlane. 1995. Automatic Control of Food Manufacturing Processes.
Chapman and Hall, London, UK.
DETERMINATION AND PREDICTION OF SHELF LIFE OF
MOISTURE SENSITIVE FOOD PRODUCT BASED ON
PRODUCT-PACKAGE-ENVIRONMENT INTERACTION

Prof. I.K.Sawhney
Dairy Engineering Division
NDRI, Karnal-132001

1.0 Introduction
Shelf life of a food product depends on a multiplicity of variables and their
changes, including the product, the environmental conditions and the packaging.
Different authors have defined shelf life differently but the essence of shelf life concept is
same. It is the time period for the packaged food to retain its quality level with regard to
safety, physical/chemical properties, morphological characteristics and organoleptic
acceptability. During this period of time the packaged product passes through a dynamic
deteriorating process, which is affected by the type of product, packaging material,
handling processes and the environmental conditions.
Obviously, the first step of studying shelf life of food product is to define the
quality parameter. This is an uneasy task. Many physical, chemical and biochemical
process happen simultaneously and are affected by the product ingredient. The
ingredients vary so differently in each food but some of these could be correlated with the
quality parameters. These quality parameters are used as the indicator of product stability.
The incorporation of all of the known relevant parameters permit the food scientist to
accurately and precisely predict the effect of packaging material and environmental
condition on the shelf life of product contents.

2.0 Shelf Life Concerns of Moisture Sensitive Food Product


The shelf life of a product is affected by many factors, including the product itself,
package design and properties, storage conditions, etc. Knowing those factors make it
possible to predict the shelf life of the product. It has long been recognized that the water
activity correlates sufficiently with many degenerative reactions in the food product.
Such as oxidation, enzymatic hydrolysis, Millard reaction, vitamin loss etc. This makes
water activity a useful indicator of product stability and microbial activity.
Information derived from moisture sorption isotherm is useful to predict the chemical
and physical stability of food as a function of moisture content of the product. A
relationship is to be worked out between the moisture content and the target quality
parameter for the moisture sensitive food. The target quality parameter could be the
crispness of moisture sensitive food or the vitamin loss in the fruit product. The shelf life
can then be studied by monitoring the change of moisture content of the food product.

1
3.0 Predicting the Shelf Life of Moisture Sensitive Food
There is no universal simulation model for all food products to study their shelf life.
Each product has its own unique characteristics in terms of both physical and chemical
properties and stability. To establish shelf life prediction equation for a given food
product it is required to study and correlate the unique product properties with the
varying environment conditions and the package characteristics. It is complicated to
simulate the real-life situations. To simplify the situation some assumptions are always
made to make the prediction of shelf life easier and workable.
The process of predicting the shelf life of a moisture sensitive food product can the be
delineated as below:

 Study the properties of moisture-sensitive product, such as initial moisture


content, critical moisture content and moisture sorption isotherm.
 Decide the critical deteriorative indices and establish their dependence on the
environmental conditions.
 Characterize the mass transfer properties of packaging material and their
dependence on environmental conditions.
 Develop the predictive equation describing the storage condition and time-
dependent change in the predominant quality attribute of the food product.
 Integrate the predictive equation for package properties with the predictive
equation for product and establish the product-package-environment interaction
equations.
 In-package shelf life measurement of the food product experimentally.
 Compare the validity and suitability of developed predictive model.

4.0 Product Characteristics

4.1 Initial Moisture Content: The first step in studying the shelf life of moisture
sensitive product is to determine its initial moisture content. It is usually
expressed on dry basis.

4.2 Moisture Sorption Isotherm: Moisture sorption isotherm describes the relation
between moisture content of the product and the water vapour partial pressure
of the surrounding environment. The assumption is the equilibrium of moisture
movement between the product and the environment is reached. The isotherm
curve is dependent on the product as well as the temperature. Different product
at same temperature or same product at different temperature will result in
different isotherm data. The equilibrium moisture content of the product
corresponding to different water activity could be determined using following
equation:

We
Me  M i  11 -------------------------(1)
Wi

2
Where:
Me = equilibrium moisture content of product (dry basis), g
Wi = initial weight of the product, g
Mi = initial moisture content of the product (dry basis), g
We = equilibrium or final weight of the product, g

There are different mathematical models describing the moisture sorption


isotherm. The most commonly used one is GAB model (Guggenheim-Anderson-DeBoer
model), which is in the form of second order polynomial of the following form:

M e  C.k.AW ------------------- (2)




Wm 1  k.Aw 1 k.Aw  C.Aw 




Where:
Me = equilibrium moisture content of the product (%, dry basis)
Aw = water activity (RH%/100)
Wm = the water content in the monolayer in BET theory
C = Guggenheim constant
K = a factor correcting properties of the multiplayer molecules with
respect to the bulk liquid

4.3 Critical Moisture Content; Critical moisture content is very important factor
for moisture sensitive food product such as cereal, cracker, milk powder,
powdered juice etc. Different food products have different critical moisture
content. We can determine the critical moisture content in different ways. It
could be based on the sensory evaluation, which may be based on perceived
taste, flavor or the mouth feel or the result of market search. Any specific
chemical reaction, such as, browning or lipid oxidation could also be the
parameter for critical moisture content. The critical moisture content is required
to be correlated with these deteriorative indices. Determination of critical
moisture content makes it possible to calculate the shelf life of the moisture
sensitive food product.

5.0 Package Characteristics


The most important package characteristic is the water vapour permeability
coefficient. It determines how fast or slow the water vapour can permeate through the
film. Consequently it affects the shelf life of the product. The most common method used
to determine the WVTR of the polymeric material is the Isostatic method. In this method
pouches are made from polymeric material. One is empty and sealed and is used as

3
control. The others are filled with desiccant. All the pouches are stored at known
temperature and humidity and are weighed regularly till a constant weighed is observed.
The permeability coefficient is then calculated as below:

WVTRxT
PH2 O  ---------------------(3)
AxP

Where:
WVTR = water vapour transmission rate, kg/s
T= film thickness, m
A = surface area for permeation, m2
p = partial pressure difference (ps x RH/100), Pa
RH = difference in relative humidity between inside and outside package
ps = saturated water vapour pressure at the specific temperature.

The water vapour permeability coefficient of polymeric materials is dependent upon


temperature. This dependence is described by Arrhenius equation as follows:

P2
ln   Ea  1 1  -------------------------(4)
   

R T T 

P
1  2 1 

Where:
P= Permeability coefficient, kg.m/m2/sec/Pa
Ea = activation energy, kJ/mol
R = gas constant, 8.314 J/mol/K
T = temperature, K

6.0 Determination of Shelf Life of Moisture Sensitive Food


Experimentally
To determine the shelf life of moisture sensitive food experimentally a sample of
about 10 to 20 grams is packed in the LDPE pouches and stored at the known conditions
of storage. The pouch containing the product is weighed every alternate day until the
critical moisture content of the product is reached. Since this critical moisture content of
the product has already been correlated with the product deteriorative indices, as
described in 4.3 above, the period after which this critical moisture content is reached can
be taken as the shelf life of the product.

4
7.0 Prediction of Shelf Life of Moisture Sensitive Food Analytically
The shelf life of moisture sensitive food can be predicted analytically using the
following equation:
P.A
dp  ( p  p )dt -------------------------(5)
o i
T

Where:
p = partial pressure of water vapour, Pa
po = partial pressure of water vapour of storage environment, Pa
pi = partial pressure of water vapour inside the package, Pa
t = storage time, sec
P = permeability coefficient of package, kg.m/m2/sec/Pa
A = area of permeation, m2
T = thickness of polymeric material, m

Taking ‘ dp = Wd dM’ the above equation can be rewritten in the following form:

2M
T.W dM
dt   d
  --------------------------(6)
P.A M po  pi (M )  

Where
Wd = dry weight of the product, kg
M = moisture content of the product on dry basis, %

There are several different methods to calculate the above integration.

7.1 Linear method: In this method the moisture sorption isotherm is treated linearly, as
below;
Me =  Aw +  …. ---------------------------(7)

Using this in the above Eq-6 we get 



T.Wd .  A  AW ,t 0 

t ln  wo 
------------------------(8)
A A 
 

P.A. p
s  Wo W ,t t 

Where:
ps = saturated water vapour pressure
Aw0 = water activity (relative humidity of storage condition
Aw, t=0 = headspace water activity inside the package at time t=0
Aw, t=t = head space water activity inside the package at time t=t

5
7.2 Log method: In this method the isotherm curve is divided into n intervals and each
interval is treated as a linear function. This will give us one value of time for each
interval and the total shelf life will be the summation of all the time values as below:

T.W n  A A  
w,t i ------------------(9)
t d    .ln 
wo


P.A. ps i 1  Aw0  Aw,t i 1 



Where:
I =  value for each interval ‘I’
Aw0 = water activity of storage conditions
Aw,t=I = head space water activity inside the package at the beginning of the
Interval
Aw,t=I+1 = head space water activity inside the package at the end of the interval

7.3 Integrated GAB equation: The use of GAB equation for describing the moisture
sorption isotherm has been discussed in section 4.2 above. If the MSI of the moisture
sensitive product could be described by this equation the Eq-6 could be integrated using
this equation and the solution is obtained in the following form:

H 
t --------------------(10)
  
Where: 
 2WmC  .M f  2WmC 
ln 
H  M f  M i   .M  2W C  -------------------(11)
 
 i m 



  1 C 2kAwo   2 --------------------(12)

P.A. ps
 ---------------------(13)
2k 1  C T.Wd

8.0 Error Analysis


In studying the shelf life of product analytically we make several assumptions to
simplify the process of computation. Hence each model that is developed can simulate

6
the real situation to a limited accuracy. Therefore each model is tested for the accuracy of
prediction of the shelf life of the product. There are several statistical tools available to
test the accuracy of the model employing the experimental values of the shelf life vis-à-
vis the analytical value of the shelf life predicted by the model under the given set of the
packaging and storage conditions. Any of the above models could give the best fit. Other
isotherm equations could be tried if the results obtained for the above-described methods
are not up to the mark.

9.0 Conclusions
The isotherm curve of moisture sensitive food product describes the movement of
moisture content between the food product and the environment in which it is stored. The
critical moisture content of the food product could be correlated with its deteriorative
indices and the determination of critical moisture content makes it possible to calculate
the shelf life of the moisture sensitive food product. With the knowledge of dry weight of
the product, its initial moisture content, critical moisture content, the package size and its
permeability, storage conditions (temperature and humidity), the shelf life of the product
could be predicted based on product, package, environment interaction.

References
Labuza, T.P. 1980. Effect of water activity on reaction kinetics of food deterioration.
Food Technol. 34 (4):36

Lund, D.B. 1982. Quantifying reactions influencing quality of foods: texture, flavour and
appearance. J. Food Process. Presev., 6:133

Sawhney,I.K. and M.Cheryan 1988 Moisture sorption characteristics of heat-desiccated


whole milk product(khoa). Lebensm wiss u Technol. 21:239-41

Sawhney, I.K., Patil, G.R. and Bikram Kumar 1991 Effect of moisture sorption isotherm
of heat-treated whole milk product, khoa. J.Dairy Research. 58;329

Sawhney,I.K.,Bikram Kumar,Agrawala,S.P.and Patel,A.A. 2000 Evaluation of moisture


sorption hysteresis of khoa. Indian J.Dairy Sci. 53(3),184-189.

Singh, R.P. 1994. Scientific principles of shelf life evaluation. In: Shelf Life Evaluation
of Foods. C.M.D. Mann and A.A.Jones (eds), Blackie Academic and Professional,
London, pp.3-26.

Wells, J.H. and Singh, R.P. 1988.A kinetic approach to food quality prediction using full
history time temperature indicators. J. Food Sci., 53: 1866-71, 1893.

7
SENSORS IN RELATION TO COMPUTER APPLICATION
IN FOOD PROCESS CONTROL

A.A. Patel
Dairy Technology Division
National Dairy Research Institute,
Karnal-132001

1.0 Introduction
Relevance of computers and electronics in dairy and food processing is evident in two
particular areas: Process control and product evaluation, the first being more obvious.
Food process control as a part of process automation entails product monitoring (or
measurements) integral to process monitoring. Thus, measurements to be made on
raw materials, intermediate product and finished product constitute a key element in
process control. Measurements are also essential in the context of the process state or
process conditions external to the product. Irrespective of the measurement object, the
most crucial aspect of process or product monitoring is real-time instrumentation
coupled with process state modeling for one to be able to effect process control aimed
at automation of the food processing activity.
Real time measurement refers to obtaining information about a product (or process
medium) and transforming it into useful information within a response time coherent
with the evolution of the process. In other words, the measurement made should be
rapid enough to be effectively utilized for appropriate process manipulation so as to
maintain the measured parameter within acceptable limits as the process progresses.
While this can be easily conceived in a continuous process line, it also applies to
batch processing systems. The response time, which may vary from less than a second
to several minutes, depends on the measurement technique or the device employed.
Sensors are commonly used devices for continuous or continual product monitoring.
The type of sensor therefore, determines the response time and thereby, its utility in
process control.
2.0 Sensors and their Classification
2.1 Definition of a Sensor
A sensor is a device or an instrument intended for transforming a quantity to be
measured or detected into another quantity that is accessible to human senses or to an
acquisition system. In general, a sensor detects, locates or quantifies energy or matter
and thereby offers detection or measurement of a physical or chemical property to
which it responds. For the device to qualify as a sensor its design principle must allow
the provision of a continuous output signal. It is particularly relevant to the sensor
constituting a component of a process control loop wherein the sensor output through
a transmitter serves as an input to the controller responsible for process manipulation
in a continuous manner.
Sensors are, in effect, „transducers‟ which indicate change. A transducer is defined as
the device which converts signals from one signal domain (type) to another, i.e.,
which converts one form of energy into another. Some transducers are able not only
to indicate a change but also to effect a change. The transducers intended for causing
a change are termed “actuators”. Ideally, a transducer would convert all the energy of
the input signal into energy in the output signal domain without dissipation (loss) of
any energy in a form outside the intended output signal domain. However, in practice,
full conversion is not achieved. Very few transducers operate without requiring any
other power source than the input signal.

2.2 Basic Components of a Sensor


A sensor generally comprises three components: one (a receptor), that interacts with
the „analyte‟ or the sample or the product on which measurement is being carried out;
second, a base device or a transducer (usually, a physical one); and third, electronics
intended to magnify, transmit or display the output from the transducer. Often, the
first component, i.e., the interacting part may be integral to the physical transducer or
the base device, and the third, electronic component may form a part of the
„transmitter‟ as an element of a process control loop.
2.3 Sensor Classification based on the Interaction with the Analyte
Sensors may be classified depending on the property being measured, or the type of
interaction with the chemical or physical property of the sample. Thus, the sensor
measuring a chemical property is a chemical sensor or a chemosensor, the one
measuring a physical property is a physical sensor and the sensor devoted to
measurement of a biological property is a biosensor. However, the interaction-based
classification is more commonly used. Thus, physical, chemical, biochemical and
biological interactions will correspondingly designate the sensors.
Physical sensors rely on a physical interaction with the sample as, for example,
measurement of mechanical resonance characteristics as a function of viscosity of a
process fluid, which in turn may be influenced by other process variables leading to
chemical changes in the fluid. Chemical sensors, which may sometimes be taken to
include biosensors are, more specifically, the sensors (chemosensors) that rely on non-
biological chemical or sorption interactions with the analyte. Chemosensors include
those based on the interaction of catalytic metals, or of redox-sensitive metal oxides
with the process fluid. Biosensors are the sensors based on interaction of a
biocatalyst, such as an enzyme, with the analyte. The biocomponent, (bioreceptor)
may, alternatively, be an antibody, in which case the sensor is called an
„immunosensor‟, or it may be a DNA where the resulting sensor is known as a „DNA
probe‟. Such a bioreceptor is suitably placed together with a physical transducer and
electronics to give the required output.

2.4 Sensor Classification based on Proximity with the Measurement Point


Depending on the closeness to the process line, a sensor can be „on-line‟ or „off-line‟.
An on-line sensor provides a continuous measurement on the process material and its
output can be used to adjust process variables using a control loop. The data on a
process stream are available in „real-time‟ or after a short time lapse (i.e. after a short
distance from the measuring point). On-line measurement can be direct in-line or at-
line. An in-line sensor may be placed in the main process line where the measurement
is desired, or the process line may be accessed through a window (or container wall)
transparent to the radiation (visible, microwave or infra-red) used for the
measurement. At-line sensors, on the other hand, are located in a bypass (closed loop)
or a bleed line (open loop) which permits sampling and conditioning (tempering,
dilution, filtration, etc.) to suit the measurement. Automated sampling and computer-
aided sample conditioning permits many instruments to be located in the production
area and used as sensors giving a continuous or intermittent input for process control.
In-situ measurement by means of sensors or instruments provide real-time
information in case of batch processes where also process control can be effected
based on the output of these sensors.
When instruments cannot be fed with samples from the line, the samples are subjected
to off-line measurement in the QC laboratory where rapid analytical tests using
suitable instruments (off-line sensors) located there. The response time in such cases
will again determine the utility of the data for process control, which of course would
be less relevant to continuous process manipulation.
3.0 Intelligent (or Smart) Sensors or Integrated Sensors
Semiconductor devices or semiconducting materials form the basis of many of the
sensors developed in the past. The electrical properties of semiconductors can be
varied widely by the choice of the material which may have conductivities as low as
insulators or as high as metals. These materials, whose properties can be tailored not
only by the crystal size and orientation but also by doping with impurities, are
sensitive to optical, thermal, electrical, magnetic, mechanical and chemical variables
depending on the chosen material and device structure. Silicon is a piezoelectric
material such as quartz crystal microbalance (whose electrical properties, e.g.,
oscillation frequency depend on its geometry and/or mass) as well as a semiconductor
and one or both of these properties have been used in many sensor designs.
Semiconductor devices such as photovoltaic silicon or germanium diodes are
employed as infra-red (IR) detectors in remote thermometry, lead sulphide
photoconductive devices in both IR thermometry and NIR compositional analysis,
and microwave semiconductor devices are used in instrumentation for the
determination of, for example, water content.
While silicon-based field effect transistor (FET) devices and light-addressable
potentiometric sensors (LAPS) are commonly used, solid-state transducers have been
developed as compact and robust substitutes based on the semi-conducting and
piezoelectric properties of silicon. Rapid response and the possibility integration with
other sensing and electronic circuit elements are other advantages which have led to
the use of solid-state transducers in integrated sensors.
Integrated sensors are combined solid-state sensors of the same type as in a sensor
assay, or of different types for the purpose of simultaneous measurement of several
variables. Solid-state sensors can also be integrated with interface electronics
comprising electronic circuitry for signal conditioning and processing aimed at
obtaining an amplified, linearized output signal. Such sensors, also known as „smart
sensors‟, generate, via current/frequency conversion and analog/digital conversion, an
output in a form suitable for input into digital systems:

SENSOR SIGNAL PROCESSING OUTPUT PRESENTATION

Sensors can also be integrated with actuators. An example of application of such


smart sensors is the drum-dryer speed control based on the „time constant‟
representing the decreasing thermal efficiency with the progressing dryer run, wherein
the moisture sensor integrated into the dynamic model actuates the speed control for a
constant moisture level in the dried product. Application of integrated sensors can
also be found in fouling and cleaning of heat exchangers, sterilization of canned
foods, and alcoholic fermentation process.
4.0 Conclusion
Sensors as a means of making real-time measurement on a process material form an
integral part of the process control system whether applied to a continuous operation
or to a batch process. Rapid response is their distinguishing feature whether in on-line
application or in use as a laboratory tool. Sensors are based on various interaction
principles, viz., physical (electromagnetic radiation, sound or ultrasound, pressure,
mechanical, electrical, thermal, etc.), chemical or biological/biochemical ones. An
essential component of any sensor is a „transducer‟ i.e. a device converting one type
of signal (input) into another type (or, changing one form of energy into another
form), the output from which is amplified, digitalized and displayed or transmitted to
an acquisition system or a process controller by means of suitable electronics.
Semiconductor-based sensors, in their solid-state structure, have evolved into the so-
called „intelligent‟ or „smart‟ sensors which are essentially combined or integrated
sensing devices for industrial process control. Sensors employed in level, flow or
temperature controls have long been used as the classical sensors. Sorting by colour,
consistency monitoring and compositional control are other common sensor
applications in practice. Biosensors, though having a limited on-line application as
yet, are emerging as powerful tools with high specificity of considerable value in
monitoring chemical/biochemical/microbiological contaminants in milk and food
products. Rapid developments in sensors promise to raise the degree of process
automation and assure product quality and safety.

References :

Bresnahan, D. (1997). Process control. In: Handbook of Food Engineering Practice,


Band IV, CRC Press, Boca Raton (USA), pp.633-666.

Castillo, J. et al. (2004). Biosensors for life quality : Design, development and
applications. Sensors and Actuators, 102: 179-194.

Deshpande, S.S. and Rocco, M.M. (1994). Biosensors and their potential use in food
quality control. Food Technol., 48(6): 1316-1320.

Davenel, A. (1996). On-line control and problems with sensors. In: Quality Control
for Food and Agricultural Products, J.L. Multon (Ed.), VCH Publ., London, pp.95-
113.

Kress-Rogers, E. and Brimelow, C.J. (Eds.) (2001). Instrumentation and Sensors for
the Food Industry. CRC Press, Boca Raton (USA), 836pp.

Patel, A.A. (2003). Application of biosensors in dairy and food industries. In:
Application of Biotechnology in Dairy and Food Processing, Lecture Compendium,
Centre of Advanced Studies, Dairy Technology Division, Nov. 2003, pp.89-94.
USE OF COMPUTER SOFTWARE LINDO IN DAIRY
PROCESSING

Smita Sirohi
DESM Division,
NDRI, Karnal

1.0 Introduction
LINDO (Linear, Interactive, and Discrete Optimizer) is convenient but powerful
software for solving linear, integer, and quadratic programming problems. This lecture
note outlines the basic steps in using LINDO for linear, integer and quadratic
programming.
Linear Programming (LP) deals with the optimization of linear functions of several variables
(called objective function) subject to the conditions that the variables are non-negative
and satisfy a set of linear equations and or inequalities (called linear constraints). In all
linear programming problems, the maximization or minimization of some quantity is the
objective and all linear programming problems also have a second property: restrictions
or constraints that limit the degree to which the objective can be pursued. A linear
programming problem can be formulated with n decision variables and m constraints.
The solution of the problem involving only two decision variables is possible graphically.
For solving problems with more number of decision variables the solution procedure is
Simplex method. However, manual solution is extremely difficult when the number of
variables and constraints are very large, and computer software like LINDO is an easy
tool in such cases.
2.0 Solution of LP problem using LINDO: When you start LINDO, two
windows are immediately displayed. The outer window labeled “LINDO” contains all the
command menus and the command toolbar. The smaller window labeled “<untitled>” is
the model window. This window is used to enter and edit the linear programming model
that you want to solve.
Consider the following LP problem:
2.1 Entering Objective Function: The first item you must enter into the model
window is the objective function. Thus, for the above problem, enter MAX 10S + 9D.
To indicate that the objective function has been completely entered and that the model
constraints will follow, press the enter key to move to the next line.
2.1 Entering the Constraints: Type the words SUBJECT TO (or just the letters ST)
in this line. Next, after pressing the enter key to move to a new line, enter the first
constraint. In the example above the first constraint will be entered as:
0.7S + 1D < 630
Two things are worth noting here. One, as the computer input recognizes decimal rather
than fractional data values, the linear program must be stated with the decimal
coefficients. Second, the LINDO interprets the symbol < as ≤.
After entering the first constraint, move to the next line by pressing the enter key.
Enter the second constraint in the next line
0.5S + 0.83333D< 600
Press the enter key again and enter the third constraint
1S+ 0.66667D <708
Then, press the enter key again and enter the fourth and final constraint,
0.1S+ 0.25D <135
Finally, after pressing the enter key, type END to signal LINDO that the model input is
complete. The model window will now contain the following model:
MAX 10S + 9D
ST
0.7S + 1D < 630
0.5S + 0.83333D< 600
1S+ 0.66667D <708
0.1S+ 0.25D <135
END
If you make an error entering the model, you can correct it at any time by simply
positioning the cursor where you made the error and entering the necessary corrections.
Solving the model: Select the Solve command from the Solve menu, or press the Solve
button on the LINDO toolbar. If LINDO does not find any errors in the model input, it
will begin to solve the model. As part of the solution process, LINDO displays a Status
Window that can be used to monitor the progress of the solver. When the solver is
finished, LINDO will ask whether you want to do range (sensitivity) analysis. If you
select the YES button and close the Status Window, LINDO displays the complete
solution to the LP problem on a new window titled “Reports Window.” The output that
appears in the Reports Window is shown below:
OBJECTIVE FUNCTION VALUE

1) 7667.994

VARIABLE VALUE REDUCED COST


S 539.998413 0.000000
D 252.001114 0.000000

ROW SLACK OR SURPLUS DUAL PRICES


2) 0.000000 4.374956
3) 120.000717 0.000000
4) 0.000000 6.937531
5) 17.999882 0.000000

NO. ITERATIONS= 2

RANGES IN WHICH THE BASIS IS UNCHANGED:

OBJ COEFFICIENT RANGES


VARIABLE CURRENT ALLOWABLE ALLOWABLE
COEF INCREASE DECREASE
S 10.000000 3.499932 3.700000
D 9.000000 5.285714 2.333300

RIGHTHAND SIDE RANGES


ROW CURRENT ALLOWABLE ALLOWABLE
RHS INCREASE DECREASE
2 630.000000 52.363155 134.400009
3 600.000000 INFINITY 120.000717
4 708.000000 192.000000 127.998589
5 135.000000 INFINITY 17.999882

2.3 Interpretation of the solution: The first section of the output shown above is self-
explanatory. For example, we see that the optimal solution is S =540 and D= 252, the
value of the optimal solution is 7668, and the slack variables for the four constraints are
0, 120, 0, and 18. The rest of the output can be used to determine how a change in a
coefficient of the objective function or a change in the right-hand-side value of a
constraint will affect the optimal solution.

3.0 Integer Programming: Integer LP models are ones whose variables are
constrained to take integer or whole number (as opposed to fractional) values. It may not
be obvious that integer programming is a very much harder problem than ordinary linear
programming, but that is nonetheless the case, in both theory and practice. Integer models
are known by a variety of names and abbreviations, according to the generality of the
restrictions on their variables. Mixed integer (MILP or MIP) problems require only some
of the variables to take integer values, whereas pure integer (ILP or IP) problems require
all variables to be integer. Zero-one (or 0-1 or binary) MIPs or IPs restrict their integer
variables to the values zero and one. (The latter are more common than you might expect,
because many kinds of combinatorial and logical restrictions can be modeled through the
use of zero-one variables.)

LINDO recognizes two kinds of integer variables: zero/one variables (binary)


and general integer variables. Zero/one variables are restricted to the values 0 or 1. They
are useful for representing go/no-go type decisions. General integer variables are
restricted to the non-negative integer values (0, 1, 2, ...). GIN and INTEGER statements
are used, respectively, to identify general and binary integer variables. The statements
should appear after the END statement in your model. Variables which are restricted to
the values 0 or 1 are identified using the INTEGER statement. It is used in one of two
forms:

INTEGER <VariableName>
or
INTEGER <N>

The first (and recommended) form identifies variable <VariableName> as being 0/1. The
second form identifies the first N variables in the current formulation as being 0/1. The
order of the variables is determined by their order encountered in the model. This order
can be verified by observing the order of variables in the solution report. The second
form of this command is more powerful because it allows the user to identify several
integer variables with one line, but it requires the user to be aware of the exact order of
the variables. This may be confusing if not all variables appear in the objective. If there
are several variables, but they are scattered throughout the internal representation of the
model, the variables would have to be identified as integers on separate lines.
General integer variables are identified with the GIN statement. Otherwise, the
GIN command is used exactly as the INT command. For example, GIN 4 makes the first
4 variables general integer. GIN TONIC makes the variable TONIC a general integer
variable.
A Small Example - Integer Programming Model
MIN X + 3 Y + 2 Z
ST
2.5 X + 3.1 Z > 2
.2 X + .7 Y + .4 Z > .5
END
INTEGER X
INTEGER Y
INTEGER Z

Notice the INTEGER statements appear after the END statement. These three
INTEGER statements declare our three variables to be zero/one. When we solve this
model the following results appear:
LP OPTIMUM FOUND AT STEP 3
OBJECTIVE VALUE = 2.25714278

NEW INTEGER SOLUTION OF 3.00000000 AT BRANCH 0 PIVOT 3


RE-INSTALLING BEST SOLUTION...

OBJECTIVE FUNCTION VALUE


1) 3.000000
VARIABLE VALUE REDUCED COST
X 1.000000 1.000000
Y 0.000000 3.000000
Z 1.000000 2.000000

ROW SLACK OR SURPLUS DUAL PRICES


2) 3.600000 0.000000
3) 0.100000 0.000000
NO. ITERATIONS= 3
BRANCHES= 0 DETERM.= 1.000E 0

4.0 Quadratic Programming: It involves optimization of the quadratic objective


function. An optimization problem is considered to be a quadratic when: all constraints
are linear, and the objective contains at least one quadratic (second degree) term. As an
example, Model 1 below is a quadratic program and Model 2 is not:
Model 1:
Maximize X2 - XY + 3X +10Y
Subject to X + Y < 10
X<7
Y<6

Model 2:
Maximize X3 - XY + 3X +10Y
Subject to X 2 + Y 2< 10
X<7
Y<6

This is not a quadratic program due to the cubic term in the objective and the quadratic
terms in the first constraint.
To use LINDO to solve quadratic programs, models should be converted into
equivalent linear form. This is accomplished by writing a LINDO model with the first
order conditions (also called the Karush/Kuhn/Tucker/LaGrange conditions) as the first
set of rows and the Ax<=b constraints, which we refer to as the "real" constraints, as the
second set of rows. The first order conditions are the optimality conditions, which must
be satisfied by a solution to a quadratic model. It turns out that the first order conditions
to a quadratic model are all linear. LINDO exploits this fact to use its linear solver to
solve what is effectively a nonlinear model. If you are uncertain as to exactly what first
order conditions are and how they are used here is a simple illustrative example (for
details you can refer to any introductory calculus text).
Suppose the quadratic objective function for minimization is:
3X2 + 2Y2 + Z2 + 2XY – XZ - 0.8YZ
and the constraints are:
X+Y+Z=1
1.3 X + 1.2 Y + 1.08 Z > 1.12
X < 0.75
Y < 0.75
Z < 0.75

The input procedure for LINDO requires this model be converted to true linear
form by writing the first order conditions. To do this, we must introduce a dual variable,
or LaGrange multiplier, for each constraint. For the above 5 constraints, we will use 5
dual variables denoted, respectively, as: UNITY, RETURN, XFRAC, YFRAC, and
ZFRAC.
The LaGrangean expression corresponding to this model is then:

Min f(X,Y,Z) = 3 X2 + 2 Y2 + Z2 + 2 X Y - X Z - .8 Y Z
+ ( X + Y + Z - 1) UNITY +
+ (1.12 - (1.3 X + 1.2 Y + 1.08 Z)) RETURN
+ (X - .75) XFRAC
+ (Y - .75) YFRAC
+ (Z - .75) ZFRAC

Basically, we have moved all the constraints into the objective, weighting them
by their corresponding dual variables.
The next step is to compute the first order conditions. The first order conditions
are computed by taking the partial derivatives of f(X,Y,Z) with respect to each of the
decision variables and setting them to be non-negative. For example, for the variable X
the first order condition is:

6 X + 2 Y - Z + UNITY - 1.3 RETURN + XFRAC ≥ 0


After the specification of the first order condition with respect to each variable
viz., X,Y and Z there is one final restriction on the input of quadratic models. Namely,
the real constraints, which are:
X+Y+Z=1
1.3 X + 1.2 Y + 1.08 Z > 1.12
X < 0.75
Y < 0.75
Z < 0.75
in our example, must appear last.

Finally, the command QCP must be used to specify which constraint is the first of
the real constraints. Since we have an objective and one first order condition for each of
the three variables, the real constraints begin with row 5. Thus, we will need a "QCP 5"
statement to complete our model.
The final model will be hence be :

MIN X + Y + Z + UNITY + RETURN + XFRAC + YFRAC + ZFRAC


ST
6 X + 2 Y - Z + UNITY - 1.3 RETURN + XFRAC > 0
2 X + 4 Y - 0.8 Z + UNITY - 1.2 RETURN + YFRAC > 0
- X - 0.8 Y + 2 Z + UNITY - 1.08 RETURN + ZFRAC > 0
X+Y+Z=1
1.3 X + 1.2 Y + 1.08 Z > 1.12
X < .75
Y < .75
Z < .75
END
QCP 5
After solving the model, following solution will be obtained:

OBJECTIVE FUNCTION VALUE

1) 0.4173749

VARIABLE VALUE REDUCED COST


X 0.154863 0.000000
Y 0.15486 0.000000
Z 0.594901 0.000000
UNITY -0.834750 0.000000
RETURN 0.000000 0.024098
XFRAC 0.000000 0.595137
YFRAC 0.000000 0.499764
ZFRAC 0.000000 0.155099

ROW SLACK OR SURPLUS DUAL PRICES


2) 0.000000 -0.154863
3) 0.000000 -0.250236
4) 0.000000 -0.594901
5) 0.000000 -0.834750
6) 0.024098 0.000000
7) 0.595137 0.000000
8) 0.499764 0.000000
9) 0.155099 0.000000

NO. ITERATIONS= 7

Thus, the objective function will be minimized when value of X, Y and Z are
0.15486, 0.15486 and 0.594901, respectively.
STATISTICAL METHODS FOR DAIRY PROCESSING

D.K. Jain
National Dairy Research Institute,
Karnal-132001 (Haryana)

1.0 Introduction
The discipline of statistics and computer applications is indispensable in research work particularly so
in agricultural and dairy research. Most of the advancements in various facets of knowledge have
taken place because of experiments conducted in agricultural and dairy sciences with the help of
statistical methods. These experiments are frequently designed and data so obtained are analyzed with
the help of various statistical methodologies and interpreted accordingly. In fact, there is hardly any
research work today that one can find complete without statistical data and its subsequent statistical
analysis and interpretation. Statistics plays a vital role in diverse fields from basic research to practical
decision-making on the basis of numerical data and calculated risks. The development of statistics has
been closely related to evolution of electronic computers. The applications of complex statistical
theories have increased manifold with the availability of high-speed computers and associated
software. The disciplines of Statistics and Computer Applications have now assumed greater
importance in all the fields of science including Dairy Science in view of fast developments taking
place in these areas.

The quality of milk and milk products can be judged on the basis of chemical and bacteriological as
well as sensory evaluation. Advances in micro-electronics and computing in the past decade have
strengthened the incentives for automation of food manufacturing processes. The use of computer to
control the operations of industrial units has already enabled increased product quality and process
flexibility. However, the dairy industry in the country is relatively late in embracing advanced
automation systems. The application of computers in processing, design and simulation is
transforming the industry and majority of these applications are in Computer-Aided Design (CAD),
Mathematical Modeling and Quality Control and so on. In the majority of food plants today,
automation plays an important role. A number of statistical techniques and statistical software
packages are available to analyze the data obtained as a result of chemical, bacteriological and sensory
evaluation. In this write-up, a brief review of various statistical methods, which are useful for dairy
processing has been given.

2.0 Research Methodology


Research is a systematic scientific method consisting of enunciating the problem, formulating a
hypothesis, collecting data, analyzing them and finally reaching at some conclusion/solution.
3.0 Hypothesis Testing
The purpose of hypothesis testing is to aid the researcher, or administrator in reaching at a decision
concerning a population by examining a sample from that population.
A hypothesis may be defined simply as a statement about one or more populations. The hypothesis is
usually concerned with the parameters of the populations about which the statement is made.
Researchers are concerned with two types of hypotheses – research hypotheses and statistical
hypotheses. The research hypothesis is the conjecture or supposition that motivates the research. It
may be the result of years of observation on the part of the researcher.
Research hypotheses lead directly to statistical hypothesis. Statistical hypotheses are stated in such a
way that they may be evaluated by appropriate statistical techniques. For convenience, hypothesis
testing is presented as a six-step procedure.
Data : The nature of the data that form the basis of the testing procedures must be understood,
since this determines the particular test to be employed. Whether the data consist of counts or
measurements, for example, must be determined.
Assumptions : These include, among others, assumptions about the normality of the population
distribution, equality of variances, and independence of samples.
Hypothesis : There are two statistical hypothesis involved in hypothesis testing and these should
be explicitly stated. The first is the hypothesis to be tested, usually referred to as the null
hypothesis of no difference, since it is a statement of agreement with (or no difference from) true
conditions in the population of interest. In general, the null hypothesis is set up for the express
purpose of being discredited. Consequently, the complement of the conclusion that the researcher
is seeking to reach becomes the statement of the null hypothesis. In the testing process the null
hypothesis either is rejected or is not rejected. If the null hypothesis is not rejected, we will say
that the data on which the test is based do not provide sufficient evidence to cause rejection. If the
testing procedure leads to rejection, we will conclude that the data at hand are not compatible with
the hypothesis, but are supportive of some other hypothesis. This other hypothesis is known as the
alternative hypothesis and may be designated by the symbol H1.
Compute Test Statistic : From the data contained in the sample we compute a value of the test
statistic and compare it with the acceptance and rejection regions that have already been specified.
Statistical Decision : The statistical decision consists of rejecting or of not rejecting the
null hypothesis. It is rejected if the computed value of the test statistic falls in the
rejection region, and it is not rejected if the computed value of the test statistic falls in the
acceptance region.
Conclusion : If Ho is rejected, we concluded that H1 is true. If Ho is not rejected, we
conclude that Ho may be true.
3.1 Large Sample Tests
By “Large Sample” we shall mean that the sample size is very large and that the population is
NORMAL. Hence all the properties of Normal Distribution are to be applied in such a case. Let us
enlist different possible cases and the corresponding test statistics as follows :
Cases :

1. Testing of Mean of a Normal Distribution with Known S.D. () :


| X – µ0 |
Z=
/n

2. Testing of mean of a Normal Distribution with unknown S.D. () :

| X – µ0 |
t=
S/n

1
S=  (Xi – X)2
n-1

(S is called sample Standard Deviation)

3. (a) Test of equality of two means with known variances :

(Ho : µ1 = µ2)

| X1 – X2 |
Z =
21 21
+
n1 n2

(b) Test of equality of two means where variances are not known :
| X1 – X2 |
t =
1 1
S +
n1 n2

Where,

(n1 – 1) S21 + (n2 –1) S22


S=
(n1 + n2 – 2)
4. (i) Test of proportion (single sample) :
| P – P0 |
Z=
 pq / n

(ii) Test of equality of proportions of two samples :

P1 – P2
Z =
p1q1 p2q2
+
n1 n2

3.2 Small Sample Tests


t Test
Apply when : i) Sample size is 30 or less,
ii) Population variance or standard deviation is unknown.
While testing hypothesis following assumptions are usually made :-
a) the population is normal (or Approximately Normal),
b) observations are independently drawn for the random sample,
c) in case of 2 samples, population variances are assumed to be equal (for the test of equality
of Means).
4.0 Chi-Square as a Test of Independence
Many times, the management of a dairy plant wants to know whether the differences they
observe among several sample proportions are significant or only due to chance. Suppose the dairy
plant manager conducting consumer preference survey for a dairy product wants to know whether the
proportion of persons liking the dairy product differs significantly among different age groups /
income groups / education groups / food habit groups / regions, etc. This is carried out using chi-
square test for independence of attributes, where the objective is to test whether the two attributes A
and B each at „r‟ and „s‟ levels, respectively, possessed by persons are independent. The chi-square
formula to test the independence of attributes is given as

(Ai) (Bj)
(AiBj) –
r s n
 2
=   distributed as 2 with (r–1)
i=1 j=1 (Ai)(Bj)

N
(s – 1) degrees of freedom where (AiBj) are the number of persons possessing the attribute A at i th
level and attribute B at jth level. (Ai) are the number of persons possessing the attribute A at i th level
irrespective of the level of attribute B, i.e., column total and (Bj) are the number of persons possessing
the attribute B at jth level irrespective of the level of attribute A, i.e., row total. N is the total number of
persons.
If calculated value of 2 is greater than the tabulated 2 then we may conclude that the
attributes under consideration are not independent, which may imply that the liking of dairy product
differ significantly among different groups considered in the study.

5.0 Correlation and Regression Techniques


Correlation and regression techniques are used for assessing the relationships and predictions
of variables. The different procedures of correlation like simple, partial, multiple. Rank, intra-
class and correlation ratio are useful for different situations for assessing the relationships of
variables. The simple and multiple regressions will be useful for making prediction of a
dependent variable through a set of explanatory variables in different situations. The
estimates of correlation and regression coefficients are tested using different statistical tests
of significance for valid inferences. The criteria for assessing model selection, comparison of
models, sensitivity of regression are discussed. Some of the problems like multicollinearity
and extreme observations in the data analysis are also discussed.

5.1 Measures of Correlation


Simple Correlation : It measures relationship between two variables „X” and „Y‟. It ranges
from –1 to +1.
 XY – ( X)( Y) / n
r =
 ( Y2 – ( Y)2 / n) ( X2 – ( X)2 / n)

Partial Correlation : It measures the partial relationship between two variable „Y‟ and „X1‟
keeping the effect of a third variable „X2‟ as constant. It ranges between –1 and +1.
rYX1 – rYX2 rX1X2
rY.X1.X2 =
 (1 – r2YX2)(1 – r2X1X2)

Multiple Correlation : It measures the correlation of a dependent variable „Y‟ with a set of
2 or more independent variables „X‟ together. It ranges between 0 and 1
r2YX1 + r2YX2 – 2rYX1rYX2rX1X2
2
R Y X1 X2 =
 1 – r X1.X2
2

Correlation Ratio : Correlation ratio „‟ is the appropriate measure of curvi-linear


relationship when the relationship between two variables is not linear. If relation is linear
then  = r, otherwise  > r. It ranges between –1 to +1.
(2YX = [ (Ti2 / ni) – T2 / N] / NY2
where, Ti / ni =  fij Yij / ni
T / N =  ni yi /  ni
5.2 Measures of Regression
Simple Regression : It measures the functional relationship between a dependent variable
„Y‟ and an explanatory variable „X‟ with estimates of a constant term () and a slope (). The
estimates of () and () can be negative, zero or positive. The linear regression is given as :
Y =  + X
 
 

=y-x

  xiyj
=
 x2i

Multiple Regression : If the dependent variable „Y‟ is a function of a set of explanatory


variables „X‟, then the estimates of regression coefficients of different variables () along
with the constant term () are estimated using the matrix algebra. The multiple regression of
„Y‟ through different explanatory variables can be given as:
Y =  + 1 X1 + 2 X2 + 3 X3 + 4 X4 + ………

 = (X‟ X)-1 X‟ Y

The regression coefficients can be negative, zero or positive and would measure the rates of
change in „Y‟ for an unit change in the explanatory variables.
Polynomial Regression : If the dependent variable „Y‟ is a function of linear and other higher
order effects of an explanatory variable „X‟, then the polynomial regression is fitted to
quantify the effects of a variable and its significance at different orders for prediction of „Y‟.
The nth order polynomial regression of „Y‟ can be given as:
Y =  + 1 X1 + 2 X12 + 3 X13 + 4 X14 + ……… n X1n
 = (X X)-1 X‟ Y
The polynomial regression coefficients can be negative, zero or positive and would
measure the rates of change in „Y‟ for an unit change in the explanatory variable.

6.0 RANKING METHODS


Ranking methods are used where the characteristics are of qualitative nature. Sometimes
recourse to these methods is taken even where measurements are made in order to reduce the labour
computation or to get a rapid result. These are several techniques:
6.1 Spearman’s Rank Correlation Method

The best known technique in this field is Spearman‟s Rank Correlation Coefficient. Suppose
we have ten makes of ice-cream ranked in order of preference by two judges. The problem is: Do the
judges show evidence of agreement among themselves with regard to ranking? In this case, we use
Spearman‟s Rank Correlation Coefficient which is defined by the formula. n

6  di2
i=1
R=1–
n (n2 – 1)

Where,

„di‟ is the difference between ranks given by two judges for ith make of ice-cream and „n‟ is
the number of makes of ice-cream.

Whenever two rankings are identical, the rank correlation has the value +1 and when the
rankings are as greatly in disagreement as possible, i.e., when one ranking is exactly the reverse of the
other, the rank correlation coefficient is equal to –1. The significance of the rank correlation
coefficient is tested through t-test, which is computed by the formula

n–2
t=R
1 – R2

distributed as t-statistic with (n – 2) degrees of freedom. In case of computed „t‟ value being greater
than tabulated value of „t‟, it is concluded that there is an evidence of significant agreement between
two judges. This should not be meant as that the two judges are really placing the different makes of
ice-cream in the correct order approximately. It is quite possible that though they both agree and they
both are wrong.

6.2 Coefficient of Concordance

Very often, as in the case of sensory evaluation of dairy products, we are not concerned
simply with the agreement between two judges, but have several judges and want to know whether
there is significant measure of agreement among them. In such cases, coefficient of concordance,
which is a measure of overall agreement among the judges is used. Suppose we have „n‟ makes of ice-
cream being subjected to sensory evaluation by a panel of „m‟ judges. Let us suppose that the judges
assign rankings to the different makes of ice-cream and it is desired to test whether there is an
evidence of overall agreement among the judges. The coefficient of concordance as measured by „w‟
is defined by the formula,
S 12S
w = =
Smax m2(n3 – n)

Where, „Smax‟ is the maximum possible sum of squares when the judges are in complete agreement and
their expectation on the Null Hypothesis that there is no agreement among the judges. „S‟ is the sum
of squared differences between observed and the expected rank totals.

The coefficient of concordance „w‟ lies between zero signifying complete randomness in the
allocation of rankings to 1 signifying complete agreement among the judges. „w‟ can be tested for
significance using Snedecor‟s F distribution as follows :

Step 1. A „Continuity Correction‟ is applied, whereby subtract unity from the calculated value of „S‟and
increase the divisor

m2 (n3 – n)
by 2 and then calculate „w‟
12

(m – 1)w
Step 2. Calculate Snedecor‟s F as distributed as „F‟ statistic with
1–w

degrees of freedom [(n –1) – 2/m] for the greater estimate and (m – 1) [(n – 1) – 2/m] degrees of
freedom for the lesser estimate.

If calculated value of „F‟ statistic is greater than tabulated „F‟ then the judges do exhibit a
notable degree of agreement in their judgements of the palatability of the different makes of ice-cream
from different manufacturers.

6.3 Coefficient of Consistency


Sometimes a question arises whether ranking is a legitimate procedure at all. It can often be
the case that we sensibly have a preference for one item over the other without being able to show
logical justification for a ranking procedure. In our ice-cream example, there are many factors, which
might influence several judges differently. One judge may be influenced by taste, the other by colour,
another by attractive package, and so on. Such type of judgements are called as „multidimensional‟.
Characteristic of all such cases is inconsistency of judgements expressed by the same observer. The
problem of multi-dimensional judgements can be tackled on the basis of paired comparisons, rather
than straight ranking. In this case, the judge is provided with every possible combination of two items
from the set of items to be evaluated and leave him scope for inconsistent judgements.

In general, given „n‟ items, pairs can be chosen in nc2 ways. Here, the judge is not allowed to
declare himself unable to decide between one item of a pair and another. He has to decide either way.
For example, if we have 7 items (A, B, C, D, E, F and G) to be compared then we have 21
pairs. A table is prepared, where we record choice of judges in which the symbol „1‟ is allotted, if the
item denoting the row of the table is preferred to the item denoting the column, whereas symbol „0‟ is
allotted if the item denoting the row of the table is rejected in favour of the item denoting the column.
Obviously since no item is compared with itself the diagonal of the table will be blank.

The consistency of the judge is tested by coefficient of consistency which is given as under :

24d 24d
K = 1– when „n‟ is odd K = 1– when „n‟ is even
n3 – n n3 – 4n

Where, „d‟ is the number of circular triads observed in a given set-up of choices.

Tmax – T n3 – n
The value of d is given as d = where, Tmax =
12 12

n
and T =  (Si – Ei)2
i =1
Where, S is the sum of the row and „E‟ is the expected frequency of the symbol per row and is given
by (n-1/2). The value of „k‟ lies between „0‟ and „1‟. „k‟ attains the value „1‟ if there are no
inconsistent triads of judgement otherwise it is less than „1‟. Whenever, k = 1, we are justified in
setting up an ordinary ranking of items but not otherwise.

6.4 Coefficient of Agreement


It may be noted that the existence of significant consistency does not necessarily guarantee
that his judgement is sound. A man may well be consistently wrong. Even a group of observers /
judges may be consistently and jointly wrong. In such cases, the possible solution is to increase the
number of judges doing our paired comparison test. Now the question is whether there is a degree of
agreement among the judges. This is carried out by computing coefficient of agreement.

Suppose there are „m‟ judges doing the paired comparison test. Then in our table of results we
can record square by square, the number of judges stating the preference in question. In this way each
cell in our table might contain any number from „m‟ (when all the judges state the preference AB) to
zero (when all the judges state preference BA). Given „n‟ items to be compared, there will be nc2
paired comparisons. If all the judges are in perfect agreement, there will thus be nc2 cells in our table
containing the score „m‟, and nc2 cells containing the score „0‟.

The next step is to find the number of agreements between pairs of judges. For example, if a
particular cell in the table contains the score „j‟ to indicate that „j‟ judges had agreed in ranking of
that particular choice, these „j‟ judges could be arranged in jc2 pairs, all agreeing about the judgement
in question. The same calculation has to be repeated for the number of agreements between pairs of
judges for every cell in the table getting for each cell a term of the type jc2, where „j‟ is the number of
judges in various cells. Adding up all these jc2 terms for the whole table we get

n
J =  j
c2 with „n‟ items to be compared. i =
1
Now with „m‟ judges and „n‟ items to be compared. The coefficient of agreement is defined by the
formula
2J
A = 1
m
c2 . c2 n

The coefficient of agreement lies between –1 signifying complete disagreement between two judges
and +1 signifying maximum number of agreements occurring when nc2 cells each contain the number
„m‟, and the maximum number of agreements between judges will be mc2 . nc2.

The coefficient of agreement can be tested for statistical significance through chi-square test.
The expression

4J m (m – 1) (m – 3) n (n – 1)
Z = –
m–2 2 (m – 2)2

m (m – 1) n (n – 1)
is distributed as  with
2
degrees of freedom.
2 (m – 2)2

7.0 Analysis of Variance


Analysis of variance (ANOVA) is an important statistical technique, which enables us to test
the significance of the differences among more than two sample means. This technique consists in
splitting up the total variation into components of variation due to independent factors, where each
component gives us the estimate of population variance and the remaining being attributed to random
causes of variation.

Analysis of variance is very useful in sensory and chemical evaluation studies. Here one can
compare chemical characteristics of the product / average flavour score / body and texture score /
overall acceptability score given by a panel of judges among different types of dairy products
prepared with different ingredients varying at different levels. In each of these cases, we compare the
means of more than two samples.

In order to use analysis of variance, it is assumed that each of the samples is drawn from a
normal population and that each of these populations have the same variance 2. If, however, the
sample sizes are large enough, the assumption of normality is not so crucial.

7.1 One-way Analysis of Variance


Here, the researcher wants to test the hypothesis that whether the „k‟ treatment / class means
are equal which can be formally stated as

µ1 = µ2 = µ3 = = µk

Suppose, for example, the researcher wants to test whether the average chemical or bacteriological
characteristic on sensory score assigned by a panel of judges for different dairy products differ
significantly or not. Here, the mathematical model under consideration is:

i = 1, 2, .... k
Xij = µ + ti + eij
j = 1, 2, ...., ni

Where, Xij is the jth observation belonging to ith class, µ is general mean, it is the effect of ith class and
eij is the random error component which is normally independently distributed with zero mean and
constant variance e2. For detailed statistical analysis one may consult any standard textbook of
statistics.

7.2 Two-way Analysis of Variance (with one observation per cell)

The basic objective in this method is comparison among all treatments within a small block of
experimental material, thus reducing the effect of error variation. The mathematical model for two-
way analysis of variance is given as under :

Xij = µ + ti + bj + eij

Where, µ is general mean, ti is effect due to ith treatment (i = 1, 2, ....., t); bj is the effect due to jth block
(j = 1, 2, ....., b) and eij is random error component assumed to be NID (0, e2). The model assumes
additivity of block and treatment effects, i.e., there is no joint effect of the ti and bj. For example,
suppose that a processor of dairy products is interested in comparing four storage procedures. The
variable of interest is the index of the bacterial count after 72 hrs of storage. Since the milk received is
variable with respect to the bacterial count, the researcher wishes to experiment with several lots /
batches of milk, say, five batches which constitute his replicates of experimental data. His objective is
to find the significant differences among storage treatments and batches with respect to bacteria count
present in the milk, which can be done by using two-way analysis of variance.

8.0 Factorial Experiments


Experiments where the effects of more than one factor say fat / TS levels, sugar levels, etc. on
the dairy product prepared are considered together are called factorial experiments, while experiments
with one factor say only fat / TS or sugar, may be called simple experiments. For example, the quality
of ice cream or kulfi depends on the particular fat / TS level as well as also on the particular sugar
level used which gives the best quality of ice-cream or kulfi. The only way to know the effect of
different fat / TS levels in the presence of different sugar levels (or vice-versa) is to have all possible
combinations of the fat / TS and sugar levels in the same experiment with two factors and select a
combination of two factors, which gives the best quality ice-cream or kulfi. Similarly, these factorial
experiments can be extended to three or more factors each at different levels, but then the analysis
becomes more complicated.

8.1 Two-Factor Experiments

Sometimes a researcher may be interested in examining the effect of two classes of


treatments, or „factors‟. For example, in the milk storage experiment mentioned in the previous section,
he may be interested in studying both the effect of different types of containers and the effect of
different storage temperatures. Here, he is interested not only in the individual effect of each of these
factors, but also in their joint effect known as interaction effect. That is, certain temperatures may
work better in conjunction with particular containers than with others.

To take another example, the researcher may be interested in examining the effect of various
malt levels and sugar levels and their interaction on the quality of sterilized malted milk evaluated for
overall acceptability by a panel of judges. For this design, we establish the following mathematical
model :

Xijk = µ + Mi + Sj + (MS)ij + eijk

Where,

µ = General mean;

Mi = Effect of ith malt level;

Sj = Effect of ith sugar levels;

(MS)ij = Effect of interaction between malt level and sugar levels in ij th cell, and
eijk is the random error component assumed to be independently
normally distributed with zero mean and constant variance.

We assume that Mi = Sj = (MS)ij = 0. Here, the total sum of squares will be split into sum
of squares due to malt level, sugar level and their interaction and the error sum of squares. For
detailed statistical methodology one can refer to any standard textbook of statistics given at the end.

If, in the above experiment, it is found that the interaction effect is significant one can find out
best possible combination of malt and sugar level, which gives the best quality of sterilized malted
milk based on mean overall acceptability scores for various combinations of malt and sugar levels.

8.2 Three Factor Experiments

It is not necessary that the researcher should limit his investigation to two factors. For
example, in a study of Process Optimization for Acidified Milk Beverages, the researcher prepared
beverage using four sugar levels, three pH levels and three types of acids. The beverage so prepared
was subjected to chemical as well as sensory evaluation by a panel of judges for its body and texture
and flavour. The objective of the experiment was to find whether the quality of beverage is affected
by different sugar levels, pH levels, types of acids and their interaction. This is three factor
experiment. The details of the experiment are given as under :

Factor A : Sugar Levels : 14%, 16%, 18%, 20%

Factor B : pH Levels : 2.75, 4.00, 4.25

Factor C : Type of Acids : Phosphoric acid, citric acid, lactic acid

Number of Replicates : Three

On the basis of significant interaction effect, the researcher selected pH-sugar combination
and type of acid which gave best quality of beverage based on flavour and body and texture score.
The mathematical model used can be given as under:

Xijkl = µ + Si + Pj + Ak + (SP)ij + (SA)ik + (PA)jk + (SPA)ijk + eijkl

Where, µ is general mean, Si is effect of ith sugar level, Pj is effect of jth pH level and Ak is effect of
kth type of acid and (SP)ij, (SA)ik and (PA)jk are two factor interaction effects and (SPA)ijk is three
factor interaction effect and eijkl is random error component assumed to be independently normally
distributed with zero mean and constant variance e2.

Factorial experiments, as the number of levels of each factor increases, the number of
experiments required for factorial design becomes unreasonably large. For this reason fraction
replication of factorial experiments are generally recommended for these designs with four or more
levels. Even with the increase in the number of factors, the factorial design becomes more
complicated and involves large number of interaction effects and interpretation of results becomes
quite cumbersome. In such cases, the higher order interactions which are irrelevant and non-
significant may be ignored and their sum of squares and degrees of freedom may be pooled with error
component.

9.0 Conclusion

Although an exposition of number of statistical procedures have been covered yet


they are not complete in all respects. However, depending upon the objective of the
investigation and the type of data, right type of statistical procedures can be used for
meaningful analysis of data. Moreover with the availability of powerful computers and
statistical software packages, the problem of data analysis has become quite easy. The
mathematical modelling based on complete biochemical profile is also being considered as a
suitable alternative. The computer manipulation of a large database on the characteristics of
the organisms has been successfully used to accurately identify different microorganisms.
One such application has been the commercially available tests for Enterobacteriaceae. This
has been utilized to biochemically analyze the given microorganism and create a database on
the test results, which are integrated with computerized mathematical modelling for rapid
interpretation of results. The predictive microbiological computer modelling is another area
which enable to make predictions even to determine influence of a single factor or a
combination of factors.

References

Bryant, E.C. 1966. Statistical Analysis. McGraw Hill, NY.


Cochran, W.G. and Cox, G.M. 1957. Experimental Designs. John Wiley & Sons Inc., NY.
Goulden, C.H. 1959). Methods of Statistical Analysis. Asia Publishing House, Bombay.
Moroney, M.J. 1971. Facts from Figures. Penguin Books, U.K.
Snedecor, G.W. and Cochran. 1967. Statistical Methods. Oxford & IBH Pub. Co., New
Delhi.
SORPTION ISOTHERMS AND GENERATION OF
SORPTION PARAMETERS

G. R. Patil & R. R. B. Singh,


Dairy Technology Division,
NDRI, KARNAL

1.0 Introduction
Water is an integral part of all food systems. It determines behavior of food
products during many processing operations and significantly affects the quality of food.
An understanding of the state of water in foods that is characterized by water activity aw is
therefore essential to control and optimize various physical, chemical and microbial changes
in food systems. Determination of sorption isotherms thus, has several applications in food
science.
In mixing operations and development of a new product formulation, sorption
isotherm data of each component will help to predict transfer of moisture from one product
to another, which is essential for controlling the deterioration of the final product.
Determination of enthalpy of sorption and desorption of water at two different temperatures
gives an indication of binding strength of water molecules to the solid and has definite
bearing on the energy balance during drying and freezing operations. Sorption isotherm is
also important in packaging operations as the knowledge of initial and maximum allowable
moisture content and aw along with the surface and permeability of the packaging material
will help in determining shelf life of the packaged foods under varying conditions of
storage.

2.0 Methods for Determination of Sorption Isotherms


Methods that have been developed for determining sorption isotherms can be
broadly classified under two heads: A. Gravimetric methods; B. Manometric and
Hygrometric methods. Although several innovations have been tried in both the group of
methods of measurement to improve the rapidity and accuracy of measurement, the
following gravimetric method has been recommended by the cost projects 90 and 90-bis on
physical properties of foodstuffs. (COST = Co-operation in the field of Scientific and
Technical Research in Europe) and remains by far the most widely used and reliable method
of determining sorption isotherms.

2.1 Principle of Measurement


The principle underlying the method of measurement is that food product is exposed
to a controlled environment of relative humidity at defined temperature condition. The
weight of the sample is monitored at definite intervals till a time there is no change in
weight as the food attains equilibrium with the environment. Such determinations at several
relative humidity (RH) conditions will yield a sorption isotherm.
2.2 Design of the Sorption Apparatus
The equipment which has been recommended consists of a simple arrangement
comprising of glass jars as sorbostat with vapour tight lids. Sorbed source in sufficient
quantity is placed in the jar to maintain large sorbate to sample ratio. The substance is
placed in small weighing bottles standing on trivets directly above the sorbate source. The
jars are then placed in thermostatically controlled incubators or water baths maintained at
predetermined temperatures.

1
Fig. 1. Sorption apparatus

1. Sorption container; 2. Weighing bottle with ground in


stopper;
3. Petridish on trivets; 4. Saturated salt solution

2.3 Sorbate Sources for Creating Constant ERH


2.3.1 Sulphuric Acid Solutions of Varying Concentrations: Depending on the
concentration, sulphuric acid solutions will have varying water vapour pressure and
the ERH in the headspace will accordingly change. The major limitations of using
H2SO4, however, remains change in the concentration due to loss or gain of moisture
thereby altering the ERH conditions. The following Table gives aw of sulfuric acid
solutions at different temperature.

Table 1. Water activity of sulfuric acid solutions at different concentrations and


temperatures

H2SO4 Density Temperature (oC)


(%) at 25oC 5 10 20 25 30 40 50
(g/cm3)
5 1.0300 0.9803 0.9804 0.9806 0.9807 0.9808 0.9811 0.9814
10 1.0640 0.9554 0.9555 0.9558 0.9560 0.9562 0.9565 0.9570
15 1.0994 0.9227 0.9230 0.9237 0.9241 0.9245 0.9253 0.9261
20 1.1365 0.8771 0.8779 0.8796 0.8805 0.8814 0.8831 0.8848
25 1.1750 0.8165 0.8183 0.8218 0.8235 0.8252 0.8285 0.8317
30 1.2150 0.7396 0.7429 0.7491 0.7521 0.7549 0.7604 0.7655
35 1.2563 0.6464 0.6514 0.6607 0.6651 0.6693 0.6773 0.6846
40 1.2991 0.5417 0.5480 0.5599 0.5656 0.5711 0.5816 0.5914
45 1.3437 0.4319 0.4389 0.4524 0.4589 0.4653 0.4775 0.4891
50 1.3911 0.3238 0.3307 0.3442 0.3509 0.3574 0.3702 0.3827
55 1.4412 0.2255 0.2317 0.2440 0.2502 0.2563 0.2685 0.2807
60 1.4940 0.1420 0.1471 0.1573 0.1625 0.1677 0.1781 0.1887
65 1.5490 0.0785 0.0821 0.0895 0.0933 0.0972 0.1052 0.1135
70 1.6059 0.0355 0.0377 0.0422 0.0445 0.0470 0.0521 0.0575
75 1.6644 0.0131 0.0142 0.0165 0.0177 0.0190 0.0218 0.0249
80 1.7221 0.0035 0.0039 0.0048 0.0053 0.0059 0.0071 0.0085

Source: Rao and Rizvi, 1986

2.3.2 Glycerol Solutions: Glycerol solutions of varying concentrations (adjusted with


water) can also be used for creating constant ERH conditions. The difficulty in using

2
glycerol solutions, however, arises from the fact that glycerol can volatilise and absorb into
the foods thereby causing error. Un like H2SO4, it is non-corrosive but gets diluted or
concentrated during sorption due to loss or gain of moisture from the sample. Table 2 below
gives aw of glycerol solutions.

Table 2. Water activity of glycerol solutions at 20oC

Concentration Refractive index Water activity


(kg/L)
- 1.3463 0.98
- 1.3560 0.96
0.2315 1.3602 0.95
0.3789 1.3773 0.90
0.4973 1.3905 0.85
0.5923 1.4015 0.80
0.6751 1.4109 0.75
0.7474 1.4191 0.70
0.8139 1.4264 0.65
0.8739 1.4329 0.60
0.9285 1.4387 0.55
0.9760 1.4440 0.50
- 1.4529 0.40
Source: Rao and Rizvi, 1986

2.3.3 Salt Slurries: Saturated slurries of various inorganic and organic salts produce
constant ERH in the headspace of sorption container. The ERH decreases with increasing
temperature due to increased solubility of salts with increasing temperatures. The Table 3
below gives aw of different salt slurries at varying temperatures.

Table 3. Water activities of different salt slurries at various temperatures

Salt Temperatures (oC)


5 10 20 25 30 40 50
Lithium chloride 0.113 0.113 0.113 0.113 0.113 0.112 0.111
Potassium acetate - 0.234 0.231 0.225 0.216 - -
Magnesium chloride 0.336 0.335 0.331 0.328 0.324 0.316 0.305
Potassium carbonate 0.431 0.431 0.432 0.432 0.432 - -
Magnesium nitrate 0.589 0.574 0.544 0.529 0.514 0.484 0.454
Potassium iodide 0.733 0.721 0.699 0.689 0.679 0.661 0.645
Sodium chloride 0.757 0.757 0.755 0.753 0.751 0.747 0.744
Ammonium sulfate 0.824 0.821 0.813 0.810 0.806 0.799 0.792
Potassium chloride 0.877 0.868 0.851 0.843 0.836 0.823 0.812
Potassium nitrate 0.963 0.960 0.946 0.936 0.923 0.891 0.848
Potassium sulfate 0.985 0.982 0.976 0.970 0.970 0.964 0.958
Source: Rao and Rizvi, 1986

2.3.3.1 Preparation of salt slurries: The Table 4 gives the proportion of different salts to
water for preparing saturated slurries.

3
Table 4. Preparation of recommended saturated salt Solutions at 25oC

Salt RH (%) Salt (g) Water


(mL)
LiCl 11.15 150 85
CH3COOK 22.60 200 65
MgCl2 32.73 200 25
K2CO3 43.80 200 90
Mg(NO3)2 52.86 200 30
NaBr 57.70 200 80
SrCl2 70.83 200 50
NaCl 75.32 200 60
KCl 84.32 200 80
BaCl2 90.26 250 70
Source: Spiess and Wolf, 1987

The following equations (Table) can be used for predicting aw of known salt slurries
at any temperature.

Table 5. Regression equations of water activity of selected salt solutions at different


temperatures

Salt Equation R2
LiCl Ln aw=(500.95/T-3.85 0.976
KC2H3O2 Ln aw=(861.39/T)-4.33 0.965
MgCl2 Ln aw=(303.35/T)-2.13 0.995
K2CO3 Ln aw=(145.0/T)-1.3 0.967
MgNO3 Ln aw=(356.6/T)-1.82 0.987
NaNO2 Ln aw=(435.96/T)-1.88 0.974
NaCl Ln aw=(228.92/T)-1.04 0.961
KCl Ln aw=(367.58/T)-1.39 0.967
Temperature ‘T’ in Kelvin
Source: Rao and Rizvi, 1986

While preparing salt slurries, the following care must be taken to improve precision
of measurement:

 Only AR grade salts should be used.


 Salt crystals in excess should be present at the bottom.
 Before the samples are placed, the containers with slurries should be maintained at
required temperatures for 3-4 days for allowing equilibration.
 The ratio of slurry surface to sample surface should preferably be >10:1.
 The ratio of air volume to sample volume should be 20:1
 The salt slurries should be occasionally stirred to prevent change in concentrations
of top liquid layer due to loss or gain of moisture from the sample.

Precautions:
 Some salts are caustic: potassium dichromate, potassium chloride

4
 Some salts are highly toxic: lithium chloride, sodium nitrite
 Alkaline solutions such as K2CO3 absorb large amounts of CO2 with time thereby
decreasing aw significantly

3.0 Standardization of Sorption Apparatus with Reference Materials


The recommended material for this purpose is microcrystalline cellulose (MCC).
This material is very stable against changes in the sorption behaviour and can be used even
after 2 to 3 repeated adsorption and desoption cycles. It does not exhibit hysteresis between
adsorption and desorption and require very short periods for reaching equilibrium.

4.0 Preparation of Samples:


The test substrate should be prepared in a way that ensures homogeneity so that
sample drawn for sorption studies is representative of the bulk. Sample size should normally
be 1 gm and at least three replications should be used for minimizing error in the study. For
adsorption isotherms, samples should be vacuum dried preferably at 30°C for 30-40 hrs.
followed by freeze drying and desiccant drying to reduce the moisture level to a level lower
than the corresponding lowest water activity of the saturated salt being used. Once the
samples have been weighed accurately and placed in the sorption jar, weighing should
follow at regular intervals till the sample reaches equilibrium and the change in weight in
three subsequent weighing does not change by more than 2 mg per gm of sample.

5.0 Types of Moisture Sorption Curves


The sorption isotherms are obtained by drawing a plot of moisture (g/100 g of
sample, db) vs water activity. The isotherms thus obtained could be classified according to
the following five general types.

Type I (Langmuir) Type II


Moisture

(Anticaking agents) (sigmoid)


most of foods

aw

Type IV
Type III
(Sugars)

Type V

Fig. 2. The five types of van der Walls adsorption isotherms (Source: Bhandari, 2002)

5
6.0 Isotherm Models
Over the years, a large number of isotherm models have been prepared and tested
for food materials. These can be categorized as two, three or four parameter models. Some
of the most commonly used models are presented hereunder:

6.1 Two Parameter Models


b
 a 
1. Oswin W  a w 
(1  a w ) 

 1
ln  ln
1 2C 1  aw
2. Caurie  ln
W C.W0 W0 aw

 a 
 b 
W 

3. Halsey aw  e  



 W0 B.aw
4.BET Equation W
(1  aw)[1  (B  1)aw]

6.2 Three Parameter Models

Gkaw
1. GAB W  W0
(1  k a w )1  kaw  Gkaw 

 a  aw (c.aw  b)
2. Modified Mizrahi W
aw  1

Where,
W = Equilibrium moisture content, g/100 g solids
W0 = Moisture content equivalent to the monolayer
aw = water activity
a, b = Constants
B = Constant
C = Density of sorbed water
G = Guggenheim constant
k = Correction factor for properties of multiplayer molecules with
respect to the bulk liquid

Of these, GAB model has been found to be most appropriate for describing sorption
behaviour of food systems over a wide range of water activity. Both BET (application range
aw=0.05-0.45) and GAB equations can be used for obtaining monolayer moisture that is
critical for quality and shelf life of foods

6
1  aw 1
The Caurie’s plot of ln vs. ln
over the given aw range can be used to
aw W
obtain Caurie’s slope. The number of adsorbed monolayers may be obtained by the formula
S2N

Density of bound water is represented by ‘C’ in the Caurie’s equation and percent
bound water or non-freezable water is the product of monolayer value W0 in the equation
and number of adsorbed monolayers. The surface area of adsorption is determined by the
formula
54.54
A
S

The accuracy of fit for these equations can be evaluated by calculating the root mean
square percent error (RMS %) and residuals (R*).
1  n  w  w  
2

RMS%  CAL   100


  EXP
n  1  wEXP  

ue  u p
R*  100
ue

7.0 Effect of Temperature on Water Activity

Knowledge of the temperature dependence of sorption phenomena provides valuable


information about the changes related to the energetic of the system. The shift in water
activity as a function of change in temperature at constant moisture constant is due to the
change in water binding, dissociation of water or increase in solubility of solute in water. At
constant water activity, most of the foods hold less water at higher temperature. The
constant in moisture sorption isotherm equations, which represents either temperature or a
function of temperature, is used to calculate the temperature dependence of water activity.
The clausius-clapeyron equation is often used to predict aw at any temperature if the
isosteric heat and aw values at one temperature are known. The equation for water vapour in
terms of isosteric heat (Qst) is given by:

aw1 Qst  1 1 
In     
R T T 
 

a
w2  1 2 

Where Qst is net isosteric heat of sorption or excess binding energy for the removal
of water also called excess heat of sorption.

7
4600

4100

3600
H eat of desorpt ion (K j/kg)

3100

2600

2100

1600

1100

600
10 20 30 40 50 60 70 80 90 100
Moisture content (% db)

Figure 3. Typical diagram showing net isosteric heat of sorption

8.0 Hysteresis in Adsorption–Desorption Isotherms

When adsorption and desorption isotherms for the same food material are plotted on
the same graph, usually the desorption isotherm lies above the adsorption isotherm and
sorption hysteresis loop is formed. Moisture sorption hysteresis has both theoretical and
practical implications. The theoretical implications include considerations of the
irreversibility of the sorption process and also the question of validity of thermodynamic
functions derived therefrom. The practical implications refer to the response of the effects
of hysteresis on chemical and microbiological deterioration in processed foods intended for
prolonged storage. The hysteresis property of foods is generally affected by the
composition, temperature, storage time, drying temperature, and number of successive
adsorption and desorption cycles. Several theories have been proposed to explain hysteresis
phenomenon in foods. A typical diagram showing hysteresis in sorption isotherm is given
below:

Desorption
Moisture

Adsorption

Water activity

Figure 4. Typical diagram showing hysteresis phenomenon

8
References

Bhandari, B.R. (2002)

Kapsalis, J.G. (1981) Moisture sorption hysteresis. In : water activity: influences on food
quality (eds. L. B. Rockland and G. F. Stewart), Academic Press. Inc., New
York, USA, pp. 143.

Rizvi, S.S. H. (1986) Thermodynamic properties of foods in dehydration. In : Engineering


properties of foods (eds. M. A. Rao and S. S. H. Rizvi), Marcel Dekker, Inc.,
New York, USA, pp. 133

Spiess, W.E.L. and Wolf, W. (1978). Critical evaluation of methods to determine moisture
sorption isotherms . In: Water activity: Theory and applications to foods
(eds. L.B. Rockland and L.R. Beuchat.), Marcel Dekker, Inc., New York,
USA, pp. 215.

Wolf, W., Spiess, W.E.L. and Jung, G. (1985) Standardization of isotherm measurement
(COST- Project 90 and 90-bis). In: Properties of water in foods (eds. D.
Simatos and J. L. Multon), Martin Nijhoff Publishers, Dordrecht, The
Netherlands, pp. 661.

9
REACTION KINETICS AND MODELLING FOR PREDICTION OF
SHELF LIFE OF FOODS

Dr. G.R. Patil


Head
DT Division
NDRI, Karnal-132001

1.0 Introduction
Shelf life is an important feature of all foods and it is a matter of concern to all
including food manufacture, wholesalers, retailers and the consumers. The shelf life of the
food may be defined as the time between production and packaging of the product and the
point at which it becomes unacceptable under defined environmental conditions.

Prior to determining the shelf life of food, it is essential to determine which factors
limit the shelf life of food. The factors which affect the shelf life of food can be divided
according to whether they are causes or effects. The letter may be determined by the
laboratory tests when the shelf life is evaluated and can be grouped in four sub-categories:

1.1 Physical

Changes in colour, size & shape, texture and structure of foods.

1.2 Chemical

Lipid oxidation, Lipid hydrolysis, acidity, maillard reaction, loss of water, nutrients
etc.

1.3 Microbial

Change in total viable count, coliform, yeast and mold count, growth of pathogens
such as Salmonella spp., E.coli, Staphylococcus areus, Clostridium botulinum etc.

1.4 Sensory

Changes in flavour, texture and appearance.

However, a more useful approach in evaluating the factors it to consider the causes,
which can further be grouped into intrinsic and extrinsic factors (Lewis and Dale, 1994;
Ebrune and Prentice, 1994).

2.0 Intrinsic Factors


2.1 Raw Materials

The importance of quality and consistency of the raw materials used in maintenance
of shelf life of foods can not be over emphasised. The variation in source and supply of raw
materials will result in a variation of factors, which influence shelf life. The quality of the
raw materials is crucial if the assigned shelf life is to be met the first time and every time.
Generally, all raw materials should be handled according to good manufacturing practice and
if necessary any sensitive ingredients should be decontaminated before use.

2.2 Composition and Formulation

This is perhaps the most important factor that influences the shelf life of foods. This
is because the exact composition and formulation of a product will have profound influence
on possible changes that may occur during the storage of the product. These changes can be
microbiological physical and chemical/biochemical in nature, which are intimately linked to
the shelf life of the product. The water content, fat content the type and state of fat, pH, Eh,
salt and sugar content, etc. of food profoundly influence both biochemical and microbial
changes in food.

The presence of preservative (either naturally present or added), and the synergistic
effect of organic acids, pH, salt and sugar on the microbial stability also play decisive role in
shelf life of foods.

2.3 Initial Microflora

The type and number of initial microflora associated with raw material before food
formulation has a great influence on shelf life of the product.

3.0 Extrinsic Factors

3.1 Processing

Thermal processing of foods at various time-temperature combinations, lowering of


pH by fermentation or by the addition of organic acids, lowering water activity by
dehydration, addition and salt or sugar, etc. are various processing techniques aimed at
improving the microbiological stability of foods. An understanding of the synergistic and
antagonistic reactions that take place with different processing methods when a variety of
functional ingredients are used as necessary as this may have important influence on the
stability of the final product.

3.2 Plant Hygiene

It is necessary to ensure that plant and machines are capable of being cleaned to a
satisfactory standard and if there are potential problems they must be identified and resolved
as early as possible. Regular environmental monitoring (e.g. using air sampling, swabbing of
machine parts and surfaces and so on) during factory trials can provide early warning of such
problems.

3.3 Packaging

Packaging both in design and material terms, can have major influence on the shelf
life of the product. Packaging fulfils a number of functions including protection of product
from physical damage and from changes in the environment, and therefore, packaging can
influence the shelf life of product in number of ways. The choice of the packaging material is
important factor as it will influence the water vapour-transmission rate, oxygen transmission

10
rate, temperature of heat processing, etc. thereby influencing the microbiological,
biochemical and sensory changes in the product during storage.

3.4 Storage and Distribution

All food, including dairy products, may spoil during storage and distribution due to
(1) growth of microorganism (2) biochemical and chemical changes such as oxidation,
rancidity, moisture migration, etc. (3) attack by vermin, birds or other pests. The
microbiological or chemical changes are greatly reduced by lower storage temperature.
Effective temperature control during storage and distribution is, therefore, a critical factor in
preventing the development of spoilage organisms and ensuring that dairy products achieve
their potential shelf-life, as well as minimising any food poisoning risk. The other major
factor that can influence shelf life during the storage and distribution operation is light. This
is most likely to be a critical factor in the display cabinet.

3.5 Consumer Storage

The shelf life of a product mainly depends upon maintaining the integrity of "cold
chain". Therefore, full account must be given for the potential abuse of the product by the
retailer and consumer.

4.0 Evaluation of Shelf-Life


A common practice employed to evaluate the shelf life of a given food product is to
determine changes in selected quality characteristics over a period of time. Emperical
techniques such as sensory evaluation and analytical techniques such as microbiological
analysis or determination of chemical parameters may be used to quantify the quality
attributes of food. The shelf life failure is often identified by just Noticeable Difference"
(JND) i.e. "the earliest time when a difference between the quality of test and control samples
can be detected by trained sensory panel's (Van Arsdel et al., 1969).

As higher storage temperature lead to increased quality deterioration, attempts have


been made in the past to use mathematical models to describe changes in food quality as
influenced by storage temperature. Use of chemical kinetics to model changes in food
quality and Arrhenius relationship to describe the influence of temperature on the reaction
rate constant has been suggested by many workers (Kwolek and Bookwalter, 1971; Saguy
and Karel, 1980; Lai and Heldman, 1982). A computer aided method to simulate changes in
food quality during storage of frozen foods was used by Singh (1976).

5.0 Reaction Kinetics


Chemical kinetics involves the study of the rates and mechanisms by which one
chemical species converts to another. The rates of reactions are determined by monitoring
the concentration of either the reactants or the products of the reactions. To analyse general
quality changes in foods, the following approach is commonly used (Singh, 1994).

A general rate expression for quality attribute and may be written as follows

dQ
  kQn (1)
dt
11
Where ± refers to either decreasing or increasing value of the attribute Q, k is the rate
constant, n is the observed order and reaction. It is assumed that the environmental factors
such as temperature humidity and light and concentrations of other components are kept
constant.

5.1 Zero Order Reaction

Consider a quality attribute Q decreases linearly during the storage period, implying
that the rate of loss of a quality attribute is constant throughout the storage period and it does
not depend on the concentration of Q. This linear relationship between quality attribute and
storage time represents a zero order reaction, therefore substituting n = 0 in equation (1) we
get.

dQ (2)
 k
dt
Equation (2) may be intregrated to obtain

Q  Q0  kt (3)

Where Q0 represents some initial value of a quality attribute and Q is the amount that
attribute left after time t.

It the end of shelf life, ts is noted by the quality attribute reacting a certain level, say
Qe, then

Qe  Qo  k ts (4)

Therefore, the shelf-life, ts, may be calculated as


Q  Qe
t s  0 (5)
k
The use of zero order rate equation ( 2 ) is useful in describing such reactions as
enzymatic degradation, non-enzymatic browning and lipid oxidation, etc.

5.2 First Order Reaction

Consider a quality attribute Q that decreases in an exponential manner with storage


time. The rate of loss of a quality attribute is dependent on the amount of quality attribute
remaining; this implies that as time proceeds and the quality attribute decreases so does the
rate of reaction. This exponential relationship between quality attribute and time represents a
first order reaction n = 1, and equation (1) is modified as follows:

dQ (6)
  kQ
dt

12
by integration, we get

Q
ln   kt (7)
Qo

Where Q is the amount of quality attribute left at time 't'.

At the end of shelf life, ts for a certain final level of quality attribute Qe, we can also
write equation (7) as:

Q
ln e   kts (8)
Qo

or

Qe
ln
Qo
t s 
k
(9)

The types of food deterioration reactions that show first order losses include vitamin
and protein losses and microbial growth.

Most reactions showing losses in food quality may be described by zero or first order,
however, there are some studies in the literature that indicate use of other orders (Singh and
Heldman, 1976; Jayraj Rao, 1993).

6.0 Temperature Effects


The influence of temperature on the reaction rate may be described by using the
Arrhenius relationship, as follows:
Ea
k = ko exp [ - —— ]
(10)
RT

Where ko is the pre-exponential factor, Ea is the activation energy, R is the ideal gas
constant and T is the absolute temperature.

Another parameter that is often used in the literature to describe the relationship
between temperature and reaction rate constant is the Q10

Value Q10 is defined as follows:

reaction rate at temperature (T+10)°C


Q10 = —————————————————— (11)
reaction rate at temperature T°C

13
7.0 Development of Comprehensive Shelf Life Prediction Model
Shelf life of food product is the period of storage during which the organoleptic
quality remains suitable for consumption. The sensory properties of the product during
storage are in turn, governed by progression of a host of deteriorative reactions. These
reactions may induce changes of chemical, biochemical and physical nature. The multiplicity
and extent of these changes determine the period up to which the product may remain
acceptable to consumers. The rate of these changes are, in turn, influenced by the temperature
of storage. For prediction of shelf life of any food product, therefore, information relating to
the physico-chemical parameters affecting sensory quality as well as dependence on storage
temperature and time must be known and integrated into a model. Such comprehensive
models have been reported by Jayraj Rao (1993) and Ananthanarayanan et al., (1993).

Most of the changes occurring in the product can be measured by objective tests and
interpreted in quantifiable terms. Relationships between these changes and sensory score
helps in objective measurement of human sensory perception. Different models such as
linear, exponential and power as given below, may be employed to establish the relationship
between sensory score and chemical parameters.

Linear : S=a+bX

(12)

Exponential : S = a ebX (13)

Power : S = aXb
(14)

Where, S is sensory score


X is chemical parameter
a and b are coefficients

Sometimes, the sensory score is dependent on more than one chemical parameter,
such as

S = a,+ b, X1 + b2 X2 (15)

Where X1 and X2 are chemical parameters.

The rate of change of chemical parameter (s) during storage can be obtained from
equation (1) for reaction kinetics and dependence and reaction rate constants (k) with
temperature can be worked out using Arrhenius equation (10). The equation relating sensory
data with chemical parameter, equation describing rate of change of chemical parameter, and
equation relating reaction rate with storage temperature can be combined to yield a model
which could predict the sensory quality of product at a given storage time and temperature
condition. Assuming the chemical parameter follows a zero order reaction kinetics and
sensory and chemical parameters are related linearly, the model can be written as:

S = a + b [X0-(k0 e -Ea/RT)t ] OR

14
(s  a)  X
0
t  b
 Ea / RT
 k0 exp
Where,

S = Sensory Score after storage time


t = Storage period
X0 = Chemical parameter at t = o
K0 = Arrhenius constant
Ea = Activation energy
R = Universal gas constant
T = Absolute storage temperature
a and b = Coefficients,

References

Ananthanarayanan, K.R., Abhay Kumar and Patil, G.R. (1993) Kinetics of various deteriorative changes during
storage of UHT soy beverage and development of a shelf life prediction model. Lebensmittel Wiss., u.
Technol. 26 : 191-197.
Eburne, R.C. and Prentice, G. (1994) Modified atmosphere packed ready-to eat meat products, In Shelf-life
evaluation of foods (eds. CMD Man and A.A. Jones), Blackie Academic of Professional, Glasgow, UK.
Jayraj Rao, K. (1993) Application of Hurdle technology in development of long life paneer based convenience
food. Ph.D thesis submitted to NDRI Deemed Univ.
Kwolek, W.F. and Bookwalter G.N. (1971) Predicting storage stability from time-temperature data. Food
Technology 25 (10) 1025, 1026, 1029, 1031, 1037.
Lai, D. and Heldman, D.R. (1982) Analysis of kinetics of quality changes in frozen foods. J. Process Engg 6 :
179-200.
Lewis, M. and Dale, R.H. (1994) Chilled Yoghurt and other dairy products. In Shelf-life evaluation of foods
(eds. CMD Man and A.A. Jones) Blackie Academic and Professional, Glasgow, U.K.
Saguy, I. and Karel, M. (1980) Modelling of quality deterioration during food processing and storage. Food
Technol., 34(2), 78-85.
Singh, R.P. (1994) Scientific Principles of Shelf-life evaluation. In Shelf-life evaluation of Foods (edn. CMD
Man and A.A. Jones) Blackie Academic & Professional, Glasgow, U.K.
Singh, R.P. and Heldmon, 1976) Simulation of liquid food quality during storage. Trans.Am. Soc. Agric. Eng.
19(1) 178-84.
Van Arsdel, W.B. Coply, M.J. and Olson, Q.L. (1969) Quality and Stability of Frozen Foods, Willey-
Interscience, New York.

15
e-TONGUE IN MONITORING SENSORY QUALITY OF
FOODS

S. K. Kanawjia and Dr. S. Makhal*


Dairy Technology Division
NDRI, Karnal-132001
*Asstt. Manager (R & D), JKDF, Gajraula

1.0 Introduction
Taste is the most important sensory attribute of any food product, which determines
its acceptability. The senses of taste have always been used in monitoring and judging the
quality of foods. There are several problems associated with human taster which include
sensory fatigue, varied perception of similar taste to different people, health risk as
associated with tasting of certain chemicals and its dependability on human mood and
adaptation.
e-Tongue mimics the biological tongue, which is actually a group of sensor chips
capable of remitting real time data to control the quality of a liquid process. When a liquid
flows over this "tongue," its exact chemical makeup can be ascertained and controlled by
computer. This innovation is expected to save several lacs of rupees in industrial quality
control. The food and beverage industry wants to develop it for rapid testing of new food
products for comparison with a computer library of tastes proven popular with consumers.
e-Tongue is the most advanced device of its type worldwide and has no analogues.
2.0 Development of e-Tongue
One of the overall goals of NASA's Space Life Sciences Division of Advanced
Human Technology Program (AHST) research project is to understand the principles,
concepts, and science which will enable the development of an integrated, rugged, reliable,
low mass/power, electro analytical device which can identify and quantitatively determine a
variety of water quality parameters including, inorganics, organics, gases along with
physical properties like pH, oxidation reduction potential, and conductivity.
To accomplish these goals a group of scientists in collaboration with the NASA's Jet
Propulsion Laboratory and Thermo Orion Research, undertook the research necessary to
lead to an electrochemically-based integrated array of chemical sensors based on several
novel transduction and fabrication concepts. Even though this type of sensor array might be
thought of as an "Electronic Tongue", it is exceedingly more capable. Working in
conjunction with a neural network, it will provide both qualitative and quantitative
information for a much broader range of components, such as cations, anions, inorganic and
organic than a human tongue ever could. Their work has lead to the discovery of a unique
electro-immobilization technique, which imparts special selectivity properties to each
sensor. Unlike previous devices though, this electrochemically-based sensor will provide
both identification and reliable quantitative data.
3.0 e-Tongue Capability
The researchers designed e-Tongue to be structurally similar to the human tongue,
which has four different kinds of receptors that respond to distinct tastes. The human tongue

28
creates a pattern in the brain to store and recall the taste of a particular food. e-Tongue is an
analytical instrument comprising an array of chemical sensors with partial specificity (cross-
sensitivity) to different components in solution, and an appropriate method of pattern
recognition and/or multivariate calibration for data processing. It is a new generation
analytical instrument based on an array of non-selective chemical sensors (electrodes) and
pattern recognition methods. Chemical sensors incorporated into the array exhibit high
cross-sensitivity to different components of analyzed liquids inorganic and organic, ionic
and non-ionic. Utilization of the sensors with high cross-sensitivity in conjunction with
modern data processing methods permits to carry out a multi-component quantitative
analysis of liquids (determination of composition and components), and also recognition
(identification, classification, distinguishing) of complex liquids.
With the miniature sensor array, acquisition of data streams composed of red,
green, and blue (RGB) colour patterns distinctive for the analytes in the solution are rapidly
acquired. e-Tongue contains tiny beads analogous to taste buds. A little silicon chip that has
micro beads arrayed on it, in a similar fashion to the taste buds on your tongue has been
made at the university of Texas. Each of the beads responds to different analytes like the
tongue responds to sweet, sour, salty, and bitter. There is a potential to make taste buds for
almost any analytes. To build e-Tongue, the scientists positioned 10 to 100 polymer micro
beads on silicon chip about one centimeter square. The beads are arranged in tiny pits to
represent taste buds and marked each pit with dye to create a red, green, and blue (RGB)
colour bar.
The colours change when the chemicals are introduced to e-Tongue. A camera on a
chip connected to a computer then examines the colourrs and performs a simple RGB
analysis that in turn determines what tastes are present. Yellow, for example, would be a
response to high acidity, or a sour taste. The e-Tongue now uses simple markers to detect
different types of taste: calcium and metal ions for salty, pH levels for sour, and sugars for
sweet.
e-Tongue features an auto-samples and integrated software (Giese, 2001). It is
designed to replicate human taster and consists of an array of chemical sensors, each with
partial specificity to a wide range of non-volatile taste molecules, coupled with a suitable
pattern recognition system. For instance, The Alpha M.O.S. e-Tongue, called the Astree, is
composed of a 16-position auto sampler, an array of liquid sensors, and an advanced chemo
metric software package (Alpha M.O.S, 2001 a, b, c). The instrument also has an option of
sample temperature control to ensure analytical producibility of measurements.
4.0 Features of e-Tongue
One of the unique features of the system is the possibility to correlate the output of
e-Tongue with human perception of taste, odor and flavour, e.g. with food evaluations made
by a trained taster. A typical sensitivity limit of most such sensors for the majority of
components are about several micrograms per liter. Of primary importance are stability of
sensor behaviour and enhanced cross-sensitivity, which is understood as reproducible
response of a sensor to as many species as possible. It responds to a number of organic and
inorganic nonvolatile compounds in the ppm level in liquid environment .The response can
be highly reproducible.
5.0 Principle of e-Tongue
Each taste sensation may correspond to a fingerprint signal induced by the
differential activation of the various taste receptors. e-Tongue works on this principle. It

29
works by measuring dissolved compounds and taste substances in liquid samples (Giese,
2001).
It contains four different chemical sensors. The sensors comprise very thin films of
three polymers and a small molecule containing ruthenium ions. These materials are
deposited onto gold electrodes hooked up to an electrical circuit. In a solution of
flavoursome substances, such as sugar, salt quinine (bitter) and hydrochloric acid (sour), the
thin sensing films absorb the dissolved substances. This alters the electrical behaviour (the
capacitance) of the electrodes in a measurable way. A composite sensor that incorporates all
four therefore produces an electronic fingerprint of the taste. The researchers combine these
responses into a single data point on a graph. The position on the graph reflects the type of
taste: sweet lies towards the top left, for example, sour towards the top right (Riul et al.,
2002).
6.0 Advantages of e-Tongue
The typical analysis time using the e-Tongue is about 3 min from when the
sensors are introduced into the beaker containing the sample. It takes only 5 minutes for
analysis and sensor cleaning. It has been proved that the instrument is so sensitive that it
can respond to as low as 1-2 molar of sucrose, caffeine, salt (NaCl), sour (HCl) and Umami
(MSG) (Tan et al., 2001).
Application of e-Tongue would be advantageous to analyze the taste of those toxic
substances which human dare to taste due to toxicity. Successful application of e-Tongue
may offer online monitor of taste and documentation, thereby permitting better product
process maintenance. Therefore, e-Tongue may help the food processor to reduce wastage
from poorly controlled processes and increased productivity. Since, e-Tongue readily tends
itself to automation and computerization, monitoring taste quality can be incorporated into
the manufacturing process. Another advantage is its versatility: e-Tongue being developed
from small, inexpensive, handhold devices for both the periodic taste analysis of goods for
household purposes and sophisticated devices for contitinuous, on-line monitoring of taste
quality. Eventually, e-Tongue will be inexpensive disposable units, placed on a roll of tape
to be used quickly and easily. Application of e-Tongue permits many sorts of diverse
sample to be examined. Once a protocol has been established, the instrument does not
require highly skilled operators.
7.0 Training of e-Tongue
e-Tongue, like a human being, needs to be trained with a correctly selected sample
set to ensure good recognition and reproducibility. e-Tongue, in fact, seems to be black box;
it knows nothing until it is taught. Alpha M.O.S. (2000) suggested the procedure for
training, model building, and validation for the instrument.

8.0 Operation of an e-Tongue


The auto-sampler allows 15 samples to be evaluated automatically, once the
sample has been prepared. Preparation of samples typically involves filling the 100 ml
beakers to three-fourths full. No other sample preparation is required. One beaker position
is reversed for cleaning the sensor array following analysis of each individual sample. The
auto-sampler also includes fluidic pumps for cleaning out the beaker for sensor rising when

30
needed. Cooling to 2 to 4°C also ensures that there is limited sample change during analysis
cycle. Analysis of sample is followed by a wash cycle to ensure that there is no carryover of
sample to the next analysis and also to ensure good reproducibility. Typically, upto five
replicate measurements are made for each sample.
9.0 Correlation between e-Tongue Output and Human Perception
A good agreement was observed for coffee, wine and soft drinks. That is why
"artificial tasting" of beverages and foodstuffs based on sensor arrays and multivariate data
processing seems to be a highly interesting emerging field. The performance of e-Tongue is
presented in Table-1.
Table-1

Attributes Qualitative Analysis Quantitative Performance


Typical sensor
20 - 40 sensors 10-30 sensors
array size
Typical number of
4-8 12 - 50
measuring sessions
Number of
measure- ments
3 3 - 10
within each
measuring session
Discrimination of
different types of
Classification of the different
beverages,
coffee depending on acidity,
Discrimination of
Examples Glycerol rate in wine samples,
different coffees by
Determination of components in
name, Discrimination of
model blood plasma.
orange juices by their
quality.

10.0 Applications of e-Tongue


e-Tongue has a multitude potential application, which includes its uses in quality
control laboratory in space station and in medicine/body functions. e-Tongue can sense low
levels of impurities in water. It could discriminate between Cabernet Sauvignons of the
same year from two different wineries, and between those from the same winery but
different years. It can also spot molecules such as sugar and salt at concentrations too low
for human detection. The electronic fingerprint allows the scientists to predict what a
particular solution will taste like. Martin Taylor of the University of Wales at Bangor has
anticipated that the device will probably be able to discriminate the umami taste too, giving
it a refined palate for sushi. The food and beverage industries may want to use e-Tongue to
develop a digital library of tastes proven to be popular with consumers, or to monitor the
flavours of existing products.
The e-Tongue has been designed to replace human tasters. e-tongue can also "taste"
cholesterol levels in blood, cocaine in urine, or toxins in water. e-Tongue can be applied for

31
quantitative analysis and recognition (identification, classification) of a very wide range of
liquids on aqueous and water-organic basis. The most promising are the perspectives of e-
Tongue application for quality control and identification of the conformity to standards for
different food stuffs - juices, coffee, beer, wine, spirits, etc. Also the system can be
successfully utilized in complicated tasks of industrial and environmental analysis, such as
determination of the composition of groundwater in the abandoned uranium mines.
In the brewing industry, e-Tongue can be used to monitor batch-to-batch
variation of the beers following the brewing process. e-Tongue allows product conformity
testing, taste default detection, origin identification (Giese, 2001). An extremely important
taste attribute of beer is its bitterness. A range of beers has also been analyzed using e-
Tongue. Result shows the good linearity of quantification of BU (Bitterness Unit) using
PLS.
e-Tongue has also been used for the analysis of quality of high-fructose corn syrup
to detect some taint compounds responsible for the off-flavours, such as fish taste/flavour
formed by microbiological oxidation of protein residues and other taste/odour descriptors
including fruity, astringent, SO2, salty, corn-caramel, and moldy. Bleibaum et al. (2001)
tested a series of nine 100 % apple juices, including a three-apple blend, vitamin-C fortified
apple/pear juice, and an apple cider using e- Tongue.
There are numerous fields in the food industry where e-Tongue may prove beneficial
in food processing, in principle and practice. Quality management is of utmost importance
in the food industry, especially since the in guess of the good quality assurance Programme.
Application of e-Tongue would allow the taste quality of a food to be monitored
continuously from the raw material stage right through to final product. In recent years, e-
Tongue has found food and beverage industry as the challenging environment for its routine
application in taste control and analysis.
11.0 e-Tongue-The Tomorrow
The technology is projected to save millions of rupees, as it becomes an integral part
of industrial process quality control system. Once established in the pharmaceutical
industry, scientists have planned to expand and apply the instrument to the clinical
diagnostic market besides developing equipment that will help physicians diagnose patients
at the bedside. By saving time at the point of care, it will save lives. That's the whole point
of commercializing this technology. In the epoch of miniaturization, scientists lead their
concentration to develop e-Tongue of tomorrow based on a ‘single chip’. Scientists are
thinking that medical diagnosis and food quality amassment will be the most challenging
application field for e-Tongue. It may be applied to solve some environmental problems
such as analyzing hazardous wastes of factory and tap as well as ground water quality. The
application of e-Tongue would include the inspection of quality of fish, meat and fermented
products during household storage or commercial storage period.
12.0 Conclusion
For many years, assessment of sensory quality of food has been based upon the
traditional method i.e. application of human senses. Sensor technology has offered the food
industry a new, rapid type of monitoring and measuring device for taste analysis of foods
i.e., e-Tongue, whose speed, sensitivity, stability and ease of use exceed the efficiency of
human taster

32
While the application of e-Tongue will present a radical revolution in quality
control of foods providing the food industry with a great opportunity to exploit this novel
technology, it will face a dual challenge involved in identifying and progressing the
technology to capitalize on these. Some day is coming when you wouldn’t need to wait at
the door of your panel member with a tray containing a cube of cheese for sensory
evaluation, an e-Tongue fitted online would automatically analyse and document the
product quality batch by batch or someday household refrigerator would automatically alert
your brisk housewives that your yoghurt gets soured, which you put since last Durgapuja.
References
Alpha MOS. (2001a). Astree electronic tongue user manual. Toulouse, France.
Alpha MOS. ( 2001b). Astree sensor technical note. Toulouse, France.
Alpha MOS. ( 2001c). Special newsletter “Basell Interview”. Toulouse, France.
Alpha MOS. (2000). FOX2000/3000/4000 Electronic Nose advanced manual. Toulouse, France.
Bleibaum, R.N., Stone, H., Isz, S, Labreche, S., Saint Martin, E., and Tan, T.T. (2001). Comparison
of sensory and consumer results with Electronic Nose and Tongue sensors for apple juices.
Submitted for publication (http//www.google.com/).
Giese. J. (2001). Electronic Tongues, Noses and much more. Food Technology, 55(5): 74-81
Riul, A. et al. (2002). Artificial taste sensor: efficient combination of sensors made from Langmuir-
Blodgett films of conducting polymers and a ruthenium complex and self-assembled films of an
azobenzene-containing polymer. Langmuir, 18: 239 – 245.

Tan, T.; Lucas, Q.; Moy, L; Gardner, J.W. and Bartlett, P.N. (1995). The Electronic Nose – A new
instrument for sensing vapours. LC-GC 1NT, 8(4): 218-225.

33
TIME-TEMPERATURE-INDICATORS IN CONTROLLING
QUALITY OF FOODS

Dr. R. R. B. Singh
Sr. Scientist
Dairy Technology Division
NDRI, Karnal-132001

1.0 Introduction

Quality retention in foods has been shown to be greatly influenced by time and
temperature exposure of the food during storage and handling. High storage temperatures will
reduce the useful commercial life of the food product depending on the exposure time.
Indicators which can detect the time-temperature history of the product will be an effective
tool to monitor the quality of food. Several such devices which can indicate either a
temperature that was reached, a duration of exposure to a temperature or an integration of
both have been developed and reviewed in the past (Byrne, 1976; Dharmendra et al., 2000).

Time-temperature indicators are small, simple, inexpensive wireless devices that


exhibit a time-temperature history of the food. The time-temperature changes are irreversible
and can be measured. It mimics the change of a target quality parameter of food undergoing
the same variable temperature exposure (Taoukis and Labuza, 1989a, b; Hendrickx et al.,
1991). The target could be any safety or quality attribute of interest such as destruction of
spores, inactivation of enzymes, loss of vitamins, texture or colour. Commercially available
time-temperature indicators which can provide a simple means to monitor cumulative time
and temperature exposure have been broadly classified into two categories. Partial history
indicators respond only to temperature fluctuations that exceed a pre-determined threshold
temperature and are used to detect severe temperature abuse. Full-history indicators respond
to all temperature conditions and are useful for comparing different temperature histories
(Wells and Singh, 1988a, b). These indicators do not record the precise temperature but
monitor time-temperature history through irreversible change of colour or shape of indicator
elements depending on their response mechanism (Blixt, 1983; Legrond et al., 1986; Wells
and Singh, 1988a, b).

2.0 Types of Commercial TTIs

A number of TTIs based on different working principles are commercially available.


Most of these TTIs are patented and very limited information are accessible. Some of the
most widely used commercial continuous response TTIs are :

2.1 TTIs Based on Diffusion Principle

Marketed as the Monitor Mark TM (3M Co., Paul, Minnesota, USA), the monitor
consists of a pad saturated with the migrating chemical mixture, serving as a reservoir.
Superimposed on the pad is the end of a long porous wick (the track) along which the
chemical can diffuse.
Before use, the pad is separated from the wick by a barrier film so that no diffusion
occurs. Upon activation by removal of the barrier, diffusion starts if the temperature is above
the melting point of the chemical mixture. Fatty acid esters and phthalates (mixed with a blue
dye) are the types of chemicals used viz. butyl stearate (melting point of 12 oC), dimethyl
phthalate (-1.1oC), and octyl octanoate (-17oC). The response of the indicator is the distance
of the advancing “blue front” from the origin. The advancement of the diffusing substance
can be viewed through the openings along the wick or measured on an appropriate scale with
the whole length of the wick visible.

By varying the type and concentration of the ester, different melting temperatures and
response life can be chosen. Thus the indicators can be used either as a CTTI, with critical
temperature equal to the melting temperature of the ester, or as a TTI if the melting
temperature is lower than the range of temperatures the food is stored at, e.g., below 0oC for
refrigerated storage. The tags can have a shelf life of years before activation if they are kept
at cool temperatures.

2.2 TTIs Based on Enzymatic Reactions

Commercially available as I-point® Time Temperature Monitor (I-Point


Biotechnologies A.B., Malmo, Sweden), the indicator is based on a colour change caused by
a pH decrease resulting from a controlled enzymatic hydrolysis of lipid substrate. Before
activation, the indicator consists of two separate compartments, in the form of plastic mini
pouches. One compartment contains an aqueous solution of a lipolytic enzyme, such as
pancreatic lipase. The other contains the lipid substrate absorbed in a pulverized polyvinyl
chloride carrier which is suspended in an aqueous phase with a pH indicator.

Different combinations of enzyme-substrate types and concentrations can be used to


give a variety of response life and temperature dependence. Usual substrates are glycerine
tricapronate (tricaproin), tripelargonin, tributyrin, bis-3,5,5-trimethyl-hexyladipate (THA),
and mixed esters of polyvalent alcohols and organic acids.

The indicator is activated by breaking the barrier that separates the two compartments
by exertion of mechanical pressure either manually or with a special mechanical activator,
mixing enzyme and the substrate. Hydrolysis of the substrate (e.g., tricaproin) causes release
of an acid (e.g., caproic acid) and a drop in pH, translated into a colour change of the pH
indicator. Colours 0 (green), 1 (yellow), and 3 (red) are printed around the reaction window
to allow comparison and easy visual recognition and measurement of the colour change. The
continuous colour change can also be measured instrumentally by a portable colorimeter, data
for which could be developed into a simple kinetic model (Taoukis and Labuza, 1989a).

2.3 TTIs Based on Polymerization Reaction

Marketed as LifeLinesTM Fresh-Scan (previously LifeLines Freshness Monitor;


LifeLines Technology Inc.; Morris Plains, New Jersey, USA), the operating principle of the
TTI is based on the ability of disubstituted diacetylene crystals (R-C=C-C=C-R) to
polymerize through a lattice-controlled solid-state reaction. The reaction proceeds via 1,4-
addition polymerization, and the resulting polymer is highly coloured. During
polymerization, the crystal structure of the monomer is usually retained, and the polymer
crystals are chain aligned and are effectively one dimensional in their optical properties. The
colour of the chain is due to the unsaturated, highly conjugated backbone. The side groups
have little effect on the colour of the backbone but affect the reaction properties of the
monomer. The change in colour measured as a decrease in reflectance is the basis of the TTI,
and the response follows typical first order kinetics (ln reflectance vs time).

The indicator consists of an orthogonal piece of laminated paper, the front of which
includes a strip with a thin coat of the colourless diacetylenic monomer and two bar codes.
The indicating strip has a red background colour so that the change is perceived as a change
from transparent to black.

A - Time-temperature indicator label.


B - Product specific information
C - Indicator type denotes relative sensitivity of colour - developing polymer
material

Hand held micro computer to decode bar code information


Progressive colour change of TTI

The reflectance is measured by scanning with a laser optic band. The scanning data
are stored in a hand-held device supplied by the TTI manufacturer. Each time the TTI is
scanned, the reflectance readings are internally standardized in relation to the background of
the indicator strip, which has a 100% reflectance, and the black box enclosing the coated
section of the strip, which has 0% reflectance. During the scanning of the TTI, the two bar
codes are also read, one giving information about the product and the other identifying the
model of the indicator. The information stored in the portable unit can be processed or
downloaded to a microcomputer for storage and evaluation. The TTI manufacturer provides
software for the systemization and presentation of the data.

The indicator is active from the time of production, therefore, before use, it has to be
stored in a freezer, where change is minimal, and used within a relatively short period. The
tags have activation energies of about 20 and 24 kcal/mole (Taoukis and Labuza, 1989a), but
the catalyst can be varied to change this.

2.4 TTIs Based on Analog Principle

Manufactured under the trade name SmartlogTM (Remonsys Ltd., Bristol, England),
these indicators are small, battery powered, microprocessor controlled devices which record
the temperature history of the product during transport and storage. Higher costs of the
product have however limited their widespread use. Such devices are placed at strategic
locations with the product in the transport carrier to record temperature fluctuations both at
the most stable and variable temperature environments.

2.5 Consumer Readable TTIs

Consumer readable TTIs are simpler and lower in cost and function on the same
principle as the continuous-response indicators. These are however designed to show a single
end point, visually recognizable by consumers. Such TTIs are configured in a “bulls-eye”
pattern, with the outer ring being a reference colour and the center circle changing colour.
The center of the enzymatic type (I-point) shows a seemingly one step change from dark
green to yellow, while the polymer type (branded by LifeLines as FreshCheckTM ) shows a
gradual darkening until the center matches the colour of the outer ring, signaling the end of
shelf life. The consumer readable TTIs when tested using instrumental methods have been
found to be reliable both under constant and variable temperature conditions (Sherlock et. al.,
1991).

Steps involved in indicator labels preparation

Use by date unless center of label below is darker than ring

3.0 TTIs and Shelf Life Prediction of Foods


The ability of TTI to function as cumulative recorders of temperature history from
their activation time to the time each response measurement is taken, make them useful for
two types of applications.

3.1 Correlations Between TTIs Response and Quality Characteristics

Several studies have been conducted to demonstrate the use of full-history time-
temperature indicators in monitoring quality changes in different food systems (Mistry and
Kosikowski, 1983; Singh and Wells, 1985; Zall et al., 1986; Wells and Singh, 1988a). This
has been accomplished by correlating the response of indicator models with sensory and
objective changes in food quality when both the indicator and food were exposed to the same
time-temperature conditions. High correlation suggested usefulness of such devices in
monitoring food quality. Such studies though offer useful information, do not involve any
modeling of the TTI response as a function of time and temperature and thus are applicable
only for the specific foods and the exact conditions that were used. Extrapolation to similar
foods or quality loss reactions, or even use of the correlation equations for the same foods at
other temperatures or for fluctuating conditions, would not be accurate.

3.2 TTIs Response and Kinetic Modeling

An alternative approach is to use the fundamentals of chemical kinetics to develop a


scheme that allows the correlation of the response of a certain type of TTI to the change in
quality and the remaining shelf life of a food product that has undergone any temperature
exposure. Wells and Singh (1988b) prepared a generalized mathematical model to predict
changes in food quality from the response of a full-history time-temperature indicator. It was
suggested that because of the similar mathematical constructions a constant temperature
equivalent can be estimated for any interval between successive indicator inspections. The
constant temperature equivalent can then be used to predict the change in food quality during
that same interval. Taoukis and Labuza (1989a) applied mathematical modeling using
Arrhenius kinetics for evaluating three types of commercially available TTI. A scheme was
introduced that allows the correlation of the TTI response, X, to the quality index of the food.
X can be expressed as a function of time:

F(X)t = k t = k1 exp(-EA/RT)t (1)

Where F(X) is the response function of the TTI, t is the time and k the response rate constant;
the constant k1 and the activation energy EA are the Arrhenius parameters, and T is the
absolute temperature. For a variable temperature distribution, T(t), the response function can
be expressed as:

exp  E A 
t
F(X) = kdt =k t (2)
t  1  dt
0 0 R Tt 
Defining as effective temperature, Teff, the constant temperature that causes the same
response or change as the variable temperature T(t), we have:

F(X)t = k1 exp (EA/R Teff)t (3)

Similarly, the changes of quality A of the food can be modeled. Defining the food quality
function f(A) such that f(A) = kt (the form of f(A) depends on the reaction order, e.g., f(A) =
ln(Ao/At) for 1st order) and using the food’s Arrhenius parameters, kA and EA Eq. (2) and Eq.
(3) can be applied for f(A)t . For a TTI going through the same temperature distribution, T(t)
as the monitored food, the value of F(X)t is known from the response X; Teff can then be
calculated from Eq. (3). Teff and knowledge of the kinetic parameter of deterioration of the
food allows the evaluation of f(A) and, hence, the quality loss of the product.
4.0 Significance of TTIs to the Food Industry
A closely monitored temperature exposure during distribution chain and an optimized
stock rotation at the retail level based on the temperature of each product, instead of an often
meaningless expiration date (Table 1), can lead to better control of quality and a significant
decrease in food waste. TTI technology is thus compelling to the food industry for a number
of reasons.

Table 1 Open dating terminology.

Term Definitions
Production date Historical meaning; gives the date on which the product was
or pack date manufactured or put into the final package. Used on prepackaged fresh
fruits and vegetables, where shelf life depends on the freshness of the
product when harvested
Sell-by date Helps in stock rotation to get the products out, so the consumers can
purchase the product at a point which will still give them adequate time
for home storage before the end of shelf life. Printed dates are usually
very good guesses or industry practice.
Best-if-used-by The estimated point where the product quality loss reaches a level still
date generally acceptable but after which it fails to meet the high quality
standard. Ambiguous date as to when the product should be taken off of
the super market shelves, but confusing for the stock rotators
Combination Best if used within days of date. The “ days of” parts make this
date phrase a best-if-used-by-date, while the date given represents a sell-by
date
Use-by date Commonly interpreted as “it dies or you die if you eat it”. The date
determined by manufacturers as the end of the useful quality life of the
product
Freeze-by date Often used on meat or poultry in conjunction with another date, such as
a use-by date. Helpful to the consumer and helps the store in terms of
product movement
Closed or coded Numbers used by the industry that indicates production date, but not
date written for the consumer to understand. Important numbers in the case
of product recalls

4.1 Cold Chain Monitoring

Quality and safety of thermally sensitive products depend upon the cold chain, which
is a term expressing the refrigeration from processor to receiver, to warehouse, to retail, to
consumer. Temperature abuse has been proven to be cumulative, so each link in the chain is
important. TTI “labels” attached to individual cases or pallets can give a measure of the
preceding temperature conditions along the cold chain and integrate these preceding
conditions. In the typical shipped product scenario, the TTI labels are activated at the
shipping point, sense temperature during transit, and continue to sense temperature as the
shipment is broken up, redistributed and kept in cold storage at multiple final destinations.
This monitoring continues right up to the point at which retail sale occurs. The overall
monitoring of the distribution system, thus allows for recognition and possible correction of
the more problematic links.
4.2 Management of Inventory

The more common system of inventory management that is used in conjunction with
a product date stamping system is FIFO (First In, First Out). Using FIFO, the product with
the soonest expiration date is preferentially placed on the retail shelf for sale. With this
system, it is still possible to put spoiled product in front of a customer that is not fresh to the
taste, or possibly not wholesome or safe. This is because the variation in the temperature
history of any given product parcel is fairly large, and some may actually expire before the
expiration date says they will. Thus when abuse temperature conditions are encountered
during storage, transport and handling, the FIFO policy is unable to compensate for the
increased deterioration, and the uniformity in the quality of the product distributed from the
stockpile is compromised.

An alternative to this would be to determine issue of store on the basis of observed or


estimated food quality rather than elapsed time in storage. This is called Least-Shelf-Life,
First Out (LSFO) or Shortest-Remaining Shelf-Life (SRSL) policy. In this system, if the
temperature sensing and the integration function of the tags shows an earlier signal in the
three dots of the tag (signaling a lower remaining shelf life), then the product is rotated to the
retail shelf. This rotation is totally independent of the product dating. Under this scenario, the
possibility of placing “bad product thought good” in front of the consumer is almost reduced
to zero. This policy would thus reduce food waste and provide more consistent quality at the
time of issue for food items which have been exposed to differing temperature conditions.

4.3 Monitoring of Quality or Shelf Life

Commercially available TTIs have been evaluated and found to give satisfactory
prediction of a number of processed food quality. The studies conducted observed responses
of several indicators and correlated with sensory and objective measures of food quality,
when both indicator and food were exposed to the same temperature conditions. Highly
significant statistical correlations were found between indicator response and food quality
indices. TTIs have been also tested for prediction of remaining food quality and shelf life
based on response information and kinetic models from isothermal testing. The prediction
models have been satisfactorily employed even for products stored under variable
temperature conditions.

4.4 HACCP Advantages

TTI indicators now have a very important advantage-they can be custom


manufactured to match the exact characteristics of the monitored food product as it relates to
quality and safety. HACCP program designers are beginning to incorporate TTI tags and
configure them to have colour change points which match the critical control points in the
program. Since the tags give a distinct “yes/no” type of answer, they provide clear cut
answers and do not require data inspection. This is ideal for HACCP, where the emphasis is
on real time decision making and action. Several industry organizations endorse the TTI
concept and they are already acceptable in mandated HACCP program.

4.5 TTIs and Food Safety

Minimally processed foods that are high in quality, nutritionally superior, easy to
prepare and have extended shelf life present challenges to ensure microbial quality and
safety. The chief microbial concerns associated with these products center around two types
of microorganisms - psychrotrophic and mesophilic pathogens-that could grow during
extended refrigerated storage or temperature abuse at some point during handling. The
parameters that have to be seriously viewed in modeling quality are: an upper limit of the
initial microbial population under a set quality control scheme based on HACCP and the food
composition; the temperature behaviour of both lag phase and growth phase; an upper limit
for microbial load corresponding to the end of shelf life; and the probability of pathogen
growth and toxin production. From a regulatory standpoint, the presence of any viable
pathogen or microbial toxin would make the food legally adulterated. However, a better
alternative would be a statistical sampling scheme tied with a TTI. The design would have to
be such that the TTI response would signal the food to be discarded even when the actual
pathogenicity have not occurred. The use of TTIs for predicting potential pathogenic growth
would however require accurate and extensive modeling of the temperature dependence of
growth. Thus, if reliable data exist on the temperature behaviour of the different pathogens, a
TTI tag could serve as a warning sign that a certain temperature has been exceeded for an
unacceptable length of time.

5.0 Challenges and Prospects


Although food industry in the developed economy have recognized the need for use
of TTIs to reduce food waste and enhance food quality to the consumers, the cost, reliability
and lack of scientific basis of application have been deterrent in their widespread use. Low
demand of the TTIs have inhibited the production at mass scale. The economy of scale,
therefore, do not permit manufacture of indicators at low cost (present cost estimates are up
to 50 cent per unit). Smaller food processing units therefore can not afford to use these. The
need of the hour is to develop prototypes that are reliable and can be produced at relatively
lower costs. Furthermore, standardization of the unit being monitored at an appropriate size
(case, pallet, car-load etc.) would also help in enhancing the cost-benefit ratio in favour of the
indicators. Besides, most studies have used limited quantitative data without much
meaningful mathematical modeling, so extrapolation of results to different type of foods and
variable temperature conditions are a great limitation. Application of kinetic principles too
has experienced certain limitations. The actual Teff(food) and the estimated Teff(TTI) many times
differ for different temperature distributions. The three major source of error in the estimated
Teff(TTI) (Taoukis et al., 1991) are: the variation of TTI response which may increase as the
TTI tag ages; the statistical uncertainty in the Arrhenius equation parameters; and the
difference in activation energies between the TTI and the food. Thus, concerted research
efforts are needed to generate sufficient information on accuracy and reliability of TTIs to
temperature change particularly under isothermal storage conditions. Response of such
indicators under variable temperature conditions and their stability prior to activation are
some of the areas that need to be thoroughly investigated. Particular emphasis need to be
given to experimental determination of the kinetic parameters for TTIs prior to use. It would
be useful for indicator manufacturer to report activation energies and reference rate constants
to aid in the selection of TTIs for different purposes. More TTI designs and types are
expected to meet the requirements of different food products. And finally, consumer
education with regard to usefulness of TTIs in predicting quality and shelf life of foods will
go a long way in ensuring greater acceptability of these indicators.
References

Byrne, C. H. (1976) Temperature indicators-The state of art. Food Technology, 30: 66-68.
Blixt, K. G. (1983) The I-point® TTM-a versatile biochemical time temperature indicator. In : Refrigeration in
the services of Man. XIV International Congress of Refrigeration, Paris, 3: 793-795.
Dharmendra, B. V., Ganesh Kumar, C. and Mathur, B. N. (2000) Time-temperature indicators as shelf life
monitors in food industry. Indian Food Industry, 19: 181-189.
Hendrickx, M., Weng, J., Maesmans, G. and Tobback, P. (1991) Validation of time temperature indicator for
thermal processing of foods under pasteurization conditions. International Journal of Food Science and
Technology, 27; 21-31.
Legrond, C. G., Kongmark, N. and Roussel, E. (1986) Device for management of stored perishable products in
relation to storage time and temperature. French Patent Application FR 2571 145 Al. Cited from Food Science
and Technology Abstracts, 19: 2V11.
Mistry, V. V. and Kosikowski, F. V. (1983) Use of time-temperature indicators as quality control devices for
market milk. Journal of Food Protection, 46: 52-57.
Sherlock, M., Fu, B., Taoukis, T.P. and Labuza, T. P. (1991) A systematic evaluation of time-temperature
indicators for use as consumer tags. Journal of Food Protection, 54: 885-889.
Singh, R. P. and Wells, J. H. (1985) Use of time temperature indicators to monitor quality of frozen hamburger.
Food Technology, 39: 42-50.
Taoukis, P. S. and Labuza, T. P. (1989a) Applicability of time-temperature indicators as food quality monitors
under non-isothermal conditions. Journal of Food Science, 54: 783-788.
Taoukis, P. S. and Labuza, T. P. (1989b) Reliability of time-temperature indicators as food quality monitors
under non-isothermal conditions. Journal of Food Science, 54: 789-792.
Taoukis, P. S., Bin Fu and Labuza, T. P. (1991) Time-temperature indicators. Food Technology, 45: 70-82.
Wells, J.H. and Singh, R. P. (1988a) Application of time-temperature indicators in monitoring changes in
quality attributes of perishable and semi-perishable foods. Journal of Food Science, 53: 148-152,156.
Wells, J.H. and Singh, R. P. (1988b) A kinetic approach to food quality prediction using full-history time-
temperature indicators. Journal of Food Science, 53: 1866-1871.
Zall, R., Chen, J. and Fields, S. C. (1986) Evaluation of automated time-temperature monitoring system in
measuring freshness of UHT milk. Dairy and Food Sanitation, 7: 285-290.
FUZZY LOGIC SYSTEMS WITH EMPHASIS ON FOOD
AND DAIRY PROCESSING APPLICATIONS

Adesh K. Sharma† and R. K. Sharma


School of Mathematics and Computer Applications,
Thapar Institute of Engineering and Technology,
Patiala-147 004, Panjab, India.

E-mail: adeshkumar.sharma@gmail.com.

1.0 Introduction
Human beings show uncanny ability to make robust decisions in the face of uncertain and
imprecise information. Fuzzy logic (FL) was conceived and developed to emulate this process and to
use techniques expressed by humans, which are imprecise by nature.
Fuzzy logic is a valuable tool that can be used to solve highly complex problems where a
mathematical model is too difficult or impossible to create. Due to the growing complexity of current
and future problems, the ability to find relatively simple and less expensive solutions has fueled the
research into fuzzy technology.
It was designed to mathematically represent vagueness and develop tools for dealing with
imprecision inherent in several problems. Normally, in digital computers one uses the „binary‟ or
Boolean logic where the digital signal has two discrete levels, viz., Low (i.e., binary 0) or High (i.e.,
binary 1); nothing in-between. This phenomenon can be expressed mathematically with the help of
classical set theory. Consider that U denotes a given universe of disclosure (i.e., the universe of all the
possible elements from which set can be formed) and u be an element, i.e., u U . Then a crisp
(discrete) set A can be described by the characteristic function  A as follows:
1, if u  A,
 u  and only if
... 1
A
0, if and only if u  A,
Fuzzy systems use soft linguistic variables (e.g., hot, tall, slow, light, heavy, dry, small,
positive, etc.) and a range of their weightage (or truth) values, called membership functions, in the
interval [0, 1], enabling the computers to make human-like decisions. This phenomenon can be
expressed mathematically using fuzzy set theory. A fuzzy set A in the universe of disclosure U (i.e.,
the range of all possible values for an input to a fuzzy system) can be defined as a set of ordered pairs:
A  u, A u| u  A ... 2
where A :U  0,1 ... 3
Thus, fuzzy logic is basically a multi-valued logic that allows intermediate values to be
defined between conventional evaluations like yes/no, true/false, black/white, etc. Notions like rather
warm or pretty cold can be formulated mathematically and processed by computers. In this way an
attempt is made to apply a more human-like way of thinking in the programming of computers.
In essence, FL deals with events and situations with subjectively defined attributes:
 A proposition in FL does not have either 'True' or 'False'
 An event (or situation) can be, for example, 'a bit true', 'fairly true', 'almost true', 'very
true', or 'not true' depending on the event (or situation) attributes.
Fuzzy logic application to problem solving involves three steps: converting crisp (numerical)
values to a set of fuzzy values, an inference system (based on fuzzy if-then rules) and defuzzification.
In the first step, live inputs such as temperature, pressure, etc., generate a real action in the form of a
pulse width measurement driving a motor, or a voltage driving a motor or a relay. This is realized with
the help of special functions, viz., membership functions, which are the equivalents of adverbs and
adjectives in speech such as very, slightly, extremely, somewhat and so on. Next, the membership
functions are put in the form of „if-then‟ statements and a set of rules is defined for the problem under
investigation. These rules are of the form:
IF temperature is cold AND humidity is high THEN fan_speed is high
where temperature, humidity are inputs variables (i.e., known data) and fan_speed is an output variable
(i.e., data value to be computed). The adjectives „low‟ in relation to „temperature‟, „high‟ in relation to
„humidity‟ and „high‟ in relation to „fan_speed‟ are the membership functions in the present case.

The IF is an antecedent; usually a sensor reading while THEN is a consequent, a command, in


control applications. Unlike „binary logic‟, each antecedent can lead to several premises and several
consequences. Hence, in fuzzy logic more than one rule may operate at the same time but with varied
degrees. This set of rules (with different weigthages) leads to a crisp control action through the process
of defuzzification.

2.0 Architecture of FLC System


The typical architecture of a fuzzy logic control (FLC) system consists of fuzzification [1]
interface, knowledge base, fuzzy inference[2] unit, and defuzzification[3] interface (Fig. 1). The
fuzzification interface transforms the crisp measured data into fuzzy set representing the suitable
linguistic value. The knowledge base stores empirical knowledge of process operation, such as fuzzy
rules in a rule base, and membership functions in a database. The fuzzy inference unit, similar to
human decision making, performs approximate reasoning[4]. The defuzzification interface gets a non-
fuzzy decision or control action from an inferred fuzzy control action.

Figure 1: Architecture of an FLC system for a food process

3.0 FLC Applications in Food and Dairy Processing


The computerized automation of food and dairy processes is more challenging than that of
chemical or pharmaceutical processes. Food processes largely rely on operator‟s rules of thumb and are
not fully automated. Processes may be difficult to control or model by conventional methods, where
simplifications or linearization are often made. Processes become too complicated because of tightly
coupled control loops, nonlinear parameters around the operation point, or some parameters being
subject to unpredictable noise. Food processing operations are usually batch type. Few selected unit
operations have been mechanized. Hence, some of these can be categorized under semi and continuous
operations. The experienced operators, however, usually control these processes. It is not easy to
transfer the knowledge from process experts to mathematical models when designing a control system
for food processes.
The other problem in food process automation is that only a limited number of variables are
measurable online. Some of them such as odor, taste, appearance and texture are subjective and usually
evaluated qualitatively in linguistic terms or need a longer time to analyze in the laboratory. The
properties of food process usually vary and depend on unpredictable factors such as seasons, location,
and climate. Because of these reasons, automation of food process may cost time and money. Fuzzy
logic and neural network techniques separately or combined, can be used to facilitate computerized
automation.

3.1 Extrusion Process


Extrusion cooking is one of the most versatile food processes as applied in many conventional
and novel food-manufacturing operations. Therefore, automatic control of food extrusion processes is
of great interest. However, the automatic control of food extrusion cookers faces great difficulties
because of the lack of online measurable quality variables and of the appropriate sensors, and because
of the uncertainties involved in the behavior of the biological materials and complex and unclear
interactions between the various process variables.
Because of the complexity of the problems extrusion cookers have been until recently
controlled by relatively simple on-off controllers, and by expert‟s directions relying on empirical
knowledge of past experiences. However, fuzzy controller has been designed and simulated for an
extrusion process. The water feed rate is inferred based on the feed moisture and mass feed rate. The
fuzzy rules have been derived by changing the water feed rate proportional to the mass feed rate and
the difference between the optimal and current moisture levels.

3.2 Pasteurization Process


The temperature of a High-Temperature-Short-Time (HTST) heat exchanger has been
controlled by a fuzzy logic control system. Its rule base consisted of only five fuzzy control rules that
were used for inference of the steam valve throttle based on the observation of the product temperature
error or hot water temperature error. The final control action was then defuzzified by the centre of area
method. The fuzzy controller was built in a discrete framework; therefore, the fuzzy rule could be put
in the form of a rule table to shorten the time for inference. The result was not very successful in some
cases, because of difficulty in deriving better fuzzy rules and selecting proper membership functions.

3.3 Fermentation Process


Fermentation processes are also ubiquitous in the food industry, ranging from large-scale
operations for products such as beer to small-scale production of specialty ingredients; from traditional
processes such as brewing and winemaking to recent developments based on advances in
biotechnology; from production of high-quality foods for consumers to waste-treatment processes. All
of these processes rely on growth and maintenance of microorganisms in complex reaction
environments. Fuzzy controllers have been developed to deal with the control of cultivation of
microorganisms in a fermentation process. It is difficult to control the process using a classical
controller, because the biological mechanism is not completely understood and online measurements of
some variables are not reliable or accurate.

3.4 Dehydration Process


Moisture content of food product is reduced with the primary purpose to stabilize water
activity with view to increase its shelf life. There are many dryer configurations and different modes of
operation (batch vs. continuous, manual vs. automatic) but for all modes of operation there is a
common motivation for good process control. The objective is to achieve a final moisture content to
minimize damage to the thermally sensitive components (e.g., flavor, enzymes, etc.) and in some cases,
to produce desired functional properties such as dehydration characteristics and texture in dried
product. Prolonged drying causes shrinkage losses as well as quality deterioration due to oxidative
changes, unnecessary flavor loss, or stress cracking etc. Under drying is also a problem. Rejected
material may be reworked but incurs costs in warehousing, reduction in productivity and the potential
to lower the quality factors. Experienced operators control many dryers in the food industry manually.
Automatic control has been introduced for some types of food dryers. Fuzzy control systems for grain
and food dryers have been developed to the pilot plant stage, and work on industrial scale systems is in
progress.
References:
Jain LC, Martin NM (Eds.) (1999) Fusion of Neural Networks, Fuzzy Sets and Genetic Algorithms –
Industrial Applications, CRC Press, London.
Lee CC (1990) Fuzzy logic in control systems: fuzzy logic controller, Parts I and II, IEEE Trans. Sys.
Man. Cyb. 20(2):404-435.
Terano T, Asai K, Sugeno M (1992) Fuzzy Systems Theory and Its Applications. Academic Press, San
Diego.
Footnotes
[1]
It is the encoding of observed non-fuzzy data into fuzzy sets defined in the universe of input variables. Most data measured
by sensors in processes are crisp data such as temperature, flow rate, and pressure.
[2]
Various forms of tautologies are used for making deductive inferences. In FL they are referred to as inference rules, e.g., „x
is A‟, „if x is A then y is B‟, etc.
[3]
It is a mapping from the space of a fuzzy set to a space of crisp values. For most applications in process control, final non-
fuzzy output is required to actuate the process, but the result of fuzzy inference is a fuzzy set. Therefore, defuzzification is
required to decode the fuzzy control output into crisp control output.
[4]
Fuzzy logic is the basis of the approximate reasoning. A proposition is a sentence than can be expressed in the form: P: “x
is A”, where x is the subject and A is a predicate that characterizes the subject x. For example, in the proposition “Vitamins
in milk are heat sensitive”, “vitamins in milk” is subject, which is characterized by the predicate “heat sensitive”. In
classical logic, the predicates are non-fuzzy and truth-values of the propositions are either true or false. However, in fuzzy
logic we could have many different ways to express the vagueness.
BASIC CONCEPTS OF DATABASES

Anand Prakash Ruhil


Computer Centre
NDRI, Karnal

1.0 Introduction
The term database becomes popular in late 1960s. Prior to that data were maintained in
different files for data processing job. Each section in the organization was maintaining its own
data files and even common data files were not sharable with others. Application programs and
data organization methods were interrelated. If a change was made to the data organization or
to the data storage units, the application programs were also required to modify accordingly.
There was no data independence. This caused the problem of redundancy and inconsistency of
data in the organization. When database word became a catchy word, every one started using
this name by changing the title of their data file applications to database without incorporating
the features of database concepts. The main features of a database are data independence,
controlled redundancy, interconnectedness, security protection, real-time accessibility, etc.

2.0 Database
A database may be defined as a collection of interrelated data stored together without
harmful or unnecessary redundancy to serve multiple applications; the data are stored so that
they are independent of programs which use the data; a common and controlled approach is
used in adding new data and in modifying and retrieving existing data within the data base. The
data is structured so as to provide a foundation for future application development. One system
is said to contain a collection of databases if they are entirely separate in structure.

A database may be designed for batch processing, real-time processing, or in-line


processing (in which single transactions are processed to completion one at a time but without
the tight time constraints of a real-time system). Many databases serve a combination of these
processing methods. A database can be of any size and varying complexity.

In an organization, a database is of dynamic nature. Over a time period size of a


database grows, storage devices are changed, and new applications are added which, requires
restructuring of existing database. The concept of database approach provides data
independence (logical and physical) at two levels to the database designer, developers, and
application programmers to save their efforts whenever there is a change in database structure.

Logical data independence means that the overall logical structure of the data may be
changed without changing the application programs. Physical data independence means that
the physical layout and organization of data may be changed without changing either the
overall logical structure of the data or the application programs.

1
2. 1 Objective of a Database

The main objective of any database is to handle all the files or related data in an
integrated form in order to ensure data authenticity, reliability and security with optimum
performance rate having least redundant data and permitting multiple users for processing
concurrently.

2.2 Basic Terminology

1. Byte: A smallest individually addressable group of bits (7 or 8 bits).


2. Data: These are the facts that can be recorded and that have implicit meaning.
3. Data Item: A data item is the smallest unit of named data. It may consist of any number
of bits or bytes. It is also referred as field or data element. For example, name of
student, telephone number, and address, etc.
4. Data Aggregate: It is a collection of data items, which is given a name and referred to
as a whole. For example data item DATE may be composed of data items MONTH,
DAY, and YEAR.
5. Record: A record is a named collection of related data items or data aggregates.
6. File: A file is collection of all occurrences of a given type of record.
7. Database: A database is a collection of the occurrences of multiple record types,
containing the relationships between records.
8. Database System: A collection of databases.

2.3 Characteristics of a Database

Some of the desirable characteristics of a database system are as follows:

1. Ability to represent the inherent structure of the data: The database system should
be able to represent the true properties of the data. The implementation procedures
should not force the data in to structures which do not represent its inherent nature. For
example a system which can only represent tree structures is inadequate.
2. Performance: Database applications designed for use by a terminal operator must give
a response time appropriate for the man-terminal dialogue. In addition the database
system must be able to handle an appropriate throughput of transactions.
3. Minimum Cost: Techniques are used to reduce the cost for storage of data and
programming. Also minimize the cost of making changes.
4. Minimal Redundancy: Eliminate redundant data to avoid inconsistence in the data
base.
5. Search Capability: The database should be able to process anticipated as well as
unanticipated queries quickly.
6. Constant Growth: One of the most important characteristics of databases is that they
will need to change and grow. Easy restructuring of the database must be possible as
new data types and new applications are added. The restructuring should be possible
without having to rewrite the application programs and in general should cause as little
upheaval as possible.

2
7. Integrity: It is important that the data items and associations between items not be
destroyed. Hardware failures and various types of accident may occur occasionally. The
storage of data and its updating, and insertion procedures, must be such that the system
can recover from these circumstances without harm to the data. In addition to
protecting data from system problems like hardware failure, the integrity checks may
also be designed to ensure that data values confirm to certain specified rules. Test may
be made by checking the relationship between data values.
8. Privacy and Security: Data in data base systems must be kept secure and private. It
must not be stolen or lost since it is vital for the organization. It must be protected from
hardware or software failure from catastrophes, and from criminals, incompetent, and
people who would misuse it.
Data security refers to protection of data against accidental or intentional disclosure to
unauthorized persons or unauthorized modifications or destruction.
Data privacy refers to the rights of individuals and organizations to determine for
themselves when, how, and to what extent information about them is to be transmitted
to others.
9. Interface with the Past: When an organization installs new database software it is
important that it can work with the existing programs, procedures, and existing data can
be converted. This kind of compatibility can be a major constraint in switching over to
new database system.
10. Interface with the Future: This is an important feature since in future the data and its
storage media will change in many ways. An organization has to move along with
advancement taking place in technology for its survival. Moreover, the needs of an
organization also grow over the years due to changes in policies, expansion in business,
etc. Therefore under these circumstances the data base should be designed in such a
way that these changes can be incorporated in it with minimum cost. This can be
achieved to some extent by using the concept of logical and physical data independence
concept while design the database.
11. Simplicity: The means that are used to represent the overall logical view of data should
be conceived in a simple and neat fashion. In many systems pointers are used in the
logical representation to show relationships between data items. But, as more and more
relationships are added the representation becomes more complicated with pointers.

3.0 Entities and Attributes


1. Entity: The item about which information is stored is known as entity. It is a “thing” in
the real world with an independent existence. An entity may be a tangible object with
physical existence like a car, house or employee, student, etc. It may be intangible such
as an event, bank account, MS-windows, abstract concepts, etc.
2. Entity set: Collection of similar entities.
3. Attributes: Properties of an entity (e.g. student) such as name, age, class, marks
obtained, and address of a student are referred as attributes. Attributes may be atomic or
composite. Composite attributes can be divided into smaller subparts which represent
more basic attributes with independent meaning. For example, the Address can be

3
divided further into Street, City, State, Zip, etc. Attributes that are not divisible are
called as simple or atomic attributes.
4. Flat files: This is an arrangement of attributes and the values of these attributes for each
instances of an entity in columns and rows. For example attributes of a student (student
is an entity) like name, age, class, marks can be place in columns and values of the
attributes for each student are placed row vise. Related set of values of all attributes in
one row for one instance of an entity is referred as tuple.
5. Primary key: An attribute which uniquely identify a record from a given set of values
of an entity is referred as primary key. For example a student can be uniquely identified
by roll number in a class.
6. Secondary key: An attribute which identify a set of records which have certain
property. Such attribute is referred as secondary key. For example name of a student.
7. Entity Relationship Diagram: This diagram shows the relationship between different
entity and their attributes in a database in graphical way.

4.0 Data Models and Data Association


A data model is a concept to describe the structure of a database (i.e. data types, relationships
and constraints that hold on data). There are three categories of data models as described
below:
1. High level or conceptual data model: It provides concepts that are close to the way
many users perceive data.
2. Low level or physical data models provide concepts that describe details of how data is
stored in the computer.
3. Between two extremes is a class of representational data models, which provide
concepts that may be understood by end users.

Data in database may be described as follows from user point of view:

1. Schema or global logical database description: A schema is a chart of entire logical


database. This is an overall view of the data seen by the database administrator. It gives
names to entities and attributes, and specifies relationship between them. It is a
framework into which the values of data items can be filled. Schema is specified during
database design and is not expected to change frequently. A displayed schema is called
a schema diagram.

2. Subschema: This refers to an application programmer’s view of the data he/she uses.
This is a chart of a portion of the data which is oriented to the needs of one or more
application programs. Many different subschemas can be derived from one schema.

3. Physical database description: This is concerned with the physical representation and
layout and organization of data on storage units. It is concerned with the indices,
pointers, chains, and other means of physically locating records and with overflow areas
and techniques used for inserting new records and deleting records.

4
4.1 Associations between Data:

Schema and subschema are maps showing the data item types and associations between
them. There are various ways of drawing the associations. The association between two data
items can be of three types as described below:

1. One-to-one (1 : 1): In this type of association there is one-to-one mapping from data
item A to data item B. This means that at every instance in time, each value of A has
one and only one value of B associated with it. For example Product name and its code
have one to one relationship.
2. One-to-many (1 : M): In this type of association there is one-to-many mapping from
data item A to data item B. This means that for one value of A has one or more values
of B associated with it. For example Product name and its packing size have one to
many relationships.
3. Many-to-many (M : M): In this type of association there is many-to-many mapping
from data item A to data item B and vice versa . This means that one value of A has one
or more value of B associated with it and similarly one value of B has one or more
value of a associated with it. For example one bill may contain many products and one
product name may be contained in many bills.

5.0 Database Management System (DBMS)


DBMS is a collection of programs that enables users to create and maintain a database.
DBMS is a general purpose software that facilitates the process of defining, constructing, and
manipulating databases for various applications. Defining a database involves specifying the
data types, structures and constraints for the data to be stored in the database. Constructing the
database is the process of storing the data itself on some storage medium that is controlled by
the DBMS. Manipulating a database includes functions such as querying database to retrieve
specific data, updating database, etc.

STUDENT:

Name Roll Number Subject Marks


Ram 101 Math 98
Sham 102 CS 95

5.1 Advantages of DBMS

1) Controlling Redundancy:-DBMS have the capability to control the redundancy so as


to prohibit inconsistencies among the files. This may be done by automatically
checking each table.
2) Systems integration: - In DBMS, all files are integrated into one system thus
reducing redundancies and making data management more efficient. In addition,
DBMS provides centralized control of the operational data.

5
3) Restricting Unauthorized Access:- When multiple users share a database, it is likely
that some users will not be authorized to access all information in the database. For
example, financial data is often considered confidential and hence only authorized
persons are allowed to access such data.
4) Providing Persistent Storage for Program Objects and Data Structures:-
Databases can be used to provide persistent storage for program objects and data
structures.
5) Providing Backup and Recovery:- A DBMS must provide facilities for recovering
from hardware or software failures. The backup and recovery subsystem of the DBMS
is responsible for recovery.
6) GUI for generating Forms, Queries and Reports: Almost all DBMS have GUI
features for generating forms for data entry, reports and queries. This helps users to
complete the tasks without going into the complexities of programming.

5.2 Languages:

To accomplish the objectives of a DBMS several languages are needed at different level
to create a database, manipulate a database and storage of database in media, etc. But in the
present context we will be interested in language/procedures for creation and manipulation of
databases. Some more inbuilt facilities are

1. Schema description language: It helps the programmer to describe schema of a


database by providing the mechanism to create entities and their attributes in a database.
It also helps in establishing the relationships among various entities. For example data
definition language (DDL) is used to create and define entities and attributes (data files
and fields) in database.

2. Data manipulation language: This language helps the users to manipulate the data
base by providing the facilities to enter, modify, and delete data into and from database.
It is also used for retrieving data from database by putting queries to generate reports.
For example structure query language (SQL) is used to retrieve/manipulate data
efficiently in database. Most of the DBMS have also interface with other programming
languages such as VB, C++, VC++, etc. to generate complex reports fro database.

6.0 Database Structures


The relationship among various entities in a database may be categories as follows:

1. Tree or Hierarchical Structure


2. Plex or Network Structure
3. Relational or Normalized Structure

6
6.1 Tree or Hierarchical Structure

A tree is composed of a hierarchy of elements called nodes. The uppermost level of the
hierarchy has only one node called the root. Except the root node every node has one node
related to it at a higher level and this is called its parent. No element can have more than one
parent. Each element can have one or more elements related to it at a lower level. These are
called children. Elements at the end of the branches are called leaves.

A tree can be defined as a hierarchy of nodes with bi-nodal relationship such that:

1. The highest level in the hierarchy has one node called root
2. All nodes except the root are related to one and only one node on a higher level than
themselves

Hierarchical files: This refers to a file with a tree structure relationship between the
records. Data tend to breakdown into hierarchical categories. One category of data is a
subset of another.

Balanced Tree: In a balance tree each node can have the same number of branches and
the branching capacity of the tree is filled starting at the top and working down
progressing from left to right in each row.

Binary Tree: It is a special category of balanced tree structure which permits up to two
branches per node. Any tree can be converted into binary tree.

Path Dependency: The lower record in hierarchy (tree structure) may be incomplete in
meaning without their parent.

6.2 Plex or Network Structure

If a child in a data relationship has more than one parent then such relationship is
described as plex or network structure. Any node in a plex (or network) structure can be linked
to any other node on lower or higher side.

Simple Network Structure: A schema in which no line has double arrows in both
directions will be called a simple network structure.

Complex Network Structure: A schema in which lines have double arrows in both
directions will be called a complex network structure.

Intersection Data: In some cases data can be related to the association between data
items. For example part A may be supplied by several vendors who each charge a
different price for it. The data item price can not be associated with the PART record
alone or with the SUPPLIER record alone. It can only be associated with the
combination of the two. Such information is sometimes called intersection data - data
associated with the association between records.

7
6.3 Relational or Normalized Structure

In this structure the data items are represented in form of rows and columns. Tree and
network relationship are broken down into a simple two dimensional structure with the help of
normalization techniques. In next section we will discuss this kind of structure in detail to
design and create databases for solving scientific and business applications.

7.0 Relational Databases


A relational database consists of a set of tables. In each table, rows (called tuples)
represent unique instances of an entity or records and columns (called fields) represent
attributes. Each table is a relation and so a relational database can be thought of as a collection
of tables. Relationships between tables are represented by common data items in different
relations (tables).

Degree of a relation: Number of columns in a relation.

Domain: A set of values that a data item can have.

Primary Key: An attribute (field) which uniquely identify a record (row) in a relation
is known as primary key. A key may be a composite key, foreign key, etc.

Foreign Key:-An attribute is called foreign key if it is a primary key of some relation.
Hence, keys, primary and foreign provide a means of representing relationships
between tuples.

Operations on tables: The concept of relational database is based on the theory of


relational algebra therefore we can apply various types of operations on tables in
relational database. Following three types of operations are important:

1. Projection: To select specified columns from a table to create a new table


2. Selection: Create a new table by selecting rows that satisfy condition.
3. Join: Join operation creates a new table from the rows in two tables that have
fields satisfying a condition. The join operation is performed when two tables
share a common field. The projection operation split tables while join operation
puts together columns from different tables.

8.0 Normalization

The practice of storing data in separate files to minimize redundancy and simplify basic
database file management is called normalization. It is a process of analyzing the given
relations (schema) based on their Functional Dependencies (FDs) and primary key to achieve
the properties
 Minimizing redundancy
 Minimizing insertion, deletion and update anomalies.

8
The purpose of normalization is to produce a stable set of relations that is a faithful model of
operations of an enterprise. By following the principles of normalization we can achieve a
design that is highly flexible and allowing the model to be extended in future.

There are several normalization forms to normalize a database but first three are important and
sufficient to build a stable database and also take care of insertion, deletion and update
anomalies therefore, in the present context we will focus only on first three forms of
normalization as described below:

8.1 First Normal Form (1 NF): The domain of attributes must include only atomic (simple,
indivisible) values. Repeated data values are removed from a single database file and are placed
into separate database files using a key field to link database files. After completing this task
the database is said to be in first normal form.

Functional Dependence: Attribute B of a relation R is functionally dependent on


attribute A of R if, at every instant of time, each value in A has no more than one value
in B associated with it in relation R. It means that A identifies B uniquely.

Full Functional Dependency: An Attribute or a collection of attributes, B, of a relation


R can be said to be fully functionally dependent on another collection of attributes, A,
of relation R if B is functionally dependent on the whole of A but not on any subset of
A.

8.2 Second Normal Form (2 NF): A relation R is in 2NF if it is in 1NF and every non-prime
attribute A in R is fully functionally dependent on primary key.

Transitive dependence: The fields that are occasionally (though not always) dependent
on some other non key field in the same record is known as transitive dependency. In
other words, suppose that A, B, and C are three attributes or distinct collection of
attributes of a relation R. If C is functionally dependent on B and B is functionally
dependent on A, then C is functionally dependent on A. If the inverse mapping is non
simple (i.e. if A is not functionally dependent on B or B is not functionally dependent
on C), then C is said to be transitively dependent on A.

8.3 Third Normal Form (3 NF): A relation R is in 3NF if it is in 2NF and every non-prime
attribute is non-transitively dependent on primary key.

In simple terms, if we look at most manual systems that are used to store and manage
information, we will often find that the information is already structured in the third normal
form. So do not let a lot of theory confuse you. Try to reduce the redundancies in your database
files and make particular bodies of information independent and easy to work with, and you
will find that your databases will naturally fall into the desired third normal form.

9
INTRODUCTION TO MANAGEMENT INFORMATION
SYSTEMS

Anand Prakash Ruhil


Scientist, Computer Centre
National Dairy Research Institute, Karnal

1.0Introduction
Information processing is a major societal activity. At every stage we need
information to take decisions. We spent our maximum time on recording, searching,
processing and absorbing information. The terms information and data are frequently
used interchangeably. So, first let us understand what is information? Information can be
defined as data that is meaningful or useful to the recipient. Data items are therefore raw
material for producing information. Information is a vital resource to an individual as
well as to an organization for its efficient operation and management. Information can be
processed manually or with the help of computers. Now a day’s computers have become
an essential part of organizational data processing to produce information, because of the
power of the technology and the volume of data to be processed.

2.0 Types of Information


We have seen that data is processed to produce information. The same data may
be processed in different ways to produce different type of information. For example,
management of a dairy plant will require different types of information to run the plant
efficiently. The information required by the management may be classified in to the
following categories:

1. Strategic information: This information is required for long range planning and
directing the course the business should take. For example, to start manufacturing
new product in the plant or to open a new branch of the plant in some other
location, etc. This type of information is less structured, volume of information is
small and is difficult to obtain.

2. Tactical information: This type of information is needed to take short range


decisions to run the business efficiently. For example, what should be reorder
level of items in the inventory, how much loan should be sanctioned to milk
producer for purchase of animals, feed etc. based on his/her past record etc.?
Tactical information requires specifically designed processing of data. Most of it
is obtained easily from day to day collection of routine data unlike strategic
information. Further, the volume of data is more here than strategic data.

3. Operational information: This type of information is needed for day to day


operation of a business organization. For example receipt of milk from farmers
per day, inventory items issued to different sections, sale proceeds of milk and
milk products per day, etc. Operational information is usually easy to obtain by
straightforward clerical processing of data. The volume of such information is
much more than tactical information.

4. Statutory information: Information and reports which are required by law to be


sent to government authorities are normally clearly specified and require
straightforward processing of data. For example, income tax statement, sale tax,
excise duty paid, etc.

2.0 Management Structure


In a large business organization like a dairy plant it is essential to delegate responsibilities
to specialists in each area and make them accountable for their efficient functioning. For
example a dairy may have number of sections like procurement, processing, production,
sale and marketing, HRD, accounts and finance etc. Each section is headed by a middle
level manager and they report to General Manager who is overall in-charge of the
organization. Middle level managers in turn will have many assistants who are
responsible for specific day-to-day operations. These are called line managers. Further
line managers are assisted by the clerical staff to perform clerical and routine tasks. The
management structure is thus a pyramid shape as shown in the figure 1 given below.
Information required at different level of management is also shown in the same figure.

Volume of Type of
Information Information
Condensed Unstructured
Strategic
Top
Level
Planning
Tactical
Middle
Level
Planning
Operational
Line Managers Level
Transaction
Users at inquiry level
Processing
Detailed Structured

Figure: 1 Management structure and information required at different levels

3.0 Qualities of Information


The information provided to managers should have the following qualities:
• Accurate: The correctness of the input data and that of the processing rules
should be ensured so that the resulting information is accurate. Incorrect
information is worse than no information.
• Complete: Information should be complete. It should include all data and not to
exclude some.
• Trustworthy: Information should be trustworthy. The processing should not hide
some vital information which may for example, point out the inefficiency of some
individuals.
• Timely: Information should be timely. It should be given to the manager when he
needs it. Delayed information may sometimes be of no value.
• Uptodate: Information should be uptodate. It should include all data available at
the time of processing.
• Relevant: Information should be tailored to the needs of the user and be relevant
to him. Therefore before producing reports first understand user needs.
• Brief: Massive volumes of irrelevant information would waste a lot of manager’s
time and there is danger of his missing important relevant information. Therefore,
summarize relevant information.
• Significance understandable: Information should be presented when he needs it
and where he needs it in such a way that he may immediately perceive its
significance. For example, use of attractive format and graphical charts ensures
quick recognition and significance of the information.

4.0 TYPES OF SYSTEMS

Following are the major types of Information Systems that are used in an organization for
efficient operations. Each type of system is designed to accomplish a specific function
with an integrated approach. We will look at them in the context of operations of a dairy
plant.

1. On line Transaction Processing Systems (OLTPS): This system records the


routine transactions that take place in everyday operations to support day to day
activities of an organization. Detailed transactions are entered at this level to
produce operational information for line managers. These systems also process
status inquiries fired by the users. OLTPS combine data in various ways to fulfill
the hundreds of information needs an organization requires to be successful. For
example data entry of milk received from farmers/ societies to produce statements
like daily receipt of milk route wise, similarly a manager can enquire from the
system to know the status of inventory items, etc.

2. Management Information System (MIS): This system produces tactical


information for middle level management. Data collected over a period from
OLTPS is processed to generate information for short range planning.
Information is processed on periodic basis instead of a daily, recurring basis like
those using a Transaction Processing System. For example trend of milk received
form different areas/ locations can help manager to take corrective measures if the
trends are not as per expectations.
3. Decision Support System (DSS): These systems produce strategic information
for top level management. The strategic information is produced from the OLTPS
and MIS and may also be collected and generated form external environment.
Since strategic information is less structured in comparison to tactical information
therefore, it is difficult to develop such systems (DSS). An MIS uses internal
data to supply useful information. A DSS uses internal data also but combines it
with external data to help analyze various decisions management must make.
Analyzing complex, interactive decisions is the primary reason for an
organization to use a DSS.

5.0 Management Information Systems


There is no consensus on the definition of the term MIS. A number of
terminologies are being used for MIS such as “information processing system”,
“information and decision system”, “organizational information system”, or simply
“information system”. In the present context we have used the term MIS. Management
information system is an integrated user-machine system for providing information to
support operations, management, and decision making functions in an organization. The
system utilizes computer hardware and software; manual procedures; models for analysis,
planning, control and decision making; and a database.

It is a broad concept rather than a single system. Actually OLTPS and DSS may
also be considered as a part of MIS. MIS supplies decision makers with facts; it supports
and enhances the overall decision making process. MIS also enhances job performance
throughout an organization. At the most senior levels, it provides the data and
information to help the board and management make strategic decisions. At other levels,
MIS provides the means through which the institution's activities are monitored and
information is distributed to management, employees, and customers. Effective MIS
should ensure the appropriate presentation formats and time frames required by
operations and senior management is met.

MIS can be maintained and developed by either manual or automated systems or


a combination of both. Computers have been used for wide variety of activities in an
organization to perform transaction processing , to provide processing for a formal
information and reporting system, and to accomplish managerial decision support are
broadly classified as the organization’s management information system (MIS). It should
always be sufficient to meet an organization's unique business goals and objectives.
These systems should be accessible and useable at all appropriate levels of the
organization. MIS is a critical component of the organization's overall risk management
strategy. MIS supports management's ability to perform such reviews. MIS should be
used to recognize, monitor, measure, limit, and manage risks.

MIS are designed to achieve the following goals:

• Enhance communication among employees.


• Deliver complex material throughout the institution.
• Provide an objective system for recording and aggregating information.
• Reduce expenses related to labor-intensive manual activities.
• Support the organization's strategic goals and direction.

MIS and the information it generates are generally considered essential


components of prudent and reasonable business decisions. MIS should have a clearly
defined framework of guidelines, policies or practices, standards, and procedures for the
organization. These should be followed throughout the organization in the development,
maintenance, and use of MIS. MIS is viewed and used at many levels by management. It
should be supportive of the institution's longer term strategic goals and objectives. To the
other extreme it is also for routine transaction systems that are used to ensure basic
control is maintained over day to day activities.

In order to further clarify the definition of MIS we elaborate the key concepts like
user-machine system, the concept of an integrated system, the need for a database, and
the role of planning and decision models as under:

1. Computer-Based User-Machine System: Conceptually a MIS can exist without


computers, but it is the power of computers which makes MIS feasible and to
what extent the information is computerized. Computer based means that the
designer of a management information system must have knowledge of computers
and of their use in information processing. The concept of a user-machine system
implies that some tasks are best performed by humans, while others are best done
by machine. Further, system designer should also understand the capabilities of
humans as system components (as information processor) and the behavior of
humans as user of information.

2. Integrated System: MIS provides the basis for integration of organizational


information processing. Individual applications developed in a system may be
inconsistent and incompatible. Data items may be specified differently and may
not be compatible across applications that use same data. There may be
redundancy of data in individualistic approach. First step in integration is an
overall information system planning. Information system integration is also
achieved through standards, guidelines, and procedures set by the MIS function.
The enforcement of such standards and procedures permits diverse applications to
share data, meet audit and control requirements, and be shared by multiple users.

3. Need of a database: The underlying concept of database is that data needs of an


organization are managed in order to make available for processing and have
appropriate quality. A common and controlled approach is used to access, update
and maintain of data. This helps to minimize the redundancy of data.

4. Decision models: Data usually need to be processed and presented in such a way
that the result is directed towards the decision to be made. Processing of data
items is based on a decision models. Decision models can be used to support
different stages in the decision making process. Models are used to identify and
analyze possible solutions.

6.0 MIS versus Data Processing


When the MIS term became popular then every one started calling their data
processing system as MIS without understanding the concept of MIS. Here we try to
distinguish between MIS and data processing as follows:

• Data processing system processes transactions and produces reports. It


represents the automation of fundamental, routine processing to support
operations. Prior to computers, data processing was performed manually or
with simple machines.
• A MIS is more comprehensive, it encompasses processing of a wider range of
organizational functions and management processes. TPS is also a function of
MIS.
• MIS has the capability to provide analysis, planning and decision making
support.
• In MIS, users have access to decision models and methods for querying the
database on an ad hoc basis while database is also an essential part of TPS.
• MIS orientation means information resources are utilized so as to improve
decision making and achieve improved organizational effectiveness.
GRAPHICAL REPRESENTATION OF SCIENTIFIC DATA

Anand Prakash Ruhil


Scientist, Computer Center,
NDRI, Karnal

1.0 Introduction
Science is a discipline that involves the planning of experimentations, collecting and
organizing data, evaluating the results and presenting its findings to public or
concerned person. Therefore as a researcher one need to develop the skills for
researching information, designing experiments then analyzing and presenting the
data produced. The produced data can be presented in various forms like text, table or
graph. Each method has some merits and demerits. It is commonly said that “a graph
is worth a thousand words" so we can understand the importance of graphs in
scientific community. Among all methods the graphical method is the most effective
way to describe, explore and summarize a set of numbers. In modern society graphs
are extensively used to represent trend, profit and loss, quantities, sales figures,
growth and decline, comparison, ratios, etc. Statistical methods in combination with
graphical representation form a powerful tool to analyze and communicate
information. According to an estimate two trillion images of statistical graphs are
produced every year. The new techniques of graphical representation are evolving
rapidly.

Though it is difficult to trace the history of creation of graphs but it is assumed that
first time the graphs appeared around 1770 and they became popular only around
1820-30. They appeared in three different places independently at different time at
statistical atlases of William Playfair, the indicator diagrams of James Watt, and the
writings of Johann Heinrich Lambert. William Playfair’s first presented the graphs in
his Commercial and Political Atlas of 1785. James Watt’s indicator was the first self
recording instrument which drew a pressure-volume graph of steam in the cylinder of
an engine while it was in action. Johann Heinrich Lambert used the graphs
extensively in eighteenth century. He used graphs in the 1760-70s not only to present
data but also to average random errors by drawing the best curve through
experimental data points. By 1790s graphs of several different forms were available
but they were ignored until 1820-30s, when statistical and experimental graphs
became much more common.

With the advent of computers the generation of graphs has becomes an easier task and
even a novice can produce beautiful graphs with in a few minutes with the help of
computer. However it has also restrained the creative process involved in graphical
representation of data by making the creation of such graphs easy only in certain
standardized forms available in the software packages. In the literature we found that
the words Diagram, Charts and Graph are commonly being used interchangeably.
Therefore, let us first understand the meaning of each of the term before moving
ahead. The dictionary meanings of these words are as follows:
Diagram: 1. A figure usually consisting of lines drawing, made to accompany and
illustrate a geometrical theorem, mathematical demonstration, etc.
2. A drawing, sketch or plan that outlines and explains the parts,
operation, etc. of something for example a diagram of an engine.
3. A pictorial representation of a quantity or of a relationship.

Chart: 1. A map designed to aid navigation by sea or air.


2. An outline map showing special conditions or facts e.g. a weather chart.
3. A sheet exhibiting information in tabulated or methodical form.
4. A graphic representation of data as by lines, curves, bars, etc. of a
dependable variable e.g. temperature, price, etc.

Graph: 1. A drawing representing the relationship between certain set of numbers


or quantities by means of a series of dots, lines, bars, etc. plotted with
reference to a set of axes.
2. A drawing depicting a relationship between two or more variables by
means of a curve or surface containing only those points whose
coordinates satisfy the relation.
3. A network of lines connecting points.

This lecture will introduce different tools for generating graphs through computers
and presenting the scientific data graphically.

2.0 Comparison between Tabular and Diagrammatic Presentation


The most popular ways of presenting the data are tabular form or graphical form.
Both of these methods have their own importance. Sometime data can be better
presented by table than by graphs. For example to determine the price of milk using
two axes formula based on fat and SNF percentage. This information can be
effectively presented in the form of a table rather than graph. On the other hand
generally graphs are efficiently when there is a trend or comparison to be shown. The
following table highlights the comparison between the tabular presentation of data
and diagrammatic or graphic presentation of data.

Diagrammatic Presentation Tabular Presentation

1 Diagrams and Graphs are meant for a Tables are meant for statisticians for
lay man. the purpose of further analysis.
2 Diagrams give only an approximate Tables contain precise figures. Exact
idea. values can be read from tables.
3 Diagrams can be more easily Comparison and interpretations of
compared, and can be interpreted by a tables can only be done by statisticians
layman. and it is a difficult task.
4 Diagrams and graphs cannot present Tables can present more information.
much information.
5 Diagrams are more attractive and have Tables are dull for a layman (may be
a visual appeal. attractive to a statistician).
3.0 Different Forms of Graphical Representations and Visualization
The various forms of graphical representation and visualization of data or information
are as follows:

 Graphs
 Charts
 Diagrams
 Data maps and geographical information systems (GIS)
 Time series
 Narrative graphs
 Animation
 Virtual reality

4.0 Difference between Graph and Diagram


Actually it is difficult to find out a clear-cut line of demarcation between a diagram
and a graph however based on our experience and exposure to different situation we
can summarize the following points of difference between a diagram and a graph:

 A graph needs a graph paper but a diagram can be drawn on a plain paper. In
the technical way we can say that a graph is a mathematical relation between
two variables. This however is not the case of a diagram.
 As diagrams are attractive to look at, they are used for publicity and
propaganda. Graphs on the other hand are more useful to statisticians and
research workers for the purpose of further analysis.
 For representing frequency distribution, diagrams are rarely used when
compared with graphs. For example, for the time series graphs are more
appropriate than diagrams.
 Graphs are drawn using axes while diagram can be drawn in any way.

5.0 Purpose of a Graph/ Diagram


The purpose of any graph is to visually communicate complex ideas with clarity,
precision and efficiency. The presented information in a graph must be understood
easily and quickly. The graphs should serve the following purposes:

 To present the data visually.


 To show trends, not detail.
 To show and compare changes.
 To show and compare relationship between quantities.
 To explore data and identify areas worthy of further study.
 To communicate the meaning of large volumes of data in summarized form.
 To make viewer think about the substance of the data not the graphic itself.
6.0 General Principles of Constructing Effective Graph/ Diagram
How to convey the information through graphs is important in the presentation. The
objective of graph is to depict data visually, therefore it is important to avoid visual
elements that do not add to the data, and to choose a graph design that visually shows
the comparisons you intend to make. Some general rules to keep in mind while
preparing graphs are given below:

 The graph should be simple and not too messy.


 Show data without changing the data’s message.
 Visual sense should prevail.
 Each diagram must be given a clear, concise and suitable title without
damaging clarity.
 Make all text horizontal, practical and readable.
 A proper proportion between height and width must be maintained in order to
avoid an unpleasant look.
 Select a proper scale; it should be in even numbers or in multiples of five or
ten. e.g. 25, 50, 75 … or 10, 20, 30, 40 ... but no fixed rule.
 Use footnotes in order to clear certain points
 An index (or legends), explaining different lines, shades and colors should be
given.
 Graphs should be absolutely neat and clean.
 Use a consistent format when showing groups of graphs.
 The methodology, design and technology used to create graph should be
transparent.
 Independent, explanatory and category variables are usually plotted on the X-
axis.
 Dependent or response variables are usually plotted on Y-axis.
 The position, size, shape, length, symbols, angle and color are all visual codes
that carry messages so be sure these must be right one.
 The graph should be closely supported with statistical and textual description
of data.

"The important point that must be borne in mind at all times that the pictorial
representation chosen for any situation must depict the true relationship and point out
the proper conclusion. Above all the chart must be honest.".... C. W. LOWE.

6.1 Advantages of Graphs

 Quick way for the reader to visualize what we want to convey – trends up or
down
 Forceful – Emphasize main point
 Convincing – proves a point, see and hear
 Compact way of conveying information
 More interesting than talk
6.2 Disadvantages of Graphs

 Time consuming to make – decision must be made in advance for layout,


color, etc.
 Technical in nature – knowledge is required to interpret and understand.
 Costly – depending on the medium used

7.0 Software Packages for Constructing Graphs


Following is the list of some commonly used software packages for constructing
graphs, charts, etc.

1. MS Excel
2. Lotus 1-2-3
3. Harvard Graphics
4. Borland Quattro Pro
5. Prism
6. Sigma Plot
7. MATLAB
8. SYSTAT
9. FOCUS
10. Graphical Analysis http://www.vernier.com/downloads/ga3demo.html
11. SmartDraw http://www.smartdraw.com/specials/charts.asp?id=35557

8.0 Types of Graphs/ Charts


It is important to know what type of graphs should be used when presenting statistics.
There are several types of graphs available for different purposes. Each graph has
some characteristics that make it useful and suitable in certain situation. Some of the
most commonly used graphs/ charts are given as follows:

 Line Graph
 Scatter Plot
 Bar Graph
 Histogram
 Pie Chart
 Flow Chart
 Organizational Charts

This lecture will introduce different tools for generating graphs through computers
using MS Excel software and presenting the scientific data graphically.

8.1 Line Graph

A line graph is a common way of presenting statistics. It is used with continuous data.
It compares two variables. Each variable is plotted along an axis. The dependent
variable is drawn on Y-axis and independent variable is drawn on X-axis. Line graphs
are designed to show trends in data by plotting values over a period of time. A line
graph may comprise of one or more lines (not more than 3-5 lines) each representing
a specific piece of data. Some of the strengths of line graph are that:

 Given the value of one variable other can easily be determined.


 Shows the trends i.e. how one variable is affected by the other as it increases
or decreases.
 Enables the reader to make predictions about the results of data not yet
recorded

It is important to use correct scale while drawing line graphs. Otherwise line’s shape
may give incorrect impression about information. It is also significant to use
consistent scale while comparing two graphs.

Area chart is similar to the line chart with the exception that it provides shading
underneath each line to place emphasis on the value.

8.2 Scatter Plots

Scatter plots are similar to line graphs in that they use horizontal and vertical axes to
plot data points without a line joining them. However, it have a very specific purpose.
Scatter plots are used to study possible relationships between two variables. The
relationship between two variables is called their correlation. Scatter plots show how
much one variable is affected when another variable is changed.

8.3 Bar Graph

Bar graph represents only one variable. Data are displayed as bars usually used to
compare totals. It used when the data or information is discrete). For example sales,
production, population figures, etc. for various years may be shown by simple bar
charts. More than one series of data can be used in bar graphs. Since the bars in bar
graph are of the same width and vary only in heights or lengths (the greater the length,
the greater the value), it becomes very easy for readers to study the relationship. A bar
graph can be either vertical or horizontal. Horizontal bar graphs are known as bar
graphs while vertical bars graphs are known as column graph.

A stacked bar graph (also known as sub divided bar graph) shows summation of
individual components. While constructing such a graph, the various components in
each bar should be kept in the same order. A common and helpful arrangement is that
of presenting each bar in the order of magnitude with the largest component at the
bottom and the smallest at the top. The components are shown with different shades
or colors with a proper index

8.4 Histogram

It is defined as a pictorial representation of a grouped frequency distribution by means


of adjacent rectangles, whose areas are proportional to the frequencies. A histogram
has a similar appearance to a column graph but no gaps between the columns.
Histogram is used to depict data from the measurement of a continuous variable.
Technically the difference between column graph and histogram is that in a histogram
frequency is measured by the area of the column and in a column graph frequency is
measured by the height of the column.

To construct a histogram, the class intervals are plotted along the x-axis and
corresponding frequencies are plotted along the y - axis. The rectangles are
constructed such that the height of each rectangle is proportional to the frequency of
that class and width is equal to the length of the class. If all the classes have equal
width, then all the rectangles stand on the equal width. In case of classes having
unequal widths, rectangles too stand on unequal widths (bases). For open-classes,
Histogram is constructed after making certain assumptions. As the rectangles are
adjacent leaving no gaps, the class-intervals become of the inclusive type, adjustment
is necessary for end points only.

A frequency polygon is graph formed by joining the mid-points of histogram column


tops. This is only used when depicting data from the continuous variable shown on a
histogram. It smoothes out abrupt changes that may appear in a histogram and
therefore, useful for demonstrating continuity of the variable being studied.

8.5 Pie Chart

Pie chart (also known as circle graph) is a circular diagram in which a circle (pie) is
divided by the radii, into sectors (like slices of a cake or pie). The area of a sector is
proportional to the size of each component. Each individual component displays its
contribution to the whole of an entity. It is typically represented as percentage of
whole. Pie chart does not use a set of axes to plots points.

When a statistical phenomenon is composed of different components which are


numerous (say four or more components), bar charts are not suitable to represent them
because, under this situation, they become very complex and their visual impressions
are questioned. A pie chart is suitable for such situations. A pie chart is useful when
we want to show relative positions (proportions) of the figures which make the total.

The sectors of a pie chart are ordered from largest to the smallest for easier
interpretation of the data and they must be drawn in the counter-clockwise direction.
The advantage of a pie chart is that it is simple to draw. However, its disadvantage is
that it can be very difficult to see the difference in sector sizes when their values are
similar.

Pie charts can be misrepresented in the following situations:

 Leaving out one or more of the parts of the whole.


 Not defining what the whole stands for.

8.6 Flow Chart

A flow chart is defined as a pictorial representation describing a process being studied


or even used to plan stages of a project. It graphically show the sequential steps
involved in solving a problem or in a process. Flow charts provide an excellent form
of documentation for a process, and quite often are useful when examining how
various steps in process work together. Specific symbols are used for different
purposes to draw a flow chart (given in MS Word, Excel, Power Point software).

Different types of flow charts used to deal with process analysis are: top-down flow
chart, detailed flow chart, work flow chart and deployment flow chart. Each of the
different types of flow charts tends to provide a different aspect of a process or a task.
A flow chart can be customized to fit any need or purpose.

8.7 Organizational Chart

An organizational chart depicts the relationship of various component of an


organization. It graphically shows the hierarchy of command chain in the organization
and also how one department is related to another.

8.8 Selection of the Right Graph

Purpose Line Scatter Bar Histogram Pie Flow Organizational


Graph Plots Graph Chart Chart Chart
Whole and its May be Yes
parts
Proportion of May be Yes
components
Hierarchy of Yes
components
Simple May be Yes Yes
Comparisons
Multiple May be Yes
Comparisons
Trends Yes Yes
Frequency Yes Yes Yes
Sequence Yes
Relationship May be Yes May be
between two
variables
Information And Communication Technology
Dr. Shuchita Upadhyaya Bhasin
Dept. of Computer Science & Applications,
Kurukshetra University, Kurukshetra.

1.0 Objectives
Information Technology
- Information & Data
- Information Technology
- Computers and Information Technology
- Hardware & Software

Communication Technology
- Data Communication
- Transmission Medias

Computer Networks
- Computer Network
- Network Goal
- Network Categorization
- Internet

2.0 What is Information Technology


Data: Raw facts or knowledge or statistics about something forming information
Information: Information is derived from Data.
Technology: The application of science to various requirements or applications
meant for industrial, commercial or individual objectives.
Information Technology: refers to the Creation, Gathering, Processing, Storage,
and Delivery of Information and the processes and devices that make all this
possible.
Computers and Information Technology: Earlier for processing information we
had mechanical devices such as typewriters, telephones etc. Today, all such
devices have been automated and their functions are controlled by computers that
give extra ordinary capabilities to ordinary devices. Information Technology is
advancing so rapidly that previously isolated fields such as television, phone,
computers etc are now converging into a single field. Information Technology
stands firmly on two legs:
Hardware: Physical equipment usually containing electronic components and
performing some kind of function in information processing. E.g. Computers and
its components, printers etc.
Software: Instructions that guide the hardware in the performance of its duties.

3.0 Communication Technology


3.1 Data Communication:

The powerful capabilities of computers to process, manipulate and create data to


provide information is strengthened by the use of data communications systems.
A data communication system provide for the transport of data and information
between and among the computers located at geographically distant locations.

Source Transmitter Transmission Receiver Destination


System

Source: Device that generates data.


Transmitter: Transforms (encodes) information into electromagnetic signals that
can be transmitted across a transmission system.

1
Transmission System: Can be a single transmission line or a complex telephone
network.
Receiver: Accept signal from the transmission system and converts it into a form
that can be handled by destination.
Destination: Device that accepts data for applications.

3.2 Transmission Medias:

Computers and other telecommunication devices use signals to represent data.


These signals are transmitted from one device to another in the form of electromagnetic
energy. Electromagnetic signals can travel through a vacuum, through air, or through
other transmission media. Electromagnetic energy, a combination of electrical and
magnetic fields vibrating in relation to each other, includes power, voice, radio waves,
infrared light, visible light, ultraviolet light, and X, gamma, and cosmic rays.

Transmission media can be divided into two broad categories or types:

 Guided (Wired): Copper wires (Twisted pairs, coaxial cables) and Optical fibers.
 Unguided (Wireless): Radio waves, Micro Waves and Infrared waves.

Guided or Wired media are those in which the signal energy is contained and
guided within a solid medium, and wireless or unguided media are those in which the
signal propagates in the form of unguided electromagnetic signals. Copper twisted pair,
copper coaxial cable and optical fiber are examples of guided media. The atmosphere and
outer space are example of unguided media that provide a means of transmitting
electromagnetic signals but do not guide them. Not all portions of the spectrum are
currently usable for telecommunications. Voice-band frequencies are generally
transmitted as current over metal cables, such as twisted-pair or co-axial cable. Radio
frequencies can travel through air or space, but require specific transmitting and receiving
mechanisms. Visible light, the last type of electromagnetic energy currently used for
communications, is harnessed using fiber-optic cable.

For unguided media, transmission and reception are achieved by means of antenna. For
transmission, the antenna radiates electromagnetic energy into the medium (usually air),
and for reception, the antenna picks up electromagnetic waves from the surrounding
medium. There are basically two types of configurations for wireless transmission:
directional or omni directional. For the directional configuration, the transmitting antenna
puts out a focused electromagnetic beam; the transmitting and receiving antennas must
therefore be carefully aligned. In the omni directional case, the transmitted signal spreads
out in all directions and can be received by many antennas. In general, the higher, the
frequency of a signal, the more it is possible to focus it into a directional beam.

Three general ranges of frequencies are common for wireless transmission and are
identified as radio waves, microwaves and infrared waves.

4.0 Computer Networks


The merging of Computers and Communications has given rise to computer networks
where a large number of separate but interconnected computers do the job. In contrast to
the earlier generations of computing in which computation and data storage was
centralized, we now have distributed systems in which a user may retrieve a program
from one place, run it on any of a variety of processors, and send the result to a third
location. Such a system connecting different devices such as PCs, printers, disk drives
etc. is a computer network.

A Computer Network may specifically be defined as “an interconnected collection of


autonomous computers”. Two computers are said to be interconnected if they are able to
exchange information. The connections may be through copper wires, optical fibers, and
wireless electromagnetic or optical media. „Autonomous‟ means that there is no master/
slave relationship between the connected devices. Typically, each device in a network
serves a specific purpose for one or more individuals. For example, a PC can provide
access to information or software. On the other hand another PC may be a file server

2
devoted to managing a disk drive containing shared files. A network may cover a small
geographic area connecting devices in a single building or group of buildings. Such a
network is a Local area network (LAN). A network that covers a large area such as a
state, country or the world is called a Wide area network (WAN).

4.1 Goals and Applications of Computer Networks

Network goals can be summarized in terms of the uses of Networks for companies,
organizations, people etc. These uses can be viewed as the facilities provided by
computer networks. Some of the goals or objectives can be summarized as:

I. Resource sharing: Goal is to make all programs, data, and equipment


available to anyone on the network without regard to the physical location of
the resource and the user. This provides a high availability of resources to
users.
II. Load sharing: this is another aspect of resource sharing. Sharing load
between multiple computers connected together can reduce the delays for
carrying out time intensive applications.
III. High reliability: High reliability can be achieved due to alternative sources of
supply. For example, all files could be replicated on two or three machines.
So, if one of them is unavailable( due to a hardware failure), the other copies
could be used. In addition, the presence of multiple CPUs means that if one
goes down the others may be able to take over its work. For real time
applications such as military, banking, air traffic control etc. , the ability to
continue operating in the face of hardware problems is of great importance.
IV. Cost effectiveness: Small computers have a much better price/performance
ratio than large ones. Mainframes are roughly a factor of ten faster than the
fastest single chip microprocessor, but they cost a thousand times more. Thus
it may be more appropriate to have network of low cost PCs running in
parallel rather than terminals (users) connected to a single high cost
mainframe operating in time-sharing mode. This imbalance has caused
designers to build Client-Server systems in which data is kept on one or more
shared file server machines and users (clients) can share (access) this data thru
their personal computers connected to the server(s) on a network. Such a
network with many computers located in the same room or building is called a
Local Area Network (LAN). In contrast, there can be far flung networks
covering entire countries or continents. Such networks are called Wide Area
Networks (WAN).
V. Scalability: A closely related point is the ability to increase system
performance gradually as the workload increases just by adding more PCs.
With a central mainframe, when a system is full, it must be replaced by a
larger one, usually at great expense and with even greater disruption to the
users.
VI. Powerful communication medium: A real time communication can be
possible between two persons sitting on-line and far apart (distant
geographical locations). Two authors sitting far apart can prepare a report
together, making changes to the document and viewing it together at the same
time, instead of waiting several days for a letter.

4.2 Network Categorization

In its simplest form, data communication takes places between two devices that
are directly connected by some form of point-to-point transmission medium. Often,
however, it is impractical for two devices to be directly, point-to-point connected. This is
so because of one (or both) of the following reasons.

 The devices are very far apart. It would be inordinately expensive to string a
dedicated link between two devices thousands of miles apart.
 There is a set of devices, each of which may require a link to many of the others
at various times. Except for the case of a very few devices, it is impractical to
provide a dedicated wire between each pair of devices.

3
The solution to this problem is to attach each device to a communications
network. The way in which different devices are connected may be different depending
upon the distance between the devices. Two dimensions stand out as important in the
classification of networks according to the connectivity between the devices namely:

- Transmission technology
- Scale

Transmission Technology: There are two types of transmission technologies identified


according to the architecture and techniques they use for transmission.

 Point-to-point networks: Provides a dedicated link between two devices.


 Broadcast/Multipoint networks: A single communication channel that is
shared by all the machines on the network.

Scale: An alternative criterion for classifying networks is their scale/distances. These can
be divided into:

 Local Area Network (LAN)


 Metropolitan Area Network (MAN)
 Wide Area Network (WAN)

LANs, WANs and MANs can be wired or wireless. The connection of two or
more networks is called an internetwork.

Network Components: Communicating devices (servers, clients, printers, fax machines


etc.), cables, wireless media, modems, switches, hubs, repeaters, routers, gateways,
network interface cards.

4.2.1 Local area networks

Generally called LANs, are privately - owned networks within a single office
single building or campus of up to a few kilometers in size. They are widely used to
connect personal computers and workstations in company offices and factories to share
resources (e.g., printers) and exchange information. LANs with their emphasis on low
cost and simplicity have been based on the broadcast approach. LANs are distinguished
from other kinds of networks by three characteristics: (1) their size, (2) their transmission
technology. (3) their topology.

Size: LANs are restricted in size, which means that the worst-case transmission time is
bounded and known in advance.
Transmission Technology: LANs often use a transmission technology consisting of a
single cable to which all the machines are attached. Traditional LANs run at speeds of 10
to 100 Mbps, have low delay (tens of microseconds), and make very few errors. Newer
LANs may operate at higher speeds, up to hundreds of megabits sec.

Topology: Various topologies are possible for broadcast LANs. A bus (a linear cable)
and a ring topology commonly used in LAN‟s.

Bus Ring
Computers

Computer
Cable

This LAN also has a file server that performs the same functions as the central
host computer in a WAN. The file server is usually a microcomputer (usually more
powerful) but may be a minicomputer or mainframe also. The bridge (or a router or a
gateway) is a computer or special device that connects two or more networks. A bridge

4
enables computers on this LAN to communicate with computers on other LANs or
WANs. LAN components: There are five basic components of a LAN (Fig. 8). These are:
Client Computer, Server, Network Interface Cards, Network Cables & hubs, Network
Operating Systems

4.2.2 Metropolitan Area Network (MAN)

A MAN is a network spanning a geographical area that usually encompasses a


city or country area. It interconnects various buildings or other facilities within this city
wide area. MANs connect LANs and BNs (Backbone) located in different areas to each
other and to wide area networks. MANs typically span from 3 to 30 miles.

A MAN may be single network such as a cable television network, or it may be a


means of connecting a number of LANs into a larger network so that resources may be
shared LAN-to-LAN as well as device-to-device. For example, a company can use a
MAN to connect the LANs in all of its offices throughout a city.

Public city
Network

A MAN may be wholly owned and operated by a private company, or it may be a


service provided by a public company such as a local telephone company. When LANs in
close proximity need to exchange data, they can be connected privately using cable and
routers or gateways. When LANs of a single enterprise are distributed over a larger area
(such as a city or large campus), however, privately owned connection infrastructure is
impractical. Even if it is permitted to lay cable on public land, a better alternative is to
use the services of existing utilities, such as the telephone company. SMDS and DQDB
are two such services. Many telephone companies provide a popular MAN service called
Switched Multi-megabit Data Service (SMDS)

4.2.3 Wide Area Networks (WAN)

A wide area network provides long - distance transmission of data, voice, image,
and video information over large geographical areas that may comprise a country, a
continent, or even the whole world. In contrast to LANs (which depend on their own
hardware for transmission), WANs may utilize public, leased, or private communication
devices, usually in combinations, and can therefore span an unlimited number of miles.

A WAN that is wholly owned and used by a single company is often referred to as
an enterprise network. A WAN interconnects computers, LANs, BNs, MANs and others
data transmission facilities on countrywide or worldwide basis. Most organizations do
not build their own WANs by laying cables, building microwave towers, or sending up
satellites. Instead, most organizations lease circuits from inter-exchange carriers (eg.
telephone network) and use those to transmit their data.

A Wide Area Network (WAN) typically, consists of a collection of machines


intended for running user (i.e. application) programs. These may be called host machines
or end systems. The hosts are connected by a communication subnet. The job of the
subnet is to carry messages from host to host. Thus separating communications aspects
of the network (the subnet) from the application aspects (the hosts). In most wide area
networks, the subnet consists of two distinct components:

 Transmission lines
 Switching elements

5
Transmission lines (channels or trunks) moves bits between machines. The
switching elements are specialized computers used to connect two or more transmission
lines. When data arrive on an incoming line, the switching elements must choose an
outgoing line to forward them on. These switching element, are variously called as
packet switching nodes, interface message processors or routers.

Each host is generally connected to a LAN on which a router is present, although


in some cases a host can be connected directly to a router. The collection of
communication lines and router form the subnet.

In most WANs, the network contains numerous cables or telephone lines, each
one connecting a pair of routers. If two routers that do not share a cable wish to
communicate, they must do this indirectly, via other routers.

Another possibility for a WAN is a satellite or ground radio system. Each router
has an antenna through which it can send and receive. All routers can hear the output
from the satellite, and in some cases they can also hear the upward transmissions of their
fellow routers to the satellite as well. Sometimes the routers are connected to a
substantial point-to-point subnet, with only some of them having a satellite antenna.

4.4 Internet

The Internet is a World Wide Network of Networks that act as an integrated


whole and provides an infrastructure to connect universities, government offices,
companies, students, scientists, researchers and private individuals. It uses common
protocols for communication and provides common services. It supports millions of
„Server‟ computers housing large volumes of all sorts of information.

Services provided by the Internet

1) World Wide Web (WWW): Provides access to information containing


text, pictures, sound and video with embedded links to other pages.
2) E-Mail: Facilitates the composition, transfer and receiving of mails from
one computer system to another.
3) Usenet News: News groups are specialized forums in which users with a
common interest can exchange messages.
4) Remote login: User can login with a remote computer and work on that
computer.
5) File Transfer: Files of data can be transferred from one computer to
another.
6) Chat: Synchronous (happening in real time) line-by-line communication
with another user over a network.

Das könnte Ihnen auch gefallen