669 views

Uploaded by rasty_01

- Fuzzy logic, neural network & genetic algorithms
- Pattern Recognition
- An Introduction to the Modeling of Neural Networks
- Statistical Models and Causal Inference a Dialogue With the Social Sciences
- Negative Binomial Regression
- Artificial Neural Networks in Real-Life Applications [Idea, 2006]
- Artificial Neural Networks Architecture Applications
- Artificial Neural Networks in Finance and Manufacturing
- Artificial Higher Order Neural Networks for Economics and Business
- Neural Network Learning Theoretical Foundations
- Neural Engineering
- Artificial Intelligence for Humans, Volume 3 - Jeff Heaton
- Fundamentals of Neural Networks by Laurene Fausett
- Rao R.P.N. Probabilistic Models of the Brain- Perception and Neural Function
- Neural Networks
- Neural Networks
- Speech Recognition Using Neural Networks
- Livingstone, Data Analysis
- Deep Learning
- Practical Methods of Optimization

You are on page 1of 168

intentionally left

blank

Copyright © 2008, New Age International (P) Ltd., Publishers

Published by New Age International (P) Ltd., Publishers

No part of this ebook may be reproduced in any form, by photostat, microfilm,

xerography, or any other means, or incorporated into any information retrieval

system, electronic or mechanical, without the written permission of the publisher.

All inquiries should be emailed to rights@newagepublishers.com

NEW AGE INTERNATIONAL (P) LIMITED, PUBLISHERS

4835/24, Ansari Road, Daryaganj, New Delhi - 110002

Visit us at www.newagepublishers.com

Dedicated to the memory of

JESUS CHRIST

and

SARASWATI

This page

intentionally left

blank

Preface

This book deals with a novel paradigm of neural networks, called multidimensional neural

networks. It also provides comprehensive description of a certain unified theory of control,

communication and computation. This book can serve as a textbook for an advanced course

on neural networks or computational intelligence/cybernetics. Both senior undergraduate

and graduate students can get benefit from such a course. It can also serve as a reference

book for practicising engineers utilizing neural networks. Further more, the book can be

used as a research monograph by neural network researchers.

In the field of electrical engineering, researchers have innovated sub-fields such as

control theory, communication theory and computation theory. Concepts such as logic

gates, error correcting codes and optimal control vectors arise in the computation,

communication and control theories respectively. In one dimensional systems, the concept

of error correcting codes, logic gates are related to neural networks. The author, in his

research efforts showed that the optimal control vectors (associated with a one dimensional

linear system) constitute the stable states of a neural network. Thus unified theory is

discovered and formalized in one dimensional systems. Questioning the possibility of

logic gates operating on higher dimensional arrays resulted in the discovery as well as

formalisation of the research area of multi/infinite dimensional logic theory. The author

has generalised the known relationship between one dimensional logic theory and one

dimensional neural networks to multiple dimensions. He has also generalised the

relationship between one dimensional neural networks and error correcting codes to

multidimensions (using generator tensor).

On the way to unification in multidimensional systems the author has discovered and

formalised the concept of tensor state space representation of certain multidimensional

linear systems.

It is well accepted that the area of complex valued neural networks is a very promising

research area. The author has proposed a novel activation function called the complex signum

function. This function has enabled proposing a complex valued neural associative memory

on the complex hypercube.

He also proposed novel models of neuron (such as linear filter model of synapse).

This book contains 10 chapters. The first chapter provides an introduction to the unified

theory of control, communication and computation. Chapter 2 introduces a mathematical

(viii) Preface

Chapter 3, the concepts of multidimensional error correcting codes, multidimensional

neural networks and optimization of multi-variate polynomials (associated with a tensor)

over various subsets of multidimensional lattice, are related from different view points. In

Chapter 4, Tensor State Space Representation (TSSR) of certain multidimensional linear

systems is discussed. In Chapter 5, Unified Theory of Control, Communication and

Computation in multidimensional linear systems is summarized. In chapter 6, the author

proposes a novel complex signum function. In Chapter 7, a novel optimal filtering problem

associated with a one dimensional linear system is formulated and solved. In Chapter 8, a

linear filter model of synapse is proposed. Also a novel continuous time associative memory

and the associated convergence theorem are discussed. In Chapter 9, a novel model of

neuron and associated real/complex neural networks are proposed. Finally in Chapter 10,

advanced theory of evolution based on the unified theory is briefly discussed.

The Chapters in this book are organised in such a way that there is considerable

flexibility in its use by its readers. For instance, Chapters 1 to 5 can form the basis for a

graduate course on multidimensional neural networks and unified theory. This course is a

compulsory course for students interested in doing research on computational intelligence

(cybernetics). The students/researchers interested in doing research on complex valued

neural networks will find interesting material in Chapters 6 and 9. Further, the students/

researchers interested in exploring interrelationship between signal processing and neural

networks will enjoy understanding the material in Chapters 7 and 8. Finally, Chapter 10

will provide counter-intuitive insights into the theory of organic evolution.

This writing project would not be possible without the cooperation of my brother Dr.

G.V.S.R. Prasad and my beloved mother. I thank many colleagues at IIIT and those around

the world who believe that this book is my first masterpiece. I specially thank Sri Damodaran

and other employees of New Age International (P) Ltd. for making my dream of publishing

this book a reality.

G. Rama Murthy

Contents

PREFACE (vii)

1. INTRODUCTION 1

L OGICAL BASIS FOR COMPUTATION 3

L OGICAL BASIS FOR CONTROL 3

L OGICAL BASIS OF COMMUNICATION 4

A DVANCED THEORY OF EVOLUTION 6

DIMENSIONAL LOGIC THEORY 9

2.1. INTRODUCTION 9

2.2. MATHEMATICAL MODEL OF MULTIDIMENSIONAL NEURAL NETWORKS 11

2.3. CONVERGENCE THEOREM FOR MULTIDIMENSIONAL NEURAL NETWORKS 14

2.4. MULTIDIMENSIONAL LOGIC THEORY, LOGIC SYNTHESIS 17

2.5. INFINITE DIMENSIONAL LOGIC THEORY: INFINITE DIMENSIONAL LOGIC SYNTHESIS 20

2.6. N EURAL NETWORKS, LOGIC THEORIES, CONSTRAINED STATIC OPTIMIZATION 23

2.7. CONCLUSIONS 25

DIMENSIONAL NEURAL NETWORKS—CONSTRAINED STATIC OPTIMIZATION 27

3.1. INTRODUCTION 27

3.2. MULTIDIMENSIONAL NEURAL NETWORKS: MINIMUM CUT COMPUTATION IN

THE CONNECTION STRUCTURE: GRAPHOID CODES 29

3.3. MULTIDIMENSIONAL ERROR CORRECTING CODES: ASSOCIATED ENERGY

FUNCTIONS—GENERALIZED NEURAL NETWORKS 34

3.4. MULTIDIMENSIONAL ERROR CORRECTING CODES: RELATIONSHIP

TO STABLE STATES OF ENERGY FUNCTIONS 39

3.5. N ON-BINARY LINEAR CODES 42

3.6. N ON-LINEAR CODES 45

3.7. CONSTRAINED STATIC OPTIMIZATION 53

3.8. CONCLUSIONS 59

(x) Contents

4.1. INTRODUCTION 61

4.2. STATE OF THE ART IN MULTI/ INFINITE DIMENSIONAL STATIC/ DYNAMIC SYSTEM THEORY:

REPRESENTATION BY TENSOR LINEAR OPERATOR 63

4.3. STATE SPACE REPRESENTATION OF CERTAIN MULTI/ INFINITE DIMENSIONAL

DYNAMICAL SYSTEMS: TENSOR LINEAR OPERATOR 65

4.4. MULTI/ INFINITE DIMENSIONAL SYSTEM THEORY: LINEAR DYNAMICAL SYSTEMS –

STATE SPACE REPRESENTATION BY TENSOR LINEAR OPERATORS 69

4.5. STOCHASTIC DYNAMICAL SYSTEMS 70

4.6. DISTRIBUTED DYNAMICAL SYSTEMS 73

4.7. CONCLUSIONS 76

COMPUTATION: MULTIDIMENSIONAL NEURAL NETWORKS 79

5.1. INTRODUCTION 79

5.2. ONE DIMENSIONAL LOGIC FUNCTIONS, CODEWORD VECTORS, OPTIMAL CONTROL VECTORS:

ONE DIMENSIONAL NEURAL NETWORKS 80

5.3. OPTIMAL CONTROL TENSORS: MULTIDIMENSIONAL NEURAL NETWORKS 82

5.4. MULTIDIMENSIONAL SYSTEMS: OPTIMAL CONTROL TENSORS,

CODEWORD TENSORS AND SWITCHING FUNCTION TENSORS 90

5.5 CONCLUSIONS 92

6.1. INTRODUCTION 95

6.2. FEATURES OF THE PROPOSED MODEL 96

6.3 CONVERGENCE THEOREMS 97

6.4. CONCLUSIONS 105

7.1. INTRODUCTION 107

7.2. OPTIMAL SIGNAL DESIGN PROBLEM: SOLUTION 107

7.3. OPTIMAL FILTER DESIGN PROBLEM: SOLUTION (DUAL OF SIGNAL DESIGN PROBLEM) 113

7.4. CONCLUSIONS 114

VALUED NEURAL NETWORKS 117

8.1. INTRODUCTION 117

8.2. C ONTINUOUS TIME PERCEPTRON AND GENERALIZATIONS 118

8.3 A BSTRACT MATHEMATICAL STRUCTURE OF NEURONAL MODELS 120

8.4. FINITE IMPULSE RESPONSE MODEL OF SYNAPSES: NEURAL NETWORKS 121

Contents (xi)

8.6. MULTIDIMENSIONAL GENERALIZATIONS 125

8.7. G ENERALIZATION TO COMPLEX VALUED NEURAL NETWORKS (CVNNS) 125

8.8. CONCLUSIONS 126

9.1. INTRODUCTION 129

9.2 DISCRETE FOURIER TRANSFORM: SOME COMPLEX VALUED NEURAL NETWORKS 130

9.3. COMPLEX VALUED PERCEPTRON 133

9.4. N OVEL MODEL OF A NEURON: ASSOCIATED NEURAL NETWORKS 133

9.5. CONTINUOUS TIME PERCEPTRON LEARNING LAW 134

9.6. SOME IMPORTANT GENERALIZATIONS 135

9.7. SOME OPEN QUESTIONS 135

9.8. CONCLUSIONS 136

10.1. UNIFIED THEORY: CYBERNETICS 137

10.2. ORGANIC EVOLUTION 137

10.3. EVOLUTION OF LIVING SYSTEMS: INNOVATIVE PRINCIPLES 138

10.4. CONCLUSIONS 139

INDEX 141

This page

intentionally left

blank

CHAPTER

1

Introduction

Ever since the dawn of civilization, the homo-sapien animal unlike other lower level animals

was constantly creating tools that enabled the community to not only take advantage of

the physical universe but also develop a better understanding of the physical reality through

the discovery of underlying physical laws. The homo-sapien, like other lower level animals

had two primary necessities: metabolism and reproduction. But, more important was the

obsession with other developed necessities such as art, painting, music and sculpture.

These necessities naturally lead to the habit of concentration. This most important habit

enabled him to develop abstract tools utilized to study nature in most advanced civilizations.

Thus the homo-sapien animal achieved the distinction of being a higher animal compared

to the other animals in nature.

In ancient Greece, the homo-sapien civilization was highly advanced in many matters

compared to all other civilizations. Such a lead was symbolized by the development of

mathematics subject in various important stages. The most significant indication of such

development is left to posterity in the form of 13 books called, Euclid’s Elements. These

books provide the first documented effort of axiomatic development of a mathematical

structure such as the Euclidean geometry. Also, Greek, Babylonian civilizations made

important strides in algebra: solving linear, quadratic equations and studying the quadratic

homogeneous forms in two variables (for conic sections). Algebra was revived during the

Renaissance in Italy. In algebra, solution of cubic, quartic equations was carried out by the

Italian algebraists. This constituted the intellectual heritage, cultural heritage along with

religious, social traditions.

To satisfy the curiosity of observing the heavens, various star constellations,

astronomical objects were classified. In navigating the ships for battle purposes as well as

trade, astronomical observations were made. These provided the first curious data related

to the natural world. In an effort to understand the non-living material universe, homo-

sapiens have devised various tools: measuring equipment, experimental equipment,

mathematical procedures, mathematical tools etc.

2 Multidimensional Neural Networks: Unified Theory

With the discovery that Sun is the center of our relative motion system by Copernicus,

Ptolemaic theory was permanently forsaken. It gave Galileo, the curious motivation for

deriving the empirical laws of far flung significance in natural philosophy/natural

science/physics. Kepler after strenuous efforts derived the laws of planetary motion

leading to some of the laws of Newton. Issac Newton formalized the laws of Galileo by

developing calculus. He also developed a theory of gravitation based on the empirical

laws of Kepler. Michael Faraday derived the empirical laws of electric and magnetic

phenomena. Though Newton’s mechanical laws were successfully utilized to explain

heat phenomenon, kinetic theory of gases as being due to mechanical motion of molecules,

atoms, they were inadequate for electrical phenomena. Maxwell formalized Faraday’s

laws of electro-magnetic induction leading to his field equations. Later physics developed

at a feverish pace.

These results in physics were paralleled by developments in other related areas such

as chemistry, biology etc. Thus, the early efforts of homo-sapiens matured into a clearer

view of the non-living world. The above description summarizes the pre 20th century

development of this progress on homo-sapien contributions to understanding the non-

living material universe.

In making conclusive statements on the origin and evolution of physical reality,

the developments of the 20th century are more important. In that endeavor, Einstein’s

general theory of relativity was one of the most important cornerstones of 20th century

physics. It enabled him to develop a general, more correct theory of gravitation,

outdating the Newtonian theory. It showed that gravitation is due to curvature of space-

time continuum. The general theory of relativity also showed that all natural physical

laws are invariant under non-linear transformations. This result was a significant

improvement over special theory of relativity, where he showed that all natural physical

laws are invariant under linear Lorentz transformations. This result (in special theory

of relativity) was achieved when Einstein realized that due to finiteness of velocity of

light, one must discard the notions of absolute space and time. They must be replaced

by the notions of space-time continuum i.e. space and time are not independent of one

another, but are dependent. Thus, special and general theories of relativity constrained

the form of natural physical law.

In the 20th century, along with the Theory of Relativity, Quantum Mechanics was

developed due to the efforts of M.Planck, E. Schrodinger and W. Heisenberg. This theory

showed that the electromagnetic field at the quantum level was quantized. This, along

with, wave-particle duality of light was considered irreconcilable with the general theory

of relativity. To reconcile general theory of relativity with various quantum theories,

Y. Nambu proposed a string model for fundamental particles and formalized the

dynamics of light string. Utilizing the experimentally verified quantum theories of

chromodynamics, electrodynamics, supersymmetry of fundamental particles (unifying

Bosons and Fermions), it was possible to supersymmetrize the string model of

fundamental particles, resulting in the so-called superstring (supersymmetric string)

Introduction 3

experimentally verifiable, theoretically viable model.

But the material universe consists of living universe as well as non-living universe. All

efforts in science probed the non-living universe using experimental as well as theoretical

methodology. The efforts of all scientists enabled them to see farther by “standing on the

shoulders of earlier giants”. The homo-sapien animals by devising various tools discovered

and formalized various laws and theories related to non-living physical reality based

dynamical systems. The homo-sapien animal learned to build machines to facilitate his

life and that of the community surrounding him. By understanding the mechanism of

various functional units in living system such as ear, eye, various machines such as

telephone, television, loud-speaker were built. Also, in the research area of artificial

intelligence in Electrical Engineering, various functions of human brain are simulated in

machines called robots.

In the case of living universe, the scene was entirely different. The author made various

pioneering innovations on living systems unlike the extended, stretched over effort of

non-living systems by various eminent scientists. The objective/goal of this is to provide

artificial/manufacturable models of living systems i.e. robots which resemble in every

respect living systems. In arriving at artificial models, the effort of various eminent

mathematicians, scientists culminating in those of N. Wiener (who coined the word

CYBERNETICS) were helpful. The important discovery and the associated formalization

belonged to the pioneering efforts of the author.

George Boole developed the algebra when the variables assume “true” or “false” values.

This algebra is called the Boolean algebra. Certain elementary Boolean algebraic expressions

are realized in equipment called “logic gates”. When the logic gates are combined/co-

ordinated, arbitrary Boolean algebraic expression can be computed. The combination of

Boolean logic gates ( an assemblage with some minimum configuration of gates) and

memory elements forms an arithmetic unit. When such a unit is coupled with a control

unit the Central Processor Unit (CPU) in a computer is realized. The CPU in association

with a memory, input and output units forms a computational unit without intelligence.

This is just a machine which can be utilized to perform computational tasks in a fast manner.

Various thought provoking modifications make it operate on data in an efficient manner

and provide computational results related to various problems.

discovered the laws of electro-magnetic induction. Based on his investigations, Fleming

4 Multidimensional Neural Networks: Unified Theory

discovered that a time varying electric field leads to magnetic field which can be capitalized

for the motion of a neutral body. He also discovered that a time varying magnetic field leads

to electric field inside a neutral conductor and flow of current takes place. These formed the

Fleming’s left hand and right hand rules relating the relativistic effects between the electric

field, magnetic field and conductor. These investigations of Faraday and other scientists

naturally paved the way for electric circuits consisting of resistors, inductors and capacitors.

Such initial efforts led to canonical circuits such as RL circuit, RLC circuit, RC circuit etc. The

systems of differential equations and their responses were computed utilizing the analytical

techniques. The ability to control the motion of an arbitrary neutral object led to applications

of electrical circuits and their modifications for control of trajectories of aircrafts. Thus, the

automata which can perform CONTROL tasks was generated. These control automata were

primarily based on electrical circuits and operate in continuous time with the ability to make

synchronization at discrete instants. Later utilizing the Sampling Theorem, sample-data

control systems operating in discrete time were developed.

The problem of communication is to convey message from one point in space to another

point in space as reliably as possible. The message on being transmitted through the channel,

by being subject to various forms of disturbance (noise) is changed/garbled. By coding

the message (through addition of redundancy), it is possible to retrieve the original message

from the received message.

Thus, the three problems: control, communication and computation can be described

through the illustration in Figure 1.1. From the illustration, the message that is generated

may be in continuous time or discrete time. Utilizing the Sampling Theorem, if the original

signal is band-limited, then the message can be sampled. The sampled signal forms the

message in discrete time. The message is then encoded through an encoder. It is then

transmitted through a channel. If the channel is a waveform channel, various digital

modulation schemes are utilized in encoding. The signal, on reaching the receiver is

demodulated through the demodulator and then it is decoded. This whole assembly of

hardware equipment forms the COMMUNICATION equipment.

The above summary provided the efforts of engineers, scientists and mathematicians

to synthesize the automata which serve the purpose of CONTROL, COMMUNICATION

AND COMPUTATION. These functions are the basis of automata that stimulate living

systems. These automata model the living systems. In other words, control, communication

and computation automata when properly assembled and co-ordinated lead to robots which

simulate some functions of various living systems.

In the above effort at simulating the functions of living systems in machines, traditionally

the control, communication and computation automata led to sophisticated robots (which

served the purpose pretty well). Thus, the utilitarian viewpoint was partially satisfied.

But, the author took a more FUNDAMENTAL approach to the problem of simulating a

6 Multidimensional Neural Networks: Unified Theory

are extended to multi/infinite dimensional linear systems. Also, the results developed

in one dimension for computation of optimal control are immediately extended to certain

multi/infinite dimensional linear systems. This result in association with the formalization

of multi/infinite dimensional logic theory, multi/infinite dimensional coding theory

(as an extension of one dimensional linear and non-linear codes) provided the formal

UNIFIED THEORY in multi/infinite dimensional linear systems. The formal

mathematical detail on models of living system functions are provided in Chapters 2 to

5. These chapters provide the details on control, communication and computation

automata in multiple dimensions. Several generalized models of neural networks are

discussed in Chapters 5 to 9. Also relationship between neural networks and optimal filters

is discussed in Chapter 7. In Chapter 10 advanced theory of evolution is discussed.

Mathematical models of living system functions motivated us to take a closer look at the

functions of natural living systems observed in physical reality. In physical reality, we observe

homo-sapiens as well as lower level animals such as tigers, lions, snakes etc. It is reasoned

that some of the functions of natural living systems are misunderstood or un-understood.

Biological living systems such as homo-sapiens lead to a biological culture. In a

biological culture that originated during the ice age in oceans, various living species

were living in the oceans. Through some process, the two necessities of metabolism and

reproduction were developed by all living species. The homo-sapien species was

responsible for our current understanding of various activities, functions of observed

living systems. The author hypothesizes that the homo-sapien interpretations are totally

wrong. For instance,

• Metabolism which leads to killing of one species by another is unnecessary to

sustain life.

• The belief (like many superstitions) that death and aging are inevitable is only

partially true.

To be more precise, it should be possible to take non-decayed organs of a living species

and by recharging the dead cells, make it living. Many such innovative ideas on living

systems are discussed in Chapter 10.

The only necessities of natural living systems that are observed are ‘metabolism’ and

‘reproduction’. By and large the only organization and community formation that we see

in other (than homo-sapiens) natural sustems are of the following form

• Migratory pattern of birds

• Sharing the information on the place of food

• Forming a group of families to satisfy the reproductive needs

• Occasional bird songs of mutual courtship

• Occsional rituals related to protecting the members of their group etc.

Introduction 7

The organization, culture observed in other biological systems and other natural living

systems is nowhere comparable to those observed in the homo-sapien species. But the

author hypothesizes that this marginal/poor organization is primarily due to lack of co-

ordination which is achieved through the language. Thus, major effort in organizing the

lower level species of living systems is through teaching a language. Thus, organization of

living systems other than the homo-sapiens (for homo-sapien and other purposes) should

be possible.

An important part of organizing the homo-sapiens was the educational system through

an associated language. In the same spirit, by teaching some lower level animals to speak

certain language, they could be organized/educated to understand as well as develop

science and technology. When the lower level animals are organized in a zoo through

various methods, they could lead to a culture and a civilization.

Various natural living machines have developed organs/functional units due to

evolutionary needs. These functional units essentially include sensors to collect video,

audio information or more generally sensors to collect data on the surrounding environment

in the universe. The data gathered by the living machine from the surrounding environment

in physical reality is utilized to perform some primary functions such as metabolism,

reproduction etc. The data is processed by various functional sub-units inside the brain of

a living machine. Thus the understanding of the operation of various functional sub-units

in the brain of natural living machines leads to building artificial living machines which

are far superior in functional capabilities.

This page

intentionally left

blank

CHAPTER

2

Multi/Infinite Dimensional

Neural Networks, Multi/Infinite

Dimensional Logic Theory

2.1 INTRODUCTION

transformations on one dimensional arrays of zeroes and ones to arrive at arrays of

zeroes and ones. Various standard logic gates such as AND, OR, NOT, NAND, XOR,

NOR are defined on one dimensional arrays/vectors. The logic synthesis of digital

integrated circuits, consisting of the interconnection of logic gates which transit through

a set of states, is performed through the utilization of the associated state transition

diagram. The set of allowed transitions in the state space lead to various classes of

digital circuits such as shift registers, counters, flip flops etc. In one dimensional logic

theory various theorems on the decomposition, synthesis of Boolean functions are

proved and are utilized in the logic synthesis of complex digital integrated circuits. In

the practical implementation of such digital integrated circuits, semiconductor

technology with devices such as diodes, transistors, field effect transistors was

effectively utilized.

The design and implementation of complex digital integrated circuits led to the

development of highly sophisticated computers, computer systems serving various practical

applications. Some practical applications such as those in medical imaging, remote sensing,

pattern recognition led to the design and implementation of various types of parallel

computers. These computers operate on two dimensional arrays of zeroes and ones. But the

processing units in these computers treat the two dimensional array elements as those from

one dimensional arrays. Thus, the two dimensional nature of an array with dependency

structure is never capitalized. This limitation led the author to innovate information processing

units which operate on two/multidimensional arrays. Such information processing units

should necessarily be based on sub-units which operate on arrays of binary data and produce

binary arrays. These sub-units constitute the two/multidimensional logic circuits. A more

10 Multidimensional Neural Networks: Unified Theory

general class of information processing sub-units and thus the units operate on arrays whose

entries are allowed to assume multiple (not necessarily binary) values.

Automata which operate on multidimensional arrays to perform desired operation

can be defined heuristically in many ways. In some applications such as in 3-d array/

image processing, the information processing operation can only be defined heuristically

based on the required function. But, a more organized approach to define multidimensional

logic functions is discovered and formalized by the author. In this chapter, the author

describes the mathematical formalization for multidimensional logic units. The relationship

between multidimensional logic units and multidimensional neural networks is also

discussed. The generalization of the results to infinite dimensions is also briefly described.

Two dimensional neural networks were utilized by various researchers working in the

area of neural networks. The application of two dimensional neural networks to various

real world problems was also extensively studied. But, an effective mathematical abstraction

for modeling two/multi/infinite dimensional neural networks was lacking. The author in

this chapter demonstrates that tensors provide a mathematical abstraction to model multi/

infinite dimensional neural networks.

The contents of this chapter are summarized as follows:

A mathematical model of an arbitrary multidimensional neural network is developed. A

convergence theorem for an arbitrary multidimensional neural network represented by a

fully symmetric tensor is stated and proved. The input and output signal states of a

multidimensional logic gate/neural network are related through an energy function,

defined over the fully symmetric tensor representing the multidimensional logic gate, such

that the minimum/maximum energy states correspond to the output states of the logic

gate realizing a logic function. Similarly, a logic circuit consisting of the interconnection of

logic gates, represented by a symmetric tensor, is associated with a quadratic/higher degree

energy function. Multidimensional logic synthesis is described. Infinite dimensional logic

theory, logic synthesis are briefly discussed through the utilization of infinite dimension/

order tensors.

This chapter is organized as follows. In section 2, a mathematical model of an arbitrary

multidimensional neural network and associated terminology is developed. In section 3, a

convergence theorem for an arbitrary multidimensional neural network is proved. In section

4, the input/stable states of a multidimensional neural network are associated with the

input/output signal states of a multidimensional logic gate. A mathematical model of an

arbitrary multidimensional logic gate/circuit is described. Thus, multidimensional logic

theory, logic synthesis is formalized. In section 5, infinite dimensional logic theory, logic

synthesis are described. In section 6, the relationship between multidimensional neural

networks, multidimensional logic theories, various constrained static optimization problems

is elaborated. Various constrained optimization problems that commonly arise in various

problems are listed. Various innovative ideas in multidimensional neural networks are

briefly described. The chapter concludes with a set of conclusions.

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 11

in discrete time. It can be represented by a weighted connectionist structure in

multidimensions. Thus, there is a weight attached to each edge of the connectionist structure

in multidimensions and a threshold value attached to each node. At each node of the

connectionist structure, a certain algebraic threshold function is computed.

It is well known in the theory of one dimensional neural networks that a symmetric

matrix can be utilized to represent a one dimensional neural network. With the motivation,

applications of one dimensional neural networks, two dimensional neural networks were

heuristically designed and utilized for various applications. But, the author for the first

time realized that tensor is the most natural mathematical abstraction that can be utilized

to represent two/multidimensional neural networks.

Before describing the mathematical model of multidimensional neural networks, the

following discussion on tensors and associated concepts is very relevant.

It is important to realize that given n independent variables, the expression

n

∑C X

i =1

i 1 (2.1)

n n

∑∑ C

i =1 j =1

ij Xi X j (2.2)

n n n

∑∑∑ C

i =1 j =1 k =1

ijk Xi X j K k (2.3)

is called a homogeneous form (BoT) of degree three and so on. Given the components of

a tensor of order n, of dimension m , it is possible to define a homogeneous form of

degree n.

The connection structure of a one dimensional neural network, the symmetric matrix,

is naturally associated with a homogeneous quadratic form as the energy function, which

is optimized over the one dimensional hypercube. Thus, in one dimension, to utilize a

homogeneous form of degree n as the energy function, a generalized neural network is

employed, in which, at each neuron, an arbitrary algebraic threshold function is computed.

But, in multidimensions, to describe the connection structure of a neural network, a tensor

is necessarily utilized.

12 Multidimensional Neural Networks: Unified Theory

multidimensional neural networks, some notation related to tensors is provided to facilitate

the description of mathematical model of an arbitrary multidimensional neural network.

Matrices are utilized to represent quadratic forms, whereas tensors are necessary to

represent a homogeneous form of degree n.

Suppose, one second order tensor is a linear function of another second order tensor

i.e.

Aik = λiklm Bim (2.4)

where λiklm is a set of k 4 coefficients. It is easy to see that λiklm is a tensor of dimension k and

order 4. This is illustrative of linear transformation of tensors.

Now, we discuss some concepts in the multiplication of tensors.

Let A i k and Bi k be the components of two second order tensors. Consider all possible

products of the form

Ciklm = Aik Bim (2.5)

Then, the numbers C iklm are the components of a fourth-order tensor, called the outer

product of tensors with components A ik and Bi k.

Multiplication of any number of tensors of arbitrary order is defined similarly (BoT),

i.e. the product of two or more tensors is the product of the components of the tensors,

which are factors. The order of a tensor product is clearly the sum of the orders of the

factors.

Contraction of Tensors: The operation of summing a tensor of order n (n >2) over two

of its indices is called contraction. It is clear that contraction of a tensor of order n leads to a

tensor of order n-2. This tensor can be repeatedly contracted to arrive at a tensor of order 2

or a scalar depending on whether n is even or odd.

The result of multiplying two or more tensors and then contracting the product with

respect to indices belonging to different factors is often called an Inner Product of the

given tensors.

Thus, based on the notation associated with the indices, it is understood from the context

whether inner product or outer product of tensors is utilized.

With the above requisite notation from tensor algebra summarized, before describing

a mathematical model of an arbitrary multidimensional neural network, the following

intuitive discussion is provided to facilitate easier understanding.

The state of a neuron at the discrete time instant n+1 is computed by summing the

contributions from other neurons connected to it through synaptic weights which are the

components of a fully symmetric tensor S, representing the connection structure and the

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 13

state tensor of neuronal states at the time instant n. Thus, we first compute the outer product

of connection tensor and the state tensor of neurons at the time instant n and perform the

contraction over all the indices (representing the neurons) connected to a chosen neuron.

Thus, this inner product operation followed by determining its sign/parity/polarity

(positive or negative value) gives us the state tensor at time instant n+1. This procedure is

repeated at all the neurons where the state is updated.

Remark

Throughout the research article, the notation “multidimensional neural network” is utilized.

The standard notation associated with tensors utilizes the term, “dimension” to represent

the number of values an independent variable can assume and the term, “order” to represent

the number of independent variables. Thus, the state tensor order represents the number

of independent dimensions in the multidimensional neural network, MN. The notational

confusion between the usage of terms “order”, “dimension” should be resolved from the

context.

Let MN be a multidimensional neural network of dimension m and order n, then MN is

uniquely specified by ( S, T ) where ( the number of neurons in each independent variable/

dimension/ order index is m )

S is a fully symmetric tensor of order 2n and dimension m . S, the connection structure

of multidimensional neural network, is a fully symmetric tensor in the following sense

Si 1, i 2,..., in ; j 1, j 2,..., jn = Sj 1, j 2,..., jn ; i 1, i 2,...,in (2.6)

for all {i1,i2,...,in}, {j1,j2,...,jn}. This captures the intuitive notion that the multidimensional

neural network has nodes which correspond to the multidimensional neurons. The

connectionist structure of the network, in the fully connected case, has a synaptic connection

from every neuron to every other neuron and thus specifies the number of order indices/

dimensions/variables of the fully symmetric tensor. Furthermore, it is fully symmetric

since there is a link between any two nodes and the weight attached to the link is the same

in both directions.

T is a tensor compatible with S such that each component is the threshold at the node

(i1, i2,...,in) of the multidimensional neural network.

Every node ( multidimensional neuron ) can be in one of the two possible states, either

+1 or –1. The state of node (i1, i2,...., in) at time t is denoted by Xi1, i2,..., in (t). The state of MN

at time t is the tensor Xi1, i2,..., in (t), where X is tensor of dimension m and order n. The state

evolution at node (i1, i2,...,in) is computed by

Xi 1, i 2,..., in (t + 1) = Sign ( Hi 1, i 2,..., in (t)), (2.7)

14 Multidimensional Neural Networks: Unified Theory

where,

m m

Hi 1, i 2,..., in (t) = ∑ ... ∑ Si 1,..., in ; j 1,..., jn X j 1,..., jn (t) − Ti 1,..., in (t) (2.8)

j1= 1 jn = 1

The next state of the network Xi1,...,in (t +1) is computed from the current state by

performing the evaluation (2.7) at a subset of the nodes of the multidimensional neural

network, to be denoted by G. The modes of operation of the network are determined by

the method by which the subset G is selected in each time interval.

If the computation is performed at a single node in any time interval, i.e.|G| = 1, then

we will say that the network is operating in a serial mode, and if |G| = m n, then we will

say that the network is operating in a fully parallel mode. All other cases, i.e. 1 < |G| < m n,

will be called parallel modes of operation. Unlike a one dimensional neural network,

multidimensional neural network lends itself for various parallel modes of operation. It is

possible to choose G to be the set of neurons placed in each independent dimension or a

union of such sets. The set G can be chosen at random or according to some deterministic

rule. A state of the network is called stable if and only if

Xi 1,..., in (t) = Sign (S ⊗ Xi 1,..., in (t) − Ti 1,...iin ) (2.9)

where ⊗ denotes inner product i.e. outer product followed by contraction over the

appropriate indices. Once the network reaches such a state, there is no further change in

the state of the network no matter what the mode of operation is.

NETWORKS

multidimensional neural network, utilizing the notation of tensor products, in the

following, convergence theorem for an arbitrary multidimensional neural network

is stated and proved.

Theorem 2.1: Let MN = (S, T) be a multidimensional neural network of dimension m and

order n. S is a fully symmetric tensor of order 2n and dimension m with Si 1,..., in ; i 1,...,in ≥ 0 .

The network MN always converges to a stable state while operating in the serial mode

(i.e. there are no cycles in the state space) and to a cycle of length utmost 2 while operating

in a fully parallel mode (i.e. the cycles in the state space are of length ≤ 2 ).

Proof: Serial mode of operation of the multidimensional neural network is first considered.

In this mode of operation, during each time step of the operation of the neural network,

the state of only one neuron is updated. In other words, the state of each neuron is only

updated serially. At each multidimensional neuron in the network MN, the total synaptic

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 15

contribution from all neurons is first determined and its sign is determined to arrive at the

updated state of the neuron. Mathematically, this is achieved by computing the outer

product of the fully symmetric tensor S and the {+1, –1} state tensor of the multidimensional

neural network. In tensor notation, this is specified by

Ci 1,..., in ; j 1,..., jn = Si 1,..., in ; j 1,..., jn X j 1,..., jn . (2.10)

The total synaptic contribution at any neuron located at the location (i1, i2,..., in) is

determined by contracting the above outer product over all the indices {j1, j2,..., jn} i.e.

over all the neurons connected to it through the synaptic weights determined by the

components of the fully symmetric tensor S. The resultant scalar synaptic contribution

at any neuron (i1, i2,..., in) is thus determined by the inner product operation. The sign of

the resulting scalar constitutes the updated state of neuron. Thus, the state of any neuron

(i1, i2,..., in) in the multidimensional neural network in the serial mode of operation is

given by

m m

Xi 1, i 2,..., in ( k + 1) = Sign ( ∑ ... ∑ Ci 1,..., in ; j 1,..., jn ( k ) − Ti 1,..., in ) (2.11)

j1= 1 jn = 1

where ⊗ is utilized as the symbol to denote the inner product between compatible tensors.

This symbol is sometimes suppressed and it should be understood from the context whether

inner product/outer product between the tensors is meant.

With the state updating scheme in the tensor notation specified, the energy function

that is optimized in the network MN is described. It is given by

m m m m

E = < X( k ), S ⊗ ( k ) > = ∑ .. ∑ ∑ .. ∑ Si 1,..., in ; j 1,..., jn X i 1,..., in ( k ) X j 1,..., jn ( k ) (2.13)

i 1 = 1 in = 1 j 1 = 1 jn = 1

where < > denotes the inner product operator between the compatible tensors. It is

assumed in the above specification of the energy function of the neural network MN that

the threshold at each neuron is zero. This is no loss of generality, since by augmenting

the tensor S and the state tensor, the threshold values can be forced to be zero. It is easy

to see that such a thing can always be done by considering a one dimensional neural

network in which the threshold at each neuron is non-zero and arriving at a network in

which the threshold at each neuron can be made zero by augmenting the state vector as

well as the connection matrix.

Utilizing the definition of the above energy function of the network, let

∆E = E1 ( t + 1) − E1 ( t) , (discrete time index t instead of k is used) be the difference in the

energy associated with two consecutive states (transited in the serial mode of operation of

the multidimensional neural network ), and let ∆X i 1,....in denote the difference between the

next state and the current state of the node at location (i1, i2,..., in) at some arbitrary time t.

Clearly,

16 Multidimensional Neural Networks: Unified Theory

∆Xi 1,..., in = {−2, if, Xi 1,..., in (t) =1, and, Sign( Hi 1,...,in (t)) =− 1 (2.14)

By assumption, the computation (2.14) is performed only at a single node at any given

time. Suppose this computation is performed at any arbitrary node at location

(i1, i2,..., in) ; then the difference in energy resulting from updating the network state is

given by

∆E = ∆Xi 1,..., in (∑

j1

..∑ Si 1,..., in ; j 1,..., jn X j 1,..., jn + ∑ ..∑ Si 1,...,in ; j 1,..., jn Xi 1,...,in )

jn i1 in

Utilizing the fact that S is fully symmetric and the definition H i1,..., in (t), it follows that

∆E = 2 ∆Xi 1,..., in Hi 1,...,in + Si 1,...,in ;i 1+ ,...in ∆Xi 1,...., in (2.16)

Hence, since ∆Xi 1,..., in Hi 1,..., in ≥ 0 and Si 1,..., in i 1,..., in ≥ 0 , it follows that at every time instant,

∆E ≥ 0 . Thus, since the energy E is bounded from above by the appropriate norm of S, the

value of energy will converge. Now, it is proved in the following that convergence of

energy implies convergence to a stable state.

Once the energy in the network has converged, it is clear from the following facts that

the network will reach a stable state after utmost m 2n time intervals.

with Hi 1,..., in = 0.

In the fully parallel mode of operation of the network MN, the state updating scheme

for the state tensor of MN is given by

Xi 1,..., in (t + 1) = Sign (S ⊗ Xi 1,..., in (t) − Ti 1,..., in ) (2.17)

where ⊗ denotes the inner product between compatible tensors. Since, the serial mode proof

shows that a stable state is always reached with the above stated updating scheme, it is immediate

that by pairwise flipping of the values of any two dimension variables in the state tensor, the same

energy function value is attained. This, in turn implies that in the parallel mode of operation of a

multidimensional neural network, either a stable state is reached or a cycle of length utmost 2 is

reached (The two state tensors lead to the same value of the energy function). This approach to the

proof for the parallel mode of operation follows the one provided in reference. [Br G] Q. E. D

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 17

One dimensional logic theory as well as logic synthesis deal with information processing

logic gates, logic circuits which operate on one dimensional arrays of zeroes and ones (or

more generally one dimensional arrays containing finitely many symbols ). The operations

performed by AND, OR , NOR, NAND, XOR gates have appropriate intuitive interpretation

in terms of the entries of the one dimensional arrays i.e. vectors. Any effort to generalize

the one dimensional logic operations to multidimensions leads to various heuristic

possibilities and requires considerable ingenuity in formalizing a definition. But, in the

following, utilizing the multidimensional neural network model described above, a formal/

mathematical procedure to multidimensional logic theory is described.

The input and output signal states of a multidimensional logic gate are related through

an energy function. Equivalently, the multidimensional logic functions are associated with

the local optimum of various energy functions defined over the set of input m-d arrays. In

view of the mathematical model of a multidimensional neural network described in section

3, it is most logical to define the minimum/ maximum energy states of a multidimensional

neural network (optimizing an energy function over the multidimensional hypercube ) to

correspond to the multidimensional logic gate functions operating on the input arrays.

Definition 2.1

A multidimensional logic function realized through a multidimensional logic gate (with

inputs and outputs) is defined to be the local minimum/maximum of the energy function

of an associated multidimensional neural network.

Equivalently, the local optima of the energy function of a multidimensional neural

network correspond to the logic functions that are realized through various logic gates.

The following detailed description is provided to consolidate the above definition vital

to multidimensional logic theory.

The logic functions which operate on the input array are identified to be the stable states

of a multidimensional neural network ( in multiple independent variables i.e. time, space

etc.). These are the transformations between a set of input states of a multidimensional neural

network which converge to a stable state on iteration of a multidimensional neural network.

In other words, in multiple independent variables, the mapping between the input states

and the stable states to which the network converges on iteration are defined to be the logic

functions realized by a multidimensional logic gate.

By the proof of the convergence theorem, the logic functions are invariants of a tensor on

the multidimensional hypercube. The definition of multidimensional logic function is illustrated

in Figure 2.1.

In the case of one dimensional logic theory, it has been shown that the set of stable

states of a neural network correspond to various one dimensional logic functions (CAB).

With the definition of multidimensional logic function stated and clarified in many

redundant ways above, multidimensional logic synthesis is described in the following.

18 Multidimensional Neural Networks: Unified Theory

A multidimensional logic circuit consists of an arbitrary interconnection of

multidimensional logic gates. Multidimensional logic synthesis, as in one dimension,

involves synthesizing logic circuits for different purposes.

In view of the above definition of multidimensional logic functions defined through

the local optima of energy functions (realized through multidimensional neural networks),

it is natural to see if it is possible to associate energy functions with multidimensional

logic circuits. When such a scalar valued energy function can be associated with logic

circuits, the problem of multidimensional logic synthesis, is reduced to realizing such energy

functions. In the following, this important idea is developed.

A multidimensional logic circuit consists of interconnection of multidimensional logic

gates. But, the interconnection structure of a multidimensional logic gate is represented by

a fully symmetric tensor. Since, every two gates in a logic circuit need not necessarily be

connected to one another, a multidimensional logic circuit connection structure is

represented by a tensor of necessary/compatible order which is not necessarily fully

symmetric but it is required to be minimally symmetric. Thus, this block symmetric tensor

which is fully symmetric within the blocks (representing the connection structure of a

multidimensional neural network corresponding to a component logic gate) provides a

representation of multidimensional logic circuit. This tensor is utilized to associate

quadratic/higher degree energy functions with the multidimensional logic circuit. The set

of local optima of the energy functions constitute the stable states of one or more

interconnected logic gates. Thus, the set of input states (input pins) and output states (output

pins) of an entire multidimensional logic circuit are related through an energy function,

defined over the connection structure of a very high dimensional neural network. The set

of local optima of the energy function relating the input and output pins of a

multidimensional logic circuit realize various multidimensional logic functions.

From the above description, it is evident that the multidimensional logic synthesis

depends on how the multidimensional logic gates are connected to one another. The

structure of interconnection determines the structure of symmetric tensor representing

the multidimensional logic circuit. The essential result in multidimensional logic synthesis

is summarized through the following theorem.

Theorem 2.2: Given a multidimensional logic circuit, there exists a block symmetric

tensor S, representing the inter-connection structure of multidimensional neural

networks (modeling the multidimensional logic gates). The mapping between the input

and output states of a multidimensional logic circuit corresponds to that between input

tensors, local optima of energy function (quadratic/higher degree) represented by the

block symmetric tensor. The stable states of interconnected multidimensional neural

networks represent the multidimensional logic functions synthesized by the logic

circuit.

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 19

The proof of the above theorem follows from the convergence theorem and is avoided

for brevity.

The classification of multidimensional logic circuits is based on the type of transitions

allowed between the states in the multidimensional state space. The type of state transitions

fall into the following form:

(a) whether the next state reached depends on the past state only or not, as in one

dimensional logic synthesis,

(b) the type of neighbourhood of states about the current state on which the next state

reached depends. The type of neighbourhoods about the current state are classified

into few classes. These classes are similar to those utilized in the theory of random

fields, multidimensional image processing,

(c) the classification of trajectories transited by the multidimensional neural network

or a local optimum computing circuit/scheme.

In the above discussion, we considered quadratic forms as the energy functions

(motivated by the simplest possible neural network model) optimized by the logic gates,

which when connected together lead to logic circuits. This approach toward

multidimensional logic theory motivates the definition of more ‘general‘ switching/logic

functions as the local optimum of higher degree forms over the various subsets of

multidimensional lattice (hypercube, bounded lattice etc.).

Definition 2.2

A generalized logic function (representing a generalized logic gate or generalized logic

circuit) is defined as a mapping between an m -dimensional input array and the local

optimum of a tensor based form of degree greater than or equal to two, over various

subsets of multidimensional lattice (the multidimensional hypercube,

multidimensional bounded lattice). These local optimum of higher degree form (based

on a tensor) are realized through the stable states of a generalized multidimensional

neural network.

In (Rama 3) , it is shown that the strictly generalized logic function defined above has

better properties than the ordinary logic function described in Definition 4.1. The generalized

logic function is related to a multidimensional encoder utilized for communication through

multidimensional channels.

Now, with the generalized multidimensional logic gate defined above, logic synthesis

with these types of logic gates involves interconnection of them in certain topology.

This ordinary and generalized approach to multidimensional logic gate definition and

logic synthesis is depicted in Figures 2.1 to 2.3. Detailed documentation on logic synthesis

and design of future information processing machines is being pursued.

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 21

Proof: One dimensional neural network with state vector size infinity is uniquely defined

by (S, T) where S is an infinite dimensional (rows as well as columns) symmetric matrix

and T is an infinite dimensional vector of thresholds at all the neurons.

The state of the neural network at time t is a vector whose components are +1 and –1. The

next state of a node is computed by

Xi (t + 1) = Sign(Hi (t)) = + 1, if, Hi (t) ≥ 0 (2.18)

–1, otherwise

where,

∞

Hi (t ) = ∑ Sji X j (t ) − Ti (2.19)

j =1

The entries of S are such that the infinite sum in the above expression converges.

The next state of the network i.e. X ( t+1 ), is computed from the current state by

performing the evaluation (2.18) at a subset of the nodes of the network, to be denoted by

K. The mode of operation of the network is determined by the method by which the set K

is selected at each time interval i.e. if |K| = 1, then we will say that the network is operating

in a serial mode. Without loss of generality T = 0.

In the following, we consider the serial mode of operation. We argue that with the

above stated updating scheme at an arbitrary chosen neuron, the energy function (quadratic)

increases.

∞ ∞

E( k ) = ∑∑ Sij Xi ( k ) X j ( k ) (2.20)

i =1 j =1

Without loss of generality, consider the case where all the thresholds are set to zero. It

is easy to see (set the last component of state vector to –1 and appropriately augmented

entries of S) that for any finite L, we have

L L L L

∑∑ S

i =1 j =1

ij Xi ( k ) X j ( k ) ≤ ∑∑ S

i =1 j =1

ij Xi ( k + 1) X j ( k + 1) (2.21)

by the convergence theorem for one dimensional neural networks of order L, for any

arbitrary L. Now let L tend to infinity. Hence

∞ ∞ ∞ ∞

∑∑ Sij Xi (k) X j (k ) ≤

i =1 j =1

∑∑ S

i =1 j =1

ij Xi ( k + 1) X j ( k + 1) (2.22)

By the Convergence Theorem for one dimensional neural network (with the state vector

size finite) in the parallel mode of operation, if any finite set of nodes is state updated,

there is either convergence or existence of a cycle of length 2. Thus, when an infinite

22 Multidimensional Neural Networks: Unified Theory

dimensional vector is state updated in the parallel mode, for every finite segment of it,

either there is convergence or a cycle of length 2 (utmost two vectors for which the energy

values are the same) exists. Since, the energy function associated with the infinite

dimensional vector is the limit of those associated with the finite segments, it is evident

that the scalar energy values converge or a cycle of length utmost two exists. Q.E.D.

Now, we discuss briefly, the other infinite dimensional neural networks of dimension

infinity and order finite/infinite ( modeling tensor variables).

The following lemma is well known from the set theory.

The above lemma implies that the convergence theorem proved above in association

with the convergence theorem for multidimensional neural networks (its proof argument

in section 3) provides us with the convergence proof for a large class of infinite dimensional

neural networks (dimension and/or order of tensors utilized in modeling is infinity). Details

on the convergence theorem for infinite dimensional neural networks are provided below.

Tensors utilized to represent the connection structure, state of neurons of infinite

dimensional neural network are such that the either the dimension or the order is finite/infinite

with not both of them being finite (either the dimension or the order or both are infinite ).

In one dimension, when the number of neurons is infinite and a quadratic energy function

is optimized through a neural network scheme, by a straightforward extension of the results

in (Rama 3), the stable states of the neural network constitute a graph-theoretic code (with

the length of the codeword being infinite). The set over which optimization is carried out is

the unbounded unit hypercube (countable number of entries in the infinite dimensional

state vector), a subset of the lattice ( based on one independent variable ).

The following theorem is concerned with the points on the lattice in multi/infinite dimensions.

This theorem is the infinite dimensional extension of the result proved in section 3.

infinite and dimension infinity (number of neurons in each dimension). S is a fully

symmetric tensor of dimension infinity and order 2n/infinity with Si 1,..., in ; i 1,...,in ≥ 0 .

The network MN always converges to a stable state while operating in a serial mode

(i.e., there are no cycles in the state space), while in the parallel mode, the network

will always converge to a stable state or to a cycle of length 2 (i.e., the cycles in the

state space are of length ≤ 2).

Proof: For a multidimensional neural network modeled by a tensor of dimension and

order finite, in the serial mode of operation, the network always converges to a stable

state. Since, the quadratic energy function is a scalar value defined over the connection

tensor (whose order, dimension are finite ), by letting the dimension and/or order tend to

infinity in (2.13), it is immediate that the energy function value increases in the serial mode

until a stable state is reached starting in a certain initial state. Thus, for various infinite

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 23

dimensional neural networks considered, convergence to a stable state in the serial mode

of operation is ensured ( i.e. there are no cycles in the state space ).

In the parallel mode of operation of the infinite dimensional neural network, by the

same reasoning as in Theorem (2.1), the network will always converge to a stable state or

to a cycle of length 2 depending on the order of the network ( i.e. the cycles in the state

space are of length less than or equal to 2). Q.E.D

As in the case of multidimensional logic theory, the above convergence theorem is

utilized as the basis to describe infinite dimensional logic theory as well as logic

synthesis. It should be noted that the infinite dimensional logic synthesis only has

theoretical importance. Brief discussion on infinite dimensional versions is provided

for the sake of completeness.

Definition 2.3

An infinite dimensional logic function realized through an infinitedimensional logic gate

(with inputs and outputs) is defined to be the local minimum/maximum of the energy

function of an associated infinitedimensional neural network. Equivalently, the local optima

of the energy function of an Infinitedimensional neural network correspond to the logic

functions that are realized through various logic gates.

With the above definition of infinite dimensional logic function, detailed results in

infinite dimensional logic synthesis are being developed along the lines of those in

multidimensional logic synthesis. Brief description is provided in the following for the

sake of completeness.

An infinitedimensional logic circuit consists of an arbitrary interconnection of

infinitedimensional logic gates. Infinitedimensional logic synthesis, as in one dimension

involves synthesizing logic circuits for different purposes. These infinite dimensional logic

circuits only have theoretical implementations. Infinitedimensional logic synthesis depends

on how the infinitedimensional logic gates are connected to one another. The structure of

interconnection determines the structure of symmetric tensor (order and/or dimension is

infinity) representing the infinitedimensional logic circuit.

OPTIMIZATION

local optima of quadratic as well as higher degree forms defined in terms of tensors

(including matrices) over various subsets of the multidimensional lattice. These units which

map a multidimensional array/tensor to a local optimum (stable state of the

multidimensional neural network), thus constitute the multidimensional logic gates.

Interconnection of such multidimensional logic gates constitutes a multidimensional logic

circuit. Thus, multidimensional logic circuits are interconnected multidimensional neural

24 Multidimensional Neural Networks: Unified Theory

tensor. Thus, multidimensional logic theory/logic synthesis are associated with the theory

of multidimensional neural networks. These theories are in turn related to static

optimization of various forms (quadratic as well as higher degree) over different subsets

of lattice and other sets.

Various constrained static optimization problems that are of interest in different

applications (neural networks, logic theories etc.) are summarized below:

(1) Optimization of a quadratic form in finitely many variables over the one

dimensional hypercube (one independent variable),

(2) Optimization of a higher degree form in finitely many variables over the one

dimensional hypercube (one independent variable),

(3) Optimization of a quadratic form over the infinite dimensional (size of the state

vector) hypercube in one dimension,

(4) Optimization of a higher degree form over the infinite dimensional (size of the

state vector) hypercube in one dimension,

(5) Optimization of a quadratic form over the finite/infinite dimensional hypercube

in finitely/infinitely many dimensions,

(6) Optimization of a higher degree form over the finite/infinite dimensional

hypercube in finitely/infinitely many dimensions,

(7) Optimization of a quadratic form over a bounded lattice in finitely/infinitely

many dimensions,

(8) Optimization of a higher degree form over a bounded lattice in finitely/infinitely

many dimensions,

(9) Optimization of a quadratic form over the unbounded lattice in finitely/infinitely

many dimensions,

(10) Optimization of a higher degree form over the unbounded lattice in finitely/

infinitely many dimensions.

When the constraint set is the lattice (unbounded lattice) in finitely/infinitely many

dimensions and the number of state variables is not finite but countable, the objective

function is a power series each of whose terms is a quadratic/higher degree form. It is

proved in (Rama 3) that some of the constrained optimization problems arise in the design

of multi/infinite dimensional codes. In (Rama 4), various optimization problems described

above are utilized in dynamic optimization setting.

In the following , various innovative themes in multi/infinite dimensional neural

networks are briefly discussed.

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 25

The well known model of a neural network is a discrete time system in one or multiple

dimensions. A signal design problem for optical/magnetic recording channels modeled as

linear systems, led to the discovery of continuous time neural networks ( Rama 5). The state

updating scheme of the continuous time neural network takes the following form

T

0

In this technical memorandum, the author for the first time associates energy functions

with the state updating scheme. The multidimensional versions of these continuous time

neural networks are discussed in (Rama 4).

Neural networks in which the entries of the connection structure as well as state variables

(indicating the binary states of the neuronal networks) are complex valued are already studied

in one dimension. These results have the corresponding multidimensional versions. These

results parallel the results for real neural networks. These results are aided by the fact that

the quadratic form associated with a Hermitian symmetric matrix is always real and thus the

eigenvalues of the Hermitian symmetric matrix are always real.

These are neural networks in which the connection structure of the one/multidimensional

neural network is varying with discrete/continuous time index. More explicitly, the

connection tensor whose elements constitute the synaptic weights between the neurons

that are located in one/two/multiple dimensions is varying with the time index in some

orderly ( or random ) manner. The analysis of such one/multidimensional neural networks

is being studied.

2.7 CONCLUSIONS

A mathematical model of an arbitrary multidimensional neural network is described. This

model is utilized to prove the convergence theorem for multidimensional neural networks.

Utilizing the convergence theorem, multidimensional logic functions are defined and

multidimensional logic synthesis is discussed. Infinite dimensional logic synthesis is briefly

described. Various constrained static optimization problems of utility in control,

communication, computation and other applications are summarized. Several innovative

themes on one/multidimensional neural networks are summarized.

26 Multidimensional Neural Networks: Unified Theory

REFERENCES

(BoT) A. I. Borisenko and I. E. Tarapov, “Vector and Tensor Analysis with Applications,“ Dover

Publications Inc., New York,

(BrG) J.Bruck and J.W. Goodman, “A Generalized Convergence Theorem for Neural

Networks”, IEEE Transactions on Information Theory, Vol. 34, No. 5, Sept 88.

(CAB) S.T. Chakradhar, V.D. Aggarwal and M.L. Bushnell, “Neural Models and Algorithms

for Digital Testing”, Kluwer Academic Publishers.

(HoT) J. J. Hopfield and D. W. Tank, “Neural Computations of Decisions in Optimization

Problems,“ Biological Cybernetics., Vol. 52, pp. 41-52, 1985.

(Rama 1) Garimella Rama Murthy, “Multi/Infinite Dimensional Logic Synthesis,“ Manuscript

in Preparation.

(Rama 2) Garimella Rama Murthy, “Unified Theory of Control, Communication and

Computation-Part 1,” Manuscript to be submitted to IEEE Proceedings.

(Rama 3) Garimella Rama Murthy, “Multi/Infinite Dimensional Coding Theory: Multi/Infinite

Dimensional Neural Networks: Constrained Static Optimization,” Proceedings of 2002 IEEE

Information Theory Workshop, October 2002.

(Rama 4) Garimella Rama Murthy, “Optimal Control, Codeword, Logic Function Tensors—

Multidimensional Neural Networks, International Journal of Systemics, Cybernetics and

Informatics, October 2006, pages 9-17.

(Rama 5) Garimella Rama Murthy, “Signal Design for Magnetic and Optical Recording Channels:

Spectra of Bounded Functions, “ Bellcore Technical Memorandum, TM-NWT-018026.

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 27

CHAPTER

3

Multi/Infinite Dimensional

Coding Theory: Multi/Infinite

Dimensional Neural

Networks

Networks— —Constrained Static

Optimization

3.1. INTRODUCTION

In the recent years, technological developments in parallel data transfer mechanisms led

to HIPPI (high performance parallel interface), SMMDS (switched multi-megabit data

service), FDDI (fiber distributed data interface). To match these high speed parallel data

transfer mechanisms, multidimensional coding theory has been originated and some ad

hoc procedures were developed for designing linear as well as non-linear codes.

Multidimensional codes are utilized to encode arrays of symbols for transmission over

a multidimensional communication channel. Thus, the central objective in multidimensional

coding theory is to design codes that can correct many errors and whose encoding/decoding

procedures are computationally efficient. A multidimensional error correcting code can

be described by an energy landscape, with the peaks of the landscape being the codewords.

The decoding of a corrupted codeword (array) which is a point in the energy landscape

that is not a peak is equivalent to looking for the closest peak in the energy landscape. An

alternative way to describe the problem is to design a constellation which consists of a set of

points on a multidimensional lattice that are enclosed within a finite region, in such a way

that a certain optimization constraint is satisfied.

Neural network model, simulated annealing, relaxation techniques are some of the various

computation models (based on optimization) that have been attracting much interest because

they seem to have properties similar to those of biological and physical systems. The standard

computation performed in a neural network is the optimization of the energy function. The state

space of a neuro-dynamical system can be described by the topography defined by the energy

function associated with the network. The connection structure of a neural network can either

be distributed on a plane or in multidimensions (Rama 2).

Thus, the field of multidimensional neural network theory and the field of

multidimensional coding theory are linked through the common thread of optimization of

28 Multidimensional Neural Networks: Unified Theory

lattice. In a nut shell, multidimensional error correcting codes and multidimensional neural

networks can be associated with such polynomials.

In contrast to the traditional ad hoc attempts to design multidimensional codes by a

generation of researchers, the author for the first time discovered and formalized the idea of

utilizing the theory of tensor spaces to represent and study multidimensional error correcting codes.

The theory of tensor spaces enables the design of codes in one dimension (encoding as

well as decoding techniques) to be translated to multi/infinite dimensions.

Utilizing this representation, the author took a significant step forward in formally

demonstrating the relationship between multidimensional neural networks,

multidimensional codes and optimization of multivariate polynomials/monomials over

various subsets of multidimensional lattice. This relationship provides new insights into the

design of multidimensional encoders as well as decoders. Also, the relationships between

concepts such as minimum distance, correctable errors of multidimensional codes can be

derived through new proof arguments. Furthermore, the relationship enables the utilization

of multidimensional decoding techniques for the solution of optimization of multivariate

polynomials over the multidimensional hypercube ( other subsets of multidimensional lattice),

a difficult problem that arises in various applied fields such as operations research, theoretical

computer science etc. Also, utilizing the powerful techniques developed in these applied

areas for such problems, new algorithms for maximum likelihood decoding of

multidimensional error correcting codes can be designed.

Thus, the results in this chapter are summarized in the following three paragraphs.

The concepts of multidimensional neural networks, multidimensional error correcting

codes, optimization of quadratic/higher degree forms based on components of a tensor

(tensor component based multivariate polynomials), over various subsets of

multidimensional lattice, are related from different viewpoints.

It is proved that given a multidimensional linear block code, a neural network

(generalized neural network) can be constructed in such a way that every local maximum

of the energy function corresponds to a codeword tensor and every codeword tensor

corresponds to a local maximum. It is shown that determining the global maximum of the

energy function of a multidimensional neural network/generalized neural network is

equivalent to performing the maximum likelihood decoding in a linear block

multidimensional code. The results are generalized to multidimensional non-linear as well

as non-binary codes.

Theorems related to optimization of tensor based multivariate polynomials (terms/

monomials are based on the components of tensors) over arbitrary open/closed sets are

proved. Infinite dimensional extension of the results is briefly discussed.

This chapter is organized as follows. In section 2, after briefly reviewing the theory

of multidimensional neural networks, it is proved that finding the global optimum of

the energy function of the network is equivalent to finding a minimum cut in a certain

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 29

connection between the multidimensional neural network model and graphoid based

codes is established. It is shown that maximum likelihood decoding in a graphoid based

code is equivalent to finding a minimum cut in a certain graphoid. Thus, it is shown that

maximum likelihood decoding in a graphoid based code is equivalent to finding a

maximum of the energy function in a multidimensional neural network. In section 3, the

results are extended to general multidimensional linear block codes. A general energy

function, not necessarily quadratic, is defined based on the generator tensor of a given

linear block code. It is proved that finding the global maximum of the energy function is

equivalent to maximum likelihood decoding in the code. In section 3, it is briefly discussed

how the infinite dimensional codes are represented through the infinite order/dimension

(either the order or the dimension or both is infinite) generator tensor (the entries of

which satisfy some regularity conditions) and thus enable the infinite dimensional

versions of the results to be derived. In section 4, the energy function associated with the

parity check tensor of the multidimensional linear block code is described. When the

tensor is written in the systematic form, it is shown that each codeword tensor corresponds

to a local maximum of the multivariate polynomial associated with the parity check

tensor and that each local maximum corresponds to a codeword tensor. The results are

interpreted as the dual to the ones in the previous section for defining the Maximum

Likelihood Decoding (MLD) problem. In section 5, the results are generalized to non-

binary codes. Further, in section 6, the results are generalized to non-linear

multidimensional codes. In section 7, by means of a decomposition principle, theorems

related to optimization of tensor based (based on the components of a tensor) multivariate

polynomials over arbitrary open/closed sets are proved. Also, various innovative ideas

on the utilization of results in previous sections, to derive very general results in static

optimization are described. The chapter concludes with a summary of results derived.

The results in this chapter are exactly the multidimensional versions of those in (BrB).

COMPUTATION IN THE CONNECTION STRUCTURE: GRAPHOID CODES

represented by a weighted undirected connectionist structure in multidimensions. At each

multidimensional neuronal element, there is a threshold value which will fire each neuron

on crossing it. Each neuronal element computes an algebraic threshold function in the

input variables.

Let MN be a multidimensional neural network of dimension m and order n; then MN is

uniquely specified by (S, T) where ( the number of neurons in each dimension is m i.e. the

number of values assumed by each independent dimension variable) S is fully symmetric

tensor of dimension m and order 2n, and T is a tensor of thresholds attached to neuronal

elements with compatible order ( n ) and dimension ( m ). Every node can be in one of two

30 Multidimensional Neural Networks: Unified Theory

possible states +1 and –1. The state of node ( i1, i 2,..., in ) at time t is denoted by Xi 1, i 2..., in (t) .

The state of MN at time t is the tensor X i1, i2,...1. in (t)of dimension m and order n. The state

evolution at node ( i1, i 2,..., in ) is computed by

Xi 1, i 2,..., in (t + 1) = Sign (Hi 1, i 2,..., in (t )) (3.1)

where

m m

Hi 1,..., in (t ) = ∑ ... ∑ Si1,..., in; j1,..., jn X j1, j 2,..., jn (t) − Ti1,..., in (t)

j1= 1 jn = 1

The next state of the network i.e. X i1, i2,..., in (t + 1), is computed from the current state by

performing the evaluation (3.1) at a subset of nodes of the multidimensional neural network,

to be denoted by G. The modes of operation are determined by the method by which the

subset G is selected in each time interval. If the computation (3.1) is performed at a single

node in any time interval i.e. G| = 1 , then we will say that the network is operating in the

serial mode, and if G|= mn , then we will say that the network is operating in the fully

parallel mode. A state is called stable if and only if

Xi 1, i 2,..., in (t) = Sign (S ⊗ Xi 1,..., in (t) − Ti 1,..., in ) (3.2)

where ⊗ denotes inner product (the symbol is sometimes suppressed for notational brevity).

Once a neural network reached such a state there is no change in the state of the network

no matter what the mode of operation is.

An important feature of the network MN is the convergence theorem stated below.

Theorem 3.1: Let MN = (S, T) be a multidimensional neural network of dimension m and

order n. S is a fully symmetric tensor of order 2n and dimension m . The network MN

always converges to a stable state while operating in the serial mode (i.e. there are no

cycles in the state space) and to a cycle of length utmost 2 while operating in the fully

parallel mode.( i.e. cycles in the state space are of length utmost 2 ).

This theorem is proved in (Rama 2). This theorem suggests the utilization of MN as a

device for performing a local search of the optimum of an energy function. In the following,

we formulate a problem that is equivalent to determining the global maximum of an energy

function and how to map it onto a multidimensional neural network.

Definition 3.1

Let G = (V, E) be a weighted and undirected non-planar graph in multidimensions where

V denotes the set of nodes of G and E denotes the set of edges of G. Let K be the fully

symmetric tensor whose components are the weights of the edges of G.

Let V1 be a subset of V, and let V–1 = V–V1. The set of edges each of which is incident at

a node inV1 and at a node in V–1 is called a cut in G.

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 31

Definition 3.2

The weight of a cut is the sum of its edge weights. A minimum cut (MC) of a non-planar

graph/graphoid is a cut with minimum weight.

In the following, we show the equivalence between the minimum cut problem in a

graphoid (from now onwards, we call the connection structure of a multidimensional neural

network also as a graphoid ) and the problem of maximizing the quadratic form as the

energy function of a multidimensional neural network. Every non-planar graph including

the connection structure of a multidimensional neural network is a Graphoid (by definition).

Theorem 3.2: Let MN = (S, T) be a multidimensional neural network with all the thresholds

being zero i.e. T = 0. The problem of finding a state V for which the quadratic energy

function E is maximum is equivalent to finding a minimum cut in the graphoid

corresponding to MN.

Proof: Since T = 0, the energy function is given by

m m m m

E= ∑ ... ∑ ∑ ... ∑ S

i1= 1 in = 1 j 1 = 1 jn = 1

i 1,..., in ; j 1,..., in Xi 1,..., in X j 1,..., jn (3.3)

with both the end points being in the same vertex set of the cut i.e. i=j =1, and let S − − ;

S + − denote the corresponding sums of the other two cases. It follows that

E = 2 ( S++ + S− − − S+− )

which can also be written as

E = 2 ( S++ + S− − + S+− ) – 4 S+ – (3.4)

Since, the first term in the above equation is constant (it is the sum of weights of the

edges), it follows that the maximization of E is equivalent to the minimization of S+–.

Clearly, S+– is the weight of the cut in MN with V1 being the nodes of MN with a state

equal to 1. Q. E. D.

In this sub section, relationship between multidimensional neural networks and error

correcting codes based on graphoids is investigated. The ‘multidimensional error correcting

codes’ associated with graphoids (connection structure of a multidimensional neural

network ), are called “graphoid-theoretic” codes.

The family of graphoid codes are defined based on the tensors naturally associated with

the connection structure of a multidimensional neural network with nodes as well as edges.

Let G = (V, E) be an undirected connectionist structure of a multidimensional neural

network with weights on the edges. Like a graph in the plane, this is a representation for a

32 Multidimensional Neural Networks: Unified Theory

non-planar graph type structure called graphoid (not necessarily the connection structure

of a multidimensional neural network). Consider a fully symmetric tensor of dimension m

and order 2n; which is utilized to describe the connection structure of a multidimensional

neural network.

A subset of the set of edges of G can be represented by a characteristic tensor of order 2n

with the edge between two nodes Vi 1, i 2,..., in , Vj 1, j 2..., jn , leading to an entry of +1 at those locations

in the tensor. Thus, an edge characteristic tensor of a graphoid E is defined such that

Eˆ i1,..., in; j1,..., jn

= nodes(i1,...., in )and ( j1,...., jn ). (3.5)

0 otherwise.

Definition 3.3

The incidence tensor of a graphoid G = (V , E ) is a block tensor of the form

TVˆ1

TVˆ2

DGˆ = .

..

(3.6)

TVˆ

n

where TVˆ represents the tensor of the set of edges incident upon the node Vi . It should be

i

noted that the incidence tensor is a blocked tensor and the above illustration is shown to

aid the imagination of the reader.

Various concepts associated with planar graphs are utilized as the basis to define the

following concepts associated with a graphoid (non-planar). They provide the notation

associated with graphoid theoretic codes.

The following lemmas are very easy to verify.

Lemma 3.1: The set of characteristic tensors that correspond to the cuts in a connection

structure G = (V , E ) of a multidimensional neural network form a linear tensor/m-d vector

(depending on the notational convenience) space over GF(2) in multidimensions of

dimension ( V − 1) .

The linear tensor/m-d vector space that corresponds to the cuts of a graphoid Ĝ will

be called the only cut space of Ĝ . Furthermore, the circuits in a graphoid also constitute a

linear tensor/vector space.

Lemma 3.2: Given a connected graphoid G = (V , E ) ; the incidence tensor of Ĝ has rank

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 33

( V − 1) . Every block tensor in DĜ associated with a node is a characteristic tensor of a cut

( )

and every Vˆ − 1 block tensors of DĜ corresponding to different vertices/nodes of the

graphoid form a basis for the cut space of Ĝ .

Hence, given a connection structure Ĝ , the cut space of the graphoid is a

multidimensional linear block code of dimension Vˆ − 1 . ( )

For the sake of brevity, in the following, we only consider ‘cut codes’.

Given a graphoid, Ĝ , an interesting question is how to formulate the maximum

likelihood decoding (MLD) problem of the code CĜ in a graphoid-theoretic language.

( )

That is, given a graphoid Ĝ = Vˆ , Eˆ and a (0, 1) tensor Y of dimension m and order

2n; what is the codeword in C closest to Y in Hamming distance?

Ĝ

The following lemmas will answer the questions.

Hamming Distance: Given two (0,1) tensors, X, Y; the Hamming distance between m

dimensional tensors of order 2n is the number of places where they differ.

This definition is motivated by transmitting a binary tensor X through a noisy

multidimensional channel, observing the output Y and counting the number of errors that have

occurred.

( )

Lemma 3.3: Let Ĝ = Vˆ , Eˆ be a graphoid. Let CĜ be the multidimensional code associated

with Ĝ . Let Y be a (0,1) tensor of order 2n (dimension m). Construct a new graphoid, to be

defined/denoted by Ĝ Y; by assigning weights to the edges of Ĝ as follows:

Wi1, i2,..., in; j1,..., jn = (–1) Yi1,..., in; j1,..., jn ((−1)Power…) (3.7)

Wi 1, i 2,..., in ; j 1,..., jn is the weight associated with the edge (i1,..., in ; j1,..., jn ) in Ĝ . Then the

maximum likelihood decoding of the tensor Y with respect to CĜ is equivalent to finding

the minimum cut in Gˆ Y .

Proof: Assume the number of ones in Y is b. Let P be an arbitrary codeword in CG. Let L i,j denote

the number of positions in which P contains an i ∈ {0, 1} and Y contains a j ∈ {0,1}. Clearly,

b = L0,1 + L1,1 (3.8)

Thus,

−L1,1 + L1, 0 = L0,1 − b + L1,0 (3.9)

= L0,1 + L1,0 − b (3.10)

34 Multidimensional Neural Networks: Unified Theory

Minimizing the right hand side of the above expression over all P ∈ C G is equivalent to

finding a codeword which is the closest to Y. On the other hand, minimizing the left hand

side is equivalent to finding the minimum cut in G Y. Q.E.D

From the above lemma, the following theorem follows.

( )

Theorem 3.3: Let Ĝ = Vˆ , Eˆ be a graphoid. Then, maximum likelihood decoding of a

tensor word Y with respect to CĜ is equivalent to finding the maximum of the quadratic

energy function E of the multidimensional neural network defined by the graphoid Gˆ Y

with all its threshold values equal to zero.

Proof: By Lemma 3.3, maximum likelihood decoding of Y with respect to C Ĝ is equivalent

to finding the minimum cut in Gˆ Y . By Theorem 3.2, finding the minimum cut in a graphoid

is equivalent to finding the global maximum of the energy function (quadratic) of a

multidimensional neural network defined by a graphoid with all the thresholds at each

neuronal element set to zero. Q.E.D.

Graphoid based error correcting codes are very limited since the connection structure

of a multidimensional neural network is represented by a fully symmetric tensor. This

imposes restrictions on the minimum distance of multidimensional codes. Thus, a natural

question that arises is whether the equivalence stated above in the Theorem 3.3 can be

generalized to arbitrary multidimensional linear block codes.

Graphoid codes arose naturally out of the topological properties of the connection structure

of a multidimensional neural network. The connection structure required a fully symmetric

tensor to represent it. The neural network model enabled the association of a quadratic energy

function with the fully symmetric tensor and its optimization over the multidimensional

hypercube. Thus, the encoders and decoders of graphoid codes are defined through topological

structure and optimization of multivariate polynomials. Since, an arbitrary tensor like the fully

symmetric tensor constitutes a linear operator, unlike graphoid codes, arbitrary

multidimensional linear codes are first defined through their algebraic structure in the next

section. Then the maximum likelihood decoding problem of such codes is discussed.

ENERGY FUNCTIONS—GENERALIZED NEURAL NETWORKS

Recent advances in high speed parallel data transfer mechanisms based on light wave/

optical networks motivated the design and analysis of multidimensional codes. Several

researchers utilized ad hoc techniques (sometimes pseudo-mathematical techniques) to

design and analyze multidimensional codes based on the extensions of the ideas in one

dimensional error control coding theory.

The author for the first time developed the idea of utilizing ‘tensor linear operator’ for

the design and analysis of multi/infinite dimensional linear as well as non-linear codes

conceived as sub-spaces over tensor spaces.

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 35

linear space defined through the generator tensor. The encoding operation of a

multidimensional (m, n ; m,l) linear code, defined by an m dimensional (n+l) order generator

tensor Gi 1, i 2,..., in ; j 1, j 2,..., jl is performed in the following manner:

An m dimensional information tensor of order n, Bi1, i2,..., in (with 0, 1 symbols) is encoded

into the m-dimensional “codeword tensor”, C j 1, j 2,..., jl (constellation member) of order ‘l’ by

the following tensor inner product (outer product followed by contraction over appropriate

indices) scheme:

Bi 1, i 2,...., in ⊗ Gi 1, i 2,...., in ; j 1, j 2,..., jl = C j 1, j 2,..., jl (3.11)

where ⊗ denotes the inner product operation (between tensors defined over a finite field)

by means of exclusive or operation between the components of outer product of tensors

(contraction over appropriate indices of the sum of products of binary variables).

The above procedure of generating the codeword tensor from an information tensor leads to

the following interesting considerations which are inherent to multidimensional code design.

In one dimension, a binary information vector of length k is encoded into a codeword

vector of length n by padding the parity bits to it. The parity check equations obtained

through the parity check matrix determine these bits. In the case of two/multidimensional

array of information bits, there are many ways to encode the array into a codeword array.

Even in the simplest two dimensional array case, by padding a border of parity bits along

the row wise as well as column wise directions, the codeword array can be generated. In

the following, this degree of freedom in multidimensional coding is formally described.

A multidimensional information array (information tensor) is mapped into a codeword

array in the following ways:

(1) An m-dimensional information tensor of order n is mapped into an m-dimensional

codeword tensor of order l (l > n),

(2) An m -dimensional information tensor of order n is mapped into k -dimensional

codeword tensor (k > m ) of order n,

(3) An m-dimensional information tensor of order n is mapped into a k -dimensional

(k > m) codeword tensor of order l (l > n).

For the purpose of notational convenience, in the following encoding through the

operation (1) is only utilized. It is easy to realize that by transposing the information as

well as generator tensors, the operation (2) in encoding is achieved. But to encode an

information tensor into a generator tensor through the operation (3), a second generator

type tensor is utilized.

Various ideas familiar in one dimensional coding theory (parity check matrices,

primitive polynomials, basis, cosets etc.) have corresponding parallels in multi/infinite

dimensional coding theory based on the tensor linear operator defined over a finite field.

The detailed translation from one dimensional encoding/decoding algorithms to

36 Multidimensional Neural Networks: Unified Theory

concepts with parallel linear algebra concepts.

Now let us consider infinite dimensional codes. An infinite dimensional tensor can be of

the following types: (a) the dimension of the tensor is finite, whereas the order is infinite,

(b) the dimension of the tensor is infinite, whereas the order is finite, (c) the dimension as

well as order of the tensor are infinite.

An infinite dimensional code can be generated in the following manner. It is assumed

that the generator tensor of the code is such that either the dimension or the order or both

are infinite. Also, it is assumed that the entries of the generator/information tensor satisfy

the regularity conditions necessary to ensure that the inner product makes sense

(convergence of the partial sums of outer product to a limit etc.).

(i) An information tensor of finite dimension/order is mapped into a codeword

tensor of infinite dimension/order. This type of encoding can happen in practical

multidimensional communication systems,

(ii) An infinite dimension/order ( or both are infinite) tensor is mapped into a

codeword tensor with either the dimension or the order or both being infinity.

In the above encoding schemes, the generator/parity check tensors are of compatible

dimension/order (with the information tensor being encoded) to ensure that a proper

infinite dimensional codeword is generated.

Infinite dimensional extensions of the results in sections 3, 4, 5, 6, 7 (to be described in

the following paragraphs) follow from the immediate extensions of the formal arguments

to infinite dimensional tensors that satisfy the regularity conditions. They are not explicitly

repeated.

In the following, a very brief summary of multidimensional information theory is

provided as it is based on the tensor linear space structure idea necessary to model

multidimensional arrays.

In one dimension, a mathematical theory of communication is developed utilizing the

concepts of information/entropy associated with a random variable, conditional entropy,

joint entropy etc. These concepts are the vital tools to prove the noiseless channel coding

theorem. Various channel models are developed. The concepts of mutual information,

capacity of a discrete memoryless channel are utilized to prove the second channel coding

theorem. One dimensional information theory then led to rate distortion theory.

In multidimensions, a source generates multidimensional arrays of information which

pass through a multidimensional channel. A multidimensional independent, identically

distributed information array of symbols is associated with the concept of entropy H

(Xi1, i2,..., in) in the following manner:

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 37

m m

H (Xi 1, i 2,..., in ) = ∑ ...∑ Pi1, i 2,..., in log (1 Pi1, i 2,..., in )

i1= 1 in = 1

(3.12)

Given the basic idea of the above definition, results from one dimension are generalized

to multidimensions utilizing the principles described in (Rama 3). Complex sources such

as a Markovian Source require some sophistication in defining the entropy/uncertainty of

the source. The interesting channel model in multidimensions is the discrete memoryless

channel represented through a stochastic tensor whose elements are conditional

probabilities Pj 1,..., jl , i 1,..., ik . This corresponds to a Markov random field. Detailed theorems

are derived utilizing the principles described in (Rama 3).

With the multidimensional encoding scheme formally described, it is proved in the

following that the maximum likelihood decoding problem of a multidimensional linear

block code is equivalent to the maximization of multivariate polynomial (whose terms/

monomials are described in terms of the entries of received, generator tensors) associated

with the generator/received tensors over the multidimensional hypercube.

The essential idea in the derivation of the desired result is (generalization of Theorem

3.3 to arbitrary multidimensional linear codes) to represent the symbols of the additive

group as symbols in the multiplicative group through the following transformation:

a → ( − 1) a i .e . 0 → 1, 1 → − 1 . (3.13)

Thus, the information tensor Bi 1,..., in is represented by the tensor Xi 1,..., in , where the

Bi 1,...in

component Xi 1,..., in = (–1) . The encoded codeword C j 1,..., jl is thus represented by the

tensor Yj 1,..., jl . Hence, a component of the tensor Y is given by

m m

= ∏ ...∏ X i 1,....,

C j1 ,...., jl G ; j1,...... jl

Yj1, j2 ,.... Jl = ( −1) i1,.....,in

in (3.14)

i1 = 1 in = 1

Definition 3.4

In the {1, –1} representation of a multidimensional linear code, instead of a generator tensor,

given an information tensor Xi 1,..., in , an encoding procedure X → Y is utilized, where the

tensor Y j1,..., jl is such that Y j1,..., jl component is a monomial that consists of a subset of the

X i1,..., in . An encoding procedure is systematic if and only if Y j1,..., js = X j1,..., js for 1 < s < n.

Definition 3.5

Let Gi1, i 2,..., in ; j1, j 2,...., jl be a generator tensor of ones and zeroes. The polynomial representation

of generator tensor G with respect to a {+1, –1} received tensor of dimension m and order

l, W denoted by E is,

38 Multidimensional Neural Networks: Unified Theory

m m

EW ( X ) =W ⊗ ∏...∏ X i1,...,in (3.15)

i1 = 1 in =1

= W ⊗Y (X ) (3.16)

where ⊗ denotes inner product between the tensors (i.e. outer product of the tensors

followed by contraction over appropriate indices).

Consider the linear multidimensional block code defined by the generator tensor G (or

equivalently by the encoding procedure associated with G ). The polynomial representation

of G i.e. EW ( X ), will be called the energy function of W with respect to the encoding

procedure X → Y .

To establish the connection between the energy functions (optimized by neural/

generalized neural networks over various subsets of the multidimensional lattice) and linear

multidimensional block codes, we will prove that finding the global maximum of EW (X) is

equivalent to maximum likelihood decoding of a tensor W with respect to the code C.

encoding procedure X → Y , and a tensor W of ones and minus ones i.e. a {+1, –1} tensor,

the closest codeword (in Hamming distance) to W in C corresponds to an information

tensor B if and only if

EW (B) = Maximum overall tensors X of {EW(X)}. (3.17)

Proof: For an { +1, –1 } information tensor,X the scalar energy function is given by

EW (X) = W ⊗ Y(X) (3.18)

= {( j1, j 2,..., jl ): W j1,..., jl = Y j1,..., jl ( X )}

(3.19)

− {( j1,..., jl ): W j1,...., jl ≠ Y j1,...., jl ( X )}

= ml

− 2 {( j 1,... jl ): Wj 1,..., jl ≠ Yj 1,..., jl (X )}

= ml − 2 dH (W , Y) (3.20)

where d H denotes the Hamming distance between the multidimensional codewords W, Y .

From the above expression, EW ( B ) will achieve a maximum if and only if d H (W , Y) achieves

a minimum. Q. E. D.

Given an encoding procedure, we can use the same argument as in the above theorem, to

express the minimum distance of the code. Consider the encoding procedure:

X = (Xi 1,..., in ) → Y = (Yj 1,..., jl ) (3.21)

and the energy function with W, a tensor with all the components equal to one.

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 39

m m

EW (X ) = ∑ ...∑ Yj1 ,..., jl

j1= 1 jl = 1

(3.22)

As in Theorem 3.4,

and minimum over all X tensors ≠ (all ones tensor) of d H ((all ones tensor), Y)

occurs at M =Maximum overall tensors other than all ones tensor of EW (X ) (3.24)

Thus, d * ( the minimum distance of the code is given by )

ml – M

d* = (3.25)

2

The above results are being generalized to infinite dimensional codes utilizing infinite

dimension/order tensors .

In the theory of error correcting codes, minimum distance of a linear code provides a

measure of the number of errors that can be corrected. From ( 3.25), it is evident that the

maximization of minimum distance of a multidimensional linear block code requires

minimizing M. Thus, we have the following Lemma.

Lemma 3.4: The multidimensional (m, n ; m, l) linear block code which minimizes M in

(3.24) enables the correction of maximum number of errors among all possible such error

correcting codes:

Proof: From (3.25), maximization of minimum distance of an (m, n ; m , l) linear code is

equivalent to minimizing M, i.e. minimizing the maximum value of the energy function

over the m-d hypercube ( excluding the all ones tensor). Such a code design problem fits in

the game-theoretic framework. It is well known that maximization of minimum distance

also maximizes the number of errors that can be corrected. Q. E. D.

TO STABLE STATES OF ENERGY FUNCTIONS

Let C be a linear multidimensional block code (over GF(2)) defined by the generator

tensor G. Let EC be a polynomial over the components of {+1, –1} tensors (energy function)

with the property that every local maximum in EC corresponds to a codeword in C and

every codeword in C corresponds to a local maximum in EC. An interesting problem is,

40 Multidimensional Neural Networks: Unified Theory

construct to EC ?

In the following, the above problem is solved by considering the parity check tensor of

a multidimensional linear block code.

Consider an (m, l ; m , n) linear multidimensional block code. Without loss of generality,

let us consider the generator tensor G given in the systematic form i.e.

Gj 1, j 2,..., jn = I j 1,..., jl Pjl + 1,..., jn (3.26)

The parity check tensor of C is denoted by H and is given by

P

HT = H j ( n − 1),..., ,..., j 1 = I (3.27)

i.e. a blocked tensor with sub-tensors of compatible dimension and order. From the

definition of a parity check tensor of a multidimensional linear block code,

C j 1,..., jn ⊗ H j ( n −1),..., ,..., j 1 = 0 (3.28)

where the multidimensional tensor codeword on appropriate/compatible inner product

(outer product followed by contraction over the appropriate indices) with the parity check

tensor gives the zero tensor.

The above equation can be rewritten using the polynomial representation of generator

tensor devised in the previous section (with the tensor of coefficients being the all-ones

tensor. It should be noted that the all-ones tensor in the {1, –1} representation corresponds

to all-zero tensor in the {0,1} representation).

Lemma 3.5: Let E (X) be the polynomial representation of parity check tensor HT with respect

to the all ones tensor. Then, X ∈ C , the multidimensional linear block code if and only if

E (X) = m(n–l).

Proof: E , the polynomial representation of parity check tensor has m (n–l) terms, and all

the coefficients are equal to one. Hence, E = m(n–l). if and only if all the terms are equal to

one. Q. E. D.

The above Lemma ensures that in the polynomial representation, E (X), every codeword

corresponds to a global maximum (stable state). An interesting question is, does every local

maximum correspond to a codeword. This question is answered by the following theorem.

Theorem 3.5: Let C be a linear multidimensional block code, with G, H, EC, and E as

defined above. Then E is a polynomial with the properties of EC. That is, X corresponds to

∈C.

a local maximum in E if and only if X∈

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 41

Proof: From the above Lemma, the global maximum of E is m (n–l) ; thus every codeword is

a global ( and thus a local ) maximum. The converse follows from the fact that the tensor H

has a systematic form. Specifically, the last m (n–l) variables in E ; i.e., xi1,i 2 ,...., in – l +1 ,...., in ;

where the order indices iˆn −l +1 ,..., iˆn (each of them) assume m values, each appear only in

one term. That is, since I is an identity tensor in the parity tensor H; x i 1,..., iˆn – l +1 ,...., iˆn appears

only in first term, and so on. Now, assume that a tensor V exists that corresponds to a local

(n−l )

maximum (which is not global maximum). That is E (V) = L, where L < m . Hence, at

least one term exists in E (V ) that is not one. However, this can be made one by flipping

the value of the index variables that appear in this term. This contradicts the fact that V is

a local maximum. Q. E. D.

To summarize, given a linear code C, the algorithm for constructing a polynomial is as follows:

(1) Construct the systematic generator tensor of C by the standard techniques in

tensor algebra,

(2) Construct the systematic parity check tensor of C in accordance with (3.27)

(3) Construct E , which is the polynomial representation of H with respect to the all-

ones tensor. By the above Theorem 3.5, EC = E .

In the following, generalizations of the above results are discussed. Also, some important

comments, remarks are provided.

(A) The construction just described also works for cosets of linear multidimensional

block codes. Let W be a tensor of dimension m and order (n – l) of the coefficients

of E. In the construction described above, the all-ones coefficient tensor was

chosen and it was concluded that EC = E . It corresponds to the all-zero syndrome

tensor. Let C be a coset of C, and let T be the syndrome which corresponds to

C. Utilizing the proof argument of Theorem 3.5, it can be proven that a one-to-

one correspondence exists between the local maxima of polynomial representation

of the parity check tensor H with W = T and the tensors in the coset C. Clearly,

the syndrome that corresponds to the code C is the all-ones tensor (by noting that

in the transformation in section 3, 0 goes to 1).

(B) The construction described in this section is a dual way of defining the maximum

likelihood problem (MLD) (with respect to the one suggested in section (3)).

Consider a linear multidimensional block code defined by the parity check tensor

H. Given a tensor V, the maximum likelihood decoding (MLD) problem can be

defined as finding the local maximum in EC closest to V or, equivalently, finding

a local maximum of the energy function associated with the syndrome

(corresponding to V) that is achieved by a tensor of minimum weight.

The above results are generalized to some infinite dimension/order tensors in a

straightforward manner.

In the following section, the above results are generalized to non-binary codes.

42 Multidimensional Neural Networks: Unified Theory

are discussed.

Consider a linear multidimensional block code over a finite field GF(p) with p being a

prime. For the sake of notational convenience, we first consider an (m, k ; m, n) linear

multidimensional block code which maps a transmitted input tensor of dimension m, order

k into a codeword tensor of dimension m and order n. Let G denote the generator tensor of

the code which maps the ( m, k ) input tensor into a (m, n ) codeword tensor. Then, m k

symbols of the input tensor B in Zp are encoded into the codeword V by the procedure:

Vi 1, i 2,..., in = (Bi 1, i 2,..., ik ⊗ Gi 1,..., ik ; j 1,..., jn ) mod p (3.29)

The essential idea is once again to utilize the multiplicative representation. Let u be the p th

root of unity i.e. µ = e ( j 2 Π )/ p (3.30)

The additive Zp group can be represented as a multiplicative group of p th roots of

unity through the transformation: a → u a

In the multiplicative representation, the information symbols in information tensor

are represented as

X i 1,..., ik = u Bi 1,....,ik (3.31)

Thus, the encoded codeword tensor V, is represented by a new tensor Y, where

Vi1 ,..., in

Yi 1,..., in = u

(3.32)

m m

∑ .. . ∑ ( B i 1 ,. .. , ik G i 1 ,. .. , ik ; j 1 ,. .. , j n ) m o d p

= u i 1 = 1 ik = 1

m m

= ∏ ...∏ u

Bi1 ,..., ik Gi1 ,...,ik ; j1 ,..., jn

i1= 1 ik = 1

m m

= ∏ ...∏ Xi1,...,i 1,...,ik ik ; j 1,..., jn

G

i1= 1 ik = 1

multidimensional code over a field (finite) with p elements (p is a prime) by an encoding

procedure. The elements are now p th roots of unity. Thus, given an information tensor

X = (Xi 1,..., ik ) , we have the one-to-one assignment

X = (Xi1,.....,i k) –→ Y = (Yi1,....in) (3.33)

where Y = (yi 1,..., in ), is a monomial.

We discuss the maximum likelihood decoding problem with respect to two different

distance measures. In the first generalization, we consider solving the maximum likelihood

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 43

decoding (MLD) problem with the metric being the Hamming distance between the tensors

while in the second case, we consider the Lee distance.

The generalization for the case where the Hamming distance is utilized in the maximum

likelihood decoding (MLD) problem is based on the following well known Lemma.

Lemma 3.6: Let p be a prime, and let ( j 2 π p ). Assume KE{0,1,2,...,(p − 1)}µ=e then

1, if k = 0,

( p −1)

(1 p ) = ∑u km

(3.34)

m =0 0, othe rw ise

The generalization is stated through the following theorem.

Theorem 3.6: Consider an (m, k ; m, n) multidimensional linear block code over GF(p), with

p

p being a prime. Let X → Y be the corresponding encoding procedure. Let EW be the

following multivariate polynomial representation of the generator tensor G with respect to

an arbitrary received tensor W :

( p −1)

∑ (W

p •

EW (Y ) = i 1,..., in ⊗ Yi 1,..., in ) (3.35)

l=0

where W• denotes the complex conjugate of W and ⊗ denotes the inner product between

the tensors. Then, the maximum likelihood decoding of W i1 ,......, i n is equivalent to finding

p

the maximum of EW (Y ) .

Proof: It follows by the same argument as Theorem (3.4) adopted to the variables appearing

p

in the polynomial EW (Y) and the application of above Lemma. Q.E.D.

The essence of the above theorem stated in more explicit language leads to the following

conclusion.

Given a received tensor Wi 1,..., in , the closest codeword tensor (in Hamming distance) to

W in C (the code utilized at the input to the multidimensional channel) corresponds to a

tensor B if and only if

( p −1)

Max

EW (B ) = All tensors EW (Y ) = ∑ (W i 1,..., in ⊗ Yi 1,..., in )

l

(3.36)

l =0

Next, we consider the maximum likelihood decoding problem with respect to the Lee

distance. We first consider the cases where p = 3 or 5. In these cases, there are easy

expressions for the energy function. It is convenient to redefine the energy function in the

following manner:

Given an encoding procedure for a transmitted tensor X = (Xi1,..., i k), into a codeword

tensor Y = (Yi1,..., in), by the following procedure i.e.

X = (Xi 1,..., ik ), → Y = (Yi 1,..., in ) (3.37)

44 Multidimensional Neural Networks: Unified Theory

and W = (W i1,..., in), a tensor whose entries are the pth roots of unity, we redefine the energy

function as follows:

i

EW ( X ) = Re(W i1 ,....,in ⊗Yi1 ,..., in ) (3.38)

where Re (x) denotes the real part of the complex number, x denotes the integral part of

the number x and xi denotes the complex conjugate of x.

It should be noted that the energy function coincides with the one for p = 2 (in the case

u = –1). The definition of Lee distance is provided to facilitate the easier understanding of

further discussion.

Definition 3.6

p

The Lee weight of an m-dimensional tensor of order k , X = (Xi 1,..., ik ), (Xi 1,..., ik ) ∈ Z , p is a

prime, is defined as

m m

WL = ∑ ...∑ Xi 1,..., ik (3.39)

i1=1 ik =1

where

X i 1, i 2,..., ik , 0 < X i 1, i 2 ,..., ik ≤ ( p 2)

X i 1, i 2,..., ik =

p − X i 1, i 2 ,..., ik , ( p 2) < X 1i , i 2,..., ik < ( p − 1)

The Lee distance between any two compatible tensors is defined as the Lee weight, W L

of their difference.

With the above definition, we study the cases where p = 3, p = 5. From now, in the

following discussion, X → Y denotes the encoding procedure that defines a code

(multidimensional), and X , Y are tensors of dimension m and order k , n respectively, of

third or fifth roots of unity.

In the following, two new theorems are proved. The first one is equivalent to the

Theorem (3.4). It states that maximum likelihood decoding (MLD) in a ternary code is

equivalent to the maximization of the energy function in (3.39). The Theorem is formally

stated below:

Theorem 3.7: Let p = 3, A → B; then B is the closest multidimensional codeword (in the

Hamming distance) to a received tensor word W if and only if

EW ( A ) = Max EW (X ). (3.40)

X

Proof: The proof is similar to that of Theorem (3.4) and is avoided for brevity Q.E.D.

The proof of Theorem (3.7) as well as Theorem (3.8) requires the utilization of Lemma

(3.6) and a clear understanding of when the energy function is maximized. The new

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 45

again based on understanding when the energy function is maximized and how utilizing

Re ( ), the real part of a complex number does not alter the end effect in decoding a

received word.

Now, we consider the problem of maximum likelihood decoding (MLD) with respect

to the Lee distance:

Theorem 3.8: Let p = 5, A → B ; then B is the closest multidimensional codeword (in Lee

distance) to a received tensor word W if and only if

EW ( A ) = Max EW (X ). (3.41)

X

{ i

EW (A ) = (i 1, i 2,..., in ) : W ( i 1,..., in ) ⊗ B i 1,..., in = 1 }

{ i

− (i 1, i 2,..., in ) : W( i 1,..., in ) ⊗ B i 1,..., in = u 2 or u 3 } (3.42)

i

{

= m n − (i 1, i 2,..., in ) : W ( i 1,..., in ) ⊗ B i 1,...., in = u or u 4 }

{

− 2 (i 1, i 2,..., in ) : W i ( i 2,..., in ) ⊗ Bi 1,..., in = u 2 or u 3 }

= mn − dL (W,B) (3.43)

Where W• denotes the complex conjugate of all components of W and d L denotes the

Lee distance. Hence, EW (A) reaches a maximum if and only if d L (W, B) reaches the minimum.

Q.E.D.

The above results are generalized to infinite dimension/order tensors in a

straightforward manner.

In the theory of error control codes in one dimension, linear block codes are first extensively

studied and various problems including the sphere packing problem was subjected to intense

theoretical investigations. The research and development led to various theoretical as well

as practical encoding/ decoding algorithms. Then, because it was thought that linear codes

are limited from the point of view of various code parameters such as the number of (Ara)

correctable errors/minimum distance, non-linear block codes were studied. The research in

this direction culminated in the discovery of codes from algebraic geometry based techniques.

The encoding algorithm was generally easy from the point of view of theory as well as

physical hardware. It is the decoding algorithm which was considered difficult and was the

subject of intense investigations resulting in several decoders. The maximum likelihood

decoding (MLD) problem of linear codes and the relationship to energy functions (discussed

46 Multidimensional Neural Networks: Unified Theory

in the previous sections) naturally suggests a search for similar techniques to non-linear

codes. In the following, non-linear multidimensional codes are investigated.

The essential idea in generalizing the results in previous section to non-linear

multidimensional codes is to consider the representation of Boolean functions as polynomials

over the field of real numbers. In the context of one dimensional non-linear codes, part of the

discussion is known (BrB) and is repeated here for the sake of completeness. Also, utilization

of some subtle ideas associated with tensor products make the presentation essential aid for

realizing that non-linear multidimensional codes share various features with linear codes.

Definition 3.7

A Boolean function f on n variables, is a mapping

f : {0,1}n → {0,1} (3.44)

For the present discussion, it is useful to define Boolean functions using the symbols 1

and –1 instead of the symbols 0 and 1, respectively.

Definition 3.8

A Hadamard matrix of order m, denoted by H , is an m × m matrix of +1’s and –1’s such that

Hm HmT = mI m , (3.45)

where Im is the m × m identity matrix. The above definition is equivalent to the assertion

that any two rows of H are orthogonal.

Hadamard matrices of order 2k exist for all k > 0. The construction is as follows:

H1 = [1]

1 1

H2 = 1 – 1

H 2n H 2n

H 2n +1 = . (3.46)

H 2n − H 2n

Definition 3.9

Given a Boolean function f of order n, P is a polynomial (with the coefficients over the field

of real numbers) equivalent to f if and only if for all vectors X ∈ {1, – 1}

n

f (X ) = Pf (X ). (3.47)

An important problem that is relevant to the investigation of non-linear multidimensional

codes is the following:

Given a Boolean function f of order n, compute Pf , polynomial which is equivalent to f.

From the results in section 3, it is evident that the components of the codeword tensor

(of a linear code), in the {1, – 1} representation are Boolean functions (monomials) in the

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 47

Definition 3.4. Thus, the idea once again is to represent the corresponding Boolean functions

of non-linear codes as polynomials/monomials over the field of real numbers.

In the context of vector variable, the following inferences from Theorem (3.4) are

well known in switching theory textbooks. But, in the case of vector variables, in (BrB),

an alternative proof is given. In the following, it is shown that given the Boolean

functions which are the components of a codeword tensor, there exist polynomials

(with coefficients over the field of real numbers) equivalent to them over the

multidimensional hypercube.

Theorem 3.9: Let f be a Boolean function of order either strictly less than or equal to m n (in

the components of a tensor X of dimension m and order n). Let Pf be a polynomial equivalent

n

to f. Let B denote the tensor of coefficients of Pf . Let P denote the tensor of utmost 2 m

values of Pf (corresponding to m n{+1, – 1} components of tensor X ). Then,

(1) the polynomial Pf always exists and is unique,

(2) the following relationship is satisfied

P = G ⊗ B, where ⊗ denotes the inner product of tensors.

Proof: The proof is constructive in nature. The essential idea is to determine the

coefficients of the polynomial by solving a system of linear equations, possibly imbedded

in tensors.

First, let us consider a Boolean function f of one variable and let us determine the

coefficients of the polynomial Pf .

Pf (x) = b 0 +b 1 x (3.48)

Evaluating the polynomial on the domain of the Boolean function, we have

Pf (1) = b 0 + b 1 (3.49)

Pf (–1) = b 0 – b 1 (3.50)

+1 + 1

Thus, P = G ⊗ B, where G = +1 – 1 (3.51)

G is a Hadamard matrix and B as defined before is the vector of coefficients of

Pf (X1 , X2 ,..., Xn + 1 ) .

Remark

Before proceeding with the proof, the following comparison/discussion on the similarities

and differences between tensor products, matrix products is very relevant. Consider a G

matrix and a column vector B. The tensor product, when the variables (matrix, column

vector) are treated as tensors is given by

48 Multidimensional Neural Networks: Unified Theory

G ⊗ B = Gi , j Bk = Pijk

CONTRACTION

→ Pi

G11 B1 G11 B2 G12 B1 G12 B2 (3.52)

G21 B1 G21 B2 G22 B1 G22 B2

Now, we perform contraction on certain indices of the tensors. The resulting tensor is

a first order tensor. Specifically, suppose we do the contraction over the indices j, k. Then,

we have

G11 B1 + G12 B2

(3.53)

G21 B1 + G22 B2

Thus, the tensor product, in contrast to the matrix product allows more freedom in

summing the components over different indices (contraction over different indices in the

language of tensor algebra) of the tensor.

Now, we return to the original proof.

The above argument is now generalized to less than or equal to m n variables ( or arbitrary

finite/countable number of variables which are possibly the components of a tensor ) by

the method of mathematical induction.

The case m = 1, n = 1 is proved at the beginning of the proof. Since m n is still a large

number (finite), say l, it is sufficient as well as necessary to prove the result for a finite

number l ( in the case considered, the binary variables are imbedded inside a tensor. Also,

the polynomial representing the Boolean function is expressed through inner product

operation over appropriate tensors ).

Now, as an induction hypothesis, assume that the claim is true for l

P = G2n B (3.54)

variables. Since, every polynomial of (l + 1) variables can be written as a combination of

two polynomials each of l variables, we have

Pf (X1 , X2 ,..., Xl + 1 ) = Pf 1 (X1 , X2 ,..., Xl ) + Xl + 1 Pf 2 (X1 , X2 ,..., Xl ) (3.55)

There are two possibilities, either Xl + 1 = + 1 or Xl + 1 = − 1. Hence, by the induction

hypothesis (3.55), the system of linear equations in (l + 1) variables, becomes

G2n G2n

P = G – G (3.56)

2n 2n

1 1 G2n G2n

G 1 = [ 1 ] ; G 2 = 1 – 1 ; G2n +1 = G – G (3.57)

2n 2n

we have P = G 2n + 1 B

Hadamard matrices are non-singular; thus, for any given f, a unique Pf exists (defined

by a vector of coefficients).

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 49

In the language of tensor algebra, the same argument holds true except that the tensor

can have ( the tensor utilized to couple the coefficients of the polynomial representing a

Boolean function to the values of the polynomial) ‘0’ (zero) entries in addition to +1, –1

entries ( when contraction is performed over the appropriate indices). Uniqueness of such

a polynomial is ensured by the uniqueness as a representation of Boolean function ( from

the discussion/proof above ). Thus, in the tensor algebra notation, we have

P = G ⊗B (3.58)

where ⊗ denotes the inner product of two tensors. Q. E. D.

It should be clear that the above representation theory has relevance to the minimum

sum of products representation of a Boolean function. The above theory, as is easily seen

holds true, if one is interested in finding the equivalent polynomial of a Boolean function

which assumes {0,1} values. One way to see the result is by the following claim.

CLAIM: Every monomial over {1, –1} can be written as a polynomial over {0,1} by the

change of variable (BrB), x = 1 - 2 u, as follows:

k k

∏ Xi = 1+ ∑ (−2)i

i =1 i =1

∑ ∏U

Si j ∈Si

j (3.59)

X1 X 2 = (1 − 2U1 ) (1 − 2U 2 ) = 1 − 2 (U1 + U 2 ) + 4U1U 2

The representation theory developed above is now utilized for representing the

multidimensional error correcting codes in a way that generalizes the representation

described in section 3.

Consider the linear multidimensional (m, k ; m, n) code C. The code can be represented

by viewing each component of the codeword tensor as a Boolean function of utmost m k

variables. A tensor V ∈ C , if and only if there exists an m-dimensional tensor of order k

( binary entries ) such that, with {1, – 1} tensor, X = (Xi 1,..., ik ),

The Boolean functions associated with the components of a linear multidimensional

codeword tensor are determined by the generator tensor entries through which the code

is represented. For linear multidimensional codes, every component of the codeword

tensor fi 1,i 2,..., in (X ) correspond to an XOR operation of some variables of the information

tensor ( determined by the corresponding entries of the generator tensor ). Thus, for every

component (i1, i2,..., in), the Boolean function fi 1,i 2,..., in (X ) can be transformed by the method

50 Multidimensional Neural Networks: Unified Theory

mk

which consists of

one monomial only.

Now, by the same argument as in Theorem (3.4), the maximum likelihood decoding

(MLD) of a given received tensor word is equivalent to solving the following

maximization problem:

Max (Wi 1,..., in ⊗ fi 1,...., in (X )). (3.61)

By the procedure/reasoning through which we arrived at the above conclusion, tells us

that the MLD problem as defined above also holds ( (3.61) holds) for non-linear

multidimensional codes. For non-linear multidimensional codes, a component of the

codeword tensor fi 1,i 2,..., in (X ) can consist of more than one monomial. Other than that, each

component satisfies all the conditions to arrive at the above conclusion.

From the above generalization, it follows that, for both linear as well as non-linear

multidimensional codes, the maximum likelihood decoding problem is equivalent to the

maximization of a multi-variate polynomial defined over the components of {1, –1} tensor

i.e. over a tensor X with entries, of dimension m and order k .

Hence, the following interesting theorem follows:

(1) Maximization of multivariate polynomials with rational coefficients over the

multidimensional hypercube,

(2) Maximum likelihood decoding (MLD) problem of an (m, k ; m, n) multidimensional

linear code,

(3) Maximum likelihood decoding (MLD) problem of a not necessarily linear (possibly

non-linear) multidimensional code, each of whose codewords are tensors of

dimension m and order k .

In view of the results in section 5 for non-binary codes which parallel those in section

3 for binary codes, maximum likelihood decoding of non-binary, nonlinear

multidimensional codes is again equivalent to maximization of multivariate (variables

being the components of a tensor) polynomial over a subset of multidimensional lattice.

Various results (theorems, concepts, designs etc.) on optimization of multi-variate

polynomials over various subsets of lattice were developed in various scientific fields

such as electrical engineering, mathematics, computer science, operations research

etc. These results are being translated to multidimensions and also the repercussions

which follow immediately from the tensor linear operator are being documented.

For instance, in one dimensional logic theory, various theorems including the

representation of a Boolean function in the minimum sum of products (MSOP) form

are well studied. In view of the results in (Rama 3), utilizing the fact that matrix

linear operator is a special case of tensor linear operator, various theorems on

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 51

the polynomials/monomials are imbedded inside tensors. In as much as the linear

space structure is utilized in deriving the results/theorems, the translation of the

results from one dimensional logic theory to multidimensions is done with the generic

principles described in (Rama 3), (Chapter 4).

The results in sections 3, 4, 5, 6 effectively demonstrated the relationship between

multidimensional codes, the energy functions optimized by multidimensional neural

networks over various subsets of the lattice, optimization of multivariate polynomials (the

terms/monomials of which are based on the generator, other tensors) over the various

subsets of the multidimensional lattice. Thus, these local optima of the multivariate

polynomials have the structure parallel to various linear transformation groups and basis

of a certain linear spaces.

Utilizing a natural leap of imagination, the author considers univariate as well as

multivariate polynomials, power series in tensor variables with tensor coefficients.

Specifically, an interesting problem that arises in structured Markov random fields is the

problem of determination of tensor zeroes of the following univariate tensor polynomial,

power series equations (Rama 6).

X 2 ⊗ A 2 + X ⊗ A1 + A0 = 0

m

∑X

j =1

j

⊗ Aj =0

∞ (3.62)

∑ X j ⊗ Aj = 0

j =1

where X, {A} are tensors of compatible dimension, order such that the inner/outer

product operations are well defined. The solution techniques developed in (Rama 11)

when the linear operators are matrices are extended to the tensor linear operator case

in (Rama 6). Also, various results that are well documented in the books such as (Gol)

for matrix polynomials based on the properties of matrix linear operator are extended

to tensor linear operator. Furthermore, in one dimensional system theory, various results

are developed for systems of matrix polynomial equations utilizing only linear operator

properties of a matrix. These results are extended to systems of tensor polynomial

equations (Rama 3). In (Rama 6), the author formulates as well as solves the problem

of determination of tensor variate zeroes of multi-tensor variate polynomial, power

series equations

L L

∑ ...∑ X

i1

1 ⊗ X i22 ... X mim ⊗ Ai 1,...., in = 0

i1= 0 m= 0

∞ ∞

(3.63)

∑ ...∑ X

i1

i1= 0 m= 0

52 Multidimensional Neural Networks: Unified Theory

Various other associated results are documented in (Rama 6). It is well known that the

zeroes of a uni-variate scalar polynomial constitute a group. By utilizing the set of zeroes

of a determinental polynomial associated with the uni-variate/multi-variate (tensor

variables) polynomial, the set of tensor zeroes are divided into certain set of equivalence

classes. Thus, a group structure is imbedded onto the linear subspace of tensor zeroes of

uni-variate/multi-variate polynomial equations.

Unlike the multivariate polynomials (whose terms/monomials are based on the

components of tensors) optimized in sections 3, 4, 5, 6; in view of the above results, a

natural question that arises is whether the local optimum of multi-tensor variate polynomials

over various subsets of multidimensional (very high dimensional) lattice lead to (each

variable is a tensor) codeword sets with better properties. When the information tensor,

generator tensor, codeword tensors are blocked into sub-tensors and the objective function

for the optimization problem over a subset of multidimensional lattice is rewritten, it is

evident that a multi-tensor variate polynomial appears. Thus, such polynomials are

subsumed in the ones considered in sections 3, 4, 5, 6.

In computer science, operations research and other fields, problems of the following

form arise very often:

n

Maximize ∑ Wi ∏X j (3.64)

i =1 j ∈Si

where Sj is a subset of {1,2,...,n} and X j ∈{0,1} . Thus, the problem is concerned with optimizing

a multivariate polynomial, whose variables assume integer values. By the discussion, in this

section, every polynomial over {1, –1} can be transformed to an equivalent one over {0,1} by

a change of variable. It is shown in section 2, that a special case of the above problem i.e.

maximization of a quadratic form in {1, –1} variables arises in connection with the

determination of global optimum stable state of a neural network and is equivalent to the

minimum cut problem. This problem is known to be an NP hard problem.

The problem in (3.64) was studied extensively by various researchers and the main effort

concentrated in identifying the special cases which are solvable in polynomial time and in

devising approximation techniques. The most common technique for solving the

unconstrained {0, 1} program of the form in (3.64) is by transforming them to the problem of

finding the maximum weight independent set in a graph, which is an NP-hard problem. The

problem in (3.64) is transformed to the problem of finding the maximum weight independent

set by using the concept of a conflict graph of a 0-1 polynomial. In (BrB), it is shown how

decoding techniques can be utilized to maximize 0-1 nonlinear programs.

The multidimensional version of the 0-1 nonlinear programming problem in (3.64) is

given by

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 53

n

Maximize ∑ Wi ⊗ X ii (3.65)

i =1

where W, X are tensors containing the known coefficients w ‘s in W and the monomials in

the variable components of the unknown tensor X. The inner product between these two

tensors provides the scalar objective function whose variables are allowed to assume only

{0, 1} or more generally finitely many values. It is shown in (Rama 6) that such an integer

programming problem can be solved utilizing the multidimensional decoding techniques

for linear block multidimensional codes. These results in operations research are avoided

here and relegated to (Rama 6).

In one/two independent dimensions, various static optimization problems are solved under

the sub-fields of optimization theory such as (a) linear programming, (b) non-linear

programming, (c) calculus of variations, (d) combinatorial optimization etc. With the

innovative idea of formulating and solving the parallel problems in multidimensions (Rama

3) through the utilization of tensor linear operator (motivated by practical applications),

vast literature in multidimensional optimization theory is generated. Various consequences of

this innovative idea of the author are fully explained in the companion research article

(Rama 3) on dynamic optimization. In the following, some innovative ideas of generic

consequence in static optimization are described.

In view of the results in section 5, the constraint set over which a multivariate

polynomial (terms of the polynomial expressed in terms of the components of a generator

tensor, received tensor in the case of MLD) is optimized is a subset of the multidimensional

lattice (or bounded lattice, say, in multidimensions) and subsumes the multidimensional

hypercube as its subset. These results naturally lead to a question as to whether it is

possible to utilize the results in sections 3, 4, 5, 6 for optimizing multivariate polynomials

over more general constraint sets in multidimensions. In the following theorems, constrained

optimization over more general constraint sets utilizing the results of sections 3, 4, 5, 6 is

discussed.

Theorem 3.11: Consider a compact set in a multidimensional metric space. The local

optimum of a multivariate polynomial (with the terms/monomials expressed in terms of

the components of tensors and assuming binary/finitely many integer values) whose

variables are allowed to assume finitely many values, over the compact set, occurs at the

union of codewords of finitely many multidimensional non-binary/binary codes.

Proof: From real/complex analysis (also topology), we have the Heine-Borel Theorem,

which states that every open covering of the compact set ( in the space described by multiple

independent variables ) has a finite sub-covering. The covering generally consists of open

54 Multidimensional Neural Networks: Unified Theory

sets could be utilized for covering ). But, it can be chosen to be a collection of convex hulls

of bounded lattices in multidimensions. This (possibly countable) collection covers the

compact set and thus has a finite sub covering. This implies that the constraint set chosen

for optimization can be covered by finitely many bounded lattices (convex hulls of bounded

lattices in multiple independent dimensions ).

But, by the results of sections 3, 4, 5, 6 the local optimum of multivariate polynomials

(terms/monomials expressed in terms of the components of tensors) over multidimensional

bounded lattice (subsets of multidimensional lattice) constitute a linear/non-linear

multidimensional codewords. Hence, the local optimum is achieved at the set of codewords

of finitely many linear/non-linear codewords (tensors). Q. E. D.

It should be noted that the determination of global/local optimum of the multivariate

polynomial over the compact set is reduced to determining the global/local optimum of

the energy functions of finitely many neural/generalized neural networks. It should be

understood that some codewords may not be in the feasible region i.e. strictly inside the

compact set. Also, when specific compact sets are chosen, further detailed information can

be obtained on the local optima.

Theorem 3.12: Optimization of a multivariate polynomial (with the terms/monomials

expressed in terms of the components of tensors) over an arbitrary open set in a

multidimensional space (metric space) is equivalent to the optimization over the union

of codewords of countably many multidimensional linear codes or an infinite

dimensional code.

Proof: Let us consider an arbitrary open set in a multidimensional space (metric space). By

the Lindeloff’s covering lemma, the open set can always be covered by utmost a countable

collection of open balls or other sets. It is evident that the covering can be chosen to be a

countable collection of convex hulls of bounded lattices (in multidimensional space). But,

by the results in section 4, the local optima of the multivariate polynomial (the monomials/

terms being expressed in terms of the components of tensors) over a bounded lattice

constitute multidimensional codeword set. Thus, the local optima of a multivariate

polynomial occurs at the union of codewords of countably many multidimensional codes.

Q. E. D.

Remark

Suppose, the compact set/open set (in multidimensions) is covered by finitely/countably

many hyperspheres (multidimensional) and a quadratic/higher degree form is optimized.

By the spectral representation theorem, the local optima of quadratic/higher degree form

occur at the eigentensors with the eigenvalues being the corresponding values. This

corresponds to L 2 norm based optimization.

The above theorems illustrate two essential ideas of generic utility in static

optimization: (a) optimization over more general constraint sets, (b) decomposition principle.

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 55

static optimization.

Decomposition Principle

Consider an arbitrary constraint set over which an objective function is optimized in

one or more independent dimension variables. The constraint set is decomposed into the

union of finitely many special sets with interesting structure. Optimization of various

objective functions over the special sets has various interesting features: (a) various results

are well known, (b) the local optima have interesting structure, (c) it is thoroughly studied

etc. Utilizing these features, optimization of any objective function over the original set is

decomposed into simpler problems. The above two theorems are only illustrative.

The discovery and application of the above decomposition principle to

multidimensional constrained optimization problems naturally led the author to investigate

various other innovative ideas in static optimization.

(I) Approximation of Objective Function by Polynomials, Power Series ( other Special

Classes of Functions)

Polynomials and power series (uni/multi-variate) are very important classes of functions.

The optimization results (unconstrained as well as constrained) associated with these functions

enable one to derive the local optimum of some classes of functions over various constraint

sets invoking standard theorems (from approximation theory). For instance, the following

theorem enables deriving results on continuous objective functions utilizing polynomials:

Theorem 3.6 is utilized in association with the following theorem.

Theorem 3.13: Every continuous function over a compact set always attains its maximum/

minimum over the set. Every continuous function can be arbitrarily closely approximated

by polynomials ( multi-variate/univariate).

Also, invoking the standard theorems from approximation theory, various classes of

functions are arbitrarily closely approximated by polynomials: uni-variate/multi-variate.

Thus, when these functions are utilized as objective functions, results associated with

polynomials (derived in sections 3-6 ) are invoked.

(II) Discovery of new local/global optimization techniques

This requires utilizing either new classes of functions or new constraint sets. The

constraint set structure renders the local optima of some functions with interesting structure

and also the properties satisfied by the objective functions enables discovering efficient

techniques.

NP-Hard Problems:

In computer science, operations research and other applied/theoretical research fields,

various NP-hard problems are well identified and studied. It is well known that one NP-

hard problem is as complex ( in the terminology of complexity theory in theoretical computer

56 Multidimensional Neural Networks: Unified Theory

science) as any other NP-hard problem. Finding algorithms which are efficient (in terms of

complexity) for an NP-hard problem is well recognized as a difficult problem. The following

is a difficult open problem in theoretical computer science:

Problem: Does a polynomial time algorithm exist for an NP-hard problem? In other words,

is the class of problems in NP, the same as the class of problems in P? i.e. is P = NP?

In the following, an innovative algorithm/approach to solve various NP-hard problems

in one dimension is described. The multidimensional generalization of this algorithm/

approach to any NP-hard problem (in multidimensions) is being formalized. It is an

extension of the following results to multidimensions.

In section 2, the problem of computation of minimum cut in a graph is shown to be

equivalent to the problem of determining the global optimum of the energy function of a

neural network i.e. maximizing a quadratic form over the hypercube. It is well known that

this is an NP-hard problem. In the following, an attack on this problem is described.

Positive Definite Synaptic Weight Matrix: Determination of Global Optimum Stable State of a

Neural Network:

Consider a neural network whose synaptic weight matrix is symmetric as well as

positive definite. In the following, an algorithm to determine the global optimum stable

state of such a neural network is described.

(a) Utilizing the well known theorem in linear algebra, every positive definite

symmetric matrix, S can be decomposed into the following form by means of

Cholesky Decomposition.

S = N NT (3.66)

where N is a lower triangular matrix.

(b) The quadratic form being optimized by the neural network over the hypercube

can be expressed into the following form:

X T S X = XT N N TX = YT Y , where Y = N TX . (3.67)

Since S is positive definite, XT S X > 0. Thus, YT Y > 0. The scalar expression for the quadratic form

n

in terms of Y is given by ∑Y

j =1

2

j

. Thus, it is evident that the value of the quadratic form is either

k

each of the terms is a linear form. Thus, to maximize the quadratic form, each linear form

maximum/minimum value (whichever is larger) is determined over the constraint set, hypercube.

Thus, the original NP-hard problem (of maximization of a quadratic form over the

hypercube) is reduced to several linear programming problems i.e. optimization of several

linear forms over the hypercube.

In this novel algorithm/approach for various classes of NP-hard problems (minimum

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 57

cut computation in an undirected graph, knapsack problem etc.), the complexity of the

algorithm is determined by

(a) Complexity of determination of Cholesky decomposition of a positive definite

symmetric matrix. Since there are various polynomial time procedures for the spectral

decomposition, computationally well studied efficient algorithms are available,

(b) Solving the linear programming problems related to optimization of linear forms

( maximization or minimization whichever leads to a larger value for the term)

over the hypercube. It is well known that there are polynomial time algorithms

for linear programming problems.

In some problems that arise in operations research, communication theory etc.,

constraint set is a convex polygon/polytope (convex hull of various finite structures

leading to convex sets bounded by hyperplanes) etc. and a quadratic/higher degree

form is optimized over the constraint set. Then, by means of Spectral/Cholesky type

decomposition of the positive definite symmetric linear operator (in one as well as

multidimensions), various linear programming problems are solved through efficient

polynomial time procedures. The computation of complexity of such procedures,

efficient algorithms for NP-hard problems in one and multi-dimension, are being

documented. When the connection matrix has other special structure efficient

algorithms are found.

In the general framework of a linear programming problem, the constraint set is a

convex polytope. By means of the decomposition principle, utilizing the hyperplanes

bounding the feasible set, the convex polytope is expressed as the union of finitely

many rotated, translated hypercubes. The linear objective function is converted into a

quadratic objective function (reversing the technique utilized in the above algorithm)

and the results from neural network theory are invoked to determine the local optima

of the objective function (union of stable states of various neural networks). Thus, unlike

the simplex algorithm, only a subset of the vertices of the feasible region that constitute

the stable states of neural networks is searched for determining the global optimum

for the linear program.

In the following, an alternative algorithm for any NP-hard problem is formally described.

This algorithm is designed/analyzed by the author for maximum likelihood decoding

of linear block codes (Rama 1). From (BrB), and Theorem (3.3), maximum likelihood

decoding of a received word , Y with respect to a graph-theoretic code is equivalent to

58 Multidimensional Neural Networks: Unified Theory

finding the maximum of the energy function E of a neural network defined by the graph

G (the weights on the edges of G are given by W = (–1) yi with all its threshold values

equal to zero.

But, it is well known that the local optimum of a quadratic form over the hypersphere

occurs at the eigenvectors (eigentensors of the symmetric second order tensor) of the

symmetric matrix (associated with the symmetric matrix) with the value of the quadratic

form being the eigenvalue. Thus, maximum eigenvector of the symmetric matrix maximizes

the quadratic form over the hypersphere. Thus, the sign structure (sign of the components

of the vector) of the maximum eigenvector is utilized as the initial condition to run a neural

network i.e. Mathematically, let X 0 be the vector given by

X 0 = Sign ( X max ), where X max is the normalized maximum eigenvector and

X 0 is the initial state in which the neural network starts. A is the symmetric matrix.

The analysis of hop-and-skip algorithm is provided below.

X T A X = (X – X0 + X0)T A(X – X0 + X0) (3.68)

= (X – X0)T A( X – Xmax) + X0 A X0 + 2 XT0 A( X – X0 )

= λmax + (X – X0)T A(X – X0) + 2 X0T A( X – X0)

= λmax + (X – X0)T A(X – X0) + 2 λmax X0T(X – X0)

–λmax + (X – X0) T A(X – X0) + 2 λmax X0T X

= n (3.69)

The above manipulations enable one to compare the value of the quadratic form on the

hypercube at any discrete time instant against the maximum value on the unit hypersphere.

The particular choice of initial condition, minimizes the Hamming distance between the

maximum eigenvector and the initial condition vector to run the neural network.

The set of eigenvectors of the connection matrix of neural network span the entire

space or a subspace of it. Similarly, the set of stable states/ stable vectors span the space

or a sub-space. To determine the maximum stable state, the essential idea of the above

approach is to find the vector closest to the maximum stable state and utilize it as the

initial condition to run the neural network. Detailed analysis of the algorithm is being

investigated.

Dynamic Optimization

In (Rama 3), certain multidimensional system, in discrete/ continuous time is described

by the following state space representation through tensors:

Discrete Time:

X(n + 1) = A(n) ⊗ X(n) + B(n) ⊗ U(n),

(3.70)

Y(n) = C(n) ⊗ X(n) + D(n) ⊗ U( n).

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 59

Continuous Time:

X(t + 1) = A(t) ⊗ X(t) + B(t) ⊗ U(t),

(3.71)

Y(t) = C(t) ⊗ X( t) + D(t) ⊗ U(t).

where ⊗ denotes the inner product between compatible tensors in the system description

in continuous/discrete time. Utilizing this state space representation, the author formalized

a unified theory of control, communication and computation in multi/infinite dimensional

systems, first discovered in (Rama1) for one dimensional systems. This theory enabled the

author to develop a highly advanced version of the theory of evolution of life from organic

matter. In this theory the author reasons that various body organs, functions of living

systems have evolved over time and that bilogical systems are organic/inorganic matter

based dynamical systems.

3.8 CONCLUSIONS

Tensor linear spaces over finite fields are utilized to describe and study the structure/

properties of multi/infinite dimensional linear codes. The three concepts: multidimensional

neural/generalized neural networks, multidimensional codes, multivariate polynomial

(terms/monomials being expressed in terms of the components of generator, other tensors)

optimization over various subsets of lattice, are related.

It is shown that (a) the problem of maximum likelihood decoding of error correcting

codes (multidimensional), (b) finding the global maximum of the energy function of neural/

generalized neural networks, and (c) solving integer/non-linear programming problems

in multidimensions are related. The equivalence is proved for binary as well as non-binary

cases. This equivalence naturally suggests utilizing the solvable cases of one problem to

the equivalent problem and vice versa. Full capitalization of equivalence leads to various

new results (Rama 6).

The programming problem of multidimensional neural networks is solved. Several

new heuristic procedures for NP-hard problems in multidimensions are suggested from

the equivalence. The decoding techniques of various (multidimensional extensions of one

dimensional codes) codes are utilized to find approximate solutions of NP-hard problems.

Various innovative results in static optimization are described. Infinite dimensional

generalization of the results is briefly described.

REFERENCES

(Ara) B. Arazi, “Common Sense Approach to the Theory of Error Correcting Codes, “ MIT Press

book.

(BoT) A.I. Borisenko and I.E. Tarapov, “Vector and Tensor Analysis with Applications, “ Dover

Publications Inc., New York, 1968.

60 Multidimensional Neural Networks: Unified Theory

(BrB) J. Bruck and M. Blaum, “Neural Networks, Error Correcting Codes and Polynomials

Over the Binary Hypercube, “ IEEE Transactions on Information Theory, Vol. 35, No. 5,

September 1989.

(Gaal) Gaal, “ Group Theory, “ Academic Press, 1982,

(Gol) I. Goldberg, “Matrix Polynomials, “ Academic Press, 1972.

(Rama 1) Garimella Rama Murthy, “Unified Theory of Control, Communication and

Computation—Part-1, “ Manuscript to be submitted to the IEEE Proceedings.

(Rama 2) Garimella Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite

Dimensional Logic Theory, Logic Synthesis, “ Published in International Journal of Neural

Systems, Vol. 15, No. 3, pp 223-235, 2005.

(Rama 3) G. Rama Murthy, “Optimal Control, Codeword, Logic Function Tensors:

Multidimensional Neural Networks,” International Journal of Systemics, Cybernetics and

Informatics, October 2006, pages 9-17. See also Chapter 4.

(Rama 4) Garimella Rama Murthy, “Multi/Infinite Dimensional Logic Synthesis, “ Manuscript

to be submitted to the IEEE Transactions on Computers.

(Rama 5) Garimella Rama Murthy, “Signal Design for Magnetic and Optical Recording Channels,

“ Bellcore Technical Memorandum, TM-NWT-018026.

(Rama 6) Garimella Rama Murthy, “Tensor Variate Polynomials/Power Series, Tensor based

Functions, Tensor Algebraic Geometry: Optimization, “ Manuscript to be submitted to the

Transactions of American Mathematical Society.

(Rama 10) Garimella Rama Murthy, “Unified Theory of Control, Communication and

Computation: Dynamical Systems, “ Manuscript in Preparation.

(Rama 11) Garimella Rama Murthy, “Transient and Equilibrium Analysis of Computer Networks:

Finite Memory and Matrix Geometric Recursions, “ Ph. D. Thesis, Purdue University, West

Lafayette, Indiana.

(Rama 12) Garimella Rama Murthy, “Origin of Universe: Living/Non-Living: Grand-unification

Theory of Universe, “ Manuscript in preparation.

Tensor State Space Representation: Multidimensional Systems 61

CHAPTER

4

Tensor State Space

Representation:

Multidimensional Systems

4.1 INTRODUCTION

With the efforts of researchers in electrical engineering, linear system theory started with

abstract models of arbitrary linear systems through forced/unforced nth order difference

equations in discrete time and differential equations in continuous time. Such representations

are called the input-output representations of the linear system. These arbitrary system (electrical,

mechanical, chemical, hybrid systems) evolution equations were then converted into first

order differential/difference equations in state, control, input, output vectors through state,

input, output coupling matrices. Such a representation is called the state space representation.

The state space equations take the following form (Gop)

Discrete Time Systems:

X(n + 1) = A( n) X(n) + B(n) U(n),

Y(n) = C(n) X(n) + D(n) U(n),

X(t) = A(t ) X( t) + B( t) U( t)

(4.1)

Y(t ) = C(t ) X (t ) + D(t ) U (t )

where {A(n), B(n), C(n), D(n)} as well as {A(t), B(t), C(t), D(t)} are matrices of compatible

dimensions.

Thus, in the design, analysis and synthesis of linear systems, linear algebra techniques

were extensively utilized. Various, input-output representation related concepts such as

impulse response, systems function were shown to be derivable from the state space

description. Also new concepts such as controllability and observability are studied in

terms of state space representation. Thus, the state space representation of linear systems

proved to be a far better description of arbitrary systems.

62 Multidimensional Neural Networks: Unified Theory

studied. Various system theorists tried to extend one dimensional system theory to

multidimensions utilizing the ideas of local state and local control. For instance, consider

a typical discrete time, two dimensional system. The evolution of a prototype linear model

is described by the state updating equation

X(h + 1, k + 1) = A1X(h, k + 1) + A2 X(h + 1, k) + B1U(h, k + 1) + B2U(h + 1, k)

where

X( h, k )∈Rn and U( h, k )∈Rm are the local state and local input value at (h,k)

and

A1,A 2 , B1 ,B 2 are real matrices of suitable dimensions. This type of approach based

on local state and local control was utilized in association with partial differential equation

based continuous time linear multidimensional systems. These representations of continuous

time as well as discrete time multidimensional systems required considerable amount of

ingenuity, careful tracking of the indices, in designing and analyzing such systems. To a

certain degree, this notation impeded further progress in multidimensional system theory.

With this type of approach/notation, modeling, design and analysis of certain linear/non-

linear, multi/infinite dimensional systems was a complicated task.

The author for the first time realized that, for the evolution of CERTAIN

multidimensional linear systems, tensor linear operator based state space description is

necessary as well as helpful. This mathematically formal tensor state space representation

was an important contribution for further progress in multi/infinitedimensional system

theory (linear/non-linear dynamical systems). Also, the author after carefully observing

various multi/infinite dimensional systems (explicitly stated as a static or dynamical system

or when a proper abstraction is made the multidimensional nature of problem/

phenomenon becomes apparent) such as those that arise in multi/infinite dimensional

neural networks (Rama 2), databases ( utilizing multiple attribute tree etc. ), multi/infinite

dimensional coding theory (Rama 3), proposed the utilization of tensors ( of order,

dimension finite/infinite ) as the linear operators in the design, analysis and synthesis of

such systems. This idea is already utilized in some applications. It should be noted that in

the analysis of some systems defined over finite fields and other discrete structures,

utilization of tensors considerably simplifies the analysis.

In the case of multidimensional systems, there is no natural notion of causality. Various

types of causality ( quarter-plane causality, half-plane causality) are artificially imposed

by different choices of neighbourhood sets. With such an approach (for all multidimensional

systems), it is very difficult to study controllability, observability and stability. The author

realized that for certain multidimensional systems, utilization of tensor linear operators

to represent the state, control, input, output variables, is very convenient (from the point

of view of design and analysis of such systems) (Rama 1).

Tensor State Space Representation: Multidimensional Systems 63

multidimensional system theory are summarized. It is also described how the utilization

of tensor linear operator associated with multidimensional linear spaces provides a new

approach for formulating as well as solving the problems related to static as well as

dynamical systems (defined over multidimensional linear space). In section 3, state space

representation of certain multi/infinite dimensional linear systems utilizing the tensor

linear operator is formally described. In section 4, it is illustrated how the utilization of

tensor based state space representation enables one to translate the results from one

dimensional systems to certain multidimensional systems. Various generic principles of

how to translate the results from one dimensional system theory to multi/infinite

dimensional system theory are provided. In section 5, multi/infinite dimensional time

series analysis models are described. In section 6, utilizing the concepts of local state, local

input, local control in the multi/infinite dimensional state space, various state space

representations for multi/infinite dimensional distributed systems are formally described.

These state space representations enable one to translate the results developed for

conventional multi/infinite dimensional systems to those described through the tensor

state space representation. The chapter concludes with some conclusions.

SYSTEM THEORY: REPRESENTATION BY TENSOR LINEAR OPERATOR

One of the main tools in the design and analysis of one dimensional linear dynamic systems

as well as static systems is linear algebra. Motivated by practical applications in image

processing and other fields, system theorists proposed various input-output models for

two/multidimensional systems. Models which exhibit quarter plane causality have been

initially investigated from the input-output point of view (BiF) in the framework of two

dimensional filter theory, where two dimensional filters are represented by proper rational

functions in two indeterminates of the following type:

∑nZ

i + j ≥1

ij 1

i

Z2 j

W (Z1 , Z2 ) =

1+ ∑dZ

i + j≥1

ij 1

i

Z2 j (4.2)

The idea of associating two dimensional state space models with two dimensional filters

was originated very naturally. However, since the beginning it appeared that the canonical

technique based on the Nerode equivalence leads to an infinite dimensional state space.

The reason was to utilize a matrix as the linear operator to describe the state dynamics. So,

following some heuristic procedures, several finite dimensional models have been (BiF)

introduced, where two notions of state play different roles:

1. local states: X(h,k) belong to a finite dimensional vector space. They enter in the

state updating equation and determine the value of the output.

64 Multidimensional Neural Networks: Unified Theory

set of Z × Z. These belong to an infinite dimensional vector space (in one

independent dimension), which provides an extension of the space of Nerode

equivalence classes. The most common state space model with quarter plane

causality is represented by the following equation.

X(h + 1, k + 1) = A 1X (h, k +1) + A2 X(h + 1, k ) + B1 u (h, k + 1) + B2 u (h + 1, k )

where x(h , k )∈ R n , u( h, k )∈R m , y(h , k )∈R p , are the values of the local state, the input and

the output at (h, k)∈Z × Z . Since the local state at (h+1, k +1) is computed by solving a first

order difference equation, the system (4.3) denoted by Σ 1 = (A 1, A 2, B1, B2, C) is named a first

order system.

The above model has been extensively studied in its general form and under some

conditions/constraints on the system matrices. The most popular particularized version

of (4.3) is Roesser’s model, where the local state space X is assumed as the direct sum of

two vector spaces Xh and Xv , and the matrices of the model are constrained to have the

following form (partitioned)

A111 A121 0 0 B11 0

A1 = , A2 = 2 2 , B1 = , B2 = 2 (4.4)

0 0 A21 A22 0 B2

Second order models are less frequently used: the typical structure of their equation is

given by

X ( h + 1, k + 1) = A1X ( h , k + 1) + A2 X ( h + 1, k ) + A0 X ( h , k ) + BU ( h , k )

Y (h , k ) = C X(h , k ) (4.5)

In Attasi’s model A 1 and A 2 are commutative matrices. Also, A 1A 2= –A 0. It realizes

separable filters only and constitutes an interesting second order model, as the underlying

theory is very close to the one dimensional theory (BiF).

Recently, the behavior approach has been extended to two dimensional systems. Following

this theory, a two dimensional system is defined by a family of β admissible functions

(behavior), defined over the discrete plane. These functions are characterized by the property

of belonging to the Kernel of a polynomial matrix M (Z1, Z2) in two variables

β = {ω = ∑ wij z1i z2j M ω = 0}

i , j∈Z

(4.6)

Associated with the external description provided by the behavior different internal

representations can be given by introducing the so called latent variable models. State variable

models constitute a particular type of latent variables, that hold the memory of the system

with respect to the notion of past introduced on Z × Z. When a state description is possible,

Tensor State Space Representation: Multidimensional Systems 65

i.e. when the notion of past, present and future are allowed by the structure of β , the behavior

is called Markovian. Since there is not any natural direction for the evolution in Z × Z , the

Markovian property appears more general than the familiar quarter plane causality and has

been exploited in the analysis of non-causal two dimensional dynamics.

Also, various static systems that involve simple linear transformations in the

multidimensional space were previously abstracted utilizing the matrix linear operator.

Such systems arise in practical applications such as databases (modeling storage of multiple

attribute trees), computerized topography etc. The techniques developed for design and

analysis of such systems were thus very elementary.

The above efforts in two/multidimensional system theory were primarily utilizing the

matrix linear operator on an n-dimensional ( in one independent variable) vector space. System

theorists did not realize that utilization of tensor linear operator (in multidimensions) could

lead to design and analysis of a large class of multidimensional systems.

In the following areas, utilization of tensor linear operator to describe the multi/infinite

dimensional state space enables one to formulate new problems , introduce new concepts,

derive new results/theorems. Some of the areas of interest where such an idea could be

utilized are

(1) Multi/Infinite dimensional computation theory,

(2) Multi/Infinite dimensional information/communication/coding theory,

(3) Multi/Infinite dimensional rate distortion theory,

(4) Multi/Infinite dimensional stochastic systems—Theory of Markov random fields,

(5) Multi/Infinite dimensional time series analysis,

(6) Multi/Infinite dimensional digital signal processing,

(7) Theory of Multi/Infinite dimensional connectionist structures—graphoids,

(8) Theory of databases utilizing multidimensional storage,

(9) Matroid theory,

(10) Multi/Infinite dimensional Game theory.

By the utilization of the idea of capturing a multidimensional state space through a

tensor linear operator, new research problems can be formulated and solved.

DIMENSIONAL DYNAMICAL SYSTEMS: TENSOR LINEAR OPERATOR

A multidimensional system transforms an m -dimensional tensor (array) of order r into a k -

dimensional tensor of order s. In the following, some confusion that arises in the terms

utilized is cleared.

Remark: Notation

In the tensor notation, the word “dimension of a tensor” stands for the number of

values each independent variable assumes, whereas the word, “order” represents the

66 Multidimensional Neural Networks: Unified Theory

system seems incorrect compared to “multi-order” system. But, this is a matter of notation.

To stick with familiar jargon, in the following the author utilizes the term “multidimensional

systems”. From the context, the reader should be able to ascertain the usage of words,

“order”, “dimension”.

Infinite dimensional systems lead to further confusion. If each independent variable

assumes infinitely many values ( in contrast to finitely many values assumed by each

independent variable in a multidimensional system ) and there are only finitely many

independent variables, the system description utilizes infinite dimensional tensors of finite

order for state, input, output variables. But, if the number of independent variables is also

infinite, the dimension as well as order of tensors utilized in the representation of variables

is infinite.

It should be noted that, in the case of discrete time systems, each component of the

tensor input, output, state variables is a function of a discrete time index. But, in the case of

continuous time systems, each component of the tensor input, output, state variables is a

function of the continuous time index. Also, in the case of time varying systems, the

transformation is a function of the index (discrete or continuous), whereas in the case of

time-invariant systems, the transformation is independent of the index.

Definition

A dynamical system is linear if and only if, given any two points (scalar, vector, tensor

variables) in the input space, say U1 and U2, and given any two scalar ( real or complex )

constants, the following property is satisfied by the transformation L, describing the

dynamical system:

L (C 1U 1 + C 2U 2 ) = C 1L (U 1 ) = C 1L (U 1 ) + C 2L (U 2 ); C 1 , C 2 ∈C or R or any field (4.7)

If the above property is violated by the dynamical system, we call it a non-linear system.

Conventionally, in multidimensional ( multi-order may be more appropriate, but is not

utilized by the author ) system theory, in the case of discrete time dynamical system (an

example is provided in section 2), the evolution is described by means of local state, local

control, local input and local output variables. This is very cumbersome. In the case of

certain multidimensional systems, the state space representation by means of tensors

(described below) enables one to compactly capture a higher order difference equation

through TENSOR notation.

In order to describe the tensor state space representation, the following concepts/ideas

from tensor analysis are explained.

Tensor Function of a Scalar Argument

It is a rule assigning a unique value of a tensor to each admissible value of a scalar t

(BoT). The variable t can be a discrete index assuming countably many values or a

Tensor State Space Representation: Multidimensional Systems 67

write

Ai 1,i 2 ,...,in = Ai 1, i 2 ,..., in ( t ) (4.8)

For instance, the state of stress of an elastic medium varies in time. Then, the stress

tensor becomes a function of time i.e.

Pik = Pik (t) (4.9)

By the derivative of the function (4.8) with respect to time/index, we mean the tensor

with the components,

d Ai 1, i 2,..., in (t) Lt Ai 1, i 2,..., in (t + ∆t ) − Ai 1,..., in (t )

= (4.10)

dt ∆t → 0 ∆t

calculated in a coordinate system which does not vary in time. The derivative is clearly of

the same order as the tensor itself.

With the above notation from tensor analysis, certain multi/infinite dimensional discrete

time/index dynamical system can be described by means of a state space description of

the following form:

Discrete Time Systems:

X( i 1,..., ir ) (n + 1) = A( i 1,..., ir ; j1,..., jr ) (n) X( j 1,..., jr ) (n) + B( i 1,..., ir ; j1,..., jp ) (n)U( j 1,..., jp) (n),

Y(l 1,...,ls ) (n) = C(l 1,..., ls ; j 1,..., jr (n) X( j 1,..., jr ) (n) + D(l 1,..., ls ; j 1,..., jp ) (n)U( j 1,..., jp ) (n). (4.11)

where A(n) is an m dimensional tensor of order 2r (called the state coupling tensor ), X(n)

is the state of the dynamical system at the discrete time index n, whereas X(n+1) is the state

of the system at the discrete time index n+1. Furthermore B(n) is an m dimensional tensor

of order r+p ( called the input coupling tensor ), Y(n) is an output tensor of dimension m

and order s. U (n) is an m dimensional input tensor (varying with the discrete time index of

order p) and C(n) (called the state coupling tensor to the output dynamics) is an m -

dimensional tensor of order (s + r), D(n) is the input coupling tensor to the output dynamics

of dimension m and order s+p.

In the above state space description of certain type of multidimensional discrete time

dynamical system, there are r dimension variables which are inherently discrete. The

evolution of the system (changes in the system parameters) occur at discrete time instants.

The notation for index set in the state equations requires some explanation. Since the state

tensor is an m -dimensional tensor of order r, it will have m components. When the system

evolves, it transits through tensors in the state space.

With the summary of tensor functions of scalar argument provided above, the dynamics

of certain type of multi/infinite dimensional continuous time/index systems is described

by the following state space description:

68 Multidimensional Neural Networks: Unified Theory

i

X( i 1,..., ir ) (t) = A( i 1,..., ir ; j1,..., jr ) (t) X( j 1,..., jr ) (t) + B( i1,..., ir ; j 1,..., jp ) (t) U( j1,..., jp ) (t),

(4.12)

Y(l 1,..., ls ) (t) = C(l 1,..., ls ; j 1,..., jr ) (t) X( j 1,..., jr ) (t ) + D(l 1,..., ls ; j 1,..., jp ) (t)U( j 1,..., jp ) (t ).

where A(t) is an m dimensional tensor of order 2r (called the state coupling tensor to

the state dynamics), X (t) is the state of the dynamical system at the continuous time/

.

index t, whereas X (t) is derivative of the state of the system. Furthermore, B (t) is the

m dimensional tensor of order r+ p (called the input coupling tensor to the state

dynamics), Y (t) is the output tensor of dimension m and order s. Also, U (t) is an m

dimensional input tensor of dimension m and order p , and C (t) is an m -dimensional

tensor of order (s + r), D (t) is the input coupling tensor to the output of dimension m

and order s + p . It should be noted that the state space description provided above for

certain continuous/discrete index systems hold true even for certain infinite dimensional

systems. In the case of infinite dimensional systems, in the state space descriptions, the

tensors utilized are of dimension/order infinity ( either or both of them). Now, the above

tensor state space representations are contrasted with the conventional approaches in

the representation of certain multidimensional systems.

It is reasoned that the Tensor State Space Representation is an important leap in multi/

infinite dimensional system theory. Also, another objective is to remove the confusion in

the mind of the reader who read the classical literature in multi/infinite dimensional system

theory with matrix linear operator notation. The primary source of confusion is not so

much in the discrete time/index multidimensional systems, but in the case of continuous

time /index multidimensional systems.

Conventional Multidimensional System State Space Representation versus Modern Tensor State

Space Representation:

In section 2 as well as section 3, the limitations of the way system theorists tried to

represent and analyze the two/multidimensional discrete time/index systems is discussed.

Also, the advantages of tensor state space representation (of certain large class of multi/

infinite dimensional systems) discovered and formalized by the author are described. The

transition from the conventional mode of thinking where the system is represented by

means of multiple independent variables, local state/local control are coupled to the system

dynamics by means of matrices to the modern version where tensor notation is utilized,

requires the realization that the linear space utilized in multidimensions is captured through

the tensor and the system dynamics when done in discrete time requires a discrete variable.

The continuous index case requires more imagination to understand the transition

from conventional approaches to the modern approaches. In the conventional

multidimensional system representation, partial differential equations are utilized to

describe the input-output behavior as well as the state (internal description) dynamics. In

the conventional approaches, multiple independent variables are tracked through separate

indices, leading to partial differential equations. But, the utilization of tensor linear operator

Tensor State Space Representation: Multidimensional Systems 69

and the tensor function of scalar argument enables one to describe the dynamics of tensor

state variable as a function of one continuous time/index variable. Thus, the discrete as

well continuous multi/infinite dimensional system state space representation utilizing

tensors resembles the familiar one dimensional system state space description.

The above tensor state space description reduces to the one dimensional case when the

order of the tensors is one. Thus, various results developed on one dimensional linear

spaces for one dimensional linear systems are readily translated to certain multi/

infinitedimensional systems described through tensor linear spaces (with some care taken

in pathological cases as well as when the problem being solved depends heavily on the

neighborhood set).

DYNAMICAL SYSTEMS – STATE SPACE REPRESENTATION BY

TENSOR LINEAR OPERATORS

The state space representation of one dimensional linear systems resembles that in (4.11),

( 4.12). In fact, one dimensional linear systems are a very special case of certain multi/

infinite dimensional systems described through (4.11), (4.12). A natural question that

arises is whether it is possible to transfer the results from one dimensional systems to

certain multidimensional systems described through (4.11), (4.12). It is explained in the

following that it is possible to do such a translation provided some care is taken in deriving

the results for certain class of multi/infinite dimensional systems. Some principles which

can be utilized as a guideline in deriving the results for multi/infinite dimensional systems

are provided below:

(1) In the case of one dimensional systems utilizing the state space representation

of a linear system, if a result is derived on the system response (invoking the

standard theorems in the theory of ordinary difference/differential equations),

that result has a corresponding version for multi/infinite dimensional systems

when the inner product and outer product between state vector/input vector/

output vector, matrices appearing in the state space descriptions are replaced by

those between compatible tensors in multi/infinite dimensions. One must exercise

care in making sure that the tensor products make sense.

(2) The tensor state space representation (rather than vectors and matrices in one

dimensional case) enables one to translate the results on controllability,

observability, stability from one dimensional linear space based dynamical systems

to certain multidimensional linear space based dynamical systems. The tensor

state space representation enables one to translate various problems for one

dimensional systems, in a one to one manner to certain multi/infinite dimensional

systems. These problems are defined utilizing the state space structure to be

linear (linear spaces in one/multi/infinite dimensions). In translating the solution

70 Multidimensional Neural Networks: Unified Theory

product between vector-matrix variables are replaced by those between tensor-

tensor variables. Care should be taken to ensure that the problem statement in

multidimensions doesn’t utilize the neighbourhood structure.

(3) The multi/infinite dimensional state space structure is such that there is no

notion of causality. From 1970s, system theorists, electrical engineers, computer

scientists developed various notions such as quarter plane causality, half plane

causality, other types of causality ( to introduce some form of ordering on the

two/multidimensional state space ) for providing an input-output description.

But the state space representation through tensors (of certain multidimensional

systems) enables one to get the associated input-output description as a special

case ( for such systems ). Thus, various problems in image processing, database

theory, theory of random fields are reformulated utilizing the tensor state space

description and solved in this context. When these problems have multi/infinite

dimensional state space structure (implicitly or explicitly specified) imbedded

into the statement, utilizing the tensor linear operator (or the theory of tensor

linear spaces) and the results in this chapter, they are considered to be solved.

The systems in which problems are formulated can be static or dynamic.

It should be reminded that various problems in different scientific disciplines (as listed

in section 2) which are based on multi/infinite dimensional description are effected by the

tensor state space description for linear dynamical systems. Even static systems where the

state space structure is a multidimensional linear space, utilization of tensor linear operator,

tensor algebra techniques provide convenient tools for formulation as well as solution of

them.

The above generic principles are easily illustrated with the typical problem of response

determination of certain multi/infinite dimensional linear systems (whose dynamics are

captured through the Tensor State Space Representation ). Details are avoided for brevity.

In the following, multi/infinite dimensional versions of time-series models are discussed.

They are the multi/infinite dimensional versions of Auto-Regressive (AR), Auto-Regressive

Moving Average (ARMA) models. The models are formally described utilizing the tensor

linear operator for the variables. The discrete time, multi/infinite dimensional versions of

AR, ARMA models are given by

Yi 1,..., ir (n + 1) = Ai 1,..., ir ; j 1,..., jr (n) ⊗ Yj 1,..., jr (n) + Wi 1,..., ir (n), (4.13)

Yi 1,..., ir (n + 1) = Bi 1,..., ir ; j 1,..., jr ⊗ Yj 1,..., jr (n) + Vi 1,..., ir (n) + Ci 1,..., ir ⊗Vj 1,..., jr (n − 1) +

where ⊗ denotes the inner product and the variables such as⊗Yj 1,..., jr (n) are tensors. The

Tensor State Space Representation: Multidimensional Systems 71

noise models Wi 1,..., ir (n), Vi 1,...,ir (n) are multidimensional versions of white noise.

As in one dimension, the continuous time versions of these models are based on utilizing

a continuous time index t, in the place of discrete time index n and replacing the noise

models in (4.13 and 4.14) by the continuous time white noise or colored noise models. The

formal description is avoided for brevity.

The above models (which effectively reduce to the one dimensional models in the one

dimensional case) enable one to derive various important details related to such stochastic

processes in multi/infinite dimensions. For instance, the autocorrelation tensors, the power

spectrum are derived based on the well known techniques for one dimensional systems. It

should be noted that the multi/infinite dimensional power spectrum estimation problem

(formulated using local state etc.) was well known to be very difficult. Thus, the utilization

of tensor linear operators in certain multidimensional systems enabled one to invoke the

results from one dimensional systems to be extended to certain multidimensional systems.

Various interesting identities arise in the actual analysis. The details are avoided.

In the following, state space representations for arbitrary stochastic linear systems are

described. In one dimension, it is well known that the widely utilized Markov chains

constitute the one dimensional stochastic linear systems. Thus, there has been research

effort to extend the idea, approach to multi/infinite dimensions. Like the deterministic

multi/infinite dimensional linear systems, conventionally various models based on the

local state approach were developed. These are traditionally called the random field

models. With the Tensor State Space Representation (TSSR) (of certain multidimensional

systems) provided in section 3, stochastic multi/infinite dimensional linear systems,

called structured Markov random fields, are based on the tensor linear operator. In the

spirit of the one dimensional approach, the multi/infinite dimensional structured Markov

random fields are homogeneous stochastic linear systems, described by difference

equation of the following form in the discrete time/index

∏ ( n + 1) = ∏ ( n ) ⊗ P( n ) (4.15)

where Π(n) is the tensor of probabilities of the states in the state space, P (n) is the state

transition tensor of the discrete time structured Markov random field. When the structured

Markov random field is homogeneous, then P(n) = P . Both P(n), P are stochastic tensors.

In the continuous time, the multi/infinite dimensional structured Markov random field

is described by means of a generator tensor. It is given by

•

d

∏ (t ) = dt π (t) = π (t) ⊗ Q(t) (4.16)

where Π(t) is the tensor of probabilities of states in the state space at time t, Q (t) is the

generator tensor of the continuous time strucured Markov random field. Q(t) satisfies the

properties of a generator tensor.

The equilibrium distribution of states in the discrete as well as continuous time/index

72 Multidimensional Neural Networks: Unified Theory

structured Markov random field are derived through the utilization of the spectral

representation theorem of the linear operator (tensor) utilizing the eigenvalues and

eigentensors of the linear operator.

When the state transition tensor as well as generator tensor have the G/M/1-type

structure, M/G/1-type structure (Neu), the invariant distribution of the random field has

the tensor geometric form. The derivation of the form of invariant distribution and efficient

recursions for the invariant distribution follow from a generalization of the results in one

dimension.

In the following, state space representations for various types of multidimensional stochastic

dynamical systems that are commonly utilized in electrical engineering are discussed.

In the discrete time, the multi/infinite dimensional dynamical system is described by

the difference equation of the following form:

X(n + 1) = A(n) ⊗ X(n) + B(n) ⊗U(n) + W (n)

(4.17)

Y(n) = C(n) ⊗ X(n) + V (n) + D(n) ⊗ U(n)

The tensors A(n), B(n), C(n), D(n) and the state, input, output tensors are of compatible

dimension and order. The noise terms are multi/infinite dimensional extensions of the

independent, identically distributed noise model in one dimension. It is based on the following

tensor based random variable/random process (like vector random variables, vector

random processes) specification. Generally, they are zero mean tensors (each component

random variable has zero mean) and as a sequence constitute independent tensor random

variables. This model is the simplest model that is commonly utilized in stochastic control

theory (ZoP), (SaW). Utilizing Tensor State Space Representation (TSSR), Unified Theory

of Control, Communication and Computation is formalized in (Rama 4).

Co var iance tensor {W (m), W (n)} = Q(m) δ (m − n),

Co var iance tensor {V (m), V (n)} = R(m) δ (m − n), (4.18)

Co var iance tensor {W (m), V (n)} = 0

These plant noise and measurement noise models are assumed to be independent of the

normal random initial state tensor, X( ). The continuous time multi/infinite dimensional

stochastic models utilize continuous time I.I.D. noise (as in one dimension). The state space

model description has an additive I.I.D. noise term to those described in section 3. With

the above state model, theorems in one dimensional stochastic control are extended to

multi/infinite dimensions, since the matrix linear operator is replaced by the tensor linear

operator. In translating the results inner/outer product between vectors/matrices are

replaced by those between the tensors/tensors.

Now, we consider a noise model which describes processes which are more complicated

than the ones considered previously. The colored noise model considered in ARMA time

series model is a special case version of the following noise model. In this model, the noise

processes constitute a structured Markov random field in multi/infinite dimensions. The

Tensor State Space Representation: Multidimensional Systems 73

plant noise model and measurement noise are uncorrelated/independent. The noise models

satisfy the following equations.

X(n + 1) = A(n) ⊗ X(n) + B(n) ⊗U(n) + L(n)

(4.19)

Y (n) = C(n) ⊗ X(n) + M(n) + D(n) ⊗ U(n)

L(n), M(n) are discrete time structured Markov random fields. The fact that Markov random field

is a stochastic linear system enables one to apply the stochastic dynamic programming. In the

above noise model, the plant and measurement noise are made to be the most general models

that are conceivable, while at the same time they are tractable. The continuous time version of the

state space model has an additive term added to those in section 3.

With the above state space representation, various results developed in one dimensional

stochastic control theory (SaW) are extended to multi/infinite dimensional systems utilizing

the generic principles described in section 3. Thus, various recursive forms for state

estimation, filtering and prediction are translated from one dimensional systems to

multidimensional systems, particularly with the I.I.D. form of noise.

The time series model discussed at the beginning of the section with tensor state space

representation, led the author to provide very detailed linear prediction type results in multi/

infinite dimensions when the noise process is white as well as colored. Thus, the linear prediction

theory, which was so successful in theoretical as well as practical applications is successfully (in

mathematical completeness) advanced to multi/infinite dimensions by the author with the tensor

state space representation. The mathematical equations look familiar with tensor products

being utilized in the equations.

It should be noted that using the signal and noise models described in this section,

multidimensional versions of Wiener and Kalman filters can easily be derived. Various

results on estimation, prediction and control are translated from one dimension to multi-

dimension (Rama 4) (when the multidimensional system has Tensor State Space

Representation i.e. TSSR).

In summary various results developed in one dimensional stochastic control theory,

theory of one dimensional random processes are extended to multi/infinite dimensions

through the Tensor State Space Representation.

Distributed dynamical systems are a class of systems which are more general than the

dynamical systems considered above in some sense. They arise in various practical

applications such as the electrical transmission lines (distributed inductance, capacitance,

resistance along the line), image models, models of tomographic images of brain etc.

One/multi/infinite dimensional systems in which the tensors which appear in the

system dynamics that vary with time are one of the simple illustrations of distributed

dynamical systems. These systems illustrate a form of non-homogeneity in the evolution

74 Multidimensional Neural Networks: Unified Theory

of the system in the state space i.e. a dependence on the discrete/continuous time index of

the manner in which the state coupling, input coupling, output coupling tensors vary with

time, resulting in a distributed nature of the manner of state transitions depending on the

location i.e. discrete/continuous time index. This naturally motivates considering systems,

based on practical applications, in which the state transitions in multi/infinite dimensions

depend on the location. This is once again reminiscent of the conventional models of two/

multidimensional signal processing. To formally provide models of distributed dynamical

systems in multi/infinite dimensions, the following notation from tensor algebra/analysis

is introduced.

It is a rule assigning a value of a tensor B to each admissible value of a set of variables

(t1, t2 ,..., ts ). To indicate such a function, we write

In the models of distributed systems described in the following, utilizing tensor linear

operators, the state, input, output variables are functions of multiple discrete time/index

or continuous time/index.

The following concept from tensor analysis is also extremely helpful.

Tensor Field:

By a tensor field, we mean a rule assigning a unique value of a tensor to each point of

a certain volume V ( V may be all of space). Let r be the radius vector of a variable point

of V with respect to the origin of some coordinate system. Then, a tensor field is indicated

by writing

Ai 1,..., in = Ai 1,..., in (r ) (4.21)

if the tensor is of order n. A special class of tensor fields are nonstationary fields, which are

functions of both space and time i.e. of both the vector r and the scalar t:

ϕ = ϕ (r , t ), A = A(r , t ) (4.22)

A tensor field is said to be homogeneous if it has no spatial dependence. In this case,

the above reduces to

A = A(t) (4.23)

Tensor fields which are continuous are of utility in physical applications and in modeling

various real life dynamical systems. Non-stationary fields are of utility in modeling

distributed dynamical systems.

It will be evident to an intelligent reader, how the above concepts are utilized in the

following models of distributed dynamical systems. Particularly, tensor fields enable one

to define dynamical systems over regions in the higher dimensional space which are not

Tensor State Space Representation: Multidimensional Systems 75

Motivated by the quarter plane causal model familiar from conventional two dimensional

system theory, the author defines the following model of a linear system distributed in the

plane or two dimensional distributed system. It is given by

(1) (2)

Xi 1,..., ir ( h + 1, k + 1) = A i 1,..., ir ; j 1,..., jr ( h, k + 1) ⊗ X j 1,.., jr (h , k + 1) + A i 1,..., ir ; j 1,..., js ( h + 1, k ) ⊗

where X(h, k ), U(h,k), Y(h,k) are the values of the local state tensor, input tensor and output

tensor at (h, k) ∈ Z × Z. The multidimensional extension of this model is described based

on the same spirit in the sense that the nearest/farthest neighbourhood set is partitioned

into causal/non-causal parts and utilizing it in writing the multidimensional difference

equation describing the system dynamics.

For instance, the half plane causal model familiar in two dimensional signal processing

is written utilizing the tensor linear operator in the same spirit as the above quarter plane

causal model.

The spirit in which various notions of causality is introduced into the system evolution in

the state space is by means of natural/artificially induced decomposition of the state space.

The state space is partitioned into neighborhoods and the dynamical system is described by

means of a difference equation (multi/infinite dimensional) of the following form

X{( i 1,..., in )( k )∈( N + 1)} = A(i 1,..., in ; j 1,..., jn )( k ) ⊗ X {( j 1,...., jn )(k )∈ N} +

Y{(i 1,..., in )(k )∈N} = C( i 1,..., in ; j 1,..., jn) (k ) ⊗ X {( j 1,..., jn) (k )∈N } +

where N, N+1 are neighbourhood sets in the multi/infinite dimensional state space which

are not necessarily bounded by hyperplanes (captured by a structure like tensors/matrices).

The above state space description of a dynamical system in discrete index variables is in

the most general format conceivable. The advantages of such a model is the ability to

make an arbitrary choice of the neighbourhood. If the neighborhood is chosen to be one

among those in the set utilized for embedding causality structure onto the state space,

various models result.

The continuous time version of the above model utilizes, non-stationary tensor fields.

The typical system evolution equations are given by

i

X{( i 1,..., in )( t)∈( N + 1)} = A(i 1,..., in ; j 1,..., jn )(t ) ⊗ X {( j 1,..., jn )(t )∈ N } +

76 Multidimensional Neural Networks: Unified Theory

Y{(i 1,..., in )(t )∈N } = C( i 1,..., in ; j 1,..., jn) (t ) ⊗ X {( j 1,..., jn) (t)∈N} +

i

where, Xi 1,..., in (t) is the tensor of partial derivatives (like the Jacobian matrix, we can call it

a Jacobian tensor) of Xi 1,..., in (t) . Once again this is the most general model conceivable. If

the neighbourhood set is represented by a tensor, we have a very important special case.

If one has understood carefully, the notions of local state, local control and the essential

ideas of the theory of ordinary/partial difference/differential equations, many results

developed in those fields have been adopted to the case where vector-matrix variables are

replaced by the tensor-tensor variables. The outcome of this mathematically formal

approach is:

(i) Results developed by the differential/partial differential equations community are

adopted to the tensor-tensor based equations. Once again, the translation is done with

relative ease, (ii) Distributed dynamical systems are modeled by using the half plane,

quarter plane causal type neighbourhood models. In these models, the matrices/vectors

are replaced by tensors. Various other models based on local state, local control, various

types of decompositions of the state space that arise in fields such as image processing,

tomography etc. are translated to the multidimensional case by replacing the vectors/

matrices by tensors.

Various types of problems formulated and solved in conventional two/

multidimensional system theory are adopted to the tensor based difference/differential

equations by utilizing tensor products and tensor algebra/analysis. Some illustrations of

design, analysis of distributed systems are reported utilizing the tensor linear operators

for local state, local control, local input, local output variables and replacing the vector/

matrix products by means of tensor-tensor products. They are avoided here for brevity.

4.7 CONCLUSIONS

Utilization of tensor linear operator associated with dynamic as well as static linear systems

enables one to formulate as well as solve various known as well as new problems utilizing

the powerful tools of tensor algebra (Rama1). This important representation invoked by

the author is hoped to have useful effect on various scientific/mathematical fields. State

space representation by tensor linear operators is discovered and formalized (Rama1). It is

formally demonstrated how the theory of certain multidimensional systems is developed

utilizing the tensor state space representation and translations of the results from one

dimensional system theory. Approaches to translate one dimensional stochastic control

theory to multi/infinite dimensional systems are briefly described. New state space

representations for distributed dynamical systems are developed which enable translating

the results from conventional state space models of multidimensional systems. Thus, in

Tensor State Space Representation: Multidimensional Systems 77

essence the tensor linear operator based representation of static as well as dynamic systems

has important impact on various fields of scientific endeavour.

REFERENCES

(BiF) M. Bisiacco and E. Fornasini, “Optimal Control of Two Dimensional Systems,” SIAM

Journal of Control and Optimization, Vol. 28, pp. 582-601, May 1990.

(BoT) A. I. Borisenko and I. E. Tarapov, “Vector and Tensor Analysis with Applications,” Dover

Publications Inc., New York, 1968.

(Gop) M. Gopal, “Modern Control System Theory“, John Wiley and Sons, New York.

(Neu) M.F. Neuts, “Matrix Geometric Solutions in Stochastic Models”, Marcel-Dekker,

Baltimore.

(Rama 1) Garimella Rama Murthy, “Tensor State Space Representation: Multidimensional

Systems, International Journal of Systemics, Cybernetics and Informatics (IJSCI), January

2007, page 16-23

(Rama 2) Garimella Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite

Dimensional Logic Theory,” International Journal of Neural Systems, Vol. 15, No. 3, June

2005.

(Rama 3) Garimella Rama Murthy, “Multidimensional Neural Networks: Multidimensional

Coding Theory:Constrained Static Optimization” Proceedings of 2002 IEEE International

Workshop on Information Theory.

(Rama 4) “Optimal Control, Codeword, Logic Function Tensors: Multidimensional Neural

Networks, IJSCI, October 2006, Pages 9-17.

(SaW) Sage and White, “Optimal Control Theory,” Academic Press.

(Zop) R. Zoppoli and T. Parisini, “Learning Techniques and Neural Networks for the

Solution of N-stage Non-linear No n-quadratic Optimal Control Problems,” Topics in 2-d

System Theory, 1992.

This page

intentionally left

blank

CHAPTER

5

Unified Theory of Control,

Communication and Computa-

tion: Multidimensional Neural

Networks

5.1 INTRODUCTION

In the mid 1940s, Norbert Wiener coined the word Cybernetics for the research field

dedicated to understand the control, communication, computation and other such functions

of living systems. It is well agreed that these functions of living systems are controlled by

various functional sub-assemblies in the brain synthesized through bio-chemical circuits.

Research work on this field was pursued by several researchers in diverse fields. The multi-

disciplinary effort resulted in progressing the literature on the subject. But no formally

precise discoveries were made.

Also, starting in 1950s, the research efforts in electrical engineering discipline led to

the isolated theories of control, communication and computation. The central goal of these

three fields is summarized in the following:

• The problem of communication is to convey a message from one point in space

and time to another point in space and time as reliably as possible.

• The problem of control is to move a system from one point in state space to

another point in state space such that a certain objective function is minimized

• The problem of computation is to process a set of input symbols and produce

another set of output symbols based on some information processing operation.

These three problems, on the surface seem to be unrelated to one another.

Also, in the mid 1960s, several researchers became interested in the mathematical model of

the nervous system. This effort was meant to complement the research in cybernetics. Hopfield/

Amari succeeded in providing an abstract model of associative memory. Based on this abstract

model, researchers are led to the following question which remained unanswered.

Question: Is it true that the functional units responsible for control, communication

and computation are synthesized through a network of homogeneous neurons?

80 Multidimensional Neural Networks: Unified Theory

Occasionally research efforts led to establishing some relationship between the three

fields. But, in this chapter it is shown (with mathematical clarity and preciseness) that in the

sense of optimization ( consolidating the earlier efforts of other authors) of some objective

function, these three problems are related to one another leading to one form of unification.

From a practical point of view, this unification leads to design of brain of powerful robots.

With the efforts of the author, Boolean Logic theory was generalized to multi/infinite

dimensions using an optimization approach (Rama 1). This approach led to the area of

multidimensional neural networks (Rama 1). Also using the generalization of results in (BrB)

in one dimension, multidimensional linear as well as non-linear codes are related to

multidimensional neural networks. Thus using these results the research fields: Computation

and Communication are related through the common thread of neural networks. In this

paper, the main achievement of the author is to show that optimal control tensors of certain

multidimensional systems are synthesized as the stable states of neural networks. Thus

utilizing the results summarized in this paragraph, Unified Theory of Control,

Communication and Computation is generalized to multidimensional systems.

This chapter is organized in the following manner. In section 2, unification of control,

communication and computation in one dimensional systems is summarized. In Section 3,

the discovery and formalization of Tensor State Space Representation of certain

multidimensional systems is briefly discussed. Using this representation, optimal control

tensors (in a well known criteria of optimality) are shown to constitute the stable states of

a multidimensional Hopfield neural network. In Section 4, utilizing the results in (Rama

1), (Rama 2), Unified Theory of Control, Communication and Computation in

multidimensional systems is formally described. Conclusions are reported in Section 5.

OPTIMAL CONTROL VECTORS: ONE DIMENSIONAL NEURAL

NETWORKS

optimizing a quadratic form over the hypercube. Other authors also realized that the concept

of a logic gate (CAB) (in one dimension), concept of error correcting code (BrB) could be

related to one dimensional neural networks (optimizing a quadratic/higher degree form).

These efforts are summarized in the following paragraphs. The essential goal of this section

is to summarize unification of control, communication and computation functions (in one

dimensional systems) through the common thread of one dimensional neural networks.

One dimensional logic theory as well as logic synthesis deal with information processing

logic gates and logic circuits which operate on one dimensional arrays of zeroes and ones

(or more generally one dimensional arrays containing finitely many symbols). The

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 81

operations performed by AND, OR, NOR, NAND, XOR gates have appropriate intuitive

interpretation in terms of the entries of the one dimensional arrays i.e. vectors.

Research in the area of artifical neural networks led to the problem whether all one

dimensional logic gates can be synthesized using a single layer neural network. Chakradhar

et al. provided an answer to the problem. They showed that the set of stable states of a

Hopfield neural network correspond to one dimensional logic functions (CAB).

Equivalently, the input and output signal states of a logic gate are related through an

energy function. The outputs correspond to the stable states of neural network (which

constitute the local optima of the energy function). Thus, in a well defined sense, one

dimensional neural networks and logic theory are related.

In (BrB), several ways of relating the concept of neural networks and the concept of error

correcting codes are presented. Specifically it is shown that, given a linear block code, a

neural network can be constructed in such a way that every local maximum of the energy

function corresponds to a codeword and every codeword correspond to a local maximum.

Also it is shown that performing maximum likelihood decoding in a linear block error

correcting code is shown to be equivalent to finding a global maximum of the energy

function of certain neural network. Thus, one dimensional neural networks and error

correcting codes are related.

In dealing with the problem of storage of data in magnetic and optical recording systems,

Wyner formulated an important open research problem (GoC). The problem is “Consider a

Single Input, Single Output, (SISO) linear time invariant continous time system. Consider

the input which is constrained to assume values between +1 and –1. Determine the optimal

input signals which maximize the total output energy over a finite horizon”. This problem

was solved by the author in (Rama 5) and independently by Honig et al. (HoS). In (RKB), the

author formulated and solved the problem in the case of SISO discrete time, linear time

invariant systems. The result in the case of discrete time SISO systems shows that the optimal

control vectors over a finite horizon constitute the stable states of a Hopfield Neural Network.

Thus optimal control vectors are synthesized as the local optima of energy function associated

with a Hopfield neural network. The associated derivation is provided in Chapter 7.

Thus, the research work summarized in the previous paragraphs shows that optimal

control vectors, optimal codeword vectors and optimal logic gate outputs are synthesized

as the stable states of a one dimensional neural network (not necessarily same). Hence the

three research areas of control, communication and computation are unified using the

common thread of neural networks. One should note that the unification is done in one

dimension (one independent variable). In the following, we extend the unification to

multidimensions. Particularly in the following section, the main achievement of the author

(in this chapter) is discussed.

82 Multidimensional Neural Networks: Unified Theory

NETWORKS

In the case of one dimensional linear systems, it was shown that the state space

representation of the dynamics is much better than input-output description. Specifically,

state space representation naturally leads to concepts such as controllability, observability

associated with the system.

Unfortunately, in the case of multidimensional systems, there is no natural notion of

causality. Thus system theorists introduced notions such as quarter-plane causality, half-

plane causality etc by partitioning the index set for state variables. In contrast to these

approaches, the author discovered and formalized (Rama 3), Tensor State Space

Representation (TSSR) of CERTAIN multidimensional systems. It is discussed in (Rama 3)

that this particular representation enables transferring results from one dimensional

systems (with vector-matrix state space representation) to certain multidimensional

systems.

In summary, CERTAIN multi/infinite dimensional discrete time/index dynamical

systems can be described by means of a state space description of the following form:

Discrete Time Systems

X( i 1,..., ir ) (n + 1) = A( i 1,..., ir ; j1,..., jr ) (n) ⊗ X( j 1,..., jr ) (n) + B( i 1,..., ir ; j 1,..., jp ) (n) ⊗ U ( j 1,..., jp ) (n),

Y( l1,..., ls ) (n) = C( l1,..., ls ; j 1,..., jr ) (n) ⊗ X( j 1,..., jr ) (n) + D ( l1,..., ls ; j 1,..., jp ) (n) ⊗ U( j 1,..., jp ) (n). (5.1)

Where ⊗ denotes inner product operation between compatible tensors (BoT). Also in (5.1),

A(n) is an m dimensional tensor of order 2r (called the state coupling tensor ), X(n) is the

state of the dynamical system at the discrete time index n, whereas X(n+1) is the state of

the system at the discrete time index n+1. Furthermore B(n) is an m dimensional tensor of

order r+p ( called the input coupling tensor ), Y(n) is an output tensor of dimension m and

order s. U(n) is an m dimensional input tensor of order p (varying with the discrete time

index of order p) and C(n) (called the state coupling tensor to the output dynamics) is an

m-dimensional tensor of order (s + r), D(n) is the input coupling tensor to the output

dynamics of dimension m and order s + p.

With the above important representation of certain multidimensional systems, we

formulate and solve an important problem in optimal control of certain multidimensional

systems. The solution of the problem shows that the optimal control tensors are synthesized

as the stable states of a multidimensional Hopfield neural network (The connection structute

of m -d Hopfield neural network is a fully symmetric tensor).

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 83

Problem Definition

Find an admissible sequence of (realizable) input signal tensors, U(k ) for k ∈ { 0, 1, 2, ....}

(with each component of the tensor being bounded in amplitude by unity (one) or without

loss of generality be a fixed constant) i.e. Ui 1, i 2,..., ir ( k ) ≤ 1 in order to minimize the criterion

−1 kf

J = 2 ∑ Yin ,..., i 1 ( k ) ⊗ Yi1,..., in ( k ) (5.2)

k=0

subject to

X (n +1) = A(n ) ⊗ X (n) + B(n ) ⊗ U (n) (5.3)

Y (n ) = C ( n ) ⊗ X ( n ) (5.4)

where A(n), B(n), C(n), D(n) are tensors arising in the system dynamics of the discrete

time multi/infinite dimensional system. Furthermore, X(n) is the state tensor of the system.

These tensors which arise in the system dynamics are of compatible dimensions. Without

loss of generality, a multi-input, multi-output multidimensional linear system is considered.

Let the impulse response tensor of the system be denoted by h(k, l). This is the discrete

time version of the problem given in (GoC) for CERTAIN discrete time multidimensional

systems. The open problem given in (GoC) is solved in (Rama 5).

Problem Definition

The optimality condition is derived through the application of the maximum principle or

equivalently, the dynamic programming principle. The application of dynamic

programming enables us to derive the necessary as well as sufficient condition through

the principle of optimality in some cases.

Let U(k) , k = 0, 1, 2, ..... k f − 1 be the optimal control tensor sequence, and let X (k) , k =

0,1,2 ,..., be the state response of the linear system due to the input tensors U(k), uniquely

specified by (5.3), (5.4) and the initial condition of the linear dynamical system. Then, under

reasonable assumptions, discussed in the application of the discrete maximum principle

(SaW), it is shown that there exists a non-trivial function satisfying

δ H(Xk , Uk , λk + 1 , k )

λk = (5.5)

δ Xk

where the Pontryagin function/Hamiltonian is given by

−1

H(X k , U k , λk + 1 , k ) = (C(k ) ⊗ X(k ))in ,..., i1 ⊗ (C(k ) ⊗ X(k ))i1,..., in +

2

λil ,..., i 1 (k + 1) ⊗ [ A(k ) ⊗ X (k ) + B(k ) ⊗ U (k )] (5.6)

84 Multidimensional Neural Networks: Unified Theory

λ ( k ) = − C jm , jm −1,..., j 1 ( k ) ⊗ Yi 1,..., ip ( k ) + Ais ,..., i 1 ( k ) ⊗ λ ( k + 1) (5.7)

Since, the terminal state is unspecified, we have

λ ( k f ) = − C jm ,..., j 1 ( k f ) ⊗ Yi 1,..., ip ( k f ) (5.8)

This will provide the terminal condition for solving (5.7). Since the input tensor sequence

is constrained, it must necessarily satisfy

H (X k ,U k , λk + 1 , k ) = Min H ( Xk ,V , λk + 1 , k ) for all k = 0,1,..., k f − 1,

V ∈T

Thus,

U ( k ) = − Sign (Bsl ,..., S1 (k ) ⊗ λ t 1 ,...., tn ( k + 1)) (5.9)

Solving (5.7) for λ(k + 1) and substituting in (5.9), we arrive at the optimal control

sequence. When the constraint set is other than a hypercube, various well known techniques

from mathematical programming for different constraint sets such as a convex polytope,

convex polyhedra are invoked in the context of quadratic programming. The cost function

is quadratic and it is optimized over various types of constraint sets such as the one

described previously.

With the terminal state specified, the equation (5.7) is recursed backwards to arrive at

the optimal control tensor in the case of multi/infinite dimensional systems. Thus, an

efficient computational form for solving the two point boundary value problem is

derived in the following. It should be noted that, we derive the expression for λk +1 in the

case of certain linear time varying multi/infinite dimensional dynamical systems

λ ( k ) = − C jm ,..., j 1 ( k ) ⊗ Yi 1,..., ip ( k ) + Ais ,..., i 1 ( k ) ⊗ λ ( k + 1) (5.10)

starting with the terminal condition, recursing backwards.

Remark

Before we proceed further, it should be reminded that the indices for tensor describing the

order of the tensor are given values by the symbols that came to mind. The tensors in the

above state space representation are of compatible order to ensure that inner and outer

products make sense. Now, we return to the derivation. In the following, the notation ⊗ is

utilized to denote the inner product (BoT) between the tensors of compatible order.

λt1,..., tl (k f ) = – C jm ,..., j 1 (k f ) ⊗ Yi 1,..., ip (k f ) (5.11)

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 85

Yi 1,..., ip (k f − 1 ) − Ais ,..., i 1 (k f − 2 ) ⊗ Ais ,..., i 1 (k f − 1 ) ⊗ C jm,..., j 1 (k f ) ⊗ Yi 1,..., ip (k f ) (5.13)

Yi 1,..., ip (k f − 2 ) − Ais ,..., i 1 (k f − 3 ) ⊗ Ais ,..., i 1 ( k f − 2 ) ⊗ C jm ,..., j 1 (k f − 1 ) ⊗ Yi 1,..., ip ( k f − 1 )

– A is ,..., i 1( k f – 3 ) ⊗ Ais ,..., i 1 ( k f – 2 ) ⊗ Ais ,..., i 1 ( k f –1 ) ⊗ C jm ,..., j 1 ( k f ) ⊗ Yi 1,...,ip ( k f ) (5.14)

Thus, continuing the solution of the difference equation backwards, we have

λt1,..., tl ( k f − l ) = − Cjm ,..., j 1 ( k f − l ) ⊗ Yi 1,..., ip ( k f − l ) − Ais ,..., i 1 ( k f − l ) ⊗ Cjm,..., j 1 ( k f − l + 1 ) ⊗

−... − Ais ,..., i 1 ( k + 1) ⊗ Ais ,..., i 1 (k + 2) ⊗ ... ⊗ Ais ,..., i 1 ( k + l) ⊗ C jm ,..., j 1 (k + l + 1) ⊗ Yi 1,..., ip ( k + l + 1)

(5.16)

Thus we have the optimal control solution for the problem given by (utilizing (5.9))

Uv 1,..., vr ( k ) = Sign (Bsl ,..., s1 (k ) ⊗ C jm ,..., j 1 (k + 1) ⊗ Y( k + 1) +

∑B

i =1

sl ,..., s1 ( k ) ⊗ Ais ,..., i 1 (k + 1) ⊗ .... ⊗ Ais ,..., i 1 ( k + i) ⊗ C jm ,..., j 1 ( k + i + 1) ⊗ Yi 1,..., ip (k + i + 1) ) (5.17)

Now, utilizing the definition of the impulse response tensor of the time varying linear

system, we have

Uv 1,..., vr ( k ) = Sign (Bsl ,..., s1 ( k ) ⊗ C jm ,..., j 1 ( k + 1) ⊗ Y( k + 1) +

∑ h (k + i + 1, k ) ⊗ Y(k + i + 1))

i =1

(5.18)

l

= Sign (∑ h (k + i + 1, k ) ⊗ Y(k + i + 1) )

i=0

86 Multidimensional Neural Networks: Unified Theory

h(.,.) is the transposed tensor of the impulse response tensor. The term in the

parenthesis is given by

l l k +i +1

∑ h (k + i + 1, k) ⊗ Y(k + i + 1) = ∑ h (k + i + 1, k ) ⊗ ∑ h (k + i + 1, j) ⊗ u ( j)

i=0 i=0 j=0

(5.19)

Exchanging the order of summation, (with the help of associated index grid), we have

kf kf − k − 1

∑

j =0

∑

i = max imum {0, j − k − 1}

( h ( k + i + 1, k ) ⊗ h ( k + i + 1, j) ⊗ u( j) (5.20)

kf kf − k − 1

U ∗ (k ) = Sign ∑ ∑ (h (k + i + 1, k ) ⊗ h (k + i + 1, j )) ⊗ U ( j ) (5.21)

j = 0 i = max{1, j − k − 1}

Let us define

kf − k − 1

R(k , j ) = ∑ (h (k + i + 1, k ) ⊗ h (k + i + 1, j)) (5.22)

i = max{1, j − k − 1}

kf

U ∗ ( k ) = Sign ( ∑ R( k , j) ⊗ U( j)) (5.23)

j=0

kf − k − 1

R( k , j) = ∑ (h(i + 1) ⊗ h(k + i + 1 − j)

i = max {1, j − k − 1}

(5.24)

This is the energy density tensor of time invariant linear system obtained from the

impulse response tensor. Thus the optimal control tensor is the stable state of a

multidimensional Hopfield neural network.

Now, we formulate and solve the continuous time versions of the problems. The

continuous time versions of the problem provides us with the structure of the local optimum

of a quadratic form over the continuous time multi/infinite dimensional hypercube. This is

the problem where the L∞ norm of the control tensors is constrained in amplitude by unity.

In the derivation of the optimal control, the following definition is necessary.

By the integral of a tensor of a scalar continuous argument, we mean the tensor with

the components,

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 87

Consider a multi/infinite dimensional linear system with continuous index/argument.

The system dynamics are given by

.

X i 1,..., ir ( t ) = Ai 1,..., ir ; j 1,..., jr (t) ⊗ X j 1,..., jr (t ) + Bi 1,..., ir ; j 1,..., jp (t ) ⊗ U j 1,..., jp (t)

The objective function being minimized in the optimal control problem is given by

tf tf

−1

J = ∫ Yls ,..., l 1 (t) ⊗ Yl 1,..., ls (t) dt = ∫ φ (X , U , t ) dt (5.27)

2 to to

subject to the constraint given in (5.26) and the input tensors are constrained to be on the

continuous time multi/infinite dimensional hypercube.

Solution:

Form the Pontryagin function ( or Hamiltonian) of the problem. It is given by

−1

H (X , U , λ , t) = (C(t) ⊗ X(t))ls ,..., l 1 ⊗ (C(t) ⊗ X(t))l 1,..., ls +

2

λir ,..., i 1 (t) ⊗ ( A(t) ⊗ X(t) + B(t) ⊗ U(t)) (5.28)

control tensors i.e. control tensors whose components are constrained in amplitude by

unity. Thus,

U* j 1,..., jp (t) = − Sign {Bjp ,..., j 1;ir ,..., i 1 (t) ⊗ λi 1,..., ir (t)}

Thus, the optimal control tensors for the problem is obtained from the above equation.

To explicitly determine the optimal control, the adjoint equations and associated boundary

conditions are given by

. δ H(X , U , λ , t) δφ ( X , U , t) δ

[ A(t) X(t) + B(t)U(t)] ⊗ λ(t),

T

− λi 1,..., ir (t) = = +

δX δX δX

δ

where is a partial derivative operator, (5.29)

δX

λi 1,..., ir (t f ) = 0

The above equations (5.28), (5.29) alongwith the system dynamics described through

(5.26) are solved for determining λi 1,..., ir (t)

88 Multidimensional Neural Networks: Unified Theory

i

−λi 1,..., ir (t) = − Cls ,..., l1 (t) ⊗ Cl1,..., ls (t) ⊗ Xi 1,..., ir (t) + Ajr ,..., j 1; ir ,..., i 1 (t) ⊗ λi 1,..., ir (t)

λi 1,..., ir (t) = − Ajr ,..., j 1; ir ,..., i 1 (t) ⊗ λi 1,..., ir (t) + Cl 1,..., ls (t) ⊗ Yl 1,..., ls (t)

The above differential equation is solved, like the state equations for the linear dynamical

system, to arrive at

t

a a

(5.32)

tf

d a

φ (t ,τ ) = − Ajr ,..., j 1; ir ,..., i 1 (t) ⊗ φ a (t , τ ) (5.33)

dt

φ a ( t ,τ ) = I ; φ a ( t , t f ) = φ ( t f , t )

a

a

tf

= ∫ φ (τ , t) ⊗ C

tf

ls ,..., l1 (τ ) ⊗ Yl1,..., ls (τ )dτ (5.34)

tf

t

Hence, we have

−Bjp ,..., j 1; ir ,..., i 1 (t) ⊗ λi 1,..., ir (t) =

tf

∫B

t

jr ,..., j1; ir ,..., j1 (t) ⊗ φ (τ , t) ⊗ Cls ,..., l 1 (τ ) ⊗ Yl 1,..., ls (τ )dτ (5.35)

where

τ

Yl1,..., ls (τ ) = ∫ Cl1,..., ls (τ ) ⊗ φ (τ , s) ⊗ B( s) ⊗ U ( s)ds (5.36)

to

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 89

Thus, we have

− B jp , ..., j 1 ; ir , ..., i 1 (t) ⊗ λ i1, ..., ir (t)

tf τ

= ∫

t

B j p , ..., j1 ; ir , ..., i1 (t ) ⊗ φ (τ , t ) ⊗ C ls, ..., l1 (τ ) ⊗ [ ∫

t0

C(τ ) ⊗ φ (τ , s) ⊗ B(s ) ⊗ U ( s ) ds] dτ (5.37)

−Bjp ,..., j 1; ir ,..., i 1 t ⊗ λi 1,..., ir (t) =

tf tf

∫ ∫B

0 s

jp ,..., i 1 (t) ⊗ φ (τ , t) ⊗ Cls ,..., l 1 (τ ) ⊗ Cl1,..., ls (τ ) ⊗ φ (τ , s) ⊗ Bi 1,..., jp ( s) dsdτ (5.38)

H j1 ,..., jp (t , τ ) = Cl 1,..., ls (t) ⊗ φ (t , τ ) ⊗ Bi 1,..., jp (τ ) t ≥τ

(5.39)

0 t <τ

Utilizing the above expression in (5.37),

tf tf

0 s

tf

= Sign ∫ R(t , s) ⊗ U * j 1,..., jp ( s)ds (5.41)

0

where R(t,s) is the energy density tensor of the linear system and is given by

tf

s

For, linear time invariant multidimensional systems, H (τ , s ) , the impulse response tensor

is dependent only on the difference between arguments/indices. Thus, the necessary

condition on the optimal control (for continuous time multidimensional systems) is given

by (5.41). It shows that the optimal control tensor is the stable state of a continuous time

(Hopfield type) neural network. One must understand that the concept of continuous

time multidimensional neural network is conceived by the author in (Rama 4). It should

be noted that when the objective function is a higher degree form (rather than quadratic

form), similar derivations are done. Details are avoided for brevity.

90 Multidimensional Neural Networks: Unified Theory

CODEWORD TENSORS AND SWITCHING FUNCTION TENSORS

In view of the results in previous section, in the following, we briefly summarize the results

discussed in (Rama 1) and (Rama 2) so that we realize that the unification (of control,

communication and computation functions) in one dimension, discussed and formalized

by the author also naturally extends to multidimensional systems.

In one dimensional logic theory, operations performed by AND, OR, NOR, NAND, XOR,

NOT gates have appropriate intuitive interpretation in terms of the entries of the one

dimensional arrays i.e.vectors. Any effort to generalize the one dimensional logic operations

to multidimensions leads to various heuristic possibilities and requires considerable

ingenuity in formalizing the definition. But, in the following, utilizing the multidimensional

neural network model [ described in (Rama1) ], a formal/mathematical procedure to

multidimensional logic theory is described.

The input and output signal states of a multidimensional logic gate are related through

an energy function. Equivalently, the multidimensional logic functions are associated with

the local optima of various energy functions defined over the set of input multidimensional

arrays. In view of the mathematical model of multidimensional neural network [ conceived

in (Rama1) ], it is most logical to define the maximum/minimum energy states of a

multidimensional neural network (optimizing an energy function over the

multidimensional hypercube) to correspond to the multidimensional logic gate functions

operating on the input arrays. Similarly generalized multidimensional neural networks

(optimizing a higher degree form) are utilized to define generalized logic functions.

Now we summarize the relationship between multidimensional codes and

multidimensional neural networks.

In the field of one dimensional error correcting codes, various linear as well as non-linear

codes are designed by various researchers. The approaches utilized were mathematically

very sound. But some researchers tried to extend the approach to two/three/

multidimensional code design with limited success.

The author (for the first time) conceived the idea of utilizing a generator tensor to represent

a multidimensional linear code. Given this idea and the results in (BrB) [ relating one

dimensional linear/non-linear codes to generalized neural networks], multidimensional linear

as well as non-linear codes are related to multidimensional generalized neural networks.

One specific result is the following (Details can be found in (Rama 2) ): Given a

multidimensional block code (linear or non-linear), a neural network can be constructed in

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 91

such a way that every local maximum of the energy function corresponds to a codeword

tensor and every codeword tensor corresponds to a local maximum (i.e. stable state).

Unification: Now utilizing the results in Section 3 (relating optimal control tensors and

multidimensional neural networks), we readily have the unification of control,

communication and computation (through the common thread of neural networks).

Formally, the optimal control tensors, optimal multidimensional logic functions,

multidimensional codeword tensors are synthesized through the stable states of

multidimensional neural (generalized) networks.

In the above unification discussion, we only considered neural (generalized neural)

networks in discrete time. In equation (5.41), we discovered and formalized the concept of

continuous time neural associative memory (with the energy function being a quadratic

form associated with certain Kernel).

Continuous time generalized neural networks are defined and associated with

optimal control tensors, optimal codeword tensors and optimal switching functions.

Unified theory with generalized neural networks follows in a similar fashion. Details

are avoided here for brevity.

In view of formal clarity, the following theorem is a comprehensive statement of the

unification of control, communication and computation functions (with quadratic energy

function/objective function) The generalization to the case of higher degree energy function

follows in a similar manner.

Theorem 5.1: Consider a linear time varying multidimensional system with the state space

representation provided in (5.3), (5.4). The optimal control tensor (subject to a finite

amplitude constraint, i.e. Uv1,..., vr ≤ 1 o r, N), optimal switching function (in the sense of a

transformation between an input tensor and an output tensor), optimal linear

multidimensional code constitute the local optimum of a quadratic form in the components

of state variable, input, output tensors. Thus, in the case of linear dynamical systems, with

quadratic energy/objective function, the optimal control tensors, optimal switching

function, optimal linear code are unified to be the local optima of a quadratic form (with

argument/index/time varying coefficient tensors for time varying systems) over the

multidimensional hypercube. Thus these local optima are synthesized as the stable states

of neural/generalized neural network.

Proof: From (Rama1), the stable states of a multidimensional neural network constitute

the local optimum of a quadratic form with the fully symmetric connection tensor as the

weighting tensor. The convergence theorem for (infinite) multidimensional neural networks

provides a formal result. These local optima are defined to be the multidimensional logic

functions in the sense of a mapping between the input tensors and the stable state tensor.

But, from (5.23), the optimal control tensors which optimize a quadratic objective function

have the stable state structure of an interconnected multidimensional neural network with

92 Multidimensional Neural Networks: Unified Theory

block fully symmetric connection structure. Thus, the optimal control and optimal switching

function which optimize a quadratic objective function constitute the stable states of a

multidimensional neural network.

From (Rama 2), it is formally true that the connection structure of a multidimensional

graph-type structure (say graphoid) is associated with a multidimensional linear code through

its cut space. These cutset codes are termed graph-theoretic codes. It is also proved in (Rama

2) that maximum likelihood decoding of a corrupted word (received word) with respect to

the graphoid theoretic code is equivalent to finding the global optimum of the quadratic

energy function associated with a multidimensional neural network. Furthermore, it is shown

that a tensor constitutes the local optimum of a multi-variate polynomial in the components

of input, output tensors (quadratic tensor form) if and only if ( the polynomial is associated

with the parity check tensor) it is a codeword of the multidimensional linear code. Thus,

associated with the generator/parity check tensor of graphoid theoretic code, there exists a

quadratic form whose local optimum constitute the codewords (quadratic form over the

multidimensional hypercube).

Hence the optimal code, optimal control and optimal switching function which

constitute the local optimum of a multi-variate quadratic form ( in the components of

state, input, ouput tensors) are unified to be the same. This constitutes the statement of

unified theory of control, communication and computation in linear dynamical systems (time

varying as well as time invariant systems) with a quadratic form as the objective function.

Q. E. D.

In a future revision, it is discussed how the unification extends to other important

functions. Generalization of the results to certain infinite dimensional systems is also

discussed.

5.5 CONCLUSIONS

In this chapter, based on the work of author and earlier authors, the unification of control,

communication and computation functions (through the common thread of neural

networks) is formalized. The main contribution of the author for unification in one

dimension is to show that the optimal control vectors (in a well known optimality criterion)

constitute the stable states of a Hopfield network. The next important step was to envision

unification in multidimensions. Based on the concept of multidimensional neural networks

(Rama1, Rama 2), the author was able to formally unify communication and computation

functions. Tensor State Space Representation (TSSR) conceived and formalized by the author

was utilized to prove that the optimal control tensors constitute the stable states of a

multidimensional neural network (in discrete as well as continuous time systems). With

this important result, the author was able to show that optimal codewords, optimal logic

functions and optimal control tensors constitute the stable states of a multidimensional

neural network.

Unified Theory of Control, Communication and Computation: Multidimensional Neural Networks 93

REFERENCES

(BoT) A. I. Borisenko and I. E. Tarapov, “Vector and Tensor Analysis with Applications“, Dover

Publications Inc., New York, 1968.

(BrB) J. Bruck and M. Blaum, “Neural Networks, Error Correcting Codes and Polynomials Over

the Binary Hypercube“, IEEE Transactions on Information Theory, Vol. 35, No. 5, September

1989.

(CAB) S.T. Chakradhar, V.D. Agrawal and M.L. Bushnell, “Neural Models and Algorithms

for Digital Testing“, Kluwer Academic Publishers, 1991.

(GoC) B. Gopinath and T. Cover, “Open Problems in Control, Communication and Computation“,

Springer, Heidelberg, 1987.

(HoS) M.Honig and K. Stieglitz, “On Wyner’s conjecture” Bellcore Technical

Memorandum.

(Rama 1) Garimella Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite

Dimensional Logic Theory“, International Journal of Neural Systems, Vol.15, No.3, Pages

223-235, June 2005.

(Rama 2) Garimella Rama Murthy, “Multidimensional Coding Theory: Multidimensional Neural

Networks“, In part presented at the 2002 IEEE International Workshop on Information

Theory.

(Rama 3) G. Rama Murthy, “Tensor State Space Representation: Multidimensional Systems“,

International Journal of Systemics, Cybernetics and Informatics (IJSCI), January 2007,

pages 16-23.

(Rama 4) G. Rama Murthy, “Optimal Control, Codeword, Logic Function Tensors:

Multidimensional Neural Networks”, International Journal... (IJSCI), October 2006, Pages 9-

17.

(Rama 5) G. Rama Murthy, “Signal Design for Magnetic and Optical Recording Channels: Spectra

of Bounded Functions“, Bellcore Technical Memorandum, TM-NWT-018026, December 1990.

(RKB) G. Rama Murthy, P. Krishna Reddy and L. Behera, “Neural Network Based Optimal

Binary Filters”, submitted to Elsevier Signal Processing Journal.

(Gop) M. Gopal, “Modern Control System Theory“, John Wiley and Sons, New York,

(SaW) Sage and White, “Optimum Systems Control“, Prentice-Hall Inc., Englewood Cliffs,

New Jersey 07632.

This page

intentionally left

blank

CHAPTER

6

Comple

ComplexxV alued Neural

Valued

Associative Memory on the

Comple

Complexx Hypercube

6.1 INTRODUCTION

The Hopfield model of the neural network is designed basing on the McCulloch-Pitts

neuron. In this network the computation of the algebraic threshold function is carried out

at each node. The edge between two nodes is associated with a weight. This network can

hence be represented with a weight matrix which is nothing but a symmetric matrix where

Wi , j represents the weight associated with the edge connecting the neurons i and j . Since

it is a symmetric matrix, (i.e., the network represented by an undirected graph), we have

Wi , j = W j ,i . The threshold function can be calculated at each neuron using the function

1, if Hi (t ) ≥ 0

Vi (t + 1) = sgn ( Hi (t) ) =

−1 otherwise

where,

n

Hi (t ) = ∑ Wj , i Vj (t ) − Ti

j =1

Here, Vi (t + 1) represents the value of the function i.e. state value at node i at time (t +1)

(which is the next time instant).

Energy function: The model also associates an energy function which is the quadratic

form V T (t )WV (t ) (neglecting the threshold value without loss of generality) where V(t)

stands for the column matrix that represents the vector corresponding to the state of all

neurons at time instant t. This vector will lie on the hypercube whose order is that of the

synaptic weight matrix.

96 Multidimensional Neural Networks: Unified Theory

Modes of operation: The Hopfield model can operate in one of the two modes, serial or

fully parallel mode or a combination of these. Serial mode is the one in which the next state

computation, i.e., the evaluation of the neural network takes place at each node (node after

node) for every time instant. In the fully parallel mode the evaluation takes place for every

node at each time instant. A combination implies that the evaluation occurs at a group of

nodes for every time instant.

A stable state is defined as a state such that after reaching it, the network output does

not change i.e., V(t) = sgn(WV(t)).

The model results in the following convergence theorems:

Theorem 1: If the neural network is operating in the serial mode and the elements on the

diagonal of connection matrix are non-negative, the network will converge to a stable

state i.e., there are no cycles in the state space.

Theorem 2 : If the network is operating in the fully parallel mode, the network will either

converge to a stable state or to a cycle of length two i.e., it oscillates between two states in

the state space.

Goals: The goals of this chapter are to consider the possibilities of implementing a complex

valued associative memory and observe the behavior of the model in the serial and the

fully parallel modes.

Remark

Recently the concept of a complex valued neural network has been explored since

the work of [ZURADA 1996] and has been almost successfully applied to the fields

of image processing and pattern recognition. A conglomeration of the papers on

the subject has been briefly collected in [HIROSE 2003]. Following this literature

our work is based on implementation of a newer method to realize a complex valued

neural network.

The chapter is organized into three parts. The first part of section 2 discusses the features

of the model the authors are proposing. Also implications to convergence of the network

are briefly pointed out. The second part of the same section provides a proof technique

used for arguing the convergence properties of the discussed form of the complex valued

associative memory. The third part actually presents the proof of convergence and considers

how it is similar to real valued Hopfield associative memory.

The model that we are about to propose considers a case where the neuron output is complex

valued and lies on the complex hypercube. A vector of size n on the complex hypercube

has each entry element belonging to the set {1 + j , 1– j , – 1 + j , -1 – j}.Thus, the state vector

also lies on the complex hypercube. This is, in many ways similar to the Hopfield model.

Complex Valued Neural Associative Memory on the Complex Hypercube 97

The Synaptic weights are complex-valued and the weight matrix is Hermitian unlike the

real valued case, where it is symmetric. The next state V(t+1) can be computed as,

V(t+1) = sgn (real part (WV(t))) + jsgn (complex part (WV(t))) (6.1)

Thus the values of the entities in the column vector V(t+1) anytime would be confining to

the set {1+J, 1–j, –1+j, –1–j} unlike the real case wherein the values confine only to the set

{1, –1}. Thus the total number of values V(t+1) would take, i.e., the number of points of the

“Complex hypercube” would equal 4n where n is the order of the neural network.

The energy function would thus be E(t) = (VT(t))* WV(t) (neglecting the threshold value

without loss of generality). The authors would like to prove that an important property of

this model would be to converge to a stable state when operating in the serial mode and

utmost to a cycle of length 2 when operating in the fully parallel mode.

The proof technique adopted by the authors is method of isolating the real and imaginary

parts of the Hermitian synaptic weight matrix and evaluating them separately. As one can

see when the Hermitian matrix is isolated into two parts real and imaginary, the matrix

corresponding the real part would be a real symmetric one and that corresponding the

imaginary part would be a real anti-symmetric one.

Remark

It is an interesting observation that the energy function when evaluated for the real part

with the complex valued vector would behave exactly as if it were a matrix being

evaluated for the real valued neural network proposed in [HOPFIELD].That is, we have

complex valued associative memory with a real connection matrix. The exact details of

the proof follow.

Before delving into the proof we summarize convergence issues.

1. Given a neural network N = [W,T] with the synaptic weights in the weight matrix

being complex and the matrix itself being Hermitian and the calculation of the

algebraic threshold function generating a complex number, the network always

converges to a stable state when operating in the serial mode.

2. The same network when operating in the parallel mode either converges to a

stable state or to a cycle of length (maximum) two.

Generalized proof of convergence of an arbitrary neural net in the serial mode.

From the proof technique discussed in section (II) the proof follows.

Ek (t) is the evaluation of the energy function value at node k at time instant t.

98 Multidimensional Neural Networks: Unified Theory

V1 (t )

W11 " W1 n #

Ek (t ) = (V (t)" Vk (t ) " V (t ) )

1

* *

n

*

# O # Vk (t )

W

n1 " W nn #

Vn (t )

If we break the expression for Ek (t) into two parts, EkR(t) and Ek i(t), the real and imaginary

parts (of energy function), they come out like this:

V1 (t )

W11R " W1nR #

V (t )

(

EkR ( t ) = V1 (t)"Vk (t) "Vn (t)

* * *

) #

W

" #

k

#

(6.2)

1nR " WnnR

V (t )

n

V1 (t )

0 " jW1nI #

V (t )

EkI (t ) = (V1 (t )"Vk (t ) "Vn (t ) )

* * * # " # k

− jW " 0 #

V (t )

1nI

n

Evaluating the real part of (6.1) for the energy function, we have,

k −1 k −1 n

∑V i

*

( t ) ∑ WijRVj ( t ) + WikRVk ( t ) + ∑W

Vj ( t ) + ijR

i=0 j =1 j= k +1

k − 1 n

Ek (t ) = Vk * (t ) ∑ WkjRVj ( t ) + WkkRVk ( t ) + ∑ WkjRVj ( t ) +

j=1 j= k +1

n k −1 n

∑ Vi * (t ) ∑ WijRVj (t ) + WikRVk ( t ) + ∑ WijRVj ( t )

i =k +1 j =1 j =k +1

Similarly,

V1 (t )

W11R " W1nR #

EkR (t + 1) = (V1 (t )" Vk (t + 1)"Vn (t )) # # Vk (t + 1)

* * *

O (6.3)

W

n1 R " WnnR #

V (t )

n

Complex Valued Neural Associative Memory on the Complex Hypercube 99

The expression for Ek R(t+1) results because it is operating in the serial mode and the

updating of the function value takes place at only one node, i.e., the node at which we are

evaluating(Vk ). In the parallel mode, however, all the function values in the vector will be

updated.

k −1 k −1 n

∑ V i

*

( t ) ∑ W V

ijR j ( t ) + W V

ikR k ( t + 1 ) + ∑ WijRVj (t ) +

i=0 j =1 j= k +1

k − 1 n

Vk * (t + 1) ∑ WkjRVj (t ) + WkkRVk ( t + 1) + ∑ WkjRVj ( t ) +

EkR ( t + 1) j =1 j =k +1

=

n k −1 n

∑ Vi * (t ) ∑ WijRVj (t ) + WikRVk ( t + 1) + ∑ WijRVj (t )

i =k +1 j =1 j =k +1

∆EkR ( t ) = n n

∑

j = 1( j≠ k )

(

WkjR Vk* ( t + 1)Vj (t ) − Vk* (t )Vj (t ) + ) ∑ j = 1( j≠k )

(

WjkR Vk (t + 1)Vj* ( t ) − Vk (t )Vj* (t ) )

But since the real part of both matrices are symmetric, WkjR = WjkR ;

Thus,

∆EkR ( t ) = n

∑ (

WkjR Vk* ( t + 1)Vj ( t ) − Vk* ( t )Vj ( t ) + Vk (t + 1)Vj* ( t ) − Vk ( t )Vj* ( t ) )

j = 1( j≠ k )

(

WkkR (VkR2 ( t + 1) + VkI2 (t + 1)) − (VkR2 (t ) + VkI2 (t )) + )

∆EkR ( t ) = n

∑

j = 1( j≠ k )

(

WkjR Vj ( t ) ∆Vk* ( t ) + Vj* (t ) ∆Vk (t ) )

(

WkkR ∆VkR (t ) (VkR (t + 1) + VkR (t )) + ∆VkI (t )(VkI (t + 1) + VkI (t )) + )

∆EkR ( t ) = n

∑

j = 1( j≠ k )

(

WkjR Vj (t ) ∆VkR ( t ) − jVj ( t ) ∆VkI ( t ) + Vj* ( t ) ∆VkR ( t ) + jVj* (t ) ∆VkI ( t ) )

100 Multidimensional Neural Networks: Unified Theory

(

WkkR ∆VkR ( t ) (VkR (t + 1) + VkR ( t )) + ∆VkI (t )(VkI ( t + 1) + VkI (t )) + )

∆EkR (t ) = n

∑

j = 1( j≠ k )

( )

2WkjR VjR (t ) ∆VkR (t ) + VjI (t ) ∆VkI (t )

WkkR ∆VkR ( t )VkR (t ) + WkkR ∆VkI (t )VkI (t )

we get,

n

2 WkkRVkR ( t ) + ∑ WkjRVjR (t ) ∆VkR ( t ) + WkkR ∆VkR2 ( t ) +

j =1( j ≠k )

∆EkR (t ) = (6.4)

n

2 WkkRVkI (t ) + ∑ WkjRVjI (t ) ∆VkI (t ) + WkkR ∆VkI2 (t )

j =1( j≠ k )

n n

If we consider that W k k RVkR (t ) + ∑ W k jRVjR (t ) and W k k RVk R (t ) + ∑ W k jRV jR (t )

j =1( j ≠k ) j =1( j ≠k )

are expressions for H k (t) in the real mode with some arbitrary VjR (t) and VjI (t), then from

the Hopfield convergence theorem, it is proved from the expression for

∆E = 2 Hk ∆Vk2 (t) + Wkk ∆Vk2 (t) , that it is a value that eventually goes to zero which means

that Ek (t) is not which is the local maxima. Hence ∆EkR (t) in complex case also reaches zero

hence Ek (t) is also not-to a local maxima.

Thus it remains to evaluate the imaginary part contribution of energy function i.e.

V1 ( t )

0 " W1nI #

EkI (t ) = (V1 ( t )"Vk (t )"Vn ( t )) #

* * *

O # Vk ( t )

−W

1 nI " 0 #

Vn (t )

Complex Valued Neural Associative Memory on the Complex Hypercube 101

k −1 k −1 n

∑ V (t ) ∑

i

*

jWijI Vj (t ) + jWikIVk (t ) + ∑ jWijIVj (t ) +

i =1 j =1( i ≠ j ) j = k + 1( i ≠ j )

k −1 n

EkI (t ) = Vk (t ) ∑ jWijIVj (t ) + ∑ jWijI Vj (t ) +

*

j =1 j = k +1

n k −1 n

∑ Vi (t ) ∑ jWijI Vj (t ) + jWikIVk (t ) + ∑ jWijIVj (t )

*

i =k +1 j =1(i ≠ j ) j = k + 1( i ≠ j )

and similarly,

V1 (t )

0 " jW1nI #

EkI ( t + 1) = (V1 (t )"Vk ( t + 1)"Vn ( t )) #

* * *

" # Vk (t + 1)

− jW1nI " 0 #

Vn (t )

k −1 k −1 n

∑ V i

*

(t ) ∑ jWijI Vj (t ) + jWikIVk (t + 1) + ∑ jWijI Vj (t ) +

i =1 j =1( i ≠ j ) j = k + 1( i ≠ j )

k −1 n

EkI ( t + 1) = ∆ Vk * (t + 1) ∑ jWijIVj (t ) + ∑ jWijI Vj (t ) +

j =1

j = k +1

n k −1 n

∑ Vi (t ) ∑ jWijI Vj (t ) + jWikIVk (t + 1) + ∑ jWijI Vj (t )

*

i =k +1 j =1(i ≠ j ) j = k + 1( i ≠ j )

n

∆EkI (t ) = 2 ∑ (W

j =1( j ≠ k )

kjI ) (

VjR ( t ) ∆VkI − WkjI VjI (t ) ∆VkR ) (6.5)

Thus,

∆Ek (t) = + 2 ∑ WkjI VjR (t ) ∆VkI − WkjI VjI ( t ) ∆VkR ( ) ( )

2 ( HkB ) ∆VkI (t ) + WkkR ∆VkI (t )

2

j =1( j ≠ k )

(6.6)

This is the expression for ∆Ek (t) in the complex valued neural network.

102 Multidimensional Neural Networks: Unified Theory

As one can observe from the above expression, the first term is zero when the neural

net converges to a stable state(from [BRUCK 1987]). The second term is real but may take

negative values depending on the imaginary parts of the corresponding entities of the

weight matrix. But when the first term becomes zero, i.e., when the net converges to a

stable state, ∆VkI will be zero. Hence the second term will be zero at the stable state. Which

means that the energy function of a complex valued neural net converges to a positive

value. This also proves that the complex valued associative memory constrained with a

real connection matrix converges to a stable state with a behavior that matches the real

valued neural net described by Hopfield.

Graphs of convergence of the energy function to a stable value and that of the entity

∆E to zero are depicted below. The relative performance as well as the analogous

relationship of that of three cases depicted (i) complex synaptic weight matrix and complex

vector, (ii) convergence for real valued neural network and (iii) an intermediary of these

two, i.e., a test with complex vectors on a real valued synaptic weight matrix is shown.

Convergence of Energy function

800

Complex vector

600

500

Energy function value E

Real weights

complex vectors

400

300

200

Real weights real

vectors

100

-100

-200

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

time t

300

Complex weights

Complex vector

250

Real weights

200 complex vectors

Energy difference

150

100

Real weights

50 real vectors

0

0 2 4 6 8 10 12 14 16

time t

Complex Valued Neural Associative Memory on the Complex Hypercube 103

Thus as we see from the figures above, in the first graph the energy function value is

the greatest for the case where weights are complex and the vectors are also complex. That

is because the complex part of the weight matrix is non-zero and contribute to the energy

value. Next comes the case where weights are real but the vectors are complex. In this case,

though there will be no complex part of the weights contributing, the complex part of the

vectors contribute to this increase in the energy value from the original Hopfield case

which in turn has the least local optima of convergence.

It is customary and mandatory to prove that this analogue between the complex and

the real cases of Hopfield associative memory in the serial mode can also be extended to

work in the fully parallel mode. It has been observed however, that a separate proof is not

required to illustrate the behavior of the model in the fully parallel mode. The same proof

can be extended in a certain manner shown by the work of [BRUCK 1987].

Before going into the extension needed, let the general form of the expression for the

fully parallel mode be observed first.

V1 (t + 1)

W11R " W1nR #

EkR ( t + 1) = (V1 (t + 1)"Vk (t + 1)"Vn (t + 1)) # # Vk (t + 1)

* * *

"

W

n 1R " WnnR #

Vn (t + 1)

V1 (t + 1)

W11I " W1nI #

EkI (t + 1) = (V1 (t + 1)"Vk (t + 1)"Vn (t + 1)) # # Vk (t + 1)

* * *

"

W

n 1I " WnnI #

Vn (t + 1)

These expressions change because the computation of the function is done at every

node of the neural net at a certain time instant.

Instead of evaluating the above expressions, an easier method is to define a special

neural net N’ such that,

N′ = [W ‘, T ‘]

0 W

where W′ = W 0

T

and T′ = T

104 Multidimensional Neural Networks: Unified Theory

As it can be seen from the above matrix N’, it defines a newer neural net which

corresponds to a bipartite graph with 2n nodes. Let the subsets of nodes be P1 and P2

which are independent sets of nodes.

It has been proved beyond doubt in previous work that

1. for any serial mode of operation in N there exists a serial mode of operation in

N’ provided W has a non-negative diagonal.

2. there exists a serial mode of operation in N’ which is equivalent to a fully parallel

mode of operation in N.

This can be seen because if N is operating in a fully parallel mode, since P1 and P2

correspond to independent sets of nodes, it would be equivalent to evaluating one node at

a time in N’ which is nothing but the serial mode.

Now, since N’ is operating in a serial mode, when N’ reaches a stable state one of the

following things happen.

1. The current state of operation of both the partitions P1 and P2 that correspond to

N’ may be the same which means that both P1 and P2 converge to a stable state.

2. The current state of operation of both P1 and P2 are distinct which implies N

will oscillate between the two states at which P1 and P2 are existing currently,

thus converging to a cycle of length two.

The graphs which depict the operation in the parallel mode are shown below.

250

200

150

Energy function E

100

50

-50

-100

-150

1 2 3 4 5 6 7 8 9 10 11

Time t

Complex Valued Neural Associative Memory on the Complex Hypercube 105

consecutive states as the energy oscillates

250 +

+

200

150

Energy difference Delta E

100

+ + +

50

-50

+ + +

-100

+

-150

1 2 3 4 5 6 7 8 9

Time t

The above graphs depict the oscillation of the value of the energy function for a peculiar

case and that of the value of ∆E as it oscillates about zero.

6.4 CONCLUSIONS

From the above discussion one can observe that the evaluation of the signum function at

each node and thereby determining the vector that originates for the next instant makes

the complex valued neural network similar in behavior to the real valued one. However

the designs of neural networks( [ZURADA 1996],[ZURADA 2003] and [HIROSE 2003])

proposed so far have not seen the plausibility of the implementation of the above mentioned

method of performing the complex signum function. Since the application of the function

proves that the network converges just as the real valued one, it can be conveniently applied

to applications such as image processing and pattern recognition.

106 Multidimensional Neural Networks: Unified Theory

REFERENCES

[1]. [HOPFIELD 1982] J.J. Hopfield and D.W. Tank “Neural Computations of Decisions in

Optimization Problems” in Proc. Nat. Acad. Sci. USA , Vol. 79., pp. 2554-2558, 1982.

[2]. [BRUCK 1987] Jehoshua Bruck and Joseph W.Goodman “A Generalized Convergence Theorem

for Neural Networks”. IEEE First Conference on Neural Networks, San Diego, CA June 1987.

[3]. [ZURADA 1996] Stainslaw Jankowski, Andrzej Lozowski and Jacek M.Zurada,

“Complex-valued Multistate Neural Associative Memory”. IEEE Transactions on Neural

Networks. Vol. 7, No 6, November 1996.

[4]. [ZURADA 2003] Mehemet kerem Muezzinoglu, Student member IEEE, Cuneyt Guzelis

and Zacek.M.Zurada, Fellow IEEE “A New Design Method for the Complex-valued Multistate

Hopfield Associative Memory”. IEEE Transactions on Neural Networks Vol. 14. No. 4, July

2003.

[5]. [HIROSE 2003] Akira Hirose. “Complex Valued Neural Networks: Theories and

Applications”. World Scientific Publishing Co, November 2003.

[6]. G. Rama Murthy and D. Praveen, “Complex valued Neural Associative Memory on the

Complex Hypercube,” Proceedings of 2004 IEEE Conference on Cybernetics and Intelligent

Systems (CIS 2004).

Optimal Binary Filters: Neural Networks 107

CHAPTER

7

Optimal Binary Filters:

Neural Networks

7.1 INTRODUCTION

is corrupted by noise. A natural important practical/theoretical problem is filtering. Thus it

is also desirable to design an optimal filter. In many traditional approaches, the criterion of

optimality is the minimization of mean square error between actual output and estimated

output. Based on this criteria Wiener formulated the problem and discovered the optimal

filter. This filter is derived based on transfer function description. Kalman formulated the

problem based on state space description of linear system and in his case the filter is recursive.

Independent of the research in system theory, Shannon formalized information theory

by formulating the notion of block codes for noisy communication channels. The relationship

between system theory approach and coding theory approach was seriously investigated

by few researchers. In [2] the problem of optimal signal design for linear system/channels

was formulated and solved. It is shown in this chapter that investigating the relationship

between system theory approach and coding theory approach leads to the formulation

and solution of a new optimal filtering problem.

In section 2 an optimal signal design problem is formulated and solved which then

forms the basis of the optimal filtering problem and its solution which is discussed in

section 3. Finally the section 4 concludes the chapter.

Our essential goal is to formulate and solve an optimal filtering problem. But the problem

is related to optimal signal design problem, which is formulated and solved in the following.

The signal design problem is an open research problem formulated by A.Wyner, in

continuous time [8].

108 Multidimensional Neural Networks: Unified Theory

Find an admissible sequence of input signals (possibly vectors) u k , k = 0, 1, 2,..., k f –1 i.e, ith

i

component of which (vector) is bounded in amplitude by one i.e. uk ≤ 1, in order to

k −1

−1

f

2 k =0

XkT CT CXk

subject to

X k +1 = AX k + Buk , X (0) = 0

Yk = CX k

Here A is n × n matrix, B is n × p matrix, C is an m × n matrix. Single input, single output

(SISO) as well as multi-input, multi-output (MIMO) channels are considered. Let the m ×

p impulse response matrix [6] be denoted by h(.).

7.2.2 Solution

The optimal control vectors which maximize the total output energy of a linear discrete

time filter over a finite horizon [0, k f ] are given by

*

u k = sign ∑R kj u j

j

k f − k −1

where Rkj = ∑

i = max{1, j − k −1}

hT (i + 1)h(k + i + 1 − j ) and u * is the optimal choice. The condition

k

provided above provides a necessary condition on the optimum input signal/control. The

stable states of a neural network constitute the local optimum control vector. The global

optimum stable state provides the global optimum control vector [2].

Proof : Discrete Maximal Principle well known in literature is utilized to provide the

solution. Consider a one-dimensional linear dynamical system. Its state space description

is given by

X(k +1)= A(k) X(k) + B(k)u(k),

X(0)=X 0 (7.1)

It should be noted that we are purposefully considering a linear time varying system.

In the above state space description of the linear system, A(k ) is an ‘n × n’ matrix (a second

Optimal Binary Filters: Neural Networks 109

order tensor) and X(k ) is the state vector (first order tensor) i.e. an ‘n × 1’ vector. C(k ) is an

‘m × n’ matrix, B(k ) is an ‘n × p ’ matrix.

In the case of certain multidimensional linear systems as well as infinite dimensional

linear systems, the state transition tensor [7], the state tensor, the input tensor [5], the

output tensor are of compatible dimension as well as order. The discrete time dynamical

system evolution (linear or non-linear) is described through tensors [7]. The inner product

between the linear operators is carried out with the standard method. To restrict oneself

to the problem considered in the present chapter, we return to the one-dimensional system

keeping in mind that the authors already made the extension to multi/infinite

dimensional systems [7].

i

The input sequence satisfies the constraints of the following form i.e. uk ≤ 1 , where u ki

is the ith component of the input vector. Thus, U k ∈ V , a subset of Rr.

The cost function J is given below. The problem considered is to find an admissible

sequence uˆ k , k = 0,1,..., k f − 1 subject to the constraints and also minimizing the objective

function J.

k −1

−1

f

1 T T

J=

2 k =0

∑

X kT CT (k )C(k )X k − Xk f C ( k f )C( k f )Xk f

2

(7.3)

response of (7.1) uniquely specified by u. Then, under reasonable assumptions, there exists

a non-trivial function satisfying

∂H( xk , uk , λk , k )

λk = (7.4)

∂xk

where, the Hamiltonian is given by

−1 T T

H (x k , uk , λk +1 , k ) = Xk C ( k )C( k )Xk +λ T [ A( k )X + B(k )u ] (7.5)

2 k +1 k k

Thus, the adjoint vector equation is given by

λk = −CT (k )C(k )Xk + AT (k )λk +1 (7.6)

Since, the terminal state is unspecified, we have

λ (k f ) = −C T (k f )C( k f )X (k f ) (7.7)

110 Multidimensional Neural Networks: Unified Theory

This will provide the terminal condition for solving (7.6). Since the input is constrained

it must necessarily satisfy

v∈V

T

Thus, uk* = − S ign {B ( k )λk + 1 } (7.9)

In most textbooks [6] and references for optimal control /signal design (7.9) is derived

as a necessary condition. We make the following detailed derivation, to expose the structure

of optimal control and its relationship to stable states of a neural network.

Solving (7.6) for λ k +1 and substituting in (7.9) we arrive at the optimal control sequence.

It is immediate to see that if v is a convex polytope, then we have a mathematical

programming problem. Our chief contribution is the following derivation.

With the terminal state specified, the equation (7.6) is recursed backwards to arrive at

the optimal vector (optimal control tensor in the multi–dimensional case). Thus, an efficient

computational form for solving the two-point boundary value problem [9] is derived in

the following.

It should be noted that, we derived the expression for λ k+1 .in the case of linear time

varying dynamical system.

f f f

Starting with the terminal condition, recursing backwards

λk f −1 = −C T ( k f − 1)C( k f − 1)X ( k f − 1)

T

= −C ( k f − 1)Y (k f − 1)

− AT ( k f − 1)CT ( k f )Y( k f )

λk f −2 = −C T ( k f − 2)C( k f − 2)X (k f − 2)

− AT (k f − 2)C T (k f − 1)Y (k f − 1)

− AT ( k f − 2) AT ( k f − 1)C T ( k f )Y( k f )

λk f −3 = −CT ( k f − 3)Y( k f − 3)

Optimal Binary Filters: Neural Networks 111

− AT ( k f − 3)CT ( k f − 2)Y( k f − 2)

− AT ( k f − 3) AT ( k f − 2)C T ( k f − 1)Y( k f − 1)

− AT ( k f − 3) AT ( k f − 2) AT ( k f − 1)CT ( k f )Y( k f )

Thus continuing the pattern downwards, we have for the linear time invariant filters

λk f −l = −CT Y( k f − l) − AT CT Y( k f − l + 1)

− ( AT )2 C T Y ( k f − l + 2)

− ( AT )3 C T Y ( k f − l + 3) ...

−( AT )l C T Y ( k f ) (7.11)

Hence

−( AT )2 CT Y(k + 3) ...

k f − k+l

−( AT ) C T Y( k + l + 1) (7.12)

* l

i =0

∑

uk = Sign BT ( AT )i C T Y ( k + i + 1)

(7.13)

* l

i =0

∑

i.e. uk = Sign h T (i + 1)Y ( k + i + 1)

(7.14)

l l k + i +1

∑h

i =0

T

(i + 1)Y( k + i + 1) = ∑ hT (i + 1) ∑ h (k + i + 1 − j)u ( j) (7.15)

i =0 j =0

112 Multidimensional Neural Networks: Unified Theory

Exchanging the order of summation, with the help of grid, in Figure 7.1 we have

k f k f −k −1

uk* = Sign ∑ ∑ [h T (i + 1) h (k + i + 1 − j )] u ( j )

j =0 i = max ( 1, j − k −1)

kf

k+1

0, 0 1 2 kf – k – 1

Writing the expression

k f k f − k −1

uk* = Sign ∑ ∑ [h T (i + 1)h (k + i + 1 − j )]u ( j ) (7.16)

j =0 i =max{1, j − k −1}

Let us define

k f − k −1

Rkj = ∑

i = max{1, j −k −1}

hT (i + 1) h ( k + i + 1 − j ) (7.17)

Thus,

k f

uk* = Sign ∑ Rk ju ( j ) (7.18)

j =0

In the case of linear time varying systems, the above derivation still applies, as is easily

seen above. It is easy to see that we have an expression of the following form for the

optimal control vector over finite horizon of a time varying linear system.

uk* = Sign {∑ S }

kj u( j )

where, Skj is the energy density matrix of time varying linear system. This can be stated as

a theorem. The authors derived similar results for multidimensional and infinite

dimensional systems[2].

Optimal Binary Filters: Neural Networks 113

DESIGN PROBLEM)

The optimal filter problem is formulated below. For simplicity we consider a single input,

single output linear filter/channel/system. It is easily seen that extension to multi input,

multi output systems follows in a straightforward manner. It is discussed how the solution

of optimal signal design problem also leads to a solution to this problem.

Find an admissible sequence of impulse response values h k , k = 0, 1, 2,..., k f –1, which are

bounded in amplitude by one i.e. hk ≤ 1, in order to minimize the criterion.

k f −1

−1

J =

2

∑X T T

k C CX k subject to

X k +1 = AX k + Buk , X (0) = 0

Yk = CX k

where A is n x n matrix , B is n x 1 matrix, C is an 1 × n matrix. Let the impulse response at

time ‘k’ be denoted by h(k ).

That is, find a bounded support ,bounded magnitude impulse response values such that total

output energy over a finite horizon is maximized. The input is unconstrained.

7.3.2 Solution

Since convolution is a commutative operator, as far as the output of linear filter is concerned,

the roles of input and impulse response can be exchanged.

y ( n ) = u ( n ) * h ( n) = h ( n ) * u ( n )

The input and impulse response have a dual role in determining output. Maximizing

output subject to bounded extent, bounded support input is equivalent to maximizing

output subject to bounded extent, bounded support impulse response. The optimal input

vector is given as the stable state of a neural network [3]. Thus optimal input signals

constitute a linear code [1], [4].

The optimal set of impulse responses constitutes stable states of a neural network whose

connection matrix is the input energy density matrix. The components of optimal impulse

response vector assume binary values. They constitute a linear code. The linear filtering

operation reduces to the “binary filtering” operation. The optimal binary filters are related

to optimal codes matched to input. The derivation follows by replacing the “input” by

“impulse response” finite in extent and finite in support. The derivation involves duplication

114 Multidimensional Neural Networks: Unified Theory

effort required with the derivation of “optimal input”. Thus, linear filtering involves

weighting the input values in the window by binary values. It is shown in [3] that the logic

functions in multidimensions also constitute the stable states of an m-d neural network.

of linear channel over

finite horizon

Determine the

connection matrix of

neural network (energy

density matrix)

Determine the

local/global optimum

input signal, i.e., the

stable state of neural

network by running it in

serial mode

finite horizon

density matrix, the

connection matrix of a

neural network

network in serial mode to

compute the local/global

optimum impulse response

7.4 CONCLUSIONS

shown that this filtering problem is related to an optimal control problem [6]. By solving

Optimal Binary Filters: Neural Networks 115

the optimal control/ signal design problem, it is shown that the global optimum impulse

response constitutes the stable state of a Hopfield neural network.

REFERENCES

* Journal:

[1] Jehoshua Bruck, Mario Blaum, “Neural Networks, Error – Correcting Codes, and Polynomials

over the Binary n- Cube“, IEEE Transactions on Information Theory, Vol. 35, No. 5, September

1989.

* Conference Proceedings:

[2] G. Rama Murthy, “Optimal Control, Codeword, Logic Function Tensors: Multidimensional

Neural Networks,” International Journal of Systemics, Cybernetics and Informatics (IJSCI),

October 2006, pages 9-17.

[3] G. Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional

Logic theory,” International Journal of Neural Systems, Vol. 15, No. 3, pp. 223-235, 2005.

[4] G. Rama Murthy, “Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional

Neural Networks: Constrained Static Optimization,” Proceedings of IEEE Information Theory

Workshop, October 2002.

* Books:

[5] A.I. Borisenko and I. E. Tarapov, “Vector and Tensor Analysis with Applications“, Dover

Publications Inc., New York, 1968.

[6] M. Gopal, “Modern Control System Theory“, John Wiley and Sons, New York.

[7] G. Rama Murthy, “Tensor State Space Representation: Multidimensional Systems”, International

Journal of Systemics, Cybernetics and Informatics (IJSCI), January 2007, pp.16-23

[8] B. Gopinath, T.Cover , “Open Problems in Control and Communication and Computation.”,

Springer, Hiedelberg, 1987.

[9] A.E. Bryson and Y.C. Ho, “Applied Optimal Control: Optimization, Estimation and Control”,

Taylor and Francis Inc. 1995.

This page

intentionally left

blank

CHAPTER

8

Linear Filter Model of a

Synapse: Associated Novel

Real/Comple

eal/Complex xValued Neural

Valued

Networks

8.1 INTRODUCTION

Artificial neural networks are innovated to provide models of biological neural networks.

The currently available models of neurons are utilized to build single layer ( e.g. single

layer perceptron ) as well as multi-layer neural networks (e.g. multi-layer perceptron).

These neural networks were utilized successfully in several applications. Also various

paradigms of neural networks such as radial basis functions, self-organizing memory are

innovated and utilized in applications.

In the case of conventional real valued neural networks, the inputs, outputs belong

to the Euclidean space (Rn or Rm ). In these neural networks, a synapse is represented/

modeled by a single synaptic weight which is lumped at one point. These synaptic weights

are updated in the training phase using one of the learning laws ( for example, Perceptron

learning law, gradient rule etc). In the case of supervised training, these learning laws

enable one to classify the input patterns into finitely many classes (based on the training

samples).

• By reflecting on modeling biological neurons, we are naturally led to making the realistic

assumption that synapses constitute distributed elements rather than lumped elements.

Thus, a realistic model of a synapse is a linear system (characterized by impulse response)

while at the same time maintaining tractability.

• In conventional neuronal models, the input at each synapse is a constant and is

acted on by the scalar synaptic weight. But in biological neurons, it is most natural

to consider that the input signal samples are not scalar values, but are functions

defined over a finite support. The synapses (characterized by impulse response)

act on these input signals which are defined on the domain (restricted to a

118 Multidimensional Neural Networks: Unified Theory

support) [0,T]. Thus the class of input signals belong to a function space (defined

on [0,T]). For the sake of notational convenience, let the synaptic weight functions

be also defined on [0,T].

In summary, a continuous-time, real valued neuron has input signals (which are real valued

functions of time) defined over a finite support. The input signals are fed to synapses acting as

linear systems/filters and sum of responses is operated on by an activation function. Using this

model of a neuron, various feed-forward/recurrent networks of neurons are designed and studied.

This chapter is organized as follows. In section 2, continuous time perceptron model is

discussed. Also in this section, the continuous time perceptron learning law is discussed.

In section 3, abstract mathematical structure of neuronal models is discussed. In section 4,

neuronal model based on finite impulse response model of synapse is discussed. Also the

associated neural networks are proposed. In section 5, a novel continuous time associative

memory is proposed and the convergence theorem is discussed. In section 6, various

multidimensional neural network generalizations are discussed. In section 7, complex

valued neural networks based on the continuous time neuronal model are discussed. The

chapter concludes in section 8.

The area of artificial neural networks was pioneered by the efforts of McCulloch and Pitts

to provide a model of neuron. Soon, it was realized by Minsky et al. that such a model of

neuron has no training of the synaptic weights. Thus they proposed the model of single

perceptron as well as single layer of perceptrons. Further they provided the perceptron

learning law. This law was proved to converge when the input patterns are linearly

separable. Later it was shown that a Multi-Layer-Perceptron, a feed forward network can

be trained (using the back-propagation algorithm) to classify non-linearly separable

patterns. In the following (as discussed in the Introduction), we propose a more accurate

(biologically) model of neuron and use it to construct various artificial neural networks.

Consider finitely many (say M) input signals which are defined on a bounded support [0,

T]. Let each of these signals be input to synapses which are characterized by synaptic

weight functions (that are defined on support [0, T] ). Since each of the synapses act as a linear

filter, the output of each synapse is a convolution of the input function with the synaptic weight

function. Mathematically, let a i(t), Wi(t) for 1≤ i ≤ M be the input functions , synaptic weight

functions respectively. Let the signum function be the activation function of the neuron.

Thus the output of the neuron is given by

M

y(t) = Sign ∑

i =1

ai (t ) ⊗ Wi (t ) − T

(8.1)

Linear Filter Model of a Synapse: Associated Novel Real/Complex Valued Neural Networks 119

where ⊗ denotes the convolution operation between two time functions (and T is the threshold

at the neuron. Without loss of generality, T can be assumed to be zero ). More explicitly,

(8.2)

M T

y(t) = Sign

∑∫ ai (τ ) Wi (t − τ ) dτ − T

i =1 0

The successive input functions are defined over the interval [0,T]. They are fed as inputs to

the continuous time neurons at successive SLOTS.

M

y(t) = sign ∑ ai (t ) ⊗ w i (t )

i =1

a1 (t)

w1 (t)

w2(t)

a2 (t)

. wm(t)

.

.

am (t)

(In the figure ⊗ denotes the convolution operator)

Proof: As in the case of “conventional perceptron”, a continuous time perceptron learning

law is given by:

Wi( n+1) (t) = Wi(n ) (t) + η ( S(t ) − g(t) ) ai (t) (8.3)

where S(t) is the target output for the current training example, g(t) is the output generated

by the perceptron andη is a positive constant called the learning rate.

Proof: In this model of continuous time perceptron, the weights are functions of time defined

on the interval [0, T]. Thus, since the synaptic weights are functions of time, we are led to

investigating the type of convergence: (i) Pointwise or (ii) Uniform.

Suppose we fix the time point, t. The convergence of synaptic weights in (8.3) is assured

by the proof of convergence in the case of conventional perceptron. Since, the choice of

time point is arbitrary, we are assured of pointwise convergence of synaptic weights based

on training sample input functions.

It is interesting to know under what conditions, the sequence of synaptic weight

functions converge UNIFORMLY. Q.E.D.

120 Multidimensional Neural Networks: Unified Theory

Using the above approach to model a neuron, it is straightforward to arrive at a multi-

layer feed forward network. In such a multi-layer perceptron, the activation function at each

neuron is changed from being a signum function to a sigmoid function i.e.

1

y(t) = , (8.4)

1 + e − z(t )

where y(t) is the output of neuron,

M

z(t) = ∑ a (t) ⊗ W (t)

j =1

i i ( ⊗ ..convolution operator) (8.5)

follows essentially in a one-to-one manner. The details are avoided for brevity. Also various

recurrent networks based on the continuous-time neuron can be designed and implemented.

It is possible to consider a model of neuron in which the input functions are defined

over the function space [0, ∞ ]. It is possible to consider neural networks based on such

inputs. The inputs are divided into testing and training classes.

Consider the inputs to a continuous-time neuron which are defined on a finite support

[0,T]. Let the impulse responses of synapses modeled as linear filters be defined on the

finite support [0,T]. Thus, the inputs as well as synaptic weight functions belong to the

function space defined over the finite support [0,T]. We answer the following question.

Q: Under reasonable assumptions, what is the mathematical structure of the function

space defined over [0,T] ?

Let F be the set (function space) on which the following operations are well defined:

Addition, Convolution (These operations are like addition, multiplication defined on the

sets: real numbers, complex numbers).

Lemma: Let the identically zero function be the additive identity element and Delta function

(δ (t ) = 1for t = 0 and δ (t ) = 0 for t ≠ 0 ) be the multiplicative identity. Then, the set F on

which addition, multiplication (of functions defined on [0,T]) operations are defined

constitutes a Ring.

Proof: Involves routine verification of axioms of the ring (closure under addition, convolution

operations between the members of F i.e. functions) and are avoided for brevity. Actually F

is “Close” to being a field except that “multiplicative inverse” does not always exist. Q.E.D.

Include such functions and convert F into field.

Now define a vector space, G over the field. The set of input functions incorporated

into a vector belongs to G. The usual ‘multiplication’ operation is replaced by ‘convolution’.

Linear Filter Model of a Synapse: Associated Novel Real/Complex Valued Neural Networks 121

‘vector’ (specified by synaptic weight functions { Wi (t ), 1 ≤ i ≤ M } is given by

M

∑ a (t)

i =1

i ⊗ Wi (t) = L(t) (8.5)

be the vector space defined over F. A class of functions is separable into two

classes, if there exists a hyperplane such that the two regions are defined by

M M

i =1

i ⊗ Wi ( t) > L( t) (8.6)

i =1

into “N” classes.

It is well known that the Fourier/Laplace transform of the convolution of two functions, is

the product of Fourier/Laplace transforms of the individual functions. It is found that

processing the functions (by applying the activation function) has advantages in the

transform domain (Rama 5). In this subsection, we restrict attention to functions with

rational Fourier/Laplace transforms.

The function space being operated on by the activation function is now the field of

rational functions over [0,T]. Thus there is a natual mapping between the two fields

(associated with continuous time neuron). With the above discussion summarizing the

abstract mathematical structure of neuronal modeling (being considered), we arrive at the

following conclusions:

• Consider a single layer of continuous time perceptrons being trained by input

function samples. As long as the input samples are “linearly separable”, the set

of synaptic weight functions converge (to an equilibrium vector)

• Consider a continuous-time multi-layer perceptron being trained by input

function samples. The back propagation algorithm utilized to train synaptic

weight functions converges even when the input function samples are non-

linearly separable (provided there are sufficient number of continuous-time

neurons in the hidden layer).

NETWORKS

• So far we have considered continuous time neural networks in which the synaptic

weight function corresponds to an analog linear filter. A natural question arises

whether it is possible to conceive a synapse whose impulse response corresponds to that

122 Multidimensional Neural Networks: Unified Theory

of a digital filter i.e. a Finite Impulse Response Filter (FIR). In the following, we

consider neural network with such a model of synapse.

• Typically, let the discrete time input signals be considered over the finite horizon

[ 0, 1, 2, …, S]. For the sake of simplicity, let the length of all FIR filters modeling

the synapses be the same, say T (The generalization to the case where the FIR filters

have different lengths is straightforward). Thus the impulse response sequences

(associated with different synapses) extend over the duration {0,1,2,……….,}.

• The output of the synapse (described by an FIR filter) depends on the input signal

values over a finite horizon (depending on the length of the impulse response).

Typically the length of filter is smaller than the support of a distinct input sequence

i.e. T << S. It should be noted that the successive input sequences are of same length.

M i

y (n ) = Sign ∑

i =1

C ( n) ⊗ a i ( n)

(8.7)

M T

= Sign ∑∑

i =1 k = 0

C i (k ) a i (n − k )

Where C i(k) for k = 1, 2,...,T is the impulse response sequence of ith synapse and

a i (k ),... for k = 1, 2,..., S is the ith input sequence to the neuron.

• Thus the synaptic weight sequence values (impulse response of FIR filters) can

be trained according to the following perceptron learning law

( n)

Ci(n+1) (k ) = Ci ( k ) + η ( S( k ) − g( k )) a i(k ) (8.8)

where S(k ) is the target output for the current training example, g(k ) is the output generated

by the perceptron at time k and η is a positive constant called the learning rate.

This update rule converges when the input patterns are linearly separable. Using the

same model of neuron, a multi layer perceptron is trained using a modified version of

Back Propagation Algorithm.

It is possible to consider neuronal models in which the synapse acts as an Infinite

Impulse Response filter. Furthermore, based on such a model of neuron (synapse acting as

an FIR filter), it is possible to discuss a novel associative memory. Currently, the models of

neurons discussed (in section 2, section 4) are being compared with traditional models of

neurons [Rama5].

In addressing, the problem of signal design for magnetic/optical recording channels, Wyner

formulated an open research problem [GoC]. The problem statement is provided below.

Linear Filter Model of a Synapse: Associated Novel Real/Complex Valued Neural Networks 123

Consider a single input, single output linear time invariant filter (SISO linear filter)

modeling a magnetic/optical recording channel. Let the class of inputs (to the linear filter)

defined on bounded support [0, T], be bounded in magnitude by unity (1). Determine the

optimal signals such that the total output over finite horizon is maximized.

T

∫ y (t)

2

dt ( where y(t) is the output of linear filter )

0

The author [Rama3] as well as Honig et al. [HoS] independently solved the problem.

The solution in [Rama3] is more general in the sense that we considered Multi-Input, Multi-

Output (MIMO) linear time varying filters and derived the optimal input vector. Let Y (t)

be an optimal input vector. Then it satisfies the following signed integral equation

T

∫

Y (t) = Sign R (t , u) Y(u) du

0

(8.9)

where R (t,u) is the energy density matrix of the multi-input, multi-output, Linear time

varying system. In the case of multi-input, multi-output, linear time invariant system, the

optimal input vector satisfies the following equation

T

Y (t) = Sign

0

∫

R (t − u) Y (u) du

(8.10)

computing the optimal control vector starting with an arbitrary binary vector defined on

the support [0,T].

finite support [0,T]. Let R(t) be the energy density matrix of a multi-input, multi-output

linear system representing the time varying synaptic weight matrix. The following

successive approximation scheme is used to compute the local optimum stable function

starting with a initial binary vector Y(0) (t).

T

∫

(n)

Y (t) = Sign R ( t − τ ) Y (τ ) d τ •

( n +1)

(8.11)

0

From practical considerations, it is necessary to know whether the above successive

approximation scheme converges or not. This problem is converted into an equivalent

problem by discretizing the continuous time linear system into a discrete time system.

Such discretization can always be done for some types of systems (satisfying some regularity

124 Multidimensional Neural Networks: Unified Theory

conditions) without fear of approximating the system dynamics. The standard procedure

of discretizing a continuous time system is summarized in many textbooks including

Gopal’s book ( Gop., Pages 185-187),

With the discrete time system equivalent to the continuous time system, the argument

technique adopted for convergence is once again the energy function hill climbing in

successive iterations.

Theorem 8.1: Consider a Multi-Input, Multi-Output (MIMO), linear time-invariant

system described by the dynamics

i

X (t ) = A X (t ) + B Y (t )

Z(t) = C X (t) (8.12)

The discrete time simulation (of the above continuous time system) of the following form

X( k+1 ) = F X( k ) + G Y (k) (8.13)

Z( k ) = H X( k ) (8.14)

can always be done. The discrete simulation is almost exact except for the error introduced

by sampling the input and that caused by the iterative procedure for evaluating the matrices.

Proof: Follows from the procedure described in Gopal (Gop, pp.185-187 ). Q.E.D.

With such a discrete time system corresponding to a continuous time system, we have

the following recursion (successive approximation scheme):

Y (n +1) ( k ) = Sign {W Y( n) (k )} for n ≥ 0, (8.15)

Where Y(k ) is the optimal control vector associated with the discrete time linear system

(obtained by discretizing a continuous time system) and W is the energy density tensor

(associated with the discrete time system). Thus we have a Hopfield network with W as

the synaptic weight matrix. Hence starting with an initial vector Y (0) (k ) , the above recursion

converges to a stable state (local optimum vector) or at most a cycle of length 2 ( by invoking

the convergence theorem associated with Hopfield neural network whose Connection

matrix is W). Q.E.D.

Thus, the above approach converts the problem of determining the convergence of

scheme in (8.11), to that associated with a discrete time linear system. The iteration reminds

of L∞ version of Neumann Series. The energy function (Lyapunov function) optimized

over the state trajectory of continuous time linear system is a quadratic form [Rama1].

In [BrB], various possible generalized neural networks are discussed. These neural

networks are associated with an energy function which is a higher order form than a

quadratic form (associated with a Hopfield neural network). It is very natural to formalize

associative memories which are generalizations of those discussed in this chapter.

Several generalizations of the results are documented in the technical report [Rama5].

For instance, the complex valued, continuous time associative memory is discussed in

Linear Filter Model of a Synapse: Associated Novel Real/Complex Valued Neural Networks 125

detail in the technical report [Rama5, RaP]. For such a complex valued associative memory,

a convergence theorem is stated and proved.

neural networks, whose input as well as output are vectors. It is straight

forward to generalize the results to the case where the input/output is a 3-

dimensional/multidimensional array [Rama1, Rama 2]. Tensor products are

utilized to determine the output of each neuron in the network. Such three/

multidimensional neural networks arise in the biological neural network in

human/animal brain.

• In the case of a human/animal brain, the associative memory operates on three

dimensional input patterns. Thus the state of the associative memory is not a

vector (one dimensional array) but a three dimensional array. An appropriate

model of such memory is a three dimensional, continuous time associative memory.

It is easy to see that the model described in section 5 can easily be generalized

along the lines of the work in [Rama 2].

Activation Functions

Consider a complex valued, continuous time neuron whose inputs as well as synaptic

weight functions (defined on support [0, T] ) and thresholds are complex valued functions.

In such a model of neuron, it is possible to utilize various activation functions.

Let z(t) = ( c(t) + j d (t) ) be the net contribution (after convolving the input functions

with the synaptic weight functions) at a neuron. The following activation functions can be

utilized:

Activation Functions: (i) Complex Signum Function;

Sign ( c(t) + j d (t) ) = Sign ( c(t) ) + j Sign ( d (t) ).

With such an activation function, the continuous time perceptron convergence law

described in (8.3) for real values neurons is easily generalized to continuous time, complex

valued perceptrons.

In the case of conventional complex valued perceptron, it is well known that the

perceptron training law is easily generalized [AAV]. Using the similar proof technique, in

the case of complex valued, continuous time neurons, the convergence proof utilized by

Aizenberg et al. is generalized.

Also, in the case of conventional, complex valued neuron, the above activation function

is utilized in [RaP] for arriving at an associative memory.

126 Multidimensional Neural Networks: Unified Theory

1

g(z(t)) = 1 + e − z(t ) or alternatively (8.16)

In the case of complex valued, continuous time multi-layer perceptron, we utilize the

above complex valued sigmoidal function as the activation function at each (complex

valued) neuron. With such a model of neuron, the backpropagation algorithm in Nitta et al.

[Nit1, Nit2] is generalized to the case of continuous time neural networks.

Utilizing traditional model of a neuron, unified theory of control, communication and

computation is discovered and formalized [Rama 4]. This unified theory is generalized

using the models of neurons discussed in this chapter [Rama1].

8.8. CONCLUSIONS

In this, chapter models of neurons are proposed. The synapses are considered as distributed

elements rather than lumped elements. Thus, synapses are modeled as linear filters in

continuous time as well as discrete time. Using these novel models of neurons, associated

neural networks are proposed. Also, a novel model of associative memory is proposed.

Using such a model, convergence aspects of various modes of operation is discussed.

Multidimensional generalizations of neural networks are discussed. Also associated

complex valued neural networks are discussed.

REFERENCES

[AAV] I. N. Aizenberg, N. N. Aizenberg and J. Vandewalle, “Multi-Valued and Universal

Binary Neurons”, Kluwer Academic Publishers, 2000.

[BrB] J. Bruck and M. Blaum, “Neural Networks, Error Correcting Codes, and Polynomials over

the Binary n-Cube”, IEEE Transactions on Information Theory, pp. 976- 987, Vol. 35, No.5,

September 1989.

[GoC] B. Gopinath and T. Cover, “Open Problems in Control, Communication and Computation”,

Springer, Hiedelberg, 1987.

[Gop] M. Gopal, “Modern Control System Theory“, John Wiley & Sons, Second Edition,

1993.

[HoS] M. Honig and K. Stieglitz, “On Wyner’s Conjecture”, Bellcore Technical Memorandum.

[Nit1] T. Nitta and T. Furuya: “A Complex Back-propagation Learning”, Transactions of

Information Processing Society of Japan, Vol.32, No.10, pp.1319-1329 (1991) (in Japanese).

Linear Filter Model of a Synapse: Associated Novel Real/Complex Valued Neural Networks 127

[Nit2] T. Nitta : “An Extension of the Back-Propagation Algorithm to Complex Numbers”, Neural

Networks, Vol.10, No.8, pp.1391-1415 (1997).

[Rama1] G. Rama Murthy,” Unified Theory of Control, Communication and Computation”, To

be submitted to Proceedings of IEEE.

[Rama2] G. Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite

Dimensional Logic Theory“, International Journal of Neural Systems, Vol. 15, No. 3, Pages

223-235, June 2005.

[Rama3] G. Rama Murthy, “Signal Design for Magnetic/Optical Recording Channels: Spectra of

Bounded Functions“, Bellcore (Now Telcordia) Technical Memorandum, TM-NWT-018026.

[Rama4] G. Rama Murthy, “Optimal Control, Codeword, Logic Function Tensors:

Multidimensional Neural Networks“, International Journal of Systemics, Cybernetics and

Informatics, October 2006, pages 9-17.

[Rama5] G. Rama Murthy, “Linear Filter Model of Synapses: Associated Novel Real/Complex

Valued Neural Networks“, IIIT Technical Report in Preparation.

[RaP]G. Rama Murthy and D. Praveen, “Complex-Valued Neural Associative Memory on the

Complex Hypercube“, Proceedings of 2004 IEEE International Conference on Cybernetics

and Intelligent Systems (CIS-2004), Singapore.

This page

intentionally left

blank

CHAPTER

9

Novel ComplexV

Complex alued

Valued

Neural Networks

9.1 INTRODUCTION

Starting in 1950s researchers tried to arrive at models of neuronal circuitry. Thus the research

field of artificial neural networks took birth. The so-called, perceptron was shown to be

able to classify linear separable patterns. Since the Ex-clusive OR gate cannot be synthesized

through any perceptron (as the gate outputs are not linearly separable), the interest in

artificial neural networks faded away. In the 1970s, it was shown that multi-layer feed

forward neural network such as a multi-layer perceptron is able to classify non-linearly

separable patterns.

Living systems/machines such as homosapiens, lions, tigers etc. have the ability to

associate externally presented one/two/three dimensional information such as audio

signal/images/three dimensional scenes with the information stored in the brain. This

highly accurate ability of association of information is amazingly achieved through the

bio-chemical circuitry in the brain. In 1980s Hopfield revived the interest in the area of

artificial neural networks through a model of associative memory. The main contribution

is a convergence theorem which shows that the artificial neural network reaches a memory/

stable state starting in any arbitrary initial input (in a certain important mode of operation).

He also demonstrated several interesting variations of associative memory. In (Rama4), a

continuous-time version of associative memory is described. It is shown that the celebrated

convergence theorem in discrete time generalizes to the continuous time associative

memory. In (Rama2), the model of associative memory in one dimension (Hopfield

associative memory) is generalized to multi/infinite dimensions and the associated

convergence theorem is proven.

It was realized by researchers such as N.N. Aizenberg that the basic model of a

neuron must be modified to account for complex valued inputs, complex valued

130 Multidimensional Neural Networks: Unified Theory

synaptic weights and thresholds [AAV]. In many real world applications, complex

valued input signals need to be processed by neural networks with complex synaptic

weights [Hir]. Thus the need to study, design and analysis of such networks is real.

Also, in (Rama3) the results on real valued associative memories are extended to

complex valued neural networks. In [Nit1, Nit2], the celebrated back propagation

algorithm is generalized to complex valued neural networks. Also, in [Rama4], based

on a novel model of neuron, complex valued neural networks are designed. Thus, based

on the results in section 2, section 3, it is reasoned that transforming real valued signals

into complex domain and processing them in the complex domain could have many

advantages.

This chapter is organized as follows. In Section 2, Discrete Fourier Transform (DFT)

is utilized to transform a set of real/complex valued sequences into the complex valued

( frequency) domain. It is reasoned that, in a well defined sense, processing the signals

using complex valued neural networks is equivalent to processing them in real domain.

In Section 3, a novel model of continuous time neuron is discussed. The associated neural

networks (based on the novel model of neuron) are briefly outlined. In Section 4, some

important generalizations are discussed. In Section 5, some open questions are outlined.

The chapter concludes in Section 6.

NEURAL NETWORKS

In the field of Digital Signal Processing (DSP), discrete sequences are processed by discrete

time circuits such as digital filters. One transform which converts the time domain

information into frequency domain is called as the Discrete Fourier Transform (DFT). One

of the main reasons for utilizing the DFT in many applications is the existence of a fast

algorithm to compute DFT. This fast algorithm is called as the Fast Fourier Transform

(FFT). In the following, we provide the mathematical expressions for the Discrete Fourier

Transform (DFT) as well as Inverse Discrete Fourier Transform (IDFT) of a discrete sequence

{X n } nM=−01 i.e. { x0 , x1 , x2 ,..., xM − 1 } .

M −1

DFT: X ( k ) = ∑

n =0

x(n) WMk n for 0 ≤ k ≤ ( M − 1) (9.1)

M −1

1

IDFT: x(n ) =

M ∑

k =0

X( k ) WM− k n for 0 ≤ n ≤ ( M − 1) (9.2)

Where

2π

− j( )

WM = e M

(9.3)

Novel Complex Valued Neural Networks 131

Main Question: Consider a set of samples which are linearly separable in the M-

dimensional Euclidean space. Utilizing an invertible (Bijection) Linear Transformation,

transform the points. In the transformed domain, are the resulting samples, linearly separable?

In answering this question, we are led to the following Lemma.

Lemma 9.1: Under Bijective Linear Transformation, linearly separable patterns in Euclidean

Space are mapped to linearly separable patterns in the transform space.

Proof: For the sake of notational convenience, we consider the patterns in a 2-dimensional

Euclidean space. Let the bijective/invertible linear transformation be T: R 2 → R 2 .

Let the original separating line (more generally hyperplane) be given by

W1 X + W2 Y = C (9.4)

S1 = {( x, y)| W1 x + W2 y ≥ C }

Now let us consider the Linear Transformation, T:

T : R2 → R 2

(9.6)

( x , y ) → ( px + qy , rs + sy )

Let the linear transformation be represented by the following matrix:

p q

r s (9.7)

Under this transformation, the separating line coordinates become:

X ' p q X

Y ' = r s Y (9.8)

Thus we readily have

X‘ = pX + qY

(9.9)

Y‘ = rX + sY

On inverting the linear transformation, we have

132 Multidimensional Neural Networks: Unified Theory

−1

X p q X '

Y = (9.10)

r s Y '

s /d −q /d X '

= −r /d p /d Y '

'

Where d is the determinant of the matrix and is given by d = p s – q r. We thus have

s q

X d X' − d Y '

Y = (9.11)

−r p

X' + Y '

d d

Thus, substituting for X, Y in the original separating line/hyper plane

W1 X + W2 Y = C , we readily have

s q −r p

W1 X ' − Y ' + W2 X' + Y ' = C

d d d d

From the above equations, it is clear that a point in two dimensional Euclidean space

belonging to the set gets transformed to the point T (x . y ) = (x ', y ') i.e.

(x , y ) ∈ S1

T (x , y ) = ( x ', y ')∈S '1 Where the set S’1 is given by

• Thus we have shown that the patterns which are linearly seperable in two

dimensional Euclidean space will remain linearly seperable after applying a

bijective linear transformation to the samples.

• The above proof is easily generalized to samples in n-dimensional Euclidean

space ( where ‘n’ is arbitrary). Q.E.D.

Consider the equation (9.1) for computing the Discrete Fourier Transformation of a

discrete sequence of samples {x(n) : 0 ≤ n ≤ ( M − 1)}. Let the column vector containing these

samples be given by Y. Also, let the column vector containing the transformed samples i.e

{X(k) : 0 ≤ k ≤ ( M − 1)} be given by Z. It is clear that equation (9.1) is equivalent to the

following:

Novel Complex Valued Neural Networks 133

Z = F Y, (9.14)

Where F is the Discrete Fourier Transform matrix. This matrix is invertible. Hence the

transformation between the discrete sequence vectors Y, Z is bijective. Thus the above

Lemma applies.

Consider a single layer of conventional perceptrons. Let the sequence of input vectors be

{Y1 , Y2 ,..., YL } . The following supervised learning procedure is utilized to classify the patterns:

• Apply the DFT to the successive input training sample vectors resulting in the

vectors. {Z1 , Z2 ,...., ZL } .

• Train a single layer of Complex Valued Perceptrons using the transformed sample

vectors (complex valued version of perceptron learning law provided in [AAV]

is used)

• Apply the IDFT to arrive at the proper class of training samples.

• Utilize the trained complex valued neural network to classify the test patterns.

In view of Lemma 1, the above procedure converges when the training samples are

linearly separable. Thus the linearly separable test patterns are properly classified.

The above procedure is also applied for non-linearly separable patterns using a complex

valued Multi-Layer Perceptron. Back propagation algorithm discussed in [Nit1, Nit2] is

utilized. Detailed discussion is provided in [Rama1]. It is argued by Nitta et al. that the complex

valued version of back propagation algorithm converges faster than the real one. Thus from

computational viewpoint, the above procedure is attractive.

weights) of current input values is taken and a suitable activation function ( Signum or

Sigmoid or hyperbolic tangent) is applied. A biologically more probable model takes the

following facts into account

• The output of a neuron depends not only on the current input value, but all the

input values over a finite horizon. Thus inputs to neurons are defined over a finite

horizon (rather than a single time point).

• Synapses are treated as distributed elements rather than lumped elements. Thus

synaptic weights are functions defined on a finite support.

For the sake of convenience, let the input as well as synaptic weight functions be defined

on the support [0, T].

134 Multidimensional Neural Networks: Unified Theory

Let the synaptic weights be wi (t), 1 ≤ i ≤ M i.e. time functions defined on the support [0,T].

Also, let the inputs be given by ai (t),1 ≤ i ≤ M .

Thus, the output of the neuron is given by

M

y(t ) = Sign

j =1 ∑

a j (t ) w j (t )

(9.15)

More general activation functions ( sigmoid, hyperbolic tangent etc.) could be used. The

successive input functions are defined over the interval [0,T]. They are fed as inputs to the

continuous time neurons at successive SLOTS. For the sake of notational convenience, we

call such a neuron, a continuous time perceptron.

M

y(t) = Sign[ Σ a (t) w (t) ]

1 i

a1(t) i=1

w1(t)

w2(t)

a2(t)

. wm(t)

.

.

am (t)

given by:

(n )

Wi(n+1) (t) = Wi (t) + η ( S(t ) − g(t) ) ai (t) (9.16)

where S(t) is the target output for the current training example, g(t) is the output

generated by the continuous time perceptron and η is a positive constant called the

learning rate. The proof of convergence of conventional perceptron learning law, also

guarantees the point wise convergence (not necessarily uniform convergence) of

synaptic weight functions.

Using sigmoid function as the activation function and the continuous perceptron as

the model of neuron, it is straightforward to arrive at a continuous time Multi-Layer

Perceptron. The conventional back propagation algorithm is generalized to such a feed

forward network.

Novel Complex Valued Neural Networks 135

Suppose the synaptic weight functions are chosen as sinusoids i.e. wi (t) = cos υi t or sin υ i t

(where υi = 2π f i and fi ’s are frequencies of the sinusoids). The weighted contribution at

each neuron actually corresponds to Amplitude Modulation (where the synaptic weight

functions are the carrier frequencies and the inputs are the base band signals).

We seriously expect that the well known results in Modulation Theory (of

communication systems) could be effectively utilized in supervised learning using a single/

multiple layer perceptron.

it is possible to consider the case where the inputs constitute a three/

multidimensional array (For instance in biological systems, the neurons are indexed

by three dimension variables). Utilizing tensor products, the outputs of continuous

time neurons are obtained. Also, using the above model of neuron, multi-layer,

multidimensional neural networks (such as Multidimensional Multi-layer

Perceptron) are designed and studied [Rama1].

• Based on the above model of neuron, it is possible to consider complex valued

neural networks in which the input functions, synaptic weight functions, thresholds

are complex valued. It is possible to generalize the perceptron learning law,

complex valued back propagation algorithm to such complex valued neural

networks [Rama1].

• It should be possible to design and study complex valued associative memories

based on the above model of neuron.

Or equivalently, what is the most general version of Lemma 1 ?

• Consider the problem of supervised learning in a function space. Equivalently

consider a function space over [0,T]. Define a distance metric over such a space.

Design a neural network which can be trained to classify the patterns into finitely

many classes (of functions) [Rama 4].

• CLIFFORD NEURAL NETWORKS: Some researchers modeled the neuronal

parameters using quaternions. These quaternion based neural networks are

utilized in practical applications such as colour night vision [KIM]. Also some

authors have utilized geometric algebra in designing novel neural networks. An

important open problem is to show that the Clifford/geometric algebra based

neural networks have important advantages over the real valued neural networks.

136 Multidimensional Neural Networks: Unified Theory

9.8 CONCLUSIONS

In this chapter, transforming real valued signals into complex domain (using DFT) and

processing them using complex valued neural network is discussed. A novel model of

neuron is proposed. Based on such a model, real as well as complex valued neural networks

are proposed. Some open research questions are provided.

REFERENCES

[AAV] I. N. Aizenberg, N. N. Aizenberg and J. Vandewalle, “Multi-Valued and Universal

Binary Neurons”, Kluwer Academic Publishers, 2000.

[Hir] A.Hirose, “Complex Valued Neural Networks: Theories and Applications“, World Scientific

Publishing Company, November 2003.

[KIM] H. Kusamichi, T. Isokawa, N. Matsui et al. “A New Scheme for Colour Night Vision by

Quaternion Neural Network“, 2nd International Conference on Autonomous robots & agents,

Dec. 13-15, 2004, Palmerston North, New Zealand.

[Nit1] T. Nitta and T. Furuya: “A Complex Back-propagation Learning”, Transactions of

Information Processing Society of Japan, Vol.32, No.10, pp.1319-1329 (1991) (in Japanese).

[Nit2] T. Nitta : “An Extension of the Back-Propagation Algorithm to Complex Numbers,”

Neural Networks, Vol.10, No.8, pp.1391-1415 (1997).

[Rama1] G. Rama Murthy, “Unified Theory of Control, Communication and Computation”, To

be submitted to Proceedings of IEEE.

[Rama 2] G. Rama Murthy, “Multi/Infinite Dimensional Neural Networks, Multi/Infinite

Dimensional Logic Theory“, International Journal of Neural Systems, Vol.15, No. 3 (2005), 1-

13, June 2005.

[Rama 3] G. Rama Murthy and D. Praveen, “Complex-Valued Neural Associative Memory on

the Complex Hypercube“, Proceedings of 2004 IEEE International Conference on Cybernetics

and Intelligent Systems (CIS-2004), Singapore.

[Rama 4] G. Rama Murthy, “Linear Filter Model of Synapses: Associated Novel Real/Complex

Valued Neural Networks“, IIIT Technical Report in preparation.

[Rama 5] G. Rama Murthy, “Some Novel Real/Complex Valued Neural Network Models,”

Proceedings of 9th Fuzzy days, (International Conference on Computational Intelligence),

September 2006, Dortmund, Germany, Pages, 473-483.

Advanced Theory of Evolution of Living Systems 137

CHAPTER

10

Advanced Theory of

Evolution of Living Systems

the author led to the birth of the field of Mathematical Cybernetics. Its formal/mathematical

basis can be contrasted to the several ad hoc-pseudo mathematical developments in the

fields of mathematical biology, psychology, engineering etc. based on the initial enthusiasm

generated by Wiener who coined the word.

Mathematical cybernetics is the field of formal clarity/completeness which provides

abstract models of increasing complexity which demonstrates the equivalence of the

functions of control, communication and computation in the machine and the living system

from the point of view of the theory/practice (physical hardware) through linear/non-

linear, real/complex dynamical systems.

This field of mathematical/formal research on LIVING SYSTEMS in the universe

pioneered by the author led him to take a deeper and mathematical approach to the CRUDE,

EMPIRICAL THEORY OF EVOLUTION originated by Darwin and seriously investigated

by various biological/zoological researchers. It is argued that the concepts utilized by the

evolutionists in the theory of organic evolution are incorrect and need to be modified/

updated in view of the unified theory of control, communication and computation. For the

purposes of completeness along with brevity some details are provided in the following.

Various life forms starting with one/few cell organisms such as amoeba, hydra etc. have

evolved from the organic mass in the oceans under certain atmospheric conditions. Some of

these organisms have a starting with the ellipsoid based egg. The egg based life forms have

formed organs such as eye, mouth due to the organic reactions taking place inside the egg.

138 Multidimensional Neural Networks: Unified Theory

With one/two eyes formed on the surface of the egg, due to the rotation of the earth, the

natural terrain in the oceans, the egg was constantly drifting in the ocean. The homogeneity

of the organic mass has simultaneously led to the formation of several eggs in the same

region. This led to the problem of congestion in the region (of eggs). The life forms, in order

to cope with the problem began to develop limbs for LOCOMOTION. The remaining organs

formed due to the natural environment/atmosphere have similar topological features in

different species of living systems (UNIFICATION of various LIVING species). The life forms

began to jostle and to handle the environmental needs, the ellipsoid based egg deformed to

form various shapes for the body and primitive organs (non-intelligence based). These

differences in the shape/topological features of the body, organs led to the classification of

such living systems into species such as frogs, fish, crocodiles etc.

The initial organic mass based life had no intelligence. The set of characteristics that

are common to various life forms have formed over a large length of time. Some novel and

innovative concepts which are the distinguishing features of this advanced theory are

briefly described below.

(A) Principle of Equivalence of Trainability of Intelligence in all Natural Living Systems: The

observed variation of intelligence in living systems could be due to variation of

the biochemical content in the brain. But by training various living systems to

learn a language, they could be made intelligent and a certain living animal

culture with intelligence can be developed. In essence, various lower/higher level

animals could be organized in a zoo and rendered useful to themselves as well

as homo-sapiens.

(B) Principle of Non-Necessity of Perceived Needs of Living Systems: Various species

characteristics as needs have evolved over a period of time. These cravings/needs

are genetically replicated. Some of these needs/cravings are not necessary to

sustain life. For instance, METABOLISM which leads to killing of one life form

by another was an accident and is not necessary to sustain the life of various

species of organic life based machines.

(C) “Life” and “death” are identical to functioning and non-functioning machines. It

should be possible to take an organically non-decayed species form which is “dead”

due to “bleeding”, heart failure, some organic decay, malignant growth etc. and make

it living. Thus, in some sense, there is no death. In summary, unified theory sheds

a proper light on the previously less understood concepts of death and life.

(D) Reproduction, as a species need has evolved over a long period of time. This has

led to the problem of overpopulation in some parts of the planet. As is well

known, reproductive appetite could be turned off.

Advanced Theory of Evolution of Living Systems 139

geographical boundaries, across countries have arisen due to the programming

of the homo-sapien machines. For instance, the tradition of “battle” in European

communities is a disease. Thus, various human, living machine diseases are cured

through proper understanding like problems in science or mathematics.

(F) By cross fertilizing the eggs of different species of living machine such as monkeys,

homo-sapiens, lions, tigers etc. it might be possible to give birth to lower-higher

animal, sea-land animal, air-land animal, other combinations of living machines.

Thus, creation/organization/civilization/culturing of whole new class of living

machines is a promising possibility.

Utilizing the mathematical theories of optimization (of various types...functional,

multivariate, constrained etc), MORPHOGENESIS etc. various macro-scale aspects of

living systems—shapes of body organs, their functions etc; micro-scale aspects of living

systems—structure of DNA, protein chains, structure of genes etc. are formally explained.

Thus, systems in living as well as non-living material universe are endowed with an

optimization interpretation (at micro-scale as well as macro-scale).

10.4 CONCLUSIONS

In an effort to understand non-living physical reality, various sub-fields of science such as

physics and chemistry were developed. Based on experimental observations from physical

reality, various mathematical, empirical theories were constucted to derive laws of nature.

These laws, principles, theories on non-living physical reality were utilized to develop

science and engineering. The field of biology was developed to understand the composition,

operation, coordination of various organs/functional units of living systems in nature such

as homo-sapiens, tigers etc. The distinction between living pysical reality and non-living

physical reality was very puzzling to scientists. In the mid 1940’s, N. Wiener coined the

word CYBERNETICS for the field dedicated to understand the control, communication

and computation functions of living systems.

The author pioneered the field of mathematical cybernetics by unifying the control,

communication and computation functions of living system functional units. Thus, a

mathematical model of natural living systems was developed. It is shown that in the context

of one dimensional linear dynamical systems that the unification includes various other

functions alongwith control, communication and computation. By utilizing the tensor state

space representation of certain multi/infinite dimensional linear dynamical systems

discovered by the author, cybernetics results for multi/infinite dimensional systems were

developed. These results enabled the development of multi/infinte dimensional coding,

computation and system theories.

The author also made some pioneering investigations into the functions of various

natural living sytems. These investigations provided the important conclusion that the

living machines such as homo-sapiens, tigers etc. programmed themselves for functions

140 Multidimensional Neural Networks: Unified Theory

such as metabolism, sex etc. Many issues of importance to the living machines such as

control/coordination of them, diseases, programmed bad habits are all addressed based

on a proper understanding of the theory. The advanced theory of evolution resulting from

the unified theory of control, communication and computation resulted in new perspectives

into nature based living systems.

In summary, in this book, the author related multidimensional logic, coding and control

theories to the concept of multidimensional neural networks (proposed by him). He

innovated a novel complex signum function and proposed a novel complex valued

associative memory. Several novel models of neuron are proposed and associated real as

well as complex valued neural networks are discussed.

Index 141

Index

A Arbitrary Open 54

ARMA Time Series Model 72

A codeword 91 Array 35

A Human/Animal Brain 125 Artificial 3, 107

A Multi-Layer Feed Forward Network 120 Artificial Neural Networks 81, 117, 118, 129

A Sigmoid Function 120 Associated Boundary Conditions 87

Astable State 16, 124 Associative Memory 79, 80, 125, 129

Abstract Mathematical Structure 118 Attasi’s Model 64

Abstract Model 79 Audio Signal 129

Abstract models 137 Auto-Regressive (AR) 70

Accurate 118 Auto-Regressive Moving Average (ARMA) 70

Activation Function 118, 125, 133, 134 Autocorrelation Tensors 71

Activation Functions 125 Automata 4, 10

Adaptive Neural Networks 24

Addition 120

Additive 42

B

Additive I.I.D. Noise Term 72 Back Propagation Algorithm 118, 122, 126, 130, 133

Adjoint Equations 87 Basic Model of a Neuron 129

Admissible Control Tensors 87 Basis 33

Admissible Functions 64 Behavior 64, 96

Admissible Sequence 83, 108, 109, 113 Behavior Approach 64

Advanced Theory of Evolution 137 Better Model of Neurons 117

Algebraic Geometry 45 Bijection 131

Algebraic Threshold Function 1, 29, 95, 97 Bijective Linear Transformation 131

All One Dimensional Logic Gates 81 Binary Arrays 9

All-Ones Tensor 40 Binary Codes 50

Amplitude Modulation 135 “Binary Filtering” 113

Analog Linear Filter. 121 Binary Linear Multidimensional Code, 42

Analysis 62, 63, 76 Binary Tensor 33

AND, OR , NOR, NAND, XOR Gate 17 Binary Valued Functions 123

AND, OR, NOR, NAND, XOR Gates 81 Binary Vector 123

AND, OR, NOR, NAND, XOR, NOT Gates 90 Biological Neural Networks 117

AND, OR, NOT, NAND, XOR, NOR 9 Biological Neurons 117

Animal 1 Biological Systems 135

Approximation 55 Bipartite Graph 104

Approximation Theory 55 Block Codes 107

Arbitrary Algebraic Threshold Function 12 Block Symmetric Tensor 18

142 Multidimensional Neural Networks: Unified Theory

Block Tensors 33 Coding Theory Approach 107

Blocked Tensor 32 Coding Theory 27

Boolean Algebra 3 Colored Noise 71

Boolean Function 46, 48, 49 Colored Noise Model 72

Boolean Functions 9, 46, 47 Common Thread 80

Boolean Logic Theory 80 Common Thread of Neural Networks 80, 81

Bounded Extent 113 Communication 4, 79, 80, 81

Bounded Lattice 24, 53, 54 Communication and Computation 91

Bounded Lattices 54 Communication Systems 135

Bounded Magnitude 113 Commutative Operator 113

Bounded Support [0, T] 118, 123 Compact 54

Bounded Support Input 113 Compact Set 53, 54, 55

Brain of Powerful Robots 80 Compatible Tensors 15, 59

Complex 97

C Complex Domain 136

Complex Hypercube 96, 97

Causal/Non-Causal Parts 75 Complex Neural Networks 24

Certain Discrete Time Multidimensional Systems Complex Number 97, 120

83 Complex Part of the Weight Matrix 103

Certain Discrete Time System 82 Complex Signum Function 105, 125

Certain Multidimensional Linear System 109 Complex Synaptic Weight Matrix 102

Certain Multidimensional Linear Systems 62 Complex Synaptic Weights 130

Certain Multidimensional System 82 Complex Valued 96

Certain Multidimensional Systems 69, 82 Complex Valued Associative memory 96, 102, 125

Certain Multi/Infinite 65 Complex Valued Backpropagation Algorithm 135

Channels 107 Complex Valued, Continuous Time Associative

Characteristic Tensor 32 Memory 124

Cholesky Decomposition 56, 57 Complex Valued, Continuous Time Multi-layer

Circuits 32 Perceptron 126

Civilization, 1 Complex Valued, Continuous Time Neuron 125

Class of Input Signals 118 Complex Valued Inputs 129

Class of Inputs 123 Complex Valued Multi-Layer Perceptron 133

Class of Problems in P? 56 Complex Valued Neural Net 102

Classes (of Functions) 135 Complex Valued Neural Network 96, 101, 105,

Classes of Functions 55 133, 135, 136

Classes of NP-Hard Problems 56 Complex Valued Neural Networks 118, 125, 130

Clifford Neural Networks 135 Complex Valued Neuron 125

Clifford/Geometric Algebra 135 Complex Valued Perceptron 133

Code 29, 49 Complex Valued Perceptrons 133

Codes 27 Complex Valued Sigmoidal Function 126

Codeword 33, 38, 40, 81 Complex Valued Synaptic Weights 129

Codeword Array 35 Complex Valued Vector 97

Codeword Tensor 35, 42, 43, 46, 47, 49 Complex-Valued 97, 135

Codeword Vector 35 Complexity Theory 55

Codewords 27, 53, 54 Computation 3

Index 143

Computation Functions 92 Control, Communication 4, 90, 92

Concept of a Logic Gate 80 Control, Communication and Computation 92

Concept of Error 81 Controllability 61, 62, 82

Concept of Error Correcting Code 80 Controllability, Observability, Stability 69

Concept of Neural Network 81 Conventional Approaches 68

Concepts 82 Conventional Back Propagation Algorithm 134

Connection Matrix 57, 58, 96, 113 Conventional Model of Neuron 133

Connection Structure 12, 14, 18, 21, 24, 27, 29, Conventional Perceptron 134

31, 32, 34, 92 Conventional Perceptrons 133

Connection Tensor 22 Conventional State Space Models 77

Connectionist Structure 11, 13 Converge 97

Constrained Static Optimization 53 Converge Uniformly 119

Constraint 57 Convergence 22

Constraint set 24, 53, 54, 55, 57, 84 Convergence of Energy 16

Continuous Time 86 Convergence Properties 96

Continuous Function 55 Convergence Theorem 10, 14, 17, 21, 30, 125

Continuous Index/Argument 87 Convergence Theorems 96

Continuous Objective Functions 55 Converges 133

Continuous Time 62, 71 Convex Polygon/Polytope 57

Continuous Time Associative Memory 122 Convex Polytope 57

Continuous Time I. I. D. Noise 72 Convolution 113, 118, 120

Continuous Time Linear Multidimensional Convolution operation 119

Systems 62 Correcting codes 81

Continuous Time Linear System 123 Corrupted codeword 27

Continuous Time Multi-Layer 134 Corrupted word 92

Continuous Time Neural Associative Memory 91 Cosets 41

Continuous Time Neural Networks 24, 121, 126 Cost function 84, 109

Continuous Time Neuron 121 Criterion 108

Continuous Time Neurons 119, 135 Cut 30, 32

Continuous Time or Discrete Time 4 Cut codes 33

Continuous Time Perceptron 118, 134 Cut space 32, 33, 92

Continuous Time Perceptron Learning Law Cuts 32

119, 134 Cybernetics 79

Continuous Time Perceptron Law 118, 134 Cycle 21, 97

Continuous Time Perceptrons 121 Cycle of length 2, 97

Continuous Time Strucured Markov Random field 71 Cycle of length two 104

Continuous Time System 124 Cycles 22, 30

Continuous Time Systems 66 Cycles in the State Space 20

Continuous Time Version 75

Continuous Time Versions 86

Continuous Time Versions of These Models 71 D

Continuous-time 118, 129

Continuous-time Multi-Layer Perceptron 120, Darwin 137

121 Data bases 62

Continuous-Time Neuron 120 “Death” 138

Contraction 12, 14, 35, 48 Decoders 34

144 Multidimensional Neural Networks: Unified Theory

Decoding Algorithm 45 Discrete Time System 123, 124

Decoding Techniques 28, 52 Discrete Time Systems 66, 67, 82

Decomposition Principle 29, 54, 55, 57 Discrete Time, Time Varying Linear Systems: 83

Decompositions of the State Space 76 Discrete Time, Two Dimensional System 62

Definition 23 Discretizing a Continuous Time System 124

Degree n 11 Disease 139

Derivative 67 Distance Measures 42

Design 63 Distance Metric Over Such a Space 135

Design a Constellation 27 Distributed Dynamical Systems 73, 76, 77

Design, Analysis 61, 62 Distributed Elements 117, 126, 133

Design of Codes 28 Distributed Nature of 74

Determinant of the Matrix 132 Dual 29

DFT 133, 136 DUAL of Signal Design Problem 113

Difference Equation 71 Dynamic as well as Static Linear Systems 76

Difference Equations 61 Dynamic Optimization 24

Difference in Energy 16 Dynamic Programming Principle 83

Differential Equations 61, 69 Dynamical System 29, 66

Differential/partial differential equations 76 Dynamical Systems 63, 75, 86, 137

Digital filter 122 Dynamics 82

Digital filters 130

Digital Signal Processing 130 E

Dimension 13, 22, 29

Dimensional 27 Each Neuron Act 135

Dimensional Continuous Time 67 Egg 138

Dimensional Dynamical Systems 65 Eigentensors 72

Dimensional Hypercube 53 Eigenvalue 58

Dimensional Linear as well as Non-Linear Codes 34 Eigenvalues 72

Dimensional Neural Networks 27 Eigenvectors 58

Discovered and Formalized 82 Electrical Transmission Lines 73

Discovery and Formalization 80 Ellipsoid 138

Discrete Fourier Transform 130 Ellipsoid Based Egg 137

Discrete Fourier Transform matrix 133 Empirical Theory of Evolution 137

Discrete Fourier Transformation 132 Encode 35

Discrete Maximal Principle 108 Encoded Codeword 37

Discrete maximum principle 83 Encoder 4

Discrete memoryless channel 37 Encoders 34

Discrete Sequence of Samples 132 Encoding 27, 35

Discrete Sequences 130 Encoding Algorithm 45

Discrete Time 82 Encoding Procedure 37, 38, 43, 44

Discrete Time Input Signals 122 Encoding/Decoding Algorithms 35, 45

Discrete Time Linear System 124 Energy Density Matrix 123

Discrete Time Multidimensional Neural Network Energy density tensor 86, 89, 124

29 Energy E 16

Discrete Time Multidimensional Systems 62 Energy Function 10, 1, 15, 16, 17, 21, 27,

Discrete Time Multi/Infinitedimensional System 28, 29, 30, 31, 38, 43, 44, 45, 58, 59,

83 81, 90, 91, 95, 97, 98, 102, 103, 105, 124

Index 145

Energy Function Being A Quadratic Form 91 Fundamental 4

Energy Function Hill Climbing 124

G

Energy Functions 18, 19, 20, 24, 38, 45, 51, 54, 90

Energy Landscape 27

Energy Values 21

G/M/1-Type Structure 72

Entropy 36

Game-Theoretic Codes: Optimal Codes 39

Entropy/Uncertainty 37

Generalization of Back Propagation Algorithm

Equilibrium Distribution 72

120

Error Correcting Codes 31, 34, 59, 81

Generalized Logic Circuit 19

Errors 27, 39

Generalized Logic Function 19

Euclidean Space 117

Generalized Logic Gate 19

Every Codeword 91

Generalized Multidimensional Logic Gate 19

Every Local Maximum 81, 91

Generalized Multidimensional Neural Network 19

Evolution 62

Generalized Neural 91

Evolution At Node 14

Generalized Neural Network 11, 28

Evolution Equations 61

Generalized Neural Networks 124

Evolution of The System 74

Generalized/Multidimensional Neural Networks 12

Evolutionists 137

Generator Tensor 29, 34, 35, 37, 38, 40, 42, 49,

Exclusive OR 35, 129 53, 71, 72, 90

Generator Tensor, Codeword Tensors 52

F Generator Tensor G 43

Generator Tensors 35

Fast Fourier Transform 130 Generator/Information Tensor 36

Feed Forward Network 118, 134 Generator/Parity Check Tensor 92

Feed-forward/Recurrent Networks of Neurons Generator/Parity Check Tensors 36

118 Global Maximum 28, 29, 30, 34, 38, 40, 81

‘Field’, F 121 Global Optimization 55

Filter 118 Global Optimum 28, 92

Filtering 73, 107 Global Optimum Control Vector 108

Finite Dimensional Vector Space 63 Global Optimum Impulse Response 115

Finite Field 35, 42 Global Optimum Stable State 52

Finite Fields 59 Global States 63

Finite Impulse Response Filter 122 Global/Local Optimum 54

Finite Impulse Response Model 121 Graph-Theoretic Code 22, 57, 92

Finite Impulse Response Model of Synapse 118 Graphoid 29, 31, 33, 34, 92

Finite Support 118, 120, 123 Graphoid Based Codes 29

Finitely Many Classes 117 Graphoid Codes 31, 34

Fourier Laplace Transform 121 Graphoid Theoretic Codes 32

Fully Parallel Mode 14, 16, 20, 96, 103, 104 Graphs of Convergence 102

Fully Symmetric Connection Tensor 91 Group 42, 52

Fully Symmetric Tensor 10, 14, 29, 30, 82, 84

H

Fully Symmetric Tensor of 22

Fully Symmetric Tensor S 13, 15

Function 31

Hadamard Matrix 46, 47

Function Space 118, 120, 121

Half Plane Causal 75

146 Multidimensional Neural Networks: Unified Theory

Half Plane, Quarter Plane Causal Type Impulse Response of FIR Filters 122

Neighbourhood 76 Impulse Response Sequences 122

Hamiltonian 83, 87 Impulse Response Tensor 83, 85, 86, 89

Hamming Distance 33, 38, 42, 43, 58 Impulse Response Values 113

Hardware 5 Impulse Response 113

Heine-Borel Theorem 53 Incidence Tensor 32

Hermitian 97 Independent, Identically Distributed Noise Model

Hermitian Symmetric Matrix 24 72

Hermitian Synaptic Weight Matrix 97 Infinitedimensional 20, 23

Heuristic Procedures 59 Infinitedimensional Code 36

Higher Degree 23 Infinitedimensional Codes 29, 36

Higher Degree Forms 19 Infinitedimensional Hypercube 23

Higher Dimensional Space 75 Infinitedimensional Linear System 109

Hirose 105 Infinitedimensional Logic Function 22, 23

Homo-sapien 1, 2 Infinitedimensional Logic Theory 10, 20, 22

Homo-sapien Machines 139 Infinitedimensional Neural Network 22

Homo-sapiens 2, 129, 138 Infinitedimensional Symmetric Matrix 20

Homogeneous 71, 74 Infinitedimensional Systems 66

Homogeneous Form of Degree N. 11, 12 Infinitedimensional Tensor 36

Homogeneous Quadratic Form 11 Infinitedimensional Vector 21

Homogeneous Stochastic Linear Systems 71 Infinitedimensional Vector Space 20

Honig 123 Infinite Impulse Response Filter 122

Hop-skip Algorithm 57 Infinite Order/Dimension 29

Hopfield 80, 102, 103, 129 Infinitedimensional Logic Circuit 23

Hopfield Associative Memory 96, 103, 129 Infinitedimensional Logic Gate 22

Hopfield Convergence Theorem 100 Infinitedimensional Logic Gates 23

Hopfield Model 95, 96 Infinitedimensional Logic Synthesis 23

Hopfield Network 124 Information Tensor 35, 37, 38, 46, 49, 52

Hopfield Neural Network 81, 115 Information Theory 107

Hopfield/Amari 79 Information Vector 35

Hypercube 23, 58, 84, 95 Initial Condition 83

Hypercubes 57 Initial State Tensor 72

‘Hyperplane’ 121 Inner and Outer Products Make Sense 84

Hypersphere 58 Inner Product 14, 15, 38, 49, 53, 59, 69, 71, 84, 109

Hyperspheres 54 Inner Product of the Given Tensors 12

Inner Product Operation 82

Inner Product Operator 15

I Inner Product/Outer Product 70

Inner/Outer Product 72

Identity Tensor 40 Innovative Ideas 29

IDFT 133 Input 113, 117

Image Models 73 Input and Output Signal States 17, 81, 90

Image Processing 63, 96, 105 Input Coupling Tensor 67, 68, 82

Image Processing, Tomography 76 Input Energy Density Matrix 113

Images 129 Input Functions 135

Imaginary Parts of 97 Input, Output, State Variables 66

Index 147

Input Sequence 109, 122 Lindeloff’s Covering Lemma 54

Input Signal Samples 117 Linear 66, 118

Input Signal Tensors 83 Linear Algebra 61, 63

Input Signals 108, 118 Linear Algebra Concepts 35

Input Tensor 42, 67, 75, 82, 87 Linear As Well As Non-linear Codes 90

Input-Output Models 63 Linear Block Code 29, 81

Input-Output Point of View 63 Linear Block Codes 45, 57

Input-Output Representations 61 Linear Block Multidimensional Code 28

Input/Stable States 10 Linear Block Multidimensional Codes 53

Inputs 117 Linear Code 113

Inputs to Neurons 133 Linear Discrete Time Filter 108

Integer Programming 52 Linear Dynamical System 88

Integer/Non-linear Programming 59 Linear Dynamical Systems 70, 91

Integral of Tensor Function 86 Linear Equations 47

Intelligence 3, 138 Linear Filtering 113

Interconnection Structure 23 Linear Filters 126

Interesting Observation 97 Linear Multidimensional 49

Internal Representations 64 Linear Multidimensional Block Code 38, 41, 42

Invariant Distribution 72 Linear Multidimensional Block Codes 38, 41

Invariants of a Tensor 17 Linear Multidimensional Codeword Tensor 49

Invertible 133 Linear Operator 63, 70, 72

Isolated Theories 79 Linear Operators 62, 109

Isolating 97 Linear Prediction 73

Linear Programming 56

J Linear Programming Problems 57

Linear Programming Problems: Decomposition

Jacobian Matrix 76 Principle 57

Jacobian Tensor 76 Linear Separability 121

Linear Space 68

K Linear System 107, 117

Linear System Theory 61

Knapsack Problem 57 Linear Systems 5, 24, 61

Linear Systems/Filters 118

L

Linear Tensor/Vector 32

Linear Tensor/Vector Space 32

Linear Time Invariant Continous Time System 81

Language 138

Linear Time Invariant Multidimensional System 89

Latent Variable Models 64

Linear Time Varying Multidimensional System 91

Lattice 22, 50, 59

Lattice (Unbounded Lattice) 24 Linear Time Varying Multi/Infinitedimensional

Lattice 5 Dynamical Systems 84

Learning Laws 117 Linear Time Varying System 108

Learning Rate 119, 134 Linear Time Varying Systems 112

Lee Distance 42, 43, 44, 45 Linear Time-invariant System 124

Lee Weight 44 Linear Transformation, 131

“Life” 138 Linear Transformation Groups 51

148 Multidimensional Neural Networks: Unified Theory

Linear/Non-linear 137 M/G/1-type Structure 72

Linear/Non-linear Codewords 54 Machines 3, 4

Linear/Non-linear Dynamical Systems 62 Macro-scale Aspects 139

Linear/Non-linear Multidimensional Codewords 54 Magnetic and Optical Recording Systems 81

Linearly Separable 122, 131 Magnetic/Optical Recording Channel 123

Linearly Separable Patterns 131 Magnetic/Optical Recording Channels 122

Linearly Separable Test Patterns 133 Markov Chains 71

Linearly Seperable 132 Markov Random Field is a Stochastic Linear

Living Systems 4, 137 System 73

Living Systems/Machines 129 Markovian 64

Local Control 62, 66, 76 Markovian Property 65

Local Input 66 Markovian Source 37

Local Input, Local Output 76 Mathematical Abstraction 10

Local Maxima 100 Mathematical Clarity 80

Local Maximum 28, 29, 91 Mathematical Cybernetics 137

Local Minimum/Maximum Of 17 Mathematical Model 11, 12, 13, 79, 90

Local Optima 55, 57, 81, 90 Mathematical Structure 120

Local Optima of Energy Function 19, 81 Mathematical Theory of Communication 36

Local Optima of The Energy Functions 18 Matrix 133

Local Optimum 19, 54, 91, 92 Maximization of a Quadratic Form 52

Local Optimum Control Vector 108 Maximization of Multivariate Polynomial 37

Local Output 66 Maximum Eigenvector 58

Local State 62, 66, 76 Maximum Likelihood Decoding 28, 29, 33, 34,

Local State, Local Control 76 41, 43, 44, 45, 50, 57, 59, 81, 92

Local State Tensor 75 Maximum Likelihood Decoding Problem 37, 42, 43

Local State, The Input 64 Maximum Likelihood Problem (MID) 41

Local States 63 Maximum of The Quadratic Energy Function 34

Locomotion 138 Maximum Weight Independent Set 52

Logic Circuit 10, 19 Maximum/Minimum 55

Logic Circuits 10, 19, 80 Maximum/Minimum Energy States 90

Logic Function 10 McCulloch and Pitts 118

Logic Functions 17, 22, 114 McCulloch-Pitts Neuron 95

Logic Gate 10, 81 Mean Square Error 107

Logic Gates 3, 9, 19, 22, 80 Measurement Noise 73

Logic Gates, Logic Circuits 17 Measurement Noise Models 72

Logic Synthesis 9, 10, 80 Message 4

Logic Theory 9, 81 Metabolism 138

Lumped 117 Metabolism and Reproduction 1, 6

Lumped Elements 117, 126, 133 Metric Space 54

Lyapunov Function 124 Minimum 33

Minimum Cut 28, 29, 31, 34, 56

M Minimum Cut Computation 56

Minimum Cut Problem 52

m-d Hopfield Neural Network 82 Minimum Distance 34, 38, 39, 45

m-d Neural Network 114 Minimum Distance, Correctable Errors 28

M-Dimensional Euclidean Space 131 Minimum Weight 31

Index 149

Minsky 118 Multidimensional Error Correcting Codes 28

Mode of Operation 15 Multidimensional Generalization 56

Model 96, 97 Multidimensional Generalizations 126

Model of Associative Memory 129 Multidimensional Generalized Neural Networks 90

Model of Neuron 118 Multidimensional Graph-type Structure 92

Model of Synapse 122 Multidimensional Hopfield Neural Network 80

Modeling, Design 62 Multidimensional Hypercube 17, 19, 28, 47, 91

Modeling Distributed Dynamical Systems 74 Multidimensional Information Array (Information

Models 117 Theory) 35

Models of Distributed Systems 74 Multidimensional Information Theory 36

Models of Neuronal Circuitry 129 Multidimensional Lattice 19, 23, 27, 28, 38, 51, 53

Models of Neurons 122 Multidimensional Linear As Well As Non-linear 80

Models of Tomographic Images of Brain 73 Multidimensional Linear Block Code

Modern Approaches 68 28, 33, 37, 38, 39, 40, 43

Modes of Operation 14, 96 Multidimensional Linear Block Codes 29, 34

Modes, Serial 96 Multidimensional Linear Code 37, 90, 92

Modulation Theory: Feed Forward Neural Multidimensional Linear Codes 37, 54

Networks 135 Multidimensional Linear Codeword Constellation 34

Monomial 37 Multidimensional Linear Space 69, 70

Monomials 47, 53 Multidimensional Linear Spaces 62

More General Constraint Sets 53 Multidimensional Linear Systems 5

Morphogenesis 139 Multidimensional Logic 19

Multi Layer Perceptron 122 Multidimensional Logic Circuit 18, 23

Multidimensional 66, 90, 110 Multidimensional Logic Function 17, 18

Multidimensional Array/Tensor 23 Multidimensional Logic Functions 10, 18, 19, 25, 90

Multidimensional Arrays 10, 36 Multidimensional Logic Gate 10, 17, 19, 90

Multidimensional Block Code 90 Multidimensional Logic Gate Functions 17, 90

Multidimensional Bounded Lattice 19, 54 Multidimensional Logic Gate/Circuit 10

Multidimensional Channel 33, 36, 43 Multidimensional Logic Gates 18, 23

Multidimensional Code 33, 35 Multidimensional Logic Synthesis 18, 25

Multidimensional Codes 27, 28, 34, 51, 59 Multidimensional Logic Theory 17, 22, 90

Multidimensional Codeword 44, 45 Multidimensional Logic Theory/Logic Synthesis 23

Multidimensional Codeword Set 54 Multidimensional Logic Units 10

Multidimensional Codeword Tensors 91 Multidimensional Metric Space 53

Multidimensional Codewords 38 Multidimensional Neural Network 10, 11, 13, 14,

Multidimensional Coding Theory 27, 90 15, 18, 22, 27, 29, 30, 31, 34, 59, 90, 92, 118

Multidimensional Communication Systems 36 Multidimensional Neural Networks 10, 11, 18, 23,

Multidimensional Constrained Optimization 28, 51, 59, 80, 82, 90, 92

Problem 55 Multidimensional Neuron 15

Multidimensional Decoding Techniques 53 Multidimensional Neuronal Element 29

Multidimensional Discrete Time Dynamical Multidimensional Neurons 13

System 67 Multidimensional Non-Binary/Binary Codes 53

Multidimensional Encoder 19 Multidimensional Optimization Theory 53

Multidimensional Encoders As Well As Decoders 28 Multidimensional Space 65

Multidimensional Encoding Scheme 37 Multidimensional Stochastic Dynamical System 72

150 Multidimensional Neural Networks: Unified Theory

Multidimensional System Theory 62

Multidimensional Systems Natural Linear 107

62, 63, 66, 71, 80, 82, 90 Natural Living Systems 138

Multidimensional Tensor Codeword 40 Nearest/Farthest Neighbourhood Set 75

Multidimensions 114 Necessary Condition 108, 110

Multi-input,Multi-Output (MIMO) 124 Neighbourhood Sets 75

Multi-input, Multi-output (MIMO) Channels 108 Nerode Equivalence 63, 64

Multi-input, Multi-output (MIMO) Linear Time Networks 91

Varying Filters 123 Neumann Series 124

Multi-Layer Feed Forward Neural Network 129 Neural 91

Multi-layer, Multidimensional Neural Networks 135 Neural Net 102, 104

Multi-Layer Neural Networks 117 Neural Network 28, 30, 56, 57, 58, 90, 95, 97,

Multi-Layer Perceptron 117, 118, 129 108, 110, 113, 122, 135

“Multi-Order” System 66 Neural Network Model 27

Multi-Tensor Variate Polynomials 52 Neural Networks 59, 120, 121, 130, 133, 135

Multi-Variate Polynomial Equations 52 Neural/Generalized Neural Networks 38, 59

Multi-Variate Quadratic Form 92 Neuron 12, 13, 122

Multi/Infinitedimensional 5, 63 Neuron Output 96

Multi/Infinitedimensional Coding Theory 35, 62 Neuronal Element 29

Multi/Infinitedimensional Distributed Systems 63 Neuronal Models 117, 118, 120

Multi/Infinitedimensional Hypercube 86, 87 No Natural Notion of Causality 82

Multi/Infinitedimensional Linear Codes 59 Node 13, 99

Multi/Infinitedimensional Linear System 87 Nodes 13

Multi/Infinitedimensional Linear Systems 63, 70 Noise Model 72, 73

Multi/Infinitedimensional Neural Networks 10, 62 Noise Process 73

Multi/Infinitedimensional State 65 Noise Processes 72

Multi/Infinitedimensional State Space 75 Noise Terms 72

Multi/Infinitedimensional State Space Structure 70 Noisy Communication Channels 107

Multi/Infinitedimensional Structured Markov Non-binary Codes 28, 29, 50

Random Field 71 Non-binary Linear Codes 41

Multi/Infinitedimensional System 63 Non-causal Two Dimensional Dynamics 65

Multi/Infinitedimensional System Theory 62, 68, 69 Non-homogeneity 74

Multi/Infinitedimensional Systems 63, 69, 84 Non-linear 28

Multi/Infinitedimensional Versions of Time-series Non-linear Block Codes 45

Models 70 Non-linear Codes 27, 46, 47, 90

Multi/Infnintedimensional State Space 70 Non-linear Multidimensional Codes 29, 46, 50

Multiple Arguments 74 Non-linear System 66

Multiplication 120 Non-linear Systems 107

Multiplication of Tensors 12 Non-linearly Separable Patterns 118, 129, 133

Multiplicative Group 42 Non-negative Diagonal 104

Multiplicative Representation 42 Non-planar Graph 31

Multivariate Polynomial 29, 52, 53, 54, 59 Non-stationary Fields 74

Multivariate Polynomials 28, 29, 34, 51, 52, 54 Non-stationary Tensor Fields 75

Nonstationary Fields 74

Novel Associative Memory 122

Index 151

Novel Continuous Time Associative Memory 118 Optimal Binary Filters: Neural Networks 107

Novel Model of a Neuron 133 Optimal Code 92

Novel Model of Associative Memory 126 Optimal Codeword 92

Novel Model of Continuous Time Neuron 130 Optimal Codeword Vector 81

Novel Model of Neuron 130, 136 Optimal Control 5, 92, 110

Novel Models of Neurons 126 Optimal Control of Certain Multidimensional

NP-hard Problem 52, 56, 57 System 82

NP-hard Problems 55, 59 Optimal Control Problem 87, 114

Optimal Control Sequence 84

O Optimal Control Tensor 82, 84, 86, 89, 91, 92

Optimal Control Tensor Sequence 83

Objective Function 55, 57, 80, 87, 91 Optimal Control Tensors 80, 82, 87, 91, 92

Objective Function J 109 Optimal Control Vector 81, 112, 123, 124

Objective Functions 55 Optimal Control Vectors 81, 92, 108

Observability 61, 62, 82 Optimal Control/ Signal Design 115

One Dimensional 5 Optimal Filter 107

One Dimensional Arrays 9 Optimal Filter Design Problem 113

One Dimensional Arrays i.e.vectors 90 Optimal Filter Problem 113

One Dimensional Arrays of Zeroes and Ones 80 Optimal Filtering Problem 107, 114

One Dimensional Coding Theory 35 Optimal Input 113

One Dimensional Error Control Coding Theory 34 Optimal Input Vector 123

One Dimensional Error Correcting Code 81 Optimal Linear Multidimensional Code 91

One Dimensional Error Correcting Codes 90 Optimal Logic Functions 92

One Dimensional Linear Dynamic Systems 63 Optimal Logic Gate Output 81

One Dimensional Linear Space 69 Optimal Multidimensional Logic Functions 91

One Dimensional Linear System 82 Optimal Sequence 109

One Dimensional Linear Systems 69 Optimal Set of Impulse Responses 113

One Dimensional Logic Functions 18, 81 Optimal Signal Design 107

One Dimensional Logic Theory 17, 18, Optimal Switching Function 91, 92

50, 51, 80, 90 Optimality Condition 83

One Dimensional Neural Network 11, 20, 81 Optimization 23, 27, 28, 54, 55, 59, 80

One Dimensional Neural Networks 20, 80, 81 Optimization Approach 80

One Dimensional Non-Linear Codes 46 Optimization Constraint 27

One Dimensional Optimal Control Vectors 81 Optimization of Multivariate Polynomials 28, 50

One Dimensional Stochastic Linear Systems 71 Optimization of Quadratic/Higher Degree Forms 28

One Dimensional System Theory 76 Optimization Over More General Constraint Sets, 54

One Dimensional Systems 71, 80, 82 Optimum Input Signal 108

One-dimensional Linear Dynamical System 108 Optimum Stable State 56

One/Two/Three Dimensional Information 129 Order 13, 22, 29

Open Problem 83 Ordinary Difference 69

Open Questions 130 Ordinary/Partial Difference/Differential Equations 76

Open Research Problem 107, 122, 123 Organic Evolution 137

Open Set 54 Organic Life Based Machines 138

Open/Closed Sets 29 Organic Mass 137, 138

Operating in the Fully Parallel Mode 97 Organically Non-decayed 138

Optical Networks 34 Oscillate 104

152 Multidimensional Neural Networks: Unified Theory

Oscillation 105 Problem of Communication 79

Outer Product 14, 35, 69 Problem of Computation 79

Outer Product of Tensors 12 Problem of Control 79

Output 64, 113 Procedures 27

Output Generated 119 Product 15

Output of A Neuron 133 Programming 139

Output of Each Synapse 118 Programming Problem 59

Output States 10 Proof Arguments 28

Output Tensor 67, 68, 82 Proof of Convergence 134

Output Tensors 72 Proof Technique 96

Outputs 117 Proper Class 133

pth Root of Unity 42

P pth Roots of Unity 44

Parallel Computers 9 Q

Parallel Data Transfer 34

Parallel Mode 21, 22, 97 Quadratic 84

Parallel Modes 14 Quadratic Energy 91

Parity Check Equations 35 Quadratic Energy Function 22, 31, 34, 92

Parity Check Matrices 35 Quadratic Form 23, 31, 56, 58, 86, 91, 92, 95

Parity Check Matrix 35 Quadratic Form over the Hypercube 80

Parity Check Tensor 29, 40 Quadratic Forms 12, 19

Partial Differential Equations 68, 69 Quadratic Objective Function 57, 91, 92

Pattern Recognition 96 Quadratic/Higher Degree Energy Function 10

Patterns 132 Quarter Plane Causal Distributed Dynamical

Perceptron 118, 119, 129, 134 Systems 75

Perceptron Learning Law 118, 135 Quarter Plane Causal Model 75

Perceptron Model 135 Quarter Plane Causality 63, 64, 65, 70

Planar Graphs 32 Quarter-plane Causality, Half-plane Causality 62,

Plant and Measurement Noise 73 82

Plant Noise 72 Quaternion Based Neural Networks 135

Plant Noise Model 73

Point Wise Convergence 134 R

Polynomial 46, 49

Polynomial Representation 37, 40 Random Field 72

Polynomial Time Algorithms 57 Random Field Models 71

Polynomials 46, 47 Random Process 72

Polynomials, Power Series 55 Random Variable 72

Pontryagin Function 83, 87 Rational Functions 63

Positive Definite Symmetric Matrix 56, 57 Real Anti-symmetric One 97

Positive Definite Synaptic Weight Matrix 56 Real Connection Matrix 102

Power Spectrum 71 Real Mode 100

Preciseness 80 Real Numbers 120

Prediction 73 Real Part 97, 98, 99

Prime 43 Real Symmetric 97

Index 153

Real Valued Associative Memories 130 SISO Discrete Time, Linear Time Invariant

Real Valued Neural Networks 117 Systems 81

Real Valued Neuron 118 Solution of the Difference Equation 85

Real/Complex Valued Sequences 130 Space 32

Realistic Model 117 Space Representations 71

Realizable 83 Special Sets 55

Received Tensor 43, 53 Species 6

Received Tensor Word 50 Species of Living Systems 138

Received Word 45 Spectral Representation Theorem 72

Recursion 124 Spectral/Cholesky Type Decomposition 57

Redundancy 4 Sphere Packing Problem 45

Representation 38, 82 Stability 62

Reproduction 138 Stable 14, 30

Response Determination 70 Stable State 16, 20, 21, 22, 30, 40, 58, 92, 96,

Ring 120 97, 102, 104, 113, 115

Robots 3 Stable State of a Continuous Time 89

Roesser’s Model 64 Stable State of a Multidimensional Hopfield

Neura 86

S Stable States 17, 18, 19, 22, 80, 81, 91, 92, 108,

110, 113, 114

Samples 131 Stable States (Stable Functions) 5

Scalar Synaptic Weight 117 Stable States of a Hopfield Network 92

Second Order Models 64 Stable States of a Hopfield Neural Network 81

Separable Filters 64 Stable States of a Multidimensional Hopfield

Separating Line/Hyper Plane 132 Neural Network 82

Serial Mode 14, 20, 21, 22, 30, 96, 97, 99, 103, Stable States of a One Dimensional Neural

104 Network 81

Shannon 107 Stable States of Multidimensional Neural (Gener-

Sigmoid Function 134 alized) Network 91

Sign Structure 58 Stable States of Neural Network 81

Signal Design 122 Standard Theorems 55

Signal Design Problem 24 State 13, 31, 67, 95

Signed Integral Equation 123 State Coupling Tensor 67, 68, 82

Signum Function 105, 118, 120 State Equations 88

Signum or Sigmoid or Hyperbolic Tangent 133 State Estimation 73

Simulated Annealing 27 State Evolution 30

Single Input, Single Output 81, 113 State, Input 72

Single Input, Single Output Linear Time Invariant State of a Neuron 13

123 State of a Node 20

Single Layer 117, 121, 133 State of Neuron. 15

Single Layer Neural Network 81 State of Node 30

Single Layer of Perceptrons 118 State of the Dynamical System 68, 82

Single Layer Perceptron 117 State of the Network 14, 20, 30

Single Synaptic Weight 117 State of the Node 16

Single/Multi-Layer Continuous Time Neural State Response 109

Networks 125 State Space 22, 27, 30, 63, 74

154 Multidimensional Neural Networks: Unified Theory

State Space Description 61, 67, 82, 107, 108 Synaptic Weight Matrix 56, 124

State Space Description of a Dynamical System 75 Synaptic Weight Sequence Values 122

State Space Model 64 Synaptic Weights 13, 15, 20, 25, 97, 117, 118, 133

State Space Representation 58, 59, 61, 63, 65, Syndrome 41

66, 69, 73, 76, 82, 91 Synthesis 61, 62

State Space Representation Through Tensors 70 System 61

State Space Representations 72, 76 System Dynamics 75, 83, 87, 124

State Space Structure 69 System Theorists 63, 68

State Tensor 13 System Theory 66, 107

State Transition Tensor 71, 72, 88 System Theory Approach 107

State Transitions 74 Systematic Form 29

States of Neural Networks. 80 Systems 67, 139

Static 63 Systems Function 61

Static Optimization 23, 29, 54, 55

Static Optimization Problems 53

Static Systems 65, 70

T

Stochastic Control Theory 73, 76 Target Output 119, 122, 134

Stochastic Dynamic Programming 73 Tensor 11, 29, 32, 68

Stochastic Linear Systems 71 Tensor Algebra, 49, 76

Stochastic Models 72 Tensor Algebra Concepts 35

Stochastic Processes 71 Tensor Analysis 66

Stochastic Tensor 37 Tensor Based 29

Storage of Data 81 Tensor Based Difference/Differential Equations 76

Stress Tensor 67 Tensor Based Multivariate Polynomials 28

Structure of Optimal Control 110 Tensor Based State Space Representation 63

Structure of the local Optimum 86 Tensor Field 74

Structured Markov Random Field 71, 72, 73 Tensor Functions 74

Structured Markov Random Fields 71 Tensor Geometric Form 72

Sub-spaces 34 Tensor Inner Product (Outer Product) 35

Subsets of Multidimensional Lattice 19, 28 Tensor Linear Operator 34, 35, 62, 63, 65, 69, 76

Subsets of the Lattice 51 Tensor Linear Operators 62, 71, 76

Successive Approximation Procedure 123 Tensor Linear Space 36

Successive Approximation Scheme 123 Tensor Linear Spaces 59, 69, 70

Successive Input Functions 119, 134 Tensor of Partial Derivatives 76

Supervised Learning 135 Tensor of Probabilities of The States 71

Supervised Learning in a Function Space 135 Tensor Product 12, 48

Supervised Training 117 Tensor Products 46, 73, 76, 125, 135

Switching/Logic Functions 19 Tensor Products, Matrix Products 47

Symmetric 99 Tensor Spaces 28, 34

Symmetric Matrix 11, 20, 58, 95 Tensor State Space 82

Symmetric Tensor 10, 18, 23 Tensor State Space Description 69, 70

Synapse 117, 121 Tensor State Space Representation 62, 63, 66,

Synapses 117, 118, 126, 133 67, 68, 71, 72, 73, 76, 80, 82, 92

Synaptic Contribution 15 Tensor State Space Representations 68

Synaptic Weight 134, 135 Tensor-tensor Products 76

Synaptic Weight Function 121 Tensor-tensor Variables 70

Synaptic Weight Functions Tensors 10, 12, 38, 58, 62

118, 121, 125, 133, 134, 135 Terminal State 84

Index 155

The Common Thread of Neural Networks 91

The Nervous System 79

The Real and Imaginary Parts 98

U

The Time Varying Linear System 85 Unbounded Lattice 24

Theorems 28 Undirected Graph 57, 95

Theoretical Computer Science 56 Uni-variate Scalar Polynomial 52

Theory of Error Control Codes 45 Unification 80, 90, 92

Theory of Error Correcting Codes 39 Unification of Control 91

Theory of Multidimensional Neural Networks 28

Unified 81

Three Dimensional Array 125

Unified Theory 6, 92

Three Dimensional Scenes 129

Unified Theory of Control, Communication and

Three/Multidimensional Array 135

Computation 80, 126, 137

Threshold 13

Uniform Convergence 134

Threshold Value 29

Univariate 51

Thresholds 21, 29, 130, 135

Universe 1

Time Invariant Linear System 86

Unknown Tensor X 53

Time Series Model 73

Updating of the Function 99

Time Varying Synaptic Weight Matrix 123

Time Varying Systems 66

Time-invariant Systems 66 V

Topography 27

Topological 34 Various Recurrent Networks 120

Topological Features 138 Vector Space 120, 121

Total Output Energy 81, 108 Vector-matrix Variables 70

Trainability of Intelligence 138 Vector/Matrix Products 76

Training 138 Version 124

Training Example 119, 122

W

Training Phase 117

Training Samples 117, 133

Transfer Function 107

Weight Matrix 95, 97

Transformed Samples 132

Weight of a Cut 31

Transitions 19

Weighted and Undirected Non-planar Graph 30

Transmitted Tensor 43

Weighted Contribution 135

TSSR 73

Weighted Undirected Connectionist Structure 29

Two Dimensional Distributed System 75

Weights 119

Two Dimensional Euclidean Space 132

Weights are Complex 103

Two Dimensional Filter Theory 63

White As Well As Colored 73

Two Dimensional Neural Networks 10

Two Dimensional Signal Processing 75 White Noise 71

Two Dimensional State Space Models 63 Wiener 107, 137

Two Dimensional System Theory 75 Wiener and Kalman Filters 73

Two Possible States 13 Wyner 81, 122

Two/Multidimensional Logic Circuits 9

Two/Multidimensional Arrays 9 Z

Two/Multidimensional Neural Networks. 11

Two/Multidimensional System Theory 62, 76 Zero Mean Tensors 72

Two/Multidimensional Systems 63 Zurada 105

- Fuzzy logic, neural network & genetic algorithmsUploaded byJaya Shukla
- Pattern RecognitionUploaded byLawrence Wang
- An Introduction to the Modeling of Neural NetworksUploaded byOvidiu Dumitrescu
- Statistical Models and Causal Inference a Dialogue With the Social SciencesUploaded bycreativewithin
- Negative Binomial RegressionUploaded byCícero Souza
- Artificial Neural Networks in Real-Life Applications [Idea, 2006]Uploaded byelectric_circuits90
- Artificial Neural Networks Architecture ApplicationsUploaded byRaamses Díaz
- Artificial Neural Networks in Finance and ManufacturingUploaded byunicornstar
- Artificial Higher Order Neural Networks for Economics and BusinessUploaded byjasonmiller.uga5271
- Neural Network Learning Theoretical FoundationsUploaded byAntonio Marcegaglia
- Neural EngineeringUploaded bySameeran Amar Nath
- Artificial Intelligence for Humans, Volume 3 - Jeff HeatonUploaded byHemant Chaudhari
- Fundamentals of Neural Networks by Laurene FausettUploaded bysivakumar
- Rao R.P.N. Probabilistic Models of the Brain- Perception and Neural FunctionUploaded bychyansheaujiun
- Neural NetworksUploaded byMarioEs
- Neural NetworksUploaded byrian ngganden
- Speech Recognition Using Neural NetworksUploaded bydervish
- Livingstone, Data AnalysisUploaded bychemistj
- Deep LearningUploaded bynidar
- Practical Methods of OptimizationUploaded byMehdi Poornikoo
- 0521519004ArtificialIntelligenceUploaded byFernando Campos Cano
- Fuzzy NeuralUploaded byapi-3834446
- algorithms-and-architectures-of-artificial-intelligence-frontiers-in-artificial-intelligence-and-applications 9781586037703 29752Uploaded byl3oy_in_l3lack
- Neural Fuzzy System – by Chin Teng LinUploaded bymansoor.ahmed100
- Generalized Linear ModelsUploaded byBoris Polanco
- Algebraic and Geometric Methods in StatisticsUploaded byjuntujuntu
- Koch I. Analysis of Multivariate and High-Dimensional Data 2013Uploaded byrciani
- Planning AlgorithmUploaded byvv
- Tensor Flow 101Uploaded byNarasimhaiah Narahari
- appendix_tensorflow.pdfUploaded bykumar kumar

- ΙατρικήΦυσική.pdfUploaded byrasty_01
- BMC_CMC_Labs.pdfUploaded byrasty_01
- BMC_CMC_Labs.pdfUploaded byrasty_01
- Social environment is associated with gene regulatory.pdfUploaded byrasty_01
- Social environment is associated with gene regulatory.pdfUploaded byrasty_01
- Politics in the Slump--Polarization and Extremism after Crises, 1870-2014.pdfUploaded byrasty_01
- Politics in the Slump--Polarization and Extremism after Crises, 1870-2014.pdfUploaded byrasty_01
- Γνωσιακές Επιστήμες Και ΔιεπιστημονικότηταUploaded byrasty_01
- 9_biohlektrismosUploaded byΧρύσα Φλουράκη
- Επικαμπύλια Ολοκληρώματα.pdfUploaded byrasty_01
- BernardiniUploaded byrasty_01
- Υποδείξεις Παρουσίασης Επιστημονικών Διατριβών.pdfUploaded byrasty_01
- Computational and Mathematical Methods in Medicine.pdfUploaded byrasty_01
- Computational and Mathematical Methods in Medicine.pdfUploaded byrasty_01
- ΤΚΟ ΕΦΑΡΜΟΓΗ ISO 17025 ΕΡΩΤΗΣΕΙΣ - ΑΠΑΝΤΗΣΕΙΣ full.pdfUploaded byrasty_01
- Θέματα Αναλυτική Γεωμετρία (Ζαφειρίδου)Uploaded byemath_gr
- Χάγκεν Φλάισερ - ΒικιπαίδειαUploaded byrasty_01
- How to Cite Sources_ Citation and Style Guides _ Queen's University LibraryUploaded byrasty_01
- Citation Styles, Style Guides, And Avoiding Plagiarism-The Library-University of California, BerkeleyUploaded byrasty_01
- AriProodosUploaded byrasty_01
- Churchill Brown@Chapter1Uploaded byrasty_01
- Style GuidesUploaded byrasty_01
- Θέματα δομής υλικών Σεπτέμβριος 2008Uploaded byrasty_01
- gewlogia_perivallonUploaded byrasty_01
- Έκθεση PorterUploaded byrasty_01
- askiseis geometrikes proodoiUploaded byrasty_01
- Ύλη Δομής ΥλικώνUploaded byrasty_01
- scientificamerican0565-58Uploaded byrasty_01
- 42CBEUploaded byrasty_01
- Citing SourcesUploaded byrasty_01

- Metal Detector Parts ListUploaded byAlex Anthony
- Fdci222, Fdcio222 Technical ManualUploaded byissa10096
- 07032662Uploaded byJulio César
- A203SE Analog Electronics CourseworkUploaded byBadur Shakeal
- ubm_edn_201207Uploaded byqtdragon
- Expt List Dec Dcd PeiiUploaded byahireprash
- Radio Shack Cellular PhonesUploaded byapi-3750856
- BE_ETNC_SEM 8(REV)Uploaded byjigarsampat
- Prospectus 2011 12(Sec)Uploaded byamitdey56
- 4-45nm SRAM Technology Development and Technology Lead VehicleUploaded byRohit Bhelkar
- CETa CompsUploaded bymade_up-down65
- INST260_sec2Uploaded byDoly
- UC2842_SGS_AN.pdfUploaded byupali01
- Analog ElectronicsUploaded bySuganyaRavi
- Digital Dice - ElectronicsUploaded byAnonymous QnKVe82e
- t8_enUploaded byOliver Suaza
- Spice Examples Orcad Winspice FullerUploaded byMannan Abdul
- ECA-II question papersUploaded bysatyakar_vvk
- Accuphase e 200Uploaded byTudor Gabriel Gavrilescu
- Bobina_HH-1M1608-121JTUploaded byNegru P. Plantatie
- 555 timer ic_an170Uploaded byacidreign
- Et200s 2ao u St Manual en-US[1]Uploaded bysurendra n patel
- TDA_16846Uploaded byzeldeni
- LNK362.pdfUploaded byJose Benavides
- Transient Behavior of Simple RC CircuitsUploaded bymivey
- 1 History of MicroelectronicsUploaded byWaris Amin
- Fabrication of TransistorsUploaded byDurga Prasath
- paper273Uploaded byoscar_rivera_g
- PE197001.pdfUploaded byelin373
- 2703.pdfUploaded byRadu Paul