You are on page 1of 344

Exam Preparation

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Mon, 03 Dec 2012 18:03:43 UTC

Contents
Articles
Fick's laws of diffusion Conservation of mass Atom-transfer radical-polymerization Living polymerization Dispersity Molar mass distribution Biotechnology Biosensor Biochemical cascade Biocatalysis Enzyme Active site Activation energy Oxidoreductase Glucose oxidase Peroxidase Horseradish peroxidase Inclusion body Protein folding Protein purification Chromatography Gel permeation chromatography Size-exclusion chromatography Affinity chromatography High-performance liquid chromatography Electrophoresis Gel electrophoresis Ion chromatography Antibody Immunoprecipitation Coagulation Protease Heat equation Diffusion 1 7 11 14 17 19 21 35 41 49 52 72 73 75 77 80 82 84 87 95 101 109 116 122 126 136 139 149 151 167 174 184 187 201

Mass diffusivity Chemical potential Conservation law Massenergy equivalence Momentum Angular momentum Charge conservation Conservation of energy First law of thermodynamics Laws of thermodynamics Continuity equation Fluid mechanics NavierStokes equations Conserved quantity Energy flux Mass flow rate Fluid dynamics

212 214 221 222 242 258 266 269 276 288 292 302 307 319 321 321 323

References
Article Sources and Contributors Image Sources, Licenses and Contributors 330 337

Article Licenses
License 341

Fick's laws of diffusion

Fick's laws of diffusion


Fick's laws of diffusion describe diffusion and can be used to solve for the diffusion coefficient, D. They were derived by Adolf Fick in the year 1855.

Fick's first law


Fick's first law relates the diffusive flux to the concentration under the assumption of steady state. It postulates that the flux goes from regions of high concentration to regions of low concentration, with a magnitude that is proportional to the concentration gradient (spatial derivative). In one (spatial) dimension, the law is
Molecular diffusion from a microscopic and macroscopic point of view. Initially, there are solute molecules on the left side of a barrier (purple line) and none on the right. The barrier is removed, and the solute diffuses to fill the whole container. Top: A single molecule moves around randomly. Middle: With more molecules, there is a clear trend where the solute fills the container more and more uniformly. Bottom: With an enormous number of solute molecules, randomness becomes undetectable: The solute appears to move smoothly and systematically from high-concentration areas to low-concentration areas. This smooth flow is described by Fick's laws.

where is the "diffusion flux" [(amount of substance) per unit area per unit time], example . measures the amount of substance that will flow through a small area during a small time interval. is the diffusion coefficient or diffusivity in dimensions of [length2 time1], example

(for ideal mixtures) is the concentration in dimensions of [amount of substance per unit volume], example is the position [length], example is proportional to the squared velocity of the diffusing particles, which depends on the temperature, viscosity of

the fluid and the size of the particles according to the Stokes-Einstein relation. In dilute aqueous solutions the diffusion coefficients of most ions are similar and have values that at room temperature are in the range of 0.6x109 to 2x109 m2/s. For biological molecules the diffusion coefficients normally range from 1011 to 1010 m2/s. In two or more dimensions we must use obtaining . The driving force for the one-dimensional diffusion is the quantity which for ideal mixtures is the concentration gradient. In chemical systems other than ideal solutions or mixtures, the driving force for diffusion of each species is the gradient of chemical potential of this species. Then Fick's first law (one-dimensional case) can be written as: , the del or gradient operator, which generalises the first derivative,

where the index i denotes the ith species, c is the concentration (mol/m3), R is the universal gas constant (J/(K mol)), T is the absolute temperature (K), and is the chemical potential (J/mol). If the primary variable is mass fraction ( , given, for example, in ), then the equation changes to:

Fick's laws of diffusion

where

is the fluid density (for example, in

). Note that the density is outside the gradient operator.

Fick's second law


Fick's second law predicts how diffusion causes the concentration to change with time:

where is the concentration in dimensions of [(amount of substance) length3], example is time [s] is the diffusion coefficient in dimensions of [length2 time1], example is the position [length], example

It can be derived from Fick's First law and the mass conservation in absence of any chemical reactions:

Assuming the diffusion coefficient D to be a constant we can exchange the orders of the differentiation and multiply by the constant:

and, thus, receive the form of the Fick's equations as was stated above. For the case of diffusion in two or more dimensions Fick's Second Law becomes , which is analogous to the heat equation. If the diffusion coefficient is not a constant, but depends upon the coordinate and/or concentration, Fick's Second Law yields

An important example is the case where

is at a steady state, i.e. the concentration does not change by time, so that , the solution for the . In two or more dimensions we obtain

the left part of the above equation is identically zero. In one dimension with constant concentration will be a linear change of concentrations along

which is Laplace's equation, the solutions to which are called harmonic functions by mathematicians.

Example solution in one dimension: diffusion length


A simple case of diffusion with time t in one dimension (taken as the x-axis) from a boundary located at position , where the concentration is maintained at a value is . where erfc is the complementary error function. The length is called the diffusion length and provides a

measure of how far the concentration has propagated in the x-direction by diffusion in time t (Bird, 1976). As a quick approximation of the error function, the first 2 terms of the Taylor series can be used:

Fick's laws of diffusion

If

is time-dependent, the diffusion length becomes

. This idea is useful for estimating a

diffusion length over a heating and cooling cycle, where D varies with temperature.

Generalizations
1. In the inhomogeneous media, the diffusion coefficient varies in space, affect Fick's first law but the second law changes: . This dependence does not

2. In the anisotropic media, the diffusion coefficient depends on the direction. It is a symmetric tensor Fick's first law changes to

For the diffusion equation this formula gives

The symmetric matrix of diffusion coefficients

should be positive definite. It is needed to make the right hand

side operator elliptic. 3. For the inhomogeneous anisotropic media these two forms of the diffusion equation should be combined in

4. The approach based on the Einstein's mobility and Teorell formula gives the following generalization of Fick's equation for the multicomponent diffusion of the perfect components:

where

are concentrations of the components and

is the matrix of coefficients. Here, indexes i,j are related to

the various components and not to the space coordinates. The Chapman-Enskog formulas for diffusion in gases include exactly the same terms. It should be stressed that these physical models of diffusion are different from the toy-models which are valid for very small

deviations from the uniform equilibrium. Earlier, such terms were introduced in the MaxwellStefan diffusion equation. For anisotropic multicomponent diffusion coefficients one needs 4-index quantities, for example, are related to the components and , =1,2,3 correspond to the space coordinates. , where i, j

Fick's laws of diffusion

History
In 1855, physiologist Adolf Fick first reported[1][2] his now-well-known laws governing the transport of mass through diffusive means. Fick's work was inspired by the earlier experiments of Thomas Graham, which fell short of proposing the fundamental laws for which Fick would become famous. The Fick's law is analogous to the relationships discovered at the same epoch by other eminent scientists: Darcy's law (hydraulic flow), Ohm's law (charge transport), and Fourier's Law (heat transport). Fick's experiments (modeled on Graham's) dealt with measuring the concentrations and fluxes of salt, diffusing between two reservoirs through tubes of water. It is notable that Fick's work primarily concerned diffusion in fluids, because at the time, diffusion in solids was not considered generally possible.[3] Today, Fick's Laws form the core of our understanding of diffusion in solids, liquids, and gases (in the absence of bulk fluid motion in the latter two cases). When a diffusion process does not follow Fick's laws (which does happen),[4][5] we refer to such processes as non-Fickian, in that they are exceptions that "prove" the importance of the general rules that Fick outlined in 1855.

Applications
Equations based on Fick's law have been commonly used to model transport processes in foods, neurons, biopolymers, pharmaceuticals, porous soils, population dynamics, nuclear materials, semiconductor doping process, etc. Theory of all voltammetric methods is based on solutions of Fick's equation. A large amount of experimental research in polymer science and food science has shown that a more general approach is required to describe transport of components in materials undergoing glass transition. In the vicinity of glass transition the flow behavior becomes "non-Fickian". It can be shown that the Fick's law can be obtained from the Maxwell-Stefan equations[6] of multi-component mass transfer. The Fick's law is limiting case of the Maxwell-Stefan equations, when the mixture is extremely dilute and every chemical species is interacting only with the bulk mixture and not with other species. To account for the presence of multiple species in a non-dilute mixture, several variations of the Maxwell-Stefan equations are used. See also non-diagonal coupled transport processes (Onsager relationship).

Biological perspective
The first law gives rise to the following formula:[7]

in which, is the permeability, an experimentally determined membrane "conductance" for a given gas at a given to temperature. is the difference in concentration of the gas across the membrane for the direction of flow (from ). Fick's first law is also important in radiation transfer equations. However, in this context it becomes inaccurate when the diffusion constant is low and the radiation becomes limited by the speed of light rather than by the resistance of the material the radiation is flowing through. In this situation, one can use a flux limiter. The exchange rate of a gas across a fluid membrane can be determined by using this law together with Graham's law.

Fick's flow in liquids


When two miscible liquids are brought into contact, and diffusion takes place, the macroscopic (or average) concentration evolves following Fick's law. On a mesoscopic scale, that is, between the macroscopic scale described by Fick's law and molecular scale, where molecular random walks take place, fluctuations cannot be neglected. Such situations can be successfully modeled with Landau-Lifshitz fluctuating hydrodynamics. In this theoretical framework, diffusion is due to fluctuations whose dimensions range from the molecular scale to the macroscopic

Fick's laws of diffusion scale. [8] In particular, fluctuating hydrodynamic equations include a Fick's flow term, with a given diffusion coefficient, along with hydrodynamics equations and stochastic terms describing fluctuations. When calculating the fluctuations with a perturbative approach, the zero order approximation is Fick's law. The first order gives the fluctuations, and it comes out that fluctuations contribute to diffusion. This represents somehow a tautology, since the phenomena described by a lower order approximation is the result of a higher approximation: this problem is solved only by renormalizing fluctuating hydrodynamics equations.

Semiconductor fabrication applications


IC Fabrication technologies, model processes like CVD, Thermal Oxidation, and Wet Oxidation, doping, etc. use diffusion equations obtained from Fick's law. In certain cases, the solutions are obtained for boundary conditions such as constant source concentration diffusion, limited source concentration, or moving boundary diffusion (where junction depth keeps moving into the substrate).

Derivation of Fick's 1st law in 1 dimension


The following derivation is based on a similar argument made in Berg 1977 (see references). Consider a collection of particles performing a random walk in one dimension with length scale . Let be the number of particles at position at time . and time scale

At a given time step, half of the particles would move left and half would move right. Since half of the particles at point move right and half of the particles at point move left, the net movement to the right is:

The flux, J, is this net movement of particles across some area element of area a, normal to the random walk during a time interval . Hence we may write:

Multiplying the top and bottom of the righthand side by

and rewriting, we obtain:

We note that concentration is defined as particles per unit volume, and hence In addition, to: is the definition of the diffusion constant in one dimension,

. . Thus our expression simplifies

In the limit where

is infinitesimal, the righthand side becomes a space derivative:

Fick's laws of diffusion

Notes
[1] A. Fick, Ann. der. Physik (1855), 94, 59, doi:10.1002/andp.18551700105 (in German). [2] A. Fick, Phil. Mag. (1855), 10, 30. (in English) [3] Jean Philibert, One and a Half Century of Diffusion: Fick, Einstein, before and beyond, Diffusion Fundamentals 2, 2005 1.11.10 (http:/ / www. uni-leipzig. de/ diffusion/ journal/ pdf/ volume2/ diff_fund_2(2005)1. pdf) [4] J. L. Vzquez (2006), The Porous Medium Equation. Mathematical Theory, Oxford Univ. Press. [5] A.N. Gorban, H.P. Sargsyan and H.A. Wahab (2011), Quasichemical Models of Multicomponent Nonlinear Diffusion (http:/ / arxiv. org/ pdf/ 1012. 2908v4. pdf), Mathematical Modelling of Natural Phenomena (http:/ / journals. cambridge. org/ action/ displayJournal?jid=MNP), Volume 6 / Issue 05, 184262. [6] Taylor, Ross; R Krishna (1993). Multicomponent mass transfer. Wiley. [7] Physiology at MCG 3/3ch9/s3ch9_2 (http:/ / web. archive. org/ web/ 20080401093403/ http:/ / www. lib. mcg. edu/ edu/ eshuphysio/ program/ section3/ 3ch9/ s3ch9_2. htm) [8] D. Brogioli and A. Vailati, Diffusive mass transfer by nonequilibrium fluctuations: Fick's law revisited, Phys. Rev. E 63, 012105/1-4 (2001) (http:/ / arxiv. org/ abs/ cond-mat/ 0006163)

References
W.F. Smith, Foundations of Materials Science and Engineering 3rd ed., McGraw-Hill (2004) H.C. Berg, Random Walks in Biology, Princeton (1977) R.B. Bird, W.E. Stewart, E.N. Lightfoot, Transport Phenomena, John Wiley & sons, (1976)

External links
Diffusion fundamentals (http://www.timedomaincvd.com/CVD_Fundamentals/xprt/intro_diffusion.html) Diffusion in Polymer based Materials (http://www.composite-agency.com/messages/3875.html) Fick's equations, Boltzmann's transformation, etc. (with figures and animations) (http://dragon.unideb.hu/ ~zerdelyi/Diffusion-on-the-nanoscale/node2.html) Wilson, Bill. Fick's Second Law. Connexions. 21 Aug. 2007 (http://cnx.org/content/m1036/2.11/) (http://webserver.dmt.upm.es/~isidoro/bk3/c11/Mass Transfer.htm)

Conservation of mass

Conservation of mass
The law of conservation of mass, also known as the principle of mass/matter conservation, states that the mass of an isolated system (closed to all transfers of matter and energy) will remain constant over time. This principle is equivalent to the conservation of energy: when energy or mass is enclosed in a system and none is allowed in or out, its quantity cannot otherwise change over time (hence, its quantity is "conserved" over time). The mass of an isolated system cannot be changed as a result of processes acting inside the system. The law implies that mass can neither be created nor destroyed, although it may be rearranged in space and changed into different types of particles; and that for any chemical process in an isolated system, the mass of the reactants must equal the mass of the products. The concepts of both matter and mass conservation are widely used in many fields such as chemistry, mechanics, and fluid dynamics. Historically, the principle of mass conservation, discovered in chemical reactions by Antoine Lavoisier in the late 18th century, was of crucial importance in progressing from alchemy to the modern natural science of chemistry.

In a thermodynamically closed system (i.e. one which is closed to exchanges of matter, but open to small exchanges of non-material energy (such as heat and work) with the surroundings) mass is only approximately conserved. In this case the input or output of energy changes the mass of the system, according to special relativity, although the change is usually small since relatively large amounts of energy are equivalent to only a small amount of mass. Mass is absolutely conserved in so-called isolated systems, i.e. those completely isolated from all exchanges with the environment. In special relativity, the mass-energy equivalence theorem states that mass conservation is equivalent to total energy conservation, which is the first law of thermodynamics. In special relativity the difference between closed and isolated systems becomes important, since conservation of mass is strictly and perfectly upheld only for isolated systems. In special relativity, mass is not converted to energy, as such, since energy always retains its equivalent amount of mass within any isolated system. However, certain types of matter may be converted to energy, so long as the mass of the system is unchanged in the process. When this energy is removed from systems, they lose mass. In general relativity, mass (and energy) conservation in expanding volumes of space is a complex concept, subject to different definitions, and neither mass nor energy is as strictly and simply conserved as is the case in special relativity and in Minkowski space. For a discussion, see mass in general relativity.

Antoine Lavoisier's discovery of the Law of Conservation of Mass led to many new findings in the 19th century. Joseph Proust's Law of Definite Proportions and John Dalton's Atomic Theory branched from the discoveries of Antoine Lavoisier. Lavoisier's quantitative experiments revealed that combustion involved oxygen rather than what was previously thought to be phlogiston.

History
An important idea in ancient Greek philosophy was that "Nothing comes from nothing", so that what exists now has always existed: no new matter can come into existence where there was none before. An explicit statement of this, along with the further principle that nothing can pass away into nothing, is found in Empedocles (approx. 490430 BCE): "For it is impossible for anything to come to be from what is not, and it cannot be brought about or heard of that what is should be utterly destroyed."[1]

Conservation of mass A further principle of conservation was stated by Epicurus (341270 BCE) who, describing the nature of the universe, wrote that "the totality of things was always such as it is now, and always will be".[2] Jain philosophy, a non-creationist philosophy based on the teachings of Mahavira (6th century BCE),[3] states that the universe and its constituents such as matter cannot be destroyed or created. The Jain text Tattvarthasutra (2nd century) states that a substance is permanent, but its modes are characterised by creation and destruction.[4] A principle of the conservation of matter was also stated by Nasr al-Dn al-Ts (12011274). He wrote that "A body of matter cannot disappear completely. It only changes its form, condition, composition, color and other properties and turns into a different complex or elementary matter".[5] The principle of conservation of mass was first outlined by Mikhail Lomonosov (17111765) in 1748. He proved them by experimentsthough this is sometimes challenged.[6] Antoine Lavoisier (17431794) had expressed these ideas more clearly several years ago. Others who anticipated the work of Lavoisier include Joseph Black (17281799), Henry Cavendish (17311810), and Jean Rey (15831645).[7] The conservation of mass was obscure for millennia because of the buoyancy effect of the Earth's atmosphere on the weight of gases. For example, a piece of wood weighs less after burning; this seemed to suggest that some of its mass disappears, or is transformed or lost. This was not disproved until careful experiments were performed in which chemical reactions such as rusting were allowed to take place in sealed glass ampoules; it was found that the chemical reaction did not change the weight of the sealed container and its contents. The vacuum pump also enabled the weighing of gases using scales. Once understood, the conservation of mass was of great importance in progressing from alchemy to modern chemistry. Once early chemists realized that chemical substances never disappeared but were only transformed into other substances with the same weight, these scientists could for the first time embark on quantitative studies of the transformations of substances. The idea of mass conservation plus a surmise that certain "elemental substances" also could not be transformed into others by chemical reactions, in turn led to an understanding of chemical elements, as well as the idea that all chemical processes and transformations (such as burning and metabolic reactions) are reactions between invariant amounts or weights of these chemical elements.

Generalization
In special relativity, the conservation of mass does not apply if the system is open and energy escapes. However, it does continue to apply to totally closed (isolated) systems. If energy cannot escape a system, its mass cannot decrease. In relativity theory, so long as any type of energy is retained within a system, this energy exhibits mass. Also, mass must be differentiated from matter (see below), since matter may not be perfectly conserved in isolated systems, even though mass is always conserved in such systems. However, matter is so nearly conserved in chemistry that violations of matter conservation were not measured until the nuclear age, and the assumption of matter conservation remains an important practical concept in most systems in chemistry and other studies that do not involve the high energies typical of radioactivity and nuclear reactions.

The mass associated with chemical amounts of energy is too small to measure
The change in mass of certain kinds of open systems where atoms or massive particles are not allowed to escape, but other types of energy (such as light or heat) are allowed to enter or escape, went unnoticed during the 19th century, because the change in mass associated with addition or loss of small quantities of thermal or radiant energy in chemical reactions is very small. (In theory, mass would not change at all for experiments conducted in isolated systems where heat and work were not allowed in or out.) The theoretical association of all energy with mass was made by Albert Einstein in 1905. However Max Planck pointed out that the change in mass of systems as a result of extraction or addition of chemical energy, as predicted by Einstein's theory, is so small that it could not be measured with available instruments, for example as a test of

Conservation of mass Einstein's theory. Einstein in turn speculated that the energies associated with newly-discovered radioactivity were significant enough, compared with the mass of systems producing them, to enable their mass-change to be measured, once the energy of the reaction had been removed from the system. This later indeed proved to be possible, although it was eventually to be the first artificial nuclear transmutation reactions in the 1930s, using cyclotrons, that proved the first successful test of Einstein's theory regarding mass-loss with energy-loss.

Mass conservation remains correct if energy is not lost


The conservation of relativistic mass implies the viewpoint of a single observer (or the view from a single inertial frame) since changing inertial frames may result in a change of the total energy (relativistic energy) for systems, and this quantity determines the relativistic mass. The principle that the mass of a system of particles must be equal to the sum of their rest masses, even though true in classical physics, may be false in special relativity. The reason that rest masses cannot be simply added is that this does not take into account other forms of energy, such as kinetic and potential energy, and massless particles such as photons, all of which may (or may not) affect the mass of systems. For moving massive particles in a system, examining the rest masses of the various particles also amounts to introducing many different inertial observation frames (which is prohibited if total system energy and momentum are to be conserved), and also when in the rest frame of one particle, this procedure ignores the momenta of other particles, which affect the system mass if the other particles are in motion in this frame. For the special type of mass called invariant mass, changing the inertial frame of observation for a whole closed system has no effect on the measure of invariant mass of the system, which remains both conserved and invariant even for different observers who view the entire system. Invariant mass is a system combination of energy and momentum, which is invariant for any observer, because in any inertial frame, the energies and momenta of the various particles always add to the same quantity (the momentum may be negative, so the addition amounts to a subtraction). The invariant mass is the relativistic mass of the system when viewed in the center of momentum frame. It is the minimum mass which a system may exhibit in all possible inertial frames. The conservation of both relativistic and invariant mass applies even to systems of particles created by pair production, where energy for new particles may come from kinetic energy of other particles, or from a photon as part of a system. Again, neither the relativistic nor the invariant mass of totally-closed (that is, isolated) systems changes when new particles are created. However, different inertial observers will disagree on the value of this conserved mass, if it is the relativistic mass (i.e., relativistic mass is conserved by not invariant). However, all observers agree on the value of the conserved mass, if the mass being measured is the invariant mass (i.e., invariant mass is both conserved and invariant). The mass-energy equivalence formula gives a different prediction in non-isolated systems, since if energy is allowed to escape a system, both relativistic mass and invariant mass will escape also. In this case, the mass-energy equivalence formula predicts that the change in mass of a system is associated with the change in its energy due to energy being added or subtracted: This form involving changes was the form in which this famous equation was originally presented by Einstein. In this sense, mass changes in any system are explained simply if the mass of the energy added or removed from the system, are taken into account. The formula implies that bound systems have an invariant mass (rest mass for the system) less than the sum of their parts, if the binding energy has been allowed to escape the system after the system has been bound. This may happen by converting system potential energy into some other kind of active energy, such as kinetic energy or photons, which easily escape a bound system. The difference in system masses, called a mass defect, is a measure of the binding energy in bound systems in other words, the energy needed to break the system apart. The greater the mass defect, the larger the binding energy. The binding energy (which itself has mass) must be released (as light or heat) when the parts combine to form the bound system, and this is the reason the mass of the bound system decreases when the energy leaves the system.[8] The total invariant mass is actually conserved, when the mass of the binding

Conservation of mass energy that has escaped, is taken into account.

10

Exceptions or caveats to mass/matter conservation


Matter is not perfectly conserved
The principle of matter conservation may be considered as an approximate physical law that is true only in the classical sense, without consideration of special relativity and quantum mechanics. It is approximately true except in certain high energy applications. A particular difficulty with the idea of conservation of "matter" is that "matter" is not a well-defined word scientifically, and when particles that are considered to be "matter" (such as electrons and positrons) are annihilated to make photons (which are often not considered matter) then conservation of matter does not take place over time, even within isolated systems. However, matter is conserved to such an extent that matter conservation may be safely assumed in chemical reactions and all situations in which radioactivity and nuclear reactions are not involved.

Open systems and thermodynamically closed systems


Mass is also not generally conserved in open systems (even if "closed" which means partly open, i.e. to heat and work). Such is the case when various forms of energy are allowed into, or out of, the system (see for example, binding energy). However, again unless radioactivity or nuclear reactions are involved, the amount of energy escaping systems as heat, work, or electromagnetic radiation is usually too small to be measured as a decrease in system mass. The law of mass conservation for isolated systems (totally closed to all mass and energy), as viewed over time from any single inertial frame, continues to be true in modern physics. The reason for this is that relativistic equations show that even "massless" particles such as photons still add mass and energy to isolated systems, allowing mass (though not matter) to be strictly conserved in all processes where energy does not escape the system. In relativity, different observers may disagree as to the particular value of the conserved mass of a given system, but each observer will agree that this value does not change over time as long as the system is isolated (totally closed to everything).

General relativity
In general relativity, the total invariant mass of photons in an expanding volume of space will decrease, due to the red shift of such an expansion (see Mass in general relativity). The conservation of both mass and energy therefore depends on various corrections made to energy in the theory, due to the changing gravitational potential energy of such systems.

References
[1] Fr. 12; see pp.2912 of Kirk, G. S.; J. E. Raven, Malcolm Schofield (1983). The Presocratic Philosophers (2 ed.). Cambridge: Cambridge University Press. ISBN978-0-521-27455-5. [2] Long, A. A.; D. N. Sedley (1987). "Epicureanism: The principals of conservation". The Hellenistic Philosophers. Vol 1: Translations of the principal sources with philosophical commentary. Cambridge: Cambridge University Press. pp.2526. ISBN0-521-27556-3. [3] Mahavira is dated 599 BCE - 527 BCE. See. Dundas, Paul; John Hinnels ed. (2002). The Jains. London: Routledge. ISBN0-415-26606-8. p. 24 [4] Devendra (Muni.), T. G. Kalghatgi, T. S. Devadoss (1983) A source-book in Jaina philosophy Udaipur:Sri Tarak Guru Jain Gran. p.57. Also see Tattvarthasutra verses 5.29 and 5.37 [5] Farid Alakbarov (Summer 2001). A 13th-Century Darwin? Tusi's Views on Evolution (http:/ / azer. com/ aiweb/ categories/ magazine/ 92_folder/ 92_articles/ 92_tusi. html), Azerbaijan International 9 (2). [6] *Pomper, Philip (October 1962). "Lomonosov and the Discovery of the Law of the Conservation of Matter in Chemical Transformations". Ambix 10 (3): 119127. Lomonosov, Mikhail Vasilevich (1970). Mikhail Vasilevich Lomonosov on the Corpuscular Theory. Henry M. Leicester (trans.). Cambridge,

Conservation of mass
Mass.: Harvard University Press. Introduction, p.25. [7] An Historical Note on the Conservation of Mass (http:/ / www. eric. ed. gov/ ERICWebPortal/ Home. portal?_nfpb=true& ERICExtSearch_SearchValue_0=EJ128341& ERICExtSearch_SearchType_0=kw& _pageLabel=ERICSearchResult& newSearch=true& rnd=1194465579133& searchtype=keyword), Robert D. Whitaker, Journal of Chemical Education, 52, 10, 658-659, Oct 75 [8] Kenneth R. Lang, Astrophysical Formulae, Springer (1999), ISBN 3-540-29692-1

11

Atom-transfer radical-polymerization
Atom transfer radical polymerization (ATRP) is an example of a living polymerization or a controlled/living radical polymerization (CRP). Like its counterpart, ATRA or atom transfer radical addition, it is a means of forming carbon-carbon bond through transition metal catalyst. As the name implies, the atom transfer step is the key step in the reaction responsible for uniform polymer chain growth. ATRP (or transition metal-mediated living radical polymerization) was independently discovered by Mitsuo Sawamoto et al.[1] and by Jin-Shan Wang and Krzysztof Matyjaszewski in 1995.[2] This is a typical ATRP reaction:

ATRP
The uniformed polymer chain growth, which leads to low dispersity, stems from the transition metal based catalyst. This catalyst provides an equilibrium between active, and General ATRP Reaction. A. Initiation. B. Equilibrium with dormant specie. therefore propagating, polymer and an C.Propagation inactive form of the polymer; known as the dormant form. Since the dormant state of the polymer is vastly preferred in this equilibrium, side reactions are suppressed. This equilibrium in turn lowers the concentration of propagating radicals, therefore suppressing unintentional termination and controlling molecular weights. ATRP reactions are very robust in that they are tolerant of many functional groups like allyl, amino, epoxy, hydroxy and vinyl groups present in either the monomer or the initiator.[3] ATRP methods are also advantageous due to the ease of preparation, commercially available and inexpensive catalysts (copper complexes), pyridine based ligands and initiators (alkyl halides).[4]

The ATRP with styrene. If all the styrene is reacted (the conversion is 100%) the polymer will have 100 units of styrene built into it. PMDETA stands for N,N,N',N,N pentamethyldiethylenetriamine.

Atom-transfer radical-polymerization

12

Components of ATRP
There are five important variable components of Atom Transfer Radical Polymerizations. They are the monomer, initiator, catalyst, solvent and temperature. The following section breaks down the contributions of each component to the overall polymerization.

Monomer
Monomers that are typically used in ATRP are molecules with substituents that can stabilize the propagating radicals; for example, styrenes, (meth)acrylates, (meth)acrylamides, and acrylonitrile.[5] ATRP are successful at leading to polymers of high number average molecular weight and a narrow polydispersity index when the concentration of the propagating radical balances the rate of radical termination. Yet, the propagating rate is unique to each individual monomer. Therefore, it is important that the other components of the polymerization (initiator, catalysts, ligands and solvents) are optimized in order for the concentration of the dormant species to be greater than the concentration of the propagating radical and yet not too great to slow down or halt the reaction.[6][7]

Initiator
The number of growing polymer chains is determined by the initiator. The faster the initiation, the fewer terminations and transfers, the more consistent the number of propagating chains leading to narrow molecular weight distributions.[7] Organic halides that are similar in the organic framework as the propagating radical are often chosen as initiators.[6] Most initators for ATRP are alkyl halides.[8] Alkyl halides such as alkyl bromides are more reactive than alkyl chlorides and both have good molecular weight control.[6][7] The shape or structure of your initiator can determine the architecture of your polymer. For example, initiators with multiple alkyl halide groups on a single core can lead to a star-like polymer shape.[9]

Illustration of a star initiator for ATRP

Catalyst
The catalyst is the most important component of ATRP because it determines the equilibrium constant between the active and dormant species. This equilibrium determines the polymerization rate and an equilibrium constant too small may inhibit or slow the polymerization while an equilibrium constant too large leads to a high distribution of chain lengths.[7] There are several requirements for the metal catalyst: 1. there needs to be two accessible oxidation states that are separated by one electron 2. the metal center needs to have a reasonable affinity for halogens 3. the coordination sphere of the metal needs to be expandable when its oxidized so to be able to accommodate the halogen

Atom-transfer radical-polymerization 4. a strong ligand complexation.[6] The most studied catalysts are those that polymerizations involving copper, which has shown the most versatility, showing successful polymerizations regardless of the monomer.

13

Solvent
Toluene,1,4-dioxane, xylene, anisole, DMF, DMSO, water, methanol, acetonitrile, chloroform, bulk monomer

Reverse ATRP
In reverse ATRP, the catalyst is added in its higher oxidation state. Chains are activated by conventional radical initiators (e.g. AIBN) and deactivated by the transition metal. The source of transferrable halogen is the copper salt, so this must be present in concentrations comparable to the transition metal. A mixture of radical initiator and active (lower oxidation state) catalyst allows for the creation of block copolymers (contaminated with homopolymer) which is impossible using standard reverse ATRP. This is called SR&NI (simultaneous reverse and normal initiation ATRP).

AGET ATRP
Activators generated by electron transfer uses a reducing agent unable to initiate new chains (instead of organic radicals) as regenerator for the low-valent metal. Examples are metallic Cu, tin(II), ascorbic acid, or triethylamine. It allows for lower concentrations of transition metals, and may also be possible in aqueous or dispersed medium.

Hybrid and bimetallic systems


This technique uses a variety of different metals/oxidation states, possibly on solid supports, to act as activators/deactivators, possibly with reduced toxicity or sensitivity. Iron salts can, for example, efficiently activate alkyl halides but requires an efficient Cu(II) deactivator which can be present in much lower concentrations (35mol%)

ICAR ATRP
Initiators for continuous activator regeneration is a technique that uses large excesses of initiator to continuously regenerate the activator, lowering its required concentration from thousands of ppm to around 1 ppm; making it an industrially relevant technique. Styrene is especially interesting because it generates radicals when sufficiently heated.

ARGET ATRP
Activators regenerated by electron transfer can be used to make block copolymers using a method similar to AGET but requiring strongly reduced amounts of metal, since the activator is regenerated from the deactivator by a large excess of reducing agent (e.g. hydrazine, phenoles, sugars, ascorbic acid, etc...) It differs from AGET ATRP in that AGET uses reducing agents to generate the active catalyst (in quasi stoichiometric amounts) while in ARGET a large excess is used to continuously regenerate the activator allowing transition metal concentrations to drop to ~1 ppm without loss of control.

Atom-transfer radical-polymerization

14

Polymers Made by ATRP


Polystyrene Poly (methyl methacrylate) Polyacrylamide

References
[1] Kato, M; Kamigaito, M; Sawamoto, M; Higashimura, T (1995). "Polymerization of Methyl Methacrylate with the Carbon Tetrachloride/Dichlorotris-(triphenylphosphine)ruthenium(II)/Methylaluminum Bis(2,6-di-tert-butylphenoxide) Initiating System: Possibility of Living Radical Polymerization". Macromolecules 28: 17211723. Bibcode1995MaMol..28.1721K. doi:10.1021/ma00109a056. [2] Wang, J; Matyjaszewski, K (1995). "Controlled/"living" radical polymerization. Atom transfer radical polymerization in the presence of transition-metal complexes". J. Am. Chem. Soc. 117: 56145615. doi:10.1021/ja00125a035. [3] Cowie, J. M. G.; Arrighi, V. In Polymers: Chemistry and Physics of Modern Materials; CRC Press Taylor and Francis Group: Boca Raton, Fl, 2008; 3rd Ed., pp. 8284 ISBN 0849398134 [4] Matyjaszewski, K. Fundamentals of ATRP Research (http:/ / www. chem. cmu. edu/ groups/ maty/ about/ research/ 03. html) (accessed 01/07, 2009). [5] Patten, T. E; Matyjaszewski, K (1998). "Atom Transfer Radical Polymerization and the Synthesis of Polymeric Materials". Adv. Mater. 10: 901. doi:10.1002/(SICI)1521-4095(199808)10:12<901::AID-ADMA901>3.0.CO;2-B. [6] Odian, G. In Radical Chain Polymerization; Principles of Polymerization; Wiley-Interscience: Staten Island, New York, 2004; Vol. , pp 316321. [7] Matyjaszewski, K; Xia, J (2001). "Atom Transfer Radical Polymerization". Chem. Rev. 101 (9): 29212990. doi:10.1021/cr940534g. ISSN0009-2665. PMID11749397. [8] Matyjaszewski, Krzysztof; Nicolay V. Tsarevsky (2009). "Nanostructured functional materials prepared by atom transfer radical polymerization". Nature Chemistry 1 (4): 276288. Bibcode2009NatCh...1..276M. doi:10.1038/NCHEM.257. [9] Jakubowski, Wojciech. "Complete Tools for the Synthesis of Well-Defined Functionalized Polymers via ATRP" (http:/ / www. sigmaaldrich. com/ materials-science/ polymer-science/ atrp. html). Sigma-Aldrich. . Retrieved 21 July 2010.

Living polymerization
In polymer chemistry, living polymerization is a form of addition polymerization where the ability of a growing polymer chain to terminate has been removed.[1][2] This can be accomplished in a variety of ways. Chain termination and chain transfer reactions are absent and the rate of chain initiation is also much larger than the rate of chain propagation. The result is that the polymer chains grow at a more constant rate than seen in traditional chain polymerization and their lengths remain very similar (i.e. they have a very low polydispersity index). Living polymerization is a popular method for synthesizing block copolymers since the polymer can be synthesized in stages, each stage containing a different monomer. Additional advantages are predetermined molar mass and control over end-groups. Living polymerization in the literature is often called "living" polymerization or controlled polymerization. Living polymerization was demonstrated by Michael Szwarc in 1956 in the anionic polymerization of styrene with an alkali metal / naphthalene system in tetrahydrofuran (THF). He found that after addition of monomer to the initiator system that the increase in viscosity would eventually cease but that after addition of a new amount of monomer after some time the viscosity would start to increase again.[3] The main living polymerization techniques are: Living anionic polymerization Living cationic polymerization Ring opening metathesis polymerization Living free radical polymerization

Group transfer polymerization living Ziegler-Natta polymerization

Living polymerization

15

Living anionic polymerization


As early as 1936, Karl Ziegler proposed that anionic polymerization of styrene and butadiene by consecutive addition of monomer to an alkyl lithium initiator occurred without chain transfer or termination. Twenty years later, living polymerization was demonstrated by Szwarc through the anionic polymerization of styrene in THF using sodium naphthalenide as celerator.[4][5][6]

Living cationic polymerization


Monomers for living cationic polymerization are electron-rich alkenes such as vinyl ethers, isobutylene, styrene, and N-vinylcarbazole. The initiators are binary systems consisting of a electrophile and a Lewis acid. The method was developed around 1980 with contributions from Higashimura, Sawamoto and Kennedy.

Living ring-opening metathesis polymerization


Given the right reaction conditions ring-opening metathesis polymerization (ROMP) can be rendered living. The first such systems were described by Robert H. Grubbs in 1986 based on norbornene and Tebbe's reagent and in 1978 Grubbs together with Richard R. Schrock describing living polymerization with a tungsten carbene complex.[7]

Living free radical polymerization


Starting in the 1970s several new methods were discovered which allowed the development of living polymerization using free radical chemistry. These techniques involved catalytic chain transfer polymerization, iniferter mediated polymerization, stable free radical mediated polymerization (SFRP), atom transfer radical polymerization (ATRP), reversible addition-fragmentation chain transfer (RAFT) polymerization, and iodine-transfer polymerization.

Living group-transfer polymerization


Group-transfer polymerization also has characteristics of living polymerization.[8] It is applied to alkylated methacrylate monomers and the initiator is a silyl ketene acetal. New monomer adds to the initiator and to the active growing chain in a Michael reaction. With each addition of a monomer group the trimethylsilyl group is transferred to the end of the chain. The active chain-end is not ionic as in anionic or cationic polymeriation but is covalent. The reaction can be catalysed by bifluorides and bioxyanions such as tris(dialkylamino)sulfonium bifluoride or tetrabutyl ammonium bibenzoate. The method was discovered in 1983 by O.W. Webster[9] and the name first suggested by Barry Trost.

Living Ziegler-Natta polymerization


Several reported methods exist that introduce livingness in Ziegler-Natta polymerization.[10] The monomer in this type of polymerization (a subset of coordination polymerization) is an alpha-olefin and the active site contains an alkyl to metal bond. Chain growth is based on the Cossee-Arlman mechanism. An early method (Doi, 1979) describes propene polymerization in toluene at 50C using diethylaluminium chloride and a vanadium catalyst for example V(acac)3 to syndiotactic polypropylene with a polydispersity index of 1.05 to 1.4.[11][12] Another living system as described by McConville in 1996 is based on titanium using 1-hexene, [RN(CH2)3NR]TiMe2 and tris(pentafluorophenyl)boron[13]

Living polymerization

16

External links
IUPAC Gold Book Definition [14] precise definitions from the American Chemical Society [15] Living Ziegler-Natta Polymerization Article [16] Living polymers 50 years of evolution Article [17]

References
[1] Halasa, A. F. Rubber Chem. Technol., 1981, 54, 627. [2] (2006) The Chemistry of Radical Polymerization - Second fully revised edition (Graeme Moad & David H. Solomon). Elsevier. ISBN 0-08-044286-2 [3] Webster, O. W. Science, 1991, 251, 8877. [4] M. Szwarc, Nature 1956, 178, 1168. [5] Szwarc, M.; Levy, M.; Milkovich, R. J. Am. Chem. Soc. 1956, 78, 2656. [6] US 4 158 678 (priority date 30 June 1976). [7] "Ring-opening polymerization of norbornene by a living tungsten alkylidene complex" R. R. Schrock, J. Feldman, L. F. Cannizzo, R. H. Grubbs Macromolecules; 1987; 20(5); 11691172. doi:10.1021/ma00171a053 [8] Polymer chemistry: a practical approach 2004 Fred J. Davis [9] "Group-transfer polymerization. 1. A new concept for addition polymerization with organosilicon initiators" O. W. Webster, W. R. Hertler, D. Y. Sogah, W. B. Farnham, T. V. RajanBabu J. Am. Chem. Soc., 1983, 105 (17), pp. 57065708 doi:10.1021/ja00355a039 [10] organicdivision.org Essay: Living Ziegler-Natta Polymerization 2002 Richard J. Keaton PDF (http:/ / www. organicdivision. org/ ama/ orig/ Fellowship/ 2002_2003_Awardees/ Essays/ keaton. pdf) [11] "'Living' Coordination Polymerization of Propene Initiated by the Soluble V(acac)3-Al(C2H5)2Cl System" Yoshiharu Doi, Satoshi Ueki, Tominaga Keii Macromolecules, 1979, 12 (5), pp. 814819 doi:10.1021/ma60071a004 [12] "Living coordination polymerization of propene with a highly active vanadium-based catalyst" Yoshiharu Doi, Shigeo Suzuki, Kazuo Soga Macromolecules, 1986, 19 (12), pp. 28962900 doi:10.1021/ma00166a002 [13] "Living Polymerization of -Olefins by Chelating Diamide Complexes of Titanium" John D. Scollard and David H. McConville J. Am. Chem. Soc., 1996, 118 (41), pp. 1000810009 doi:10.1021/ja9618964 [14] http:/ / www. iupac. org/ goldbook/ L03597. pdf [15] http:/ / www. polyacs. org/ nomcl/ mnn12. html [16] http:/ / organicdivision. org/ essays_2002/ keaton. pdf [17] http:/ / www. weizmann. ac. il/ ICS/ booklet/ 18/ pdf/ levy. pdf

Dispersity

17

Dispersity
In physical and organic chemistry, the dispersity is a measure of the heterogeneity of sizes of molecules or particles in a mixture. A collection of objects is called monodisperse if the objects have the same size, shape, or mass. A sample of objects that have an inconsistent size, shape and mass distribution is called polydisperse. The objects can be in any form of chemical dispersion, such as particles in a colloid, droplets in a cloud,[1] crystals in a rock,[2] or polymer molecules in a solvent.[3] Polymers can possess a distribution of molecular mass; particles often possess a wide distribution of size, surface area and mass; and thin films can possess a varied distribution of film thickness. IUPAC has deprecated the use of the term polydispersity index having replaced it with the term dispersity, represented by the symbol and calculated using the equation = Mm/Mn, where Mm is the mass-average molar mass and Mn is the number-average molar mass. IUPAC has also deprecated the terms monodisperse, which is considered to be self-contradictory, and polydisperse, which is considered redundant, preferring the terms uniform and non-uniform instead.[4]

A monodisperse collection

A polydisperse collection

Overview
A monodisperse, or uniform, polymer is composed of molecules of the same mass.[5] Natural polymers are typically monodisperse.[6] Synthetic monodisperse polymer chains can be made by processes such as anionic polymerization, a method using an anionic catalyst to produce chains that are similar in length. This technique is also known as living polymerization. It is used commercially for the production of block copolymers. Monodisperse collections can be easily created through the use of template-based synthesis, a common method of synthesis in nanotechnology. A polymer material is denoted by the term polydisperse, or non-uniform, if its chain lengths vary over a wide range of molecular masses. This is characteristic of man-made polymers.[7]. Natural organic matter produced by the decomposition of plants and wood debris in soils (humic substances) also has a pronounced polydispersed character. It is the case of humic acids and fulvic acids, natural polyelectrolyte substances having respectively higher and lower molecular weights. Another interpretation of polydispersity index is explained in the article Dynamic light scattering (cumulant method subheading). In this sense, the PDI values are in the range from 0 to 1.

Dispersity

18

The polydispersity index (PDI) or heterogeneity index, or simply dispersity (), is a measure of the distribution of molecular mass in a given polymer sample. The PDI calculated is the weight average molecular weight ( ) divided by the number average molecular weight ( ). It indicates the distribution of individual molecular masses in a batch of polymers. The PDI has a value equal to or greater than 1, but as the polymer chains approach uniform chain length, the PDI approaches unity (1).[8] For some natural polymers PDI is almost taken as unity. The PDI from polymerization is often denoted as: , where is the weight average molecular weight and is the number
IUPAC definition of dispersity

average molecular weight. molecular mass, while

is more sensitive to molecules of low

is more sensitive to molecules of high molecular mass.

Effect of polymerization mechanism


Typical dispersities vary based on the mechanism of polymerization and can be affected by a variety of reaction conditions. In synthetic polymers, it can vary greatly due to reactant ratio, how close the polymerization went to completion, etc. For typical addition polymerization, values of the PDI can range around 10 to 20. For typical step polymerization, most probable values of the PDI are around 2 Carothers' equation limits PDI to values of 2 and below. Living polymerization, a special case of addition polymerization, leads to values very close to 1. Such is the case also in biological polymers, where the dispersity can be very close or equal to 1, indicating only one length of polymer is present.

Determination methods
Gel permeation chromatography (also known as size exclusion chromatography) Light scattering measurements such as dynamic light scattering Direct measurement via mass spectrometry using MALDI or ESI-MS

References
[1] Martins, J. A.; Silva Dias, M. A. F. (2009). "The impact of smoke from forest fires on the spectral dispersion of cloud droplet size distributions in the Amazonian region". Environmental Research Letters 4: 015002. doi:10.1088/1748-9326/4/1/015002. [2] Higgins, Michael D. (2000). "Measurement of crystal size distributions" (http:/ / wwwdsa. uqac. ca/ ~mhiggins/ am_min_2000. pdf). American Mineralogist 85: 11051116. . [3] Okita, K.; Teramoto, A.; Kawahara, K.; Fujita, H. (1968). "Light scattering and refractometry of a monodisperse polymer in binary mixed solvents". The Journal of Physical Chemistry 72: 278. doi:10.1021/j100847a053. [4] Stepto, R. F. T.; Gilbert, R. G.; Hess, M.; Jenkins, A. D.; Jones, R. G.; Kratochvl P. (2009). " Dispersity in Polymer Science (http:/ / media. iupac. org/ publications/ pac/ 2009/ pdf/ 8102x0351. pdf)" Pure Appl. Chem. 81 (2): 351353. DOI:10.1351/PAC-REC-08-05-02. [5] "monodisperse polymer (See: uniform polymer)" (http:/ / goldbook. iupac. org/ M04012. html). IUPAC Gold Book. International Union of Pure and Applied Chemistry. . Retrieved 25 January 2012. [6] Brown, William H.; Foote, Christopher S.; Iverson, Brent L.; Anslyn, Eric V. (2012). Organic chemistry (http:/ / books. google. ca/ books?id=rxRHzOS-3xoC& pg=PT1193) (6 ed.). Cengage Learning. p.1161. ISBN978-0-8400-5498-2. . [7] http:/ / www. chemicool. com/ definition/ polydisperse. html [8] Peter Atkins and Julio De Paula, Atkins' Physical Chemistry, 9th edition (Oxford University Press, 2010, ISBN 978-0-19-954337-3)

Dispersity

19

External links
Polymer structure (http://openlearn.open.ac.uk/mod/resource/view.php?id=196629)

Molar mass distribution


In linear polymers the individual polymer chains rarely have exactly the same degree of polymerization and molar mass, and there is always a distribution around an average value. The molar mass distribution (or molecular weight distribution) in a polymer describes the relationship between the number of moles of each polymer species (Ni) and the molar mass (Mi) of that species.[1] The molar mass distribution of a polymer may be modified by polymer fractionation.

Definition of molar mass averages


Different average values can be defined depending on the statistical method that is applied. The weighted mean can be taken with the weight fraction, the mole fraction or the volume fraction: Number average molar mass or Mn Weight average molar mass or Mw Viscosity average molar mass or Mv Z average molar mass or Mz
[2]

Here a is the exponent in the Mark-Houwink equation that relates the intrinsic viscosity to molar mass.

Measurement
These different definitions have true physical meaning because different techniques in physical polymer chemistry often measure just one of them. For instance, osmometry measures number average molar mass and small-angle laser light scattering measures weight average molar mass. Mv is obtained from viscosimetry and Mz by sedimentation in an analytical ultracentrifuge. The quantity a in the expression for the viscosity average molar mass varies from 0.5 to 0.8 and depends on the interaction between solvent and polymer in a dilute solution. In a typical distribution curve, the average values are related to each other as follows: Mn < Mv < Mw < Mz. Polydispersity of a sample is defined as Mw divided by Mn and gives an indication just how narrow a distribution is.[2] The most common technique for measuring molecular weight used in modern times is a variant of high-pressure liquid chromatography (HPLC) known by the interchangeable terms of size exclusion chromatography (SEC) and gel permeation chromatography (GPC). These techniques involve forcing a polymer solution through a matrix of cross-linked polymer particles at a pressure of up to several thousand psi. The limited accessibility of stationary phase pore volume for the polymer molecules results in shorter elution times for high-molecular-weight species. The use of low polydispersity standards allows the user to correlate retention time with molecular weight, although the actual correlation is with the Hydrodynamic volume. If the relationship between molar mass and the hydrodynamic volume changes (i.e., the polymer is not exactly the same shape as the standard) then the calibration for mass is in error. The most common detectors used for size exclusion chromatography include online methods similar to the bench methods used above. By far the most common is the differential refractive index detector that measures the change in refractive index of the solvent. This detector is concentration-sensitive and very molecular-weight-insensitive, so it is ideal for a single-detector GPC system, as it allows the generation of mass v's molecular weight curves. Less

Molar mass distribution common but more accurate and reliable is a molecular-weight-sensitive detector using multi-angle laser-light scattering - see Static Light Scattering. These detectors directly measure the molecular weight of the polymer and are most often used in conjunction with differental refractive index detectors. A further alternative is either low-angle light scattering, which uses a single low angle to determine the molar mass, or Right-Angle-Light Laser scattering in combination with a viscometer, although this latter technique does not give an absolute measure of molar mass but one relative to the structural model used. The molar mass distribution of a polymer sample depends on factors such as chemical kinetics and work-up procedure. Ideal step-growth polymerization gives a polymer with polydispersity of 2. Ideal living polymerization results in a polydispersity of 1. By dissolving a polymer an insoluble high molar mass fraction may be filtered off resulting in a large reduction in Mw and a small reduction in Mn thus reducing polydispersity.

20

Number average molecular weight


The number average molecular weight is a way of determining the molecular weight of a polymer. Polymer molecules, even ones of the same type, come in different sizes (chain lengths, for linear polymers), so the average molecular weight will depend on the method of averaging. The number average molecular weight is the ordinary arithmetic mean or average of the molecular weights of the individual macromolecules. It is determined by measuring the molecular weight of n polymer molecules, summing the weights, and dividing by n.

The number average molecular weight of a polymer can be determined by gel permeation chromatography, viscometry via the (Mark-Houwink equation), colligative methods such as vapor pressure osmometry, end-group determination or proton NMR.[3] An alternative measure of the molecular weight of a polymer is the weight average molecular weight. The ratio of the weight average to the number average is called the polydispersity index. High Number-Average Molecular Weight Polymers may be obtained only with a high fractional monomer conversion in the case of step-growth polymerization, as per the Carothers' equation.

Weight average molecular weight


The weight average molecular weight is a way of describing the molecular weight of a polymer. Polymer molecules, even if of the same type, come in different sizes (chain lengths, for linear polymers), so we have to take an average of some kind. For the weight average molecular weight, this is calculated by

where

is the number of molecules of molecular weight

If the weight average molecular weight is w, and one chooses a random monomer, then the polymer it belongs to will have a weight of w on average (for a homopolymer). The weight average molecular weight can be determined by light scattering, small angle neutron scattering (SANS), X-ray scattering, and sedimentation velocity. An alternative measure of molecular weight for a polymer is the number average molecular weight; the ratio of the weight average to the number average is called the polydispersity index. The weight-average molecular weight, Mw, is also related to the fractional monomer conversion, p, in step-growth polymerization as per Carothers' equation: , where Mo is the molecular weight of the repeating unit.

Molar mass distribution

21

References
[1] I. Katime "Qumica Fsica Macromolecular". Servicio Editorial de la Universidad del Pas Vasco. Bilbao [2] R.J. Young and P.A. Lovell, Introduction to Polymers, 1991 [3] Polymer Molecular Weight Analysis by 1H NMR Spectroscopy Josephat U. Izunobi and Clement L. Higginbotham J. Chem. Educ., 2011, 88 (8), pp 10981104 doi:10.1021/ed100461v

Biotechnology
Biotechnology (sometimes shortened to "biotech") is generally accepted as the use of living systems and organisms to develop or make useful products, or "any technological application that uses biological systems, living organisms or derivatives there of, to make or modify products or processes for specific use" (UN Convention on Biological Diversity)[1] . For thousands of years, humankind has used biotechnology in agriculture, food production and medicine.[2] The term itself is largely believed to have been coined in 1919 by Hungarian engineer Karl Ereky. In the late 20th and early 21st century, Insulin crystals. biotechnology has expanded to include new and diverse sciences such as genomics, recombinant gene technologies, applied immunology, and development of pharmaceutical therapies and diganostic tests.[3]

Various definitions of 'biotechnology'


The concept of 'biotech' or 'biotechnology' encompasses a wide range of procedures (and history) for modifying living organisms according to human purposes going back to domestication of animals, cultivation of plants, and "improvements" to these through breeding programs that employ artificial selection and hybridization. Modern usage also includes genetic engineering as well as cell and tissue culture technologies. The United Nations Convention on Biological Diversity defines 'biotechnology' as: "Any technological application that uses biological systems, living organisms, or derivatives thereof, to make or modify products or processes for specific use."[4] In other words, biotechology can be defined as the mere application of technical advances in life science to develop commercial products. Biotechnology also draws on the pure biological sciences (genetics, microbiology, animal cell culture, molecular biology, biochemistry, embryology, cell biology). And in many instances it is also dependent on knowledge and methods from outside the sphere of biology including: chemical engineering, bioprocess engineering, bioinformatics, a new brand of information technology, and biorobotics.

Conversely, modern biological sciences (including even concepts such as molecular ecology) are intimately entwined and dependent on the methods developed through biotechnology and what is commonly thought of as the life sciences industry. Biotechnology is the research and development in the laboratory using bioinformatics for exploration, extraction, exploitation and production from any living organisms and any source of biomass by means of biochemical engineering where high value-added products could be planned (reproduced by biosynthesis, for example), forecasted, formulated, developed, manufactured and marketed for the purpose of sustainable operations (for the return from bottomless initial investment on R & D) and gaining durable patents rights (for exclusives rights for sales, and prior to this to receive national and international approval from the results on animal experiment and

Biotechnology human experiment, especially on the pharmaceutical branch of biotechnology to prevent any undetected side-effects or safety concerns by using the products), for more about the biotechnology industry, see.[5][6][7][8][9][10] By contrast, bioengineering is generally thought of as a related field with its emphasis more on higher systems approaches (not necessarily altering or using biological materials directly) for interfacing with and utilizing living things.

22

History
Although not normally what first comes to mind, many forms of human-derived agriculture clearly fit the broad definition of "using a biotechnological system to make products". Indeed, the cultivation of plants may be viewed as the earliest biotechnological enterprise. Agriculture has been theorized to have become the dominant way of producing food since the Neolithic Revolution. Through early biotechnology, the earliest farmers selected and bred the best suited crops, having the highest yields, to produce enough food to support a growing population. As crops and fields became increasingly large and difficult to maintain, it was discovered that specific organisms and their by-products could effectively fertilize, restore nitrogen, and control pests. Throughout the history of agriculture, farmers have inadvertently altered the genetics of their crops through introducing them to new environments and breeding them with other plants one of the first forms of biotechnology.

These processes also were included in early fermentation of beer.[11] These processes were introduced in early Mesopotamia, Egypt, and India, and still use the same basic biological methods. In brewing, malted grains (containing enzymes) convert starch from grains into sugar and then adding specific yeasts to produce beer. In this process, carbohydrates in the grains were broken down into alcohols such as ethanol. Later other cultures produced the process of lactic acid fermentation which allowed the fermentation and preservation of other forms of food, such as soy sauce. Fermentation was also used in this time period to produce leavened bread. Although the process of fermentation was not fully understood until Louis Pasteur's work in 1857, it is still the first use of biotechnology to convert a food source into another form. For thousands of years, humans have used selective breeding to improve production of crops and livestock to use them for food. In selective breeding, organisms with desirable characteristics are mated to produce offspring with the same characteristics. For example, this technique was used with corn to produce the largest and sweetest crops.[12] In the early twentieth century scientists gained a greater understanding of microbiology and explored ways of manufacturing specific products. In 1917, Chaim Weizmann first used a pure microbiological culture in an industrial process, that of manufacturing corn starch using Clostridium acetobutylicum, to produce acetone, which the United Kingdom desperately needed to manufacture explosives during World War I.[13] Biotechnology has also led to the development of antibiotics. In 1928, Alexander Fleming discovered the mold Penicillium. His work led to the purification of the antibiotic by Howard Florey, Ernst Boris Chain and Norman Heatley, penicillin. In 1940, penicillin became available for medicinal use to treat bacterial infections in humans.[12] The field of modern biotechnology is generally thought of as having been born in 1971 when Paul Berg's (Stanford) experiments in gene splicing had early success. Herbert W. Boyer (Univ. Calif. at San Francisco) and Stanley N. Cohen (Stanford) significantly advanced the new technology in 1972 by transferring genetic material into a bacterium, such that the imported material would be reproduced. The commercial viability of a biotechnology industry was significantly expanded on June 16, 1980, when the United States Supreme Court ruled that a genetically

Brewing was an early application of biotechnology

Biotechnology modified microorganism could be patented in the case of Diamond v. Chakrabarty.[14] Indian-born Ananda Chakrabarty, working for General Electric, had modified a bacterium (of the Pseudomonas genus) capable of breaking down crude oil, which he proposed to use in treating oil spills. (Chakrabarty's work did not involve gene manipulation but rather the transfer of entire organelles between strains of the Pseudomonas bacterium. Revenue in the industry is expected to grow by 12.9% in 2008. Another factor influencing the biotechnology sector's success is improved intellectual property rights legislationand enforcementworldwide, as well as strengthened demand for medical and pharmaceutical products to cope with an ageing, and ailing, U.S. population.[15] Rising demand for biofuels is expected to be good news for the biotechnology sector, with the Department of Energy estimating ethanol usage could reduce U.S. petroleum-derived fuel consumption by up to 30% by 2030. The biotechnology sector has allowed the U.S. farming industry to rapidly increase its supply of corn and soybeansthe main inputs into biofuelsby developing genetically modified seeds which are resistant to pests and drought. By boosting farm productivity, biotechnology plays a crucial role in ensuring that biofuel production targets are met.[16]

23

Applications
Biotechnology has applications in four major industrial areas, including health care (medical), crop production and agriculture, non food (industrial) uses of crops and other products (e.g. biodegradable plastics, vegetable oil, biofuels), and environmental uses. For example, one application of biotechnology is the directed use of organisms for the manufacture of organic products (examples include beer and milk products). Another example is using naturally present bacteria by the mining industry in bioleaching. Biotechnology is also used to recycle, treat waste, cleanup sites contaminated by industrial activities (bioremediation), and also to produce biological weapons. A series of derived terms have been coined to identify several branches of biotechnology; for example: Bioinformatics is an interdisciplinary field which addresses biological problems using computational techniques, and makes the rapid organization and analysis of biological data possible. The field may also be referred to as computational biology, and can be defined as, "conceptualizing biology in terms of molecules and then A rose plant that began as cells grown in a tissue culture applying informatics techniques to understand and organize the [17] information associated with these molecules, on a large scale." Bioinformatics plays a key role in various areas, such as functional genomics, structural genomics, and proteomics, and forms a key component in the biotechnology and pharmaceutical sector. Blue biotechnology is a term that has been used to describe the marine and aquatic applications of biotechnology, but its use is relatively rare. Green biotechnology is biotechnology applied to agricultural processes. An example would be the selection and domestication of plants via micropropagation. Another example is the designing of transgenic plants to grow under specific environments in the presence (or absence) of chemicals. One hope is that green biotechnology might produce more environmentally friendly solutions than traditional industrial agriculture. An example of this is the engineering of a plant to express a pesticide, thereby ending the need of external application of pesticides. An example of this would be Bt corn. Whether or not green biotechnology products such as this are ultimately more environmentally friendly is a topic of considerable debate.

Biotechnology Red biotechnology is applied to medical processes. Some examples are the designing of organisms to produce antibiotics, and the engineering of genetic cures through genetic manipulation. White biotechnology, also known as industrial biotechnology, is biotechnology applied to industrial processes. An example is the designing of an organism to produce a useful chemical. Another example is the using of enzymes as industrial catalysts to either produce valuable chemicals or destroy hazardous/polluting chemicals. White biotechnology tends to consume less in resources than traditional processes used to produce industrial goods.{{Citation needed|date=October 2009} http://www.bio-entrepreneur.net/Advance-definition-biotech. pdf} The investment and economic output of all of these types of applied biotechnologies is termed as bioeconomy.

24

Medicine
In medicine, modern biotechnology finds promising applications in such areas as drug production pharmacogenomics gene therapy genetic testing (or genetic screening): techniques in molecular biology detect genetic diseases. To test the developing fetus for Down syndrome, Amniocentesis and chorionic villus sampling can be used.[12]

Pharmacogenomics Pharmacogenomics is the study of how the genetic inheritance of an individual affects his/her body's response to drugs. It is a portmanteau derived from the words "pharmacology" and "genomics". It is hence the study of the relationship between pharmaceuticals and genetics. The vision of pharmacogenomics is to be able to design and produce drugs that are adapted to each person's genetic makeup.[18] Pharmacogenomics results in the following benefits:[18] 1. Development of tailor-made medicines. Using pharmacogenomics, a million blood tests at once pharmaceutical companies can create drugs based on the proteins, enzymes and RNA molecules that are associated with specific genes and diseases. These tailor-made drugs promise not only to maximize therapeutic effects but also to decrease damage to nearby healthy cells.
DNA microarray chip some can do as many as

2. More accurate methods of determining appropriate drug dosages. Knowing a patient's genetics will enable doctors to determine how well his/ her body can process and metabolize a medicine. This will maximize the value of the medicine and decrease the likelihood of overdose. 3. Improvements in the drug discovery and approval process. The discovery of potential therapies will be made easier using genome targets. Genes have been associated with numerous diseases and disorders. With modern biotechnology, these genes can be used as targets for the development of effective new therapies, which could significantly shorten the drug discovery process. 4. Better vaccines. Safer vaccines can be designed and produced by organisms transformed by means of genetic engineering. These vaccines will elicit the immune response without the attendant risks of infection. They will be inexpensive, stable, easy to store, and capable of being engineered to carry several strains of pathogen at once.

Biotechnology Pharmaceutical products Most traditional pharmaceutical drugs are relatively small molecules that bind to particular molecular targets and either activate or deactivate biological processes. Small molecules are typically manufactured through traditional organic synthesis, and many can be taken orally. In contrast, Biopharmaceuticals are large biological molecules such as proteins that are developed to address targets that cannot easily be addressed by small molecules. Some examples of biopharmaceutical drugs include Infliximab, a monoclonal antibody used in the treatment of autoimmune diseases, Etanercept, a fusion protein used in the treatment of autoimmune diseases, and Rituximab, a chimeric monoclonal antibody used in the treatment of cancer. Due to their larger size, and corresponding difficulty with surviving the stomach, colon and liver, biopharmaceuticals are typically injected.

25

Modern biotechnology is often associated with the use of genetically altered microorganisms such as E. coli or yeast for the production of substances like synthetic insulin or antibiotics. It can also refer to transgenic animals or transgenic plants, such as Bt corn. Genetically altered mammalian cells, such as Chinese Hamster Ovary cells (CHO), are also used to manufacture certain pharmaceuticals. Another promising new biotechnology application is the development of plant-made pharmaceuticals. Biotechnology is also commonly associated with landmark breakthroughs in new medical therapies to treat hepatitis B, hepatitis C, cancers, arthritis, haemophilia, bone fractures, multiple sclerosis, and cardiovascular disorders. The biotechnology industry has also been instrumental in developing molecular diagnostic devices that can be used to define the target patient population for a given biopharmaceutical. Herceptin, for example, was the first drug approved for use with a matching diagnostic test and is used to treat breast cancer in women whose cancer cells express the protein HER2. Modern biotechnology can be used to manufacture existing medicines relatively easily and cheaply. The first genetically engineered products were medicines designed to treat human diseases. To cite one example, in 1978 Genentech developed synthetic humanized insulin by joining its gene with a plasmid vector inserted into the bacterium Escherichia coli. Insulin, widely used for the treatment of diabetes, was previously extracted from the pancreas of abattoir animals (cattle and/or pigs). The resulting genetically engineered bacterium enabled the production of vast quantities of synthetic human insulin at relatively low cost.[19] According to a 2003 study undertaken by the International Diabetes Federation (IDF) on the access to and availability of insulin in its member countries, synthetic 'human' insulin is considerably more expensive in most countries where both synthetic 'human' and animal insulin are commercially available: e.g. within European countries the average price of synthetic 'human' insulin was twice as high as the price of pork insulin.[20] Yet in its position statement, the IDF writes that "there is no overwhelming evidence to prefer one species of insulin over another" and "[modern, highly purified] animal insulins remain a perfectly acceptable alternative.[21] Modern biotechnology has evolved, making it possible to produce more easily and relatively cheaply human growth hormone, clotting factors for hemophiliacs, fertility drugs, erythropoietin and other drugs.[22] Most drugs today are based on about 500 molecular targets. Genomic knowledge of the genes involved in diseases, disease pathways, and drug-response sites are expected to lead to the discovery of thousands more new targets.[22]

Computer-generated image of insulin hexamers highlighting the threefold symmetry, the zinc ions holding it together, and the histidine residues involved in zinc binding.

Biotechnology Genetic testing Genetic testing involves the direct examination of the DNA molecule itself. A scientist scans a patient's DNA sample for mutated sequences. There are two major types of gene tests. In the first type, a researcher may design short pieces of DNA ("probes") whose sequences are complementary to the mutated sequences. These probes will seek their complement among the base pairs of an individual's genome. If the mutated sequence is present in the patient's genome, the probe will bind to it and flag the mutation. In the second type, a researcher may conduct the gene test by comparing the sequence of DNA bases in a patient's gene to disease in healthy individuals or their progeny. Genetic testing is now used for: Carrier screening, or the identification of unaffected individuals who carry one copy of a gene for a disease that requires two copies for the disease to manifest; Confirmational diagnosis of symptomatic individuals; Determining sex; Forensic/identity testing; Newborn screening; Prenatal diagnostic screening; Presymptomatic testing for estimating the risk of developing adult-onset cancers; Presymptomatic testing for predicting adult-onset disorders. Some genetic tests are already available, although most of them are used in developed countries. The tests currently available can detect mutations associated with rare genetic disorders like cystic fibrosis, sickle cell anemia, and Huntington's disease. Recently, tests have been developed to detect mutation for a handful of more complex conditions such as breast, ovarian, and colon cancers. However, gene tests may not detect every mutation associated with a particular condition because many are as yet undiscovered. Controversial questions The absence of privacy and anti-discrimination legal protections in most countries can lead to discrimination in employment or insurance or other use of personal genetic information. This raises questions such as whether genetic privacy is different from medical privacy.[23] 1. Reproductive issues. These include the use of genetic information in reproductive decision-making and the possibility of genetically altering reproductive cells that may be passed on to future generations. For example, germline therapy changes the genetic make-up of an individual's descendants. Thus, any error in The bacterium Escherichia coli is routinely genetically engineered. technology or judgment may have far-reaching consequences (though the same can also happen through natural reproduction). Ethical issues like designed babies and human cloning have also given rise to controversies between and among scientists and bioethicists, especially in the light of past abuses with eugenics (see reductio ad hitlerum). 2. Clinical issues. These center on the capabilities and limitations of doctors and other health-service providers, people identified with genetic conditions, and the general public in dealing with genetic information. 3. Effects on social institutions. Genetic tests reveal information about individuals and their families. Thus, test results can affect the dynamics within social institutions, particularly the family.

26

Gel electrophoresis

Biotechnology 4. Conceptual and philosophical implications regarding human responsibility, free will vis--vis genetic determinism, and the concepts of health and disease. Gene therapy Gene therapy may be used for treating, or even curing, genetic and acquired diseases like cancer and AIDS by using normal genes to supplement or replace defective genes or to bolster a normal function such as immunity. It can be used to target somatic cells (i.e., those of the body) or gamete (i.e., egg and sperm) cells. In somatic gene therapy, the genome of the recipient is changed, but this change is not passed along to the next generation. In contrast, in germline gene therapy, the egg and sperm cells of the parents are changed for the purpose of passing on the changes to their offspring. There are basically two ways of implementing a gene therapy treatment:
Gene therapy using an Adenovirus vector. A new gene is inserted into an adenovirus vector, which is used to introduce the modified DNA into a human cell. If the treatment is successful, the new gene will make a functional protein.

27

1. Ex vivo, which means "outside the body" Cells from the patient's blood or bone marrow are removed and grown in the laboratory. They are then exposed to a virus carrying the desired gene. The virus enters the cells, and the desired gene becomes part of the DNA of the cells. The cells are allowed to grow in the laboratory before being returned to the patient by injection into a vein. 2. In vivo, which means "inside the body" No cells are removed from the patient's body. Instead, vectors are used to deliver the desired gene to cells in the patient's body. As of June 2001, more than 500 clinical gene-therapy trials involving about 3,500 patients have been identified worldwide. Around 78% of these are in the United States, with Europe having 18%. These trials focus on various types of cancer, although other multigenic diseases are being studied as well. Recently, two children born with severe combined immunodeficiency disorder ("SCID") were reported to have been cured after being given genetically engineered cells. Gene therapy faces many obstacles before it can become a practical approach for treating disease.[23] At least four of these obstacles are as follows: 1. Gene delivery tools. Genes are inserted into the body using gene carriers called vectors. The most common vectors now are viruses, which have evolved a way of encapsulating and delivering their genes to human cells in a pathogenic manner. Scientists manipulate the genome of the virus by removing the disease-causing genes and inserting the therapeutic genes. However, while viruses are effective, they can introduce problems like toxicity, immune and inflammatory responses, and gene control and targeting issues. In addition, in order for gene therapy to provide permanent therapeutic effects, the introduced gene needs to be integrated within the host cell's genome. Some viral vectors effect this in a random fashion, which can introduce other problems such as disruption of an endogenous host gene. 2. High costs. Since gene therapy is relatively new and at an experimental stage, it is an expensive treatment to undertake. This explains why current studies are focused on illnesses commonly found in developed countries, where more people can afford to pay for treatment. It may take decades before developing countries can take advantage of this technology. 3. Limited knowledge of the functions of genes. Scientists currently know the functions of only a few genes. Hence, gene therapy can address only some genes that cause a particular disease. Worse, it is not known exactly whether genes have more than one function, which creates uncertainty as to whether replacing such genes is indeed desirable.

Biotechnology 4. Multigene disorders and effect of environment. Most genetic disorders involve more than one gene. Moreover, most diseases involve the interaction of several genes and the environment. For example, many people with cancer not only inherit the disease gene for the disorder, but may have also failed to inherit specific tumor suppressor genes. Diet, exercise, smoking and other environmental factors may have also contributed to their disease. Human Genome Project The Human Genome Project is an initiative of the U.S. Department of Energy ("DOE") and the National Institutes of Health ("NIH") that aims to generate a high-quality reference sequence for the entire human genome and identify all the human genes. The DOE and its predecessor agencies were assigned by the U.S. Congress to develop new energy resources and technologies and to pursue a deeper understanding of potential health and environmental risks posed by their production and use. In 1986, the DOE announced its Human Genome Initiative. Shortly thereafter, the DOE and National Institutes of Health developed a plan for a joint Human Genome Project ("HGP"), which officially began in 1990. The HGP was originally planned to last 15 years. However, rapid technological advances and worldwide participation accelerated the completion date to 2003 (making it a 13 year project). Already it has enabled gene hunters to pinpoint genes associated with more than 30 disorders.[24] Cloning Cloning involves the removal of the nucleus from one cell and its placement in an unfertilized egg cell whose nucleus has either been deactivated or removed. There are two types of cloning:
DNA Replication image from the Human Genome Project (HGP)

28

1. Reproductive cloning. After a few divisions, the egg cell is placed into a uterus where it is allowed to develop into a fetus that is genetically identical to the donor of the original nucleus. 2. Therapeutic cloning.[25] The egg is placed into a Petri dish where it develops into embryonic stem cells, which have shown potentials for treating several ailments.[26] In February 1997, cloning became the focus of media attention when Ian Wilmut and his colleagues at the Roslin Institute announced the successful cloning of a sheep, named Dolly, from the mammary glands of an adult female. The cloning of Dolly made it apparent to many that the techniques used to produce her could someday be used to clone human beings.[27] This stirred a lot of controversy because of its ethical implications.

Biotechnology

29

Agriculture
Crop yield Using the techniques of modern biotechnology, one or two genes (Smartstax from Monsanto in collaboration with Dow AgroSciences will use 8, starting in 2010) may be transferred to a highly developed crop variety to impart a new character that would increase its yield.[28] However, while increases in crop yield are the most obvious applications of modern biotechnology in agriculture, it is also the most difficult one. Current genetic engineering techniques work best for effects that are controlled by a single gene. Many of the genetic characteristics associated with yield (e.g., enhanced growth) are controlled by a large number of genes, each of which has a minimal effect on the overall yield.[29] There is, therefore, much scientific work to be done in this area. Reduced vulnerability of crops to environmental stresses Crops containing genes that will enable them to withstand biotic and abiotic stresses may be developed. For example, drought and excessively salty soil are two important limiting factors in crop productivity. Biotechnologists are studying plants that can cope with these extreme conditions in the hope of finding the genes that enable them to do so and eventually transferring these genes to the more desirable crops. One of the latest developments is the identification of a plant gene, At-DBF2, from Arabidopsis thaliana, a tiny weed that is often used for plant research because it is very easy to grow and its genetic code is well mapped out. When this gene was inserted into tomato and tobacco cells (see RNA interference), the cells were able to withstand environmental stresses like salt, drought, cold and heat, far more than ordinary cells. If these preliminary results prove successful in larger trials, then At-DBF2 genes can help in engineering crops that can better withstand harsh environments.[30] Researchers have also created transgenic rice plants that are resistant to rice yellow mottle virus (RYMV). In Africa, this virus destroys majority of the rice crops and makes the surviving plants more susceptible to fungal infections.[31] Increased nutritional qualities Proteins in foods may be modified to increase their nutritional qualities. Proteins in legumes and cereals may be transformed to provide the amino acids needed by human beings for a balanced diet.[29] A good example is the work of Professors Ingo Potrykus and Peter Beyer in creating Golden rice (discussed below). Improved taste, texture or appearance of food Modern biotechnology can be used to slow down the process of spoilage so that fruit can ripen longer on the plant and then be transported to the consumer with a still reasonable shelf life. This alters the taste, texture and appearance of the fruit. More importantly, it could expand the market for farmers in developing countries due to the reduction in spoilage. However, there is sometimes a lack of understanding by researchers in developed countries about the actual needs of prospective beneficiaries in developing countries. For example, engineering soybeans to resist spoilage makes them less suitable for producing tempeh which is a significant source of protein that depends on fermentation. The use of modified soybeans results in a lumpy texture that is less palatable and less convenient when cooking. The first genetically modified food product was a tomato which was transformed to delay its ripening.[32] Researchers in Indonesia, Malaysia, Thailand, Philippines and Vietnam are currently working on delayed-ripening papaya in collaboration with the University of Nottingham and Zeneca.[33] Biotechnology in cheese production:[34] enzymes produced by micro-organisms provide an alternative to animal rennet a cheese coagulant and an alternative supply for cheese makers. This also eliminates possible public concerns with animal-derived material, although there are currently no plans to develop synthetic milk, thus making this argument less compelling. Enzymes offer an animal-friendly alternative to animal rennet. While providing comparable quality, they are theoretically also less expensive. About 85 million tons of wheat flour is used every year to bake bread.[35] By adding an enzyme called maltogenic amylase to the flour, bread stays fresher longer. Assuming that 1015% of bread is thrown away as stale, if it could

Biotechnology be made to stay fresh another 57 days then perhaps 2 million tons of flour per year would be saved. Other enzymes can cause bread to expand to make a lighter loaf, or alter the loaf in a range of ways. Reduced dependence on fertilizers, pesticides and other agrochemicals Most of the current commercial applications of modern biotechnology in agriculture are on reducing the dependence of farmers on agrochemicals. For example, Bacillus thuringiensis (Bt) is a soil bacterium that produces a protein with insecticidal qualities. Traditionally, a fermentation process has been used to produce an insecticidal spray from these bacteria. In this form, the Bt toxin occurs as an inactive protoxin, which requires digestion by an insect to be effective. There are several Bt toxins and each one is specific to certain target insects. Crop plants have now been engineered to contain and express the genes for Bt toxin, which they produce in its active form. When a susceptible insect ingests the transgenic crop cultivar expressing the Bt protein, it stops feeding and soon thereafter dies as a result of the Bt toxin binding to its gut wall. Bt corn is now commercially available in a number of countries to control corn borer (a lepidopteran insect), which is otherwise controlled by spraying (a more difficult process). Crops have also been genetically engineered to acquire tolerance to broad-spectrum herbicide. The lack of herbicides with broad-spectrum activity and no crop injury was a consistent limitation in crop weed management. Multiple applications of numerous herbicides were routinely used to control a wide range of weed species detrimental to agronomic crops. Weed management tended to rely on preemergencethat is, herbicide applications were sprayed in response to expected weed infestations rather than in response to actual weeds present. Mechanical cultivation and hand weeding were often necessary to control weeds not controlled by herbicide applications. The introduction of herbicide-tolerant crops has the potential of reducing the number of herbicide active ingredients used for weed management, reducing the number of herbicide applications made during a season, and increasing yield due to improved weed management and less crop injury. Transgenic crops that express tolerance to glyphosate, glufosinate and bromoxynil have been developed. These herbicides can now be sprayed on transgenic crops without inflicting damage on the crops while killing nearby weeds.[36] From 1996 to 2001, herbicide tolerance was the most dominant trait introduced to commercially available transgenic crops, followed by insect resistance. In 2001, herbicide tolerance deployed in soybean, corn and cotton accounted for 77% of the 626,000 square kilometres planted to transgenic crops; Bt crops accounted for 15%; and "stacked genes" for herbicide tolerance and insect resistance used in both cotton and corn accounted for 8%.[37] Production of novel substances in crop plants Biotechnology is being applied for novel uses other than food. For example, oilseed can be modified to produce fatty acids for detergents, substitute fuels and petrochemicals. Potatoes, tomatoes, rice tobacco, lettuce, safflowers, and other plants have been genetically engineered to produce insulin and certain vaccines. If future clinical trials prove successful, the advantages of edible vaccines would be enormous, especially for developing countries. The transgenic plants may be grown locally and cheaply. Homegrown vaccines would also avoid logistical and economic problems posed by having to transport traditional preparations over long distances and keeping them cold while in transit. And since they are edible, they will not need syringes, which are not only an additional expense in the traditional vaccine preparations but also a source of infections if contaminated.[38] In the case of insulin grown in transgenic plants, it is well-established that the gastrointestinal system breaks the protein down therefore this could not currently be administered as an edible protein. However, it might be produced at significantly lower cost than insulin produced in costly bioreactors. For example, Calgary, Canada-based SemBioSys Genetics, Inc. reports that its safflower-produced insulin will reduce unit costs by over 25% or more and approximates a reduction in the capital costs associated with building a commercial-scale insulin manufacturing facility of over $100 million, compared to traditional biomanufacturing facilities.[39]

30

Biotechnology Animal biotechnology In animals, biotechnology techniques are being used to improve genetics and for pharmaceutical or industrial applications. Molecular biology techniques can help drive breeding programs by directing selection of superior animals. Animal cloning, through somatic cell nuclear transfer (SCNT), allows for genetic replication of selected animals. Genetic engineering, using recombinant DNA, alters the genetic makeup of the animal for selected purposes, including producing therapeutic proteins in cows and goats.[40] There is a genetically altered salmon with an increased growth rate being considered for FDA approval.[41] Criticism There is another side to the agricultural biotechnology issue. It includes increased herbicide usage and resultant herbicide resistance, "super weeds," residues on and in food crops, genetic contamination of non-GM crops which hurt organic and conventional farmers, etc.[42][43]

31

Biological engineering
Biotechnological engineering or biological engineering is a branch of engineering that focuses on biotechnologies and biological science. It includes different disciplines such as biochemical engineering, biomedical engineering, bio-process engineering, biosystem engineering and so on. Because of the novelty of the field, bioengineer is still not clearly defined. However, in general it is an integrated approach of fundamental biological sciences and traditional engineering principles. Biotechnologists are often employed to scale up bio processes from the laboratory scale to the manufacturing scale. Moreover, as with most engineers, they often deal with management, economic and legal issues. Since patents and regulation (e.g., U.S. Food and Drug Administration regulation in the U.S.) are very important issues for biotech enterprises, bioengineers are often required to have knowledge related to these issues. The increasing number of biotech enterprises is likely to create a need for bioengineers in the years to come. Many universities throughout the world are now providing programs in bioengineering and biotechnology (as independent programs or specialty programs within more established engineering fields).

Bioremediation and biodegradation


Biotechnology is being used to engineer and adapt organisms especially microorganisms in an effort to find sustainable ways to clean up contaminated environments. The elimination of a wide range of pollutants and wastes from the environment is an absolute requirement to promote a sustainable development of our society with low environmental impact. Biological processes play a major role in the removal of contaminants and biotechnology is taking advantage of the astonishing catabolic versatility of microorganisms to degrade/convert such compounds. New methodological breakthroughs in sequencing, genomics, proteomics, bioinformatics and imaging are producing vast amounts of information. In the field of Environmental Microbiology, genome-based global studies open a new era providing unprecedented in silico views of metabolic and regulatory networks, as well as clues to the evolution of degradation pathways and to the molecular adaptation strategies to changing environmental conditions. Functional genomic and metagenomic approaches are increasing our understanding of the relative importance of different pathways and regulatory networks to carbon flux in particular environments and for particular compounds and they will certainly accelerate the development of bioremediation technologies and biotransformation processes.[44] Marine environments are especially vulnerable since oil spills of coastal regions and the open sea are poorly containable and mitigation is difficult. In addition to pollution through human activities, millions of tons of petroleum enter the marine environment every year from natural seepages. Despite its toxicity, a considerable fraction of petroleum oil entering marine systems is eliminated by the hydrocarbon-degrading activities of microbial communities, in particular by a remarkable recently discovered group of specialists, the so-called hydrocarbonoclastic bacteria (HCCB).[45]

Biotechnology

32

Biotechnology regulations
The National Institutes of Health (NIH) was the first federal agency to assume regulatory responsibility in the United States. The Recombinant DNA Advisory Committee of the NIH published guidelines for working with recombinant DNA and recombinant organisms in the laboratory. Nowadays, the agencies that are responsible for the biotechnology regulation are: US Department of Agriculture (USDA) that regulates plant pests and medical preparation from living organisms, Environmental Protection Agency (EPA) that regulates pesticides and herbicides, and the Food and Drug Administration (FDA) which ensures that the food and drug products are safe and effective
[12]

Education
In 1988, after prompting from the United States Congress, the National Institute of General Medical Sciences (National Institutes of Health) (NIGMS) instituted a funding mechanism for biotechnology training. Universities nationwide compete for these funds to establish Biotechnology Training Programs (BTPs). Each successful application is generally funded for five years then must be competitively renewed. Graduate students in turn compete for acceptance into a BTP; if accepted then stipend, tuition and health insurance support is provided for two or three years during the course of their Ph.D. thesis work. Nineteen institutions offer NIGMS supported BTPs.[46] Biotechnology training is also offered at the undergraduate level and in community colleges.

References and notes


[1] http:/ / www. cbd. int/ convention/ text/ [2] "Incorporating Biotechnology into the Classroom - What is Biotechnology?", from the curricula of the 'Incorporating Biotechnology into the High School Classroom through Arizona State University's BioREACH PROGRAM', accessed on October 16, 2012) (http:/ / www. public. asu. edu/ ~langland/ biotech-intro. html) [3] Incorporating Biotechnology into the Classroom - What is Biotechnology?, from Incorporating Biotechnology into the High School Classroom through Arizona State University's BioREACH PROGRAM, Arizona State University, Microbiology Department, retrieved October 16, 2012 (http:/ / www. public. asu. edu/ ~langland/ biotech-intro. html) [4] " The Convention on Biological Diversity (http:/ / www. biodiv. org/ convention/ convention. shtml) (Article 2. Use of Terms)." United Nations. 1992. Retrieved on February 6, 2008. [5] http:/ / www. europabio. org/ what-biotechnology [6] http:/ / www. oecd. org/ science/ innovationinsciencetechnologyandindustry/ 49303992. pdf [7] http:/ / www. oecd. org/ sti/ biotechnologypolicies/ keybiotechnologyindicators. htm [8] http:/ / www. bio. org/ [9] http:/ / en. wikipedia. org/ wiki/ Biotechnology_company [10] http:/ / www. strategyr. com/ showgsbr. asp?ind=BIOT& Pageview=Execute [11] See Arnold, John P. (2005) [1911]. Origin and History of Beer and Brewing: From Prehistoric Times to the Beginning of Brewing Science and Technology. Cleveland, Ohio: BeerBooks. p. 34. ISBN 978-0-9662084-1-2. OCLC 71834130. [12] Thieman, W.J.; Palladino, M.A. (2008). Introduction to Biotechnology. Pearson/Benjamin Cummings. ISBN0-321-49145-9. [13] Springham, D.; Springham, G.; Moses, V.; Cape, R.E. (24 August 1999). Biotechnology: The Science and the Business (http:/ / books. google. com/ books?id=9GY5DCr6LD4C). CRC Press. p.1. ISBN978-90-5702-407-8. . [14] " Diamond v. Chakrabarty, 447 U.S. 303 (1980). No. 79-139 (http:/ / caselaw. lp. findlaw. com/ scripts/ getcase. pl?court=us& vol=447& invol=303)." United States Supreme Court. June 16, 1980. Retrieved on May 4, 2007. [15] VoIP Providers And Corn Farmers Can Expect To Have Bumper Years In 2008 And Beyond, According To The Latest Research Released By Business Information Analysts At IBISWorld (http:/ / web. archive. org/ web/ 20080402034432/ http:/ / www. ibisworld. com/ pressrelease/ pressrelease. aspx?prid=115). Los Angeles (March 19, 2008) [16] The Recession List Top 10 Industries to Fly and Fl... (ith anincreasing share accounted for by ...) (http:/ / www. bio-medicine. org/ biology-technology-1/ The-Recession-List---Top-10-Industries-to-Fly-and-Flop-in-2008-4076-3/ ), bio-medicine.org [17] Gerstein, M. " Bioinformatics Introduction (http:/ / www. primate. or. kr/ bioinformatics/ Course/ Yale/ intro. pdf)." Yale University. Retrieved on May 8, 2007. [18] U.S. Department of Energy Human Genome Program, supra note 6. [19] Bains, W. (1987). Genetic Engineering For Almost Everybody: What Does It Do? What Will It Do?. Penguin. p.99. ISBN0-14-013501-4. [20] IDF 2003; "Diabetes Atlas,: 2nd ed."; International Diabetes Federation, Brussels (http:/ / www. eatlas. idf. org/ ), eatlas.idf.org [21] IDF March 2005; "Position Statement." International Diabetes Federation, Brussels. (http:/ / www. idf. org/ home/ index. cfm?node=1385) idf.org

Biotechnology
[22] U.S. Department of State International Information Programs, "Frequently Asked Questions About Biotechnology", USIS Online; available from USinfo.state.gov (http:/ / usinfo. state. gov/ ei/ economic_issues/ biotechnology/ biotech_faq. html), accessed 13 September 2007. Cf. Feldbaum, C. (February 2002). "Some History Should Be Repeated". Science 295 (5557): 975. doi:10.1126/science.1069614. PMID11834802. [23] The National Action Plan on Breast Cancer and U.S. National Institutes of Health-Department of Energy Working Group on the Ethical, Legal and Social Implications (ELSI) have issued several recommendations to prevent workplace and insurance discrimination. The highlights of these recommendations, which may be taken into account in developing legislation to prevent genetic discrimination, may be found at elsi/legislat.html ORNL.org (http:/ / www. ornl. gov/ hgmis/ ). [24] U.S. Department of Energy Human Genome Program, supra note 6 [25] A number of scientists have called for the use the term "nuclear transplantation," instead of "therapeutic cloning," to help reduce public confusion. The term "cloning" has become synonymous with "somatic cell nuclear transfer," a procedure that can be used for a variety of purposes, only one of which involves an intention to create a clone of an organism. They believe that the term "cloning" is best associated with the ultimate outcome or objective of the research and not the mechanism or technique used to achieve that objective. They argue that the goal of creating a nearly identical genetic copy of a human being is consistent with the term "human reproductive cloning," but the goal of creating stem cells for regenerative medicine is not consistent with the term "therapeutic cloning." The objective of the latter is to make tissue that is genetically compatible with that of the recipient, not to create a copy of the potential tissue recipient. Hence, "therapeutic cloning" is conceptually inaccurate. Vogelstein B., Alberts B., Shine K. (February 2002). "Please Don't Call It Cloning!". Science 295 (5558): 1237. doi:10.1126/science.1070247. PMID11847324. [26] Cameron D. (23 May 2002). "Stop the Cloning". Technology Review. Also available from Techreview.com (http:/ / www. techreview. com), [hereafter "Cameron"] [27] Nussbaum, M.C.; Sunstein, C.R. (1998). Clones And Clones: Facts And Fantasies About Human Cloning. New York: W.W. Norton. p.11. However, there is wide disagreement within scientific circles whether human cloning can be successfully carried out. For instance, Dr. Rudolf Jaenisch of Whitehead Institute for Biomedical Research believes that reproductive cloning shortcuts basic biological processes, thus making normal offspring impossible to produce. In normal fertilization, the egg and sperm go through a long process of maturation. Cloning shortcuts this process by trying to reprogram the nucleus of one whole genome in minutes or hours. This results in gross physical malformations to subtle neurological disturbances. Cameron, supra note 30 [28] Asian Development Bank, Agricultural Biotechnology, Poverty Reduction and Food Security (Manila: Asian Development Bank, 2001). Also available from ADB.org (http:/ / www. adb. org) [29] D. Bruce and A. Bruce, Engineering Genesis: The Ethics of Genetic Engineering, London: Earthscan Publications, 1999 ISBN 1-85383-570-6 [30] Sara Abdulla (27 May 1999). "Drought stress". Nature News. doi:10.1038/news990527-9. [31] National Academy of Sciences (2001). Transgenic Plants and World Agriculture. Washington: National Academy Press. [32] For an account of the research and development of Flavr Savr tomato, see Martineau, B. (2001). First Fruit: The Creation of the Flavr Savr Tomato and the Birth of Biotech Food. New York: McGraw-Hill. [33] A.F. Krattiger, An Overview of ISAAA from 1992 to 2000, ISAAA Brief No. 19-2000, 9 [34] EuropaBio An animal friendly alternative for cheeze makers (http:/ / www. europabio. org/ documents/ cheese. pdf), Europabio.org [35] EuropaBio Biologically better bread (http:/ / www. europabio. org/ documents/ painbread. pdf), Europabio.org [36] L. P. Gianessi, C. S. Silvers, S. Sankula and J. E. Carpenter. Plant Biotechnology: Current and Potential Impact for Improving Pest management in US Agriculture, An Analysis of 40 Case Studies (http:/ / croplife. intraspin. com/ Biotech/ plant-biotechnology-current-and-potential-impact-for-improving-pest-management-in-u-s-agriculture-an-analysis-of-40-case-studies/ ) (Washington, D.C.: National Center for Food and Agricultural Policy, 2002), 56 [37] C. James, "Global Review of Commercialized Transgenic Crops: 2002", ISAAA Brief No. 27-2002, at 1112. Also available from ISAAA.org (http:/ / www. isaaa. org/ resources/ publications/ briefs/ default. html) [38] Pascual DW (2007). "Vaccines are for dinner" (http:/ / www. pnas. org/ cgi/ content/ full/ 104/ 26/ 10757). Proc Natl Acad Sci USA 104 (26): 1075710758. doi:10.1073/pnas.0704516104. PMC1904143. PMID17581867. . [39] SemBioSys.ca (http:/ / www. sembiosys. ca/ ). SemBioSys.ca. Retrieved on 2011-09-05. [40] Van Eenennaam, AL (2006). "What is the Future of Animal Biotechnology?" (http:/ / animalscience. ucdavis. edu/ animalbiotech/ My_Laboratory/ Publications/ futureanimalbiotech. pdf). California Agriculture 60 (3): 132139. doi:10.3733/ca.v060n03p132. . [41] Dove, AW (2005). "Clone on the Range:What Animal Biotech is Bringing to the Table". Nature Biotechnology 23 (3): 283285. doi:10.1038/nbt0305-283. PMID15765075. [42] Monsanto and the Roundup Ready Controversy (http:/ / www. sourcewatch. org/ index. php?title=Monsanto_and_the_Roundup_Ready_Controversy), SourceWatch.org [43] Monsanto (http:/ / www. sourcewatch. org/ index. php?title=Monsanto), SourceWatch.org [44] Diaz E (editor). (2008). Microbial Biodegradation: Genomics and Molecular Biology (http:/ / www. horizonpress. com/ biod) (1st ed.). Caister Academic Press. ISBN1-904455-17-4. . [45] Martins VAP (2008). "Genomic Insights into Oil Biodegradation in Marine Systems" (http:/ / www. horizonpress. com/ biod). Microbial Biodegradation: Genomics and Molecular Biology (http:/ / www. lefitummidi. webs. com/ biod). Caister Academic Press. ISBN978-1-904455-17-2. .

33

Biotechnology
[46] Nigms.Nih.Gov (http:/ / www. nigms. nih. gov/ Training/ InstPredoc/ PredocInst-Biotechnology. htm). Nigms.Nih.Gov. Retrieved on 2011-09-05.

34

Further reading
Friedman, Yali (2008). Building Biotechnology: Starting, Managing, and Understanding Biotechnology Companies (http://www.buildingbiotechnology.com). Washington, DC: Logos Press. ISBN978-0-9734676-3-5. Oliver, Richard W.. The Coming Biotech Age. ISBN0-07-135020-9. Powell, Walter W.; White, Douglas R.; Koput, Kenneth W.; Owen-Smith, Jason (2005). "Network Dynamics and Field Evolution: The Growth of Interorganizational Collaboration in the Life Sciences". American Journal of Sociology 110 (4): 11321205. doi:10.1086/421508. Viviana Zelizer Best Paper in Economic Sociology Award (20052006), American Sociological Association. Zaid, A; H.G. Hughes, E. Porceddu, F. Nicholas (2001). Glossary of Biotechnology for Food and Agriculture A Revised and Augmented Edition of the Glossary of Biotechnology and Genetic Engineering. Available in English, French, Spanish, Chinese, Arabic, Russian, Polish, Serbian, Vietnamese and Kazakh (http://www.fao. org/biotech/biotech-glossary/en/). Rome: FAO. ISBN92-5-104683-2. Agricultural Biotechnology: An Economic Perspective (http://naldr.nal.usda.gov/Exe/ZyNET.exe/ E6870001.XML?ZyActionD=ZyDocument&Client=National Agricultural Library Digital Repository& Index=AH|AH2|AIB|BIC|Books|ERS|FVMNR|JAR|MP|ROS|Rural|TB|USDA_Div_Bulletin|WPC|YOA1|YOA2& Docs=&Query=biotechnology&Time=&EndTime=&SearchMethod=1&TocRestrict=n&Toc=&TocEntry=& QField=&QFieldYear=&QFieldMonth=&QFieldDay=&UseQField=&IntQFieldOp=1&ExtQFieldOp=1& XmlQuery=&Doc=<document name="E6870001.XML" path="\\NALDR\DIGITAL\ZYFILES\INDEXDATA\ERS\XML\2008\00000002\" index="ERS"/>& File=\\NALDR\DIGITAL\ZYFILES\INDEXDATA\ERS\XML\2008\00000002\E6870001.XML& User=ANONYMOUS&Password=&SortMethod=h|-&MaximumDocuments=20&FuzzyDegree=0& ImageQuality=r85g16/r85g16/x150y150g16/i500&Display=hpfrw&DefSeekPage=f& SearchBack=ZyActionL&Back=ZyActionS&BackDesc=Results page) by the USDA Economic Research Service. A 1994 publication from the Agricultural Economic Report.

External links
The International Forum on Biotechnology (http://www.biot.tk) Foundation for Biotechnology Awareness and Education (http://www.fbae.org/), A report on Agricultural Biotechnology (http://www.fao.org/docrep/006/y5160e/y5160e00.HTM) focusing on the impacts of "Green" Biotechnology with a special emphasis on economic aspects. fao.org. US Economic Benefits of Biotechnology to Business and Society (http://www.economics.noaa.gov/ ?goal=ecosystems&file=users/business/biotech) NOAA Economics, economics.noaa.gov Database of the Safety and Benefits of Biotechnology (http://croplife.intraspin.com/Biotech/) a database of peer-reviewed scientific papers and the safety and benefits of biotechnology.

Biosensor

35

Biosensor
A biosensor is an analytical device, used for the detection of an analyte, that combines a biological component with a physicochemical detector. the sensitive biological element (biological material (e.g. tissue, microorganisms, organelles, cell receptors, enzymes, antibodies, nucleic acids, etc.), a biologically derived material or biomimic component that interacts (binds or recognises) the analyte under study. The biologically sensitive elements can also be created by biological engineering. the transducer or the detector element (works in a physicochemical way; optical, piezoelectric, electrochemical, etc.) that transforms the signal resulting from the interaction of the analyte with the biological element into another signal (i.e., transduces) that can be more easily measured and quantified; biosensor reader device with the associated electronics or signal processors that are primarily responsible for the display of the results in a user-friendly way.[1] This sometimes accounts for the most expensive part of the sensor device, however it is possible to generate a user friendly display that includes transducer and sensitive element(see Holographic Sensor). The readers are usually custom-designed and manufactured to suit the different working principles of biosensors. Known manufacturers of biosensor electronic readers include PalmSens, Gwent Biotechnology Systems and Rapid Labs. A common example of a commercial biosensor is the blood glucose biosensor, which uses the enzyme glucose oxidase to break blood glucose down. In doing so it first oxidizes glucose and uses two electrons to reduce the FAD (a component of the enzyme) to FADH2. This in turn is oxidized by the electrode (accepting two electrons from the electrode) in a number of steps. The resulting current is a measure of the concentration of glucose. In this case, the electrode is the transducer and the enzyme is the biologically active component. Recently, arrays of many different detector molecules have been applied in so called electronic nose devices, where the pattern of response from the detectors is used to fingerprint a substance. . In the Wasp Hound odor-detector, the mechanical element is a video camera and the biological element is five parasitic wasps who have been conditioned to swarm in response to the presence of a specific chemical.[2] Current commercial electronic noses, however, do not use biological elements. A canary in a cage, as used by miners to warn of gas, could be considered a biosensor. Many of today's biosensor applications are similar, in that they use organisms which respond to toxic substances at a much lower concentrations than humans can detect to warn of their presence. Such devices can be used in environmental monitoring, trace gas detection and in water treatment facilities. Many optical biosensors are based on the phenomenon of surface plasmon resonance (SPR) techniques. This utilises a property of and other materials; specifically that a thin layer of gold on a high refractive index glass surface can absorb laser light, producing electron waves (surface plasmons) on the gold surface. This occurs only at a specific angle and wavelength of incident light and is highly dependent on the surface of the gold, such that binding of a target analyte to a receptor on the gold surface produces a measurable signal. Surface plasmon resonance sensors operate using a sensor chip consisting of a plastic cassette supporting a glass plate, one side of which is coated with a microscopic layer of gold. This side contacts the optical detection apparatus of the instrument. The opposite side is then contacted with a microfluidic flow system. The contact with the flow system creates channels across which reagents can be passed in solution. This side of the glass sensor chip can be modified in a number of ways, to allow easy attachment of molecules of interest. Normally it is coated in carboxymethyl dextran or similar compound. Light of a fixed wavelength is reflected off the gold side of the chip at the angle of total internal reflection, and detected inside the instrument. The angle of incident light is varied in order to match he evanescent wave propagation rate with the propagation rate of the surface plasmon plaritons.[3] This induces the evanescent wave to

Biosensor penetrate through the glass plate and some distance into the liquid flowing over the surface. The refractive index at the flow side of the chip surface has a direct influence on the behaviour of the light reflected off the gold side. Binding to the flow side of the chip has an effect on the refractive index and in this way biological interactions can be measured to a high degree of sensitivity with some sort of energy. The refractive index of the medium near the surface changes when biomolecules attach to the surface, and the SPR angle varies as a function of this change. Other evanescent wave biosensors have been commercialised using waveguides where the propagation constant through the waveguide is changed by the absorption of molecules to the waveguide surface. One such example, Dual Polarisation Interferometry uses a buried waveguide as a reference against which the change in propagation constant is measured. Other configurations such as the Mach-Zehnder have reference arms lithographically defined on a substrate. Higher levels of integration can be achieved using resonator geometries where the resonant frequency of a ring resonator changes when molecules are absorbed.[4][5] Other optical biosensors are mainly based on changes in absorbance or fluorescence of an appropriate indicator compound and do not need a total internal reflection geometry. For example, a fully operational prototype device detecting casein in milk has been fabricated. The device is based on detecting changes in absorption of a gold layer.[6] A widely used research tool, the micro-array, can also be considered a biosensor. Nanobiosensors use an immobilized bioreceptor probe that is selective for target analyte molecules. Nanomaterials are exquisitely sensitive chemical and biological sensors. Nanoscale materials demonstrate unique properties. Their large surface area to volume ratio can achieve rapid and low cost reactions, using a variety of designs.[7] Biological biosensors often incorporate a genetically modified form of a native protein or enzyme. The protein is configured to detect a specific analyte and the ensuing signal is read by a detection instrument such as a fluorometer or luminometer. An example of a recently developed biosensor is one for detecting cytosolic concentration of the analyte cAMP (cyclic adenosine monophosphate), a second messenger involved in cellular signaling triggered by ligands interacting with receptors on the cell membrane.[8] Similar systems have been created to study cellular responses to native ligands or xenobiotics (toxins or small molecule inhibitors). Such "assays" are commonly used in drug discovery development by pharmaceutical and biotechnology companies. Most cAMP assays in current use require lysis of the cells prior to measurement of cAMP. A live-cell biosensor for cAMP can be used in non-lysed cells with the additional advantage of multiple reads to study the kinetics of receptor response.

36

Electrochemical
Electrochemical biosensors are normally based on enzymatic catalysis of a reaction that produces or consumes electrons (such enzymes are rightly called redox enzymes). The sensor substrate usually contains three electrodes; a reference electrode, a working electrode and a counter electrode. The target analyte is involved in the reaction that takes place on the active electrode surface, and the reaction may cause either electron transfer across the double layer (producing a current) or can contribute to the double layer potential (producing a voltage). We can either measure the current (rate of flow of electrons is now proportional to the analyte concentration) at a fixed potential or the potential can be measured at zero current (this gives a logarithmic response). Note that potential of the working or active electrode is space charge sensitive and this is often used. Further, the label-free and direct electrical detection of small peptides and proteins is possible by their intrinsic charges using biofunctionalized ion-sensitive field-effect transistors.[9] Another example, the potentiometric biosensor, (potential produced at zero current) gives a logarithmic response with a high dynamic range. Such biosensors are often made by screen printing the electrode patterns on a plastic substrate, coated with a conducting polymer and then some protein (enzyme or antibody) is attached. They have only two electrodes and are extremely sensitive and robust. They enable the detection of analytes at levels previously only achievable by HPLC and LC/MS and without rigorous sample preparation. All biosensors usually involve minimal sample preparation as the biological sensing component is highly selective for the analyte concerned. The signal is

Biosensor produced by electrochemical and physical changes in the conducting polymer layer due to changes occurring at the surface of the sensor. Such changes can be attributed to ionic strength, pH, hydration and redox reactions, the latter due to the enzyme label turning over a substrate ([10]). Field effect transistors, in which the gate region has been modified with an enzyme or antibody, can also detect very low concentrations of various analytes as the binding of the analyte to the gate region of the FET cause a change in the drain-source current.

37

Ion Channel Switch


The use of ion channels has been shown to offer highly sensitive detection of target biological molecules.[11] By imbedding the ion channels in supported or tethered bilayer membranes (t-BLM) attached to a gold electrode, an electrical circuit is created. Capture molecules such as antibodies can be bound to the ion channel so that the binding of the target molecule controls the ion flow through the channel. This results in a measurable change in the electrical conduction which is proportional to the concentration of the target. An Ion Channel Switch (ICS) biosensor can be created using gramicidin, a dimeric peptide channel, in a tethered bilayer membrane.[12] One peptide of gramicidin, with attached antibody, is mobile and one is fixed. Breaking the dimer stops the ionic current through the membrane. The magnitude of the change in electrical signal is greatly increased by separating the membrane from the metal surface using a hydrophilic spacer. Quantitative detection of an extensive class of target species, including proteins, bacteria, drug and toxins has been demonstrated using different membrane and capture configurations.[13][14]

ICS - channel open

ICS - channel closed

Others
Piezoelectric sensors utilise crystals which undergo an elastic deformation when an electrical potential is applied to them. An alternating potential (A.C.) produces a standing wave in the crystal at a characteristic frequency. This frequency is highly dependent on the elastic properties of the crystal, such that if a crystal is coated with a biological recognition element the binding of a (large) target analyte to a receptor will produce a change in the resonance frequency, which gives a binding signal. In a mode that uses surface acoustic waves (SAW), the sensitivity is greatly increased. This is a specialised application of the Quartz crystal microbalance as a biosensor. Thermometric and magnetic based biosensors are rare.

Applications
There are many potential applications of biosensors of various types. The main requirements for a biosensor approach to be valuable in terms of research and commercial applications are the identification of a target molecule, availability of a suitable biological recognition element, and the potential for disposable portable detection systems to be preferred to sensitive laboratory-based techniques in some situations. Some examples are given below: Glucose monitoring in diabetes patients historical market driver Other medical health related targets Environmental applications e.g. the detection of pesticides and river water contaminants such as heavy metal ions[15] Remote sensing of airborne bacteria e.g. in counter-bioterrorist activities

Biosensor Detection of pathogens[16] Determining levels of toxic substances before and after bioremediation Detection and determining of organophosphate Routine analytical measurement of folic acid, biotin, vitamin B12 and pantothenic acid as an alternative to microbiological assay Determination of drug residues in food, such as antibiotics and growth promoters, particularly meat and honey. Drug discovery and evaluation of biological activity of new compounds. Protein engineering in biosensors[17] Detection of toxic metabolites such as mycotoxins[18]

38

Glucose monitoring
Commercially available gluocose monitors rely on amperometric sensing of glucose by means of glucose oxidase, which oxidises glucose producing hydrogen peroxide which is detected by the electrode. To overcome the limitation of amperometric sensors, a flurry of research is present into novel sensing methods, such as fluorescent glucose biosensors.

Interferometric Reflectance Imaging Sensor


The Interferometric Reflectance Imaging Sensor (IRIS) was developed by the Unlu research group at Boston University for the purpose of label-free biosensing. Using simple lenses and low-powered, coherent LEDs, the device offers exquisite sensitivity and reproducibility and is able to image with remarkable resolution beyond the classical diffraction limit. This relatively cheap solution also presents minimal hazards when compared to a laser illumination source. The IRIS operates solely on optical reflection. The ability for it to image with extremely high spatial resolution stems from the integration of a diffuser into the design of the microscope. The diffuser randomizes the directionality of the light from a single LED source (called Khler illumination) which allows for sharp focusing of incident light without back-imaging the source in the image projection. Practical uses of this device include the detection of bacterial and viral infections in underdeveloped countries. When pathogen specific growth factors are introduced into a microarray, only spots with the targeted pathogens will grow and increase in concentration. In turn, this dictates a change in the reflected intensity compared to pre-growth. Thus, by measuring how reflectance changes over time, unknown pathogens and their growth rates can be easily characterized and identified. One specific form of photometric biosensing technique developed by researchers at Boston University is interferometric reflectance imaging. Using optical interference techniques, imaging of antibodies were successfully performed. This was achieved without altering the antibody structure or using bio-markers such as fluorescent proteins. The basis of this technique stems solely from optical interference. By using a reflective substrate such as silicon, light reflected from proteins will interfere with light reflected from the substrate. In result, interference patterns are generated that alter the intensity of the reflected light. This phenomena is measurable by a camera. Proteins have indices of refraction based on their concentration. When light is shined on the proteins, a portion of the light is transmitted through the molecules and reflected off the silicon's surface. The interference of the light initially reflected off the proteins and the light reflected off the surface of the silicon will have a relative phase difference (after being transmitted back through the protein) contributing to a wavelength-dependent sinusoidal variation in the total amount of reflected light (captured by the imaging device).

Biosensor

39

Biosensors in food analysis


There are several applications of biosensors in food analysis. In food industry optic coated with antibodies are commonly used to detect pathogens and food toxins. The light system in these biosensors has been fluorescence, since this type of optical measurement can greatly amplify the signal. A range of immuno- and ligand-binding assays for the detection and measurement of small molecules such as water-soluble vitamins and chemical contaminants (drug residues) such as sulfonamides and Beta-agonists have been developed for use on SPR based sensor systems, often adapted from existing ELISA or other immunological assay. These are in widespread use across the food industry.

Surface attachment of the biological elements


An important part in a biosensor is to attach the biological elements (small molecules/protein/cells) to the surface of the sensor (be it metal, polymer or glass). The simplest way is to functionalize the surface in order to coat it with the biological elements. This can be done by polylysine, aminosilane, epoxysilane or nitrocellulose in the case of silicon chips/silica glass. Subsequently the bound biological agent may be for example fixed by Layer by layer depositation of alternatively charged polymer coatings[19] Alternatively three dimensional lattices (hydrogel/xerogel) can be used to chemically or physically entrap these (where by chemically entraped it is meant that the biological element is kept in place by a strong bond, while physically they are kept in place being unable to pass through the pores of the gel matrix). The most commonly used hydrogel is sol-gel, a glassy silica generated by polymerization of silicate monomers (added as tetra alkyl orthosilicates, such as TMOS or TEOS) in the presence of the biological elements (along with other stabilizing polymers, such as PEG) in the case of physical entrapment.[20] Another group of hydrogels, which set under conditions suitable for cells or protein, are acrylate hydrogel, which polymerize upon radical initiation. One type of radical initiator is a peroxide radical, typically generated by combining a persulfate with TEMED (Polyacrylamide gel are also commonly used for protein electrophoresis),[21] alternatively light can be used in combination with a photoinitiator, such as DMPA (2,2-dimethoxy-2-phenylacetophenone).[22] Smart materials that mimic the biological components of a sensor can also be classified as biosensors using only the active or catalytic site or analogous configurations of a biomolecule.[23]

DNA Biosensors
In the future, DNA will find use as a versatile material from which scientists can craft biosensors. DNA biosensors can theoretically be used for medical diagnostics, forensic science, agriculture, or even environmental clean-up efforts. No external monitoring is needed for DNA-based sensing devises. This is a significant advantage. DNA biosensors are complicated mini-machinesconsisting of sensing elements, micro lasers, and a signal generator. At the heart of DNA biosensor function is the fact that two strands of DNA stick to each other by virtue of chemical attractive forces. On such a sensor, only an exact fitthat is, two strands that match up at every nucleotide positiongives rise to a fluorescent signal (a glow) that is then transmitted to a signal generator.

Biosensor

40

References
[1] Cavalcanti A, Shirinzadeh B, Zhang M, Kretly LC (2008). "Nanorobot Hardware Architecture for Medical Defense" (http:/ / www. mdpi. org/ sensors/ papers/ s8052932. pdf). Sensors 8 (5): 29322958. doi:10.3390/s8052932. . [2] "Wasp Hound" (http:/ / www. sciencentral. com/ articles/ view. php3?article_id=218392717). Science Central. . Retrieved 23 February 2011. [3] Homola J (2003 /8x2n9xhbkqtp76dq). Present and future of surface plasmon resonance biosensors.. [4] M. Iqbal, M. A. Gleeson, B. Spaugh, F. Tybor, W. G. Gunn, M. Hochberg, T. Baehr-Jones, R. C. Bailey, L. C. Gunn, "Label-Free Biosensor Arrays Based on Silicon Ring Resonators and High-Speed Optical Scanning Instrumentation", IEEE J. Sel. Top. Quant. Elec. 16, 654-661 (2010) [5] J. Witzens, M. Hochberg (2011). "Optical detection of target molecule induced aggregation of nanoparticles by means of high-Q resonators" (http:/ / www. opticsinfobase. org/ oe/ fulltext. cfm?uri=oe-19-8-7034& id=211400). Opt. Express 19: 7034-7061. . [6] H. M. Hiep et al. "A localized surface plasmon resonance based immunosensor for the detection of casein in milk" Sci. Technol. Adv. Mater. 8 (2007) 331 free download (http:/ / dx. doi. org/ 10. 1016/ j. stam. 2006. 12. 010) [7] Gerald A Urban 2009 Meas. Sci. Technol. 20 012001 doi:10.1088/0957-0233/20/1/012001 [8] Fan, F. et al. (2008) Novel Genetically Encoded Biosensors Using Firefly Luciferase. ACS Chem. Biol. 3, 34651. free download (http:/ / pubs. acs. org/ doi/ abs/ 10. 1021/ cb8000414) [9] S.Q. Lud, M.G. Nikolaides, I. Haase, M. Fischer and A.R. Bausch (2006)."Field Effect of Screened Charges: Electrical Detection of Peptides and Proteins by a Thin Film Resistor" ChemPhysChem 7(2), 379-384 doi:10.1002/cphc.200500484 [10] http:/ / www. universalsensors. co. uk [11] Vockenroth I, Atanasova P, Knoll W, Jenkins A, Kper I (2005). "Functional tethered bilayer membranes as a biosensor platform". IEEE Sensors 2005 - the 4-th IEEE Conference on Sensors: 608610. [12] Cornell BA, BraachMaksvytis VLB, King LG et al. (1997). "A biosensor that uses ion-channel switches". Nature 387 (6633): 580583. Bibcode1997Natur.387..580C. doi:10.1038/42432. PMID9177344. [13] Oh S, Cornell B, Smith D et al. (2008). "Rapid detection of influenza A virus in clinical samples using an ion channel switch biosensor". Biosensors & Bioelectronics 23 (7): 11611165. doi:10.1016/j.bios.2007.10.011. [14] Krishnamurthy V, Monfared S, Cornell B (2010). ": Ion Channel Biosensors Part I Construction Operation and Clinical Studies". IEEE Transactions on Nanotechnology 9 (3): 313322. Bibcode2010ITNan...9..313K. doi:10.1109/TNANO.2010.2041466. [15] Saharudin Haron and Asim K. Ray (2006) Optical biodetection of cadmium and lead ions in water. (http:/ / www. cheme. utm. my/ staff/ saharudin/ index. php?option=com_content& task=view& id=24& Itemid=42) Medical Engineering and Physics, 28 (10). pp. 978-981. ISSN 1350-4533. [16] Pohanka M, Skladal P, Kroca M (2007)."Biosensors for biological warfare agent detection". Def. Sci. J. 57(3):185-93. [17] http:/ / www. springerlink. com/ content/ 672p4l4l45xk02j2 [18] Pohanka M, Jun D, Kuca K (2007)."Mycotoxin assay using biosensor technology: a review. Drug Chem. Toxicol. 30(3):253-61. [19] Nanomedicine and its potential in diabetes research and practice. Pickup JC, Zhi ZL, Khan F, Saxl T, Birch DJ. Diabetes Metab Res Rev. 2008 Nov-Dec;24(8):604-10. [20] Entrapment of biomolecules in sol-gel matrix for applications in biosensors: problems and future prospects. Gupta R, Chaudhury NK. Biosens Bioelectron. 2007 May 15;22(11):2387-99. [21] Clark HA, Kopelman R, Tjalkens R, Philbert MA. Optical nanosensors for chemical analysis inside single living cells. 2. Sensors for pH and calcium and the intracellular application of PEBBLE sensors. Anal Chem. 1999 Nov 1;71(21):4837-43. [22] Percutaneous fiber-optic sensor for chronic glucose monitoring in vivo. Liao KC, Hogen-Esch T, Richmond FJ, Marcu L, Clifton W, Loeb GE. Biosens Bioelectron. 2008 May 15;23(10):1458-65. [23] http:/ / www. technologyreview. com/ biomedicine/ 21603/ ?a=f

"The Chemistry of Health." The Chemistry of Health (2006): 42-43. National Institutes of Health and National Institute of General Medical Sciences. Web. <http://www.nigms.nih.gov>

External links
What are biosensors (http://www.lsbu.ac.uk/biology/enztech/biosensors.html) Biosensor Applications - Drug and explosives detection products from Sweden (http://www.biosensor.se) Scratching at the surface of biosensors (http://www.rsc.org/Publishing/ChemTech/Volume/2009/02/ biosensors.asp) - an * Instant Insight (http://www.rsc.org/Publishing/ChemTech/Instant_insights.asp) discussing how surface chemistry lets porous silicon biosensors fulfil their promise from the Royal Society of Chemistry Biosensor Forum - Social network for researchers and organizations (http://www.biosensorforum.com)

Biochemical cascade

41

Biochemical cascade
A biochemical cascade is a series of chemical reactions in which the products of one reaction are consumed in the next reaction [1]. These cascades facilitate the transformation or generation of complex molecules in small steps. At each step, various controlling factors are involved to regulate cellular reactions, responding effectively to cues about their changing internal and external environments. Chemical reactions are orchestrated by complex molecular networks, which consist of proteins/enzymes or RNAs (second messengers), connected by activation or synthesis in biological processes.[2].

Introduction
In biochemistry, several important enzymatic cascades and signal transduction cascades participate in metabolic pathways or signalling networks, in which enzymes are usually involved to catalyze the reactions. For example, the tissue factor pathway in the coagulation cascade of secondary hemostasis is the primary pathway lead to fibrin formation, therefore, the initiation of blood coagulation. The pathways are a series of reactions, in which a zymogen (inactive enzyme precursor) of a serine protease and its glycoprotein co-factors are activated to become active components that then catalyze the next reaction in the cascade, ultimately resulting in cross-linked fibrin [3]. Another example, sonic hedgehog signaling pathway is one of the key regulators of embryonic development and is present in all bilaterians. [4]. Different parts of the embryo have different concentrations of hedgehog signaling proteins, which give cells information to make the embryo develop properly and correctly into a head or a tail. When the pathway malfunctions, it can result in diseases like basal cell carcinoma [5]. Recent studies point to the role of hedgehog signaling in regulating adult stem cells involved in maintenance and regeneration of adult tissues. The pathway has also been implicated in the development of some cancers. Drugs that specifically target hedgehog signaling to fight diseases are being actively developed by a number of pharmaceutical companies [6]. Most biochemical cascades are series of events, in which one event triggers the next, in a linear fashion. Negative cascades, however, include events that are in a circular fashion, or can cause or be caused by multiple events [7]. Biochemical cascades include: The Complement system The Insulin Signaling Pathway The Sonic hedgehog Signaling Pathway The Wnt signaling pathway The JAK-STAT signaling pathway The Adrenergic receptor Pathways The Acetylcholine receptor Pathways

Negative cascades include: Ischemic cascade

Pathway construction
Pathway building has been performed by individual groups studying a network of interest (e.g., immune signaling pathway) as well as by large bioinformatics consortia (e.g., the Reactome Project) and commercial entities (e.g., Ingenuity Systems). Pathway building is the process of identifying and integrating the entities, interactions, and associated annotations, and populating the knowledge base. Pathway construction can have either a data-driven objective (DDO) or a knowledge-driven objective (KDO). Data-driven pathway construction is used to generate relationship information of genes or proteins identified in a specific experiment such as a microarray study [8]. Knowledge-driven pathway construction entails development of a detailed pathway knowledge base for particular domains of interest, such as a cell type, disease, or system. The curation process of a biological pathway entails

Biochemical cascade identifying and structuring content, mining information manually and/or computationally, and assembling a knowledgebase using appropriate software tools [9]. A schematic illustrating the major steps involved in the data-driven and knowledge-driven construction processes [8].

42

Schematic Illustrating the Biological Pathway Building Process.

For either DDO or KDO pathway construction, the first step is to mine pertinent information from relevant information sources about the entities and interactions. The information retrieved is assembled using appropriate formats, information standards, and pathway building tools to obtain a pathway prototype. The pathway is further refined to include context-specific annotations such as species, cell/tissue type, or disease type. The pathway can then be verified by the domain experts and updated by the curators based on appropriate feedback [10]. Recent attempts to improve knowledge integration have led to refined classifications of cellular entities, such as GO, and to the assembly of structured knowledge repositories [11]. Data repositories, which contain information regarding sequence data, metabolism, signaling, reactions, and interactions are a major source of information for pathway building [12]. A few useful databases are described in the following table [8].
Database Curation Type GO Annotation (Y/N) Description

1. Protein-protein interactions databases BIND MINT HPRD [13] [14] [15] Manual Curation Manual Curation Manual Curation N N N 200,000 documented biomolecular interactions and complexes Experimentally verified interactions Elegant and comprehensive presentation of the interactions, entities and evidences Yeast interactions. A part of MIPS

MPact

[16]

Manual and Automated Curation Manual and Automated Curation

DIP

[17]

Experimentally determined interactions

Biochemical cascade
[18] [19]

43
Manual Curation Manual Curation Manual and Automated Curation Manual Curation Manual and Automated Curation Manual Curation Y N Y Database and analysis system of binary and multi-protein interactions PDZ Domain containing proteins Based on specific experiments and literature

IntAct

PDZBase GNPV

[20]

BioGrid UniHi

[21]

Y Y

Physical and genetic interactions Comprehensive human protein interactions

[22]

OPHID

[23]

Combines PPI from BIND, HPRD, and MINT

2. Metabolic Pathway databases EcoCyc [24] Manual and Automated Curation Manual Curation Manual and Automated Curation Manual and Automated Curation Y Entire genome and biochemical machinery of E. Coli

MetaCyc

[25] [26]

N N

Pathways of over 165 species Human metabolic pathways and the human genome

HumanCyc

BioCyc

[27]

Collection of databases for several organism

3. Signaling Pathway databases KEGG [28] Manual Curation Y Comprehensive collection of pathways such as human disease, signaling, genetic information processing pathways. Links to several useful databases Compendium of metabolic and signaling pathways built using CellDesigner. Pathways can be downloaded in SBML format Hierarchical layout. Extensive links to relevant databases such as NCBI, ENSEMBL, UNIPROT, HAPMAP, KEGG, CHEBI, PubMed, GO. Follows PSI-MI standards Domain experts curated biological connection maps and associated mathematical models Repository of canonical pathways Commercial mammalian biological knowledgebase about genes, drugs, chemical, cellular and disease processes, and signaling and metabolic pathways Compendium of several highly structured, assembled signaling pathways Repository of biological pathways built using CellDesigner

PANTHER

[29]

Manual Curation

Reactome

[30]

Manual Curation

Biomodels [32]

[31]

Manual Curation

STKE

Manual Curation [33] Manual Curation

N Y

Ingenuity Systems

PID

[34]

Manual Curation Manual and Automated Curation

Y Y

BioPP

Legend: Y Yes, N No; BIND Biomolecular Interaction Network Database, DIP Database of Interacting Proteins, GNPV Genome Network Platform Viewer, HPRD = Human Protein Reference Database, MINT Molecular INTeration database, MIPS Munich Information center for Protein Sequences, UNIHI Unified Human Interactome, OPHID Online Predicted Human Interaction Database, EcoCyc Encyclopaedia of E. Coli Genes

Biochemical cascade and Metabolism, MetaCyc aMetabolic Pathway database, KEGG Kyoto Encyclopedia of Genes and Genomes, PANTHER Protein Analysis Through Evolutionary Relationship database, STKE Signal Transduction Knowledge Environment, PID The Pathway Interaction Database, BioPP Biological Pathway Publisher. A comprehensive list of resources can be found at http://www.pathguide.org.

44

Pathway-Related Databases and Tools


KEGG
The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource (http:/ / www. genome. jp/ kegg/ ) provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES), the chemical space (KEGG LIGAND), wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY), and ontologies for pathway reconstruction (BRITE database)[35]. The KEGG PATHWAY database is a collection of manually drawn pathway maps for metabolism, genetic information processing, environmental information processing such as signal transduction, ligandreceptor interaction and cell communication, various other cellular processes and human diseases, all based on extensive survey of published literature [36].

GenMAPP
Gene Map Annotator and Pathway Profiler (GenMAPP) (http:/ / www. genmapp. org/ ) a free, open-source, The overall architecture of the KEGG database is made up of four main components. stand-alone computer program is designed for organizing, analyzing, and sharing genome scale data in the context of biological pathways. GenMAPP database support multiple gene annotations and species as well as custom species database creation for a potentially unlimited number of species [37]. Pathway resources are expanded by utilizing homology information to translate pathway content between species and extending existing pathways with data derived from conserved protein interactions and coexpression. A new mode of data visualization including time-course, single nucleotide polymorphism (SNP), and splicing, has been implemented with GenMAPP database to support analysis of complex data. GenMAPP also offers innovative ways to display and share data by incorporating HTML export of analyses for entire sets of pathways as organized web pages (http:/ / www. genmapp. org/ tutorials/ Converting-MAPPs-between-species. pdf). In short, GenMAPP provides a means to rapidly interrogate complex experimental data for pathway-level changes in a diverse range of organisms.

Biochemical cascade

45

Reactome
Given the genetic makeup of an organism, the complete set of possible reactions constitutes its reactome. Reactome, located at http:/ / www. reactome. org is a curated, peer-reviewed resource of human biological processes/pathway data. The basic unit of the Reactome database is a reaction; reactions are then grouped into causal chains to form pathways [38] The Reactome data model allows us to represent many diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, and signal transduction, and high-level processes, such as the cell cycle [39]. Reactome provides a qualitative framework, on which quantitative data can be superimposed. Tools have been developed to facilitate custom data entry and annotation by expert biologists, and to allow visualization and exploration of the finished dataset as an interactive process map [40]. Although the primary curational domain is pathways from Homo sapiens, electronic projections of human pathways onto other organisms are regularly created via putative orthologs, thus making Reactome relevant to model organism research communities. The database is publicly available under open source terms, which allows both its content and its software infrastructure to be freely used and redistributed. Studying whole transcriptional proles and cataloging proteinprotein interactions has yielded much valuable biological information, from the genome or proteome to the physiology of an organism, an organ, a tissue or even a single cell. The Reactome database containing a framework of possible reactions which, when combined with expression and enzyme kinetic data, provides the infrastructure for quantitative models, therefore, an integrated view of biological processes, which links such gene products and can be systematically mined by using bioinformatics applications [41]. Reactome data available in a variety of standard formats, including BioPAX, SBML and PSI-MI, and also enable data exchange with other pathway databases, such as the Cycs, KEGG and amaze, and molecular interaction databases, such as BIND and HPRD. The next data release will cover apoptosis, including the death receptor signaling pathways, and the Bcl2 pathways, as well as pathways involved in hemostasis. Other topics currently under development include several signaling pathways, mitosis, visual phototransduction and hematopoeisis [42]. In summary, Reactome provides high-quality curated summaries of fundamental biological processes in humans in a form of biologist-friendly visualization of pathways data, and is an open-source project.

Pathway-Oriented Approaches
In the post-genomic age, high-throughput sequencing and gene/protein profiling techniques have transformed biological research by enabling comprehensive monitoring of a biological system, yielding a list of differentially expressed genes or proteins, which is useful in identifying genes that may have roles in a given phenomenon or phenotype.[43]. With DNA microarrays and genome-wide gene engineering, it is possible to screen global gene expression profiles to contribute a wealth of genomic data to the public domain. With RNA interference, it is possible to distill the inferences contained in the experimental literature and primary databases into knowledge bases that consist of annotated representations of biological pathways. In this case, individual genes and proteins are known to be involved in biological processes, components, or structures, as well as how and where gene products interact with each other. [44] [45]. Pathway-oriented approaches for analyzing microarray data, by grouping long lists of individual genes, proteins, and/or other biological molecules according to the pathways they are involved in into smaller sets of related genes or proteins, which reduces the complexity, have proven useful for connecting genomic data to specific biological processes and systems. Identifying active pathways that differ between two conditions can have more explanatory power than a simple list of different genes or proteins. In addition, a large number of pathway analytic methods exploit pathway knowledge in public repositories such as Gene Ontology (GO) or Kyoto Encyclopedia of Genes and Genomes (KEGG), rather than inferring pathways from molecular measurements.[46] [47] Furthermore, different research focuses have given the word pathway different meanings. For example, pathway can denote a metabolic pathway involving a sequence of enzyme-catalyzed reactions of small molecules, or a signaling pathway involving a set of protein phosphorylation reactions and gene regulation events. Therefore, the term pathway analysis has a very broad application. For instance, it can refer to the analysis physical interaction

Biochemical cascade networks (e.g., proteinprotein interactions), kinetic simulation of pathways, and steady-state pathway analysis (e.g., flux-balance analysis), as well as its usage in the inference of pathways from expression and sequence data. Several functional enrichment analysis tools [48] [49] [50] [51] and algorithms [52] have been developed to enhance data interpretation. The existing knowledge basedriven pathway analysis methods in each generation have been summarized in recent literature. [53]

46

Applications of Pathway Analysis in Disease


Colorectal cancer (CRC)
A program package MatchMiner was used to scan HUGO names for cloned genes of interest are scanned, then are input into GoMiner (online at http:/ / genomebiology. com/ 2003/ 4/ 4/ R28), which leveraged the GO to identify the biological processes, functions and components represented in the gene profile. Also, Database for Annotation, Visualization, and Integrated Discovery (DAVID) (http:/ / genomebiology. com/ 2003/ 4/ 9/ R60) and KEGG database (http:/ / www. genome. ad. jp/ kegg/ ) can be used for the analysis of microarray expression data and the analysis of each GO biological process (P), cellular component (C), and molecular function (F) ontology. In addition, DAVID tools can be used to analyze the roles of genes in metabolic pathways and show the biological relationships between genes or gene-products and may represent metabolic pathways. These two databases also provide bioinformatics tools online to combine specific biochemical information on a certain organism and facilitate the interpretation of biological meanings for experimental data. By using a combined approach of Microarray-Bioinformatic technologies, a potential metabolic mechanism contributing to colorectal cancer (CRC) has been demonstrated [54] Several environmental factors may be involved in a series of points along the genetic pathway to CRC. These include genes associated with bile acid metabolism, glycolysis metabolism and fatty acid metabolism pathways, supporting a hypothesis that some metabolic alternations observed in colon carcinoma may occur in the development of CRC. [54]

Parkinsons disease (PD)


Cellular models are instrumental in dissecting a complex pathological process into simpler molecular events. Parkinsons disease (PD) is multifactorial and clinically heterogeneous; the aetiology of the sporadic (and most common) form is still unclear and only a few molecular mechanisms have been clarified so far in the neurodegenerative cascade. In such a multifaceted picture, it is particularly important to identify experimental models that simplify the study of the different networks of proteins and genes involved. Cellular models that reproduce some of the features of the neurons that degenerate in PD have contributed to many advances in our comprehension of the pathogenic flow of the disease. In particular, the pivotal biochemical pathways (i.e. apoptosis and oxidative stress, mitochondrial impairment and dysfunctional mitophagy, unfolded protein stress and improper removal of misfolded proteins) have been widely explored in cell lines, challenged with toxic insults or genetically modified. The central role of a-synuclein has generated many models aiming to elucidate its contribution to the dysregulation of various cellular processes. Classical cellular models appear to be the correct choice for preliminary studies on the molecular action of new drugs or potential toxins and for understanding the role of single genetic factors. Moreover, the availability of novel cellular systems, such as cybrids or induced pluripotent stem cells, offers the chance to exploit the advantages of an in vitro investigation, although mirroring more closely the cell population being affected.[55]

Biochemical cascade

47

Alzheimer's diseases (AD)


Synaptic degeneration and death of nerve cells are defining features of Alzheimers disease (AD), the most prevalent age-related neurodegenerative disorders. In AD, neurons in the hippocampus and basal forebrain (brain regions that subserve learning and memory functions) are selectively vulnerable. Studies of postmortem brain tissue from AD people have provided evidence for increased levels of oxidative stress, mitochondrial dysfunction and impaired glucose uptake in vulnerable neuronal populations. Studies of animal and cell culture models of AD suggest that increased levels of oxidative stress (membrane lipid peroxidation, in particular) may disrupt neuronal energy metabolism and ion homeostasis, by impairing the function of membrane ion-motive ATPases, glucose and glutamate transporters. Such oxidative and metabolic compromise may thereby render neurons vulnerable to excitotoxicity and apoptosis. Recent studies suggest that AD can manifest systemic alterations in energy metabolism (e.g., increased insulin resistance and dysregulation of glucose metabolism). Emerging evidence that dietary restriction can forestall the development of AD is consistent with a major metabolic component to these disorders, and provides optimism that these devastating brain disorders of aging may be largely preventable.[56]

References
[1] Nic, M.; Jirat, J.; Kosata, B., eds. (2006) Chemical Reaction. IUPAC Compendium of Chemical Terminology (Online ed.). [2] March, J (1985) Advanced Organic Chemistry: Reactions, Mechanisms, and Structure (3rd ed.), New York: Wiley. [3] Mishra, B. (2002) A symbolic approach to modelling cellular behaviour. In Prasanna,V., Sahni,S. and Shukla,U. (eds), High Performance ComputingHiPC 2002. LNCS 2552. Springer-Verlag, pp. 725732. [4] Ingham, P.W., Nakano, Y., Seger, C. (2011)Mechanisms and functions of Hedgehog signalling across the metazoa. Nature Reviews Genetics, 12 (6), 393406. [5] Antoniotti, M., Park, F., Policriti, A., Ugel, N., Mishra, B. (2003) Foundations of a query and simulation system for the modeling of biochemical and biological processes. In Pacific Symposium on Biocomputing 2003 (PSB 2003), pp. 116127. [6] de Jong, H.(2002) Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol., 9(1), 67103. [7] Hinkle JL, Bowman L (2003) Neuroprotection for ischemic stroke. J Neurosci Nurs 35 (2): 1148. [8] Viswanathan G. A., Seto J., Patil S., Nudelman G., Sealfon S. C. (2008) Getting Started in Biological Pathway Construction and Analysis. PLoS Comput Biol 4(2): e16. [9] Stromback L., Jakoniene V., Tan H., Lambrix P. (2006) Representing, storing and accessing. The MIT Press. [10] Brazma A., Krestyaninova M., Sarkans U. (2006) Standards for systems biology. Nat Rev Genet 7: 593605. [11] Baclawski K., Niu T. (2006) Ontologies for bioinformatics. Cambridge (Massachusetts): Boca Raton (Florida): Chapman & Hall/CRC. [12] Kashtan N., Itzkovitz S., Milo R., Alon U. (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20: 17461758. [13] http:/ / www. bind. ca/ [14] http:/ / mint. bio. uniroma2. it/ mint/ [15] http:/ / www. hprd. org/ [16] http:/ / mips. gsf. de/ genre/ proj/ mpact/ [17] http:/ / dip. doe-mci. ucla. edu/ [18] http:/ / www. ebi. ac. uk/ intact [19] http:/ / icb. med. cornell. edu/ services/ pdz/ start/ [20] http:/ / genomenetwork/ nig. ac. jp/ [21] http:/ / thebiogrid. org/ [22] http:/ / theoderich. fb3. mdc-berlin. de:8080/ unihi/ [23] http:/ / ophid. utoronto. ca/ ophid/ [24] http:/ / ecocyc. org [25] http:/ / metacyc. org [26] http:/ / humancyc. org [27] http:/ / biocyc. org [28] http:/ / www. genome. ad. jp/ kegg/ %20 [29] http:/ / panther. appliedbiosystems. com/ [30] http:/ / www. reactome. org/ [31] http:/ / www. ebi. ac. uk/ biomodels/ [32] http:/ / stke. sciencemag. org/ cm/ [33] http:/ / ingenuity. com/ [34] http:/ / pid. nic. nih. gov/ PID/

Biochemical cascade
[35] Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S. (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34, D354D357. [36] Minoru K., Susumu G., Miho F., Mao T., Mika H. (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs Nucl. Acids Res. 38(1): D355-D360. [37] Dahlquist K. D., Salomonis N., Vranizan K., Lawlor S. C., Conklin B. R. (2002) GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet.31(1):19-20. [38] Vastrik I., D'Eustachio P., Schmidt E., Joshi-Tope G., Gopinath G., Croft D., de Bono B., Gillespie M., Jassal B., Lewis S., Matthews L., Wu G., Birney E., Stein L. (2007) Reactome: a knowledgebase of biological pathways and processes. Genome Biol. 8:R39. [39] Joshi-Tope G., Gillespie M., Vastrik I., D'Eustachio P., Schmidt E., de Bono B., Jassal B., Gopinath G. R. , Wu G. R., Matthews L., Lewis S., Birney E., Stein L. (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33:D428-32. [40] Matthews, L., Gopinath, G., Gillespie, M., Caudy, M. (2009) Reactome knowledge base of human biological pathways and processes. Nucleic Acids Res. 37, D619D622. [41] Croft, D., OKelly, G., Wu, G., Haw, R. (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691D697. [42] Haw, R., Hermjakob, H., D'Eustachio, P. and Stein, L. (2011), Reactome pathway analysis to enrich biological discovery in proteomics data sets. Proteomics, 11: 35983613. [43] Priami, C. (ed.) (2003) Computational Methods in Systems Biology. LNCS 2602. Springer Verlag. [44] Karp, P. D., Riley, M., Saier, M., Paulsen, I. T., Paley, S. M., Pellegrini-Toole, A. (2000) The ecocyc and metacyc databases. Nucleic Acids Res., 28, 5659. [45] Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M. (1999) Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 27(1), 2934. [46] Ashburner, M. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet., 25, 2529. [47] Kanehisa, M. (2002) The KEGG databases at GenomeNet. Nucleic Acids Res., 30, 4246. [48] Boyle, E. I. (2004) GO::TermFinderopen source software for accessing Gene Ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics, 20, 37103715. [49] Huang, D. W. (2007) The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol., 8,R183. [50] Maere, S. (2005) BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks. Bioinformatics, 21, 34483449. [51] Ramos, H. (2008) The protein information and property explorer: an easy-to-use, rich-client web application for the management and functional analysis of proteomic data. Bioinformatics, 24, 21102111. [52] Li,Y. (2008) A global pathway crosstalk network. Bioinformatics, 24, 14421447. [53] Khatri P., Sirota M., Butte A. J. (2012) Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges. PLoS Comput. Biol. 8(2): e1002375. [54] Yeh C. S., Wang J. Y., Cheng T. L., Juan C. H., Wu C. H., Lin S. R. (2006) Fatty acid metabolism pathway play an important role in carcinogenesis of human colorectal cancers by Microarray-Bioinformatics analysis. Cancer letters 233 (2): 297-308. [55] Alberio, T., Lopiano, L. and Fasano, M. (2012) Cellular models to investigate biochemical pathways in Parkinsons disease. FEBS Journal, 279: 11461155. [56] Mattson, M. P., Pedersen, W. A., Duan, W., Culmsee, C., Camandola, S. (1999) Cellular and Molecular Mechanisms Underlying Perturbed Energy Metabolism and Neuronal Degeneration in Alzheimer's and Parkinson's Diseases. Annals of the New York Academy of Sciences, 893: 154175.

48

External links
KEGG resource (http://www.genome.jp/kegg/) DAVID tools (http://genomebiology.com/2003/4/9/R60) GenMAPP (http://www.genmapp.org/) GoMiner (http://genomebiology.com/2003/4/4/R28) Pathguide (http://www.pathguide.org) (http://www.genmapp.org/tutorials/Converting-MAPPs-between-species.pdf)

Biocatalysis

49

Biocatalysis
Biocatalysis is the use of natural catalysts, such as protein enzymes, to perform chemical transformations on organic compounds. Both enzymes that have been more or less isolated and enzymes still residing inside living cells are employed for this task.[1][2][3]

History
Biocatalysis underpins some of the oldest chemical transformations known to humans, for brewing predates recorded history. The oldest records of brewing are about 6000 years old and refer to the Sumerians. The employment of enzymes and whole cells have been important for many industries for centuries. The most obvious uses have been in the food and drink businesses where the production of wine, beer, cheese etc. is dependent on the effects of the microorganisms. More than one hundred years ago, biocatalysis was employed to do chemical transformations on non-natural man-made organic compounds, and the last 30 years have seen a substantial increase in the application of biocatalysis to produce fine chemicals, especially for the pharmaceutical industry.[4] Since biocatalysis deals with enzymes and microorganisms, it is historically classified separately from "homogeneous catalysis" and "heterogeneous catalysis". However, mechanistically speaking, biocatalysis is simply a special case of heterogeneous catalysis.[5]

Advantages
The key word for organic synthesis is selectivity, which is necessary to obtain a high yield of a specific product. There are a large range of selective organic reactions available for most synthetic needs. However, there is still one area where organic chemists are struggling, and that is when chirality is involved, although considerable progress in chiral synthesis has been achieved in recent years. Enzymes display three major types of selectivities: Chemoselectivity: Since the purpose of an enzyme is to act on a single type of functional group, other sensitive functionalities, which would normally react to a certain extent under chemical catalysis, survive. As a result, biocatalytic reactions tend to be "cleaner" and laborious purification of product(s) from impurities emerging through side-reactions can largely be omitted. Regioselectivity and diastereoselectivity: Due to their complex three-dimensional structure, enzymes may distinguish between functional groups which are chemically situated in different regions of the substrate molecule. Enantioselectivity: Since almost all enzymes are made from L-amino acids, enzymes are chiral catalysts. As a consequence, any type of chirality present in the substrate molecule is "recognized" upon the formation of the enzyme-substrate complex. Thus a prochiral substrate may be transformed into an optically active product and both enantiomers of a racemic substrate may react at different rates. These reasons, and especially the latter, are the major reasons why synthetic chemists have become interested in biocatalysis. This interest in turn is mainly due to the need to synthesise enantiopure compounds as chiral building blocks for drugs and agrochemicals. Another important advantage of biocatalysts are that they are environmentally acceptable, being completely degraded in the environment. Furthermore the enzymes act under mild conditions, which minimizes problems of undesired side-reactions such as decomposition, isomerization, racemization and rearrangement, which often plague traditional methodology.

Biocatalysis

50

Asymmetric biocatalysis
The use of biocatalysis to obtain enantiopure compounds can be divided into two different methods: 1. Kinetic resolution of a racemic mixture 2. Biocatalysed asymmetric synthesis In kinetic resolution of a racemic mixture, the presence of a chiral object (the enzyme) converts one of the enantiomers into product at a greater reaction rate than the other enantiomer.

The racemic mixture has now been transformed into a mixture of two different compounds, making them separable by normal methodology. The maximum yield in such kinetic resolutions is 50%, since a yield of more than 50% means that some of wrong isomer also has reacted, giving a lower enantiomeric excess. Such reactions must therefore be terminated before equilibrium is reached. If it is possible to perform such resolutions under conditions where the two substrate- enantiomers are racemizing continuously, all substrate may in theory be converted into enantiopure product. This is called dynamic resolution. In biocatalysed asymmetric synthesis, a non-chiral unit becomes chiral in such a way that the different possible stereoisomers are formed in different quantities. The chirality is introduced into the substrate by influence of enzyme, which is chiral. Yeast is a biocatalyst for the enantioselective reduction of ketones.

The biocatalytic Baeyer-Villiger oxidation is another example of a biocatalytic reaction. In one study a specially designed mutant of Candida Antarctica was found to be an effective catalyst for the Michael addition of acrolein with acetylacetone at 20 C in absence of additional solvent.[6] Another study demonstrates how racemic nicotine (mixture of S and R-enantiomers 1 in scheme 3) can be deracemized in a one-pot procedure involving a monoamine oxidase isolated from Aspergillus niger which is able to oxidize only the amine S-enantiomer to the imine 2 and involving an ammoniaborane reducing couple which can reduce the imine 2 back to the amine 1.[7] In this way the S-enantiomer will continuously be consumed by the enzyme while the R-enantiomer accumulates. It is even possible to stereoinvert pure S to pure R.

Biocatalysis

51

References
[1] Anthonsen, T. Reactions Catalyzed by Enzymes. In Applied Biocatalysis, 2. Ed. ; Adlercreutz. P.; #Straathof, A. J. J. Eds.; Harwood Academic Publishers: UK, 1999; pp 18-53 [2] Faber, K. Biotransformations in Organic Chemistry, 4th ed., Springer-Verlag, Berlin 2000. [3] Jayasinghe L. Y., Smallridge A. J., and Trewhella M. A. (1993). "The yeast mediated reduction of ethyl acetoacetate in petroleum ether". Tetrahedron Letters 34 (24): 3949. doi:10.1016/S0040-4039(00)79272-0. [4] Andreas Liese, Karsten Seelbach, Christian Wandrey, Industrial Biotransformations, Wiley-VCH, Weinheim, 2006, 2ndedition, 556 S., ISBN 3-527-31001-0. [5] Gadi Rothenberg, Catalysis: Concepts and green applications, Wiley-VCH: Weinheim, Feb. 2008, ISBN 978-3-527-31824-7. [6] Maria Svedendahl, Karl Hult, and Per Berglund (2005). "Fast Carbon-Carbon Bond Formation by a Promiscuous Lipase". J. Am. Chem. Soc. 127 (51): 1798817989. doi:10.1021/ja056660r. PMID16366534. [7] Colin J. Dunsmore, Reuben Carr, Toni Fleming, and Nicholas J. Turner (2006). "A Chemo-Enzymatic Route to Enantiomerically Pure Cyclic Tertiary Amines". J. Am. Chem. Soc. 128 (7): 22242225. doi:10.1021/ja058536d. PMID16478171.

External links
The Centre of Excellence for Biocatalysis - CoEBio3 (http://www.coebio3.org) The University of Exeter - Biocatalysis Centre (http://centres.exeter.ac.uk/biocatalysis/) Applied Biocatalysis centre - Graz (http://www.a-b.tugraz.at/index_en.htm) Center for Biocatalysis and Bioprocessing - The University of Iowa (http://www.uiowa.edu/~biocat/) TU Delft - Biocatalysis & Organic Chemistry (BOC) (http://www.bt.tudelft.nl/boc) KTH Stockholm - Biocatalysis Research Group (http://www.biotech.kth.se/biochem/biocatalysis/) Institute of Technical Biocatalysis at the Hamburg University of Technology (TUHH) (http://www. technical-biocatalysis.com) MIT Short Course - Principles and Applications of Biocatalysis (http://web.mit.edu/professional/ short-programs/courses/principles_applications_biocatalysis.html) Web Resource for biocatalysis and enzymes (http://www.bio-catalyst.com)

Enzyme

52

Enzyme
Enzymes ( /nzamz/) are large biological molecules responsible for the thousands of chemical interconversions that sustain life.[1][2] They are highly selective catalysts, greatly accelerating both the rate and specificity of metabolic reactions, from the digestion of food to the synthesis of DNA. Most enzymes are proteins, although some catalytic RNA molecules have been identified. Enzymes adopt a specific three-dimensional structure, and may employ organic (e.g. biotin) and inorganic (e.g. magnesium ion) cofactors to assist in catalysis. In enzymatic reactions, the molecules at the beginning of the process, called substrates, Human glyoxalase I. Two zinc ions that are needed for the enzyme to catalyze its are converted into different molecules, reaction are shown as purple spheres, and an enzyme inhibitor called called products. Almost all chemical S-hexylglutathione is shown as a space-filling model, filling the two active sites. reactions in a biological cell need enzymes in order to occur at rates sufficient for life. Since enzymes are selective for their substrates and speed up only a few reactions from among many possibilities, the set of enzymes made in a cell determines which metabolic pathways occur in that cell. Like all catalysts, enzymes work by lowering the activation energy (Ea) for a reaction, thus dramatically increasing the rate of the reaction. As a result, products are formed faster and reactions reach their equilibrium state more rapidly. Most enzyme reaction rates are millions of times faster than those of comparable un-catalyzed reactions. As with all catalysts, enzymes are not consumed by the reactions they catalyze, nor do they alter the equilibrium of these reactions. However, enzymes do differ from most other catalysts in that they are highly specific for their substrates. Enzymes are known to catalyze about 4,000 biochemical reactions.[3] A few RNA molecules called ribozymes also catalyze reactions, with an important example being some parts of the ribosome.[4][5] Synthetic molecules called artificial enzymes also display enzyme-like catalysis.[6] Enzyme activity can be affected by other molecules. Inhibitors are molecules that decrease enzyme activity; activators are molecules that increase activity. Many drugs and poisons are enzyme inhibitors. Activity is also affected by temperature, pressure, chemical environment (e.g., pH), and the concentration of substrate. Some enzymes are used commercially, for example, in the synthesis of antibiotics. In addition, some household products use enzymes to speed up biochemical reactions (e.g., enzymes in biological washing powders break down protein or fat stains on clothes; enzymes in meat tenderizers break down proteins into smaller molecules, making the meat easier to chew).

Enzyme

53

Etymology and history


As early as the late 17th and early 18th centuries, the digestion of meat by stomach secretions[7] and the conversion of starch to sugars by plant extracts and saliva were known. However, the mechanism by which this occurred had not been identified.[8] In the 19th century, when studying the fermentation of sugar to alcohol by yeast, Louis Pasteur came to the conclusion that this fermentation was catalyzed by a vital force contained within the yeast cells called "ferments", which were thought to function only within living organisms. He wrote that "alcoholic fermentation is an act correlated with the life and organization of the yeast cells, not with the death or putrefaction of the cells."[9] In 1877, German physiologist Wilhelm Khne (18371900) first used the term enzyme, which comes from Greek , "in leaven", to describe this process.[10] The word enzyme was used later to refer to nonliving substances such as pepsin, and the word ferment was used to refer to chemical activity produced by living organisms. In 1897, Eduard Buchner submitted his first paper on the ability of yeast extracts that lacked any living yeast cells to ferment sugar. In a series of experiments at the University of Berlin, he found that the sugar was fermented even when there were no living yeast cells in the mixture.[11] He named the enzyme that brought about the fermentation of sucrose "zymase".[12] In 1907, he received the Nobel Prize in Chemistry "for his biochemical research and his discovery of cell-free fermentation". Following Buchner's example, enzymes are usually named according to the reaction they carry out. Typically, to generate the name of an enzyme, the suffix -ase is added to the name of its substrate (e.g., lactase is the enzyme that cleaves lactose) or the type of reaction (e.g., DNA polymerase forms DNA polymers).[13] Having shown that enzymes could function outside a living cell, the next step was to determine their biochemical nature. Many early workers noted that enzymatic activity was associated with proteins, but several scientists (such as Nobel laureate Richard Willsttter) argued that proteins were merely carriers for the true enzymes and that proteins per se were incapable of catalysis.[14] However, in 1926, James B. Sumner showed that the enzyme urease was a pure protein and crystallized it; Sumner did likewise for the enzyme catalase in 1937. The conclusion that pure proteins can be enzymes was definitively proved by Northrop and Stanley, who worked on the digestive enzymes pepsin (1930), trypsin and chymotrypsin. These three scientists were awarded the 1946 Nobel Prize in Chemistry.[15] This discovery that enzymes could be crystallized eventually allowed their structures to be solved by x-ray crystallography. This was first done for lysozyme, an enzyme found in tears, saliva and egg whites that digests the coating of some bacteria; the structure was solved by a group led by David Chilton Phillips and published in 1965.[16] This high-resolution structure of lysozyme marked the beginning of the field of structural biology and the effort to understand how enzymes work at an atomic level of detail.

Enzyme

54

Structures and mechanisms


Enzymes are in general globular proteins and range from just 62 amino acid residues in size, for the monomer of 4-oxalocrotonate tautomerase,[18] to over 2,500 residues in the animal fatty acid synthase.[19] A small number of RNA-based biological catalysts exist, with the most common being the ribosome; these are referred to as either RNA-enzymes or ribozymes. The activities of enzymes are determined by their three-dimensional structure.[20] However, although structure does determine function, predicting a novel enzyme's activity just from its structure is a very difficult problem that has not yet been solved.[21] Most enzymes are much larger than the substrates they act on, and only a small Ribbon diagram showing human carbonic anhydrase II. The grey sphere is the zinc [17] portion of the enzyme (around 24 amino cofactor in the active site. Diagram drawn from PDB 1MOO . [22] acids) is directly involved in catalysis. The region that contains these catalytic residues, binds the substrate, and then carries out the reaction is known as the active site. Enzymes can also contain sites that bind cofactors, which are needed for catalysis. Some enzymes also have binding sites for small molecules, which are often direct or indirect products or substrates of the reaction catalyzed. This binding can serve to increase or decrease the enzyme's activity, providing a means for feedback regulation. Like all proteins, enzymes are long, linear chains of amino acids that fold to produce a three-dimensional product. Each unique amino acid sequence produces a specific structure, which has unique properties. Individual protein chains may sometimes group together to form a protein complex. Most enzymes can be denaturedthat is, unfolded and inactivatedby heating or chemical denaturants, which disrupt the three-dimensional structure of the protein. Depending on the enzyme, denaturation may be reversible or irreversible. Structures of enzymes with substrates or substrate analogs during a reaction may be obtained using Time resolved crystallography methods.

Specificity
Enzymes are usually very specific as to which reactions they catalyze and the substrates that are involved in these reactions. Complementary shape, charge and hydrophilic/hydrophobic characteristics of enzymes and substrates are responsible for this specificity. Enzymes can also show impressive levels of stereospecificity, regioselectivity and chemoselectivity.[23] Some of the enzymes showing the highest specificity and accuracy are involved in the copying and expression of the genome. These enzymes have "proof-reading" mechanisms. Here, an enzyme such as DNA polymerase catalyzes a reaction in a first step and then checks that the product is correct in a second step.[24] This two-step process results in average error rates of less than 1 error in 100 million reactions in high-fidelity mammalian polymerases.[25] Similar proofreading mechanisms are also found in RNA polymerase,[26] aminoacyl tRNA synthetases[27] and ribosomes.[28] Some enzymes that produce secondary metabolites are described as promiscuous, as they can act on a relatively broad range of different substrates. It has been suggested that this broad substrate specificity is important for the

Enzyme evolution of new biosynthetic pathways.[29] "Lock and key" model Enzymes are very specific, and it was suggested by the Nobel laureate organic chemist Emil Fischer in 1894 that this was because both the enzyme and the substrate possess specific complementary geometric shapes that fit exactly into one another.[30] This is often referred to as "the lock and key" model. However, while this model explains enzyme specificity, it fails to explain the stabilization of the transition state that enzymes achieve. In 1958, Daniel Koshland suggested a modification to the lock and key model: since enzymes are rather flexible structures, the active site is continuously reshaped by interactions with the substrate as the substrate interacts with the enzyme.[31] As a result, the substrate does not simply bind to a rigid active site; the amino Diagrams to show the induced fit hypothesis of enzyme action acid side-chains that make up the active site are molded into the precise positions that enable the enzyme to perform its catalytic function. In some cases, such as glycosidases, the substrate molecule also changes shape slightly as it enters the active site.[32] The active site continues to change until the substrate is completely bound, at which point the final shape and charge is determined.[33] Induced fit may enhance the fidelity of molecular recognition in the presence of competition and noise via the conformational proofreading mechanism.[34]

55

Mechanisms
Enzymes can act in several ways, all of which lower G (Gibbs energy):[35] Lowering the activation energy by creating an environment in which the transition state is stabilized (e.g. straining the shape of a substrateby binding the transition-state conformation of the substrate/product molecules, the enzyme distorts the bound substrate(s) into their transition state form, thereby reducing the amount of energy required to complete the transition). Lowering the energy of the transition state, but without distorting the substrate, by creating an environment with the opposite charge distribution to that of the transition state. Providing an alternative pathway. For example, temporarily reacting with the substrate to form an intermediate ES complex, which would be impossible in the absence of the enzyme. Reducing the reaction entropy change by bringing substrates together in the correct orientation to react. Considering H alone overlooks this effect. Increases in temperatures speed up reactions. Thus, temperature increases help the enzyme function and develop the end product even faster. However, if heated too much, the enzymes shape deteriorates and the enzyme becomes denatured. Some enzymes like thermolabile enzymes work best at low temperatures. It is interesting that this entropic effect involves destabilization of the ground state,[36] and its contribution to catalysis is relatively small.[37]

Enzyme Transition state stabilization The understanding of the origin of the reduction of G requires one to find out how the enzymes can stabilize its transition state more than the transition state of the uncatalyzed reaction. It seems that the most effective way for reaching large stabilization is the use of electrostatic effects, in particular, when having a relatively fixed polar environment that is oriented toward the charge distribution of the transition state.[38] Such an environment does not exist in the uncatalyzed reaction in water. Dynamics and function The internal dynamics of enzymes has been suggested to be linked with their mechanism of catalysis.[39][40][41] Internal dynamics are the movement of parts of the enzyme's structure, such as individual amino acid residues, a group of amino acids, or even an entire protein domain. These movements occur at various time-scales ranging from femtoseconds to seconds. Networks of protein residues throughout an enzyme's structure can contribute to catalysis through dynamic motions.[42][43][44][45] This is simply seen in the kinetic scheme of the combined process, enzymatic activity and dynamics; this scheme can have several independent Michaelis-Menten-like reaction pathways that are connected through fluctuation rates.[46][47][48] Protein motions are vital to many enzymes, but whether small and fast vibrations, or larger and slower conformational movements are more important depends on the type of reaction involved. However, although these movements are important in binding and releasing substrates and products, it is not clear if protein movements help to accelerate the chemical steps in enzymatic reactions.[49] These new insights also have implications in understanding allosteric effects and developing new medicines.

56

Allosteric modulation
Allosteric sites are sites on the enzyme that bind to molecules in the cellular environment. The sites form weak, noncovalent bonds with these molecules, causing a change in the conformation of the enzyme. This change in conformation translates to the active site, which then affects the reaction rate of the enzyme.[50] Allosteric interactions can both inhibit and activate enzymes and are a common way that enzymes are controlled in the body.[51]
Allosteric transition of an enzyme between R and T states, stabilized by an agonist, an inhibitor and a substrate (the MWC model)

Cofactors and coenzymes


Cofactors
Some enzymes do not need any additional components to show full activity. However, others require non-protein molecules called cofactors to be bound for activity.[52] Cofactors can be either inorganic (e.g., metal ions and iron-sulfur clusters) or organic compounds (e.g., flavin and heme). Organic cofactors can be either prosthetic groups, which are tightly bound to an enzyme, or coenzymes, which are released from the enzyme's active site during the reaction. Coenzymes include NADH, NADPH and adenosine triphosphate. These molecules transfer chemical groups between enzymes.[53] An example of an enzyme that contains a cofactor is carbonic anhydrase, and is shown in the ribbon diagram above with a zinc cofactor bound as part of its active site.[54] These tightly bound molecules are usually found in the active site and are involved in catalysis. For example, flavin and heme cofactors are often involved in redox reactions.

Enzyme Enzymes that require a cofactor but do not have one bound are called apoenzymes or apoproteins. An apoenzyme together with its cofactor(s) is called a holoenzyme (this is the active form). Most cofactors are not covalently attached to an enzyme, but are very tightly bound. However, organic prosthetic groups can be covalently bound (e.g., biotin in the enzyme pyruvate carboxylase). The term "holoenzyme" can also be applied to enzymes that contain multiple protein subunits, such as the DNA polymerases; here the holoenzyme is the complete complex containing all the subunits needed for activity.

57

Coenzymes
Coenzymes are small organic molecules that can be loosely or tightly bound to an enzyme. Tightly bound coenzymes can be called prosthetic groups. Coenzymes transport chemical groups from one enzyme to another.[55] Some of these chemicals such as riboflavin, thiamine and folic acid are vitamins (compounds that cannot be synthesized by the body and must be acquired from the diet). The chemical groups carried include the hydride ion (H-) carried by NAD or NADP+, the phosphate group carried by adenosine triphosphate, the acetyl group carried by coenzyme A, formyl, methenyl or methyl groups carried by folic acid and the methyl group carried by S-adenosylmethionine. Since coenzymes are chemically changed as a consequence of enzyme action, it is useful to consider coenzymes to be a special class of substrates, or second substrates, which are common to many different enzymes. For example, about 700 enzymes are known to use the coenzyme NADH.[56]
Space-filling model of the coenzyme NADH

Coenzymes are usually continuously regenerated and their concentrations maintained at a steady level inside the cell: for example, NADPH is regenerated through the pentose phosphate pathway and S-adenosylmethionine by methionine adenosyltransferase. This continuous regeneration means that even small amounts of coenzymes are used very intensively. For example, the human body turns over its own weight in ATP each day.[57]

Enzyme

58

Thermodynamics
As all catalysts, enzymes do not alter the position of the chemical equilibrium of the reaction. Usually, in the presence of an enzyme, the reaction runs in the same direction as it would without the enzyme, just more quickly. However, in the absence of the enzyme, other possible uncatalyzed, "spontaneous" reactions might lead to different products, because in those conditions this different product is formed faster. Furthermore, enzymes can couple two or more reactions, so that a thermodynamically favorable reaction can be used to "drive" a thermodynamically unfavorable one. For example, the hydrolysis of ATP is often used to drive other chemical reactions.[58]

The energies of the stages of a chemical reaction. Substrates need a lot of potential energy to reach a transition state, which then decays into products. The enzyme stabilizes the transition state, reducing the energy needed to form products.

Enzymes catalyze the forward and backward reactions equally. They do not alter the equilibrium itself, but only the speed at which it is reached. For example, carbonic anhydrase catalyzes its reaction in either direction depending on the concentration of its reactants. (in tissues; high CO2 concentration) (in lungs; low CO2 concentration) Nevertheless, if the equilibrium is greatly displaced in one direction, that is, in a very exergonic reaction, the reaction is in effect irreversible. Under these conditions, the enzyme will, in fact, catalyze the reaction only in the thermodynamically allowed direction.

Kinetics
Enzyme kinetics is the investigation of how enzymes bind substrates and turn them into products. The rate data used in kinetic analyses are commonly obtained from enzyme assays, where since the 90s, the dynamics of many enzymes are studied on the level of individual molecules. In 1902 Victor Henri proposed a quantitative theory of enzyme kinetics,[59] but his experimental data were not useful because the significance of the hydrogen ion concentration was not yet appreciated. After Peter Lauritz Srensen had defined the logarithmic pH-scale and introduced the concept of buffering in 1909[60] the German chemist Leonor Michaelis and his Canadian postdoc Maud Leonora Menten repeated Henri's experiments and confirmed his equation, which is referred to as Henri-Michaelis-Menten kinetics (termed also Michaelis-Menten kinetics).[61] Their work was further developed by G. E. Briggs and J. B. S. Haldane, who derived kinetic equations that are still widely considered today a starting point in solving enzymatic activity.[62]
Mechanism for a single substrate enzyme catalyzed reaction. The enzyme (E) binds a substrate (S) and produces a product (P).

Enzyme The major contribution of Henri was to think of enzyme reactions in two stages. In the first, the substrate binds reversibly to the enzyme, forming the enzyme-substrate complex. This is sometimes called the Michaelis complex. The enzyme then catalyzes the chemical step in the reaction and releases the product. Note that the simple Michaelis Menten mechanism for the enzymatic activity is considered today a basic idea, where many examples show that the enzymatic activity involves structural dynamics. This is incorporated in the enzymatic mechanism while introducing several Michaelis Menten pathways that are connected with fluctuating rates.[46][47][48] Nevertheless, there is a mathematical relation connecting the behavior obtained from the basic Michaelis Menten mechanism (that was indeed proved correct in many experiments) with the generalized Michaelis Menten mechanisms involving dynamics and activity; [63] this means that the measured activity of enzymes on the level of many enzymes may be explained with the simple Michaelis-Menten equation, yet, the actual activity of enzymes is richer and involves structural dynamics. Enzymes can catalyze up to several million reactions per second. For example, the uncatalyzed decarboxylation of orotidine 5'-monophosphate has a half life of 78 million years. However, when the enzyme orotidine 5'-phosphate decarboxylase is added, the same process takes just 25 milliseconds.[64] Enzyme rates depend on solution conditions and substrate concentration. Conditions that denature the protein abolish enzyme activity, such as high temperatures, extremes of pH or high salt concentrations, while raising substrate concentration tends to increase activity Saturation curve for an enzyme reaction showing the relation between the substrate concentration (S) and rate (v) when [S] is low. To find the maximum speed of an enzymatic reaction, the substrate concentration is increased until a constant rate of product formation is seen. This is shown in the saturation curve on the right. Saturation happens because, as substrate concentration increases, more and more of the free enzyme is converted into the substrate-bound ES form. At the maximum reaction rate (Vmax) of the enzyme, all the enzyme active sites are bound to substrate, and the amount of ES complex is the same as the total amount of enzyme. However, Vmax is only one kinetic constant of enzymes. The amount of substrate needed to achieve a given rate of reaction is also important. This is given by the Michaelis-Menten constant (Km), which is the substrate concentration required for an enzyme to reach one-half its maximum reaction rate. Each enzyme has a characteristic Km for a given substrate, and this can show how tight the binding of the substrate is to the enzyme. Another useful constant is kcat, which is the number of substrate molecules handled by one active site per second. The efficiency of an enzyme can be expressed in terms of kcat/Km. This is also called the specificity constant and incorporates the rate constants for all steps in the reaction. Because the specificity constant reflects both affinity and catalytic ability, it is useful for comparing different enzymes against each other, or the same enzyme with different substrates. The theoretical maximum for the specificity constant is called the diffusion limit and is about 108 to 109 (M1 s1). At this point every collision of the enzyme with its substrate will result in catalysis, and the rate of product formation is not limited by the reaction rate but by the diffusion rate. Enzymes with this property are called catalytically perfect or kinetically perfect. Example of such enzymes are triose-phosphate isomerase, carbonic anhydrase, acetylcholinesterase, catalase, fumarase, -lactamase, and superoxide dismutase. Michaelis-Menten kinetics relies on the law of mass action, which is derived from the assumptions of free diffusion and thermodynamically driven random collision. However, many biochemical or cellular processes deviate

59

Enzyme significantly from these conditions, because of macromolecular crowding, phase-separation of the enzyme/substrate/product, or one or two-dimensional molecular movement.[65] In these situations, a fractal Michaelis-Menten kinetics may be applied.[66][67][68][69] Some enzymes operate with kinetics, which are faster than diffusion rates, which would seem to be impossible. Several mechanisms have been invoked to explain this phenomenon. Some proteins are believed to accelerate catalysis by drawing their substrate in and pre-orienting them by using dipolar electric fields. Other models invoke a quantum-mechanical tunneling explanation, whereby a proton or an electron can tunnel through activation barriers, although for proton tunneling this model remains somewhat controversial.[70][71] Quantum tunneling for protons has been observed in tryptamine.[72] This suggests that enzyme catalysis may be more accurately characterized as "through the barrier" rather than the traditional model, which requires substrates to go "over" a lowered energy barrier.

60

Inhibition
Enzyme reaction rates can be decreased by various types of enzyme inhibitors. Competitive inhibition In competitive inhibition, the inhibitor and substrate compete for the enzyme (i.e., they can not bind at the same time).[74] Often competitive inhibitors strongly resemble the real substrate of the enzyme. For example, methotrexate is a competitive inhibitor of the enzyme dihydrofolate reductase, which catalyzes the reduction of dihydrofolate to tetrahydrofolate. The Competitive inhibitors bind reversibly to the enzyme, preventing the binding of substrate. similarity between the structures of On the other hand, binding of substrate prevents binding of the inhibitor. Substrate and folic acid and this drug are shown in inhibitor compete for the enzyme. the figure to the right bottom. In some cases, the inhibitor can bind to a site other than the binding-site of the usual substrate and exert an allosteric effect to change the shape of the usual binding-site. For example, strychnine acts as an allosteric inhibitor of the glycine receptor in the mammalian spinal cord and brain stem. Glycine is a major post-synaptic inhibitory neurotransmitter with a specific receptor site. Strychnine binds to an alternate site that reduces the affinity of the glycine receptor for glycine, resulting in convulsions due to

Enzyme lessened inhibition by the glycine.[75] In competitive inhibition the maximal rate of the reaction is not changed, but higher substrate concentrations are required to reach a given maximum rate, increasing the apparent Km. Uncompetitive inhibition In uncompetitive inhibition, the inhibitor cannot bind to the free enzyme, only to the ES-complex. The EIS-complex thus formed is enzymatically inactive. This type of inhibition is rare, but may occur in multimeric enzymes. Non-competitive inhibition Non-competitive inhibitors can bind to the enzyme at the binding site at the same time as the substrate,but not to the active site. Both the EI and EIS complexes are enzymatically inactive. Because the inhibitor can not be driven from the enzyme by higher substrate concentration (in contrast to competitive inhibition), the apparent Vmax changes. But because the substrate can still bind to the enzyme, the Km stays the same.

61

Types of inhibition. This classification was introduced by W.W. Cleland.

[73]

Mixed inhibition This type of inhibition resembles the non-competitive, except that the EIS-complex has residual enzymatic activity.This type of inhibitor does not follow Michaelis-Menten equation. In many organisms, inhibitors may act as part of a feedback mechanism. If an enzyme produces too much of one substance in the organism, that substance may act as an inhibitor for the enzyme at the beginning of the pathway that produces it, causing production of the substance to slow down or stop when there is sufficient amount. This is a form of negative feedback. Enzymes that are subject to this form of regulation are often multimeric and have allosteric binding sites for regulatory substances. Their substrate/velocity plots are not hyperbolar, but sigmoidal (S-shaped). Irreversible inhibitors react with the enzyme and form a covalent adduct with the protein. The inactivation is irreversible. These compounds include eflornithine a drug used to treat the parasitic disease sleeping sickness.[76] Penicillin and Aspirin also act in this The coenzyme folic acid (left) and the anti-cancer drug methotrexate (right) are very manner. With these drugs, the similar in structure. As a result, methotrexate is a competitive inhibitor of many enzymes that use folates. compound is bound in the active site and the enzyme then converts the inhibitor into an activated form that reacts irreversibly with one or more amino acid residues. Uses of inhibitors

Enzyme Since inhibitors modulate the function of enzymes they are often used as drugs. A common example of an inhibitor that is used as a drug is aspirin, which inhibits the COX-1 and COX-2 enzymes that produce the inflammation messenger prostaglandin, thus suppressing pain and inflammation. However, other enzyme inhibitors are poisons. For example, the poison cyanide is an irreversible enzyme inhibitor that combines with the copper and iron in the active site of the enzyme cytochrome c oxidase and blocks cellular respiration.[77]

62

Biological function
Enzymes serve a wide variety of functions inside living organisms. They are indispensable for signal transduction and cell regulation, often via kinases and phosphatases.[78] They also generate movement, with myosin hydrolyzing ATP to generate muscle contraction and also moving cargo around the cell as part of the cytoskeleton.[79] Other ATPases in the cell membrane are ion pumps involved in active transport. Enzymes are also involved in more exotic functions, such as luciferase generating light in fireflies.[80] Viruses can also contain enzymes for infecting cells, such as the HIV integrase and reverse transcriptase, or for viral release from cells, like the influenza virus neuraminidase. An important function of enzymes is in the digestive systems of animals. Enzymes such as amylases and proteases break down large molecules (starch or proteins, respectively) into smaller ones, so they can be absorbed by the intestines. Starch molecules, for example, are too large to be absorbed from the intestine, but enzymes hydrolyze the starch chains into smaller molecules such as maltose and eventually glucose, which can then be absorbed. Different enzymes digest different food substances. In ruminants, which have herbivorous diets, microorganisms in the gut produce another enzyme, cellulase, to break down the cellulose cell walls of plant fiber.[81] Several enzymes can work together in a specific order, creating metabolic pathways. In a metabolic pathway, one enzyme takes the product of another enzyme as a substrate. After the catalytic reaction, the product is then passed on to another enzyme. Sometimes more than one enzyme can catalyze the same reaction in parallel; this can allow more complex regulation: with, for example, a low constant activity provided by one

Glycolytic enzymes and their functions in the metabolic pathway of glycolysis

enzyme but an inducible high activity from a second enzyme. Enzymes determine what steps occur in these pathways. Without enzymes, metabolism would neither progress through the same steps nor be fast enough to serve the needs of the cell. Indeed, a metabolic pathway such as glycolysis could not exist independently of enzymes. Glucose, for example, can react directly with ATP to become phosphorylated at one or more of its carbons. In the absence of enzymes, this occurs so slowly as to be insignificant. However, if hexokinase is added, these slow reactions continue to take place except that phosphorylation at carbon 6 occurs so rapidly that, if the mixture is tested a short time later, glucose-6-phosphate is found to be the only significant product. As a consequence, the network of metabolic pathways within each cell depends on the set of functional enzymes that are present.

Enzyme

63

Control of activity
There are five main ways that enzyme activity is controlled in the cell. 1. Enzyme production (transcription and translation of enzyme genes) can be enhanced or diminished by a cell in response to changes in the cell's environment. This form of gene regulation is called enzyme induction and inhibition. For example, bacteria may become resistant to antibiotics such as penicillin because enzymes called beta-lactamases are induced that hydrolyze the crucial beta-lactam ring within the penicillin molecule. Another example are enzymes in the liver called cytochrome P450 oxidases, which are important in drug metabolism. Induction or inhibition of these enzymes can cause drug interactions. 2. Enzymes can be compartmentalized, with different metabolic pathways occurring in different cellular compartments. For example, fatty acids are synthesized by one set of enzymes in the cytosol, endoplasmic reticulum and the Golgi apparatus and used by a different set of enzymes as a source of energy in the mitochondrion, through -oxidation.[82] 3. Enzymes can be regulated by inhibitors and activators. For example, the end product(s) of a metabolic pathway are often inhibitors for one of the first enzymes of the pathway (usually the first irreversible step, called committed step), thus regulating the amount of end product made by the pathways. Such a regulatory mechanism is called a negative feedback mechanism, because the amount of the end product produced is regulated by its own concentration. Negative feedback mechanism can effectively adjust the rate of synthesis of intermediate metabolites according to the demands of the cells. This helps allocate materials and energy economically, and prevents the manufacture of excess end products. The control of enzymatic action helps to maintain a stable internal environment in living organisms. 4. Enzymes can be regulated through post-translational modification. This can include phosphorylation, myristoylation and glycosylation. For example, in the response to insulin, the phosphorylation of multiple enzymes, including glycogen synthase, helps control the synthesis or degradation of glycogen and allows the cell to respond to changes in blood sugar.[83] Another example of post-translational modification is the cleavage of the polypeptide chain. Chymotrypsin, a digestive protease, is produced in inactive form as chymotrypsinogen in the pancreas and transported in this form to the stomach where it is activated. This stops the enzyme from digesting the pancreas or other tissues before it enters the gut. This type of inactive precursor to an enzyme is known as a zymogen. 5. Some enzymes may become activated when localized to a different environment (e.g., from a reducing (cytoplasm) to an oxidizing (periplasm) environment, high pH to low pH, etc.). For example, hemagglutinin in the influenza virus is activated by a conformational change caused by the acidic conditions, these occur when it is taken up inside its host cell and enters the lysosome.[84]

Enzyme

64

Involvement in disease
Since the tight control of enzyme activity is essential for homeostasis, any malfunction (mutation, overproduction, underproduction or deletion) of a single critical enzyme can lead to a genetic disease. The importance of enzymes is shown by the fact that a lethal illness can be caused by the malfunction of just one type of enzyme out of the thousands of types present in our bodies. One example is the most common type of phenylketonuria. A mutation of a single amino acid in the enzyme phenylalanine hydroxylase, which catalyzes the first step in the degradation of phenylalanine, results in build-up of phenylalanine and related products. This can lead to mental retardation if the disease is untreated.[86] Another example of enzyme deficiency is pseudocholinesterase, in which there is slow metabolic degradation of exogenous choline.
Phenylalanine hydroxylase. Created from PDB 1KW0 [85]

Another example is when germline mutations in genes coding for DNA repair enzymes cause hereditary cancer syndromes such as xeroderma pigmentosum. Defects in these enzymes cause cancer since the body is less able to repair mutations in the genome. This causes a slow accumulation of mutations and results in the development of many types of cancer in the sufferer. Oral administration of enzymes can be used to treat several diseases (e.g. pancreatic insufficiency and lactose intolerance). Since enzymes are proteins themselves they are potentially subject to inactivation and digestion in the gastrointestinal environment. Therefore a non-invasive imaging assay was developed to monitor gastrointestinal activity of exogenous enzymes (prolyl endopeptidase as potential adjuvant therapy for celiac disease) in vivo.[87]

Naming conventions
An enzyme's name is often derived from its substrate or the chemical reaction it catalyzes, with the word ending in -ase. Examples are lactase, alcohol dehydrogenase and DNA polymerase. This may result in different enzymes, called isozymes, with the same function having the same basic name. Isoenzymes have a different amino acid sequence and might be distinguished by their optimal pH, kinetic properties or immunologically. Isoenzyme and isozyme are homologous proteins. Furthermore, the normal physiological reaction an enzyme catalyzes may not be the same as under artificial conditions. This can result in the same enzyme being identified with two different names. For example, glucose isomerase, which is used industrially to convert glucose into the sweetener fructose, is a xylose isomerase in vivo (within the body). The International Union of Biochemistry and Molecular Biology have developed a nomenclature for enzymes, the EC numbers; each enzyme is described by a sequence of four numbers preceded by "EC". The first number broadly classifies the enzyme based on its mechanism. The top-level classification is[88] EC 1 Oxidoreductases: catalyze oxidation/reduction reactions EC 2 Transferases: transfer a functional group (e.g. a methyl or phosphate group) EC 3 Hydrolases: catalyze the hydrolysis of various bonds EC 4 Lyases: cleave various bonds by means other than hydrolysis and oxidation

EC 5 Isomerases: catalyze isomerization changes within a single molecule EC 6 Ligases: join two molecules with covalent bonds.

Enzyme According to the naming conventions, enzymes are generally classified into six main family classes and many sub-family classes. Some web-servers, e.g., EzyPred [89] [90] and bioinformatics tools have been developed to predict which main family class [91] and sub-family class [92] [93] an enzyme molecule belongs to according to its sequence information alone via the pseudo amino acid composition.

65

Industrial applications
Enzymes are used in the chemical industry and other industrial applications when extremely specific catalysts are required. However, enzymes in general are limited in the number of reactions they have evolved to catalyze and also by their lack of stability in organic solvents and at high temperatures. As a consequence, protein engineering is an active area of research and involves attempts to create new enzymes with novel properties, either through rational design or in vitro evolution.[94][95] These efforts have begun to be successful, and a few enzymes have now been designed "from scratch" to catalyze reactions that do not occur in nature.[96]
Application Food processing Enzymes used Amylases from fungi and plants Uses Production of sugars from starch, such as in making [97] high-fructose corn syrup. In baking, catalyze breakdown of starch in the flour to sugar. Yeast fermentation of sugar produces the carbon dioxide that raises the dough.

Amylases catalyze the release of simple sugars from starch. Baby foods Brewing industry

Proteases

Biscuit manufacturers use them to lower the protein level of flour.

Trypsin Enzymes from barley are released during the mashing stage of beer production. Industrially produced barley enzymes Amylase, glucanases, proteases Betaglucanases and arabinoxylanases Amyloglucosidase and pullulanases

To predigest baby foods They degrade starch and proteins to produce simple sugar, amino acids and peptides that are used by yeast for fermentation. Widely used in the brewing process to substitute for the natural enzymes found in barley. Split polysaccharides and proteins in the malt. Improve the wort and beer filtration characteristics.

Low-calorie beer and adjustment of fermentability.

Germinating barley used for malt

Proteases Acetolactatedecarboxylase (ALDC)

Remove cloudiness produced during storage of beers. Increases fermentation efficiency by reducing diacetyl [98] formation. Clarify fruit juices.

Fruit juices

Cellulases, pectinases

Enzyme

66
Rennin, derived from the stomachs Manufacture of cheese, used to hydrolyze protein of young ruminant animals (like calves and lambs)

Dairy industry

Microbially produced enzyme

Now finding increasing use in the dairy industry

Lipases

Is implemented during the production of Roquefort cheese to enhance the ripening of the blue-mold cheese.

Roquefort cheese Meat tenderizers Starch industry

Lactases

Break down lactose to glucose and galactose.

Papain

To soften meat for cooking

Amylases, amyloglucosideases and Converts starch into glucose and various syrups. glucoamylases Glucose isomerase Converts glucose into fructose in production of high-fructose syrups from starchy materials. These syrups have enhanced sweetening properties and lower calorific values than sucrose for the same level of sweetness. Degrade starch to lower viscosity, aiding sizing and coating paper. Xylanases reduce bleach required for decolorizing; cellulases smooth fibers, enhance water drainage, and promote ink removal; lipases reduce pitch and lignin-degrading enzymes remove lignin to soften paper.

Glucose

Fructose

Paper industry

Amylases, Xylanases, Cellulases and ligninases

A paper mill in South Carolina Biofuel industry Cellulases Used to break down cellulose into sugars that can be fermented (see cellulosic ethanol)

Ligninases

Use of lignin waste

Cellulose in 3D Biological detergent Primarily proteases, produced in an Used for presoak conditions and direct liquid extracellular form from bacteria applications helping with removal of protein stains from clothes Amylases Detergents for machine dish washing to remove resistant starch residues Used to assist in the removal of fatty and oily stains Used in biological fabric conditioners

Lipases Cellulases

Enzyme

67
Proteases To remove proteins on contact lens to prevent infections To generate oxygen from peroxide to convert latex into foam rubber Dissolve gelatin off scrap film, allowing recovery of its silver content. Used to manipulate DNA in genetic engineering, important in pharmacology, agriculture and medicine. Essential for restriction digestion and the polymerase chain reaction. Molecular biology is also important in forensic science.

Contact lens cleaners

Rubber industry

Catalase

Photographic industry

Protease (ficin)

Molecular biology

Restriction enzymes, DNA ligase and polymerases

Part of the DNA double helix

References
[1] Smith AL (Ed) (1997). Oxford dictionary of biochemistry and molecular biology. Oxford [Oxfordshire]: Oxford University Press. ISBN0-19-854768-4. [2] Grisham, Charles M.; Reginald H. Garrett (1999). Biochemistry. Philadelphia: Saunders College Pub. pp.4267. ISBN0-03-022318-0. [3] Bairoch A. (2000). "The ENZYME database in 2000" (http:/ / www. expasy. org/ NAR/ enz00. pdf) (PDF). Nucleic Acids Res 28 (1): 3045. doi:10.1093/nar/28.1.304. PMC102465. PMID10592255. . [4] Lilley D (2005). "Structure, folding and mechanisms of ribozymes". Curr Opin Struct Biol 15 (3): 31323. doi:10.1016/j.sbi.2005.05.002. PMID15919196. [5] Cech T (2000). "Structural biology. The ribosome is a ribozyme". Science 289 (5481): 8789. doi:10.1126/science.289.5481.878. PMID10960319. [6] Groves JT (1997). "Artificial enzymes. The importance of being selective". Nature 389 (6649): 32930. doi:10.1038/38602. PMID9311771. [7] de Raumur, RAF (1752). "Observations sur la digestion des oiseaux". Histoire de l'academie royale des sciences 1752: 266, 461. [8] Williams, H. S. (1904) A History of Science: in Five Volumes. Volume IV: Modern Development of the Chemical and Biological Sciences (http:/ / etext. lib. virginia. edu/ toc/ modeng/ public/ Wil4Sci. html) Harper and Brothers (New York) Accessed 4 April 2007 [9] Dubos J. (1951). "Louis Pasteur: Free Lance of Science, Gollancz. Quoted in Manchester K. L. (1995) Louis Pasteur (18221895)chance and the prepared mind". Trends Biotechnol 13 (12): 5115. doi:10.1016/S0167-7799(00)89014-9. PMID8595136. [10] Khne coined the word "enzyme" in: W. Khne (1877) " ber das Verhalten verschiedener organisirter und sog. ungeformter Fermente (http:/ / books. google. com/ books?id=jzdMAAAAYAAJ& pg=PA190& ie=ISO-8859-1& output=html)" (On the behavior of various organized and so-called unformed ferments), Verhandlungen des naturhistorisch-medicinischen Vereins zu Heidelberg, new series, vol. 1, no. 3, pages 190193. The relevant passage occurs on page 190: "Um Missverstndnissen vorzubeugen und lstige Umschreibungen zu vermeiden schlgt Vortragender vor, die ungeformten oder nicht organisirten Fermente, deren Wirkung ohne Anwesenheit von Organismen und ausserhalb derselben erfolgen kann, als Enzyme zu bezeichnen." (Translation: In order to obviate misunderstandings and avoid cumbersome periphrases, [the author, a university lecturer] suggests designating as "enzymes" the unformed or not organized ferments, whose action can occur without the presence of organisms and outside of the same.) [11] Nobel Laureate Biography of Eduard Buchner at http:/ / nobelprize. org (http:/ / nobelprize. org/ nobel_prizes/ chemistry/ laureates/ 1907/ buchner-bio. html). Retrieved 4 April 2007. [12] Text of Eduard Buchner's 1907 Nobel lecture at http:/ / nobelprize. org (http:/ / nobelprize. org/ nobel_prizes/ chemistry/ laureates/ 1907/ buchner-lecture. html). Retrieved 4 April 2007. [13] The naming of enzymes by adding the suffix "-ase" to the substrate on which the enzyme acts, has been traced to French scientist mile Duclaux (18401904), who intended to honor the discoverers of diastase the first enzyme to be isolated by introducing this practice in his book Trait de Microbiologie (http:/ / books. google. com/ books?id=Kp9EAAAAQAAJ& printsec=frontcover), vol. 2 (Paris, France: Masson and Co., 1899). See Chapter 1, especially page 9. [14] Willsttter, R. (1927). Problems and Methods in Enzyme Research. Cornell University Press, Ithaca. quoted in Blow, David (2000). "So do we understand how enzymes work?" (http:/ / cmgm3. stanford. edu/ biochem/ sb241/ Herschlag_lectures/ papers/ Blow. pdf) (pdf). Structure 8 (4): R77R81. doi:10.1016/S0969-2126(00)00125-8. PMID10801479. . [15] 1946 Nobel prize for Chemistry laureates at http:/ / nobelprize. org (http:/ / nobelprize. org/ nobel_prizes/ chemistry/ laureates/ 1946/ ). Retrieved 4 April 2007.

Enzyme
[16] Blake CC, Koenig DF, Mair GA, North AC, Phillips DC, Sarma VR. (1965). "Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution". Nature 206 (4986): 75761. doi:10.1038/206757a0. PMID5891407. [17] http:/ / www. rcsb. org/ pdb/ explore. do?structureId=1MOO [18] Chen LH, Kenyon GL, Curtin F, Harayama S, Bembenek ME, Hajipour G, Whitman CP (1992). "4-Oxalocrotonate tautomerase, an enzyme composed of 62 amino acid residues per monomer". J. Biol. Chem. 267 (25): 1771621. PMID1339435. [19] Smith S (1 December 1994). "The animal fatty acid synthase: one gene, one polypeptide, seven enzymes". FASEB J. 8 (15): 124859. PMID8001737. [20] Anfinsen C.B. (1973). "Principles that Govern the Folding of Protein Chains". Science 181 (4096): 22330. doi:10.1126/science.181.4096.223. PMID4124164. [21] Dunaway-Mariano D (2008). "Enzyme function discovery". Structure 16 (11): 1599600. doi:10.1016/j.str.2008.10.001. PMID19000810. [22] The Catalytic Site Atlas at The European Bioinformatics Institute (http:/ / www. ebi. ac. uk/ thornton-srv/ databases/ CSA/ ). Retrieved 4 April 2007. [23] Jaeger KE, Eggert T. (2004). "Enantioselective biocatalysis optimized by directed evolution". Curr Opin Biotechnol. 15 (4): 30513. doi:10.1016/j.copbio.2004.06.007. PMID15358000. [24] Shevelev IV, Hubscher U. (2002). "The 3' 5' exonucleases". Nat Rev Mol Cell Biol. 3 (5): 36476. doi:10.1038/nrm804. PMID11988770. [25] Tymoczko, John L.; Stryer Berg Tymoczko; Stryer, Lubert; Berg, Jeremy Mark (2002). Biochemistry. San Francisco: W.H. Freeman. ISBN0-7167-4955-6. [26] Zenkin N, Yuzenkova Y, Severinov K. (2006). "Transcript-assisted transcriptional proofreading". Science. 313 (5786): 51820. doi:10.1126/science.1127422. PMID16873663. [27] Ibba M, Soll D. (2000). "Aminoacyl-tRNA synthesis". Annu Rev Biochem. 69: 61750. doi:10.1146/annurev.biochem.69.1.617. PMID10966471. [28] Rodnina MV, Wintermeyer W. (2001). "Fidelity of aminoacyl-tRNA selection on the ribosome: kinetic and structural mechanisms". Annu Rev Biochem. 70: 41535. doi:10.1146/annurev.biochem.70.1.415. PMID11395413. [29] Firn, Richard. "The Screening Hypothesis a new explanation of secondary product diversity and function" (http:/ / web. archive. org/ web/ 20060516035537/ http:/ / www-users. york. ac. uk/ ~drf1/ rdf_sp1. htm). Archived from the original (http:/ / www. york. ac. uk/ res/ firn/ web/ rdf_sp1. htm) on 16 May 2006. . Retrieved 11 October 2006. [30] Fischer E. (1894). "Einfluss der Configuration auf die Wirkung der Enzyme" (http:/ / gallica. bnf. fr/ ark:/ 12148/ bpt6k90736r/ f364. chemindefer). Ber. Dt. Chem. Ges. 27 (3): 298593. doi:10.1002/cber.18940270364. . [31] Koshland D. E. (1958). "Application of a Theory of Enzyme Specificity to Protein Synthesis". Proc. Natl. Acad. Sci. 44 (2): 98104. doi:10.1073/pnas.44.2.98. PMC335371. PMID16590179. [32] Vasella A, Davies GJ, Bohm M. (2002). "Glycosidase mechanisms". Curr Opin Chem Biol. 6 (5): 61929. doi:10.1016/S1367-5931(02)00380-0. PMID12413546. [33] Boyer, Rodney (2002) [2002]. "6". Concepts in Biochemistry (2nd ed.). New York, Chichester, Weinheim, Brisbane, Singapore, Toronto.: John Wiley & Sons, Inc.. pp.1378. ISBN0-470-00379-0. OCLC51720783. [34] Savir Y & Tlusty T (2007). Scalas, Enrico. ed. "Conformational proofreading: the impact of conformational changes on the specificity of molecular recognition" (http:/ / www. weizmann. ac. il/ complex/ tlusty/ papers/ PLoSONE2007. pdf). PLoS ONE 2 (5): e468. doi:10.1371/journal.pone.0000468. PMC1868595. PMID17520027. . [35] Fersht, Alan (1985). Enzyme structure and mechanism. San Francisco: W.H. Freeman. pp.502. ISBN0-7167-1615-1. [36] Jencks, William P. (1987). Catalysis in chemistry and enzymology. Mineola, N.Y: Dover. ISBN0-486-65460-5. [37] Villa J, Strajbl M, Glennon TM, Sham YY, Chu ZT, Warshel A (2000). "How important are entropic contributions to enzyme catalysis?". Proc. Natl. Acad. Sci. U.S.A. 97 (22): 11899904. doi:10.1073/pnas.97.22.11899. PMC17266. PMID11050223. [38] Warshel A, Sharma PK, Kato M, Xiang Y, Liu H, Olsson MH (2006). "Electrostatic basis for enzyme catalysis". Chem. Rev. 106 (8): 321035. doi:10.1021/cr0503106. PMID16895325. [39] Eisenmesser EZ, Bosco DA, Akke M, Kern D (2002). "Enzyme dynamics during catalysis". Science 295 (5559): 15203. doi:10.1126/science.1066176. PMID11859194. [40] Agarwal PK (2005). "Role of protein dynamics in reaction rate enhancement by enzymes". J. Am. Chem. Soc. 127 (43): 1524856. doi:10.1021/ja055251s. PMID16248667. [41] Eisenmesser EZ, Millet O, Labeikovsky W (2005). "Intrinsic dynamics of an enzyme underlies catalysis". Nature 438 (7064): 11721. doi:10.1038/nature04105. PMID16267559. [42] Yang LW, Bahar I (5 June 2005). "Coupling between catalytic site and collective dynamics: A requirement for mechanochemical activity of enzymes" (http:/ / www. cell. com/ structure/ abstract/ S0969-2126(05)00167-X). Structure 13 (6): 893904. doi:10.1016/j.str.2005.03.015. PMC1489920. PMID15939021. . [43] Agarwal PK, Billeter SR, Rajagopalan PT, Benkovic SJ, Hammes-Schiffer S. (5 March 2002). "Network of coupled promoting motions in enzyme catalysis". Proc Natl Acad Sci USA. 99 (5): 27949. doi:10.1073/pnas.052005999. PMC122427. PMID11867722. [44] Agarwal PK, Geist A, Gorin A (2004). "Protein dynamics and enzymatic catalysis: investigating the peptidyl-prolyl cis-trans isomerization activity of cyclophilin A". Biochemistry 43 (33): 1060518. doi:10.1021/bi0495228. PMID15311922. [45] Tousignant A, Pelletier JN. (2004). "Protein motions promote catalysis". Chem Biol. 11 (8): 103742. doi:10.1016/j.chembiol.2004.06.007. PMID15324804.

68

Enzyme
[46] Flomenbom O, Velonia K, Loos D et al. (2005). "Stretched exponential decay and correlations in the catalytic activity of fluctuating single lipase molecules". Proc. Natl. Acad. Sci. USA 102 (7): 23682372. doi:10.1073/pnas.0409039102. PMC548972. PMID15695587. [47] English BP, Min W, van Oijen AM et al. (2006). "Ever-fluctuating single enzyme molecules: Michaelis-Menten equation revisited". Nature Chem. Biol. 2 (2): 8794. doi:10.1038/nchembio759. PMID16415859. [48] Lu H, Xun L, Xie X S (1998). "Single-molecule enzymatic dynamics". Science 282 (5395): 18771882. doi:10.1126/science.282.5395.1877. PMID9836635. [49] Olsson, MH; Parson, WW; Warshel, A (2006). "Dynamical Contributions to Enzyme Catalysis: Critical Tests of A Popular Hypothesis". Chem. Rev. 106 (5): 173756. doi:10.1021/cr040427e. PMID16683752. [50] Neet KE (1995). "Cooperativity in enzyme function: equilibrium and kinetic aspects". Meth. Enzymol.. Methods in Enzymology 249: 51967. doi:10.1016/0076-6879(95)49048-5. ISBN978-0-12-182150-0. PMID7791626. [51] Changeux JP, Edelstein SJ (2005). "Allosteric mechanisms of signal transduction". Science 308 (5727): 14248. doi:10.1126/science.1108595. PMID15933191. [52] de Bolster, M.W.G. (1997). "Glossary of Terms Used in Bioinorganic Chemistry: Cofactor" (http:/ / www. chem. qmul. ac. uk/ iupac/ bioinorg/ CD. html#34). International Union of Pure and Applied Chemistry. . Retrieved 30 October 2007. [53] de Bolster, M.W.G. (1997). "Glossary of Terms Used in Bioinorganic Chemistry: Coenzyme" (http:/ / www. chem. qmul. ac. uk/ iupac/ bioinorg/ CD. html#33). International Union of Pure and Applied Chemistry. . Retrieved 30 October 2007. [54] Fisher Z, Hernandez Prada JA, Tu C, Duda D, Yoshioka C, An H, Govindasamy L, Silverman DN and McKenna R. (2005). "Structural and kinetic characterization of active-site histidine as a proton shuttle in catalysis by human carbonic anhydrase II". Biochemistry. 44 (4): 1097115. doi:10.1021/bi0480279. PMID15667203. [55] Wagner, Arthur L. (1975). Vitamins and Coenzymes. Krieger Pub Co. ISBN0-88275-258-8. [56] BRENDA The Comprehensive Enzyme Information System (http:/ / www. brenda. uni-koeln. de/ ). Retrieved 4 April 2007. [57] Trnroth-Horsefield S, Neutze R (2008). "Opening and closing the metabolite gate". Proc. Natl. Acad. Sci. U.S.A. 105 (50): 195656. doi:10.1073/pnas.0810654106. PMC2604989. PMID19073922. [58] Ferguson, S. J.; Nicholls, David; Ferguson, Stuart (2002). Bioenergetics 3 (3rd ed.). San Diego: Academic. ISBN0-12-518121-3. [59] Henri, V. (1902). "Theorie generale de l'action de quelques diastases". Compt. Rend. Hebd. Acad. Sci. Paris 135: 9169. [60] Srensen,P.L. (1909). "Enzymstudien {II}. ber die Messung und Bedeutung der Wasserstoffionenkonzentration bei enzymatischen Prozessen". Biochem. Z. 21: 131304. [61] Michaelis L., Menten M. (1913). "Die Kinetik der Invertinwirkung". Biochem. Z. 49: 333369. English translation (http:/ / web. lemoyne. edu/ ~giunta/ menten. html). Retrieved 6 April 2007. [62] Briggs G. E., Haldane J. B. S. (1925). "A note on the kinetics of enzyme action". Biochem. J. 19 (2): 339339. PMC1259181. PMID16743508. [63] Xue X, Liu F, Ou-Yang ZC (2006). "Single molecule Michaelis-Menten equation beyond quasistatic disorder". Phys. Rev. E 74 (3): 030902. doi:10.1103/PhysRevE.74.030902. PMID17025584. [64] Radzicka A, Wolfenden R. (1995). "A proficient enzyme". Science 267 (5194): 90931. doi:10.1126/science.7809611. PMID7809611. [65] Ellis RJ (2001). "Macromolecular crowding: obvious but underappreciated". Trends Biochem. Sci. 26 (10): 597604. doi:10.1016/S0968-0004(01)01938-7. PMID11590012. [66] Kopelman R (1988). "Fractal Reaction Kinetics". Science 241 (4873): 162026. doi:10.1126/science.241.4873.1620. PMID17820893. [67] Savageau MA (1995). "Michaelis-Menten mechanism reconsidered: implications of fractal kinetics". J. Theor. Biol. 176 (1): 11524. doi:10.1006/jtbi.1995.0181. PMID7475096. [68] Schnell S, Turner TE (2004). "Reaction kinetics in intracellular environments with macromolecular crowding: simulations and rate laws". Prog. Biophys. Mol. Biol. 85 (23): 23560. doi:10.1016/j.pbiomolbio.2004.01.012. PMID15142746. [69] Xu F, Ding H (2007). "A new kinetic model for heterogeneous (or spatially confined) enzymatic catalysis: Contributions from the fractal and jamming (overcrowding) effects". Appl. Catal. A: Gen. 317 (1): 7081. doi:10.1016/j.apcata.2006.10.014. [70] Garcia-Viloca M., Gao J., Karplus M., Truhlar D. G. (2004). "How enzymes work: analysis by modern rate theory and computer simulations". Science 303 (5655): 18695. doi:10.1126/science.1088172. PMID14716003. [71] Olsson M. H., Siegbahn P. E., Warshel A. (2004). "Simulations of the large kinetic isotope effect and the temperature dependence of the hydrogen atom transfer in lipoxygenase". J. Am. Chem. Soc. 126 (9): 28208. doi:10.1021/ja037233l. PMID14995199. [72] Masgrau L., Roujeinikova A., Johannissen L. O., Hothi P., Basran J., Ranaghan K. E., Mulholland A. J., Sutcliffe M. J., Scrutton N. S., Leys D. (2006). "Atomic Description of an Enzyme Reaction Dominated by Proton Tunneling". Science 312 (5771): 23741. doi:10.1126/science.1126002. PMID16614214. [73] Cleland, W.W. (1963). "The Kinetics of Enzyme-catalyzed Reactions with two or more Substrates or Products 2. {I}nhibition: Nomenclature and Theory". Biochim. Biophys. Acta 67: 17387. [74] Price, NC. (1979). "What is meant by 'competitive inhibition'?". Trends in Biochemical Sciences 4 (11): pN272. doi:10.1016/0968-0004(79)90205-6. [75] Dick, Ronald M. (2011). "Chapter 2. Pharmacodynamics: The Study of Drug Action". In Ouellette, Richard G.; Joyce, Joseph A.. Pharmacology for Nurse Anesthesiology. Jones & Bartlett Learning. ISBN978-0-7637-8607-6. [76] R Poulin; Lu, L; Ackermann, B; Bey, P; Pegg, AE (5 January 1992). "Mechanism of the irreversible inactivation of mouse ornithine decarboxylase by alpha-difluoromethylornithine. Characterization of sequences at the inhibitor and coenzyme binding sites". Journal of Biological Chemistry 267 (1): 1508. PMID1730582.

69

Enzyme
[77] Yoshikawa S and Caughey WS. (15 May 1990). "Infrared evidence of cyanide binding to iron and copper sites in bovine heart cytochrome c oxidase. Implications regarding oxygen reduction". J Biol Chem. 265 (14): 794558. PMID2159465. [78] Hunter T. (1995). "Protein kinases and phosphatases: the yin and yang of protein phosphorylation and signaling". Cell. 80 (2): 22536. doi:10.1016/0092-8674(95)90405-0. PMID7834742. [79] Berg JS, Powell BC, Cheney RE (1 April 2001). "A millennial myosin census". Mol. Biol. Cell 12 (4): 78094. PMC32266. PMID11294886. [80] Meighen EA (1 March 1991). "Molecular biology of bacterial bioluminescence". Microbiol. Rev. 55 (1): 12342. PMC372803. PMID2030669. [81] Mackie RI, White BA (1 October 1990). "Recent advances in rumen microbial ecology and metabolism: potential impact on nutrient output". J. Dairy Sci. 73 (10): 297195. doi:10.3168/jds.S0022-0302(90)78986-2. PMID2178174. [82] Faergeman NJ, Knudsen J (1997). "Role of long-chain fatty acyl-CoA esters in the regulation of metabolism and in cell signalling". Biochem. J. 323 (Pt 1): 112. PMC1218279. PMID9173866. [83] Doble B. W., Woodgett J. R. (2003). "GSK-3: tricks of the trade for a multi-tasking kinase". J. Cell. Sci. 116 (Pt 7): 117586. doi:10.1242/jcs.00384. PMC3006448. PMID12615961. [84] Carr C. M., Kim P. S. (2003). "A spring-loaded mechanism for the conformational change of influenza hemagglutinin". Cell 73 (4): 82332. doi:10.1016/0092-8674(93)90260-W. PMID8500173. [85] http:/ / www. rcsb. org/ pdb/ explore. do?structureId=1KW0 [86] Phenylketonuria: NCBI Genes and Disease (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?call=bv. View. . ShowSection& rid=gnd. section. 234). Retrieved 4 April 2007. [87] Fuhrmann G, Leroux JC (2011). "In vivo fluorescence imaging of exogenous enzyme activity in the gastrointestinal tract". Proceedings of the National Academy of Sciences 108 (22): 90329037. doi:10.1073/pnas.1100285108. [88] The complete nomenclature can be browsed at Enzyme Nomenclature (http:/ / www. chem. qmul. ac. uk/ iubmb/ enzyme/ ). Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes by the Reactions they Catalyse. Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) [89] http:/ / www. csbio. sjtu. edu. cn/ bioinf/ EzyPred/ [90] Shen, HB; Chou, KC (2007). "EzyPred: A top-down approach for predicting enzyme functional classes and subclasses". Biochemical and Biophysical Research Communications 364 (1): 539. doi:10.1016/j.bbrc.2007.09.098. PMID17931599. [91] Qiu, JD; Huang, JH; Shi, SP; Liang, RP (2010). "Using the concept of Chou's pseudo amino acid composition to predict enzyme family classes: An approach with support vector machine based on discrete wavelet transform". Protein and peptide letters 17 (6): 71522. doi:10.2174/092986610791190372. PMID19961429. [92] Zhou, X. B., Chen, C., Li, Z. C. & Zou, X. Y. (2007). "Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes". Journal of Theoretical Biology 248 (3): 546551. doi:10.1016/j.jtbi.2007.06.001. PMID17628605. [93] Chou, K. C. (2005). "Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes". Bioinformatics 21 (1): 1019. doi:10.1093/bioinformatics/bth466. PMID15308540. [94] Renugopalakrishnan V, Garduno-Juarez R, Narasimhan G, Verma CS, Wei X, Li P. (2005). "Rational design of thermally stable proteins: relevance to bionanotechnology". J Nanosci Nanotechnol. 5 (11): 17591767. doi:10.1166/jnn.2005.441. PMID16433409. [95] Hult K, Berglund P. (2003). "Engineered enzymes for improved organic synthesis". Curr Opin Biotechnol. 14 (4): 395400. doi:10.1016/S0958-1669(03)00095-8. PMID12943848. [96] Jiang L, Althoff EA, Clemente FR (2008). "De novo computational design of retro-aldol enzymes". Science 319 (5868): 138791. doi:10.1126/science.1152692. PMID18323453. [97] Guzmn-Maldonado H, Paredes-Lpez O (1995). "Amylolytic enzymes and products derived from starch: a review". Critical reviews in food science and nutrition 35 (5): 373403. doi:10.1080/10408399509527706. PMID8573280. [98] Dulieu C, Moll M, Boudrant J, Poncelet D (2000). "Improved performances and control of beer fermentation using encapsulated alpha-acetolactate decarboxylase and modeling". Biotechnology progress 16 (6): 95865. doi:10.1021/bp000128k. PMID11101321.

70

Enzyme

71

Further reading
Etymology and history "New Beer in an Old Bottle: Eduard Buchner and the Growth of Biochemical Knowledge, edited by Athel Cornish-Bowden and published by Universitat de Valncia (1997): ISBN 84-370-3328-4" (http:/ / web. archive. org/ web/ 20080207101706/ http:/ / bip. cnrs-mrs. fr/ bip10/ buchner. htm). Archived from the original (http:/ / bip. cnrs-mrs. fr/ bip10/ buchner. htm) on 7 February 2008., A history of early enzymology. Williams, Henry Smith, 18631943. A History of Science: in Five Volumes. Volume IV: Modern Development of the Chemical and Biological Sciences (http:/ / etext. lib. virginia. edu/ toc/ modeng/ public/ Wil4Sci. html), A textbook from the 19th century. Kleyn J, Hough J (1971). "The microbiology of brewing". Annu. Rev. Microbiol. 25: 583608. doi:10.1146/annurev.mi.25.100171.003055. PMID4949040. Kinetics and inhibition Cornish-Bowden, Athel. Fundamentals of Enzyme Kinetics. (3rd edition), Portland Press, 2004. ISBN 1-85578-158-1. Segel Irwin H. Enzyme Kinetics: Behavior and Analysis of Rapid Equilibrium and Steady-State Enzyme Systems. (New Ed edition), Wiley-Interscience, 1993. ISBN 0-471-30309-7. Baynes, John W. Medical Biochemistry. (2nd edition), Elsevier-Mosby, 2005. ISBN 0-7234-3341-0, p.57.

Enzyme structure and mechanism Fersht, Alan (1999). Structure and mechanism in protein science: a guide to enzyme catalysis and protein folding. San Francisco: W.H. Freeman. ISBN0-7167-3268-8. Walsh C (1979). Enzymatic reaction mechanisms. San Francisco: W. H. Freeman. ISBN0-7167-0070-0. Page, M. I., and Williams, A. (Eds.). Enzyme Mechanisms. Royal Society of Chemistry, 1987. ISBN 0-85186-947-5. Bugg, T. Introduction to Enzyme and Coenzyme Chemistry. (2nd edition), Blackwell Publishing Limited, 2004. ISBN 1-4051-1452-5. Warshel, A. Computer Modeling of Chemical Reactions in enzymes and Solutions. John Wiley & Sons Inc., 1991. ISBN 0-471-18440-3.

Function and control of enzymes in the cell Price, N. and Stevens, L. Fundamentals of Enzymology: Cell and Molecular Biology of Catalytic Proteins. Oxford University Press, 1999. ISBN 0-19-850229-X. "Nutritional and Metabolic Diseases" (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?rid=gnd. chapter. 86). Chapter of the on-line textbook Introduction to Genes and Disease from the NCBI.

Thermodynamics "Reactions and Enzymes" (http:/ / www. emc. maricopa. edu/ faculty/ farabee/ BIOBK/ BioBookEnzym. html) Chapter 10 of on-line biology book at Estrella Mountain Community College.

Enzyme-naming conventions Enzyme Nomenclature (http:/ / www. chem. qmul. ac. uk/ iubmb/ enzyme/ ), Recommendations for enzyme names from the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology. Koshland, D. The Enzymes, v. I, ch. 7. Acad. Press, New York, 1959.

Industrial applications "History of industrial enzymes" (http:/ / www. mapsenzymes. com/ History_of_Enzymes. asp), Article about the history of industrial enzymes from the late 1900s to the present times.

External links
Structure/Function of Enzymes (http://mcdb-webarchive.mcdb.ucsb.edu/sears/biochemistry/), Web tutorial on enzyme structure and function. Enzymes in diagnosis (http://www.science2day.info/2008/02/enzyme-test-or-cpk-test-what-is-it.html) Role of enzymes in diagnosis of diseases. Enzyme spotlight (http://www.ebi.ac.uk/intenz/spotlight.jsp) Monthly feature at the European Bioinformatics Institute on a selected enzyme. AMFEP (http://www.amfep.org/), Association of Manufacturers and Formulators of Enzyme Products BRENDA (http://www.brenda-enzymes.org/) database, a comprehensive compilation of information and literature references about all known enzymes; requires payment by commercial users. Enzyme Structures (http://pdbe.org/ec) Explore 3-D structure data of enzymes in the Protein Data Bank.

Enzyme Enzyme Structures Database (http://www.ebi.ac.uk/thornton-srv/databases/enzymes/) links to the known 3-D structure data of enzymes in the Protein Data Bank. ExPASy enzyme (http://us.expasy.org/enzyme/) database, links to Swiss-Prot sequence data, entries in other databases and to related literature searches. KEGG: Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/) Graphical and hypertext-based information on biochemical pathways and enzymes. (http://www.enzyme-database.org/) enzyme database MACiE (http://www.ebi.ac.uk/thornton-srv/databases/MACiE/) database of enzyme reaction mechanisms. MetaCyc database of enzymes and metabolic pathways Face-to-Face Interview with Sir John Cornforth who was awarded a Nobel Prize for work on stereochemistry of enzyme-catalyzed reactions (http://www.vega.org.uk/video/programme/19) Freeview video by the Vega Science Trust Sigma Aldrich Enzyme Assays by Enzyme Name (http://www.sigmaaldrich.com/life-science/metabolomics/ enzyme-explorer.html)Hundreds of assays sorted by enzyme name. Bugg TD (2001). "The development of mechanistic enzymology in the 20th century". Nat Prod Rep 18 (5): 46593. doi:10.1039/b009205n. PMID11699881.

72

Active site
In biology the active site is part of an enzyme where substrates bind and undergo a chemical reaction.[1] The majority of enzymes are proteins but RNA enzymes called ribozymes also exist. The active site of an enzyme is usually found in a cleft or pocket that is lined by amino acid residues (or nucleotides in ribozymes) that participate in recognition of the substrate. Residues that directly participate in the catalytic reaction mechanism are called active site residues.

Binding mechanism
There are two proposed models of how enzymes work: the lock and key model and the induced fit model. The lock and key model assumes that the active site is a perfect fit for a specific substrate and that once the substrate binds to the enzyme no further Induced fit hypothesis of enzyme action. modification is necessary; this is simplistic. The induced fit model is a development of the lock-and-key model and instead assumes that an active site is more flexible and that the presence of certain residues (amino acids) in the active site will encourage the enzyme to locate the correct substrate, after which conformational changes may occur as the substrate is bound.

Chemistry
Substrates bind to the active site of the enzyme or a specificity pocket through hydrogen bonds, hydrophobic interactions, temporary covalent interactions (van der Waals) or a combination of all of these to form the enzyme-substrate complex. Residues of the active site will act as donors or acceptors of protons or other groups on the substrate to facilitate the reaction. In other words, the active site modifies the reaction mechanism in order to change the activation energy of the reaction. The product is usually unstable in the active site due to steric

Active site hindrances that force it to be released and return the enzyme to its initial unbound state.

73

External links
Catalytic Site Atlas (CSA) [2] hosted by EMBL-EBI

References
[1] IUPAC. Compendium of Chemical Terminology, 2nd ed. (the "Gold Book"). Compiled by A. D. McNaught and A. Wilkinson. Blackwell Scientific Publications, Oxford (1997). XML on-line corrected version: http:/ / goldbook. iupac. org (2006-) created by M. Nic, J. Jirat, B. Kosata; updates compiled by A. Jenkins. ISBN 0-9678550-9-8. doi:10.1351/goldbook. [2] http:/ / www. ebi. ac. uk/ thornton-srv/ databases/ CSA/

Activation energy
In chemistry, activation energy is a term introduced in 1889 by the Swedish scientist Svante Arrhenius that is defined as the energy that must be overcome in order for a chemical reaction to occur. Activation energy may also be defined as the minimum energy required to start a chemical reaction. The activation energy of a reaction is usually denoted by Ea and given in units of kilojoules per mole. Activation energy can be thought of as the height of the potential barrier (sometimes called the energy barrier) separating two minima of potential energy (of the reactants and products of a reaction). For a chemical reaction to proceed at a reasonable rate, there should exist an appreciable number of molecules with energy equal to or greater than the activation energy. At a more advanced level, the Arrhenius Activation energy term from the Arrhenius equation is best regarded as an experimentally determined parameter that indicates the sensitivity of the reaction rate to temperature. There are two objections to associating this activation The sparks generated by striking steel against a energy with the threshold barrier for an elementary reaction. First, it is flint provide the activation energy to initiate often unclear as to whether or not reaction does proceed in one step; combustion in this Bunsen burner. The blue flame threshold barriers that are averaged out over all elementary steps have will sustain itself after the sparks are extinguished little theoretical value. Second, even if the reaction being studied is because the continued combustion of the flame is now energetically favorable. elementary, a spectrum of individual collisions contributes to rate constants obtained from bulk ('bulb') experiments involving billions of molecules, with many different reactant collision geometries and angles, different translational and (possibly) vibrational energies - all of which may lead to different microscopic reaction rates.

Negative activation energy


In some cases, rates of reaction decrease with increasing temperature. When following an approximately exponential relationship so the rate constant can still be fit to an Arrhenius expression, this results in a negative value of Ea. Elementary reactions exhibiting these negative activation energies are typically barrierless reactions, in which the reaction proceeding relies on the capture of the molecules in a potential well. Increasing the temperature leads to a reduced probability of the colliding molecules capturing one another (with more glancing collisions not leading to reaction as the higher momentum carries the colliding particles out of the potential well), expressed as a reaction

Activation energy cross section that decreases with increasing temperature. Such a situation no longer leads itself to direct interpretations as the height of a potential spot.

74

Temperature independence and the relation to the Arrhenius equation


The Arrhenius equation gives the quantitative basis of the relationship between the activation energy and the rate at which a reaction proceeds. From the Arrhenius equation, the activation energy can be expressed as

where A is the frequency factor for the reaction, R is the universal gas constant, T is the temperature (in kelvin), and k is the reaction rate coefficient. While this equation suggests that the activation energy is dependent on temperature, in regimes in which the Arrhenius equation is valid this is cancelled by the temperature dependence of k. Thus, Ea can be evaluated from the reaction rate coefficient at any temperature (within the validity of the Arrhenius equation).

Catalysis
A substance that modifies the transition state to lower the activation energy is termed a catalyst; a biological catalyst is termed an enzyme. It is important to note that a catalyst increases the rate of reaction without being consumed by it. In addition, while the catalyst lowers the activation energy, it does not change the energies of the original reactants or products. Rather, the reactant energy and the product energy remain the same and only the activation energy is altered (lowered).

Relationship with Gibbs free energy

The relationship between activation energy (

) and enthalpy of formation (H) with

In the Arrhenius equation, the term activation energy (Ea) is used to describe the energy required to reach the transition state. Likewise, the Eyring equation is a similar equation which also describes the rate of a reaction. Instead of also using Ea, however, the Eyring equation uses the concept of Gibbs free energy and the symbol * to denote the energy of the transition state.

and without a catalyst, plotted against the reaction coordinate. The highest energy position (peak position) represents the transition state. With the catalyst, the energy required to enter transition state decreases, thereby decreasing the energy required to initiate the reaction.

External links
The Activation Energy of Chemical Reactions [1]

Activation energy

75

References
[1] http:/ / chemed. chem. purdue. edu/ genchem/ topicreview/ bp/ ch22/ activate. html

Oxidoreductase
In biochemistry, an oxidoreductase is an enzyme that catalyzes the transfer of electrons from one molecule, the reductant, also called the electron donor, to another the oxidant, also called the electron acceptor. This group of enzymes usually utilizes NADP or NAD+ as cofactors.

Reactions
For example, an enzyme that catalyzed this reaction would be an oxidoreductase: A + B A + B In this example, A is the reductant (electron donor) and B is the oxidant (electron acceptor). In biochemical reactions, the redox reactions are sometimes more difficult to see, such as this reaction from glycolysis: In this reaction, NAD+ is the oxidant (electron acceptor), and glyceraldehyde-3-phosphate is the reductant (electron donor). Pi + glyceraldehyde-3-phosphate + NAD+ NADH + H+ + 1,3-bisphosphoglycerate

Nomenclature
Proper names of oxidoreductases are formed as "donor:acceptor oxidoreductase"; however, other names are much more common. The common name is "donor dehydrogenase" when possible, such as glyceraldehyde-3-phosphate dehydrogenase for the second reaction above. Common names are also sometimes formed as "acceptor reductase", such as NAD+ reductase. "Donor oxidase" is a special case where O2 is the acceptor.

Classification
Oxidoreductases are classified as EC 1 in the EC number classification of enzymes. Oxidoreductases can be further classified into 22 subclasses: EC 1.1 includes oxidoreductases that act on the CH-OH group of donors (alcohol oxidoreductases) EC 1.2 includes oxidoreductases that act on the aldehyde or oxo group of donors EC 1.3 includes oxidoreductases that act on the CH-CH group of donors (CH-CH oxidoreductases) EC 1.4 includes oxidoreductases that act on the CH-NH2 group of donors (Amino acid oxidoreductases, Monoamine oxidase) EC 1.5 includes oxidoreductases that act on CH-NH group of donors EC 1.6 includes oxidoreductases that act on NADH or NADPH EC 1.7 includes oxidoreductases that act on other nitrogenous compounds as donors EC 1.8 includes oxidoreductases that act on a sulfur group of donors EC 1.9 includes oxidoreductases that act on a heme group of donors EC 1.10 includes oxidoreductases that act on diphenols and related substances as donors EC 1.11 includes oxidoreductases that act on peroxide as an acceptor (peroxidases) EC 1.12 includes oxidoreductases that act on hydrogen as donors

EC 1.13 includes oxidoreductases that act on single donors with incorporation of molecular oxygen (oxygenases) EC 1.14 includes oxidoreductases that act on paired donors with incorporation of molecular oxygen

Oxidoreductase EC 1.15 includes oxidoreductases that act on superoxide radicals as acceptors EC 1.16 includes oxidoreductases that oxidize metal ions EC 1.17 includes oxidoreductases that act on CH or CH2 groups EC 1.18 includes oxidoreductases that act on iron-sulfur proteins as donors EC 1.19 includes oxidoreductases that act on reduced flavodoxin as a donor EC 1.20 includes oxidoreductases that act on phosphorus or arsenic in donors EC 1.21 includes oxidoreductases that act on X-H and Y-H to form an X-Y bond EC 1.97 includes other oxidoreductases

76

External links
EC 1 Introduction [1] from the Department of Chemistry at Queen Mary, University of London

References
[1] http:/ / www. chem. qmul. ac. uk/ iubmb/ enzyme/ EC1/ intro. html

Glucose oxidase

77

Glucose oxidase
Glucose oxidase

PDB Molecule of the Month pdb77_1 Identifiers EC number CAS number 1.1.3.4
[2] [3]

[1]

9001-37-0

Databases IntEnz BRENDA ExPASy KEGG MetaCyc PRIAM IntEnz view


[4] [5] [6]

BRENDA entry NiceZyme view KEGG entry


[7]

metabolic pathway profile


[9]

[8]

PDB structures RCSB PDB [10] PDBe [11] PDBsum [12] Gene Ontology AmiGO [13] / EGO [14] Search PMC PubMed articles articles
[15] [16]

NCBI Protein search [17]

The glucose oxidase enzyme (GOx) (EC 1.1.3.4 [18]) is an oxido-reductase that catalyses the oxidation of glucose to hydrogen peroxide and D-glucono--lactone. In cells, it aids in breaking the sugar down into its metabolites. Glucose oxidase is widely used for the determination of free glucose in body fluids (diagnostics), in vegetal raw material, and in the food industry. It also has many applications in biotechnologies, typically enzyme assays for biochemistry including biosensors in nanotechnologies.[19] It is often extracted from Aspergillus niger.

Glucose oxidase

78

Structure
GOx is a dimeric protein, the 3D structure of which has been elucidated. The active site where glucose binds is in a deep pocket. The enzyme, like many proteins that act outside of cells, is covered with carbohydrate chains.

Activity
At pH 7, glucose exists in solution in cyclic hemiacetal form as 63.6% -D-glucopyranose and 36.4% -D-glucopyranose, the proportion of linear and furanose form being negligible. The glucose oxidase binds specifically to -D-glucopyranose and does not act on -D-glucose. It is able to oxidise all of the glucose in solution because the equilibrium between the and anomers is driven towards the side as it is consumed in the reaction.
[19]

Glucose oxidase catalyzes the oxidation of -D-glucose into D-glucono-1,5-lactone, which then hydrolyzes to gluconic acid. In order to work as a catalyst, GOx requires a cofactor, flavin adenine dinucleotide (FAD). FAD is a common component in biological oxidation-reduction (redox reactions). Redox reactions involve a gain or loss of electrons from a molecule. In the GOx-catalyzed redox reaction, FAD works as the initial electron acceptor and is reduced to FADH2. Then FADH2 is oxidized by the final electron acceptor, molecular oxygen (O2), which can do so because it has a higher reduction potential. O2 is then reduced to hydrogen peroxide (H2O2).

Applications
Glucose oxidase is widely used, coupled to peroxidase reaction that vizualizes colorimetrically the formed H2O2, for the determination of free glucose in sera or blood plasma for diagnostics, using spectrometric assays manually or with automated procedures, and even point of use rapid assays.[19][20] Similar assays allows to monitor glucose levels in fermentation, bioreactors, and to control glucose in vegetal raw material and food products. Enzyme electrode biosensors detect levels of glucose by keeping track of the number of electrons passed through the enzyme by connecting it to an electrode and measuring the resulting charge. This has a possible application in the world of nanotechnology when used in conjunction with tiny electrodes as glucose sensors for diabetics. In manufacturing, GOx is used as an additive thanks to its oxidising effects: it prompts for stronger dough in bakery, replacing oxidants such as bromate. It also helps remove oxygen from food packaging, or D-glucose from egg white to prevent browning. Glucose oxidase is found in honey and acts as a natural preservative. GOx at the surface of the honey reduces atmospheric O2 to hydrogen peroxide (H2O2), which acts as an antimicrobial barrier. GOx similarly acts as a bactericide in many cells (fungi, immune cells).

Related enzymes: Notatin and other names


Notatin, extracted from antibacterial cultures of Penicillium notatum, was originally named Penicillin A, but was renamed to avoid confusion with penicillin.[21] Notatin was shown to be identical to Penicillin B and glucose oxidase, enzymes extracted from other molds besides P. notatum;[22] it is now generally known as glucose oxidase.[20] Early experiments showed that notatin exhibits in vitro antibacterial activity (in the presence of glucose) due to hydrogen peroxide formation.[21][23] In vivo tests showed that notatin was not effective in protecting rodents from Streptococcus haemolyticus, Staphylococcus aureus, or salmonella, and caused severe tissue damage at some doses.[23]

Glucose oxidase

79

References
[1] http:/ / www. rcsb. org/ pdb/ static. do?p=education_discussion/ molecule_of_the_month/ pdb77_1. html [2] http:/ / www. chem. qmul. ac. uk/ iubmb/ enzyme/ EC1/ 1/ 3/ 4. html [3] http:/ / toolserver. org/ ~magnus/ cas. php?language=en& cas=9001-37-0& title= [4] http:/ / www. ebi. ac. uk/ intenz/ query?cmd=SearchEC& ec=1. 1. 3. 4 [5] http:/ / www. brenda-enzymes. org/ php/ result_flat. php4?ecno=1. 1. 3. 4 [6] http:/ / www. expasy. org/ enzyme/ 1. 1. 3. 4 [7] http:/ / www. genome. ad. jp/ dbget-bin/ www_bget?enzyme+ 1. 1. 3. 4 [8] http:/ / biocyc. org/ META/ substring-search?type=NIL& object=1. 1. 3. 4 [9] http:/ / priam. prabi. fr/ cgi-bin/ PRIAM_profiles_CurrentRelease. pl?EC=1. 1. 3. 4 [10] http:/ / www. rcsb. org/ pdb/ search/ smartSubquery. do?smartSearchSubtype=EnzymeClassificationQuery& Enzyme_Classification=1. 1. 3. 4 [11] http:/ / www. ebi. ac. uk/ pdbe-srv/ PDBeXplore/ enzyme/ ?ec=1. 1. 3. 4 [12] http:/ / www. ebi. ac. uk/ thornton-srv/ databases/ cgi-bin/ enzymes/ GetPage. pl?ec_number=1. 1. 3. 4 [13] http:/ / amigo. geneontology. org/ cgi-bin/ amigo/ go. cgi?query=GO:0046562& view=details [14] http:/ / www. ebi. ac. uk/ ego/ DisplayGoTerm?id=GO:0046562& format=normal [15] http:/ / www. ncbi. nlm. nih. gov/ entrez/ query. fcgi?db=pubmed& term=1. 1. 3. 4%5BEC/ RN%20Number%5D%20AND%20pubmed%20pmc%20local%5Bsb%5D [16] http:/ / www. ncbi. nlm. nih. gov/ entrez/ query. fcgi?db=pubmed& term=1. 1. 3. 4%5BEC/ RN%20Number%5D [17] http:/ / www. ncbi. nlm. nih. gov/ protein?term=1. 1. 3. 4%5BEC/ RN%20Number%5D [18] http:/ / enzyme. expasy. org/ EC/ 1. 1. 3. 4 [19] Technical sheet of Glucose Oxidase (http:/ / www. interchim. fr/ ft/ 1/ 12718A. pdf), Interchim [20] Julio Raba and Horacio A. Mottola (1995). "Glucose Oxidase as an Analytical Reagent" (http:/ / www. biosensing. net/ EBLA/ Corso/ Lezione 01/ GOD. PDF). Critical Reviews in Analytical Chemistry 25 (1): 142. doi:10.1080/10408349508050556. . [21] Coulthard CE, Michaelis R, Short WF, Sykes G (1945). "Notatin: an anti-bacterial glucose-aerodehydrogenase from Penicillium notatum Westling and Penicillium resticulosum sp. nov". Biochem. J. 39 (1): 2436. PMC1258144. PMID16747849. [22] KEILIN D, HARTREE EF (January 1952). "Specificity of glucose oxidase (notatin)". Biochem. J. 50 (3): 33141. PMC1197657. PMID14915954. [23] Broom WA, Coulthard CE, Gurd MR, Sharpe ME (December 1946). "Some pharmacological and chemotherapeutic properties of notatin". Br J Pharmacol Chemother 1 (4): 225233. PMC1509745. PMID19108091.

External links
"Glucose Oxidase: A much used and much loved enzyme in biosensors" (http://www-biol.paisley.ac.uk/ marco/enzyme_electrode/chapter3/chapter3_page1.htm) at University of Paisley Glucose+Oxidase (http://www.nlm.nih.gov/cgi/mesh/2011/MB_cgi?mode=&term=Glucose+Oxidase) at the US National Library of Medicine Medical Subject Headings (MeSH)

Peroxidase

80

Peroxidase
Peroxidase

Glutathione Peroxidase 1 Identifiers Symbol Pfam InterPro PROSITE SCOP SUPERFAMILY peroxidase PF00141
[1] [2] [3]

IPR002016

PDOC00394 1hsr 1hsr


[4] [5]

Available protein structures: Pfam PDB structures


[6] [7]

RCSB PDB

; PDBe

[8]

PDBsum structure summary [9]

Peroxidases (EC number 1.11.1.x [10]) are a large family of enzymes that typically catalyze a reaction of the form: ROOR' + electron donor (2 e-) + 2H+ ROH + R'OH For many of these enzymes the optimal substrate is hydrogen peroxide, but others are more active with organic hydroperoxides such as lipid peroxides. Peroxidases can contain a heme cofactor in their active sites, or alternately redox-active cysteine or selenocysteine residues. The nature of the electron donor is very dependent on the structure of the enzyme. For example, horseradish peroxidase can use a variety of organic compounds as electron donors and acceptors. Horseradish peroxidase has an accessible active site, and many compounds can reach the site of the reaction. Because there is a very closed active site, for an enzyme such as cytochrome c peroxidase, the compounds that donate electrons are very specific, .

Peroxidase While the exact mechanisms have yet to be elucidated, peroxidases are known to play a part in increasing a plant's defenses against pathogens.[11] Peroxidases are sometimes used as histological marker. Cytochrome c peroxidase is used as a soluble, easily purified model for cytochrome c oxidase. The glutathione peroxidase family consists of 8 known human isoforms. Glutathione peroxidases use glutathione as an electron donor and are active with both hydrogen peroxide and organic hydroperoxide substrates. Gpx1, Gpx2, Gpx3, and Gpx4 have been shown to be selenium-containing enzymes, whereas Gpx6 is a selenoprotein in humans with cysteine-containing homologues in rodents. Amyloid beta, when bound to heme, has been shown to have peroxidase activity.[12] A typical group of peroxidases are the haloperoxidases. This group is able to form reactive halogen species and, as a result, natural organohalogen substances. A majority of peroxidase protein sequences can be found in the PeroxiBase database.

81

Applications
Peroxidase can be used for treatment of industrial waste waters. For example, phenols, which are important pollutants, can be removed by enzyme-catalyzed polymerization using horseradish peroxidase. Thus phenols are oxidized to phenoxy radicals, which participate in reactions where polymers and oligomers are produced that are less toxic than phenols. Furthermore, peroxidases can be an alternative option of a number of harsh chemicals, eliminating harsh reaction conditions. There are many investigations about the use of peroxidase in many manufacturing processes like adhesives, computer chips, car parts, and linings of drums and cans.

References
[1] http:/ / pfam. sanger. ac. uk/ family?acc=PF00141 [2] http:/ / www. ebi. ac. uk/ interpro/ DisplayIproEntry?ac=IPR002016 [3] http:/ / www. expasy. org/ cgi-bin/ prosite-search-ac?PDOC00394 [4] http:/ / scop. mrc-lmb. cam. ac. uk/ scop/ search. cgi?tlev=fa;& amp;pdb=1hsr [5] http:/ / supfam. org/ SUPERFAMILY/ cgi-bin/ search. cgi?search_field=1hsr [6] http:/ / pfam. sanger. ac. uk/ family/ PF00141?tab=pdbBlock [7] http:/ / www. rcsb. org/ pdb/ search/ smartSubquery. do?smartSearchSubtype=PfamIdQuery& pfamID=PF00141 [8] http:/ / www. ebi. ac. uk/ pdbe-srv/ PDBeXplore/ pfam/ ?pfam=PF00141 [9] http:/ / www. ebi. ac. uk/ thornton-srv/ databases/ cgi-bin/ pdbsum/ GetPfamStr. pl?pfam_id=PF00141 [10] http:/ / www. chem. qmul. ac. uk/ iubmb/ enzyme/ EC1/ 11/ 1/ [11] Karthikeyan M et al. (December 2005). "Induction of resistance in host against the infection of leaf blight pathogen (Alternaria palandui) in onion (Allium cepa var aggregatum)". Indian J Biochem Biophys 42 (6): 3717. PMID16955738. [12] Hani Atamna, Kathleen Boyle (21 February 2006). "Amyloid-{beta} peptide binds with heme to form a peroxidase: Relationship to the cytopathologies of Alzheimer's disease". Proceedings of the National Academy of Science 103 (9): 33813386. doi:10.1073/pnas.0600134103. PMID16492752.

Horseradish peroxidase

82

Horseradish peroxidase
Horseradish peroxidase

Horseradish peroxidase C1 Identifiers Organism Symbol Alt. symbols PDB UniProt

[1]

Armoracia rusticana Peroxidase C1A PRXC1A 1GWU P00433 Other data


[3] [5]

[2]

More structures

[4]

EC number

1.11.1.7

[6]

The enzyme horseradish peroxidase (HRP), found in horseradish, is used extensively in biochemistry applications primarily for its ability to amplify a weak signal and increase detectability of a target molecule.

Structure
The structure of the enzyme was first solved by X-ray crystallography in 1997[7] and has since has been solved several times with various substrates.[8] It is an all alpha-helical protein which binds heme as a cofactor.

Substrates
Alone, the HRP enzyme, or conjugates thereof, is of little value; its presence must be made visible using a substrate that, when oxidized by HRP using hydrogen peroxide as the oxidizing agent, yields a characteristic change that is detectable by spectrophotometric methods.[9] Numerous substrates for the horseradish peroxidase enzyme have been described and commercialized to exploit the desirable features of HRP. These substrates fall into several distinct categories. HRP catalyzes the conversion of chromogenic substrates (e.g., TMB, DAB, ABTS) into coloured products, and produces light when acting on chemiluminescent substrates (e.g. ECL).

Horseradish peroxidase

83

Applications
Horseradish peroxidase is a 44,173.9-dalton glycoprotein with 6 lysine residues which can be conjugated to a labeled molecule. It produces a coloured, fluorimetric, or luminescent derivative of the labeled molecule when incubated with a proper substrate, allowing it to be detected and quantified. HRP is often used in conjugates (molecules that have been joined genetically or chemically) to determine the presence of a molecular target. For example, an antibody conjugated to HRP may be used to detect a small amount of a specific protein in a western blot. Here, the antibody provides the specificity to locate the protein of interest, and the HRP enzyme, in the presence of a substrate, produces a detectable signal.[10] Horseradish peroxidase is also commonly used in techniques such as ELISA and Immunohistochemistry due to its monomeric nature and the ease with which it produces coloured products. Peroxidase, a heme-containing oxidoreductase, is a commercially important enzyme which catalyses the reductive cleavage of hydrogen peroxide by an electron donor. Horseradish peroxidase is ideal in many respects for these applications because it is smaller, more stable, and less expensive than other popular alternatives such as alkaline phosphatase. It also has a high turnover rate that allows generation of strong signals in a relatively short time span. Moreover, "In recent years the technique of marking neurons with the enzyme horseradish peroxidase has become a major tool. In its brief history, this method has probably been used by more neurobiologists than have used the Golgi stain since its discovery in 1870."[11]

Enhanced chemiluminescence (ECL)


Horseradish peroxidase catalyses the oxidation of luminol to 3-aminophthalate via several intermediates. The reaction is accompanied by emission of low-intensity light at 428nm. However, in the presence of certain chemicals, the light emitted is enhanced up to 1000-fold, making the light easier to detect and increasing the sensitivity of the reaction. The enhancement of light emission is called enhanced chemiluminescence (ECL). Several enhancers can be used, but the most effective are modified phenols, especially p-iodophenol. The intensity of light is a measure of the number of enzyme molecules reacting and thus of the amount of hybrid. ECL is simple to set up and is sensitive, detecting about 0.5 pg nucleic acid in Southern blots and in northern blots. Detection by chemiluminescent substrates has several advantages over chromogenic substrates. The sensitivity is 10- to 100-fold greater, and quantifying of light emission is possible over a wide dynamic range, whereas that for coloured precipitates is much more limited, about one order of magnitude less. Stripping filters are much easier when chemiluminescent substrates are used.

References
[1] PDB 1w4y (http:/ / www. rcsb. org/ pdb/ explore/ explore. do?structureId=1w4y); Carlsson GH, Nicholls P, Svistunenko D, Berglund GI, Hajdu J (January 2005). "Complexes of horseradish peroxidase with formate, acetate, and carbon monoxide". Biochemistry 44 (2): 63542. doi:10.1021/bi0483211. PMID15641789. [2] http:/ / www. ncbi. nlm. nih. gov/ Taxonomy/ Browser/ wwwtax. cgi?id=3704& rn=1 [3] http:/ / www. pdb. org/ pdb/ cgi/ explore. cgi?pdbId=1GWU [4] http:/ / www. ebi. ac. uk/ pdbe/ searchResults. html?display=both& term=P00433 [5] http:/ / www. uniprot. org/ uniprot/ P00433 [6] http:/ / www. genome. jp/ dbget-bin/ www_bget?enzyme+ 1. 11. 1. 7 [7] PDB 1GWU (http:/ / www. rcsb. org/ pdb/ explore/ explore. do?structureId=1GWU); Gajhede M, Schuller DJ, Henriksen A, Smith AT, Poulos TL (December 1997). "Crystal structure of horseradish peroxidase C at 2.15 A resolution". Nat. Struct. Biol. 4 (12): 10328. PMID9406554. [8] "Peroxidase C1A Related PDB sequences" (http:/ / www. ebi. ac. uk/ pdbe-apps/ widgets/ unipdb?uniprot=P00433). UniPDB. European Bioinformatics Institute. . [9] Veitch NC (February 2004). "Horseradish peroxidase: a modern view of a classic enzyme". Phytochemistry 65 (3): 24959. doi:10.1016/j.phytochem.2003.10.022. PMID14751298. [10] Chau YP, Lu KS (1995). "Investigation of the blood-ganglion barrier properties in rat sympathetic ganglia by using lanthanum ion and horseradish peroxidase as tracers". Acta Anat (Basel) 153 (2): 13544. doi:10.1159/000313647. PMID8560966.

Horseradish peroxidase
[11] Lichtman JW, Purves D (1985). "Cell marking with horseradish peroxidase" (http:/ / books. google. com/ books?id=t9JqAAAAMAAJ& q=In+ recent+ years+ the+ technique+ of+ marking+ neurons+ with+ the+ enzyme+ horseradish+ peroxidase+ has+ become+ a+ major+ tool& dq=In+ recent+ years+ the+ technique+ of+ marking+ neurons+ with+ the+ enzyme+ horseradish+ peroxidase+ has+ become+ a+ major+ tool& hl=en& sa=X& ei=-rL_T67zC6ni4QSKndWrCA& redir_esc=y). Principles of neural development. Sunderland, Mass: Sinauer Associates. p.114. ISBN0-87893-744-7. .

84

External links
Horseradish peroxidase (http://www.nlm.nih.gov/cgi/mesh/2011/MB_cgi?mode=&term=Horseradish+ peroxidase) at the US National Library of Medicine Medical Subject Headings (MeSH)

Inclusion body
Inclusion bodies are nuclear or cytoplasmic aggregates of stainable substances, usually proteins. They typically represent sites of viral multiplication in a bacterium or a eukaryotic cell and usually consist of viral capsid proteins. Inclusion bodies can also be hallmarks of genetic diseases, as in the case of Neuronal Inclusion bodies in disorders like Frontotemporal dementia and Parkinson's disease.[1]

Composition
Inclusion bodies have a non-unit lipid membrane. Protein inclusion bodies are classically thought to contain misfolded protein. However, this has recently been contested, as green fluorescent protein will sometimes fluoresce in inclusion bodies, which indicates some resemblance of the native structure and researchers have recovered folded protein from inclusion bodies.[2][3][4]

Mechanism of formation
When genes from one organism are expressed in another the resulting protein sometimes forms inclusion bodies. This is often true when large evolutionary distances are crossed: a cDNA isolated from Eukarya for example, and expressed as a recombinant gene in a prokaryote risks the formation of the inactive aggregates of protein known as inclusion bodies. While the cDNA may properly code for a translatable mRNA, the protein that results will emerge in a foreign microenvironment. This often has fatal effects, especially if the intent of cloning is to produce a biologically active protein. For example, eukaryotic systems for carbohydrate modification and membrane transport are not found in prokaryotes. The internal microenvironment of a prokaryotic cell (pH, osmolarity) may differ from that of the original source of the gene. Mechanisms for folding a protein may also be absent, and hydrophobic residues that normally would remain buried may be exposed and available for interaction with similar exposed sites on other ectopic proteins. Processing systems for the cleavage and removal of internal peptides would also be absent in bacteria. The initial attempts to clone insulin in a bacterium suffered all of these deficits. In addition, the fine controls that may keep the concentration of a protein low will also be missing in a prokaryotic cell, and overexpression can result in filling a cell with ectopic protein that, even if it were properly folded, would precipitate by saturating its environment.

Inclusion body

85

Viral inclusion bodies


Examples of viral inclusion bodies in animals are Intracytoplasmic eosinophilic Negri bodies in Rabies Guarnieri bodies in Small pox Henderson-Peterson bodies in Molluscum contagiosum Intranuclear acidophilic Cowdry type A in Herpes simplex virus and Varicella zoster virus and Torres bodies in Yellow fever Cowdry type B in Polio Intranuclear basophilic Cowdry type B in Adenovirus "owl eyes" in cytomegalovirus Both intranuclear and intracytoplasmic Warthin finkeldey bodies in Measles Examples of viral inclusion bodies in plants [5] include aggregations of virus particles (like those for Cucumber mosaic virus [6]) and aggregations of viral proteins (like the cylindrical inclusions of potyviruses [7]). Depending on the plant and the plant virus family these inclusions can be found in epidermal cells, mesophyll cells, and stomatal cells when plant tissue when properly stained [8].

Inclusion bodies in Erythrocytes


Normally a red blood cell does not contain inclusions in the cytoplasm. However, it may be seen because of certain hematologic disorders. There are three kinds of erythrocyte inclusions: 1. Developmental Organelles 1. Howell-Jolly bodies: small, round fragments of the nucleus resulting from karyorrhexis or nuclear disintegration of the late reticulocyte and stain reddish-blue with Wright stain. 2. Basophilic stipplings - this stipplings is either fine or coarse, deep blue to purple staining inclusion that appears in erythrocytes on a dried Wright stain. 3. Pappenheimer bodies - are siderotic granules which are small, irregular, dark-staining granules that appear near the periphery of a young erythrocyte in a Wright stain. 4. Polychromatophilic red cells - young red cells that no longer have nucleus but still contain some RNA. 5. Cabot Rings - ring-like structure and may appear in erythrocytes in megaloblastic anemia or in severe anemias, lead poisoning, and in dyserythropoiesis, in which erythrocytes are destroyed before being released from the bone marrow. 2. Abnormal Hemoglobin Precipitation 1. Heinz bodies - round bodies, refractile inclusions not visible on a Wright stain film. It is best identified by supravital staining with basic dyes. 2. Hemoglobin H Inclusions - alpha thalassemia, greenish-blue inclusion bodies appear in many erythrocytes after four drops of blood is incubated with 0.5mL of Brilliant cresyl blue for 20 minutes at 37C. 3. Protozoan Inclusion 1. Malaria 2. Babesia

Inclusion body

86

Current problems with the isolation of proteins from bacterial inclusion bodies
70-80% of recombinant proteins expressed E. coli are contained in inclusion bodies (i.e., protein aggregates). The purification of the expressed proteins from inclusion bodies usually require two main steps: extraction of inclusion bodies from the bacteria followed by the solubilisation of the purified inclusion bodies. This is considered labour-intensive, time consuming and not cost effective.

Pseudo-inclusions
Pseudo-inclusions are invaginations of the cytoplasm into the cell nuclei, which may give the appearance of intranuclear inclusions. They may appear in papillary thyroid carcinoma.[9]

See Also
JUNQ and IPOD

References
[1] Cruts, M; Gijselinck, I, van der Zee, J, Engelborghs, S, Wils, H, Pirici, D, Rademakers, R, Vandenberghe, R, Dermaut, B, Martin, JJ, van Duijn, C, Peeters, K, Sciot, R, Santens, P, De Pooter, T, Mattheijssens, M, Van den Broeck, M, Cuijt, I, Vennekens, K, De Deyn, PP, Kumar-Singh, S, Van Broeckhoven, C (2006-08-24). "Null mutations in progranulin cause ubiquitin-positive frontotemporal dementia linked to chromosome 17q21.". Nature 442 (7105): 9204. doi:10.1038/nature05017. PMID16862115. [2] Biochem Biophys Res Com 328(2005) 189-197 [3] Protein Eng 7(1994) 131-136 [4] Biochem Biophys Res Comm 312 (2003) 1383-1386 [5] http:/ / plantpath. ifas. ufl. edu/ pdc/ Inclusionpage/ Florvirus. html [6] http:/ / plantpath. ifas. ufl. edu/ pdc/ Inclusionpage/ CMV/ CucMoInc. html [7] http:/ / plantpath. ifas. ufl. edu/ pdc/ Inclusionpage/ Poty/ poty. html [8] http:/ / plantpath. ifas. ufl. edu/ pdc/ Inclusionpage/ Howto. html [9] Chapter 20 in: Mitchell, Richard Sheppard; Kumar, Vinay; Abbas, Abul K.; Fausto, Nelson. Robbins Basic Pathology. Philadelphia: Saunders. ISBN1-4160-2973-7. 8th edition.

Protein folding

87

Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil.[1] Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids. This polypeptide lacks any developed three-dimensional structure (the left hand side of the neighboring figure). Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein (the right hand side of the figure), known as the native state. The resulting three-dimensional structure is determined by the amino acid sequence (Anfinsen's dogma).[2]

Protein before and after folding.

The correct three-dimensional structure is essential to function, although some parts of functional proteins may remain unfolded.[3] Failure to fold into native Results of protein folding. structure generally produces inactive proteins, but in some instances misfolded proteins have modified or toxic functionality. Several neurodegenerative and other diseases are believed to result from the accumulation of amyloid fibrils formed by misfolded proteins.[4] Many allergies are caused by the folding of the proteins, for the immune system does not produce antibodies for certain protein structures.[5]

Protein folding

88

Known facts
Relationship between folding and amino acid sequence
The amino-acid sequence of a protein determines its native conformation.[6] A protein molecule folds spontaneously during or after biosynthesis. While these macromolecules may be regarded as "folding themselves", the process also depends on the solvent (water or lipid bilayer),[7] the concentration of salts, the temperature, and the presence of molecular chaperones. Folded proteins usually have a side chain. Packing stabilizes the folded state, and charged or side chains occupy the solvent-exposed surface where they interact with surrounding water. Minimizing the number of hydrophobic side-chains exposed to water is an important driving force behind Illustration of the main driving force behind protein structure formation. In the [8] compact fold (to the right), the hydrophobic amino acids (shown as black spheres) the folding process. Formation of are in general shielded from the solvent. intramolecular hydrogen bonds provides another important contribution to protein stability.[9] The strength of hydrogen bonds depends on their environment, thus H-bonds enveloped in a hydrophobic core contribute more than H-bonds exposed to the aqueous environment to the stability of the native state.[10] The process of folding often begins co-translationally, so that the N-terminus of the protein begins to fold while the C-terminal portion of the protein is still being synthesized by the ribosome. Specialized proteins called chaperones assist in the folding of other proteins.[11] A well studied example is the bacterial GroEL system, which assists in the folding of globular proteins. In eukaryotic organisms chaperones are known as heat shock proteins. Although most globular proteins are able to assume their native state unassisted, chaperone-assisted folding is often necessary in the crowded intracellular environment to prevent aggregation; chaperones are also used to prevent misfolding and aggregation that may occur as a consequence of exposure to heat or other changes in the cellular environment. There are two models of protein folding that are currently being confirmed: The first: The diffusion collision model, in which a nucleus is formed, then the secondary structure is formed, and finally these secondary structures are collided together and pack tightly together. The second: The nucleation-condensation model, in which the secondary and tertiary structures of the protein are made at the same time. Recent studies have shown that some proteins show characteristics of both of these folding models. For the most part, scientists have been able to study many identical molecules folding together en masse. At the coarsest level, it appears that in transitioning to the native state, a given amino acid sequence takes on roughly the same route and proceeds through roughly the same intermediates and transition states. Often folding involves first the establishment of regular secondary and supersecondary structures, in particular alpha helices and beta sheets, and afterward tertiary structure. Formation of quaternary structure usually involves the "assembly" or "coassembly" of subunits that have already folded. The regular alpha helix and beta sheet structures fold rapidly because they are stabilized by intramolecular hydrogen bonds, as was first characterized by Linus Pauling. Protein folding may involve covalent bonding in the form of disulfide bridges formed between two cysteine residues or the formation of

Protein folding metal clusters. Shortly before settling into their more energetically favourable native conformation, molecules may pass through an intermediate "molten globule" state. The essential fact of folding, however, remains that the amino acid sequence of each protein contains the information that specifies both the native structure and the pathway to attain that state. This is not to say that nearly identical amino acid sequences always fold similarly.[12] Conformations differ based on environmental factors as well; similar proteins fold differently based on where they are found. Folding is a spontaneous process independent of energy inputs from nucleoside triphosphates. The passage of the folded state is mainly guided by hydrophobic interactions, formation of intramolecular hydrogen bonds, and van der Waals forces, and it is opposed by conformational entropy.

89

Disruption of the native state


Under some conditions proteins will not fold into their biochemically functional forms [13] . Temperatures above or below the range that cells tend to live in will cause thermally unstable proteins to unfold or "denature" (this is why boiling makes an egg white turn opaque). High concentrations of solutes, extremes of pH, mechanical forces, and the presence of chemical denaturants can do the same. Protein thermal stability is far from constant, however. For example, hyperthermophilic bacteria have been found that grow at temperatures as high as 122 C,[14] which of course requires that their full complement of vital proteins and protein assemblies be stable at that temperature or above. A fully denatured protein lacks both tertiary and secondary structure, and exists as a so-called random coil. Under certain conditions some proteins can refold; however, in many cases, denaturation is irreversible.[15] Cells sometimes protect their proteins against the denaturing influence of heat with enzymes known as chaperones or heat shock proteins, which assist other proteins both in folding and in remaining folded. Some proteins never fold in cells at all except with the assistance of chaperone molecules, which either isolate individual proteins so that their folding is not interrupted by interactions with other proteins or help to unfold misfolded proteins, giving them a second chance to refold properly. This function is crucial to prevent the risk of precipitation into insoluble amorphous aggregates.

Incorrect protein folding and neurodegenerative disease


Aggregated proteins are associated with prion-related illnesses such as Creutzfeldt-Jakob disease, bovine spongiform encephalopathy (mad cow disease), amyloid-related illnesses such as Alzheimer's disease and familial amyloid cardiomyopathy or polyneuropathy,[16] as well as intracytoplasmic aggregation diseases such as Huntington's and Parkinson's disease.[4][17] These age onset degenerative diseases are associated with the aggregation of misfolded proteins into insoluble, extracellular aggregates and/or intracellular inclusions including cross-beta sheet amyloid fibrils. While it is not completely clear whether the aggregates are the cause or merely a reflection of the loss of protein homeostasis, the balance between synthesis, folding, aggregation and protein turnover, the recent European Medicines Agency approval of Tafamidis or Vyndaqel (a kinetic stabilizer of tetrameric transthyretin) for the treatment of the transthyretin amyloid diseases suggests that it is the process of amyloid fibril formation and not the fibrils themselves that causes the degeneration of post-mitotic tissue in human amyloid diseases.[18] Misfolding and excessive degradation instead of folding and function leads to a number of proteopathy diseases such as antitrypsin-associated emphysema, cystic fibrosis and the lysosomal storage diseases, where loss of function is the origin of the disorder. While protein replacement therapy has historically been used to correct the latter disorders, an emerging approach is to use pharmaceutical chaperones to fold mutated proteins to render them functional.

Protein folding

90

Effect of external factors on the folding of proteins


Several external factors such as temperature, external fields (electric, magnetic),[19] molecular crowding,[20] limitation of space could have a big influence on the folding of proteins.[21] Modification of the local minima by external factors can also induce modifications of the folding trajectory. Protein folding is a very finely tuned process. Hydrogen bonding between different atoms provides the force required. Hydrophobic interactions between hydrophobic amino acids pack the hydrophobic residues

The Levinthal paradox and kinetics


Levinthal's paradox is a thought experiment, also constituting a self-reference in the theory of protein folding. In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations. An estimate of 3300 or 10143 was made in one of his papers. The Levinthal paradox[22] observes that if a protein were folded by sequentially sampling of all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate (on the nanosecond or picosecond scale). Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur, and the protein must, therefore, fold through a series of meta-stable intermediate states. The duration of the folding process varies dramatically depending on the protein of interest. When studied outside the cell, the slowest folding proteins require many minutes or hours to fold primarily due to proline isomerization, and must pass through a number of intermediate states, like checkpoints, before the process is complete.[23] On the other hand, very small single-domain proteins with lengths of up to a hundred amino acids typically fold in a single step.[24] Time scales of milliseconds are the norm and the very fastest known protein folding reactions are complete within a few microseconds.[25]

Experimental techniques for studying protein folding


While inferences about protein folding can be made through mutation studies; typically, experimental techniques for studying protein folding rely on the gradual unfolding or folding of a solution of proteins and observing conformational changes using standard non-crystallographic techniques for observing protein structure.

Protein nuclear magnetic resonance spectroscopy


Protein folding is routinely studied using NMR spectroscopy, for example by monitoring hydrogen-deuterium exchange of partially folded intermediates.

Circular dichroism
Circular dichroism is one of the most general and basic tools to study protein folding. Circular dichroism spectroscopy measures the absorption of circularly polarized light. In proteins, structures such as alpha helices and beta sheets are chiral, and thus absorb such light. The absorption of this light acts as a marker of the degree of foldedness of the protein ensemble. This technique has been used to measure equilibrium unfolding of the protein by measuring the change in this absorption as a function of denaturant concentration or temperature. A denaturant melt measures the free energy of unfolding as well as the protein's m value, or denaturant dependence. A temperature melt measures the melting temperature (Tm) of the protein. This type of spectroscopy can also be combined with fast-mixing devices, such as stopped flow, to measure protein folding kinetics and to generate chevron plots.

Protein folding

91

Dual polarisation interferometry


Dual polarisation interferometry is a surface based technique for measuring the optical properties of molecular layers. When used to characterise protein folding, it measures the conformation by determining the overall size of a monolayer of the protein and its density in real time at sub-Angstrom resolution . Although real time, measurement of the kinetics of protein folding are limited to processes that occur slower than ~10Hz. Similar to circular dichroism the stimulus for folding can be a denaturant or temperature.

Vibrational circular dichroism of proteins


The more recent developments of vibrational circular dichroism (VCD) techniques for proteins, currently involving Fourier transform (FFT) instruments, provide powerful means for determining protein conformations in solution even for very large protein molecules. Such VCD studies of proteins are often combined with X-ray diffraction of protein crystals, FT-IR data for protein solutions in heavy water (D2O), or ab initio quantum computations to provide unambiguous structural assignments that are unobtainable from CD.

Studies of folding with high time resolution


The study of protein folding has been greatly advanced in recent years by the development of fast, time-resolved techniques. These are experimental methods for rapidly triggering the folding of a sample of unfolded protein, and then observing the resulting dynamics. Fast techniques in widespread use include neutron scattering,[26] ultrafast mixing of solutions, photochemical methods, and laser temperature jump spectroscopy. Among the many scientists who have contributed to the development of these techniques are Jeremy Cook, Heinrich Roder, Harry Gray, Martin Gruebele, Brian Dyer, William Eaton, Sheena Radford, Chris Dobson, Alan Fersht, Bengt Nlting and Lars Konermann.

Proteolysis
Proteolysis is routinely used to probe the fraction unfolded under a wide range of solution conditions (e.g. Fast parallel proteolysis (FASTpp)[27] [28].

Optical tweezers
Optical tweezers have been used to stretch single protein molecules from their C- and N-termini and unfold them and study the subsequent refolding. The technique allows one to measure folding rates at single-molecule level. For example optical tweezers have been recently applied to study folding and unfolding of proteins involved in blood coagulation. von Willebrand factor (vWF) is a protein with an essential role in blood clot formation process. It is discovered -using single molecule optical tweezers measurement - that calcium-bound vWF acts as a shear force sensor in the blood. Shear force leads to unfolding of the A2 domain of vWF whose refolding rate is dramatically enhanced in the presence of calcium [29].

Computational methods for studying protein folding


Energy landscape of protein folding
The protein folding phenomenon was largely an experimental endeavor until the formulation of an energy landscape theory of proteins by Joseph Bryngelson and Peter Wolynes in the late 1980s and early 1990s. This approach introduced the principle of minimal frustration.[30] This principle says that nature has chosen amino acid sequences so that the folded state of the protein is very stable. In addition, the undesired interactions between amino acids along the folding pathway are reduced making the acquisition of the folded state a very fast process. Even though nature has reduced the level of frustration in proteins, some degree of it remains up to now as can be observed in the

Protein folding presence of local minima in the energy landscape of proteins. A consequence of these evolutionarily selected sequences is that proteins are generally thought to have globally "funneled energy landscapes" (coined by Jos Onuchic)[31] that are largely directed toward the native state. This "folding funnel" landscape allows the protein to fold to the native state through any of a large number of pathways and intermediates, rather than being restricted to a single mechanism. The theory is supported by both computational simulations of model proteins and experimental studies,[30] and it has been used to improve methods for protein structure prediction and design.[30] The description of protein folding by the leveling free-energy landscape is also consistent with the 2nd law of thermodynamics.[32] Physically, thinking of landscapes in terms of visualizable potential or total energy surfaces simply with maxima, saddle points, minima, and funnels, rather like geographic landscapes, is perhaps a little misleading. The relevant description is really a highly dimensional phase space in which manifolds might take a variety of more complicated topological forms.[33]

92

Modeling of protein folding


De novo or ab initio techniques for computational protein structure prediction are related to, but strictly distinct from experimental studies of protein folding. Molecular Dynamics (MD) is an important tool for studying protein folding and dynamics in silico. First equilibrium folding simulations were done using implicit solvent model and umbrella sampling.[34] Because of computational cost, ab initio MD folding simulations with explicit water are limited to peptides and very small Folding@home uses Markov state models, like the one diagrammed here, to model the possible shapes and folding pathways a protein can take as it condenses from its initial proteins.[35][36] MD simulations of randomly-coiled state (left) into its native 3D structure (right). larger proteins remain restricted to dynamics of the experimental structure or its high-temperature unfolding. In order to simulate long-time folding processes (beyond about 1 microsecond), like folding of small-size proteins (about 50 residues) or larger, some approximations or simplifications in protein models may be introduced to speed-up the calculation process.[37] The 5-petaFLOP distributed computing project Folding@home simulates protein folding using the idle processing time of PlayStation 3s and the CPU and GPU of personal computers from volunteers. The project aims to understand protein misfolding and accelerate drug design for disease research.

Protein folding

93

References
[1] Alberts, Bruce; Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walters (2002). "The Shape and Structure of Proteins" (http:/ / www. ncbi. nlm. nih. gov/ entrez/ query. fcgi?cmd=Search& db=books& doptcmdl=GenBookHL& term=mboc4[book]+ AND+ 372270[uid]& rid=mboc4. section. 388). Molecular Biology of the Cell; Fourth Edition. New York and London: Garland Science. ISBN0-8153-3218-1. . [2] Anfinsen, C. (1972). "The formation and stabilization of protein structure". Biochem. J. 128 (4): 73749. PMC1173893. PMID4565129. [3] Jeremy M. Berg, John L. Tymoczko, Lubert Stryer; Web content by Neil D. Clarke (2002). "3. Protein Structure and Function" (http:/ / www. ncbi. nlm. nih. gov/ entrez/ query. fcgi?cmd=Search& db=books& doptcmdl=GenBookHL& term=stryer[book]+ AND+ 215168[uid]& rid=stryer. chapter. 280). Biochemistry. San Francisco: W. H. Freeman. ISBN0-7167-4684-0. . [4] Dennis J. Selkoe (2003). "Folding proteins in fatal ways" (http:/ / www. nature. com/ nature/ journal/ v426/ n6968/ full/ nature02264. html). Nature 426 (6968): 900904. doi:10.1038/nature02264. PMID14685251. . [5] Alberts, Bruce, Dennis Bray, Karen Hopkin, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter. "Protein Structure and Function." Essential Cell Biology. Edition 3. New York: Garland Science, Taylor and Francis Group, LLC, 2010. Pg 120-170. [6] Anfinsen CB. (20 July 1973). "Principles that Govern the Folding of Protein Chains" (http:/ / www. sciencemag. org/ cgi/ pdf_extract/ 181/ 4096/ 223). Science. 181 (4096): 223230. Bibcode1973Sci...181..223A. doi:10.1126/science.181.4096.223. PMID4124164. . [7] van den Berg, B., Wain, R., Dobson, C. M., Ellis R. J. (August 2000). "Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell". EMBO J. 19 (15): 38705. doi:10.1093/emboj/19.15.3870. PMC306593. PMID10921869. [8] Pace, C., Shirley, B., McNutt, M., Gajiwala, K. (1 January 1996). "Forces contributing to the conformational stability of proteins" (http:/ / www. fasebj. org/ cgi/ reprint/ 10/ 1/ 75). FASEB J. 10 (1): 7583. PMID8566551. . [9] Rose, G., Fleming, P., Banavar, J., Maritan, A. (2006). "A backbone-based theory of protein folding". Proc. Natl. Acad. Sci. U.S.A. 103 (45): 1662333. Bibcode2006PNAS..10316623R. doi:10.1073/pnas.0606843103. PMC1636505. PMID17075053. [10] Deechongkit, S., Nguyen, H., Dawson, P. E., Gruebele, M., Kelly, J. W. (2004). "Context Dependent Contributions of Backbone H-Bonding to -Sheet Folding Energetics". Nature 403 (45): 1015. Bibcode2004Natur.430..101D. doi:10.1038/nature02611. PMID15229605. [11] Lee, S., Tsai, F. (2005). "Molecular chaperones in protein quality control" (http:/ / www. jbmb. or. kr/ fulltext/ jbmb/ view. php?vol=38& page=259). J. Biochem. Mol. Biol. 38 (3): 25965. doi:10.5483/BMBRep.2005.38.3.259. PMID15943899. . [12] Alexander, P. A., He Y., Chen, Y., Orban, J., Bryan, P. N. (2007). "The design and characterization of two proteins with 88% sequence identity but different structure and function". Proc Natl Acad Sci U S A. 104 (29): 119638. Bibcode2007PNAS..10411963A. doi:10.1073/pnas.0700922104. PMC1906725. PMID17609385. [13] . http:/ / www. plosone. org/ article/ fetchObjectAttachment. action;jsessionid=CE17B6912F77B4069E4969431710B8A7?uri=info%3Adoi%2F10. 1371%2Fjournal. pone. 0046147& representation=PDF. [14] Takai, K., Nakamura, K., Toki, T., Tsunogai, U., Miyazaki, M., Miyazaki, J., Hirayama, H., Nakagawa, S., Nunoura, T., Horikoshi, K. (2008). "Cell proliferation at 122 C and isotopically heavy CH4 production by a hyperthermophilic methanogen under high-pressure cultivation". Proc Natl Acad Sci USA 105 (31): 1094954. Bibcode2008PNAS..10510949T. doi:10.1073/pnas.0712334105. PMC2490668. PMID18664583. [15] Shortle, D. (1 January 1996). "The denatured state (the other half of the folding equation) and its role in protein stability" (http:/ / www. fasebj. org/ cgi/ reprint/ 10/ 1/ 27). FASEB J. 10 (1): 2734. PMID8566543. . [16] Hammarstrom, P., et al., Prevention of Transthyretin Amyloid Disease by Changing Protein Misfolding Energetics. Science, 2003. 299(5607): p. 713-716. [17] Chiti, F.; Dobson, C. (2006). "Protein misfolding, functional amyloid, and human disease.". Annual review of biochemistry 75: 333366. doi:10.1146/annurev.biochem.75.101304.123901. PMID16756495. [18] Johnson, S.M., et al., Native State Kinetic Stabilization as a Strategy To Ameliorate Protein Misfolding Diseases: A Focus on the Transthyretin Amyloidoses. Acc. Chem. Res., 2005. 38(12): p. 911-921. [19] Ojeda, P., Garcia, M. (2010). "Electric Field-Driven Disruption of a Native -Sheet Protein Conformation and Generation of a Helix-Structure". Biophysical Journal 99 (2): 595599. Bibcode2010BpJ....99..595O. doi:10.1016/j.bpj.2010.04.040. PMC2905109. PMID20643079. [20] Berg, B., Ellis, J., Dobson, C. (1999). "Effects of macromolecular crowding on protein folding and aggregation". The EMBO Journal 18 (24): 69276933. doi:10.1093/emboj/18.24.6927. PMC1171756. PMID10601015. [21] Ellis RJ (July 2006). "Molecular chaperones: assisting assembly in addition to folding". Trends in Biochemical Sciences 31 (7): 395401. doi:10.1016/j.tibs.2006.05.001. PMID16716593. [22] C. Levinthal (1968). "Are there pathways for protein folding?" (http:/ / www. biochem. wisc. edu/ courses/ biochem704/ Reading/ Levinthal1968. pdf). J. Chim. Phys. 65: 445. . [23] Kim, P. S., Baldwin, R. L. (1990). "Intermediates in the folding reactions of small proteins". Annu. Rev. Biochem. 59: 63160. doi:10.1146/annurev.bi.59.070190.003215. PMID2197986. [24] Jackson S. E. (August 1998). "How do small single-domain proteins fold?" (http:/ / biomednet. com/ elecref/ 13590278003R0081). Fold Des 3 (4): R8191. doi:10.1016/S1359-0278(98)00033-9. PMID9710577. . [25] Kubelka, J., Hofrichter, J., Eaton, W. A. (February 2004). "The protein folding 'speed limit'". Curr. Opin. Struct. Biol. 14 (1): 7688. doi:10.1016/j.sbi.2004.01.013. PMID15102453.

Protein folding
[26] Bu, Z; Cook, J; Callaway, D. J. E. (2001). "Dynamic regimes and correlated structural dynamics in native and denatured alpha-lactalbuminC". J Mol Biol 312 (4): 865873. doi:10.1006/jmbi.2001.5006. PMID11575938. [27] . http:/ / www. plosone. org/ article/ fetchObjectAttachment. action;jsessionid=CE17B6912F77B4069E4969431710B8A7?uri=info%3Adoi%2F10. 1371%2Fjournal. pone. 0046147& representation=PDF. [28] . PMID15782190. [29] Jakobi AJ, Mashaghi A, Tans SJ, Huizinga EG. Calcium modulates force sensing by the von Willebrand factor A2 domain. Nature Commun. 2011 Jul 12;2:385. (http:/ / www. nature. com/ ncomms/ journal/ v2/ n7/ full/ ncomms1385. html) [30] Bryngelson, J. D., Onuchic, J. N., Socci, N. D. and Wolynes, P.G. (1995). "Funnels, Pathways, and the Energy Landscape of Protein Folding: A Synthesis" (http:/ / wolynes. ucsd. edu/ Wolynes Papers/ Funnels Pathways 135. pdf). Proteins:Struct. Funct. Genet. 21 (3): 167195. doi:10.1002/prot.340210302. PMID7784423. . [31] Leopold, P. E., Montal, M. and Onuchic, J. N. (1992). "Protein folding funnels: a kinetic approach to the sequence-structure relationship" (http:/ / www. pnas. org/ content/ 89/ 18/ 8721. full. pdf+ html). Proc. Natl. Acad. Sci. USA 89 (18): 87218725. Bibcode1992PNAS...89.8721L. doi:10.1073/pnas.89.18.8721. PMC49992. PMID1528885. . [32] Sharma, V., Kaila, V. R. I., and Annila, A. (2009). "Protein folding as an evolutionary process". Physica A 388 (6): 851862. Bibcode2009PhyA..388..851S. doi:10.1016/j.physa.2008.12.004. [33] Robson, B, Vaithilingham A. (2008). Protein Folding Revisited. Progress in Molecular Biology and Translational Science, Molecular Biology of Protein Folding. 84:161-202, Elsevier Press/Academic Press [34] Schaefer, Michael; Bartels, Christian, Karplus, Martin (3 December 1998). "Solution conformations and thermodynamics of structured peptides: molecular dynamics simulation with an implicit solvation model1". Journal of Molecular Biology 284 (3): 835848. doi:10.1006/jmbi.1998.2172. PMID9826519. [35] "Fragment-based Protein Folding Simulations" (http:/ / www. cs. ucl. ac. uk/ staff/ d. jones/ t42morph. html). . [36] "Protein folding" (http:/ / www. biomolecular-modeling. com/ Abalone/ Protein-folding. html) (by Molecular Dynamics). . [37] Kmiecik, S., and Kolinski, A. (2007). "Characterization of protein-folding pathways by reduced-space modeling". Proc. Natl. Acad. Sci. U.S.A. 104 (30): 123305. Bibcode2007PNAS..10412330K. doi:10.1073/pnas.0702265104. PMC1941469. PMID17636132.

94

External links
FoldIt - Folding Protein Game (http://fold.it/portal/info/science) Folding@Home (http://www.stanford.edu/group/pandegroup/folding/about.html) Rosetta@Home (http://boinc.bakerlab.org/rosetta) Human Proteome Folding Project (http://www.worldcommunitygrid.org/research/proteome/overview.do) BHAGEERATH-H: Protein tertiary structure prediction server (http://www.scfbio-iitd.res.in/bhageerath/ bhageerath_h.jsp)

http://www.englandlab.com/protein-folding.html

Protein purification

95

Protein purification
Protein purification is a series of processes intended to isolate a single type of protein from a complex mixture. Protein purification is vital for the characterization of the function, structure and interactions of the protein of interest. The starting material is usually a biological tissue or a microbial culture. The various steps in the purification process may free the protein from a matrix that confines it, separate the protein and non-protein parts of the mixture, and finally separate the desired protein from all other proteins. Separation of one protein from all others is typically the most laborious aspect of protein purification. Separation steps may exploit differences in (for example) protein size, physico-chemical properties, binding affinity and biological activity.

Purpose
Purification may be preparative or analytical. Preparative purifications aim to produce a relatively large quantity of purified proteins for subsequent use. Examples include the preparation of commercial products such as enzymes (e.g. lactase), nutritional proteins (e.g. soy protein isolate), and certain biopharmaceuticals (e.g. insulin). Analytical purification produces a relatively small amount of a protein for a variety of research or analytical purposes, including identification, quantification, and studies of the protein's structure, post-translational modifications and function. Pepsin and urease were the first proteins purified to the point that they could be crystallized.[1]

Strategies
Choice of a starting material is key to the design of a purification process. In a plant or animal, a particular protein usually isn't distributed homogeneously throughout the body; different organs or tissues have higher or lower concentrations of the protein. Use of only the tissues or organs with the highest concentration decreases the volumes needed to produce a given amount of purified protein. If the protein is present in low abundance, or if it has a high value, scientists may use recombinant DNA technology to develop cells that will produce large quantities of the desired protein (this is known as an expression system). Recombinant expression allows the protein to be tagged, e.g. by a His-tag, to facilitate purification, which means that the purification can be done in fewer steps. In addition, recombinant expression usually starts with a higher fraction of the desired protein than is present in a natural source. An analytical purification generally utilizes three properties to separate proteins. First, proteins may be purified according to their isoelectric points by running them through a pH graded gel or an ion exchange column. Second, proteins can be separated according to their size or molecular weight via size exclusion Recombinant bacteria can be grown in a flask chromatography or by SDS-PAGE (sodium dodecyl containing growth media. sulfate-polyacrylamide gel electrophoresis) analysis. Proteins are often purified by using 2D-PAGE and are then analysed by peptide mass fingerprinting to establish the protein identity. This is very useful for scientific purposes and the detection limits for protein are nowadays very low and nanogram amounts of protein are sufficient for their analysis. Thirdly, proteins may be separated by polarity/hydrophobicity via high performance liquid chromatography or reversed-phase chromatography.

Protein purification

96

Evaluating purification yield


The most general method to monitor the purification process is by running a SDS-PAGE of the different steps. This method only gives a rough measure of the amounts of different proteins in the mixture, and it is not able to distinguish between proteins with similar apparent molecular weight. If the protein has a distinguishing spectroscopic feature or an enzymatic activity, this property can be used to detect and quantify the specific protein, and thus to select the fractions of the separation, that contains the protein. If antibodies against the protein are available then western blotting and ELISA can specifically detect and quantify the amount of desired protein. Some proteins function as receptors and can be detected during purification steps by a ligand binding assay, often using a radioactive ligand. In order to evaluate the process of multistep purification, the amount of the specific protein has to be compared to the amount of total protein. The latter can be determined by the Bradford total protein assay or by absorbance of light at 280 nm, however some reagents used during the purification process may interfere with the quantification. For example, imidazole (commonly used for purification of polyhistidine-tagged recombinant proteins) is an amino acid analogue and at low concentrations will interfere with the bicinchoninic acid (BCA) assay for total protein quantification. Impurities in low-grade imidazole will also absorb at 280nm, resulting in an inaccurate reading of protein concentration from UV absorbance. Another method to be considered is Surface Plasmon Resonance (SPR). SPR can detect binding of label free molecules on the surface of a chip. If the desired protein is an antibody, binding can be translated directly to the activity of the protein. One can express the active concentration of the protein as the percent of the total protein. SPR can be a powerful method for quickly determining protein activity and overall yield. It is a powerful technology that requires an instrument to perform.

Methods of protein purification


The methods used in protein purification can roughly be divided into analytical and preparative methods. The distinction is not exact, but the deciding factor is the amount of protein that can practically be purified with that method. Analytical methods aim to detect and identify a protein in a mixture, whereas preparative methods aim to produce large quantities of the protein for other purposes, such as structural biology or industrial use. In general, the preparative methods can be used in analytical applications, but not the other way around.

Extraction
Depending on the source, the protein has to be brought into solution by breaking the tissue or cells containing it. There are several methods to achieve this: Repeated freezing and thawing, sonication, homogenization by high pressure, filtration, or permeabilization by organic solvents. The method of choice depends on how fragile the protein is and how sturdy the cells are. After this extraction process soluble proteins will be in the solvent, and can be separated from cell membranes, DNA etc. by centrifugation. The extraction process also extracts proteases, which will start digesting the proteins in the solution. If the protein is sensitive to proteolysis, it is usually desirable to proceed quickly, and keep the extract cooled, to slow down proteolysis.

Precipitation and differential solubilization


In bulk protein purification, a common first step to isolate proteins is precipitation with ammonium sulfate (NH4)2SO4. This is performed by adding increasing amounts of ammonium sulfate and collecting the different fractions of precipitate protein. Ammonium sulphate can be removed by dialysis.The hydrophobic groups on the proteins gets exposed to the atmosphere and it attracts other protein hydrophobic groups and gets aggregated. Protein precipitated will be large enough to be visible. One advantage of this method is that it can be performed inexpensively with very large volumes.

Protein purification The first proteins to be purified are water-soluble proteins. Purification of integral membrane proteins requires disruption of the cell membrane in order to isolate any one particular protein from others that are in the same membrane compartment. Sometimes a particular membrane fraction can be isolated first, such as isolating mitochondria from cells before purifying a protein located in a mitochondrial membrane. A detergent such as sodium dodecyl sulfate (SDS) can be used to dissolve cell membranes and keep membrane proteins in solution during purification; however, because SDS causes denaturation, milder detergents such as Triton X-100 or CHAPS can be used to retain the protein's native conformation during complete purification.

97

Ultracentrifugation
Centrifugation is a process that uses centrifugal force to separate mixtures of particles of varying masses or densities suspended in a liquid. When a vessel (typically a tube or bottle) containing a mixture of proteins or other particulate matter, such as bacterial cells, is rotated at high speeds, the angular momentum yields an outward force to each particle that is proportional to its mass. The tendency of a given particle to move through the liquid because of this force is offset by the resistance the liquid exerts on the particle. The net effect of "spinning" the sample in a centrifuge is that massive, small, and dense particles move outward faster than less massive particles or particles with more "drag" in the liquid. When suspensions of particles are "spun" in a centrifuge, a "pellet" may form at the bottom of the vessel that is enriched for the most massive particles with low drag in the liquid. Non-compacted particles still remaining mostly in the liquid are called the "supernatant" and can be removed from the vessel to separate the supernatant from the pellet. The rate of centrifugation is specified by the angular acceleration applied to the sample, typically measured in comparison to the g. If samples are centrifuged long enough, the particles in the vessel will reach equilibrium wherein the particles accumulate specifically at a point in the vessel where their buoyant density is balanced with centrifugal force. Such an "equilibrium" centrifugation can allow extensive purification of a given particle. Sucrose gradient centrifugation a linear concentration gradient of sugar (typically sucrose, glycerol, or a silica based density gradient media, like Percoll) is generated in a tube such that the highest concentration is on the bottom and lowest on top. Percoll is a trademark owned by GE Healthcare companies. A protein sample is then layered on top of the gradient and spun at high speeds in an ultracentrifuge. This causes heavy macromolecules to migrate towards the bottom of the tube faster than lighter material. During centrifugation in the absence of sucrose, as particles move farther and farther from the center of rotation, they experience more and more centrifugal force (the further they move, the faster they move). The problem with this is that the useful separation range of within the vessel is restricted to a small observable window. Spinning a sample twice as long doesn't mean the particle of interest will go twice as far, in fact, it will go significantly further. However, when the proteins are moving through a sucrose gradient, they encounter liquid of increasing density and viscosity. A properly designed sucrose gradient will counteract the increasing centrifugal force so the particles move in close proportion to the time they have been in the centrifugal field. Samples separated by these gradients are referred to as "rate zonal" centrifugations. After separating the protein/particles, the gradient is then fractionated and collected.

Protein purification

98

Chromatographic methods
Usually a protein purification protocol contains one or more chromatographic steps. The basic procedure in chromatography is to flow the solution containing the protein through a column packed with various materials. Different proteins interact differently with the column material, and can thus be separated by the time required to pass the column, or the conditions required to elute the protein from the column. Usually proteins are detected as they are coming off the column by their absorbance at 280nm. Many different chromatographic methods exist:

Size exclusion chromatography


Chromatography can be used to separate protein in solution or denaturing conditions by using porous gels. This technique is known as size exclusion chromatography. The principle is that smaller molecules have to traverse a larger volume in a porous matrix. Consequentially, proteins of a certain range in size will require a variable volume of eluent (solvent) before being collected at the other end of the column of gel.

In the context of protein purification, the eluent is usually pooled in different test tubes. All test tubes containing no measurable trace of the protein to purify are discarded. The remaining solution is thus made of the protein to purify and any other similarly-sized proteins.

Chromatographic equipment. Here set up for a size exclusion chromatography. The buffer is pumped through the column (right) by a computer controlled device.

Separation based on charge or hydrophobicity


Hydrophobic Interaction Chromatography Resin used in the column are amphiphiles with both hydrophobic and hydrophilic regions. The hydrophobic part of the resin attracts hydrophobic region on the proteins. The greater the hydrophobic region on the protein the stronger the attraction between the gel and that particular protein.

Ion exchange chromatography


Ion exchange chromatography separates compounds according to the nature and degree of their ionic charge. The column to be used is selected according to its type and strength of charge. Anion exchange resins have a positive charge and are used to retain and separate negatively charged compounds, while cation exchange resins have a negative charge and are used to separate positively charged molecules. Before the separation begins a buffer is pumped through the column to equilibrate the opposing charged ions. Upon injection of the sample, solute molecules will exchange with the buffer ions as each competes for the binding sites on the resin. The length of retention for each solute depends upon the strength of its charge. The most weakly charged compounds will elute first, followed by those with successively stronger charges. Because of the nature of the separating mechanism, pH, buffer type, buffer concentration, and temperature all play important roles in controlling the separation. Ion exchange chromatography is a very powerful tool for use in protein purification and is frequently used in both analytical and preparative separations.

Protein purification

99

Affinity chromatography
Affinity Chromatography is a separation technique based upon molecular conformation, which frequently utilizes application specific resins. These resins have ligands attached to their surfaces which are specific for the compounds to be separated. Most frequently, these ligands function in a fashion similar to that of Nickel-affinity column. The resin is blue since it has bound nickel. antibody-antigen interactions. This "lock and key" fit between the ligand and its target compound makes it highly specific, frequently generating a single peak, while all else in the sample is unretained. Many membrane proteins are glycoproteins and can be purified by lectin affinity chromatography. Detergent-solubilized proteins can be allowed to bind to a chromatography resin that has been modified to have a covalently attached lectin. Proteins that do not bind to the lectin are washed away and then specifically bound glycoproteins can be eluted by adding a high concentration of a sugar that competes with the bound glycoproteins at the lectin binding site. Some lectins have high affinity binding to oligosaccharides of glycoproteins that is hard to compete with sugars, and bound glycoproteins need to be released by denaturing the lectin. Metal binding A common technique involves engineering a sequence of 6 to 8 histidines into the N- or C-terminal of the protein. The polyhistidine binds strongly to divalent metal ions such as nickel and cobalt. The protein can be passed through a column containing immobilized nickel ions, which binds the polyhistidine tag. All untagged proteins pass through the column. The protein can be eluted with imidazole, which competes with the polyhistidine tag for binding to the column, or by a decrease in pH (typically to 4.5), which decreases the affinity of the tag for the resin. While this procedure is generally used for the purification of recombinant proteins with an engineered affinity tag (such as a 6xHis tag or Clontech's HAT tag), it can also be used for natural proteins with an inherent affinity for divalent cations. Immunoaffinity chromatography Immunoaffinity chromatography uses the specific binding of an antibody to the target protein to selectively purify the protein. The procedure involves immobilizing an antibody to a column material, which then selectively binds the protein, while everything else flows through. The protein can be eluted by changing the pH or the salinity. Because this method does not involve engineering in a tag, it can be used for proteins from natural sources.[2] Purification of a tagged protein

Another way to tag proteins is to engineer an antigen peptide tag onto the protein, and then purify the protein on a column or by incubating with a loose resin that is coated with an immobilized antibody. This particular procedure is known as immunoprecipitation. Immunoprecipitation is quite capable of generating an extremely specific interaction which usually results in binding only the desired protein. The purified tagged proteins can then easily be separated from the other proteins in solution and later eluted back into clean solution.

A HPLC. From left to right: A pumping device generating a gradient of two different solvents, a steel enforced column and an apparatus for measuring the absorbance.

Protein purification When the tags are not needed anymore, they can be cleaved off by a protease. This often involves engineering a protease cleavage site between the tag and the protein.

100

HPLC
High performance liquid chromatography or high pressure liquid chromatography is a form of chromatography applying high pressure to drive the solutes through the column faster. This means that the diffusion is limited and the resolution is improved. The most common form is "reversed phase" hplc, where the column material is hydrophobic. The proteins are eluted by a gradient of increasing amounts of an organic solvent, such as acetonitrile. The proteins elute according to their hydrophobicity. After purification by HPLC the protein is in a solution that only contains volatile compounds, and can easily be lyophilized.[3] HPLC purification frequently results in denaturation of the purified proteins and is thus not applicable to proteins that do not spontaneously refold.

Concentration of the purified protein


At the end of a protein purification, the protein often has to be concentrated. Different methods exist.

Lyophilization
If the solution doesn't contain any other soluble component than the protein in question the protein can be lyophilized (dried). This is commonly done after an HPLC run. This simply removes all volatile components, leaving the proteins behind.

Ultrafiltration
Ultrafiltration concentrates a protein solution using selective permeable membranes. The function of the membrane is to let the water and small molecules pass through while retaining the protein. The solution is forced against the membrane by mechanical pump or gas pressure or centrifugation.

A selectively permeable membrane can be mounted in a centrifuge tube. The buffer is forced through the membrane by centrifugation, leaving the protein in the upper chamber.

Analytical
Denaturing-Condition Electrophoresis
Gel electrophoresis is a common laboratory technique that can be used both as preparative and analytical method. The principle of electrophoresis relies on the movement of a charged ion in an electric field. In practice, the proteins are denatured in a solution containing a detergent (SDS). In these conditions, the proteins are unfolded and coated with negatively charged detergent molecules. The proteins in SDS-PAGE are separated on the sole basis of their size. In analytical methods, the protein migrate as bands based on size. Each band can be detected using stains such as Coomassie blue dye or silver stain. Preparative methods to purify large amounts of protein, require the extraction of the protein from the electrophoretic gel. This extraction may involve excision of the gel containing a band, or eluting the band directly off the gel as it runs off the end of the gel.

Protein purification In the context of a purification strategy, denaturing condition electrophoresis provides an improved resolution over size exclusion chromatography, but does not scale to large quantity of proteins in a sample as well as the late chromatography columns.

101

Non-Denaturing-Condition Electrophoresis
An important non-denaturing electrophoretic procedure for isolating bioactive metalloproteins in complex protein mixtures is termed 'quantitative native continuous polyacrylamide gel electrophoresis (QPNC-PAGE).

References
[1] "The Nobel Prize in Chemistry 1946" (http:/ / www. nobelprize. org/ nobel_prizes/ chemistry/ laureates/ 1946/ ). . Retrieved 2011-09-19. [2] Ehle H, Horn A (1990). "Immunoaffinity chromatography of enzymes". Bioseparation 1 (2): 97110. PMID1368167. [3] Regnier FE (October 1983). "High-performance liquid chromatography of biopolymers" (http:/ / www. sciencemag. org/ cgi/ pmidlookup?view=long& pmid=6353575). Science 222 (4621): 24552. doi:10.1126/science.6353575. PMID6353575. .

External links
Protein purification in one day (http://proteincrystallography.org/protein-purification/) Protein purification facility (http://wolfson.huji.ac.il/purification/) Slope Spectroscopy (http://www.solovpe.com)

Chromatography
Chromatography [|krm|tgrfi] (from Greek chroma "color" and graphein "to write") is the collective term for a set of laboratory techniques for the separation of mixtures. The mixture is dissolved in a fluid called the mobile phase, which carries it through a structure holding another material called the stationary phase. The various constituents of the mixture travel at different speeds, causing them to separate. The separation is based on differential partitioning between the mobile and stationary phases. Subtle differences in a compound's partition coefficient result in differential retention on the stationary phase and thus changing the separation.

Chromatography may be preparative or analytical. The purpose of preparative chromatography is to separate the components of a mixture for more advanced use (and is thus a form of purification). Analytical chromatography is done normally with smaller amounts of material and is for measuring the relative proportions of analytes in a mixture. The two are not mutually exclusive.

Pictured is a sophisticated gas chromatography system. This instrument records concentrations of acrylonitrile in the air at various points throughout the chemical laboratory.

Chromatography

102

History
Chromatography, literally "color writing", was first employed by Russian scientist Mikhail Tsvet in 1900. He continued to work with chromatography in the first decade of the 20th century, primarily for the separation of plant pigments such as chlorophyll, carotenes, and xanthophylls. Since these components have different colors (green, orange, and yellow, respectively) they gave the technique its name. New types of chromatography developed during the 1930s and 1940s made the technique useful for many separation processes. Chromatography technique developed substantially as a result of the work of Archer John Porter Martin and Richard Laurence Millington Synge during the 1940s and 1950s. They established the principles and basic techniques of partition chromatography, and their work encouraged the rapid development of several chromatographic methods: paper chromatography, gas chromatography, and what would become known as high performance liquid chromatography. Since then, the technology has advanced rapidly. Researchers found that the main principles of Tsvet's chromatography could be applied in many different ways, resulting in the different varieties of chromatography described below. Advances are continually improving the technical performance of chromatography, allowing the separation of increasingly similar molecules.

Chromatography terms
The analyte is the substance to be separated during chromatography. Analytical chromatography is used to determine the existence and possibly also the concentration of analyte(s) in a sample.

Thin layer chromatography is used to separate components of a plant extract, illustrating the experiment with plant pigments that gave chromatography its name

A bonded phase is a stationary phase that is covalently bonded to the support particles or to the inside wall of the column tubing. A chromatogram is the visual output of the chromatograph. In the case of an optimal separation, different peaks or patterns on the chromatogram correspond to different components of the separated mixture.

Plotted on the x-axis is the retention time and plotted on the y-axis a signal (for example obtained by a spectrophotometer, mass spectrometer or a variety of other detectors) corresponding to the response created by the analytes exiting the system. In the case of an optimal system the signal is proportional to the concentration

Chromatography of the specific analyte separated. A chromatograph is equipment that enables a sophisticated separation e.g. gas chromatographic or liquid chromatographic separation. Chromatography is a physical method of separation that distributes components to separate between two phases, one stationary (stationary phase), while the other (the mobile phase) moves in a definite direction. The eluate is the mobile phase leaving the column. The eluent is the solvent that carries the analyte. An eluotropic series is a list of solvents ranked according to their eluting power. An immobilized phase is a stationary phase that is immobilized on the support particles, or on the inner wall of the column tubing. The mobile phase is the phase that moves in a definite direction. It may be a liquid (LC and Capillary Electrochromatography (CEC)), a gas (GC), or a supercritical fluid (supercritical-fluid chromatography, SFC). The mobile phase consists of the sample being separated/analyzed and the solvent that moves the sample through the column. In the case of HPLC the mobile phase consists of a non-polar solvent(s) such as hexane in normal phase or polar solvents in reverse phase chromotagraphy and the sample being separated. The mobile phase moves through the chromatography column (the stationary phase) where the sample interacts with the stationary phase and is separated. Preparative chromatography is used to purify sufficient quantities of a substance for further use, rather than analysis. The retention time is the characteristic time it takes for a particular analyte to pass through the system (from the column inlet to the detector) under set conditions. See also: Kovats' retention index The sample is the matter analyzed in chromatography. It may consist of a single component or it may be a mixture of components. When the sample is treated in the course of an analysis, the phase or the phases containing the analytes of interest is/are referred to as the sample whereas everything out of interest separated from the sample before or in the course of the analysis is referred to as waste. The solute refers to the sample components in partition chromatography. The solvent refers to any substance capable of solubilizing another substance, and especially the liquid mobile phase in liquid chromatography. The stationary phase is the substance fixed in place for the chromatography procedure. Examples include the silica layer in thin layer chromatography Chromatography is based on the concept of partition coefficient. Any solute partitions between two immiscible solvents. When we make one solvent immobile (by adsorption on a solid support matrix) and another mobile it results in most common applications of chromatography. If matrix support is polar (e.g. paper, silica etc.) it is forward phase chromatography, and if it is non polar (C-18) it is reverse phase.

103

Techniques by chromatographic bed shape


Column chromatography
Column chromatography is a separation technique in which the stationary bed is within a tube. The particles of solid stationary phase or the support coated with a liquid stationary phase may fill the whole inside volume of tube (packed column) or be concentrated on or along the inside tube wall leaving an open, unrestricted path for mobile phase in the middle part of the tube (open tubular column). Differences in rates of movement through medium are calculated to different retention times of the sample.[1] the the the the

In 1978, W. C. Still introduced a modified version of column chromatography called flash column chromatography (flash).[2][3] The technique is very similar to the traditional column chromatography, except for that the solvent is driven through the column by applying positive pressure. This allowed most separations to be

Chromatography performed in less than 20 minutes, with improved separations compared to the old method. Modern flash chromatography systems are sold as pre-packed plastic cartridges, and the solvent is pumped through the cartridge. Systems may also be linked with detectors and fraction collectors providing automation. The introduction of gradient pumps resulted in quicker separations and less solvent usage. In expanded bed adsorption, a fluidized bed is used, rather than a solid phase made by a packed bed. This allows omission of initial clearing steps such as centrifugation and filtration, for culture broths or slurries of broken cells. Phosphocellulose chromatography utilizes the binding affinity of many DNA-binding proteins for phosphocellulose. The stronger a protein's interaction with DNA, the higher the salt concentration needed to elute that protein.[4]

104

Planar chromatography
Planar chromatography is a separation technique in which the stationary phase is present as or on a plane. The plane can be a paper, serving as such or impregnated by a substance as the stationary bed (paper chromatography) or a layer of solid particles spread on a support such as a glass plate (thin layer chromatography). Different compounds in the sample mixture travel different distances according to how strongly they interact with the stationary phase as compared to the mobile phase. The specific Retention factor (Rf) of each chemical can be used to aid in the identification of an unknown substance. Paper chromatography Paper chromatography is a technique that involves placing a small dot or line of sample solution onto a strip of chromatography paper. The paper is placed in a jar containing a shallow layer of solvent and sealed. As the solvent rises through the paper, it meets the sample mixture, which starts to travel up the paper with the solvent. This paper is made of cellulose, a polar substance, and the compounds within the mixture travel farther if they are non-polar. More polar substances bond with the cellulose paper more quickly, and therefore do not travel as far. Thin layer chromatography Thin layer chromatography (TLC) is a widely employed laboratory technique and is similar to paper chromatography. However, instead of using a stationary phase of paper, it involves a stationary phase of a thin layer of adsorbent like silica gel, alumina, or cellulose on a flat, inert substrate. Compared to paper, it has the advantage of faster runs, better separations, and the choice between different adsorbents. For even better resolution and to allow for quantification, high-performance TLC can be used.

Displacement chromatography
The basic principle of displacement chromatography is: A molecule with a high affinity for the chromatography matrix (the displacer) competes effectively for binding sites, and thus displace all molecules with lesser affinities.[5] There are distinct differences between displacement and elution chromatography. In elution mode, substances typically emerge from a column in narrow, Gaussian peaks. Wide separation of peaks, preferably to baseline, is desired for maximum purification. The speed at which any component of a mixture travels down the column in elution mode depends on many factors. But for two substances to travel at different speeds, and thereby be resolved, there must be substantial differences in some interaction between the biomolecules and the chromatography matrix. Operating parameters are adjusted to maximize the effect of this difference. In many cases, baseline separation of the peaks can be achieved only with gradient elution and low column loadings. Thus, two drawbacks to elution mode chromatography, especially at the preparative scale, are operational complexity, due to gradient solvent pumping, and low throughput, due to low column loadings. Displacement chromatography has advantages over elution chromatography in that components are resolved into consecutive zones of pure substances rather than peaks. Because the process takes advantage of the nonlinearity of the isotherms, a larger column feed can be separated on a given column with the purified components recovered at significantly higher concentrations.

Chromatography

105

Techniques by physical state of mobile phase


Gas chromatography
Gas chromatography (GC), also sometimes known as gas-liquid chromatography, (GLC), is a separation technique in which the mobile phase is a gas. Gas chromatography is always carried out in a column, which is typically "packed" or "capillary" (see below). Gas chromatography (GC) is based on a partition equilibrium of analyte between a solid stationary phase (often a liquid silicone-based material) and a mobile gas (most often helium). The stationary phase is adhered to the inside of a small-diameter glass tube (a capillary column) or a solid matrix inside a larger metal tube (a packed column). It is widely used in analytical chemistry; though the high temperatures used in GC make it unsuitable for high molecular weight biopolymers or proteins (heat denatures them), frequently encountered in biochemistry, it is well suited for use in the petrochemical, environmental monitoring and remediation, and industrial chemical fields. It is also used extensively in chemistry research.

Liquid chromatography
Liquid chromatography (LC) is a separation technique in which the mobile phase is a liquid. Liquid chromatography can be carried out either in a column or a plane. Present day liquid chromatography that generally utilizes very small packing particles and a relatively high pressure is referred to as high performance liquid chromatography (HPLC). In HPLC the sample is forced by a Preparative HPLC apparatus liquid at high pressure (the mobile phase) through a column that is packed with a stationary phase composed of irregularly or spherically shaped particles, a porous monolithic layer, or a porous membrane. HPLC is historically divided into two different sub-classes based on the polarity of the mobile and stationary phases. Methods in which the stationary phase is more polar than the mobile phase (e.g., toluene as the mobile phase, silica as the stationary phase) are termed normal phase liquid chromatography (NPLC) and the opposite (e.g., water-methanol mixture as the mobile phase and C18 = octadecylsilyl as the stationary phase) is termed reversed phase liquid chromatography (RPLC). Ironically the "normal phase" has fewer applications and RPLC is therefore used considerably more. Specific techniques under this broad heading are listed below.

Chromatography

106

Affinity chromatography
Affinity chromatography[6] is based on selective non-covalent interaction between an analyte and specific molecules. It is very specific, but not very robust. It is often used in biochemistry in the purification of proteins bound to tags. These fusion proteins are labeled with compounds such as His-tags, biotin or antigens, which bind to the stationary phase specifically. After purification, some of these tags are usually removed and the pure protein is obtained. Affinity chromatography often utilizes a biomolecule's affinity for a metal (Zn, Cu, Fe, etc.). Columns are often manually prepared. Traditional affinity columns are used as a preparative step to flush out unwanted biomolecules. However, HPLC techniques exist that do utilize affinity chromatogaphy properties. Immobilized Metal Affinity Chromatography (IMAC) is useful to separate aforementioned molecules based on the relative affinity for the metal (I.e. Dionex IMAC). Often these columns can be loaded with different metals to create a column with a targeted affinity.

Supercritical fluid chromatography


Supercritical fluid chromatography is a separation technique in which the mobile phase is a fluid above and relatively close to its critical temperature and pressure.

Techniques by separation mechanism


Ion exchange chromatography
Ion exchange chromatography (usually referred to as ion chromatography) uses an ion exchange mechanism to separate analytes based on their respective charges. It is usually performed in columns but can also be useful in planar mode. Ion exchange chromatography uses a charged stationary phase to separate charged compounds including anions, cations, amino acids, peptides, and proteins. In conventional methods the stationary phase is an ion exchange resin that carries charged functional groups that interact with oppositely charged groups of the compound to retain. Ion exchange chromatography is commonly used to purify proteins using FPLC.

Size-exclusion chromatography
Size-exclusion chromatography (SEC) is also known as gel permeation chromatography (GPC) or gel filtration chromatography and separates molecules according to their size (or more accurately according to their hydrodynamic diameter or hydrodynamic volume). Smaller molecules are able to enter the pores of the media and, therefore, molecules are trapped and removed from the flow of the mobile phase. The average residence time in the pores depends upon the effective size of the analyte molecules. However, molecules that are larger than the average pore size of the packing are excluded and thus suffer essentially no retention; such species are the first to be eluted. It is generally a low-resolution chromatography technique and thus it is often reserved for the final, "polishing" step of a purification. It is also useful for determining the tertiary structure and quaternary structure of purified proteins, especially since it can be carried out under native solution conditions.

Chromatography

107

Special techniques
Reversed-phase chromatography
Reversed-phase chromatography is an elution procedure used in liquid chromatography in which the mobile phase is significantly more polar than the stationary phase.

Two-dimensional chromatography
In some cases, the chemistry within a given column can be insufficient to separate some analytes. It is possible to direct a series of unresolved peaks onto a second column with different physico-chemical (Chemical classification) properties. Since the mechanism of retention on this new solid support is different from the first dimensional separation, it can be possible to separate compounds that are indistinguishable by one-dimensional chromatography. The sample is spotted at one corner of a square plate,developed, air-dried, then rotated by 90 and usually redeveloped in a second solvent system.

Pyrolysis gas chromatography


Pyrolysis gas chromatography mass spectrometry is a method of chemical analysis in which the sample is heated to decomposition to produce smaller molecules that are separated by gas chromatography and detected using mass spectrometry. Pyrolysis is the thermal decomposition of materials in an inert atmosphere or a vacuum. The sample is put into direct contact with a platinum wire, or placed in a quartz sample tube, and rapidly heated to 6001000 C. Depending on the application even higher temperatures are used. Three different heating techniques are used in actual pyrolyzers: Isothermal furnace, inductive heating (Curie Point filament), and resistive heating using platinum filaments. Large molecules cleave at their weakest points and produce smaller, more volatile fragments. These fragments can be separated by gas chromatography. Pyrolysis GC chromatograms are typically complex because a wide range of different decomposition products is formed. The data can either be used as fingerprint to prove material identity or the GC/MS data is used to identify individual fragments to obtain structural information. To increase the volatility of polar fragments, various methylating reagents can be added to a sample before pyrolysis. Besides the usage of dedicated pyrolyzers, pyrolysis GC of solid and liquid samples can be performed directly inside Programmable Temperature Vaporizer (PTV) injectors that provide quick heating (up to 30 C/s) and high maximum temperatures of 600650 C. This is sufficient for some pyrolysis applications. The main advantage is that no dedicated instrument has to be purchased and pyrolysis can be performed as part of routine GC analysis. In this case quartz GC inlet liners have to be used. Quantitative data can be acquired, and good results of derivatization inside the PTV injector are published as well.

Chromatography

108

Fast protein liquid chromatography


Fast protein liquid chromatography (FPLC) is a term applied to several chromatography techniques which are used to purify proteins. Many of these techniques are identical to those carried out under high performance liquid chromatography, however use of FPLC techniques are typically for preparing large scale batches of a purified product.

Countercurrent chromatography
Countercurrent chromatography (CCC) is a type of liquid-liquid chromatography, where both the stationary and mobile phases are liquids. The operating principle of CCC equipment requires a column consisting of an open tube coiled around a bobbin. The bobbin is rotated in a double-axis gyratory motion (a cardioid), which causes a variable gravity (G) field to act on the column during each rotation. This motion causes the column to see one partitioning step per revolution and components of the sample separate in the column due to their partitioning coefficient between the two An example of a HPCCC system immiscible liquid phases used. There are many types of CCC available today. These include HSCCC (High Speed CCC) and HPCCC (High Performance CCC). HPCCC is the latest and best performing version of the instrumentation available currently.

Chiral chromatography
Chiral chromatography involves the separation of stereoisomers. In the case of enantiomers, these have no chemical or physical differences apart from being three-dimensional mirror images. Conventional chromatography or other separation processes are incapable of separating them. To enable chiral separations to take place, either the mobile phase or the stationary phase must themselves be made chiral, giving differing affinities between the analytes. Chiral chromatography HPLC columns (with a chiral stationary phase) in both normal and reversed phase are commercially available.

References
[1] IUPAC Nomenclature for Chromatography (http:/ / www. iupac. org/ publications/ pac/ 1993/ pdf/ 6504x0819. pdf) IUPAC Recommendations 1993, Pure & Appl. Chem., Vol. 65, No. 4, pp.819872, 1993. [2] Still, W. C.; Kahn, M.; Mitra, A. J. Org. Chem. 1978, 43(14), 29232925. doi:10.1021/jo00408a041 [3] Laurence M. Harwood, Christopher J. Moody (13 June 1989). Experimental organic chemistry: Principles and Practice (Illustrated ed.). WileyBlackwell. pp.180185. ISBN978-0-632-02017-1. [4] Christian B. Anfinsen, John Tileston Edsall, Frederic Middlebrook Richards Advances in Protein Chemistry. Science 1976, 6-7. [5] Displacement Chromatography 101 (http:/ / www. sacheminc. com/ industries/ biotechnology/ teaching-tools. html). Sachem, Inc. Austin, TX 78737 [6] Pascal Bailon, George K. Ehrlich, Wen-Jian Fung and Wolfgang Berthold, An Overview of Affinity Chromatography, Humana Press, 2000. ISBN 978-0-89603-694-9, ISBN 978-1-60327-261-2.

Chromatography

109

External links
IUPAC Nomenclature for Chromatography (http://www.iupac.org/publications/pac/1993/pdf/6504x0819. pdf) Chromedia (http://www.chromedia.org) On line database and community for chromatography practitioners (paid subscription required) Library 4 Science: Chrom-Ed Series (http://www.chromatography-online.org/) Overlapping Peaks Program Learning by Simulations (http://www.vias.org/simulations/ simusoft_peakoverlap.html) Chromatography Videos MIT OCW Digital Lab Techniques Manual (http://ocw.mit.edu/ans7870/ resources/chemvideo/index.htm) Chromatography Equations Calculators MicroSolv Technology Corporation (http://www.mtc-usa.com/ calculators_chrom.asp)

Gel permeation chromatography


Gel permeation chromatography (GPC) is a type of size exclusion chromatography (SEC), that separates analytes on the basis of size. The technique is often used for the analysis of polymers. As a technique, SEC was first developed in 1955 by Lathe and Ruthven.[1] The term gel permeation chromatography can be traced back to J.C. Moore of the Dow Chemical Company who investigated the technique in 1964 and the proprietary column technology was licensed to Waters, who subsequently commercialized this technology in 1964.[2] It is often necessary to separate polymers, both to analyze them as well as to purify the desired product. When characterizing polymers, it is important to consider the polydispersity index (PDI) as well the molecular weight. Polymers can be characterized by a variety of definitions for molecular weight including the number average molecular weight (Mn), the weight average molecular weight (Mw) (see molar mass distribution), the size average molecular weight (Mz), or the viscosity molecular weight (Mv). GPC allows for the determination of PDI as well as Mv and based on other data, the Mn, Mw, and Mz can be determined.

How GPC Works


GPC separates based on the size or hydrodynamic volume (radius of gyration) of the analytes. This differs from other separation techniques which depend upon chemical or physical interactions to separate analytes.[3] Separation occurs via the use of porous beads packed in a column (see stationary phase (chemistry)). The smaller analytes can enter the pores more easily and therefore spend more time in these pores, increasing their retention time. Conversely, larger analytes spend little if any time in the pores and are eluted quickly. All columns have a range of molecular weights that can be separated.

Schematic of pore vs. analyte size

Gel permeation chromatography

110

If an analyte is either too large or too small it will be either not retained or completely retained respectively. Analytes that are not retained are eluted with the free volume outside of the particles (Vo), while analytes that are completely retained are eluted with volume of solvent held in the pores (Vi). The total volume can be considered by the following equation, where Vg is the volume of the polymer gel and Vt is the total volume:[3] As can be inferred, there is a limited range of molecular weights that can be separated by each column and therefore the size of the pores for the packing should be chosen according to the range of molecular weight of analytes to be separated. Range of molecular weights that can be separated for each packing material For polymer separations the pore sizes should be on the order of the polymers being analyzed. If a sample has a broad molecular weight range it may be necessary to use several GPC columns in tandem with one another to fully resolve the sample.

Application
GPC is often used to determine the relative molecular weight of polymer samples as well as the distribution of molecular weights. What GPC truly measures is the molecular volume and shape function as defined by the intrinsic viscosity. If comparable standards are used, this relative data can be used to determine molecular weights within 5% accuracy. Polystyrene standards with PDI of less than 1.2 are typically used to calibrate the GPC.[4] Unfortunately, polystyrene tends to be a very linear polymer and therefore as a standard it is only useful to compare it to other polymers that are known to be linear and of relatively the same size.

Gel permeation chromatography

111

Material and Methods


Instrumentation

A typical Waters GPC instrument including A. sample holder, B.Column C.Pump D. Refractive Index Detector E. UV-vis Detector

The inside of sample holder of Waters GPC instrument

Gel permeation chromatography is conducted almost exclusively in chromatography columns. The experimental design is not much different from other techniques of liquid chromatography. Samples are dissolved in an appropriate solvent, in the case of GPC these tend to be organic solvents and after filtering the solution it is injected onto a column. A Waters GPC instrument is shown to the left. The separation of multi-component mixture takes place in the column. The constant supply of fresh eluent to the column is accomplished by the use of a pump. Since most analytes are not visible to the naked eye a detector is needed. Often multiple detectors are used to gain additional information about the polymer sample. The availability of a detector makes the fractionation convenient and accurate.

Gel permeation chromatography

112

Gel
Gels are used as stationary phase for GPC. The pore size of a gel must be carefully controlled in order to be able to apply the gel to a given separation. Other desirable properties of the gel forming agent are the absence of ionizing groups and, in a given solvent, low affinity for the substances to be separated. Commercial gels like Sephadex, Bio-Gel (cross-linked polyacrylamide), agarose gel and Styragel are often used based on different separation requirements.[5]

Eluent
The eluent (mobile phase) should be a good solvent for the polymer, should permit high detector response from the polymer and should wet the packing surface. The most common eluents in for polymers that dissolve at room temperature GPC are tetrahydrofuran (THF), o-dichlorobenzene and trichlorobenzene at 130150C for crystalline polyalkynes and m-cresol and o-chlorophenol at 90C for crystalline condensation polymers such as polyamides and polyesters.

Pump
There are two types of pumps available for uniform delivery of relatively small liquid volumes for GPC: piston or peristaltic pumps.

Detector
In GPC, the concentration by weight of polymer in the eluting solvent may be monitored continuously with a detector. There are many detector types available and they can be divided into two main categories. The first is concentration sensitive detectors which includes UV absorption, differential refractometer (DRI) or refractive index (RI) detectors, infrared (IR) absorption and density detectors. Molecular weight sensitive detectors include low angle light scattering detectors (LALLS), multi angle light scattering (MALLS).[6] The resulting chromatogram is therefore a weight distribution of the polymer as a function of retention volume.

GPC Chromatogram; Vo= no retention, Vt= complete retention, A and B = partial retention

The most sensitive detector is the differential UV photometer and the most common detector is the differential refractometer (DRI). When characterizing copolymer, it is necessary to have two detectors in series.[4] For accurate determinations of copolymer composition at least two of those detectors should be concentration detectors.[6] The determination of most copolymer compositions is done using UV and RI detectors, although other combinations can be used.[7]

Gel permeation chromatography

113

Data Analysis
Gel permeation chromatography (GPC) has become the most widely used technique for analyzing polymer samples in order to determine their molecular weights and weight distributions. Examples of GPC chromatograms of polystyrene samples with their molecular weights and PDIs are shown on the left.

GPC Separation of Anionically Synthesized Polystyrene; Mn=3,000 g/mol, PDI=1.32

GPC Separation of Free-Radical Synthesized Polystyrene; Mn=24,000 g/mol, PDI=4.96

Gel permeation chromatography

114

Benoit and co-workers proposed that the hydrodynamic volume, V, which is proportional to the product of [] and M, where [] is the intrinsic viscosity of the polymer in the SEC eluent, may be used as the universal calibration parameter. If the Mark-Houwink-Sakurada constants K and are known (see Mark-Houwink equation), a plot of log []M versus elution volume (or elution time) for a particular solvent, column and instrument provides a universal calibration curve which can be used for any polymer in that solvent. By determining the retention volumes (or times) of monodisperse polymer standards (e.g. solutions of monodispersed polystyrene in THF), a Standardization of a size exclusion column. calibration curve can be obtained by plotting the logarithm of the molecular weight versus the retention time or volume. Once the calibration curve is obtained, the gel permeation chromatogram of any other polymer can be obtained in the same solvent and the molecular weights (usually Mn and Mw) and the complete molecular weight distribution for the polymer can be determined. A typical calibration curve is shown to the right and the molecular weight from an unknown sample can be obtained from the calibration curve.

Advantages of GPC
As a separation technique GPC has many advantages. First of all, it has a well-defined separation time due to the fact that there is a final elution volume for all unretained analytes. Additionally, GPC can provide narrow bands, although this aspect of GPC is more difficult for polymer samples that have broad ranges of molecular weights present. Finally, since the analytes do not interact chemically or physically with the column, there is a lower chance for analyte loss to occur.[3] For investigating the properties of polymer samples in particular, GPC can be very advantageous. GPC provides a more convenient method of determining the molecular weights of polymers. In fact most samples can be thoroughly analyzed in an hour or less.[8] Other methods used in the past were fractional extraction and fractional precipitation. As these processes were quite labor intensive molecular weights and mass distributions typically were not analyzed.[9] Therefore, GPC has allowed for the quick and relatively easy estimation of molecular weights and distribution for polymer samples

Disadvantages of GPC
There are disadvantages to GPC, however. First, there is a limited number of peaks that can be resolved within the short time scale of the GPC run. Also, as a technique GPC requires around at least a 10% difference in molecular weight for a reasonable resolution of peaks to occur.[3] In regards to polymers, the molecular masses of most of the chains will be too close for the GPC separation to show anything more than broad peaks. Another disadvantage of GPC for polymers is that filtrations must be performed before using the instrument to prevent dust and other particulates from ruining the columns and interfering with the detectors. Although useful for protecting the instrument, the pre-filtration of the sample has the possibility of removing higher molecular weight sample before it can be loaded on the column.

Gel permeation chromatography

115

References
[1] Lathe, G.H.; Ruthven, C.R.J. The Separation of Substance and Estimation of their Relative Molecular Sizes by the use of Columns of Starch in Water. Biochem J. 1956, 62, 665674. PMID 13249976 [2] Moore, J.C. Gel permeation chromatography. I. A new method for molecular weight distribution of high polymers. J. Polym. Sci., 1964, 2, 835-843. (http:/ / www3. interscience. wiley. com/ cgi-bin/ fulltext/ 104042350/ PDFSTART) doi:10.1002/pol.1964.100020220 [3] Skoog, D.A. Principles of Instrumental Analysis, 6th ed.; Thompson Brooks/Cole: Belmont, CA, 2006, Chapter 28. [4] Sandler, S.R.; Karo, W.; Bonesteel, J.; Pearce, E.M. Polymer Synthesis and Characterization: A Laboratory Manual; Academic Press: San Diego, 1998. [5] Helmut, D. Gel Chromatography, Gel Filtration, Gel Permeation, Molecular Sieves: A Laboratory Handbook; Springer-Verlag, 1969. [6] Trathnigg, B. Determination of MWD and Chemical Composition of Polymers by Chromatographic Techniques. Prog. Polym. Sci. 1995, 20, 615-650. (http:/ / www. sciencedirect. com/ science?_ob=MImg& _imagekey=B6TX2-3YCDW2F-C-1& _cdi=5578& _user=99318& _orig=search& _coverDate=12/ 31/ 1995& _sk=999799995& view=c& wchp=dGLzVtz-zSkWA& md5=752a5001c8e7a52bd9aadd999673ed57& ie=/ sdarticle. pdf) doi:10.1016/0079-6700(95)00005-Z [7] Pasch, H. Hyphenated Techniques in Liquid Chromatography of Polymers. Adv. Polym. Sci. 2000, 150, 1-66. (http:/ / www. springerlink. com/ content/ 75w211mepm23dxmt/ fulltext. pdf) doi:10.1007/3-540-48764-6 [8] Cowie, J.M.G.; Arrighi, V. Polymers: Chemistry and Physics of Modern Materials, 3rd ed. CRC Press, 2008. [9] Odian G. Principles of Polymerization, 3rd ed.; Wiley Interscience Publication, 1991.

Size-exclusion chromatography

116

Size-exclusion chromatography
Size-exclusion chromatography

Equipment for running size-exclusion chromatography. The buffer is pumped through the column (right) by a computer-controlled device Acronym Classification Analytes SEC Chromatography macromolecules synthetic polymers biomolecules Other techniques Related High performance liquid chromatography Aqueous Normal Phase Chromatography Ion exchange chromatography Micellar liquid chromatography

Size-exclusion chromatography (SEC) is a chromatographic method in which molecules in solution are separated by their size, and in some cases molecular weight.[1] It is usually applied to large molecules or macromolecular complexes such as proteins and industrial polymers. Typically, when an aqueous solution is used to transport the sample through the column, the technique is known as gel-filtration chromatography, versus the name gel permeation chromatography, which is used when an organic solvent is used as a mobile phase. SEC is a widely used polymer characterization method because of its ability to provide good molar mass distribution (Mw) results for polymers.

Size-exclusion chromatography

117

Applications
The main application of gel-filtration chromatography is the fractionation of proteins and other water-soluble polymers, while gel permeation chromatography is used to analyze the molecular weight distribution of organic-soluble polymers. Either technique should not be confused with gel electrophoresis, where an electric field is used to "pull" or "push" molecules through the gel depending on their electrical charges.

Advantages
The advantages of this method include good separation of large molecules from the small molecules with a minimal volume of eluate,[2] and that various solutions can be applied without interfering with the filtration process, all while preserving the biological activity of the particles to be separated. The technique is generally combined with others that further separate molecules by other characteristics, such as acidity, basicity, charge, and affinity for certain compounds. With size exclusion chromatography, there are short and well-defined separation times and narrow bands, which lead to good sensitivity. There is also no sample loss because solutes do not interact with the stationary phase. Disadvantages are, for example, that only a limited number of bands can be accommodated because the time scale of the chromatogram is short, and, in general, there has to be a 10% difference in molecular mass to have a good resolution[3]

Discovery
The technique was invented by Grant Henry Lathe and Colin R Ruthven, working at Queen Charlottes Hospital, London.[4][5] They later received the John Scott Award for this invention.[6] While Lathe and Ruthven used starch gels as the matrix, Jerker Porath and Per Flodin later introduced dextran gels;[7] other gels with size fractionation properties include agarose and polyacrylamide. A short review of these developments has appeared.[8] There were also attempts to fractionate synthetic high polymers; however, it was not until 1964, when J. C. Moore of the Dow Chemical Company published his work on the preparation of gel permeation chromatography (GPC) columns based on cross-linked polystyrene with controlled pore size,[9] that a rapid increase of research activity in this field began. It was recognized almost immediately that with proper calibration, GPC was capable to provide molar mass and molar mass distribution information for synthetic polymers. Because the latter information was difficult to obtain by other methods, GPC came rapidly into extensive use.[10]

Theory and method


One requirement for SEC is that the analyte does not interact with the surface of the stationary phases. Differences in elution time are based solely on the volume the analyte "sees". Thus, a small molecule that can penetrate every corner of the pore system of the stationary phase "sees" the entire pore volume and the interparticle volume, and will elute late (when the pore- and interparticle volume has passed through the column ~80% of the column volume). On the other extreme, a very large molecule that cannot penetrate the pore system "sees" only the interparticle volume (~35% of the column volume) and will elute earlier when this volume of mobile phase has passed through the column. The underlying principle of SEC is that particles of different sizes will elute (filter) through a stationary phase at different rates. This results in the separation of a solution of particles based on size. Provided that all the particles are loaded simultaneously or near-simultaneously, particles of the same size should elute together. However, as there are various measure of the size of a macromolecule (for instance, the radius of gyration and the hydrodynamic radius), a fundamental problem in the theory of SEC has been the choice of a proper molecular size parameter by which molecules of different kinds are separated. Experimentally, Benoit and co-workers found an excellent correlation between elution volume and a dynamically based molecular size, the hydrodynamic volume, for several different chain architecture and chemical compositions.[11] The observed correlation based on the hydrodynamic volume became accepted as the basis of universal SEC calibration.

Size-exclusion chromatography Still, the use of the hydrodynamic volume, a size based on dynamical properties, in the interpretation of SEC data is not fully understood.[12] This is because SEC is typically run under low flow rate conditions where hydrodynamic factor should have little effect on the separation. In fact, both theory and computer simulations assume a thermodynamic separation principle: the separation process is determined by the equilibrium distribution (partitioning) of solute macromolecules between two phases --- a dilute bulk solution phase located at the interstitial space and confined solution phases within the pores of column packing material. Based on this theory, it has been shown that the relevant size parameter to the partitioning of polymers in pores is the mean span dimension (mean maximal projection onto a line).[13] Although this issue has not been fully resolved, it is likely that the mean span dimension and the hydrodynamic volume are strongly correlated. Each size exclusion column has a range of molecular weights that can be separated. The exclusion limit defines the molecular weight at the upper end of this range and is where molecules are too large to be trapped in the stationary phase. The permeation limit defines the molecular weight at the lower end of the range of separation and is where molecules of a small enough size can penetrate into the pores of the stationary phase completely and all molecules below this molecular mass are so small that they elute as a single band[14] This is usually achieved with an apparatus called a column, which consists of a hollow tube tightly packed with extremely small porous polymer beads designed to have pores of different sizes. These pores may be A size exclusion column. depressions on the surface or channels through the bead. As the solution travels down the column some particles enter into the pores. Larger particles cannot enter into as many pores. The larger the particles, the faster the elution. The filtered solution that is collected at the end is known as the eluate. The void volume includes any particles too large to enter the medium, and the solvent volume is known as the column volume.

118

Factors affecting filtration


In real-life situations, particles in solution do not have a fixed size, resulting in the probability that a particle that would otherwise be hampered by a pore passing right by it. Also, the stationary-phase particles are not ideally defined; both particles and pores may vary in size. Elution curves, therefore, resemble Gaussian distributions. The stationary phase may also interact in undesirable ways with a particle and influence retention times, though great care is taken by column manufacturers to use stationary phases that are inert and minimize this issue.

A cartoon illustrating the theory behind size exclusion chromatography

Like other forms of chromatography, increasing the column length will enhance the resolution, and increasing the column diameter increases the capacity of the column. Proper column packing is important to maximize resolution: An overpacked column can collapse the pores in the beads, resulting in a loss of resolution. An underpacked column can reduce the relative surface area of the stationary phase accessible to smaller species, resulting in those species

Size-exclusion chromatography spending less time trapped in pores. Unlike affinity chromatography techniques, a solvent head at the top of the column can drastically diminish resolution as the sample diffuses prior to loading, broadening the downstream elution.

119

Analysis
In simple manual columns, the eluent is collected in constant volumes, known as fractions. The more similar the particles are in size the more likely they will be in the same fraction and not detected separately. More advanced columns overcome this problem by constantly monitoring the eluent. The collected fractions are often examined by spectroscopic techniques to determine the concentration of the particles eluted. Common spectroscopy detection techniques are refractive index (RI) and ultraviolet (UV). When eluting spectroscopically similar species (such as during biological purification), other techniques may be necessary to identify the contents of each fraction. It is also possible to analyse the eluent flow continuously with RI, LALLS, Multi-Angle Laser Light Scattering MALS, UV, and/or viscosity measurements.

Standardization of a size exclusion column.

The elution volume (Ve) decreases roughly linear with the logarithm of the molecular hydrodynamic volume. Columns are often calibrated using 4-5 standard samples (e.g., folded proteins of known molecular weight), and a sample containing a very large molecule such as thyroglobulin to determine the void volume. (Blue dextran is not recommended for Vo determination because it is heterogeneous and may give variable results) The elution volumes of the standards are divided by the elution volume of the thyroglobulin (Ve/Vo) and plotted against the log of the standards' molecular weights.

SEC Chromatogram of a biological sample.

Applications

Size-exclusion chromatography

120

Biochemical applications
In general, SEC is considered a low resolution chromatography as it does not discern similar species very well, and is therefore often reserved for the final "polishing" step of a purification. The technique can determine the quaternary structure of purified proteins that have slow exchange times, since it can be carried out under native solution conditions, preserving macromolecular interactions. SEC can also assay protein tertiary structure, as it measures the hydrodynamic volume (not molecular weight), allowing folded and unfolded versions of the same protein to be distinguished. For example, the apparent hydrodynamic radius of a typical protein domain might be 14 and 36 for the folded and unfolded forms, respectively. SEC allows the separation of these two forms, as the folded form will elute much later due to its smaller size.

Polymer synthesis
SEC can be used as a measure of both the size and the polydispersity of a synthesised polymer, that is, the ability to be able to find the distribution of the sizes of polymer molecules. If standards of a known size are run previously, then a calibration curve can be created to determine the sizes of polymer molecules of interest in the solvent chosen for analysis (often THF). In alternative fashion, techniques such as light scattering and/or viscometry can be used online with SEC to yield absolute molecular weights that do not rely on calibration with standards of known molecular weight. Due to the difference in size of two polymers with identical molecular weights, the absolute determination methods are, in general, more desirable. A typical SEC system can quickly (in about half an hour) give polymer chemists information on the size and polydispersity of the sample. The preparative SEC can be used for polymer fractionation on an analytical scale. .

Drawback
In SEC, mass is not measured so much as the hydrodynamic volume of the polymer molecules, that is, how much space a particular polymer molecule takes up when it is in solution. However, the approximate molecular weight can be calculated from SEC data because the exact relationship between molecular weight and hydrodynamic volume for polystyrene can be found. For this, polystyrene is used as a standard. But the relationship between hydrodynamic volume and molecular weight is not the same for all polymers, so only an approximate measurement can be obtained.[15] Another drawback is the possibility of interaction between the stationary phase and the analyte. Any interaction leads to a later elution time and thus mimics a smaller analyte size.

Absolute size-exclusion chromatography


Absolute size-exclusion chromatography (ASEC) is a technique that couples a dynamic light scattering (DLS) instrument to a size exclusion chromatography system for absolute size measurements of proteins and macromolecules as they elute from the chromatography system. The definition of absolute used here is that it does not require calibration to obtain hydrodynamic size, often referred to as hydrodynamic diameter (DH in units of nm). The sizes of the macromolecules are measured as they elute into the flow cell of the DLS instrument from the size exclusion column set. It should be noted that the hydrodynamic size of the molecules or particles are measured and not their molecular weights. For proteins a Mark-Houwink type of calculation can be used to estimate the molecular weight from the hydrodynamic size. A big advantage of DLS coupled with SEC is the ability to obtain enhanced DLS resolution . Batch DLS is quick and simple and provides a direct measure of the average size but the baseline resolution of DLS is 3 to 1 in diameter. Using SEC, the proteins and protein oligomers are separated, allowing oligomeric resolution. Aggregation studies can also be done using ASEC although the aggregate concentration may not be calculated, the size of the aggregate will be measured only to be limited by the maximum size eluting from the SEC columns.

Size-exclusion chromatography Limitations of ASEC include flow-rate, concentration, and precision. Because a correlation function requires anywhere from 37 seconds to properly build, a limited number of data points can be collected across the peak.

121

References
[1] Paul-Dauphin, Stephanie; Morgan, Millan-Agorio, Herod, Kandiyoti (6). "Probing Size Exclusion Mechanisms of Complex Hydrocarbon Mixtures: The Effect of Altering Eluent Compositions" (http:/ / pubs. acs. org/ doi/ abs/ 10. 1021/ ef700410e). Energy & Fuels. 6 21: 34843489. doi:10.1021/ef700410e. . Retrieved October 6, 2007. [2] Skoog, D. A.; Principles of Instrumental Analysis, 6th ed.; Thompson Brooks/Cole: Belmont, CA, 2006, Chapter 28. [3] Skoog, D. A.; Principles of Instrumental Analysis, 6th ed.; Thompson Brooks/Cole: Belmont, CA, 2006, Chapter 28. [4] Lathe, GH and Ruthven, CR (1955) The separation of substances on the basis of their molecular weights, using columns of starch and water. Biochem J. 60(4):xxxiv. [5] Lathe, GH and Ruthven, CR (1956) The separation of substances and estimation of their relative molecular sizes by the use of columns of starch in water. Biochem. J. 62(4): 665-674. article (http:/ / www. pubmedcentral. nih. gov/ articlerender. fcgi?tool=pubmed& pubmedid=13315231) [6] John Scott Award (http:/ / www. garfield. library. upenn. edu/ johnscottaward. html) [7] Porath, J and Flodin, P (1959) Gel filtration: A method for desalting and group separation. Nature 183(4676): 1657-1659. [8] Eisenstein, M (2006) A look back, adventures in the matrix. Nature Methods 3(5): 410 article (http:/ / www. nature. com/ nmeth/ journal/ v3/ n5/ full/ nmeth0506-410. html) [9] Moore, J. C. (1964) Gel Permeation Chromatography. 1. A New Method for Molecular Weight Distribution of High Polymers. Journal of Polymer Science: Part A 2: 835-843 (http:/ / onlinelibrary. wiley. com/ doi/ 10. 1002/ pol. 1964. 100020220/ abstract) [10] Striegel, A. M.; Kirkland, J. J.; Yau, W. W.; Bly, D. D.; Modern Size Exclusion Chromatography, Practice of Gel Permeation and Gel Filtration Chromatography, 2nd ed.; Wiley: NY, 2009. [11] Grubisic, Z.; Rempp, P.; Benoit, H. (1967) A universal calibration for Gel Permeation Chromatography. Journal of Polymer Science: Polym. Lett. 5: 753-759 [12] Sun, T.; Chance, R. R.; Graessley, W. W.; and Lohse, D. J. (2004) A study of the separation principle in Size Exclusion Chromatography. Macromolecule 37: 4304-4312. [13] Wang, Y.; Teraoka, I.; Hansen, F. Y.; Peters, G. H.; and Hassager, O. (2010) A theoretical study of the separation principle in Size Exclusion Chromatography. Macromolecule 43: 1651-1659. [14] Skoog, D. A.; Principles of Instrumental Analysis, 6th ed.; Thompson Brooks/Cole: Belmont, CA, 2006, Chapter 28. [15] Polymer Science Learning Center (PSLC) - Size Exclusion Chromatography (http:/ / pslc. ws/ mactest/ sec. htm)

Affinity chromatography

122

Affinity chromatography
Affinity chromatography is a method of separating biochemical mixtures and based on a highly specific interaction such as that between antigen and antibody, enzyme and substrate, or receptor and ligand.

Uses
Affinity chromatography can be used to: Purify and concentrate a substance from a mixture into a buffering solution Reduce the amount of a substance in a mixture Discern what biological compounds bind to a particular substance Purify and concentrate an enzyme solution.

Principle
The immobile phase is typically a gel matrix, often of agarose; a linear sugar molecule derived from algae.[1] Usually the starting point is an undefined heterogeneous group of molecules in solution, such as a cell lysate, growth medium or blood serum. The molecule of interest will have a well known and defined property which can be exploited during the affinity purification process. The process itself can be thought of as an entrapment, with the target molecule becoming trapped on a solid or stationary phase or medium. The other molecules in solution will not become trapped as they do not possess this property. The solid medium can then be removed from the mixture, washed and the target molecule released from the entrapment in a process known as elution. Possibly the most common use of affinity chromotography is for the purification of recombinant proteins.

Batch and column setup

Affinity chromatography

123

Binding to the solid phase may be achieved by column chromatography whereby the solid medium is packed onto a column, the initial mixture run through the column to allow setting, a wash buffer run through the column and the elution buffer subsequently applied to the column and collected. These steps are usually done at ambient pressure. Alternatively binding may be achieved using a batch treatment, by adding the initial mixture to the solid phase in a vessel, mixing, separating the solid phase (for example), removing the liquid phase, washing, re-centrifuging, adding the elution buffer, re-centrifuging and removing the eluate. Sometimes a hybrid method is employed, the binding is done by the batch method, then the solid phase with the target molecule bound is packed onto a column and washing and elution are done on the column. A third method,expanded bed adsorption, which combines the advantages of the two methods mentioned above, has also been developed. The solid phase particles are placed in a column where liquid phase is pumped in from the bottom and exits at the top. The gravity of the particles ensure that the solid phase does not exit the column with the liquid phase. Affinity columns can be eluted by changing the ionic strength through a gradient. Salt concentrations, pH, pI, charge and ionic strength can all be used to separate or form the gradient to separate.

Column chromatography

Specific uses
Affinity chromatography can be used in a number of applications, including nucleic acid purification, protein purification from cell free extracts, and purification from blood.

Immunoaffinity
Another use for the procedure is the affinity purification of antibodies from blood serum. If serum is known to contain antibodies against a specific antigen (for example if the serum comes from an organism immunized against the antigen concerned) then it can be used for the affinity purification of that antigen. This is also known as Immunoaffinity Chromatography. For example if an organism is immunised against a GST-fusion protein it will produce antibodies against the fusion-protein, and possibly antibodies against the GST tag as well. The protein can then be covalently coupled to a solid support such as agarose and used as an affinity ligand in purifications of antibody from immune serum. For thoroughness the GST protein and the GST-fusion protein can each be coupled separately. The serum is initially allowed to bind to the GST affinity matrix. This will remove antibodies against the GST part of the fusion protein. The serum is then separated from the solid support and
Batch chromatography

Affinity chromatography allowed to bind to the GST-fusion protein matrix. This allows any antibodies that recognize the antigen to be captured on the solid support. Elution of the antibodies of interest is most often achieved using a low pH buffer such as glycine pH 2.8. The eluate is collected into a neutral tris or phosphate buffer, to neutralize the low pH elution buffer and halt any degradation of the antibody's activity. This is a nice example as affinity purification is used to purify the initial GST-fusion protein, to remove the undesirable anti-GST antibodies from the serum and to purify the target antibody. A simplified strategy is often employed to purify antibodies generated against peptide antigens. When the peptide antigens are produced synthetically, a terminal cysteine residue is added at either the N- or C-terminus of the peptide. This cysteine residue contains a sulfhydryl functional group which allows the peptide to be easily conjugated to a carrier protein (e.g.Keyhole Limpet Hemocyanin (KLH)). The same cysteine-containing peptide is also immobilized onto an agarose resin through the cysteine residue and is then used to purify the antibody.

124

Immobilized metal ion affinity chromatography


Immobilized metal ion affinity chromatography (IMAC) is based on the specific coordinate covalent bond of amino acids, particularly histidine, to metals. This technique works by allowing proteins with an affinity for metal ions to be retained in a column containing immobilized metal ions, such as cobalt, nickel, copper for the purification of histidine containing proteins or peptides, iron, zinc or gallium for the purification of phosphorylated proteins or peptides. Many naturally occurring proteins do not have an affinity for metal ions, therefore recombinant DNA technology can be used to introduce such a protein tag into the relevant gene. Methods used to elute the protein of interest include changing the pH, or adding a competitive molecule, such as imidazole.

Recombinant proteins
Possibly the most common use of affinity chromatography is for the purification of recombinant proteins. Proteins with a known affinity are protein tagged in order to aid their purification. The protein may have been genetically modified so as to allow it to be selected for affinity binding, this is known as a fusion protein. Tags include glutathione-S-transferase (GST), hexahistidine (His), and maltose binding protein (MBP). Histidine tags have an affinity for nickel or cobalt ions which are coordinate covalent bond with a chelator for the purposes of solid medium entrapment. For elution, an excess amount of a compound able to act as a metal ion ligand, such as imidazole, is used. GST has an affinity for glutathione which is commercially available immobilized as glutathione agarose. During elution, excess glutathione is used to displace the tagged protein.

Lectins
Lectin affinity chromatography is a form of affinity chromatography where lectins are used to separate components within the sample. Lectins, such as Concanavalin A [2] are proteins which can bind specific carbohydrate (sugar) molecules. The most common application is to separate glycoproteins from non-glycosylated proteins, or one glycoform from another glycoform.[3]
A chromatography column containing nickel-agarose beads used for purification of proteins with histidine tags

Affinity chromatography

125

References
[1] Voet and Voet, Biochemistry John Wiley and Sons; 1995 [2] bioWORLD, Concanavalin A - Specifically binds to mannosyl and glucosyl residues of polysaccharides and glycoproteins. See MSDS (http:/ / www. bio-world. com/ site/ accounts/ masterfiles/ MSDS/ MS-22070010. pdf) [3] GE Healthcare Life Sciences, Immobilized lectin (http:/ / www. gelifesciences. com/ aptrix/ upp01077. nsf/ Content/ protein_purification~affinity~immobilized_lectin)

High-performance liquid chromatography

126

High-performance liquid chromatography


High-performance liquid chromatography

A HPLC. From left to right: A pumping device generating a gradient of two different solvents, a steel enforced column and an apparatus for measuring the absorbance. Acronym Classification Analytes HPLC Chromatography organic molecules biomolecules ions polymers Other techniques Related Chromatography Aqueous Normal Phase Chromatography Hydrophilic Interaction Chromatography Ion exchange chromatography Size exclusion chromatography Micellar liquid chromatography Liquid chromatography-mass spectrometry

Hyphenated

High-performance liquid chromatography (sometimes referred to as high-pressure liquid chromatography), HPLC, is a chromatographic technique used to separate a mixture of compounds in analytical chemistry and biochemistry with the purpose of identifying, quantifying and purifying the individual components of the mixture. HPLC apparatus. HPLC is also considered an instrumentation technique of analytical chemistry, instead of a gravitimetric technique. HPLC has many uses including medical (e.g. detecting vitamin D concentrations in blood serum), legal (e.g.detecting performance enhancement drugs in urine), research (e.g. purifying substances from a complex biological sample, or separating similar synthetic chemicals from each other), and manufacturing (e.g. during the production process of pharmaceutical and biologic products).[1] HPLC can alternatively be described as a mass transfer involving adsorption.

High-performance liquid chromatography

127

HPLC relies on the pressure of mechanical pumps on a liquid solvent to load a sample mixture onto a chemistry column, in which the separation occurs. A HPLC separation column is filled with solid particles (e.g. silica, polymers, or sorbents), and the sample mixture is separated into compounds as it interacts with the column particles. HPLC separation is influenced by the liquid solvents condition (e.g. pressure, temperature), chemical interactions between the sample mixture and the liquid solvent (e.g. hydrophobicity, protonation, etc), and chemical interactions between the sample mixture and the solid particles packed inside of the separation column (e.g. ligand affinity, ion exchange, etc...).

A modern self contained HPLC.

HPLC is distinguished from ordinary liquid chromatography because the pressure of HPLC is relatively high (~150 bar, ~2000 PSI), while ordinary liquid chromatography typically relies on the force of gravity to provide pressure. Due to the higher pressure separation conditions of HPLC, HPLC columns have relatively small internal diameter (e.g. 4.6 mm), are short (e.g. 250 mm), and packed more densely with smaller particles, which helps achieve finer separations of a sample mixture than ordinary liquid chromatography can. This gives HPLC superior resolving power when separating mixtures, which is why it is a popular chromatographic technique. The schematic of an HPLC instrument typically includes a sampler by which the sample mixture is injected into the HPLC, one or more mechanical pumps for pushing liquid through a tubing system, a separation column, a digital analyte detector (e.g. a UV/Vis, or a photodiode array (PDA)) for qualitative or quantitative analysis of the separation, and a digital microprocessor for controlling the HPLC components (and user software). Many different types of columns are available, varying in size, and in the type (i.e. chemistry) of solid packed particle types available. Some models of mechanical pumps in a HPLC instrument can also mix multiple liquids together, and the recipe or gradient of those liquids can modify the chemical interactions that occur in HPLCs column, and thereby modify the chemical separation of the mixture.

Operation
column. The components of the sample move through the column at different velocities, which are functions of specific physical or chemical interactions with the stationary phase. The velocity of each component depends on its chemical nature, on the nature of the stationary phase (column) and on the composition of the mobile phase. The time at which a specific analyte elutes (emerges from the column) is called the retention time. The retention time measured under particular conditions is considered an identifying characteristic of a given analyte. The use of smaller particle size packing materials requires the use of higher operational pressure ("backpressure") and typically improves chromatographic resolution (i.e. the degree of separation between consecutive analytes emerging from the column). Common mobile phases used include any miscible combination of water

High-performance liquid chromatography with various organic solvents (the most common are acetonitrile and methanol). Some HPLC techniques use water-free mobile phases (see Normal-phase chromatography below). The aqueous component of the mobile phase may contain buffers, acids (such as formic, phosphoric or trifluoroacetic acid) or salts to assist in the separation of the sample components. The composition of the mobile phase may be kept constant ("isocratic elution mode") or varied ("gradient elution mode") during the chromatographic analysis. Isocratic elution is typically effective in the separation of sample components that are not very dissimilar in their affinity for the stationary phase. In gradient elution the composition of the mobile phase is varied typically from low to high eluting strength. The eluting strength of the mobile phase is reflected by analyte retention times with high eluting strength producing fast elution (=short retention times). A typical gradient profile in reversed phase chromatography might start at 5% acetonitrile (in water or aqueous buffer) and progress linearly to 95% acetonitrile over 525 minutes. Period of constant mobile phase composition may be part of any gradient profile. For example, the mobile phase composition may be kept constant at 5% acetonitrile for 13 min, followed by a linear change up to 95% acetonitrile. The composition of the mobile phase depends on the intensity of interactions between analytes and stationary phase (e.g. hydrophobic interactions in reversed-phase HPLC). Depending on their affinity for the stationary and mobile phases analytes partition between the two during the separation process taking place in the column. This partitioning process is similar to that which occurs during a liquid-liquid extraction but is continuous, not step-wise. In this example, using a water/acetonitrile gradient, more hydrophobic components will elute (come off the column) late, once the mobile phase gets more concentrated in acetonitrile (i.e. in a mobile phase of higher eluting strength). The choice of mobile phase components, additives (such as salts or acids) and gradient conditions depend on the nature of the column and sample components. Often a series of trial runs are performed with the sample in order to find the HPLC method which gives the best separation.

128

Types
Partition chromatography
Partition chromatography was the first kind of chromatography that chemists developed. The partition coefficient principle has been applied in paper chromatography, thin layer chromatography, gas phase and liquid-liquid applications. The 1952 Nobel Prize in chemistry was earned by Archer John Porter Martin and Richard Laurence Millington Synge for their development of the technique, which was used for their separation of amino acids. Partition chromatography uses a retained solvent, on the surface or within the grains or fibers of an "inert" solid supporting matrix as with paper chromatography; or takes advantage

HILIC Partition Technique Useful Range

High-performance liquid chromatography of some coulombic and/or hydrogen donor interaction with the solid support. Molecules equilibrate (partition) between a liquid stationary phase and the eluent. Known as Hydrophilic Interaction Chromatography (HILIC) in HPLC, this method separates analytes based on polar differences. HILIC most often uses a bonded polar stationary phase and water miscible, high organic concentration, mobile phases. Partition HPLC has been used historically on unbonded silica or alumina supports. Each works effectively for separating analytes by relative polar differences. HILIC bonded phases have the advantage of separating acidic, basic and neutral solutes in a single chromatogram.[2] The polar analytes diffuse into a stationary water layer associated with the polar stationary phase and are thus retained. Retention strengths increase with increased analyte polarity, and the interaction between the polar analyte and the polar stationary phase (relative to the mobile phase) increases the elution time. The interaction strength depends on the functional groups in the analyte molecule which promote partitioning but can also include coulombic (electrostatic) interaction and hydrogen donor capability. Use of more polar solvents in the mobile phase will decrease the retention time of the analytes, whereas more hydrophobic solvents tend to increase retention times.

129

Normal-phase chromatography
It was one of the first kinds of HPLC that chemists developed. Also known as normal-phase HPLC (NP-HPLC), or adsorption chromatography, this method separates analytes based on their affinity for a polar stationary surface such as silica, hence it is based on analyte ability to engage in polar interactions (such as hydrogen-bonding or dipole-dipole type of interactions) with the sorbent surface. NP-HPLC uses a non-polar, non-aqueous mobile phase, and works effectively for separating analytes readily soluble in non-polar solvents. The analyte associates with and is retained by the polar stationary phase. Adsorption strengths increase with increased analyte polarity. The interaction strength depends not only on the functional groups present in the structure of the analyte molecule, but also on steric factors. The effect of steric hindrance on interaction strength allows this method to resolve (separate) structural isomers. The use of more polar solvents in the mobile phase will decrease the retention time of analytes, whereas more hydrophobic solvents tend to induce slower elution (increased retention times). Very polar solvents such as traces of water in the mobile phase tend to adsorb to the solid surface of the stationary phase forming a stationary bound (water) layer which is considered to play an active role in retention. This behavior is somewhat peculiar to normal phase chromatograhy because it is governed almost exclusively by an adsorptive mechanism (i.e. analytes interact with a solid surface rather than with the solvated layer of a ligand attached to the sorbent surface; see also reversed-phase HPLC below). Adsorption chromatography is still widely used for structural isomer separations in both column and thin-layer chromatography formats on activated (dried) silica or alumina supports. Partition- and NP-HPLC fell out of favor in the 1970s with the development of reversed-phase HPLC because of poor reproducibility of retention times due to the presence of a water or protic organic solvent layer on the surface of the silica or alumina chromatographic media. This layer changes with any changes in the composition of the mobile phase (e.g. moisture level) causing drifting retention times. Recently, partition chromatography has become popular again with the development of HILIC bonded phases which demonstrate improved reproducibility, and due to a better understanding of the range of usefulness of the technique.

Displacement chromatography
The basic principle of displacement chromatography is: A molecule with a high affinity for the chromatography matrix (the displacer) will compete effectively for binding sites, and thus displace all molecules with lesser affinities.[3] There are distinct differences between displacement and elution chromatography. In elution mode, substances typically emerge from a column in narrow, Gaussian peaks. Wide separation of peaks, preferably to baseline, is desired in order to achieve maximum purification. The speed at which any component of a mixture travels down the column in elution mode depends on many factors. But for two substances to travel at different

High-performance liquid chromatography speeds, and thereby be resolved, there must be substantial differences in some interaction between the biomolecules and the chromatography matrix. Operating parameters are adjusted to maximize the effect of this difference. In many cases, baseline separation of the peaks can be achieved only with gradient elution and low column loadings. Thus, two drawbacks to elution mode chromatography, especially at the preparative scale, are operational complexity, due to gradient solvent pumping, and low throughput, due to low column loadings. Displacement chromatography has advantages over elution chromatography in that components are resolved into consecutive zones of pure substances rather than peaks. Because the process takes advantage of the nonlinearity of the isotherms, a larger column feed can be separated on a given column with the purified components recovered at significantly higher concentration .

130

Reversed-phase chromatography (RPC)


Reversed phase HPLC (RP-HPLC) has a non-polar stationary phase and an aqueous, moderately polar mobile phase. One common stationary phase is a silica which has been surface-modified with RMe2SiCl, where R is a straight chain alkyl group such as C18H37 or C8H17. With such stationary phases, retention time is longer for molecules which are less polar, while polar molecules elute more readily (early in the analysis). An investigator can increase retention times by adding more water to the A chromatogram of complex mixture (perfume water) obtained by reversed phase mobile phase; thereby making the affinity of HPLC the hydrophobic analyte for the hydrophobic stationary phase stronger relative to the now more hydrophilic mobile phase. Similarly, an investigator can decrease retention time by adding more organic solvent to the eluent. RP-HPLC is so commonly used that it is often incorrectly referred to as "HPLC" without further specification. The pharmaceutical industry regularly employs RP-HPLC to qualify drugs before their release. RP-HPLC operates on the principle of hydrophobic interactions, which originates from the high symmetry in the dipolar water structure and plays the most important role in all processes in life science. RP-HPLC allows the measurement of these interactive forces. The binding of the analyte to the stationary phase is proportional to the contact surface area around the non-polar segment of the analyte molecule upon association with the ligand on the stationary phase. This solvophobic effect is dominated by the force of water for "cavity-reduction" around the analyte and the C18-chain versus the complex of both. The energy released in this process is proportional to the surface tension of the eluent (water: 7.3106J/cm, methanol: 2.2106J/cm) and to the hydrophobic surface of the analyte and the ligand respectively. The retention can be decreased by adding a less polar solvent (methanol, acetonitrile) into the mobile phase to reduce the surface tension of water. Gradient elution uses this effect by automatically reducing the polarity and the surface tension of the aqueous mobile phase during the course of the analysis. Structural properties of the analyte molecule play an important role in its retention characteristics. In general, an analyte with a larger hydrophobic surface area (C-H, C-C, and generally non-polar atomic bonds, such as S-S and others) is retained longer because it is non-interacting with the water structure. On the other hand, analytes with higher polar surface area (conferred by the presence of polar groups, such as -OH, -NH2, COO or -NH3+ in their structure) are less retained as they are better integrated into water. Such interactions are subject to steric effects in that very large molecules may have only restricted access to the pores of the stationary phase, where the interactions with surface ligands (alkyl chains) take place. Such surface hindrance typically results in less retention.

High-performance liquid chromatography Retention time increases with hydrophobic (non-polar) surface area. Branched chain compounds elute more rapidly than their corresponding linear isomers because the overall surface area is decreased. Similarly organic compounds with single C-C-bonds elute later than those with a C=C or C-C-triple bond, as the double or triple bond is shorter than a single C-C-bond. Aside from mobile phase surface tension (organizational strength in eluent structure), other mobile phase modifiers can affect analyte retention. For example, the addition of inorganic salts causes a moderate linear increase in the surface tension of aqueous solutions (ca. 1.5107J/cm per Mol for NaCl, 2.5107J/cm per Mol for (NH4)2SO4), and because the entropy of the analyte-solvent interface is controlled by surface tension, the addition of salts tend to increase the retention time. This technique is used for mild separation and recovery of proteins and protection of their biological activity in protein analysis (hydrophobic interaction chromatography, HIC). Another important factor is the mobile phase pH since it can change the hydrophobic character of the analyte. For this reason most methods use a buffering agent, such as sodium phosphate, to control the pH. Buffers serve multiple purposes: control of pH, neutralize the charge on the silica surface of the stationary phase and act as ion pairing agents to neutralize analyte charge. Ammonium formate is commonly added in mass spectrometry to improve detection of certain analytes by the formation of analyte-ammonium adducts. A volatile organic acid such as acetic acid, or most commonly formic acid, is often added to the mobile phase if mass spectrometry is used to analyze the column effluent. Trifluoroacetic acid is used infrequently in mass spectrometry applications due to its persistence in the detector and solvent delivery system, but can be effective in improving retention of analytes such as carboxylic acids in applications utilizing other detectors, as it is a fairly strong organic acid. The effects of acids and buffers vary by application but generally improve chromatographic resolution. Reversed phase columns are quite difficult to damage compared with normal silica columns; however, many reversed phase columns consist of alkyl derivatized silica particles and should never be used with aqueous bases as these will destroy the underlying silica particle. They can be used with aqueous acid, but the column should not be exposed to the acid for too long, as it can corrode the metal parts of the HPLC equipment. RP-HPLC columns should be flushed with clean solvent after use to remove residual acids or buffers, and stored in an appropriate composition of solvent. The metal content of HPLC columns must be kept low if the best possible ability to separate substances is to be retained. A good test for the metal content of a column is to inject a sample which is a mixture of 2,2'- and 4,4'bipyridine. Because the 2,2'-bipy can chelate the metal, the shape of the peak for the 2,2'-bipy will be distorted (tailed) when metal ions are present on the surface of the silica...

131

Size-exclusion chromatography
Size-exclusion chromatography (SEC), also known as gel permeation chromatography or gel filtration chromatography, separates particles on the basis of molecular size (actually by a particle's Stokes radius). It is generally a low resolution chromatography and thus it is often reserved for the final, "polishing" step of a purification. It is also useful for determining the tertiary structure and quaternary structure of purified proteins. SEC is used primarily for the analysis of large molecules such as proteins or polymers. SEC works by trapping these smaller molecules in the pores of a particle. The larger molecules simply pass by the pores as they are too large to enter the pores. Larger molecules therefore flow through the column quicker than smaller molecules, that is, the smaller the molecule, the longer the retention time. This technique is widely used for the molecular weight determination of polysaccharides. SEC is the official technique (suggested by European pharmacopeia) for the molecular weight comparison of different commercially available low-molecular weight heparins.

High-performance liquid chromatography

132

Ion-exchange chromatography
In ion-exchange chromatography (IC), retention is based on the attraction between solute ions and charged sites bound to the stationary phase. Solute ions of the same charge as the charged sites on the column are excluded from binding, while solute ions of the opposite charge of the charged sites of the column are retained on the column. Solute ions that are retained on the column can be eluted from the column by changing the solvent conditions (e.g. increasing the ion effect of the solvent system by increasing the salt concentration of the solution, increasing the column temperature, changing the pH of the solvent, etc...). Types of ion exchangers include: Polystyrene resins These allow cross linkage which increases the stability of the chain. Higher cross linkage reduces swerving, which increases the equilibration time and ultimately improves selectivity. Cellulose and dextran ion exchangers (gels) These possess larger pore sizes and low charge densities making them suitable for protein separation. Controlled-pore glass or porous silica In general, ion exchangers favor the binding of ions of higher charge and smaller radius. An increase in counter ion (with respect to the functional groups in resins) concentration reduces the retention time. A decrease in pH reduces the retention time in cation exchange while an increase in pH reduces the retention time in anion exchange. By lowering the pH of the solvent in a cation exchange column, for instance, more hydrogen ions are available to compete for positions on the anionic stationary phase, thereby eluting weakly-bound cations. This form of chromatography is widely used in the following applications: water purification, preconcentration of trace components, ligand-exchange chromatography, ion-exchange chromatography of proteins, high-pH anion-exchange chromatography of carbohydrates and oligosaccharides, and others.

Bioaffinity chromatography
This chromatographic process relies on the property of biologically active substances to form stable, specific, and reversible complexes. The formation of these complexes involves the participation of common molecular forces such as the Van der Waals interaction, electrostatic interaction, dipole-dipole interaction, hydrophobic interaction, and the hydrogen bond. An efficient, biospecific bond is formed by a simultaneous and concerted action of several of these forces in the complementary binding sites.

Aqueous normal-phase chromatography


Aqueous normal-phase chromatography (ANP) is a chromatographic technique which encompasses the mobile phase region between reversed-phase chromatography (RP) and organic normal phase chromatography (ONP). This technique is used to achieve unique selectivity for hydrophilic compounds, showing normal phase elution using reversed-phase solvents.

Isocratic flow and gradient elution


A separation in which the mobile phase composition remains constant throughout the procedure is termed isocratic (meaning constant composition). The word was coined by Csaba Horvath who was one of the pioneers of HPLC., The mobile phase composition does not have to remain constant. A separation in which the mobile phase composition is changed during the separation process is described as a gradient elution.[4] One example is a gradient starting at 10% methanol and ending at 90% methanol after 20 minutes. The two components of the mobile phase are typically termed "A" and "B"; A is the "weak" solvent which allows the solute to elute only slowly, while B is the "strong" solvent which rapidly elutes the solutes from the column. In reversed-phase chromatography, solvent A is often water or an aqueous buffer, while B is an organic solvent miscible with water, such as acetonitrile, methanol, THF, or isopropanol.

High-performance liquid chromatography In isocratic elution, peak width increases with retention time linearly according to the equation for N, the number of theoretical plates. This leads to the disadvantage that late-eluting peaks get very flat and broad. Their shape and width may keep them from being recognized as peaks. Gradient elution decreases the retention of the later-eluting components so that they elute faster, giving narrower (and taller) peaks for most components. This also improves the peak shape for tailed peaks, as the increasing concentration of the organic eluent pushes the tailing part of a peak forward. This also increases the peak height (the peak looks "sharper"), which is important in trace analysis. The gradient program may include sudden "step" increases in the percentage of the organic component, or different slopes at different times all according to the desire for optimum separation in minimum time. In isocratic elution, the selectivity does not change if the column dimensions (length and inner diameter) change that is, the peaks elute in the same order. In gradient elution, the elution order may change as the dimensions or flow rate change. The driving force in reversed phase chromatography originates in the high order of the water structure. The role of the organic component of the mobile phase is to reduce this high order and thus reduce the retarding strength of the aqueous component.

133

Parameters
Theoretical
HPLC separations have theoretical parameters and equations to describe the separation of components into signal peaks when detected by instrumentation such as by a UV detector or mass spectrometer. The parameters are practical in that they are not predicted, but measurements of HPLC chromatograms. They are analogous to the calculation of retention factor for a paper chromatography separation, but describes how well HPLC separates a mixture into two or more components that are detected as peaks (bands) on a chromatograms. The HPLC parameters are the: efficiency factor(N), the retention factor (kappa prime), and the separation factor (alpha). Together the factors are variables in a resolution equation, which describes how well two components' peaks separated or overlapped each other. These parameters are mostly only used for describing HPLC reversed phase and HPLC normal phase separations, since those separations tend to be more subtle than other HPLC modes (e.g. ion exchange and size exclusion). Void volume is the amount of space in a column that is occupied by solvent. It is the space within the column that is outside of the column's internal packing material. Void volume is measured on a chromatogram as the first component peak detected, which is usually the solvent that was present in the sample mixture; ideally the sample solvent flows through the column without interacting with the column, but is still detectable as distinct from the HPLC solvent. The void volume is used as a correction factor other factors. Efficiency factor (N) practically measures how sharp component peaks on the chromatogram are, as ratio of the component peak's area ("retention time") relative to the width of the peaks at their widest point (at the baseline). Peaks that are tall, sharp, and relatively narrow indicate that separation method efficiently removed a component from a mixture; high efficiency. Efficiency is very dependent upon the HPLC column and the HPLC method used. Efficiency factor is synonymous with plate number, and the 'number of theoretical plates'. Retention factor (kappa prime) measures how long a component of the mixture stuck to the column, measured by the area under the curve of its peak in a chromatogram (since HPLC chromatograms are a function of time). Each chromatogram peak will have its own retention factor (e.g. kappa1 for the retention factor of the first peak). This factor may be corrected for by the void volume of the column. Separation factor (alpha) is a relative comparison how well two neighboring components of the mixture were separated (i.e. two neighboring bands on a chromatogram). This factor is defined in terms of a ratio of the retention factors of a pair of neighboring chromatogram peaks, and may also be corrected for by the void volume

High-performance liquid chromatography of the column. The greater the separation factor value is over 1.0, the better the separation, until about 2.0 beyond which an HPLC method is probably not needed for separation. Resolution equations relate the three factors such that high efficiency and separation factors improve the resolution of component peaks in a HPLC separation.

134

Internal diameter
The internal diameter (ID) of an HPLC column is an important parameter that influences the detection sensitivity and separation selectivity in gradient elution. It also determines the quantity of analyte that can be loaded onto the column. Larger columns are usually seen in industrial applications, such as the purification of a drug product for later use. Low-ID columns have improved sensitivity and lower solvent consumption at the expense of loading capacity. Larger ID columns (over 10mm) are used to purify usable amounts of material because of their large loading capacity. Analytical scale columns (4.6mm) have been the most common type of columns, though smaller columns are rapidly gaining in popularity. They are used in traditional quantitative analysis of samples and often use a UV-Vis absorbance detector. Narrow-bore columns (12mm) are used for applications when more sensitivity is desired either with special UV-vis detectors, fluorescence detection or with other detection methods like liquid chromatography-mass spectrometry Capillary columns (under 0.3mm) are used almost exclusively with alternative detection means such as mass spectrometry. They are usually made from fused silica capillaries, rather than the stainless steel tubing that larger columns employ.

Particle size
Most traditional HPLC is performed with the stationary phase attached to the outside of small spherical silica particles (very small beads). These particles come in a variety of sizes with 5 m beads being the most common. Smaller particles generally provide more surface area and better separations, but the pressure required for optimum linear velocity increases by the inverse of the particle diameter squared.[5][6][7] This means that changing to particles that are half as big, keeping the size of the column the same, will double the performance, but increase the required pressure by a factor of four. Larger particles are used in preparative HPLC (column diameters 5cm up to >30cm) and for non-HPLC applications such as solid-phase extraction.

Pore size
Many stationary phases are porous to provide greater surface area. Small pores provide greater surface area while larger pore size has better kinetics, especially for larger analytes. For example, a protein which is only slightly smaller than a pore might enter the pore but does not easily leave once inside.

Pump pressure
Pumps vary in pressure capacity, but their performance is measured on their ability to yield a consistent and reproducible flow rate. Pressure may reach as high as 40MPa (6000lbf/in2), or about 400atmospheres. Modern HPLC systems have been improved to work at much higher pressures, and therefore are able to use much smaller particle sizes in the columns (<2 m). These "Ultra High Performance Liquid Chromatography" systems or RSLC/UHPLCs can work at up to 100MPa (15,000lbf/in), or about 1000atmospheres. The term "UPLC" is a trademark of the Waters Corporation, but is sometimes used to refer to the more general technique.

High-performance liquid chromatography

135

References
[1] "Practical aspects of fast reversed-phase high-performance liquid chromatography using 3 m particle packed columns and monolithic columns in pharmaceutical development and production working under current good manufacturing practice" (http:/ / www. sciencedirect. com/ science/ article/ pii/ S0021967304003139). . Retrieved 10/23/2012. [2] Lindsay, S. ; Kealey, D. (1987). High performance liquid chromatography (http:/ / www. osti. gov/ energycitations/ product. biblio. jsp?osti_id=7013902). Wiley. . from review Hung, L. B.; Parcher, J. F.; Shores, J. C.; Ward, E. H. (1988). "Theoretical and experimental foundation for surface-coverage programming in gas-solid chromatography with an adsorbable carrier gas". J. Am. Chem. Soc. 110 (11): 1090. doi:10.1021/ac00162a003. [3] Displacement Chromatography (http:/ / www. sacheminc. com/ industries/ biotechnology/ teaching-tools. html). Sacheminc.com. Retrieved on 2011-06-07. [4] A recent book provides a comprehensive treatment of the theory of high-performance gradient chromatography: Lloyd R. Snyder and John W. Dolan (2006). High-Performance Gradient Elution: The Practical Application of the Linear-Solvent-Strength Model. Wiley Interscience. ISBN0-471-70646-9. [5] Majors, Ronald E.. (2010-09-07) Fast and Ultrafast HPLC on sub-2 m Porous Particles Where Do We Go From Here? LC-GC Europe (http:/ / www. lcgceurope. com/ lcgceurope/ article/ articleDetail. jsp?id=333246& pageID=3#). Lcgceurope.com. Retrieved on 2011-06-07. [6] Xiang, Y.; Liu Y. and Lee M.L. (2006). "Ultrahigh pressure liquid chromatography using elevated temperature". Journal of Chromatography A 1104 (12): 198202. doi:10.1016/j.chroma.2005.11.118. PMID16376355. [7] Horvth, Cs.; Preiss B.A. and Lipsky S.R. (1967). "Fast liquid chromatography. Investigation of operating parameters and the separation of nucleotides on pellicular ion exchangers". Analytical Chemistry 39 (12): 14221428. doi:10.1021/ac60256a003. PMID6073805.

Further reading
L. R. Snyder, J.J. Kirkland, and J. W. Dolan, Introduction to Modern Liquid Chromatography, John Wiley & Sons, New York, 2009. M.W. Dong, Modern HPLC for practicing scientists. Wiley, 2006. L. R. Snyder, J.J. Kirkland, and J. L. Glajch, Practical HPLC Method Development, John Wiley & Sons, New York, 1997. S. Ahuja and H. T. Rasmussen (ed), HPLC Method Development for Pharmaceuticals, Academic Press, 2007. S. Ahuja and M.W. Dong (ed), Handbook of Pharmaceutical Analysis by HPLC, Elsevier/Academic Press, 2005. Y. V. Kazakevich and R. LoBrutto (ed.), HPLC for Pharmaceutical Scientists, Wiley, 2007. U. D. Neue, HPLC Columns: Theory, Technology, and Practice, Wiley-VCH, New York, 1997. M. C. McMaster, HPLC, a practical user's guide, Wiley, 2007.

External links
Liquid Chromatography (http:/ / www. dmoz. org/ / Science/ Chemistry/ Analytical/ Separations_Science/ Liquid_Chromatography/) at the Open Directory Project

Electrophoresis

136

Electrophoresis
Electrophoresis is the motion of dispersed particles relative to a fluid under the influence of a spatially uniform electric field.[1][2][3][4][5][6] This electrokinetic phenomenon was observed for the first time in 1807 by Ferdinand Frederic Reuss (Moscow State University),[7] who noticed that the application of a constant electric field caused clay particles dispersed in water to migrate. It is ultimately caused by the presence of a charged interface between the particle surface and the surrounding fluid. Electrophoresis of positively charged particles (cations) is called cataphoresis, while electrophoresis of negatively charged particles (anions) is called anaphoresis.

Illustration of electrophoresis

Theory
Suspended particles have an electric surface charge, strongly affected by surface adsorbed species[8], on which an external electric field exerts an electrostatic Coulomb force. According to the double layer theory, all surface charges in fluids are screened by a diffuse layer of ions, which has the same absolute charge but opposite sign with respect to that of the surface charge. The electric field also exerts a force on the ions in the diffuse layer which has direction opposite to that acting on the surface charge. Illustration of electrophoresis retardation This latter force is not actually applied to the particle, but to the ions in the diffuse layer located at some distance from the particle surface, and part of it is transferred all the way to the particle surface through viscous stress. This part of the force is also called electrophoretic retardation force. When the electric field is applied and the charged particle to be analyzed is at steady movement through the diffuse layer, the total resulting force is zero :

Considering the drag on the moving particles due to the viscosity of the dispersant, in the case of low Reynolds number and moderate electric field strength E, the velocity of a dispersed particle v is simply proportional to the applied field, which leaves the electrophoretic mobility e defined as:

Electrophoresis The most known and widely used theory of electrophoresis was developed in 1903 by Smoluchowski[9] , where r is the dielectric constant of the dispersion medium, 0 is the permittivity of free space (CN1m2), is dynamic viscosity of the dispersion medium (Pas), and is zeta potential (i.e., the electrokinetic potential of the slipping plane in the double layer). The Smoluchowski theory is very powerful because it works for dispersed particles of any shape at any concentration. Unfortunately, it has limitations on its validity. It follows, for instance, from the fact that it does not include Debye length 1. However, Debye length must be important for electrophoresis, as follows immediately from the Figure on the right. Increasing thickness of the double layer (DL) leads to removing point of retardation force further from the particle surface. The thicker DL, the smaller retardation force must be. Detailed theoretical analysis proved that the Smoluchowski theory is valid only for sufficiently thin DL, when particle radius a is much greater than the Debye length : . This model of "thin Double Layer" offers tremendous simplifications not only for electrophoresis theory but for many other electrokinetic theories. This model is valid for most aqueous systems because the Debye length is only a few nanometers there. It breaks only for nano-colloids in solution with ionic strength close to water. The Smoluchowski theory also neglects contribution of surface conductivity. This is expressed in modern theory as condition of small Dukhin number:

137

In the effort of expanding the range of validity of electrophoretic theories, the opposite asymptotic case was considered, when Debye length is larger than particle radius: . Under this condition of a "thick Double Layer", Hckel[10] predicted the following relation for electrophoretic mobility: . This model can be useful for some nanoparticles and non-polar fluids, where Debye length is much larger than in the usual cases. There are several analytical theories that incorporate surface conductivity and eliminate the restriction of a small Dukhin number, pioneered by Overbeek[11] and Booth.[12] Modern, rigorous theories valid for any Zeta potential and often any a stem mostly from Dukhin-Semenikhin theory.[13] In the thin Double Layer limit, these theories confirm the numerical solution to the problem provided by O'Brien and White.[14]

Electrophoresis

138

References
[1] [2] [3] [4] [5] [6] [7] [8] Lyklema, J. (1995). Fundamentals of Interface and Colloid Science. vol. 2. p.3.208. Hunter, R.J. (1989). Foundations of Colloid Science. Oxford University Press. Dukhin, S.S.; B.V. Derjaguin (1974). Electrokinetic Phenomena. J. Willey and Sons. Russel, W.B.; D.A. Saville and W.R. Schowalter (1989). Colloidal Dispersions. Cambridge University Press. Kruyt, H.R. (1952). Colloid Science. Volume 1, Irreversible systems. Elsevier. Dukhin, A.S.; P.J. Goetz (2002). Ultrasound for characterizing colloids. Elsevier. Reuss, F.F. (1809). Mem. Soc. Imperiale Naturalistes de Moscow 2: 327. Hanaor, D.A.H.; Michelazzi, M.; Leonelli, C.; Sorrell, C.C. (2012). "The effects of carboxylic acids on the aqueous dispersion and electrophoretic deposition of ZrO2" (http:/ / www. sciencedirect. com/ science/ article/ pii/ S0955221911004171). Journal of the European Ceramic Society 32 (1): 235-244. doi:10.1016/j.jeurceramsoc.2011.08.015. . [9] von Smoluchowski, M. (1903). Bull. Int. Acad. Sci. Cracovie 184. [10] Hckel, E. (1924). Physik.Z. 25: 204. [11] Overbeek, J.Th.G (1943). Koll.Bith: 287. [12] Booth, F. (1948). Nature 161: 83. Bibcode1948Natur.161...83B. doi:10.1038/161083a0. PMID18898334. [13] Dukhin, S.S.; N.M. Semenikhin (1970). Koll.Zhur 32: 366. [14] O'Brien, R.W.; L.R. White (1978). J. Chem. Soc. Faraday Trans. 2 (74): 1607.

Further reading
Voet and Voet (1990). Biochemistry. John Wiley & Sons. Jahn, G.C.; D.W. Hall and S.G. Zam (1986). "A comparison of the life cycles of two Amblyospora (Microspora: Amblyosporidae) in the mosquitoes Culex salinarius and Culex tarsalis Coquillett". J. Florida Anti-Mosquito Assoc. 57: 2427. Khattak, M.N.; R.C. Matthews (1993). "Genetic relatedness of Bordetella species as determined by macrorestriction digests resolved by pulsed-field gel electrophoresis". Int. J. Syst. Bacteriol. 43 (4): 65964. Barz, D.P.J.; P. Ehrhard (2005). "Model and verification of electrokinetic flow and transport in a micro-electrophoresis device". Lab Chip 5: 949958. Shim, J.; P. Dutta and C.F. Ivory (2007). "Modeling and simulation of IEF in 2-D microgeometries". Electrophoresis 28: 527586.

External links
List of relative mobilities (http://web.med.unsw.edu.au/phbsoft/mobility_listings.htm)

Gel electrophoresis

139

Gel electrophoresis
Gel electrophoresis

Gel electrophoresis apparatus An agarose gel is placed in this buffer-filled box and electrical field is applied via the power supply to the rear. The negative terminal is at the far end (black wire), so DNA migrates toward the camera. Classification Electrophoresis Other techniques Related Capillary electrophoresis SDS-PAGE Two-dimensional gel electrophoresis Temperature gradient gel electrophoresis

Gel electrophoresis

140

Gel electrophoresis is a method for separation and analysis of macromolecules (DNA, RNA and proteins) and their fragments, based on their size and charge. It is used in clinical chemistry to separate proteins by charge and or size (IEF agarose, essentially size independent) and in biochemistry and molecular biology to separate a mixed population of DNA and RNA fragments by length, to estimate the size of DNA and RNA fragments or to separate proteins by charge.[1] Nucleic acid molecules are separated by applying an electric field to move the negatively charged molecules through an agarose matrix. Shorter molecules move faster and migrate farther than longer ones because shorter molecules migrate more easily through the pores of the gel. This phenomenon is called sieving.[2] Proteins are separated by charge in agarose because the pores of the gel are too large to sieve proteins. Gel electrophoresis can also be used for separation of nanoparticles. Gel electrophoresis uses a gel as an anticonvective medium and or sieving medium during electrophoresis, the movement of a charged particle in an electrical field. Gels suppress the thermal convection caused by application of the electric field, and can also act as a sieving medium, retarding the passage of molecules; gels Digital image of 3 plasmid restriction digests run on a can also simply serve to maintain the finished separation, so that a 1% w/v agarose gel, 3 volt/cm, stained with ethidium bromide. The DNA size marker is a commercial 1kbp post electrophoresis stain can be applied.[3] DNA Gel ladder. The position of the wells and direction of DNA electrophoresis is usually performed for analytical purposes, often migration is noted. after amplification of DNA via PCR, but may be used as a preparative technique prior to use of other methods such as mass spectrometry, RFLP, PCR, cloning, DNA sequencing, or Southern blotting for further characterization.

Separation
In simple terms: Electrophoresis is a process which enables the sorting of molecules based on size. Using an electric field, molecules (such as DNA) can be made to move through a gel made of agar or polyacrylamide. The molecules being sorted are dispensed into a well in the gel material. The gel is placed in an electrophoresis chamber, which is then connected to a power source. When the electric current is applied, the larger molecules move more slowly through the gel while the smaller molecules move faster. The different sized molecules form distinct bands on the gel. The term "gel" in this instance refers to the matrix used to contain, then separate the target molecules. In most cases, the gel is a crosslinked polymer whose composition and porosity is chosen based on the specific weight and composition of the target to be analyzed. When separating proteins or small nucleic acids (DNA, RNA, or oligonucleotides) the gel is usually composed of different concentrations of acrylamide and a cross-linker, producing different sized mesh networks of polyacrylamide. When separating larger nucleic acids (greater than a few hundred bases), the preferred matrix is purified agarose. In both cases, the gel forms a solid, yet porous matrix. Acrylamide, in contrast to polyacrylamide, is a neurotoxin and must be handled using appropriate safety precautions to avoid poisoning. Agarose is composed of long unbranched chains of uncharged carbohydrate without cross links resulting in a gel with large pores allowing for the separation of macromolecules and macromolecular complexes.

Gel electrophoresis "Electrophoresis" refers to the electromotive force (EMF) that is used to move the molecules through the gel matrix. By placing the molecules in wells in the gel and applying an electric field, the molecules will move through the matrix at different rates, determined largely by their mass when the charge to mass ratio (Z) of all species is uniform, toward the (negatively charged) cathode if positively charged or toward the (positively charged) anode if negatively charged.[4] If several samples have been loaded into adjacent wells in the gel, they will run parallel in individual lanes. Depending on the number of different molecules, each lane shows separation of the components from the original mixture as one or more distinct bands, one band per component. Incomplete separation of the components can lead to overlapping bands, or to indistinguishable smears representing multiple unresolved components. Bands in different lanes that end up at the same distance from the top contain molecules that passed through the gel with the same speed, which usually means they are approximately the same size. There are molecular weight size markers available that contain a mixture of molecules of known sizes. If such a marker was run on one lane in the gel parallel to the unknown samples, the bands observed can be compared to those of the unknown in order to determine their size. The distance a band travels is approximately inversely proportional to the logarithm of the size of the molecule. There are limits to electrophoretic techniques. Since passing current through a gel causes heating, gels may melt during electrophoresis. Electrophoresis is performed in buffer solutions to reduce pH changes due to the electric field, which is important because the charge of DNA and RNA depends on pH, but running for too long can exhaust the buffering capacity of the solution. Further, different preparations of genetic material may not migrate consistently with each other, for morphological or other reasons.

141

Types of gel
Agarose
Agarose gels are easily cast and handled compared to other matrices, because the gel setting is a physical rather than chemical change. Samples are also easily recovered. After the experiment is finished, the resulting gel can be stored in a plastic bag in a refrigerator. Agarose gel electrophoresis can be used for the separation of DNA fragments ranging from 50 base pair to several megabases (millions of bases) using specialized apparatus. The distance between DNA bands of a given length is determined by the percent agarose in the gel. The disadvantage of higher concentrations is the long run times (sometimes days). Instead high percentage agarose gels should be run with a pulsed field electrophoresis (PFE), or field inversion electrophoresis. "Most agarose gels are made with between 0.7% (good separation or resolution of large 510kb DNA fragments) and 2% (good resolution for small 0.21kb fragments) agarose dissolved in electrophoresis buffer. Up to 3% can be used for separating very tiny fragments but a vertical polyacrylamide gel is more appropriate in this case. Low percentage gels are very weak and may break when you try to lift them. High percentage gels are often brittle and do not set evenly. 1% gels are common for many applications."[5] Agarose gels do not have a uniform pore size, but are optimal for electrophoresis of proteins that are larger than 200 kDa.[6]

Gel electrophoresis

142

Polyacrylamide
Polyacrylamide gel electrophoresis (PAGE) is used for separating proteins ranging in size from 5 to 2,000 kDa due to the uniform pore size provided by the polyacrylamide gel. Pore size is controlled by controlling the concentrations of acrylamide and bis-acrylamide powder used in creating a gel. Care must be used when creating this type of gel, as acrylamide is a potent neurotoxin in its liquid and powdered form. Traditional DNA sequencing techniques such as Maxam-Gilbert or Sanger methods used polyacrylamide gels to separate DNA fragments differing by a single base-pair in length so the sequence could be read. Most modern DNA separation methods now use agarose gels, except for particularly small DNA fragments. It is currently most often used in the field of immunology and protein analysis, often used to separate different proteins or isoforms of the same protein into separate bands. These can be transferred onto a nitrocellulose or PVDF membrane to be probed with antibodies and corresponding markers, such as in a western blot. Typically resolving gels are made in 6%, 8%, 10%, 12% or 15%. Stacking gel (5%) is poured on top of the resolving gel and a gel comb (which forms the wells and defines the lanes where proteins, sample buffer and ladders will be placed) is inserted. The percentage chosen depends on the size of the protein that one wishes to identify or probe in the sample. The smaller the known weight, the higher the percentage that should be used. Changes on the buffer system of the gel can help to further resolve proteins of very small sizes.[7]

Starch
Partially hydrolysed potato starch makes for another non-toxic medium for protein electrophoresis. The gels are slightly more opaque than acrylamide or agarose. Non-denatured proteins can be separated according to charge and size. They are visualised using Napthal Black or Amido Black staining. Typical starch gel concentrations are 5% to 10%.[8][9][10]

Gel conditions
Denaturing
A denaturing gel is a type of electrophoresis in which the native structure of macromolecules that are run within the gel is not maintained. For instance, gels used in SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) will unfold and denature the native structure of a protein. In contrast to native gel electrophoresis, quaternary structure cannot be investigated using this method.

Gel electrophoresis

143

Native
Native gel electrophoresis is an electrophoretic separation method typically used in proteomics and metallomics.[12] Native PAGE separations are run in non-denaturing conditions. Detergents are used only to the extent that they are necessary to lyse lipid membranes in the cell. Complexes remainfor the most partassociated and folded as they would be in the cell. One downside, however, is that complexes may not separate cleanly or predictably, since they cannot move through the polyacrylamide gel as quickly as individual, denatured proteins. Unlike denaturing methods, such as SDS-PAGE, native gel electrophoresis does not use a charged denaturing agent. The molecules being separated (usually proteins or nucleic acids) therefore differ not only in molecular mass and intrinsic charge, but also the cross-sectional area, and thus experience different electrophoretic forces dependent on the shape of the overall structure. Since the proteins remain in the native state they may be visualised not only by general protein staining reagents but also by specific enzyme-linked staining.

Buffers
Buffers in gel electrophoresis are used to provide ions that carry a Specific enzyme-linked staining: current and to maintain the pH at a relatively constant value. There are Glucose-6-Phosphate Dehydrogenase isoenzymes a number of buffers used for electrophoresis. The most common being, in Plasmodium falciparum infected Red blood [11] for nucleic acids Tris/Acetate/EDTA (TAE), Tris/Borate/EDTA cells (TBE). Many other buffers have been proposed, e.g. lithium borate, which is almost never used, based on Pubmed citations (LB), iso electric histidine, pK matched goods buffers, etc.; in most cases the purported rationale is lower current (less heat) and or matched ion mobilities, which leads to longer buffer life. Borate is problematic; Borate can polymerize, and/or interact with cis diols such as those found in RNA. TAE has the lowest buffering capacity but provides the best resolution for larger DNA. This means a lower voltage and more time, but a better product. LB is relatively new and is ineffective in resolving fragments larger than 5 kbp; However, with its low conductivity, a much higher voltage could be used (up to 35 V/cm), which means a shorter analysis time for routine electrophoresis. As low as one base pair size difference could be resolved in 3% agarose gel with an extremely low conductivity medium (1 mM Lithium borate).[13]

Gel electrophoresis

144

Visualization
After the electrophoresis is complete, the molecules in the gel can be stained to make them visible. DNA may be visualized using ethidium bromide which, when intercalated into DNA, fluoresce under ultraviolet light, while protein may be visualised using silver stain or Coomassie Brilliant Blue dye. Other methods may also be used to visualize the separation of the mixture's components on the gel. If the molecules to be separated contain radioactivity, for example in DNA sequencing gel, an autoradiogram can be recorded of the gel. Photographs can be taken of gels, often using Gel Doc. The most common dye used to make DNA or RNA DNA gel electrophoresis bands visible for agarose gel electrophoresis is ethidium bromide, usually abbreviated as EtBr. It fluoresces under UV light when intercalated into the major groove of DNA (or RNA). By running DNA through an EtBr-treated gel and visualizing it with UV light, any band containing more than ~20ng DNA becomes distinctly visible. EtBr is a known mutagen, and safer alternatives are available, such as GelRed, which binds to the minor groove. SYBR Green I is another dsDNA stain, produced by Invitrogen. It is more expensive, but 25 times more sensitive, and possibly safer than EtBr, though there is no data addressing its mutagenicity or toxicity in humans.[14] SYBR Safe is a variant of SYBR Green that has been shown to have low enough levels of mutagenicity and toxicity to be deemed nonhazardous waste under U.S. Federal regulations.[15] It has similar sensitivity levels to EtBr,[15] but, like SYBR Green, is significantly more expensive. In countries where safe disposal of hazardous waste is mandatory, the costs of EtBr disposal can easily outstrip the initial price difference, however. Since EtBr stained DNA is not visible in natural light, scientists mix DNA with negatively charged loading buffers before adding the mixture to the gel. Loading buffers are useful because they are visible in natural light (as opposed to UV light for EtBr stained DNA), and they co-sediment with DNA (meaning they move at the same speed as DNA of a certain length). Xylene cyanol and Bromophenol blue are common dyes found in loading buffers; they run about the same speed as DNA fragments that are 5000 bp and 300 bp in length respectively, but the precise position varies with percentage of the gel. Other less frequently used progress markers are Cresol Red and Orange G which run at about 125 bp and 50 bp, respectively. Visualization can also be achieved by transferring DNA to a nitrocellulose membrane followed by exposure to a hybridization probe. This process is termed Southern blotting.

Analysis
After electrophoresis the gel is illuminated with an ultraviolet lamp (usually by placing it on a light box, while using protective gear to limit exposure to ultraviolet radiation). The illuminator apparatus mostly also contains imaging apparatus that takes an image of the gel, after illumination with UV radiation. The ethidium bromide fluoresces reddish-orange in the presence of DNA, since it has intercalated with the DNA. The DNA band can also be cut out of the gel, and can then be dissolved to retrieve the purified DNA. The gel can then be photographed usually with a digital or polaroid camera. Although the stained nucleic acid fluoresces reddish-orange, images are usually shown in black and white (see figures). Even short exposure of nucleic acids to UV light causes significant damage to the sample. UV damage to the sample will reduce the efficiency of subsequent manipulation of the sample, such as ligation and cloning. If the DNA is to be used after separation on the agarose gel, it is best to avoid exposure to UV light by using a blue light excitation

Gel electrophoresis source such as the XcitaBlue UV to blue light conversion screen from Bio-Rad or Dark Reader from Clare Chemicals. A blue excitable stain is required, such as one of the SYBR Green or GelGreen stains. Blue light is also better for visualization since it is safer than UV (eye-protection is not such a critical requirement) and passes through transparent plastic and glass. This means that the staining will be brighter even if the excitation light goes through glass or plastic gel platforms. Gel electrophoresis research often takes advantage of software-based image analysis tools, such as ImageJ.
1 2 3

145

A 1% agarose 'slab' gel under normal light, behind a The gel with UV illumination, the ethidium bromide perspex UV shield. Only the marker dyes can be seen stained DNA glows orange Digital photo of the gel. Lane 1. Commercial DNA Markers (1kbplus), Lane 2. empty, Lane 3. a PCR product of just over 500 bases, Lane 4. Restriction digest showing a similar fragment cut from a 4.5 kb plasmid vector

Downstream processing
After separation, an additional separation method may then be used, such as isoelectric focusing or SDS-PAGE. The gel will then be physically cut, and the protein complexes extracted from each portion separately. Each extract may then be analysed, such as by peptide mass fingerprinting or de novo sequencing after in-gel digestion. This can provide a great deal of information about the identities of the proteins in a complex.

Applications
Estimation of the size of DNA molecules following restriction enzyme digestion, e.g. in restriction mapping of cloned DNA. Analysis of PCR products, e.g. in molecular genetic diagnosis or genetic fingerprinting Separation of restricted genomic DNA prior to Southern transfer, or of RNA prior to Northern transfer. Gel electrophoresis is used in forensics, molecular biology, genetics, microbiology and biochemistry. The results can be analyzed quantitatively by visualizing the gel with UV light and a gel imaging device. The image is recorded with a computer operated camera, and the intensity of the band or spot of interest is measured and compared against standard or markers loaded on the same gel. The measurement and analysis are mostly done with specialized software.

Gel electrophoresis Depending on the type of analysis being performed, other techniques are often implemented in conjunction with the results of gel electrophoresis, providing a wide range of field-specific applications.

146

Nucleic acids
In the case of nucleic acids, the direction of migration, from negative to positive electrodes, is due to the naturally-occurring negative charge carried by their sugar-phosphate backbone.[16] Double-stranded DNA fragments naturally behave as long rods, so their migration through the gel is relative to their size or, for cyclic fragments, their radius of gyration. Circular DNA such as plasmids, however, may show multiple bands, the speed of migration may depend on whether it is relaxed or supercoiled. Single-stranded DNA or RNA tend to fold up into molecules with complex shapes and migrate through the gel in a complicated manner based on their tertiary structure. Therefore, agents that disrupt the hydrogen bonds, such as sodium hydroxide or formamide, are used to denature the nucleic acids and cause them to behave as long rods again.[17] Gel electrophoresis of large DNA or RNA is usually done by agarose An agarose gel of a PCR product compared to a gel electrophoresis. See the "Chain termination method" page for an DNA ladder. example of a polyacrylamide DNA sequencing gel. Characterization through ligand interaction of nucleic acids or fragments may be performed by mobility shift affinity electrophoresis. Electrophoresis of RNA samples can be used to check for genomic DNA contamination and also for RNA degradation. RNA from eukaryotic organisms shows distinct bands of 28s and 18s rRNA, the 28s band being approximately twice as intense as the 18s band. Degraded RNA has less sharpely defined bands, has a smeared appearance, and intensity ratio is less than 2:1.

Proteins
Proteins, unlike nucleic acids, can have varying charges and complex shapes, therefore they may not migrate into the polyacrylamide gel at similar rates, or at all, when placing a negative to positive EMF on the sample. Proteins therefore, are usually denatured in the presence of a detergent such as sodium dodecyl sulfate (SDS) that coats the proteins with a negative charge.[3] Generally, the amount of SDS bound is relative to the size of the protein (usually 1.4g SDS per gram of protein), so that the resulting denatured proteins have an overall negative charge, and all the proteins have a similar charge to mass ratio. Since denatured proteins act like long rods instead of having a complex tertiary shape, the rate at which the resulting SDS coated proteins migrate in the gel is relative only to its size and not its charge or shape.[3]

SDS-PAGE autoradiography The indicated proteins are present in different concentrations in the two samples.

Proteins are usually analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), by native gel electrophoresis, by quantitative preparative native continuous polyacrylamide gel electrophoresis (QPNC-PAGE), or by 2-D electrophoresis. Characterization through ligand interaction may be performed by electroblotting or by affinity electrophoresis in agarose or by capillary electrophoresis as for estimation of binding constants and determination of structural features

Gel electrophoresis like glycan content through lectin binding.

147

History
1930s first reports of the use of sucrose for gel electrophoresis 1955 introduction of starch gels, mediocre separation 1959 introduction of acrylamide gels; disc electrophoresis (Ornstein and Davis); accurate control of parameters such as pore size and stability; and (Raymond and Weintraub) 1969 introduction of denaturing agents especially SDS separation of protein subunit (Weber and Osborn)[18] 1970 Laemmli separated 28 components of T4 phage using a stacking gel and SDS 1975 2-dimensional gels (OFarrell); isoelectric focusing then SDS gel electrophoresis 1977 sequencing gels late 1970s agarose gels 1983 pulsed field gel electrophoresis enables separation of large DNA molecules 1983 introduction of capillary electrophoresis A 1959 book on electrophoresis by Milan Bier cites references from the 1800s.[19] However, Oliver Smithies made significant contributions. Bier states: "The method of Smithies ... is finding wide application because of its unique separatory power." Taken in context, Bier clearly implies that Smithies' method is an improvement.

References
[1] Kryndushkin DS, Alexandrov IM, Ter-Avanesyan MD, Kushnirov VV (2003). "Yeast [PSI+] prion aggregates are formed by small Sup35 polymers fragmented by Hsp104". Journal of Biological Chemistry 278 (49): 4963643. doi:10.1074/jbc.M307996200. PMID14507919. [2] Sambrook J, Russel DW (2001). Molecular Cloning: A Laboratory Manual 3rd Ed. Cold Spring Harbor Laboratory Press. Cold Spring Harbor, NY. [3] Berg JM, Tymoczko JL Stryer L (2002). Biochemistry (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?& rid=stryer. section. 438#455) (5th ed.). WH Freeman. ISBN0-7167-4955-6. . [4] Robyt, John F. & White, Bernard J. (1990). Biochemical Techniques Theory and Practice. Waveland Press. ISBN0-88133-556-8. [5] "Agarose gel electrophoresis (basic method)" (http:/ / www. methodbook. net/ dna/ agarogel. html). Biological Protocols. . Retrieved 23 August 2011. [6] Smisek, D. L.; Hoagland, D. A. (1989). "Agarose gel electrophoresis of high molecular weight, synthetic polyelectrolytes". Macromolecules 22 (5): 2270. doi:10.1021/ma00195a048. [7] Schgger, Hermann (2006). "TricineSDS-PAGE". Nature protocols 1 (1): 1622. doi:10.1038/nprot.2006.4. PMID17406207. [8] Gordon, A.H. (1975). Electrophoresis of proteins in polyacrylamide and starch gels. New York: American Elsevier Publishing Company, Inc. [9] Smithies, O. (1955). "Zone electrophoresis in starch gels: group variations in the serum proteins of normal adults". Biochem. J. 61 (4): 629641. PMC1215845. PMID13276348. [10] Wraxall, B.G.D.; Culliford, B.J. (1968). "A thin-layer starch gel method for enzyme typing of bloodstains". J. Forensic Sci. Soc. 8 (2): 8182. doi:10.1016/S0015-7368(68)70449-7. PMID5738223. [11] Hempelmann E, Wilson RJ (1981). "Detection of glucose-6-phosphate dehydrogenase in malarial parasites" (http:/ / www. sciencedirect. com/ science/ article/ pii/ 0166685181901006). Molecular and Biochemical Parasitology 2 (34): 197204. doi:10.1016/0166-6851(81)90100-6. PMID7012616. . [12] Preparative gel electrophoresis of native metalloproteins (http:/ / www. aesociety. org/ areas/ preparative_gel. php) [13] Brody JR, Kern SE (October 2004). "History and principles of conductive media for standard DNA electrophoresis" (http:/ / www. cc. ahs. chula. ac. th/ Molmed/ DNA electro's conductive media. pdf). Anal. Biochem. 333 (1): 113. doi:10.1016/j.ab.2004.05.054. PMID15351274. . [14] SYBR Green I Nucleic Acid Gel Stain (http:/ / probes. invitrogen. com/ media/ pis/ mp07567. pdf) [15] SYBR Safe DNA Gel Stain (http:/ / probes. invitrogen. com/ media/ pis/ mp33100. pdf) [16] Lodish H, Berk A, Matsudaira P (2004). Molecular Cell Biology (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?& rid=mcb. section. 1637#1648) (5th ed.). WH Freeman: New York, NY. ISBN978-0-7167-4366-8. . [17] Troubleshooting DNA agarose gel electrophoresis. Focus 19:3 p.66 (1997). [18] Weber, K; Osborn, M (1969). "The reliability of molecular weight determinations by dodecyl sulfate-polyacrylamide gel electrophoresis". The Journal of Biological Chemistry 244 (16): 440612. PMID5806584. [19] Milan Bier (ed.) (1959). Electrophoresis. Theory, Methods and Applications (3rd printing ed.). Academic Press. p.225. OCLC1175404. LCC 59-7676.

Gel electrophoresis

148

External links
Biotechniques Laboratory electrophoresis demonstration (http://gslc.genetics.utah.edu/units/biotech/gel), from the University of Utah's Genetic Science Learning Center Discontinuous native protein gel electrophoresis (http://www3.interscience.wiley.com/cgi-bin/abstract/ 113343976/ABSTRACT) Drinking straw electrophoresis (http://maradydd.livejournal.com/417631.html) How to run a DNA or RNA gel (http://arbl.cvmbs.colostate.edu/hbooks/genetics/biotech/gels/index.html) Animation of gel analysis of DNA restriction (http://arbl.cvmbs.colostate.edu/hbooks/genetics/biotech/gels/ virgel.html) Step by step photos of running a gel and extracting DNA (http://web.mit.edu/7.02/virtual_lab/RDM/ RDM1virtuallab.html) A typical method from wikiversity (http://en.wikiversity.org/w/index. php?title=Agarose_gel_electrophoresis&action=edit&redlink=1)

Ion chromatography

149

Ion chromatography
Ion exchange chromatography
Acronym IC, IEC

Classification Chromatography Other techniques Related High performance liquid chromatography Aqueous Normal Phase Chromatography Size exclusion chromatography Micellar liquid chromatography

Ion-exchange chromatography (or ion chromatography) is a process that allows the separation of ions and polar molecules based on their charge. It can be used for almost any kind of charged molecule including large proteins, small nucleotides and amino acids. The solution to be injected is usually called a sample, and the individually separated components are called analytes. It is often used in protein purification, water analysis, and quality control.

Principle
Ion-exchange chromatography retains analyte molecules on the column based on coulombic (ionic) interactions. The stationary phase surface displays ionic functional groups (R-X) that interact with analyte ions of opposite charge. This type of chromatography is further subdivided into cation exchange chromatography and anion exchange chromatography. The ionic compound consisting of the cationic species M+ and the anionic species B- can be retained by the stationary phase.

Ion Chromatography

Cation exchange chromatography retains positively charged cations because the stationary phase displays a negatively charged functional group:

Anion exchange chromatography retains anions using positively charged functional group:

Note that the ion strength of either C+ or A- in the mobile phase can be adjusted to shift the equilibrium position and thus retention time. The ion chromatogram shows a typical chromatogram obtained with an anion exchange column.

Ion chromatography

150

Typical technique
A sample is introduced, either manually or with an autosampler, into a sample loop of known volume. A buffered aqueous solution known as the mobile phase carries the sample from the loop onto a column that contains some form of stationary phase material. This is typically a resin or gel matrix consisting of agarose or cellulose beads with covalently bonded charged functional groups. The target analytes (anions or cations) are retained on the stationary phase but can be eluted by increasing the concentration of a similarly charged species that will displace the Metrohm 850 Ion chromatography system analyte ions from the stationary phase. For example, in cation exchange chromatography, the positively charged analyte could be displaced by the addition of positively charged sodium ions. The analytes of interest must then be detected by some means, typically by conductivity or UV/Visible light absorbance. In order to control an IC system, a chromatography data system (CDS) is usually needed. In addition to IC systems, some of these CDSs can also control gas chromatography (GC) and HPLC

Separating proteins
Proteins have numerous functional groups that can have both positive and negative charges. Ion exchange chromatography separates proteins according to their net charge, which is dependent on the composition of the mobile phase. By adjusting the pH or the ionic concentration of the mobile phase, various protein molecules can be separated. For Preparative-scale ion exchange column used for protein purification. example, if a protein has a net positive charge at pH 7, then it will bind to a column of negatively-charged beads, whereas a negatively charged protein would not. By changing the pH so that the net charge on the protein is negative, it too will be eluted. Elution by changing the ionic strength of the mobile phase is a more subtle effect - it works as ions from the mobile phase will interact with the immobilized ions in preference over those on the stationary phase. This "shields" the stationary phase from the protein, (and vice versa) and allows the protein to elute.

Ion chromatography

151

Clinical utility
Used in measurement of HbA1c, porphyrin & water purification.

References
Small, Hamish (1989). Ion chromatography. New York: Plenum Press. ISBN0-306-43290-0. Tatjana Weiss; Weiss, Joachim (2005). Handbook of Ion Chromatography. Weinheim: Wiley-VCH. ISBN3-527-28701-9. Gjerde, Douglas T.; Fritz, James S. (2000). Ion Chromatography. Weinheim: Wiley-VCH. ISBN3-527-29914-9. Jackson, Peter; Haddad, Paul R. (1990). Ion chromatography: principles and applications. Amsterdam: Elsevier. ISBN0-444-88232-4.

External links
LC Instruments [1] at the Open Directory Project

References
[1] http:/ / www. dmoz. org/ / Science/ Chemistry/ Analytical/ Separations_Science/ Liquid_Chromatography/ Products_and_Services/ Instruments/

Antibody

Antibody

152

An antibody (Ab), also known as an immunoglobulin (Ig), is a large Y-shaped protein produced by B-cells that is used by the immune system to identify and neutralize foreign objects such as bacteria and viruses. The antibody recognizes a unique part of the foreign target, called an antigen.[1][2] Each tip of the "Y" of an antibody contains a paratope (a structure analogous to a lock) that is specific for one particular epitope (similarly analogous to a key) on an antigen, allowing these two structures to bind together with precision. Using this binding mechanism, an antibody can tag a microbe or an infected cell for attack by other parts of the immune system, or can neutralize its target directly (for example, by blocking a part of a microbe that is essential for its invasion and survival). The production of antibodies is the main function of the humoral immune system.[3] Antibodies are secreted by a type of white blood cell called a plasma cell. Antibodies can occur in two physical forms, a soluble form that is secreted from the cell, and a membrane-bound form that is attached Each antibody binds to a specific antigen; an interaction similar to a to the surface of a B cell and is referred to as the B lock and key. cell receptor (BCR). The BCR is only found on the surface of B cells and facilitates the activation of these cells and their subsequent differentiation into either antibody factories called plasma cells, or memory B cells that will survive in the body and remember that same antigen so the B cells can respond faster upon future exposure.[4] In most cases, interaction of the B cell with a T helper cell is necessary to produce full activation of the B cell and, therefore, antibody generation following antigen binding.[5] Soluble antibodies are released into the blood and tissue fluids, as well as many secretions to continue to survey for invading microorganisms. Antibodies are glycoproteins belonging to the immunoglobulin superfamily; the terms antibody and immunoglobulin are often used interchangeably.[6] Antibodies are typically made of basic structural unitseach with two large heavy chains and two small light chains. There are several different types of antibody heavy chains, and several different kinds of antibodies, which are grouped into different isotypes based on which heavy chain they possess. Five different antibody isotypes are known in mammals, which perform different roles, and help direct the appropriate immune response for each different type of foreign object they encounter.[7] Though the general structure of all antibodies is very similar, a small region at the tip of the protein is extremely variable, allowing millions of antibodies with slightly different tip structures, or antigen binding sites, to exist. This region is known as the hypervariable region. Each of these variants can bind to a different antigen.[1] This enormous diversity of antibodies allows the immune system to recognize an equally wide variety of antigens.[6] The large and diverse population of antibodies is generated by random combinations of a set of gene segments that encode different antigen binding sites (or paratopes), followed by random mutations in this area of the antibody gene, which create further diversity.[7][8] Antibody genes also re-organize in a process called class switching that changes the base of the heavy chain to another, creating a different isotype of the antibody that retains the antigen specific variable region. This allows a single antibody to be used by several different parts of the immune system.

Antibody

153

Forms
Surface immunoglobulin (Ig) is attached to the membrane of the effector B cells by its transmembrane region, while antibodies are the secreted form of Ig and lack the trans membrane region so that antibodies can be secreted into the bloodstream and body cavities. As a result, surface Ig and antibodies are identical except for the transmembrane regions. Therefore, they are considered two forms of antibodies: soluble form or membrane-bound form (Parham 21-22). When you are given an immunoglobulin, your body uses antibodies from other people's blood plasma to help prevent illness. And even though immunoglobulins are obtained from blood, they are purified so that they can't pass on diseases to the person who receives them. The membrane-bound form of an antibody may be called a surface immunoglobulin (sIg) or a membrane immunoglobulin (mIg). It is part of the B cell receptor (BCR), which allows a B cell to detect when a specific antigen is present in the body and triggers B cell activation.[9] The BCR is composed of surface-bound IgD or IgM antibodies and associated Ig- and Ig- heterodimers, which are capable of signal transduction.[10] A typical human B cell will have 50,000 to 100,000 antibodies bound to its surface.[10] Upon antigen binding, they cluster in large patches, which can exceed 1 micrometer in diameter, on lipid rafts that isolate the BCRs from most other cell signaling receptors.[10] These patches may improve the efficiency of the cellular immune response.[11] In humans, the cell surface is bare around the B cell receptors for several hundred nanometers,[12] which further isolates the BCRs from competing influences.

Antibody purification
Antibodies are purified so that they can't pass on diseases such as HIV to the person who receives them. Antibody purification involves selective enrichment or specific isolation of antibodies from serum (polyclonal antibodies), ascites fluid or cell culture supernatant of a hybridoma cell line (monoclonal antibodies). Purification methods range from very crude to highly specific and can be classified as follows: Physicochemical fractionation differential precipitation, size-exclusion or solid-phase binding of immunoglobulins based on size, charge or other shared chemical characteristics of antibodies in typical samples. This isolates a subset of sample proteins that includes the immunoglobulins. Class-specific affinity solid-phase binding of particular antibody classes (e.g., IgG) by immobilized biological ligands (proteins, lectins, etc.) that have specific affinity to immunoglobulins. This purifies all antibodies of the target class without regard to antigen specificity. Antigen-specific affinity affinity purification of only those antibodies in a sample that bind to a particular antigen molecule through their specific antigen-binding domains. This purifies all antibodies that bind the antigen without regard to antibody class or isotype. Antibodies that were developed as monoclonal antibody hybridoma cell lines and produced as ascites fluid or cell culture supernatant can be fully purified without using an antigen-specific affinity method (third type) because the target antibody is (for most practical purposes) the only immunoglobulin in the production sample. By contrast, for polyclonal antibodies (serum samples), antigen-specific affinity purification is required to prevent co-purification of nonspecific immunoglobulins. For example, generally only 2-5% of total IgG in mouse serum is specific for the antigen used to immunize the animal. The type(s) and degree of purification that are necessary to obtain usable antibody depend upon the intended application(s) for the antibody.

Antibody

154

Isotypes
Antibody isotypes of mammals Name Types IgA 2 Description Antibody Complexes

Found in mucosal areas, such as the gut, respiratory tract and urogenital tract, and prevents [13] colonization by pathogens. Also found in saliva, tears, and breast milk. Functions mainly as an antigen receptor on B cells that have not been exposed to [14] antigens. It has been shown to activate basophils and mast cells to produce antimicrobial [15] factors. Binds to allergens and triggers histamine release from mast cells and basophils, and is [3] involved in allergy. Also protects against parasitic worms. In its four forms, provides the majority of antibody-based immunity against invading [3] pathogens. The only antibody capable of crossing the placenta to give passive immunity to the fetus. Expressed on the surface of B cells (monomer) and in a secreted form (pentamer) with very high avidity. Eliminates pathogens in the early stages of B cell mediated (humoral) [3][14] immunity before there is sufficient IgG.

IgD

IgE

IgG

IgM

Antibodies can come in different varieties known as isotypes or classes. In placental mammals there are five antibody isotypes known as IgA, IgD, IgE, IgG and IgM. They are each named with an "Ig" prefix that stands for immunoglobulin, another name for antibody, and differ in their biological properties, functional locations and ability to deal with different antigens, as depicted in the table.[16] The antibody isotype of a B cell changes during cell development and activation. Immature B cells, which have never been exposed to an antigen, are known as nave B cells and express only the IgM isotype in a cell surface bound form. B cells begin to express both IgM and IgD when they reach maturitythe co-expression of both these immunoglobulin isotypes renders the B cell 'mature' and ready to respond to antigen.[17] B cell activation follows engagement of the cell bound antibody molecule with an antigen, causing the cell to divide and differentiate into an antibody producing cell called a plasma cell. In this activated form, the B cell starts to produce antibody in a secreted form rather than a membrane-bound form. Some daughter cells of the activated B cells undergo isotype switching, a mechanism that causes the production of antibodies to change from IgM or IgD to the other antibody isotypes, IgE, IgA or IgG, that have defined roles in the immune system.

Structure
Antibodies are heavy (~150kDa) globular plasma proteins. They have sugar chains added to some of their amino acid residues.[18] In other words, antibodies are glycoproteins. The basic functional unit of each antibody is an immunoglobulin (Ig) monomer (containing only one Ig unit); secreted antibodies can also be dimeric with two Ig units as with IgA, tetrameric with four Ig units like teleost fish IgM, or pentameric with five Ig units, like mammalian IgM.[19]

Antibody

155

The variable parts of an antibody are its V regions, and the constant part is its C region.

Immunoglobulin domains
The Ig monomer is a "Y"-shaped molecule that consists of four polypeptide chains; two identical heavy chains and two identical light chains connected by disulfide bonds.[16] Each chain is composed of structural domains called immunoglobulin domains. These domains contain about 70-110 amino acids and are classified into different categories (for example, variable or IgV, and constant or IgC) according to their size and function.[20] They have a characteristic immunoglobulin fold in which two beta sheets create a sandwich shape, held together by interactions between conserved cysteines and other charged amino acids.

Several immunoglobulin domains make up the two heavy chains (red and blue) and the two light chains (green and yellow) of an antibody. The immunoglobulin domains are composed of between 7 (for constant domains) and 9 (for variable domains) -strands.

Heavy chain
There are five types of mammalian Ig heavy chain denoted by the Greek letters: , , , , and .[1] The type of heavy chain present defines the class of antibody; these chains are found in IgA, IgD, IgE, IgG, and IgM antibodies, respectively.[6] Distinct heavy chains differ in size and composition; and contain approximately 450 amino acids, while and have approximately 550 amino acids.[1] In birds, the major serum antibody, also found in yolk, is called IgY. It is quite different from mammalian IgG. However, in some older literature and even on some commercial life sciences product websites it is still called "IgG", which is incorrect and can be confusing. Each heavy chain has two regions, the constant region and the variable region. The constant region is identical in all antibodies of the same isotype, but differs in antibodies of different isotypes. Heavy chains , and have a constant region composed of three tandem (in a line) Ig domains, and a hinge region for added flexibility;[16] heavy chains and have a constant region composed of four immunoglobulin domains.[1] The variable region of the heavy chain differs in antibodies produced by different B cells, but is the same for all antibodies produced by a single B cell or B cell clone. The variable region of each heavy chain is approximately 110 amino acids long and is composed of a single Ig domain.

Light chain

In mammals there are two types of immunoglobulin light chain, which are called lambda () and kappa ().[1] A light chain has two successive domains: one constant domain and one variable domain. The approximate length of a light chain is 211 to 217 amino acids.[1] Each antibody contains two light chains that are always identical; only one type of light chain, or , is present per antibody in mammals. Other types of light chains, such as the iota () chain, are found in lower vertebrates like sharks (Chondrichthyes) and bony fishes (Teleostei).

1. Fab region 2. Fc region 3. Heavy chain (blue) with one variable (VH) domain followed by a constant domain (CH1), a hinge region, and two more constant (CH2 and CH3) domains. 4. Light chain (green) with one variable (VL) and one constant (CL) domain 5. Antigen binding site (paratope) 6. Hinge regions.

Antibody

156

CDRs, Fv, Fab and Fc Regions


Some parts of an antibody have the same functions. The arms of the Y, for example, contain the sites that can bind two antigens (in general, identical) and, therefore, recognize specific foreign objects. This region of the antibody is called the Fab (fragment, antigen binding) region. It is composed of one constant and one variable domain from each heavy and light chain of the antibody.[21] The paratope is shaped at the amino terminal end of the antibody monomer by the variable domains from the heavy and light chains. The variable domain is also referred to as the FV region and is the most important region for binding to antigens. More specifically, variable loops of -strands, three each on the light (VL) and heavy (VH) chains are responsible for binding to the antigen. These loops are referred to as the complementarity determining regions (CDRs). The structures of these CDRs have been clustered and classified by Chothia et al.[22] and more recently by North et al.[23] In the framework of the immune network theory, CDRs are also called idiotypes. According to immune network theory, the adaptive immune system is regulated by interactions between idiotypes. The base of the Y plays a role in modulating immune cell activity. This region is called the Fc (Fragment, crystallizable) region, and is composed of two heavy chains that contribute two or three constant domains depending on the class of the antibody.[1] Thus, the Fc region ensures that each antibody generates an appropriate immune response for a given antigen, by binding to a specific class of Fc receptors, and other immune molecules, such as complement proteins. By doing this, it mediates different physiological effects including recognition of opsonized particles, lysis of cells, and degranulation of mast cells, basophils and eosinophils.[16][24]

Function
Activated B cells differentiate into either antibody-producing cells called plasma cells that secrete soluble antibody or memory cells that survive in the body for years afterward in order to allow the immune system to remember an antigen and respond faster upon future exposures.[25] At the prenatal and neonatal stages of life, the presence of antibodies is provided by passive immunization from the mother. Early endogenous antibody production varies for different kinds of antibodies, and usually appear within the first years of life. Since antibodies exist freely in the bloodstream, they are said to be part of the humoral immune system. Circulating antibodies are produced by clonal B cells that specifically respond to only one antigen (an example is a virus capsid protein fragment). Antibodies contribute to immunity in three ways: they prevent pathogens from entering or damaging cells by binding to them; they stimulate removal of pathogens by macrophages and other cells by coating the pathogen; and they trigger destruction of pathogens by stimulating other immune responses such as the complement pathway.[26]

Antibody

157

Activation of complement
Antibodies that bind to surface antigens on, for example, a bacterium attract the first component of the complement cascade with their Fc region and initiate activation of the "classical" complement system.[26] This results in the killing of bacteria in two ways.[3] First, the binding of the antibody and complement molecules marks the microbe for ingestion by phagocytes in a process called opsonization; these phagocytes are attracted by certain complement molecules generated in the complement cascade. Secondly, some complement system components form a membrane attack complex to assist antibodies to kill the bacterium directly.[27]

The secreted mammalian IgM has five Ig units. Each Ig unit (labeled 1) has two epitope binding Fab regions, so IgM is capable of binding up to 10 epitopes.

Activation of effector cells

To combat pathogens that replicate outside cells, antibodies bind to pathogens to link them together, causing them to agglutinate. Since an antibody has at least two paratopes it can bind more than one antigen by binding identical epitopes carried on the surfaces of these antigens. By coating the pathogen, antibodies stimulate effector functions against the pathogen in cells that recognize their Fc region.[3] Those cells which recognize coated pathogens have Fc receptors which, as the name suggests, interacts with the Fc region of IgA, IgG, and IgE antibodies. The engagement of a particular antibody with the Fc receptor on a particular cell triggers an effector function of that cell; phagocytes will phagocytose, mast cells and neutrophils will degranulate, natural killer cells will release cytokines and cytotoxic molecules; that will ultimately result in destruction of the invading microbe. The Fc receptors are isotype-specific, which gives greater flexibility to the immune system, invoking only the appropriate immune mechanisms for distinct pathogens.[1]

Natural antibodies
Humans and higher primates also produce natural antibodies which are present in serum before viral infection. Natural antibodies have been defined as antibodies that are produced without any previous infection, vaccination, other foreign antigen exposure or passive immunization. These antibodies can activate the classical complement pathway leading to lysis of enveloped virus particles long before the adaptive immune response is activated. Many natural antibodies are directed against the disaccharide galactose (1,3)-galactose (-Gal), which is found as a terminal sugar on glycosylated cell surface proteins, and generated in response to production of this sugar by bacteria contained in the human gut.[28] Rejection of xenotransplantated organs is thought to be, in part, the result of natural antibodies circulating in the serum of the recipient binding to -Gal antigens expressed on the donor tissue.[29]

Immunoglobulin diversity
Virtually all microbes can trigger an antibody response. Successful recognition and eradication of many different types of microbes requires diversity among antibodies; their amino acid composition varies allowing them to interact with many different antigens.[30] It has been estimated that humans generate about 10billion different antibodies, each capable of binding a distinct epitope of an antigen.[31] Although a huge repertoire of different antibodies is generated in a single individual, the number of genes available to make these proteins is limited by the size of the human genome. Several complex genetic mechanisms have evolved that allow vertebrate B cells to generate a diverse pool of antibodies from a relatively small number of antibody genes.[32]

Antibody

158

Domain variability
The region (locus) of a chromosome that encodes an antibody is large and contains several distinct genes for each domain of the antibodythe locus containing heavy chain genes (IGH@) is found on chromosome 14, and the loci containing lambda and kappa light chain genes (IGL@ and IGK@) are found on chromosomes 22 and 2 in humans. One of these domains is called the variable domain, which is present in each heavy and light chain of every antibody, but can differ in different antibodies generated from The complementarity determining regions of the heavy chain are [33] distinct B cells. Differences, between the variable shown in red (PDB 1IGT ) domains, are located on three loops known as hypervariable regions (HV-1, HV-2 and HV-3) or complementarity determining regions (CDR1, CDR2 and CDR3). CDRs are supported within the variable domains by conserved framework regions. The heavy chain locus contains about 65 different variable domain genes that all differ in their CDRs. Combining these genes with an array of genes for other domains of the antibody generates a large cavalry of antibodies with a high degree of variability. This combination is called V(D)J recombination discussed below.[34]

V(D)J recombination
Somatic recombination of immunoglobulins, also known as V(D)J recombination, involves the generation of a unique immunoglobulin variable region. The variable region of each immunoglobulin heavy or light chain is encoded in several piecesknown as gene segments (subgenes). These segments are called variable (V), diversity (D) and joining (J) segments.[32] V, D and J segments are found in Ig heavy chains, but only V and J segments are found in Ig light chains. Multiple copies of the V, D and J gene segments exist, and are tandemly arranged in the genomes of mammals. In the bone marrow, each developing B cell will assemble an immunoglobulin variable region by randomly selecting and combining one V, one D and one J gene segment (or one V and one J segment in the light chain). As there are multiple Simplified overview of V(D)J recombination of immunoglobulin copies of each type of gene segment, and different heavy chains combinations of gene segments can be used to generate each immunoglobulin variable region, this process generates a huge number of antibodies, each with different paratopes, and thus different antigen specificities.[7] Interestingly, the rearrangement of several subgenes (e.i. V2 family) for lambda light chain immunoglobulin is coupled with the activation of microRNA miR-650, which further influences biology of B-cells .[35] After a B cell produces a functional immunoglobulin gene during V(D)J recombination, it cannot express any other variable region (a process known as allelic exclusion) thus each B cell can produce antibodies containing only one kind of variable chain.[1][36]

Antibody

159

Somatic hypermutation and affinity maturation


For more details on this topic, see Somatic hypermutation and Affinity maturation Following activation with antigen, B cells begin to proliferate rapidly. In these rapidly dividing cells, the genes encoding the variable domains of the heavy and light chains undergo a high rate of point mutation, by a process called somatic hypermutation (SHM). SHM results in approximately one nucleotide change per variable gene, per cell division.[8] As a consequence, any daughter B cells will acquire slight amino acid differences in the variable domains of their antibody chains. This serves to increase the diversity of the antibody pool and impacts the antibodys antigen-binding affinity.[37] Some point mutations will result in the production of antibodies that have a weaker interaction (low affinity) with their antigen than the original antibody, and some mutations will generate antibodies with a stronger interaction (high affinity).[38] B cells that express high affinity antibodies on their surface will receive a strong survival signal during interactions with other cells, whereas those with low affinity antibodies will not, and will die by apoptosis.[38] Thus, B cells expressing antibodies with a higher affinity for the antigen will outcompete those with weaker affinities for function and survival. The process of generating antibodies with increased binding affinities is called affinity maturation. Affinity maturation occurs in mature B cells after V(D)J recombination, and is dependent on help from helper T cells.[39]

Class switching
Isotype or class switching is a biological process occurring after activation of the B cell, which allows the cell to produce different classes of antibody (IgA, IgE, or IgG).[7] The different classes of antibody, and thus effector functions, are defined by the constant (C) regions of the immunoglobulin heavy chain. Initially, nave B cells express only cell-surface IgM and IgD with identical antigen binding regions. Each isotype is adapted for a distinct function, therefore, after activation, an antibody with an IgG, IgA, or IgE effector function might be required to effectively eliminate an antigen. Class switching allows different daughter cells from the same activated B cell to produce antibodies of different isotypes. Only the Mechanism of class switch recombination that allows isotype constant region of the antibody heavy chain changes switching in activated B cells during class switching; the variable regions, and therefore antigen specificity, remain unchanged. Thus the progeny of a single B cell can produce antibodies, all specific for the same antigen, but with the ability to produce the effector function appropriate for each antigenic challenge. Class switching is triggered by cytokines; the isotype generated depends on which cytokines are present in the B cell environment.[40] Class switching occurs in the heavy chain gene locus by a mechanism called class switch recombination (CSR). This mechanism relies on conserved nucleotide motifs, called switch (S) regions, found in DNA upstream of each constant region gene (except in the -chain). The DNA strand is broken by the activity of a series of enzymes at two selected S-regions.[41][42] The variable domain exon is rejoined through a process called non-homologous end joining (NHEJ) to the desired constant region (, or ). This process results in an immunoglobulin gene that encodes an antibody of a different isotype.[43]

Antibody

160

Affinity designations
A group of antibodies can be called monovalent (or specific) if they have affinity for the same epitope,[44] or for the same antigen[45] (but potentially different epitopes on the molecule), or for the same strain of microorganism[45] (but potentially different antigens on or in it). In contrast, a group of antibodies can be called polyvalent (or unspecific) if they have affinity for various antigens[45] or microorganisms.[45] Intravenous immunoglobulin, if not otherwise noted, consists of polyvalent IgG. In contrast, monoclonal antibodies are monovalent for the same epitope.

Medical applications
Disease diagnosis and therapy
Detection of particular antibodies is a very common form of medical diagnostics, and applications such as serology depend on these methods.[46] For example, in biochemical assays for disease diagnosis,[47] a titer of antibodies directed against Epstein-Barr virus or Lyme disease is estimated from the blood. If those antibodies are not present, either the person is not infected, or the infection occurred a very long time ago, and the B cells generating these specific antibodies have naturally decayed. In clinical immunology, levels of individual classes of immunoglobulins are measured by nephelometry (or turbidimetry) to characterize the antibody profile of patient.[48] Elevations in different classes of immunoglobulins are sometimes useful in determining the cause of liver damage in patients for whom the diagnosis is unclear.[6] For example, elevated IgA indicates alcoholic cirrhosis, elevated IgM indicates viral hepatitis and primary biliary cirrhosis, while IgG is elevated in viral hepatitis, autoimmune hepatitis and cirrhosis. Autoimmune disorders can often be traced to antibodies that bind the body's own epitopes; many can be detected through blood tests. Antibodies directed against red blood cell surface antigens in immune mediated hemolytic anemia are detected with the Coombs test.[49] The Coombs test is also used for antibody screening in blood transfusion preparation and also for antibody screening in antenatal women.[49] Practically, several immunodiagnostic methods based on detection of complex antigen-antibody are used to diagnose infectious diseases, for example ELISA, immunofluorescence, Western blot, immunodiffusion, immunoelectrophoresis, and magnetic immunoassay. Antibodies raised against human chorionic gonadotropin are used in over the counter pregnancy tests. Targeted monoclonal antibody therapy is employed to treat diseases such as rheumatoid arthritis,[50] multiple sclerosis,[51] psoriasis,[52] and many forms of cancer including non-Hodgkin's lymphoma,[53] colorectal cancer, head and neck cancer and breast cancer.[54] Some immune deficiencies, such as X-linked agammaglobulinemia and hypogammaglobulinemia, result in partial or complete lack of antibodies.[55] These diseases are often treated by inducing a short term form of immunity called passive immunity. Passive immunity is achieved through the transfer of ready-made antibodies in the form of human or animal serum, pooled immunoglobulin or monoclonal antibodies, into the affected individual.[56]

Prenatal therapy
Rhesus factor, also known as Rhesus D (RhD) antigen, is an antigen found on red blood cells; individuals that are Rhesus-positive (Rh+) have this antigen on their red blood cells and individuals that are Rhesus-negative (Rh) do not. During normal childbirth, delivery trauma or complications during pregnancy, blood from a fetus can enter the mother's system. In the case of an Rh-incompatible mother and child, consequential blood mixing may sensitize an Rh- mother to the Rh antigen on the blood cells of the Rh+ child, putting the remainder of the pregnancy, and any subsequent pregnancies, at risk for hemolytic disease of the newborn.[57] Rho(D) immune globulin antibodies are specific for human Rhesus D (RhD) antigen.[58] Anti-RhD antibodies are administered as part of a prenatal treatment regimen to prevent sensitization that may occur when a Rhesus-negative mother has a Rhesus-positive fetus. Treatment of a mother with Anti-RhD antibodies prior to and immediately after trauma and delivery destroys Rh antigen in the mother's system from the fetus. Importantly, this occurs before the antigen can stimulate maternal B cells to "remember" Rh antigen by generating memory B cells. Therefore, her

Antibody humoral immune system will not make anti-Rh antibodies, and will not attack the Rhesus antigens of the current or subsequent babies. Rho(D) Immune Globulin treatment prevents sensitization that can lead to Rh disease, but does not prevent or treat the underlying disease itself.[58]

161

Research applications
Specific antibodies are produced by injecting an antigen into a mammal, such as a mouse, rat, rabbit, goat, sheep, or horse for large quantities of antibody. Blood isolated from these animals contains polyclonal antibodiesmultiple antibodies that bind to the same antigenin the serum, which can now be called antiserum. Antigens are also injected into chickens for generation of polyclonal antibodies in egg yolk.[59] To obtain antibody that is specific for a single epitope of an antigen, antibody-secreting lymphocytes are isolated from the animal and immortalized by fusing them with a cancer cell line. The fused cells are called hybridomas, and will continually grow and secrete antibody in culture. Single hybridoma cells are isolated by dilution cloning to generate cell clones that all produce the same antibody; these antibodies are called monoclonal antibodies.[60] Polyclonal and monoclonal antibodies are often purified using Protein A/G or antigen-affinity chromatography.[61]

Immunofluorescence image of the eukaryotic cytoskeleton. Actin filaments are shown in red, microtubules in green, and the nuclei in blue.

In research, purified antibodies are used in many applications. They are most commonly used to identify and locate intracellular and extracellular proteins. Antibodies are used in flow cytometry to differentiate cell types by the proteins they express; different types of cell express different combinations of cluster of differentiation molecules on their surface, and produce different intracellular and secretable proteins.[62] They are also used in immunoprecipitation to separate proteins and anything bound to them (co-immunoprecipitation) from other molecules in a cell lysate,[63] in Western blot analyses to identify proteins separated by electrophoresis,[64] and in immunohistochemistry or immunofluorescence to examine protein expression in tissue sections or to locate proteins within cells with the assistance of a microscope.[62][65] Proteins can also be detected and quantified with antibodies, using ELISA and ELISPOT techniques.[66][67]

Structure prediction
The importance of antibodies in health care and the biotechnology industry demands knowledge of their structures at high resolution. This information is used for protein engineering, modifying the antigen binding affinity, and identifying an epitope, of a given antibody. X-ray crystallography is one commonly used method for determining antibody structures. However, crystallizing an antibody is often laborious and time consuming. Computational approaches provide a cheaper and faster alternative to crystallography, but their results are more equivocal since they do not produce empirical structures. Online web servers such as Web Antibody Modeling (WAM)[68] and Prediction of Immunoglobulin Structure (PIGS)[69] enables computational modeling of antibody variable regions. Rosetta Antibody is a novel antibody FV region structure prediction server, which incorporates sophisticated techniques to minimize CDR loops and optimize the relative orientation of the light and heavy chains, as well as homology models that predict successful docking of antibodies with their unique antigen.[70]

Antibody

162

History
The first use of the term "antibody" occurred in a text by Paul Ehrlich. The term Antikrper (the German word for antibody) appears in the conclusion of his article "Experimental Studies on Immunity", published in October 1891, which states that "if two substances give rise to two different antikrper, then they themselves must be different".[71] However, the term was not accepted immediately and several other terms for antibody were proposed; these included Immunkrper, Amboceptor, Zwischenkrper, substance sensibilisatrice, copula, Desmon, philocytase, fixateur, and Immunisin.[71] The word antibody has formal analogy to the word antitoxin and a similar concept to Immunkrper.[71] The study of antibodies began in 1890 when Kitasato Shibasabur described antibody activity against diphtheria and tetanus toxins. Kitasato put forward the theory of humoral immunity, proposing that a mediator [75][76] in serum could react with a foreign antigen. His idea prompted Paul Ehrlich to propose the side chain theory for antibody and antigen interaction in 1897, when he hypothesized that receptors (described as side chains) on the surface of cells could bind specifically to toxins in a "lock-and-key" interaction and that this binding reaction was the trigger for the production of antibodies.[77] Other researchers believed that antibodies existed freely in the blood and, in 1904, Almroth Wright suggested that soluble antibodies coated bacteria to label them for phagocytosis and killing; a process that he named opsoninization.[78]

Angel of the West (2008) by Julian Voss-Andreae is a sculpture based [72] on the antibody structure published by E. Padlan. Created for the [73] Florida campus of the Scripps Research Institute, the antibody is placed into a ring referencing Leonardo da Vinci's Vitruvian Man thus highlighting the similar proportions of the antibody and the [74] human body.

In the 1920s, Michael Heidelberger and Oswald Avery observed that antigens could be precipitated by antibodies and went on to show that antibodies were made of protein.[79] The biochemical properties of antigen-antibody binding interactions were examined in more detail in the late 1930s by John Marrack.[80] The next major advance was in the 1940s, when Linus Pauling confirmed the lock-and-key theory proposed by Ehrlich by showing that the interactions between antibodies and antigens depended more on their shape than their chemical composition.[81] In 1948, Astrid Fagreaus discovered that B cells, in the form of plasma cells, were responsible for generating antibodies.[82]

Michael Heidelberger

Further work concentrated on characterizing the structures of the antibody proteins. A major advance in these structural studies was the discovery in the early 1960s by Gerald Edelman and Joseph Gally of the antibody light chain,[83] and their realization that this protein was the same as the Bence-Jones protein described in 1845 by Henry Bence Jones.[84] Edelman went on to discover that antibodies are composed of disulfide bond-linked heavy and light chains. Around the same time, antibody-binding (Fab) and antibody tail (Fc) regions of IgG were characterized by Rodney Porter.[85] Together, these scientists deduced the structure and complete amino acid sequence of IgG, a feat for which they were jointly awarded the 1972 Nobel Prize in Physiology or Medicine.[85] The Fv fragment was prepared and characterized by David Givol.[86] While most of these early studies focused on IgM and IgG, other immunoglobulin isotypes were identified in the 1960s: Thomas Tomasi discovered secretory antibody (IgA)[87] and David S. Rowe and John L. Fahey identified IgD,[88] and IgE

Antibody was identified by Kimishige Ishizaka and Teruko Ishizaka as a class of antibodies involved in allergic reactions.[89] In a landmark series of experiments beginning in 1976, Susumu Tonegawa showed that genetic material can rearrange itself to form the vast array of available antibodies.[90]

163

References
[1] Charles Janeway (2001). Immunobiology. (5th ed.). Garland Publishing. ISBN0-8153-3642-X. (electronic full text via NCBI Bookshelf) (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?call=bv. View. . ShowTOC& rid=imm. TOC& depth=10). [2] Litman GW, Rast JP, Shamblott MJ, Haire RN, Hulst M, Roess W, Litman RT, Hinds-Frey KR, Zilch A, Amemiya CT (January 1993). "Phylogenetic diversification of immunoglobulin genes and the antibody repertoire". Mol. Biol. Evol. 10 (1): 6072. PMID8450761. [3] Pier GB, Lyczak JB, Wetzler LM (2004). Immunology, Infection, and Immunity. ASM Press. ISBN1-55581-246-5. [4] Borghesi L, Milcarek C (2006). "From B cell to plasma cell: regulation of V(D)J recombination and antibody secretion". Immunol. Res. 36 (13): 2732. doi:10.1385/IR:36:1:27. PMID17337763. [5] Parker D (1993). "T cell-dependent B cell activation". Annu Rev Immunol 11 (1): 331360. doi:10.1146/annurev.iy.11.040193.001555. PMID8476565. [6] Rhoades RA, Pflanzer RG (2002). Human Physiology (4th ed.). Thomson Learning. ISBN0-534-42174-1. [7] Market E, Papavasiliou FN (October 2003). "V(D)J recombination and the evolution of the adaptive immune system". PLoS Biol. 1 (1): E16. doi:10.1371/journal.pbio.0000016. PMC212695. PMID14551913. [8] Diaz M, Casali P (2002). "Somatic immunoglobulin hypermutation". Curr Opin Immunol 14 (2): 235240. doi:10.1016/S0952-7915(02)00327-8. PMID11869898. [9] Parker D (1993). "T cell-dependent B cell activation". Annu. Rev. Immunol. 11 (1): 331360. doi:10.1146/annurev.iy.11.040193.001555. PMID8476565. [10] Wintrobe, Maxwell Myer (2004). Wintrobe's clinical hematology. John G. Greer, John Foerster, John N Lukens, George M Rodgers, Frixos Paraskevas (11 ed.). Hagerstown, MD: Lippincott Williams & Wilkins. pp.453456. ISBN978-0-7817-3650-3. [11] Tolar P, Sohn HW, Pierce SK (February 2008). "Viewing the antigen-induced initiation of B-cell activation in living cells" (http:/ / www. blackwell-synergy. com/ openurl?genre=article& sid=nlm:pubmed& issn=0105-2896& date=2008& volume=221& spage=64). Immunol. Rev. 221 (1): 6476. doi:10.1111/j.1600-065X.2008.00583.x. PMID18275475. . [12] Wintrobe, Maxwell Myer (2004). Wintrobe's clinical hematology. John G. Greer, John Foerster, John N Lukens, George M Rodgers, Frixos Paraskevas (11 ed.). Hagerstown, MD: Lippincott Williams & Wilkins. pp.453456. ISBN0-7817-3650-1. [13] Underdown B, Schiff J (1986). "Immunoglobulin A: strategic defense initiative at the mucosal surface". Annu Rev Immunol 4 (1): 389417. doi:10.1146/annurev.iy.04.040186.002133. PMID3518747. [14] Geisberger R, Lamers M, Achatz G (2006). "The riddle of the dual expression of IgM and IgD". Immunology 118 (4): 060526021554006. doi:10.1111/j.1365-2567.2006.02386.x. PMC1782314. PMID16895553. [15] Chen K, Xu W, Wilson M, He B, Miller NW, Bengtn E, Edholm ES, Santini PA, Rath P, Chiu A, Cattalini M, Litzman J, B Bussel J, Huang B, Meini A, Riesbeck K, Cunningham-Rundles C, Plebani A, Cerutti A (2009). "Immunoglobulin D enhances immune surveillance by activating antimicrobial, proinflammatory and B cell-stimulating programs in basophils". Nature Immunology 10 (8): 889898. doi:10.1038/ni.1748. PMC2785232. PMID19561614. [16] Woof J, Burton D (2004). "Human antibody-Fc receptor interactions illuminated by crystal structures.". Nat Rev Immunol 4 (2): 8999. doi:10.1038/nri1266. PMID15040582. [17] Goding J (1978). "Allotypes of IgM and IgD receptors in the mouse: a probe for lymphocyte differentiation". Contemp Top Immunobiol 8: 20343. PMID357078. [18] Mattu T, Pleass R, Willis A, Kilian M, Wormald M, Lellouch A, Rudd P, Woof J, Dwek R (1998). "The glycosylation and structure of human serum IgA1, Fab, and Fc regions and the role of N-glycosylation on Fc alpha receptor interactions". J Biol Chem 273 (4): 22602272. doi:10.1074/jbc.273.4.2260. PMID9442070. [19] Roux K (1999). "Immunoglobulin structure and function as revealed by electron microscopy". Int Arch Allergy Immunol 120 (2): 8599. doi:10.1159/000024226. PMID10545762. [20] Barclay A (2003). "Membrane proteins with immunoglobulin-like domains - a master superfamily of interaction molecules". Semin Immunol 15 (4): 215223. doi:10.1016/S1044-5323(03)00047-2. PMID14690046. [21] Putnam FW, Liu YS, Low TL (1979). "Primary structure of a human IgA1 immunoglobulin. IV. Streptococcal IgA1 protease, digestion, Fab and Fc fragments, and the complete amino acid sequence of the alpha 1 heavy chain". J Biol Chem 254 (8): 286574. PMID107164. [22] Al-Lazikani B, Lesk AM, Chothia C (1997). "Standard conformations for the canonical structures of immunoglobulins". J Mol Biol 273 (4): 927948. doi:10.1006/jmbi.1997.1354. PMID9367782. [23] North B, Lehmann A, Dunbrack RL (2010). "A new clustering of antibody CDR loop conformations". J Mol Biol 406 (2): 228256. doi:10.1016/j.jmb.2010.10.030. PMC3065967. PMID21035459. [24] Heyman B (1996). "Complement and Fc-receptors in regulation of the antibody response". Immunol Lett 54 (23): 195199. doi:10.1016/S0165-2478(96)02672-7. PMID9052877. [25] Borghesi L, Milcarek C (2006). "From B cell to plasma cell: regulation of V(D)J recombination and antibody secretion". Immunol Res 36 (13): 2732. doi:10.1385/IR:36:1:27. PMID17337763.

Antibody
[26] Ravetch J, Bolland S (2001). "IgG Fc receptors". Annu Rev Immunol 19 (1): 275290. doi:10.1146/annurev.immunol.19.1.275. PMID11244038. [27] Rus H, Cudrici C, Niculescu F (2005). "The role of the complement system in innate immunity". Immunol Res 33 (2): 103112. doi:10.1385/IR:33:2:103. PMID16234578. [28] Racaniello, Vincent (2009-10-06). "Natural antibody protects against viral infection" (http:/ / www. virology. ws/ 2009/ 10/ 06/ natural-antibody-protects-against-viral-infection/ ). Virology Blog. Archived (http:/ / www. webcitation. org/ 5uJzysytc) from the original on 2010-11-17. . Retrieved 2010-01-22. [29] Milland J, Sandrin MS (December 2006). "ABO blood group and related antigens, natural antibodies and transplantation". Tissue Antigens 68 (6): 459466. doi:10.1111/j.1399-0039.2006.00721.x. PMID17176435. [30] Mian I, Bradwell A, Olson A (1991). "Structure, function and properties of antibody binding sites". J Mol Biol 217 (1): 133151. doi:10.1016/0022-2836(91)90617-F. PMID1988675. [31] Fanning LJ, Connor AM, Wu GE (1996). "Development of the immunoglobulin repertoire". Clin. Immunol. Immunopathol. 79 (1): 114. doi:10.1006/clin.1996.0044. PMID8612345. [32] Nemazee D (2006). "Receptor editing in lymphocyte development and central tolerance". Nat Rev Immunol 6 (10): 728740. doi:10.1038/nri1939. PMID16998507. [33] http:/ / www. rcsb. org/ pdb/ explore/ explore. do?structureId=1IGT [34] Peter Parham. "The Immune System. 2nd ed. Garland Science: New York, 2005. pg.47-62 [35] Mraz, M.; Dolezalova, D.; Plevova, K.; Stano Kozubik, K.; Mayerova, V.; Cerna, K.; Musilova, K.; Tichy, B. et al. (2012). "MicroRNA-650 expression is influenced by immunoglobulin gene rearrangement and affects the biology of chronic lymphocytic leukemia". Blood 119 (9): 21102113. doi:10.1182/blood-2011-11-394874. PMID22234685. [36] Bergman Y, Cedar H (2004). "A stepwise epigenetic process controls immunoglobulin allelic exclusion". Nat Rev Immunol 4 (10): 753761. doi:10.1038/nri1458. PMID15459667. [37] Honjo T, Habu S (1985). "Origin of immune diversity: genetic variation and selection". Annu Rev Biochem 54 (1): 803830. doi:10.1146/annurev.bi.54.070185.004103. PMID3927822. [38] Or-Guil M, Wittenbrink N, Weiser AA, Schuchhardt J (2007). "Recirculation of germinal center B cells: a multilevel selection strategy for antibody maturation". Immunol. Rev. 216: 13041. doi:10.1111/j.1600-065X.2007.00507.x. PMID17367339. [39] Neuberger M, Ehrenstein M, Rada C, Sale J, Batista F, Williams G, Milstein C (March 2000). "Memory in the B-cell compartment: antibody affinity maturation". Philos Trans R Soc Lond B Biol Sci 355 (1395): 357360. doi:10.1098/rstb.2000.0573. PMC1692737. PMID10794054. [40] Stavnezer J, Amemiya CT (2004). "Evolution of isotype switching". Semin. Immunol. 16 (4): 257275. doi:10.1016/j.smim.2004.08.005. PMID15522624. [41] Durandy A (2003). "Activation-induced cytidine deaminase: a dual role in class-switch recombination and somatic hypermutation". Eur. J. Immunol. 33 (8): 20692073. doi:10.1002/eji.200324133. PMID12884279. [42] Casali P, Zan H (2004). "Class switching and Myc translocation: how does DNA break?". Nat. Immunol. 5 (11): 11011103. doi:10.1038/ni1104-1101. PMID15496946. [43] Lieber MR, Yu K, Raghavan SC (2006). "Roles of nonhomologous DNA end joining, V(D)J recombination, and class switch recombination in chromosomal translocations". DNA Repair (Amst.) 5 (910): 12341245. doi:10.1016/j.dnarep.2006.05.013. PMID16793349. [44] page 22 (http:/ / books. google. se/ books?id=TfW5sUfeM5gC& pg=PA22) in: Shoenfeld, Yehuda.; Meroni, Pier-Luigi.; Gershwin, M. Eric (2007). Autoantibodie. Amsterdam ; Boston: Elsevier. ISBN978-0-444-52763-9. [45] Farlex dictionary > monovalent (http:/ / www. thefreedictionary. com/ monovalent) Citing: The American Heritage Science Dictionary, Copyright 2005 [46] "Animated depictions of how antibodies are used in [[ELISA (http:/ / www. immunospot. eu/ elisa-animation. html)] assays"]. Cellular Technology Ltd.Europe. Archived (http:/ / www. webcitation. org/ 5uK00Qems) from the original on 2010-11-17. . Retrieved 2007-05-08. [47] "Animated depictions of how antibodies are used in [[ELISPOT (http:/ / www. immunospot. eu/ elispot-animation. html)] assays"]. Cellular Technology Ltd.Europe. Archived (http:/ / www. webcitation. org/ 5uK00pHh5) from the original on 2010-11-17. . Retrieved 2007-05-08. [48] Stern P (2006). "Current possibilities of turbidimetry and nephelometry" (http:/ / www. clsjep. cz/ odkazy/ kbm0603-146. pdf). Klin Biochem Metab 14 (3): 146151. Archived (http:/ / www. webcitation. org/ 5uK01zkxp) from the original on 2010-11-17. . [49] Dean, Laura (2005). "Chapter 4: Hemolytic disease of the newborn" (http:/ / www. ncbi. nlm. nih. gov/ books/ bv. fcgi?rid=rbcantigen. chapter. ch4). Blood Groups and Red Cell Antigens. NCBI Bethesda (MD): National Library of Medicine (US),. . [50] Feldmann M, Maini R (2001). "Anti-TNF alpha therapy of rheumatoid arthritis: what have we learned?". Annu Rev Immunol 19 (1): 163196. doi:10.1146/annurev.immunol.19.1.163. PMID11244034. [51] Doggrell S (2003). "Is natalizumab a breakthrough in the treatment of multiple sclerosis?". Expert Opin Pharmacother 4 (6): 9991001. doi:10.1517/14656566.4.6.999. PMID12783595. [52] Krueger G, Langley R, Leonardi C, Yeilding N, Guzzo C, Wang Y, Dooley L, Lebwohl M (2007). "A human interleukin-12/23 monoclonal antibody for the treatment of psoriasis". N Engl J Med 356 (6): 580592. doi:10.1056/NEJMoa062382. PMID17287478. [53] Plosker G, Figgitt D (2003). "Rituximab: a review of its use in non-Hodgkin's lymphoma and chronic lymphocytic leukaemia". Drugs 63 (8): 803843. doi:10.2165/00003495-200363080-00005. PMID12662126. [54] Vogel C, Cobleigh M, Tripathy D, Gutheil J, Harris L, Fehrenbacher L, Slamon D, Murphy M, Novotny W, Burchmore M, Shak S, Stewart S (2001). "First-line Herceptin monotherapy in metastatic breast cancer". Oncology. 61 Suppl 2 (Suppl. 2): 3742. doi:10.1159/000055400. PMID11694786.

164

Antibody
[55] LeBien TW (1 July 2000). "Fates of human B-cell precursors" (http:/ / bloodjournal. hematologylibrary. org/ cgi/ content/ full/ 96/ 1/ 9). Blood 96 (1): 923. PMID10891425. Archived (http:/ / www. webcitation. org/ 5uK02zfmd) from the original on 2010-11-17. . [56] Ghaffer A (2006-03-26). "Immunization" (http:/ / pathmicro. med. sc. edu/ ghaffar/ immunization. htm). Immunology Chapter 14. University of South Carolina School of Medicine. Archived (http:/ / www. webcitation. org/ 5uK03Lul6) from the original on 2010-11-17. . Retrieved 2007-06-06. [57] Urbaniak S, Greiss M (2000). "RhD haemolytic disease of the fetus and the newborn". Blood Rev 14 (1): 4461. doi:10.1054/blre.1999.0123. PMID10805260. [58] Fung Kee Fung K, Eason E, Crane J, Armson A, De La Ronde S, Farine D, Keenan-Lindsay L, Leduc L, Reid G, Aerde J, Wilson R, Davies G, Dsilets V, Summers A, Wyatt P, Young D (2003). "Prevention of Rh alloimmunization". J Obstet Gynaecol Can 25 (9): 76573. PMID12970812. [59] Tini M, Jewell UR, Camenisch G, Chilov D, Gassmann M (2002). "Generation and application of chicken egg-yolk antibodies". Comp. Biochem. Physiol., Part a Mol. Integr. Physiol. 131 (3): 569574. doi:10.1016/S1095-6433(01)00508-6. PMID11867282. [60] Cole SP, Campling BG, Atlaw T, Kozbor D, Roder JC (1984). "Human monoclonal antibodies". Mol. Cell. Biochem. 62 (2): 10920. doi:10.1007/BF00223301. PMID6087121. [61] Kabir S (2002). "Immunoglobulin purification by affinity chromatography using protein A mimetic ligands prepared by combinatorial chemical synthesis". Immunol Invest 31 (34): 263278. doi:10.1081/IMM-120016245. PMID12472184. [62] Brehm-Stecher B, Johnson E (2004). "Single-cell microbiology: tools, technologies, and applications" (http:/ / mmbr. asm. org/ cgi/ content/ full/ 68/ 3/ 538?view=long& pmid=15353569). Microbiol Mol Biol Rev 68 (3): 538559. doi:10.1128/MMBR.68.3.538-559.2004. PMC515252. PMID15353569. Archived (http:/ / www. webcitation. org/ 5uK04DmZC) from the original on 2010-11-17. . [63] Williams N (2000). "Immunoprecipitation procedures". Methods Cell Biol. Methods in Cell Biology 62: 449453. doi:10.1016/S0091-679X(08)61549-6. ISBN978-0-12-544164-3. PMID10503210. [64] Kurien B, Scofield R (2006). "Western blotting". Methods 38 (4): 283293. doi:10.1016/j.ymeth.2005.11.007. PMID16483794. [65] Scanziani E (1998). "Immunohistochemical staining of fixed tissues". Methods Mol Biol 104: 133140. doi:10.1385/0-89603-525-5:133. ISBN978-0-89603-525-6. PMID9711649. [66] Reen DJ. (1994). "Enzyme-linked immunosorbent assay (ELISA)". Methods Mol Biol. 32: 461466. doi:10.1385/0-89603-268-X:461. ISBN0-89603-268-X. PMID7951745. [67] Kalyuzhny AE (2005). "Chemistry and biology of the ELISPOT assay". Methods Mol Biol. 302: 015032. doi:10.1385/1-59259-903-6:015. ISBN1-59259-903-6. PMID15937343. [68] Whitelegg N.R.J., Rees A.R. (2000). "WAM: an improved algorithm for modeling antibodies on the WEB" (http:/ / peds. oxfordjournals. org/ cgi/ content/ abstract/ 13/ 12/ 819). Protein Engineering 13 (12): 819824. doi:10.1093/protein/13.12.819. PMID11239080. Archived (http:/ / www. webcitation. org/ 5uK04d3mA) from the original on 2010-11-17. . WAM (http:/ / antibody. bath. ac. uk/ abmod. html) [69] Marcatili P, Rosi A,Tramontano A (2008). "PIGS: automatic prediction of antibody structures" (http:/ / arianna. bio. uniroma1. it/ pigs/ ). Bioinformatics 24 (17): 19531954. doi:10.1093/bioinformatics/btn341. PMID18641403. Archived (http:/ / www. webcitation. org/ 5uK06XmYO) from the original on 2010-11-17. . [70] Sivasubramanian A, Sircar A, Chaudhury S, Gray J J (2009). "Toward high-resolution homology modeling of antibody Fv regions and application to antibodyantigen docking" (http:/ / arianna. bio. uniroma1. it/ pigs/ ). Proteins 74 (2): 497514. doi:10.1002/prot.22309. PMC2909601. PMID19062174. Archived (http:/ / www. webcitation. org/ 5uK07ARPr) from the original on 2010-11-17. . RosettaAntibody (http:/ / antibody. graylab. jhu. edu) [71] Lindenmann, Jean (1984). "Origin of the Terms 'Antibody' and 'Antigen'" (http:/ / www3. interscience. wiley. com/ cgi-bin/ fulltext/ 119531625/ PDFSTART). Scand. J. Immunol. 19 (4): 2815. doi:10.1111/j.1365-3083.1984.tb00931.x. PMID6374880. Archived (http:/ / www. webcitation. org/ 5uK08DeWS) from the original on 2010-11-17. . Retrieved 2008-11-01. [72] Padlan, Eduardo (February 1994). "Anatomy of the antibody molecule". Mol. Immunol. 31 (3): 169217. doi:10.1016/0161-5890(94)90001-9. PMID8114766. [73] "New Sculpture Portraying Human Antibody as Protective Angel Installed on Scripps Florida Campus" (http:/ / www. scripps. edu/ newsandviews/ e_20081110/ sculpture. html). Archived (http:/ / www. webcitation. org/ 5uK08UfTv) from the original on 2010-11-17. . Retrieved 2008-12-12. [74] "Protein sculpture inspired by Vitruvian Man" (http:/ / www. boingboing. net/ 2008/ 10/ 22/ protein-sculpture-in. html). Archived (http:/ / www. webcitation. org/ 5uK0Ai3D4) from the original on 2010-11-17. . Retrieved 2008-12-12. [75] "Emil von Behring Biography" (http:/ / nobelprize. org/ nobel_prizes/ medicine/ laureates/ 1901/ behring-bio. html). Archived (http:/ / www. webcitation. org/ 5uK0BRd1D) from the original on 2010-11-17. . Retrieved 2007-06-05. [76] AGN (1931). "The Late Baron Shibasaburo Kitasato". Canadian Medical Association Journal 25 (2): 206. PMC382621. PMID20318414. [77] Winau F, Westphal O, Winau R (2004). "Paul Ehrlich--in search of the magic bullet". Microbes Infect. 6 (8): 786789. doi:10.1016/j.micinf.2004.04.003. PMID15207826. [78] Silverstein AM (2003). "Cellular versus humoral immunology: a century-long dispute". Nat. Immunol. 4 (5): 425428. doi:10.1038/ni0503-425. PMID12719732. [79] Van Epps HL (2006). "Michael Heidelberger and the demystification of antibodies" (http:/ / www. jem. org/ cgi/ reprint/ 203/ 1/ 5. pdf). J. Exp. Med. 203 (1): 5. doi:10.1084/jem.2031fta. PMC2118068. PMID16523537. Archived (http:/ / www. webcitation. org/ 5uK0EkKLx) from the original on 2010-11-17. .

165

Antibody
[80] Marrack, JR (1938). Chemistry of antigens and antibodies (2nd ed.). London: His Majesty's Stationery Office. OCLC3220539. [81] "The Linus Pauling Papers: How Antibodies and Enzymes Work" (http:/ / profiles. nlm. nih. gov/ MM/ Views/ Exhibit/ narrative/ specificity. html). Archived (http:/ / www. webcitation. org/ 5uK0FQBmR) from the original on 2010-11-17. . Retrieved 2007-06-05. [82] Silverstein AM (2004). "Labeled antigens and antibodies: the evolution of magic markers and magic bullets" (http:/ / users. path. ox. ac. uk/ ~seminars/ halelibrary/ Paper 18. pdf). Nat. Immunol. 5 (12): 12111217. doi:10.1038/ni1140. PMID15549122. Archived (http:/ / www. webcitation. org/ 5m6w1MlHG) from the original on 2009-12-18. . [83] Edelman GM, Gally JA (1962). "The nature of Bence-Jones proteins. Chemical similarities to polypetide chains of myeloma globulins and normal gamma-globulins". J. Exp. Med. 116 (2): 207227. doi:10.1084/jem.116.2.207. PMC2137388. PMID13889153. [84] Stevens FJ, Solomon A, Schiffer M (1991). "Bence Jones proteins: a powerful tool for the fundamental study of protein chemistry and pathophysiology". Biochemistry 30 (28): 68036805. doi:10.1021/bi00242a001. PMID2069946. [85] Raju TN (1999). "The Nobel chronicles. 1972: Gerald M Edelman (b 1929) and Rodney R Porter (1917-85)". Lancet 354 (9183): 1040. doi:10.1016/S0140-6736(05)76658-7. PMID10501404. [86] Hochman J, Inbar D, Givol D (1973). "An active antibody fragment (Fv) composed of the variable portions of heavy and light chains". Biochemistry 12 (6): 11301135. doi:10.1021/bi00730a018. PMID4569769. [87] Tomasi TB (1992). "The discovery of secretory IgA and the mucosal immune system". Immunol. Today 13 (10): 416418. doi:10.1016/0167-5699(92)90093-M. PMID1343085. [88] Preud'homme JL, Petit I, Barra A, Morel F, Lecron JC, Lelivre E (2000). "Structural and functional properties of membrane and secreted IgD". Mol. Immunol. 37 (15): 871887. doi:10.1016/S0161-5890(01)00006-2. PMID11282392. [89] Johansson SG (2006). "The discovery of immunoglobulin E". Allergy and asthma proceedings : the official journal of regional and state allergy societies 27 (2 Suppl 1): S36. PMID16722325. [90] Hozumi N, Tonegawa S (1976). "Evidence for somatic rearrangement of immunoglobulin genes coding for variable and constant regions". Proc. Natl. Acad. Sci. U.S.A. 73 (10): 36283632. doi:10.1073/pnas.73.10.3628. PMC431171. PMID824647.

166

External links
Mike's Immunoglobulin Structure/Function Page (http://www.path.cam.ac.uk/~mrc7/mikeimages.html) at University of Cambridge Antibodies as the PDB molecule of the month (http://www.rcsb.org/pdb/static.do?p=education_discussion/ molecule_of_the_month/pdb21_1.html) Discussion of the structure of antibodies at RCSB Protein Data Bank Microbiology and Immunology On-line Textbook (http://pathmicro.med.sc.edu/mayer/IgStruct2000.htm) at University of South Carolina A hundred years of antibody therapy (http://users.path.ox.ac.uk/~scobbold/tig/new1/mabth.html) History and applications of antibodies in the treatment of disease at University of Oxford How Lymphocytes Produce Antibody (http://www.cellsalive.com/antibody.htm) from Cells Alive! Antibody applications (http://www.ii.bham.ac.uk/clinicalimmunology/CISimagelibrary/) Fluorescent antibody image library, University of Birmingham

Immunoprecipitation

167

Immunoprecipitation
Immunoprecipitation (IP) is the technique of precipitating a protein antigen out of solution using an antibody that specifically binds to that particular protein. This process can be used to isolate and concentrate a particular protein from a sample containing many thousands of different proteins. Immunoprecipitation requires that the antibody be coupled to a solid substrate at some point in the procedure.

Types of immunoprecipitation
Individual protein Immunoprecipitation (IP)
Involves using an antibody that is specific for a known protein to isolate that particular protein out of a solution containing many different proteins. These solutions will often be in the form of a crude lysate of a plant or animal tissue. Other sample types could be bodily fluids or other samples of biological origin.

Protein complex immunoprecipitation (Co-IP)


Immunoprecipitation of intact protein complexes (i.e. antigen along with any proteins or ligands that are bound to it) is known as co-immunoprecipitation (Co-IP). Co-IP works by selecting an antibody that targets a known protein that is believed to be a member of a larger complex of proteins. By targeting this known member with an antibody it may become possible to pull the entire protein complex out of solution and thereby identify unknown members of the complex. This works when the proteins involved in the complex bind to each other tightly, making it possible to pull multiple members of the complex out of solution by latching onto one member with an antibody. This concept of pulling protein complexes out of solution is sometimes referred to as a "pull-down". Co-IP is a powerful technique that is used regularly by molecular biologists to analyze proteinprotein interactions. Identifying the members of protein complexes may require several rounds of precipitation with different antibodies for a number of reasons: A particular antibody often selects for a subpopulation of its target protein that has the epitope exposed, thus failing to identify any proteins in complexes that hide the epitope. This can be seen in that it is rarely possible to precipitate even half of a given protein from a sample with a single antibody, even when a large excess of antibody is used. The first round of IP will often result in the identification of many new proteins that are putative members of the complex being studied. The researcher will then obtain antibodies that specifically target one of the newly identified proteins and repeat the entire immunoprecipitation experiment. This second round of precipitation may result in the recovery of additional new members of a complex that were not identified in the previous experiment. As successive rounds of targeting and immunoprecipitations take place, the number of identified proteins may continue to grow. The identified proteins may not ever exist in a single complex at a given time, but may instead represent a network of proteins interacting with one another at different times for different purposes. Repeating the experiment by targeting different members of the protein complex allows the researcher to double-check the result. Each round of pull-downs should result in the recovery of both the original known protein as well as other previously identified members of the complex (and even new additional members). By repeating the immunoprecipitation in this way, the researcher verifies that each identified member of the protein complex was a valid identification. If a particular protein can only be recovered by targeting one of the known members but not by targeting other of the known members then that protein's status as a member of the complex may be subject to question.

Immunoprecipitation

168

Chromatin immunoprecipitation (ChIP)


Chromatin immunoprecipitation (ChIP) is a method used to determine the location of DNA binding sites on the genome for a particular protein of interest. This technique gives a picture of the proteinDNA interactions that occur inside the nucleus of living cells or tissues. The in vivo nature of this method is in contrast to other approaches traditionally employed to answer the same questions. The principle underpinning this assay is that DNA-binding proteins (including transcription factors and histones) in living cells can be cross-linked to the DNA that they are binding. By using an antibody that is specific to a putative DNA binding protein, one can immunoprecipitate the proteinDNA complex out of cellular lysates. The crosslinking is often accomplished by applying formaldehyde to the cells (or tissue), although it is sometimes advantageous to use a more defined and consistent crosslinker such as DTBP. Following crosslinking, the cells are lysed and the DNA is broken into pieces 0.21.0 kb in length by sonication. At this point the immunoprecipitation is performed resulting in the purification of proteinDNA complexes. The purified proteinDNA complexes are then heated to reverse the formaldehyde cross-linking of the protein and DNA complexes, allowing the DNA to be separated from the proteins. The identity and quantity of the DNA fragments isolated can then be determined by PCR. The limitation of performing PCR on the isolated fragments is that one must have an idea which genomic region is being targeted in order to generate the correct PCR primers. This limitation is ChIP-sequencing workflow very easily circumvented simply by cloning the isolated genomic DNA into a plasmid vector and then using primers that are specific to the cloning region of that vector. Alternatively, when one wants to find where the protein binds on a genome-wide scale, a DNA microarray can be used (ChIP-on-chip or ChIP-chip) allowing for the characterization of the cistrome. As well, ChIP-Sequencing has recently emerged as a new technology that can localize protein binding sites in a high-throughput, cost-effective fashion.

RNA immunoprecipitation (RIP)


Similar to chromatin immunoprecipitation (ChIP) outlined above, but rather than targeting DNA binding proteins as in ChIP, RNA immunoprecipitation targets RNA binding proteins.[1] RIP is also an in vivo method in that live cells are lysed and the immunoprecipitation is performed with an antibody that targets the protein of interest. By isolating the protein, the RNA will also be isolated as it is bound to the protein. The purified RNA-protein complexes can be separated by performing an RNA extraction and the identity of the RNA can be determined by cDNA sequencing[2] or RT-PCR. Some variants of RIP, such as PAR-CLIP include cross-linking steps, which then require less careful lysis conditions.

Tagged proteins
One of the major technical hurdles with immunoprecipitation is the great difficulty in generating an antibody that specifically targets a single known protein. To get around this obstacle, many groups will engineer tags onto either the C- or N- terminal end of the protein of interest. The advantage here is that the same tag can be used time and again on many different proteins and the researcher can use the same antibody each time. The advantages with using tagged proteins are so great that this technique has become commonplace for all types of immunoprecipitation

Immunoprecipitation including all of the types of IP detailed above. Examples of tags in use are the Green Fluorescent Protein (GFP) tag, Glutathione-S-transferase (GST) tag and the FLAG-tag tag. While the use of a tag to enable pull-downs is convenient, it raises some concerns regarding biological relevance because the tag itself may either obscure native interactions or introduce new and unnatural interactions.

169

Methods
The two general methods for immunoprecipitation are the direct capture method and the indirect capture method.

Direct
Antibodies that are specific for a particular protein (or group of proteins) are immobilized on a solid-phase substrate such as superparamagnetic microbeads or on microscopic agarose (non-magnetic) beads. The beads with bound antibodies are then added to the protein mixture and the proteins that are targeted by the antibodies are captured onto the beads via the antibodies, in other words, they become immunoprecipitated.

Indirect
Antibodies that are specific for a particular protein, or a group of proteins, are added directly to the mixture of protein. The antibodies have not been attached to a solid-phase support yet. The antibodies are free to float around the protein mixture and bind their targets. As time passes, the beads coated in protein A/G are added to the mixture of antibody and protein. At this point, the antibodies, which are now bound to their targets, will stick to the beads. From this point on, the direct and indirect protocols converge because the samples now have the same ingredients. Both methods gives the same end-result with the protein or protein complexes bound to the antibodies which themselves are immobilized onto the beads.

Selection
An indirect approach is sometimes preferred when the concentration of the protein target is low or when the specific affinity of the antibody for the protein is weak. The indirect method is also used when the binding kinetics of the antibody to the protein is slow for a variety of reasons. In most situations, the direct method is the default, and the preferred, choice.

Technological advances
Agarose
Historically the solid-phase support for immunoprecipitation used by the majority of scientists has been highly-porous agarose beads (also known as agarose resins or slurries). The advantage of this technology is a very high potential binding capacity, as virtually the entire sponge-like structure of the agarose particle (50 to 150m in size) is available for binding antibodies (which will in turn bind the target proteins) and the use of standard laboratory equipment for all aspects of the IP protocol without the need for any specialized equipment. The advantage of an extremely high binding capacity must be carefully balanced with the quantity of antibody that the researcher is prepared to use to coat the agarose beads. Because antibodies can be a cost-limiting factor, it is best to calculate backward from the amount of protein that needs to be captured (depending upon the analysis to be performed downstream), to the amount of antibody that is required to bind that quantity of protein (with a small excess added in order to account for inefficiencies of the system), and back still further to the quantity of agarose that is needed to bind that particular quantity of antibody. In cases where antibody saturation is not required, this technology is unmatched in its ability to capture extremely large quantities of captured target proteins. The caveat here is that the "high capacity advantage" can become a "high capacity disadvantage" that is manifested when the

Immunoprecipitation enormous binding capacity of the sepharose/agarose beads is not completely saturated with antibodies. It often happens that the amount of antibody available to the researcher for the their immunoprecipitation experiment is less than sufficient to saturate the agarose beads to be used in the immunoprecipitation. In these cases the researcher can end up with agarose particles that are only partially coated with antibodies, and the portion of the binding capacity of the agarose beads that is not coated with antibody is then free to bind anything that will stick, resulting in an elevated background signal due to non-specific binding of lysate components to the beads, which can make data interpretation difficult. While some may argue that for these reasons it is prudent to match the quantity of agarose (in terms of binding capacity) to the quantity of antibody that one wishes to be bound for the immunoprecipitation, a simple way to reduce the issue of non-specific binding to agarose beads and increase specificity is to preclear the lysate, which for any immunoprecipitation is highly recommended.[3][4] Preclearing Lysates are complex mixtures of proteins, lipids, carbohydrates and nucleic acids, and one must assume that some amount of non-specific binding to the IP antibody, Protein A/G or the beaded support will occur and negatively affect the detection of the immunoprecipitated target(s). In most cases, preclearing the lysate at the start of each immunoprecipitation experiment (see step 2 in the "protocol" section below)[5] is a way to remove potentially reactive components from the cell lysate prior to the immunoprecipitation to prevent the non-specific binding of these components to the IP beads or antibody. The basic preclearing procedure is described below, wherein the lysate is incubated with beads alone, which are then removed and discarded prior to the immunoprecipitation [5] This approach, though, does not account for non-specific binding to the IP antibody, which can be considerable. Therefore, an alternative method of preclearing is to incubate the protein mixture with exactly the same components that will be used in the immunoprecipitation, except that a non-target, irrelevant antibody of the same antibody subclass as the IP antibody is used instead of the IP antibody itself.[4] This approach attempts to use as close to the exact IP conditions and components as the actual immunoprecipitation to remove any non-specific cell constituent without capturing the target protein (unless, of course, the target protein non-specifically binds to some other IP component, which should be properly controlled for by analyzing the discarded beads used to preclear the lysate). The target protein can then be immunoprecipitated with the reduced risk of non-specific binding interfering with data interpretation.

170

Superparamagnetic beads
While the vast majority of immunoprecipitations are performed with agarose beads, the use of superparamagnetic beads for immunoprecipitation is a much newer approach that is only recently gaining in popularity as an alternative to agarose beads for IP applications. Unlike agarose, magnetic beads are solid and can be spherical, depending on the type of bead, and antibody binding is limited to the surface of each bead. While these beads do not have the advantage of a porous center to increase the binding capacity, magnetic beads are significantly smaller than agarose beads (1 to 4m), and the greater number of magnetic beads per volume than agarose beads collectively gives magnetic beads an effective surface area-to-volume ratio for optimum antibody binding. Commercially available magnetic beads can be separated based by size uniformity into monodisperse and polydisperse beads. Monodisperse beads, also called microbeads, exhibit exact uniformity, and therefore all beads exhibit identical physical characteristics, including the binding capacity and the level of attraction to magnets. Polydisperse beads, while similar in size to monodisperse beads, show a wide range in size variability (1 to 4m) that can influence their binding capacity and magnetic capture. Although both types of beads are commercially available for immunoprecipitation applications, the higher quality monodisperse superparamagnetic beads are more ideal for automatic protocols because of their consistent size, shape and performance. Monodisperse and polydisperse superparamagnetic beads are offered by many companies, including Invitrogen, Thermo Scientific, and Millipore.

Immunoprecipitation

171

Agarose vs. Magnetic Beads


Proponents of magnetic beads claim that the beads exhibit a faster rate of protein binding[6][7][8] over agarose beads for immunoprecipitation applications, although standard agarose bead-based immunoprecipitations have been performed in 1 hour.[4] Claims have also been made that magnetic beads are better for immunoprecipitating extremely large protein complexes because of the complete lack of an upper size limit for such complexes,[6][7][9] although there is no unbiased evidence stating this claim. The nature of magnetic bead technology does result in less sample handling[7] due to the reduced physical stress on samples of magnetic separation versus repeated centrifugation when using agarose, which may contribute greatly to increasing the yield of labile (fragile) protein complexes.[7][8][9] Additional factors, though, such as the binding capacity, cost of the reagent, the requirement of extra equipment and the capability to automate IP processes should be considered in the selection of an immunoprecipitation support. Binding Capacity Proponents of both agarose and magnetic beads can argue whether the vast difference in the binding capacities of the two beads favors one particular type of bead. In a bead-to-bead comparison, agarose beads have significantly greater surface area and therefore a greater binding capacity than magnetic beads due to the large bead size and sponge-like structure. But the variable pore size of the agarose causes a potential upper size limit that may affect the binding of extremely large proteins or protein complexes to internal binding sites, and therefore magnetic beads may be better suited for immunoprecipitating large proteins or protein complexes than agarose beads, although there is a lack of independent comparative evidence that proves either case. Some argue that the significantly greater binding capacity of agarose beads may be a disadvantage because of the larger capacity of non-specific binding. Others may argue for the use of magnetic beads because of the greater quantity of antibody required to saturate the total binding capacity of agarose beads, which would obviously be an economical disadvantage of using agarose. While these arguments are correct outside the context of their practical use, these lines of reasoning ignore two key aspects of the principle of immunoprecipitation that demonstrates that the decision to use agarose or magnetic beads is not simply determined by binding capacity. First, non-specific binding is not limited to the antibody-binding sites on the immobilized support; any surface of the antibody or component of the immunoprecipitation reaction can bind to nonspecific lysate constituents, and therefore nonspecific binding will still occur even when completely saturated beads are used. This is why it is important to preclear the sample before the immunoprecipitation is performed. Second, the ability to capture the target protein is directly dependent upon the amount of immobilized antibody used, and therefore, in a side-by-side comparison of agarose and magnetic bead immunoprecipitation, the most protein that either support can capture is limited by the amount of antibody added. So the decision to saturate any type of support depends on the amount of protein required, as described above in the Agarose section of this page. Cost The price of using either type of support is a key determining factor in using agarose or magnetic beads for immunoprecipitation applications. A typical first-glance calculation on the cost of magnetic beads compared to sepharose beads [10] may make the sepharose beads appear less expensive. But magnetic beads may be competitively priced compared to agarose for analytical-scale immunoprecipitations depending on the IP method used and the volume of beads required per IP reaction. Using the traditional batch method of immunoprecipitation as listed below, where all components are added to a tube during the IP reaction, the physical handling characteristics of agarose beads necessitate a minimum quantity of beads for each IP experiment (typically in the range of 25 to 50l beads per IP). This is because sepharose beads must be concentrated at the bottom of the tube by centrifugation and the supernatant removed after each incubation, wash, etc. This imposes absolute physical limitations on the process, as pellets of agarose beads less than 25 to 50l

Immunoprecipitation are difficult if not impossible to visually identify at the bottom of the tube. With magnetic beads, there is no minimum quantity of beads required due to magnetic handling, and therefore, depending on the target antigen and IP antibody, it is possible to use considerably less magnetic beads. Conversely, spin columns may be employed instead of normal microfuge tubes to significantly reduce the amount of agarose beads required per reaction. Spin columns contain a filter that allows all IP components except the beads to flow through using a brief centrifugation and therefore provide a method to use significantly less agarose beads with minimal loss. Equipment As mentioned above, only standard laboratory equipment is required for the use of agarose beads in immunoprecipitation applications, while high-power magnets are required for magnetic bead-based IP reactions. While the magnetic capture equipment may be cost-prohibitive, the rapid completion of immunoprecipitations using magnetic beads may be a financially beneficial approach when grants are due, because a 30 minute protocol with magnetic beads compared to overnight incubation at 4C with agarose beads may result in more data generated in a shorter length of time.[6][7][8] Automation An added benefit of using magnetic beads is that automated immunoprecipitation devices are becoming more readily available. These devices not only reduce the amount of work and time to perform an IP, but they can also be used for high-throughput applications. Summary While clear benefits of using magnetic beads include the increased reaction speed, more gentle sample handling and the potential for automation, the choice of using agarose or magnetic beads based on the binding capacity of the support medium and the cost of the product may depend on the protein of interest and the IP method used. As with all assays, empirical testing is required to determine which method is optimal for a given application.

172

Protocol
Background
Once the solid substrate bead technology has been chosen, antibodies are coupled to the beads and the antibody-coated-beads can be added to the heterogeneous protein sample (e.g. homogenized tissue). At this point, antibodies that are immobilized to the beads will bind to the proteins that they specifically recognize. Once this has occurred the immunoprecipitation portion of the protocol is actually complete, as the specific proteins of interest are bound to the antibodies that are themselves immobilized to the beads. Separation of the immunocomplexes from the lysate is an extremely important series of steps, because the protein(s) must remain bound to each other (in the case of co-IP) and bound to the antibody during the wash steps to remove non-bound proteins and reduce background. When working with agarose beads, the beads must be pelleted out of the sample by briefly spinning in a centrifuge with forces between 6003,000 x g (times the standard gravitational force). This step may be performed in a standard microcentrifuge tube, but for faster separation, greater consistency and higher recoveries, the process is often performed in small spin columns with a pore size that allows liquid, but not agarose beads, to pass through. After centrifugation, the agarose beads will form a very loose fluffy pellet at the bottom of the tube. The supernatant containing contaminants can be carefully removed so as not to disturb the beads. The wash buffer can then be added to the beads and after mixing, the beads are again separated by centrifugation. With superparamagnetic beads, the sample is placed in a magnetic field so that the beads can collect on the side of the tube. This procedure is generally complete in approximately 30 seconds, and the remaining (unwanted) liquid is

Immunoprecipitation pipetted away. Washes are accomplished by resuspending the beads (off the magnet) with the washing solution and then concentrating the beads back on the tube wall (by placing the tube back on the magnet). The washing is generally repeated several times to ensure adequate removal of contaminants. If the superparamagnetic beads are homogeneous in size and the magnet has been designed properly, the beads will concentrate uniformly on the side of the tube and the washing solution can be easily and completely removed. After washing, the precipitated protein(s) are eluted and analyzed by gel electrophoresis, mass spectrometry, western blotting, or any number of other methods for identifying constituents in the complex. Protocol times for immunoprecipitation vary greatly due to a variety of factors, with protocol times increasing with the number of washes necessary or with the slower reaction kinetics of porous agarose beads.

173

Steps
1. Lyse cells and prepare sample for immunoprecipitation. 2. Pre-clear the sample by passing the sample over beads alone or bound to an irrelevant antibody to soak up any proteins that non-specifically bind to the IP components. 3. Incubate solution with antibody against the protein of interest. Antibody can be attached to solid support before this step (direct method) or after this step (indirect method). Continue the incubation to allow antibody-antigen complexes to form. 4. Precipitate the complex of interest, removing it from bulk solution. 5. Wash precipitated complex several times. Spin each time between washes when using agarose beads or place tube on magnet when using superparamagnetic beads and then remove the supernatant. After the final wash, remove as much supernatant as possible. 6. Elute proteins from the solid support using low-pH or SDS sample loading buffer. 7. Analyze complexes or antigens of interest. This can be done in a variety of ways: 1. SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) followed by gel staining. 2. SDS-PAGE followed by: gel staining, cutting out individual stained protein bands, and sequencing the proteins in the bands by MALDI-Mass Spectrometry 3. Transfer and Western Blot using another antibody for proteins that were interacting with the antigen followed by chemiluminesent visualization.

References
[1] Keene JD, Komisarow JM, Friedersdorf MB (2006). "RIP-Chip: the isolation and identification of mRNAs, microRNAs and protein components of ribonucleoprotein complexes from cell extracts" (http:/ / www. ncbi. nlm. nih. gov/ entrez/ eutils/ elink. fcgi?dbfrom=pubmed& tool=sumsearch. org/ cite& retmode=ref& cmd=prlinks& id=17406249). Nat Protoc 1 (1): 3027. doi:10.1038/nprot.2006.47. PMID17406249. . [2] Sanford JR, Wang X, Mort M, et al (March 2009). "Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts". Genome Res. 19 (3): 38194. doi:10.1101/gr.082503.108. PMC2661799. PMID19116412. [3] Bonifacino, J. S., Dell'Angelica, E. C. and Springer, T. A. 2001. Immunoprecipitation. Current Protocols in Molecular Biology. 10.16.110.16.29. [4] Rosenberg, Ian (2005). Protein analysis and purification: benchtop techniques (http:/ / books. google. com/ ?id=gi-UgCF8G6EC& pg=PA43& dq=preclear+ irrelevant+ antibody+ isbn#v=onepage& q& f=false). Springer. pp.520. ISBN978-0-8176-4340-9. . [5] Crowell RE, Du Clos TW, Montoya G, Heaphy E, Mold C (November 1991). "C-reactive protein receptors on the human monocytic cell line U-937. Evidence for additional binding to Fc gamma RI" (http:/ / www. jimmunol. org/ cgi/ pmidlookup?view=long& pmid=1834740). Journal of Immunology 147 (10): 344551. PMID1834740. . [6] Alber F, Dokudovskaya S, Veenhoff LM, et al. (November 2007). "The molecular architecture of the nuclear pore complex". Nature 450 (7170): 695701. doi:10.1038/nature06405. PMID18046406. [7] Alber F, Dokudovskaya S, Veenhoff LM, et al. (November 2007). "Determining the architectures of macromolecular assemblies". Nature 450 (7170): 68394. doi:10.1038/nature06404. PMID18046405. [8] Cristea IM, Williams R, Chait BT, Rout MP (December 2005). "Fluorescent proteins as proteomic probes". Molecular & Cellular Proteomics 4 (12): 193341. doi:10.1074/mcp.M500227-MCP200. PMID16155292.

Immunoprecipitation
[9] Niepel M, Strambio-de-Castillia C, Fasolo J, Chait BT, Rout MP (July 2005). "The nuclear pore complexassociated protein, Mlp2p, binds to the yeast spindle pole body and promotes its efficient assembly". The Journal of Cell Biology 170 (2): 22535. doi:10.1083/jcb.200504140. PMC2171418. PMID16027220. [10] http:/ / www. effectsofglutathione. com/ glutathione-sepharose/

174

External links
Analysis of Proteins Using Immunoprecipitation (http://www.animal.ufl.edu/hansen/protocols/imp98.prt. htm) at ufl.edu Immunoprecipitation (http://www.nlm.nih.gov/cgi/mesh/2011/MB_cgi?mode=& term=Immunoprecipitation) at the US National Library of Medicine Medical Subject Headings (MeSH) Chromatin+immunoprecipitation (http://www.nlm.nih.gov/cgi/mesh/2011/MB_cgi?mode=& term=Chromatin+immunoprecipitation) at the US National Library of Medicine Medical Subject Headings (MeSH) Introduction to Immunoprecipitation Methodology (http://www.piercenet.com/browse. cfm?fldID=E5049863-D59D-B7A5-D057-E23F7B22AD44)

Coagulation
Coagulation (thrombogenesis) is the process by which blood forms clots. It is an important part of hemostasis, the cessation of blood loss from a damaged vessel, wherein a damaged blood vessel wall is covered by a platelet and fibrin-containing clot to stop bleeding and begin repair of the damaged vessel. Disorders of coagulation can lead to an increased risk of bleeding (hemorrhage) or obstructive clotting (thrombosis).[1] Coagulation is highly conserved throughout biology; in all mammals, coagulation involves both a cellular (platelet) and a protein (coagulation factor) component.[2] The system in humans has been the most extensively researched and is the best understood.

Blood coagulation pathways in vivo showing the central role played by thrombin

Coagulation begins almost instantly after an injury to the blood vessel has damaged the endothelium lining the vessel. Exposure of the blood to proteins such as tissue factor initiates changes to blood platelets and the plasma protein fibrinogen, a clotting factor. Platelets immediately form a plug at the site of injury; this is called primary hemostasis. Secondary hemostasis occurs simultaneously: Proteins in the blood plasma, called coagulation factors or clotting factors, respond in a complex cascade to form fibrin strands, which strengthen the platelet plug.[3]

Coagulation

175

Physiology
Platelet activation
When endothelium is damaged, the normally isolated, underlying collagen is exposed to circulating platelets, which bind directly to collagen with collagen-specific glycoprotein Ia/IIa surface receptors. This adhesion is strengthened further by von Willebrand factor (vWF), which is released from the endothelium and from platelets; vWF forms additional links between the platelets' glycoprotein Ib/IX/V and the collagen fibrils. These adhesions also activate the platelets.[4] Activated platelets release the contents of stored granules into the blood plasma. The granules include ADP, serotonin, platelet-activating factor (PAF), vWF, platelet factor 4, and thromboxane A2 (TXA2), which, in turn, activate additional platelets. The granules' contents activate a Gq-linked protein receptor cascade, resulting in increased calcium concentration in the platelets' cytosol. The calcium activates protein kinase C, which, in turn, activates phospholipase A2 (PLA2). PLA2 then modifies the integrin membrane glycoprotein IIb/IIIa, increasing its affinity to bind fibrinogen. The activated platelets change shape from spherical to stellate, and the fibrinogen cross-links with glycoprotein IIb/IIIa aid in aggregation of adjacent platelets (completing primary hemostasis).[4]

The coagulation cascade


The coagulation cascade of secondary hemostasis has two pathways which lead to fibrin formation. These are the contact activation pathway (also known as the intrinsic pathway), and the tissue factor pathway (also known as the extrinsic pathway). It was previously thought that the coagulation cascade consisted of two pathways of equal importance joined to a common pathway. It is now known that the primary pathway for the initiation of blood coagulation is the tissue factor pathway. The pathways are a series of reactions, in which a zymogen [5] (inactive enzyme precursor) of a serine The classical blood coagulation pathway protease and its glycoprotein co-factor are activated to become active components that then catalyze the next reaction in the cascade, ultimately resulting in cross-linked fibrin. Coagulation factors are generally indicated by Roman numerals, with a lowercase a appended to indicate an active form.[5] The coagulation factors are generally serine proteases (enzymes), which act by cleaving downstream proteins. There are some exceptions. For example, FVIII and FV are glycoproteins, and Factor XIII is a transglutaminase.[5] The coagulation factors circulate as inactive zymogens. The coagulation cascade is classically divided into three pathways. The tissue factor and contact activation pathways both activate the "final common pathway" of factor X, thrombin and fibrin.[6]

Coagulation Tissue factor pathway (extrinsic) The main role of the tissue factor pathway is to generate a "thrombin burst," a process by which thrombin, the most important constituent of the coagulation cascade in terms of its feedback activation roles, is released instantaneously. FVIIa circulates in a higher amount than any other activated coagulation factor. Following damage to the blood vessel, FVII leaves the circulation and comes into contact with tissue factor (TF) expressed on tissue-factor-bearing cells (stromal fibroblasts and leukocytes), forming an activated complex (TF-FVIIa). TF-FVIIa activates FIX and FX. FVII is itself activated by thrombin, FXIa, FXII and FXa. The activation of FX (to form FXa) by TF-FVIIa is almost immediately inhibited by tissue factor pathway inhibitor (TFPI). FXa and its co-factor FVa form the prothrombinase complex, which activates prothrombin to thrombin. Thrombin then activates other components of the coagulation cascade, including FV and FVIII (which activates FXI, which, in turn, activates FIX), and activates and releases FVIII from being bound to vWF. FVIIIa is the co-factor of FIXa, and together they form the "tenase" complex, which activates FX; and so the cycle continues. ("Tenase" is a contraction of "ten" and the suffix "-ase" used for enzymes.)[5] Contact activation pathway (intrinsic) The contact activation pathway begins with formation of the primary complex on collagen by high-molecular-weight kininogen (HMWK), prekallikrein, and FXII (Hageman factor). Prekallikrein is converted to kallikrein and FXII becomes FXIIa. FXIIa converts FXI into FXIa. Factor XIa activates FIX, which with its co-factor FVIIIa form the tenase complex, which activates FX to FXa. The minor role that the contact activation pathway has in initiating clot formation can be illustrated by the fact that patients with severe deficiencies of FXII, HMWK, and prekallikrein do not have a bleeding disorder. Instead, contact activation system seems to be more involved in inflammation.[5] Final common pathway Thrombin has a large array of functions. Its primary role is the conversion of fibrinogen to fibrin, the building block of a hemostatic plug. In addition, it activates Factors VIII and V and their inhibitor protein C (in the presence of thrombomodulin), and it activates Factor XIII, which forms covalent bonds that crosslink the fibrin polymers that form from activated monomers.[5] Following activation by the contact factor or tissue factor pathways, the coagulation cascade is maintained in a prothrombotic state by the continued activation of FVIII and FIX to form the tenase complex, until it is down-regulated by the anticoagulant pathways.[5]

176

Cofactors
Various substances are required for the proper functioning of the coagulation cascade: Calcium and phospholipid (a platelet membrane constituent) are required for the tenase and prothrombinase complexes to function. Calcium mediates the binding of the complexes via the terminal gamma-carboxy residues on FXa and FIXa to the phospholipid surfaces expressed by platelets, as well as procoagulant microparticles or microvesicles shed from them. Calcium is also required at other points in the coagulation cascade. Vitamin K is an essential factor to a hepatic gamma-glutamyl carboxylase that adds a carboxyl group to glutamic acid residues on factors II, VII, IX and X, as well as Protein S, Protein C and Protein Z. In adding the gamma-carboxyl group to glutamate residues on the immature clotting factors Vitamin K is itself oxidized. Another enzyme, Vitamin K epoxide reductase, (VKORC) reduces vitamin K back to its active form. Vitamin K epoxide reductase is pharmacologically important as a target of anticoagulant drugs warfarin and related coumarins such as acenocoumarol, phenprocoumon, and dicumarol. These drugs create a deficiency of reduced

Coagulation vitamin K by blocking VKORC, thereby inhibiting maturation of clotting factors. Vitamin K deficiency from other causes (e.g., in malabsorption) or impaired vitamin K metabolism in disease (e.g., in hepatic failure) lead to the formation of PIVKAs (proteins formed in vitamin K absence) which are partially or totally non-gamma carboxylated, affecting the coagulation factors' ability to bind to phospholipid.

177

Regulators
Five mechanisms keep platelet activation and the coagulation cascade in check. Abnormalities can lead to an increased tendency toward thrombosis: Protein C is a major physiological anticoagulant. It is a vitamin K-dependent serine protease enzyme that is activated by thrombin into activated protein C (APC). Protein C is activated in a sequence that starts with Protein C and thrombin binding to a cell surface protein thrombomodulin. Thrombomodulin binds these proteins in such a way that it activates Protein C. The activated form, along with protein S and a phospholipid as cofactors, degrades FVa and FVIIIa. Quantitative or qualitative deficiency of either may lead to thrombophilia (a tendency to develop thrombosis). Impaired action of Protein C (activated Protein C resistance), for example by having the "Leiden" variant of Factor V or high levels of FVIII also may lead to a thrombotic tendency. Antithrombin is a serine protease inhibitor (serpin) that degrades the serine proteases: thrombin, FIXa, FXa, FXIa, and FXIIa. It is constantly active, but its adhesion to these factors is increased by the presence of heparan sulfate (a glycosaminoglycan) or the administration of heparins (different heparinoids increase affinity to FXa, thrombin, or both). Quantitative or qualitative deficiency of antithrombin (inborn or acquired, e.g., in proteinuria) leads to thrombophilia. Tissue factor pathway inhibitor (TFPI) limits the action of tissue factor (TF). It also inhibits excessive TF-mediated activation of FVII and FX. Plasmin is generated by proteolytic cleavage of plasminogen, a plasma protein synthesized in the liver. This cleavage is catalyzed by tissue plasminogen activator (t-PA), which is synthesized and secreted by endothelium. Plasmin proteolytically cleaves fibrin into fibrin degradation products that inhibit excessive fibrin formation. Prostacyclin (PGI2) is released by endothelium and activates platelet Gs protein-linked receptors. This, in turn, activates adenylyl cyclase, which synthesizes cAMP. cAMP inhibits platelet activation by decreasing cytosolic levels of calcium and, by doing so, inhibits the release of granules that would lead to activation of additional platelets and the coagulation cascade.[7]

Fibrinolysis
Eventually, blood clots are reorganised and resorbed by a process termed fibrinolysis. The main enzyme responsible for this process (plasmin) is regulated by various activators and inhibitors.[7]

Role in immune system


The coagulation system overlaps with the immune system. Coagulation can physically trap invading microbes in blood clots. Also, some products of the coagulation system can contribute to the innate immune system by their ability to increase vascular permeability and act as chemotactic agents for phagocytic cells. In addition, some of the products of the coagulation system are directly antimicrobial. For example, beta-lysine, a protein produced by platelets during coagulation, can cause lysis of many Gram-positive bacteria by acting as a cationic detergent.[8] Many acute-phase proteins of inflammation are involved in the coagulation system. In addition, pathogenic bacteria may secrete agents that alter the coagulation system, e.g. coagulase and streptokinase.

Coagulation

178

Testing of coagulation
Numerous tests are used to assess the function of the coagulation system:[9] Common: aPTT, PT (also used to determine INR), fibrinogen testing (often by the Clauss method), platelet count, platelet function testing (often by PFA-100). Other: TCT, bleeding time, mixing test (whether an abnormality corrects if the patient's plasma is mixed with normal plasma), coagulation factor assays, antiphosholipid antibodies, D-dimer, genetic tests (e.g. factor V Leiden, prothrombin mutation G20210A), dilute Russell's viper venom time (dRVVT), miscellaneous platelet function tests, thromboelastography (TEG or Sonoclot), euglobulin lysis time (ELT). The contact activation (intrinsic) pathway is initiated by activation of the "contact factors" of plasma, and can be measured by the activated partial thromboplastin time (aPTT) test. The tissue factor (extrinsic) pathway is initiated by release of tissue factor (a specific cellular lipoprotein), and can be measured by the prothrombin time (PT) test. PT results are often reported as ratio (INR value) to monitor dosing of oral anticoagulants such as warfarin.

Blood plasma after the addition of Tissue Factor forms a gel-like structure (Test for prothrombin time).

The quantitative and qualitative screening of fibrinogen is measured by the thrombin clotting time (TCT). Measurement of the exact amount of fibrinogen present in the blood is generally done using the Clauss method for fibrinogen testing. Many analysers are capable of measuring a "derived fibrinogen" level from the graph of the Prothrombin time clot. If a coagulation factor is part of the contact activation or tissue factor pathway, a deficiency of that factor will affect only one of the tests: Thus hemophilia A, a deficiency of factor VIII, which is part of the contact activation pathway, results in an abnormally prolonged aPTT test but a normal PT test. The exceptions are prothrombin, fibrinogen, and some variants of FX that can be detected only by either aPTT or PT. If an abnormal PT or aPTT is present, additional testing will occur to determine which (if any) factor is present as aberrant concentrations. Deficiencies of fibrinogen (quantitative or qualitative) will affect all screening tests.

Laboratory findings in various platelet and coagulation disorders


Condition Vitamin K deficiency or warfarin Disseminated intravascular coagulation Von Willebrand disease Hemophilia Aspirin Thrombocytopenia Liver failure, early Liver failure, end-stage Uremia Prothrombin time Partial thromboplastin time Bleeding time Prolonged Prolonged Unaffected Unaffected Unaffected Unaffected Prolonged Prolonged Unaffected Normal or mildly prolonged Unaffected Prolonged Prolonged Prolonged Unaffected Unaffected Unaffected Prolonged Unaffected Prolonged Prolonged Unaffected Prolonged Prolonged Unaffected Prolonged Prolonged Platelet count Unaffected Decreased Decreased and/or Rejected Unaffected Unaffected Decreased Unaffected Decreased Unaffected

Coagulation

179
Prolonged Prolonged Prolonged Prolonged Prolonged Unaffected Unaffected Prolonged Prolonged Unaffected Unaffected Prolonged Prolonged Unaffected Unaffected Unaffected Unaffected Unaffected Decreased or unaffected Unaffected

Congenital afibrinogenemia FactorV deficiency

FactorX deficiency as seen in amyloid purpura Prolonged Glanzmann's thrombasthenia Bernard-Soulier syndrome Factor XII deficiency Unaffected Unaffected Unaffected

Role in disease
Problems with coagulation may dispose to hemorrhage, thrombosis, and occasionally both, depending on the nature of the pathology.[10]

Platelet disorders
Platelet conditions may be congenital or acquired. Some inborn platelet pathologies are Glanzmann's thrombasthenia, Bernard-Soulier syndrome (abnormal glycoprotein Ib-IX-V complex), gray platelet syndrome (deficient alpha granules), and delta storage pool deficiency (deficient dense granules). Most are rare conditions. Most inborn platelet pathologies predispose to hemorrhage. Von Willebrand disease is due to deficiency or abnormal function of von Willebrand factor, and leads to a similar bleeding pattern; its milder forms are relatively common. Decreased platelet numbers may be due to various causes, including insufficient production (e.g., in myelodysplastic syndrome or other bone marrow disorders), destruction by the immune system (immune thrombocytopenic purpura/ITP), and consumption due to various causes (thrombotic thrombocytopenic purpura/TTP, hemolytic-uremic syndrome/HUS, paroxysmal nocturnal hemoglobinuria/PNH, disseminated intravascular coagulation/DIC, heparin-induced thrombocytopenia/HIT). Most consumptive conditions lead to platelet activation, and some are associated with thrombosis.

Disease and clinical significance of thrombosis


The best-known coagulation factor disorders are the hemophilias. The three main forms are hemophilia A (factor VIII deficiency), hemophilia B (factor IX deficiency or "Christmas disease") and hemophilia C (factor XI deficiency, mild bleeding tendency). Hemophilia A and B are X-linked recessive disorders, whereas Hemophilia C is much more rare autosomal recessive disorder most commonly seen in Ashkenazi Jews. Von Willebrand disease (which behaves more like a platelet disorder except in severe cases), is the most common hereditary bleeding disorder and is characterized as being inherited autosomal recessive or dominant. In this disease, there is a defect in von Willebrand factor (vWF), which mediates the binding of glycoprotein Ib (GPIb) to collagen. This binding helps mediate the activation of platelets and formation of primary hemostasis. Bernard-Soulier syndrome is a defect or deficiency in GPIb. GPIb, the receptor for vWF, can be defective and lead to lack of primary clot formation (primary hemostasis) and increased bleeding tendency. This is an autosomal recessive inherited disorder. Thrombasthenia of Glanzmann and Naegeli (Glanzmann thrombasthenia) is extremely rare. It is characterized by a defect in GPIIb/IIIa fibrinogen receptor complex. When GPIIb/IIIa receptor is dysfunctional, fibrinogen cannot cross-link platelets, which inhibits primary hemostasis. This is an autosomal recessive inherited disorder. In liver failure (acute and chronic forms), there is insufficient production of coagulation factors by the liver; this may increase bleeding risk. Deficiency of Vitamin K may also contribute to bleeding disorders because clotting factor maturation depends on Vitamin K.

Coagulation Thrombosis is the pathological development of blood clots. These clots may break free and become mobile, forming an embolus or grow to such a size that occludes the vessel in which it developed. An embolism is said to occur when the thrombus (blood clot) becomes a mobile embolus and migrates to another part of the body, interfering with blood circulation and hence impairing organ function downstream of the occlusion. This causes ischemia and often leads to ischemic necrosis of tissue. Most cases of venous thrombosis are due to acquired states (older age, surgery, cancer, immobility) or inherited thrombophilias (e.g., antiphospholipid syndrome, factor V Leiden, and various other genetic deficiencies or variants). Mutations in factor XII have been associated with an asymptomatic prolongation in the clotting time and possibly a tendency toward thrombophlebitis. Other mutations have been linked with a rare form of hereditary angioedema (type III).

180

Pharmacology
Procoagulants
The use of adsorbent chemicals, such as zeolites, and other hemostatic agents are also used for sealing severe injuries quickly (such as in traumatic bleeding secondary to gunshot wounds). Thrombin and fibrin glue are used surgically to treat bleeding and to thrombose aneurysms. Desmopressin is used to improve platelet function by activating arginine vasopressin receptor 1A. Coagulation factor concentrates are used to treat hemophilia, to reverse the effects of anticoagulants, and to treat bleeding in patients with impaired coagulation factor synthesis or increased consumption. Prothrombin complex concentrate, cryoprecipitate and fresh frozen plasma are commonly used coagulation factor products. Recombinant activated human factor VII is increasingly popular in the treatment of major bleeding. Tranexamic acid and aminocaproic acid inhibit fibrinolysis, and lead to a de facto reduced bleeding rate. Before its withdrawal, aprotinin was used in some forms of major surgery to decrease bleeding risk and need for blood products.

Anticoagulants
Anticoagulants and anti-platelet agents are amongst the most commonly used medications. Anti-platelet agents include aspirin, dipyridamole, ticlopidine, clopidogrel and prasugrel; the parenteral glycoprotein IIb/IIIa inhibitors are used during angioplasty. Of the anticoagulants, warfarin (and related coumarins) and heparin are the most commonly used. Warfarin affects the vitamin K-dependent clotting factors (II, VII, IX,X), whereas heparin and related compounds increase the action of antithrombin on thrombin and factor Xa. A newer class of drugs, the direct thrombin inhibitors, is under development; some members are already in clinical use (such as lepirudin). Also under development are other small molecular compounds that interfere directly with the enzymatic action of particular coagulation factors (e.g., rivaroxaban, dabigatran, apixaban).[11]

Coagulation factors

Coagulation

181

Coagulation factors and related substances


Number and/or name I (fibrinogen) Forms clot (fibrin) Function Associated genetic disorders Congenital afibrinogenemia, Familial renal amyloidosis Prothrombin G20210A, Thrombophilia

II (prothrombin)

Its active form (IIa) activates I, V, VII, VIII, XI, XIII, protein C, platelets Co-factor of VIIa (formerly known as factor III) Required for coagulation factors to bind to phospholipid (formerly known as factor IV) Co-factor of X with which it forms the prothrombinase complex Unassigned old name of Factor Va Activates IX, X

III Tissue factor IV Calcium

V (proaccelerin, labile factor)

Activated protein C resistance

VI VII (stable factor, proconvertin)

congenital proconvertin/factor VII deficiency Haemophilia A Haemophilia B

VIII (Antihemophilic factor A) IX (Antihemophilic factor B or Christmas factor) X (Stuart-Prower factor) XI (plasma thromboplastin antecedent) XII (Hageman factor) XIII (fibrin-stabilizing factor) von Willebrand factor prekallikrein (Fletcher factor) high-molecular-weight kininogen (HMWK) (Fitzgerald factor) fibronectin

Co-factor of IX with which it forms the tenase complex Activates X: forms tenase complex with factor VIII

Activates II: forms prothrombinase complex with factor V Congenital Factor X deficiency Activates IX Activates factor XI, VII and prekallikrein Crosslinks fibrin Binds to VIII, mediates platelet adhesion Activates XII and prekallikrein; cleaves HMWK Supports reciprocal activation of XII, XI, and prekallikrein Mediates cell adhesion Haemophilia C Hereditary angioedema type III Congenital Factor XIIIa/b deficiency von Willebrand disease Prekallikrein/Fletcher Factor deficiency Kininogen deficiency

Glomerulopathy with fibronectin deposits Antithrombin III deficiency Heparin cofactor II deficiency

antithrombin III heparin cofactor II

Inhibits IIa, Xa, and other proteases Inhibits IIa, cofactor for heparin and dermatan sulfate ("minor antithrombin") Inactivates Va and VIIIa Cofactor for activated protein C (APC, inactive when bound to C4b-binding protein) Mediates thrombin adhesion to phospholipids and stimulates degradation of factor X by ZPI Degrades factors X (in presence of protein Z) and XI (independently) Converts to plasmin, lyses fibrin and other proteins

protein C protein S

Protein C deficiency Protein S deficiency

protein Z

Protein Z deficiency

Protein Z-related protease inhibitor (ZPI)

plasminogen

Plasminogen deficiency, type I (ligneous conjunctivitis) Antiplasmin deficiency Familial hyperfibrinolysis and thrombophilia Quebec platelet disorder

alpha 2-antiplasmin tissue plasminogen activator (tPA)

Inhibits plasmin Activates plasminogen

urokinase

Activates plasminogen

Coagulation

182
Inactivates tPA & urokinase (endothelial PAI) Plasminogen activator inhibitor-1 deficiency

plasminogen activator inhibitor-1 (PAI1)

plasminogen activator inhibitor-2 (PAI2) cancer procoagulant

Inactivates tPA & urokinase (placental PAI) Pathological factor X activator linked to thrombosis in cancer

History
Initial discoveries
Theories on the coagulation of blood have existed since antiquity. Physiologist Johannes Mller (18011858) described fibrin, the substance of a thrombus. Its soluble precursor, fibrinogen, was thus named by Rudolf Virchow (18211902), and isolated chemically by Prosper Sylvain Denis (17991863). Alexander Schmidt suggested that the conversion from fibrinogen to fibrin is the result of an enzymatic process, and labeled the hypothetical enzyme "thrombin" and its precursor "prothrombin".[12][13] Arthus discovered in 1890 that calcium was essential in coagulation.[14][15] Platelets were identified in 1865, and their function was elucidated by Giulio Bizzozero in 1882.[16] The theory that thrombin is generated by the presence of tissue factor was consolidated by Paul Morawitz in 1905.[17] At this stage, it was known that thrombokinase/thromboplastin (factor III) is released by damaged tissues, reacting with prothrombin (II), which, together with calcium (IV), forms thrombin, which converts fibrinogen into fibrin (I).[18]

Coagulation factors
The remainder of the biochemical factors in the process of coagulation were largely discovered in the 20th century. A first clue as to the actual complexity of the system of coagulation was the discovery of proaccelerin (initially and later called Factor V) by Paul Owren (19051990) in 1947. He also postulated its function to be the generation of accelerin (Factor VI), which later turned out to be the activated form of V (or Va); hence, VI is not now in active use.[18] Factor VII (also known as serum prothrombin conversion accelerator or proconvertin, precipitated by barium sulfate) was discovered in a young female patient in 1949 and 1951 by different groups. Factor VIII turned out to be deficient in the clinically recognised but etiologically elusive hemophilia A; it was identified in the 1950s and is alternatively called antihemophilic globulin due to its capability to correct hemophilia A.[18] Factor IX was discovered in 1952 in a young patient with hemophilia B named Stephen Christmas (19471993). His deficiency was described by Dr. Rosemary Biggs and Professor R.G. MacFarlane in Oxford, UK. The factor is, hence, called Christmas Factor. Christmas lived in Canada, and campaigned for blood transfusion safety until succumbing to transfusion-related AIDS at age 46. An alternative name for the factor is plasma thromboplastin component, given by an independent group in California.[18] Hageman factor, now known as factor XII, was identified in 1955 in an asymptomatic patient with a prolonged bleeding time named of John Hageman. Factor X, or Stuart-Prower factor, followed, in 1956. This protein was identified in a Ms. Audrey Prower of London, who had a lifelong bleeding tendency. In 1957, an American group identified the same factor in a Mr. Rufus Stuart. Factors XI and XIII were identified in 1953 and 1961, respectively.[18] The view that the coagulation process is a "cascade" or "waterfall" was enunciated almost simultaneously by MacFarlane[19] in the UK and by Davie and Ratnoff[20] in the USA, respectively.

Coagulation

183

Nomenclature
The usage of Roman numerals rather than eponyms or systematic names was agreed upon during annual conferences (starting in 1955) of hemostasis experts. In 1962, consensus was achieved on the numbering of factors I-XII.[21] This committee evolved into the present-day International Committee on Thrombosis and Hemostasis (ICTH). Assignment of numerals ceased in 1963 after the naming of Factor XIII. The names Fletcher Factor and Fitzgerald Factor were given to further coagulation-related proteins, namely prekallikrein and high-molecular-weight kininogen, respectively.[18] Factors III and VI are unassigned, as thromboplastin was never identified, and actually turned out to consist of ten further factors, and accelerin was found to be activated Factor V.

Other species
All mammals have an extremely closely related blood coagulation process, using a combined cellular and serine protease process. In fact, it is possible for any mammalian coagulation factor to "cleave" its equivalent target in any other mammal. The only nonmammalian animal known to use serine proteases for blood coagulation is the horseshoe crab.[22]

References
[1] David Lillicrap; Nigel Key; Michael Makris; Denise O'Shaughnessy (2009). Practical Hemostasis and Thrombosis. Wiley-Blackwell. pp.15. ISBN1-4051-8460-4. [2] Alan D. Michelson (26 October 2006). Platelets (http:/ / books. google. com/ books?id=GnIQGmiSylkC& pg=PA3). Academic Press. pp.35. ISBN978-0-12-369367-9. . Retrieved 18 October 2012. [3] Furie B, Furie BC (2005). "Thrombus formation in vivo" (http:/ / www. jci. org/ cgi/ content/ full/ 115/ 12/ 3355). J. Clin. Invest. 115 (12): 335562. doi:10.1172/JCI26987. PMC1297262. PMID16322780. . [4] Pallister CJ and Watson MS (2010). Haematology. Scion Publishing. pp.334336. ISBN1-904842-39-9. [5] Pallister CJ and Watson MS (2010). Haematology. Scion Publishing. pp.336347. ISBN1-904842-39-9. [6] Hoffbrand, A. V. (2002). Essential haematology. Oxford: Blackwell Science. pp.241243. ISBN0-632-05153-1. [7] Hoffbrand, A. V. (2002). Essential haematology. Oxford: Blackwell Science. pp.243245. ISBN0-632-05153-1. [8] Immunology Chapter One: Innate ot non-specific immunity (http:/ / pathmicro. med. sc. edu/ ghaffar/ innate. htm) Gene Mayer, Ph.D. Immunology Section of Microbiology and Immunology On-line. University of South Carolina [9] David Lillicrap; Nigel Key; Michael Makris; Denise O'Shaughnessy (2009). Practical Hemostasis and Thrombosis. Wiley-Blackwell. pp.716. ISBN1-4051-8460-4. [10] Hatton, Chris (2008). Haematology (Lecture Notes). Cambridge, MA: Blackwell Publishers. pp.145166. ISBN1-4051-8050-1. [11] Soff GA (March 2012). "A new generation of oral direct anticoagulants". Arteriosclerosis, Thrombosis, and Vascular Biology 32 (3): 56974. doi:10.1161/ATVBAHA.111.242834. PMID22345595. [12] Schmidt A (1872). "Neue Untersuchungen ueber die Fasserstoffesgerinnung". Pflger's Archiv fr die gesamte Physiologie 6: 413538. doi:10.1007/BF01612263. [13] Schmidt A. Zur Blutlehre. Leipzig: Vogel, 1892. [14] Arthus M, Pags C (1890). "Nouvelle theorie chimique de la coagulation du sang". Arch Physiol Norm Pathol 5: 73946. [15] Shapiro SS (2003). "Treating thrombosis in the 21st century". N. Engl. J. Med. 349 (18): 17624. doi:10.1056/NEJMe038152. PMID14585945. [16] Brewer DB (2006). "Max Schultze (1865), G. Bizzozero (1882) and the discovery of the platelet". Br. J. Haematol. 133 (3): 2518. doi:10.1111/j.1365-2141.2006.06036.x. PMID16643426. [17] Morawitz P (1905). "Die Chemie der Blutgerinnung". Ergebn Physiol 4: 307422. [18] Giangrande PL (2003). "Six characters in search of an author: the history of the nomenclature of coagulation factors". Br. J. Haematol. 121 (5): 70312. doi:10.1046/j.1365-2141.2003.04333.x. PMID12780784. [19] MacFarlane RG (1964). "An enzyme cascade in the blood clotting mechanism, and its function as a biochemical amplifier". Nature 202 (4931): 4989. doi:10.1038/202498a0. PMID14167839. [20] Davie EW, Ratnoff OD (1964). "Waterfall sequence for intrinsic blood clotting". Science 145 (3638): 13102. doi:10.1126/science.145.3638.1310. PMID14173416. [21] Wright IS (1962). "The Nomenclature of Blood Clotting Factors". Can Med Assoc J 86 (8): 3734. PMC1848865. PMID14008442. [22] Osaki T, Kawabata S (June 2004). "Structure and function of coagulogen, a clottable protein in horseshoe crabs". Cellular and Molecular Life Sciences : CMLS 61 (11): 125765. doi:10.1007/s00018-004-3396-5. PMID15170505.

Coagulation

184

External links
3D structures
UMich Orientation of Proteins in Membranes families/superfamily-97 (http://opm.phar.umich.edu/families. php?superfamily=97) Calculated orientations of complexes with GLA domains in membrane UMich Orientation of Proteins in Membranes families/superfamily-48 (http://opm.phar.umich.edu/families. php?superfamily=48) Discoidin domains of blood coagulation factors

Protease
A protease (also termed peptidase or proteinase) is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein.

Classification
Standard
Proteases are currently classified into six broad groups: Serine proteases Threonine proteases Cysteine proteases Aspartate proteases Metalloproteases

The threonine and glutamic-acid proteases were not described until 1995 and 2004, respectively. The mechanism used to cleave a peptide bond involves making an amino acid residue that has the cysteine and threonine (proteases) or a water molecule (aspartic acid, metallo- and glutamic acid proteases) nucleophilic so that it can attack the peptide carboxyl group. One way to make a nucleophile is by a catalytic triad, where a histidine residue is used to activate serine, cysteine, or threonine as a nucleophile. Within each of the broad groups proteases have been classified, by Rawlings and Barrett, into families of related proteases. For example within the serine proteases families are labelled Sx where S denotes the serine catalytic type and the x denotes the number of the family, for example S1 (chymotrypsins). An up to date classification of proteases into families is found in the MEROPS database.[1][2]

By optimal pH
Alternatively, proteases may be classified by the optimal pH in which they are active: Acid proteases Neutral proteases involved in type 1 hypersensitivity. Here, it is released by mast cells and causes activation of complement and kinins.[3] This group includes the calpains. Basic proteases (or alkaline proteases)

Protease

185

Occurrence
Proteases occur naturally in all organisms. These enzymes are involved in a multitude of physiological reactions from simple digestion of food proteins to highly regulated cascades (e.g., the blood-clotting cascade, the complement system, apoptosis pathways, and the invertebrate prophenoloxidase-activating cascade). Proteases can either break specific peptide bonds (limited proteolysis), depending on the amino acid sequence of a protein, or break down a complete peptide to amino acids (unlimited proteolysis). The activity can be a destructive change, abolishing a protein's function or digesting it to its principal components; it can be an activation of a function, or it can be a signal in a signaling pathway. Bacteria also secrete proteases to hydrolyse (digest) the peptide bonds in proteins and therefore break the proteins down into their constituent monomers. Bacterial and fungal proteases are particularly important to the global carbon and nitrogen cycles in the recycling of proteins, and such activity tends to be regulated in by nutritional signals in these organisms.[4] The net impact of nutritional regulation of protease activity among the thousands of species present in soil can be observed at the overall microbial community level as proteins are broken down in response to carbon, nitrogen, or sulfur limitation.[5] A secreted bacterial protease may also act as an exotoxin, and be an example of a virulence factor in bacterial pathogenesis. Bacterial exotoxic proteases destroy extracellular structures. Protease enzymes are also used extensively in the bread industry in bread improver. Proteases, also known as proteinases or proteolytic enzymes, are a large group of enzymes. Proteases belong to the class of enzymes known as hydrolases, which catalyse the reaction of hydrolysis of various bonds with the participation of a water molecule. Proteases are involved in digesting long protein chains into short fragments, splitting the peptide bonds that link amino acid residues. Some of them can detach the terminal amino acids from the protein chain (exopeptidases, such as aminopeptidases, carboxypeptidase A); the others attack internal peptide bonds of a protein (endopeptidases, such as trypsin, chymotrypsin, pepsin, papain, elastase). Proteases are divided into four major groups according to the character of their catalytic active site and conditions of action: serine proteinases, cysteine (thiol) proteinases, aspartic proteinases, and metalloproteinases. Attachment of a protease to a certain group depends on the structure of catalytic site and the amino acid (as one of the constituents) essential for its activity. Proteases are used throughout an organism for various metabolic processes. Acid proteases secreted into the stomach (such as pepsin) and serine proteases present in duodenum (trypsin and chymotrypsin) enable us to digest the protein in food; proteases present in blood serum (thrombin, plasmin, Hageman factor, etc.) play important role in blood-clotting, as well as lysis of the clots, and the correct action of the immune system. Other proteases are present in leukocytes (elastase, cathepsin G) and play several different roles in metabolic control. Proteases determine the lifetime of other proteins playing important physiological role like hormones, antibodies, or other enzymesthis is one of the fastest "switching on" and "switching off" regulatory mechanisms in the physiology of an organism. By complex cooperative action the proteases may proceed as cascade reactions, which result in rapid and efficient amplification of an organism's response to a physiological signal. Proteases are part of many laundry detergents.

Protease

186

Inhibitors
The activity of proteases is inhibited by protease inhibitors. One example of protease inhibitors is the serpin superfamily, which includes alpha 1-antitrypsin, C1-inhibitor, antithrombin, alpha 1-antichymotrypsin, plasminogen activator inhibitor-1, and neuroserpin. Natural protease inhibitors include the family of lipocalin proteins, which play a role in cell regulation and differentiation. Lipophilic ligands, attached to lipocalin proteins, have been found to possess tumor protease inhibiting properties. The natural protease inhibitors are not to be confused with the protease inhibitors used in antiretroviral therapy. Some viruses, with HIV/AIDS among them, depend on proteases in their reproductive cycle. Thus, protease inhibitors are developed as antiviral means.

Degradation
Proteases, being themselves proteins, are known to be cleaved by other protease molecules, sometimes of the same variety. This may be an important method of regulation of protease activity.

Protease research
The field of protease research is enormous. Barrett and Rawlings estimated that approximately 8001 papers related to this field are published each year. For a look at current activities and interests of protease researchers, see the International Proteolysis Society [6] web page.

Notes
[1] Rawlings ND, Barrett AJ, Bateman A (January 2010). "MEROPS: the peptidase database". Nucleic Acids Res. 38 (Database issue): D22733. doi:10.1093/nar/gkp971. PMC2808883. PMID19892822. [2] MEROPS (http:/ / merops. sanger. ac. uk/ ) [3] Mitchell, Richard Sheppard; Kumar, Vinay; Abbas, Abul K.; Fausto, Nelson (2007). Robbins Basic Pathology. Philadelphia: Saunders. pp.122. ISBN1-4160-2973-7. 8th edition. [4] Sims, G.K. 2006. Nitrogen Starvation Promotes Biodegradation of N-Heterocyclic Compounds in Soil. Soil Biology & Biochemistry 38:2478-2480. [5] Sims, G. K., and M. M. Wander. 2002. Proteolytic activity under nitrogen or sulfur limitation. Appl. Soil Ecol. 568:1-5. [6] http:/ / www. protease. org

References
Barrett A.J., Rawlings ND, Woessner JF. The Handbook of Proteolytic Enzymes, 2nd ed. Academic Press, 2003. ISBN 0-12-079610-4. Hedstrom L. Serine Protease Mechanism and Specificity. Chem Rev 2002;102:4501-4523. Southan C. A genomic perspective on human proteases as drug targets. Drug Discov Today 2001;6:681-688. Hooper NM. Proteases in Biology and Medicine. London: Portland Press, 2002. ISBN 1-85578-147-6. Puente XS, Sanchez LM, Overall CM, Lopez-Otin C. Human and Mouse Proteases: a Comparative Genomic Approach. Nat Rev Genet 2003;4:544-558. Ross J, Jiang H, Kanost MR, Wang Y. Serine proteases and their homologs in the Drosophila melanogaster genome: an initial analysis of sequence conservation and phylogenetic relationships. Gene 2003;304:117-31. Puente XS, Lopez-Otin C. A Genomic Analysis of Rat Proteases and Protease Inhibitors. Genome Biol 2004;14:609-622. Luca Feijoo-Siota, Toms G. Villa Native and Biotechnologically Engineered Plant Proteases with Industrial Applications. Food and Bioprocess Technology 2010.

Protease

187

External links
International Proteolysis Society (http://www.protease.org/) Merops - the peptidase database (http://merops.sanger.ac.uk/) List of protease inhibitors (http://www.sciencegateway.org/resources/protease.htm) Protease cutting predictor (http://www.expasy.org/tools/peptidecutter/) List of proteases and their specificities (http://www.expasy.org/tools/peptidecutter/peptidecutter_enzymes. html) (see also (http://www.expasy.org/cgi-bin/lists?peptidas.txt)) Proteolysis MAP from Center for Proteolytic Pathways (http://www.proteolysis.org/) Proteolysis Cut Site database - curated expert annotation from users (http://cutdb.burnham.org/) Protease cut sites graphical interface (http://substrate.burnham.org/) TopFIND protease database covering cut sites, substrates and protein termini (http://clipserve.clip.ubc.ca/ topfind) Proteases (http://www.nlm.nih.gov/cgi/mesh/2011/MB_cgi?mode=&term=Proteases) at the US National Library of Medicine Medical Subject Headings (MeSH)

Heat equation
The heat equation is a parabolic partial differential equation which describes the distribution of heat (or variation in temperature) in a given region over time.

Statement of the equation


For a function u(x,y,z,t) of three spatial variables (x,y,z) (see cartesian coordinates) and the time variable t, the heat equation is
In this example, the heat equation in two dimensions predicts that if one area of an otherwise cool metal plate has been heated, say with a torch, over time the temperature of that area will gradually decrease, starting at the edge and moving inward. Meanwhile the part of the plate outside that region will be getting warmer. Eventually the entire plate will reach a uniform intermediate temperature. In this animation, both height and color are used to show temperature.

More generally in any coordinate system:

where is a positive constant, and or 2 denotes the Laplace operator. In the physical problem of temperature variation, u(x,y,z,t) is the temperature and is the thermal diffusivity. For the mathematical treatment it is sufficient to consider the case =1. The heat equation is of fundamental importance in diverse scientific fields. In mathematics, it is the prototypical parabolic partial differential equation. In probability theory, the heat equation is connected with the study of Brownian motion via the FokkerPlanck equation. In financial mathematics it is used to solve the BlackScholes partial differential equation. The diffusion equation, a more general version of the heat equation, arises in connection with the study of chemical diffusion and other related processes.

Heat equation

188

General description
Suppose one has a function u which describes the temperature at a given location (x, y, z). This function will change over time as heat spreads throughout space. The heat equation is used to determine the change in the function u over time. The image to the right is animated and describes the way heat changes in time along a metal bar. One of the interesting properties of the heat equation is the maximum principle which says that the maximum value of u is either earlier in time than the region of concern or on the edge of the region of concern. This is essentially saying that temperature comes either from some source or from earlier in time because heat permeates but is not created from nothingness. This is a property of parabolic partial differential equations and is not difficult to prove mathematically (see below).

Solution of a 1D heat equation PDE. The temperature (u) is initially distributed over a one-dimensional, one-unit-long interval (x=[0,1]) with insulated endpoints. The distribution approaches equilibrium over time.

Another interesting property is that even if u has a discontinuity at an initial time t = t0, the temperature becomes smooth as soon as t > t0. For example, if a bar of metal has temperature 0 and another has temperature 100 and they are stuck together end to end, then very quickly the temperature at the point of connection will become 50 and the graph of the temperature will run smoothly from 0 to 100. The heat equation is used in probability and describes random walks. It is also applied in financial mathematics for this reason. It is also important in Riemannian geometry and thus topology: it was adapted by Richard Hamilton when he defined the Ricci flow that was later used by Grigori Perelman to solve the topological Poincar conjecture.

The physical problem and the equation


Derivation in one dimension
The heat equation is derived from Fourier's law and conservation of energy (Cannon 1984). By Fourier's law, the flow rate of heat energy through a surface is proportional to the negative temperature gradient across the surface,

where k is the thermal conductivity and u is the temperature. In one dimension, the gradient is an ordinary spatial derivative, and so Fourier's law is

where

In the absence of work done, a change in internal energy per unit volume in the material, Q, is proportional to the change in temperature, u. (In this section only, is the ordinary difference operator, not the Laplacian.) That is,

where cp is the specific heat capacity and is the mass density of the material. Choosing zero energy at absolute zero temperature, this can be rewritten as . The increase in internal energy in a small spatial region of the material

over the time period

Heat equation

189

is given by[1]

where the fundamental theorem of calculus was used. If no work is done and there are neither heat sources nor sinks, the change in internal energy in the interval [x-x, x+x] is accounted for entirely by the flux of heat across the boundaries. By Fourier's law, this is

again by the fundamental theorem of calculus.[2] By conservation of energy,

This is true for any rectangle [tt, t+t] [xx, x+x]. By the fundamental lemma of the calculus of variations, the integrand must vanish identically:

Which can be rewritten as:

or:

which is the heat equation, where the coefficient (often denoted )

is called the thermal diffusivity. An additional term may be introduced into the equation to account for radiative loss of heat, which depends upon the excess temperature at a given point compared with the surroundings. At low excess temperatures, the radiative loss is approximately , giving a one-dimensional heat-transfer equation of the form . At high excess temperatures, however, the Stefan-Boltzmann law gives a net radiative heat-loss proportional to , and the above equation is inaccurate. For large excess temperatures, , giving a high-temperature heat-transfer equation of the form

where

. Here,

is Stefan's constant,

is a characteristic constant of the material, instead of

is the

sectional perimeter of the bar and approximation in this case.

is its cross-sectional area. However, using

gives a better

Heat equation

190

Three-dimensional problem
In the special cases of wave propagation of heat in an isotropic and homogeneous medium in a 3-dimensional space, this equation is

where: u = u(x, y, z, t) is temperature as a function of space and time; is the rate of change of temperature at a point over time;

uxx, uyy, and uzz are the second spatial derivatives (thermal conductions) of temperature in the x, y, and z directions, respectively; is the thermal diffusivity, a material-specific quantity depending on the thermal conductivity k, the

mass density , and the specific heat capacity cp. The heat equation is a consequence of Fourier's law of cooling (see heat conduction). If the medium is not the whole space, in order to solve the heat equation uniquely we also need to specify boundary conditions for u. To determine uniqueness of solutions in the whole space it is necessary to assume an exponential bound on the growth of solutions; this assumption is consistent with observed experiments. Solutions of the heat equation are characterized by a gradual smoothing of the initial temperature distribution by the flow of heat from warmer to colder areas of an object. Generally, many different states and starting conditions will tend toward the same stable equilibrium. As a consequence, to reverse the solution and conclude something about earlier times or initial conditions from the present heat distribution is very inaccurate except over the shortest of time periods. The heat equation is the prototypical example of a parabolic partial differential equation. Using the Laplace operator, the heat equation can be simplified, and generalized to similar equations over spaces of arbitrary number of dimensions, as where the Laplace operator, or 2, the divergence of the gradient, is taken in the spatial variables. The heat equation governs heat diffusion, as well as other diffusive processes, such as particle diffusion or the propagation of action potential in nerve cells. Although they are not diffusive in nature, some quantum mechanics problems are also governed by a mathematical analog of the heat equation (see below). It also can be used to model some phenomena arising in finance, like the BlackScholes or the Ornstein-Uhlenbeck processes. The equation, and various non-linear analogues, has also been used in image analysis. The heat equation is, technically, in violation of special relativity, because its solutions involve instantaneous propagation of a disturbance. The part of the disturbance outside the forward light cone can usually be safely neglected, but if it is necessary to develop a reasonable speed for the transmission of heat, a hyperbolic problem should be considered instead like a partial differential equation involving a second-order time derivative. Some models of nonlinear heat conduction (which are also parabolic equations) have solutions with finite heat transmission speed.[3][4]

Heat equation

191

Internal heat generation


The function u above represents temperature of a body. Alternatively, it is sometimes convenient to change units and represent u as the heat density of a medium. Since heat density is proportional to temperature in a homogeneous medium, the heat equation is still obeyed in the new units. Suppose that a body obeys the heat equation and, in addition, generates its own heat per unit volume (e.g., in watts/litre - W/L) at a rate given by a known function q varying in space and time.[5] Then the heat per unit volume u satisfies an equation

For example, a tungsten light bulb filament generates heat, so it would have a positive nonzero value for q when turned on. While the light is turned off, the value of q for the tungsten filament would be zero.

Solving the heat equation using Fourier series


The following solution technique for the heat equation was proposed by Joseph Fourier in his treatise Thorie analytique de la chaleur, published in 1822. Let us consider the heat equation for one space variable. This could be used to model heat conduction in a rod. The equation is
Idealized physical setting for heat conduction in a rod with homogeneous boundary conditions.

(1)

where u = u(x, t) is a function of two variables x and t. Here x is the space variable, so x [0,L], where L is the length of the rod. t is the time variable, so t 0. We assume the initial condition
(2)

where the function f is given, and the boundary conditions


. (3)

Let us attempt to find a solution of (1) which is not identically zero satisfying the boundary conditions (3) but with the following property: u is a product in which the dependence of u on x, t is separated, that is:
(4)

This solution technique is called separation of variables. Substituting u back into equation (1),

Heat equation Since the right hand side depends only on x and the left hand side only on t, both sides are equal to some constant value . Thus:
(5)

192

and
(6)

We will now show that nontrivial solutions for (6) for values of 0 cannot occur: 1. Suppose that < 0. Then there exist real numbers B, C such that

From (3) we get

and therefore B = 0 = C which implies u is identically 0. 2. Suppose that = 0. Then there exist real numbers B, C such that From equation (3) we conclude in the same manner as in 1 that u is identically 0. 3. Therefore, it must be the case that > 0. Then there exist real numbers A, B, C such that and From (3) we get C = 0 and that for some positive integer n,

This solves the heat equation in the special case that the dependence of u has the special form (4). In general, the sum of solutions to (1) which satisfy the boundary conditions (3) also satisfies (1) and (3). We can show that the solution to (1), (2) and (3) is given by

where

Generalizing the solution technique


The solution technique used above can be greatly extended to many other types of equations. The idea is that the operator uxx with the zero boundary conditions can be represented in terms of its eigenvectors. This leads naturally to one of the basic ideas of the spectral theory of linear self-adjoint operators. Consider the linear operator u = uxx. The infinite sequence of functions

for n 1 are eigenvectors of . Indeed

Heat equation Moreover, any eigenvector f of with the boundary conditions f(0)=f(L)=0 is of the form en for some n 1. The functions en for n 1 form an orthonormal sequence with respect to a certain inner product on the space of real-valued functions on [0, L]. This means

193

Finally, the sequence {en}n diagonalized the operator .

spans a dense linear subspace of L2(0, L). This shows that in effect we have

Heat conduction in non-homogeneous anisotropic media


In general, the study of heat conduction is based on several principles. Heat flow is a form of energy flow, and as such it is meaningful to speak of the time rate of flow of heat into a region of space. The time rate of heat flow into a region V is given by a time-dependent quantity qt(V). We assume q has a density, so that

Heat flow is a time-dependent vector function H(x) characterized as follows: the time rate of heat flowing through an infinitesimal surface element with area dS and with unit normal vector n is

Thus the rate of heat flow into V is also given by the surface integral

where n(x) is the outward pointing normal vector at x. The Fourier law states that heat energy flow has the following linear dependence on the temperature gradient

where A(x) is a 33 real matrix that is symmetric and positive definite. By Green's theorem, the previous surface integral for heat flow into V can be transformed into the volume integral

The time rate of temperature change at x is proportional to the heat flowing into an infinitesimal volume element, where the constant of proportionality is dependent on a constant

Putting these equations together gives the general equation of heat flow:

Remarks. The coefficient (x) is the inverse of specific heat of the substance at x density of the substance at x. In the case of an isotropic medium, the matrix A is a scalar matrix equal to thermal conductivity.

Heat equation In the anisotropic case where the coefficient matrix A is not scalar (i.e., if it depends on x), then an explicit formula for the solution of the heat equation can seldom be written down. Though, it is usually possible to consider the associated abstract Cauchy problem and show that it is a well-posed problem and/or to show some qualitative properties (like preservation of positive initial data, infinite speed of propagation, convergence toward an equilibrium, smoothing properties). This is usually done by one-parameter semigroups theory: for instance, if A is a symmetric matrix, then the elliptic operator defined by

194

is self-adjoint and dissipative, thus by the spectral theorem it generates a one-parameter semigroup.

Fundamental solutions
A fundamental solution, also called a heat kernel, is a solution of the heat equation corresponding to the initial condition of an initial point source of heat at a known position. These can be used to find a general solution of the heat equation over certain domains; see, for instance, (Evans 1998) for an introductory treatment. In one variable, the Green's function is a solution of the initial value problem

where is the Dirac delta function. The solution to this problem is the fundamental solution

One can obtain the general solution of the one variable heat equation with initial condition u(x, 0) = g(x) for - < x < and 0 < t < by applying a convolution:

In several spatial variables, the fundamental solution solves the analogous problem

in - < x i < , i = 1,...,n, and 0 < t < . The n-variable fundamental solution is the product of the fundamental solutions in each variable; i.e.,

The general solution of the heat equation on Rn is then obtained by a convolution, so that to solve the initial value problem with u(x, t = 0) = g(x), one has

The general problem on a domain in Rn is

with either Dirichlet or Neumann boundary data. A Green's function always exists, but unless the domain can be readily decomposed into one-variable problems (see below), it may not be possible to write it down explicitly. The method of images provides one additional technique for obtaining Green's functions for non-trivial domains.

Heat equation

195

Some Green's function solutions in 1D


A variety of elementary Green's function solutions in one-dimension are recorded here. In some of these, the spatial domain is the entire real line (-,). In others, it is the semi-infinite interval (0,) with either Neumann or Dirichlet boundary conditions. One further variation is that some of these solve the inhomogeneous equation

where f is some given function of x and t. Homogeneous heat equation Initial value problem on (-,)

Comment. This solution is the convolution with respect to the variable x of the fundamental solution and the function g(x). Therefore, according to the general properties of the convolution with respect to differentiation, Moreover, about approximation to the identity, is a solution of the same heat equation, for and so that, by general facts

as t 0 in various senses, according to the specific g. For converges uniformly to g as t 0,

instance, if g is assumed bounded and continuous on R then meaning that u(x, t) is continuous on R [0, ) with u(x,0) = g(x).

Initial value problem on (0,) with homogeneous Dirichlet boundary conditions

Comment. This solution is obtained from the preceding formula as applied to the data g(x) suitably extended to R, so as to be an odd function, that is, letting g(x) := g(x) for all x. Correspondingly, the solution of the initial value problem on (,+) is an odd function with respect to the variable x for all values of t, and in particular it satisfies the homogeneous Dirichlet boundary conditions Initial value problem on (0,) with homogeneous Neumann boundary conditions

Comment. This solution is obtained from the first solution formula as applied to the data g(x) suitably extended to R so as to be an even function, that is, letting g(x) := +g(x) for all x. Correspondingly, the solution of the initial value problem on (,+) is an even function with respect to the variable x for all values of t > 0, and in particular, being smooth, it satisfies the homogeneous Neumann boundary conditions ux(0, t) = 0. Problem on (0,) with homogeneous initial conditions and non-homogeneous Dirichlet boundary conditions

Heat equation

196

Comment.

This

solution

is

the convolution with respect to the variable t of and the function h(t). Since (x, t) is the fundamental solution of

the function (x, t) is also a solution of the same heat equation, and so is u := h, thanks to general properties of the convolution with respect to differentiation. Moreover, and so that, by general facts about approximation to the identity, (x, ) h h as x 0 in various senses, according to the specific h. For instance, if h is assumed continuous on R with support in [0, ) then (x, ) h converges uniformly on compacta to h as x 0, meaning that u(x, t) is continuous on [0, ) [0, ) with u(0, t) = h(t). Inhomogeneous heat equation Problem on (-,) homogeneous initial conditions

Comment. This solution is the convolution in R2, that is with respect to both the variables x and t, of the fundamental solution and the function f(x, t), both meant as defined on the whole R2 and identically 0 for all t 0. One verifies that distributions as which is expressed in the language of

where the distribution is the Dirac's delta function, that is the evaluation at 0.

Problem on (0,) with homogeneous Dirichlet boundary conditions and initial conditions

Comment. This solution is obtained from the preceding formula as applied to the data f(x, t) suitably extended to R [0,), so as to be an odd function of the variable x, that is, letting f(x, t) := f(x, t) for all x and t. Correspondingly, the solution of the inhomogeneous problem on (-,+) is an odd function with respect to the variable x for all values of t, and in particular it satisfies the homogeneous Dirichlet boundary conditions u(0, t) = 0. Problem on (0,) with homogeneous Neumann boundary conditions and initial conditions

Heat equation Comment. This solution is obtained from the first formula as applied to the data f(x, t) suitably extended to R [0,), so as to be an even function of the variable x, that is, letting f(x,t) := f(x, t) for all x and t. Correspondingly, the solution of the inhomogeneous problem on (,+) is an even function with respect to the variable x for all values of t, and in particular, being a smooth function, it satisfies the homogeneous Neumann boundary conditions Examples Since the heat equation is linear, solutions of other combinations of boundary conditions, inhomogeneous term, and initial conditions can be found by taking an appropriate linear combination of the above Green's function solutions. For example, to solve

197

let

where u and v solve the problems

Similarly, to solve

let

where w, v, and r solve the problems

Mean-value property for the heat equation


Solutions of the heat equations

satisfy a mean-value property analogous to the mean-value properties of harmonic functions, solutions of , though a bit more complicated. Precisely, if u solves and

then

where E is a "heat-ball", that is a super-level set of the fundamental solution of the heat equation:

Heat equation

198

Notice that

as so the above formula holds for any (x, t) in the (open) set dom(u) for large enough. Conversely, any function u satisfying the above mean-value property on an open domain of Rn R is a solution of the heat equation. This can be shown by an argument similar to the analogous one for harmonic functions.

Stationary Heat Equation


The (time) stationary heat equation is not dependent on time. In other words, it is assumed conditions exist such that:

This condition depends on the time constant and the amount of time passed since boundary conditions have been imposed. Thus, the condition is fulfilled in situations in which the time equilibrium constant is fast enough that the more complex time-dependent heat equation can be approximated by the stationary case. Equivalently, the stationary condition exists for all cases in which enough time has passed that the thermal field u no longer evolves in time. In the stationary case, a spacial thermal gradient may (or may not) exist, but if it does, it does not change in time. This equation therefore describes the end result in all thermal problems in which a source is switched on (for example, an engine started in an automobile), and enough time has passed for all permanent temperature gradients to establish themselves in space, after which these spacial gradients no longer change in time (as again, with an automobile in which the engine has been running for long enough). The other (trivial) solution is for all spacial temperature gradients to disappear as well, in which case the temperature become uniform in space, as well. The equation is much simpler and can help to understand better the physics of the materials without focusing on the dynamic of the heat transport process. It is widely used for simple engineering problems assuming there is equilibrium of the temperature fields and heat transport, with time. Stationary condition:

The stationary heat equation for a volume that contains a heat source (the inhomogeneous case), is the Poisson's equation:

In electrostatics, this is equivalent to the case where the space under consideration contains an electrical charge. The stationary heat equation without a heat source within the volume (the homogeneous case) is the equation in electrostatics for a volume of free space that does not contain a charge. It is described by Laplace's equation:

where u is the temperature, k is the thermal conductivity and q the heat source density.

Heat equation

199

Applications
Particle diffusion
One can model particle diffusion by an equation involving either: the volumetric concentration of particles, denoted c, in the case of collective diffusion of a large number of particles, or the probability density function associated with the position of a single particle, denoted P. In either case, one uses the heat equation

or

Both c and P are functions of position and time. D is the diffusion coefficient that controls the speed of the diffusive process, and is typically expressed in meters squared over second. If the diffusion coefficient D is not constant, but depends on the concentration c (or P in the second case), then one gets the nonlinear diffusion equation.

Brownian motion
The random trajectory of a single particle subject to the particle diffusion equation (or heat equation) is a Brownian motion. If a particle is placed at R = 0 at time t = 0, then the probability density function associated with the position vector of the particle R will be the following:

which is a (multivariate) normal distribution evolving in time.

Schrdinger equation for a free particle


With a simple division, the Schrdinger equation for a single particle of mass m in the absence of any applied force field can be rewritten in the following way: , where i is the imaginary unit, is the reduced Planck's constant, and is the wavefunction of the particle. This equation is formally similar to the particle diffusion equation, which one obtains through the following transformation:

Applying this transformation to the expressions of the Green functions determined in the case of particle diffusion yields the Green functions of the Schrdinger equation, which in turn can be used to obtain the wavefunction at any time through an integral on the wavefunction at t = 0: , with

Remark: this analogy between quantum mechanics and diffusion is a purely formal one. Physically, the evolution of the wavefunction satisfying Schrdinger's equation might have an origin other than diffusion.

Heat equation

200

Thermal diffusivity in polymers


A direct practical application of the heat equation, in conjunction with Fourier theory, in spherical coordinates, is the measurement of the thermal diffusivity in polymers (Unsworth and Duarte). The dual theoretical-experimental method demonstrated by these authors is applicable to rubber and various other materials of practical interest.

Further applications
The heat equation arises in the modeling of a number of phenomena and is often used in financial mathematics in the modeling of options. The famous BlackScholes option pricing model's differential equation can be transformed into the heat equation allowing relatively easy solutions from a familiar body of mathematics. Many of the extensions to the simple option models do not have closed form solutions and thus must be solved numerically to obtain a modeled option price. The equation describing pressure diffusion in an porous medium is identical in form with the heat equation. Diffusion problems dealing with Dirichlet, Neumann and Robin boundary conditions have closed form analytic solutions (Thambynayagam 2011). The heat equation is also widely used in image analysis (Perona & Malik 1990) and in machine-learning as the driving theory behind scale-space or graph Laplacian methods. The heat equation can be efficiently solved numerically using the CrankNicolson method of (Crank & Nicolson 1947). This method can be extended to many of the models with no closed form solution, see for instance (Wilmott, Howison & Dewynne 1995). An abstract form of heat equation on manifolds provides a major approach to the AtiyahSinger index theorem, and has led to much further work on heat equations in Riemannian geometry.

Notes
[1] Here we are assuming that the material has constant mass density and heat capacity through space as well as time, although generalizations are given below. [2] In higher dimensions, the divergence theorem is used instead. [3] The Mathworld: Porous Medium Equation (http:/ / mathworld. wolfram. com/ PorousMediumEquation. html) and the other related models have solutions with finite wave propagation speed. [4] Juan Luis Vazquez (2006-12-28), The Porous Medium Equation: Mathematical Theory, Oxford University Press, USA, ISBN0-19-856903-3 [5] Note that the units of u must be selected in a manner compatible with those of q. Thus instead of being for thermodynamic temperature (Kelvin - K), units of u should be J/L.

References
Cannon, John Rozier (1984), The OneDimensional Heat Equation (http://books.google.com/ ?id=XWSnBZxbz2oC&printsec=frontcover#v=onepage&q=), Encyclopedia of Mathematics and Its Applications, 23 (1st ed.), Reading-Menlo ParkLondonDon MillsSidneyTokyo/ CambridgeNew YorkNew RochelleMelbourneSidney: Addison-Wesley Publishing Company/Cambridge University Press, pp.XXV+483, ISBN978-0-521-30243-2, MR0747979, Zbl0567.35001. Crank, J.; Nicolson, P.; Hartree, D. R. (1947), "A Practical Method for Numerical Evaluation of Solutions of Partial Differential Equations of the Heat-Conduction Type", Proceedings of the Cambridge Philosophical Society 43: 5067, Bibcode1947PCPS...43...50C, doi:10.1017/S0305004100023197 Einstein, Albert (1905), "ber die von der molekularkinetischen Theorie der Wrme geforderte Bewegung von in ruhenden Flssigkeiten suspendierten Teilchen", Ann. Phys. Leipzig 17 322 (8): 549560, Bibcode1905AnP...322..549E, doi:10.1002/andp.19053220806 Evans, L.C. (1998), Partial Differential Equations, American Mathematical Society, ISBN0-8218-0772-2 John, Fritz (1991), Partial Differential Equations (4th ed.), Springer, ISBN978-0-387-90609-6 Wilmott, P.; Howison, S.; Dewynne, J. (1995), The Mathematics of Financial Derivatives:A Student Introduction, Cambridge University Press

Heat equation Carslaw, H. S.; Jaeger, J. C. (1959), Conduction of Heat in Solids (2nd ed.), Oxford University Press, ISBN978-0-19-853368-9 Thambynayagam, R. K. M. (2011), The Diffusion Handbook: Applied Solutions for Engineers, McGraw-Hill Professional, ISBN978-0-07-175184-1 Perona, P; Malik, J. (1990), "Scale-Space and Edge Detection Using Anisotropic Diffusion", IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (7): 629639 Unsworth, J.; Duarte, F. J. (1979), "Heat diffusion in a solid sphere and Fourier Theory", Am. J. Phys. 47 (11): 891893, Bibcode1979AmJPh..47..981U, doi:10.1119/1.11601

201

External links
Derivation of the heat equation (http://www.mathphysics.com/pde/HEderiv.html) Linear heat equations (http://eqworld.ipmnet.ru/en/solutions/lpde/heat-toc.pdf): Particular solutions and boundary value problems - from EqWorld

Diffusion
Diffusion is one of several transport phenomena that occur in nature. A distinguishing feature of diffusion is that it results in mixing or mass transport without requiring bulk motion. Thus, diffusion should not be confused with convection or advection, which are other transport mechanisms that use bulk motion to move particles from one place to another. In Latin word "diffundere" means "to spread out". There are two ways to introduce the notion of diffusion: either a phenomenological approach starting with Ficks laws and their mathematical consequences, or a physical and atomistic one, by considering the random walk of the diffusing particles.[1] In the phenomenological approach, according to Fick's laws, the diffusion flux is proportional to the minus gradient of concentrations. It goes from regions of higher concentration to regions of lower concentration. Later on, various generalizations of the Fick's laws were developed in the frame of thermodynamics and non-equilibrium thermodynamics.[2] From the atomistic point of view, diffusion is considered as a result of the random walk of the diffusing particles. In molecular diffusion, the moving molecules are self-propelled by thermal energy. Random walk of small particles in suspension in a fluid was discovered in 1827 by Robert Brown. The theory of the Brownian motion and the atomistic backgrounds of diffusion were developed by Albert Einstein.[3]
a diffusion process in science. Some particles are dissolved in a glass of water. Initially, the particles are all near one corner of the glass. If the particles all randomly move around ("diffuse") in the water, then the particles will eventually become distributed randomly and uniformly, and organized (but diffusion will still continue to occur, just that there will be no net flux).

Time lapse video of diffusion of a dye dissolved in water into a gel.

Now, the concept of diffusion is widely used in science: in physics (particle diffusion), chemistry and biology, in sociology, economics and finance (diffusion of people, ideas and of price values). It appears every time the concept of random walk in ensembles of individuals is applicable.

Diffusion

202

History of diffusion in physics


In technology, diffusion in solids was used long before the theory of diffusion was created. For example, the cementation process that produces steel from the iron includes carbon diffusion and was described already by Pliny the Elder, the diffusion of colours of stained glasses or earthenwares and Chinas was well known for many centuries. In modern science, the first systematic experimental study of diffusion was performed by Thomas Graham. He studied diffusion in gases and the main phenomenon was described by him in 1831-1833:[4] "...gases of different nature, when brought into contact, do not arrange themselves according to their density, the heaviest undermost, and the lighter uppermost, but they spontaneously diffuse, mutually and equally, through each other, and so remain in the intimate state of mixture for any length of time. The measuments of Graham allowed James Clerk Maxwell to derive in 1867 the coefficient of diffusion of CO2 in air. The error is less than 5%. In 1855, Adolf Fick, the 26-year old anatomy demonstrator from Zrich proposed his law of diffusion. He used Graham's research and his goal was "the development of a fundamental law, for the operation of diffusion in a single element of space". He declared a deep analogy between diffusion and conduction of heat or electricity and created the formalism that is similar to Fourier's law for heat conduction (1822) and Ohm's law for electrical current (1827). Robert Boyle demonstrated diffusion in solids in 17th century[5] by penetration of Zinc into a copper coin. Nevertheless, diffusion in solids was not systematically studied till the second part of the 19th century. William Chandler Roberts-Austen, the well-known British metallurgist, studied systematically solid state diffusion on the example of gold in lead in 1896. He has been the former assistant of Thomas Graham and this connection inspired him:[6] "... My long connection with Graham's researches made it almost a duty to attempt to extend his work on liquid diffusion to metals." In 1858, Rudolf Clausius introduced the concept of the mean free path. In the same year, James Clerk Maxwell developed the first atomistic theory of transport processes in gases. The modern atomistic theory of diffusion and Brownian motion was developed by Albert Einstein, Marian Smoluchowski and Jean-Baptiste Perrin. The role of Ludwig Boltzmann in the development of the atomistic backgrounds of the macroscopic transport processes was great. His Boltzmann equation serves more than 140 years as a source of ideas and problems in mathematics and physics of transport processes.[7] In 1920-1921 George de Hevesy measured self-diffusion using radioisotopes. He studied self-diffusion of radioactive isotopes of lead in liquid and solid lead. Yakov Frenkel (or, sometimes, Jakov or Jacov) proposed in 1926 and then elaborated the idea of diffusion in crystals through local defects (vacancies and interstitial atoms). He introduced several mechanisms of diffusion and found rate constants from experimental data. Later, this idea was developed further by Carl Wagner and Walter H. Schottky. Nowadays, it is universally recognized that atomic defects are necessary to mediate diffusion in crystals.[6] The ideas of Frenkel represent diffusion process in condensed matter as an ensemble of elementary jumps and quasichemical interactions of particles and defects. Henry Eyring with co-authors applied his theory of absolute reaction rates to this quasichemical representation of diffusion.[8] The analogy between reaction kinetics and diffusion leads to various nonlinear versions of Fick's law.[9]

Diffusion

203

Basic models of diffusion


Diffusion flux
Each model of diffusion expresses the diffusion flux through concentrations, densities and their derivatives. Flux is a vector . The transfer of a physical quantity through a small area with normal per time is

where

is the inner product and then

is the little-o notation. If we use the notation of vector area

The dimension of the diffusion flux is [flux]=[quantity]/([time][area]). The diffusing physical quantity the number of particles, mass, energy, electric charge, or any other scalar extensive quantity. For its density, diffusion equation has the form

may be , the

where where

is intensity of any local source of this quantity (the rate of a chemical reaction, for example). For the on the boundary, is the normal to the boundary at point .

diffusion equation, the no-flux boundary conditions can be formulated as

Fick's law and equations


Fick's first law: the diffusion flux is proportional to the negative of the concentration gradient:

The corresponding diffusion equation (Fick's second law) is

where

is the Laplace operator,

Onsager's equations for multicomponent diffusion and thermodiffusion


Fick's law describes diffusion of an admixture in a media. The concentration of this admixture should be small and the gradient of this concentration should be also small. The driving force of diffusion in Fick's law is the antigradient of concentration, . In 1931, Lars Onsager[10] included the multicomponent transport processes in the general context of linear non-equilibrium thermodynamics. For multi-component transport,

where

is the flux of the ith physical quantity (component) and

is the jth thermodynamic force.

The thermodynamic forces for the transport processes were introduced by Onsager as the space gradients of the derivatives of the entropy density s (he used the term "force" in quotation marks or "driving force"):

where are the "thermodynamic coordinates". For the heat and mass transfer one can take (the density of internal energy) and is the concentration of the ith component. The corresponding driving forces are the space

Diffusion vectors because where T is the absolute temperature and is the chemical potential of the ith component. It should be stressed that

204

the separate diffusion equations describe the mixing or mass transport without bulk motion. Therefore, the terms with variation of the total pressure are neglected. It is possible for diffusion of small admixtures and for small gradients. For the linear Onsager equations, we must take the thermodynamic forces in the linear approximation near equilibrium:

where the derivatives of s are calculated at equilibrium n*. The matrix of the kinetic coefficients symmetric (Onsager reciprocal relations) and positive definite (for the entropy growth). The transport equations are

should be

Here, all the indexes i, j, k=0,1,2,... are related to the internal energy (0) and various components. The expression in the square brackets is the matrix of the diffusion (i,k>0), thermodiffusion (i>0, k=0 or k>0, i=0) and thermal conductivity (i=k=0) coefficients. Under isothermal conditions T=const. The relevant thermodynamic potential is the free energy (or the free entropy). The thermodynamic driving forces for the isothermal diffusion are antigradients of chemical potentials, , and the matrix of diffusion coefficients is

(i,k>0). There is intrinsic arbitrariness in the definition of the thermodynamic forces and kinetic coefficients because they are not measurable separately and only their combinations can be measured. For example, in the original

work of Onsager[10] the thermodynamic forces include additional multiplier T, whereas in the Course of Theoretical Physics[11] this multiplier is omitted but the sign of the thermodynamic forces is opposite. All these changes are supplemented by the corresponding changes in the coefficients and do not effect the measurable quantities.

Nondiagonal diffusion must be nonlinear


The formalism of linear irreversible thermodynamics (Onsager) generates the systems of linear diffusion equations in the form

If the matrix of diffusion coefficients is diagonal then this system of equations is just a collection of decoupled Fick's equations for various components. Assume that diffusion is non-diagonal, for example, , and consider the state with . At this state, . If at some points then becomes negative at these points in a short time. Therefore, linear non-diagonal diffusion does not preserve positivity of concentrations. Non-diagonal equations of multicomponent diffusion must be non-linear.[9]

Diffusion

205

Einstein's mobility and Teorell formula


The Einstein relation (kinetic theory) connects the diffusion coefficient and the mobility (the ratio of the particle's terminal drift velocity to an applied force)[12]

where D is the diffusion constant; is the "mobility"; kB is Boltzmann's constant; T is the absolute temperature. Below, to combine in the same formula the chemical potential and the mobility, we use for mobility the notation . The mobility--based approach was further applied by T. Teorell [13]. In 1935, he studied the diffusion of ions through a membrane. He formulated the essence of his approach in the formula: the flux is equal to mobilityconcentrationforce per gram ion. This is the so-called Teorell formula. The force under isothermal conditions consists of two parts: 1. Diffusion force caused by concentration gradient: 2. Electrostatic force caused by electric potential gradient: Here R is the gas constant, T is the absolute temperature, n is the concentration, the equilibrium concentration is marked by a superscript "eq", q is the charge and is the electric potential. The simple but crucial difference between the Teorell formula and the Onsager laws is the concentration factor in the Teorell expression for the flux. In the Einstein - Teorell approach, If for the finite force the concentration tends to zero then the flux also tends to zero, whereas the Onsager equations violate this simple and physically obvious rule. The general formulation of the Teorell formula for non-perfect systems under isothermal conditions is[9]

where is the chemical potential, 0 is the standard value of the chemical potential. The expression is the so-called activity. It measures the "effective concentration" of a species in a non-ideal mixture. In this notation, the Teorell formula for the flux has a very simple form[9]

The standard derivation of the activity includes a normalization factor and for small concentrations , where is the standard concentration. Therefore this formula for the flux describes the flux of the normalized dimensionless quantity, ,

Teorell formula for multicomponent diffusion The Teorell formula with combination of Onsager's definition of the diffusion force gives

where

is the mobility of the ith component,

is its activity,

is the matrix of the coefficients,

is the .

themodynamic diffusion force,

. For the isothermal perfect systems,

Therefore, the Einstein-Teorell approach gives the following multicomponent generalization of the Fick's law for multicomponent diffusion:

Diffusion

206

or in the vector calculus div-grad form

where

is the matrix of coefficients. The Chapman-Enskog formulas for diffusion in gases include exactly the

same terms. Earlier, such terms were introduced in the MaxwellStefan diffusion equation.

Jumps on the surface and in solids


Diffusion of reagents on the surface of a catalyst may play an important role in heterogeneous catalysis. The model of diffusion in the ideal monolayer is based on the jumps of the reagents on the nearest free places. This model was used for CO on Pt oxidation under low gas pressure. The system includes several reagents on the surface. Their .

surface concentrations are

The surface is a lattice of the adsorbtion places. Each reagent molecule fills a place on the surface. Some of the places are free. The concentration of the free paces is . The sum of all (including free places) is constant, the density of adsorbtion places b. The jump model gives for the diffusion flux of The corresponding diffusion equation is:[9]

Diffusion in the monolayer: oscillations near temporary equilibrium positions and jumps to the nearest free places.

(i=1,...,n):

Due to the conservation law,

and we have the system of m diffusion equations. For one . For two and more

component we get Fick's law and linear equations because

components the equations are nonlinear. If all particles can exchange their positions with their closest neighbours then a simple generalization gives

where

is a symmetric matrix of coefficients which characterize the intensities of jumps. The free

places (vacancies) should be considered as special "particles" with concentration . Various versions of these jump models are also suitable for simple diffusion mechanisms in solids.

Diffusion

207

Diffusion in porous media


For diffusion in porous media the basic equations are[14]:

where D is the diffusion coefficient, n is the concentration, m>0 (usually m>1, the case m=1 corresponds to Fick's law). For diffusion of gases in porous media this equation is the formalisation of Darcy's law: the velocity of a gas in the porous media is

where k is the permeability of the medium, is the viscosity and p is the pressure. The flux J=nv and for Darcy's law gives the equation of diffusion in porous media with m=+1. For underground water infiltration the Boussinesq approximation gives the same equation with m=2. For plasma with the high level of radiation the Zeldovich-Raizer equation gives m>4 for the heat transfer.

Diffusion in physics
Elementary theory of diffusion coefficient in gases
The diffusion coefficient is the coefficient in the Fick's first law , where J is the diffusion flux (amount of substance) per unit area per unit time, n (for ideal mixtures) is the concentration, x is the position [length]. Let us consider two gases with molecules of the same diameter d and mass m (self-diffusion). In this case, the elementary mean free path theory of diffusion gives for the diffusion coefficient

where kB is the Boltzmann constant, T is the temperature, P is the pressure, mean thermal speed:

is the mean free path, and vT is the

We can see that the diffusion coefficient in the mean free path approximation grows with T as T3/2 and decreases with P as 1/P. If we use for P the ideal gas law P=RnT with the total concentration n, then we can see that for given concentration n the diffusion coefficient grows with T as T1/2 and for given temperature it decreases with the total concentration as 1/n. For two different gases, A and B, with molecular masses mA, mB and molecular diameters dA, dB, the mean free path estimate of the diffusion coefficient of A in B and B in A is:

Diffusion

208

The theory of diffusion in gases based on Boltzmann's equation


In Boltzmann's kinetics of the mixture of gases, each gas has its own distribution function, , where t is the time moment, x is position and c is velocity of molecule of the of the ith component of the mixture. Each component has its mean velocity . If the velocities do not concide then

there exists diffusion. In the Chapman-Enskog approximation, all the distribution functions are expressed through the densities of the conserved quantities:[7] individual concentrations of particles, density of moment density of kinetic energy The kinetic temperature T and pressure P are defined in 3D space as ; where is the total density. is given by the expression:[7] , where is the force applied to the molecules of the ith component and is the thermodiffusion ratio. , (mi is the ith particle mass), . (particles per volume),

For two gases, the difference between velocities,

The coefficient D12 is positive. This is the diffusion coefficient. Four terms in the formula for C1-C2 describe four main effects in the diffusion of gases: 1. describes the flux of the first component from the areas with the high ratio n1/n to the areas with lower values of this ratio (and, analogously the flux of the second component from high n2/n to low n2/n because n2/n=1-n1/n); 2. describes the flux of the heavier molecules to the areas with higher pressure and the lighter molecules to the areas with lower pressure, this is barodiffusion; 3. describes diffusion caused by the difference of the forces applied to molecules of different types. For example, in the Earth's gravitational field, the heavier molecules should go down, or in electric field the charged molecules should move, until this effect is not equilibrated by the sum of other terms. This effect should not be confused with barodiffusion caused by the pressure gradient. 4. describes thermodiffusion, the diffusion flux caused by the temperature gradient.

All these effects are called diffusion because they describe the differences between velocities of different components in the mixture. Therefore, these effects cannot be described as a bulk transport and differ from advection or convection. In the first approximation,[7] for rigid spheres;

Diffusion

209

The number

for repulsing force

is defined by quadratures (formulas (3.7), (3.9), Ch. 10 of the classical Chapman and Cowling

book[7]) We can see that the dependence on T for the rigid spheres is the same as for the simple mean free path theory but for the power repulsion laws the exponent is different. Dependence on a total concentration n for a given temperature has always the same character, 1/n. In applications to gas dynamics, the diffusion flux and the bulk flow should be joined in one system of transport equations. The bulk flow describes the mass transfer. Its velocity V is the mass average velocity. It is defined through the momentum density and the mass concentrations:

where

is the mass concentration of the ith species,

is the mass density. , . The mass transfer of

By definition, the diffusion velocity of the ith component is the ith component is described by the continuity equation

where

is the net mass production rate in chemical reactions,

. represents

In these equations, the term

describes advection of the ith component and the term

diffusion of this component. In 1948, Wendell H. Furry proposed to use the form of the diffusion rates found in kinetic theory as a framework for the new phenomenological approach to diffusion in gases. This approach was developed further by F.A. Williams and S.H. Lam.[15] For the diffusion velocities in multicomponent gases (N components) they used

Here,

is the diffusion coefficient matrix,

is the thermal diffusion coefficient,

is the body force per is the

unite mass acting on the ith species, partial pressure),

is the partial pressure fraction of the ith species (and .

is the mass fraction of the ith species, and

Diffusion

210

Separation diffusion from convection in gases


While Brownian motion of multi-molecular mesoscopic particles (like pollen grains studied by Brown) is observable under an optical microscope, molecular diffusion can only be probed in carefully controlled experimental conditions. Since Graham experiments, it is well known that avoiding of convection is necessary and this may be a non-trivial task. Under normal conditions, molecular diffusion dominates only on length The above palette shows change in excess carriers being generated (green:electrons and scales between nanometer and purple:holes) with increasing light intensity (Generation rate /cm3) at the center of an intrinsic semiconductor bar. Electrons have a higher diffusion constant than holes, leading millimeter. On larger length scales, to fewer excess electrons at the center as compared to holes. transport in liquids and gases is normally due to another transport phenomenon, convection, and to study diffusion on the larger scale, special efforts are needed. Therefore, some often cited examples of diffusion are wrong: If cologne is sprayed in one place, it will soon be smelled in the entire room, but a simple calculation shows that this can't be due to diffusion. Convective motion persists in the room because the temperature inhomogeneity. If ink is dropped in water, one usually observes an inhomogeneous evolution of the spatial distribution, which clearly indicates convection (caused, in particular, by this dropping). In contrast, heat conduction through solid media is an everyday occurrence (e.g. a metal spoon partly immersed in a hot liquid). This explains why the diffusion of heat was explained mathematically before the diffusion of mass.

Other types of diffusion


Anisotropic diffusion, also known as the Perona-Malik equation, enhances high gradients Anomalous diffusion,[16] in porous medium Atomic diffusion, in solids Eddy diffusion, in coarse-grained description of turbulent flow Effusion of a gas through small holes Electronic diffusion, resulting in an electric current called the diffusion current Facilitated diffusion, present in some organisms Gaseous diffusion, used for isotope separation Heat equation, diffusion of thermal energy It diffusion, mathematisation of Brownian motion, continuous stochastic process. Knudsen diffusion of gas in long pores with frequent wall collisions Momentum diffusion ex. the diffusion of the hydrodynamic velocity field Photon diffusion

Random walk,[17] model for diffusion Reverse diffusion, against the concentration gradient, in phase separation Rotational diffusion, random reorientations of molecules

Diffusion Surface diffusion, diffusion of adparticles on a surface Turbulent diffusion, transport of mass, heat, or momentum within a turbulent fluid

211

References
[1] J. Philibert (2005). One and a half century of diffusion: Fick, Einstein, before and beyond. (http:/ / www. rz. uni-leipzig. de/ diffusion/ pdf/ volume2/ diff_fund_2(2005)1. pdf) Diffusion Fundamentals, 2, 1.1--1.10. [2] S.R. De Groot, P. Mazur (1962). Non-equilibrium Thermodynamics. North-Holland, Amsterdam. [3] A. Einstein (1905), ber die von der molekularkinetischen Theorie der Wrme geforderte Bewegung von in ruhenden Flssigkeiten suspendierten Teilchen (http:/ / www. zbp. univie. ac. at/ dokumente/ einstein2. pdf). Ann. Phys., 17, 549--560. [4] Diffusion Processes, Thomas Graham Symposium, ed. J.N. Sherwood, A.V. Chadwick, W.M.Muir, F.L. Swinton, Gordon and Breach, London, 1971. [5] L.W. Barr (1997), In: Diffusion in Materials, DIMAT 96, ed. H.Mehrer, Chr. Herzig, N.A. Stolwijk, H. Bracht, Scitec Publications, Vol.1, pp. 1-9. [6] H. Mehrer, N.A. Stolwijk (2009). Heroes and Highlights in the History of Diffusion (http:/ / www. uni-leipzig. de/ diffusion/ pdf/ volume11/ diff_fund_11(2009)1. pdf), Diffusion Fundamentals, 11, 1, 1-32. [7] S. Chapman, T. G. Cowling (1970), The Mathematical Theory of Non-uniform Gases: An Account of the Kinetic Theory of Viscosity, Thermal Conduction and Diffusion in Gases, Cambridge University Press (3rd edition). [8] J.F. Kincaid, H. Eyring, A.E. Stearn (1941), The theory of absolute reaction rates and its application to viscosity and diffusion in the liquid State. Chem. Rev., 28, 301-365. [9] A.N. Gorban, H.P. Sargsyan and H.A. Wahab (2011), Quasichemical Models of Multicomponent Nonlinear Diffusion (http:/ / arxiv. org/ pdf/ 1012. 2908v4. pdf), Mathematical Modelling of Natural Phenomena (http:/ / journals. cambridge. org/ action/ displayJournal?jid=MNP), Volume 6 / Issue 05, 184262. [10] Onsager, L. (1931), Reciprocal relations in irreversible processes. (http:/ / prola. aps. org/ abstract/ PR/ v37/ i4/ p405_1) I, Phys. Rev. 37, 405-426; II 38, 2265-2279 [11] L.D. Landau, E.M. Lifshitz (1980). Statistical Physics. Vol. 5 (3rd ed.). Butterworth-Heinemann. ISBN978-0-7506-3372-7. [12] S. Bromberg, K.A. Dill (2002), Molecular Driving Forces: Statistical Thermodynamics in Chemistry and Biology (http:/ / books. google. com/ books?id=hdeODhjp1bUC& pg=PA327), Garland Science. [13] T. Teorell (1935). Studies on the "diffusion effect" upon ionic distribution--I Some theoretical considerations. Proc. N. A. S. USA, 21, 152--161. [14] J. L. Vzquez (2006), The Porous Medium Equation. Mathematical Theory, Oxford Univ. Press. [15] S. H. Lam (2006), Multicomponent diffusion revisited, Physics of Fluids 18, 073101. [16] D. Ben-Avraham and S. Havlin (2000). Diffusion and Reactions in Fractals and Disordered Systems (http:/ / havlin. biu. ac. il/ Shlomo Havlin books_d_r. php). Cambridge University Press. . [17] Weiss, G. (1994). Aspects and Applications of the Random Walk. North-Holland.

External links
Diffusion in a Bipolar Junction Transistor Demo (http://codingzebra.com/DiffusionDemo.htm) Diffusion Furnace (http://www.crystec.com/klldiffe.htm) for doping of semiconductor wafers. POCl3 doping of Silicon. A Java applet implementing Diffusion (http://oscar.iitb.ac.in/availableProposalsAction1.do?type=av& id=683&language=english)

Mass diffusivity

212

Mass diffusivity
Diffusivity or diffusion coefficient is a proportionality constant between the molar flux due to molecular diffusion and the gradient in the concentration of the species (or the driving force for diffusion). Diffusivity is encountered in Fick's law and numerous other equations of physical chemistry. It is generally prescribed for a given pair of species. For a multi-component system, it is prescribed for each pair of species in the system. The higher the diffusivity (of one substance with respect to another), the faster they diffuse into each other. This coefficient has an SI unit of m2/s (length2 / time). In CGS units it was given in cm2/s.

Temperature dependence of the diffusion coefficient


Typically, a compound's diffusion coefficient is ~10,000 as great in air than in water. Carbon dioxide in air has a diffusion coefficient of 16mm2/s, and in water its diffusion coefficient is 0.0016mm2/s.[1][2] The diffusion coefficient in solids at different temperatures is often found to be well predicted by

where is the diffusion coefficient is the maximum diffusion coefficient (at infinite temperature) is the activation energy for diffusion in dimensions of [energy (amount of substance)1] is the temperature in units of [absolute temperature] (kelvins or degrees Rankine) is the gas constant in dimensions of [energy temperature1 (amount of substance)1]

An equation of this form is known as the Arrhenius equation. An approximate dependence of the diffusion coefficient on temperature in liquids can often be found using StokesEinstein equation, which predicts that:

where: T1 and T2 denote temperatures 1 and 2, respectively D is the diffusion coefficient (cm2/s) T is the absolute temperature (K), is the dynamic viscosity of the solvent (Pas) The dependence of the diffusion coefficient on temperature for gases can be expressed using the ChapmanEnskog theory (predictions accurate on average to about 8%)[3]:

where: 1 and 2 index the two kinds of molecules present in the gaseous mixture T temperature (K) M molar mass (g/mol) p pressure (atm) the average collision diameter (the values are tabulated[4]) ()

Mass diffusivity a temperature-dependent collision integral (the values are tabulated[4] but usually of order 1) (dimensionless). D diffusion coefficient (which is expressed in cm2/s when the other magnitudes are expressed in the units as given above[3][5]).

213

Pressure dependence of the diffusion coefficient


For self-diffusion in gases at two different pressures (but the same temperature), the following empirical equation has been suggested:[3]

where: P1 and P2 denote pressures 1 and 2, respectively D is the diffusion coefficient (m2/s) is the gas mass density (kg/m3)

Effective diffusivity in porous media


The effective diffusion coefficient describes diffusion through the pore space of porous media.[6] It is macroscopic in nature, because it is not individual pores but the entire pore space that needs to be considered. The effective diffusion coefficient for transport through the pores, De, is estimated as follows:

where: D is the diffusion coefficient in gas or liquid filling the pores (m2s1) t is the porosity available for the transport (dimensionless) is the constrictivity (dimensionless) is the tortuosity (dimensionless)

The transport-available porosity equals the total porosity less the pores which, due to their size, are not accessible to the diffusing particles, and less dead-end and blind pores (i.e., pores without being connected to the rest of the pore system). The constrictivity describes the slowing down of diffusion by increasing the viscosity in narrow pores as a result of greater proximity to the average pore wall. It is a function of pore diameter and the size of the diffusing particles.

Example values
Gases at 1 atm., solutes in liquid at infinite dilution. Legend: (s) solid; (l) liquid; (g) gas; (dis) dissolved.

Mass diffusivity

214

Values of diffusion coefficients


Species pair (solute-solvent) Temperature (C) D (cm2/s) Reference air(g) - H2O(g) air(g) -oxygen(g) oxygen(dis) - water(l) hydrogen(dis) - water(l) hydrogen - iron(s) hydrogen - iron(s) aluminium - copper(s) 25 25 25 25 10 100 20 0.282 0.176 2.1x10-5 4.5x10-5 [3] [3] [3] [3]

1.66x10-9 [3] 124x10-9 [3]

1.3x10-30 [3]

References
[1] CRC Press Online: CRC Handbook of Chemistry and Physics, Section 6, 91st Edition (http:/ / www. crcpress. com/ product/ isbn/ 9781439820773) [2] Diffusion (http:/ / www. cco. caltech. edu/ ~brokawc/ Bi145/ Diffusion. html) [3] Cussler, E. L. (1997). Diffusion: Mass Transfer in Fluid Systems (2nd ed.). New York: Cambridge University Press. ISBN0-521-45078-0. [4] Hirschfelder, J.; Curtiss, C. F.; Bird, R. B. (1954). Molecular Theory of Gases and Liquids. New York: Wiley. ISBN0-471-40065-3. [5] Welty, James R.; Wicks, Charles E.; Wilson, Robert E.; Rorrer, Gregory (2001). Fundamentals of Momentum, Heat, and Mass Transfer. Wiley. ISBN978-0-470-12868-8. [6] Grathwohl, P. (1998). Diffusion in natural porous media: Contaminant transport, sorption / desorption and dissolution kinetics. Kluwer Academic. ISBN0-7923-8102-5.

Chemical potential
In thermodynamics, chemical potential, also known as partial molar free energy, is a form of potential energy that can be absorbed or released during a chemical reaction. It may also change during a phase transition. The chemical potential of a species in the mixture can be defined as the slope of the free energy of the system with respect to a change in the number of moles of just that species. Thus, it is the partial derivative of the free energy with respect to the amount of the species, all other species' concentrations in the mixture remaining constant, and at constant temperature. When pressure is constant, chemical potential is the partial molar Gibbs free energy. At chemical equilibrium or in phase equilibrium the total sum of chemical potentials is zero, as the free energy is at a minimum.[1][2][3] In semiconductor physics, the chemical potential of a system of electrons is known as the Fermi level.[4]

Chemical potential

215

Overview
Particles tend to move from higher chemical potential to lower chemical potential. In this way, chemical potential is a generalization of "potentials" in physics such as gravitational potential. When a ball rolls down a hill, it is moving from a higher gravitational potential (higher elevation) to a lower gravitational potential (lower elevation). In the same way, as molecules move, react, dissolve, melt, etc., they will always tend naturally to go from a higher chemical potential to a lower one, changing the particle number, which is conjugate variable to chemical potential. A simple example is a system of dilute molecules diffusing in a homogeneous environment (animation at right). In this system, the molecules tend to move from areas with high concentration to low concentration, until eventually the concentration is the same everywhere.

The microscopic explanation for this is based in kinetic theory and the random motion of molecules. However, it is simpler to describe the process in terms of chemical potentials: A molecule has a higher chemical potential in a higher-concentration area, and a lower chemical potential in a low concentration area. Movement of molecules from higher chemical potential to lower chemical potential is accompanied by a release of free energy. Therefore it is a spontaneous process. Another example is a glass of liquid water with ice cubes in it. Above 0C, an H2O molecule in the liquid phase has a lower chemical potential than a water molecule in an ice cube (solid phase). When some ice melts H2O molecules migrate from solid to liquid where their chemical potential is lower. Below 0C, the molecules in ice have a lower chemical potential, so the ice cubes grow. At the temperature of the melting point, 0C, the chemical potentials in water and ice are the same; the ice cubes neither grow nor shrink, and the system is in equilibrium. A third example is illustrated by the chemical reaction of dissociation of a weak acid, such as acetic acid, HA, A=CH3COO-. HA H+ + AVinegar contains acetic acid. When acid molecules dissociate, the concentration of the undissociated acid molecules (HA) decreases and the concentrations of the product ions (H+ and A-) increase. Thus the chemical potential of HA decreases and the sum of the chemical potentials of H+ and A- increases. When the sums of chemical potential of reactants and product are equal the system is at equilibrium and there is no tendency for the reaction to proceed in either the forward or backward direction. This explains why vinegar is acidic, because acetic acid dissociates to some extent, releasing hydrogen ions into the solution. Chemical potentials are important in many aspects of equilibrium chemistry, including melting, boiling, evaporation, solubility, osmosis, partition coefficient, liquid-liquid extraction and chromatography. In each case there is a characteristic constant which is a function of the chemical potentials of the species at equilibrium. In electrochemistry, ions do not always tend to go from higher to lower chemical potential, but they do always go from higher to lower electrochemical potential. The electrochemical potential completely characterizes all of the

Initially, there are solute molecules on the left side of a barrier (purple line) and none on the right. The barrier is removed, and the solute diffuses to fill the whole container. Top: A single molecule moves around randomly. Middle: With more molecules, the solute fills the container somewhat more uniformly. Bottom: With an enormous number of solute molecules, all apparent randomness is gone: The solute appears to move smoothly and systematically from areas of high concentration to those of low, in accordance with Fick's laws.

Chemical potential influences on an ion's motion, while the chemical potential includes everything except the electric force. (See below for more on this terminology.)

216

Thermodynamic definitions
The fundamental equation of chemical thermodynamics for a system containing n constituent species, with the i-th species having Ni particles is, in terms of Gibbs energy

At constant temperature and pressure this simplifies to

The definition of chemical potential of the i-th species, i, follows by setting all the numbers N, apart from one, to be constant.

When temperature and volume are taken to be constant chemical potential relates to the Helmholtz free energy, A.

The chemical potential of a species is the slope of the free energy with respect to the number of particles of that species. It reflects the change in free energy when the number of particles of one species changes. Each chemical species, be it an atom, ion or molecule, has its own chemical potential. At equilibrium free energy is at its minimum for the system, that is, dG=0. It follows that the sum of chemical potentials is also zero.

Use of this equality provides the means to establish the equilibrium constant for a chemical reaction. Other definition are sometimes used.

Here U is internal energy, H is enthalpy and the entropy, S, is taken to be constant (see #History). Keeping the entropy fixed requires perfect thermal insulation, so these definitions have limited practical applications.

Dependence on concentration (particle number)


The variation of chemical potential with particle number is most easily expressed in terms quantities related to concentration. The concentration of a species in a given volume, v, is simply Ni / v, so chemical potential can be defined in terms of concentration rather than particle number. For species in solution where istd is the potential in a given standard state and ai is the activity of the species in solution. Activity can be expressed as a product of concentration and an activity coefficient so, if activity coefficients are ignored chemical potential is proportional to the logarithm of the concentration of the species. For species in the gaseous state the variation is expressed as i = istd + RT lne fi where fi is the fugacity. If the ideal gas law applies fugacity is equal to partial pressure and chemical potential is proportional to the logarithm of the partial pressure. i = istd + RT lne ai

Chemical potential

217

Dependence on temperature and pressure


The Maxwell relation

shows that the temperature variation of chemical potential depends on entropy. If entropy increases as the particle number increases, then chemical potential will decrease. Similarly

shows that the pressure coefficient depends on volume. If volume increases with particle number, chemical potential also increases.

Applications
The Gibbs-Duhem equation is useful because it relates individual chemical potentials. For example, in a binary mixture, at constant temperature and pressure, the chemical potentials of the two participants are related by

Every instance of phase or chemical equilibrium is characterized by a constant. For instance, the melting of ice is characterized by a temperature, known as the melting point at which solid and liquid phases are in equilibrium with each other. Chemical potentials can be used to explain the slopes of lines on a phase diagram by using the Clapeyron equation, which in turn can be derived from the Gibbs-Duhem equation.[5] They are used to explain colligative properties such as melting point depression by the application of pressure.[6] Both Raoult's law and Henry's law can be derived in a simple manner using chemical potentils.[7]

History
Chemical potential was first described by the American engineer, chemist and mathematical physicist Josiah Willard Gibbs. He defined it as follows: If to any homogeneous mass in a state of hydrostatic stress we suppose an infinitesimal quantity of any substance to be added, the mass remaining homogeneous and its entropy and volume remaining unchanged, the increase of the energy of the mass divided by the quantity of the substance added is the potential for that substance in the mass considered. Gibbs later noted also that for the purposes of this definition, any chemical element or combination of elements in given proportions may be considered a substance, whether capable or not of existing by itself as a homogeneous body. This freedom to choose the boundary of the system allows chemical potential to be applied to a huge range of systems. The term can be used in thermodynamics and physics for any system undergoing change. Chemical potential is also referred to as partial molar Gibbs energy (see also partial molar property). Chemical potential is measured in units of energy/particle or, equivalently, energy/mole. In his 1873 paper A Method of Geometrical Representation of the Thermodynamic Properties of Substances by Means of Surfaces, Gibbs introduced the preliminary outline of the principles of his new equation able to predict or estimate the tendencies of various natural processes to ensue when bodies or systems are brought into contact. By studying the interactions of homogeneous substances in contact, i.e. bodies, being in composition part solid, part liquid, and part vapor, and by using a three-dimensional volumeentropyinternal energy graph, Gibbs was able to determine three states of equilibrium, i.e. "necessarily stable", "neutral", and "unstable", and whether or not changes will ensue. In 1876, Gibbs built on this framework by introducing the concept of chemical potential so to take into

Chemical potential account chemical reactions and states of bodies that are chemically different from each other. In his own words, to summarize his results in 1873, Gibbs states: If we wish to express in a single equation the necessary and sufficient condition of thermodynamic equilibrium for a substance when surrounded by a medium of constant pressure P and temperature T, this equation may be written:

218

where refers to the variation produced by any variations in the state of the parts of the body, and (when different parts of the body are in different states) in the proportion in which the body is divided between the different states. The condition of stable equilibrium is that the value of the expression in the parenthesis shall be a minimum. In this description, as used by Gibbs, refers to the internal energy of the body, refers to the entropy of the body, and is the volume of the body.

Electrochemical, Internal, external, and total chemical potential


The abstract definition of chemical potential given abovetotal change in free energy per extra mole of substanceis more specifically called total chemical potential.[8][9] If two locations have different total chemical potentials for a species, some of it may be due to potentials associated with "external" force fields (Electric potential energy differences, gravitational potential energy differences, etc.), while the rest would be due to "internal" factors (density, temperature, etc.)[8] Therefore the total chemical potential can be split into internal chemical potential and external chemical potential:

where

i.e., the external potential is the sum of electric potential, gravitational potential, etc. (q and m are the charge and mass of the species, respectively, V and h are the voltage and height of the container, respectively, and g is the acceleration due to gravity.) The internal chemical potential includes everything else besides the external potentials, such as density, temperature, and enthalpy. The phrase "chemical potential" sometimes means "total chemical potential", but that is not universal.[8] In some fields, in particular electrochemistry, semiconductor physics, and solid-state physics, the term "chemical potential" means internal chemical potential, while the term electrochemical potential is used to mean total chemical potential.[10][11][12][13][14]

Chemical potential of electrons in solids


Electrons in solids have a chemical potential, defined the same way as the chemical potential of a chemical species: The change in free energy when electrons are added or removed from the system. In the case of electrons, the chemical potential is usually expressed in energy per particle rather than energy per mole, and the energy per particle is conventionally given in units of electron-volt (eV). Chemical potential plays an especially important role in semiconductor physics. For example, n-type silicon has a higher chemical potential of electrons than p-type silicon. Therefore, when p-type and n-type silicon are put into contactcalled a pn junctionelectrons will spontaneously flow from the n-type to the p-type. This transfer of charge causes a "built-in" electric field, which is central to how pn diodes and photovoltaics work. Chemical potential of electrons in solids is closely related to the concepts of work function, fermi level, electronegativity, and ionization potential. In fact, the chemical potential of an atom is sometimes said to be the negative of the atom's electronegativity. Likewise, the process of chemical potential equalization is sometimes

Chemical potential referred to as the process of electronegativity equalization. This connection comes from the Mulliken definition of electronegativity. By inserting the energetic definitions of the ionization potential and electron affinity into the Mulliken electronegativity, it is possible to show that the Mulliken chemical potential is a finite difference approximation of the electronic energy with respect to the number of electrons., i.e.,

219

where IP and EA are the ionization potential and electron affinity of the atom, respectively. As described above, when describing chemical potential, one has to say "relative to what". In the case of electrons in solids, chemical potential is often specified "relative to vacuum", i.e. relative to an electron sitting isolated in empty space. In practice, the electrochemical potential of electrons is even more important than the chemical potential. The electrochemical potential of electrons in a solid is called the fermi level.

In particle physics
In recent years, thermal physics has applied the definition of chemical potential to systems in particle physics and its associated processes. For example, in a quark-gluon plasma or other QCD matter, at every point in space there is a chemical potential for photons, a chemical potential for electrons, a chemical potential for baryon number, electric charge, and so forth. In the case of photons, photons are bosons and can very easily and rapidly appear or disappear. Therefore the chemical potential of photons is always and everywhere zero. The reason is, if the chemical potential somewhere was higher than zero, photons would spontaneously disappear from that area until the chemical potential went back to zero; likewise if the chemical potential somewhere was less than zero, photons would spontaneously appear until the chemical potential went back to zero. Since this process occurs extremely rapidly (at least, it occurs rapidly in the presence of dense charged matter), it is safe to assume that the photon chemical potential is never different from zero. Electric charge is different, because it is conserved, i.e. it can be neither created nor destroyed. It can, however, diffuse. The "chemical potential of electric charge" controls this diffusion: Electric charge, like anything else, will tend to diffuse from areas of higher chemical potential to areas of lower chemical potential.[15] Other conserved quantities like baryon number are the same. In fact, each conserved quantity is associated with a chemical potential and a corresponding tendency to diffuse to equalize it out.[16] In the case of electrons, the behavior depends on temperature and context. At low temperatures, with no positrons present, electrons cannot be created or destroyed. Therefore there is an electron chemical potential that might vary in space, causing diffusion. At very high temperatures, however, electrons and positrons can spontaneously appear out of the vacuum (pair production), so the chemical potential of electrons by themselves becomes a less useful quantity than the chemical potential of the conserved quantities like (electrons minus positrons). The chemical potentials of bosons and fermions is related to the number of particles and the temperature by BoseEinstein statistics and FermiDirac statistics respectively.

Chemical potential

220

References
[1] Atkins, Peter; de Paula, Julio (2006). Atkins' Physical Chemistry (8th ed.). Oxford University Press. ISBN978-0-19-870072-2. Page references in this article refer specifically to the 8th edition of this book. [2] Baierlein, Ralph (April 2001). "The elusive chemical potential" (http:/ / www. physics. rutgers. edu/ ugrad/ 351/ chemical_potential. pdf). American Journal of Physics 69 (4): 423434. Bibcode2001AmJPh..69..423B. doi:10.1119/1.1336839. . [3] Job, G.; Herrmann, F. (February 2006). "Chemical potentiala quantity in search of recognition" (http:/ / www. physikdidaktik. uni-karlsruhe. de/ publication/ ejp/ chem_pot_ejp. pdf) (PDF). European Journal of Physics 27 (2): 353371. Bibcode2006EJPh...27..353J. doi:10.1088/0143-0807/27/2/018. . [4] Kittel, Charles; Herbert Kroemer (1980-01-15). Thermal Physics (2nd Edition) (http:/ / books. google. com/ ?id=c0R79nyOoNMC& pg=PA357). W. H. Freeman. p.357. ISBN978-0-7167-1088-2. . [5] Atkins, Section 4.1, p 126 [6] Atkins, Section 5.5, pp 150-155 [7] Atkins, Section 5.3, pp 143-145 [8] Thermal Physics (http:/ / books. google. com/ books?id=c0R79nyOoNMC& pg=PA124) by Kittel and Kroemer, second edition, page 124. [9] Thermodynamics in Earth and Planetary Sciences by Jibamitra Ganguly, google books link (http:/ / books. google. com/ books?id=aD6TJAuCTVsC& pg=PA240). This text uses "internal", "external", and "total chemical potential" as in this article. [10] Electrochemical Methods by Bard and Faulkner, 2nd edition, Section 2.2.4(a),4-5. [11] Electrochemistry at Metal and Semiconductor Electrodes, by Norio Sato, pages 4-5, google books link (http:/ / books. google. com/ books?id=olQzaXNgM74C& pg=PA4) [12] Physics Of Transition Metal Oxides, by Sadamichi Maekawa, p323, google books link (http:/ / books. google. com/ books?id=iyNzfufnkBgC& pg=PA323) [13] The Physics of Solids: Essentials and Beyond, by Eleftherios N. Economou, page 140, google books link (http:/ / books. google. com/ books?id=SyGsHxH071MC& pg=PA140). In this text, total chemical potential is usually called as "electrochemical potential", but sometimes as just "chemical potential". The internal chemical potential is referred to by the unwieldy phrase "chemical potential in the absence of the [electric] field". [14] Solid State Physics by Ashcroft and Mermin, page 257 note 36. Page 593 of the same book uses, instead, an unusual "flipped" definition where "chemical potential" is the total chemical potential which is constant in equilibrium, and "electrochemical potential" is the internal chemical potential; presumably this unusual terminology was an unintentional mistake. [15] Baierlein, Ralph (2003). Thermal Physics. Cambridge University Press. ISBN0-521-65838-1. OCLC39633743. [16] Hadrons and Quark-Gluon Plasma (http:/ / books. google. com/ books?id=SAlbKkdor1gC& pg=PA91& lpg=PA91), by Jean Letessier, Johann Rafelski, p91

External links
Chemical Potential (http://www.tf.uni-kiel.de/matwis/amat/def_en/kap_2/advanced/t2_4_1.html) Chemical Potentials (http://www.phasediagram.dk/chemical_potentials.htm) Values of the chemical potential of 1300 substances (http://www.job-stiftung.de/index.php?id=54,0,0,1,0,0)

Conservation law

221

Conservation law
In physics, a conservation law states that a particular measurable property of an isolated physical system does not change as the system evolves. One particularly important physical result concerning conservation laws is Noether's Theorem, which states that there is a one-to-one correspondence between conservation laws and differentiable symmetries of physical systems. For example, the conservation of energy follows from the time-invariance of physical systems, and the fact that physical systems behave the same regardless of how they are oriented in space gives rise to the conservation of angular momentum.

Exact laws
A partial listing of conservation laws that are said to be exact laws, or more precisely have never been shown to be violated: Conservation of mass-energy Conservation of linear momentum Conservation of angular momentum Conservation of electric charge Conservation of color charge Conservation of weak isospin Conservation of probability CPT symmetry (combining charge, parity and time conjugation) Lorentz symmetry

Approximate laws
There are also approximate conservation laws. These are approximately true in particular situations, such as low speeds, short time scales, or certain interactions. Conservation of mass (applies for non-relativistic speeds and when there are no nuclear reactions) Conservation of baryon number (See chiral anomaly) Conservation of lepton number (In the Standard Model) Conservation of flavor (violated by the weak interaction) Conservation of parity Invariance under Charge conjugation Invariance under time reversal CP symmetry, the combination of charge and parity conjugation (equivalent to time reversal if CPT holds)

Conservation law

222

References
Victor J. Stenger, 2000. Timeless Reality: Symmetry, Simplicity, and Multiple Universes. Buffalo NY: Prometheus Books. Chpt. 12 is a gentle introduction to symmetry, invariance, and conservation laws.

External links
Conservation Laws [1] an online textbook

References
[1] http:/ / www. lightandmatter. com/ area1book2. html

Massenergy equivalence
In physics, in particular special and general relativity, massenergy equivalence is the concept that the mass of a body is a measure of its energy content. In this concept, mass is a property of all energy, and energy is a property of all mass, and the two properties are connected by a constant. This means (for example) that the total internal energy E of a body at rest is equal to the product of its rest mass m and a suitable conversion factor to transform from units of mass to units of energy. Albert Einstein proposed massenergy equivalence in 1905 in one of his Annus Mirabilis papers entitled "Does the inertia of a body depend upon its energy-content?"[1] The equivalence is described by the famous equation:

4-meter-tall sculpture of Einstein's 1905 E=mc2 formula at the 2006 Walk of Ideas, Berlin, Germany

Explication

where E is energy, m is mass, and c is the speed of light. The formula is dimensionally consistent and does not depend on any specific system of measurement units. The equation E=mc2 indicates that energy always exhibits relativistic mass in whatever form the energy takes.[2] Massenergy equivalence does not imply that mass may be "converted" to energy, but it allows for matter to be converted to energy. Through all such conversions, however mass remains conserved (i.e., the quantity of mass remains constant), since mass is a property of matter and also any type of energy. Energy is also conserved. In physics, mass must be differentiated from matter. Matter, when seen as certain types of particles, can be created and destroyed (as in particle annihilation or creation), but a closed system of precursors and products of such reactions, as a whole, retain both the original mass and energy thoughout the reaction.

Massenergy equivalence When the system is not closed, and energy is removed from a system (for example in nuclear fission or nuclear fusion), some mass is always removed along with the energy, according to their equivalence where one always accompanies the other. This energy thus is associated with the missing mass, and this mass will be added to any other system which absorbs the removed energy. In this situation E=mc2 can be used to calculate how much mass goes along with the removed energy. It also tells how much mass will be added to any system which later absorbs this energy. This was the original use of the equation when derived by Einstein. E=mc2 has sometimes been used as an explanation for the origin of energy in nuclear processes, but massenergy equivalence does not explain the origin of such energies. Instead, this relationship merely indicates that the large amounts of energy released in such reactions may exhibit enough mass that the mass-loss may be measured, when the released energy (and its mass) have been removed from the system. For example, the loss of mass to an atom and a neutron as a result of the capture of the neutron, and the production of a gamma ray, has been used to test mass-energy equivalence to high precision, as the energy of the gamma ray may be compared with the mass defect after capture. In 2005, these were found to agree to 0.0004%, the most precise test of the equivalence of mass and energy to date. This test was performed in the World Year of Physics 2005, a centennial celebration of Einstein's achievements in 1905.[3] Einstein was not the first to propose a massenergy relationship (see the History section). However, Einstein was the first scientist to propose the E=mc2 formula and the first to interpret massenergy equivalence as a fundamental principle that follows from the relativistic symmetries of space and time.

223

Nomenclature
In Does the inertia of a body depend upon its energy-content? Einstein used V to mean the speed of light in a vacuum and L to mean the energy lost by a body in the form of radiation. Consequently, the equation E=mc2 was not originally written as a formula but as a sentence in German that meant if a body gives off the energy L in the form of radiation, its mass diminishes by L/V2.[4] A remark placed above it informed that the equation was approximate because the conclusion was only justified if one neglected "magnitudes of fourth and higher orders" of a series expansion.[5] In 1907, Einstein's mass-energy relationship was written as M0=E0/c2 by Max Planck[6] and, subsequently, was given a quantum interpretation[7] by Johannes Stark, who assumed its validity and correctness (Gltigkeit). However, Stark wrote the equation as e0=m0 c2 which meant the energy bound in the mass of an electron at rest and still was not the present popular version of the equation. In 1924, Louis de Broglie assumed the correctness of the relationship "nergie=masse c2" on page 31 in his Research on the Theory of the Quanta (published in 1925) but he did not write E=mc2. However, Einstein returned to the topic once again after the World War Two and this time he wrote E=mc2 in the title of his article[8] intended as an explanation for a general reader by analogy[9]

Conservation of mass and energy


The concept of massenergy equivalence connects the concepts of conservation of mass and conservation of energy, which continue to hold separately in any isolated system (one that is closed to loss of any type of energy, including energy associated with loss of matter). The theory of relativity allows particles which have rest mass to be converted to other forms of mass which require motion, such as kinetic energy, heat, or light. However, the system mass remains. Kinetic energy or light can also be converted to new kinds of particles which have rest mass, but again the energy remains. Both the total mass and the total energy inside an isolated system remain constant over time, as seen by any single observer in a given inertial frame. In other words, energy can neither be created nor destroyed, and energy, in all of its forms, has mass. Mass also can neither be created nor destroyed, and in all of its forms, has energy. According to the theory of relativity, mass and energy as commonly understood, are two names for the same thing, and neither one is changed nor transformed into the other. Rather, neither one exists without the other existing also, as a property of a system. Rather than mass being

Massenergy equivalence changed into energy, the view of special relativity is that rest mass has been changed to a more mobile form of mass, but remains mass. In the transformation process, neither the amount of mass nor the amount of energy changes, since both are properties which are connected to each other via a simple constant.[10] Thus, if energy leaves a system by changing its form, it simply takes its system mass with it. This view requires that if either mass or energy disappears from a system, it will always be found that both have simply moved off to another place, where they may both be measured as an increase of both mass and energy corresponding to the loss in the first system.

224

Fast-moving objects and systems of objects


When an object is pulled in the direction of motion, it gains momentum and energy, but when the object is already traveling near the speed of light, it cannot move much faster, no matter how much energy it absorbs. Its momentum and energy continue to increase without bounds, whereas its speed approaches a constant valuethe speed of light. This implies that in relativity the momentum of an object cannot be a constant times the velocity, nor can the kinetic energy be a constant times the square of the velocity. A property called the relativistic mass is defined as the ratio of the momentum of an object to its velocity.[11] Relativistic mass depends on the motion of the object, so that different observers in relative motion see different values for it. If the object is moving slowly, the relativistic mass is nearly equal to the rest mass and both are nearly equal to the usual Newtonian mass. If the object is moving quickly, the relativistic mass is greater than the rest mass by an amount equal to the mass associated with the kinetic energy of the object. As the object approaches the speed of light, the relativistic mass grows infinitely, because the kinetic energy grows infinitely and this energy is associated with mass. The relativistic mass is always equal to the total energy (rest energy plus kinetic energy) divided by c2.[2] Because the relativistic mass is exactly proportional to the energy, relativistic mass and relativistic energy are nearly synonyms; the only difference between them is the units. If length and time are measured in natural units, the speed of light is equal to 1, and even this difference disappears. Then mass and energy have the same units and are always equal, so it is redundant to speak about relativistic mass, because it is just another name for the energy. This is why physicists usually reserve the useful short word "mass" to mean rest-mass, or invariant mass, and not relativistic mass. The relativistic mass of a moving object is larger than the relativistic mass of an object that is not moving, because a moving object has extra kinetic energy. The rest mass of an object is defined as the mass of an object when it is at rest, so that the rest mass is always the same, independent of the motion of the observer: it is the same in all inertial frames. For things and systems made up of many parts, like an atomic nucleus, planet, or star, the relativistic mass is the sum of the relativistic masses (or energies) of the parts, because energies are additive in closed systems. This is not true in systems which are open, however, if energy is subtracted. For example, if a system is bound by attractive forces, and the energy gained due to the forces of attraction in excess of the work done is removed from the system, then mass will be lost with this removed energy. For example, the mass of an atomic nucleus is less than the total mass of the protons and neutrons that make it up, but this is only true after this energy from binding has been removed in the form of a gamma ray (which in this system, carries away the mass of the energy of binding). This mass decrease is also equivalent to the energy required to break up the nucleus into individual protons and neutrons (in this case, work and mass would need to be supplied). Similarly, the mass of the solar system is slightly less than the masses of sun and planets individually. For a system of particles going off in different directions, the invariant mass of the system is the analog of the rest mass, and is the same for all observers, even those in relative motion. It is defined as the total energy (divided by c2) in the center of mass frame (where by definition, the system total momentum is zero). A simple example of an object with moving parts but zero total momentum, is a container of gas. In this case, the mass of the container is given by its total energy (including the kinetic energy of the gas molecules), since the system total energy and invariant mass

Massenergy equivalence are the same in any reference frame where the momentum is zero, and such a reference frame is also the only frame in which the object can be weighed. In a similar way, the theory of special relativity posits that the thermal energy in all objects (including solids) contributes to their total masses and weights, even though this energy is present as the kinetic and potential energies of the atoms in the object, and it (in a similar way to the gas) is not seen in the rest masses of the atoms that make up the object. In a similar manner, even photons (light quanta), if trapped in a container space (as a photon gas or thermal radiation), would contribute a mass associated with their energy to the container. Such an extra mass, in theory, could be weighed in the same way as any other type of rest mass. This is true in special relativity theory, even though individually, photons have no rest mass. The property that trapped energy in any form adds weighable mass to systems that have no net momentum, is one of the characteristic and notable consequences of relativity. It has no counterpart in classical Newtonian physics, in which radiation, light, heat, and kinetic energy never exhibit weighable mass under any circumstances. Just as the relativistic mass of closed system is conserved through time, so also is their invariant mass. It is this property which allows the conservation of all types of mass in systems, and also conservation of all types of mass in reactions where matter is converted to energy, and vice versa.

225

Applicability of the strict massenergy equivalence formula, E = mc


As is noted above, two different definitions of mass have been used in special relativity, and also two different definitions of energy. The simple equation E=mc is not generally applicable to all these types of mass and energy, except in the special case that the total additive momentum is zero for the system under consideration. In such a case, which is always guaranteed when observing the system from either its center of mass frame or its center of momentum frame, E=mc is always true for any type of mass and energy that are chosen. Thus, for example, in the center of mass frame, the total energy of an object or system is equal to its rest mass times c, a useful equality. This is the relationship used for the container of gas in the previous example. It is not true in other reference frames where the center of mass is in motion. In these systems or for such an object, its total energy will depend on both its rest (or invariant) mass, and also its (total) momentum.[12] In inertial reference frames other than the rest frame or center of mass frame, the equation E=mc remains true if the energy is the relativistic energy and the mass the relativistic mass. It is also correct if the energy is the rest or invariant energy (also the minimum energy), and the mass is the rest mass, or the invariant mass. However, connection of the total or relativistic energy (Er) with the rest or invariant mass (m0) requires consideration of the system total momentum, in systems and reference frames where the total momentum has a non-zero value. The formula then required to connect the two different kinds of mass and energy, is the extended version of Einstein's equation, called the relativistic energymomentum relationship:[13]

or

Here the (pc)2 term represents the square of the Euclidean norm (total vector length) of the various momentum vectors in the system, which reduces to the square of the simple momentum magnitude, if only a single particle is considered. This equation reduces to E=mc when the momentum term is zero. For photons where m0 = 0, the equation reduces to Er = pc.

Massenergy equivalence

226

Meanings of the strict massenergy equivalence formula, E=mc


Massenergy equivalence states that any object has a certain energy, even when it is stationary. In Newtonian mechanics, a motionless body has no kinetic energy, and it may or may not have other amounts of internal stored energy, like chemical energy or thermal energy, in addition to any potential energy it may have from its position in a field of force. In Newtonian mechanics, all of these energies are much smaller than the mass of the object times the speed of light squared. In relativity, all of the energy that moves along with an object (that is, all the energy which is present in the object's rest frame) contributes to the total mass of the body, which measures how much it resists acceleration. Each potential and kinetic energy makes a proportional contribution to the mass. As noted above, even if a box of ideal mirrors "contains" light, then the individually massless photons still contribute to the total mass of the box, by the amount of their energy divided by c2.[14]

In relativity, removing energy is removing mass, and for an observer in the center of mass frame, the formula m=E/c indicates how much mass is lost when energy is removed. In a nuclear reaction, the mass of the atoms that come out is less than the mass of the atoms that go in, and the difference in mass shows up as heat and light which has the same relativistic mass as the difference (and also the same invariant mass in the center of mass frame of the system). In this case, the E in the formula is the energy released and removed, and the mass m is how much the mass decreases. In the same way, when any sort of energy is added to an isolated system, the increase in the mass is equal to the added energy divided by c. For example, when water is heated it gains about 1111017kg of mass for every joule of heat added to the water. An object moves with different speed in different frames, depending on the motion of the observer, so the kinetic energy in both Newtonian mechanics and relativity is frame dependent. This means that the amount of relativistic energy, and therefore the amount of relativistic mass, that an object is measured to have depends on the observer. The rest mass is defined as the mass that an object has when it is not moving (or when an inertial frame is chosen such that it is not moving). The term also applies to the invariant mass of systems when the system as a whole is not "moving" (has no net momentum). The rest and invariant masses are the smallest possible value of the mass of the object or system. They also are conserved quantities, so long as the system is closed. Because of the way they are calculated, the effects of moving observers are subtracted, so these quantities do not change with the motion of the observer. The rest mass is almost never additive: the rest mass of an object is not the sum of the rest masses of its parts. The rest mass of an object is the total energy of all the parts, including kinetic energy, as measured by an observer that sees the center of the mass of the object to be standing still. The rest mass adds up only if the parts are standing still and do not attract or repel, so that they do not have any extra kinetic or potential energy. The other possibility is that they have a positive kinetic energy and a negative potential energy that exactly cancels.

The massenergy equivalence formula was displayed on Taipei 101 during the event of the World Year of Physics 2005.

Massenergy equivalence

227

Binding energy and the "mass defect"


Whenever any type of energy is removed from a system, the mass associated with the energy is also removed, and the system therefore loses mass. This mass defect in the system may be simply calculated as m = E/c2, and this was the form of the equation historically first presented by Einstein in 1905. However, use of this formula in such circumstances has led to the false idea that mass has been "converted" to energy. This may be particularly the case when the energy (and mass) removed from the system is associated with the binding energy of the system. In such cases, the binding energy is observed as a "mass defect" or deficit in the new system. The fact that the released energy is not easily weighed in many such cases, may cause its mass to be neglected as though it no longer existed. This circumstance, along with the real conversion of matter (not mass) to energy in some high energy particle reactions, has caused the conversion of matter to energy (which occurs) to be conflated with the false idea of conversion of mass to energy, which does not occur. The difference between the rest mass of a bound system and of the unbound parts is the binding energy of the system, if this energy has been removed after binding. For example, a water molecule weighs a little less than two free hydrogen atoms and an oxygen atom; the minuscule mass difference is the energy that is needed to split the molecule into three individual atoms (divided by c), and which was given off as heat when the molecule formed (this heat had mass). Likewise, a stick of dynamite in theory weighs a little bit more than the fragments after the explosion, but this is true only so long as the fragments are cooled and the heat removed. In this case the mass difference is the energy/heat that is released when the dynamite explodes, and when this heat escapes, the mass associated with it escapes, only to be deposited in the surroundings which absorb the heat (so that total mass is conserved). Such a change in mass may only happen when the system is open, and the energy and mass escapes. Thus, if a stick of dynamite is blown up in a hermetically sealed chamber, the mass of the chamber and fragments, the heat, sound, and light would still be equal to the original mass of the chamber and dynamite. If sitting on a scale, the weight and mass would not change. This would in theory also happen even with a nuclear bomb, if it could be kept in an ideal box of infinite strength, which did not rupture or pass radiation.[15] Thus, a 21.5 kiloton (9 x 1013 joule) nuclear bomb produces about one gram of heat and electromagnetic radiation, but the mass of this energy would not be detectable in an exploded bomb in an ideal box sitting on a scale; instead, the contents of the box would be heated to millions of degrees without changing total mass and weight. If then, however, a transparent window (passing only electromagnetic radiation) were opened in such an ideal box after the explosion, and a beam of X-rays and other lower-energy light allowed to escape the box, it would eventually be found to weigh one gram less than it had before the explosion. This weight-loss and mass-loss would happen as the box was cooled by this process, to room temperature. However, any surrounding mass which had absorbed the X-rays (and other "heat") would gain this gram of mass from the resulting heating, so the mass "loss" would represent merely its relocation. Thus, no mass (or, in the case of a nuclear bomb, no matter) would be "converted" to energy in such a process. Mass and energy, as always, would both be separately conserved.

Massless particles
Massless particles have zero rest mass. Their relativistic mass is simply their relativistic energy, divided by c2, or m(relativistic) = E/c2.[16][17] The energy for photons is E = hf where h is Planck's constant and f is the photon frequency. This frequency and thus the relativistic energy are frame-dependent. If an observer runs away from a photon in the direction it travels from a source, having it catch up with the observer, then when the photon catches up it will be seen as having less energy than it had at the source. The faster the observer is traveling with regard to the source when the photon catches up, the less energy the photon will have. As an observer approaches the speed of light with regard to the source, the photon looks redder and redder, by relativistic Doppler effect (the Doppler shift is the relativistic formula), and the energy of a very long-wavelength photon approaches zero. This is why a photon is massless; this means that the rest mass of a photon is zero.

Massenergy equivalence

228

Massless particles contribute rest mass and invariant mass to systems


Two photons moving in different directions cannot both be made to have arbitrarily small total energy by changing frames, or by moving toward or away from them. The reason is that in a two-photon system, the energy of one photon is decreased by chasing after it, but the energy of the other will increase with the same shift in observer motion. Two photons not moving in the same direction will exhibit an inertial frame where the combined energy is smallest, but not zero. This is called the center of mass frame or the center of momentum frame; these terms are almost synonyms (the center of mass frame is the special case of a center of momentum frame where the center of mass is put at the origin). The most that chasing a pair of photons can accomplish to decrease their energy is to put the observer in frame where the photons have equal energy and are moving directly away from each other. In this frame, the observer is now moving in the same direction and speed as the center of mass of the two photons. The total momentum of the photons is now zero, since their momentums are equal and opposite. In this frame the two photons, as a system, have a mass equal to their total energy divided by c2. This mass is called the invariant mass of the pair of photons together. It is the smallest mass and energy the system may be seen to have, by any observer. It is only the invariant mass of a two-photon system that can be used to make a single particle with the same rest mass. If the photons are formed by the collision of a particle and an antiparticle, the invariant mass is the same as the total energy of the particle and antiparticle (their rest energy plus the kinetic energy), in the center of mass frame, where they will automatically be moving in equal and opposite directions (since they have equal momentum in this frame). If the photons are formed by the disintegration of a single particle with a well-defined rest mass, like the neutral pion, the invariant mass of the photons is equal to rest mass of the pion. In this case, the center of mass frame for the pion is just the frame where the pion is at rest, and the center of mass does not change after it disintegrates into two photons. After the two photons are formed, their center of mass is still moving the same way the pion did, and their total energy in this frame adds up to the mass energy of the pion. Thus, by calculating the invariant mass of pairs of photons in a particle detector, pairs can be identified that were probably produced by pion disintegration. A similar calculation illustrates that the invariant mass of systems is conserved, even when massive particles (particles with rest mass) within the system are converted to massless particles (such as photons). In such cases, the photons contribute invariant mass to the system, even though they individually have no invariant mass or rest mass. Thus, an electron and positron (each of which has rest mass) may undergo annihiliation with each other to produce two photons, each of which is massless (has no rest mass). However, in such circumstances, no system mass is lost. Instead, the system of both photons moving away from each other has an invariant mass, which acts like a rest mass for any system in which the photons are trapped, or that can be weighed. Thus, not only the quantity of relativistic mass, but also the quantity of invariant mass does not change in transformations between "matter" (electrons and positrons) and energy (photons).

Relation to gravity
In physics, there are two distinct concepts of mass: the gravitational mass and the inertial mass. The gravitational mass is the quantity that determines the strength of the gravitational field generated by an object, as well as the gravitational force acting on the object when it is immersed in a gravitational field produced by other bodies. The inertial mass, on the other hand, quantifies how much an object accelerates if a given force is applied to it. The mass-energy equivalence in special relativity refers to the inertial mass. However, already in the context of Newton gravity, the Weak Equivalence Principle is postulated: the gravitational and the inertial mass of every object are the same. Thus, the mass-energy equivalence, combined with the Weak Equivalence Principle, results in the prediction that all forms of energy contribute to the gravitational field generated by an object. This observation is one of the pillars of the general theory of relativity. The above prediction, that all forms of energy interact gravitationally, has been subject to experimental tests. The first observation testing this prediction was made in 1919.[18] During a solar eclipse, Arthur Eddington observed that the light from stars passing close to the Sun was bent. The effect is due to the gravitational attraction of light by the

Massenergy equivalence sun. The observation confirmed that the energy carried by light indeed is equivalent to a gravitational mass. Another seminal experiment, the PoundRebka experiment, was performed in 1960.[19] In this test a beam of light was emitted from the top of a tower and detected at the bottom. The frequency of the light detected was higher than the light emitted. This result confirms that the energy of photons increases when they fall in the gravitational field of the earth. The energy, and therefore the gravitational mass, of photons is proportional to their frequency as stated by the Planck's relation.

229

Consequences for nuclear physics


Max Planck pointed out that the massenergy equivalence formula implied that bound systems would have a mass less than the sum of their constituents, once the binding energy had been allowed to escape. However, Planck was thinking about chemical reactions, where the binding energy is too small to measure. Einstein suggested that radioactive materials such as radium would provide a test of the theory, but even though a large amount of energy is released per atom in radium, due to the half-life of the substance (1602 years), only a small fraction of radium atoms decay over an experimentally measurable period of time.

Once the nucleus was discovered, experimenters realized that the very high binding energies of the atomic nuclei should allow calculation of their binding energies, simply from mass differences. But it was not until the discovery of the neutron in 1932, and the measurement of the neutron mass, that this calculation could actually be performed (see nuclear binding energy for example calculation). A little while later, the first transmutation reactions (such as[20] the Cockcroft-Walton experiment: 7Li + p 2 4He) verified Einstein's formula to an accuracy of 0.5%. In 2005, Rainville et al. published a direct test of the energy-equivalence of mass lost in the binding-energy of a neutron to atoms of particular isotopes of silicon and sulfur, by comparing the mass-lost to the energy of the emitted gamma ray associated with the neutron capture. The binding mass-loss agreed with the gamma ray energy to a precision of 0.00004 %, the most accurate test of E=mc2 to date.[3] The massenergy equivalence formula was used in the understanding of nuclear fission reactions, and implies the great amount of energy that can be released by a nuclear fission chain reaction, used in both nuclear weapons and nuclear power. By measuring the mass of different atomic nuclei and subtracting from that number the total mass of the protons and neutrons as they would weigh separately, one gets the exact binding energy available in an atomic nucleus. This is used to calculate the energy released in any nuclear reaction, as the difference in the total mass of the nuclei that enter and exit the reaction.

Task Force One, the world's first nuclear-powered task force. Enterprise, Long Beach and Bainbridge in formation in the Mediterranean, 18 June 1964. Enterprise crew members are spelling out Einstein's Mass-Energy Equivalence formula E=mc on the flight deck.

Massenergy equivalence

230

Practical examples
Einstein used the CGS system of units (centimeters, grams, seconds, dynes, and ergs), but the formula is independent of the system of units. In natural units, the speed of light is defined to equal 1, and the formula expresses an identity: E=m. In the SI system (expressing the ratio E/m in joules per kilogram using the value of c in meters per second): E/m= c2=(299,792,458m/s)2= 89,875,517,873,681,764J/kg (9.01016joules per kilogram). So the energy equivalent of one gram (1/1000 of a kilogram) of mass is equivalent to: 89.9terajoules 25.0million kilowatt-hours (25GWh) 21.5billion kilocalories (21Tcal)[21] 85.2billion BTUs[21] or to the energy released by combustion of the following: 21.5kilotons of TNT-equivalent energy (21kt)[21] 568,000 US gallons of automotive gasoline Any time energy is generated, the process can be evaluated from an E=mc2 perspective. For instance, the "Gadget"-style bomb used in the Trinity test and the bombing of Nagasaki had an explosive yield equivalent to 21kt of TNT. About 1kg of the approximately 6.15kg of plutonium in each of these bombs fissioned into lighter elements totaling almost exactly one gram less, after cooling. The electromagnetic radiation and kinetic energy (thermal and blast energy) released in this explosion carried the missing one gram of mass.[22] This occurs because nuclear binding energy is released whenever elements with more than 62 nucleons fission. Another example is hydroelectric generation. The electrical energy produced by Grand Coulee Dam's turbines every 3.7hours represents one gram of mass. This mass passes to the electrical devices (such as lights in cities) which are powered by the generators, where it appears as a gram of heat and light.[23] Turbine designers look at their equations in terms of pressure, torque, and RPM. However, Einstein's equations show that all energy has mass, and thus the electrical energy produced by a dam's generators, and the heat and light which result from it, all retain their mass, which is equivalent to the energy. The potential energyand equivalent massrepresented by the waters of the Columbia River as it descends to the Pacific Ocean would be converted to heat due to viscous friction and the turbulence of white water rapids and waterfalls were it not for the dam and its generators. This heat would remain as mass on site at the water, were it not for the equipment which converted some of this potential and kinetic energy into electrical energy, which can be moved from place to place (taking mass with it). Whenever energy is added to a system, the system gains mass: A spring's mass increases whenever it is put into compression or tension. Its added mass arises from the added potential energy stored within it, which is bound in the stretched chemical (electron) bonds linking the atoms within the spring. Raising the temperature of an object (increasing its heat energy) increases its mass. For example, consider the world's primary mass standard for the kilogram, made of platinum/iridium. If its temperature is allowed to change by 1C, its mass will change by 1.5 picograms (1pg = 11012g).[24] A spinning ball will weigh more than a ball that is not spinning. Its increase of mass is exactly the equivalent of the mass of energy of rotation, which is itself the sum of the kinetic energies of all the moving parts of the ball. For example, the Earth itself is more massive due to its daily rotation, than it would be with no rotation. This rotational energy (2.14 x 1029 J) represents 2.38 billion metric tons of added mass.[25] Note that no net mass or energy is really created or lost in any of these examples and scenarios. Mass/energy simply moves from one place to another. These are some examples of the transfer of energy and mass in accordance with the principle of massenergy conservation.

Massenergy equivalence

231

Efficiency
Although mass cannot be converted to energy, matter particles can be. Also, a certain amount of the ill-defined "matter" in ordinary objects can be converted to active energy (light and heat), even though no identifiable real particles are destroyed. Such conversions happen in nuclear weapons, in which the protons and neutrons in atomic nuclei lose a small fraction of their average mass, but this mass-loss is not due to the destruction of any protons or neutrons (or even, in general, lighter particles like electrons). Also the mass is not destroyed, but simply removed from the system in the form of heat and light from the reaction. In nuclear reactions, typically only a small fraction of the total massenergy of the bomb is converted into heat, light, radiation and motion, which are "active" forms which can be used. When an atom fissions, it loses only about 0.1% of its mass (which escapes from the system and does not disappear), and in a bomb or reactor not all the atoms can fission. In a fission based atomic bomb, the efficiency is only 40%, so only 40% of the fissionable atoms actually fission, and only 0.04% of the total mass appears as energy in the end. In nuclear fusion, more of the mass is released as usable energy, roughly 0.3%. But in a fusion bomb (see nuclear weapon yield), the bomb mass is partly casing and non-reacting components, so that in practicality, no more than about 0.03% of the total mass of the entire weapon is released as usable energy (which, again, retains the "missing" mass). In theory, it should be possible to convert all of the mass in matter into heat and light (which would of course have the same mass), but none of the theoretically known methods are practical. One way to convert all matter into usable energy is to annihilate matter with antimatter. But antimatter is rare in our universe, and must be made first. Due to inefficient mechanisms of production, making antimatter always requires far more energy than would be released when it was annihilated. Since most of the mass of ordinary objects resides in protons and neutrons, in order to convert all ordinary matter to useful energy, the protons and neutrons must be converted to lighter particles. In the standard model of particle physics, the number of protons plus neutrons is nearly exactly conserved. Still, Gerard 't Hooft showed that there is a process which will convert protons and neutrons to antielectrons and neutrinos.[26] This is the weak SU(2) instanton proposed by Belavin Polyakov Schwarz and Tyupkin.[27] This process, can in principle convert all the mass of matter into neutrinos and usable energy, but it is normally extraordinarily slow. Later it became clear that this process will happen at a fast rate at very high temperatures,[28] since then instanton-like configurations will be copiously produced from thermal fluctuations. The temperature required is so high that it would only have been reached shortly after the big bang. Many extensions of the standard model contain magnetic monopoles, and in some models of grand unification, these monopoles catalyze proton decay, a process known as the CallanRubakov effect.[29] This process would be an efficient massenergy conversion at ordinary temperatures, but it requires making monopoles and anti-monopoles first. The energy required to produce monopoles is believed to be enormous, but magnetic charge is conserved, so that the lightest monopole is stable. All these properties are deduced in theoretical modelsmagnetic monopoles have never been observed, nor have they been produced in any experiment so far. A third known method of total matterenergy conversion is using gravity, specifically black holes. Stephen Hawking theorized[30] that black holes radiate thermally with no regard to how they are formed. So it is theoretically possible to throw matter into a black hole and use the emitted heat to generate power. According to the theory of Hawking radiation, however, the black hole used will radiate at a higher rate the smaller it is, producing usable powers at only small black hole masses, where usable may for example be something greater than the local background radiation. It is also worth noting that the ambient irradiated power would change with the mass of the black hole, increasing as the mass of the black hole decreases, or decreasing as the mass increases, at a rate where power is proportional to the inverse square of the mass. In a "practical" scenario, mass and energy could be dumped into the black hole to regulate this growth, or keep its size, and thus power output, near constant. This could result from the fact that mass and energy are lost from the hole with its thermal radiation.

Massenergy equivalence

232

Background
Massvelocity relationship
In developing special relativity, Einstein found that the kinetic energy of a moving body is

with

the velocity,

the rest mass, and the Lorentz factor.

He included the second term on the right to make sure that for small velocities, the energy would be the same as in classical mechanics:

Without this second term, there would be an additional contribution in the energy when the particle is not moving. Einstein found that the total momentum of a moving particle is:

and it is this quantity which is conserved in collisions. The ratio of the momentum to the velocity is the relativistic mass, m.

And the relativistic mass and the relativistic kinetic energy are related by the formula:

Einstein wanted to omit the unnatural second term on the right-hand side, whose only purpose is to make the energy at rest zero, and to declare that the particle has a total energy which obeys: which is a sum of the rest energy m0c2 and the kinetic energy. This total energy is mathematically more elegant, and fits better with the momentum in relativity. But to come to this conclusion, Einstein needed to think carefully about collisions. This expression for the energy implied that matter at rest has a huge amount of energy, and it is not clear whether this energy is physically real, or just a mathematical artifact with no physical meaning. In a collision process where all the rest-masses are the same at the beginning as at the end, either expression for the energy is conserved. The two expressions only differ by a constant which is the same at the beginning and at the end of the collision. Still, by analyzing the situation where particles are thrown off a heavy central particle, it is easy to see that the inertia of the central particle is reduced by the total energy emitted. This allowed Einstein to conclude that the inertia of a heavy particle is increased or diminished according to the energy it absorbs or emits.

Massenergy equivalence

233

Relativistic mass
After Einstein first made his proposal, it became clear that the word mass can have two different meanings. The rest mass is what Einstein called m, but others defined the relativistic mass with an explicit index:

This mass is the ratio of momentum to velocity, and it is also the relativistic energy divided by c2 (it is not Lorentz-invariant, in contrast to ). The equation E = mrelc2 holds for moving objects. When the velocity is small, the relativistic mass and the rest mass are almost exactly the same. E=mc2 either means E=m0c2 for an object at rest, or E=mrelc2 when the object is moving. Also Einstein (following Hendrik Lorentz and Max Abraham) used velocityand direction-dependent mass concepts (longitudinal and transverse mass) in his 1905 electrodynamics paper and in another paper in 1906.[31][32] However, in his first paper on E=mc2 (1905), he treated m as what would now be called the rest mass.[1] Some claim that (in later years) he did not like the idea of "relativistic mass."[33] When modern physicists say "mass", they are usually talking about rest mass, since if they meant "relativistic mass", they would just say "energy". Considerable debate has ensued over the use of the concept "relativistic mass" and the connection of "mass" in relativity to "mass" in Newtonian dynamics. For example, one view is that only rest mass is a viable concept and is a property of the particle; while relativistic mass is a conglomeration of particle properties and properties of spacetime. A perspective that avoids this debate, due to Kjell Vyenli, is that the Newtonian concept of mass as a particle property and the relativistic concept of mass have to be viewed as embedded in their own theories and as having no precise connection.[34][35]

Low-speed expansion
We can rewrite the expression E = m0c2 as a Taylor series:

For speeds much smaller than the speed of light, higher-order terms in this expression get smaller and smaller because v/c is small. For low speeds we can ignore all but the first two terms:

The total energy is a sum of the rest energy and the Newtonian kinetic energy. The classical energy equation ignores both the m0c2 part, and the high-speed corrections. This is appropriate, because all the high-order corrections are small. Since only changes in energy affect the behavior of objects, whether we include the m0c2 part makes no difference, since it is constant. For the same reason, it is possible to subtract the rest energy from the total energy in relativity. By considering the emission of energy in different frames, Einstein could show that the rest energy has a real physical meaning. The higher-order terms are extra correction to Newtonian mechanics which become important at higher speeds. The Newtonian equation is only a low-speed approximation, but an extraordinarily good one. All of the calculations used in putting astronauts on the moon, for example, could have been done using Newton's equations without any of the higher-order corrections.

Massenergy equivalence

234

History
While Einstein was the first to have correctly deduced the massenergy equivalence formula, he was not the first to have related energy with mass. But nearly all previous authors thought that the energy which contributes to mass comes only from electromagnetic fields.[36][37][38][39]

Newton: matter and light


In 1717 Isaac Newton speculated that light particles and matter particles were inter-convertible in "Query 30" of the Opticks, where he asks: Are not the gross bodies and light convertible into one another, and may not bodies receive much of their activity from the particles of light which enter their composition?

Swedenborg: matter composed of "pure and total motion"


In 1734 Emanuel Swedenborg in his Principia theorized that all matter is ultimately composed of dimensionless points of "pure and total motion." He described this motion as being without force, direction or speed, but having the potential for force, direction and speed everywhere within it.[40][41]

Electromagnetic mass
There were many attempts in the 19th and the beginning of the 20th centurylike those of J. J. Thomson (1881), Oliver Heaviside (1888), and George Frederick Charles Searle (1897), Wilhelm Wien (1900), Max Abraham (1902), Hendrik Antoon Lorentz (1904) to understand as to how the mass of a charged object depends on the electrostatic field.[36][37] This concept was called electromagnetic mass, and was considered as being dependent on velocity and direction as well. Lorentz (1904) gave the following expressions for longitudinal and transverse electromagnetic mass: , where

Radiation pressure and inertia


Another way of deriving some sort of electromagnetic mass was based on the concept of radiation pressure. In 1900, Henri Poincar associated electromagnetic radiation energy with a "fictitious fluid" having momentum and mass . By that, Poincar tried to save the center of mass theorem in Lorentz's theory, though his treatment led to radiation paradoxes.[39] Friedrich Hasenhrl showed in 1904, that electromagnetic cavity radiation contributes the "apparent mass" to the cavity's mass. He argued that this implies mass dependence on temperature as well.[42]

Einstein: massenergy equivalence


Albert Einstein did not formulate exactly the formula E = mc2 in his 1905 Annus Mirabilis paper "Does the Inertia of a Body Depend Upon Its Energy Content?";[1] rather, the paper states that if a body gives off the energy L in the form of radiation, its mass diminishes by L/c2. (Here, "radiation" means electromagnetic radiation, or light, and mass means the ordinary Newtonian mass of a slow-moving object.) This formulation relates only a change m in mass to a change L in energy without requiring the absolute relationship.

Massenergy equivalence Objects with zero mass presumably have zero energy, so the extension that all mass is proportional to energy is obvious from this result. In 1905, even the hypothesis that changes in energy are accompanied by changes in mass was untested. Not until the discovery of the first type of antimatter (the positron in 1932) was it found that all of the mass of pairs of resting particles could be converted to radiation. First derivation (1905) Already in his relativity paper "On the electrodynamics of moving bodies", Einstein derived the correct expression for the kinetic energy of particles: . Now the question remained open as to which formulation applies to bodies at rest. This was tackled by Einstein in his paper "Does the inertia of a body depend upon its energy content?". Einstein used a body emitting two light pulses in opposite directions, having energies of before and after the emission as seen in its rest frame. As seen from a moving frame, this becomes and . Einstein obtained:

235

then he argued that

can only differ from the kinetic energy

by an additive constant, which gives

Neglecting effects higher than third order in

this gives:

Thus Einstein concluded that the emission reduces the body's mass by

, and that the mass of a body is a

measure of its energy content. The correctness of Einstein's 1905 derivation of E=mc2 was criticized by Max Planck (1907), who argued that it is only valid to first approximation. Another criticism was formulated by Herbert Ives (1952) and Max Jammer (1961), asserting that Einstein's derivation is based on begging the question.[43][44] On the other hand, John Stachel and Roberto Torretti (1982) argued that Ives' criticism was wrong, and that Einstein's derivation was correct.[45] Hans Ohanian (2008) agreed with Stachel/Torretti's criticism of Ives, though he argued that Einstein's derivation was wrong for other reasons.[46] For a recent review, see Hecht (2011).[47] Alternative version An alternative version of Einstein's thought experiment was proposed by Fritz Rohrlich (1990), who based his reasoning on the Doppler effect.[48] Like Einstein, he considered a body at rest with mass M. If the body is examined in a frame moving with nonrelativistic velocity v, it is no longer at rest and in the moving frame it has momentum P = Mv. Then he supposed the body emits two pulses of light to the left and to the right, each carrying an equal amount of energy E/2. In its rest frame, the object remains at rest after the emission since the two beams are equal in strength and carry opposite momentum. But if the same process is considered in a frame moving with velocity v to the left, the pulse moving to the left will be redshifted while the pulse moving to the right will be blue shifted. The blue light carries more momentum than the red light, so that the momentum of the light in the moving frame is not balanced: the light is carrying some net momentum to the right.

Massenergy equivalence The object has not changed its velocity before or after the emission. Yet in this frame it has lost some right-momentum to the light. The only way it could have lost momentum is by losing mass. This also solves Poincar's radiation paradox, discussed above. The velocity is small, so the right-moving light is blueshifted by an amount equal to the nonrelativistic Doppler shift factor 1 v/c. The momentum of the light is its energy divided by c, and it is increased by a factor of v/c. So the right-moving light is carrying an extra momentum given by:

236

The left-moving light carries a little less momentum, by the same amount light is twice . This is the right-momentum that the object lost.

. So the total right-momentum in the

The momentum of the object in the moving frame after the emission is reduced by this amount:

So the change in the object's mass is equal to the total energy lost divided by

. Since any emission of energy can

be carried out by a two step process, where first the energy is emitted as light and then the light is converted to some other form of energy, any emission of energy is accompanied by a loss of mass. Similarly, by considering absorption, a gain in energy is accompanied by a gain in mass. Relativistic center-of-mass theorem 1906 Like Poincar, Einstein concluded in 1906 that the inertia of electromagnetic energy is a necessary condition for the center-of-mass theorem to hold. On this occasion, Einstein referred to Poincar's 1900 paper and wrote:[49] Although the merely formal considerations, which we will need for the proof, are already mostly contained in a work by H. Poincar2, for the sake of clarity I will not rely on that work.[50] In Einstein's more physical, as opposed to formal or mathematical, point of view, there was no need for fictitious masses. He could avoid the perpetuum mobile problem, because on the basis of the massenergy equivalence he could show that the transport of inertia which accompanies the emission and absorption of radiation solves the problem. Poincar's rejection of the principle of actionreaction can be avoided through Einstein's , because mass conservation appears as a special case of the energy conservation law.

Others
During the nineteenth century there were several speculative attempts to show that mass and energy were proportional in various ether theories.[51] In 1873 Nikolay Umov pointed out a relation between mass and energy for ether in the form of =kmc2, where 0.5 k 1.[52] The writings of Samuel Tolver Preston,[53][54] and a 1903 paper by Olinto De Pretto,[55][56] presented a massenergy relation. De Pretto's paper received recent press coverage when Umberto Bartocci discovered that there were only three degrees of separation linking De Pretto to Einstein, leading Bartocci to conclude that Einstein was probably aware of De Pretto's work.[57] Preston and De Pretto, following Le Sage, imagined that the universe was filled with an ether of tiny particles which are always moving at speed c. Each of these particles have a kinetic energy of mc2 up to a small numerical factor. The nonrelativistic kinetic energy formula did not always include the traditional factor of 1/2, since Leibniz introduced kinetic energy without it, and the 1/2 is largely conventional in prerelativistic physics.[58] By assuming that every particle has a mass which is the sum of the masses of the ether particles, the authors would conclude that all matter contains an amount of kinetic energy either given by E=mc2 or 2E=mc2 depending on the convention. A particle ether was usually considered unacceptably speculative science at the time,[59] and since these authors did not formulate relativity, their reasoning is completely different from that of Einstein, who used relativity to change

Massenergy equivalence frames. Independently, Gustave Le Bon in 1905 speculated that atoms could release large amounts of latent energy, reasoning from an all-encompassing qualitative philosophy of physics.[60][61]

237

Radioactivity and nuclear energy


It was quickly noted after the discovery of radioactivity in 1897, that the total energy due to radioactive processes is about one million times greater than that involved in any known molecular change. However, it raised the question where this energy is coming from. After eliminating the idea of absorption and emission of some sort of Lesagian ether particles, the existence of a huge amount of latent energy, stored within matter, was proposed by Ernest Rutherford and Frederick Soddy in 1903. Rutherford also suggested that this internal energy is stored within normal matter as well. He went on to speculate in 1904:[62][63] If it were ever found possible to control at will the rate of disintegration of the radio-elements, an enormous amount of energy could be obtained from a small quantity of matter. Einstein's equation is in no way an explanation of the large energies released in radioactive decay (this comes from the powerful nuclear forces involved; forces that were still unknown in 1905). In any case, the enormous energy released from radioactive decay (which had been measured by Rutherford) was much more easily measured than the (still small) change in the gross mass of materials, as a result. Einstein's equation, by theory, can give these energies by measuring mass differences before and after reactions, but in practice, these mass differences in 1905 were still too small to be measured in bulk. Prior to this, the ease of measuring radioactive decay energies with a calorimeter was thought possibly likely to allow measurement of changes in mass difference, as a check on Einstein's equation itself. Einstein mentions in his 1905 paper that massenergy equivalence might perhaps be tested with radioactive decay, which releases enough energy (the quantitative amount known roughly by 1905) to possibly be "weighed," when missing from the system (having been given off as heat). However, radioactivity seemed to proceed at its own unalterable (and quite slow, for radioactives known then) pace, and even when simple nuclear reactions became possible using proton bombardment, the idea that these great amounts of usable energy could be liberated at will with any practicality, proved difficult to substantiate. It had been used as the basis of much speculation, causing Rutherford himself to later reject his ideas of 1904; he was reported in 1933 to have said that: "Anyone who expects a source of power from the transformation of the atom is talking moonshine."[64] This situation changed dramatically in 1932 with the discovery of the neutron and its mass, allowing mass differences for single nuclides and their reactions to be calculated directly, and compared with the sum of masses for the particles that made up their composition. In 1933, the energy released from the reaction of lithium-7 plus protons giving rise to 2 alpha particles (as noted above by Rutherford), allowed Einstein's equation to be tested to an error of 0.5%. However, scientists still did not see such reactions as a source of power. After the very public demonstration of huge energies released from nuclear fission after the atomic bombings of Hiroshima and Nagasaki in 1945, the equation E=mc2 became directly linked in the public eye with the power and peril of nuclear weapons. The equation was featured as early as page 2 of the Smyth Report, the official 1945 release by the US government on the development of the atomic bomb, and by 1946 the equation was linked closely enough with Einstein's work that the cover of Time magazine prominently featured a picture of Einstein next to an image of a mushroom cloud emblazoned with the equation.[65] Einstein himself had only a minor role in the Manhattan Project: he had cosigned a letter to the U.S. President in 1939 urging funding for research into atomic energy, warning that an atomic bomb was theoretically possible. The letter persuaded Roosevelt to devote a significant portion of the wartime budget to atomic research. Without a security clearance, Einstein's only scientific contribution was an analysis of an isotope separation method in theoretical terms. It was inconsequential, on account of Einstein not being given sufficient information (for security reasons) to fully work on the problem.[66] While E=mc2 is useful for understanding the amount of energy potentially released in a fission reaction, it was not strictly necessary to develop the weapon, once the fission process was known, and its energy measured at 200 MeV

Massenergy equivalence (which was directly possible, using a quantitative Geiger counter, at that time). As the physicist and Manhattan Project participant Robert Serber put it: "Somehow the popular notion took hold long ago that Einstein's theory of relativity, in particular his famous equation E=mc2, plays some essential role in the theory of fission. Albert Einstein had a part in alerting the United States government to the possibility of building an atomic bomb, but his theory of relativity is not required in discussing fission. The theory of fission is what physicists call a non-relativistic theory, meaning that relativistic effects are too small to affect the dynamics of the fission process significantly."[67] However the association between E=mc2 and nuclear energy has since stuck, and because of this association, and its simple expression of the ideas of Albert Einstein himself, it has become "the world's most famous equation".[68] While Serber's view of the strict lack of need to use massenergy equivalence in designing the atomic bomb is correct, it does not take into account the pivotal role which this relationship played in making the fundamental leap to the initial hypothesis that large atoms were energetically allowed to split into approximately equal parts (before this energy was in fact measured). In late 1938, while on the winter walk on which they solved the meaning of Hahn's experimental results and introduced the idea that would be called atomic fission, Lise Meitner and Otto Robert Frisch made direct use of Einstein's equation to help them understand the quantitative energetics of the reaction which overcame the "surface tension-like" forces holding the nucleus together, and allowed the fission fragments to separate to a configuration from which their charges could force them into an energetic "fission". To do this, they made use of "packing fraction", or nuclear binding energy values for elements, which Meitner had memorized. These, together with use of E = mc2 allowed them to realize on the spot that the basic fission process was energetically possible: ...We walked up and down in the snow, I on skis and she on foot. ...and gradually the idea took shape... explained by Bohr's idea that the nucleus is like a liquid drop; such a drop might elongate and divide itself... We knew there were strong forces that would resist, ..just as surface tension. But nuclei differed from ordinary drops. At this point we both sat down on a tree trunk and started to calculate on scraps of paper. ...the Uranium nucleus might indeed be a very wobbly, unstable drop, ready to divide itself... But, ...when the two drops separated they would be driven apart by electrical repulsion, about 200 MeV in all. Fortunately Lise Meitner remembered how to compute the masses of nuclei... and worked out that the two nuclei formed... would be lighter by about one-fifth the mass of a proton. Now whenever mass disappears energy is created, according to Einstein's formula E = mc2, and... the mass was just equivalent to 200 MeV; it all fitted![69][70]

238

References
[1] Einstein, A. (1905), "Ist die Trgheit eines Krpers von seinem Energieinhalt abhngig?", Annalen der Physik 18: 639643, Bibcode1905AnP...323..639E, doi:10.1002/andp.19053231314. See also the English translation. (http:/ / www. fourmilab. ch/ etexts/ einstein/ E_mc2/ www/ ) [2] Paul Allen Tipler, Ralph A. Llewellyn (2003-01), Modern Physics (http:/ / books. google. com/ ?id=tpU18JqcSNkC& lpg=PP1& pg=PA87#v=onepage& q=), W. H. Freeman and Company, pp.8788, ISBN0-7167-4345-0, [3] Rainville, S. et al. World Year of Physics: A direct test of E=mc2. Nature 438, 1096-1097 (22 December 2005) | doi:10.1038/4381096a; Published online 21 December 2005. [4] See the sentence on the last page (p.641) of the original edition of Ist die Trgheit eines Krpers von seinem Energieinhalt abhngig? in Annalen der Physik, 1905, below the equation K0 - K1= L/V2 v2/2. See also the sentence under the last equation in the English translation, K0 - K1= 1/2 L/c2 v2, and the comment on the symbols used in About this edition that follows the translation (http:/ / www. fourmilab. ch/ etexts/ einstein/ E_mc2/ www/ ) [5] See the sentence on the last page (p.641) of the original edition of Ist die Trgheit eines Krpers von seinem Energieinhalt abhngig? in Annalen der Physik, 1905, above the equation K0 - K1= L/V2 v2/2. See also the sentence above the last equation in the English translation, K0 - K1= 1/2 L/c2 v2, and the comment on the symbols used in About this edition that follows the translation (http:/ / www. fourmilab. ch/ etexts/ einstein/ E_mc2/ www/ ) [6] Planck, M. (1907). Ber.d.Berl.Akad. 29: 542. [7] Stark, J. (1907). "Elementarquantum der Energie, Modell der negativen und der positiven Elekrizitat". Physikalische Zeitschrift 24 (8): 881. [8] A.Einstein 'E=mc2': the most urgent problem of our time Science illustrated, vol. 1 no. 1, April issue, pp. 16-17, 1946 (item 417 in the "Bibliography"

Massenergy equivalence
[9] M.C.Shields Bibliography of the Writings of Albert Einstein to May 1951 in Albert Einstein: Philosopher-Scientist by Paul Arthur Schilpp (Editor) Albert Einstein Philospher - Scientist (http:/ / www. questia. com/ PM. qst?a=o& d=84000079) [10] In F. Fernflores. The Equivalence of Mass and Energy. Stanford Encyclopedia of Philosophy. (http:/ / plato. stanford. edu/ entries/ equivME/ #2. 1) [11] Note that the relativistic mass, in contrast to the rest mass m0, is not a relativistic invariant, and that the velocity a Minkowski four-vector, in contrast to the quantity proper time. However, the energy-momentum four-vector , where is not is the differential of the is a genuine Minkowski four-vector, and the

239

intrinsic origin of the square-root in the definition of the relativistic mass is the distinction between d and dt. [12] Relativity DeMystified, D. McMahon, Mc Graw Hill (USA), 2006, ISBN 0-07-145545-0 [13] Dynamics and Relativity, J.R. Forshaw, A.G. Smith, Wiley, 2009, ISBN 978-0-470-01460-8 [14] Hans, H. S.; Puri, S. P. (2003). Mechanics (http:/ / books. google. com/ books?id=hrBe52GPHrYC) (2 ed.). Tata McGraw-Hill. p.433. ISBN0-07-047360-9. ., Chapter 12 page 433 (http:/ / books. google. com/ books?id=hrBe52GPHrYC& pg=PA433) [15] E. F. Taylor and J. A. Wheeler, Spacetime Physics, W.H. Freeman and Co., NY. 1992. ISBN 0-7167-2327-1, see pp. 248-9 for discussion of mass remaining constant after detonation of nuclear bombs, until heat is allowed to escape. [16] Mould, Richard A. (2002). Basic relativity (http:/ / books. google. com/ books?id=lfGE-wyJYIUC) (2 ed.). Springer. p.126. ISBN0-387-95210-1. ., Chapter 5 page 126 (http:/ / books. google. com/ books?id=lfGE-wyJYIUC& pg=PA126) [17] Chow, Tail L. (2006). Introduction to electromagnetic theory: a modern perspective (http:/ / books. google. com/ books?id=dpnpMhw1zo8C). Jones & Bartlett Learning. p.392. ISBN0-7637-3827-1. ., Chapter 10 page 392 (http:/ / books. google. com/ books?id=dpnpMhw1zo8C& pg=PA392) [18] Dyson, F.W.; Eddington, A.S., & Davidson, C.R. (1920). "A Determination of the Deflection of Light by the Sun's Gravitational Field, from Observations Made at the Solar eclipse of May 29, 1919". Phil. Trans. Roy. Soc. A 220 (571-581): 291333. Bibcode1920RSPTA.220..291D. doi:10.1098/rsta.1920.0009. [19] Pound, R. V.; Rebka Jr. G. A. (April 1, 1960). "Apparent weight of photons". Physical Review Letters 4 (7): 337341. Bibcode1960PhRvL...4..337P. doi:10.1103/PhysRevLett.4.337. [20] (http:/ / homepage. eircom. net/ ~louiseboylan/ Pages/ Cockroft_walton. htm) Cockcroft-Walton experiment [21] Conversions used: 1956 International (Steam) Table (IT) values where one calorie 4.1868J and one BTU 1055.05585262J. Weapons designers' conversion value of onegram TNT 1000calories used. [22] The 6.2kg core comprised 0.8% gallium by weight. Also, about 20% of the Gadget's yield was due to fast fissioning in its natural uranium tamper. This resulted in 4.1moles of Pu fissioning with 180MeV per atom actually contributing prompt kinetic energy to the explosion. Note too that the term "Gadget"-style is used here instead of "Fat Man" because this general design of bomb was very rapidly upgraded to a more efficient one requiring only 5kg of the Pu/gallium alloy. [23] Assuming the dam is generating at its peak capacity of 6,809MW. [24] Assuming a 90/10 alloy of Pt/Ir by weight, a Cp of 25.9 for Pt and 25.1 for Ir, a Pt-dominated average Cp of 25.8, 5.134moles of metal, and 132J.K-1 for the prototype. A variation of 1.5picograms is of course, much smaller than the actual uncertainty in the mass of the international prototype, which is 2micrograms. [25] InfraNet Lab (2008-12-07). Harnessing the Energy from the Earth's Rotation. Article on Earth rotation energy. Divided by c^2. InfraNet Lab, 7 December 2008. Retrieved from http:/ / infranetlab. org/ blog/ 2008/ 12/ harnessing-the-energy-from-the-earth%E2%80%99s-rotation/ . [26] G. 't Hooft, "Computation of the quantum effects due to a four-dimensional pseudoparticle", Physical Review D14:34323450 (1976). [27] A. Belavin, A. M. Polyakov, A. Schwarz, Yu. Tyupkin, "Pseudoparticle Solutions to Yang Mills Equations", Physics Letters 59B:85 (1975). [28] F. Klinkhammer, N. Manton, "A Saddle Point Solution in the Weinberg Salam Theory", Physical Review D 30:2212. [29] Rubakov V. A. "Monopole Catalysis of Proton Decay", Reports on Progress in Physics 51:189241 (1988). [30] S.W. Hawking "Black Holes Explosions?" Nature 248:30 (1974). [31] Einstein, A. (1905), "Zur Elektrodynamik bewegter Krper." (http:/ / www. physik. uni-augsburg. de/ annalen/ history/ papers/ 1905_17_891-921. pdf) (PDF), Annalen der Physik 17: 891921, Bibcode1905AnP...322..891E, doi:10.1002/andp.19053221004, . English translation. (http:/ / www. fourmilab. ch/ etexts/ einstein/ specrel/ www/ ) [32] Einstein, A. (1906), "ber eine Methode zur Bestimmung des Verhltnisses der transversalen und longitudinalen Masse des Elektrons." (http:/ / www. physik. uni-augsburg. de/ annalen/ history/ papers/ 1906_21_583-586. pdf) (PDF), Annalen der Physik 21: 583586, Bibcode1906AnP...326..583E, doi:10.1002/andp.19063261310, [33] See e.g. Lev B.Okun, The concept of Mass, Physics Today 42 (6), June 1969, p. 3136, http:/ / www. physicstoday. org/ vol-42/ iss-6/ vol42no6p31_36. pdf [34] Max Jammer (1999), Concepts of mass in contemporary physics and philosophy (http:/ / books. google. com/ ?id=jujK1bn4QUQC& pg=PA51), Princeton University Press, p.51, ISBN0-691-01017-X, [35] Eriksen, Erik; Vyenli, Kjell (1976), "The classical and relativistic concepts of mass", Foundations of Physics (Springer) 6: 115124, Bibcode1976FoPh....6..115E, doi:10.1007/BF00708670 [36] Jannsen, M., Mecklenburg, M. (2007), From classical to relativistic mechanics: Electromagnetic models of the electron. (http:/ / www. tc. umn. edu/ ~janss011/ ), in V. F. Hendricks, et al., , Interactions: Mathematics, Physics and Philosophy (Dordrecht: Springer): 65134, [37] Whittaker, E.T. (19511953), 2. Edition: A History of the theories of aether and electricity, vol. 1: The classical theories / vol. 2: The modern theories 19001926, London: Nelson

Massenergy equivalence
[38] Miller, Arthur I. (1981), Albert Einstein's special theory of relativity. Emergence (1905) and early interpretation (19051911), Reading: AddisonWesley, ISBN0-201-04679-2 [39] Darrigol, O. (2005), "The Genesis of the theory of relativity." (http:/ / www. bourbaphy. fr/ darrigol2. pdf) (PDF), Sminaire Poincar 1: 122, [40] Swedenborg, Emanuel (1734). "De Simplici Mundi vel Puncto naturali" (in Latin). Principia Rerum Naturalia. Leipzig. p.32. [41] Swedenborg, Emanuel (1845) (in English). The Principia; or The First Principles of Natural Things. London: W. Newbery. p.55-57. [42] Philip Ball (Aug 23, 2011). "Did Einstein discover E = mc2?" (http:/ / physicsworld. com/ cws/ article/ news/ 46941). Physics World. . [43] Ives, Herbert E. (1952), "Derivation of the mass-energy relation", Journal of the Optical Society of America 42 (8): 540543, doi:10.1364/JOSA.42.000540 [44] Jammer, Max (1961/1997). Concepts of Mass in Classical and Modern Physics. New York: Dover. ISBN0-486-29998-8. [45] Stachel, John; Torretti, Roberto (1982), "Einstein's first derivation of mass-energy equivalence", American Journal of Physics 50 (8): 760763, Bibcode1982AmJPh..50..760S, doi:10.1119/1.12764 [46] Ohanian, Hans (2008), "Did Einstein prove E=mc2?", Studies In History and Philosophy of Science Part B 40 (2): 167173, arXiv:0805.1400, doi:10.1016/j.shpsb.2009.03.002 [47] Hecht, Eugene (2011), "How Einstein confirmed E0=mc2", American Journal of Physics 79 (6): 591600, Bibcode2011AmJPh..79..591H, doi:10.1119/1.3549223 [48] Rohrlich, Fritz (1990), "An elementary derivation of E=mc2", American Journal of Physics 58 (4): 348349, Bibcode1990AmJPh..58..348R, doi:10.1119/1.16168 [49] Einstein, A. (1906), "Das Prinzip von der Erhaltung der Schwerpunktsbewegung und die Trgheit der Energie" (http:/ / www. physik. uni-augsburg. de/ annalen/ history/ papers/ 1906_20_627-633. pdf) (PDF), Annalen der Physik 20: 627633, Bibcode1906AnP...325..627E, doi:10.1002/andp.19063250814, [50] Einstein 1906: Trotzdem die einfachen formalen Betrachtungen, die zum Nachweis dieser Behauptung durchgefhrt werden mssen, in der Hauptsache bereits in einer Arbeit von H. Poincar enthalten sind2, werde ich mich doch der bersichtlichkeit halber nicht auf jene Arbeit sttzen. [51] Helge Kragh, "Fin-de-Sicle Physics: A World Picture in Flux" in Quantum Generations: A History of Physics in the Twentieth Century (Princeton, NJ: Princeton University Press, 1999. [52] . . . . ., 1950. (Russian) [53] Preston, S. T., Physics of the Ether, E. & F. N. Spon, London, (1875). [54] Bjerknes: S. Tolver Preston's Explosive Idea E = mc2. (http:/ / itis. volta. alessandria. it/ episteme/ ep6/ ep6-bjerk1. htm) [55] MathPages: Who Invented Relativity? (http:/ / www. mathpages. com/ rr/ s8-08/ 8-08. htm) [56] De Pretto, O. Reale Instituto Veneto Di Scienze, Lettere Ed Arti, LXIII, II, 439500, reprinted in Bartocci. [57] Umberto Bartocci, Albert Einstein e Olinto De PrettoLa vera storia della formula pi famosa del mondo, editore Andromeda, Bologna, 1999. [58] Prentiss, J.J. (August 2005), "Why is the energy of motion proportional to the square of the velocity?", American Journal of Physics 73 (8): 705 [59] John Worrall, review of the book Conceptions of Ether. Studies in the History of Ether Theories by Cantor and Hodges, The British Journal of the Philosophy of Science vol 36, no 1, March 1985, p. 84. The article contrasts a particle ether with a wave-carrying ether, the latter was acceptable. [60] Le Bon: The Evolution of Forces. (http:/ / www. rexresearch. com/ lebonfor/ evforp1. htm#p1b3ch2) [61] Bizouard: Poincar E = mc2 l'quation de Poincar, Einstein et Planck. (http:/ / www. annales. org/ archives/ x/ poincaBizouard. pdf) [62] Rutherford, Ernest (1904), Radioactivity (http:/ / www. archive. org/ details/ radioactivity00ruthrich), Cambridge: University Press, pp.336338, [63] Heisenberg, Werner (1958), Physics And Philosophy: The Revolution In Modern Science (http:/ / www. archive. org/ details/ physicsandphilos010613mbp), New York: Harper & Brothers, pp.118119, [64] "We might in these processes obtain very much more energy than the proton supplied, but on the average we could not expect to obtain energy in this way. It was a very poor and inefficient way of producing energy, and anyone who looked for a source of power in the transformation of the atoms was talking moonshine. But the subject was scientifically interesting because it gave insight into the atoms." The Times archives (http:/ / archive. timesonline. co. uk/ ), September 12, 1933, "The British associationbreaking down the atom" [65] Cover. (http:/ / www. time. com/ time/ covers/ 0,16641,19460701,00. html) Time magazine, July 1, 1946. [66] Isaacson, Einstein: His Life and Universe. [67] Robert Serber, The Los Alamos Primer: The First Lectures on How to Build an Atomic Bomb (University of California Press, 1992), page 7. Note that the quotation is taken from Serber's 1992 version, and is not in the original 1943 Los Alamos Primer of the same name. [68] David Bodanis, E=mc2: A Biography of the World's Most Famous Equation (New York: Walker, 2000). [69] A quote from Frisch about the discovery day. Accessed April 4, 2009. (http:/ / homepage. mac. com/ dtrapp/ people/ Meitnerium. html) [70] Sime, Ruth (1996). Lise Meitner: A Life in Physics. California Studies in the History of Science. 13. Berkeley: University of California Press. pp.236237. ISBN0-520-20860-9. 2

240

Bodanis, David (2001), E=mc : A Biography of the World's Most Famous Equation, Berkley Trade, ISBN0-425-18164-2

Massenergy equivalence Tipler, Paul; Llewellyn, Ralph (2002), Modern Physics (4th ed.), W. H. Freeman, ISBN0-7167-4345-0 Lasky, Ronald C. (April 23, 2007), "What is the significance of E = mc2? And what does it mean?" (http://www. sciam.com/article.cfm?id=significance-e-mc-2-means), Scientific American (Scientific American)

241

External links
The Equivalence of Mass and Energy (http://plato.stanford.edu/entries/equivME) Entry in the Stanford Encyclopedia of Philosophy Living Reviews in Relativity (http://relativity.livingreviews.org/) An open access, peer-referred, solely online physics journal publishing invited reviews covering all areas of relativity research. A shortcut to E=mc2 (http://fotonowy.pl/index.php?main_page=page&id=6) An easy to understand, high-school level derivation of the E=mc2 formula. Einstein on the Inertia of Energy (http://www.mathpages.com/home/kmath600/kmath600.htm) MathPages

Momentum

242

Momentum

In a game of pool, momentum is conserved; that is, if one ball stops dead after the collision, the other ball will continue away with all the momentum. If the moving ball continues or is deflected then both balls will carry a portion of the momentum from the collision, Common symbol(s): SI unit: p kgm/s or N s

In classical mechanics, linear momentum or translational momentum (pl. momenta; SI unit kgm/s, or, equivalently, Ns) is the product of the mass and velocity of an object. For example, a heavy truck moving fast has a large momentumit takes a large and prolonged force to get the truck up to this speed, and it takes a large and prolonged force to bring it to a stop afterwards. If the truck were lighter, or moving slower, then it would have less momentum. Like velocity, linear momentum is a vector quantity, possessing a direction as well as a magnitude:

Linear momentum is also a conserved quantity, meaning that if a closed system is not affected by external forces, its total linear momentum cannot change. In classical mechanics, conservation of linear momentum is implied by Newton's laws; but it also holds in special relativity (with a modified formula) and, with appropriate definitions, a (generalized) linear momentum conservation law holds in electrodynamics, quantum mechanics, quantum field theory, and general relativity.

Newtonian mechanics
Momentum has a direction as well as magnitude. Quantities that have both a magnitude and a direction are known as vector quantities. Because momentum has a direction, it can be used to predict the resulting direction of objects after they collide, as well as their speeds. Below, the basic properties of momentum are described in one dimension. The vector equations are almost identical to the scalar equations (see multiple dimensions).

Single particle
The momentum of a particle is traditionally represented by the letter p. It is the product of two quantities, the mass (represented by the letter m) and velocity (v):[1]

The units of momentum are the product of the units of mass and velocity. In SI units, if the mass is in kilograms and the velocity in meters per second, then the momentum is in kilograms meters/second (kgm/s). Being a vector, momentum has magnitude and direction. For example, a model airplane of 1kg, traveling due north at 1m/s in straight and level flight, has a momentum of 1kgm/s due north measured from the ground.

Momentum

243

Many particles
The momentum of a system of particles is the sum of their momenta. If two particles have masses m1 and m2, and velocities v1 and v2, the total momentum is

The momenta of more than two particles can be added in the same way. A system of particles has a center of mass, a point determined by the weighted sum of their positions:

If all the particles are moving, the center of mass will generally be moving as well. If the center of mass is moving at velocity vcm, the momentum is: This is known as Euler's first law.[2][3]

Relation to force
If a force F is applied to a particle for a time interval t, the momentum of the particle changes by an amount

In differential form, this gives Newton's second law: the rate of change of the momentum of a particle is equal to the force F acting on it:[1]

If the force depends on time, the change in momentum (or impulse) between times t1 and t2 is

The second law only applies to a particle that does not exchange matter with its surroundings,[4] and so it is equivalent to write

so the force is equal to mass times acceleration.[1] Example: a model airplane of 1kg accelerates from rest to a velocity of 1m/s due north in 1s. The thrust required to produce this acceleration is 1newton. The change in momentum is 1kgm/s.

Conservation

Momentum

244

In a closed system (one that does not exchange any matter with the outside and is not acted on by outside forces) the total momentum is constant. This fact, known as the law of conservation of momentum, is implied by Newton's laws of motion.[5] Suppose, for example, that two particles interact. Because of the third law, the forces between them are equal and opposite. If the particles are numbered 1 and 2, the second law states that F1 = dp1/dt and F2 = dp2/dt. Therefore

or

A Newton's cradle demonstrates conservation of momentum.

If the velocities of the particles are u1 and u2 before the interaction, and afterwards they are v1 and v2, then This law holds no matter how complicated the force is between particles. Similarly, if there are several particles, the momentum exchanged between each pair of particles adds up to zero, so the total change in momentum is zero. This conservation law applies to all interactions, including collisions and separations caused by explosive forces.[5] It can also be generalized to situations where Newton's laws do not hold, for example in the theory of relativity and in electrodynamics.[6]

Dependence on reference frame


Momentum is a measurable quantity, and the measurement depends on the motion of the observer. For example, if an apple is sitting in a glass elevator that is descending, an outside observer looking into the elevator sees the apple moving, so to that observer the apple has a nonzero momentum. To someone inside the elevator, the apple does not move, so it has zero momentum. The two observers each have a frame of reference in which they observe motions, and if the elevator is descending steadily they will see behavior that is consistent with the same physical laws. Suppose a particle has position x in a stationary frame of reference. From the point of view of another frame of reference moving at a uniform speed u, the position (represented by a primed coordinate) changes with time as
Newton's apple in Einstein's elevator. In person A's frame of reference, the apple has non-zero velocity and momentum. In the elevator's and person B's frames of reference, it has zero velocity and momentum.

This is called a Galilean transformation. If the particle is moving at speed dx/dt = v in the first frame of reference, in the second it is moving at speed

Since u does not change, the accelerations are the same:

Thus, momentum is conserved in both reference frames. Moreover, as long as the force has the same form in both frames, Newton's second law is unchanged. Forces such as Newtonian gravity, which depend only on the scalar

Momentum distance between objects, satisfy this criterion. This independence of reference frame is called Newtonian relativity or Galilean invariance.[7] A change of reference frame can often simplify calculations of motion. For example, in a collision of two particles a reference frame can be chosen where one particle begins at rest. Another commonly used reference frame is the center of mass frame, one that is moving with the center of mass. In this frame, the total momentum is zero.

245

Application to collisions
By itself, the law of conservation of momentum is not enough to determine the motion of particles after a collision. Another property of the motion, kinetic energy, must be known. This is not necessarily conserved. If it is conserved, the collision is called an elastic collision; if not, it is an inelastic collision. Elastic collisions An elastic collision is one in which no kinetic energy is lost. Perfectly elastic "collisions" can occur when the objects do not touch each other, as for example in atomic or nuclear scattering where electric repulsion keeps them apart. A slingshot maneuver of a satellite around a planet can also be viewed as a perfectly elastic collision from a distance. A collision between two pool balls is a good example of an almost totally elastic collision, due to their high rigidity; but when bodies come in contact there is always some dissipation.[8]

Elastic collision of equal masses

Elastic collision of unequal masses

A head-on collision between two bodies can be represented by velocities in one dimension, along a line passing through the bodies. If the velocities are u1 and u2 before the collision and v1 and v2 after, the equations expressing conservation of momentum and kinetic energy are:

A change of reference frame can often simplify the analysis of a collision. For example, suppose there are two bodies of equal mass m, one stationary and one approaching the other at a speed v (as in the figure). The center of mass is moving at speed v/2 and both bodies are moving towards it at speed v/2. Because of the symmetry, after the collision both must be moving away from the center of mass at the same speed. Adding the speed of the center of mass to both, we find that the body that was moving is now stopped and the other is moving away at speed v. The bodies have exchanged their velocities. Regardless of the velocities of the bodies, a switch to the center of mass frame leads us to the same conclusion. Therefore, the final velocities are given by[5]

In general, when the initial velocities are known, the final velocities are given by[9]

If one body has much greater mass than the other, its velocity will be little affected by a collision while the other body will experience a large change.

Momentum Inelastic collisions In an inelastic collision, some of the kinetic energy of the colliding bodies is converted into other forms of energy such as heat or sound. a perfectly inelastic collision between equal Examples include traffic collisions,[10] in which the effect of lost masses kinetic energy can be seen in the damage to the vehicles; electrons losing some of their energy to atoms (as in the FranckHertz experiment);[11] and particle accelerators in which the kinetic energy is converted into mass in the form of new particles. In a perfectly inelastic collision (such as a bug hitting a windshield), both bodies have the same motion afterwards. If one body is motionless to begin with, the equation for conservation of momentum is

246

so

In a frame of reference moving at the speed v), the objects are brought to rest by the collision and 100% of the kinetic energy is converted. One measure of the inelasticity of the collision is the coefficient of restitution CR, defined as the ratio of relative velocity of separation to relative velocity of approach. In applying this measure to ball sports, this can be easily measured using the following formula:[12]

The momentum and energy equations also apply to the motions of objects that begin together and then move apart. For example, an explosion is the result of a chain reaction that transforms potential energy stored in chemical, mechanical, or nuclear form into kinetic energy, acoustic energy, and electromagnetic radiation. Rockets also make use of conservation of momentum: propellant is thrust outward, gaining momentum, and an equal and opposite momentum is imparted to the rocket.[13]

Multiple dimensions
Real motion has both direction and magnitude and must be represented by a vector. In a coordinate system with x, y, z axes, velocity has components vx in the x direction, vy in the y direction, vz in the z direction. The vector is represented by a boldface symbol:[14]

Similarly, the momentum is a vector quantity and is represented by a boldface symbol:

The equations in the previous sections work in vector form if the scalars p and v are replaced by vectors p and v. Each vector equation represents three scalar equations. For example,

Two-dimensional elastic collision. There is no motion perpendicular to the image, so only two components are needed to represent the velocities and momenta. The two blue vectors represent velocities after the collision and add vectorially to get the initial (red) velocity.

represents three equations:[14]

Momentum

247

The kinetic energy equations are exceptions to the above replacement rule. The equations are still one-dimensional, but each scalar represents the magnitude of the vector, for example,

Each vector equation represents three scalar equations. Often coordinates can be chosen so that only two components are needed, as in the figure. Each component can be obtained separately and the results combined to produce a vector result.[14] A simple construction involving the center of mass frame can be used to show that if a stationary elastic sphere is struck by a moving sphere, the two will head off at right angles after the collision (as in the figure).[15]

Objects of variable mass


The concept of momentum plays a fundamental role in explaining the behavior of variable-mass objects such as a rocket ejecting fuel or a star accreting gas. In analyzing such an object, one treats the object's mass as a function that varies with time: m(t). The momentum of the object at time t is therefore p(t) = m(t)v(t). One might then try to invoke Newton's second law of motion by saying that the external force F on the object is related to its momentum p(t) by F = dp/dt, but this is incorrect, as is the related expression found by applying the product rule to d(mv)/dt:[16]

This equation does not correctly describe the motion of variable-mass objects. The correct equation is

where u is the velocity of the ejected/accreted mass as seen in the object's rest frame.[16] This is distinct from v, which is the velocity of the object itself as seen in an inertial frame. This equation is derived by keeping track of both the momentum of the object as well as the momentum of the ejected/accreted mass. When considered together, the object and the mass constitute a closed system in which total momentum is conserved.

Generalized coordinates
Newton's laws can be difficult to apply to many kinds of motion because the motion is limited by constraints. For example, a bead on an abacus is constrained to move along its wire and a pendulum bob is constrained to swing at a fixed distance from the pivot. Many such constraints can be incorporated by changing the normal Cartesian coordinates to a set of generalized coordinates that may be fewer in number.[17] Refined mathematical methods have been developed for solving mechanics problems in generalized coordinates. They introduce a generalized momentum, also known as the canonical or conjugate momentum, that extends the concepts of both linear momentum and angular momentum. To distinguish it from generalized momentum, the product of mass and velocity is also referred to as mechanical, kinetic or kinematic momentum.[6][18][19] The two main methods are described below.

Momentum

248

Lagrangian mechanics
In Lagrangian mechanics, a Lagrangian is defined as the difference between the kinetic energy T and the potential energy V:

If the generalized coordinates are represented as a vector q = (q1, q2, ... , qN) and time differentiation is represented by a dot over the variable, then the equations of motion (known as the Lagrange or EulerLagrange equations) are a set of N equations:[20]

If a coordinate qi is not a Cartesian coordinate, the associated generalized momentum component pi does not necessarily have the dimensions of linear momentum. Even if qi is a Cartesian coordinate, pi will not be the same as the mechanical momentum if the potential depends on velocity.[6] Some sources represent the kinematic momentum by the symbol .[21] In this mathematical framework, a generalized momentum is associated with the generalized coordinates. Its components are defined as

Each component pj is said to be the conjugate momentum for the coordinate qj. Now if a given coordinate qi does not appear in the Lagrangian (although its time derivative might appear), then This is the generalization of the conservation of momentum.[6] Even if the generalized coordinates are just the ordinary spatial coordinates, the conjugate momenta are not necessarily the ordinary momentum coordinates. An example is found in the section on electromagnetism.

Hamiltonian mechanics
In Hamiltonian mechanics, the Lagrangian (a function of generalized coordinates and their derivatives) is replaced by a Hamiltonian that is a function of generalized coordinates and momentum. The Hamiltonian is defined as

where the momentum is obtained by differentiating the Lagrangian as above. The Hamiltonian equations of motion are[22]

As in Lagrangian mechanics, if a generalized coordinate does not appear in the Hamiltonian, its conjugate momentum component is conserved.[23]

Momentum

249

Symmetry and conservation


Conservation of momentum is a mathematical consequence of the homogeneity (shift symmetry) of space (position in space is the canonical conjugate quantity to momentum). That is, conservation of momentum is a consequence of the fact that the laws of physics do not depend on position; this is a special case of Noether's theorem.[24]

Relativistic mechanics
Lorentz invariance
Newtonian physics assumes that absolute time and space exist outside of any observer; this gives rise to the Galilean invariance described earlier. It also results in a prediction that the speed of light can vary from one reference frame to another. This is contrary to observation. In the special theory of relativity, Einstein keeps the postulate that the equations of motion do not depend on the reference frame, but assumes that the speed of light c is invariant. As a result, position and time in two reference frames are related by the Lorentz transformation instead of the Galilean transformation.[25] Consider, for example, a reference frame moving relative to another at velocity v in the x direction. The Galilean transformation gives the coordinates of the moving frame as

while the Lorentz transformation gives[26]

where is the Lorentz factor:

Newton's second law, with mass fixed, is not invariant under a Lorentz transformation. However, it can be made invariant by making the inertial mass m of an object a function of velocity: m0 is the object's invariant mass.[27] The modified momentum,

obeys Newton's second law:

Within the domain of classical mechanics, relativistic momentum closely approximates Newtonian momentum: at low velocity, m0v is approximately equal to m0v, the Newtonian expression for momentum.

Four-vector formulation
In the theory of relativity, physical quantities are expressed in terms of four-vectors that include time as a fourth coordinate along with the three space coordinates. These vectors are generally represented by capital letters, for example R for position. The expression for the four-momentum depends on how the coordinates are expressed. Time may be given in its normal units or multiplied by the speed of light so that all the components of the four-vector have dimensions of length. If the latter scaling is used, an interval of proper time, , defined by[28]

Momentum is invariant under Lorentz transformations (in this expression and in what follows the (+ ) metric signature has been used). Mathematically this invariance can be ensured in one of two ways: by treating the four-vectors as Euclidean vectors and multiplying time by the square root of -1; or by keeping time a real quantity and embedding the vectors in a Minkowski space.[29] In a Minkowski space, the scalar product of two four-vectors U = (U0,U1,U2,U3) and V = (V0,V1,V2,V3) is defined as In all the coordinate systems, the (contravariant) relativistic four-velocity is defined by

250

and the (contravariant) four-momentum is where m0 is the invariant mass. If R = (ct,x,y,z) (in Minkowski space), then[30] Using Einstein's mass-energy equivalence, E = mc2, this can be rewritten as

Thus, conservation of four-momentum is Lorentz-invariant and implies conservation of both mass and energy. The magnitude of the momentum four-vector is equal to m0c: and is invariant across all reference frames. The relativistic energymomentum relationship holds even for massless particles such as photons; by setting m0 = 0 it follows that

In a game of relativistic "billiards", if a stationary particle is hit by a moving particle in an elastic collision, the paths formed by the two afterwards will form an acute angle. This is unlike the non-relativistic case where they travel at right angles.[31]

Classical electromagnetism
In Newtonian mechanics, the law of conservation of momentum can be derived from the law of action and reaction, which states that the forces between two particles are equal and opposite. Electromagnetic forces violate this law. Under some circumstances one moving charged particle can exert a force on another without any return force.[32] Moreover, Maxwell's equations, the foundation of classical electrodynamics, are Lorentz-invariant. However, momentum is still conserved.

Momentum

251

Vacuum
In Maxwell's equations, the forces between particles are mediated by electric and magnetic fields. The electromagnetic force (Lorentz force) on a particle with charge q due to a combination of electric field E and magnetic field (as given by the "B-field" B) is

This force imparts a momentum to the particle, so by Newton's second law the particle must impart a momentum to the electromagnetic fields.[33] In a vacuum, the momentum per unit volume is

where 0 is the vacuum permeability and c is the speed of light. The momentum density is proportional to the Poynting vector S which gives the directional rate of energy transfer per unit area:[33][34]

If momentum is to be conserved in a volume V, changes in the momentum of matter through the Lorentz force must be balanced by changes in the momentum of the electromagnetic field and outflow of momentum. If Pmech is the momentum of all the particles in a volume V, and the particles are treated as a continuum, then Newton's second law gives

The electromagnetic momentum is

and the equation for conservation of each component i of the momentum is

The term on the right is an integral over the surface S representing momentum flow into and out of the volume, and nj is a component of the surface normal of S. The quantity Ti j is called the Maxwell stress tensor, defined as
[33]

Media
The above results are for the microscopic Maxwell equations, applicable to electromagnetic forces in a vacuum (or on a very small scale in media). It is more difficult to define momentum density in media because the division into electromagnetic and mechanical is arbitrary. The definition of electromagnetic momentum density is modified to

where the H-field H is related to the B-field and the magnetization M by The electromagnetic stress tensor depends on the properties of the media.[33]

Momentum

252

Particle in field
If a charged particle q moves in an electromagnetic field, its kinematic momentum m v is not conserved. However, it has a canonical momentum that is conserved. Lagrangian and Hamiltonian formulation The kinetic momentum p is different to the canonical momentum P (synonymous with the generalized momentum) conjugate to the ordinary position coordinates r, because P includes a contribution from the electric potential (r, t) and vector potential A(r, t):[21]
Classical mechanics Lagrangian Relativistic mechanics

Canonical momentum

Kinetic momentum

Hamiltonian

where = v is the velocity (see time derivative) and e is the electric charge of the particle. See also Electromagnetism (momentum). If neither nor A depends on position, P is conserved.[6] The classical Hamiltonian
2 2

for a particle in any field equals the total energy of the system - the kinetic energy T =

p /2m (where p = pp, see dot product) plus the potential energy V. For a particle in an electromagnetic field, the potential energy is V = e, and since the kinetic energy T always corresponds to the kinetic momentum p, replacing the kinetic momentum by the above equation (p = P eA) leads to the Hamiltonian in the table. These Lagrangian and Hamiltonian expressons can derive the Lorentz force. Canonical commutation relations The kinetic momentum (p above) satisfies the commutation relation:[21]

where: j, k, are indices labelling vector components, B is a component of the magnetic field, and kj is the Levi-Civita symbol, here in 3-dimensions.

Quantum mechanics
In quantum mechanics, momentum is defined as an operator on the wave function. The Heisenberg uncertainty principle defines limits on how accurately the momentum and position of a single observable system can be known at once. In quantum mechanics, position and momentum are conjugate variables. For a single particle described in the position basis the momentum operator can be written as

Momentum where is the gradient operator, is the reduced Planck constant, and i is the imaginary unit. This is a commonly encountered form of the momentum operator, though the momentum operator in other bases can take other forms. For example, in the momentum basis the momentum operator is represented as

253

where the operator p acting on a wave function (p) yields that wave function multiplied by the value p, in an analogous fashion to the way that the position operator acting on a wave function (x) yields that wave function multiplied by the value x. For both massive and massless objects, relativistic momentum is related to the de Broglie wavelength by

Electromagnetic radiation (including visible light, ultraviolet light, and radio waves) is carried by photons. Even though photons (the particle aspect of light) have no mass, they still carry momentum. This leads to applications such as the solar sail. The calculation of the momentum of light within dielectric media is somewhat controversial (see AbrahamMinkowski controversy).[35]

Deformable bodies and fluids


Conservation
In fields such as fluid dynamics and solid mechanics, it is not feasible to follow the motion of individual atoms or molecules. Instead, the materials must be approximated by a continuum in which there is a particle or fluid parcel at each point that is assigned the average of the properties of atoms in a small region nearby. In particular, it has a density and velocity v that depend on time t and position r. The momentum per unit volume is v.[36] Consider a column of water in hydrostatic equilibrium. All the forces on the water are in balance and the water is motionless. On any given drop of water, two forces are balanced. The first is gravity, which acts directly on each atom and molecule inside. The gravitational force per unit volume is g, where g is the gravitational acceleration. The second force is the sum of all the Motion of a material body forces exerted on its surface by the surrounding water. The force from below is greater than the force from above by just the amount needed to balance gravity. The normal force per unit area is the pressure p. The average force per unit volume inside the droplet is the gradient of the pressure, so the force balance equation is[37]

If the forces are not balanced, the droplet accelerates. This acceleration is not simply the partial derivative v/t because the fluid in a given volume changes with time. Instead, the material derivative is needed:[38]

Applied to any physical quantity, the material derivative includes the rate of change at a point and the changes dues to advection as fluid is carried past the point. Per unit volume, the rate of change in momentum is equal to Dv/Dt. This is equal to the net force on the droplet. Forces that can change the momentum of a droplet include the gradient of the pressure and gravity, as above. In addition, surface forces can deform the droplet. In the simplest case, a shear stress , exerted by a force parallel to the

Momentum surface of the droplet, is proportional to the rate of deformation or strain rate. Such a shear stress occurs if the fluid has a velocity gradient because the fluid is moving faster on one side than another. If the speed in the x direction varies with z, the tangential force in direction x per unit area normal to the z direction is

254

where is the viscosity. This is also a flux, or flow per unit area, of x-momentum through the surface.[39] Including the effect of viscosity, the momentum balance equations for the incompressible flow of a Newtonian fluid are

These are known as the NavierStokes equations.[40] The momentum balance equations can be extended to more general materials, including solids. For each surface with normal in direction i and force in direction j, there is a stress component ij. The nine components make up the Cauchy stress tensor , which includes both pressure and shear. The local conservation of momentum is expressed by the Cauchy momentum equation:

where f is the body force.[41] The Cauchy momentum equation is broadly applicable to deformations of solids and liquids. The relationship between the stresses and the strain rate depends on the properties of the material (see Types of viscosity).

Acoustic waves
A disturbance in a medium gives rise to oscillations, or waves, that propagate away from their source. In a fluid, small changes in pressure p can often be described by the acoustic wave equation:

where c is the speed of sound. In a solid, similar equations can be obtained for propagation of pressure (P-waves) and shear (S-waves).[42] The flux, or transport per unit area, of a momentum component vj by a velocity vi is equal to vjvj. In the linear approximation that leads to the above acoustic equation, the time average of this flux is zero. However, nonlinear effects can give rise to a nonzero average.[43] It is possible for momentum flux to occur even though the wave itself does not have a mean momentum.[44]

History of the concept


In about 530 A.D., working in Alexandria, Byzantine philosopher John Philoponus developed a concept of momentum in his commentary to Aristotle's Physics. Aristotle claimed that everything that is moving must be kept moving by something. For example, a thrown ball must be kept moving by motions of the air. Most writers continued to accept Aristotle's theory until the time of Galileo, but a few were skeptical. Philoponus pointed out the absurdity in Aristotle's claim that motion of an object is promoted by the same air that is resisting its passage. He proposed instead that an impetus was imparted to the object in the act of throwing it.[45] Ibn Sn (also known by his Latinized name Avicenna) read Philoponus and published his own theory of motion in The Book of Healing in 1020. He agreed that an impetus is imparted to a projectile by the thrower; but unlike Philoponus, who believed that it was a temporary virtue that would decline even in a vacuum, he viewed it as a persistent, requiring external forces such as air resistance to dissipate it.[46][47][48] The work of Philoponus, and possibly that of Ibn Sn,[48] was read and refined by the European philosophers Peter Olivi and Jean Buridan. Buridan, who in about 1350 was made rector of

Momentum the University of Paris, referred to impetus being proportional to the weight times the speed. Moreover, Buridan's theory was different to his predecessor's in that he did not consider impetus to be self-dissipating, asserting that a body would be arrested by the forces of air resistance and gravity which might be opposing its impetus.[49][50] Ren Descartes believed that the total "quantity of motion" in the universe is conserved, where the quantity of motion is understood as the product of size and speed. This should not be read as a statement of the modern law of momentum, since he had no concept of mass as distinct from weight and size, and more importantly he believed that it is speed rather than velocity that is conserved. So for Descartes if a moving object were to bounce off a surface, changing its direction but not its speed, there would be no change in its quantity of motion.[51][52] Galileo, later, in his Two New Sciences, used the Italian word impeto. The first correct statement of the law of conservation of momentum was by English mathematician John Wallis in his 1670 work, Mechanica sive De Motu, Tractatus Geometricus: "the initial state of the body, either of rest or of motion, will persist" and "If the force is greater than the resistance, motion will result".[53] Wallis uses momentum and vis for force. Newton's Philosophi Naturalis Principia Mathematica, when it was first published in 1687, showed a similar casting around for words to use for the mathematical momentum. His Definition II[54] defines quantitas motus, "quantity of motion", as "arising from the velocity and quantity of matter conjointly", which identifies it as momentum.[55] Thus when in Law II he refers to mutatio motus, "change of motion", being proportional to the force impressed, he is generally taken to mean momentum and not motion.[56] It remained only to assign a standard term to the quantity of motion. The first use of "momentum" in its proper mathematical sense is not clear but by the time of Jenning's Miscellanea in 1721, four years before the final edition of Newton's Principia Mathematica, momentum M or "quantity of motion" was being defined for students as "a rectangle", the product of Q and V, where Q is "quantity of material" and V is "velocity", s/t.[57]

255

Notes
[1] [2] [3] [4] Feynman Vol. 1, Chapter 9 "Euler's Laws of Motion" (http:/ / www. bookrags. com/ research/ eulers-laws-of-motion-wom/ ). . Retrieved 2009-03-30. McGill and King (1995). Engineering Mechanics, An Introduction to Dynamics (3rd ed.). PWS Publishing Company. ISBN0-534-93399-8. Plastino, Angel R.; Muzzio, Juan C. (1992). "On the use and abuse of Newton's second law for variable mass problems". Celestial Mechanics and Dynamical Astronomy (Netherlands: Kluwer Academic Publishers) 53 (3): 227232. Bibcode1992CeMDA..53..227P. doi:10.1007/BF00052611. ISSN0923-2958. "We may conclude emphasizing that Newton's second law is valid for constant mass only. When the mass varies due to accretion or ablation, [an alternate equation explicitly accounting for the changing mass] should be used." [5] Feynman Vol. 1, Chapter 10 [6] Goldstein 1980, pp.5456 [7] Goldstein 1980, p.276 [8] Carl Nave (2010). "Elastic and inelastic collisions" (http:/ / hyperphysics. phy-astr. gsu. edu/ hbase/ elacol. html). Hyperphysics. . Retrieved 2 August 2012. [9] Serway, Raymond A.; John W. Jewett, Jr (2012). Principles of physics : a calculus-based text (5th ed.). Boston, MA: Brooks/Cole, Cengage Learning. p.245. ISBN9781133104261. [10] Carl Nave (2010). "Forces in car crashes" (http:/ / hyperphysics. phy-astr. gsu. edu/ hbase/ carcr. html#cc1). Hyperphysics. . Retrieved 2 August 2012. [11] Carl Nave (2010). "The Franck-Hertz Experiment" (http:/ / hyperphysics. phy-astr. gsu. edu/ hbase/ FrHz. html). Hyperphysics. . Retrieved 2 August 2012. [12] McGinnis, Peter M. (2005). Biomechanics of sport and exercise Biomechanics of sport and exercise (http:/ / books. google. com/ books?id=PrOKEcZXJ58C& pg=PA85& lpg=PA85& dq=coefficient+ of+ restitution+ bounciness) (2nd ed.). Champaign, IL [u.a.]: Human Kinetics. p.85. ISBN9780736051019. Biomechanics of sport and exercise. [13] Sutton, George (2001), "1" (http:/ / books. google. com/ ?id=LQbDOxg3XZcC& printsec=frontcover), Rocket Propulsion Elements (7th ed.), Chichester: John Wiley & Sons, ISBN978-0-471-32642-7, [14] Feynman Vol. 1, Chapter 11 [15] Rindler 1986, pp.2627 [16] Kleppner; Kolenkow. An Introduction to Mechanics. p.13539. [17] Goldstein 1980, pp.1113 [18] Jackson 1975, p.574 [19] Feynman Vol. 3, Chapter 21-3

Momentum
[20] Goldstein 1980, pp.2021 [21] Lerner, Rita G., ed. (2005). Encyclopedia of physics (3rd ed.). Weinheim: Wiley-VCH-Verl.. ISBN978-3527405541. [22] Goldstein 1980, pp.341342 [23] Goldstein 1980, p.348 [24] Hand, Louis N.; Finch, Janet D. (1998). Analytical mechanics (7th print ed.). Cambridge, England: Cambridge University Press. Chapter 4. ISBN9780521575720. [25] Rindler 1986, Chapter 2 [26] Feynman Vol. 1, Chapter 15-2 [27] Rindler 1986, pp.7781 [28] Rindler 1986, p.66 [29] Misner, Charles W.; Kip S. Thorne, John Archibald Wheeler (1973). Gravitation. 24th printing.. New York: W. H. Freeman. p.51. ISBN9780716703440. [30] Here the time coordinate comes first. Several sources put the time coordinate at the end of the vector. [31] Rindler 1986, pp.8687 [32] Goldstein 1980, pp.78 [33] Jackson 1975, pp.238241 Expressions, given in Gaussian units in the text, were converted to SI units using Table 3 in the Appendix. [34] Feynman Vol. 1, Chapter 27-6 [35] Barnett, Stephen M. (2010). "Resolution of the Abraham-Minkowski Dilemma". Physical Review Letters 104 (7). Bibcode2010PhRvL.104g0401B. doi:10.1103/PhysRevLett.104.070401. [36] Tritton 2006, pp.4851 [37] Feynman Vol. 2, Chapter 40 [38] Tritton 2006, pp.54 [39] Bird, R. Byron; Warren Stewart; Edwin N. Lightfoot (2007). Transport phenomena (2nd revised ed.). New York: Wiley. p.13. ISBN9780470115398. [40] Tritton 2006, p.58 [41] Acheson, D. J. (1990). Elementary Fluid Dynamics. Oxford University Press. p.205. ISBN0-19-859679-0. [42] Gubbins, David (1992). Seismology and plate tectonics (Repr. (with corr.) ed.). Cambridge [England]: Cambridge University Press. p.59. ISBN0521379954. [43] LeBlond, Paul H.; Mysak, Lawrence A. (1980). Waves in the ocean (2. impr. ed.). Amsterdam [u.a.]: Elsevier. p.258. ISBN9780444419262. [44] McIntyre, M. E. (1981). "On the 'wave momentum' myth". J. Fluid. Mech 106: 331347. [45] "John Philoponus" (http:/ / plato. stanford. edu/ entries/ philoponus/ #2. 1). Standford Encyclopedia of Philosophy. 8 June 2007. . Retrieved 26 July 2012. [46] Fernando Espinoza (2005). "An analysis of the historical development of ideas about motion and its implications for teaching", Physics Education 40 (2), p. 141. [47] Seyyed Hossein Nasr & Mehdi Amin Razavi (1996). The Islamic intellectual tradition in Persia. Routledge. p.72. ISBN0-7007-0314-4. [48] Aydin Sayili (1987). "Ibn Sn and Buridan on the Motion of the Projectile". Annals of the New York Academy of Sciences 500 (1): 477482. Bibcode1987NYASA.500..477S. doi:10.1111/j.1749-6632.1987.tb37219.x. [49] T.F. Glick, S.J. Livesay, F. Wallis. "Buridian, John". Medieval Science, Technology and Medicine:an Encyclopedia. p.107. [50] Park, David (1990). The how and the why : an essay on the origins and development of physical theory. With drawings by Robin Brickman (3rd print ed.). Princeton, N.J.: Princeton University Press. pp.139141. ISBN9780691025087. [51] Daniel Garber (1992). "Descartes' Physics". In John Cottingham. The Cambridge Companion to Descartes. Cambridge: Cambridge University Press. pp.310319. ISBN0-521-36696-8. [52] Rothman, Milton A. (1989). Discovering the natural laws : the experimental basis of physics ([2me dition, revue et augmente]. ed.). New York: Dover Publications. pp.8388. ISBN9780486261782. [53] Scott, J.F. (1981). The Mathematical Work of John Wallis, D.D., F.R.S.. Chelsea Publishing Company. p.111. ISBN0-8284-0314-7. [54] Newton placed his definitions up front as did Wallis, with whom Newton can hardly fail to have been familiar. [55] Grimsehl, Ernst; Leonard Ary Woodward, Translator (1932). A Textbook of Physics. London, Glasgow: Blackie & Son limited. p.78. [56] Rescigno, Aldo (2003). Foundation of Pharmacokinetics. New York: Kluwer Academic/Plenum Publishers. p.19. ISBN0306477041. [57] Jennings, John (1721). Miscellanea in Usum Juventutis Academicae. Northampton: R. Aikes & G. Dicey. p.67.

256

Momentum

257

References Further reading


Halliday, David; Robert Resnick (1960-2007). Fundamentals of Physics. John Wiley & Sons. Chapter 9. Dugas, Ren (1988). A history of mechanics. Translated into English by J.R. Maddox (Dover ed.). New York: Dover Publications. ISBN9780486656328. Feynman, Richard P.; Leighton, Robert B.; Sands, Matthew (2005). The Feynman lectures on physics, Volume 1: Mainly Mechanics, Radiation, and Heat (Definitive ed.). San Francisco, Calif.: Pearson Addison-Wesley. ISBN978-0805390469. Feynman, Richard P.; Leighton, Robert B.; Sands, Matthew (2005). The Feynman lectures on physics, Volume III: Quantum Mechanics (Definitive ed.). New York: BasicBooks. ISBN978-0805390490. Goldstein, Herbert (1980). Classical mechanics (2d ed.). Reading, Mass.: Addison-Wesley Pub. Co.. ISBN0201029189. Hand, Louis N.; Finch, Janet D.. Analytical Mechanics. Cambridge University Press. Chapter 4. Jackson, John David (1975). Classical electrodynamics (2d ed.). New York: Wiley. ISBN047143132X. Landau, L.D.; E.M. Lifshitz (2000). The classical theory of fields. 4th rev. English edition, reprinted with corrections; translated from the Russian by Morton Hamermesh. Oxford: Butterworth Heinemann. ISBN9780750627689. Rindler, Wolfgang (1986). Essential Relativity : Special, general and cosmological (Rev. 2. ed.). New York u.a.: Springer. ISBN0387100903. Serway, Raymond; Jewett, John (2003). Physics for Scientists and Engineers (6 ed.). Brooks Cole. ISBN 0-534-40842-7 Stenger, Victor J. (2000). Timeless Reality: Symmetry, Simplicity, and Multiple Universes. Prometheus Books. Chpt. 12 in particular. Tipler, Paul (1998). Physics for Scientists and Engineers: Vol. 1: Mechanics, Oscillations and Waves, Thermodynamics (4th ed.). W. H. Freeman. ISBN 1-57259-492-6 Tritton, D.J. (2006). Physical fluid dynamics (2nd. ed.). Oxford: Claredon Press. p.58. ISBN0198544936.

External links
Conservation of momentum (http://www.lightandmatter.com/html_books/lm/ch14/ch14.html) A chapter from an online textbook

Angular momentum

258

Angular momentum
In physics, angular momentum, moment of momentum, or rotational momentum[1][2] is a vector quantity that represents the product of a body's rotational inertia and rotational velocity about a particular axis. The angular momentum of a system of particles (e.g. a rigid body) is the sum of angular momenta of the individual particles. For a rigid body rotating around an axis of symmetry (e.g. the blades of a ceiling fan), the angular momentum can be expressed as the product of the body's moment of inertia, I, (i.e., a measure of an object's resistance to changes in its rotation rate) and its angular velocity :

In this way, angular momentum is sometimes described as the rotational analog of linear momentum. For the case of an object that is small compared with the radial distance to its axis of rotation, such as a tin can swinging from a long string or a planet orbiting in a circle around the Sun, the angular momentum can be expressed as its linear momentum, mv, crossed by its position from the origin, r. Thus, the angular momentum L of a particle with respect to some point of origin is

This gyroscope remains upright while spinning due to its angular momentum.

Angular momentum is conserved in a system where there is no net external torque, and its conservation helps explain many diverse phenomena. For example, the increase in rotational speed of a spinning figure skater as the skater's arms are contracted is a consequence of conservation of angular momentum. The very high rotational rates of neutron stars can also be explained in terms of angular momentum conservation. Moreover, angular momentum conservation has numerous applications in physics and engineering (e.g., the gyrocompass).

Angular momentum in classical mechanics


Definition
The angular momentum L of a particle about a given origin is defined as:

where r is the position vector of the particle relative to the origin, p is the linear momentum of the particle, and denotes the cross product. As seen from the definition, the derived SI units of angular momentum are newton meter seconds (Nms or kgm2/s) or joule seconds (Js). Because of the cross product, L is a pseudovector perpendicular to both the radial vector r and the momentum vector p and it is assigned a sign by the right-hand rule.

Relationship between force (F), torque (), momentum (p), and angular momentum (L) vectors in a rotating system

For an object with a fixed mass that is rotating about a fixed symmetry axis, the angular momentum is expressed as the product of the moment of inertia of the object and its angular velocity vector:

Angular momentum

259

where I is the moment of inertia of the object (in general, a tensor quantity), and is the angular velocity. The angular momentum of a particle or rigid body in rectilinear motion (pure translation) is a vector with constant magnitude and direction. If the path of the particle or center of mass of the rigid body passes through the given origin, its angular momentum is zero. Angular momentum is also known as moment of momentum.

Angular momentum of a collection of particles


If a system consists of several particles, the total angular momentum about a point can be obtained by adding (or integrating) all the angular momenta of the constituent particles. i.e.,

Angular momentum simplified using the center of mass


It is very often convenient to consider the angular momentum of a collection of particles about their center of mass, since this simplifies the mathematics considerably. The angular momentum of a collection of particles is the sum of the angular momentum of each particle:

where ri is the position vector of particle i from the reference point, mi is its mass, and vi is its velocity. The center of mass is defined by:

where the total mass of all particles is given by

It follows that the velocity of the center of mass is

If we define ri as the displacement of particle i from the center of mass, and vi as the velocity of particle i with respect to the center of mass, then we have and and also and so that the total angular momentum with respect to the center is

The first term is just the angular momentum of the center of mass. It is the same angular momentum one would obtain if there were just one particle of mass m moving at velocity v located at the center of mass. The second term is the angular momentum that is the result of the particles moving relative to their center of mass. This second term can be even further simplified if the particles form a rigid body, in which case it is the product of moment of inertia and angular velocity of the spinning motion (as above). The same result is true if the discrete point masses discussed above are replaced by a continuous distribution of matter.

Angular momentum

260

Fixed axis of rotation


For many applications where one is only concerned about rotation around one axis, it is sufficient to discard the pseudovector nature of angular momentum, and treat it like a scalar where it is positive when it corresponds to a counter-clockwise rotation, and negative clockwise. To do this, just take the definition of the cross product and discard the unit vector, so that angular momentum becomes:
Angular momentum in terms of scalar and vector components

where r,p is the angle between r and p measured from r to p; an important distinction because without it, the sign of the cross product would be meaningless. From the above, it is possible to reformulate the definition to either of the following:

where

is called the lever arm distance to p.

The easiest way to conceptualize this is to consider the lever arm distance to be the distance from the origin to the line that p travels along. With this definition, it is necessary to consider the direction of p (pointed clockwise or counter-clockwise) to figure out the sign of L. Equivalently:

where rotation.

is the component of p that is perpendicular to r. As above, the sign is decided based on the sense of

For an object with a fixed mass that is rotating about a fixed symmetry axis, the angular momentum is expressed as the product of the moment of inertia of the object and its angular velocity vector:

where I is the moment of inertia of the object (in general, a tensor quantity) and is the angular velocity. It is a misconception that angular momentum is always about the same axis as angular velocity. Sometime this may not be possible, in these cases the angular momentum component along the axis of rotation is the product of angular velocity and moment of inertia about the given axis of rotation. As the kinetic energy K of a massive rotating body is given by

it is proportional to the square of the angular velocity.

Angular momentum

261

Conservation of angular momentum


The law of conservation of angular momentum states that when no external torque acts on an object or a closed system of objects, no change of angular momentum can occur. Hence, the angular momentum before an event involving only internal torques or no torques is equal to the angular momentum after the event. This conservation law mathematically follows from continuous directional symmetry of space (no direction in space is any different from any other direction). See Noether's theorem.[3] The time derivative of angular momentum is called torque:

An example of angular momentum conservation. A spinning figure skater reduces her moment of inertia by pulling in her arms, causing her rotation rate to increase.

(The cross-product of velocity and momentum is zero, because these vectors are parallel.) So requiring the system to be "closed" here is mathematically equivalent to zero external torque acting on the system:

where is any torque applied to the system of particles. It is assumed that internal interaction forces obey Newton's third law of motion in its strong form, that is, that the forces between particles are equal and opposite and act along the line between the particles. In orbits, the angular momentum is distributed between the spin of the planet itself and the angular momentum of its orbit: ; If a planet is found to rotate slower than expected, then astronomers suspect that the planet is accompanied by a satellite, because the total angular momentum is shared between the planet and its satellite in order to be conserved. The conservation of angular momentum is used extensively in analyzing what is called central force motion. If the net force on some body is directed always toward some fixed point, the center, then there is no torque on the body with respect to the center, and so the angular momentum of the body about the center is constant. Constant angular momentum is extremely useful when dealing with the orbits of planets and satellites, and also when analyzing the Bohr model of the atom.

Angular momentum

262

The conservation of angular momentum explains the angular acceleration of an ice skater as she brings her arms and legs close to the vertical axis of rotation. By bringing part of mass of her body closer to the axis she decreases her body's moment of inertia. Because angular momentum is constant in the absence of external torques, the angular velocity (rotational speed) of the skater has to increase. The same phenomenon results in extremely fast spin of compact stars (like white dwarfs, neutron stars and black holes) when they are formed out of much larger and slower rotating stars (indeed, decreasing the size of object 104 times results in increase of its angular velocity by the factor 108). The conservation of angular momentum in EarthMoon system results in the transfer of angular momentum from Earth to Moon (due to tidal torque the Moon exerts on the Earth). This in turn results in the slowing down of the rotation rate of Earth (at about 42 ns/day ), and in gradual increase of the radius of Moon's orbit (at ~4.5cm/year rate ).

The torque caused by the two opposing forces Fg and -Fg causes a change in the angular momentum L in the direction of that torque (since torque is the time derivative of angular momentum). This causes the top to precess.

Angular momentum in relativistic mechanics


In modern (late 20th century) theoretical physics, angular momentum is described using a different formalism. Under this formalism, angular momentum is the 2-form Noether charge associated with rotational invariance (As a result, angular momentum is not conserved for general curved spacetimes, unless it happens to be asymptotically rotationally invariant). For a system of point particles without any intrinsic angular momentum (see below), it turns out to be

(Here, the wedge product is used.). In the language of four-vectors and tensors the angular momentum of a particle in relativistic mechanics is expressed as an antisymmetric tensor of second order

Angular momentum

263

Angular momentum in quantum mechanics


Angular momentum in quantum mechanics differs in many profound respects from angular momentum in classical mechanics.

Spin, orbital, and total angular momentum


The classical definition of angular momentum as can be carried over to quantum mechanics, by reinterpreting r as the quantum position operator and p as the quantum momentum operator. L is then an operator, specifically called the orbital angular momentum operator. However, in quantum physics, there is another type of angular momentum, called spin angular momentum, represented by the spin operator S. Almost all elementary particles have spin. Spin is often depicted as a particle literally spinning around an axis, but this is a misleading and inaccurate picture: Spin is an intrinsic property of a particle, fundamentally different from orbital angular momentum. All elementary particles have a characteristic spin, for example electrons always have "spin 1/2" while photons always have "spin 1".

Angular momenta of a classical object.Left: intrinsic "spin" angular momentum S is really orbital angular momentum of the object at every point,right: extrinsic orbital angular momentum L about an axis,top: the moment of inertia tensor I and angular velocity (L is not always parallel to )R.P. Feynman, R.B. Leighton, M. Sands (1964). Feynman's Lectures on Physics (volume 2). Addison-Wesley. pp.317. ISBN9-780-201-021172.bottom: momentum p and it's radial position r from the axis.The total angular momentum (spin + orbital) is J.

Finally, there is total angular momentum J, which combines both the spin and orbital angular momentum of all particles and fields. (For one particle, J=L+S.) Conservation of angular momentum applies to J, but not to L or S; for example, the spinorbit interaction allows angular momentum to transfer back and forth between L and S, with the total remaining constant.

Quantization
In quantum mechanics, angular momentum is quantized that is, it cannot vary continuously, but only in "quantum leaps" between certain allowed values. For any system, the following restrictions on measurement results apply, where is the reduced Planck constant and is any direction vector such as x, y, or z:

Angular momentum

264

If you measure...

The result can be...

or , where ( or ) , where

(There are additional restrictions as well, see angular momentum operator for details.) The reduced Planck constant
34

is tiny by everyday standards, about

10 J s, and therefore this quantization does not noticeably affect the angular momentum of macroscopic objects. However, it is very important in the microscopic world. For example, the structure of electron shells and subshells in chemistry is significantly affected by the quantization of angular momentum. Quantization of angular momentum was first postulated by Niels Bohr in his Bohr model of the atom.

Uncertainty
In the definition position operators , , , , six operators are involved: The , and the momentum operators ,

. However, the Heisenberg uncertainty principle tells us that it

is not possible for all six of these quantities to be known simultaneously with arbitrary precision. Therefore, there are limits to what can be known or measured about a particle's angular momentum. It turns out that the best that one can do is to simultaneously measure both the angular momentum vector's magnitude and its component along one axis. The uncertainty is closely related to the fact that different components of an angular momentum operator do not commute, for example . (For the precise commutation relations, see angular momentum operator.)

In this standing wave on a circular string, the circle is broken into exactly 8 wavelengths. A standing wave like this can have 0,1,2, or any integer number of wavelengths around the circle, but it cannot have a non-integer number of wavelengths like 8.3. In quantum mechanics, angular momentum is quantized for a similar reason.

Total angular momentum as generator of rotations


As mentioned above, orbital angular momentum L is defined as in classical mechanics: , but total angular momentum J is defined in a different, more basic way: J is defined as the "generator of rotations".[5] More specifically, J is defined so that the operator

is the rotation operator that takes any system and rotates it by angle

about the axis

The relationship between the angular momentum operator and the rotation operators is the same as the relationship between lie algebras and lie groups in mathematics. The close relationship between angular momentum and rotations is reflected in Noether's theorem that proves that angular momentum is conserved whenever the laws of physics are rotationally invariant.

Angular momentum

265

Angular momentum in electrodynamics


When describing the motion of a charged particle in an electromagnetic field, the canonical momentum P (derived from the Lagrangian for this system) is not gauge invariant. As a consequence, the canonical angular momentum L = r p is not gauge invariant either. Instead, the momentum that is physical, the so-called kinetic momentum (used throughout this article), is (in SI units)

where e is the electric charge of the particle and A the magnetic vector potential of the electromagnetic field. The gauge-invariant angular momentum, that is kinetic angular momentum, is given by

The interplay with quantum mechanics is discussed further in the article on canonical commutation relations.

Footnotes
[1] Truesdell, Clifford (1991). A First Course in Rational Continuum Mechanics: General concepts (http:/ / books. google. com/ books?id=l5J3oQ6V5RsC& lpg=PA37& dq=rotational momentum& pg=PA37#v=onepage& q=rotational momentum& f=false). Academic Press. ISBN0-12-701300-8. . [2] Smith, Donald Ray; Truesdell, Clifford (1993). An introduction to continuum mechanics -after Truesdell and Noll (http:/ / books. google. com/ books?id=ZcWC7YVdb4wC& lpg=PP1& pg=PA100#v=onepage& q& f=false). Springer. ISBN0-7923-2454-4. . [3] Landau, L. D.; Lifshitz, E. M. (1995). The classical theory of fields. Course of Theoretical Physics. Oxford, Butterworth-Heinemann. ISBN0-7506-2768-9. [4] R.P. Feynman, R.B. Leighton, M. Sands (1964). Feynman's Lectures on Physics (volume 2). Addison-Wesley. pp.317. ISBN9-780-201-021172. [5] Littlejohn, Robert (2011). "Lecture notes on rotations in quantum mechanics" (http:/ / bohr. physics. berkeley. edu/ classes/ 221/ 1011/ notes/ spinrot. pdf). Physics 221B Spring 2011 (http:/ / bohr. physics. berkeley. edu/ classes/ 221/ 1011/ 221. html). . Retrieved 13 Jan 2012.

References
Cohen-Tannoudji, Claude; Diu, Bernard; Lalo, Franck (2006). Quantum Mechanics (2 volume set ed.). John Wiley & Sons. ISBN978-0-471-56952-7. Condon, E. U.; Shortley, G. H. (1935). "Especially Chapter 3". The Theory of Atomic Spectra. Cambridge University Press. ISBN0-521-09209-4. Edmonds, A. R. (1957). Angular Momentum in Quantum Mechanics. Princeton University Press. ISBN0-691-07912-9. Jackson, John David (1998). Classical Electrodynamics (3rd ed.). John Wiley & Sons. ISBN978-0-471-30932-1. Serway, Raymond A.; Jewett, John W. (2004). Physics for Scientists and Engineers (6th ed.). Brooks/Cole. ISBN0-534-40842-7. Thompson, William J. (1994). Angular Momentum: An Illustrated Guide to Rotational Symmetries for Physical Systems. Wiley. ISBN0-471-55264-X. Tipler, Paul (2004). Physics for Scientists and Engineers: Mechanics, Oscillations and Waves, Thermodynamics (5th ed.). W. H. Freeman. ISBN0-7167-0809-4.

Angular momentum

266

External links
Conservation of Angular Momentum (http://www.lightandmatter.com/html_books/lm/ch15/ch15.html) - a chapter from an online textbook Angular Momentum in a Collision Process (http://www.hakenberg.de/diffgeo/collision_resolution.htm) derivation of the three dimensional case

Charge conservation
In physics, charge conservation is the principle that electric charge can neither be created nor destroyed. The net quantity of electric charge, the amount of positive charge minus the amount of negative charge in the universe, is always conserved. The first written statement of the principle was by American scientist and statesman Benjamin Franklin in 1747.[1] it is now discovered and demonstrated, both here and in Europe, that the Electrical Fire is a real Element, or Species of Matter, not created by the Friction, but collected only. Benjamin Franklin,Letter to Cadwallader Colden, 5 June 1747[2] Charge conservation is a physical law that states that the change in the amount of electric charge in any volume of space is exactly equal to the amount of charge flowing into the volume minus the amount of charge flowing out of the volume. In essence, charge conservation is an accounting relationship between the amount of charge in a region and the flow of charge into and out of that region. Mathematically, we can state the law as a continuity equation:

Q(t) is the quantity of electric charge in a specific volume at time t, QIN is the amount of charge flowing into the volume between time t1 and t2, and QOUT is the amount of charge flowing out of the volume during the same time period. This does not mean that individual positive and negative charges cannot be created or destroyed. Electric charge is carried by subatomic particles such as electrons and protons, which can be created and destroyed. In particle physics, charge conservation means that in elementary particle reactions that create charged particles, equal numbers of positive and negative particles are always created, keeping the net amount of charge unchanged. Similarly, when particles are destroyed, equal numbers of positive and negative charges are destroyed. Although conservation of charge requires that the total quantity of charge in the universe is constant, it leaves open the question of what that quantity is. Most evidence indicates that the net charge in the universe is zero;[3][4] that is, there are equal quantities of positive and negative charge.

Formal statement of the law


Vector calculus can be used to express the law in terms of charge density (in coulombs per cubic meter) and electric current density J (in amperes per square meter):

The term on the left is the rate of change of the charge density at a point. The term on the right is the divergence of the current density J. The equation equates these two factors, which says that the only way for the charge density at a point to change is for a current of charge to flow into or out of the point. This statement is equivalent to a conservation of four-current.

Charge conservation

267

Mathematical derivation
The net current into a volume is

where S = V is the boundary of V oriented by outward-pointing normals, and dS is shorthand for NdS, the outward pointing normal of the boundary V. Here is the current density (charge per unit area per unit time) at the surface of the volume. The vector points in the direction of the current. From the Divergence theorem this can be written

Charge conservation requires that the net current into a volume must necessarily equal the net change in charge within the volume.

Charge is related to charge density by the relation

This yields

Since this is true for every volume, we have in general

Connection to gauge invariance


Charge conservation can also be understood as a consequence of symmetry through Noether's theorem, a central result in theoretical physics that asserts that each conservation law is associated with a symmetry of the underlying physics. The symmetry that is associated with charge conservation is the global gauge invariance of the electromagnetic field.[5] This is related to the fact that the electric and magnetic fields are not changed by different choices of the value representing the zero point of electrostatic potential . However the full symmetry is more complicated, and also involves the vector potential . The full statement of gauge invariance is that the physics of an electromagnetic field are unchanged when the scalar and vector potential are shifted by the gradient of an arbitrary scalar field :

In quantum mechanics the scalar field is equivalent to a phase shift in the wavefunction of the charged particle:

so gauge invariance is equivalent to the well known fact that changes in the phase of a wavefunction are unobservable, and only changes in the magnitude of the wavefunction result in changes to the probability function . This is the ultimate theoretical origin of charge conservation.

Charge conservation Gauge invariance is a very important, well established property of the electromagnetic field and has many testable consequences. The theoretical justification for charge conservation is greatly strengthened by being linked to this symmetry. For example, local gauge invariance also requires that the photon be massless, so the good experimental evidence that the photon has zero mass is also strong evidence that charge is conserved.[6] Even if gauge symmetry is exact, however, there might be apparent electric charge non-conservation if charge could leak from our normal 3-dimensional space into hidden extra dimensions.[7][8]

268

Experimental Evidence
The best experimental tests of electric charge conservation are searches for particle decays that would be allowed if electric charge is not always conserved. No such decays have ever been seen.[9] The best experimental test comes from searches for the energetic photon from an electron decaying into a neutrino and a single photon:
e mean lifetime is greater than 4.6 1026years (90% Confidence Level),[10]

but there are theoretical arguments that such single-photon decays will never occur even if charge is not conserved.[11] Charge disappearance tests are sensitive to decays without energetic photons, other unusual charge violating processes such as an electron spontaneously changing into a positron,[12] and to electric charge moving into other dimensions. The best experimental bounds on charge disappearance are:
e anything mean lifetime is greater than 6.4 1024years (68% CL)[13] n p charge non-conserving decays are less than 8 1027 (68% CL) of all neutron decays [14]

Notes
[1] Heilbron, J.L. (1979). Electricity in the 17th and 18th centuries: a study of early Modern physics (http:/ / books. google. ca/ books?id=UlTLRUn1sy8C& pg=PA330). University of California Press. p.330. ISBN0-520-03478-3. . [2] The Papers of Benjamin Franklin (http:/ / www. franklinpapers. org/ franklin/ framedVolumes. jsp?vol=3& page=141b). 3. Yale University Press. 1961. p.142. . [3] S. Orito, M. Yoshimura (1985). "Can the Universe be Charged?" (http:/ / ccdb4fs. kek. jp/ cgi-bin/ img/ allpdf?198505168). Physical Review Letters 54 (22): 24572460. Bibcode1985PhRvL..54.2457O. doi:10.1103/PhysRevLett.54.2457. . [4] E. Masso, F. Rota (2002). "Primordial helium production in a charged universe". Physics Letters B 545 (3-4): 221225. arXiv:astro-ph/0201248. Bibcode2002PhLB..545..221M. doi:10.1016/S0370-2693(02)02636-9. [5] Bettini, Alessandro (2008). Introduction to Elementary Particle Physics (http:/ / books. google. com/ books?id=HNcQ_EiuTxcC& pg=PA164& lpg=PA164& sig=luNaWBSntSRav1k9W7_ZhwsDe54). UK: Cambridge University Press. pp.164165. ISBN0-521-88021-1. . [6] A.S. Goldhaber, M.M. Nieto (2010). "Photon and Graviton Mass Limits". Reviews of Modern Physics 82 (1): 939979. arXiv:0809.1003. Bibcode2010RvMP...82..939G. doi:10.1103/RevModPhys.82.939.; see Section II.C Conservation of Electric Charge [7] S.Y. Chu (1996). "Gauge-Invariant Charge Nonconserving Processes and the Solar Neutrino Puzzle" (http:/ / www. worldscinet. com/ mpla/ 11/ 1128/ S0217732396002241. html). Modern Physics Letters A 11 (28): 22512257. Bibcode1996MPLA...11.2251C. doi:10.1142/S0217732396002241. . [8] S.L. Dubovsky, V.A. Rubakov, P.G. Tinyakov (2000). "Is the electric charge conserved in brane world?". Journal of High Energy Physics August (8): 315318. arXiv:hep-ph/0007179. Bibcode1979PhLB...84..315I. doi:10.1016/0370-2693(79)90048-0. [9] Particle Data Group (May 2010). "Tests of Conservation Laws" (http:/ / pdg. lbl. gov/ 2010/ tables/ rpp2010-conservation-laws. pdf). Journal of Physics G 37 (7A): 8998. Bibcode2010JPhG...37g5021N. doi:10.1088/0954-3899/37/7A/075021. . [10] H.O. Back et al. (2002). "Search for electron decay mode e + with prototype of Borexino detector" (http:/ / www. sciencedirect. com/ science?_ob=ArticleURL& _udi=B6TVN-44P6XXC-6& _user=994540& _coverDate=01/ 17/ 2002& _rdoc=1& _fmt=high& _orig=search& _origin=search& _sort=d& _docanchor=& view=c& _acct=C000050024& _version=1& _urlVersion=0& _userid=994540& md5=72e0cd4ee57ca676b6fd8b8e2354e99b& searchtype=a). Physics Letters B 525 (1-2): 2940. Bibcode2002PhLB..525...29B. doi:10.1016/S0370-2693(01)01440-X. . [11] L.B. Okun (1989). "Comments on Testing Charge Conservation and Pauli Exclusion Principle" (http:/ / ccdb4fs. kek. jp/ cgi-bin/ img/ allpdf?198905149). Comments on Nuclear and Particle Physics 19 (3): 99116. . [12] R.N. Mohapatra (1987). "Possible Nonconservation of Electric Charge" (http:/ / ccdb4fs. kek. jp/ cgi-bin/ img/ allpdf?198709236). Physical Review Letters 59 (14): 15101512. Bibcode1987PhRvL..59.1510M. doi:10.1103/PhysRevLett.59.1510. .

Charge conservation
[13] P. Belli et al. (1999). "Charge non-conservation restrictions from the nuclear levels excitation of 129Xe induced by the electron's decay on the atomic shell" (http:/ / www. sciencedirect. com/ science?_ob=ArticleURL& _udi=B6TVN-3Y8N3C6-1W& _user=994540& _coverDate=10/ 21/ 1999& _rdoc=1& _fmt=high& _orig=search& _origin=search& _sort=d& _docanchor=& view=c& _acct=C000050024& _version=1& _urlVersion=0& _userid=994540& md5=bafd2d9b4bbb26a6b871b4e73413f4ec& searchtype=a). Physics Letters B 465 (1-4): 315322. Bibcode1999PhLB..465..315B. doi:10.1016/S0370-2693(99)01091-6. . This is the most stringent of several limits given in Table 1 of this paper. [14] Norman, E.B.; Bahcall, J.N.; Goldhaber, M. (1996). "Improved limit on charge conservation derived from 71Ga solar neutrino experiments" (http:/ / ccdb4fs. kek. jp/ cgi-bin/ img/ allpdf?200037774). Physical Review D53 (7): 40864088. Bibcode1996PhRvD..53.4086N. doi:10.1103/PhysRevD.53.4086. .

269

Further reading
Lemay, J.A. Leo (2008). "Chapter 2: Electricity" (http://books.google.com/books?id=NL5bcRP5aRAC& pg=PA58). The Life of Benjamin Franklin, Volume 3: Soldier, Scientist, and Politician. University of Pennsylvania Press. ISBN978-0-8122-4121-1.

Conservation of energy
The law of conservation of energy, first formulated in the nineteenth century, is a law of physics. It states that the total amount of energy in an isolated system remains constant over time. The total energy is said to be conserved over time. For an isolated system, this law means that energy can change its location within the system, and that it can change form within the system, for instance chemical energy can become kinetic energy, but that energy can be neither created nor destroyed. In the twentieth century, the definition of energy was broadened. It was found that particles that have rest mass are equivalent to amounts of energy (see mass-energy equivalence). There particles were found subject to annihilation in which matter particles (such as electrons) can be converted to non-matter (such as photons of electromagnetic radiation), or even into potential energy or kinetic energy. Matter could also be created out of kinetic or other types of energy, in the process of matter creation. Thus, matter (defined as ponderable matter particles) was found not to be conserved. In such a transformation process within an isolated system, neither the mass nor the energy changes over time, although the matter content may change. Therefore, conservation of energy, and conservation of mass, each still holds as a law in its own right (indeed they are restatements of the same law, when mass and energy are recognized to be equivalent). When stated alternatively, in terms of mass and of energy, they appear as the apparently distinct laws of the nineteenth century. A consequence of the law of conservation of energy is that no intended "perpetual motion machine" can perpetually deliver energy to its surroundings.[1] Any delivery of energy by such a device would result in delivery of mass also, and the machine would lose mass continually until it eventually disappeared.

Conservation of energy

270

History
Ancient philosophers as far back as Thales of Miletus c.~550 BCE had inklings of the conservation of some underlying substance of which everything is made. However, there is no particular reason to identify this with what we know today as "mass-energy" (for example, Thales thought it was water). In 1638, Galileo published his analysis of several situationsincluding the celebrated "interrupted pendulum"which can be described (in modern language) as conservatively converting potential energy to kinetic energy and back again. However, Galileo did not state the process in modern terms and again cannot be credited with the crucial insight. It was Gottfried Wilhelm Leibniz during 16761689 who first attempted a mathematical formulation of the kind of energy which is connected with motion (kinetic energy). Leibniz noticed that in many mechanical systems (of several masses, mi each with velocity vi ),

Gottfried Leibniz

was conserved so long as the masses did not interact. He called this quantity the vis viva or living force of the system. The principle represents an accurate statement of the approximate conservation of kinetic energy in situations where there is no friction. Many physicists at that time held that the conservation of momentum, which holds even in systems with friction, as defined by the momentum:

was the conserved vis viva. It was later shown that, under the proper conditions, both quantities are conserved simultaneously such as in elastic collisions. It was largely engineers such as John Smeaton, Peter Ewart, Carl Holtzmann, Gustave-Adolphe Hirn and Marc Seguin who objected that conservation of momentum alone was not adequate for practical calculation and made use of Leibniz's principle. The principle was also championed by some chemists such as William Hyde Wollaston. Academics such as John Playfair were quick to point out that kinetic energy is clearly not conserved. This is obvious to a modern analysis based on the second law of thermodynamics but in the 18th and 19th centuries, the fate of the lost energy was still unknown. Gradually it came to be suspected that the heat inevitably generated by motion under friction, was another form of vis viva. In 1783, Antoine Lavoisier and Pierre-Simon Laplace reviewed the two competing theories of vis viva and caloric theory.[2] Count Rumford's 1798 observations of heat generation during the boring of cannons added more weight to the view that mechanical motion could be converted into heat, and (as importantly) that the conversion was quantitative and could be predicted (allowing for a universal conversion constant between kinetic energy and heat). Vis viva now started to be known as energy, after the term was first used in that sense by Thomas Young in 1807.

Conservation of energy

271

The recalibration of vis viva to

which can be understood as finding the exact value for the kinetic energy to work conversion constant, was largely the result of the work of Gaspard-Gustave Coriolis and Jean-Victor Poncelet over the period 18191839. The former called the quantity quantit de travail (quantity of work) and the latter, travail mcanique (mechanical work), and both championed its use in engineering calculation. In a paper ber die Natur der Wrme, published in the Zeitschrift fr Physik in 1837, Karl Friedrich Mohr gave one of the earliest general statements of the doctrine of the conservation of energy in the words: "besides the 54 known chemical elements there is in the physical world one agent only, and this is called Kraft [energy or work]. It may appear, according to circumstances, as motion, chemical affinity, cohesion, electricity, light and magnetism; and from any one of these forms it can be transformed into any of the others."
Gaspard-Gustave Coriolis

Mechanical equivalent of heat


A key stage in the development of the modern conservation principle was the demonstration of the mechanical equivalent of heat. The caloric theory maintained that heat could neither be created nor destroyed but conservation of energy entails the contrary principle that heat and mechanical work are interchangeable. In 1798 Count Rumford (Benjamin Thompson) performed measurements of the frictional heat generated in boring cannons and developed the idea that heat is a form of kinetic energy; his measurements refuted caloric theory, but were imprecise enough to leave room for doubt. The mechanical equivalence principle was first stated in its modern form by the German surgeon Julius Robert von Mayer in 1842.[3] Mayer reached his conclusion on a voyage to the Dutch East Indies, where he found that his patients' blood was a deeper red because they were consuming less oxygen, and therefore less energy, to maintain their body temperature in the hotter climate. He had discovered that heat and mechanical work were both forms of energy, and later, after improving his knowledge of physics, he calculated a quantitative relationship between them (pub' 1845).

James Prescott Joule

Conservation of energy

272

Meanwhile, in 1843 James Prescott Joule independently discovered the mechanical equivalent in a series of experiments. In the most famous, now called the "Joule apparatus", a descending weight attached to a string caused a paddle immersed in water to rotate. He showed that the gravitational potential energy lost by the weight in descending was equal to the thermal energy (heat) gained by the water by friction with the paddle. Over the period 18401843, similar work was carried out by engineer Ludwig A. Colding though it was little known outside his native Denmark. Both Joule's and Mayer's work suffered from resistance and neglect but it was Joule's that, perhaps unjustly, eventually drew the wider recognition.

Joule's apparatus for measuring the mechanical equivalent of heat. A descending weight attached to a string causes a paddle immersed in water to rotate.

For the dispute between Joule and Mayer over priority, see Mechanical equivalent of heat: Priority In 1844, William Robert Grove postulated a relationship between mechanics, heat, light, electricity and magnetism by treating them all as manifestations of a single "force" (energy in modern terms). In 1874 Grove published his theories in his book The Correlation of Physical Forces.[4] In 1847, drawing on the earlier work of Joule, Sadi Carnot and mile Clapeyron, Hermann von Helmholtz arrived at conclusions similar to Grove's and published his theories in his book ber die Erhaltung der Kraft (On the Conservation of Force, 1847). The general modern acceptance of the principle stems from this publication. In 1877, Peter Guthrie Tait claimed that the principle originated with Sir Isaac Newton, based on a creative reading of propositions 40 and 41 of the Philosophiae Naturalis Principia Mathematica. This is now regarded as an example of Whig history.[5]

Massenergy equivalence
In the nineteenth century, mass and energy were considered to be of quite different natures. Then Albert Einstein's theory of special relativity showed that mass and energy are related by an equivalence. Energy has an equivalent mass, and mass has an equivalent energy. Physicists now speak of a unified law of conservation of mass-energy. This is a recognition that the two nineteenth century conservation laws are restricted versions of one and the same more general law. While matter can be actually converted into non-matter, the relation between mass and energy is a simply theoretical equivalence, so that it makes no sense to think of their "actual interconversion". Thus, the modern view is that conservation of energy and conservation of mass are simply the same conservation law, stated differently, in different units. Einstein's E = mc2 and other equations serve to convert one unit to the other.

First law of thermodynamics


For a closed thermodynamic system, the first law of thermodynamics may be stated as:

where system.

is the amount of energy added to the system by a heating process,

is the amount of energy lost by

the system due to work done by the system on its surroundings and

is the change in the internal energy of the

The 's before the heat and work terms are used to indicate that they describe an increment of energy which is to be interpreted somewhat differently than the increment of internal energy (see Inexact differential). Work and heat are processes which add or subtract energy, while the internal energy with the system. Thus the term "heat energy" for is a particular form of energy associated means "that amount of energy added as the result of heating"

Conservation of energy rather than referring to a particular form of energy. Likewise, the term "work energy" for

273 means "that amount of

energy lost as the result of work". The most significant result of this distinction is the fact that one can clearly state the amount of internal energy possessed by a thermodynamic system, but one cannot tell how much energy has flowed into or out of the system as a result of its being heated or cooled, nor as the result of work being performed on or by the system. In simple terms, this means that energy cannot be created or destroyed, only converted from one form to another. Entropy is a function of the state of a system which tells of the possibility of conversion of heat into work. For a simple compressible system, the work performed by the system may be written

where

is the pressure and

is a small change in the volume of the system, each of which are system variables.

The heat energy may be written

where

is the temperature and

is a small change in the entropy of the system. Temperature and entropy are

variables of state of a system.

Mechanics
In mechanics, conservation of energy is usually stated as

where T is kinetic and V potential energy. For this particular form to be valid, the following must be true: The system is scleronomous (neither kinetic nor potential energy are explicit functions of time) The potential energy doesn't depend on velocities. The kinetic energy is a quadratic form with regard to velocities. The total energy E depends on the motion of the frame of reference (and it turns out that it is minimum for the center of mass frame).

Noether's theorem
The conservation of energy is a common feature in many physical theories. From a mathematical point of view it is understood as a consequence of Noether's theorem, which states every continuous symmetry of a physical theory has an associated conserved quantity; if the theory's symmetry is time invariance then the conserved quantity is called "energy". The energy conservation law is a consequence of the shift symmetry of time; energy conservation is implied by the empirical fact that the laws of physics do not change with time itself. Philosophically this can be stated as "nothing depends on time per se". In other words, if the physical system is invariant under the continuous symmetry of time translation then its energy (which is canonical conjugate quantity to time) is conserved. Conversely, systems which are not invariant under shifts in time (for example, systems with time dependent potential energy) do not exhibit conservation of energy unless we consider them to exchange energy with another, external system so that the theory of the enlarged system becomes time invariant again. Since any time-varying system can be embedded within a larger time-invariant system, conservation can always be recovered by a suitable re-definition of what energy is. Conservation of energy for finite systems is valid in such physical theories as special relativity and quantum theory (including QED) in the flat space-time.

Conservation of energy

274

Relativity
With the discovery of special relativity by Albert Einstein, energy was proposed to be one component of an energy-momentum 4-vector. Each of the four components (one of energy and three of momentum) of this vector is separately conserved across time, in any closed system, as seen from any given inertial reference frame. Also conserved is the vector length (Minkowski norm), which is the rest mass for single particles, and the invariant mass for systems of particles (where momenta and energy are separately summed before the length is calculatedsee the article on invariant mass). The relativistic energy of a single massive particle contains a term related to its rest mass in addition to its kinetic energy of motion. In the limit of zero kinetic energy (or equivalently in the rest frame) of a massive particle; or else in the center of momentum frame for objects or systems which retain kinetic energy, the total energy of particle or object (including internal kinetic energy in systems) is related to its rest mass or its invariant mass via the famous equation . Thus, the rule of conservation of energy over time in special relativity continues to hold, so long as the reference frame of the observer is unchanged. This applies to the total energy of systems, although different observers disagree as to the energy value. Also conserved, and invariant to all observers, is the invariant mass, which is the minimal system mass and energy that can be seen by any observer, and which is defined by the energymomentum relation. In general relativity conservation of energy-momentum is expressed with the aid of a stress-energy-momentum pseudotensor. The theory of general relativity leaves open the question of whether there is a conservation of energy for the entire universe.

Quantum theory
In quantum mechanics, energy of a quantum system is described by a self-adjoint (Hermite) operator called Hamiltonian, which acts on the Hilbert space (or a space of wave functions ) of the system. If the Hamiltonian is a time independent operator, emergence probability of the measurement result does not change in time over the evolution of the system. Thus the expectation value of energy is also time independent. The local energy conservation in quantum field theory is ensured by the quantum Noether's theorem for energy-momentum tensor operator. Note that due to the lack of the (universal) time operator in quantum theory, the uncertainty relations for time and energy are not fundamental in contrast to the position momentum uncertainty principle, and merely holds in specific cases (See Uncertainty principle). Energy at each fixed time can be precisely measured in principle without any problem caused by the time energy uncertainty relations. Thus the conservation of energy in time is a well defined concept even in quantum mechanics.

Notes
[1] Planck, M. (1923/1927). Treatise on Thermodynamics, third English edition translated by A. Ogg from the seventh German edition, Longmans, Green & Co., London, page 40. [2] Lavoisier, A.L. & Laplace, P.S. (1780) "Memoir on Heat", Acadmie Royale des Sciences pp.4355 [3] von Mayer, J.R. (1842) "Remarks on the forces of inorganic nature" in Annalen der Chemie und Pharmacie, 43, 233 [4] Grove, W. R. (1874). The Correlation of Physical Forces (6th ed.). London: Longmans, Green. [5] Hadden, Richard W. (1994). On the shoulders of merchants: exchange and the mathematical conception of nature in early modern Europe (http:/ / books. google. com/ books?id=7IxtC4Jw1YoC). SUNY Press. p.13. ISBN0-7914-2011-6. ., Chapter1, p.13 (http:/ / books. google. com/ books?id=7IxtC4Jw1YoC& pg=PA13)

Conservation of energy

275

References
Modern accounts
Goldstein, Martin, and Inge F., 1993. The Refrigerator and the Universe. Harvard Univ. Press. A gentle introduction. Kroemer, Herbert; Kittel, Charles (1980). Thermal Physics (2nd ed.). W. H. Freeman Company. ISBN0-7167-1088-9. Nolan, Peter J. (1996). Fundamentals of College Physics, 2nd ed.. William C. Brown Publishers. Oxtoby & Nachtrieb (1996). Principles of Modern Chemistry, 3rd ed.. Saunders College Publishing. Papineau, D. (2002). Thinking about Consciousness. Oxford: Oxford University Press. Serway, Raymond A.; Jewett, John W. (2004). Physics for Scientists and Engineers (6th ed.). Brooks/Cole. ISBN0-534-40842-7. Stenger, Victor J. (2000). Timeless Reality. Prometheus Books. Especially chpt. 12. Nontechnical. Tipler, Paul (2004). Physics for Scientists and Engineers: Mechanics, Oscillations and Waves, Thermodynamics (5th ed.). W. H. Freeman. ISBN0-7167-0809-4. Lanczos, Cornelius (1970). The Variational Principles of Mechanics. Toronto: University of Toronto Press. ISBN0-8020-1743-6.

History of ideas
Brown, T.M. (1965). "Resource letter EEC-1 on the evolution of energy concepts from Galileo to Helmholtz". American Journal of Physics 33 (10): 759765. Bibcode1965AmJPh..33..759B. doi:10.1119/1.1970980. Cardwell, D.S.L. (1971). From Watt to Clausius: The Rise of Thermodynamics in the Early Industrial Age. London: Heinemann. ISBN0-435-54150-1. Guillen, M. (1999). Five Equations That Changed the World. New York: Abacus. ISBN0-349-11064-6. Hiebert, E.N. (1981). Historical Roots of the Principle of Conservation of Energy. Madison, Wis.: Ayer Co Pub. ISBN0-405-13880-6. Kuhn, T.S. (1957) Energy conservation as an example of simultaneous discovery, in M. Clagett (ed.) Critical Problems in the History of Science pp.32156 Sarton, G.; Joule, J. P.; Carnot, Sadi (1929). "The discovery of the law of conservation of energy". Isis 13: 1849. doi:10.1086/346430. Smith, C. (1998). The Science of Energy: Cultural History of Energy Physics in Victorian Britain. London: Heinemann. ISBN0-485-11431-3. Mach, E. (1872). History and Root of the Principles of the Conservation of Energy. Open Court Pub. Co., IL. Poincar, H. (1905). Science and Hypothesis. Walter Scott Publishing Co. Ltd; Dover reprint, 1952. ISBN0-486-60221-4., Chapter 8, "Energy and Thermo-dynamics"

External links
The First Law of Thermodynamics (http://35.9.69.219/home/modules/pdf_modules/m158.pdf) (PDF file) by Jerzy Borysowicz for Project PHYSNET (http://www.physnet.org).
MISN-0-158

First law of thermodynamics

276

First law of thermodynamics


The first law of thermodynamics is a version of the law of conservation of energy, specialized for thermodynamical systems. It is usually formulated by stating that the change in the internal energy of a closed system is equal to the amount of heat supplied to the system, minus the amount of work done by the system on its surroundings. The law of conservation of energy can be stated: The energy of an isolated system is constant.

Original statements
The first explicit statement of the first law of thermodynamics, by Rudolf Clausius in 1850, referred to cyclic thermodynamic processes. "In all cases in which work is produced by the agency of heat, a quantity of heat is consumed which is proportional to the work done; and conversely, by the expenditure of an equal quantity of work an equal quantity of heat is produced."[1] Clausius stated the law also in another form, this time referring to the existence of a function of state of the system called the internal energy, and expressing himself in terms of a differential equation for the increments of a thermodynamic process. This equation may be translated into words as follows: In a thermodynamic process of a closed system, the increment in the internal energy is equal to the difference between the increment of heat accumulated by the system and the increment of work done by it.[2]

Description
The first law of thermodynamics was expressed in two ways by Clausius. One way referred to cyclic processes and the inputs and outputs of the system, but did not refer to increments in the internal state of the system. The other way referred to any incremental change in the internal state of the system, and did not expect the process to be cyclic. A cyclic process is one which can be repeated indefinitely often and still eventually leave the system in its original state. In each repetition of a cyclic process, the work done by the system is proportional to the heat consumed by the system. In a cyclic process in which the system does work on its surroundings, it is necessary that some heat be taken in by the system and some be put out, and the difference is the heat consumed by the system in the process. The constant of proportionality is universal and independent of the system and was measured by James Joule in 1845 and 1847, who described it as the mechanical equivalent of heat. In any incremental process, the change in the internal energy is considered due to a combination of heat added to the system and work done by the system. Taking as an infinitesimal (differential) change in internal energy, one writes

where

and

are infinitesimal amounts of heat supplied to the system by its surroundings and work done by

the system on its surroundings, respectively. This sign convention is implicit in Clausius' statement of the law given above, and is consistent with the use of thermodynamics to study heat engines which provide useful work, which is regarded as positive. In chemistry, however, it is conventional to use the IUPAC convention where the first law is formulated in terms of the work done on the system. With this alternate sign convention for work, the first law for a closed system may be written:
[3]

This convention follows physicists such as Max Planck[4], and considers all net energy transfers to the system as positive and all net energy transfers from the system as negative, independently of any use for the system as an

First law of thermodynamics engine or otherwise. When a system expands in a quasistatic process, the work done by the system on the environment is the product, PdV, of pressure, P, and volume change, dV, whereas the work done on the system is -PdV. Using either sign convention for work, the change in internal energy of the system is:

277

Work and heat are expressions of actual physical processes which supply or remove energy, while

is a

mathematical abstraction that keeps account of the exchanges of energy that befall the system. Thus the term heat for means that amount of energy added or removed by conduction of heat or by thermal radiation, rather than referring to a form of energy within the system. Likewise, work energy for means "that amount of energy gained or lost as the result of work". Internal energy is a property of the system whereas work done and heat supplied are not. A significant result of this distinction is that a given internal energy change can be achieved by, in principle, many combinations of heat and work. The internal energy of a system is not uniquely defined. It is defined only up to an arbitrary additive constant of integration, which can be adjusted to give arbitrary reference zero levels. This non-uniqueness is in keeping with the abstract mathematical nature of the internal energy. The internal energy is stated relative to a conventionally chosen standard reference state of the system.

Various statements of the law for closed systems


The law is of very great importance and generality and is consequently thought of from several points of view. Most careful textbook statements of the law express it for closed systems. It is stated in several ways, sometimes even by the same author.[5][6] For the thermodynamics of closed systems, the distinction between transfers of energy as work and as heat is central and is within the scope of the present article. For the thermodynamics of open systems, such a distinction is beyond the scope of the present article, but some limited comments are made on it in the section below headed 'First law of thermodynamics for open systems'. There are two main ways of stating a law of thermodynamics, physically or mathematically. They should be logically coherent and consistent with one another.[7] An example of a physical statement is that of Planck (1897/1903): It is in no way possible, either by mechanical, thermal, chemical, or other devices, to obtain perpetual motion, i.e. it is impossible to construct an engine which will work in a cycle and produce continuous work, or kinetic energy, from nothing."[8] This physical statement is restricted neither to closed systems nor to systems with states that are strictly defined only for thermodynamic equilibrium; it has meaning also for open systems and for systems with states that are not in thermodynamic equilibrium. An example of a mathematical statement is that of Crawford (1963): For a given system we let Ekin= large-scale mechanical energy, Epot= large-scale potential energy, and Etot= total energy. The first two quantities are specifiable in terms of appropriate mechanical variables, and by definition

For any finite process, whether reversible or irreversible,

The first law in a form that involves the principle of conservation of energy more generally is

First law of thermodynamics

278

Here Q and W are heat and work added, with no restrictions as to whether the process is reversible, quasistatic, or irreversible.[Warner, Am. J. Phys., 29, 124 (1961)][9] This statement by Crawford, for W, uses the sign convention of IUPAC, not that of Clausius. Though it does not explicitly say so, this statement refers to closed systems, and to internal energy U defined for bodies in states of thermodynamic equilibrium, which possess well-defined temperatures. The history of statements of the law for closed systems has two main periods, before and after the work of Bryan (1907),[10] of Carathodory (1909),[11] and the approval of Carathodory's work given by Born (1921).[12] The earlier traditional versions of the law for closed systems are nowadays often considered to be out of date. Carathodory's celebrated presentation of equilibrium thermodynamics[11] refers to closed systems, which are allowed to contain several phases connected by internal walls of various kinds of impermeability and permeability, explicitly including walls that are permeable only to heat. Carathodory's version of the first law of thermodynamics was stated in an axiom which refrained from defining or mentioning temperature or quantity of heat transferred. That axiom stated that the internal energy of a phase in equilibrium is a function of state, that the sum of the internal energies of the phases is the total internal energy of the system, and that the value of the total internal energy of the system is changed by the amount of work done adiabatically on it, considering work as a form of energy. That article considered this statement to be an expression of the law of conservation of energy for such systems. This version is nowadays widely accepted as authoritative, but is stated in slightly varied ways by different authors. The Carathodory statement of the law in axiomatic form does not mention heat or temperature, but the equilibrium states to which it refers are explicitly defined by variable sets that necessarily include "non-deformation variables", such as pressures, which, within reasonable restrictions, can be rightly interpreted as empirical temperatures, and the walls connecting the phases of the system are explicitly defined as possibly impermeable to heat or permeable only to heat. According to Mnster (1970), "A somewhat unsatisfactory aspect of Carathodory's theory is that a consequence of the Second Law must be considered at this point [in the statement of the first law], i.e. that it is not always possible to reach any state 2 from any other state 1 by means of an adiabatic process." Mnster instances that no adiabatic process can reduce the internal energy of a system at constant volume.[13] Carathodory's paper asserts that its statement of the first law corresponds exactly to Joule's experimental arrangement, regarded as an instance of adiabatic work. It does not point out that Joule's experimental arrangement performed essentially irreversible work, through friction of paddles in a liquid, or passage of electric current through a resistance inside the system, driven by motion of a coil and inductive heating, or by an external current source, which can access the system only by the passage of electrons, and so is not strictly adiabatic, because electrons are a form of matter, which cannot penetrate adiabatic walls. The paper goes on to base its main argument on the possibility of quasi-static adiabatic work, which is essentially reversible. The paper asserts that it will avoid reference to Carnot cycles, and then proceeds to base its argument on cycles of forward and backward quasi-static adiabatic stages, with isothermal stages of zero magnitude. Some respected modern statements of the first law for closed systems assert the existence of internal energy as a function of state defined in terms of adiabatic work and accept the Carathodory idea that heat is not defined in its own right, that is to say calorimetrically or as due to temperature difference; they define heat as a residual difference between change of internal energy and work done on the system, when that work does not account for the whole of the change of internal energy and the system is not adiabatically isolated.[14][13][15] Sometimes the concept of internal energy is not made explicit in the statement.[16][17][18] Sometimes the existence of the internal energy is made explicit but work is not explicitly mentioned in the statement of the first postulate of thermodynamics. Heat supplied is then defined as the residual change in internal energy after work has been taken into account, in a non-adiabatic process.[19] A respected modern author states the first law of thermodynamics as "Heat is a form of energy", which explicitly mentions neither internal energy nor adiabatic work. Heat is defined as energy transferred by thermal contact with a

First law of thermodynamics reservoir, which has a temperature, and is generally so large that addition and removal of heat do not alter its temperature.[20] A current student text on chemistry defines heat thus: "heat is the exchange of thermal energy between a system and its surroundings caused by a temperature difference." The author then explains how heat is defined or measured by calorimetry, in terms of heat capacity, specific heat capacity, molar heat capacity, and temperature.[21] A respected text disregards the Carathodory's exclusion of mention of heat from the statement of the first law for closed systems, and admits heat calorimetrically defined along with work and internal energy.[22] Another respected text defines heat exchange as determined by temperature difference, but also mentions that the Born (1921) version is "completely rigorous".[23] These versions follow the traditional approach that is now considered out of date, exemplified by that of Planck (1897/1903).[24]

279

Evidence for the first law of thermodynamics for closed systems


The first law of thermodynamics for closed systems was originally induced from empirically observed evidence, however, it is now taken to be the definition of heat via the law of conservation of energy and the definition of work in terms of changes in the external parameters of a system. The original discovery of the law was gradual over a period of perhaps half a century or more, and some early studies were in terms of cyclic processes.[25] The following is an account in terms of changes of state through compound processes that are not necessarily cyclic. This account first considers processes for which the first law is easily verified because of their simplicity, namely adiabatic processes (in which no heat is transferred) and adynamic processes (in which no work is transferred).

Adiabatic processes
Given a closed system in an initial state, if work is done on the system in an adiabatic (i.e. no heat transfer) way, and given the final state after a process, the amount of work required to be transferred to the system is the same, irrespective of how this work is performed. The work done on the system is defined and measured by changes in mechanical or quasi-mechanical variables external to the system. Physically, adiabatic transfer of energy as work requires the existence of adiabatic enclosures. For instance, in Joule's experiment, the initial system is a tank of water with a paddle wheel inside. If we isolate thermally the tank and move the paddle wheel with a pulley and a weight we can relate the increase in temperature with the height descended by the mass. Now the system is returned to its initial state, isolated again, and the same amount of work is done on the tank using different devices (an electric motor, a chemical battery, a spring,...). In every case, the amount of work can be measured independently. The return to the initial state is not conducted by doing adiabatic work on the system. The evidence shows that the final state of the water (in particular, its temperature and volume) is the same in every case. It is irrelevant if the work is electrical, mechanical, chemical,... or if done suddenly or slowly, as long as it is performed in an adiabatic way, that is to say, without heat transfer into or out of the system. Evidence of this kind shows that to increase the temperature of the water in the tank, the qualitative kind of adiabatically performed work does not matter. No qualitative kind of adiabatic work has ever been observed to decrease the temperature of the water in the tank. A change from one state to another, for example an increase of both temperature and volume, may be conducted in several stages, for example by externally supplied electrical work on a resistor in the body, and adiabatic expansion allowing the body to do work on the surroundings. It needs to be shown that the time order of the stages, and their relative magnitudes, does not affect the amount of adiabatic work that needs to be done for the change of state. According to one respected scholar: "Unfortunately, it does not seem that experiments of this kind have ever been carried out carefully. ... We must therefore admit that the statement which we have enunciated here, and which is equivalent to the first law of thermodynamics, is not well founded on direct experimental evidence."[26]

First law of thermodynamics This kind of evidence, of independence of sequence of stages, combined with the above-mentioned evidence, of independence of qualitative kind of work, would show the existence of a very important state variable that corresponds with adiabatic work, but not that such a state variable represented a conserved quantity. For the latter, another step of evidence is needed, which may be related to the concept of reversibility, as mentioned below. That very important state variable was first recognized and denoted by Clausius in 1850, but he did not then name it, and he defined it in terms not only of work but also of heat transfer in the same process. It was also independently recognized in 1850 by Rankine, who also denoted it ; and in 1851 by Kelvin who then called it "mechanical energy", and later "intrinsic energy". In 1865, after some hestitation, Clausius began calling his state function "energy". In 1882 it was named as the internal energy by Helmholtz.[27] If only adiabatic processes were of interest, and heat could be ignored, the concept of internal energy would hardly arise or be needed. The relevant physics would be largely covered by the concept of potential energy, as was intended in the 1847 paper of Helmholtz on the principle of conservation of energy, though that did not deal with forces that cannot be described by a potential, and thus did not fully justify the principle; moreover that paper was very critical of the early work of Joule which had by then been performed.[28] A great merit of the internal energy concept is that it frees thermodynamics from a restriction to cyclic processes, and allows a treatment in terms of thermodynamic states. In an adiabatic process, adiabatic work takes the system either from a reference state to an arbitrary one with internal energy , or from the state to the state : with internal energy

280

Except under the special, and strictly speaking, fictional, condition of reversibility, only one of the processes or is empirically feasible by a simple application of externally supplied work. The reason for this is given as the second law of thermodynamics and is not considered in the present article. The fact of such irreversibility may be dealt with in two main ways, according to different points of view. since the work of Bryan (1907), To deal with it nowadays, the most accepted way, followed by Carathodory,[11][29][15] is to rely on the previously established concept of quasi-static processes,[30][31][32] as follows. Actual physical processes of transfer of energy as work are always at least to some degree irreversible. The irreversibility is often due to mechanisms known as dissipative, that transform bulk kinetic energy into internal energy. Examples are friction and viscosity. If the process is performed more slowly, the frictional or viscous dissipation is less. In the limit of infinitely slow performance, the dissipation tends to zero and then the limiting process, though fictional rather than actual, is notionally reversible, and is called quasi-static. Throughout the course of the fictional limiting quasi-static process, the internal intensive variables of the system are equal to the external intensive variables, those which describe the reactive forces exerted by the surroundings.[33] This can be taken to justify the formula

Another way to deal with it is to allow that experiments with processes of heat transfer to or from the system may be used to justify the formula just above. Moreover, it deals to some extent with the problem of lack of direct experimental evidence that the time order of stages of a process does not matter in the determination of internal energy. This way does not provide theoretical purity in terms of adiabatic work processes, but is empirically feasible, and is in accord with experiments actually done, such as the Joule experiments mentioned just above, and with older traditions. The formula of the path This kind of empirical evidence, coupled with theory of this kind, largely justifies the following statement: above allows that to go by processes of quasi-static adiabatic work from the state to the state we can take a path that goes through the reference state , since the quasi-static adiabatic work is independent

First law of thermodynamics For all adiabatic processes between two specified states of a closed system of any nature, the net work done is the same regardless the details of the process, and determines a state function called internal energy, ."

281

Adynamic processes
A complementary observable aspect of the first law is about heat transfer. Adynamic transfer of energy as heat can be measured empirically by changes in the surroundings of the system of interest by calorimetry. This again requires the existence of adiabatic enclosure of the entire process, system and surroundings, though the separating wall between the surroundings and the system is thermally conductive or radiatively permeable, not adiabatic. A calorimeter can rely on measurement of sensible heat, which requires the existence of thermometers and measurement of temperature change in bodies of known sensible heat capacity under specified conditions; or it can rely on the measurement of latent heat, through measurement of masses of material which change phase, at temperatures fixed by the occurrence of phase changes under specified conditions in bodies of known latent heat of phase change. The calorimeter can be calibrated by adiabatically doing externally determined work on it. The most accurate method is by passing an electric current from outside through a resistance inside the calorimeter. The calibration allows comparison of calorimetric measurement of quantity of heat transferred with quantity of energy transferred as work. According to one textbook, "The most common device for measuring is an adiabatic bomb calorimeter."[34] According to another textbook, "Calorimetry is widely used in present day laboratories."[35] According to one opinion, "Most thermodynamic data come from calorimetry..."[36] According to another opinion, "The most common method of measuring heat is with a calorimeter."[37] When the system evolves with transfer of energy as heat, without energy being transferred as work, in an adynamic process, the heat transferred to the system is equal to the increase in its internal energy:

General case for reversible processes


Heat transfer is practically reversible when it is driven by practically negligibly small temperature gradients. Work transfer is practically reversible when it occurs so slowly that there are no frictional effects within the system; frictional effects outside the system should also be zero if the process is to be globally reversible. For a particular reversible process in general, the work done reversibly on the system, , and the heat transferred reversibly to the system, are not required to occur respectively adiabatically or adynamically, but , through the space of

they must belong to the same particular process defined by its particular reversible path,

thermodynamic states. Then the work and heat transfers can occur and be calculated simultaneously. Putting the two complementary aspects together, the first law for a particular reversible process can be written

This combined statement is the expression the first law of thermodynamics for reversible processes for closed systems. In particular, if no work is done on a thermally isolated closed system we have . This is one aspect of the law of conservation of energy and can be stated: The internal energy of an isolated system remains constant.

First law of thermodynamics

282

General case for irreversible processes


If, in a process of change of state of a closed system, the energy transfer is not under a practically zero temperature gradient and practically frictionless, then the process is irreversible. Then the heat and work transfers may be difficult to calculate, and irreversible thermodynamics is called for. Nevertheless, the first law still holds and provides a check on the measurements and calculations of the work done irreversibly on the system, , and the heat transferred irreversibly to the system, , which belong to the same particular process defined by its particular irreversible path, This means that the internal energy , through the space of thermodynamic states. between two

is a function of state and that the internal energy change

states is a function only of the two states.

Overview of the weight of evidence for the law


The first law of thermdynamics is very general and makes so many predictions that they can hardly all be directly tested by experiment. Nevertheless, very very many of its predictions have been found empirically accurate. And very importantly, no accurately and properly conducted experiment has ever detected a violation of the law. Consequently, within its scope of applicability, the law is so reliably established, that, nowadays, rather than experiment being considered as testing the accuracy of the law, it is far more practical and realistic to think of the law as testing the accuracy of experiment. An experimental result that seems to violate the law may be assumed to be inaccurate or wrongly conceived, for example due to failure to consider an important physical factor.

State functional formulation for infinitesimal processes


When the heat and work transfers in the equations above are infinitesimal in magnitude, they are often denoted by , rather than exact differentials denoted by "d", as a reminder that heat and work do not describe the state of any system. The integral of an inexact differential depends upon the particular path taken through the space of thermodynamic parameters while the integral of an exact differential depends only upon the initial and final states. If the initial and final states are the same, then the integral of an inexact differential may or may not be zero, but the integral of an exact differential will always be zero. The path taken by a thermodynamic system through a chemical or physical change is known as a thermodynamic process. For a homogeneous system, with a well-defined temperature and pressure, the expression for dU can be written in terms of exact differentials, if the work that the system does is equal to its pressure times the infinitesimal increase in its volume. Here one assumes that the changes are quasistatic, so slow that there is at each instant negligible departure from thermodynamic equilibrium within the system. In other words, W = -PdV where P is pressure and V is volume. As such a quasistatic process in a homogeneous system is reversible, the total amount of heat added to a closed system can be expressed as Q =TdS where T is the temperature and S the entropy of the system. Therefore, for closed, homogeneous systems:

The above equation is known as the fundamental thermodynamic relation, for which the independent variables are taken as S and V, with respect to which T and P are partial derivatives of U. While this has been derived for quasistatic changes, it is valid in general, as U can be considered as a thermodynamic state function of the independent variables S and V. E.g., suppose that the system is initially in a state of thermal equilibrium defined by S and V, and then the system is suddenly perturbed so that thermal equilibrium breaks down and no temperature and pressure can be defined. Then the system settles down again to a state of thermal equilibrium, defined by an entropy and a volume which differ infinitesimally from the initial values. The infinitesimal difference in internal energy between the initial and final state will then satisfy the above equation. The work done and heat added to the system will then not satisfy the above

First law of thermodynamics expressions, they will instead satisfy the inequalities: Q < TdS' and W < PdV'. In the case of a closed system in which the particles of the system are of different types and, because chemical reactions may occur, their respective numbers are not necessarily constant, the expression for dU becomes:

283

where dNi is the (small) increase in amount of type-i particles in the reaction, and i is known as the chemical potential of the type-i particles in the system. If dNi is expressed in kg then i is expressed in J/kg. The statement of the first law, using exact differentials is now:

If the system has more external mechanical variables than just the volume that can change, the fundamental thermodynamic relation generalizes to:

Here the Xi are the generalized forces corresponding to the external variables xi. The parameters Xi are independent of the size of the system and are called intensive parameters and the xi are proportional to the size and called extensive parameters. For an open system, there can be transfers of particles as well as energy into or out of the system during a process. For this case, the first law of thermodynamics still holds, in the form that the internal energy is a function of state and the change of internal energy in a process is a function only of its initial and final states, as noted in the section below headed First law of thermodynamics for open systems. A useful idea from mechanics is that the energy gained by a particle is equal to the force applied to the particle multiplied by the displacement of the particle while that force is applied. Now consider the first law without the heating term: dU = -PdV. The pressure P can be viewed as a force (and in fact has units of force per unit area) while dVis the displacement (with units of distance times area). We may say, with respect to this work term, that a pressure difference forces a transfer of volume, and that the product of the two (work) is the amount of energy transferred out of the system as a result of the process. If one were to make this term negative then this would be the work done on the system. It is useful to view the TdS term in the same light: here the temperature is known as a "generalized" force (rather than an actual mechanical force) and the entropy is a generalized displacement. Similarly, a difference in chemical potential between groups of particles in the system drives a chemical reaction that changes the numbers of particles, and the corresponding product is the amount of chemical potential energy transformed in process. For example, consider a system consisting of two phases: liquid water and water vapor. There is a generalized "force" of evaporation which drives water molecules out of the liquid. There is a generalized "force" of condensation which drives vapor molecules out of the vapor. Only when these two "forces" (or chemical potentials) are equal will there be equilibrium, and the net rate of transfer will be zero. The two thermodynamic parameters which form a generalized force-displacement pair are termed "conjugate variables". The two most familiar pairs are, of course, pressure-volume, and temperature-entropy.

Spatially inhomogeneous systems


Classical thermodynamics is initially focused on closed homogeneous systems (e.g. Planck 1897/1903[24]), which might be regarded as 'zero-dimensional' in the sense that they have no spatial variation. But it is desired to study also systems with distinct internal motion and spatial inhomogeneity. For such systems, the principle of conservation of energy is expressed in terms not only of internal energy as defined for homogeneous systems, but also in terms of kinetic energy and potential energies of parts of the inhomogeneous system with respect to each other and with

First law of thermodynamics

284

respect to long-range external forces.[38] How the total energy of a system is allocated between these three more specific kinds of energy varies according to the purposes of different writers; this is because these components of energy are to some extent mathematical artefacts rather than actually measured physical quantities. For any closed homogeneous component of an inhomogeneous closed system, if denotes the total energy of that component system, one may write

where

and

denote respectively the total kinetic energy and the total potential energy of the component denotes its internal energy.[9][39]

closed homogeneous system, and

Potential energy can be exchanged with the surroundings of the system when the surroundings impose a force field, such as gravitational or electromagnetic, on the system. A compound system consisting of two interacting closed homogeneous component subsystems has a potential energy of interaction between the subsystems. Thus, in an obvious notation, one may write

The distinction between internal and kinetic energy is hard to make in the presence of turbulent motion within the system, as friction gradually dissipates macroscopic kinetic energy of localised bulk flow into molecular random motion of molecules that is classified as internal energy. The rate of dissipation by friction of kinetic energy of localised bulk flow into internal energy,[40][41][42] whether in turbulent or in streamlined flow, is an important quantity in non-equilibrium thermodynamics. This is a serious difficulty for attempts to define entropy for time-varying spatially inhomogeneous systems.

First law of thermodynamics for open systems


For the first law of thermodynamics, there is no trivial passage of physical conception from the closed system view to an open system view.[43] For closed systems, the concepts of an adiabatic enclosure and of an adiabatic wall are fundamental. Matter and internal energy cannot permeate or penetrate such a wall. For an open system, there is a wall that allows penetration by matter. In general, matter in motion will carry with it some internal energy, and some potential energy changes will accompany the motion. An open system is not adiabatically enclosed. By definition therefore, adiabatic work cannot be done on an open system.[44] In contrast to the case of closed systems, for open systems, in the presence of diffusion, there is no unconstrained and unconditional physical distinction between convective transfer of internal energy by bulk flow of matter, the transfer of internal energy without transfer of matter (usually called heat conduction), and change of various potential energies. The older traditional way and the Carathodory way agree that there is no physically unique definition of heat and work transfer processes between open systems.[45][46][47][48] The ideas of heat and work transfer for closed systems are superseded for open systems by the ideas of transfer of kinetic energy of bulk flow, of bulk potential and of internal energies, and of entropy. An example is evaporation. One may consider an open system consisting of a collection of vapour in a controlled volume, enclosed except where it is allowed to receive more vapour from or to condense into its parent liquid, which may be considered as another open system in open contact with the vapour system. The process might be a mechanical increase in the controlled volume of the vapour. Some work will be done by the vapour in the mechanical part of the process, but also some of the parent liquid will evaporate and enter the vapour collection which is the system. Some internal energy will accompany the vapour that enters the system, but it will not make sense to try to uniquely identify part of that internal energy as heat and part of it as work. Consequently, the energy transfer of the process as a whole, though having a component of mechanical work, cannot be uniquely split into heat and work transfers to or from the open system. The component of total energy transfer that accompanies the transfer of vapour into the system is customarily called 'latent heat of evaporation', but this is a quirk of historical language usage, not in strict compliance with the thermodynamic definition of transfer of energy as heat. In this example, kinetic energy of bulk flow and potential energy with respect to external long range forces such as gravity are both

First law of thermodynamics considered to be zero. The first law of thermodynamics refers to the change of internal energy of the open system.

285

History
The discovery of the first law of thermodynamics was by way of many tries and mistakes of investigation, over a period of about half a century. The first full statements of the law were made by Clausius in 1850 as noted above, and by Rankine also in 1850; Rankine's statement was perhaps not quite as clear and distinct as was Clausius'.[25] A main aspect of the struggle was to deal with the previously proposed caloric theory of heat. Germain Hess in 1840 stated a conservation law for the so-called 'heat of reaction' for chemical reactions.[49] His law was later recognized as a consequence of the first law of thermodynamics, but Hess's statement was not explicitly concerned with the relation between energy exchanges by heat and work.

Julius Robert von Mayer

According to Truesdell (1980), Julius Robert von Mayer in 1841 made a statement that meant that "in a process at constant pressure, the heat used to produce expansion is universally interconvertible with work", but this is not a general statement of the first law.[50][51]

References
[1] Clausius, R. (1850). Ueber die bewegende Kraft der Wrme und die Gesetze, welche sich daraus fr die Wrmelehre selbst ableiten lassen, Annalen der Physik und Chemie (Poggendorff, Leipzig), 155 (3): 368-394, particularly on page 373 (http:/ / gallica. bnf. fr/ ark:/ 12148/ bpt6k15164w/ f389. image), translation here taken from Truesdell, C.A. (1980), pp. 188-189. [2] Clausius, R. (1850). Ueber die bewegende Kraft der Wrme und die Gesetze, welche sich daraus fr die Wrmelehre selbst ableiten lassen, Annalen der Physik und Chemie (Poggendorff, Leipzig), 155 (3): 368-394, page 384 (http:/ / gallica. bnf. fr/ ark:/ 12148/ bpt6k15164w/ f400. image). [3] Quantities, Units and Symbols in Physical Chemistry (IUPAC Green Book) (http:/ / media. iupac. org/ publications/ books/ gbook/ IUPAC-GB3-2ndPrinting-Online-22apr2011. pdf) See Sec. 2.11 Chemical Thermodynamics [4] Planck, M.(1897/1903). Treatise on Thermodynamics, translated by A. Ogg, Longmans, Green & Co., London. (https:/ / ia700200. us. archive. org/ 15/ items/ treatiseonthermo00planrich/ treatiseonthermo00planrich. pdf), p. 43 [5] Mnster, A. (1970). [6] Bailyn, M. (1994), p. 79. [7] Kirkwood, J.G., Oppenheim, I. (1961), pp. 3133. [8] Planck, M.(1897/1903), p. 40. [9] Crawford, F.H. (1963), pp. 106107. [10] Bryan, G.H. (1907), p. 47. [11] C. Carathodory (1909). "Untersuchungen ber die Grundlagen der Thermodynamik". Mathematische Annalen 67: 355386. A partly reliable translation is to be found at Kestin, J. (1976). The Second Law of Thermodynamics, Dowden, Hutchinson & Ross, Stroudsburg PA.. doi:10.1007/BF01450409. [12] Born, M. (1921). Kritische Betrachtungen zur traditionellen Darstellung der Thermodynamik, Physik. Zeitschr. 22: 218224. [13] Mnster, A. (1970), pp. 2324. [14] Reif, F. (1965), p. 122. [15] Haase, R. (1971), pp. 2425. [16] Pippard, A.B. (1957/1966), p. 14. [17] Reif, F. (1965), p. 82. [18] Adkins, C.J. (1968/1983), p. 31. [19] Callen, H.B. (1960/1985), pp. 13, 17. [20] Kittel, C. Kroemer, H. (1980). Thermal Physics, (first edition by Kittel alone 1969), second edition, W.H. Freeman, San Francisco, ISBN 0-7167-1088-9, pp. 49, 227. [21] Tro, N.J. (2008). Chemistry. A Molecular Approach, Pearson/Prentice Hall, Upper Saddle River NJ, ISBN 0-13-100065-9, p. 246. [22] Kirkwood, J.G., Oppenheim, I. (1961), pp. 1718. Kirkwood & Oppenheim 1961 is recommended by Mnster, A. (1970), p. 376. It is also cited by Eu, B.C. (2002), Generalized Thermodynamics, the Thermodynamics of Irreversible Processes and Generalized Hydrodynamics, Kluwer Academic Publishers, Dordrecht, ISBN 1-4020-0788-4, pp. 18, 29, 66.

First law of thermodynamics


[23] Guggenheim, E.A. (1949/1967). Thermodynamics. An Advanced Treatment for Chemists and Physicists, (first edition 1949), fifth edition 1967, North-Holland, Amsterdam, pp. 910. Guggenheim 1949/1965 is recommended by Buchdahl, H.A. (1966), p. 218. It is also recommended by Mnster, A. (1970), p. 376. [24] Planck, M.(1897/1903). [25] Truesdell, C.A. (1980). [26] Pippard, A.B. (1957/1966), p. 15. According to Herbert Callen, in his most widely cited text, Pippard's text gives a "scholarly and rigorous treatment"; see Callen, H.B. (1960/1985), p. 485. It is also recommended by Mnster, A. (1970), p. 376. [27] Cropper, W.H. (1986). Rudolf Clausius and the road to entropy, Am. J. Phys., 54: 10681074. [28] Truesdell, C.A. (1980), pp. 161162. [29] Buchdahl, H.A. (1966), p. 43. [30] Maxwell, J. C. (1871). Theory of Heat, Longmans, Green, and Co., London, p. 150. [31] Planck, M. (1897/1903), Section 71, p. 52. [32] Bailyn, M. (1994), p. 95. [33] Adkins, C.J. (1968/1983), p. 35. [34] Atkins, P., de Paula, J. (1978/2010). Physical Chemistry, (first edition 1978), ninth edition 2010, Oxford University Press, Oxford UK, ISBN 978-0-19-954337-3, p. 54. [35] Kondepudi, D. (2008). Introduction to Modern Thermodynamics, Wiley, Chichester, ISBN 978-0-470-01598-8, p. 63. [36] Gislason, E.A., Craig, N.C. (2005). Cementing the foundations of thermodynamics:comparison of system-based and surroundings-based definitions of work and heat, J. Chem. Thermodynamics 37: 954966. [37] Rosenberg, R.M. (2010). From Joule to Caratheodory and Born: A conceptual evolution of the first law of thermodynamics, J. Chem. Edu., 87: 691693. [38] Bailyn, M. (1994), 254-256. [39] Glansdorff, P., Prigogine, I. (1971), page 8. [40] Thomson, William (1852 a). " On a Universal Tendency in Nature to the Dissipation of Mechanical Energy (http:/ / zapatopi. net/ kelvin/ papers/ on_a_universal_tendency. html)" Proceedings of the Royal Society of Edinburgh for April 19, 1852 [This version from Mathematical and Physical Papers, vol. i, art. 59, pp. 511.] [41] Thomson, W. (1852 b). On a universal tendency in nature to the dissipation of mechanical energy, Philosophical Magazine 4: 304-306. [42] Helmholtz, H. (1869/1871). Zur Theorie der stationren Strme in reibenden Flssigkeiten, Verhandlungen des naturhistorisch-medizinischen Vereins zu Heidelberg, Band V: 1-7. Reprinted in Helmholtz, H. (1882), Wissenschaftliche Abhandlungen, volume 1, Johann Ambrosius Barth, Leipzig, pages 223-230 (http:/ / echo. mpiwg-berlin. mpg. de/ ECHOdocuViewfull?url=/ mpiwg/ online/ permanent/ einstein_exhibition/ sources/ QWH2FNX8/ index. meta& start=231& viewMode=images& pn=237& mode=texttool) [43] Mnster A. (1970), Sections 14, 15, pp. 4551. [44] Mnster, A. (1970), p. 46. [45] Mnster, A. (1970), p. 50. [46] Haase, R. (1963/1969), p. 15. [47] Haase, R. (1971), p. 20. [48] Smith, D.A. (1980). Definition of heat in open systems, Aust. J. Phys., 33: 95105. (http:/ / www. publish. csiro. au/ paper/ PH800095. htm) [49] Hess, H. (1840). Thermochemische Untersuchungen, Annalen der Physik und Chemie (Poggendorff, Leipzig) 126(6): 385-404 (http:/ / gallica. bnf. fr/ ark:/ 12148/ bpt6k151359/ f397. image. r=Annalen der Physik (Leipzig) 125. langEN). [50] Truesdell, C.A. (1980), pp. 157-158. [51] Mayer, Robert (1841). Paper: 'Remarks on the Forces of Nature"; as quoted in: Lehninger, A. (1971). Bioenergetics - the Molecular Basis of Biological Energy Transformations, 2nd. Ed. London: The Benjamin/Cummings Publishing Company.

286

Cited sources
Adkins, C.J. (1968/1983). Equilibrium Thermodynamics, (first edition 1968), third edition 1983, Cambridge University Press, ISBN 0-521-25445-0. Bailyn, M. (1994). A Survey of Thermodynamics, American Institute of Physics Press, New York, ISBN 0-88318-797-3. Bryan, G.H. (1907). Thermodynamics. An Introductory Treatise dealing mainly with First Principles and their Direct Applications, B.G. Teubner, Leipzig (https://ia700208.us.archive.org/6/items/Thermodynamics/ Thermodynamics.tif). Buchdahl, H.A. (1966), The Concepts of Classical Thermodynamics, Cambridge University Press, London. Callen, H.B. (1960/1985), Thermodynamics and an Introduction to Thermostatistics, (first edition 1960), second edition 1985, John Wiley & Sons, New York, ISBN 0471862568.

First law of thermodynamics Crawford, F.H. (1963). Heat, Thermodynamics, and Statistical Physics, Rupert Hart-Davis, London, Harcourt, Brace & World, Inc.. Glansdorff, P., Prigogine, I., (1971). Thermodynamic Theory of Structure, Stability and Fluctuations, Wiley, London, ISBN 0-471-30280-5. Haase, R. (1963/1969). Thermodynamics of Irreversible Processes, English translation, Addison-Wesley Publishing, Reading MA. Haase, R. (1971). Survey of Fundamental Laws, chapter 1 of Thermodynamics, pages 197 of volume 1, ed. W. Jost, of Physical Chemistry. An Advanced Treatise, ed. H. Eyring, D. Henderson, W. Jost, Academic Press, New York, lcn 73117081. Kirkwood, J.G., Oppenheim, I. (1961). Chemical Thermodynamics, McGraw-Hill Book Company, New York. Mnster, A. (1970), Classical Thermodynamics, translated by E.S. Halberstadt, WileyInterscience, London, ISBN 0-471-62430-6. Pippard, A.B. (1957/1966). Elements of Classical Thermodynamics for Advanced Students of Physics, original publication 1957, reprint 1966, Cambridge University Press, Cambridge UK. Planck, M.(1897/1903). Treatise on Thermodynamics, translated by A. Ogg, Longmans, Green & Co., London. (https://ia700200.us.archive.org/15/items/treatiseonthermo00planrich/treatiseonthermo00planrich.pdf) Reif, F. (1965). Fundamentals of Statistical and Thermal Physics, McGraw-Hill Book Company, New York. Truesdell, C.A. (1980). The Tragicomical History of Thermodynamics, 1822-1854, Springer, New York, ISBN 0-387-90403-4.

287

Further reading
Goldstein, Martin, and Inge F. (1993). The Refrigerator and the Universe. Harvard University Press. ISBN0-674-75325-9. OCLC32826343. Chpts. 2 and 3 contain a nontechnical treatment of the first law. engel Y.A. and Boles M. (2007). Thermodynamics: an engineering approach. McGraw-Hill Higher Education. ISBN0-07-125771-3. Chapter 2. Atkins P. (2007). Four Laws that drive the Universe. OUP Oxford. ISBN0-19-923236-9.

External links
MISN-0-158, The First Law of Thermodynamics (http://35.9.69.219/home/modules/pdf_modules/m158. pdf) (PDF file) by Jerzy Borysowicz for Project PHYSNET (http://www.physnet.org). First law of thermodynamics (http://web.mit.edu/16.unified/www/FALL/thermodynamics/notes/node8. html) in the MIT Course Unified Thermodynamics and Propulsion (http://web.mit.edu/16.unified/www/ FALL/thermodynamics/notes/notes.html) from Prof. Z. S. Spakovszky

Laws of thermodynamics

288

Laws of thermodynamics
The four laws of thermodynamics define fundamental physical quantities (temperature, energy, and entropy) that characterize thermodynamic systems. The laws describe how these quantities behave under various circumstances, and forbid certain phenomena (such as perpetual motion). The four laws of thermodynamics are:[1][2][3][4][5][6] Zeroth law of thermodynamics: If two systems are in thermal equilibrium with a third system, they must be in thermal equilibrium with each other. This law helps define the notion of temperature. First law of thermodynamics: Heat and work are forms of energy transfer. Energy is invariably conserved but the internal energy of a closed system changes as heat and work are transferred in or out of it. Equivalently, perpetual motion machines of the first kind are impossible. Second law of thermodynamics: The entropy of any isolated system not in thermal equilibrium almost always increases. Isolated systems spontaneously evolve towards thermal equilibriumthe state of maximum entropy of the systemin a process known as "thermalization". Equivalently, perpetual motion machines of the second kind are impossible. Third law of thermodynamics: The entropy of a system approaches a constant value as the temperature approaches zero. The entropy of a system at absolute zero is typically zero, and in all cases is determined only by the number of different ground states it has. Specifically, the entropy of a pure crystalline substance at absolute zero temperature is zero. Classical thermodynamics describes the exchange of work and heat between systems. It has a special interest in systems that are individually in states of thermodynamic equilibrium. Thermodynamic equilibrium is a condition of systems which are adequately described by only macroscopic variables. Every physical system, however, when microscopically examined, shows apparently random microscopic statistical fluctuations in its thermodynamic variables of state (entropy, temperature, pressure, etc.). These microscopic fluctuations are negligible for systems which are nearly in thermodynamic equilibrium and which are only macroscopically examined. They become important, however, for systems which are nearly in thermodynamic equilibrium when they are microscopically examined, and, exceptionally, for macroscopically examined systems that are in critical states,[7] and for macroscopically examined systems that are far from thermodynamic equilibrium. There have been suggestions of additional laws, but none of them achieve the generality of the four accepted laws, and they are not mentioned in standard textbooks.[1][2][3][4][5][8][9] The laws of thermodynamics are important fundamental laws in physics and they are applicable in other natural sciences.

Zeroth law
The zeroth law of thermodynamics may be stated as follows: If system A and system B are individually in thermal equilibrium with system C, then system A is in thermal equilibrium with system B The zeroth law implies that thermal equilibrium, viewed as a binary relation, is a Euclidean relation. If we assume that the binary relationship is also reflexive, then it follows that thermal equilibrium is an equivalence relation. Equivalence relations are also transitive and symmetric. The symmetric relationship allows one to speak of two systems being "in thermal equilibrium with each other", which gives rise to a simpler statement of the zeroth law: If two systems are in thermal equilibrium with a third, they are in thermal equilibrium with each other However, this statement requires the implicit assumption of both symmetry and reflexivity, rather than reflexivity alone.

Laws of thermodynamics The law is also a statement about measurability. To this effect the law allows the establishment of an empirical parameter, the temperature, as a property of a system such that systems in equilibrium with each other have the same temperature. The notion of transitivity permits a system, for example a gas thermometer, to be used as a device to measure the temperature of another system. Although the concept of thermodynamic equilibrium is fundamental to thermodynamics and was clearly stated in the nineteenth century, the desire to label its statement explicitly as a law was not widely felt until Fowler and Planck stated it in the 1930s, long after the first, second, and third law were already widely understood and recognized. Hence it was numbered the zeroth law. The importance of the law as a foundation to the earlier laws is that it allows the definition of temperature in a non-circular way without reference to entropy, its conjugate variable.

289

First law
The first law of thermodynamics may be stated thus: Increase in internal energy of a body = heat supplied to the body - work done by the body. U = Q - W For a thermodynamic cycle, the net heat supplied to the system equals the net work done by the system. More specifically, the First Law encompasses several principles: The law of conservation of energy. This states that energy can be neither created nor destroyed. However, energy can change forms, and energy can flow from one place to another. The total energy of an isolated system remains the same. The concept of internal energy and its relationship to temperature. If a system, for example a rock, has a definite temperature, then its total energy has three distinguishable components. If the rock is flying through the air, it has kinetic energy. If it is high above the ground, it has gravitational potential energy. In addition to these, it has internal energy which is the sum of the kinetic energy of vibrations of the atoms in the rock, and other sorts of microscopic motion, and of the potential energy of interactions between the atoms within the rock. Other things being equal, the internal energy increases as the rock's temperature increases. The concept of internal energy is the characteristic distinguishing feature of the first law of thermodynamics. The flow of heat is a form of energy transfer. In other words, a quantity of heat that flows from a hot body to a cold one can be expressed as an amount of energy being transferred from the hot body to the cold one. Performing work is a form of energy transfer. For example, when a machine lifts a heavy object upwards, some energy is transferred from the machine to the object. The object acquires its energy in the form of gravitational potential energy in this example. Combining these principles leads to one traditional statement of the first law of thermodynamics: it is not possible to construct a perpetual motion machine which will continuously do work without consuming energy.

Second law
The second law of thermodynamics asserts the existence of a quantity called the entropy of a system and further states that When two isolated systems in separate but nearby regions of space, each in thermodynamic equilibrium in itself (but not necessarily in equilibrium with each other at first) are at some time allowed to interact, breaking the isolation that separates the two systems, allowing them to exchange matter or energy, they will eventually reach a mutual thermodynamic equilibrium. The sum of the entropies of the initial, isolated systems is less than or equal to the entropy of the final combination of exchanging systems. In the process of reaching a new

Laws of thermodynamics thermodynamic equilibrium, total entropy has increased, or at least has not decreased. It follows that the entropy of an isolated macroscopic system never decreases. The second law states that spontaneous natural processes increase entropy overall, or in another formulation that heat can spontaneously be conducted or radiated only from a higher-temperature region to a lower-temperature region, but not the other way around. The second law refers to a wide variety of processes, reversible and irreversible. Its main import is to tell about irreversibility. The prime example of irreversibility is in the transfer of heat by conduction or radiation. It was known long before the discovery of the notion of entropy that when two bodies of different temperatures are connected with each other by purely thermal connection, conductive or radiative, then heat always flows from the hotter body to the colder one. This fact is part of the basic idea of heat, and is related also to the so-called zeroth law, though the textbooks' statements of the zeroth law are usually reticent about that, because they have been influenced by Carathodory's basing his axiomatics on the law of conservation of energy and trying to make heat seem a theoretically derivative concept instead of an axiomatically accepted one. ilahv (1997) notes that Carathodory's approach does not work for the description of irreversible processes that involve both heat conduction and conversion of kinetic energy into internal energy by viscosity (which is another prime example of irreversibility), because "the mechanical power and the rate of heating are not expressible as differential forms in the 'external parameters'".[10] The second law tells also about kinds of irreversibility other than heat transfer, and the notion of entropy is needed to provide that wider scope of the law. According to the second law of thermodynamics, in a reversible heat transfer, an element of heat transferred, Q, is the product of the temperature (T), both of the system and of the sources or destination of the heat, with the increment (dS) of the system's conjugate variable, its entropy (S)
[1]

290

The second law defines entropy, which may be viewed not only as a macroscopic variable of classical thermodynamics, but may also be viewed as a measure of deficiency of physical information about the microscopic details of the motion and configuration of the system, given only predictable experimental reproducibility of bulk or macroscopic behavior as specified by macroscopic variables that allow the distinction to be made between heat and work. More exactly, the law asserts that for two given macroscopically specified states of a system, there is a quantity called the difference of entropy between them. The entropy difference tells how much additional microscopic physical information is needed to specify one of the macroscopically specified states, given the macroscopic specification of the other, which is often a conveniently chosen reference state. It is often convenient to presuppose the reference state and not to explicitly state it. A final condition of a natural process always contains microscopically specifiable effects which are not fully and exactly predictable from the macroscopic specification of the initial condition of the process. This is why entropy increases in natural processes. The entropy increase tells how much extra microscopic information is needed to tell the final macroscopically specified state from the initial macroscopically specified state.[11]

Third law
The third law of thermodynamics is sometimes stated as follows: The entropy of a perfect crystal at absolute zero is exactly equal to zero. At zero temperature the system must be in a state with the minimum thermal energy. This statement holds true if the perfect crystal has only one state with minimum energy. Entropy is related to the number of possible microstates according to S = kBln(), where S is the entropy of the system, kB Boltzmann's constant, and the number of microstates (e.g. possible configurations of atoms). At absolute zero there is only 1 microstate possible (=1) and ln(1) = 0.

Laws of thermodynamics A more general form of the third law that applies to systems such as glasses that may have more than one minimum energy state: The entropy of a system approaches a constant value as the temperature approaches zero. The constant value (not necessarily zero) is called the residual entropy of the system.

291

History
Count Rumford (born Benjamin Thompson) showed, about 1797, that mechanical action can generate indefinitely large amounts of heat, so challenging the caloric theory. The historically first established thermodynamic principle which eventually became the second law of thermodynamics was formulated by Sadi Carnot during 1824. By 1860, as formalized in the works of those such as Rudolf Clausius and William Thomson, two established principles of thermodynamics had evolved, the first principle and the second principle, later restated as thermodynamic laws. By 1873, for example, thermodynamicist Josiah Willard Gibbs, in his memoir Graphical Methods in the Thermodynamics of Fluids, clearly stated the first two absolute laws of thermodynamics. Some textbooks throughout the 20th century have numbered the laws differently. In some fields removed from chemistry, the second law was considered to deal with the efficiency of heat engines only, whereas what was called the third law dealt with entropy increases. Directly defining zero points for entropy calculations was not considered to be a law. Gradually, this separation was combined into the second law and the modern third law was widely adopted.

References
[1] Guggenheim, E.A. (1985). Thermodynamics. An Advanced Treatment for Chemists and Physicists, seventh edition, North Holland, Amsterdam, ISBN 0-444-86951-4. [2] Kittel, C. Kroemer, H. (1980). Thermal Physics, second edition, W.H. Freeman, San Francisco, ISBN 0-7167-1088-9. [3] Adkins, C.J. (1968). Equilibrium Thermodynamics, McGraw-Hill, London, ISBN 0-07-084057-1. [4] Kondepudi D. (2008). Introduction to Modern Thermodynamics, Wiley, Chichester, ISBN 978-0-470-01598-8. [5] Lebon, G., Jou, D., Casas-Vzquez, J. (2008). Understanding Non-equilibrium Thermodynamics. Foundations, Applications, Frontiers, Springer, Berlin, ISBN 978-3-540-74252-4. [6] Chris Vuille; Serway, Raymond A.; Faughn, Jerry S. (2009). College physics (http:/ / books. google. ca/ books?id=CX0u0mIOZ44C& pg=PT355). Belmont, CA: Brooks/Cole, Cengage Learning. p.355. ISBN0-495-38693-6. . [7] Balescu, R. (1975). Equilibrium and Nonequilibrium Statistical Mechanics, Wiley, New York, ISBN 0-471-04600-0. [8] De Groot, S.R., Mazur, P. (1962). Non-equilibrium Thermodynamics, North Holland, Amsterdam. [9] Glansdorff, P., Prigogine, I. (1971). Thermodynamic Theory of Structure, Stability and Fluctuations, Wiley-Interscience, London, ISBN 0-471-30280-5. [10] ilhav, M. (1997). The Mechanics and Thermodynamics of Continuous Media, Springer, Berlin, ISBN 3-540-58378-5, page 137. [11] Ben-Naim, A. (2008). A Farewell to Entropy: Statistical Thermodynamics Based on Information, World Scientific, New Jersey, ISBN 978-981-270-706-2.

Further reading
Atkins, Peter, 2007. Four Laws That Drive the Universe. OUP Oxford. Goldstein, Martin, and Inge F., 1993. The Refrigerator and the Universe. Harvard Univ. Press. A gentle introduction.

Continuity equation

292

Continuity equation
A continuity equation in physics is an equation that describes the transport of a conserved quantity. Since mass, energy, momentum, electric charge and other natural quantities are conserved under their respective appropriate conditions, a variety of physical phenomena may be described using continuity equations. Continuity equations are a stronger, local form of conservation laws. For example, it is true that "the total energy in the universe is conserved". But this statement does not immediately rule out the possibility that a lot of energy could disappear from Earth while simultaneously appearing in another galaxy. A stronger statement is that energy is locally conserved: Energy can neither be created nor destroyed, nor can it "teleport" from one place to anotherit can only move by a continuous flow. A continuity equation is the mathematical way to express this kind of statement. Continuity equations more generally can include "source" and "sink" terms, which allow them to describe quantities which are often but not always conserved, such as the density of a molecular species which can be created or destroyed by chemical reactions. In an everyday example, there is a continuity equation for the number of living humans; it has a "source term" to account for people giving birth, and a "sink term" to account for people dying. Any continuity equation can be expressed in an "integral form" (in terms of a flux integral), which applies to any finite region, or in a "differential form" (in terms of the divergence operator) which applies at a point. Continuity equations underlie more specific transport equations such as the convectiondiffusion equation, Boltzmann transport equation, and Navier-Stokes equations.

General equation
Preliminary description
As stated above, the idea behind the continuity equation is the flow of some property, such as mass, energy, electric charge, momentum, and even probability, through surfaces from one region of space to another. The surfaces, in general, may either be open or closed, real or imaginary, and have an arbitrary shape, but are fixed for the calculation (i.e. not time-varying, which is appropriate since this complicates the maths for no advantage). Let this property be represented by just one scalar variable, q, and let the volume density of this property (the amount of q per unit volume V) be , and the union of all surfaces be denoted by S. Mathematically, is a ratio of two infinitesimal quantities:

Illustration of how flux j passes through open curved surfaces S (dS is differential vector area).

which has the dimension [quantity][L]3 (where L is length).

Continuity equation

293

There are different ways to conceive the continuity equation: 1. either the flow of particles carrying the quantity q, described by a velocity field v, which is also equivalent to a flux j of q (a vector function describing the flow per unit area per unit time of q), or 2. in the cases where a velocity field is not useful or applicable, the flux j of the quantity q only (no association with velocity). In each of these cases, the transfer of q occurs as it passes through two surfaces, the first S1 and the second S2.

Illustration of how flux j passes through closed surfaces S1 and S2. The surface area elements shown are dS1 and dS2, and the flux is integrated over the whole surface. Yellow dots are sources, red dots are sinks, the blue lines are the flux lines of q.

The flux j should represent some flow or transport, which has dimensions [quantity][T]1[L]2. In cases where particles/carriers of quantity q are moving with velocity v, such as particles of mass in a fluid or charge carriers in a conductor, j can be related to v by:

This relation is only true in situations where there are particles moving and carrying q - it can't always be applied. To illustrate this: if j is electric current density (electric current per unit area) and is the charge density (charge per unit volume), then the velocity of the charge carriers is v. However - if j is heat flux Illustration of q, , and j, and the effective flux due to carriers of q. density (heat energy per unit time per unit area), then is the amount of q per unit volume (in the box), j represents the flux even if we let be the heat energy density (heat energy (blue flux lines) and q is carried by the particles (yellow). per unit volume) it does not imply the "velocity of heat" is v (this makes no sense, and is not practically applicable). In the latter case only j (with ) may be used in the continuity equation.

Continuity equation

294

Elementary vector form


Consider the case when the surfaces are flat and planar cross-sections. For the case where a velocity field can be applied, dimensional analysis leads to this form of the continuity equation:

where the left hand side is the initial amount of q flowing per unit time through surface S1, the right hand side is the final amount through surface S2, S1 and S2 are the vector areas for the surfaces S1 and S2 respectively. Notice the dot pr