Sie sind auf Seite 1von 118

History of computing hardware

Computing hardware has been an important component of the process of calculation


and data storage since it became useful for numerical values to be processed and shared.
The earliest computing hardware was probably some form of tally stick; later record
keeping aids include Phoenician clay shapes which represented counts of items, probably
livestock or grains, in containers. Something similar is found in early Minoan
excavations. These seem to have been used by the merchants, accountants, and
government officials of the time.

Devices to aid computation have changed from simple recording and counting devices to
the abacus, the slide rule, analog computers, and more recent electronic computers. Even
today, an experienced abacus user using a device hundreds of years old can sometimes
complete basic calculations more quickly than an unskilled person using an electronic
calculator — though for more complex calculations, computers out-perform even the
most skilled human.

This article covers major developments in the history of computing hardware, and
attempts to put them in context. For a detailed timeline of events, see the computing
timeline article. The history of computing article is a related overview and treats methods
intended for pen and paper, with or without the aid of tables.

Earliest devices
Humanity has used devices to aid in computation for millennia. One example is a device
for establishing the checkered cloths of the counting houses served as simple for
enumerating stacks of coins, by height. A more arithmetic-oriented machine is the
abacus. The earliest form of abacus, the dust abacus, is thought to have been invented in
Babylonia.[citation needed] The Egyptian bead and wire abacus dates from 500 BC.[citation needed]

In 1623 Wilhelm Schickard built the first mechanical calculator[citation needed] and thus
became the father of the computing era. Since his machine used techniques such as cogs
and gears first developed for clocks, it was also called a 'calculating clock'. It was put to
practical use by his friend Johannes Kepler, who revolutionized astronomy.

Machines by Blaise Pascal (the Pascaline, 1642).An original calculator by Pascal (1640)
is preserved in the Zwinger Museum,and Gottfried Wilhelm von Leibniz (1671)
followed. Around 1820, Charles Xavier Thomas created the first successful, mass-
produced mechanical calculator, the Thomas Arithmometer, that could add, subtract,
multiply, and divide. It was mainly based on Leibniz's work. Mechanical calculators, like
the base-ten addiator, the comptometer, the Monroe, the Curta and the Addo-X remained
in use until the 1970s.

Leibniz also described the binary numeral system, a central ingredient of all modern
computers. However, up to the 1940s, many subsequent designs (including Charles
Babbage's machines of the 1800s and even ENIAC of 1945) were based on the harder-to-
implement decimal system.

John Napier noted that multiplication and division of numbers can be performed by
addition and subtraction, respectively, of logarithms of those numbers. Since these real
numbers can be represented as distances or intervals on a line, the slide rule allowed
multiplication and division operations to be carried out significantly faster than was
previously possible. Slide rules were used by generations of engineers and other
mathematically inclined professional workers, until the invention of the pocket calculator.
The engineers in the Apollo program to send a man to the moon made many of their
calculations on slide rules, which were accurate to 3 or 4 significant figures.

While producing the first logarithmic tables Napier needed to perform many
multiplications and it was at this point that he designed Napier's bones.

1801: punched card technology

Punched card system of a music machine. Also referred to as Book music, a one-stop
European medium for organs

Punched card system of a 19th Century loom

As early as 1725 Basile Bouchon used a perforated paper loop in a loom to establish the
pattern to be reproduced on cloth, and in 1726 his co-worker Jean-Baptiste Falcon
improved on his design by using perforated paper cards attached to one another for
efficiency in adapting and changing the program. The Bouchon-Falcon loom was semi-
automatic and required manual feed of the program.

In 1801, Joseph-Marie Jacquard developed a loom in which the pattern being woven was
controlled by punched cards. The series of cards could be changed without changing the
mechanical design of the loom. This was a landmark point in programmability.

Herman Hollerith invented a tabulating machine using punched cards in the 1880s.

In 1833, Charles Babbage moved on from developing his difference engine to developing
a more complete design, the analytical engine, which would draw directly on Jacquard's
punched cards for its programming.[1].

In 1890, the United States Census Bureau used punched cards and sorting machines
designed by Herman Hollerith, to handle the flood of data from the decennial census
mandated by the Constitution. Hollerith's company eventually became the core of IBM.
IBM developed punched card technology into a powerful tool for business data-
processing and produced an extensive line of specialized unit record equipment. By 1950,
the IBM card had become ubiquitous in industry and government. The warning printed
on most cards intended for circulation as documents (checks, for example), "Do not fold,
spindle or mutilate," became a motto for the post-World War II era.[1]

Leslie Comrie's articles on punched card methods and W.J. Eckert's publication of
Punched Card Methods in Scientific Computation in 1940, described techniques which
were sufficiently advanced to solve differential equations, perform multiplication and
division using floating point representations, all on punched cards and unit record
machines. The Thomas J. Watson Astronomical Computing Bureau, Columbia University
performed astronomical calculations representing the state of the art in computing.

In many computer installations, punched cards were used until (and after) the end of the
1970s. For example, science and engineering students at many universities around the
world would submit their programming assignments to the local computer centre in the
form of a stack of cards, one card per program line, and then had to wait for the program
to be queued for processing, compiled, and executed. In due course a printout of any
results, marked with the submitter's identification, would be placed in an output tray
outside the computer center. In many cases these results would comprise solely a printout
of error messages regarding program syntax etc., necessitating another edit-compile-run
cycle.[2]

Punched cards are still used and manufactured in the current century, and their distinctive
dimensions (and 80-column capacity) can still be recognized in forms, records, and
programs around the world.

In 1835 Charles Babbage described his analytical engine. It was the plan of a general-
purpose programmable computer, employing punch cards for input and a steam engine
for power. One crucial invention was to use gears for the function served by the beads of
an abacus. In a real sense, computers all contain automatic abacuses (technically called
the arithmetic logic unit or floating-point unit).

His initial idea was to use punch-cards to control a machine that could calculate and print
logarithmic tables with huge precision (a specific purpose machine). Babbage's idea soon
developed into a general-purpose programmable computer, his analytical engine.

While his design was sound and the plans were probably correct, or at least debuggable,
the project was slowed by various problems. Babbage was a difficult man to work with
and argued with anyone who didn't respect his ideas. All the parts for his machine had to
be made by hand. Small errors in each item can sometimes sum up to large discrepancies
in a machine with thousands of parts, which required these parts to be much better than
the usual tolerances needed at the time. The project dissolved in disputes with the artisan
who built parts and was ended with the depletion of government funding.

Ada Lovelace, Lord Byron's daughter, translated and added notes to the "Sketch of the
Analytical Engine" by Federico Luigi, Conte Menabrea. She has become closely
associated with Babbage. Some claim she is the world's first computer programmer,
however this claim and the value of her other contributions are disputed by many.

A reconstruction of the Difference Engine II, an earlier, more limited design, has been
operational since 1991 at the London Science Museum. With a few trivial changes, it
works as Babbage designed it and shows that Babbage was right in theory.

The museum used computer-operated machine tools to construct the necessary parts,
following tolerances which a machinist of the period would have been able to achieve.
Some feel that the technology of the time was unable to produce parts of sufficient
precision, though this appears to be false. The failure of Babbage to complete the engine
can be chiefly attributed to difficulties not only related to politics and financing, but also
to his desire to develop an increasingly sophisticated computer. Today, many in the
computer field term this sort of obsession creeping featuritis.

The US Government used Herman Hollerith's tabulating machine for taking the 1890
U.S. Census.

Following in the footsteps of Babbage, although unaware of his earlier work, was Percy
Ludgate, an accountant from Dublin, Ireland. He independently designed a
programmable mechanical computer, which he described in a work that was published in
1909.

[] 1930s–1960s: desktop calculators

Curta calculator

By the 1900s earlier mechanical calculators, cash registers, accounting machines, and so
on were redesigned to use electric motors, with gear position as the representation for the
state of a variable. Companies like Friden, Marchant Calculator and Monroe made
desktop mechanical calculators from the 1930s that could add, subtract, multiply and
divide. The word "computer" was a job title assigned to people who used these
calculators to perform mathematical calculations. During the Manhattan project, future
Nobel laureate Richard Feynman was the supervisor of the roomful of human computers,
many of them women mathematicians, who understood the differential equations which
were being solved for the war effort. Even the renowned Stanisław Ulam was pressed
into service to translate the mathematics into computable approximations for the
hydrogen bomb, after the war.

In 1948, the Curta was introduced. This was a small, portable, mechanical calculator that
was about the size of a pepper grinder. Over time, during the 1950s and 1960s a variety
of different brands of mechanical calculator appeared on the market.
The first all-electronic desktop calculator was the British ANITA Mk.VII, which used a
Nixie tube display and 177 subminiature thyratron tubes. In June 1963, Friden introduced
the four-function EC-130. It had an all-transistor design, 13-digit capacity on a 5-inch
CRT, and introduced reverse Polish notation (RPN) to the calculator market at a price of
$2200. The model EC-132 added square root and reciprocal functions. In 1965, Wang
Laboratories produced the LOCI-2, a 10-digit transistorized desktop calculator that used a
Nixie tube display and could compute logarithms.

With development of the integrated circuits and microprocessors, the expensive, large
calculators were replaced with smaller electronic devices.

[] Pre-1940 analog computers

Cambridge differential analyzer, 1938

Before World War II, mechanical and electrical analog computers were considered the
'state of the art', and many thought they were the future of computing. Analog computers
use continuously varying amounts of physical quantities, such as voltages or currents, or
the rotational speed of shafts, to represent the quantities being processed. An ingenious
example of such a machine was the Water integrator built in 1928; an electrical example
is the Mallock machine built in 1941. Unlike modern digital computers, analog
computers are not very flexible, and need to be reconfigured (i.e., reprogrammed)
manually to switch them from working on one problem to another. Analog computers had
an advantage over early digital computers in that they could be used to solve complex
problems while the earliest attempts at digital computers were quite limited. But as digital
computers have become faster and used larger memory (e.g., RAM or internal store),
they have almost entirely displaced analog computers, and computer programming, or
coding has arisen as another human profession.

Since computers were rare in this era, the solutions were often hard-coded into paper
forms such as graphs and nomograms, which could then allow analog solutions to
problems, such as the distribution of pressures and temperatures in a heating system.

Some of the most widely deployed analog computers included devices for aiming
weapons, such as the Norden bombsight and Fire-control systems for naval vessels. Some
of these stayed in use for decades after WWII. One example is the Mark I Fire Control
Computer, deployed by the United States Navy on a variety of ships from destroyers to
battleships.

The art of analog computing reached its zenith with the differential analyzer, invented by
Vannevar Bush in 1930. Fewer than a dozen of these devices were ever built; the most
powerful was constructed at the University of Pennsylvania's Moore School of Electrical
Engineering, where the ENIAC was built. Digital electronic computers like the ENIAC
spelled the end for most analog computing machines, but hybrid analog computers,
controlled by digital electronics, remained in substantial use into the 1950s and 1960s,
and later in some specialized applications.

[] Early digital computers


The era of modern computing began with a flurry of development before and during
World War II, as electronic circuits, relays, capacitors, and vacuum tubes replaced
mechanical equivalents and digital calculations replaced analog calculations. Machines
such as the Atanasoff–Berry Computer, the Z3, the Colossus, and ENIAC were built by
hand using circuits containing relays or valves (vacuum tubes), and often used punched
cards or punched paper tape for input and as the main (non-volatile) storage medium.

In later systems, temporary or working storage was provided by acoustic delay lines
(which use the propagation time of sound through a medium such as liquid mercury or
wire to briefly store data) or by Williams tubes (which use the ability of a television
picture tube to store and retrieve data). By 1954, magnetic core memory was rapidly
displacing most other forms of temporary storage, and dominated the field through the
mid-1970s.

In this era, a number of different machines were produced with steadily advancing
capabilities. At the beginning of this period, nothing remotely resembling a modern
computer existed, except in the long-lost plans of Charles Babbage and the mathematical
musings of Alan Turing and others. At the end of the era, devices like the EDSAC had
been built, and are universally agreed to be digital computers. Defining a single point in
the series as the "first computer" misses many subtleties.

Alan Turing's 1936 paper proved enormously influential in computing and computer
science in two ways. Its main purpose was to prove that there were problems (namely the
halting problem) that could not be solved by any sequential process. In doing so, Turing
provided a definition of a universal computer, a construct that came to be called a Turing
machine, a purely theoretical device that formalizes the concept of algorithm execution,
replacing Kurt Gödel's more cumbersome universal language based on arithmetics.
Except for the limitations imposed by their finite memory stores, modern computers are
said to be Turing-complete, which is to say, they have algorithm execution capability
equivalent to a universal Turing machine. This limited type of Turing completeness is
sometimes viewed as a threshold capability separating general-purpose computers from
their special-purpose predecessors.

For a computing machine to be a practical general-purpose computer, there must be some


convenient read-write mechanism, punched tape, for example. For full versatility, the
Von Neumann architecture uses the same memory both to store programs and data;
virtually all contemporary computers use this architecture (or some variant). While it is
theoretically possible to implement a full computer entirely mechanically (as Babbage's
design showed), electronics made possible the speed and later the miniaturization that
characterize modern computers.
There were three parallel streams of computer development in the World War II era, and
two were either largely ignored or were deliberately kept secret. The first was the
German work of Konrad Zuse. The second was the secret development of the Colossus
computer in the UK. Neither of these had much influence on the various computing
projects in the United States. After the war, British and American computing researchers
cooperated on some of the most important steps towards a practical computing device.

[] Konrad Zuse's Z-series: the first program-controlled computers

A reproduction of Zuse's Z1 computer.

Working in isolation in Germany, Konrad Zuse started construction in 1936 of his first Z-
series calculators featuring memory and (initially limited) programmability. Zuse's purely
mechanical, but already binary Z1, finished in 1938, never worked reliably due to
problems with the precision of parts.

Zuse's subsequent machine, the Z3, was finished in 1941. It was based on telephone
relays and did work satisfactorily. The Z3 thus became the first functional program-
controlled, all-purpose, digital computer. In many ways it was quite similar to modern
machines, pioneering numerous advances, such as floating point numbers. Replacement
of the hard-to-implement decimal system (used in Charles Babbage's earlier design) by
the simpler binary system meant that Zuse's machines were easier to build and potentially
more reliable, given the technologies available at that time. This is sometimes viewed as
the main reason why Zuse succeeded where Babbage failed.

Programs were fed into Z3 on punched films. Conditional jumps were missing, but since
the 1990s it has been proved theoretically that Z3 was still a universal computer (ignoring
its physical storage size limitations). In two 1936 patent applications, Konrad Zuse also
anticipated that machine instructions could be stored in the same storage used for data –
the key insight of what became known as the Von Neumann architecture and was first
implemented in the later British EDSAC design (1949). Zuse also claimed to have
designed the first higher-level programming language, (Plankalkül), in 1945 (which was
published in 1948) although it was implemented for the first time in 2000 by the Free
University of Berlin – five years after Zuse died.

Zuse suffered setbacks during World War II when some of his machines were destroyed
in the course of Allied bombing campaigns. Apparently his work remained largely
unknown to engineers in the UK and US until much later, although at least IBM was
aware of it as it financed his post-war startup company in 1946 in return for an option on
Zuse's patents.

[] American developments
In 1937, Claude Shannon produced his master's thesis at MIT that implemented Boolean
algebra using electronic relays and switches for the first time in history. Entitled A
Symbolic Analysis of Relay and Switching Circuits, Shannon's thesis essentially founded
practical digital circuit design.

In November of 1937, George Stibitz, then working at Bell Labs, completed a relay-
based computer he dubbed the "Model K" (for "kitchen", where he had assembled it),
which calculated using binary addition. Bell Labs authorized a full research program in
late 1938 with Stibitz at the helm. Their Complex Number Calculator, completed January
8, 1940, was able to calculate complex numbers. In a demonstration to the American
Mathematical Society conference at Dartmouth College on September 11, 1940, Stibitz
was able to send the Complex Number Calculator remote commands over telephone lines
by a teletype. It was the first computing machine ever used remotely, in this case over a
phone line. Some participants in the conference who witnessed the demonstration were
John Von Neumann, John Mauchly, and Norbert Wiener, who wrote about it in his
memoirs.

In 1939, John Vincent Atanasoff and Clifford E. Berry of Iowa State University
developed the Atanasoff–Berry Computer (ABC), a special purpose digital electronic
calculator for solving systems of linear equations. (The original goal was to solve 29
simultaneous equations of 29 unknowns each, but due to errors in the card puncher
mechanism the completed machine could only solve a few equations.) The design used
over 300 vacuum tubes for high speed and employed capacitors fixed in a mechanically
rotating drum for memory. Though the ABC machine was not programmable, it was the
first to use electronic circuits. ENIAC co-inventor John Mauchly examined the ABC in
June 1941, and its influence on the design of the later ENIAC machine is a matter of
contention among computer historians. The ABC was largely forgotten until it became
the focus of the lawsuit Honeywell v. Sperry Rand, the ruling of which invalidated the
ENIAC patent (and several others) as, among many reasons, having been anticipated by
Atanasoff's work.

In 1939, development began at IBM's Endicott laboratories on the Harvard Mark I.


Known officially as the Automatic Sequence Controlled Calculator, the Mark I was a
general purpose electro-mechanical computer built with IBM financing and with
assistance from IBM personnel, under the direction of Harvard mathematician Howard
Aiken. Its design was influenced by Babbage's Analytical Engine, using decimal
arithmetic and storage wheels and rotary switches in addition to electromagnetic relays. It
was programmable via punched paper tape, and contained several calculation units
working in parallel. Later versions contained several paper tape readers and the machine
could switch between readers based on a condition. Nevertheless, the machine was not
quite Turing-complete. The Mark I was moved to Harvard University and began
operation in May 1944.

[] Colossus
Colossus was used to break German ciphers during World War II.

During World War II, the British at Bletchley Park achieved a number of successes at
breaking encrypted German military communications. The German encryption machine,
Enigma, was attacked with the help of electro-mechanical machines called bombes. The
bombe, designed by Alan Turing and Gordon Welchman, after the Polish cryptographic
bomba (1938), ruled out possible Enigma settings by performing chains of logical
deductions implemented electrically. Most possibilities led to a contradiction, and the few
remaining could be tested by hand.

The Germans also developed a series of teleprinter encryption systems, quite different
from Enigma. The Lorenz SZ 40/42 machine was used for high-level Army
communications, termed "Tunny" by the British. The first intercepts of Lorenz messages
began in 1941. As part of an attack on Tunny, Professor Max Newman and his colleagues
helped specify the Colossus. The Mk I Colossus was built between March and December
1943 by Tommy Flowers and his colleagues at the Post Office Research Station at Dollis
Hill in London and then shipped to Bletchley Park.

Colossus was the first totally electronic computing device. The Colossus used a large
number of valves (vacuum tubes). It had paper-tape input and was capable of being
configured to perform a variety of boolean logical operations on its data, but it was not
Turing-complete. Nine Mk II Colossi were built (The Mk I was converted to a Mk II
making ten machines in total). Details of their existence, design, and use were kept secret
well into the 1970s. Winston Churchill personally issued an order for their destruction
into pieces no larger than a man's hand. Due to this secrecy the Colossi were not included
in many histories of computing. A reconstructed copy of one of the Colossus machines is
now on display at Bletchley Park.

[] ENIAC

ENIAC performed ballistics trajectory calculations with 160 kW of power.

The US-built ENIAC (Electronic Numerical Integrator and Computer), often called the
first electronic general-purpose computer because Konrad Zuse's earlier Z3 was electric
but not electronic, publicly validated the use of electronics for large-scale computing.
This was crucial for the development of modern computing, initially because of the
enormous speed advantage, but ultimately because of the potential for miniaturization.
Built under the direction of John Mauchly and J. Presper Eckert, it was 1,000 times faster
than its contemporaries. ENIAC's development and construction lasted from 1943 to full
operation at the end of 1945. When its design was proposed, many researchers believed
that the thousands of delicate valves (i.e. vacuum tubes) would burn out often enough that
the ENIAC would be so frequently down for repairs as to be useless. It was, however,
capable of up to thousands of operations per second for hours at a time between valve
failures.

ENIAC was unambiguously a Turing-complete device. A "program" on the ENIAC,


however, was defined by the states of its patch cables and switches, a far cry from the
stored program electronic machines that evolved from it. To program it meant to rewire
it. At the time, however, unaided calculation was seen as enough of a triumph to view the
solution of a single problem as the object of a program. (Improvements completed in
1948 made it possible to execute stored programs set in function table memory, which
made programming less a "one-off" effort, and more systematic.)

Adapting ideas developed by Eckert and Mauchly after recognizing the limitations of
ENIAC, John von Neumann wrote a widely-circulated report describing a computer
design (the EDVAC design) in which the programs and working data were both stored in
a single, unified store. This basic design, which became known as the von Neumann
architecture, would serve as the basis for the development of the first really flexible,
general-purpose digital computers.

[] First-generation von Neumann machine and the other


works

"Baby" at the Museum of Science and Industry in Manchester (MSIM), England

The first working von Neumann machine was the Manchester "Baby" or Small-Scale
Experimental Machine, built at the University of Manchester in 1948; it was followed in
1949 by the Manchester Mark I computer which functioned as a complete system using
the Williams tube and magnetic drum for memory, and also introduced index registers.
The other contender for the title "first digital stored program computer" had been
EDSAC, designed and constructed at the University of Cambridge. Operational less than
one year after the Manchester "Baby", it was also capable of tackling real problems.
EDSAC was actually inspired by plans for EDVAC (Electronic Discrete Variable
Automatic Computer), the successor to ENIAC; these plans were already in place by the
time ENIAC was successfully operational. Unlike ENIAC, which used parallel
processing, EDVAC used a single processing unit. This design was simpler and was the
first to be implemented in each succeeding wave of miniaturization, and increased
reliability. Some view Manchester Mark I / EDSAC / EDVAC as the "Eves" from which
nearly all current computers derive their architecture.

The first universal programmable computer in the Soviet Union was created by a team of
scientists under direction of Sergei Alekseyevich Lebedev from Kiev Institute of
Electrotechnology, Soviet Union (now Ukraine). The computer MESM (МЭСМ, Small
Electronic Calculating Machine) became operational in 1950. It had about 6,000 vacuum
tubes and consumed 25 kW of power. It could perform approximately 3,000 operations
per second. Another early machine was CSIRAC, an Australian design that ran its first
test program in 1949.

In October 1947, the directors of J. Lyons & Company, a British catering company
famous for its teashops but with strong interests in new office management techniques,
decided to take an active role in promoting the commercial development of computers.
By 1951 the LEO I computer was operational and ran the world's first regular routine
office computer job.

Manchester University's machine became the prototype for the Ferranti Mark I. The first
Ferranti Mark I machine was delivered to the University in February, 1951 and at least
nine others were sold between 1951 and 1957.

UNIVAC I, above, the first commercial electronic computer in the United States (third in
the world), achieved 1900 operations per second in a smaller and more efficient package
than ENIAC.

In June 1951, the UNIVAC I (Universal Automatic Computer) was delivered to the U.S.
Census Bureau. Although manufactured by Remington Rand, the machine often was
mistakenly referred to as the "IBM UNIVAC". Remington Rand eventually sold 46
machines at more than $1 million each. UNIVAC was the first 'mass produced' computer;
all predecessors had been 'one-off' units. It used 5,200 vacuum tubes and consumed 125
kW of power. It used a mercury delay line capable of storing 1,000 words of 11 decimal
digits plus sign (72-bit words) for memory. Unlike IBM machines it was not equipped
with a punch card reader but 1930s style metal magnetic tape input, making it
incompatible with some existing commercial data stores. High speed punched paper tape
and modern-style magnetic tapes were used for input/output by other computers of the
era.

In November 1951, the J. Lyons company began weekly operation of a bakery valuations
job on the LEO (Lyons Electronic Office). This was the first business application to go
live on a stored program computer.

In 1952, IBM publicly announced the IBM 701 Electronic Data Processing Machine, the
first in its successful 700/7000 series and its first IBM mainframe computer. The IBM
704, introduced in 1954, used magnetic core memory, which became the standard for
large machines. The first implemented high-level general purpose programming
language, Fortran, was also being developed at IBM for the 704 during 1955 and 1956
and released in early 1957. (Konrad Zuse's 1945 design of the high-level language
Plankalkül was not implemented at that time.)

IBM introduced a smaller, more affordable computer in 1954 that proved very popular.
The IBM 650 weighed over 900 kg, the attached power supply weighed around 1350 kg
and both were held in separate cabinets of roughly 1.5 meters by 0.9 meters by 1.8
meters. It cost $500,000 or could be leased for $3,500 a month. Its drum memory was
originally only 2000 ten-digit words, and required arcane programming for efficient
computing. Memory limitations such as this were to dominate programming for decades
afterward, until the evolution of a programming model which was more sympathetic to
software development.

In 1955, Maurice Wilkes invented microprogramming, which was later widely used in
the CPUs and floating-point units of mainframe and other computers, such as the IBM
360 series. Microprogramming allows the base instruction set to be defined or extended
by built-in programs (now sometimes called firmware, microcode, or millicode).

In 1956, IBM sold its first magnetic disk system, RAMAC (Random Access Method of
Accounting and Control). It used 50 24-inch metal disks, with 100 tracks per side. It
could store 5 megabytes of data and cost $10,000 per megabyte. (As of 2006, magnetic
storage, in the form of hard disks, costs less than one tenth of a cent per megabyte).

[] Post-1960: third generation and beyond


Main article: History of computing hardware (1960s-present)

The explosion in the use of computers began with 'Third Generation' computers. These
relied on Jack St. Clair Kilby's and Robert Noyce's independent invention of the
integrated circuit (or microchip), which later led to the invention of the microprocessor,
by Ted Hoff and Federico Faggin at Intel.

During the 1960s there was considerable overlap between second and third generation
technologies. As late as 1975, Sperry Univac continued the manufacture of second-
generation machines such as the UNIVAC 494.

The microprocessor led to the development of the microcomputer, small, low-cost


computers that could be owned by individuals and small businesses. Microcomputers, the
first of which appeared in the 1970s, became ubiquitous in the 1980s and beyond. Steve
Wozniak, co-founder of Apple Computer, is credited with developing the first mass-
market home computers. However, his first computer, the Apple I, came out some time
after the KIM-1 and Altair 8800, and the first Apple computer with graphic and sound
capabilities came out well after the Commodore PET. Computing has evolved with
microcomputer architectures, with features added from their larger brethren, now
dominant in most market segments.

An indication of the rapidity of development of this field can be inferred by the Burks,
Goldstein, von Neuman, seminal article, documented in the Datamation September-
October 1962 issue, which was written, as a preliminary version 15 years earlier. (See the
references below.) By the time that anyone had time to write anything down, it was
obsolete.
Microprocessors
A microprocessor is a programmable digital electronic component that incorporates the
functions of a central processing unit (CPU) on a single semiconducting integrated circuit
(IC). The microprocessor was born by reducing the word size of the CPU from 32 bits to
4 bits, so that the transistors of its logic circuits would fit onto a single part. One or more
microprocessors typically serve as the CPU in a computer system, embedded system, or
handheld device.

Microprocessors made possible the advent of the microcomputer in the mid-1970s.


Before this period, electronic CPUs were typically made from bulky discrete switching
devices (and later small-scale integrated circuits) containing the equivalent of only a few
transistors. By integrating the processor onto one or a very few large-scale integrated
circuit packages (containing the equivalent of thousands or millions of discrete
transistors), the cost of processor power was greatly reduced. Since the advent of the IC
in the mid-1970s, the microprocessor has become the most prevalent implementation of
the CPU, nearly completely replacing all other forms. See History of computing hardware
for pre-electronic and early electronic computers.

The evolution of microprocessors has been known to follow Moore's Law when it comes
to steadily increasing performance over the years. This law suggests that the complexity
of an integrated circuit, with respect to minimum component cost, doubles every 24
months. This dictum has generally proven true since the early 1970s. From their humble
beginnings as the drivers for calculators, the continued increase in power has led to the
dominance of microprocessors over every other form of computer; every system from the
largest mainframes to the smallest handheld computers now uses a microprocessor at its
core.

[] History

The 4004 with cover removed (left) and as actually used (right).

Three projects arguably delivered a complete microprocessor at about the same time,
namely Intel's 4004, Texas Instruments' TMS 1000, and Garrett AiResearch's Central Air
Data Computer.

In 1968, Garrett AiResearch, with designer Ray Holt and Steve Geller, were invited to
produce a digital computer to compete with electromechanical systems then under
development for the main flight control computer in the US Navy's new F-14 Tomcat
fighter. The design was complete by 1970, and used a MOS-based chipset as the core
CPU. The design was significantly (approx 20 times) smaller and much more reliable
than the mechanical systems it competed against, and was used in all of the early Tomcat
models. This system contained a "a 20-bit, pipelined, parallel multi-microprocessor".
However, the system was considered so advanced that the Navy refused to allow
publication of the design until 1997. For this reason the CADC, and the MP944 chipset it
used, are fairly unknown even today. see First Microprocessor Chip Set

TI developed the 4-bit TMS 1000 and stressed pre-programmed embedded applications,
introducing a version called the TMS1802NC on September 17, 1971, which
implemented a calculator on a chip. The Intel chip was the 4-bit 4004, released on
November 15, 1971, developed by Federico Faggin and Marcian Hoff.

TI filed for the patent on the microprocessor. Gary Boone was awarded U.S. Patent
3,757,306 for the single-chip microprocessor architecture on September 4, 1973. It may
never be known which company actually had the first working microprocessor running
on the lab bench. In both 1971 and 1976, Intel and TI entered into broad patent cross-
licensing agreements, with Intel paying royalties to TI for the microprocessor patent. A
nice history of these events is contained in court documentation from a legal dispute
between Cyrix and Intel, with TI as intervenor and owner of the microprocessor patent.

Interestingly, a third party (Gilbert Hyatt) was awarded a patent which might cover the
"microprocessor". See a webpage claiming an invention pre-dating both TI and Intel,
describing a "microcontroller". According to a rebuttal and a commentary, the patent was
later invalidated, but not before substantial royalties were paid out.

A computer-on-a-chip is a variation of a microprocessor which combines the


microprocessor core (CPU), some memory, and I/O (input/output) lines, all on one chip.
The computer-on-a-chip patent, called the "microcomputer patent" at the time, U.S.
Patent 4,074,351 , was awarded to Gary Boone and Michael J. Cochran of TI. Aside from
this patent, the standard meaning of microcomputer is a computer using one or more
microprocessors as its CPU(s), while the concept defined in the patent is perhaps more
akin to a microcontroller.

According to A History of Modern Computing, (MIT Press), pp. 220–21, Intel entered
into a contract with Computer Terminals Corporation, later called Datapoint, of San
Antonio TX, for a chip for a terminal they were designing. Datapoint later decided not to
use the chip, and Intel marketed it as the 8008 in April, 1972. This was the world's first 8-
bit microprocessor. It was the basis for the famous "Mark-8" computer kit advertised in
the magazine Radio-Electronics in 1974. The 8008 and its successor, the world-famous
8080, opened up the microprocessor component marketplace.

[] Notable 8-bit designs

The 4004 was later followed in 1972 by the 8008, the world's first 8-bit microprocessor.
These processors are the precursors to the very successful Intel 8080 (1974), Zilog Z80
(1976), and derivative Intel 8-bit processors. The competing Motorola 6800 was released
August 1974. Its architecture was cloned and improved in the MOS Technology 6502 in
1975, rivaling the Z80 in popularity during the 1980s.
Both the Z80 and 6502 concentrated on low overall cost, through a combination of small
packaging, simple computer bus requirements, and the inclusion of circuitry that would
normally have to be provided in a separate chip (for instance, the Z80 included a memory
controller). It was these features that allowed the home computer "revolution" to take off
in the early 1980s, eventually delivering machines that sold for US$99.

The Western Design Center, Inc. (WDC) introduced the CMOS 65C02 in 1982 and
licensed the design to several companies which became the core of the Apple IIc and IIe
personal computers, medical implantable grade pacemakers and defibrilators, automotive,
industrial and consumer devices. WDC pioneered the licensing of microprocessor
technology which was later followed by ARM and other microprocessor Intellectual
Property (IP) providers in the 1990’s.

Motorola trumped the entire 8-bit world by introducing the MC6809 in 1978, arguably
one of the most powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded
– and also one of the most complex hardwired logic designs that ever made it into
production for any microprocessor. Microcoding replaced hardwired logic at about this
point in time for all designs more powerful than the MC6809 – specifically because the
design requirements were getting too complex for hardwired logic.

Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry
of interest due to its innovative and powerful instruction set architecture.

A seminal microprocessor in the world of spaceflight was RCA's RCA 1802 (aka
CDP1802, RCA COSMAC) (introduced in 1976) which was used in NASA's Voyager
and Viking spaceprobes of the 1970s, and onboard the Galileo probe to Jupiter (launched
1989, arrived 1995). RCA COSMAC was the first to implement C-MOS technology. The
CDP1802 was used because it could be run at very low power,* and because its
production process (Silicon on Sapphire) ensured much better protection against cosmic
radiation and electrostatic discharges than that of any other processor of the era. Thus, the
1802 is said to be the first radiation-hardened microprocessor.

[] 16-bit designs

The first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16,
introduced in early 1973. An 8-bit version of the chipset was introduced in 1974 as the
IMP-8. During the same year, National introduced the first 16-bit single-chip
microprocessor, the National Semiconductor PACE, which was later followed by an
NMOS version, the INS8900.

Other early multi-chip 16-bit microprocessors include one used by Digital Equipment
Corporation (DEC) in the LSI-11 OEM board set and the packaged PDP 11/03
minicomputer, and the Fairchild Semiconductor MicroFlame 9440, both of which were
introduced in the 1975 to 1976 timeframe.
The first single-chip 16-bit microprocessor was TI's TMS 9900, which was also
compatible with their TI-990 line of minicomputers. The 9900 was used in the TI 990/4
minicomputer, the TI-99/4A home computer, and the TM990 line of OEM
microcomputer boards. The chip was packaged in a large ceramic 64-pin DIP package
package, while most 8-bit microprocessors such as the Intel 8080 used the more common,
smaller, and less expensive plastic 40-pin DIP. A follow-on chip, the TMS 9980, was
designed to compete with the Intel 8080, had the full TI 990 16-bit instruction set, used a
plastic 40-pin package, moved data 8 bits at a time, but could only address 16 KiB. A
third chip, the TMS 9995, was a new design. The family later expanded to include the
99105 and 99110.

The Western Design Center, Inc. (WDC) introduced the CMOS 65816 16-bit upgrade of
the WDC CMOS 65C02 in 1984. The 65816 16-bit microprocessor was the core of the
Apple IIgs and later the Super Nintendo Entertainment System, making it one of the most
popular 16-bit designs of all time.

Intel followed a different path, having no minicomputers to emulate, and instead


"upsized" their 8080 design into the 16-bit Intel 8086, the first member of the x86 family
which powers most modern PC type computers. Intel introduced the 8086 as a cost
effective way of porting software from the 8080 lines, and succeeded in winning much
business on that premise. The 8088, a version of the 8086 that used an external 8-bit data
bus, was the microprocessor in the first IBM PC, the model 5150. Following up their
8086 and 8088, Intel released the 80186, 80286 and, in 1985, the 32-bit 80386,
cementing their PC market dominance with the processor family's backwards
compatibility.

The integrated microprocessor memory management unit (MMU) was developed by


Childs et al. of Intel, and awarded US patent number 4,442,484.

[] 32-bit designs

Upper interconnect layers on an Intel 80486DX2 die.

16-bit designs were in the market only briefly when full 32-bit implementations started to
appear.

The most famous of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as
it was widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit
external data bus to reduce pin count, and supported only 24-bit addresses. Motorola
generally described it as a 16-bit processor, though it clearly has 32-bit architecture. The
combination of high speed, large (16 mebibytes) memory space and fairly low costs made
it the most popular CPU design of its class. The Apple Lisa and Macintosh designs made
use of the 68000, as did a host of other designs in the mid-1980s, including the Atari ST
and Commodore Amiga.
The world's first single-chip fully-32-bit microprocessor, with 32-bit data paths, 32-bit
buses, and 32-bit addresses, was the AT&T Bell Labs BELLMAC-32A, with first
samples in 1980, and general production in 1982 (See this bibliographic reference and
this general reference). After the divestiture of AT&T in 1984, it was renamed the WE
32000 (WE for Western Electric), and had two follow-on generations, the WE 32100 and
WE 32200. These microprocessors were used in the AT&T 3B5 and 3B15
minicomputers; in the 3B2, the world's first desktop supermicrocomputer; in the
"Companion", the world's first 32-bit laptop computer; and in "Alexander", the world's
first book-sized supermicrocomputer, featuring ROM-pack memory cartridges similar to
today's gaming consoles. All these systems ran the UNIX System V operating system.

Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but
was not a commercial success. It had an advanced capability-based object-oriented
architecture, but poor performance compared to other competing architectures such as the
Motorola 68000.

Motorola's success with the 68000 led to the MC68010, which added virtual memory
support. The MC68020, introduced in 1985 added full 32-bit data and address busses.
The 68020 became hugely popular in the Unix supermicrocomputer market, and many
small companies (e.g., Altos, Charles River Data Systems) produced desktop-size
systems. Following this with the MC68030, which added the MMU into the chip, the
68K family became the processor for everything that wasn't running DOS. The continued
success led to the MC68040, which included an FPU for better math performance. A
68050 failed to achieve its performance goals and was not released, and the follow-up
MC68060 was released into a market saturated by much faster RISC designs. The 68K
family faded from the desktop in the early 1990s.

Other large companies designed the 68020 and follow-ons into embedded equipment. At
one point, there were more 68020s in embedded equipment than there were Intel
Pentiums in PCs (See this webpage for this embedded usage information). The ColdFire
processor cores are derivatives of the venerable 68020.

During this time (early to mid 1980s), National Semiconductor introduced a very similar
16-bit pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016),
the full 32-bit version named the NS 32032, and a line of 32-bit industrial OEM
microcomputers. By the mid-1980s, Sequent introduced the first symmetric
multiprocessor (SMP) server-class computer using the NS 32032. This was one of the
design's few wins, and it disappeared in the late 1980s.

The MIPS R2000 (1984) and R3000 (1989) were highly successful 32-bit RISC
microprocessors. They were used in high-end workstations and servers by SGI, among
others.

Other designs included the interesting Zilog Z8000, which arrived too late to market to
stand a chance and disappeared quickly.
In the late 1980s, "microprocessor wars" started killing off some of the microprocessors.
Apparently, with only one major design win, Sequent, the NS 32032 just faded out of
existence, and Sequent switched to Intel microprocessors.

From 1985 to 2003, the 32-bit x86 architectures became increasingly dominant in
desktop, laptop, and server markets, and these microprocessors became faster and more
capable. Intel had licensed early versions of the architecture to other companies, but
declined to license the Pentium, so AMD and Cyrix built later versions of the architecture
based on their own designs.During this span, these processors increased in complexity
(transistor count) and capability (instructions/second) by at least a factor of 1000.

[] 64-bit designs in personal computers

While 64-bit microprocessor designs have been in use in several markets since the early
1990s, the early 2000s have seen the introduction of 64-bit microchips targeted at the PC
market.

With AMD's introduction of the first 64-bit IA-32 backwards-compatible architecture,


AMD64, in September 2003, followed by Intel's own x86-64 chips, the 64-bit desktop era
began. Both processors can run 32-bit legacy apps as well as the new 64-bit software.
With 64-bit Windows XP, Linux and Mac OS X (to a certain extent) that run 64-bit
native, the software too is geared to utilise the full power of such processors. The move to
64 bits is more than just an increase in register size from the IA-32 as it also doubles the
number of general-purpose registers for the aging CISC designs.

The move to 64 bits by PowerPC processors had been intended since the processors'
design in the early 90s and was not a major cause of incompatibility. Existing integer
registers are extended as are all related data pathways, but, as was the case with IA-32,
both floating point and vector units had been operating at or above 64 bits for several
years. Unlike what happened with IA-32 was extended to x86-64, no new general
purpose registers were added in 64-bit PowerPC, so any performance gained when using
the 64-bit mode for applications making no use of the larger address space is minimal.

[] Multicore designs

AMD Athlon 64 X2 3600 Dual core processor


Main article: Multi-core (computing)

A different approach to improving a computer's performance is to add extra processors,


as in symmetric multiprocessing designs which have been popular in servers and
workstations since the early 1990s. Keeping up with Moore's Law is becoming
increasingly challenging as chip-making technologies approach the physical limits of the
technology.
In response, the microprocessor manufacturers look for other ways to improve
performance, in order to hold on to the momentum of constant upgrades in the market.

A multi-core processor is simply a single chip containing more than one microprocessor
core, effectively multiplying the potential performance with the number of cores (as long
as the operating system and software is designed to take advantage of more than one
processor). Some components, such as bus interface and second level cache, may be
shared between cores. Because the cores are physically very close they interface at much
faster clock speeds compared to discrete multiprocessor systems, improving overall
system performance.

In 2005, the first mass-market dual-core processors were announced and as of 2007 dual-
core processors are widely used in servers, workstations and PCs while quad-core
processors are now available for high-end applications in both the home and professional
environments.

[] RISC

In the mid-1980s to early-1990s, a crop of new high-performance RISC (reduced


instruction set computer) microprocessors appeared, which were initially used in special
purpose machines and Unix workstations, but have since become almost universal in all
roles except the Intel-standard desktop.

The first commercial design was released by MIPS Technologies, the 32-bit R2000 (the
R1000 was not released). The R3000 made the design truly practical, and the R4000
introduced the world's first 64-bit design. Competing projects would result in the IBM
POWER and Sun SPARC systems, respectively. Soon every major vendor was releasing
a RISC design, including the AT&T CRISP, AMD 29000, Intel i860 and Intel i960,
Motorola 88000, DEC Alpha and the HP-PA.

Market forces have "weeded out" many of these designs, leaving the PowerPC as the
main desktop RISC processor, with the SPARC being used in Sun designs only. MIPS
continues to supply some SGI systems, but is primarily used as an embedded design,
notably in Cisco routers. The rest of the original crop of designs have either disappeared,
or are about to. Other companies have attacked niches in the market, notably ARM,
originally intended for home computer use but since focussed at the embedded processor
market. Today RISC designs based on the MIPS, ARM or PowerPC core power the vast
majority of computing devices.

As of 2006, several 64-bit architectures are still produced. These include x86-64, MIPS,
SPARC, Power Architecture, and Itanium.

[] Special-purpose designs
A 4-bit, 2 register, six assembly language instruction computer made entirely of 74-series
chips.

Though the term "microprocessor" has traditionally referred to a single- or multi-chip


CPU or system-on-a-chip (SoC), several types of specialized processing devices have
followed from the technology. The most common examples are microcontrollers, digital
signal processors (DSP) and graphics processing units (GPU). Many examples of these
are either not programmable, or have limited programming facilities. For example, in
general GPUs through the 1990s were mostly non-programmable and have only recently
gained limited facilities like programmable vertex shaders. There is no universal
consensus on what defines a "microprocessor", but it is usually safe to assume that the
term refers to a general-purpose CPU of some sort and not a special-purpose processor
unless specifically noted.

The RCA 1802 had what is called a static design, meaning that the clock frequency could
be made arbitrarily low, even to 0 Hz, a total stop condition. This let the
Voyager/Viking/Galileo spacecraft use minimum electric power for long uneventful
stretches of a voyage. Timers and/or sensors would awaken/speed up the processor in
time for important tasks, such as navigation updates, attitude control, data acquisition,
and radio communication.

[] Market statistics
In 2003, about $44 billion (USD) worth of microprocessors were manufactured and sold.
[1] Although about half of that money was spent on CPUs used in desktop or laptop
personal computers, those count for only about 0.2% of all CPUs sold.

Binary Number System


The binary numeral system, or base-2 number system, is a numeral system that
represents numeric values using two symbols, usually 0 and 1. More specifically, the
usual base-2 system is a positional notation with a radix of 2. Owing to its
straightforward implementation in electronic circuitry, the binary system is used
internally by virtually all modern computers.

[] History
The ancient Indian mathematician Pingala presented the first known description of a
binary numeral system around 800 BC written in Hindu numerals. The numeration
system was based on the Eye of Horus Old Kingdom numeration system. [1]

A full set of 8 trigrams and 64 hexagrams, analogous to the 3-bit and 6-bit binary
numerals, were known to the ancient Chinese in the classic text I Ching. Similar sets of
binary combinations have also been used in traditional African divination systems such as
Ifá as well as in medieval Western geomancy.
An ordered binary arrangement of the hexagrams of the I Ching, representing the decimal
sequence from 0 to 63, and a method for generating the same, was developed by the
Chinese scholar and philosopher Shao Yong in the 11th century. However, there is no
evidence that Shao understood binary computation.

In 1605 Francis Bacon discussed a system by which letters of the alphabet could be
reduced to sequences of binary digits, which could then be encoded as scarcely visible
variations in the font in any random text. Importantly for the general theory of binary
encoding, he added that this method could be used with any objects at all: "provided
those objects be capable of a twofold difference onely; as by Bells, by Trumpets, by
Lights and Torches, by the report of Muskets, and any instruments of like nature."[2] (See
Bacon's cipher.)

The modern binary number system was fully documented by Gottfried Leibniz in the
17th century in his article Explication de l'Arithmétique Binaire. Leibniz's system used 0
and 1, like the modern binary numeral system.

In 1854, British mathematician George Boole published a landmark paper detailing a


system of logic that would become known as Boolean algebra. His logical system proved
instrumental in the development of the binary system, particularly in its implementation
in electronic circuitry.

In 1937, Claude Shannon produced his master's thesis at MIT that implemented Boolean
algebra and binary arithmetic using electronic relays and switches for the first time in
history. Entitled A Symbolic Analysis of Relay and Switching Circuits, Shannon's thesis
essentially founded practical digital circuit design.

In November of 1937, George Stibitz, then working at Bell Labs, completed a relay-
based computer he dubbed the "Model K" (for "Kitchen", where he had assembled it),
which calculated using binary addition. Bell Labs thus authorized a full research program
in late 1938 with Stibitz at the helm. Their Complex Number Computer, completed
January 8, 1940, was able to calculate complex numbers. In a demonstration to the
American Mathematical Society conference at Dartmouth College on September 11,
1940, Stibitz was able to send the Complex Number Calculator remote commands over
telephone lines by a teletype. It was the first computing machine ever used remotely over
a phone line. Some participants of the conference who witnessed the demonstration were
John Von Neumann, John Mauchly, and Norbert Wiener, who wrote about it in his
memoirs.

[] Representation
A binary number can be represented by any sequence of bits (binary digits), which in turn
may be represented by any mechanism capable of being in two mutually exclusive states.
The following sequences of symbols could all be interpreted as the same binary numeric
value of love:
0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 0 1
| - - | - - | | | - - | - - - - | - - - | - - | | - - | | - | -
x o o x o o x x x o o x o o o o x o o o x o o x x o o x x o x o
a z z a z z a a a z z a z z z z a z z z a z z a a z z a a z z z

A binary clock might use LEDs to express binary values. In this clock, each column of
LEDs shows a binary-coded decimal numeral of the traditional sexagesimal time.

The numeric value represented in each case is dependent upon the value assigned to each
symbol. In a computer, the numeric values may be represented by two different voltages;
on a magnetic disk, magnetic polarities may be used. A "positive", "yes", or "on" state is
not necessarily equivalent to the numerical value of one; it depends on the architecture in
use.

In keeping with customary representation of numerals using Arabic numerals, binary


numbers are commonly written using the symbols 0 and 1. When written, binary
numerals are often subscripted, prefixed or suffixed in order to indicate their base, or
radix. The following notations are equivalent:

100101 binary (explicit statement of format)


100101b (a suffix indicating binary format)
100101B (a suffix indicating binary format)
bin 100101 (a prefix indicating binary format)
1001012 (a subscript indicating base-2 (binary) notation)
%100101 (a prefix indicating binary format)
0b100101 (a prefix indicating binary format, common in programming languages)

When spoken, binary numerals are usually pronounced by pronouncing each individual
digit, in order to distinguish them from decimal numbers. For example, the binary
numeral "100" is pronounced "one zero zero", rather than "one hundred", to make its
binary nature explicit, and for purposes of correctness. Since the binary numeral "100" is
equal to the decimal value four, it would be confusing, and numerically incorrect, to refer
to the numeral as "one hundred" so to speak.

[] Counting in binary
Counting in binary is similar to counting in any other number system. Beginning with a
single digit, counting proceeds through each symbol, in increasing order. Decimal
counting uses the symbols 0 through 9, while binary only uses the symbols 0 and 1.

When the symbols for the first digit are exhausted, the next-higher digit (to the left) is
incremented, and counting starts over at 0. In decimal, counting proceeds like so:

000, 001, 002, ... 007, 008, 009, (rightmost digit starts over, and next digit is
incremented)
010, 011, 012, ...
...
090, 091, 092, ... 097, 098, 099, (rightmost two digits start over, and next digit is
incremented)
100, 101, 102, ...

After a digit reaches 9, an increment resets it to 0 but also causes an increment of the next
digit to the left. In binary, counting is the same except that only the two symbols 0 and 1
are used. Thus after a digit reaches 1 in binary, an increment resets it to 0 but also causes
an increment of the next digit to the left:

000, 001, (rightmost digit starts over, and next digit is incremented)
010, 011, (rightmost two digits start over, and next digit is incremented)
100, 101, ...

[] Binary simplified
One can think about binary by comparing it with our usual numbers. We use a base ten
system. This means that the value of each position in a numerical value can be
represented by one of ten possible symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. We are all
familiar with these and how the decimal system works using these ten symbols. When we
begin counting values, we should start with the symbol 0, and proceed to 9 when
counting. We call this the "ones", or "units" place.

The "ones" place, with those digits, might be thought of as a multiplication problem. 5
can be thought of as 5 × 100 (10 to the zeroeth power, which equals 5 × 1, since any
number to the zero power is one). As we move to the left of the ones place, we increase
the power of 10 by one. Thus, to represent 50 in this same manner, it can be thought of as
5 × 101, or 5 × 10.

When we run out of symbols in the decimal numeral system, we "move to the left" one
place and use a "1" to represent the "tens" place. Then we reset the symbol in the "ones"
place back to the first symbol, zero.

Binary is a base two system which works just like our decimal system, however with only
two symbols which can be used to represent numerical values: 0 and 1. We begin in the
"ones" place with 0, then go up to 1. Now we are out of symbols, so to represent a higher
value, we must place a "1" in the "twos" place, since we don't have a symbol we can use
in the binary system for 2, like we do in the decimal system.

In the binary numeral system, the value represented as 10 is (1 × 21) + (0 × 20). Thus, it
equals "2" in our decimal system.

Binary-to-decimal equivalence:

To see the actual algorithm used in computing the conversion, see the conversion guide
below.
Here is another way of thinking about it: When you run out of symbols, for example
11111, add a "1" on the left end and reset all the numerals on the right to "0", producing
100000. This also works for symbols in the middle. Say the number is 100111. If you add
one to it, you move the leftmost repeating "1" one space to the left (from the "fours"
place to the "eights" place) and reset all the numerals on the right to "0", producing
101000.

[] Binary arithmetic
Arithmetic in binary is much like arithmetic in other numeral systems. Addition,
subtraction, multiplication, and division can be performed on binary numerals.

[] Addition

The circuit diagram for a binary half adder, which adds two bits together, producing sum
and carry bits.

The simplest arithmetic operation in binary is addition. Adding two single-digit binary
numbers is relatively simple:

0+0=0
0+1=1
1+0=1
1 + 1 = 10 (carry:1)

Adding two "1" values produces the value "10" (spoken as "one-zero"), equivalent to the
decimal value 2. This is similar to what happens in decimal when certain single-digit
numbers are added together; if the result equals or exceeds the value of the radix (10), the
digit to the left is incremented:

5 + 5 = 10
7 + 9 = 16

This is known as carrying in most numeral systems. When the result of an addition
exceeds the value of the radix, the procedure is to "carry the one" to the left, adding it to
the next positional value. Carrying works the same way in binary:

1 1 1 1 1 (carried digits)
0 1 1 0 1
+ 1 0 1 1 1
-------------
= 1 0 0 1 0 0

In this example, two numerals are being added together: 011012 (13 decimal) and 101112
(23 decimal). The top row shows the carry bits used. Starting in the rightmost column, 1
+ 1 = 102. The 1 is carried to the left, and the 0 is written at the bottom of the rightmost
column. The second column from the right is added: 1 + 0 + 1 = 102 again; the 1 is
carried, and 0 is written at the bottom. The third column: 1 + 1 + 1 = 112. This time, a 1 is
carried, and a 1 is written in the bottom row. Proceeding like this gives the final answer
1001002 (36 decimal).

When computers must add two numbers, the rule that: x ^ y = x + y % 2 for any two bits
x and y allows for very fast calculation, as well.

[] Subtraction

Subtraction works in much the same way:

0−0=0
0 − 1 = 1 (with borrow)
1−0=1
1−1=0

One binary numeral can be subtracted from another as follows:

* * * * (starred columns are borrowed from)


1 1 0 1 1 1 0
− 1 0 1 1 1
----------------
= 1 0 1 0 1 1 1

Subtracting a positive number is equivalent to adding a negative number of equal


absolute value; computers typically use two's complement notation to represent negative
values. This notation eliminates the need for a separate "subtract" operation. The
subtraction can be summarized with this formula:

A - B = A + not B + 1

For further details, see two's complement.

[] Multiplication

Multiplication in binary is similar to its decimal counterpart. Two numbers A and B can
be multiplied by partial products: for each digit in B, the product of that digit in A is
calculated and written on a new line, shifted leftward so that its rightmost digit lines up
with the digit in B that was used. The sum of all these partial products gives the final
result.

Since there are only two digits in binary, there are only two possible outcomes of each
partial multiplication:

• If the digit in B is 0, the partial product is also 0


• If the digit in B is 1, the partial product is equal to A

For example, the binary numbers 1011 and 1010 are multiplied as follows:

1 0 1 1 (A)
× 1 0 1 0 (B)
---------
0 0 0 0 ← Corresponds to a zero in B
+ 1 0 1 1 ← Corresponds to a one in B
+ 0 0 0 0
+ 1 0 1 1
---------------
= 1 1 0 1 1 1 0

See also Booth's multiplication algorithm.

[] Division

Binary division is again similar to its decimal counterpart:

__________
1 0 1 | 1 1 0 1 1

Here, the divisor is 1012, or 5 decimal, while the dividend is 110112, or 27 decimal. The
procedure is the same as that of decimal long division; here, the divisor 1012 goes into the
first three digits 1102 of the dividend one time, so a "1" is written on the top line. This
result is multiplied by the divisor, and subtracted from the first three digits of the
dividend; the next digit (a "1") is included to obtain a new three-digit sequence:

1
__________
1 0 1 | 1 1 0 1 1
− 1 0 1
-----
0 1 1

The procedure is then repeated with the new sequence, continuing until the digits in the
dividend have been exhausted:

1 0 1
__________
1 0 1 | 1 1 0 1 1
− 1 0 1
-----
0 1 1
− 0 0 0
-----
1 1 1
− 1 0 1
-----
1 0
Thus, the quotient of 110112 divided by 1012 is 1012, as shown on the top line, while the
remainder, shown on the bottom line, is 102. In decimal, 27 divided by 5 is 5, with a
remainder of 2.

[] Bitwise operations
Main article: bitwise operation

Though not directly related to the numerical interpretation of binary symbols, sequences
of bits may be manipulated using Boolean logical operators. When a string of binary
symbols is manipulated in this way, it is called a bitwise operation; the logical operators
AND, OR, and XOR may be performed on corresponding bits in two binary numerals
provided as input. The logical NOT operation may be performed on individual bits in a
single binary numeral provided as input. Sometimes, such operations may be used as
arithmetic short-cuts, and may have other computational benefits as well. For example,
an arithmetic shift left of a binary number is the equivalent of multiplication by a
(positive, integral) power of 2.

[] Conversion to and from other numeral systems


[] Decimal

To convert from a base-10 integer numeral to its base-2 (binary) equivalent, the number
is divided by two, and the remainder is the least-significant bit. The (integer) result is
again divided by two, its remainder is the next most significant bit. This process repeats
until the result of further division becomes zero.

For example, 11810, in binary, is:

Operation Remainder

118 ÷ 2 = 59 0

59 ÷ 2 = 29 1

29 ÷ 2 = 14 1

14 ÷ 2 = 7 0
7÷2=3 1

3÷2=1 1

1÷2=0 1

Reading the sequence of remainders from the bottom up gives the binary numeral
11101102.

This method works for conversion from any base, but there are better methods for bases
which are powers of two, such as octal and hexadecimal given below.

To convert from base-2 to base-10 is the reverse algorithm. Starting from the left, double
the result and add the next digit until there are no more. For example to convert
1100101011012 to decimal:

Result Remaining digits

0 110010101101

0×2+1=1 10010101101

1×2+1=3 0010101101

3×2+0=6 010101101

6 × 2 + 0 = 12 10101101

12 × 2 + 1 = 25 0101101

25 × 2 + 0 = 50 101101
50 × 2 + 1 = 101 01101

101 × 2 + 0 = 202 1101

202 × 2 + 1 = 405 101

405 × 2 + 1 = 811 01

811 × 2 + 0 = 1622 1

1622 × 2 + 1 = 3245

The result is 324510.

The fractional parts of a number are converted with similar methods. They are again
based on the equivalence of shifting with doubling or halving.

In a fractional binary number such as .110101101012, the first digit is , the second , etc.
So if there is a 1 in the first place after the decimal, then the number is at least , and vice
versa. Double that number is at least 1. This suggests the algorithm: Repeatedly double
the number to be converted, record if the result is at least 1, and then throw away the
integer part.

For example, 10, in binary, is:

Convertin
Result
g

0.

0.0

0.01
0.010

0.0101

Thus the repeating decimal fraction 0.3... is equivalent to the repeating binary fraction
0.01... .

Or for example, 0.110, in binary, is:

Converting Result

0.1 0.

0.1 × 2 = 0.2 < 1 0.0

0.2 × 2 = 0.4 < 1 0.00

0.4 × 2 = 0.8 < 1 0.000

0.8 × 2 = 1.6 ≥ 1 0.0001

0.6 × 2 = 1.2 ≥ 1 0.00011

0.2 × 2 = 0.4 < 1 0.000110

0.4 × 2 = 0.8 < 1 0.0001100

0.8 × 2 = 1.6 ≥ 1 0.00011001

0.6 × 2 = 1.2 ≥ 1 0.000110011


0.2 × 2 = 0.4 < 1 0.0001100110

This is also a repeating binary fraction 0.000110011... . It may come as a surprise that
terminating decimal fractions can have repeating expansions in binary. It is for this
reason that many are surprised to discover that 0.1 + ... + 0.1, (10 additions) differs from
1 in floating point arithmetic. In fact, the only binary fractions with terminating
expansions are of the form of an integer divided by a power of 2, which 1/10 is not.

The final conversion is from binary to decimal fractions. The only difficulty arises with
repeating fractions, but otherwise the method is to shift the fraction to an integer, convert
it as above, and then divide by the appropriate power of two in the decimal base. For
example:

x = 1100 .101110011100...
= 1100101110 .0111001110...
= 11001 .0111001110...
= 1100010101
x = (789/62)10

Another way of converting from binary to decimal, often quicker for a person familiar
with hexadecimal, is to do so indirectly—first converting (x in binary) into (x in
hexadecimal) and then converting (x in hexadecimal) into (x in decimal).

[] Hexadecimal

Binary may be converted to and from hexadecimal somewhat more easily. This is due to
the fact that the radix of the hexadecimal system (16) is a power of the radix of the binary
system (2). More specifically, 16 = 24, so it takes four digits of binary to represent one
digit of hexadecimal.

The following table shows each hexadecimal digit along with the equivalent decimal
value and four-digit binary sequence:

Hex Dec Binary Hex Dec Binary Hex Dec Binary Hex Dec Binary

0 0 0000 4 4 0100 8 8 1000 C 12 1100

1 1 0001 5 5 0101 9 9 1001 D 13 1101


2 2 0010 6 6 0110 A 10 1010 E 14 1110

3 3 0011 7 7 0111 B 11 1011 F 15 1111

To convert a hexadecimal number into its binary equivalent, simply substitute the
corresponding binary digits:

3A16 = 0011 10102


E716 = 1110 01112

To convert a binary number into its hexadecimal equivalent, divide it into groups of four
bits. If the number of bits isn't a multiple of four, simply insert extra 0 bits at the left
(called padding). For example:

010100102 = 0101 0010 grouped with padding = 5216


110111012 = 1101 1101 grouped = DD16

To convert a hexadecimal number into its decimal equivalent, multiply the decimal
equivalent of each hexadecimal digit by the corresponding power of 16 and add the
resulting values:

C0E716 = (12 × 163) + (0 × 162) + (14 × 161) + (7 × 160) = (12 × 4096) + (0 × 256)
+ (14 × 16) + (7 × 1) = 49,38310

[] Octal

Binary is also easily converted to the octal numeral system, since octal uses a radix of 8,
which is a power of two (namely, 23, so it takes exactly three binary digits to represent an
octal digit). The correspondence between octal and binary numerals is the same as for the
first eight digits of hexadecimal in the table above. Binary 000 is equivalent to the octal
digit 0, binary 111 is equivalent to octal 7, and so on.

Octal Binary

0 000

1 001
2 010

3 011

4 100

5 101

6 110

7 111

Converting from octal to binary proceeds in the same fashion as it does for hexadecimal:

658 = 110 1012


178 = 001 1112

And from binary to octal:

1011002 = 101 1002 grouped = 548


100112 = 010 0112 grouped with padding = 238

And from octal to decimal:

658 = (6 × 81) + (5 × 80) = (6 × 8) + (5 × 1) = 5310


1278 = (1 × 82) + (2 × 81) + (7 × 80) = (1 × 64) + (2 × 8) + (7 × 1) = 8710

[] Representing real numbers


Non-integers can be represented by using negative powers, which are set off from the
other digits by means of a radix point (called a decimal point in the decimal system). For
example, the binary number 11.012 thus means:

1 × 21 (1 × 2 = 2) plus
1 × 20 (1 × 1 = 1) plus
0 × 2-1 (0 × ½ = 0) plus
1 × 2-2 (1 × ¼ = 0.25)
For a total of 3.25 decimal.

All dyadic rational numbers have a terminating binary numeral—the binary


representation has a finite number of terms after the radix point. Other rational numbers
have binary representation, but instead of terminating, they recur, with a finite sequence
of digits repeating indefinitely. For instance

= = 0.0101010101...2

= = 0.10110100 10110100 10110100...2

The phenomenon that the binary representation of any rational is either terminating or
recurring also occurs in other radix-based numeral systems. See, for instance, the
explanation in decimal. Another similarity is the existence of alternative representations
for any terminating representation, relying on the fact that 0.111111... is the sum of the
geometric series 2-1 + 2-2 + 2-3 + ... which is 1.

Binary numerals which neither terminate nor recur represent irrational numbers. For
instance,

• 0.10100100010000100000100.... does have a pattern, but it is not a fixed-length


recurring pattern, so the number is irrational
• 1.0110101000001001111001100110011111110... is the binary representation of ,
the square root of 2, another irrational. It has no discernible pattern, although a
proof that is irrational requires more than this. See irrational number.

History of programming languages


This article discusses the major developments in the history of programming languages.
For a detailed timeline of events, see the timeline of programming languages.

[] Prehistory
The first programming languages predate the modern computer. From the first, the
languages were codes. Herman Hollerith realized that he could encode information on
punch cards when he observed that railroad train conductors would encode the
appearance of the ticket holders on the train tickets using the position of punched holes
on the tickets. Hollerith then proceeded to encode the 1890 census data on punch cards
which he made the same size as the boxes for holding US currency. (The dollar bill was
later downsized.)

The first computer codes were specialized for the applications. In the first decades of the
twentieth century, numerical calculations were based on decimal numbers. Eventually it
was realized that logic could be represented with numbers, as well as with words. For
example, Alonzo Church was able to express the lambda calculus in a formulaic way.
The Turing machine was an abstraction of the operation of a tape-marking machine, for
example, in use at the telephone companies. However, unlike the lambda calculus,
Turing's code does not serve well as a basis for higher-level languages — its principal use
is in rigorous analyses of algorithmic complexity.

Like many "firsts" in history, the first modern programming language is hard to identify.
From the start, the restrictions of the hardware defined the language. Punch cards allowed
80 columns, but some of the columns had to be used for a sorting number on each card.
Fortran included some keywords which were the same as English words, such as "IF",
"GOTO" (go to) and "CONTINUE". The use of a magnetic drum for memory meant that
computer programs also had to be interleaved with the rotations of the drum. Thus the
programs were more hardware dependent than today.

To some people the answer depends on how much power and human-readability is
required before the status of "programming language" is granted. Jacquard looms and
Charles Babbage's Difference Engine both had simple, extremely limited languages for
describing the actions that these machines should perform. One can even regard the
punch holes on a player piano scroll as a limited domain-specific programming language,
albeit not designed for human consumption.

[] The 1940s
In the 1940s the first recognizably modern, electrically powered computers were created.
The limited speed and memory capacity forced programmers to write hand tuned
assembly language programs. It was soon discovered that programming in assembly
language required a great deal of intellectual effort and was error-prone.

In 1948, Konrad Zuse [1] published a paper about his programming language Plankalkül.
However, it was not implemented in his time and his original contributions were isolated
from other developments.

Some important languages that were developed in this time period include:

• 1943 - Plankalkül (Konrad Zuse)


• 1943 - ENIAC coding system
• 1949 - C-10

[] The 1950s and 1960s


In the 1950s the first three modern programming languages whose descendants are still in
widespread use today were designed:

• FORTRAN, the "FORmula TRANslator, invented by John W. Backus et al.;


• LISP, the "LISt Processor", invented by John McCarthy et al.;
• COBOL, the COmmon Business Oriented Language, created by the Short Range
Committee, heavily influenced by Grace Hopper.

Another milestone in the late 1950s was the publication, by a committee of American and
European computer scientists, of "a new language for algorithms"; the Algol 60 Report
(the "ALGOrithmic Language"). This report consolidated many ideas circulating at the
time and featured two key innovations:

• The use of Backus-Naur Form (BNF) for describing the language's syntax. Nearly
all subsequent programming languages have used a variant of BNF to describe the
context-free portion of their syntax.
• The introduction of lexical scoping for names in arbitrarily nested scopes.

Algol 60 was particularly influential in the design of later languages, some of which soon
became more popular. The Burroughs large systems were designed to be programmed in
an extended subset of Algol.

Some important languages that were developed in this time period include:

• 1951 - Regional Assembly Language


• 1952 - Autocode
• 1954 - FORTRAN
• 1958 - LISP
• 1958 - ALGOL 58
• 1959 - COBOL
• 1962 - APL
• 1962 - Simula
• 1964 - BASIC
• 1964 - PL/I

[] 1967-1978: establishing fundamental paradigms


The period from the late 1960s to the late 1970s brought a major flowering of
programming languages. Most of the major language paradigms now in use were
invented in this period:

• Simula, invented in the late 1960s by Nygaard and Dahl as a superset of Algol
60, was the first language designed to support object-oriented programming.
• Smalltalk (mid 1970s) provided a complete ground-up design of an object-
oriented language.
• C, an early systems programming language, was developed by Dennis Ritchie and
Ken Thompson at Bell Labs between 1969 and 1973.
• Prolog, designed in 1972 by Colmerauer, Roussel, and Kowalski, was the first
logic programming language.
• ML built a polymorphic type system (invented by Robin Milner in 1978) on top
of Lisp, pioneering statically typed functional programming languages.

Each of these languages spawned an entire family of descendants, and most modern
languages count at least one of them in their ancestry.

The 1960s and 1970s also saw considerable debate over the merits of "structured
programming", which essentially meant programming without the use of GOTO. This
debate was closely related to language design: some languages did not include GOTO,
which forced structured programming on the programmer. Although the debate raged
hotly at the time, nearly all programmers now agree that, even in languages that provide
GOTO, it is bad style to use it except in rare circumstances. As a result, later generations
of language designers have found the structured programming debate tedious and even
bewildering.

Some important languages that were developed in this time period include:

• 1970 - Pascal
• 1970 - Forth
• 1972 - C
• 1972 - Smalltalk
• 1972 - Prolog
• 1973 - ML
• 1978 - SQL

[] The 1980s: consolidation, modules, performance


The 1980s were years of relative consolidation. C++ combined object-oriented and
systems programming. The United States government standardized Ada, a systems
programming language intended for use by defense contractors. In Japan and elsewhere,
vast sums were spent investigating so-called "fifth generation" languages that
incorporated logic programming constructs. The functional languages community moved
to standardize ML and Lisp. Rather than inventing new paradigms, all of these
movements elaborated upon the ideas invented in the previous decade.

However, one important new trend in language design was an increased focus on
programming for large-scale systems through the use of modules, or large-scale
organizational units of code. Modula, Ada, and ML all developed notable module
systems in the 1980s. Module systems were often wedded to generic programming
constructs---generics being, in essence, parameterized modules (see also parametric
polymorphism).

Although major new paradigms for programming languages did not appear, many
researchers expanded on the ideas of prior languages and adapted them to new contexts.
For example, the languages of the Argus and Emerald systems adapted object-oriented
programming to distributed systems.
The 1980s also brought advances in programming language implementation. The RISC
movement in computer architecture postulated that hardware should be designed for
compilers rather than for human assembly programmers. Aided by processor speed
improvements that enabled increasingly aggressive compilation techniques, the RISC
movement sparked greater interest in compilation technology for high-level languages.

Language technology continued along these lines well into the 1990s.

Some important languages that were developed in this time period include:

• 1983 - Ada
• 1983 - C++
• 1985 - Eiffel
• 1987 - Perl
• 1989 - FL (Backus)

[] The 1990s: the Internet age


The rapid growth of the Internet in the mid-1990s was the next major historic event in
programming languages. By opening up a radically new platform for computer systems,
the Internet created an opportunity for new languages to be adopted. In particular, the
Java programming language rose to popularity because of its early integration with the
Netscape Navigator web browser, and various scripting languages achieved widespread
use in developing customized applications for web servers. Neither of these
developments represented much fundamental novelty in language design; for example,
the design of Java was a more conservative version of ideas explored many years earlier
in the Smalltalk community, but the widespread adoption of languages that supported
features like garbage collection and strong static typing was a major change in
programming practice.

Some important languages that were developed in this time period include:

• 1990 - Haskell
• 1990 - Python
• 1991 - Java
• 1993 - Ruby
• 1995 - PHP
• 2000 - C#

[] Current trends
Programming language evolution continues, in both industry and research. Some current
directions:
• Mechanisms for adding security and reliability verification to the language:
extended static checking, information flow control, static thread safety.
• Alternative mechanisms for modularity: mixins, delegates, aspects.
• Component-oriented software development.
• Metaprogramming, reflection or access to the abstract syntax tree
• Increased emphasis on distribution and mobility.
• Integration with databases, including XML and relational databases.
• Open Source as a developmental philosophy for languages, including the GNU
compiler collection and recent languages such as Python, Ruby, and Squeak.
• Support for Unicode so that source code (program text) is not restricted to those
characters contained in the ASCII character set; allowing, for example, use of
non-Latin-based scripts or extended punctuation.

Algorithm
From Wikipedia, the free encyclopedia

• Find out more about navigating Wikipedia and finding information •


Jump to: navigation, search

In mathematics, computing, linguistics, and related disciplines, an algorithm is a finite


list of well-defined instructions for accomplishing some task that, given an initial state,
will terminate in a defined end-state.

The concept of an algorithm originated as a means of recording procedures for solving


mathematical problems such as finding the common divisor of two numbers or
multiplying two numbers. A partial formalization of the concept began with attempts to
solve the Entscheidungsproblem (the "decision problem") that David Hilbert posed in
1928. Subsequent formalizations were framed as attempts to define "effective
calculability" (cf Kleene 1943:274) or "effective method" (cf Rosser 1939:225); those
formalizations included the Gödel-Herbrand-Kleene recursive functions of 1930, 1934
and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's "Formulation I" of
1936, and Alan Turing's Turing machines of 1936-7 and 1939.

[] Etymology
Al-Khwārizmī, Persian astronomer and mathematician, wrote a treatise in Arabic in 825
AD, On Calculation with Hindu Numerals. (See algorism). It was translated into Latin in
the 12th century as Algoritmi de numero Indorum,[1] which title was likely intended to
mean "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's
rendition of the author's name; but people misunderstanding the title treated Algoritmi as
a Latin plural and this led to the word "algorithm" (Latin algorismus) coming to mean
"calculation method". The intrusive "th" is most likely due to a false cognate with the
Greek αριθμος (arithmos) meaning "number".
Flowcharts are often used to graphically represent algorithms.

[] Why algorithms are necessary: an informal definition


No generally accepted formal definition of "algorithm" exists yet. We can, however,
derive clues to the issues involved and an informal meaning of the word from the
following quotation from Boolos and Jeffrey (1974, 1999):

"No human being can write fast enough, or long enough, or small enough to list
all members of an enumerably infinite set by writing out their names, one after
another, in some notation. But humans can do something equally useful, in the
case of certain enumerably infinite sets: They can give explicit instructions for
determining the nth member of the set, for arbitrary finite n. Such instructions
are to be given quite explicitly, in a form in which they could be followed by a
computing machine, or by a human who is capable of carrying out only very
elementary operations on symbols" (boldface added, p. 19).

The words "enumerably infinite" mean "countable using integers perhaps extending to
infinity". Thus Boolos and Jeffrey are saying that an algorithm implies instructions for a
process that "creates" output integers from an arbitrary "input" integer or integers that, in
theory, can be chosen from 0 to infinity. Thus we might expect an algorithm to be an
algebraic equation such as y = m + n — two arbitrary "input variables" m and n that
produce an output y. Unfortunately — as we see in Algorithm characterizations — the
word algorithm implies much more than this, something on the order of (for our addition
example):

Precise instructions (in language understood by "the computer") for a "fast,


efficient, good" process that specifies the "moves" of "the computer" (machine or
human, equipped with the necessary internally-contained information and
capabilities) to find, decode, and then munch arbitrary input integers/symbols m
and n, symbols + and = ... and (reliably, correctly, "effectively") produce, in a
"reasonable" time, output-integer y at a specified place and in a specified format.

The concept of algorithm is also used to define the notion of decidability (logic). That
notion is central for explaining how formal systems come into being starting from a small
set of axioms and rules. In logic, the time that an algorithm requires to complete cannot
be measured, as it is not apparently related with our customary physical dimension. From
such uncertainties, that characterize ongoing work, stems the unavailability of a
definition of algorithm that suits both concrete (in some sense) and abstract usage of the
term.

For a detailed presentation of the various points of view around the definition of
"algorithm" see Algorithm characterizations. For examples of simple addition
algorithms specified in the detailed manner described in Algorithm
characterizations, see Algorithm examples.
[] Formalization of algorithms
Algorithms are essential to the way computers process information, because a computer
program is essentially an algorithm that tells the computer what specific steps to perform
(in what specific order) in order to carry out a specified task, such as calculating
employees’ paychecks or printing students’ report cards. Thus, an algorithm can be
considered to be any sequence of operations that can be performed by a Turing-complete
system. Authors who assert this thesis include Savage (1987) and Gurevich (2000):

"...Turing's informal argument in favor of his thesis justifies a stronger thesis:


every algorithm can be simulated by a Turing machine" (Gurevich 2000:1)
...according to Savage [1987], "an algorithm is a computational process defined
by a Turing machine."(Gurevich 2000:3)

Typically, when an algorithm is associated with processing information, data are read
from an input source or device, written to an output sink or device, and/or stored for
further processing. Stored data are regarded as part of the internal state of the entity
performing the algorithm. In practice, the state is stored in a data structure, but an
algorithm requires the internal data only for specific operation sets called abstract data
types.

For any such computational process, the algorithm must be rigorously defined: specified
in the way it applies in all possible circumstances that could arise. That is, any
conditional steps must be systematically dealt with, case-by-case; the criteria for each
case must be clear (and computable).

Because an algorithm is a precise list of precise steps, the order of computation will
almost always be critical to the functioning of the algorithm. Instructions are usually
assumed to be listed explicitly, and are described as starting 'from the top' and going
'down to the bottom', an idea that is described more formally by flow of control.

So far, this discussion of the formalization of an algorithm has assumed the premises of
imperative programming. This is the most common conception, and it attempts to
describe a task in discrete, 'mechanical' means. Unique to this conception of formalized
algorithms is the assignment operation, setting the value of a variable. It derives from the
intuition of 'memory' as a scratchpad. There is an example below of such an assignment.

For some alternate conceptions of what constitutes an algorithm see functional


programming and logic programming .

[] Termination

Some writers restrict the definition of algorithm to procedures that eventually finish. In
such a category Kleene places the "decision procedure or decision method or algorithm
for the question" (Kleene 1952:136). Others, including Kleene, include procedures that
could run forever without stopping; such a procedure has been called a "computational
method" (Knuth 1997:5) or "calculation procedure or algorithm" (Kleene 1952:137);
however, Kleene notes that such a method must eventually exhibit "some object" (Kleene
1952:137).

Minsky makes the pertinent observation that if an algorithm hasn't terminated then we
cannot answer the question "Will it terminate with the correct answer?":

"But if the length of the process is not known in advance, then 'trying' it may not
be decisive, because if the process does go on forever — then at no time will we
ever be sure of the answer" (Minsky 1967:105)

Thus the answer is: undecidable. We can never know, nor can we do an analysis
beforehand to find out. The analysis of algorithms for their likelihood of termination is
called Termination analysis. See Halting problem for more about this knotty issue.

In the case of non-halting computation method (calculation procedure) success can no


longer be defined in terms of halting with a meaningful output. Instead, terms of success
that allow for unbounded output sequences must be defined. For example, an algorithm
that verifies if there are more zeros than ones in an infinite random binary sequence must
run forever to be effective. If it is implemented correctly, however, the algorithm's output
will be useful: for as long as it examines the sequence, the algorithm will give a positive
response while the number of examined zeros outnumber the ones, and a negative
response otherwise. Success for this algorithm could then be defined as eventually
outputting only positive responses if there are actually more zeros than ones in the
sequence, and in any other case outputting any mixture of positive and negative
responses.

See the examples of (im-)"proper" subtraction at partial function for more about what can
happen when an algorithm fails for certain of its input numbers — e.g. (i) non-
termination, (ii) production of "junk" (output in the wrong format to be considered a
number) or no number(s) at all (halt ends the computation with no output), (iii) wrong
number(s), or (iv) a combination of these. Kleene proposed that the production of "junk"
or failure to produce a number is solved by having the algorithm detect these instances
and produce e.g. an error message (he suggested "0"), or preferably, force the algorithm
into an endless loop (Kleene 1952:322). Davis does this to his subtraction algorithm —
he fixes his algorithm in a second example so that it is proper subtraction (Davis
1958:12-15). Along with the logical outcomes "true" and "false" Kleene also proposes the
use of a third logical symbol "u" — undecided (Kleene 1952:326) — thus an algorithm
will always produce something when confronted with a "proposition". The problem of
wrong answers must be solved with an independent "proof" of the algorithm e.g. using
induction:

"We normally require auxiliary evidence for this (that the algorithm correctly
defines a mu recursive function), e.g. in the form of an inductive proof that, for
each argument value, the computation terminates with a unique value" (Minsky
1967:186)
[] Expressing algorithms

Algorithms can be expressed in many kinds of notation, including natural languages,


pseudocode, flowcharts, and programming languages. Natural language expressions of
algorithms tend to be verbose and ambiguous, and are rarely used for complex or
technical algorithms. Pseudocode and flowcharts are structured ways to express
algorithms that avoid many of the ambiguities common in natural language statements,
while remaining independent of a particular implementation language. Programming
languages are primarily intended for expressing algorithms in a form that can be executed
by a computer, but are often used as a way to define or document algorithms.

There is a wide variety of representations possible and one can express a given Turing
machine program as a sequence of machine tables (see more at finite state machine and
state transition table), as flowcharts (see more at state diagram), or as a form of
rudimentary machine code or assembly code called "sets of quadruples" (see more at
Turing machine).

Sometimes it is helpful in the description of an algorithm to supplement small "flow


charts" (state diagrams) with natural-language and/or arithmetic expressions written
inside "block diagrams" to summarize what the "flow charts" are accomplishing.

Representations of algorithms are generally classed into three accepted levels of Turing
machine description (Sipser 2006:157):

• 1 High-level description:

"...prose to describe an algorithm, ignoring the implementation details. At this


level we do not need to mention how the machine manages its tape or head"

• 2 Implementation description:

"...prose used to define the way the Turing machine uses its head and the way that
it stores data on its tape. At this level we do not give details of states or transition
function"

• 3 Formal description:

Most detailed, "lowest level", gives the Turing machine's "state table".
For an example of the simple algorithm "Add m+n" described in all three levels
see Algorithm examples.

[] Implementation

Most algorithms are intended to be implemented as computer programs. However,


algorithms are also implemented by other means, such as in a biological neural network
(for example, the human brain implementing arithmetic or an insect looking for food), in
an electrical circuit, or in a mechanical device

[] Algorithm analysis

As it happens, it is important to know how much of a particular resource (such as time or


storage) is required for a given algorithm. Methods have been developed for the analysis
of algorithms to obtain such quantitative answers; for example, the algorithm above has a
time requirement of O(n), using the big O notation with n as the length of the list. At all
times the algorithm only needs to remember two values: the largest number found so far,
and its current position in the input list. Therefore it is said to have a space requirement of
O(1).[2] (Note that the size of the inputs is not counted as space used by the algorithm.)

Different algorithms may complete the same task with a different set of instructions in
less or more time, space, or effort than others. For example, given two different recipes
for making potato salad, one may have peel the potato before boil the potato while the
other presents the steps in the reverse order, yet they both call for these steps to be
repeated for all potatoes and end when the potato salad is ready to be eaten.

The analysis and study of algorithms is a discipline of computer science, and is often
practiced abstractly without the use of a specific programming language or
implementation. In this sense, algorithm analysis resembles other mathematical
disciplines in that it focuses on the underlying properties of the algorithm and not on the
specifics of any particular implementation. Usually pseudocode is used for analysis as it
is the simplest and most general representation.

[] Classes
There are various ways to classify algorithms, each with its own merits.

[] Classification by implementation

One way to classify algorithms is by implementation means.

• Recursion or iteration: A recursive algorithm is one that invokes (makes


reference to) itself repeatedly until a certain condition matches, which is a method
common to functional programming. Iterative algorithms use repetitive constructs
like loops and sometimes additional data structures like stacks to solve the given
problems. Some problems are naturally suited for one implementation or the
other. For example, towers of hanoi is well understood in recursive
implementation. Every recursive version has an equivalent (but possibly more or
less complex) iterative version, and vice versa.

• Logical: An algorithm may be viewed as controlled logical deduction. This notion


may be expressed as:
Algorithm = logic + control.[3]

The logic component expresses the axioms that may be used in the computation
and the control component determines the way in which deduction is applied to
the axioms. This is the basis for the logic programming paradigm. In pure logic
programming languages the control component is fixed and algorithms are
specified by supplying only the logic component. The appeal of this approach is
the elegant semantics: a change in the axioms has a well defined change in the
algorithm.

• Serial or parallel or distributed: Algorithms are usually discussed with the


assumption that computers execute one instruction of an algorithm at a time.
Those computers are sometimes called serial computers. An algorithm designed
for such an environment is called a serial algorithm, as opposed to parallel
algorithms or distributed algorithms. Parallel algorithms take advantage of
computer architectures where several processors can work on a problem at the
same time, whereas distributed algorithms utilise multiple machines connected
with a network. Parallel or distributed algorithms divide the problem into more
symmetrical or asymmetrical subproblems and collect the results back together.
The resource consumption in such algorithms is not only processor cycles on each
processor but also the communication overhead between the processors. Sorting
algorithms can be parallelized efficiently, but their communication overhead is
expensive. Iterative algorithms are generally parallelizable. Some problems have
no parallel algorithms, and are called inherently serial problems.

• Deterministic or non-deterministic: Deterministic algorithms solve the problem


with exact decision at every step of the algorithm whereas non-deterministic
algorithm solve problems via guessing although typical guesses are made more
accurate through the use of heuristics.

• Exact or approximate: While many algorithms reach an exact solution,


approximation algorithms seek an approximation that is close to the true solution.
Approximation may use either a deterministic or a random strategy. Such
algorithms have practical value for many hard problems.

[] Classification by design paradigm

Another way of classifying algorithms is by their design methodology or paradigm. There


is a certain number of paradigms, each different from the other. Furthermore, each of
these categories will include many different types of algorithms. Some commonly found
paradigms include:

• Divide and conquer. A divide and conquer algorithm repeatedly reduces an


instance of a problem to one or more smaller instances of the same problem
(usually recursively), until the instances are small enough to solve easily. One
such example of divide and conquer is merge sorting. Sorting can be done on each
segment of data after dividing data into segments and sorting of entire data can be
obtained in conquer phase by merging them. A simpler variant of divide and
conquer is called decrease and conquer algorithm, that solves an identical
subproblem and uses the solution of this subproblem to solve the bigger problem.
Divide and conquer divides the problem into multiple subproblems and so
conquer stage will be more complex than decrease and conquer algorithms. An
example of decrease and conquer algorithm is binary search algorithm.
• Dynamic programming. When a problem shows optimal substructure, meaning
the optimal solution to a problem can be constructed from optimal solutions to
subproblems, and overlapping subproblems, meaning the same subproblems are
used to solve many different problem instances, a quicker approach called
dynamic programming avoids recomputing solutions that have already been
computed. For example, the shortest path to a goal from a vertex in a weighted
graph can be found by using the shortest path to the goal from all adjacent
vertices. Dynamic programming and memoization go together. The main
difference between dynamic programming and divide and conquer is that
subproblems are more or less independent in divide and conquer, whereas
subproblems overlap in dynamic programming. The difference between dynamic
programming and straightforward recursion is in caching or memoization of
recursive calls. When subproblems are independent and there is no repetition,
memoization does not help; hence dynamic programming is not a solution for all
complex problems. By using memoization or maintaining a table of subproblems
already solved, dynamic programming reduces the exponential nature of many
problems to polynomial complexity.
• The greedy method. A greedy algorithm is similar to a dynamic programming
algorithm, but the difference is that solutions to the subproblems do not have to be
known at each stage; instead a "greedy" choice can be made of what looks best for
the moment. The greedy method extends the solution with the best possible
decision (not all feasible decisions) at an algorithmic stage based on the current
local optimum and the best decision (not all possible decisions) made in previous
stage. It is not exhaustive, and does not give accurate answer to many problems.
But when it works, it will be the fastest method. The most popular greedy
algorithm is finding the minimal spanning tree as given by Kruskal.
• Linear programming. When solving a problem using linear programming,
specific inequalities involving the inputs are found and then an attempt is made to
maximize (or minimize) some linear function of the inputs. Many problems (such
as the maximum flow for directed graphs) can be stated in a linear programming
way, and then be solved by a 'generic' algorithm such as the simplex algorithm. A
more complex variant of linear programming is called integer programming,
where the solution space is restricted to the integers.
• Reduction. This technique involves solving a difficult problem by transforming it
into a better known problem for which we have (hopefully) asymptotically
optimal algorithms. The goal is to find a reducing algorithm whose complexity is
not dominated by the resulting reduced algorithm's. For example, one selection
algorithm for finding the median in an unsorted list involves first sorting the list
(the expensive portion) and then pulling out the middle element in the sorted list
(the cheap portion). This technique is also known as transform and conquer.
• Search and enumeration. Many problems (such as playing chess) can be
modeled as problems on graphs. A graph exploration algorithm specifies rules for
moving around a graph and is useful for such problems. This category also
includes search algorithms, branch and bound enumeration and backtracking.
• The probabilistic and heuristic paradigm. Algorithms belonging to this class fit
the definition of an algorithm more loosely.

1. Probabilistic algorithms are those that make some choices randomly (or pseudo-
randomly); for some problems, it can in fact be proven that the fastest solutions
must involve some randomness.
2. Genetic algorithms attempt to find solutions to problems by mimicking biological
evolutionary processes, with a cycle of random mutations yielding successive
generations of "solutions". Thus, they emulate reproduction and "survival of the
fittest". In genetic programming, this approach is extended to algorithms, by
regarding the algorithm itself as a "solution" to a problem.
3. Heuristic algorithms, whose general purpose is not to find an optimal solution, but
an approximate solution where the time or resources are limited. They are not
practical to find perfect solutions. An example of this would be local search, tabu
search, or simulated annealing algorithms, a class of heuristic probabilistic
algorithms that vary the solution of a problem by a random amount. The name
"simulated annealing" alludes to the metallurgic term meaning the heating and
cooling of metal to achieve freedom from defects. The purpose of the random
variance is to find close to globally optimal solutions rather than simply locally
optimal ones, the idea being that the random element will be decreased as the
algorithm settles down to a solution.

[] Classification by field of study

See also: List of algorithms

Every field of science has its own problems and needs efficient algorithms. Related
problems in one field are often studied together. Some example classes are search
algorithms, sorting algorithms, merge algorithms, numerical algorithms, graph
algorithms, string algorithms, computational geometric algorithms, combinatorial
algorithms, machine learning, cryptography, data compression algorithms and parsing
techniques.

Fields tend to overlap with each other, and algorithm advances in one field may improve
those of other, sometimes completely unrelated, fields. For example, dynamic
programming was originally invented for optimization of resource consumption in
industry, but is now used in solving a broad range of problems in many fields.

[] Classification by complexity
See also: Complexity class

Algorithms can be classified by the amount of time they need to complete compared to
their input size. There is a wide variety: some algorithms complete in linear time relative
to input size, some do so in an exponential amount of time or even worse, and some
never halt. Additionally, some problems may have multiple algorithms of differing
complexity, while other problems might have no algorithms or no known efficient
algorithms. There are also mappings from some problems to other problems. Owing to
this, it was found to be more suitable to classify the problems themselves instead of the
algorithms into equivalence classes based on the complexity of the best possible
algorithms for them.

[] Legal issues
See also: Software patents for a general overview of the patentability of software,
including computer-implemented algorithms.

Algorithms, by themselves, are not usually patentable. In the United States, a claim
consisting solely of simple manipulations of abstract concepts, numbers, or signals do not
constitute "processes" (USPTO 2006) and hence algorithms are not patentable (as in
Gottschalk v. Benson). However, practical applications of algorithms are sometimes
patentable. For example, in Diamond v. Diehr, the application of a simple feedback
algorithm to aid in the curing of synthetic rubber was deemed patentable. The patenting
of software is highly controversial, and there are highly criticized patents involving
algorithms, especially data compression algorithms, such as Unisys' LZW patent.

Additionally, some cryptographic algorithms have export restrictions (see export of


cryptography).
This short section requires expansion.

Flowchart

A flowchart (also spelled flow-chart and flow chart) is a schematic representation of an


algorithm or a process. A flowchart is one of the seven basic tools of quality control,
which also includes the histogram, Pareto chart, check sheet, control chart, cause-and-
effect diagram, and scatter diagram (see Quality Management Glossary). They are
commonly used in business/economic presentations to help the audience visualize the
content better, or to find flaws in the process. Alternatively, one can use Nassi-
Shneiderman diagrams.
A flowchart is described as "cross-functional" when the page is divided into different
"lanes" describing the control of different organizational units. A symbol appearing in a
particular "lane" is within the control of that organizational unit. This technique allows
the analyst to locate the responsibility for performing an action or making a decision
correctly, allowing the relationship between different organizational units with
responsibility over a single process.

Computer architecture
A typical vision of a computer architecture as a series of abstraction layers: hardware,
firmware, assembler, kernel, operating system and applications (see also Tanenbaum 79).

In computer engineering, computer architecture is the conceptual design and


fundamental operational structure of a computer system. It is a blueprint and functional
description of requirements (especially speeds and interconnections) and design
implementations for the various parts of a computer — focusing largely on the way by
which the central processing unit (CPU) performs internally and accesses addresses in
memory.

It may also be defined as the science and art of selecting and interconnecting hardware
components to create computers that meet functional, performance and cost goals.

Computer architecture comprises at least three main subcategories [1]

• Instruction set architecture, or ISA, is the abstract image of a computing system


that is seen by a machine language (or assembly language) programmer, including
the instruction set, memory address modes, processor registers, and address and
data formats.

• Microarchitecture, also known as Computer organization is a lower level, more


concrete, description of the system that involves how the constituent parts of the
system are interconnected and how they interoperate in order to implement the
ISA[2]. The size of a computer's cache for instance, is an organizational issue that
generally has nothing to do with the ISA.

• System Design which includes all of the other hardware components within a
computing system such as:

1. system interconnects such as computer buses and switches


2. memory controllers and hierarchies
3. CPU off-load mechanisms such as direct memory access
4. issues like multi-processing.
Once both ISA and microarchitecture has been specified, the actual device needs to be
designed into hardware. This design process is often called implementation.
Implementation is usually not considered architectural definition, but rather hardware
design engineering.

Implementation can be further broken down into three pieces:

• Logic Implementation/Design - where the blocks that were defined in the


microarchitecture are implemented as logic equations.
• Circuit Implementation/Design - where speed critical blocks or logic equations or
logic gates are implemented at the transistor level.
• Physical Implementation/Design - where the circuits are drawn out, the different
circuit components are placed in a chip floor-plan or on a board and the wires
connecting them are routed.

For CPUs, the entire implementation process is often called CPU design.

More specific usages of the term include more general wider-scale hardware
architectures, such as cluster computing and Non-Uniform Memory Access (NUMA)
architectures.

[] Design goals
The exact form of a computer system depends on the constraints and goals for which it
was optimized. Computer architectures usually trade off standards, cost, memory
capacity, latency and throughput. Sometimes other considerations, such as features, size,
weight, reliability, expandability and power consumption are factors as well.

The most common scheme carefully chooses the bottleneck that most reduces the
computer's speed. Ideally, the cost is allocated proportionally to assure that the data rate
is nearly the same for all parts of the computer, with the most costly part being the
slowest. This is how skillful commercial integrators optimize personal computers.

[] Cost

Generally cost is held constant, determined by either system or commercial requirements.

[] Performance

Computer performance is often described in terms of clock speed (usually in MHz or


GHz). This refers to the cycles per second of the main clock of the CPU. However, this
metric is somewhat misleading, as a machine with a higher clock rate may not necessarily
have higher performance. As a result manufacturers have moved away from clock speed
as a measure of performance. Computer performance can also be measured with the
amount of cache a processor contains. If the speed, MHz or GHz, were to be a car then
the cache is the traffic light. No matter how fast the car goes it still will not hit that green
traffic light. The more speed you have and the more cache you have the faster your
processor is.

Modern CPUs can execute multiple instructions per clock cycle, which dramatically
speeds up a program. Other factors influence speed, such as the mix of functional units,
bus speeds, available memory, and the type and order of instructions in the programs
being run.

There are two main types of speed, latency and throughput. Latency is the time between
the start of a process and its completion. Throughput is the amount of work done per unit
time. Interrupt latency is the guaranteed maximum response time of the system to an
electronic event (e.g. when the disk drive finishes moving some data). Performance is
affected by a very wide range of design choices — for example, adding cache usually
makes latency worse (slower) but makes throughput better. Computers that control
machinery usually need low interrupt latencies. These computers operate in a real-time
environment and fail if an operation is not completed in a specified amount of time. For
example, computer-controlled anti-lock brakes must begin braking almost immediately
after they have been instructed to brake.

The performance of a computer can be measured using other metrics, depending upon its
application domain. A system may be CPU bound (as in numerical calculation), I/O
bound (as in a webserving application) or memory bound (as in video editing). Power
consumption has become important in servers and portable devices like laptops.

Benchmarking tries to take all these factors into account by measuring the time a
computer takes to run through a series of test programs. Although benchmarking shows
strengths, it may not help one to choose a computer. Often the measured machines split
on different measures. For example, one system might handle scientific applications
quickly, while another might play popular video games more smoothly. Furthermore,
designers have been known to add special features to their products, whether in hardware
or software, which permit a specific benchmark to execute quickly but which do not offer
similar advantages to other, more general tasks.

[] Power consumption

Power consumption is another design criteria that factors in the design of modern
computers. Power efficiency can often be traded for performance or cost benefits. With
the increasing power density of modern circuits as the number of transistors per chip
scales (Moore's Law), power efficiency has increased in importance. Recent processor
designs such as the Intel Core 2 put more emphasis on increasing power efficiency. Also,
in the world of embedded computing, power efficiency has long been and remains the
primary design goal next to performance.

[] Historical perspective
Early usage in computer context

The term “architecture” in computer literature can be traced to the work of Lyle R.
Johnson and Frederick P. Brooks, Jr., members in 1959 of the Machine Organization
department in IBM’s main research center. Johnson had occasion to write a proprietary
research communication about Stretch, an IBM-developed supercomputer for Los
Alamos Scientific Laboratory; in attempting to characterize his chosen level of detail for
discussing the luxuriously embellished computer, he noted that his description of formats,
instruction types, hardware parameters, and speed enhancements aimed at the level of
“system architecture” – a term that seemed more useful than “machine organization.”
Subsequently Brooks, one of the Stretch designers, started Chapter 2 of a book (Planning
a Computer System: Project Stretch, ed. W. Buchholz, 1962) by writing, “Computer
architecture, like other architecture, is the art of determining the needs of the user of a
structure and then designing to meet those needs as effectively as possible within
economic and technological constraints.” Brooks went on to play a major role in the
development of the IBM System/360 line of computers, where “architecture” gained
currency as a noun with the definition “what the user needs to know.” Later the computer
world would employ the term in many less-explicit ways.

The first mention of the term architecture in the refereed computer literature is in a 1964
article describing the IBM System/360. [3] The article defines architecture as the set of
“attributes of a system as seen by the programmer, i.e., the conceptual structure and
functional behavior, as distinct from the organization of the data flow and controls, the
logical design, and the physical implementation.” In the definition, the programmer
perspective of the computer’s functional behavior is key. The conceptual structure part of
an architecture description makes the functional behavior comprehensible, and
extrapolatable to a range of use cases. Only later on did ‘internals’ such as “the way by
which the CPU performs internally and accesses addresses in memory,” mentioned
above, slip into the definition of computer architecture.

TYPICAL VERSION OF COMPUTER ARCHITECTURE

OS AND APPLICATION
KERNAL
ASSEMBLER
FIRMWARE
HARDWARE

Operating system
An operating system (OS) is the software that manages the sharing of the resources of a
computer. An operating system processes raw system data and user input, and responds
by allocating and managing tasks and internal system resources as a service to users and
programs of the system. At the foundation of all system software, an operating system
performs basic tasks such as controlling and allocating memory, prioritizing system
requests, controlling input and output devices, facilitating networking and managing file
systems. Most operating systems come with an application that provides a user interface
for managing the operating system, such as a command line interpreter or graphical user
interface. The operating system forms a platform for other system software and for
application software. Windows, Mac OS X, and Linux are three of the most popular
operating systems for personal computers.

[] Services
Main article: Kernel (computer science)

[] Process management

Every program running on a computer, being it is a service or an application, is a process.


As long as a von Neumann architecture is used to build computers, only one process per
CPU can be run at a time. Older microcomputer OSes such as MS-DOS did not attempt
to bypass this limit, with the exception of interrupt processing, and only one process
could be run under them (although DOS itself featured TSR as a very partial and not too
easy to use solution). Mainframe operating systems have had multitasking capabilities
since the early 1960s. Modern operating systems enable concurrent execution of many
processes at once via multitasking even with one CPU. Process management is an
operating system's way of dealing with running multiple processes. Since most computers
contain one processor with one core, multitasking is done by simply switching processes
quickly. Depending on the operating system, as more processes run, either each time slice
will become smaller or there will be a longer delay before each process is given a chance
to run. Process management involves computing and distributing CPU time as well as
other resources. Most operating systems allow a process to be assigned a priority which
affects its allocation of CPU time. Interactive operating systems also employ some level
of feedback in which the task with which the user is working receives higher priority.
Interrupt driven processes will normally run at a very high priority. In many systems
there is a background process, such as the System Idle Process in Windows, which will
run when no other process is waiting for the CPU.

[] Memory management

Current computer architectures arrange the computer's memory in a hierarchical manner,


starting from the fastest registers, CPU cache, random access memory and disk storage.
An operating system's memory manager coordinates the use of these various types of
memory by tracking which one is available, which is to be allocated or deallocated and
how to move data between them. This activity, usually referred to as virtual memory
management, increases the amount of memory available for each process by making the
disk storage seem like main memory. There is a speed penalty associated with using disks
or other slower storage as memory – if running processes require significantly more
RAM than is available, the system may start thrashing. This can happen either because
one process requires a large amount of RAM or because two or more processes compete
for a larger amount of memory than is available. This then leads to constant transfer of
each process's data to slower storage.

Another important part of memory management is managing virtual addresses. If


multiple processes are in memory at once, they must be prevented from interfering with
each other's memory (unless there is an explicit request to utilise shared memory). This is
achieved by having separate address spaces. Each process sees the whole virtual address
space, typically from address 0 up to the maximum size of virtual memory, as uniquely
assigned to it. The operating system maintains a page table that match virtual addresses to
physical addresses. These memory allocations are tracked so that when a process
terminates, all memory used by that process can be made available for other processes.

The operating system can also write inactive memory pages to secondary storage. This
process is called "paging" or "swapping" – the terminology varies between operating
systems.

It is also typical for operating systems to employ otherwise unused physical memory as a
page cache; requests for data from a slower device can be retained in memory to improve
performance. The operating system can also pre-load the in-memory cache with data that
may be requested by the user in the near future; SuperFetch is an example of this.

[] Disk and file systems

All operating systems include support for a variety of file systems.

Modern file systems comprise a hierarchy of directories. While the idea is conceptually
similar across all general-purpose file systems, some differences in implementation exist.
Two noticeable examples of this are the character used to separate directories, and case
sensitivity.

Unix demarcates its path components with a slash (/), a convention followed by operating
systems that emulated it or at least its concept of hierarchical directories, such as Linux,
Amiga OS and Mac OS X. MS-DOS also emulated this feature, but had already also
adopted the CP/M convention of using slashes for additional options to commands, so
instead used the backslash (\) as its component separator. Microsoft Windows continues
with this convention; Japanese editions of Windows use ¥, and Korean editions use ₩
[citation needed]
. Versions of Mac OS prior to OS X use a colon (:) for a path separator. RISC
OS uses a period (.).
Unix and Unix-like operating systems allow for any character in file names other than the
slash (including line feed (LF) and other control characters). Unix file names are case
sensitive, which allows multiple files to be created with names that differ only in case. By
contrast, Microsoft Windows file names are not case sensitive by default. Windows also
has a larger set of punctuation characters that are not allowed in file names.

File systems may provide journaling, which provides safe recovery in the event of a
system crash. A journaled file system writes information twice: first to the journal, which
is a log of file system operations, then to its proper place in the ordinary file system. In
the event of a crash, the system can recover to a consistent state by replaying a portion of
the journal. In contrast, non-journaled file systems typically need to be examined in their
entirety by a utility such as fsck or chkdsk. Soft updates is an alternative to journalling
that avoids the redundant writes by carefully ordering the update operations. Log-
structured file systems and ZFS also differ from traditional journaled file systems in that
they avoid inconsistencies by always writing new copies of the data, eschewing in-place
updates.

Many Linux distributions support some or all of ext2, ext3, ReiserFS, Reiser4, GFS,
GFS2, OCFS, OCFS2, and NILFS. Linux also has full support for XFS and JFS, along
with the FAT file systems, and NTFS.

Microsoft Windows includes support for FAT12, FAT16, FAT32, and NTFS. The NTFS
file system is the most efficient and reliable of the four Windows file systems, and as of
Windows Vista, is the only file system which the operating system can be installed on.
Windows Embedded CE 6.0 introduced ExFAT, a file system suitable for flash drives.

Mac OS X supports HFS+ as its primary file system, and it supports several other file
systems as well, including FAT16, FAT32, NTFS and ZFS.

Common to all these (and other) operating systems is support for file systems typically
found on removable media. FAT12 is the file system most commonly found on floppy
discs. ISO 9660 and Universal Disk Format are two common formats that target Compact
Discs and DVDs, respectively. Mount Rainier is a newer extension to UDF supported by
Linux 2.6 kernels and Windows-Vista that facilitates rewriting to DVDs in the same
fashion as what has been possible with floppy disks.

[] Networking

Most current operating systems are capable of using the TCP/IP networking protocols.
This means that one system can appear on a network of the other and share resources
such as files, printers, and scanners using either wired or wireless connections.

Many operating systems also support one or more vendor-specific legacy networking
protocols as well, for example, SNA on IBM systems, DECnet on systems from Digital
Equipment Corporation, and Microsoft-specific protocols on Windows. Specific
protocols for specific tasks may also be supported such as NFS for file access.
[] Security

Many operating systems include some level of security. Security is based on the two
ideas that:

• The operating system provides access to a number of resources, directly or


indirectly, such as files on a local disk, privileged system calls, personal
information about users, and the services offered by the programs running on the
system;
• The operating system is capable of distinguishing between some requesters of
these resources who are authorized (allowed) to access the resource, and others
who are not authorized (forbidden). While some systems may simply distinguish
between "privileged" and "non-privileged", systems commonly have a form of
requester identity, such as a user name. Requesters, in turn, divide into two
categories:
o Internal security: an already running program. On some systems, a
program once it is running has no limitations, but commonly the program
has an identity which it keeps and is used to check all of its requests for
resources.
o External security: a new request from outside the computer, such as a
login at a connected console or some kind of network connection. To
establish identity there may be a process of authentication. Often a
username must be quoted, and each username may have a password. Other
methods of authentication, such as magnetic cards or biometric data, might
be used instead. In some cases, especially connections from the network,
resources may be accessed with no authentication at all.

In addition to the allow/disallow model of security, a system with a high level of security
will also offer auditing options. These would allow tracking of requests for access to
resources (such as, "who has been reading this file?").

Security of operating systems has long been a concern because of highly sensitive data
held on computers, both of a commercial and military nature. The United States
Government Department of Defense (DoD) created the Trusted Computer System
Evaluation Criteria (TCSEC) which is a standard that sets basic requirements for
assessing the effectiveness of security. This became of vital importance to operating
system makers, because the TCSEC was used to evaluate, classify and select computer
systems being considered for the processing, storage and retrieval of sensitive or
classified information.

[] Internal security

Internal security can be thought of as protecting the computer's resources from the
programs concurrently running on the system. Most operating systems set programs
running natively on the computer's processor, so the problem arises of how to stop these
programs doing the same task and having the same privileges as the operating system
(which is after all just a program too). Processors used for general purpose operating
systems generally have a hardware concept of privilege. Generally less privileged
programs are automatically blocked from using certain hardware instructions, such as
those to read or write from external devices like disks. Instead, they have to ask the
privileged program (operating system kernel) to read or write. The operating system
therefore gets the chance to check the program's identity and allow or refuse the request.

An alternative strategy, and the only sandbox strategy available in systems that do not
meet the Popek and Goldberg virtualization requirements, is the operating system not
running user programs as native code, but instead either emulates a processor or provides
a host for a p-code based system such as Java.

Internal security is especially relevant for multi-user systems; it allows each user of the
system to have private files that the other users cannot tamper with or read. Internal
security is also vital if auditing is to be of any use, since a program can potentially bypass
the operating system, inclusive of bypassing auditing.

[] External security

Typically an operating system offers (or hosts) various services to other network
computers and users. These services are usually provided through ports or numbered
access points beyond the operating systems network address. Services include offerings
such as file sharing, print services, email, web sites, and file transfer protocols (FTP),
most of which can have compromised security.

At the front line of security are hardware devices known as firewalls or intrustion
detection/prevention systems. At the operating system level, there are a number of
software firewalls available, as well as intrusion detection/prevention systems. Most
modern operating systems include a software firewall, which is enabled by default. A
software firewall can be configured to allow or deny network traffic to or from a service
or application running on the operating system. Therefore, one can install and be running
an insecure service, such as Telnet or FTP, and not have to be threatened by a security
breach because the firewall would deny all traffic trying to connect to the service on that
port.

[] Graphical user interfaces

Today, most modern operating systems contain Graphical User Interfaces (GUIs,
pronounced goo-eez). A few older operating systems tightly integrated the GUI to the
kernel—for example, the original implementations of Microsoft Windows and Mac OS
The Graphical subsytem was actually part of the operating system. More modern
operating systems are modular, separating the graphics subsystem from the kernel (as is
now done in Linux, and Mac OS X) so that the graphics subsystem is not part of the OS
at all.
Many operating systems allow the user to install or create any user interface they desire.
The X Window System in conjunction with GNOME or KDE is a commonly found setup
on most Unix and Unix derivative (BSD, Linux, Minix) systems.

Graphical user interfaces evolve over time. For example, Windows has modified its user
interface almost every time a new major version of Windows is released, and the Mac OS
GUI changed dramatically with the introduction of Mac OS X in 2001.

[] Device drivers

A device driver is a specific type of computer software developed to allow interaction


with hardware devices. Typically this constitutes an interface for communicating with the
device, through the specific computer bus or communications subsystem that the
hardware is connected to, providing commands to and/or receiving data from the device,
and on the other end, the requisite interfaces to the operating system and software
applications. It is a specialized hardware-dependent computer program which is also
operating system specific that enables another program, typically an operating system or
applications software package or computer program running under the operating system
kernel, to interact transparently with a hardware device, and usually provides the requisite
interrupt handling necessary for any necessary asynchronous time-dependent hardware
interfacing needs.

The key design goal of device drivers is abstraction. Every model of hardware (even
within the same class of device) is different. Newer models also are released by
manufacturers that provide more reliable or better performance and these newer models
are often controlled differently. Computers and their operating systems cannot be
expected to know how to control every device, both now and in the future. To solve this
problem, OSes essentially dictate how every type of device should be controlled. The
function of the device driver is then to translate these OS mandated function calls into
device specific calls. In theory a new device, which is controlled in a new manner, should
function correctly if a suitable driver is available. This new driver will ensure that the
device appears to operate as usual from the operating systems' point of view for any
person.

[] History
Main article: History of operating systems

The first computers did not have operating systems. By the early 1960s, commercial
computer vendors were supplying quite extensive tools for streamlining the development,
scheduling, and execution of jobs on batch processing systems. Examples were produced
by UNIVAC and Control Data Corporation, amongst others.

[] Mainframes
Through the 1960s, several major concepts were developed, driving the development of
operating systems. The development of the IBM System/360 produced a family of
mainframe computers available in widely differing capacities and price points, for which
a single operating system OS/360 was planned (rather than developing ad-hoc programs
for every individual model). This concept of a single OS spanning an entire product line
was crucial for the success of System/360 and, in fact, IBM's current mainframe
operating systems are distant descendants of this original system; applications written for
the OS/360 can still be run on modern machines. OS/360 also contained another
important advance: the development of the hard disk permanent storage device (which
IBM called DASD). Another key development was the concept of time-sharing: the idea
of sharing the resources of expensive computers amongst multiple computer users
interacting in real time with the system. Time sharing allowed all of the users to have the
illusion of having exclusive access to the machine; the Multics timesharing system was
the most famous of a number of new operating systems developed to take advantage of
the concept.

[] Midrange systems

Multics, particularly, was an inspiration to a number of operating systems developed in


the 1970s, notably Unix by Dennis Ritchie and Ken Thompson. Another commercially-
popular minicomputer operating system was VMS.

[] Microcomputer era

The first microcomputers did not have the capacity or need for the elaborate operating
systems that had been developed for mainframes and minis; minimalistic operating
systems were developed, often loaded from ROM and known as Monitors. One notable
early disk-based operating system was CP/M, which was supported on many early
microcomputers and was largely cloned in creating MS-DOS, which became wildly
popular as the operating system chosen for the IBM PC (IBM's version of it was called
IBM-DOS or PC-DOS), its successors making Microsoft one of the world's most
profitable companies. The major alternative throughout the 1980s in the microcomputer
market was Mac OS, tied intimately to the Apple Macintosh computer.

By the 1990s, the microcomputer had evolved to the point where, as well as extensive
GUI facilities, the robustness and flexibility of operating systems of larger computers
became increasingly desirable. Microsoft's response to this change was the development
of Windows NT, which served as the basis for Microsoft's desktop operating system line
starting in 2001. Apple rebuilt their operating system on top of a Unix core as Mac OS X,
also released in 2001. Hobbyist-developed reimplementations of Unix, assembled with
the tools from the GNU Project, also became popular; versions based on the Linux kernel
are by far the most popular, with the BSD derived UNIXes holding a small portion of the
server market.

The growing complexity of embedded devices has led to increasing use of embedded
operating systems.
[] Today
Modern operating systems usually feature a Graphical user interface (GUI) which uses a
pointing device such as a mouse or stylus for input in addition to the keyboard. Older
models and Operating Systems not designed for direct-human interaction (such as web-
servers) generally use a Command line interface (or CLI) typically with only the
keyboard for input. Both models are centered around a "shell" which accepts and
processes commands from the user (eg. clicking on a button, or a typed command at a
prompt).

The choice of OS may be dependant on the hardware architecture, specifically the CPU,
with only Linux and BSD running on almost any CPU. Windows NT 3.1, which is no
longer supported, was ported to the DEC Alpha and MIPS Magnum. Since the mid-
1990s, the most commonly used operating systems have been the Microsoft Windows
family, Linux, and other Unix-like operating systems, most notably Mac OS X.
Mainframe computers and embedded systems use a variety of different operating
systems, many with no direct connection to Windows or Unix. QNX and VxWorks are
two common embedded operating systems, the latter being used in network infrastructure
hardware equipment.

[] Personal computers

• IBM PC compatible - Microsoft Windows, Unix variants, and Linux variants.


• Apple Macintosh - Mac OS X (a Unix variant), Windows (on x86 Macintosh
machines only), Linux and BSD

[] Mainframe computers

The earliest operating systems were developed for mainframe computer architectures in
the 1960s. The enormous investment in software for these systems caused most of the
original computer manufacturers to continue to develop hardware and operating systems
that are compatible with those early operating systems. Those early systems pioneered
many of the features of modern operating systems. Mainframe operating systems that are
still supported include:

• Burroughs MCP-- B5000,1961 to Unisys Clearpath/MCP, present.


• IBM OS/360 -- IBM System/360, 1964 to IBM zSeries, present
• UNIVAC EXEC 8 -- UNIVAC 1108, 1964, to Unisys Clearpath IX, present.

Modern mainframes typically also run Linux or Unix variants. A "Datacenter" variant of
Windows Server 2003 is also available for some mainframe systems.

[] Embedded systems
Embedded systems use a variety of dedicated operating systems. In some cases, the
"operating system" software is directly linked to the application to produce a monolithic
special-purpose program. In the simplest embedded systems, there is no distinction
between the OS and the application. Embedded systems that have certain time
requirements are known as Real-time operating systems.

[] Unix-like operating systems

A customized KDE desktop running under Linux.

The Unix-like family is a diverse group of operating systems, with several major sub-
categories including System V, BSD, and Linux. The name "UNIX" is a trademark of
The Open Group which licenses it for use with any operating system that has been shown
to conform to their definitions. "Unix-like" is commonly used to refer to the large set of
operating systems which resemble the original Unix.

Unix systems run on a wide variety of machine architectures. They are used heavily as
server systems in business, as well as workstations in academic and engineering
environments. Free software Unix variants, such as Linux and BSD, are popular in these
areas. Unix and Unix-like systems have not reached significant market share in the
consumer and corporate desktop market, although there is some growth in this area,
notably by the Ubuntu Linux distribution. Linux on the desktop is also popular in the
developer and hobbyist operating system development communities. (see below)

Market share statistics for freely available operating systems are usually inaccurate since
most free operating systems are not purchased, making usage under-represented. On the
other hand, market share statistics based on total downloads of free operating systems are
often inflated, as there is no economic disincentive to acquire multiple operating systems
so users can download multiple, test them, and decide which they like best,

Some Unix variants like HP's HP-UX and IBM's AIX are designed to run only on that
vendor's hardware. Others, such as Solaris, can run on multiple types of hardware,
including x86 servers and PCs. Apple's Mac OS X, a hybrid kernel-based BSD variant
derived from NeXTSTEP, Mach, and FreeBSD, has replaced Apple's earlier (non-Unix)
Mac OS.

[] Open source

Over the past several years, the trend in the Unix and Unix-like space has been to open
source operating systems. Many areas previously dominated by UNIX have seen
significant inroads by Linux; Solaris source code is now the basis of the OpenSolaris
project.
The team at Bell Labs that designed and developed Unix went on to develop Plan 9 and
Inferno, which were designed for modern distributed environments. They had graphics
built-in, unlike Unix counterparts that added it to the design later. Plan 9 did not become
popular because, unlike many Unix distributions, it was not originally free. It has since
been released under Free Software and Open Source Lucent Public License, and has an
expanding community of developers. Inferno was sold to Vita Nuova and has been
released under a GPL/MIT license.

[] Microsoft Windows

A Vista desktop launched for the first time.

The Microsoft Windows family of operating systems originated as a graphical layer on


top of the older MS-DOS environment for the IBM PC. Modern versions are based on the
newer Windows NT core that first took shape in OS/2 and borrowed from VMS.
Windows runs on 32-bit and 64-bit Intel and AMD processors, although earlier versions
also ran on the DEC Alpha, MIPS, Fairchild (later Intergraph) Clipper and PowerPC
architectures (some work was done to port it to the SPARC architecture).

As of July 2007, Microsoft Windows held a large amount on the worldwide desktop
market share, although some [Who says this?] predict this to dwindle due to Microsoft's
restrictive licensing, (CD-Key) registration, and customer practices causing an increased
interest in open source operating systems. Windows is also used on low-end and mid-
range servers, supporting applications such as web servers and database servers. In recent
years, Microsoft has spent significant marketing and research & development money to
demonstrate that Windows is capable of running any enterprise application, which has
resulted in consistent price/performance records (see the TPC) and significant acceptance
in the enterprise market.

The most widely used version of the Microsoft Windows family is Microsoft Windows
XP, released on October 25, 2001. The latest release of Windows XP is Windows XP
Service Pack 2, released on August 6, 2004.

In November 2006, after more than five years of development work, Microsoft released
Windows Vista, a major new version of Microsoft Windows which contains a large
number of new features and architectural changes. Chief amongst these are a new user
interface and visual style called Windows Aero, a number of new security features such
as User Account Control, and new multimedia applications such as Windows DVD
Maker.

[] Mac OS X
Apple's Upcoming Mac OS X v10.5 ("Leopard")

Mac OS X is a line of proprietary, graphical operating systems developed, marketed, and


sold by Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh
computers. Mac OS X is the successor to the original Mac OS, which had been Apple's
primary operating system since 1984. Unlike its predecessor, Mac OS X is a UNIX
operating system built on technology that had been developed at NeXT through the
second half of the 1980s and up until Apple purchased the company in early 1997.

The operating system was first released in 1999 as Mac OS X Server 1.0, with a desktop-
oriented version (Mac OS X v10.0) following in March 2001. Since then, four more
distinct "end-user" and "server" editions of Mac OS X have been released, the most
recent being Mac OS X v10.4, which was first made available in April 2005. Releases of
Mac OS X are named after big cats; Mac OS X v10.4 is usually referred to by Apple and
users as "Tiger". In October 2007, Apple will release Mac OS X 10.5, nicknamed
"Leopard".

The server edition, Mac OS X Server, is architecturally identical to its desktop


counterpart but usually runs on Apple's line of Macintosh server hardware. Mac OS X
Server includes workgroup management and administration software tools that provide
simplified access to key network services, including a mail transfer agent, a Samba
server, an LDAP server, a domain name server, and others.

[] Hobby operating system development

Operating system development, or OSDev for short, as a hobby has a large cult
following. As such, operating systems, such as Linux, have derived from hobby operating
system projects. The design and implementation of an operating system requires skill and
determination, and the term can cover anything from a basic "Hello World" boot loader to
a fully featured kernel.

[] Other

Mainframe operating systems, such as IBM's z/OS, and embedded operating systems
such as VxWorks, eCos, and Palm OS, are usually unrelated to Unix and Windows,
except for Windows CE, Windows NT Embedded 4.0 and Windows XP Embedded
which are descendants of Windows, and several *BSDs, and Linux distributions tailored
for embedded systems. OpenVMS from Hewlett-Packard (formerly DEC), is still under
active development.

Older operating systems which are still used in niche markets include OS/2 from IBM;
Mac OS, the non-Unix precursor to Apple's Mac OS X; BeOS; XTS-300.

Popular prior to the Dot COM era, operating systems such as AmigaOS and RISC OS
continue to be developed as minority platforms for enthusiast communities and specialist
applications.
Research and development of new operating systems continues. GNU Hurd is designed
to be backwards compatible with Unix, but with enhanced functionality and a
microkernel architecture. Singularity is a project at Microsoft Research to develop an
operating system with better memory protection based on the .Net managed code
model.2008

Unix

Unix (officially trademarked as UNIX®) is a computer operating system originally


developed in 1969 by a group of AT&T employees at Bell Labs including Ken
Thompson, Dennis Ritchie and Douglas McIlroy. Today's Unix systems are split into
various branches, developed over time by AT&T as well as various commercial vendors
and non-profit organizations.

As of 2007, the owner of the trademark UNIX® is The Open Group, an industry
standards consortium. Only systems fully compliant with and certified to the Single
UNIX Specification qualify as "UNIX®" (others are called "Unix system-like" or "Unix-
like").

During the late 1970s and early 1980s, Unix's influence in academic circles led to large-
scale adoption of Unix (particularly of the BSD variant, originating from the University
of California, Berkeley) by commercial startups, the most notable of which is Sun
Microsystems. Today, in addition to certified Unix systems, Unix-like operating systems
such as Linux and BSD derivatives are commonly encountered.

Sometimes, "traditional Unix" may be used to describe a Unix or an operating system


that has the characteristics of either Version 7 Unix or UNIX System V.

[] Overview
Unix operating systems are widely used in both servers and workstations. The Unix
environment and the client-server program model were essential elements in the
development of the Internet and the reshaping of computing as centered in networks
rather than in individual computers.

Both Unix and the C programming language were developed by AT&T and distributed to
government and academic institutions, causing both to be ported to a wider variety of
machine families than any other operating system. As a result, Unix became synonymous
with "open systems".

Unix was designed to be portable, multi-tasking and multi-user in a time-sharing


configuration. Unix systems are characterized by various concepts: the use of plain text
for storing data; a hierarchical file system; treating devices and certain types of inter-
process communication (IPC) as files; and the use of a large number of small programs
that can be strung together through a command line interpreter using pipes, as opposed to
using a single monolithic program that includes all of the same functionality. These
concepts are known as the Unix philosophy.

Under Unix, the "operating system" consists of many of these utilities along with the
master control program, the kernel. The kernel provides services to start and stop
programs, handle the file system and other common "low level" tasks that most programs
share, and, perhaps most importantly, schedules access to hardware to avoid conflicts if
two programs try to access the same resource or device simultaneously. To mediate such
access, the kernel was given special rights on the system and led to the division between
user-space and kernel-space.

The microkernel tried to reverse the growing size of kernels and return to a system in
which most tasks were completed by smaller utilities. In an era when a "normal"
computer consisted of a hard disk for storage and a data terminal for input and output
(I/O), the Unix file model worked quite well as most I/O was "linear". However, modern
systems include networking and other new devices. Describing a graphical user interface
driven by mouse control in an "event driven" fashion didn't work well under the old
model. Work on systems supporting these new devices in the 1980s led to facilities for
non-blocking I/O, forms of inter-process communications other than just pipes, as well as
moving functionality such as network protocols out of the kernel.

[] History

A partial list of simultaneously running processes on a Unix system.

In the 1960s, the Massachusetts Institute of Technology, AT&T Bell Labs, and General
Electric worked on an experimental operating system called Multics (Multiplexed
Information and Computing Service), which was designed to run on the GE-645
mainframe computer. The aim was the creation of a commercial product, although this
was never a great success. Multics was an interactive operating system with many novel
capabilities, including enhanced security. The project did develop production releases,
but initially these releases performed poorly.

AT&T Bell Labs pulled out and deployed its resources elsewhere. One of the developers
on the Bell Labs team, Ken Thompson, continued to develop for the GE-645 mainframe,
and wrote a game for that computer called Space Travel.[1] However, he found that the
game was too slow on the GE machine and was expensive, costing $75 per execution in
scarce computing time.[2]

Thompson thus re-wrote the game in assembly language for Digital Equipment
Corporation's PDP-7 with help from Dennis Ritchie. This experience, combined with his
work on the Multics project, led Thompson to start a new operating system for the PDP-
7. Thompson and Ritchie led a team of developers, including Rudd Canaday, at Bell Labs
developing a file system as well as the new multi-tasking operating system itself. They
included a command line interpreter and some small utility programs.[3]

Editing a shell script using the ed editor. The dollar-sign at the top of the screen is the
prompt printed by the shell. 'ed' is typed to start the editor, which takes over from that
point on the screen downwards.

[] 1970s

In 1970 the project was named Unics, and could - eventually - support two simultaneous
users. Brian Kernighan invented this name as a contrast to Multics; the spelling was later
changed to Unix.

Up until this point there had been no financial support from Bell Labs. When the
Computer Science Research Group wanted to use Unix on a much larger machine than
the PDP-7, Thompson and Ritchie managed to trade the promise of adding text
processing capabilities to Unix for a PDP-11/20 machine. This led to some financial
support from Bell. For the first time in 1970, the Unix operating system was officially
named and ran on the PDP-11/20. It added a text formatting program called roff and a
text editor. All three were written in PDP-11/20 assembly language. Bell Labs used this
initial "text processing system", made up of Unix, roff, and the editor, for text processing
of patent applications. Roff soon evolved into troff, the first electronic publishing
program with a full typesetting capability. The UNIX Programmer's Manual was
published on November 3, 1971.

In 1973, Unix was rewritten in the C programming language, contrary to the general
notion at the time "that something as complex as an operating system, which must deal
with time-critical events, had to be written exclusively in assembly language" [4]. The
migration from assembly language to the higher-level language C resulted in much more
portable software, requiring only a relatively small amount of machine-dependent code to
be replaced when porting Unix to other computing platforms.

AT&T made Unix available to universities and commercial firms, as well as the United
States government under licenses. The licenses included all source code including the
machine-dependent parts of the kernel, which were written in PDP-11 assembly code.
Copies of the annotated Unix kernel sources circulated widely in the late 1970s in the
form of a much-copied book by John Lions of the University of New South Wales, the
Lions' Commentary on UNIX 6th Edition, with Source Code, which led to considerable
use of Unix as an educational example.

Versions of the Unix system were determined by editions of its user manuals, so that (for
example) "Fifth Edition UNIX" and "UNIX Version 5" have both been used to designate
the same thing. Development expanded, with Versions 4, 5, and 6 being released by
1975. These versions added the concept of pipes, leading to the development of a more
modular code-base, increasing development speed still further. Version 5 and especially
Version 6 led to a plethora of different Unix versions both inside and outside Bell Labs,
including PWB/UNIX, IS/1 (the first commercial Unix), and the University of
Wollongong's port to the Interdata 7/32 (the first non-PDP Unix).

In 1978, UNIX/32V, for the VAX system, was released. By this time, over 600 machines
were running Unix in some form. Version 7 Unix, the last version of Research Unix to be
released widely, was released in 1979. Versions 8, 9 and 10 were developed through the
1980s but were only released to a few universities, though they did generate papers
describing the new work. This research led to the development of Plan 9 from Bell Labs,
a new portable distributed system.

[] 1980s

A late-80s style Unix desktop running the X Window System graphical user interface.
Shown are a number of client applications common to the MIT X Consortium's
distribution, including Tom's Window Manager, an X Terminal, Xbiff, xload, and a
graphical manual page browser.

AT&T now licensed UNIX System III, based largely on Version 7, for commercial use,
the first version launching in 1982. This also included support for the VAX. AT&T
continued to issue licenses for older Unix versions. To end the confusion between all its
differing internal versions, AT&T combined them into UNIX System V Release 1. This
introduced a few features such as the vi editor and curses from the Berkeley Software
Distribution of Unix developed at the University of California, Berkeley. This also
included support for the Western Electric 3B series of machines.

Since the newer commercial UNIX licensing terms were not as favorable for academic
use as the older versions of Unix, the Berkeley researchers continued to develop BSD
Unix as an alternative to UNIX System III and V, originally on the PDP-11 architecture
(the 2.xBSD releases, ending with 2.11BSD) and later for the VAX-11 (the 4.x BSD
releases). Many contributions to Unix first appeared on BSD systems, notably the C shell
with job control (modelled on ITS). Perhaps the most important aspect of the BSD
development effort was the addition of TCP/IP network code to the mainstream Unix
kernel. The BSD effort produced several significant releases that contained network code:
4.1cBSD, 4.2BSD, 4.3BSD, 4.3BSD-Tahoe ("Tahoe" being the nickname of the
Computer Consoles Inc. Power 6/32 architecture that was the first non-DEC release of
the BSD kernel), Net/1, 4.3BSD-Reno (to match the "Tahoe" naming, and that the release
was something of a gamble), Net/2, 4.4BSD, and 4.4BSD-lite. The network code found
in these releases is the ancestor of much TCP/IP network code in use today, including
code that was later released in AT&T System V UNIX and early versions of Microsoft
Windows. The accompanying Berkeley Sockets API is a de facto standard for networking
APIs and has been copied on many platforms.
Other companies began to offer commercial versions of the UNIX System for their own
mini-computers and workstations. Most of these new Unix flavors were developed from
the System V base under a license from AT&T; however, others were based on BSD
instead. One of the leading developers of BSD, Bill Joy, went on to co-found Sun
Microsystems in 1982 and create SunOS (now Solaris) for their workstation computers.
In 1980, Microsoft announced its first Unix for 16-bit microcomputers called Xenix,
which the Santa Cruz Operation (SCO) ported to the Intel 8086 processor in 1983, and
eventually branched Xenix into SCO UNIX in 1989.

For a few years during this period (before PC compatible computers with MS-DOS
became dominant), industry observers expected that UNIX, with its portability and rich
capabilities, was likely to become the industry standard operating system for
microcomputers.[5] In 1984 several companies established the X/Open consortium with
the goal of creating an open system specification based on UNIX. Despite early progress,
the standardization effort collapsed into the "Unix wars," with various companies
forming rival standardization groups. The most successful Unix-related standard turned
out to be the IEEE's POSIX specification, designed as a compromise API readily
implemented on both BSD and System V platforms, published in 1988 and soon
mandated by the United States government for many of its own systems.

AT&T added various features into UNIX System V, such as file locking, system
administration, streams, new forms of IPC, the Remote File System and TLI. AT&T
cooperated with Sun Microsystems and between 1987 and 1989 merged features from
Xenix, BSD, SunOS, and System V into System V Release 4 (SVR4), independently of
X/Open. This new release consolidated all the previous features into one package, and
heralded the end of competing versions. It also increased licensing fees.

During this time a number of vendors including Digital Equipment, Sun, Addamax and
others began building trusted versions of UNIX for high security applications, mostly
designed for military and law enforcement applications.

The Common Desktop Environment or CDE, a graphical desktop for Unix co-developed
in the 1990s by HP, IBM, and Sun as part of the COSE initiative.

[] 1990s

In 1990, the Open Software Foundation released OSF/1, their standard Unix
implementation, based on Mach and BSD. The Foundation was started in 1988 and was
funded by several Unix-related companies that wished to counteract the collaboration of
AT&T and Sun on SVR4. Subsequently, AT&T and another group of licensees formed
the group "UNIX International" in order to counteract OSF. This escalation of conflict
between competing vendors gave rise again to the phrase "Unix wars".
In 1991, a group of BSD developers (Donn Seeley, Mike Karels, Bill Jolitz, and Trent
Hein) left the University of California to found Berkeley Software Design, Inc (BSDI).
BSDI produced a fully functional commercial version of BSD Unix for the inexpensive
and ubiquitous Intel platform, which started a wave of interest in the use of inexpensive
hardware for production computing. Shortly after it was founded, Bill Jolitz left BSDI to
pursue distribution of 386BSD, the free software ancestor of FreeBSD, OpenBSD, and
NetBSD.

By 1993 most commercial vendors had changed their variants of Unix to be based on
System V with many BSD features added on top. The creation of the COSE initiative that
year by the major players in Unix marked the end of the most notorious phase of the Unix
wars, and was followed by the merger of UI and OSF in 1994. The new combined entity,
which retained the OSF name, stopped work on OSF/1 that year. By that time the only
vendor using it was Digital, which continued its own development, rebranding their
product Digital UNIX in early 1995.

Shortly after UNIX System V Release 4 was produced, AT&T sold all its rights to
UNIX® to Novell. (Dennis Ritchie likened this to the Biblical story of Esau selling his
birthright for the proverbial "mess of pottage".[6]) Novell developed its own version,
UnixWare, merging its NetWare with UNIX System V Release 4. Novell tried to use this
to battle against Windows NT, but their core markets suffered considerably.

In 1993, Novell decided to transfer the UNIX® trademark and certification rights to the
X/Open Consortium.[7] In 1996, X/Open merged with OSF, creating the Open Group.
Various standards by the Open Group now define what is and what is not a "UNIX"
operating system, notably the post-1998 Single UNIX Specification.

In 1995, the business of administering and supporting the existing UNIX licenses, plus
rights to further develop the System V code base, were sold by Novell to the Santa Cruz
Operation.[1] Whether Novell also sold the copyrights is currently the subject of
litigation (see below).

In 1997, Apple Computer sought out a new foundation for its Macintosh operating
system and chose NEXTSTEP, an operating system developed by NeXT. The core
operating system was renamed Darwin after Apple acquired it. It was based on the BSD
family and the Mach kernel. The deployment of Darwin BSD Unix in Mac OS X makes
it, according to a statement made by an Apple employee at a USENIX conference, the
most widely used Unix-based system in the desktop computer market.

[] 2000 to present

A modern Unix desktop environment (Solaris 10)

In 2000, SCO sold its entire UNIX business and assets to Caldera Systems, which later
on changed its name to The SCO Group. This new player then started legal action against
various users and vendors of Linux. SCO have alleged that Linux contains copyrighted
Unix code now owned by The SCO Group. Other allegations include trade-secret
violations by IBM, or contract violations by former Santa Cruz customers who have since
converted to Linux. However, Novell disputed the SCO group's claim to hold copyright
on the UNIX source base. According to Novell, SCO (and hence the SCO Group) are
effectively franchise operators for Novell, which also retained the core copyrights, veto
rights over future licensing activities of SCO, and 95% of the licensing revenue. The
SCO Group disagreed with this, and the dispute had resulted in the SCO v. Novell
lawsuit.

In 2005, Sun Microsystems released the bulk of its Solaris system code (based on UNIX
System V Release 4) into an open source project called OpenSolaris. New Sun OS
technologies such as the ZFS file system are now first released as open source code via
the OpenSolaris project; as of 2006 it has spawned several non-Sun distributions such as
SchilliX, Belenix, Nexenta and MarTux.

The Dot-com crash has led to significant consolidation of Unix users as well. Of the
many commercial flavors of Unix that were born in the 1980s, only Solaris, HP-UX, and
AIX are still doing relatively well in the market, though SGI's IRIX persisted for quite
some time. Of these, Solaris has the most market share, and may be gaining popularity
due to its feature set and also since it now has an Open Source version.[8]

[] Standards
Beginning in the late 1980s, an open operating system standardization effort now known
as POSIX provided a common baseline for all operating systems; IEEE based POSIX
around the common structure of the major competing variants of the Unix system,
publishing the first POSIX standard in 1988. In the early 1990s a separate but very
similar effort was started by an industry consortium, the Common Open Software
Environment (COSE) initiative, which eventually became the Single UNIX Specification
administered by The Open Group). Starting in 1998 the Open Group and IEEE started the
Austin Group, to provide a common definition of POSIX and the Single UNIX
Specification.

In an effort towards compatibility, in 1999 several Unix system vendors agreed on


SVR4's Executable and Linkable Format (ELF) as the standard for binary and object code
files. The common format allows substantial binary compatibility among Unix systems
operating on the same CPU architecture.

The Filesystem Hierarchy Standard was created to provide a reference directory layout
for Unix-like operating systems, particularly Linux. This type of standard however is
controversial, and even within the Linux community its adoption is far from universal.

[] Components
See also: list of Unix programs

The Unix system is composed of several components that are normally packaged
together. By including — in addition to the kernel of an operating system — the
development environment, libraries, documents, and the portable, modifiable source-code
for all of these components, Unix was a self-contained software system. This was one of
the key reasons it emerged into an important teaching and learning tool and had such a
broad influence.

Inclusion of these components did not make the system large — the original V7 UNIX
distribution, consisting of copies of all of the compiled binaries plus all of the source
code and documentation occupied less than 10Mb, and arrived on a single 9-track
magtape. The printed documentation, typeset from the on-line sources, was contained in
two volumes.

The names and filesystem locations of the Unix components has changed substantially
across the history of the system. Nonetheless, the V7 implementation is considered by
many to have the canonical early structure:

• Kernel — source code in /usr/sys, composed of several sub-components:


o conf — configuration and machine-dependent parts, including boot code
o dev — device drivers for control of hardware (and some pseudo-hardware)
o sys — operating system "kernel", handling memory management, process
scheduling, system calls, etc.
o h — header files, defining key structures within the system and important
system-specific invariables
• Development Environment — Early versions of Unix contained a development
environment sufficient to recreate the entire system from source code:
o cc — C language compiler (first appeared in V3 Unix)
o as — machine-language assembler for the machine
o ld — linker, for combining object files
o lib — object-code libraries (installed in /lib or /usr/lib) libc, the system
library with C run-time support, was the primary library, but there have
always been additional libraries for such things as mathematical functions
(libm) or database access. V7 Unix introduced the first version of the
modern "Standard I/O" library stdio as part of the system library. Later
implementations increased the number of libraries significantly.
o make - build manager (introduced in PWB/UNIX), for effectively
automating the build process
o include — header files for software development, defining standard
interfaces and system invariants
o Other languages — V7 Unix contained a Fortran-77 compiler, a
programmable arbitrary-precision calculator (bc, dc), and the awk
"scripting" language, and later versions and implementations contain
many other language compilers and toolsets. Early BSD releases included
Pascal tools, and many modern Unix systems also include the GNU
Compiler Collection as well as or instead of a proprietary compiler
system.
o Other tools — including an object-code archive manager (ar), symbol-
table lister (nm), compiler-development tools (e.g. lex & yacc), and
debugging tools.
• Commands — Unix makes little distinction between commands (user-level
programs) for system operation and maintenance (e.g. cron), commands of
general utility (e.g. grep), and more general-purpose applications such as the text
formatting and typesetting package. Nonetheless, some major categories are:
o sh — The "shell" programmable command-line interpreter, the primary
user interface on Unix before window systems appeared, and even
afterward (within a "command window").
o Utilities — the core tool kit of the Unix command set, including cp, ls,
grep, find and many others. Subcategories include:
 System utilities — administrative tools such as mkfs, fsck, and
many others
 User utilities — environment management tools such as passwd,
kill, and others.
o Document formatting — Unix systems were used from the outset for
document preparation and typesetting systems, and included many related
programs such as nroff, troff, tbl, eqn, refer, and pic. Some modern Unix
systems also include packages such as TeX and GhostScript.
o Graphics — The plot subsystem provided facilities for producing simple
vector plots in a device-independent format, with device-specific
interpreters to display such files. Modern Unix systems also generally
include X11 as a standard windowing system and GUI, and many support
OpenGL.
o Communications — Early Unix systems contained no inter-system
communication, but did include the inter-user communication programs
mail and write. V7 introduced the early inter-system communication
system UUCP, and systems beginning with BSD release 4.1c included
TCP/IP utilities.

The 'man' command can display a 'man page' for every command on the system,
including itself.

• Documentation — Unix was the first operating system to include all of its
documentation online in machine-readable form. The documentation included:
o man — manual pages for each command, library component, system call,
header file, etc.
o doc — longer documents detailing major subsystems, such as the C
language and troff

[] Impact
The Unix system had significant impact on other operating systems.

It was written in high level language as opposed to assembly language (which had been
thought necessary for systems implementation on early computers). Although this
followed the lead of Multics and Burroughs, it was Unix that popularized the idea.

Unix had a drastically simplified file model compared to many contemporary operating
systems, treating all kinds of files as simple byte arrays. The file system hierarchy
contained machine services and devices (such as printers, terminals, or disk drives),
providing a uniform interface, but at the expense of occasionally requiring additional
mechanisms such as ioctl and mode flags to access features of the hardware that did not
fit the simple "stream of bytes" model. The Plan 9 operating system pushed this model
even further and eliminated the need for additional mechanisms.

Unix also popularized the hierarchical file system with arbitrarily nested subdirectories,
originally introduced by Multics. Other common operating systems of the era had ways to
divide a storage device into multiple directories or sections, but they had a fixed number
of levels, often only one level. Several major proprietary operating systems eventually
added recursive subdirectory capabilities also patterned after Multics. DEC's RSX-11M's
"group, user" hierarchy evolved into VMS directories, CP/M's volumes evolved into MS-
DOS 2.0+ subdirectories, and HP's MPE group.account hierarchy and IBM's SSP and
OS/400 library systems were folded into broader POSIX file systems.

Making the command interpreter an ordinary user-level program, with additional


commands provided as separate programs, was another Multics innovation popularized
by Unix. The Unix shell used the same language for interactive commands as for
scripting (shell scripts — there was no separate job control language like IBM's JCL).
Since the shell and OS commands were "just another program", the user could choose (or
even write) his own shell. New commands could be added without changing the shell
itself. Unix's innovative command-line syntax for creating chains of producer-consumer
processes (pipelines) made a powerful programming paradigm (coroutines) widely
available. Many later command-line interpreters have been inspired by the Unix shell.

A fundamental simplifying assumption of Unix was its focus on ASCII text for nearly all
file formats. There were no "binary" editors in the original version of Unix — the entire
system was configured using textual shell command scripts. The common denominator in
the I/O system is the byte — unlike "record-based" file systems in other computers. The
focus on text for representing nearly everything made Unix pipes especially useful, and
encouraged the development of simple, general tools that could be easily combined to
perform more complicated ad hoc tasks. The focus on text and bytes made the system far
more scalable and portable than other systems. Over time, text-based applications have
also proven popular in application areas, such as printing languages (PostScript), and at
the application layer of the Internet Protocols, e.g. Telnet, FTP, SSH, SMTP, HTTP and
SIP.
Unix popularised a syntax for regular expressions that found widespread use. The Unix
programming interface became the basis for a widely implemented operating system
interface standard (POSIX, see above).

The C programming language soon spread beyond Unix, and is now ubiquitous in
systems and applications programming.

Early Unix developers were important in bringing the theory of modularity and
reusability into software engineering practice, spawning a "Software Tools" movement.

Unix provided the TCP/IP networking protocol on relatively inexpensive computers,


which contributed to the Internet explosion of world-wide real-time connectivity, and
which formed the basis for implementations on many other platforms. (This also exposed
numerous security holes in the networking implementations.)

The Unix policy of extensive on-line documentation and (for many years) ready access to
all system source code raised programmer expectations, contributing to the Open Source
movement.

Over time, the leading developers of Unix (and programs that ran on it) evolved a set of
cultural norms for developing software, norms which became as important and influential
as the technology of Unix itself; this has been termed the Unix philosophy.

[] 2038

Main article: Year 2038 problem

Unix stores system time values as the number of seconds from midnight January 1, 1970
(the "Unix Epoch") in variables of type time_t, historically defined as "signed 32-bit
integer". On January 19, 2038, the current time will roll over from a zero followed by 31
ones (01111111111111111111111111111111) to a one followed by 31 zeros
(10000000000000000000000000000000), which will reset time to the year 1901 or 1970,
depending on implementation, because that toggles the sign bit. As many applications use
OS library routines for date calculations, the impact of this could be felt much earlier than
2038; for instance, 30-year mortgages may be calculated incorrectly beginning in the year
2008.

Since times before 1970 are rarely represented in Unix time, one possible solution that is
compatible with existing binary formats would be to redefine time_t as "unsigned 32-bit
integer". However, such a kludge merely postpones the problem to February 7, 2106, and
could introduce bugs in software that compares differences between two sets of time.

Some Unix versions have already addressed this. For example, in Solaris on 64-bit
systems, time_t is 64 bits long, meaning that the OS itself and 64-bit applications will
correctly handle dates for some 292 billion years (several times greater than the age of
the universe). Existing 32-bit applications using a 32-bit time_t continue to work on 64-
bit Solaris systems but are still prone to the 2038 problem.

[] Free Unix-like operating systems

Linux is a modern Unix-like system

In 1983, Richard Stallman announced the GNU project, an ambitious effort to create a
free software Unix-like system; "free" in that everyone who received a copy would be
free to use, study, modify, and redistribute it. GNU's goal was achieved in 1992. Its own
kernel development project, GNU Hurd, had not produced a working kernel, but a
compatible kernel called Linux was released as free software in 1992 under the GNU
General Public License. The combination of the two is frequently referred to simply as
"Linux", although the Free Software Foundation and some Linux distributions, such as
Debian GNU/Linux, use the combined term GNU/Linux. Work on GNU Hurd continues,
although very slowly.

In addition to their use in the Linux operating system, many GNU packages — such as
the GNU Compiler Collection (and the rest of the GNU toolchain), the GNU C library
and the GNU core utilities — have gone on to play central roles in other free Unix
systems as well.

Linux distributions, comprising Linux and large collections of compatible software have
become popular both with hobbyists and in business. Popular distributions include Red
Hat Enterprise Linux, SUSE Linux, Mandriva Linux, Fedora, Ubuntu, Debian
GNU/Linux, Slackware Linux and Gentoo.

A free derivative of BSD Unix, 386BSD, was also released in 1992 and led to the
NetBSD and FreeBSD projects. With the 1994 settlement of a lawsuit that UNIX
Systems Laboratories brought against the University of California and Berkeley Software
Design Inc. (USL v. BSDi), it was clarified that Berkeley had the right to distribute BSD
Unix — for free, if it so desired. Since then, BSD Unix has been developed in several
different directions, including the OpenBSD and DragonFly BSD variants.

Linux and the BSD kin are now rapidly occupying the market traditionally occupied by
proprietary Unix operating systems, as well as expanding into new markets such as the
consumer desktop and mobile and embedded devices. A measure of this success may be
seen when Apple Computer incorporated BSD into its Macintosh operating system by
way of NEXTSTEP. Due to the modularity of the Unix design, sharing bits and pieces is
relatively common; consequently, most or all Unix and Unix-like systems include at least
some BSD code, and modern BSDs also typically include some GNU utilities in their
distribution, so Apple's combination of parts from NeXT and FreeBSD with Mach and
some GNU utilities has precedent.
In 2005, Sun Microsystems released the bulk of the source code to the Solaris operating
system, a System V variant, under the name OpenSolaris, making it the first actively
developed commercial Unix system to be open sourced (several years earlier, Caldera
had released many of the older Unix systems under an educational and later BSD
license). As a result, a great deal of formerly proprietary AT&T/USL code is now freely
available.

[] Branding
See also: list of Unix systems

In October 1993, Novell, the company that owned the rights to the Unix System V source
at the time, transferred the trademarks of Unix to the X/Open Company (now The Open
Group),[9] and in 1995 sold the related business operations to Santa Cruz Operation.[10]
Whether Novell also sold the copyrights to the actual software is currently the subject of
litigation in a federal lawsuit, SCO v. Novell. Unix vendor SCO Group Inc. accused
Novell of slander of title.

The present owner of the trademark UNIX® is The Open Group, an industry standards
consortium. Only systems fully compliant with and certified to the Single UNIX
Specification qualify as "UNIX®" (others are called "Unix system-like" or "Unix-like").
The term UNIX is not an acronym, but follows the early convention of naming computer
systems in capital letters, such as ENIAC and MISTIC.

By decree of The Open Group, the term "UNIX®" refers more to a class of operating
systems than to a specific implementation of an operating system; those operating
systems which meet The Open Group's Single UNIX Specification should be able to bear
the UNIX® 98 or UNIX® 03 trademarks today, after the operating system's vendor pays
a fee to The Open Group. Systems licensed to use the UNIX® trademark include AIX,
HP-UX, IRIX, Solaris, Tru64, A/UX, Mac OS X 10.5 on Intel platforms[11], and a part of
z/OS.

Sometimes a representation like "Un*x", "*NIX", or "*N?X" is used to indicate all


operating systems similar to Unix. This comes from the use of the "*" and "?" characters
as "wildcard" characters in many utilities. This notation is also used to describe other
Unix-like systems, e.g. Linux, FreeBSD, etc., that have not met the requirements for
UNIX® branding from the Open Group.

The Open Group requests that "UNIX®" is always used as an adjective followed by a
generic term such as "system" to help avoid the creation of a genericized trademark.

The term "Unix" is also used, and in fact was the original capitalisation, but the name
UNIX stuck because, in the words of Dennis Ritchie "when presenting the original Unix
paper to the third Operating Systems Symposium of the American Association for
Computing Machinery, we had just acquired a new typesetter and were intoxicated by
being able to produce small caps" (quoted from the Jargon File, version 4.3.3, 20
September 2002). Additionally, it should be noted that many of the operating system's
predecessors and contemporaries used all-uppercase lettering, because many computer
terminals of the time could not produce lower-case letters, so many people wrote the
name in upper case due to force of habit.

Several plural forms of Unix are used to refer to multiple brands of Unix and Unix-like
systems. Most common is the conventional "Unixes", but the hacker culture which
created Unix has a penchant for playful use of language, and "Unices" (treating Unix as
Latin noun of the third declension) is also popular. The Anglo-Saxon plural form
"Unixen" is not common, although occasionally seen.

Trademark names can be registered by different entities in different countries and


trademark laws in some countries allow the same trademark name to be controlled by two
different entities if each entity uses the trademark in easily distinquishable categories.
The result is that Unix has been used as a brand name for various products including
book shelves, ink pens, bottled glue, diapers, hair driers and food containers. [2].

Microsoft Windows
Microsoft Windows is the name of several families of software operating systems by
Microsoft. Microsoft first introduced an operating environment named Windows in
November 1985 as an add-on to MS-DOS in response to the growing interest in graphical
user interfaces (GUIs).[1] Microsoft Windows eventually came to dominate the world's
personal computer market, overtaking OS/2 and Mac OS which had been introduced
previously. At the 2004 IDC Directions conference, IDC Vice President Avneesh Saxena
stated that Windows had approximately 90% of the client operating system market.[2] The
current client version of Windows are the editions of Windows Vista. The current server
versions of Windows are the editions of Windows Server 2003, but Windows Server
2008 is already in Beta.

[] Versions
See also: List of Microsoft Windows versions

The term Windows collectively describes any or all of several generations of Microsoft
(MS) operating system (OS) products. These products are generally categorized as
follows:

[] 16-bit operating environments

The box art of Windows 1.0, the first version Microsoft released to the public.
The early versions of Windows were often thought of as just graphical user interfaces,
mostly because they ran on top of MS-DOS and used it for file system services.[citation needed]
However even the earliest 16-bit Windows versions already assumed many typical
operating system functions, notably having their own executable file format and
providing their own device drivers (timer, graphics, printer, mouse, keyboard and sound)
for applications. Unlike MS-DOS, Windows allowed users to execute multiple graphical
applications at the same time, through cooperative multitasking. Finally, Windows
implemented an elaborate, segment-based, software virtual memory scheme which
allowed it to run applications larger than available memory: code segments and resources
were swapped in and thrown away when memory became scarce, and data segments
moved in memory when a given application had relinquished processor control, typically
waiting for user input.[citation needed] 16-bit Windows versions include Windows 1.0 (1985),
Windows 2.0 (1987) and its close relative Windows/286.

[] Hybrid 16/32-bit operating environments

A classic Windows logo. Was used from the early 1990s to 1999.

Windows/386 introduced a 32-bit protected mode kernel and virtual machine monitor.
For the duration of a Windows session, it created one or more virtual 8086 environments
and provided device virtualization for the video card, keyboard, mouse, timer and
interrupt controller inside each of them. The user-visible consequence was that it became
possible to preemptively multitask multiple MS-DOS environments in separate Windows
(graphical applications required switching the window to full screen mode). Windows
applications were still multi-tasked cooperatively inside one of such real-mode
environments.

Windows 3.0 (1990) and Windows 3.1 (1992) improved the design, mostly because of
virtual memory and loadable virtual device drivers (VxDs) which allowed them to share
arbitrary devices between multitasked DOS windows.[citation needed] Because of this,
Windows applications could now run in 16-bit protected mode (when Windows was
running in Standard or 386 Enhanced Mode), which gave them access to several
megabytes of memory and removed the obligation to participate in the software virtual
memory scheme. They still ran inside the same address space, where the segmented
memory provided a degree of protection, and multi-tasked cooperatively. For Windows
3.0, Microsoft also rewrote critical operations from C into assembly, making this release
faster and less memory-hungry than its predecessors.[citation needed]

[] Hybrid 16/32-bit operating systems

The Windows logo that was used from late 1999 to 2001.
With the introduction of 32-bit Windows for Workgroups 3.11, Windows could finally
stop relying on DOS for file management.[citation needed] Leveraging this, Windows 95
introduced Long File Names, reducing the 8.3 filename to the role of a boot loader. MS-
DOS was now bundled with Windows; this notably made it (partially) aware of long file
names when its utilities were run from within Windows. The most important novelty was
the possibility of running 32-bit multi-threaded preemptively multitasked graphical
programs. However, the necessity of keeping compatibility with 16-bit programs meant
the GUI components were still 16-bit only and not fully reentrant, which resulted in
reduced performance and stability.

There were three releases of Windows 95 (the first in 1995, then subsequent bug-fix
versions in 1996 and 1997, only released to OEMs, which added extra features such as
FAT32 support). Microsoft's next OS was Windows 98; there were two versions of this
(the first in 1998 and the second, named "Windows 98 Second Edition", in 1999).[citation
needed]
In 2000, Microsoft released Windows Me (Me standing for Millennium Edition),
which used the same core as Windows 98 but adopted the visual appearance of Windows
2000, as well as a new feature called System Restore, allowing the user to set the
computer's settings back to an earlier date.[citation needed] It was not a very well-received
implementation, and many user problems occurred.[citation needed] Windows Me was
considered a stopgap to the day both product lines would be seamlessly merged.[citation
needed]
Microsoft left little time for Windows Me to become popular before announcing
their next version of Windows which would be called Windows XP.

[] 32-bit operating systems

The Windows logo that was used from 2001 to November 2006.

This family of Windows systems was fashioned and marketed for higher reliability
business use, and was unencumbered by any Microsoft DOS patrimony.[citation needed] The
first release was Windows NT 3.1 (1993, numbered "3.1" to match the Windows version
and to one-up OS/2 2.1, IBM's flagship OS co-developed by Microsoft and was Windows
NT's main competitor at the time), which was followed by NT 3.5 (1994), NT 3.51
(1995), and NT 4.0 (1996); NT 4.0 was the first in this line to implement the Windows 95
user interface. Microsoft then moved to combine their consumer and business operating
systems. Their first attempt, Windows 2000, failed to meet their goals,[citation needed] and was
released as a business system. The home consumer edition of Windows 2000, codenamed
"Windows Neptune," ceased development and Microsoft released Windows Me in its
place. However, MS-DOS still existed. The last and final version of MS-DOS, version
8.0, was released embedded into Windows Me. When Windows Me passes on, it will be
the end of MS-DOS. Eventually "Neptune" was merged into their new project, Whistler,
which later became Windows XP. Since then, a new business system, Windows Server
2003, has expanded the top end of the range, and the newly released Windows Vista will
complete it. Windows CE, Microsoft's offering in the mobile and embedded markets, is
also a true 32-bit operating system that offers various services for all sub-operating
workstations.

[] 64-bit operating systems

The current Windows logo

Windows NT included support for several different platforms before the x86-based
personal computer became dominant in the professional world. Versions of NT from 3.1
to 4.0 supported DEC Alpha and MIPS R4000, which were 64-bit processors, although
the operating system treated them as 32-bit processors.

With the introduction of the Intel Itanium architecture, which is referred to as IA-64,
Microsoft released new versions of Windows 2000 to support it. Itanium versions of
Windows XP and Windows Server 2003 were released at the same time as their
mainstream x86 (32-bit) counterparts. On April 25, 2005, Microsoft released Windows
XP Professional x64 Edition and x64 versions of Windows Server 2003 to support the
AMD64/Intel64 (or x64 in Microsoft terminology) architecture. Microsoft dropped
support for the Itanium version of Windows XP in 2005. The modern 64-bit Windows
family comprises Windows XP Professional x64 Edition for AMD64/Intel64 systems,
and Windows Server 2003, in both Itanium and x64 editions. Windows Vista is the first
end-user version of Windows that Microsoft has released simultaneously in 32-bit and
x64 editions. Windows Vista does not support the Itanium architecture.

[] History
Main article: History of Microsoft Windows

Microsoft has taken two parallel routes in operating systems. One route has been the
home user and the other has been the professional IT user. The dual route has generally
led to the home versions with greater multimedia support and less functionality in
networking and security, and professional versions with inferior multimedia support and
better networking and security.

The first independent version of Microsoft Windows, version 1.0, released in November
1985, lacked a degree of functionality and achieved little popularity, and was to compete
with Apple's own operating system.[citation needed] Windows 1.0 did not provide a complete
operating system; rather, it extended MS-DOS. Microsoft Windows version 2.0 was
released in November, 1987 and was slightly more popular than its predecessor.
Windows 2.03 (release date January 1988) had changed the OS from tiled Windows to
overlapping Windows. The result of this change led to Apple Computer filing a suit
against Microsoft alleging infringement on Apple's copyrights.[citation needed]
A Windows for Workgroups 3.11 desktop.

Microsoft Windows version 3.0, released in 1990, was the first Microsoft Windows
version to achieve broad commercial success, selling 2 million copies in the first six
months.[citation needed] It featured improvements to the user interface and to multitasking
capabilities. It received a facelift in Windows 3.1, made generally available on March 1,
1992. Windows 3.1 support ended on December 31, 2001.[4]

In July 1993, Microsoft released Windows NT based on a new kernel. NT was considered
to be the professional OS and was the first Windows version to utilize preemptive
multitasking.[citation needed]. Windows NT and the Windows DOS/9x based line would later
be fused together to create Windows XP.

In August 1995, Microsoft released Windows 95, which made further changes to the user
interface, and also used preemptive multitasking. Mainstream support for Windows 95
ended on December 31, 2000 and extended support for Windows 95 ended on December
31, 2001.[5]

The next in line was Microsoft Windows 98 released in June 1998. It was substantially
criticized for its slowness and for its unreliability compared with Windows 95, but many
of its basic problems were later rectified with the release of Windows 98 Second Edition
in 1999.[citation needed] Mainstream support for Windows 98 ended on June 30, 2002 and
extended support for Windows 98 ended on July 11, 2006.[6]

As part of its "professional" line, Microsoft released Windows 2000 in February 2000.
The consumer version following Windows 98 was Windows Me (Windows Millennium
Edition). Released in September 2000, Windows Me attempted to implement a number of
new technologies for Microsoft: most notably publicized was "Universal Plug and Play."
However, the OS was heavily criticized for its lack of compatibility and stability and it
was even rated by PC World as the fourth worst product of all time.[7]

In October 2001, Microsoft released Windows XP, a version built on the Windows NT
kernel that also retained the consumer-oriented usability of Windows 95 and its
successors. This new version was widely praised in computer magazines.[8] It shipped in
two distinct editions, "Home" and "Professional", the former lacking many of the
superior security and networking features of the Professional edition. Additionally, the
first "Media Center" edition was released in 2002[9], with an emphasis on support for
DVD and TV functionality including program recording and a remote control.
Mainstream support for Windows XP will continue until April 14, 2009 and extended
support will continue until April 8, 2014.[10]

In April 2003, Windows Server 2003 was introduced, replacing the Windows 2000 line
of server products with a number of new features and a strong focus on security; this was
followed in December 2005 by Windows Server 2003 R2.
On January 30, 2007 Microsoft released Windows Vista. It contains a number of new
features, from a redesigned shell and user interface to significant technical changes, with
a particular focus on security features. It is available in a number of different editions,
more than any previous version of Windows. It has been subject to several criticisms.

[] Security

The Windows Security Center was introduced with Windows XP Service Pack 2.

Security has been a hot topic with Windows for many years, and even Microsoft itself has
been the victim of security breaches. Consumer versions of Windows were originally
designed for ease-of-use on a single-user PC without a network connection, and did not
have security features built in from the outset. Windows NT and its successors are
designed for security (including on a network) and multi-user PCs, but are not designed
with Internet security in mind as much since, when it was first developed in the early
1990s, Internet use was less prevalent. These design issues combined with flawed code
(such as buffer overflows) and the popularity of Windows means that it is a frequent
target of worm and virus writers. In June 2005, Bruce Schneier's Counterpane Internet
Security reported that it had seen over 1,000 new viruses and worms in the previous six
months.[11]

Microsoft releases security patches through its Windows Update service approximately
once a month (usually the second Tuesday of the month), although critical updates are
made available at shorter intervals when necessary.[12] In Windows 2000 (SP3 and later),
Windows XP and Windows Server 2003, updates can be automatically downloaded and
installed if the user selects to do so. As a result, Service Pack 2 for Windows XP, as well
as Service Pack 1 for Windows Server 2003, were installed by users more quickly than it
otherwise might have been.[13]

[] Windows Defender

Windows Defender

On 6 January 2005, Microsoft released a beta version of Microsoft AntiSpyware, based


upon the previously released Giant AntiSpyware. On 14 February 2006, Microsoft
AntiSpyware became Windows Defender with the release of beta 2. Windows Defender
is a freeware program designed to protect against spyware and other unwanted software.
Windows XP and Windows Server 2003 users who have genuine copies of Microsoft
Windows can freely download the program from Microsoft's web site, and Windows
Defender ships as part of Windows Vista.[14]

[] Third-party analysis
In an article based on a report by Symantec,[15] internetnews.com has described Microsoft
Windows as having the "fewest number of patches and the shortest average patch
development time of the five operating systems it monitored in the last six months of
2006."[16] However, although the overall number of vulnerabilities found in MS Windows
was lower than in the other operating systems, the number of vulnerabilities of high
severity found in Windows was significantly greater—Windows: 12, Red Hat + Fedora:
2, Apple OS X: 1, HP-UX: 2, Solaris: 1.

A study conducted by Kevin Mitnick and marketing communications firm Avantgarde in


2004 found that an unprotected and unpatched Windows XP system with Service Pack 1
lasted only 4 minutes on the Internet before it was compromised, and an unprotected and
also unpatched Windows Server 2003 system was compromised after being connected to
the internet for 8 hours.[17] However, it is important to note that this study does not apply
to Windows XP systems running the Service Pack 2 update (released in late 2004), which
vastly improved the security of Windows XP. The computer that was running Windows
XP Service Pack 2 was not compromised. The AOL National Cyber Security Alliance
Online Safety Study of October 2004 determined that 80% of Windows users were
infected by at least one spyware/adware product.[18] Much documentation is available
describing how to increase the security of Microsoft Windows products. Typical
suggestions include deploying Microsoft Windows behind a hardware or software
firewall, running anti-virus and anti-spyware software, and installing patches as they
become available through Windows Update.[citation

Windows lifecycle policy


Microsoft has stopped releasing updates and hotfixes for many old Windows operating
systems, including all versions of Windows 9x and earlier versions of Windows NT.
Support for Windows 98, Windows 98 Second Edition and Windows Me ended in July
11, 2006, and Extended Support for Windows NT 4.0 ended in December 31, 2004.
Security updates were also discontinued for Windows XP 64-bit Edition after the release
of the more recent Windows XP Professional x64 Edition.[citation needed] But most of the
updates that Microsoft has released in the past can still be downloaded using Windows
Update Catalog.[citation needed]

Windows 2000 is currently in the Extended Support Period, and this period will not end
until July 13, 2010. Only security updates will be provided during Extended Support;
indicating that no new service packs will be released for Windows 2000.

[] Emulation software
Emulation allows the use of some Windows applications without using Microsoft
Windows. These include:

• Wine - (Wine Is Not an Emulator) an almost-complete free software/open-source


software implementation of the Windows API, allowing one to run most
Windows applications on x86-based platforms, including GNU/Linux. Wine is
technically not an emulator; an emulator effectively 'pretends' to be a different
CPU, while Wine makes use of Windows-style APIs to 'simulate' the Windows
environment directly.
• CrossOver - A Wine package with licensed fonts. Its developers are regular
contributors to Wine, and focus on Wine running officially supported
applications.
• Cedega - TransGaming Technologies' proprietary fork of Wine, which is designed
specifically for running games written for Microsoft Windows under GNU/Linux.
• ReactOS - An open-source OS that intends to run the same software as Windows,
at an early alpha stage.
• Darwine - This project intends to port and develop Wine as well as other
supporting tools that will allow Darwin and Mac OS X users to run Microsoft
Windows Applications, and to provide Win32 API compatibility at application
source code level.

DOS
.

DOS (from Disk Operating System) commonly refers to the family of closely related
operating systems which dominated the IBM PC compatible market between 1981 and
1995 (or until about 2000, if Windows 9x systems are included): DR-DOS, FreeDOS,
MS-DOS, Novell-DOS, OpenDOS, PC-DOS, PTS-DOS, ROM-DOS and several others.
They are single user, single task systems. MS-DOS from Microsoft was the most widely
used. These operating systems ran on IBM PC type hardware using the Intel x86 CPUs or
their compatible cousins from other makers. MS-DOS, inspired by CP/M, is still common
today and was the foundation for many of Microsoft's operating systems (from Windows
1.0 through Windows Me). MS-DOS was later used as the foundation for their operating
systems.

[] History
MS-DOS (and the IBM PC-DOS which was licensed therefrom), and its predecessor, 86-
DOS, was inspired by CP/M (Control Program / (for) Microcomputers) — which was the
dominant disk operating system for 8-bit Intel 8080 and Zilog Z80 based
microcomputers. It was first developed at Seattle Computer Products by Tim Paterson as
a variant of CP/M-80 from Digital Research, but intended as an internal product for
testing SCP's new 8086 CPU card for the S-100 bus. It did not run on the 8080 (or
compatible) CPU needed for CP/M-80. Microsoft bought it from SCP allegedly for
$50,000, made changes and licensed the result to IBM (sold as PC-DOS) for its new 'PC'
using the 8088 CPU (internally the same as the 8086), and to many other hardware
manufacturers. In the later case it was sold as MS-DOS.
Digital Research produced a compatible product known as "DR-DOS", which was
eventually taken over (after a buyout of Digital Research) by Novell. This became
"OpenDOS" for a while after the relevant division of Novell was sold to Caldera
International, now called SCO. Later, the embedded division of Caldera was "spun off"
as Lineo (later renamed Embedix), which in turn sold DR-DOS to a start-up called
Device Logics, who now seem to call themselves DRDOS, Inc.

Only IBM-PCs were distributed with PC-DOS, whereas PC compatible computers from
nearly all other manufacturers were distributed with MS-DOS. For the early years of this
operating system family, PC-DOS was almost identical to MS-DOS.

Early versions of Microsoft Windows were little more than a graphical shell for DOS,
and later versions of Windows were tightly integrated with MS-DOS. It is also possible
to run DOS programs under OS/2 and Linux using virtual-machine emulators. Because of
the long existence and ubiquity of DOS in the world of the PC-compatible platform, DOS
was often considered to be the native operating system of the PC compatible platform.

There are alternative versions of DOS, such as FreeDOS and OpenDOS. FreeDOS
appeared in 1994 due to Microsoft Windows 95, which differed from Windows 3.11 by
being not a shell and dispensing with MS-DOS.[1]

[] Timeline

Microsoft bought non-exclusive rights for marketing 86-DOS in October 1980. In July
1981, Microsoft bought exclusive rights for 86-DOS (by now up to version 1.14) and
renamed the operating system MS-DOS.

The first IBM branded version, PC-DOS 1.0, was released in August, 1981. It supported
up to 640 kB of RAM[2] and four 160 kB 5.25" single sided floppy disks.

In May 1982, PC-DOS 1.1 added support for 320 kB double-sided floppy disks.

PC-DOS 2.0 and MS-DOS 2.0, released in March 1983, were the first versions to support
the PC/XT and fixed disk drives (commonly referred to as hard disk drives). Floppy disk
capacity was increased to 180 kB (single sided) and 360 kB (double sided) by using nine
sectors per track instead of eight.

At the same time, Microsoft announced its intention to create a GUI for DOS. Its first
version, Windows 1.0, was announced on November 1983, but was unfinished and did
not interest IBM. By November 1985, the first finished version, Microsoft Windows
1.01, was released.

MS-DOS 3.0, released in September 1984, first supported 1.2Mb floppy disks and 32Mb
hard disks. MS-DOS 3.1, released November that year, introduced network support.
MS-DOS 3.2, released in April 1986, was the first retail release of MS-DOS. It added
support of 720 kB 3.5" floppy disks. Previous versions had been sold only to computer
manufacturers who pre-loaded them on their computers, because operating systems were
considered part of a computer, not an independent product.

MS-DOS 3.3, released in April 1987, featured logical disks. A physical disk could be
divided into several partitions, considered as independent disks by the operating system.
Support was also added for 1.44 MB 3.5" floppy disks.

MS-DOS 4.0, released in July 1988, supported disks up to 2 GB (disk sizes were
typically 40-60 MB in 1988), and added a full-screen shell called DOSSHELL. Other
shells, like Norton Commander and PCShell, already existed in the market. In November
of 1988, Microsoft addressed many bugs in a service release, MS-DOS 4.01.

MS-DOS 5.0, released in April 1991, included the full-screen BASIC interpreter QBasic,
which also provided a full-screen text editor (previously, MS-DOS had only a line-based
text editor, edlin). A disk cache utility SmartDrive, undelete capabilities, and other
improvements were also included. It had severe problems with some disk utilities, fixed
later in MS-DOS 5.01, released later in the same year.

In March 1992, Microsoft released Windows 3.1, which became the first popular version
of Microsoft Windows, with more than 1,000,000 purchasing the graphical user interface.

In March 1993, MS-DOS 6.0 was released. Following competition from Digital
Research, Microsoft added a disk compression utility called DoubleSpace. At the time,
typical hard disk sizes were about 200-400 MB, and many users badly needed more disk
space. MS-DOS 6.0 also featured the disk defragmenter DEFRAG, backup program
MSBACKUP, memory optimization with MEMMAKER, and rudimentary virus
protection via MSAV.

As with versions 4.0 and 5.0, MS-DOS 6.0 turned out to be buggy. Due to complaints
about loss of data, Microsoft released an updated version, MS-DOS 6.2, with an
improved DoubleSpace utility, a new disk check utility, SCANDISK (similar to fsck
from Unix), and other improvements.

The next version of MS-DOS, 6.21 (released March 1994), appeared due to legal
problems. Stac Electronics sued Microsoft and forced it to remove DoubleSpace from
their operating system.

In May 1994, Microsoft released MS-DOS 6.22, with another disk compression package,
DriveSpace, licensed from VertiSoft Systems.

MS-DOS 6.22 was the last stand-alone version of MS-DOS available to the general
public. MS-DOS was removed from marketing by Microsoft on November 30, 2001. See
the Microsoft Licensing Roadmap.
Microsoft also released versions 6.23 to 6.25 for banks and American military
organizations. These versions introduced FAT32 support. Since then, MS-DOS exists
only as a part of Microsoft Windows versions based on Windows 95 (Windows 98,
Windows Me). The original release of Microsoft Windows 95 incorporates MS-DOS
version 7.0.

IBM released its last commercial version of a DOS, IBM PC-DOS 7.0, in early 1995. It
incorporated many new utilities such as anti-virus software, comprehensive backup
programs, PCMCIA support, and DOS Pen extensions. Also added were new features to
enhance available memory and disk space.

[] Accessing hardware under DOS


The operating system offers a hardware abstraction layer that allows development of
character-based applications, but not for accessing most of the hardware, such as graphics
cards, printers, or mice. This required programmers to access the hardware directly,
resulting in each application having its own set of device drivers for each hardware
peripheral. Hardware manufacturers would release specifications to ensure device drivers
for popular applications were available.

[] DOS and other PC operating systems


Early versions of Microsoft Windows were shell programs that ran in DOS. Windows
3.11 extended the shell by going into protected mode and added 32-bit support. These
were 16-bit/32-bit hybrids. Microsoft Windows 95 further reduced DOS to the role of the
bootloader. Windows 98 and Windows Me were the last Microsoft OS to run on DOS.
The DOS-based branch was eventually abandoned in favor of Windows NT, the first true
32-bit system that was the foundation for Windows XP and Windows Vista.

Windows NT, initially NT OS/2 3.0, was the result of a collaboration between Microsoft
and IBM to develop a 32-bit operating system that had high hardware and software
portability. Because of the success of Windows 3.0, Microsoft changed the application
programming interface to the extended Windows API, which caused a split between the
two companies and a branch in the operating system. IBM would continue to work on
OS/2 and OS/2 API, while Microsoft renamed its operating system Windows NT.

[] Reserved device names under DOS


There are reserved device names in DOS that cannot be used as filenames regardless of
extension; these restrictions also affect several Windows versions, in some cases causing
crashes and security vulnerabilities.

A partial list of these reserved names is: NUL:, COM1: or AUX:, COM2:, COM3:, COM4:,
CON:, LPT1: or PRN:, LPT2:, LPT3:, and CLOCK$.
More recent versions of both MS-DOS and IBM-DOS allow reserved device names
without the trailing colon; e.g., PRN refers to PRN:.

The NUL filename redirects to a null file, similar in function to the UNIX device /dev/null.
It is best suited for being used in batch command files to discard unneeded output. If NUL
is copied to a file that already exists, it will truncate the target file; otherwise, a zero byte
file will be created. (Thus, copy NUL foo is functionally similar to the UNIX commands
cat </dev/null >foo and cp /dev/null foo.) Naming a file as NUL, regardless of
extension, could cause unpredictable behavior in most applications. Well-designed
applications will generate an error stating that NUL is a DOS reserved filename; others
generate the file but whatever the program saves is lost; finally, some applications may
hang or leave the computer in an inconsistent state, requiring a reboot.

[] Drive naming scheme


Main article: Drive letter assignment

Under Microsoft's DOS operating system and its derivatives drives are referred to by
identifying letters. Standard practice is to reserve "A" and "B" for floppy drives. On
systems with only one floppy drive DOS permits the use of both letters for one drive, and
DOS will ask to swap disks. This permits copying from floppy to floppy or having a
program run from one floppy while having its data on another. Hard drives were
originally assigned the letters "C" and "D". DOS could only support one active partition
per drive. As support for more hard drives became available, this developed into
assigning the active primary partition on each drive letters first, then making a second
pass over the drives to allocate letters to logical drives in the extended partition, then
making a third, which gives the other non-active primary partitions their names. (Always
assumed, they exist and contain a DOS-readable file system.) Lastly, DOS allocate letters
for CD-ROMs, RAM disks and other hardware. Letter assignments usually occur in the
order of the drivers loaded, but the drivers can instruct DOS to assign a different letter.
An example is network drives, for which the driver will assign letters nearer the end of
the alphabets.

Because DOS applications use these drive letters directly (unlike the /dev folder in Unix-
like systems), they can be disrupted by adding new hardware that needs a drive letter. An
example is the addition of a new hard drive with a primary partition to an original hard
drive that contains logical drives in extended partitions. As primary partitions have higher
priority than the logical drives, it will change drive letters in the configuration. Moreover,
attempts to add a new hard drive with only logical drives in an extended partition would
still disrupt the letters of RAM disks and CD-ROM drives. This problem persisted
through the 9x versions of Windows until NT, which preserves the letters of existing
drives until the user changes it.

[] DOS emulators
Under Linux it is possible to run copies of DOS and many of its clones under DOSEMU,
a Linux-native virtual machine for running real mode programs. There are a number of
other emulators for running DOS under various versions of UNIX, even on non-x86
platforms, such as DOSBox

DOS emulators are gaining popularity among Windows XP users because Windows XP
system is incompatible with pure DOS. They are used to play 'abandoned games' made
for DOS. One of the most famous emulators is DOSBox, designed for game-playing on
modern operating systems. Another emulator ExDOS is designed for business.
VDMSound is also popular on Windows XP for its GUI and sound support.

Microsoft Word

Microsoft Word is Microsoft's flagship word processing software. It was first released in
1983 under the name Multi-Tool Word for Xenix systems.[1] Versions were later written
for several other platforms including IBM PCs running DOS (1983), the Apple
Macintosh (1984), SCO UNIX, OS/2 and Microsoft Windows (1989). It is a component
of the Microsoft Office system; however, it is also sold as a standalone product and
included in Microsoft Works Suite. Beginning with the 2003 version, the branding was
revised to emphasize Word's identity as a component within the Office suite: Microsoft
began calling it Microsoft Office Word instead of merely Microsoft Word. The latest
release is Word 2007.

[] History
[] Word 1981 to 1989

Many concepts and ideas of Word were brought from Bravo, the original GUI word
processor developed at Xerox PARC. Bravo's creator Charles Simonyi left PARC to
work for Microsoft in 1981. Simonyi hired Richard Brodie, who had worked with him on
Bravo, away from PARC that summer.[2][3] On February 1, 1983, development on what
was originally named Multi-Tool Word began.

Having renamed it Microsoft Word, Microsoft released the program October 25, 1983,
for the IBM PC. Free demonstration copies of the application were bundled with the
November 1983 issue of PC World, making it the first program to be distributed on-disk
with a magazine.[1] However, it was not well received, and sales lagged behind those of
rival products such as WordPerfect. [citation needed]

Word featured a concept of "What You See Is What You Get", or WYSIWYG, and was
the first application with such features as the ability to display bold and italics text on an
IBM PC.[1] Word made full use of the mouse, which was so unusual at the time that
Microsoft offered a bundled Word-with-Mouse package. Although MS-DOS was a
character-based system, Microsoft Word was the first word processor for the IBM PC
that showed actual line breaks and typeface markups such as bold and italics directly on
the screen while editing, although this was not a true WYSIWYG system because
available displays did not have the resolution to show actual typefaces. Other DOS word
processors, such as WordStar and WordPerfect, used simple text-only display with
markup codes on the screen or sometimes, at the most, alternative colors.[4]

As with most DOS software, each program had its own, often complicated, set of
commands and nomenclature for performing functions that had to be learned. For
example, in Word for MS-DOS, a file would be saved with the sequence Escape-T-S:
pressing Escape called up the menu box, T accessed the set of options for Transfer and S
was for Save (the only similar interface belonged to Microsoft's own Multiplan
spreadsheet). As most secretaries had learned how to use WordPerfect, companies were
reluctant to switch to a rival product that offered few advantages. Desired features in
Word such as indentation before typing (emulating the F4 feature in WordPerfect), the
ability to block text to copy it before typing, instead of picking up mouse or blocking
after typing, and a reliable way to have macros and other functions always replicate the
same function time after time, were just some of Word's problems for production typing.

Word for Macintosh, despite the major differences in look and feel from the DOS
version, was ported by Ken Shapiro with only minor changes from the DOS source code,
[citation needed]
which had been written with high-resolution displays and laser printers in mind
although none were yet available to the general public. Following the introduction of
LisaWrite and MacWrite, Word for Macintosh attempted to add closer WYSIWYG
features into its package. After Word for Mac was released in 1985, it gained wide
acceptance. There was no Word 2.0 for Macintosh; this was the first attempt to
synchronize version numbers across platforms.

The second release of Word for Macintosh, named Word 3.0, was shipped in 1987. It
included numerous internal enhancements and new features but was plagued with bugs.
Within a few months Word 3.0 was superseded by Word 3.01, which was much more
stable. All registered users of 3.0 were mailed free copies of 3.01, making this one of
Microsoft's most expensive mistakes up to that time. Word 4.0 was released in 1989.

[] Word 1990 to 1995

Microsoft Word 5.1a (Macintosh)

The first version of Word for Windows was released in 1989 at a price of 500 US dollars.
With the release of Windows 3.0 the following year, sales began to pick up (Word for
Windows 1.0 was designed for use with Windows 3.0, and its performance was poorer
with the versions of Windows available when it was first released). The failure of
WordPerfect to produce a Windows version proved a fatal mistake. It was version 2.0 of
Word, however, that firmly established Microsoft Word as the market leader.[citation needed]
After MacWrite, Word for Macintosh never had any serious rivals, although programs
such as Nisus Writer provided features such as non-contiguous selection which were not
added until Word 2002 in Office XP. In addition, many users complained that major
updates reliably came more than two years apart, too long for most business users at that
time.

Word 5.1 for the Macintosh, released in 1992, was a popular word processor due to its
elegance, relative ease of use, and feature set. However, version 6.0 for the Macintosh,
released in 1994, was widely derided, unlike the Windows version. It was the first
version of Word based on a common codebase between the Windows and Mac versions;
many accused it of being slow, clumsy and memory intensive. The equivalent Windows
version was also numbered 6.0 to coordinate product naming across platforms, despite
the fact that the previous version was Word for Windows 2.0.

When Microsoft became aware of the Year 2000 problem, it released the entire version of
DOS port of Microsoft Word 5.5 instead of getting people to pay for the update. As of
March 2007, it is still available for download from Microsoft's web site.[5]

Microsoft Word 6.0 (Windows 98)

Word 6.0 was the second attempt to develop a common codebase version of Word. The
first, code-named Pyramid, had been an attempt to completely rewrite the existing Word
product. It was abandoned when it was determined that it would take the development
team too long to rewrite and then catch up with all the new capabilities that could have
been added in the same time without a rewrite. Proponents of Pyramid claimed it would
have been faster, smaller, and more stable than the product that was eventually released
for Macintosh, which was compiled using a beta version of Visual C++ 2.0 that targets
the Macintosh, so many optimizations have to be turned off (the version 4.2.1 of Office is
compiled using the final version), and sometimes use the Windows API simulation
library included.[1] Pyramid would have been truly cross-platform, with machine-
independent application code and a small mediation layer between the application and the
operating system.

More recent versions of Word for Macintosh are no longer ported versions of Word for
Windows although some code is often appropriated from the Windows version for the
Macintosh version.[citation needed]

Later versions of Word have more capabilities than just word processing. The Drawing
tool allows simple desktop publishing operations such as adding graphics to documents.
Collaboration, document comparison, multilingual support, translation and many other
capabilities have been added over the years.[citation needed]

[] Word 97
Word 97 icon

Word 97 had the same general operating performance as later versions such as Word
2000. This was the first copy of Word featuring the "Office Assistant", which was an
animated helper used in all Office programs.

[] Word 2007

Main article: Microsoft Office 2007

Word 2007 is the most recent version of Word. This release includes numerous changes,
including a new XML-based file format, a redesigned interface, an integrated equation
editor, bibliographic management, and support for structured documents. It also has
contextual tabs, which are functionality specific only to the object with focus, and many
other features like Live Preview (which enables you to view the document without
making any permanent changes), Mini Toolbar, Super-tooltips, Quick Access toolbar,
SmartArt, etc.

[] File formats
Although the familiar ".doc" extension has been used in many different versions of Word,
it actually encompasses four distinct file formats:

1. Word for DOS


2. Word for Windows 1 and 2; Word 4 and 5 for Mac
3. Word 6 and Word 95; Word 6 for Mac
4. Word 97, 2000, 2002, and 2003; Word 98, 2001, and X for Mac

The newer ".docx" extension signifies Office Open XML and is used by Word 2007.

[] Binary formats and handling

Word document formats (.DOC) as of the early 2000s were a de facto standard of
document file formats due to their popularity. Though usually just referred to as "Word
document format", this term refers primarily to the range of formats used by default in
Word version 2–2003. In addition to the default Word binary formats, there are actually a
number of optional alternate file formats that Microsoft has used over the years. Rich
Text Format (RTF) was an early effort to create a format for interchanging formatted text
between applications. RTF remains an optional format for Word that retains most
formatting and all content of the original document. Later, after HTML appeared, Word
supported an HTML derivative as an additional full-fidelity roundtrip format similar to
RTF, with the additional capability that the file could be viewed in a web browser. Word
2007 uses the new Microsoft Office Open XML format as its default format, but retains
the older Word 97–2003 format as an option. It also supports (for output only) PDF and
XPS format.

The document formats of the various versions change in subtle and not so subtle ways;
formatting created in newer versions does not always survive when viewed in older
versions of the program, nearly always because that capability does not exist in the
previous version. Wordart also changed drastically in a recent version causing problems
with documents that used it when moving in either direction. The DOC format's
specifications are not available for public download but can be received by writing to
Microsoft directly and signing an agreement.[6]

Microsoft Word 95-2003 implemented OLE (Object Linking and Embedding) to manage
the structure of its file format, easily identifiable by the .doc extension. OLE behaves
rather like a conventional hard drive filesystem, and is made up of several key
components. Each word document is composed of so called "big blocks" which are
almost always (but do not have to be) 512-byte chunks, hence a Word documents filesize
will always be a multiple of 512. "Storages" are analogues of the directory on a disk
drive, and point to other storages or "streams" which are similar to files on a disk. The
text in a Word document is always contained in the "WordDocument" stream. The first
big block in a Word document, known as the "header" block, provides important
information as to the location of the major data structures in the document. "Property
storages" provide metadata about the storages and streams in a .doc file, such as where it
begins and its name and so forth. The "File information block" contains information
about where the text in a word document starts, ends, what version of Word created the
document and so forth. Needless to say, Word documents are far more complex than
perhaps initially expected, perhaps necessarily, or in part to prevent third-parties
designing interoperable applications.

People who do not use MS Office sometimes find it difficult to use a Word document.
Various solutions have been created. Since the format is a de facto standard, many word
processors such as AbiWord or OpenOffice.org Writer need file import and export filters
for Microsoft Word's document file format to compete. Furthermore, there is Apache
Jakarta POI, which is an open-source Java library that aims to read and write Word's
binary file. Most of this interoperability is achieved through reverse engineering since
documentation of the Word 1.0-2003 file format, while available to partners, is not
publicly released. The Word 2007 file format, however, is publicly documented.

For the last 10 years Microsoft has also made available freeware viewer programs for
Windows that can read Word documents without a full version of the MS Word software.
[2] Microsoft has also provided converters that enable different versions of Word to
import and export to older Word versions and other formats and converters for older
Word versions to read documents created in newer Word formats.[7] The whole Office
product range is covered by the Office Converter Pack for Office 97–2003 and Office
Compatibility Pack for Office 2000–2007 since the release of Office 2007.[8]

[] Microsoft Office Open XML


The aforementioned Word format is a binary format. Microsoft has moved towards an
XML-based file format for their office applications with Office 2007: Microsoft Office
Open XML. This format does not conform fully to standard XML. It is, however,
publicly documented as Ecma standard 376. Public documentation of the default file
format is a first for Word, and makes it considerably easier, though not trivial, for
competitors to interoperate. Efforts to establish it as an ISO standard are also underway.
Another XML-based, public file format supported by Word 2003 is WordprocessingML.

It is possible to write plugins permitting Word to read and write formats it does not
natively support.

[] Features and flaws


[] Normal.dot

Normal.dot is the master template from which all Word documents are created. It is one
of the most important files in Microsoft Word. It determines the margin defaults as well
as the layout of the text and font defaults. Although normal.dot is already set with certain
defaults, the user can change normal.dot to new defaults. This will change other
documents which were created using the template, usually in unexpected ways.

[] Macros

Like other Microsoft Office documents, Word files can include advanced macros and
even embedded programs. The language was originally WordBasic, but changed to
Visual Basic for Applications as of Word 97. Recently .NET has become the preferred
platform for Word programming.

This extensive functionality can also be used to run and propagate viruses in documents.
The tendency for people to exchange Word documents via email, USB key, and floppy
makes this an especially attractive vector. A prominent example is the Melissa worm, but
countless others have existed in the wild. Some anti-virus software can detect and clean
common macro viruses, and firewalls may prevent worms from transmitting themselves
to other systems.

The first virus known to affect Microsoft Word documents was called the Concept virus,
a relatively harmless virus created to demonstrate the possibility of macro virus creation.
[citation needed]

[] Layout issues

As of Word 2007 for Windows (and Word 2004 for Macintosh), the program has been
unable to handle ligatures defined in TrueType fonts: those ligature glyphs with Unicode
codepoints may be inserted manually, but are not recognized by Word for what they are,
breaking spellchecking, while custom ligatures present in the font are not accessible at
all. Other layout deficiencies of Word include the inability to set crop marks or thin
spaces. Various third-party workaround utilities have been developed.[9] Similarly,
combining diacritics are handled poorly: Word 2003 has "improved support", but many
diacritics are still misplaced, even if a precomposed glyph is present in the font.
Additionally, as of Word 2002, Word does automatic font substitution when it finds a
character in a document that does not exist in the font specified. It is impossible to
deactivate this, making it very difficult to spot when a glyph used is missing from the
font in use.

In Word 2004 for Macintosh, complex scripts support was inferior even to Word 97, and
Word does not support Apple Advanced Typography features like ligatures or glyph
variants. [3]

[] Bullets and numbering

Users report that Word's bulleting and numbering system is highly problematic.
Particularly troublesome is Word's system for restarting numbering.[10] However, the
Bullets and Numbering system has been significantly overhauled for Office 2007, which
should reduce the severity of these problems.

[] Creating Tables

Users can also create tables in MS Word. Depending on the version you have, formulas
can also be computed.

[] Versions

Microsoft Word 5.5 for DOS

Versions for MS-DOS include:

• 1983 November — Word 1


• 1985 — Word 2
• 1986 — Word 3
• 1987 — Word 4 aka Microsoft Word 4.0 for the PC
• 1989 — Word 5
• 1991 — Word 5.1
• 1991 — Word 5.5
• 1993 — Word 6.0

Versions for the Macintosh (Mac OS and Mac OS X) include:

• 1985 January — Word 1 for the Macintosh


• 1987 — Word 3
• 1989 — Word 4
• 1991 — Word 5
• 1993 — Word 6
• 1998 — Word 98
• 2000 — Word 2001, the last version compatible with Mac OS 9
• 2001 — Word v.X, the first version for Mac OS X only
• 2004 — Word 2004, part of Office 2004 for Mac
• 2008 — Word 2008, part of Office 2008 for Mac

Microsoft Word 1.0 for Windows 3.x

Versions for Microsoft Windows include:

• 1989 November — Word for Windows 1.0 for Windows 2.x, code-named "Opus"
• 1990 March — Word for Windows 1.1 for Windows 3.0, code-named "Bill the
Cat"
• 1990 June — Word for Windows 1.1a for Windows 3.1
• 1991 — Word for Windows 2.0, code-named "Spaceman Spiff"
• 1993 — Word for Windows 6.0, code named "T3" (renumbered "6" to bring
Windows version numbering in line with that of DOS version, Macintosh version
and also WordPerfect, the main competing word processor at the time; also a 32-
bit version for Windows NT only)
• 1995 — Word for Windows 95 (version 7.0) - included in Office 95
• 1997 — Word 97 (version 8.0) included in Office 97
• 1999 — Word 2000 (version 9.0) included in Office 2000
• 2001 — Word 2002 (version 10) included in Office XP

Word 2003 icon

• 2003 — Word 2003 (officially "Microsoft Office Word 2003") - (ver. 11)
included in Office 2003
• 2006 — Word 2007 (officially "Microsoft Office Word 2007") - (ver. 12)
included in Office 2007; released to businesses on November 30th 2006, released
worldwide to consumers on January 30th 2007

Versions for SCO UNIX include:

• Microsoft Word for UNIX Systems Release 5.1

Versions for OS/2 include:

• 1992 Microsoft Word for OS/2 version 1.1B


Microsoft Excel
Microsoft Excel (full name Microsoft Office Excel) is a spreadsheet application written
and distributed by Microsoft for Microsoft Windows and Mac OS. It features calculation
and graphing tools which, along with aggressive marketing, have made Excel one of the
most popular microcomputer applications to date. It is overwhelmingly the dominant
spreadsheet application available for these platforms and has been so since version 5 in
1993 and its bundling as part of Microsoft Office.

[] History
Microsoft originally marketed a spreadsheet program called Multiplan in 1982, which
was very popular on CP/M systems, but on MS-DOS systems it lost popularity to Lotus
1-2-3. This promoted development of a new spreadsheet called Excel which started with
the intention to, in the words of Doug Klunder, 'do everything 1-2-3 does and do it better'
. The first version of Excel was released for the Mac in 1985 and the first Windows
version (numbered 2.0 to line-up with the Mac and bundled with a run-time Windows
environment) was released in November 1987. Lotus was slow to bring 1-2-3 to
Windows and by 1988 Excel had started to outsell 1-2-3 and helped Microsoft achieve
the position of leading PC software developer. This accomplishment, dethroning the king
of the software world, solidified Microsoft as a valid competitor and showed its future of
developing graphical software. Microsoft pushed its advantage with regular new releases,
every two years or so. The current version for the Windows platform is Excel 12, also
called Microsoft Office Excel 2007. The current version for the Mac OS X platform is
Microsoft Excel 2004.

Microsoft Excel 2.1 included a run-time version of Windows 2.1

Early in its life Excel became the target of a trademark lawsuit by another company
already selling a software package named "Excel" in the finance industry. As the result of
the dispute Microsoft was required to refer to the program as "Microsoft Excel" in all of
its formal press releases and legal documents. However, over time this practice has been
ignored, and Microsoft cleared up the issue permanently when they purchased the
trademark to the other program. Microsoft also encouraged the use of the letters XL as
shorthand for the program; while this is no longer common, the program's icon on
Windows still consists of a stylized combination of the two letters, and the file extension
of the default Excel format is .xls.

Excel 3.0 logo


Excel offers many user interface tweaks over the earliest electronic spreadsheets;
however, the essence remains the same as in the original spreadsheet, VisiCalc: the cells
are organized in rows and columns, and contain data or formulas with relative or absolute
references to other cells.

Excel was the first spreadsheet that allowed the user to define the appearance of
spreadsheets (fonts, character attributes and cell appearance). It also introduced
intelligent cell recomputation, where only cells dependent on the cell being modified are
updated (previous spreadsheet programs recomputed everything all the time or waited for
a specific user command). Excel has extensive graphing capabilities.

When first bundled into Microsoft Office in 1993, Microsoft Word and Microsoft
PowerPoint had their GUIs redesigned for consistency with Excel, the killer app on the
PC at the time.

Excel 97 logo

Since 1993, Excel has included Visual Basic for Applications (VBA), a programming
language based on Visual Basic which adds the ability to automate tasks in Excel and to
provide user defined functions (UDF) for use in worksheets. VBA is a powerful addition
to the application which, in later versions, includes a fully featured integrated
development environment (IDE). Macro recording can produce VBA code replicating
user actions, thus allowing simple automation of regular tasks. VBA allows the creation
of forms and in-worksheet controls to communicate with the user. The language supports
use (but not creation) of ActiveX (COM) DLL's; later versions add support for class
modules allowing the use of basic object-oriented programming techniques.

The automation functionality provided by VBA has caused Excel to become a target for
macro viruses. This was a serious problem in the corporate world until antivirus products
began to detect these viruses. Microsoft belatedly took steps to prevent the misuse by
adding the ability to disable macros completely, to enable macros when opening a
workbook or to trust all macros signed using a trusted certificate.

Versions 5.0 to 9.0 of Excel contain various Easter eggs, although since version 10
Microsoft has taken measures to eliminate such undocumented features from their
products.

[] Versions

'Excel 97' (8.0) being run on Windows XP


Microsoft Excel 2003 running under Windows XP Home Edition

Excel 2003 icon

Versions for Microsoft Windows include:

• 1987 Excel 2.0 for Windows


• 1990 Excel 3.0
• 1992 Excel 4.0
• 1993 Excel 5.0 (Office 4.2 & 4.3, also a 32-bit version for Windows NT only on
the PowerPc, DEC Alpha, and MIPS)
• 1995 Excel for Windows 95 (version 7.0) - included in Office 95
• 1997 Excel 97 - (version 8.0) included in Office 97 (x86 and also a DEC Alpha
version)
• 1999 Excel 2000 (version 9.0) included in Office 2000
• 2001 Excel 2002 (version 10) included in Office XP
• 2003 Excel 2003 (version 11) included in Office 2003
• 2007 Excel 2007 (version 12) included in Office 2007
• Notice: There is no Excel 1.0, in order to avoid confusion with Apple versions.
• Notice: There is no Excel 6.0, because the Windows 95 version was launched
with Word 7. All the Office 95 & Office 4.X products have OLE 2 capacity -
moving data automatically from various programmes - and Excel 7 should show
that it was contemporary with Word 7.

Versions for the Apple Macintosh include:

• 1985 Excel 1.0


• 1988 Excel 1.5
• 1989 Excel 2.2
• 1990 Excel 3.0
• 1992 Excel 4.0
• 1993 Excel 5.0 (Office 4.X -- Motorola 68000 version and first PowerPC version)
• 1998 Excel 8.0 (Office '98)
• 2000 Excel 9.0 (Office 2001)
• 2001 Excel 10.0 (Office v. X)
• 2004 Excel 11.0 (part of Office 2004 for Mac)
• 2008 Excel 12.0 (part of Office 2008 for Mac)

Versions for OS/2 include:

• 1989 Excel 2.2


• 1991 Excel 3.0

[] File formats
Microsoft Excel up until 2007 version used a proprietary binary file format called Binary
Interchange File Format (BIFF) as its primary format[1]. Excel 2007 uses Office Open
XML as its primary file format, an XML-based container similar in design to XML-based
format called "XML Spreadsheet" ("XMLSS"), first introduced in Excel 2002[2]. The
latter format is not able to encode VBA macros.

Although supporting and encouraging the use of new XML-based formats as


replacements, Excel 2007 is still backwards compatible with the traditional, binary,
formats. In addition, most versions of Microsoft Excel are able to read CSV, DBF,
SYLK, DIF, and other legacy formats.

[] Microsoft Excel 2007 Office Open XML formats

Main article: Office Open XML

Microsoft Excel 2007, along with the other products in the Microsoft Office 2007 suite,
introduces a host of new file formats. These are part of the Office Open XML (OOXML)
specification.

The new Excel 2007 formats are:

Excel Workbook (.xlsx)


The default Excel 2007 workbook format. In reality a ZIP compressed archive
with a directory structure of XML text documents. Functions as the primary
replacement for the former binary .xls format, although it does not support Excel
macros for security reasons.
Excel Macro-enabled Workbook (.xlsm)
As Excel Workbook, but with macro support.
Excel Binary Workbook (.xlsb)
As Excel Macro-enabled Workbook, but storing information in binary form rather
than XML documents for opening and saving documents more quickly and
efficiently. Intended especially for very large documents with tens of thousands of
rows, and/or several hundreds of columns.
Excel Macro-enabled Template (.xltm)
A template document that forms a basis for actual workbooks, with macro
support. The replacement for the old .xlt format.
Excel Add-in (.xlam)
Excel add-in to add extra functionality and tools. Inherent macro support due to
the file purpose.

[] Exporting and Migration of spreadsheets

API's are also provided to open excel spreadsheets in a variety of other applications and
environments other than Microsoft Excel. These include opening excel documents on the
web using either ActiveX controls,or plugins like the Adobe Flash Player. Attempts have
also been made to be able to copy excel spreadsheets to web applications using comma-
separated values.

[] Criticism
Due to Excel's foundation on floating point calculations, the statistical accuracy of Excel
has been criticized[3][4][5][6], as has the lack of certain statistical tools. Excel proponents
have responded that some of these errors represent edge cases and that the relatively few
users who would be affected by these know of them and have workarounds and
alternatives.[citation needed]

Excel incorrectly assumes that 1900 is a leap year[7][8]. The bug originated from Lotus 1-
2-3, and was implemented in Excel for the purpose of backward compatibility[9]. This
legacy has later been carried over into Office Open XML file format. Excel also supports
the second date format based on year 1904 epoch.

Microsoft Access
.

Microsoft Office Access, previously known as Microsoft Access, is a relational database


management system from Microsoft which combines the relational Microsoft Jet
Database Engine with a graphical user interface. It is a member of the 2007 Microsoft
Office system.

Access can use data stored in Access/Jet, Microsoft SQL Server, Oracle, or any ODBC-
compliant data container. Skilled software developers and data architects use it to
develop application software. Relatively unskilled programmers and non-programmer
"power users" can use it to build simple applications. It supports some object-oriented
techniques but falls short of being a fully object-oriented development tool.

Access was also the name of a communications program from Microsoft, meant to
compete with ProComm and other programs. This Access proved a failure and was
dropped.[1] Years later Microsoft reused the name for its database software.

[] History

Access 1.1 manual.

Access version 1.0 was released in November 1992.


Microsoft specified the minimum operating system for Version 2.0 as Microsoft
Windows v3.0 with 4 MB of RAM. 6 MB RAM was recommended along with a
minimum of 8 MB of available hard disk space (14 MB hard disk space recommended).
The product was shipped on seven 1.44 MB diskettes. The manual shows a 1993
copyright date.

The software worked well with very large records sets but testing showed some
circumstances caused data corruption. For example, file sizes over 700 MB were
problematic. (Note that most hard disks were smaller than 700 MB at the time this was in
wide use.) The Getting Started manual warns about a number of circumstances where
obsolete device drivers or incorrect configurations can cause data loss.

Access 2.0, running under Windows 95

Access' initial codename was Cirrus. This was developed before Visual Basic and the
forms engine was called Ruby. Bill Gates saw the prototypes and decided that the Basic
language component should be co-developed as a separate expandable application. This
project was called Thunder. The two projects were developed separately as the
underlying forms engines were incompatible with each other; however, these were
merged together again after VBA.

[] Uses
Access is used by small businesses, within departments of large corporations, and hobby
programmers to create ad hoc customized desktop systems for handling the creation and
manipulation of data. Access can be used as a database for basic web based applications
hosted on Microsoft's Internet Information Services and utilizing Microsoft Active Server
Pages ASP. Most typical web applications should use tools like ASP/Microsoft SQL
Server or the LAMP stack.

Some professional application developers use Access for rapid application development,
especially for the creation of prototypes and standalone applications that serve as tools
for on-the-road salesmen. Access does not scale well if data access is via a network, so
applications that are used by more than a handful of people tend to rely on Client-Server
based solutions. However, an Access "front end" (the forms, reports, queries and VB
code) can be used against a host of database backends, including JET (file-based database
engine, used in Access by default), Microsoft SQL Server, Oracle, and any other ODBC-
compliant product.

[] Features
One of the benefits of Access from a programmer's perspective is its relative
compatibility with SQL (structured query language) —queries may be viewed and edited
as SQL statements, and SQL statements can be used directly in Macros and VBA
Modules to manipulate Access tables. In this case, "relatively compatible" means that
SQL for Access contains many quirks, and as a result, it has been dubbed "Bill's SQL" by
industry insiders. Users may mix and use both VBA and "Macros" for programming
forms and logic and offers object-oriented possibilities.

MSDE (Microsoft SQL Server Desktop Engine) 2000, a mini-version of MS SQL Server
2000, is included with the developer edition of Office XP and may be used with Access
as an alternative to the Jet Database Engine.

Unlike a complete RDBMS, the Jet Engine lacks database triggers and stored procedures.
Starting in MS Access 2000 (Jet 4.0), there is a syntax that allows creating queries with
parameters, in a way that looks like creating stored procedures, but these procedures are
limited to one statement per procedure.[1] Microsoft Access does allow forms to contain
code that is triggered as changes are made to the underlying table (as long as the
modifications are done only with that form), and it is common to use pass-through
queries and other techniques in Access to run stored procedures in RDBMSs that support
these.

In ADP files (supported in MS Access 2000 and later), the database-related features are
entirely different, because this type of file connects to a MSDE or Microsoft SQL Server,
instead of using the Jet Engine. Thus, it supports the creation of nearly all objects in the
underlying server (tables with constraints and triggers, views, stored procedures and
UDF-s). However, only forms, reports, macros and modules are stored in the ADP file
(the other objects are stored in the back-end database).

[] Development
Access allows relatively quick development because all database tables, queries, forms,
and reports are stored in the database. For query development, Access utilizes the Query
Design Grid, a graphical user interface that allows users to create queries without
knowledge of the SQL programming language. In the Query Design Grid, users can
"show" the source tables of the query and select the fields they want returned by clicking
and dragging them into the grid. Joins can be created by clicking and dragging fields in
tables to fields in other tables. Access allows users to view and manipulate the SQL code
if desired.

Access 97 icon

The programming language available in Access is, as in other products of the Microsoft
Office suite, Microsoft Visual Basic for Applications. Two database access libraries of
COM components are provided: the legacy Data Access Objects (DAO), which was
superseded for a time (but still accessible) by (ADO) ActiveX Data Objects however
(DAO) has been reintroduced in the latest version, MS Access 2007.
Many developers who use Access use the Leszynski naming convention, though this is
not universal; it is a programming convention, not a DBMS-enforced rule.[2] It is also
made redundant by the fact Access categorises each object automatically and always
shows the object type, by prefixing Table: or Query: before the object name when
referencing a list of different database objects.

MS Access can be applied to small projects but scales poorly to larger projects involving
multiple concurrent users because it is a desktop application, not a true client-server
database. When a Microsoft Access database is shared by multiple concurrent users,
processing speed suffers. The effect is dramatic when there are more than a few users or
if the processing demands of any of the users are high. Access includes an Upsizing
Wizard that allows users to upsize their database to Microsoft SQL Server if they want to
move to a true client-server database. It is recommended to use Access Data Projects for
most situations.

Since all database queries, forms, and reports are stored in the database, and in keeping
with the ideals of the relational model, there is no possibility of making a physically
structured hierarchy with them.

One recommended technique is to migrate to SQL Server and utilize Access Data
Projects. This allows stored procedures, views, and constraints - which are greatly
superior to anything found in Jet. Additionally this full client-server design significantly
reduces corruption, maintenance and many performance problems.

Access 2003 icon

Access allows no relative paths when linking, so the development environment should
have the same path as the production environment (though it is possible to write a
"dynamic-linker" routine in VBA that can search out a certain back-end file by searching
through the directory tree, if it can't find it in the current path). This technique also allows
the developer to divide the application among different files, so some structure is
possible.

[] Protection
If the database design needs to be secured to prevent from changes, Access databases can
be locked/protected (and the source code compiled) by converting the database to an
.MDE file. All changes to the database structure (tables, forms, macros, etc.) need to be
made to the original MDB and then reconverted to MDE.

Some tools are available for unlocking and 'decompiling', although certain elements
including original VBA comments and formatting are normally irretrievable.
[] File extensions
Microsoft Access saves information under the following file extensions:

.mdb - Access Database (2003 and earlier)


.mde - Protected Access Database, with compiled macros (2003 and earlier)
.accdb - Access Database (2007)
.mam - Access Macro
.maq - Access Query
.mar - Access Report
.mat - Access Table
.maf - Access Form
.adp - Access Project
.adn - Access Blank Project Template

Microsoft PowerPoint
.

Microsoft PowerPoint 2003 running under Windows XP Home Edition

Microsoft PowerPoint is a presentation program developed by Microsoft for its


Microsoft Office system. Microsoft PowerPoint runs on Microsoft Windows and the Mac
OS computer operating systems, although it originally ran under Xenix systems.

It is widely used by business people, educators, students, and trainers and is among the
most prevalent forms of persuasion technology. Beginning with Microsoft Office 2003,
Microsoft revised branding to emphasize PowerPoint's identity as a component within the
Office suite: Microsoft began calling it Microsoft Office PowerPoint instead of merely
Microsoft PowerPoint. The current version of Microsoft Office PowerPoint is Microsoft
Office PowerPoint 2007. As a part of Microsoft Office, Microsoft Office PowerPoint has
become the world's most widely used presentation program.

[] History

The about box for PowerPoint 1.0, with an empty document in the background.
The original Microsoft Office PowerPoint was developed by Bob Gaskins and software
developer Dennis Austin as Presenter for Forethought, Inc, which they later renamed
PowerPoint[1].

PowerPoint 1.0 was released in 1987 for the Apple Macintosh. It ran in black and white,
generating text-and-graphics pages for overhead transparencies. A new full color version
of PowerPoint shipped a year later after the first color Macintosh came to market.

Microsoft Corporation purchased Forethought and its PowerPoint software product for
$14 million on July 31, 1987.[2] In 1990 the first Windows versions were produced. Since
1990, PowerPoint has been a standard part of the Microsoft Office suite of applications
(except for the Basic Edition).

The 2002 version, part of the Microsoft Office XP Professional suite and also available as
a stand-alone product, provided features such as comparing and merging changes in
presentations, the ability to define animation paths for individual shapes,
pyramid/radial/target and Venn diagrams, multiple slide masters, a "task pane" to view
and select text and objects on the clipboard, password protection for presentations,
automatic "photo album" generation, and the use of "smart tags" allowing people to
quickly select the format of text copied into the presentation.

Microsoft Office PowerPoint 2003 did not differ much from the 2002/XP version. It
enhanced collaboration between co-workers and featured "Package for CD", which
makes it easy to burn presentations with multimedia content and the viewer on CD-ROM
for distribution. It also improved support for graphics and multimedia.

The current version, Microsoft Office PowerPoint 2007, released in November 2006,
brought major changes of the user interface and enhanced graphic capabilities. [3]

[] Operation
In PowerPoint, as in most other presentation software, text, graphics, movies, and other
objects are positioned on individual pages or "slides". The "slide" analogy is a reference
to the slide projector, a device which has become somewhat obsolete due to the use of
PowerPoint and other presentation software. Slides can be printed, or (more often)
displayed on-screen and navigated through at the command of the presenter. Slides can
also form the basis of webcasts.

PowerPoint provides two types of movements. Entrance, emphasis, and exit of elements
on a slide itself are controlled by what PowerPoint calls Custom Animations. Transitions,
on the other hand are movements between slides. These can be animated in a variety of
ways. The overall design of a presentation can be controlled with a master slide; and the
overall structure, extending to the text on each slide, can be edited using a primitive
outliner. Presentations can be saved and run in any of the file formats: the default .ppt
(presentation), .pps (PowerPoint Show) or .pot (template). In PowerPoint 2007 the XML-
based file formats .pptx, .ppsx and .potx have been introduced.
[] Compatibility
As Microsoft Office files are often sent from one computer user to another, arguably the
most important feature of any presentation software—such as Apple's Keynote, or
OpenOffice.org Impress—has become the ability to open Microsoft Office PowerPoint
files. However, because of PowerPoint's ability to embed content from other applications
through OLE, some kinds of presentations become highly tied to the Windows platform,
meaning that even PowerPoint on Mac OS X cannot always successfully open its own
files originating in the Windows version. This has led to a movement towards open
standards, such as PDF and OASIS OpenDocument.

[] Cultural effects
Supporters & critics generally agree[4][5][6] that the ease of use of presentation software can
save a lot of time for people who otherwise would have used other types of visual aid—
hand-drawn or mechanically typeset slides, blackboards or whiteboards, or overhead
projections. Ease of use also encourages those who otherwise would not have used visual
aids, or would not have given a presentation at all, to make presentations. As
PowerPoint's style, animation, and multimedia abilities have become more sophisticated,
and as PowerPoint has become generally easier to produce presentations with (even to the
point of having an "AutoContent Wizard" suggesting a structure for a presentation—
initially started as a joke by the Microsoft engineers but later included as a serious feature
in the 1990s), the difference in needs and desires of presenters and audiences has become
more noticeable.

[] Criticism
One major source of criticism of PowerPoint comes from Yale professor of statistics and
graphic design Edward Tufte, who criticizes many emergent properties of the software:[7]

• It is used to guide and reassure a presenter, rather than to enlighten the audience;
• Unhelpfully simplistic tables and charts, resulting from the low resolution of
computer displays;
• The outliner causing ideas to be arranged in an unnecessarily deep hierarchy,
itself subverted by the need to restate the hierarchy on each slide;
• Enforcement of the audience's linear progression through that hierarchy (whereas
with handouts, readers could browse and relate items at their leisure);
• Poor typography and chart layout, from presenters who are poor designers and
who use poorly designed templates and default settings;
• Simplistic thinking, from ideas being squashed into bulleted lists, and stories with
beginning, middle, and end being turned into a collection of disparate, loosely
disguised points. This may present a kind of image of objectivity and neutrality
that people associate with science, technology, and "bullet points".
Tufte's criticism of the use of PowerPoint has extended to its use by NASA engineers in
the events leading to the Columbia disaster. Tufte's analysis of a representative NASA
PowerPoint slide is included in a full page sidebar entitled "Engineering by Viewgraphs"
[8]
in Volume 1 of the Columbia Accident Investigation Board's report.

[] Versions
Versions for the Mac OS include:

• 1987 PowerPoint 1.0 for Mac OS classic


• 1988 PowerPoint 2.0 for Mac OS classic
• 1992 PowerPoint 3.0 for Mac OS classic
• 1994 PowerPoint 4.0 for Mac OS classic
• 1998 PowerPoint 98 (8.0) for Mac OS classic (Office 1998 for mac)
• 2000 PowerPoint 2001 (9.0) for Mac OS X (Office 2001 for mac)
• 2002 PowerPoint v. X (10.0) for Mac OS X (Office:mac v. X)
• 2004 PowerPoint 2004 (11.0) for Mac OS X (Office:mac 2004)
• 2008 PowerPoint 2008 (12.0) for Mac OS X (Office:mac 2008)

Note: There is no PowerPoint 5.0 , 6.0 or 7.0 for Mac. There is no version 5.0 or 6.0
because the Windows 95 version was launched with Word 7. All of the Office 95 products
have OLE 2 capacity - moving data automatically from various programs - and
PowerPoint 7 shows that it was contemporary with Word 7. There wasn't any version 7.0
made for mac to coincide with neither version 7.0 for windows nor PowerPoint 97.[9].[10].

Microsoft PowerPoint 4.0 - 2007 Icons (Windows versions)

Versions for Microsoft Windows include:

• 1990 PowerPoint 2.0 for Windows 3.0


• 1992 PowerPoint 3.0 for Windows 3.1
• 1993 PowerPoint 4.0 (Office 4.x)
• 1995 PowerPoint for Windows 95 (version 7.0) — (Office 95)
• 1997 PowerPoint 97 — (Office '97)
• 1999 PowerPoint 2000 (version 9.0) — (Office 2000)
• 2001 PowerPoint 2002 (version 10) — (Office XP)
• 2003 PowerPoint 2003 (version 11) — (Office 2003)
• 2006-2007 PowerPoint 2007 (version 12) — (Office 2007)

Computer software
Computer software, consisting of programs, enables a computer to perform specific
tasks, as opposed to its physical components (hardware) which can only do the tasks they
are mechanically designed for. The term includes application software such as word
processors which perform productive tasks for users, system software such as operating
systems, which interface with hardware to run the necessary services for user-interfaces
and applications, and middleware which controls and co-ordinates distributed systems.

[] Terminology
The term "software" is sometimes used in a broader context to describe any electronic
media content which embodies expressions of ideas such as film, tapes, records, etc.[1]

A screenshot of computer software - AbiWord.

[] Relationship to computer hardware


Main article: Computer hardware

Computer software is so called in contrast to computer hardware, which encompasses the


physical interconnections and devices required to store and execute (or run) the software.
In computers, software is loaded into RAM and executed in the central processing unit.
At the lowest level, software consists of a machine language specific to an individual
processor. A machine language consists of groups of binary values signifying processor
instructions (object code), which change the state of the computer from its preceding
state. Software is an ordered sequence of instructions for changing the state of the
computer hardware in a particular sequence. It is usually written in high-level
programming languages that are easier and more efficient for humans to use (closer to
natural language) than machine language. High-level languages are compiled or
interpreted into machine language object code. Software may also be written in an
assembly language, essentially, a mnemonic representation of a machine language using
a natural language alphabet. Assembly language must be assembled into object code via
an assembler.

The term "software" was first used in this sense by John W. Tukey in 1958.[2] In computer
science and software engineering, computer software is all computer programs. The
concept of reading different sequences of instructions into the memory of a device to
control computations was invented by Charles Babbage as part of his difference engine.
The theory that is the basis for most modern software was first proposed by Alan Turing
in his 1935 essay Computable numbers with an application to the Entscheidungsproblem.
[3]

[] Types
Practical computer systems divide software systems into three major classes: system
software, programming software and application software, although the distinction is
arbitrary, and often blurred.

• System software helps run the computer hardware and computer system. It
includes operating systems, device drivers, diagnostic tools, servers, windowing
systems, utilities and more. The purpose of systems software is to insulate the
applications programmer as much as possible from the details of the particular
computer complex being used, especially memory and other hardware features,
and such accessory devices as communications, printers, readers, displays,
keyboards, etc.

System software is a generic term referring to any computer software which manages
and controls the hardware so that application software can perform a task. It is an
essential part of the computer system. An operating system is an obvious example, while
an OpenGL or database library are less obvious examples. System software contrasts
with application software, which are programs that help the end-user to perform specific,
productive tasks, such as word processing or image manipulation.

If system software is stored on non-volatile storage such as integrated circuits, it is


usually termed firmware.

Systems software – a set of programs that organise, utilise and control hardware in a
computer system

• Programming software usually provides tools to assist a programmer in writing


computer programs and software using different programming languages in a
more convenient way. The tools include text editors, compilers, interpreters,
linkers, debuggers, and so on. An Integrated development environment (IDE)
merges those tools into a software bundle, and a programmer may not need to
type multiple commands for compiling, interpreter, debugging, tracing, and etc.,
because the IDE usually has an advanced graphical user interface, or GUI.

A programming tool or software tool is a program or application that software


developers use to create, debug, or maintain other programs and applications. The term
usually refers to relatively simple programs that can be combined together to accomplish
a task, much as one might use multiple hand tools to fix a physical object.

[] History
The history of software tools began with the first computers in the early 1950s that used
linkers, loaders, and control programs. Tools became famous with Unix in the early
1970s with tools like grep, awk and make that were meant to be combined flexibly with
pipes. The term "software tools" came from the book of the same name by Brian
Kernighan and P. J. Plauger.

Tools were originally simple and light weight. As some tools have been maintained, they
have been integrated into more powerful integrated development environments (IDEs).
These environments consolidate functionality into one place, sometimes increasing
simplicity and productivity, other times sacrificing flexibility and extensibility. The
workflow of IDEs is routinely contrasted with alternative approaches, such as the use of
Unix shell tools with text editors like Vim and Emacs.

The distinction between tools and applications is murky. For example, developers use
simple databases (such as a file containing list of important values) all the time as tools.
However a full-blown database is usually thought of as an application in its own right.

For many years, computer-assisted software engineering (CASE) tools were sought after.
Successful tools have proven elusive. In one sense, CASE tools emphasized design and
architecture support, such as for UML. But the most successful of these tools are IDEs.

The ability to use a variety of tools productively is one hallmark of a skilled software
engineer.

[] List of tools
Software tools come in many forms:

• Revision control: Bazaar, Bitkeeper, Bonsai, ClearCase, CVS, Git, GNU arch,
Mercurial, Monotone, PVCS, RCS, SCM, SCCS, SourceSafe, SVN, LibreSource
Synchronizer
• Interface generators: Swig
• Build Tools: Make, automake, Apache Ant, SCons, Rake
• Compilation and linking tools: GNU toolchain, gcc, Microsoft Visual Studio,
CodeWarrior, Xcode, ICC
• Static code analysis: lint, Splint
• Search: grep, find
• Text editors: emacs, vi
• Scripting languages: Awk, Perl, Python, REXX, Ruby, Shell, Tcl
• Parser generators: Lex, Yacc, Parsec
• Bug Databases: gnats, Bugzilla, Trac, Atlassian Jira, LibreSource
• Debuggers: gdb, GNU Binutils, valgrind
• Memory Leaks/Corruptions Detection: dmalloc, Electric Fence, duma, Insure++
• Memory use: Aard
• Code coverage: GCT, CCover
• Source-Code Clones/Duplications Finding: CCFinderX
• Refactoring Browser
• Code Sharing Sites: Freshmeat, Krugle, Sourceforge, ByteMyCode, UCodit
• Source code generation tools
• Documentation generators: Doxygen, help2man, POD, Javadoc, Pydoc/Epydoc

Debugging tools also are used in the process of debugging code, and can also be used to
create code that is more compliant to standards and portable than if they were not used.

Memory leak detection: In the C programming language for instance, memory leaks are
not as easily detected - software tools called memory debuggers are often used to find
memory leaks enabling the programmer to find these problems much more efficiently
than inspection alone.

[] IDEs
Integrated development environments (IDEs) combine the features of many tools into one
complete package. They are usually simpler and make it easier to do simple tasks, such as
searching for content only in files in a particular project.

IDEs are often used for development of enterprise-level applications.

Some examples of IDEs are:

• Delphi
• C++ Builder
• Microsoft Visual Studio
• Xcode
• Eclipse
• NetBeans
• IntelliJ IDEA
• WinDev

• Application software allows end users to accomplish one or more specific (non-
computer related) tasks. Typical applications include industrial automation,
business software, educational software, medical software, databases, and
computer games. Businesses are probably the biggest users of application
software, but almost every field of human activity now uses some form of
application software. It is used to automate all sorts of functions.

Application software is a subclass of computer software that employs the capabilities of


a computer directly and thouroghly to a task that the user wishes to perform. This should
be contrasted with system software which is involved in integrating a computer's various
capabilities, but typically does not directly apply them in the performance of tasks that
benefit the user. In this context the term application refers to both the application
software and its implementation.

A simple, if imperfect analogy in the world of hardware would be the relationship of an


electric light as an example of an application to an electric power generation plant as an
example of a system. The power plant merely generates electricity, not itself of any real
use until harnessed to an application like the electric light that performs a service that the
user desires.

The exact delineation between the operating system and application software is not
precise, however, and is occasionally subject to controversy. For example, one of the key
questions in the United States v. Microsoft antitrust trial was whether Microsoft's Internet
Explorer web browser was part of its Windows operating system or a separable piece of
application software. As another example, the GNU/Linux naming controversy is, in part,
due to disagreement about the relationship between the Linux kernel and the Linux
operating system.

Typical examples of software applications are word processors, spreadsheets, and media
players.

Multiple applications bundled together as a package are sometimes referred to as an


application suite. Microsoft Office and OpenOffice.org, which bundle together a word
processor, a spreadsheet, and several other discrete applications, are typical examples.
The separate applications in a suite usually have a user interface that has some
commonality making it easier for the user to learn and use each application. And often
they may have some capability to interact with each other in ways beneficial to the user.
For example, a spreadsheet might be able to be embedded in a word processor document
even though it had been created in the separate spreadsheet application.

User-written software tailors systems to meet the user's specific needs. User-written
software include spreadsheet templates, word processor macros, scientific simulations,
graphics and animation scripts. Even email filters are a kind of user software. Users
create this software themselves and often overlook how important it is.

In some types of embedded systems, the application software and the operating system
software may be indistinguishable to the user, as in the case of software used to control a
VCR, DVD player or Microwave Oven.

OpenOffice.org is a well-known example of application software


[] Application software classification


There are many subtypes of application software:
• Enterprise software addresses the needs of organization processes and data flow,
often in a large distributed ecosystem. (Examples include Financial, Customer
Relationship Management, and Supply Chain Management). Note that
Departmental Software is a sub-type of Enterprise Software with a focus on
smaller organizations or groups within a large organization. (Examples include
Travel Expense Management, and IT Helpdesk)
• Enterprise infrastructure software provides common capabilities needed to create
Enterprise Software systems. (Examples include Databases, Email servers, and
Network and Security Management)
• Information worker software addresses the needs of individuals to create and
manage information, often for individual projects within a department, in contrast
to enterprise management. Examples include time management, resource
management, documentation tools, analytical, and collaborative. Word
processors, spreadsheets, email and blog clients, personal information system, and
individual media editors may aid in multiple information worker tasks.
• Media and entertainment software addresses the needs of individuals and groups
to consume digital entertainment and published digital content. (Examples
include Media Players, Web Browsers, Help browsers, and Games)
• Educational software is related to Media and Entertainment Software, but has
distinct requirements for delivering evaluations (tests) and tracking progress
through material. It is also related to collaboration software in that many
Educational Software systems include collaborative capabilities.
• Media development software addresses the needs of individuals who generate
print and electronic media for others to consume, most often in a commercial or
educational setting. This includes Graphic Art software, Desktop Publishing
software, Multimedia Development software, HTML editors, Digital Animation
editors, Digital Audio and Video composition, and many others.
• Product engineering software is used in developing hardware and software
products. This includes computer aided design (CAD), computer aided
engineering (CAE), computer language editing and compiling tools, Integrated
Development Environments, and Application Programmer Interfaces.

[] Program and library


A program may not be sufficiently complete for execution by a computer. In particular, it
may require additional software from a software library in order to be complete. Such a
library may include software components used by stand-alone programs, but which
cannot work on their own. Thus, programs may include standard routines that are
common to many programs, extracted from these libraries. Libraries may also include
'stand-alone' programs which are activated by some computer event and/or perform some
function (e.g., of computer 'housekeeping') but do not return data to their calling program.
Programs may be called by one to many other programs; programs may call zero to many
other programs.

[] Three layers

Starting in the 1980s, application software has been sold in mass-produced packages
through retailers.
See also: Software architecture

Users often see things differently than programmers. People who use modern general
purpose computers (as opposed to embedded systems, analog computers,
supercomputers, etc.) usually see three layers of software performing a variety of tasks:
platform, application, and user software.

Platform software
Platform includes the firmware, device drivers, an operating system, and typically
a graphical user interface which, in total, allow a user to interact with the
computer and its peripherals (associated equipment). Platform software often
comes bundled with the computer. On a PC you will usually have the ability to
change the platform software.
Application software
Application software or Applications are what most people think of when they
think of software. Typical examples include office suites and video games.
Application software is often purchased separately from computer hardware.
Sometimes applications are bundled with the computer, but that does not change
the fact that they run as independent applications. Applications are almost always
independent programs from the operating system, though they are often tailored
for specific platforms. Most users think of compilers, databases, and other
"system software" as applications.
User-written software
User software tailors systems to meet the users specific needs. User software
include spreadsheet templates, word processor macros, scientific simulations, and
scripts for graphics and animations. Even email filters are a kind of user software.
Users create this software themselves and often overlook how important it is.
Depending on how competently the user-written software has been integrated into
purchased application packages, many users may not be aware of the distinction
between the purchased packages, and what has been added by fellow co-workers.

[] Creation
Main article: Computer programming

[] Operation
Computer software has to be "loaded" into the computer's storage (such as a hard drive,
memory, or RAM). Once the software is loaded, the computer is able to execute the
software. Computers operate by executing the computer program. This involves passing
instructions from the application software, through the system software, to the hardware
which ultimately receives the instruction as machine code. Each instruction causes the
computer to carry out an operation -- moving data, carrying out a computation, or altering
the control flow of instructions.

Data movement is typically from one place in memory to another. Sometimes it involves
moving data between memory and registers which enable high-speed data access in the
CPU. Moving data, especially large amounts of it, can be costly. So, this is sometimes
avoided by using "pointers" to data instead. Computations include simple operations such
as incrementing the value of a variable data element. More complex computations may
involve many operations and data elements together.

Instructions may be performed sequentially, conditionally, or iteratively. Sequential


instructions are those operations that are performed one after another. Conditional
instructions are performed such that different sets of instructions execute depending on
the value(s) of some data. In some languages this is known as an "if" statement. Iterative
instructions are performed repetitively and may depend on some data value. This is
sometimes called a "loop." Often, one instruction may "call" another set of instructions
that are defined in some other program or module. When more than one computer
processor is used, instructions may be executed simultaneously.

A simple example of the way software operates is what happens when a user selects an
entry such as "Copy" from a menu. In this case, a conditional instruction is executed to
copy text from data in a 'document' area residing in memory, perhaps to an intermediate
storage area known as a 'clipboard' data area. If a different menu entry such as "Paste" is
chosen, the software may execute the instructions to copy the text from the clipboard data
area to a specific location in the same or another document in memory.

Depending on the application, even the example above could become complicated. The
field of software engineering endeavors to manage the complexity of how software
operates. This is especially true for software that operates in the context of a large or
powerful computer system.

Currently, almost the only limitations on the use of computer software in applications is
the ingenuity of the designer/programmer. Consequently, large areas of activities (such as
playing grand master level chess) formerly assumed to be incapable of software
simulation are now routinely programmed. The only area that has so far proved
reasonably secure from software simulation is the realm of human art— especially,
pleasing music and literature.[citation needed]

Kinds of software by operation: computer program as executable, source code or script,


configuration.
[] Quality and reliability
Software reliability considers the errors, faults, and failures related to the creation and
operation of software.

See Software auditing, Software quality, Software testing, and Software reliability.

[] License
Software license gives the user the right to use the software in the licensed environment,
some software comes with the license when purchased off the shelf, or an OEM license
when bundled with hardware. Other software comes with a free software licence,
granting the recipient the rights to modify and redistribute the software. Software can also
be in the form of freeware or shareware. See also License Management.

[] Patents
The issue of software patents is controversial. Some believe that they hinder software
development, while others argue that software patents provide an important incentive to
spur software innovation. See software patent debate.

[] Ethics and rights for software users


Being a new part of society, the idea of what rights users of software should have is not
very developed. Some, such as the free software community, believe that software users
should be free to modify and redistribute the software they use. They argue that these
rights are necessary so that each individual can control their computer, and so that
everyone can cooperate, if they choose, to work together as a community and control the
direction that software progresses in. Others believe that software authors should have the
power to say what rights the user will get.

The former philosophy is somewhat derived from the "hacker ethic" that was common in
the 60s and 70s.

Das könnte Ihnen auch gefallen