Sie sind auf Seite 1von 14

1.

Fundamentals and Introduction to XHTML - 6 (15)

Introduction to Internet

Origin:

The history of the Internet begins with the development of electronic computers in
the 1950s. Initial concepts of wide area networking originated in several computer science
laboratories in the US, UK and France independently.

The main requirement of networking was its robustness, which means even if one of
the computers in the network gets failed the network should continue its work with no data
loss.

The department of defense (DoD) funded DARPA, Defense Advanced Research Project
Academy in February 1958 (1960’s). It planned and developed a large scale computer
network in order to accelerate information transfer (i.e. accessing of remote computers and
file sharing) and to avoid doubling up of existing information. This network was termed as
ARPA-Net. Earlier in 1966, ARPAnet was using NCP (Network Control Protocol), later its
limitations were overcome by development of TCP (Transmission Control Protocol) with
specific feature of verification of file transfer.

Any how the ARPA-net was limited to the laboratories that funded them and hence
other networking protocols have been developed by different universities and organizations
for their use.

In 1986 National Science Foundation created, National Network – NSFnet. Initially it


connected the NSF-funded supercomputer centers at five universities. But in later days by
1990 it replaced all ARPAnet for most of nonmilitary uses.

In 1995 a small part of NSFnet returned to being a research network, the rest became
known as internet. Although this term was used much earlier by ARPAnet and NSFnet.

The Internet protocol suite (TCP/IP) was developed (by Robert E. Kahn and Vint Cerf)
in the 1970s and became the standard networking protocol on the ARPANET, incorporating
concepts from the French CYCLADES project directed by Louis Pouzin. In the early 1980s the
NSF funded the establishment for national supercomputing centers at several universities,
and provided interconnectivity in 1986 with the NSFNET project, which also created network
access to the supercomputer sites in the United States from research and education
organizations.

Commercial Internet service providers (ISPs) began to emerge in the very late 1980s.
The ARPANET was decommissioned in 1990. Limited private connections to parts of the
Internet by officially commercial entities emerged in several American cities by late 1989 and

Page 1 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

1990, and the NSFNET was decommissioned in 1995, removing the last restrictions on the use
of the Internet to carry commercial traffic

In the 1980s, research at CERN in Switzerland by British computer scientist Tim


Berners-Lee resulted in the World Wide Web, linking hypertext documents into an
information system, accessible from any node on the network. Since the mid-1990s, the
Internet has had a revolutionary impact on culture, commerce, and technology, including the
rise of near-instant communication by electronic mail, instant messaging, voice over Internet
Protocol (VoIP) telephone calls, two-way interactive video calls, and the World Wide Web
with its discussion forums, blogs, social networking, and online shopping sites.

Internet is a global computer network providing a variety of information and


communication facilities, consisting of interconnected networks using standardized
communication protocols. It is a huge collection of various computer networks around the
world. It is also referred to as a “network of networks”. A computer to have internet service it
must be connected to the internet service provider (ISP).

Internet Protocol (IP) Address: Every device on the Internet is assigned a unique IP
address for identification and location definition. Till now IPv4 or IP4, the one popularly used,
may be represented in any notation expressing a 32-bit integer value. They are most often
written in the dot-decimal notation, which consists of four octets of the address expressed
individually in decimal numbers and separated by periods. With the rapid growth of internet
by 1998, the Internet Engineering Task Force (IETF) had formalized the successor protocol.
IPv6 uses a 128-bit address, theoretically allowing 2128, or approximately 3.4×1038 addresses.
The main advantage of IPv6 over IPv4 is its larger address space. The length of an IPv6 address
is 128 bits, compared with 32 bits in IPv4. The address space therefore has 2128 or
approximately 3.4×1038 addresses.

Organizations are assigned blocks of IP’s which they in turn assign to their machines
that need internet access. For example a small organization may be assigned 255 address,
such as 191.57.126.0 to 191.57.126.255.

Domain Names

Each and every computers and/or devices in a network assigned by a unique


numerical address called IP address, which is practically difficult for humans to remember the
numerical address. To avoid this difficulty they are assigned with unique identifiable names,
in better way and are easily memorable. Hence the concept of domain names came up. The
main intention of DNS (Domain Name System) is to translate domain names to the IP
addresses. There are domain names for all the websites like wise there are IP addresses also
for them. The DNS is responsible for translating these domain names to the IP addresses.

Page 2 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

The Domain Name System (DNS) is a hierarchical decentralized naming system for
computers, services, or other resources connected to the Internet or a private network. It
associates various information with domain names assigned to each of the participating
entities.

The DNS is a worldwide network that collectively forms a database of domain names
and IP addresses. This database is a global one. The hierarchy consists of DNS servers. A DNS
server can be defined as the following.

There are some top level DNS such as com, org, gov, edu, net and so on. Each domain
name is further divided into sub domains and so on.

gov com org edu in

ac
yahoo google amazon

tec pub

cse ece civil

For Example the complete path of http://www.cse.tec.ac.in can be uniquely traced


out with the help of domain name space.

World Wide Web (www): or simply the "Web is a global information medium/system
which users can read and write via computers connected to the
Internet. The term is often mistakenly used as a synonym for the
Internet itself, but the Web is a service that operates over the
Internet. WWW consists of collection of software and
corresponding protocols used to access the resources over the
network. It is an information system in which various documents
containing information are interlinked together.

Origin: The concept of www was invented by English


scientist John Berners-Lee in 1989 at European Organization for
Nuclear Research (CERN). He built a personal database of people and software models and
used hypertext so that each new page of information was linked to the existing page. Hence
he introduced the tool such as Hyper Text Transfer Protocol (HTTP), Hyper Text Markup

Page 3 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

Language (HTML) and the web browser. He wrote the first web browser in 1990 while
employed at CERN in Switzerland.

John Berners-Lee is also known as TimBL, is an English engineer and computer scientist. He is currently a
professor of Computer Science at the University of Oxford and at Massachusetts Institute of Technology (MIT).

Overview of TCP/IP protocol suite (for reference only)

TCP/IP Protocol
OSI Model Archetecture TCP /IP Protocol Suite
Layers
Application Layer

Presentation
Application Layer HTTP FTP SMTP TELNET DNS
Layer

Session Layer

Transport Layer Host to Host TCP UDP


Transport Layer

Network Layer
IGMP ICMP
Internet Layer IP
ARP
Data-link Layer

Network Interface Token Frame


Layer Ethernet ATM
Physical Layer Ring Relay

Internet: is the global system of interconnected computer networks that use the
Internet protocol suite (TCP/IP) to link devices worldwide. It is a network of networks that
consists of private, public, academic, business, and government networks of local to global
scope, linked by a broad array of electronic, wireless, and optical networking technologies.

Web Browsers: commonly referred to as a browser, is a software application running


on client machine for accessing information on the World Wide Web. Each individual web
page, image, and video is identified by a distinct URL, enabling browsers to retrieve and
display them on the user's device.

Page 4 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

The main function of a browser is to present the web resource you choose, by
requesting it from the server with help of HTTP and displaying it in the browser window. The
resource is usually an HTML document, but may also be a PDF, image, or some other type of
content. The location of the resource is specified by the user using a URI.

Most commonly used browsers are Google Chrome, Mozilla Firefox, Opera, Safari etc.

Web Server: is a program running on remote computer that uses HTTP (Hypertext
Transfer Protocol) to serve the web pages, in response to the client’s requests. They act only
when requests are made to them by browsers running on other computers on the internet.

Web Server Operations: web server and client communicate with HTTP.

The web client–server model is a distributed application structure that partitions tasks
or workloads between the providers of a resource or service, called servers, and service
requesters, called clients. Often clients and servers communicate over a computer network
on separate hardware, but both client and server may reside in the same system. A server
host runs one or more server programs which share their resources with clients. A client does
not share any of its resources, but requests a server's content or service function.

General Operation Steps involved in web server and client

1. When web server starts its execution, it informs operating system that it is ready
to accept request on specific port.
2. Web client opens a network connection with web service and sends request to
web server. The web server listen the web client request on particular port.
3. The primary task of web server is to monitor communications on specific port
(depending on the type of service)(http:80 or 8000, ftp:20 and 21 and so on) and
accept HTTP command through that port and perform operation specified by
these commands.
4. When the request from the client is received by the server it is translated either in
file name or a program name. If it is a “file name” then requested file is transferred
to the client. And if it is the “program name” then the program is executed & its
output is sent to the client.

Page 5 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

General Server Characteristics

Generally web servers share common characteristics. The file structure of a web
server has two separate directories. The root of one of these is called the document root. The
file hierarchy that grows from the document root stores the web documents to which the
server has direct access.

Example

Suppose the site name is www.mysite.com and the document root is named mystuffs,

and it is stored in the /admin/web directory

So, /admin/web/mystuffs is the document directory address

If a request URL is:


http://www.mysite.com/ebooks/web-prog.html

the server will search for the files with given path
/admin/web/mystuffs/ebooks/web-prog.html

 The files that are stored directly in the document root directory are available to the
clients using the top level URL. Normally the clients do not access the document root
directly.
 Server stores the documents that are readable too the client outside the document
root.
 The virtual documents trees are the areas from which the server can serve the
documents to its clients.
 If the documents are stored in sub directories then client can refer to these web
documents using the URL with a particular file path to that directory from the
document root directly.
 Some servers allow the access to the web documents that are in the document root of
the other machine such servers are called proxy servers.
 Web servers support various protocols such as HTTP, FTP, Gopher, News and mail.
 All the web servers interact with the database systems with help of Common Gateway
Interfaces (CGI).

Apache: The Apache HTTP Server, simply called Apache, is a free and open-source cross-
platform web server, Developed by Apache Software Foundation and released under the

Page 6 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

terms of Apache License 2.0. It is widely used web server because it is fast, reliable and
efficient. It runs on 67% of all web servers in the world.

The Apache HTTP Server Project is an effort to develop and maintain an open-source HTTP server
for modern operating systems including UNIX and Windows. The goal of this project is to provide a
secure, efficient and extensible server that provides HTTP services in sync with the current HTTP
standards. The Apache HTTP Server ("httpd") was launched in 1995 and it has been the most popular
web server on the Internet since April 1996. Find more details @ https://httpd.apache.org/.

Internet Information Services, IIS: runs only on Microsoft Windows servers. Internet
Information Services is an extensible web server created by Microsoft for use with the
Windows NT family. IIS supports HTTP, HTTP/2, HTTPS, FTP, FTPS, SMTP and NNTP. IIS is
required for Web sites that are programmed in ASP.NET. IIS is a proprietary server.

Comparison between Apache and IIS

Apache is a software foundation that develops and provides open source software
that is meant to run web servers. Their primary product is their HTTP server which is the most
popular HTTP server in use today. IIS or Internet Information Services is the software pack
developed by Microsoft to provide their Windows operating system the ability to host
internet services. IIS is second only to HTTP as the most used HTTP server in the world.

1. Apache is free while IIS is packaged with Windows.


2. IIS only runs on Windows while Apache can run on almost any OS including UNIX, Apple’s
OS X, and on most Linux Distributions.
3. ASPX runs only in IIS.
4. IIS has a dedicated staff to answer most problems while support for Apache comes from
the community itself.
5. IIS is optimized for Windows because they are from the same company.
6. The Windows OS is prone to security risks.

Uniform Resource Locator (URL) is used to specify addresses on the World Wide Web.
A URL is the fundamental network identification for any resource connected to the web (e.g.,
hypertext pages, images, and sound files). Different types of resources are identified by
different protocols. The protocol specifies how information from the link is transferred.
General format of URL is scheme:object-address
The scheme is often a communication protocol. Communication schemes include http.
ftp, gopher, telnet, file, mailto, news etc. different schemes use object address that have
different forms.
protocol://user@hostname/path/filename

Page 7 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

The most commonly used protocol for web browser and web server communication is
HTTP. This protocol is based on request-response mechanism. In case of http the form of
object address of a URL is http://fully-qualified-domain-name/path-to-document/filename.

File is another commonly used scheme in URL. This protocol allows residing the
document in the client’s machine from which the web browsers is making out the demand.
Using file the address part of URL is file://path-to-document

URL paths: the path to the web document is similar to the path to the particular file
present in the folder. In this path the directory names and files are separated by slash.

In case of UNIX servers the path is specified with forward slash and for windows servers is
specified with backward slash.

http://www.myweb.com/mydocs/index.html

a path that includes all the directories along the way is called a complete path.

Multipurpose Internet Mail Extension: A browser needs some way of determining the
format of document it receives from a web server. The MIME type is a standardized way to
indicate the nature and format of a document. MIME is a very flexible format permitting us to
include virtually any type of the file in html document or in email.

When the browser receives the document from the web server it uses the included
MIME type (and not the file extension) to determine what to do with the document / how to
process a document. It is therefore important that servers are set up correctly to attach the
correct MIME type to the header of the response object.

MIME was designed mainly for emails (SMTP), the content types defined by MIME
standards are also of importance in communication protocols outside of email, such as HTTP
for the World Wide Web. Servers insert the MIME header at the beginning of any Web
transmission. Clients use this content type or media type header to select an appropriate
viewer application for the type of data the header indicates. Some of these viewers are built
into the Web client or browser.

Type specifications: A web server attaches a MIME format specification to the


beginning of the document that it is about to provide to a browser.

The mime messages can contain text, images, audio, video or other application data,
the content consists of two levels: type and subtype. Type denotes whether the content is
html, gif, jpeg or mpeg.

Page 8 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

Types Sub types


Text plain, html
Video mpeg, quciktime
image gif, jpeg
Servers determine the type of document by using the filenames extensions as the key
into a table of types. For example, the extensions html tells the server that it should attach
text/html to the document before sending it to the requesting browser.

Experimental document types: the name of an experimental subtype begins with x-,
as in video/x-msvideo. Any web provider can add the experimental subtype by having its
name added to the list of MIME specifications stored in the web provider’s server. And in
such conditions the web provider must supply a program that the browser can call when it
needs to display the content of the database. These programs are either external to the
browser, in which case they are called helper applications, or are code modules that are
inserted into the browser, in which case they are called as plugins. Sometimes a particular
browser cannot handle some specific document types. In such case the browsers show an
error messages.

Hyper Text Transfer Protocol (HTTP): is an application protocol for distributed,


collaborative, and hypermedia information systems. HTTP is the foundation of data
communication for the World Wide Web.

HTTP functions as a request–response protocol in the client–server computing model.


A client makes a request for desired web page by giving the URL through a web browser. This
HTTP request is submitted to the web server. And the server provides resources such as HTML
files and other content, or performs other functions on behalf of the client, returns a
response message to the client. The response contains completion status information about
the request and may also contain requested content in its message body.

Page 9 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

The Request Phase: the general form of an HTTP request phase is as follows.
1. HTTP method domain part of the URL HTTP version
2. Header Fields
3. Blank Line
4. Message Body

Sample HTTP Request packet (for reference)

HTTP Request Methods

Method Description

GET Returns the contents of the specified document


HEAD Returns the header information for the specified document
POST Executes the specified document, using enclosed data
PUT Replaces the specified document with the enclosed data
DELETE Deletes the specified document
The format of a header field is the field name followed by a colon and the value of the
field. There are four categories of header fields.

1. General: for general information, such as the date


2. Request: included in request header
3. Response: for response header
4. Entity: used in both request and response headers.

A common request field is the Accept field, which specifies a preference of the
browser for the MIME type of the requested document. More than one Accept field can be
specified if the browser is willing to accept documents in more than one format. For example

Accept: text/plain
Accept: text/html
Accept: image/gif

Page 10 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

Requests that use the GET, HEAD and DELETE methods don not have bodies. In these
case the blank line signals the end of request.
If the request has a body, the length of that body must be given with a Content-Length
field. It gives the length of the response body in bytes. POST method requests require
this filed because they send data to the server.

HTTP Response phase: the general form of an HTTP response is as follows:

1. Status line
2. Response header fields
3. Blank line
4. Response body

The status line includes the HTTP version used, a three digit status code for the
response and short textual explanation. The first digits of HTTP status codes are as follows.

1 => Informational Response


2 => Success
3 => Redirection
4 => Client Error
5 => Server Error
One of the most common status codes we find is 404: Not Found which means the
requested file is not found on the server.
After the status line the server sends the response header which can contain server
lines of information about response, each in the form of field, the only essential field is the
header is Content-Type so as to process the contents.

Sample HTTP Request packet (for reference)

Page 11 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

Security: web security is the protection of information assets that can be accessed
between web server and web client. During web client-server communication there are
chances that the web browser can illegally share or use the software present on the web
server. Similarly web server can make an unauthorized use of the information present on the
web client.

Consider an example of transmitting a credit card number to a company from which a


purchase is being made. The security issues for this transaction are as follows.

1. Privacy – it must not be possible for the credit card number to be stolen while on
its way to the company server.
2. Integrity – it must not be possible for the credit card number to be modified on its
way to the company’s server.
3. Authentication – it must be possible for both the purchaser and the seller to be
certain of each other’s identity.
4. Nonrepudiation – it must be possible to legally prove that the message was
actually sent and received.

Various tools and techniques are used to overcome security issues, most commonly
used technique is data encryption and decryption.

Protection against viruses and worms are provided by antivirus software, which must
be regularly updated with new virus definitions.

Introduction to XHTML (eXtensible Hyper Text Markup Language)

HTML and XHTML are both languages in which web pages are written. HTML is SGML-
based (Standard Generalized Markup Language ) while XHTML is XML-based (Extensible
Markup Language). They are like two sides of the same coin. XHTML was derived from HTML
to conform to XML standards. Hence XHTML is strict when compared to HTML and does not
allow user to get away with lapses in coding and structure.

The reason for XHTML to be developed was convoluted browser specific tags. Pages coded in
HTML appeared different in different browsers.

Hypertext is a piece of text that works as a link


Markup language is language for writing layout information within document.
An XHTML document is a plain text file and it is very much similar to HTML. It
contains tags written in angel brackets.
Any HTML document can be written in simple notepad of wordpad text
editors.
Page 12 of 14 R.N.S Rural Polytechnic, Murudeshwar
1. Fundamentals and Introduction to XHTML - 6 (15)

This program then can be opened in some web browsers and the
corresponding web page can be viewed.
Every XHTLM document must begin with an XML declaration. In this line xml
version and encoding method should be specified.
The xml document should be written within <html> </html> tags. In this
<html> tag indicates the beginning tag and </html> indicates the ending of
particular tag.
Syntactic differences between HTML and XHTML

HTML XHTML
Case Tags and attribute names are not case It is case sensitive and all the tags in
Sensitive sensitive. Means <FORM>,<Form> & XHTML document must be written
<form> all are equal. in lowercase.
Closing Tags It may be omitted if the processing For every tag there must be a
agent (browser) can infer their closing tag in proper order. Some
presence. browsers gets confused if closing
Eg. tags were not inserted. Two ways in
<p>this is test message and ….. which closing tags can be included
<p>One more text here with no ending are
tag… <input type=”text” ></input>
And
<input type=”text” / >
Quoted Attribute values must be quoted only if All attribute values must be double
attribute there are embedded special characters quoted, regardless what characters
values or white space characters. Numeric are in the values.
attribute values are rarely quoted.
Explicit Some attribute values need not be In every XHTML document the
Attribute explicitly stated. attribute values must be specified
Values Eg: If a border attribute in <table> tag explicitly. Here the attribute
without a value, then it specifies a assigned here the attribute value
default width border on the table. must be written in double quotes.
Eg. <table border> Eg. <table border=”1”>
Element In HTML even if we do not follow the In XHTML document the nesting
Nesting nesting rules strictly it does not cause rules must be strictly followed.
much difference. These nesting rules are
1. An anchor element cannot
contain another anchor element,
and a form element cannot
contain another form element.
2. If an element appears inside
another element, the closing tag
of the inner element must
appear before the closing tag of

Page 13 of 14 R.N.S Rural Polytechnic, Murudeshwar


1. Fundamentals and Introduction to XHTML - 6 (15)

the outer element.


3. Block elements cannot be
nested in inline elements.
4. Text cannot be directly nested in
body or form element.
5. List elements cannot be directly
nested in list elements.

Page 14 of 14 R.N.S Rural Polytechnic, Murudeshwar

Das könnte Ihnen auch gefallen