You are on page 1of 42

Developing Web Applications Lecture 1: Web Basics and HTML Dr.

Ralph Moseley
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW
The World Wide Web (WWW) was developed by Tim Berners-Lee and other research scientists at CERN, the European center for nuclear research, in the late 1980s and early 1990s. WWW is a client-server model and uses TCP connections to transfer information or web pages from server to client. WWW uses a Hypertext model. Hypertext allows interactive accesses to a collection of documents. Documents can hold Text (hypertext), Graphics, Sound, Animations, Video Documents are linked together Non-distributed all documents stored locally (e.g on CD-Rom). Distributed documents stored at remote servers on the Internet.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW - Hyperlinks (or links)


Each document contains links (pointers) to other documents. The link represented by "active area" on screen Graphic - button Text - highlighted By selecting a particular link, the client fetches the referenced document from a server for display. Links may become invalid. Link is simply a text name for a remote document. Remote document may be moved to a new location while name in link remains in place.

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW Document Representation


Each WWW document is called a page. Initial page for individual or organization is called a home page. Page can contain many different types of information; page must specify:
Content The actual information Type of content The type of information, e.g. text, pictures etc Links to other documents

Rather than having a fixed representation for every browser, pages are formatted with a mark up language. This allows browser to format page to fit display. Different browsers can display pages in different ways. This also allows text-only browser to discard graphics for example. Standard is called HyperText Markup Language (HTML).
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW HTML
HTML specifies Major structure of document Formatting instructions for browsers to execute. Hypertext links Links to other documents Additional information about document contents Two parts to document: Head contains details about the document. Body contains the information/content of the document. Each web page is represented in ASCII text with embedded HTML tags that give formatting instructions to the browser. Formatted section begins with tag, <TAGNAME> End of formatted section is indicated by </TAGNAME>

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW HTML Example


<HTML> <HEAD> <TITLE> Example Page for lecture</TITLE> </HEAD> <BODY> Lecture notes for today go here! <CENTER> <TABLE BORDER=3> <TR> <TD><A HREF="./lecture10.html">Previous Lecture</A> <TD><A HREF="./lecture12.html">Next Lecture</A> <TD><A HREF="./Contents.html">Table of contents</A> <TD><A HREF="./solutions.html">Solutions to Assignments</A> <TD><A HREF="./index.html">Index of terms</A> </TABLE> </CENTER> </BODY> </HTML>
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW Other HTML Tags


Headings - <H1>, <H2> Lists <OL> - Ordered (numbered) list <UL> - Unordered (bulleted) list <LI> - List item Tables <TABLE>, </TABLE> - Define table <TR> - Begin row <TD> - Begin item in row Parameters Keyword-value pairs in HTML tags <TABLE BORDER=3>
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW Embedding Graphics


IMG tag specifies insertion of graphic
Parameters: SRC="filename" ALIGN= - alignment relative to text

<img SRC=GCD.gif" height=35 width=30> The above line would insert the image in the file GCD.gif into any web page. Image must be in format known to browser, e.g., Graphics Interchange Format (GIF), Joint Photographic Experts Group (JPEG), Bitmap etc

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW Style The layout and format of an HTML document can be simplified by using CSS (Cascading Style Sheets)
<html> <head> <style type="text/css"> body {background-color: yellow} h1 {background-color: #00ff00} h2 {background-color: transparent} p {background-color: rgb(250,0,255)} </style> </head>

<body>
<h1>This is header 1</h1> <h2>This is header 2</h2> <p>This is a paragraph</p> </body> </html>

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

WWW Identifying a web page


A web page is identified by:
The protocol used to access the web page. The computer on which the web page is stored. The TCP port that the server is listening on to allow a client to access the web page. Directory pathname of web page on server.

Specific syntax for Uniform Resource Locator (URL): protocol://computer_name:port/document_name


Protocol can be http, ftp, file, mailto.

Computer name can be DNS name or IP address. TCP port is optional (http uses port 80 as its default port). document_name is path on server to web page (file).

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

10

WWW Identifying a web page


E.g. http://www.yahoo.com/Recreation/Sports/Soccer/index.html Protocol is http Computer name or DNS name is www.yahoo.com Port number is the default port for http, i.e. port 80. Document name is /Recreation/Sports/Soccer/index.html

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

11

WWW Hyperlinks between web pages


Each hyperlink is specified in HTML by using a special tag. An item on a page is associated with another HTML document. Each link is passive, no action is taken until link is selected. HTML tags for a hyperlink are <A> and </A> The linked document is specified by parameter to the tag: HREF="document URL" <A HREF=http://www.gcd.ie>Click here to go to GCD web site.</A> Whatever is between the HTML tags, <A> and </A> is the highlighted hyperlink.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

12

WWW Client Server Model


The browser is the client, WWW (or web) server is the server. Browser:
The browser makes TCP connection to the web server. The browser sends request for the particular web page that it wishes to display. The browser reads the contents of the web page from the TCP connection and displays it in the browsers window. The browser closes the TCP connection used to transfer the web page.

Each separate item in a web page (e.g., pictures, audio) require a separate TCP connection. HyperText Transport Protocol (HTTP) specifies commands that the client (browser) issues to the server (web server) and the responses that the server sends back to the client.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

13

WWW Client Server Model

Figure 1-1: Web client/server architecture


14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

14

Web Server Basics


Duties
Listen to a port When a client is connected, read the HTTP request Perform some lookup function Send HTTP response and the requested data

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

15

Serving a Page
User of client machine types in a URL

client (Netscape) http://www.smallco.com/index.html

server (Apache)

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

16

Serving a Page
Server name is translated to an IP address via DNS
client (Netscape) http:// www.smallco.com /index.html server (Apache)

192.22.107.5

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

17

Serving a Page
Client connects to server using IP address and port number
client (Netscape) http://www.smallco.com/index.html

192.22.107.5 port 80

server (Apache)

192.22.107.5

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

18

Serving a Page
Client determines path and file to request

client (Netscape) http://www.smallco.com/index.html

server (Apache)

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

19

Serving a Page
Client sends HTTP request to server

client (Netscape) http://www.smallco.com/index.html

GET index.html HTTP/1.1

server (Apache)

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

20

Serving a Page
Server determines which file to send

client (Netscape)

server (Apache)

http://www.smallco.com/index.html

"index.html" is really /etc/httpd/htdocs/index.html

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

21

Serving a Page
Server sends response code and the document
client (Netscape) HTTP/1.1 200 OK Content-type: text/html [contents of index.html] server (Apache)

http://www.smallco.com/index.html

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

22

Serving a Page
Connection is broken

client (Netscape)

server (Apache)

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

23

HTTP
HTTP is
Designed for document transfer Generic
not tied to web browsers exclusively can serve any data type

Stateless
no persistant client/server connection

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

24

HTTP Protocol Definitions


MIME
Multipurpose Internet Mail Extensions Standards for encoding different media types in a message Originally developed for emailing files and messages in different languages

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

25

HTTP Protocol Definitions


"MIME types" are used to identify the type of information that a file contains. While the file extension .html is informally understood to mean that the file is an HTML page, there is no requirement that it mean this, and many HTML pages have different file extensions. In the HTTP protocol used by web browsers to talk to web servers, the "file extension" of the URL is not used to determine the type of information that the server will return. Indeed, there may be no file extension at all at the end of the URL. Instead, the web server specifies the correct MIME type using a Content-type: header when it responds to the web browser's HTTP request.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

26

HTTP Protocol Definitions


Examples of common mime types :
Type text/html image/png image/jpeg Common File Extension .html .png .jpeg Purpose Web Page PNG-format image JPEG-format image

audio/mpeg
application/octetstream

.mp3
.exe

MPEG Audio File


Best for downloads that should just be saved to disk

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

27

HTTP Protocol Definitions


In addition to e-mail applications, Web browsers also support various MIME types. This enables the browser to display or output files that are not in HTML format. A new version, called S/MIME, supports encrypted messages.

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

28

WWW HTTP Protocol


When a user types in
http://www.yahoo.com/Recreation/Sports/Soccer/index.html, the

broswer creates a HTTP GET Request message and sends it over a TCP connection to the web server. In the above case, the HTTP GET Request message would be
GET /Recreation/Sports/Soccer/index.html HTTP/1.0 User-Agent: InternetExplorer/5.0 Accept: text/html, text/plain, image/gif, audio/au \r\n

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

29

WWW HTTP Request messages


HTTP Request messages are sent from client to server.
Request Line Optional HTTP Header \r\n Optional Data

Type of Request (e.g. GET)

Additional information such as brower being used, media types accepted

Delimiter Carriage return User data e.g. contents of Line feed completed form

There are a number of valid HTTP Request messages


Get Used to request a web page from a web server Head Return the header of a web page, used by search engines to test the validity of hyperlinks Post Used to send data (e.g. results of registration form) to a web server Put / Delete Not typically implemented by browsers.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

30

WWW HTTP Response messages


HTTP Response messages are sent from server to client.
Status Line Optional HTTP Header \r\n Optional Data

Success/Failure Indication Number between 200 and 599

Type of content returned e.g. text/html or image/gif

Delimiter

Requested Data e.g. web page

The Status Line gives information about the success of the previous HTTP Request

14/03/2012

200 299 300 399 400 499 500 599

Success Redirection Document has been moved Client Error Bad Request, Unauthorised, Not found Server Error Internal Error, Service Overloaded
Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

31

WWW Caching Web pages


Downloading HTML documents from servers can be slow due to a number of conditions:
Parts of the Internet can be congested Dialup connection is typically very slow, 33Kbps or 56Kbps Web server can have a lot of clients connecting to it at the same time, causing it to be overloaded.

If a user returns to previous HTML document, then this could require downloading the document from the server again. A browser can hold copies of recently visited pages. This avoids having to download pages again. An organisation can use a HTTP proxy that caches documents for multiple users. Thus improving the speed at which pages can be displayed on each users computer.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

32

Proxy server:

14/03/2012

Developing Web Applications (C) 2007 John Wiley & Sons Ltd.

33

WWW Browser Architecture

Input from keyboard and mouse

html interpreter

Controller

optional plugins

D i s p l a y D r i v e r

Output sent to display

HTTP client

Other client

Network Interface Communication with remote server


14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

34

WWW Browser Architecture


Browser has more components than a server: Display driver for painting screen. HTML interpreter for formatting HTML documents. Plugins to display different content (e.g., Shockwave or Real Audio content) HTTP client to fetch HTML documents from WWW server. Other clients for other protocols (e.g., ftp, mail) Controller also must accept input from the computer user through the mouse or keyboard.

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

35

Other Protocols
FTP - File Transfer Protocol
The Internet began development in the 1960s. Moving a file from one computer to another computer required some form of removable medium (floppy disk or tape). People required a protocol to reliably transfer files between any two computers connected to the Internet. Why not use HTTP? The HTTP protocol was developed in the late 1980s and the early 1990s after 10 years of FTP developed in 1971.. HTTP provides a poor authentication mechanism of users of the protocol. HTTP doesnt easily allow files to be sent in both directions. HTTP doesnt allow files to be downloaded in separate stages.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

36

Two major differences between FTP and HTTP:


1) When connecting to a FTP server you are using a FILE server (that means you can't see anything but only files are there), but if you connect to a HTTP server you access a WEB server, which means you can load web pages into a browser. 2) Using a FTP connection you can download and upload files to the server, but when you use the HTTP connection you can only download content from the Internet for viewing, is a "read only" method.

14/03/2012

Developing Web Applications (C) 2007 John Wiley & Sons Ltd.

37

FTP - Functions
The main function of FTP was to allow the sharing of files across the Internet. It has CHMOD permission for read, write and Execute. Other functions included Allowing computer users to use computers remotely. Hiding file storage differences from the user. The format that files are stored on a Macintosh are different from a PC which in turn are different from a Unix workstation. Different length filenames also have to be accommodated. Transfer of file data between computers has to be done reliably and efficiently. FTP should also allow transfer of very large files to be done in stages.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

38

FTP
FTP is a client/server program An FTP client program enables the user to interact with an ftp server in order to access files on the ftp server computer. Client programs can be:
Simple command line interfaces. E.g. MS-Dos Prompt C:\ ftp ftp.maths.tcd.ie Integrated with Web browsers, e.g. Netscape Navigator, Internet Explorer.

FTP provides similar services to those available on most filesystems: list directories, create new files, download files, delete files. FTP uses TCP connections and the default server port for FTP is 21.
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

39

FTP - Transfer modes Batch transfer


User creates list of files to be transferred by ftp program. Users request is dropped into a queue of similar requests. FTP program reads requests and performs transfers of files. Transfer program can retry until successful. Good for slow or unreliable transfers.

Interactive transfer
User starts ftp program User can interactively list contents of directories, transfer files, delete files etc. User can find and transfer files immediately Quick feedback in case of mistakes, e.g., spelling errors
14/03/2012 Developing Web Applications
(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

40

FTP - Sample Commands


Command open ls cd bin
get put mget mput

Description Open connection to computer List Directory contents Change to another directory Change to binary transfer, used for downloading executables. Download a file from remote computer Upload a file to the remote computer Start download of multiple files Start upload of multiple files

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

41

FTP - Checkpointing
A data transfer may be aborted after only transferring part of a file.
This could be due to the client or the server crashing, the TCP connection being broken due to congestion, phone hanging up during dial up connection.

FTP allows the file transfer from where the transfer was stopped, no need to re-transfer part of file. FTP achieves this by sending restart markers between the server and the client. Restart markers are saved in a restart file by the client. Client sends restart marker when it wants to continue the transfer of a previously stopped transfer.

14/03/2012

Developing Web Applications


(C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley

42