Sie sind auf Seite 1von 99

PROJECT

REPORT
ON

A DISSERTATION
REPORT SUBMITTED TO VCE, Rohtak.

UNDER THE GUIDANCE OF: Submitted By:-


Mr. PANKAJ GUPTA
H.O.D.
Gagan Chugh(110/CS/2k1)
Deptt. Of Computer Science, Vikram Kalra(115/CS/2k1)
V.C.E Arush Babbar(135/CS/2k1)
ACKNOWLEDGEMENT
Acknowledgment is not only a ritual, but also an expression of
indebtedness to all those who have helped in the completion process of
the project. One of the most pleasant aspects in collecting the necessary
and vital information and compiling it is the opportunity to thank all
those who actively contributed to it.
We owe our deepest gratitude and profound indebtness to Mr. Pankaj
Gupta for imparting us the right training, showing the right direction,
guidance and giving an opportunity to prove our ability in this
challenging arena. We would like to express our deep felt gratitude to
them for permitting us to complete the project work, which is an
important part of our curriculum.
We are really fortunate to be placed under the able guidance of Mr.
Pankaj Gupta who despite of his busy schedule helped us upgrade our
knowledge base, helped troubleshoot problems while doing the
assignments. His encouraging remarks from time to time greatly helped
me in improving our designing skills.
Mr. Pankaj Gupta was always there to encourage us and helped in
practice. Without him, we would not have been able to complete our
project.
Many thanks to him for their efficiency, cheerfulness and most of all their
excellent teaching ability.
Table of Contents

INTRODUCTION
Objective of the System

BACKGROUND
What is Internet
Web based Technology

PLATFORM USED

 SOFTWARE AND HARDWARE REQUIREMENTS


 Software and hardware specifications
 Client Server Modal

 SYSTEM ANALYSIS
 Identification of the Need
 Preliminary Investigation
 Information Gathering
 Feasibility Study
 Technical Feasibility
 Economic Feasibility
 Operational Feasibility
 Cost/Benefit Analysis

 SYSTEN DESIGN
Table of Contents

 INTRODUCTION TO JAVA
 Socket Programming

 INTRODUCTION TO PROXY SERVER


 Definition
 How Proxy Server works?
 Advantages
 Need
 Uses in Depth

 IMPLEMENTATION DETAILS
 A caching http proxy server

 SNAPSHOTS

 LIMITATIONS

 BIBLIOGRAPHY
INTRODUCTION
OBJECTIVE

A server that sits between a client application, such as a Web browser, and a

real server is popularly known as PROXY SERVER. It intercepts all requests

to the real server to see if it can fulfill the requests itself. If not, it forwards the

request to the real server.

The main objective of a proxy server is to dramatically improve the

performance for groups of users. This is because it saves the results of all

requests for a certain amount of time.

Proxy servers can also be used to filter requests. For example, a company

might use a proxy server to prevent its employees from accessing a specific

set of Web sites.

The advantage of using a common caching proxy server is given by the

probability to find a page in the local cache. The probability is in general

expressed by the hit rate. A cache with several Gb size and a lot of users can

reach a hit rate of 30 to 40 percent. Frequently requested pages for instance

the help pages of your browser might be almost every time in the cache. In

case that the page is not in the local cache you shouldn't see any difference in

the elapsed time of a direct request or a request handled by a proxy server


BACKGROUND
WHAT IS INTERNET :-

Some time in the mid 1960's, during the Cold war, it became apparent
that there was a need for a bombproof communications system. A concept
was devised to link computers together throughout the country. With such a
system in place large sections of the country could be nuked and messages
could still get through. In the beginning, only government "think tanks" and a
few universities were linked.

Basically the Internet was an emergency military communications


system operated by the Department of Defence's Advanced Research Project
Agency (ARPA). The whole operation was referred to as ARPANET. The
Internet, sometimes called simply "the Net", is a worldwide system of
computer networks - a network of networks in which users at any one
computer can, if they have permission, get information from any other
computer (and sometimes talk directly to users at other computers).

In time, ARPANET computers were installed at every university in the


United States that had defense related funding. Gradually, the Internet had
gone form a military pipeline to a communications tool for scientists. As more
scholars came online, the administration of the system transferred from ARPA
to the National Science Foundation.

Years later, businesses began using the Internet and the administrative
responsibilities were once again transferred.

At this time no one party "operates" the Internet, there are several
entities that "oversee" the system and the protocols that are involved.
Now the Internet is a huge collection of computer networks that can
communicate with each other - a network of networks that connects worldwide
through satellite link.

A network, further, is a collection of interconnected, individually


controlled computer through networks, each computer user can communicate
and share common resources, such as printers and storage space, with other
users. When one connects to the Internet from office or home, the computer
becomes a small part of this giant network.

The speed of the Internet has changed the way of people receive
information. It combines the immediacy of broadcast with in-depth coverage of
newspapers.........making it a perfect source for news and weather
information.

Internet usage is at all time high. Almost 100 million U.S. adults are
now going online every month, according to New York-based Media mark
Research. That's half of American adults and 30 percent increase over 2000
in the number who surf the Web. There also appears to be a continuing
gender shift in the number of American adults going online. In early 2000,
Media mark reported the milestone that women for the first time ever
accounted for half of the online adults population. Now 51 percent of U.S.
adult Web surfers - some 50.6 million - are women.

Today, the Internet is a public, cooperative and self-sustaining facility


accessible to hundreds of millions of people worldwide. Physically, the
Internet uses a portion of the total resources of the currently existing public
telecommunication networks. For many Internet users, electronic mail (e-mail)
has practically replaced the Postal Service for short written transactions.
Electronic mail is the most widely used application on the Net. You can also
carry on live "conversations" with other computer users, using IRC (Internet
Relay Chat). More recently, Internet telephony hardware and software allows
real-time voice conversations.
The most widely used part of the Internet is the World Wide Web
(often-abbreviated "WWW" or called "the Web"). Its outstanding feature is
hypertext, method of instant cross-referencing. In most Web sites, certain
words or phrases appear in test of a different color than the rest; often this
text is also underlined. When you select one of these words or phrases, you
will be transferred to the site or page that is relevant to this world or phrase.
Sometimes there are buttons, images or portions of images that are
"clickable". If you move the pointer over a sport on a Website and the pointer
changes into a hand, this indicates that you can click and be transferred to
another site.

Using the Web, you have access to millions of pages of information.


Web "surfing" is done with a Web browser, the most popular of which are
Netscape Navigator and Microsoft Internet Explorer. The appearance of a
particular Web site may vary slightly depending on the browser you use. Also,
later versions of a particular browser are able to render more "bells and
whistles" such as animation, virtual reality, sound and music files than earlier
versions.

WEB BASED TECHNOLOGY: -

Borderless, barrier less, boundryless, round the clock, around the


world. This is the specialty of web.

The web (also known as WWW or World Wide Web) was invented in
the early 1990s by Tim-Berner-Lee while working at CERN, the European lab
for Particle Physics at Geneva, Switzerland.

It has grown very rapidly. Four years ago only around 1250 Web
servers were online. Today there are over 10,00,000 Web servers. The idea
behind the development of web was to provide easy access to information
and to provide the capability to move freely on the Internet.

This is schematic diagram, which illustrates the essential components


of the World Wide Web. The users tool is the browser or the user agent. The
program that understands and displays HTML documents. The browser can
interpret URLs (Uniform Resource Locator) to determine where a resource is,
and can use the URL specified protocol to retrieve the resource. One of the
most important protocols is HTTP (hypertext Transfer protocol)-most www
servers use this protocol and called HTTP or web servers. Using web servers
CGI-Common Gateway Interface (or other, similar mechanisms), users can
access other resources on the web server.

A web portal is a location on a computer network that makes


information in the form of pages or documents available to the visitors those
who reach the site with some browser software. The computer network can be
worldwide Internet or an Intranet, a local network linking the entire computer
in an office. The information can be published in the form of HTML pages.
These types of web sites are called as Static Web sites. It is also possible to
add more interactions with clients of the company by means of chat or even
with E-Commerce. These types of web sites can be called as Dynamic Web
Sites.

Web site has changed the strategy of a company and market too. It
has numerous applications. Advertising/publishing, E-commerce, collaborative
computing etc. which makes to reach all over the world.
Domain Name System (DNS): -

These words roughly map to a parallel system of address called


Internet Protocol (IP) Address. Every computer on the Internet has both a
domain name and an IP address and when you use a domain name, the
computers translate that name to the corresponding IP address.

The names of the domains describe organizational or geographic


realities. They indicate what country the network connection is in and what
kind of organization owns it.

Hypertext Transfer Protocol: -

The hypertext Transfer Protocol (HTTP) is the protocol used between a


web-server and web-browser over the Internet. When a browser requests a
page from a server it opens a connection to the server and sends a GET
command with arguments to specify the requested URL Additional parameters
may also be sent as a series of HTTP headers. The server responds to this
request with a 3 digit response code (which is similar to the NNTP response
codes) followed by a set of HTTP headers and the requested data (which
would normally be in HTML format). A separate HTTP connection is made for
each requested URL-no caching of connection is made. HTTP is a state-less
protocol and no session data is maintained over subsequent HTTP
connections. An HTTP header is a simple tag-value pair. For example
Nose-Color : Red

would set the 'Nose-Color' option to 'red'. Common headers are described in
Table on the next page.
Table: Common HTTP headers.

Header Deion

Date Date and time of request/response

Content-type Type of data being sent

Accept List of content -types that a browser


understands

Server Name and version of HTTP server software

User-Agent Name and version of client software

HTTP defines a number of commands that may be sent by the client to


the server. The most commonly used is GET which requests a certain URL or
file form the server.

Figure shows an example using the GET command. In this


example a client (which identifies itself as "Super Browse 2.5") requests"/"
(the index page) from a server. The client also notifies the server that it can
only understand HTML and GIF files. The server sends a successful response
code followed by a number of headers, a blank line and the file itself. The
Content-Type header tells the browser that the returned document file is an
HTML document.

TCP/IP (Transmission Control protocol/Internet Protocol) is the basic


communication language or protocol of the Internet. It can also be used as a
communications protocol in the private networks called Intranets and in
extranets. When you are set up with direct access to the Internet, your
computer is provided with a copy of the TCP/IP program just as every other
computer that you may send messages to or get information from also has a
copy of TCP/IP.

TCP/IP is a two-layered program. The higher layer.


Transmission Control Protocol, manages the assembling of a message or file
into smaller packets that are transmitted over the Internet and received by a
TCP layer that reassembles the packets into the Internet and received by a
TCP layer, Internet Protocol, handles the address part of each packet so that
it gets to the right destination. Each gateway computer on the network checks
this address to see where to forward the message. Even though some
packets from the same message are routed differently than other, they'll be
reassembled at the destination.

TCP/IP uses the client/server model of communication in which


a computer user (a client) requests and is provided a service (such as
sending a Web page) by another computer (a server) in the network. TCP/IP
communication is primarily point-to-point, meaning each communication is
from one point (or host computer) in the network to another point or host
computer. TCP/IP and the higher-level applications that use it are collectively
said to be "stateless" because each client request is considered a new
request unrelated to any previous one (unlike ordinary phone conversations
that require a dedicated connection for the call duration). Being stateless frees
network paths so that everyone can sue them continuously. (Note that the
TCP layer itself is not stateless as far as any one message is concerned. Its
connection remains in place until all packets in a message have been
received).

Many Internet users are familiar with the even higher layer
application protocols that use TCP/IP to get to the Internet. These include the
World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer
Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and
the simple mail transfer protocol (SMTP). These and other protocols are often
packaged together with TCP/IP as a "suite". Personal computer users usually
get to the Internet through the Serial Line Internet Protocol (SLIP) or the
Point-to-Point Protocol (PPP). These protocols encapsulate the IP packets so
that they can be sent over a dial-up phone connection to an access provider's
modem. Protocols related to TCP/IP include the User Datagram Protocol
(UDP), which is used instead of TCP for special purposes. Other protocols are
used by network host computers for exchanging router information. These
include the Internet Control Message Protocol (ICMP) the Interior Gateway
Protocol (IGP), the Exterior Gateway Protocol (EGP), and the Border
Gateway Protocol (BGP).
Platform Used
JAVA

Java was conceived by James Gosling, Patrick Naughton, Chris Warth, Ed


Frank and Mike Sheridan at Sun Microsystems in 1991.
The original impetus for JAVA was not the internet; instead the primary
motivation was the need for a platform independent language that could be
used to create software to be embedded in various consumer, electronic
devices.

Why JAVA?
Java is based on Object-oriented principles, Java is secure and robust, and
programs in java are easily portable, these are a few of the reasons why we
opted for JAVA.

Moreover, another useful aspect of JAVA is the Socket Programming, the


ability to communicate between two computers socket (ports).
Java very efficiently implies Socket programming into its Domain. All the
client-server architectures existing nowadays are based on Socket
programming.
Sockets under java programming use TCP/IP protocols.
Internet protocol(IP) is a low-level routing protocol that breaks data into
small packets and sends them to an address a network that does not
guarantee to deliver said packets to the destination.

Transmission Control Protocol(TCP) is higher level protocol that manages


to robustly string together these packets sorting and re-transmitting them as
necessary to reliably transmit your data.
As Socket programming is the heart of the Proxy Sever thus we found JAVA
as the best choice to implement a HTTP Caching Proxy Server.
SOFTWARE
&
HARDWARE
REQUIREMENTS
SOFTWARE AND HARDWARE SPECIFICATION
There are not many hardware and software requirements needed for a proxy
server. There will obviously need to be a server. It can be the same server
that the firewall is on or it can be a separate server inside the firewall. The
software that is required is easily accessible. There are many free versions of
proxy server software that are available for the Linux operating system. The
server will not need to be extremely powerful, but it may require quite a bit of
disk space depending on how the caching is setup. If caching is enabled, this
will require more disk space than if it were disabled. One major advantage of
proxy servers is that only one connection to the Internet is needed. The most
important part in the setup of a proxy server is that the client computers must
specify the IP address of the domain name of the proxy server in their Internet
browser configuration. Without this setup, users will not be able to access the
Internet.
Server Side Requirements
The Java Proxy Server requires the following hardware for hosting and
running this application.
 P-III 800 MHz Processor: The processor required is P-III 800 MHz
because it is of high processing power. It has more memory and thus the
processing speed is high.
 True Colors Display Monitors 32 bit: This resolution is required because
the Application involves lot of graphics and pictures. The application can
be best viewed using this resolution.
 64 MB RAM (Atleast): As the speed of the computer increases with
increase in RAM so it should be as high as possible.

Besides this hardware, the software required by the Java Proxy Server are:
 Java Development Kit:: The Java Development Kit ( 1.2 or Above) by
Sun Micro systems is required to run Java Proxy Server.
 JCreator : An interactive IDE for developing java Applications, to support
the easy development of this poject.
Client Side Requirements
The hardware requirements for the client accessing the web pages through
this application are:
 P-III 233 MHz Processor (recommended): The processor required is P-
III 233 MHz because it is of high processing power, has more memory and
thus processing speed is high. Due to this the application will run faster.
 True Colors Display Monitors 32 bit (600 x 800): This resolution is
required because the application involves lot of graphics and pictures. The
application can be best viewed using this resolution.
 64 MB RAM (Atleast): As the speed of the computer increases with
increase in RAM so it should be as high as possible.

Besides these hardware requirements, the software required for the client
side are:

Any cascade enabled 4th generation Internet browsers like:


 Microsoft Internet Explorer 5.0
 Netscape Navigator 4.0
Client-Server Model

The standard model for network application is the client–server model. A


server is a process that is waiting to be contacted by a client process so that
the server can do something for the client.
The server process is started on some computer systems. It initializes itself
then goes to sleep waiting for a client process to contact it requesting some
service.

The client process is started, either on the same system or on another system
that is connected to the server system with a network. Client process are
often initiated by an interactive user entering a command to a time sharing
system. The client process sends a request across the network to the server
requesting service of some form. Some examples of type of service that
server can provide:

 Return the time-of-day to the client,


 Print a file on a printer for the client,
 Read or write a file on the server’s system for the client,
 Allow the client to login to the server’s system,
 Execute a command for the client on the server’s system.

When the server process has finished providing its services to the client, the
server goes back to sleep, waiting for the next client request to arrive.

We can further divide the server’s processes into two types:

1.Whenever the server can handle a client’s request in a known, short amount
of time, the server process handles the request itself. We call these iterative
servers.
2.When the amount of time to service a request depends on the request itself,
the server typically handles it in a concurrent fashion. These are called
concurrent servers.
SYSTEM
ANALYSIS
SYSTEM ANALYSIS

System analysis refers into the process of examining a situation with the
intent of improving it through better procedures and methods. System design
is the process of planning a new system to either replace or complement an
existing system. But before any planning is done, the system must be
thoroughly understood and the requirements determined. System analysis, is
therefore, the process of gathering and interpreting facts, diagnosing
problems and using the information to re-comment improvements in the
system. In other words, system analysis means a detailed explanation or
description. Before computerizing a system under consideration, it has to be
analyzed. We need to study how it functions currently, what are the problems
and what are the requirements that the proposed system should meet.

The main components of making software are:


1. System and software requirement analysis.
2. Design and implementation of software.
3. Ensuring, verifying and maintaining software integrity.
System analysis is an activity that encompasses most of the tasks that are
collectively called Computer System Engineering. Confusion sometimes
occurs because the term is often used in context that alludes it only to
software requirement analysis activities, but system analysis focuses on all
system elements-not just software.

System analysis is conducted with the following objectives in mind:


* Identify the customer’s need.
* Evaluate the system concept for feasibility.
* Perform economic and technical analysis.
* Allocate functions to hardware, software, people, database and other
system elements.
* Establish cost and schedule constraints.
* Create a system definition that forms the foundation for all subsequent
engineering work.

The four process involved are


Identification of the need :
The first step of system analysis process involves the identification of need.
The analyst meets with the customer and the end user. Identification of need
is the starting point in the evaluation of a computer-based system. The analyst
assists the customer on defining the goals of the system.
* What information will be produced?
* What information is to be provided?
* What functions and performances are required?
The analyst makes sure to distinguish between customer’s needs and
customer wants. Information gathered during the need identification step is
specified in a System Concept Document. The customer before meeting
sometimes prepares the original concept document with the analyst.
Feasibility study feasibility study is done so that an ill-conceived system
is recognized early in definition phase. During system engineering we
concentrate out attention on four primary areas of interest:

INFORMATION GATHERING

Strategy to gather information:


Gathering information in large organization is difficult and takes time.
All relevant personnel should be consulted and no information should be
overlooked. The strategy consist of
* Identify information sources.
* Evolving a method of obtaining information from identified source.
* Using an information flow model of organization.
Information sources: -
The main sources of information for the system customization are: -
* User of system.
* Forms and documents used in organization.
* Procedure manuals and rulebooks, which specify how various
activities, are carried in the organization.
* Various reports used in the organization.

Method of searching for information


Information gathering first started with conversation with top level
management. An overview of organization, available information and objective
to be met for proposed system are manually gathered from the top
management. A gross system model is then worked out and verified. For
collecting quantitative data from number of person in organization,
questionnaires are useful. The primary purpose of interview is to obtain both
quantitative and qualitative data. While interviewing keeping some point in
mind:
* make a prior appointment with the person to be interviewed and how
much time required.
* Read the background material and prepare the reports with checklist.
* State again the purpose of interview at the beginning of the interview.
* Obtain permission to take notes.
* Do not use computer jargon.
* Try to obtain both qualitative and quantitative information.
* Summarize the information gathered during the interview and verified
by user.
Performance requirements: The following performance characteristics were
taken care of while developing the system.
User friendliness: The system is easy to learn and understand a native use
can also use the system effectively, without any difficulty.
User satisfaction: The system is such that it stands up to the user
expectations.
Response time: The response of all the operation is good. This has been
made possible by careful programming and fine tuning.
Error handling: Response to user errors and undesired situations has been
taken care of to ensure that the system operates without halting.
Safety and Robustness: The system is able to avoid or tackle disastrous
action. In other words, it should be fool proof. The system safeguards against
undesired events without human intervention.

Acceptance Criteria:
The following acceptance criteria were established for evaluation of the new
System:
1.The system should be accurate and hence reliable.
2.The software should provide all the functions. Further, the expectation time
should be very low and response should be good.
3.The system should have scope to foresee modifications and enhancements.
4. The system must satisfy the standards of good software.
User Friendliness: The system should satisfy the user's needs. It should by
easy to learn and operate.
Modularity: The system should have relatively independent and single
function parts that can be put together to make complete system.
Maintainability: The developed system should be such that the time and
effort for program maintenance, enhancement are reduced.
Timeliness: The system should operate well under normal, peak and
recovery conditions.

Other method of information searching :


* System used in other similar organization.
* Trade journals and reports of conferences describing similar system.
* I gathered the information by various types of forms, some documents, rules
which are used in manual work.
On Site Observation:
It is the process of recognizing and noting people, objects and occurrence to
obtain the information. The major objective of on-site observation is to get as
close as possible to the real system being studied.

Interview and Questionnaires:


The interview is a face to face interpersonal role of situation in which a
person called the interviewer asks a person being interviewed questions
designed to gather information about a problem area. It can be used for two
main purposes: -
1. As an exploratory device to identify relation or verify information.
2. To capture information as it exists.

There are some primary advantages of interview: -


* Its flexibility the interview a superior technique for exploring areas
where not much is known about what questions to asked or how to
formulate questions.
* It offers a better opportunity than questionnaires to evaluate the validity
of the information gathered.
* It is an affective technique for eliciting information about complex
subjects and for probing the sentiments underlying expressed
opinions.
* Many people enjoy being interviewed, regardless of the subjects. The
percentage of returns to questionnaires is relatively low.

So when I interview the persons about the project matters they provide me the
better information about existing system, how they work and what types of
problems they are facing and about their requirements.
Exception Handling:
To ensure that the system does not halt in case of undesired situations
or events, the following exception conditions were taken care of by providing
the corresponding exception responses while developing the system.
While selecting an alternative from the menu, the user enters his/her
choice. He goes ahead only if the selected choice is convincing.
While executing the screen, if the user tries to skip a field, which can
not have a null value, an appropriate message is displayed, conveying the
user that the data has to be entered in to hat field.
Once the value has been entered in to a field, the cursor moves to the
next field. While a user enters date in valid format, the system displays a
message showing the valid format he should enter.
Security: The system provides the protection of information by providing a
password for an access to the database. There fore, an authorized user can
access that database.
Flexibility: The system is such that likely changes/modifications can be
easily incorporated.

Feasibility Study

Technical feasibility
A study of function, performance and constraints that may effect the ability to
achieve an acceptable system.

Economic Feasibility
An evaluation of development cost weighed against the ultimate income or
benefit derived from the developed system.
* Legal feasibility: A determination of any infringement/violation/liability that
could result from the development of system.
* Alternatives: An evaluation of alternative approaches to development of
system.

Economic Analysis:
Among the most important information contained in a feasibility study is
cost benefit analysis an assessment of the economic justification of a
computer based system project. Cost benefit analysis delineates cost for
project development and weigh them against them tangible and intangible
benefits of a system. Cost benefit analysis is complicated by criteria that vary
with the characteristics of system to be developed the relative size of the
project and the expected return on the investment desired as part of
company's strategic plan. In addition many benefits derived from computer
based systems are intangible. Direct quantitative comparisons may be difficult
to achieve.

Technical Analysis:
During technical analysis, the analyst evaluates the technical merits of
system concept, white at same time collecting additional information about
performance, reliability, maintainability and predictability. Technical analysis
begins with an assessment of the technical viability of the proposed system.
* What technologies are required to accomplish system function and
performance?
* What new materials, methods, algorithms or processes are required and
what is their development risk?
* How will these technology issues affect the cost?
* The results obtained from the technical analysis from the basis for another
go/no-go decision on the rest system if technical risk severe, if models
indicate that desired function cannot be achieved-it is back to the drawing
board!
SYSTEM
DESIGN
DESIGN PHASE
Design phase of software development deals with transforming the customer
requirements as described in the SRS document into a form implement able
using a programming language. In order to be easily implement able in a
conventional programming language, the following items must be designed
during the design phase.
♦ Different modules required implementing the design solution.
♦ Control relationship among the identified modules, i.e. the call relationship
(also known as the invocation relationship) among modules.
♦ Interface among different modules, i.e. details of the data items exchanged
among different modules.
♦ Data structures of the individual modules.
♦ Algorithms required implementing the individual modules.
Thus the goal of the design phase is to take the SRS document as the input
and to produce the above-mentioned items at the completion stage of the
design phase. A good software design is seldom arrived through a single step
procedure but goes through a series of steps. However, we can broadly
classify various design activities into two important parts:
♦ Preliminary (or high-level) design.
♦ Detailed design
This phase of the report contains designing part of the project in a draft
manner. In designing phase, the whole system is planned through a rough
plan so that we may follow the steps and where applied can make changes
accordingly. First of all the design of database is made so that all the process
can be thought can be thought in the form of input and output. The output of
one module can be entered into the next module as the input.
System Flow Designing
Describes how data will flow for the whole system When we manipulate the
data from the database, After manipulating how we communicate and Where
that data will go so that we can communicate With the user of our site.
DESIGN OBJECTIVES
The design of a system is correct if a system built precisely according
to the design satisfies the requirements of that system. Clearly, the goal
during the design phase is to produce correct designs. There can be many
correct designs possible. The goal of the design process is not simply to
produce a design for the system. Instead the goal is to find the best possible
design, within the limitations imposed by the requirements.

In order to evaluate a design, we have to specify some properties and


criteria that can be sued for evaluation. Criteria for quality of software design
is often subjective or non-quantifiable. Some desirable properties for a
software system design are:
* Verifiability
* Completeness
* Consistency
* Efficiency
* Tractability
* Simplicity/Understandability
The property of verifiability of a design is concerned with how easily the
correctness of the design can be argued. Tractability is an important property
that can aid design verification. It requires that all design elements must be
traceable to the requirements. Completeness requires that all the different
components of the design should be specified. That is, all the relevant data
structures, modules, external interfaces and module interconnections are
specified. Consistency requires that there are no inherent inconsistencies in
the design.
Efficiency of any system is concerned with the proper use of scarce
resources by the system. The need for efficiency arises due to cost
considerations. If some resources are scarce and expensive then it is
desirable that those resources be used efficiently.
Simplicity and Understandability are perhaps the most important quality
criteria for software systems. Maintenance of software is usually quite
expensive. Maintainability of software is one the goals that we have
established. The design of a system is one of the most important factors
affecting the maintainability of system. During maintenance, the first
necessary step that a maintainer has to undertake is to understand the
system to be maintained. Only after a maintainer has a thorough
understanding of the different modules of the system should the modifications
be undertaken. A simple and understandable design will go a long way in
making the job of the maintainer easier.
INTRODUCTION
TO

JAVA
Java’s Lineage
Java is related to C++, which is a direct descendent of C. Much of the
character of Java is inherited from these two languages. From C, Java derives
its syntax. Many of Java’s object oriented features were influenced by C++. In
fact, several of Java’s defining characteristics come from—or are responses
to—its predecessors. Moreover, the creation of Java was deeply rooted in the
process of refinement and adaptation that has been occurring in computer
programming languages for the past several decades. For these reasons, this
section reviews the sequence of events and forces that led up to Java. As you
will see, each innovation in language design was driven by the need to solve
a fundamental problem that the preceding languages could not solve. Java is
no exception.

The Creation of Java


James Gosling, Patrick Naughton, Chris Warth, Ed Frank, and Mike Sheridan
conceived Java at Sun Microsystems, Inc. in 1991. It took 18 months to
develop the first working version. This language was initially called “Oak,” but
was renamed “Java” in 1995. Between the initial implementation of Oak in the
fall of 1992 and the public announcement of Java in the spring of 1995, many
more people contributed to the design and evolution of the language. Bill Joy,
Arthur van Hoff, Jonathan Payne, Frank Yellin, and Tim Lindholm were key
contributors to the maturing of the original prototype. Somewhat surprisingly,
the original impetus for Java was not the Internet! Instead, the primary
motivation was the need for a platform-independent (that is, architecture-
neutral) language that could be used to create software to be embedded in
various consumer electronic devices, such as microwave ovens and remote
controls. As you can probably guess, many different types of CPUs are used
as controllers.
The trouble with C and C++ (and most other languages) is that they are
designed to be compiled for a specific target. Although it is possible to
compile a C++ program for just about any type of CPU, to do so requires a full
C++ compiler targeted for that CPU. The problem is that compilers are
expensive and time-consuming to create. An easier—and more cost-efficient
—solution was needed. In an attempt to find such a solution, Gosling and
others began work on a portable, platform-independent language that could
be used to produce code that would run on a variety of CPUs under differing
environments. This effort ultimately led to the creation of Java.
About the time that the details of Java were being worked out, a second, and
ultimately more important, factor was emerging that would play a crucial role
in the future of Java. This second force was, of course, the World Wide Web.
Had the Web not taken shape at about the same time that Java was being
implemented, Java might have remained a useful but obscure language for
programming consumer electronics.

However, with the emergence of the World Wide Web, Java was propelled to
the forefront of computer language design, because the Web, too, demanded
portable programs. Most programmers learn early in their careers that
portable programs are as elusive as they are desirable. While the quest for a
way to create efficient, portable (platform-independent).

Why Java Is Important to the Internet


The Internet helped catapult Java to the forefront of programming, and Java,
in turn, has had a profound effect on the Internet. The reason for this is quite
simple: Java expanded the universe of objects that can move about freely in
cyberspace. In a network, two very broad categories of objects are
transmitted between the server and your personal computer: passive
information and dynamic, active programs. For example, when you read your
e-mail, you are viewing passive data.
Even when you download a program, the program’s code is still only passive
data until you execute it. However, a second type of object can be transmitted
to your computer: a dynamic, self-executing program. Such a program is an
active agent on the client computer, yet is initiated by the server. For example,
a program might be provided by the server to display properly the data that
the server is sending. As desirable as dynamic, networked programs are, they
also present serious problems in the areas of security and portability. Prior to
Java, cyberspace was effectively closed to half the entities that now live there.
Java addressed those concerns and, by doing so, opened the door to a new
form of program: the applet.

Java Applets
An applet is a special kind of Java program that is designed to be transmitted
over the Internet and automatically executed by a Java-compatible web
browser. Furthermore, an applet is downloaded on demand, just like an
image, sound file, or video clip. The important difference is that an applet is an
intelligent program, not just an animation or media file. In other words, an
applet is a program that can react to user input and dynamically change—not
just run the
same animation or sound over and over. As exciting as applets are, they
would be nothing more than wishful thinking if Java were not able to address
the two fundamental problems associated with them: security and portability.
Before continuing, let’s define what these two terms mean relative to the
Internet.

Security
As you are likely aware, every time that you download a “normal” program,
you are risking a viral infection. Prior to Java, most users did not download
executable programs frequently, and those who did scanned them for viruses
prior to execution. Even so, most users still worried about the possibility of
infecting their systems with a virus. In addition to viruses, another type of
malicious program exists that must be guarded against. This type of program
can gather private information, such as credit card numbers, bank account
balances, and passwords, by searching the contents of your computer’s local
file system.

Java answers both of these concerns by providing a “firewall” between a


networked application and your computer. When you use a Java-compatible
web browser, you can safely download Java applets without fear of viral
infection or malicious intent. Java achieves this protection by confining a Java
program to the Java execution environment and not allowing it access to other
parts of the computer. (You will see how this is accomplished shortly.) The
ability to download applets with confidence that no harm will be done and that
no security will be breached is considered by many to be the single most
innovative aspect of Java.

Portability
As discussed earlier, many types of computers and operating systems are in
use throughout the world—and many are connected to the Internet. For
programs to be dynamically downloaded to all the various types of platforms
connected to the Internet, some means of generating portable executable
code is needed. As you will soon see, the same mechanism that helps ensure
security also helps create portability. Indeed, Java’s solution to these two
problems is both elegant and efficient.

Java’s Magic: The Bytecode


The key that allows Java to solve both the security and the portability
problems just described is that the output of a Java compiler is not executable
code. Rather, it is bytecode. Bytecode is a highly optimized set of instructions
designed to be executed by the Java run-time system, which is called the
Java Virtual Machine (JVM). In essence, the JVM is an interpreter for
bytecode. This may come as a bit of a surprise since most modern languages
are designed to be compiled into executable code, not interpreted, because of
performance concerns. However, the fact that a Java program is interpreted
by the JVM helps solve the major problems associated with downloading
programs over the Internet. Here is why. Translating a Java program into
bytecode makes it much easier to run a program in a wide variety of
environments. The reason is straightforward: only the JVM needs to be
implemented for each platform. Once the run-time package exists for a given
system, any Java program can run on it. Remember, although the details of
the JVM will differ from platform to platform, all understand the same Java
bytecode. If a Java program were compiled to native code, then different
versions of the same program would have to exist for each type of CPU
connected to the Internet. This is, of course, not a feasible solution. Thus, the
execution of bytecode by the JVM is the easiest way to create truly portable
programs. The fact that a Java program is executed by the JVM also helps to
make it secure. Because the JVM is in control, it can contain the program and
prevent it from generating side effects outside of the system. As you will see,
safety is also enhanced by certain restrictions that exist in the Java language.
In general, when a program is compiled to an intermediate form and then
interpreted by a virtual machine, it runs slower than it would run if compiled to
executable code. However, with Java, the differential between the two is not
so great. Because bytecode has been highly optimized, the use of bytecode
enables the JVM to execute programs much faster than you might expect.
Although Java was initially designed as an interpreted language, there is
technically nothing about Java that prevents on-the-fly compilation of
bytecode into native code in order to boost performance. For this reason, Sun
began supplying its HotSpot technology not long after Java’s initial release.
HotSpot provides a Just-In-Time (JIT) compiler for bytecode. When a JIT
compiler is part of the JVM, selected portions of bytecode are compiled into
executable Code in real time, on a piece-by-piece, demand basis. It is
important to understand that it is not possible to compile an entire Java
program into executable code all at once, because Java performs various run-
time checks that can be done only at run time. Instead, a JIT compiler
compiles code as it is needed, during execution. Furthermore, not all
sequences of bytecode are compiled—only those that will benefit from
compilation. The remaining code is simply interpreted. However, the just-in-
time approach still yields a significant performance boost. Even when dynamic
compilation is applied to bytecode, the portability and safety features still
apply, because the JVM is still in charge of the execution environment.

The Java Buzzwords


No discussion of Java’s history is complete without a look at the Java
buzzwords. Although the fundamental forces that necessitated the invention
of Java are portability and security, other factors also played an important role
in molding the final form of the language. The Java team in the following list of
buzzwords summed up the key considerations:

• Simple
• Secure
• Portable
• Object-oriented
• Robust
• High Performance
• Multithreaded
• Architecture-neutral
• Interpreted
• High performance
• Distributed
• Dynamic

Simple
Java was designed to be easy for the professional programmer to learn and
use effectively. Assuming that you have some programming experience, you
will not find Java hard to master. If you already understand the basic concepts
of object-oriented programming, learning Java will be even easier. Best of all,
if you are an experienced C++ programmer, moving to Java will require very
little effort. Because Java inherits the C/C++ syntax and many of the object-
oriented features of C++, most programmers have little trouble learning Java.

Secure
Security is an important concern as Java is mean to be used in the
networked environments. Java implements several security mechanisms to
protect against the code that might create a virus or invade the file system. All
this security mechanisms are based on the premises that nothing is to be
trusted. Java memory allocation and the scraping of pointers are a step
towards security. Java compiler does not handle the memory layout decision
so a programmer cannot guess the actual memory layout of a class by looking
at the declarations. Java anticipates and defends against most of the
techniques that have historically been used to trick software into misbehaving.

Portable
Being architecture neutral is one big part of being portable. But Java provides
further portability be making sure that here is no implementation-dependent
aspect of the language specification. For e.g. Java explicitly defines the size
of each of the primitive data type as well as arithmetic behavior.

Object-Oriented
Although influenced by its predecessors, Java was not designed to be source-
code compatible with any other language. This allowed the Java team the
freedom to design with a blank slate. One outcome of this was a clean,
usable, pragmatic approach to objects. Borrowing liberally from many seminal
object-software environments of the last few decades, Java manages to strike
a balance between the purists’s “everything is an object” paradigm and the
pragmatist’s “stay out of my way” model. The object model in Java is simple
and easy to extend, while primitive types, such as integers, are kept as high-
performance nonobjects.
Robust
The multi-platformed environment of the Web places extraordinary demands
on a program, because the program must execute reliably in a variety of
systems. Thus, the ability to create robust programs was given a high priority
in the design of Java. To gain reliability, Java restricts you in a few key areas,
to force you to find your mistakes early in program development. At the same
time, Java frees you from having to worry about many of the most common
causes of programming errors.

Because Java is a strictly typed language, it checks your code at compile


time. However, it also checks your code at run time. In fact, many hard-to-
track-down bugs that often turn up in hard-to-reproduce run-time situations
are simply impossible to create in Java. Knowing that what you have written
will behave in a predictable way under diverse conditions is a key feature of
Java.
To better understand how Java is robust, consider two of the main reasons for
program failure: memory management mistakes and mishandled exceptional
conditions (that is, runtime errors). Memory management can be a difficult,
tedious task in traditional programming Environments. For example, in C/C++,
the programmer must annually allocate and free all dynamic memory.

This sometimes leads to problems, because programmers will either forget to


free memory that has been previously allocated or, worse, try to free some
memory that another part of their code is still using. Java virtually eliminates
these problems by managing memory allocation and deallocation for you. (In
fact, deallocation is completely automatic, because Java provides garbage
collection for unused objects.) Exceptional conditions in traditional
environments often arise in situations such as division by zero or “file not
found,” and they must be managed with clumsy and hard-to-read constructs.
Java helps in this area by providing object-oriented exception handling. In a
well-written Java program, all run-time errors can—and should—be managed
by your program.
High performance
Java is interpreted language, so it can never be as fast the compiled C
language. But this speed is adequate to run interactive GUI and network-
based application, where applications often idle, waiting for data or user input.
To support the performance critical situation we have just in time compilers
that can translate Java byte code into machine code for the particular CPU at
run time. The process of generating code is fairly simple and it produces
reasonable good code.

Multithreaded
Java was designed to meet the real-world requirement of creating interactive,
networked programs. To accomplish this, Java supports multithreaded
programming, which allows you to write programs that do many things
simultaneously. The Java run-time system comes with an elegant yet
sophisticated solution for multi-process synchronization that enables you to
construct smoothly running interactive systems. Java’s easy-to-use approach
to multithreading allows you to think about the specific behavior of your
program, not the multitasking subsystem.

Architecture-Neutral
A central issue for the Java designers was that of code longevity and
portability. One of the main problems facing programmers is that no
guarantee exists that if you write a program today, it will run tomorrow—even
on the same machine. Operating system upgrades, processor upgrades, and
changes in core system resources can all combine to make a program
malfunction. The Java designers made several hard decisions in the Java
language and the Java Virtual Machine in an attempt to alter this situation.
Their goal was “write once; run anywhere, any time, forever.” To a great
extent, this goal was accomplished.
Interpreted and High Performance
As described earlier, Java enables the creation of cross-platform programs by
compiling into an intermediate representation called Java bytecode. This code
can be executed on any system that implements the Java Virtual Machine.
Most previous attempts at cross-platform solutions have done so at the
expense of performance. As explained earlier, the Java bytecode was
carefully designed so that it would be easy to translate directly into native
machine code for very high performance by using a just-in-time compiler.
Java run-time systems that provide this feature lose none of the benefits of
the platform-independent code.

Distributed
Java is designed for the distributed environment of the Internet, because it
handles TCP/IP protocols. In fact, accessing a resource using a URL is not
much different from accessing a file. Java also supports Remote Method
Invocation (RMI). This feature enables a program to invoke methods across a
network.

Dynamic
Java programs carry with them substantial amounts of run-time type
information that is used to verify and resolve accesses to objects at run time.
This makes it possible to dynamically link code in a safe and expedient
manner. This is crucial to the robustness of the applet environment, in which
small fragments of bytecode may be dynamically updated on a running
system.

Socket programming
The communication that occurs between the client and the server must be
reliable. The data must not be lost and must be available in the same
sequence in which the server sent it.
Transmission Control Protocol(TCP) provides a reliable, point-to-point
communication channel. To communicate over TCP, client and server
programs establish a connection and bind a socket. Sockets are used to
handle communication links between applications over the network. Further
communication between the client and the server is through the socket.
Java was designed as a networking language. It makes network programming
easier by encapsulating connection functionality in the socket classes, that is,
the Socket class to create a client socket, and the ServerSocket class to
create a server socket.

• Socket is the basic class, which supports the TCP protocol. TCP is
reliable stream network connection protocol. The Socket class provides
methods for Stream I/O, which makes reading from and writing to a
socket easy. This class is indispensable to the programs written to
communicate on the Internet.

• ServerSocket is a class used by Internetserver programs for listening


to client requests. ServerSocket does not actually perform the service;
instead, it creates a Socket object on behalf of the client. The
communication is performed through the object created.
Creating a Socket

Socket socketConnection;
Try
{
SocketConnection = new Socket(www.vcerohtak.com,1001);
}
catch(IOException e)
{
}

the constructor for the Socket class requires a host to connect to, in this case
WWW.vcerohtak.com, which is theport of a server. If the server is up and
running, the code creates a new Socket instance and continues running. If the
code encounters a problem while connecting, it throws an exception.
To disconnect from the server, use the close method().
SocketConnection.close();

Creating a SERVER Socket

To create a server, we need to create a ServerSocket object that listens at a


particular port for client requests. When it recognizes a valid request, the
server socket obtains the Socket object created by client. The communication
between the server and the client occurs using this socket.

The ServerSocket class represents the server in a client/server application.


The ServerSocket class provides constructors to create a socket on a
specified port.
The class provides methods which
• Listen for a connection.
• Return the address and local port.
• Return the string representation of the Socket.

The code for the constructor is as follows: -

Public Server()
{
try
{
serverSocket = new ServerSocket(1001);
}
catch(IOException e)
{
fail(e,”Could not start server”);
}
System.out.println(“Server started”);
This.start();
}
Introduction
To

Proxy Server
Defintion

A server that sits between a client application, such as a Web browser, and a
real server. It intercepts all requests to the real server to see if it can fulfill the
requests itself. If not, it forwards the request to the real server.

Proxy servers have two main purposes:

• Improve Performance: Proxy servers can dramatically improve


performance for groups of users. This is because it saves the results
of all requests for a certain amount of time. Consider the case where
both user X and user Y access the World Wide Web through a proxy
server. First user X requests a certain Web page, which we'll call Page
1. Sometime later, user Y requests the same page. Instead of
forwarding the request to the Web server where Page 1 resides, which
can be a time-consuming operation, the proxy server simply returns
the Page 1 that it already fetched for user X. Since the proxy server is
often on the same network as the user, this is a much faster operation.
Real proxy servers support hundreds or thousands of users. The major
online services such as CompuServe and America Online, for
example, employ an array of proxy servers.

• Filter Requests: Proxy servers can also be used to filter requests. For
example, a company might use a proxy server to prevent its
employees from accessing a specific set of Web sites.

The advantage of using a common caching proxy server is given by the


probability to find a page in the local cache. The probability is in general
expressed by the hit rate. A cache with several Gb size and a lot of users can
reach a hit rate of 30 to 40 percent. Frequently requested pages for instance
the help pages of your browser might be almost every time in the cache. In
case that the page is not in the local cache you shouldn't see any difference in
the elapsed time of a direct request or a request handled by a proxy server
How does a proxy server work?

A proxy server receives a request for an Internet service (such as a Web page
request) from a user. If it passes filtering requirements, the proxy server,
assuming it is also a cache server, looks in its local cache of previously
downloaded Web pages. If it finds the page, it returns it to the user without
needing to forward the request to the Internet. If the page is not in the cache,
the proxy server, acting as a client on behalf of the user, uses one of its own
IP addresses to request the page from the server out on the Internet. When
the page is returned, the proxy server relates it to the original request and
forwards it on to the user.

To the user, the proxy server is invisible; all Internet requests and returned
responses appear to be directly with the addressed Internet server. (The
proxy is not quite invisible; its IP address has to be specified as a
configuration option to the browser or other protocol program.)

What are the advantages of using a proxy server?

• An advantage of using a proxy server is that its cache can serve all
users. If one or more Internet sites are frequently requested, these are
likely to be in the proxy's cache, which will improve user response time.
In fact, there are special servers called cache servers.
• The functions of proxy, firewall, and caching can be in separate server
programs or combined in a single package. Different server programs
can be in different computers. For example, a proxy server may in the
same machine with a firewall server or it may be on a separate server
and forward requests through the firewall.
• There are different types of proxy servers with different features; some
are anonymous proxies, which are used to hide your real IP address
and some are used to filter sites, which contain material that may be
unsuitable for people to view.
• When you connect to a web site, your true IP address will not be
shown, but the proxy server's IP will. This does not mean that you're
completely anonymous. The proxy server will have logs of IP’s that
used the proxy server and the times.

Need of a Proxy Server

So why should you use a proxy?

You can use a proxy server if you have a child and wish to restrict the sites
they are viewing. You will need to make sure you get the correct type of proxy
because not all proxies filter sites. You can use it to protect yourself, it can be
used to hide your IP, which is useful because it means hackers can not get
info about you when using it. They will only get the proxy server's IP. Proxy
servers are not hard to set up, no hardware or software is needed, you just
need to configure your browser to connect through it.

Some ISP’s (Internet Service Providers) make all their users use a proxy
server; for example in the United Arab Emirates, the main ISP makes all users
use a proxy server which blocks sites with unsuitable material. It does this
using the meta tags in the HTML code used to make the web page. Some
ISP’s may give you a choice so you can use one or not. If you want to use a
proxy server there are many around with different functions, you just have to
get the one that suits your needs best.
Uses in Depth

Filter Requests and Control Access

Proxy servers were developed to filter request going to and coming from the
Internet. As the Internet became an essential part of many companies, it also
became the easiest way to attack companies. So it became necessary to
have a secure connection to the Internet from a private network without
compromising any confidential data. Since proxy servers filter all requests,
there are no unauthorized requests being transferred between the Internet
and the LAN. Proxy servers filter and control access in a couple of different
ways. They are able to filter them by the IP address of the computer that it
came from, as well as by controlling the access of the user that made the
request. User authentication is available on most proxy servers, and is
usually integrated with the authentication that takes place to connect to the
LAN. Although, users can usually still connect to the proxy server using their
LAN credentials, even if they are not logged in to the LAN. Since there is user
authentication, the proxy server can keep a log of all the requests each user
makes. Another advantage of having user authentication integrated with the
LAN is that policies and groups can be setup to only allow certain users
access to certain sites. This is a big advantage for companies because they
are able to restrict what their employees have access to. By filtering the
request by the IP address of the computer that sent it as will as where it is
going, the proxy server can determine if the request is legitimate. An inbound
message will not be forwarded to a computer unless that computer has
requested it. There is another feature of proxy servers that filters requests,
access control lists. Proxy servers use an access control list to filter out
unacceptable requests. This list contains the addresses of computers or sites
that are not to be accessed by anyone behind the firewall. These can be sites
with inappropriate content, or frequently used sites that serve no business
function such as EBay. The proxy server can also search through a request
or site for inappropriate words. Maintaining these lists is the most difficult part
of operating proxy servers. There are too many sites out there to block all
that are unnecessary. And there are thousands of new sites every day. In
response to this, there are some vendors that offer a subscription service that
gives you updated access control lists. This makes the administering of the
proxy server much easier, but it does cost more money. In order to control
access and filter websites, companies must have clear Internet usage policies
in place. They cannot block employees from viewing things without having
documented rules to back it up. This is a very touchy subject as to where to
make the line for what employees should have access to.

Internet Access behind a Firewall


Another main function of a proxy server is to provide Internet access to users
that are behind a firewall. Firewalls were designed to block access into and
out of LANs. As mentioned before, proxy servers are able to filter and control
access to and from the Internet. This allows companies to share the Internet
to its employees that have been placed behind a firewall to ensure the
security of the network. The proxy server is able to allow users to access the
Internet without compromising security because it uses its own IP addresses
to make the requests on the Internet. When a response is returned, the proxy
remembers which computer originally made the request, and forwards the
response to them. This allows the computers on the network to remain
invisible to the outside world.

Improving Performance
Proxy servers are able to improve the performance and efficiency of a
network by caching websites. By caching websites, proxy servers are storing
them locally on the server’s hard drive. When caching is enabled, proxy
servers cache sites that are requested frequently such as Yahoo. When a
user requests Yahoo, the proxy server checks the Internet to see if there is a
more recent version. If there is, it will place it in the cache and for ward it to
the user. If there is not and the version on the proxy server is current, it will
forward that one to the user. This means the server does not have to
download any new content. Another way to configure caching is to only
update the cache periodically. This improves performance even more
because the proxy server would not have to connect to the Internet if the site
was in the cache. However, it means that the user may not be getting the
most up-to-date version of the page they requested. By caching this way, the
administrator must determine which sites should be cached and how often the
cache should be updated. This is a very difficult task to figure out. Here are
some overall advantages and disadvantages of caching:

Advantages
· Improved user response time
· No need to cache on local user machines

Disadvantages
· Requires more disk space
· Difficult to know when to update or delete cache
· Possibility of providing users with non-current sites
and information.

Sharing Internet Connections

Another feature of proxy servers is that they allow an Internet connection to


be shared. The users need to be connected to the proxy server only. The
proxy server is what actually uses the Internet connection and routes the
requests to the users. This means that each computer on the network does
not need to have access to the Internet. This increases security and saves a
lot of money. With a properly configured proxy server, users will not notice
much of a delay in response times.
Passive and Active Caching

Proxy Server performs two types of caching—passive caching and active


caching. The difference between the two types lies in when Proxy Server
caches content.

Passive caching

Passive caching occurs on behalf of every Web Proxy service request for
content (i.e., objects). As browsers request content from the Web Proxy
service, the service consults the cache to see whether a current copy of the
object exists. If no copy exists, the service downloads a fresh copy from the
Web server and serves it to the client. Subsequently, the service caches the
object on the proxy server's local drives. This newly cached object is now
ready for the proxy server to serve when other browser requests for the same
object occur.

Serving cached copies of Web pages is a benefit to the local user; however,
for Web sites tracking page hits, the result is a lost hit. Lost hits can potentially
result in lost revenues. In addition, not every type of content is cacheable.
(Examples of non-cacheable content include Active Server Pages—ASP—
and Common Gateway Interface—CGI—objects.) If the content provider used
the <META> tag HTTP-Expires to assign an expiration date and time, Proxy
Server uses this value.

Active caching

Unlike passive caching, active caching is caching that the proxy server
performs during its idle periods. This type of caching is called active because
it proactively downloads the most frequently requested pages your local proxy
server cache learns. If an entertainment Web site is one of the most
requested Web sites on your proxy server, active caching will have a fresh
copy on hand in anticipation of browser requests. This active caching process
occurs only during idle periods—for example, overnight. You can disable this
feature for those proxy servers that have time or bandwidth restrictions.
IMPLEMENTATION
DETAILS
Caching Proxy HTTP Server

A simple caching proxy HTTP server, called http, to demonstrate client and
server sockets. http supports only GET operations and a very limited range of
hard-coded MIME types. (MIME types are the type descriptors for multimedia
content.) the proxy server is single threaded, in that each request is handled
in turn while others wait. It has fairly naïve strategies for caching-it keeps
everything in RAM forever. When it is acting as a proxy server, http also
copies every file it gets to a local cache for which it has no strategy for
refreshing or garbage collecting. All of these caveats aside, http represents a
productive example of client and server sockets, and it is fun to explore and
easy to extend.

The implementation of the HTTP Proxy Server is presented in five classes


and one interface. A more complete implementation would likely split many of
the methods out of the main class, httpd, in order to abstract more of the
components. For space support classes are only acting as data structures.
We will take a close look at each class and method to examine how this
server works, starting with the support classes and ending with the main
program.

MimeHeader.java

MIME is an Internet standard for communicating multimedia content over e-


mail systems. Nat Borenstein created this standard in 1992. The HTTP
protocol uses and extends the notion of MIME headers to pass general
attribute/value pairs between the HTTP client and server.
CONSTRUCTORS

This class is a subclass of Hashtable so that it can conveniently store and


retrieve the key/value pairs associated with a MIME header. It has two
constructors. One creates a blank MimeHeader with no keys. The other takes
a string-formatted as a MIME header and parses it for the initial contents of
the object.

Parse() the parse() method is used to take a raw MIME-formatted string and
enter its key/ value pairs into a given instance of MimeHeader. It uses a
StringTokenizer to split the input data into individual lines, marked by the
CRLF(\r\n) sequence. It then iterates through each line using the canonical
while… hasMoreTokens()…. NextToken() sequence.
For each line of the MIME header, the parse() method splits the line into two
strings separated by a colon(:). The two variables key and val are set by the
substring() method to extract the characters before the colon, those after the
colon, and its following space character. Once these two strings have been
extracted, the put() method is used to store this association between the key
and value in the Hashtable.

ToString()

The toString() method (used by the String Concatenation operator ,+) is


simply the reverse of parse(). It takes the current key/value pairs stored in the
MimeHeader and returns a string representation of them in the MIME format,
where keys are printed followed by a colon and a space, and then the value
followed by a CRLF.

put(), get(), AND fix()

The put() and get() function in the Hashtable would work fine for this
application if not one for rather odd thing. The MIME specification defined
several important keys, such as Content-Type and Control-Length. Some
early implementations of MIME Systems, notably web browsers, took liberties
with the capitalization of these fields. Some use Content-Type, others content-
type. To avoid mishaps, our HTTP server tries to convert all incoming and
outgoing MimeHeader convert the values’ capitalization, using the method
fix(), before entering them into the Hashtable and before looking up a given
key.

CODE

import java.util.*;

class MimeHeader extends Hashtable {


void parse(String data) {
StringTokenizer st = new StringTokenizer(data, "\r\n");

while (st.hasMoreTokens()) {
String s = st.nextToken();
int colon = s.indexOf(':');
String key = s.substring(0, colon);
String val = s.substring(colon + 2); // skip ": "
put(key, val);
}
}
MimeHeader() {}
MimeHeader(String d) {
parse(d);
}

public String toString() {


String ret = "";
Enumeration e = keys();
while(e.hasMoreElements()) {
String key = (String) e.nextElement();
String val = (String) get(key);
ret += key + ": " + val + "\r\n";
}
return ret;
}

// This simple function converts a mime string from


// any variant of capitalization to a canonical form.
// For example: CONTENT-TYPE or content-type to Content-Type,
// or Content-length or CoNTeNT-LENgth to Content-Length.
private String fix(String ms) {
char chars[] = ms.toLowerCase().toCharArray();
boolean upcaseNext = true;

for (int i = 0; i < chars.length - 1; i++) {


char ch = chars[i];
if (upcaseNext && 'a' <= ch && ch <= 'z') {
chars[i] = (char) (ch - ('a' - 'A'));
}
upcaseNext = ch == '-';
}
return new String(chars);
}

public String get(String key) {


return (String) super.get(fix(key));
}

public void put(String key, String val) {


super.put(fix(key), val);
}}

HttpResponse.java

The HTTPResponse class is a wrapper around everything associated with a


reply from an HTTP server. This is used by the proxy part of our httpd class.
When you send a request to an HTTP server, it responds with an integer
status code, which we store in statusCode, and a textual equivalent, which we
store in reasonPhrase. (These variable names are taken from the wording in
the official HTTP specification). This single line response is followed by a
MIME header, which contains further information about the reply. We use the
previously explained MimeHeader object to prase this string. The
MimeHeader object is stored inside the HttpResponse class in the mh
variable. These variables are not made private so that the httpd class can use
them directly.

CONSTRUCTORS

If you construct an HttpResponse with a string argument, this is taken to be a


raw response from an HTTP server and is passed to parse(), described next,
to initialize the object. Alternatively, you can pass in a precomputed status
code, reason phrase, and MIME header.

Parse()

The prase() method takes the raw data that was read from the HTTP server,
parses the statusCode and reasonPhrase fro the first line, then constructs a
MimeHeader out of the remaining lines.

To String()
The toString() method is the inverse of parse(). It takes the current values of
the HttpResponse object and returns a string that an HTTP client would
expect to read back from server.

CODE

import java.io.*;
/*
* HttpResponse
* Parse a return message and MIME header from a server.
* HTTP/1.0 302 Found = redirection, check Location for where.
* HTTP/1.0 200 OK = file data comes after mime header.
*/

class HttpResponse
{
int statusCode; // Status-Code in spec
String reasonPhrase; // Reason-Phrase in spec
MimeHeader mh;
static String CRLF = "\r\n";

void parse(String request) {


int fsp = request.indexOf(' ');
int nsp = request.indexOf(' ', fsp+1);
int eol = request.indexOf('\n');
String protocol = request.substring(0, fsp);
statusCode = Integer.parseInt(request.substring(fsp+1, nsp));
reasonPhrase = request.substring(nsp+1, eol);
String raw_mime_header = request.substring(eol + 1);
mh = new MimeHeader(raw_mime_header);
}

HttpResponse(String request) {
parse(request);
}

HttpResponse(int code, String reason, MimeHeader m) {


statusCode = code;
reasonPhrase = reason;
mh = m;
}

public String toString() {


return "HTTP/1.0 " + statusCode + " " + reasonPhrase + CRLF +
mh + CRLF;
}}

UrlCacheEntry.java

To cache the contents of a document on a server, we need to make an


association between the URL that was used to retrieve the document and the
description of the document itself. A document is described by its
MimeHeader and the raw data. For example, a MimeHeader with the Content-
Type might describe an image: image/gif, and the raw image of data is just an
array of bytes. Similarly a web page will likely have a Content-type: text/html
key/value pair in its MimeHeader, while the raw data is the contents of the
HTML page. Again, the instance variables are not marked as a private so that
httpd can have free access to them.

CONSTRUCTOR

The constructor for a UrlCacheEntry object requires the URL to use as the
key and a MimeHeader to associate with it. If the MimeHeader has a field in it
called Content-Length (most do), the data area preallocated to be large
enough hold such content.

Append()

The append() method is used to add data to a UrlCacheEntry object. The


reason this isn’t simply a setData() method is that the data might be streaming
in over a network and need to be stored a chunk at a time. The append()
method deals with three cases. In the first case, the data buffer has not been
allocated at all. In the second, the data buffer is too small to accommodate the
incoming data, so it is reallocated. In the last case, the incoming data fits just
fine and is inserted into the buffer. At any time, the length member variable
holds the current valid size of the data buffer.
CODE

class UrlCacheEntry
{
String url;
MimeHeader mh;
byte data[];
int length = 0;

public UrlCacheEntry(String u, MimeHeader m) {


url = u;
mh = m;
String cl = mh.get("Content-Length");
if (cl != null) {
data = new byte[Integer.parseInt(cl)];
}
}

void append(byte d[], int n) {


if (data == null) {
data = new byte[n];
System.arraycopy(d, 0, data, 0, n);
length = n;
} else if (length + n > data.length) {
byte old[] = data;
data = new byte[old.length + n];
System.arraycopy(old, 0, data, 0, old.length);
System.arraycopy(d, 0, data, old.length, n);
} else {
System.arraycopy(d, 0, data, length, n);
length += n;
}
}
}

LogMessage.java

LogMessage is a simple interface that declares one method, log(), which


takes a single String parameter. This is used to abstract the output of
messages from the httpd. In the application case, this method is implemented
to print to the standard output of the console in which the application was
started. In the applet case, the data is appended to a windowed text buffer.

CODE

interface LogMessage {
public void log(String msg);
}

httpd.java
CONSTRUCTOR

There are five main instance variables: port docroot, log, cache, and stopflag
and all of them are private.
Httpd’s alone constructor, shown here, can set three of these:
Httpd(int p, String dr, LogMessage lm)

It initializes the port to listen on, the directory to retrievefiles from, and the
interface to send messages to.
The fourth instance variable, cache is the Hashtable where all of the files are
cached I RAM, and is initialized when the object is created. Stopflag controls
the execution of the program.

STATIC SECTION

There are several important static variables in this class. The version reported
in the “Server” field of the MIME Header is found in the variable version. A few
constants are defined next: the MIME type for HTML cfiles, mime_text_html;
the MIM end-of-line sequence, CRLF; the name of the HTML file to return in
place of raw directory requests, indexfile;and the size of the databuffer used in
I/O, buffersize.

Then mt defines a list of filename extensions and the corresponding MIME


types for those files. The types Hashtable is statically initialized in the next
block to contain the array mt as alternating keys and values. Then the
fnameToMimeType() method can be used to return the proper MIME type for
each filename passed in. if the filename does not have one of the extensions
from the mt table, the method returns defaultExt, or “text/plain”.
STATISTICAL COUNTERS

Next are five more instance variables. These are left without the private
modifier so that an external monitor can inspect these values to display them
graphically. (We will show this in action later.) These variables represent the
usage statistics of our web server. The raw number of hits and bytes served is
stored in hits_served and bytes_served. The number of files and bytes
currently stored in the cache is stored in files_in_cache and bytes_in_cache.
Finally we store the number of hits that were successfully served out of the
cache in hits_to_cache.

ToBytes()

Next we have a convenience routine, toBytes(), which converts its string


argument to an array of bytes. This is necessary, because Java String objects
are stored as Unicode characters, while the lingua franca of Internet protocols
such as HTTP is good old 8-bit ASCII.

MakeMimeHeader()

The makeMimeHeader() method is another convenience routine that is used


to create a MimeHeader object with a few key values filled in. the
MimeHeader that is returned from this method has the current time and date
in the Date field , the name and version of our server in the Server filed, the
type parameter in the Content-type field , and the length parameter in the
Content-length field.
Error ()

The error () method is used to format an HTML page to send back to web
clients who make requests that cannot be completed. The first parameter,
code is the error code to return. Typically this will be between 400 and 499.
Our server sends back 404 and 405 errors. It uses the HTTPResponse class
to encapsulate the return code with the appropriate MimeHeader. The method
returns the string representation of that response concatenated with the
HTML page to show the user. The page includes a human-readable version of
the error code, msg, and the url request that caused the error.

GetRawRequest()

The getRawRequest() method is very simple. It reads data from a stream until
it gets two consecutive newline characters. It ignores carriage returns and just
looks for newlines. Once it has found the second newline, it returns the array
of bytes into a String object and returns it. It will return null if the input stream
does not produce two consecutive newlines before it ends. This is how
messages from HTTP servers and clients are formatted. They begin with one
line of status and then are immediately followed by a MIME header. The end
of the MIME header is separated from the rest of the content by two newlines.

LogEntry()

The logEntry() method is used to report to the HTTP server in a standard


format. The format this method produces may seem odd, but it matches the
current standard for HTTP log files. This method has several helper variables
and methods that are used to format the date stamp on each log entry. The
month’s array in used to convert the month to a string representation. The
host variable is set by the main HTTP loop when it accepts a connection from
a given host. The fmt02d() method formats integers between 0 and 9 as 2-
digit, leading-zero numbers. The resulting string is then passed through the
LogMessage interface variable log.

WriteString()

Another convenience method, writeString(), is used to hide the conversion of


a String to an array of bytes so that it can be written out to a stream.

WriteUCE()

The writeUCE() method takes an OutputStream and a UrlCacheEntry. It


extracts the information out of the cache in order to send a message to a web
client containing the appropriate response code, MIME header and content.

ServerFromCacahe()

This Boolean method attempts to find a particular URL in the cache. If it is


successful then the contents of that cache entry are written to the client, the
hits_to_cache variable is incremented, and the caller is returned true.
Otherwise, it simply returns false.

LoadFile()

This method takes an InputStream, the url that corresponds to it, and the
MimeHeader for that URL. A new UrlCacaheEntry is created with the
information stored in MimeHeader. The input stream is read in chunks of
buffer_size bytes cache. The files_in_cache and bytes_in_cache variables are
updated, and the UrlCacheEntry is returned to the caller.

ReadFile()

The readFile() method might seem redundant with the loadFile() method. It
isn’t. This method is strictly for reading files out of a local file system, where
loadFile() is used to talk to streams of any sort. If the file object f exists then
an InputStream is created for it. The size of the file is determined and the
MIME type is derived from the filename. These two variables are used to
create the appropriate MimeHeader, then loadFile() is called to do the actual
reading and caching.

WriteDiskCache()

The writeDiskCache() method takes a UrlCacheEntry object and writes it


persistently into the local disk. It constructs a directory name out of the URL,
making sure to replace the slash (/) characters with the system-dependent
seperatorChar. Then it calls mkdirs() to make sure that the local disk path
exists for this URL lastly, it opens a FileOutputStream, writes all the data into
it, and closes it.

HandleProxy()

The handleProxy() routine is one of the two major modes of this server. The
basic idea is this: if you set your browser to use this server as a proxy server,
then the requests that will be sent to it will include the complete Url, where
normal GETs remove the “http//” and host name part. We simply pick apart
the complete URL, looking for that “://” sequence, the next slash (/), and
optionally other colon (:) for servers using nonstandard port numbers. Once
we have found these characters, we know the intend host and port number as
well as the URL we need to fetch from there. We can then attempt to load a
previously saved version of this document out of our Ram cache. If this fails,
we can attempt to load it from file system into the RAM cache and reattempt
loading it from the cache. If this fails, then it gets interesting, because we must
read the document from the remote site.

To do this we open a socket to the remote site and port. We send a GET
request asking for the URL that was passed to us. Whatever response header
we get back from the remote site, we send on to the client, if that code was
200, for successful file transfer, we also read the ensuing data stream into a
new UrlCacheEntry and write it into the client socket. After that we call
writeDiskCacahe() to save the results of the local disk. We log the transaction,
close the sockets, and return.

HandleGet()

The handleGeta90 method is called when the http daemon is acting like a
normal web server. It has a local disk document root out of which it is serving
files. The parameters to handle Get() tell it where to write the results, the URL
to look up, and the MmeHeader from the requesting web browser. This MIME
Header will include the User-Agent string and other useful attributes. First we
attempt to serve the URL pout of the Ram cache. If this fails, we look in the
file system for the URL. If the file does not exist or is unreadable we report an
error back to the web client. Otherwise we just use readFile() to get the
contents of the file and put them in the cache. Then writeUCE() is used to
send the contents of the file down the client socket.

DoRequest()
The doRequest() method is called once per connection to the server. It parses
the request string and incoming MIME header. It decides to call either
handleProxy() or handleGet(), based on whether there is a “://” in the request
string. If any methods are used other that GET, such as HEAD or POST, this
routine returns a 405 error to the client. Note that the HTTP is ignored if
stopFlag is false.

Run()

The run() method is called when the server thread is started. It creates a new
ServerSocket on the given port, goes into an infinite loop calling accept() on
the serversocket, and passes the resulting Socketoff to doRequest() for
inspection.

start() AND stop()

These are two methods used to start and stop the server process. These
methods set the value of stopFlag.

CODE

import java.net.*;
import java.io.*;
import java.text.*;
import java.util.*;
class httpd implements Runnable, LogMessage {
private int port;
private String docRoot;
private LogMessage log;
private Hashtable cache = new Hashtable();
private boolean stopFlag;

private static String version = "1.0";


private static String mime_text_html = "text/html";
private static String CRLF = "\r\n";
private static String indexfile = "index.html";
private static int buffer_size = 8192;
static String mt[] = { // mapping from file ext to Mime-Type
"txt", "text/plain",
"html", mime_text_html,
"htm", "text/html",
"gif", "image/gif",
"jpg", "image/jpg",
"jpeg", "image/jpg",
"class", "application/octet-stream"
};
static String defaultExt = "txt";
static Hashtable types = new Hashtable();
static {
for (int i=0; i<mt.length;i+=2)
types.put(mt[i], mt[i+1]);
}

static String fnameToMimeType(String filename) {


if (filename.endsWith("/")) // special for index files.
return mime_text_html;
int dot = filename.lastIndexOf('.');
String ext = (dot > 0) ? filename.substring(dot + 1) : defaultExt;
String ret = (String) types.get(ext);
return ret != null ? ret : (String)types.get(defaultExt);
}

int hits_served = 0;
int bytes_served = 0;
int files_in_cache = 0;
int bytes_in_cache = 0;
int hits_to_cache = 0;

private final byte toBytes(String s)[] {


byte b[] = s.getBytes();
return b;
}

private MimeHeader makeMimeHeader(String type, int length) {


MimeHeader mh = new MimeHeader();
Date curDate = new Date();
TimeZone gmtTz = TimeZone.getTimeZone("GMT");
SimpleDateFormat sdf =
new SimpleDateFormat("dd MMM yyyy hh:mm:ss zzz");
sdf.setTimeZone(gmtTz);
mh.put("Date", sdf.format(curDate));
mh.put("Server", "JavaCompleteReference/" + version);
mh.put("Content-Type", type);
if (length >= 0)
mh.put("Content-Length", String.valueOf(length));
return mh;
}

private String error(int code, String msg, String url) {


String html_page = "<body>" + CRLF +
"<h1>" + code + " " + msg + "</h1>" + CRLF;
if (url != null)
html_page += "Error when fetching URL: " + url + CRLF;
html_page += "</body>" + CRLF;
MimeHeader mh = makeMimeHeader(mime_text_html,
html_page.length());
HttpResponse hr = new HttpResponse(code, msg, mh);

logEntry("GET", url, code, 0);


return hr + html_page;
}

// Read 'in' until you get two \n's in a row.


// Return up to that point as a String.
// Discard all \r's.
private String getRawRequest(InputStream in)
throws IOException {
byte buf[] = new byte[buffer_size];
int pos=0;
int c;
while ((c = in.read()) != -1) {
switch (c) {
case '\r':
break;
case '\n':
if (buf[pos-1] == c) {
return new String(buf,0,pos);
}
default:
buf[pos++] = (byte) c;
}
}
return null;
}
static String months[] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
};
private String host;
// fmt02d is the same as C's printf("%02d", i)
private final String fmt02d(int i) {
if(i < 0) {
i = -i;
return ((i < 9) ? "-0" : "-") + i;
}
else {
return ((i < 9) ? "0" : "") + i;
}
}
private void logEntry(String cmd, String url, int code, int size) {
Calendar calendar = Calendar.getInstance();
int tzmin = calendar.get(Calendar.ZONE_OFFSET)/(60*1000);
int tzhour = tzmin / 60;
tzmin -= tzhour * 60;
log.log(host + " - - [" +
fmt02d(calendar.get(Calendar.DATE) ) + "/" +
months[calendar.get(Calendar.MONTH)] + "/" +
calendar.get(Calendar.YEAR) + ":" +
fmt02d(calendar.get(Calendar.HOUR) ) + ":" +
fmt02d(calendar.get(Calendar.MINUTE) ) + ":" +
fmt02d(calendar.get(Calendar.SECOND)) + " " +
fmt02d(tzhour) + fmt02d(tzmin) +
"] \"" +
cmd + " " +
url + " HTTP/1.0\" " +
code + " " +
size + "\n");
hits_served++;
bytes_served += size;
}

private void writeString(OutputStream out, String s)


throws IOException {
out.write(toBytes(s));
}

private void writeUCE(OutputStream out, UrlCacheEntry uce)


throws IOException {
HttpResponse hr = new HttpResponse(200, "OK", uce.mh);
writeString(out, hr.toString());
out.write(uce.data, 0, uce.length);
logEntry("GET", uce.url, 200, uce.length);
}

private boolean serveFromCache(OutputStream out, String url)


throws IOException {
UrlCacheEntry uce;
if ((uce = (UrlCacheEntry)cache.get(url)) != null) {
writeUCE(out, uce);
hits_to_cache++;
return true;
}
return false;
}

private UrlCacheEntry loadFile(InputStream in, String url,


MimeHeader mh)
throws IOException {
UrlCacheEntry uce;
byte file_buf[] = new byte[buffer_size];
uce = new UrlCacheEntry(url, mh);
int size = 0;
int n;
while ((n = in.read(file_buf)) >= 0) {
uce.append(file_buf, n);
size += n;
}
in.close();
cache.put(url, uce);
files_in_cache++;
bytes_in_cache += uce.length;
return uce;
}

private UrlCacheEntry readFile(File f, String url)


throws IOException {

if (!f.exists())
return null;
InputStream in = new FileInputStream(f);
int file_length = in.available();
String mime_type = fnameToMimeType(url);
MimeHeader mh = makeMimeHeader(mime_type, file_length);
UrlCacheEntry uce = loadFile(in, url, mh);
return uce;
}

private void writeDiskCache(UrlCacheEntry uce)


throws IOException {

String path = docRoot + uce.url;


String dir = path.substring(0, path.lastIndexOf("/"));
dir.replace('/', File.separatorChar);
new File(dir).mkdirs();
FileOutputStream out = new FileOutputStream(path);
out.write(uce.data, 0, uce.length);
out.close();
}

// A client asks us for a url that looks like this:


// http://the.internet.site/the/url
// we go get it from the site and return it...
private void handleProxy(OutputStream out, String url,
MimeHeader inmh) {
try {
int start = url.indexOf("://") + 3;
int path = url.indexOf('/', start);
String site = url.substring(start, path).toLowerCase();
int port = 80;
String server_url = url.substring(path);
int colon = site.indexOf(':');
if (colon > 0) {
port = Integer.parseInt(site.substring(colon + 1));
site = site.substring(0, colon);
}
url = "/cache/" + site + ((port != 80) ? (":" + port) : "") +
server_url;
if (url.endsWith("/"))
url += indexfile;

if (!serveFromCache(out, url)) {
if (readFile(new File(docRoot + url), url) != null) {
serveFromCache(out, url);
return;
}

// If we haven't already cached this page, open a socket


// to the site's port and send a GET command to it.
// We modify the user-agent to add ourselves... "via".

Socket server = new Socket(site, port);


InputStream server_in = server.getInputStream();
OutputStream server_out = server.getOutputStream();
inmh.put("User-Agent", inmh.get("User-Agent") +
" via JavaCompleteReferenceProxy/" + version);
String req = "GET " + server_url + " HTTP/1.0" + CRLF +
inmh + CRLF;
writeString(server_out, req);
String raw_request = getRawRequest(server_in);
HttpResponse server_response =
new HttpResponse(raw_request);
writeString(out, server_response.toString());
if (server_response.statusCode == 200) {
UrlCacheEntry uce = loadFile(server_in, url,
server_response.mh);
out.write(uce.data, 0, uce.length);
writeDiskCache(uce);
logEntry("GET", site + server_url, 200, uce.length);
}
server_out.close();
server.close();
}
} catch (IOException e) {
log.log("Exception: " + e);
}
}
private void handleGet(OutputStream out, String url,
MimeHeader inmh) {
byte file_buf[] = new byte[buffer_size];
String filename = docRoot + url +
(url.endsWith("/") ? indexfile : "");
try {
if (!serveFromCache(out, url)) {
File f = new File(filename);
if (! f.exists()) {
writeString(out, error(404, "Not Found", filename));
return;
}
if (! f.canRead()) {
writeString(out, error(404, "Permission Denied", filename));
return;
}
UrlCacheEntry uce = readFile(f, url);
writeUCE(out, uce);
}
} catch (IOException e) {
log.log("Exception: " + e);
}
}

private void doRequest(Socket s) throws IOException {


if(stopFlag)
return;
InputStream in = s.getInputStream();
OutputStream out = s.getOutputStream();

String request = getRawRequest(in);


int fsp = request.indexOf(' ');
int nsp = request.indexOf(' ', fsp+1);
int eol = request.indexOf('\n');
String method = request.substring(0, fsp);
String url = request.substring(fsp+1, nsp);
String raw_mime_header = request.substring(eol + 1);

MimeHeader inmh = new MimeHeader(raw_mime_header);

request = request.substring(0, eol);

if (method.equalsIgnoreCase("get")) {
if (url.indexOf("://") >= 0) {
handleProxy(out, url, inmh);
} else {
handleGet(out, url, inmh);
}
} else {
writeString(out, error(405, "Method Not Allowed", method));
}
in.close();
out.close();
}

public void run() {


try {
ServerSocket acceptSocket;
acceptSocket = new ServerSocket(port);
while (true) {
Socket s = acceptSocket.accept();
host = s.getInetAddress().getHostName();
doRequest(s);
s.close();
}
} catch (IOException e) {
log.log("accept loop IOException: " + e + "\n");
} catch (Exception e) {
log.log("Exception: " + e);
}
}

private Thread t;
public synchronized void start() {
stopFlag = false;
if (t == null) {
t = new Thread(this);
t.start();
}
}

public synchronized void stop() {


stopFlag = true;
log.log("Stopped at " + new Date() + "\n");
}

public httpd(int p, String dr, LogMessage lm) {


port = p;
docRoot = dr;
log = lm;
}

// This main and log method allow httpd to be run from the console.
public static void main(String args[]) {
httpd h = new httpd(80, "c:\\www", null);
h.log = h;
h.start();
try {
Thread.currentThread().join();
} catch (InterruptedException e) {};
}

public void log(String m) {


System.out.print(m);
}
}

Main()

We can use main() method to run this application from a command line. It sets
the LogMessage parameter to be the server itself, and then provides a simple
console output implementation of log().

HTTP.java

HTTP.java is an added applet class that gives the HTTP server a functional
“front panel”. This applet has two parameters that can be used to configure
the server:port and docroot. This is a very simple applet. It makes an instance
of the httpd, passing in it as the LogMessage interface. Then it creates a
panel that has a simple label ata the top, a TextArea in the middle for
displaying the LogMessages, and a panel at the bottom that has two buttons
and another label in it. The start() and stop() methods of the applet call the
corresponding methods on the httpd. The buttons labeled ”Start” and “Stop”
call their corresponding methods in the httpd. Any time a message is logged,
the bottom-right Label object is updated to contain the latest statistics from the
httpd.

CODE
import java.util.*;
import java.applet.*;
import java.awt.*;
import java.awt.event.*;

public class HTTP extends Applet implements LogMessage,


ActionListener
{
private int m_port = 80;
private String m_docroot = "c:\\www";
private httpd m_httpd;
private TextArea m_log;
private Label status;

private final String PARAM_port = "port";


private final String PARAM_docroot = "docroot";
public HTTP() {
}

public void init() {


setBackground(Color.white);
String param;

// port: Port number to listen on


param = getParameter(PARAM_port);
if (param != null)
m_port = Integer.parseInt(param);

// docroot: web document root


param = getParameter(PARAM_docroot);
if (param != null)
m_docroot = param;

setLayout(new BorderLayout());

Label lab = new Label("Java HTTPD");

lab.setFont(new Font("SansSerif", Font.BOLD, 18));


add("North", lab);
m_log = new TextArea("", 24, 80);
add("Center", m_log);
Panel p = new Panel();
p.setLayout(new FlowLayout(FlowLayout.CENTER,1,1));
add("South", p);
Button bstart = new Button("Start");
bstart.addActionListener(this);
p.add(bstart);
Button bstop = new Button("Stop");
bstop.addActionListener(this);
p.add(bstop);
status = new Label("raw");
status.setForeground(Color.green);
status.setFont(new Font("SansSerif", Font.BOLD, 18));
add("North", lab);
m_log = new TextArea("", 24, 80);
add("Center", m_log);
Panel p = new Panel();
p.setLayout(new FlowLayout(FlowLayout.CENTER,1,1));
add("South", p);
Button bstart = new Button("Start");
bstart.addActionListener(this);
p.add(bstart);
Button bstop = new Button("Stop");
bstop.addActionListener(this);
p.add(bstop);
status = new Label("raw");
status.setForeground(Color.green);
status.setFont(new Font("SansSerif", Font.BOLD, 10));
p.add(status);
m_httpd = new httpd(m_port, m_docroot, this);
}

public void destroy() {


stop();
}

public void paint(Graphics g) {


}

public void start() {


m_httpd.start();
status.setText("Running ");
clear_log("Log started on " + new Date() + "\n");
}

public void stop() {


m_httpd.stop();
status.setText("Stopped ");
}

public void actionPerformed(ActionEvent ae) {


String label = ae.getActionCommand();
if(label.equals("Start")) {
start();
}
else {
stop();
}
}

public void clear_log(String msg) {


m_log.setText(msg + "\n");
}

public void log(String msg) {


m_log.append(msg);
status.setText(m_httpd.hits_served + " hits (" +
(m_httpd.bytes_served / 1024) + "K), " +
m_httpd.files_in_cache + " cached files (" +
(m_httpd.bytes_in_cache / 1024) + "K), " +
m_httpd.hits_to_cache + " cached hits");
status.setSize(status.getPreferredSize());
}
}
SNAPSHOTS
TITLE PAGE (index.htm)
Main Page (main.htm)
Working Page (working.htm)
LIMITATIONS
Limitations of the project

As each project has some limitations, our project too lacks in some features.
These are a few areas in which this project, JAVA PROXY SERVER needs
further improvement.
 There is no provision in the project to clear the cache automatically.
The cache has to be cleared from time to time manually by the server
administrator
 If someone tries to access the files, which could not be properly saved
in the cache, can never be displayed if accessed through cache. There
is no provision in the project to request the files again from the web
and save them in the cache, unless you clear the cache manually.
 This project is not as efficient as other proxy servers available in the
market, as the client to spend more time to access the web pages
through this Proxy Server.
 Another limitation of this project is that it can handle requests from only
one client at a time. Other clients willing to set up a connection with the
server have to wait for the current client to complete its requests and
end up its connection with the server.
BIBLIOGRAPHY
Bibliography

For the development of this project we took help from a lot of sources,
some of which are listed here.

 THE COMPLETE REFERENCE – JAVA2


 JAVA PROGRAMMING – NIIT
 www.osborne.com
 www.windowsitpro.com
 www.ncl.ac.uk
 www.webopedia.com

Das könnte Ihnen auch gefallen