Beruflich Dokumente
Kultur Dokumente
REPORT
ON
A DISSERTATION
REPORT SUBMITTED TO VCE, Rohtak.
INTRODUCTION
Objective of the System
BACKGROUND
What is Internet
Web based Technology
PLATFORM USED
SYSTEM ANALYSIS
Identification of the Need
Preliminary Investigation
Information Gathering
Feasibility Study
Technical Feasibility
Economic Feasibility
Operational Feasibility
Cost/Benefit Analysis
SYSTEN DESIGN
Table of Contents
INTRODUCTION TO JAVA
Socket Programming
IMPLEMENTATION DETAILS
A caching http proxy server
SNAPSHOTS
LIMITATIONS
BIBLIOGRAPHY
INTRODUCTION
OBJECTIVE
A server that sits between a client application, such as a Web browser, and a
to the real server to see if it can fulfill the requests itself. If not, it forwards the
performance for groups of users. This is because it saves the results of all
Proxy servers can also be used to filter requests. For example, a company
might use a proxy server to prevent its employees from accessing a specific
expressed by the hit rate. A cache with several Gb size and a lot of users can
the help pages of your browser might be almost every time in the cache. In
case that the page is not in the local cache you shouldn't see any difference in
Some time in the mid 1960's, during the Cold war, it became apparent
that there was a need for a bombproof communications system. A concept
was devised to link computers together throughout the country. With such a
system in place large sections of the country could be nuked and messages
could still get through. In the beginning, only government "think tanks" and a
few universities were linked.
Years later, businesses began using the Internet and the administrative
responsibilities were once again transferred.
At this time no one party "operates" the Internet, there are several
entities that "oversee" the system and the protocols that are involved.
Now the Internet is a huge collection of computer networks that can
communicate with each other - a network of networks that connects worldwide
through satellite link.
The speed of the Internet has changed the way of people receive
information. It combines the immediacy of broadcast with in-depth coverage of
newspapers.........making it a perfect source for news and weather
information.
Internet usage is at all time high. Almost 100 million U.S. adults are
now going online every month, according to New York-based Media mark
Research. That's half of American adults and 30 percent increase over 2000
in the number who surf the Web. There also appears to be a continuing
gender shift in the number of American adults going online. In early 2000,
Media mark reported the milestone that women for the first time ever
accounted for half of the online adults population. Now 51 percent of U.S.
adult Web surfers - some 50.6 million - are women.
The web (also known as WWW or World Wide Web) was invented in
the early 1990s by Tim-Berner-Lee while working at CERN, the European lab
for Particle Physics at Geneva, Switzerland.
It has grown very rapidly. Four years ago only around 1250 Web
servers were online. Today there are over 10,00,000 Web servers. The idea
behind the development of web was to provide easy access to information
and to provide the capability to move freely on the Internet.
Web site has changed the strategy of a company and market too. It
has numerous applications. Advertising/publishing, E-commerce, collaborative
computing etc. which makes to reach all over the world.
Domain Name System (DNS): -
would set the 'Nose-Color' option to 'red'. Common headers are described in
Table on the next page.
Table: Common HTTP headers.
Header Deion
Many Internet users are familiar with the even higher layer
application protocols that use TCP/IP to get to the Internet. These include the
World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer
Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and
the simple mail transfer protocol (SMTP). These and other protocols are often
packaged together with TCP/IP as a "suite". Personal computer users usually
get to the Internet through the Serial Line Internet Protocol (SLIP) or the
Point-to-Point Protocol (PPP). These protocols encapsulate the IP packets so
that they can be sent over a dial-up phone connection to an access provider's
modem. Protocols related to TCP/IP include the User Datagram Protocol
(UDP), which is used instead of TCP for special purposes. Other protocols are
used by network host computers for exchanging router information. These
include the Internet Control Message Protocol (ICMP) the Interior Gateway
Protocol (IGP), the Exterior Gateway Protocol (EGP), and the Border
Gateway Protocol (BGP).
Platform Used
JAVA
Why JAVA?
Java is based on Object-oriented principles, Java is secure and robust, and
programs in java are easily portable, these are a few of the reasons why we
opted for JAVA.
Besides this hardware, the software required by the Java Proxy Server are:
Java Development Kit:: The Java Development Kit ( 1.2 or Above) by
Sun Micro systems is required to run Java Proxy Server.
JCreator : An interactive IDE for developing java Applications, to support
the easy development of this poject.
Client Side Requirements
The hardware requirements for the client accessing the web pages through
this application are:
P-III 233 MHz Processor (recommended): The processor required is P-
III 233 MHz because it is of high processing power, has more memory and
thus processing speed is high. Due to this the application will run faster.
True Colors Display Monitors 32 bit (600 x 800): This resolution is
required because the application involves lot of graphics and pictures. The
application can be best viewed using this resolution.
64 MB RAM (Atleast): As the speed of the computer increases with
increase in RAM so it should be as high as possible.
Besides these hardware requirements, the software required for the client
side are:
The client process is started, either on the same system or on another system
that is connected to the server system with a network. Client process are
often initiated by an interactive user entering a command to a time sharing
system. The client process sends a request across the network to the server
requesting service of some form. Some examples of type of service that
server can provide:
When the server process has finished providing its services to the client, the
server goes back to sleep, waiting for the next client request to arrive.
1.Whenever the server can handle a client’s request in a known, short amount
of time, the server process handles the request itself. We call these iterative
servers.
2.When the amount of time to service a request depends on the request itself,
the server typically handles it in a concurrent fashion. These are called
concurrent servers.
SYSTEM
ANALYSIS
SYSTEM ANALYSIS
System analysis refers into the process of examining a situation with the
intent of improving it through better procedures and methods. System design
is the process of planning a new system to either replace or complement an
existing system. But before any planning is done, the system must be
thoroughly understood and the requirements determined. System analysis, is
therefore, the process of gathering and interpreting facts, diagnosing
problems and using the information to re-comment improvements in the
system. In other words, system analysis means a detailed explanation or
description. Before computerizing a system under consideration, it has to be
analyzed. We need to study how it functions currently, what are the problems
and what are the requirements that the proposed system should meet.
INFORMATION GATHERING
Acceptance Criteria:
The following acceptance criteria were established for evaluation of the new
System:
1.The system should be accurate and hence reliable.
2.The software should provide all the functions. Further, the expectation time
should be very low and response should be good.
3.The system should have scope to foresee modifications and enhancements.
4. The system must satisfy the standards of good software.
User Friendliness: The system should satisfy the user's needs. It should by
easy to learn and operate.
Modularity: The system should have relatively independent and single
function parts that can be put together to make complete system.
Maintainability: The developed system should be such that the time and
effort for program maintenance, enhancement are reduced.
Timeliness: The system should operate well under normal, peak and
recovery conditions.
So when I interview the persons about the project matters they provide me the
better information about existing system, how they work and what types of
problems they are facing and about their requirements.
Exception Handling:
To ensure that the system does not halt in case of undesired situations
or events, the following exception conditions were taken care of by providing
the corresponding exception responses while developing the system.
While selecting an alternative from the menu, the user enters his/her
choice. He goes ahead only if the selected choice is convincing.
While executing the screen, if the user tries to skip a field, which can
not have a null value, an appropriate message is displayed, conveying the
user that the data has to be entered in to hat field.
Once the value has been entered in to a field, the cursor moves to the
next field. While a user enters date in valid format, the system displays a
message showing the valid format he should enter.
Security: The system provides the protection of information by providing a
password for an access to the database. There fore, an authorized user can
access that database.
Flexibility: The system is such that likely changes/modifications can be
easily incorporated.
Feasibility Study
Technical feasibility
A study of function, performance and constraints that may effect the ability to
achieve an acceptable system.
Economic Feasibility
An evaluation of development cost weighed against the ultimate income or
benefit derived from the developed system.
* Legal feasibility: A determination of any infringement/violation/liability that
could result from the development of system.
* Alternatives: An evaluation of alternative approaches to development of
system.
Economic Analysis:
Among the most important information contained in a feasibility study is
cost benefit analysis an assessment of the economic justification of a
computer based system project. Cost benefit analysis delineates cost for
project development and weigh them against them tangible and intangible
benefits of a system. Cost benefit analysis is complicated by criteria that vary
with the characteristics of system to be developed the relative size of the
project and the expected return on the investment desired as part of
company's strategic plan. In addition many benefits derived from computer
based systems are intangible. Direct quantitative comparisons may be difficult
to achieve.
Technical Analysis:
During technical analysis, the analyst evaluates the technical merits of
system concept, white at same time collecting additional information about
performance, reliability, maintainability and predictability. Technical analysis
begins with an assessment of the technical viability of the proposed system.
* What technologies are required to accomplish system function and
performance?
* What new materials, methods, algorithms or processes are required and
what is their development risk?
* How will these technology issues affect the cost?
* The results obtained from the technical analysis from the basis for another
go/no-go decision on the rest system if technical risk severe, if models
indicate that desired function cannot be achieved-it is back to the drawing
board!
SYSTEM
DESIGN
DESIGN PHASE
Design phase of software development deals with transforming the customer
requirements as described in the SRS document into a form implement able
using a programming language. In order to be easily implement able in a
conventional programming language, the following items must be designed
during the design phase.
♦ Different modules required implementing the design solution.
♦ Control relationship among the identified modules, i.e. the call relationship
(also known as the invocation relationship) among modules.
♦ Interface among different modules, i.e. details of the data items exchanged
among different modules.
♦ Data structures of the individual modules.
♦ Algorithms required implementing the individual modules.
Thus the goal of the design phase is to take the SRS document as the input
and to produce the above-mentioned items at the completion stage of the
design phase. A good software design is seldom arrived through a single step
procedure but goes through a series of steps. However, we can broadly
classify various design activities into two important parts:
♦ Preliminary (or high-level) design.
♦ Detailed design
This phase of the report contains designing part of the project in a draft
manner. In designing phase, the whole system is planned through a rough
plan so that we may follow the steps and where applied can make changes
accordingly. First of all the design of database is made so that all the process
can be thought can be thought in the form of input and output. The output of
one module can be entered into the next module as the input.
System Flow Designing
Describes how data will flow for the whole system When we manipulate the
data from the database, After manipulating how we communicate and Where
that data will go so that we can communicate With the user of our site.
DESIGN OBJECTIVES
The design of a system is correct if a system built precisely according
to the design satisfies the requirements of that system. Clearly, the goal
during the design phase is to produce correct designs. There can be many
correct designs possible. The goal of the design process is not simply to
produce a design for the system. Instead the goal is to find the best possible
design, within the limitations imposed by the requirements.
JAVA
Java’s Lineage
Java is related to C++, which is a direct descendent of C. Much of the
character of Java is inherited from these two languages. From C, Java derives
its syntax. Many of Java’s object oriented features were influenced by C++. In
fact, several of Java’s defining characteristics come from—or are responses
to—its predecessors. Moreover, the creation of Java was deeply rooted in the
process of refinement and adaptation that has been occurring in computer
programming languages for the past several decades. For these reasons, this
section reviews the sequence of events and forces that led up to Java. As you
will see, each innovation in language design was driven by the need to solve
a fundamental problem that the preceding languages could not solve. Java is
no exception.
However, with the emergence of the World Wide Web, Java was propelled to
the forefront of computer language design, because the Web, too, demanded
portable programs. Most programmers learn early in their careers that
portable programs are as elusive as they are desirable. While the quest for a
way to create efficient, portable (platform-independent).
Java Applets
An applet is a special kind of Java program that is designed to be transmitted
over the Internet and automatically executed by a Java-compatible web
browser. Furthermore, an applet is downloaded on demand, just like an
image, sound file, or video clip. The important difference is that an applet is an
intelligent program, not just an animation or media file. In other words, an
applet is a program that can react to user input and dynamically change—not
just run the
same animation or sound over and over. As exciting as applets are, they
would be nothing more than wishful thinking if Java were not able to address
the two fundamental problems associated with them: security and portability.
Before continuing, let’s define what these two terms mean relative to the
Internet.
Security
As you are likely aware, every time that you download a “normal” program,
you are risking a viral infection. Prior to Java, most users did not download
executable programs frequently, and those who did scanned them for viruses
prior to execution. Even so, most users still worried about the possibility of
infecting their systems with a virus. In addition to viruses, another type of
malicious program exists that must be guarded against. This type of program
can gather private information, such as credit card numbers, bank account
balances, and passwords, by searching the contents of your computer’s local
file system.
Portability
As discussed earlier, many types of computers and operating systems are in
use throughout the world—and many are connected to the Internet. For
programs to be dynamically downloaded to all the various types of platforms
connected to the Internet, some means of generating portable executable
code is needed. As you will soon see, the same mechanism that helps ensure
security also helps create portability. Indeed, Java’s solution to these two
problems is both elegant and efficient.
• Simple
• Secure
• Portable
• Object-oriented
• Robust
• High Performance
• Multithreaded
• Architecture-neutral
• Interpreted
• High performance
• Distributed
• Dynamic
Simple
Java was designed to be easy for the professional programmer to learn and
use effectively. Assuming that you have some programming experience, you
will not find Java hard to master. If you already understand the basic concepts
of object-oriented programming, learning Java will be even easier. Best of all,
if you are an experienced C++ programmer, moving to Java will require very
little effort. Because Java inherits the C/C++ syntax and many of the object-
oriented features of C++, most programmers have little trouble learning Java.
Secure
Security is an important concern as Java is mean to be used in the
networked environments. Java implements several security mechanisms to
protect against the code that might create a virus or invade the file system. All
this security mechanisms are based on the premises that nothing is to be
trusted. Java memory allocation and the scraping of pointers are a step
towards security. Java compiler does not handle the memory layout decision
so a programmer cannot guess the actual memory layout of a class by looking
at the declarations. Java anticipates and defends against most of the
techniques that have historically been used to trick software into misbehaving.
Portable
Being architecture neutral is one big part of being portable. But Java provides
further portability be making sure that here is no implementation-dependent
aspect of the language specification. For e.g. Java explicitly defines the size
of each of the primitive data type as well as arithmetic behavior.
Object-Oriented
Although influenced by its predecessors, Java was not designed to be source-
code compatible with any other language. This allowed the Java team the
freedom to design with a blank slate. One outcome of this was a clean,
usable, pragmatic approach to objects. Borrowing liberally from many seminal
object-software environments of the last few decades, Java manages to strike
a balance between the purists’s “everything is an object” paradigm and the
pragmatist’s “stay out of my way” model. The object model in Java is simple
and easy to extend, while primitive types, such as integers, are kept as high-
performance nonobjects.
Robust
The multi-platformed environment of the Web places extraordinary demands
on a program, because the program must execute reliably in a variety of
systems. Thus, the ability to create robust programs was given a high priority
in the design of Java. To gain reliability, Java restricts you in a few key areas,
to force you to find your mistakes early in program development. At the same
time, Java frees you from having to worry about many of the most common
causes of programming errors.
Multithreaded
Java was designed to meet the real-world requirement of creating interactive,
networked programs. To accomplish this, Java supports multithreaded
programming, which allows you to write programs that do many things
simultaneously. The Java run-time system comes with an elegant yet
sophisticated solution for multi-process synchronization that enables you to
construct smoothly running interactive systems. Java’s easy-to-use approach
to multithreading allows you to think about the specific behavior of your
program, not the multitasking subsystem.
Architecture-Neutral
A central issue for the Java designers was that of code longevity and
portability. One of the main problems facing programmers is that no
guarantee exists that if you write a program today, it will run tomorrow—even
on the same machine. Operating system upgrades, processor upgrades, and
changes in core system resources can all combine to make a program
malfunction. The Java designers made several hard decisions in the Java
language and the Java Virtual Machine in an attempt to alter this situation.
Their goal was “write once; run anywhere, any time, forever.” To a great
extent, this goal was accomplished.
Interpreted and High Performance
As described earlier, Java enables the creation of cross-platform programs by
compiling into an intermediate representation called Java bytecode. This code
can be executed on any system that implements the Java Virtual Machine.
Most previous attempts at cross-platform solutions have done so at the
expense of performance. As explained earlier, the Java bytecode was
carefully designed so that it would be easy to translate directly into native
machine code for very high performance by using a just-in-time compiler.
Java run-time systems that provide this feature lose none of the benefits of
the platform-independent code.
Distributed
Java is designed for the distributed environment of the Internet, because it
handles TCP/IP protocols. In fact, accessing a resource using a URL is not
much different from accessing a file. Java also supports Remote Method
Invocation (RMI). This feature enables a program to invoke methods across a
network.
Dynamic
Java programs carry with them substantial amounts of run-time type
information that is used to verify and resolve accesses to objects at run time.
This makes it possible to dynamically link code in a safe and expedient
manner. This is crucial to the robustness of the applet environment, in which
small fragments of bytecode may be dynamically updated on a running
system.
Socket programming
The communication that occurs between the client and the server must be
reliable. The data must not be lost and must be available in the same
sequence in which the server sent it.
Transmission Control Protocol(TCP) provides a reliable, point-to-point
communication channel. To communicate over TCP, client and server
programs establish a connection and bind a socket. Sockets are used to
handle communication links between applications over the network. Further
communication between the client and the server is through the socket.
Java was designed as a networking language. It makes network programming
easier by encapsulating connection functionality in the socket classes, that is,
the Socket class to create a client socket, and the ServerSocket class to
create a server socket.
• Socket is the basic class, which supports the TCP protocol. TCP is
reliable stream network connection protocol. The Socket class provides
methods for Stream I/O, which makes reading from and writing to a
socket easy. This class is indispensable to the programs written to
communicate on the Internet.
Socket socketConnection;
Try
{
SocketConnection = new Socket(www.vcerohtak.com,1001);
}
catch(IOException e)
{
}
the constructor for the Socket class requires a host to connect to, in this case
WWW.vcerohtak.com, which is theport of a server. If the server is up and
running, the code creates a new Socket instance and continues running. If the
code encounters a problem while connecting, it throws an exception.
To disconnect from the server, use the close method().
SocketConnection.close();
Public Server()
{
try
{
serverSocket = new ServerSocket(1001);
}
catch(IOException e)
{
fail(e,”Could not start server”);
}
System.out.println(“Server started”);
This.start();
}
Introduction
To
Proxy Server
Defintion
A server that sits between a client application, such as a Web browser, and a
real server. It intercepts all requests to the real server to see if it can fulfill the
requests itself. If not, it forwards the request to the real server.
• Filter Requests: Proxy servers can also be used to filter requests. For
example, a company might use a proxy server to prevent its
employees from accessing a specific set of Web sites.
A proxy server receives a request for an Internet service (such as a Web page
request) from a user. If it passes filtering requirements, the proxy server,
assuming it is also a cache server, looks in its local cache of previously
downloaded Web pages. If it finds the page, it returns it to the user without
needing to forward the request to the Internet. If the page is not in the cache,
the proxy server, acting as a client on behalf of the user, uses one of its own
IP addresses to request the page from the server out on the Internet. When
the page is returned, the proxy server relates it to the original request and
forwards it on to the user.
To the user, the proxy server is invisible; all Internet requests and returned
responses appear to be directly with the addressed Internet server. (The
proxy is not quite invisible; its IP address has to be specified as a
configuration option to the browser or other protocol program.)
• An advantage of using a proxy server is that its cache can serve all
users. If one or more Internet sites are frequently requested, these are
likely to be in the proxy's cache, which will improve user response time.
In fact, there are special servers called cache servers.
• The functions of proxy, firewall, and caching can be in separate server
programs or combined in a single package. Different server programs
can be in different computers. For example, a proxy server may in the
same machine with a firewall server or it may be on a separate server
and forward requests through the firewall.
• There are different types of proxy servers with different features; some
are anonymous proxies, which are used to hide your real IP address
and some are used to filter sites, which contain material that may be
unsuitable for people to view.
• When you connect to a web site, your true IP address will not be
shown, but the proxy server's IP will. This does not mean that you're
completely anonymous. The proxy server will have logs of IP’s that
used the proxy server and the times.
You can use a proxy server if you have a child and wish to restrict the sites
they are viewing. You will need to make sure you get the correct type of proxy
because not all proxies filter sites. You can use it to protect yourself, it can be
used to hide your IP, which is useful because it means hackers can not get
info about you when using it. They will only get the proxy server's IP. Proxy
servers are not hard to set up, no hardware or software is needed, you just
need to configure your browser to connect through it.
Some ISP’s (Internet Service Providers) make all their users use a proxy
server; for example in the United Arab Emirates, the main ISP makes all users
use a proxy server which blocks sites with unsuitable material. It does this
using the meta tags in the HTML code used to make the web page. Some
ISP’s may give you a choice so you can use one or not. If you want to use a
proxy server there are many around with different functions, you just have to
get the one that suits your needs best.
Uses in Depth
Proxy servers were developed to filter request going to and coming from the
Internet. As the Internet became an essential part of many companies, it also
became the easiest way to attack companies. So it became necessary to
have a secure connection to the Internet from a private network without
compromising any confidential data. Since proxy servers filter all requests,
there are no unauthorized requests being transferred between the Internet
and the LAN. Proxy servers filter and control access in a couple of different
ways. They are able to filter them by the IP address of the computer that it
came from, as well as by controlling the access of the user that made the
request. User authentication is available on most proxy servers, and is
usually integrated with the authentication that takes place to connect to the
LAN. Although, users can usually still connect to the proxy server using their
LAN credentials, even if they are not logged in to the LAN. Since there is user
authentication, the proxy server can keep a log of all the requests each user
makes. Another advantage of having user authentication integrated with the
LAN is that policies and groups can be setup to only allow certain users
access to certain sites. This is a big advantage for companies because they
are able to restrict what their employees have access to. By filtering the
request by the IP address of the computer that sent it as will as where it is
going, the proxy server can determine if the request is legitimate. An inbound
message will not be forwarded to a computer unless that computer has
requested it. There is another feature of proxy servers that filters requests,
access control lists. Proxy servers use an access control list to filter out
unacceptable requests. This list contains the addresses of computers or sites
that are not to be accessed by anyone behind the firewall. These can be sites
with inappropriate content, or frequently used sites that serve no business
function such as EBay. The proxy server can also search through a request
or site for inappropriate words. Maintaining these lists is the most difficult part
of operating proxy servers. There are too many sites out there to block all
that are unnecessary. And there are thousands of new sites every day. In
response to this, there are some vendors that offer a subscription service that
gives you updated access control lists. This makes the administering of the
proxy server much easier, but it does cost more money. In order to control
access and filter websites, companies must have clear Internet usage policies
in place. They cannot block employees from viewing things without having
documented rules to back it up. This is a very touchy subject as to where to
make the line for what employees should have access to.
Improving Performance
Proxy servers are able to improve the performance and efficiency of a
network by caching websites. By caching websites, proxy servers are storing
them locally on the server’s hard drive. When caching is enabled, proxy
servers cache sites that are requested frequently such as Yahoo. When a
user requests Yahoo, the proxy server checks the Internet to see if there is a
more recent version. If there is, it will place it in the cache and for ward it to
the user. If there is not and the version on the proxy server is current, it will
forward that one to the user. This means the server does not have to
download any new content. Another way to configure caching is to only
update the cache periodically. This improves performance even more
because the proxy server would not have to connect to the Internet if the site
was in the cache. However, it means that the user may not be getting the
most up-to-date version of the page they requested. By caching this way, the
administrator must determine which sites should be cached and how often the
cache should be updated. This is a very difficult task to figure out. Here are
some overall advantages and disadvantages of caching:
Advantages
· Improved user response time
· No need to cache on local user machines
Disadvantages
· Requires more disk space
· Difficult to know when to update or delete cache
· Possibility of providing users with non-current sites
and information.
Passive caching
Passive caching occurs on behalf of every Web Proxy service request for
content (i.e., objects). As browsers request content from the Web Proxy
service, the service consults the cache to see whether a current copy of the
object exists. If no copy exists, the service downloads a fresh copy from the
Web server and serves it to the client. Subsequently, the service caches the
object on the proxy server's local drives. This newly cached object is now
ready for the proxy server to serve when other browser requests for the same
object occur.
Serving cached copies of Web pages is a benefit to the local user; however,
for Web sites tracking page hits, the result is a lost hit. Lost hits can potentially
result in lost revenues. In addition, not every type of content is cacheable.
(Examples of non-cacheable content include Active Server Pages—ASP—
and Common Gateway Interface—CGI—objects.) If the content provider used
the <META> tag HTTP-Expires to assign an expiration date and time, Proxy
Server uses this value.
Active caching
Unlike passive caching, active caching is caching that the proxy server
performs during its idle periods. This type of caching is called active because
it proactively downloads the most frequently requested pages your local proxy
server cache learns. If an entertainment Web site is one of the most
requested Web sites on your proxy server, active caching will have a fresh
copy on hand in anticipation of browser requests. This active caching process
occurs only during idle periods—for example, overnight. You can disable this
feature for those proxy servers that have time or bandwidth restrictions.
IMPLEMENTATION
DETAILS
Caching Proxy HTTP Server
A simple caching proxy HTTP server, called http, to demonstrate client and
server sockets. http supports only GET operations and a very limited range of
hard-coded MIME types. (MIME types are the type descriptors for multimedia
content.) the proxy server is single threaded, in that each request is handled
in turn while others wait. It has fairly naïve strategies for caching-it keeps
everything in RAM forever. When it is acting as a proxy server, http also
copies every file it gets to a local cache for which it has no strategy for
refreshing or garbage collecting. All of these caveats aside, http represents a
productive example of client and server sockets, and it is fun to explore and
easy to extend.
MimeHeader.java
Parse() the parse() method is used to take a raw MIME-formatted string and
enter its key/ value pairs into a given instance of MimeHeader. It uses a
StringTokenizer to split the input data into individual lines, marked by the
CRLF(\r\n) sequence. It then iterates through each line using the canonical
while… hasMoreTokens()…. NextToken() sequence.
For each line of the MIME header, the parse() method splits the line into two
strings separated by a colon(:). The two variables key and val are set by the
substring() method to extract the characters before the colon, those after the
colon, and its following space character. Once these two strings have been
extracted, the put() method is used to store this association between the key
and value in the Hashtable.
ToString()
The put() and get() function in the Hashtable would work fine for this
application if not one for rather odd thing. The MIME specification defined
several important keys, such as Content-Type and Control-Length. Some
early implementations of MIME Systems, notably web browsers, took liberties
with the capitalization of these fields. Some use Content-Type, others content-
type. To avoid mishaps, our HTTP server tries to convert all incoming and
outgoing MimeHeader convert the values’ capitalization, using the method
fix(), before entering them into the Hashtable and before looking up a given
key.
CODE
import java.util.*;
while (st.hasMoreTokens()) {
String s = st.nextToken();
int colon = s.indexOf(':');
String key = s.substring(0, colon);
String val = s.substring(colon + 2); // skip ": "
put(key, val);
}
}
MimeHeader() {}
MimeHeader(String d) {
parse(d);
}
HttpResponse.java
CONSTRUCTORS
Parse()
The prase() method takes the raw data that was read from the HTTP server,
parses the statusCode and reasonPhrase fro the first line, then constructs a
MimeHeader out of the remaining lines.
To String()
The toString() method is the inverse of parse(). It takes the current values of
the HttpResponse object and returns a string that an HTTP client would
expect to read back from server.
CODE
import java.io.*;
/*
* HttpResponse
* Parse a return message and MIME header from a server.
* HTTP/1.0 302 Found = redirection, check Location for where.
* HTTP/1.0 200 OK = file data comes after mime header.
*/
class HttpResponse
{
int statusCode; // Status-Code in spec
String reasonPhrase; // Reason-Phrase in spec
MimeHeader mh;
static String CRLF = "\r\n";
HttpResponse(String request) {
parse(request);
}
UrlCacheEntry.java
CONSTRUCTOR
The constructor for a UrlCacheEntry object requires the URL to use as the
key and a MimeHeader to associate with it. If the MimeHeader has a field in it
called Content-Length (most do), the data area preallocated to be large
enough hold such content.
Append()
class UrlCacheEntry
{
String url;
MimeHeader mh;
byte data[];
int length = 0;
LogMessage.java
CODE
interface LogMessage {
public void log(String msg);
}
httpd.java
CONSTRUCTOR
There are five main instance variables: port docroot, log, cache, and stopflag
and all of them are private.
Httpd’s alone constructor, shown here, can set three of these:
Httpd(int p, String dr, LogMessage lm)
It initializes the port to listen on, the directory to retrievefiles from, and the
interface to send messages to.
The fourth instance variable, cache is the Hashtable where all of the files are
cached I RAM, and is initialized when the object is created. Stopflag controls
the execution of the program.
STATIC SECTION
There are several important static variables in this class. The version reported
in the “Server” field of the MIME Header is found in the variable version. A few
constants are defined next: the MIME type for HTML cfiles, mime_text_html;
the MIM end-of-line sequence, CRLF; the name of the HTML file to return in
place of raw directory requests, indexfile;and the size of the databuffer used in
I/O, buffersize.
Next are five more instance variables. These are left without the private
modifier so that an external monitor can inspect these values to display them
graphically. (We will show this in action later.) These variables represent the
usage statistics of our web server. The raw number of hits and bytes served is
stored in hits_served and bytes_served. The number of files and bytes
currently stored in the cache is stored in files_in_cache and bytes_in_cache.
Finally we store the number of hits that were successfully served out of the
cache in hits_to_cache.
ToBytes()
MakeMimeHeader()
The error () method is used to format an HTML page to send back to web
clients who make requests that cannot be completed. The first parameter,
code is the error code to return. Typically this will be between 400 and 499.
Our server sends back 404 and 405 errors. It uses the HTTPResponse class
to encapsulate the return code with the appropriate MimeHeader. The method
returns the string representation of that response concatenated with the
HTML page to show the user. The page includes a human-readable version of
the error code, msg, and the url request that caused the error.
GetRawRequest()
The getRawRequest() method is very simple. It reads data from a stream until
it gets two consecutive newline characters. It ignores carriage returns and just
looks for newlines. Once it has found the second newline, it returns the array
of bytes into a String object and returns it. It will return null if the input stream
does not produce two consecutive newlines before it ends. This is how
messages from HTTP servers and clients are formatted. They begin with one
line of status and then are immediately followed by a MIME header. The end
of the MIME header is separated from the rest of the content by two newlines.
LogEntry()
WriteString()
WriteUCE()
ServerFromCacahe()
LoadFile()
This method takes an InputStream, the url that corresponds to it, and the
MimeHeader for that URL. A new UrlCacaheEntry is created with the
information stored in MimeHeader. The input stream is read in chunks of
buffer_size bytes cache. The files_in_cache and bytes_in_cache variables are
updated, and the UrlCacheEntry is returned to the caller.
ReadFile()
The readFile() method might seem redundant with the loadFile() method. It
isn’t. This method is strictly for reading files out of a local file system, where
loadFile() is used to talk to streams of any sort. If the file object f exists then
an InputStream is created for it. The size of the file is determined and the
MIME type is derived from the filename. These two variables are used to
create the appropriate MimeHeader, then loadFile() is called to do the actual
reading and caching.
WriteDiskCache()
HandleProxy()
The handleProxy() routine is one of the two major modes of this server. The
basic idea is this: if you set your browser to use this server as a proxy server,
then the requests that will be sent to it will include the complete Url, where
normal GETs remove the “http//” and host name part. We simply pick apart
the complete URL, looking for that “://” sequence, the next slash (/), and
optionally other colon (:) for servers using nonstandard port numbers. Once
we have found these characters, we know the intend host and port number as
well as the URL we need to fetch from there. We can then attempt to load a
previously saved version of this document out of our Ram cache. If this fails,
we can attempt to load it from file system into the RAM cache and reattempt
loading it from the cache. If this fails, then it gets interesting, because we must
read the document from the remote site.
To do this we open a socket to the remote site and port. We send a GET
request asking for the URL that was passed to us. Whatever response header
we get back from the remote site, we send on to the client, if that code was
200, for successful file transfer, we also read the ensuing data stream into a
new UrlCacheEntry and write it into the client socket. After that we call
writeDiskCacahe() to save the results of the local disk. We log the transaction,
close the sockets, and return.
HandleGet()
The handleGeta90 method is called when the http daemon is acting like a
normal web server. It has a local disk document root out of which it is serving
files. The parameters to handle Get() tell it where to write the results, the URL
to look up, and the MmeHeader from the requesting web browser. This MIME
Header will include the User-Agent string and other useful attributes. First we
attempt to serve the URL pout of the Ram cache. If this fails, we look in the
file system for the URL. If the file does not exist or is unreadable we report an
error back to the web client. Otherwise we just use readFile() to get the
contents of the file and put them in the cache. Then writeUCE() is used to
send the contents of the file down the client socket.
DoRequest()
The doRequest() method is called once per connection to the server. It parses
the request string and incoming MIME header. It decides to call either
handleProxy() or handleGet(), based on whether there is a “://” in the request
string. If any methods are used other that GET, such as HEAD or POST, this
routine returns a 405 error to the client. Note that the HTTP is ignored if
stopFlag is false.
Run()
The run() method is called when the server thread is started. It creates a new
ServerSocket on the given port, goes into an infinite loop calling accept() on
the serversocket, and passes the resulting Socketoff to doRequest() for
inspection.
These are two methods used to start and stop the server process. These
methods set the value of stopFlag.
CODE
import java.net.*;
import java.io.*;
import java.text.*;
import java.util.*;
class httpd implements Runnable, LogMessage {
private int port;
private String docRoot;
private LogMessage log;
private Hashtable cache = new Hashtable();
private boolean stopFlag;
int hits_served = 0;
int bytes_served = 0;
int files_in_cache = 0;
int bytes_in_cache = 0;
int hits_to_cache = 0;
if (!f.exists())
return null;
InputStream in = new FileInputStream(f);
int file_length = in.available();
String mime_type = fnameToMimeType(url);
MimeHeader mh = makeMimeHeader(mime_type, file_length);
UrlCacheEntry uce = loadFile(in, url, mh);
return uce;
}
if (!serveFromCache(out, url)) {
if (readFile(new File(docRoot + url), url) != null) {
serveFromCache(out, url);
return;
}
if (method.equalsIgnoreCase("get")) {
if (url.indexOf("://") >= 0) {
handleProxy(out, url, inmh);
} else {
handleGet(out, url, inmh);
}
} else {
writeString(out, error(405, "Method Not Allowed", method));
}
in.close();
out.close();
}
private Thread t;
public synchronized void start() {
stopFlag = false;
if (t == null) {
t = new Thread(this);
t.start();
}
}
// This main and log method allow httpd to be run from the console.
public static void main(String args[]) {
httpd h = new httpd(80, "c:\\www", null);
h.log = h;
h.start();
try {
Thread.currentThread().join();
} catch (InterruptedException e) {};
}
Main()
We can use main() method to run this application from a command line. It sets
the LogMessage parameter to be the server itself, and then provides a simple
console output implementation of log().
HTTP.java
HTTP.java is an added applet class that gives the HTTP server a functional
“front panel”. This applet has two parameters that can be used to configure
the server:port and docroot. This is a very simple applet. It makes an instance
of the httpd, passing in it as the LogMessage interface. Then it creates a
panel that has a simple label ata the top, a TextArea in the middle for
displaying the LogMessages, and a panel at the bottom that has two buttons
and another label in it. The start() and stop() methods of the applet call the
corresponding methods on the httpd. The buttons labeled ”Start” and “Stop”
call their corresponding methods in the httpd. Any time a message is logged,
the bottom-right Label object is updated to contain the latest statistics from the
httpd.
CODE
import java.util.*;
import java.applet.*;
import java.awt.*;
import java.awt.event.*;
setLayout(new BorderLayout());
As each project has some limitations, our project too lacks in some features.
These are a few areas in which this project, JAVA PROXY SERVER needs
further improvement.
There is no provision in the project to clear the cache automatically.
The cache has to be cleared from time to time manually by the server
administrator
If someone tries to access the files, which could not be properly saved
in the cache, can never be displayed if accessed through cache. There
is no provision in the project to request the files again from the web
and save them in the cache, unless you clear the cache manually.
This project is not as efficient as other proxy servers available in the
market, as the client to spend more time to access the web pages
through this Proxy Server.
Another limitation of this project is that it can handle requests from only
one client at a time. Other clients willing to set up a connection with the
server have to wait for the current client to complete its requests and
end up its connection with the server.
BIBLIOGRAPHY
Bibliography
For the development of this project we took help from a lot of sources,
some of which are listed here.