Sie sind auf Seite 1von 38

V An exciting recent movement in the database area is

knowledge discovery in databases(KDD)

V KDD is an umbrella term used to describe all activities


involved in making sense of data stored in large and
complex databases

V KDD encompasses a number of terms that are currently


receiving attention namely data warehousing, datamart
and data mining
V Data warehousing-Database consists of data stored on a
computer that facilitates retrieval

V Data warehousing is a refinement of the database


concept that makes an improved data resource available
to the users

V It enables the users to manipulate and use data in


intuitive ways

V Key concept is that it encompasses a very wide range of


computer based data
V —he data resource is here called as data warehouse and
its typically very large of very high quality and highly
retrievable

V But the large size of data does not come at the cost of
poor quality

V —his is because extensive data cleaning,ie removal of


incorrect and inconsistent data and converting it into
higher quality
V jne statistical technique is clustering which arranges
the data in the ways users want to view it

V —his is similar to like goods arranged together in a


supermarket

V Data warehousing is typically performed in mainframe


computers because of extremely large amount of
stored data
V —he data is performed in a relational database

V DBMS vendors such as oracle,sybase and informix are


promoting the use of their products as data
warehouse platforms

V IBM is actively positioned itself as the builder of


computer hardware that supports data warehousing
activity
V —he data mart-

V Achieving a data warehouse sounds like a big


challenge so that experts recommended taking a
modest approach

V A data mart is a database that contains data


describing only a segment of the firms operations

V A firm may have a marketing data mart and,human


resources data mart and so on
V Data mining-A term is often used in conjunction with
data warehousing and data mart is data mining

V Itǯs the process of finding relationships in data that are


unknown to the user

V Data mining helps the user by discovering the


relationships and presenting them in an
understandable way
V —he relationships may provide the basis for decision
making

V Data mining enables the user to discover knowledge in


databases that the user may not know it exists

V Its not presenting the same data in a different format,it


shows relationships that were not previously
recognised
V —ake an eg: of a bank who have decided to offer mutual
funds to its customers

V Bank management wants to aim promotional


materials to the customers

V —hey want to target the customer segment that offers


greatest potential for business

V For this there is data mining required to relate to the


customer database and prospects
V °erification driven data mining-jne approach is for
the managers to identify characters they believe the
members of the target will have

V Assume the managers want to target young, married,


two income and high networth customers

V —he query could be entered in to the DBMS and


appropriate records will be retrieved
V Such an approach which begins with the users
hypothesis of how the data is related is called
verification driven data mining

V —he short coming of this approach is that the


retrieval process is guided entirely by the user

V —he selected information can be no better than the


users view of the data
V Discovery driven data mining-Another approach enables
a data mining system to identify the best customers for
the promotion

V —his system enables the system to analyse the database


and looks for group with common charecteristics

V In the previous bank eg:the mining system will not only


target the young married group but also retired married
couple having incomes,thereby recommending a
promotional campaign for both the groups
V ïombined discovery and verification data mining-—he
concept enables the user and computer to work
together to solve a problem

V —he user applies expertise in the problem domain and


computer performs the data analysis

V —his combination selects the appropriate data and put


it in the right form for decision making
V —he speed of data transmission is slower in telephone
systems than between two computers connected by a
telephone wire
V ïomputers need extremely reliable connections but the
humans who use the telephone can understand
communication even when the line is static
V Protocols for the public telephone system were
established to meet the minimun criteria of voice
transactions
V —he telephone system quality is significantly below the
needs of computer data transmission
V ðetworks are differentiated by the size of audience that
is served

V —echnology plays a role because there are physical


limits to the distance between computers based on the
communications medium used

V —he distinction between different types of networks has


blurred as communication technologies improve and
the quality of data transmission also improves
V —o be included on a network,each device-each
computer,printer,or similar device must be attached to
the communications medium

V —his is done using a network interface card

V —he network interface card (ðIï) acts as an


intermediary between the data moving to and from the
computer or other device
V —he ðIï is more than just a buffer to allow data
storage

V It deciphers information from the packets to


determine if the data is meant to be captured

V It also decides if the data should be allowed to pass


down the communications medium
V ocal area networks- A Að is a group of computers
and other devices(such as printers) that are connected
together by a common medium

V Aðs typically join computers that are physically close


together such as in the same room or building

V jnly a limited number of computers and other devices


can be connected on a single Að
V —he limitations vary based on the medium connecting
the computers and devices as well as the Að software
being used

V As a general rule,a Að will cover a total distance of


only half a mile

V —he distance between computers linked by


communication medium is typically at least 2 feet and
not more than 60 feet
V —he distances are only guidelines since the
specifications imposed by the type of communication
medium, the network interface card used and the Að
software dictate the actual distances

V —he current transmission speed of data along a Að


generally runs from 10 million bits per second to 100
mbps
V Að use only private network media and they do
not transfer data to the public telephone system

V jnly a single network protocol such as Ethernet or


token-ring can be used on a single Að
V Að topology and implementation-Að utilizes three
separate configurations for connecting the computers
and other devices

V —he network configuration is called topology and three


major topologies are used

V —he three are ring,bus and hub topologies which are


named after their form of arrangement in the network
V —he importance of stars and hubs to most professionals
has less to with the technology and more to do with the
communication

V —he managers and professional staff became more


dependent on computer resources

V —hey were realising the difficulties in passing


information from one to another and was time
consuming to communicate
V Advantages of Að

V Að allowed work groups to share computer based


data and to utilise computer resources (like laser
printer),not in the workers desk but on the network

V It was possible to send electronic messages to


coworkers

V —he ability to share costly hardware like a laser printer


proved to be a cost saving strategy
V Sharing electronic messages allowed individual users
to act as a group

V Benefits from group decisions became apparent to


firms

V —hey started to take advantage of other network


technologies to link local groups to other local groups
and then to the entire company
V ¢ireless Að Ȃis the extension of ordinary Aðs which
feature a wireless interface that permits inclusion of
small portable terminals

V —he wireless Að can be connected to a fixed Að in


which the users portion is fixed

V ¢Að consists of services provided by vendors who


offer nationwide email service and access to fixed
hosts on a fee basis
V Internet-is the collection of networks that can be
joined together

V If you have Að in one office and Að in different


office,you can join them and that will create an
internet

V Using road as the medium you can travel travel two


blocks to meet a friend which is an example of
internet,however with an interconnected set of roads
and plane routes you can travel virtually any where in
the world
V Internet is public and anyone who has a computer and
access to the communication medium can travel the
internet

V If an organisation is seeking new customers you can


reach a wide range of customers

V However a person using the internet may retrieve data


that that the company wished to keep in private
V jrganisation can limit access to their networks to other
members of their organisation by using an intranet
V Intranet uses the same network protocols as the internet
but limits accessibility to computer resources to a select
group
V —he Að has no physical connection to another
network,the intranet has a connection to another
network but uses software ,hardware or combination of
both called as a firewall
V Firewall prevents communication from devices other than
those authorised to use the internet
V Some authorised users may be outside the boundaries
of the organisation

V A supplier might need access to the computer based


records of inventory levels

V ¢hen an intranet is expanded to include users beyond


the organisation its called an extranet

V jnly trusted customers and business partners are


afforded extranet access and firewalls prevent
unauthorised users
V ¢orld wide web-is information space on the internet
where documents are stored and retrieved by means of
a unique addressing scheme

V Rather than handling only textual material, its also


possible to store and retrieve hypermedia-multimedia
consisting of text, graphics, audio and video

V —he worldwide web is also called web,www and ¢


V —he internet provides network architecture and the
web provides the method for storing and retreiving its
documents

V Internet is the global communication network that


connects millions of computers

V —he www is the collection of computers acting as


Internet servers that host documents formatted to
allow viewing of text,graphics and audio as well as link
to other documents on web
www terminologies:

V ¢ebsite-refers to a computer linked to the internet


containing hypermedia that can be assessed from any
other computer in the network by means of hypertext
links

V Hyper text link-—his refers to a pointer consisting of


text or graphic that is used to access hypertext stored
at any website,this text is underlined and displayed in
blue
V ¢eb page-—his refers to a hypermedia file stored at a
website,which is identified by a unique address

V Home page-refers to the first page of a web site, other


pages at the site can be reached from the home page

V UR-Universal resource locator-—his refers to the


address of a web page
V A protocol is a set of standards that govern the
communication of data

V http is the protocol for hypertext and the letters stand


for hypertext transport protocol

V —he protocol name is followed by a colon and two


slashes

V A domain name is the address of the website where the


web page is stored
V —he last three letters of the domain name the
typeofwebsite,edu(education),com(commercial),org(n
on profit org)and gov(government)

V —he domain name is followed by a single slash

V —he path can identify a certain directory/subdirectory


and file at the web site

V Html or htm is the suffix for the program code that


designates hypetext files
V Browser- refers to a software system that enables you
to retrieve hypermedia by typing in search parameters
or clicking on a graphic

V —his cpability relieves you of having to know the url of


the web page that contains the information needed

V A browser is also called a search engine


V File transfer protocol(F—P) refers to software that
enables you to copy files on to your computer from any
website

V —o do this url of the website must be known

V Many F—P sites offer transfer of data in one direction


only

V Firms have used internet sites off their premises that


are providing files to users such as product
information,news releases eÚ

Das könnte Ihnen auch gefallen