Sie sind auf Seite 1von 16

Open Search Server documentation

PRELIMINARY DRAFT
Emmanuel Keller Author Emmanuel Gosse Author
Sebastien Andrivet Translator, proofreader
InfoPro Digital
12-14, rue Mederic Paris France
http://www.open-search-server.com
© InfoPro Digital 2009
2 | Open Search Server | Introduction
Quick start

Installing the JDK Software


Open Search Server (OSS) requires a Java™ runtime environment (JRE) version 5 or newer.
1. Download the JDK software from either Sun Microsystems or IBM.
• Sun Microsystems provides its JRE (or JDK) for the Windows™ ,Linux and Solaris™ operating systems: Sun
download page .
• IBM® provides its JRE for the AIX™ and Linux operating systems: IBM developer kit
2. Select an appropriate JRE/JDK version and download it.
3. Install the JRE/JDK using the installation instructions.

Setting up then environment variables on a Windows™ System


On Windows, the only thing to do is to add an environment variable named JAVA_HOME.
1. Right click My Computer
2. Select Properties
3. Select the Advanced tab
4. Select Environment Variables
5. Edit or create a new entry named JAVA_HOME.
6. JAVA_HOME must point toward the JDK software, for example: C:\Program Files\Java
\jdk1.6.0_14

Setting up environment variables on an UNIX System


You have to define the JAVA_HOME environment variable.
1. Set JAVA_HOME
Replace [jdk-path] by the location of you JDK. For example: /usr/jdk/jdk1.6.0_14
• Korn or bash shells: export JAVA_HOME=[jdk-path]
• If you are using a Bourne shell:
JAVA_HOME=[jdk-path]
export JAVA_HOME
• If you are using a C shell: set env JAVA_HOME [jdk-path]
2. Set PATH
• Korn or bash shells: export PATH=$JAVA_HOME/bin:$PATH
• Bourne shell:
PATH=$JAVA_HOME/bin:$PATH
export PATH
• C shell: set env PATH $JAVA_HOME/bin:$PATH

Downloading Open Search Server


Download the appropriate package file for your environment.
1. Go to the download pages: http://sourceforge.net/project/showfiles.php?group_id=260863
2. Select the release version you need. Usually you will be offered the following options:
4 | Open Search Server | Quick start

• Beta : the beta version. Lastest stage of development cycle.


• Stable / Release: Stable releases, intended for production use.
• Unstable / Alpha: Usually the lastest trunk version.
3. Choose the appropriate file / archive:
Options Description
documentation.pdf The documentation you are reading now, in
PDF format.
open-search-server-XXX.zip Open Search Server archive in ZIP format.
open-search-server-XXX.tar.gz Open Search Server archive in tar.gz format.
open-search-server-XXX.war You can use the war file if you want to
deploy it manually on an application server.
4. The download process should start immediately after you click on the name of the file.

Extracting the open-search-server folder from the archive


Uncompress and/or unarchive the package file using your favorite tool.

Use your favorite tool to uncompress the archive and extract the open-search-server folder.
• Windows / Mac: double clicking on the archive will usually decompress it and extract the folder.
• ZIP archive on Unix system: You can use the unzip command line utility, for example: unzip open-
search-server-XXX.zip
• TAR.GZ archive on Unix: You can use the tar command line utility, for example: tar -zxvf open-
search-server-XXX.tar.gz

Launching Open Search Server


Start the server by executing the start batch file.

Start the server by executing the start batch file.


• On Windows, run the file start.bat as a command.
Open Search Server | Quick start | 5

• On Unix/Linux/Mac OS, open a shell, and execute start.sh.


The server is running, and will now start listening to the tcp port 8080.

Displaying the web interface


Open a compatible web browser (Internet Explorer, Firefox/Mozilla, Safari), then enter an url matching your server.

1. Open you favorite web browser.


2. Enter an url matching your server
• If the server runs on your desktop machine, you can use: http://localhost:8080
• If OSS runs on a remote server, you should build the appropriate URL, like this: http://[server-
hostname]:8080

Setting up the index directory


You must provide a path to the directory where you want to store the index data. We recommend that you start with
the web_crawler folder provided in the examples folder.

Enter the absolute path of the index directory.


• On Unix/Linux/Mac systems, enter the absolute path, for example: /home/me/open-search-server/
examples/web_crawler
• On Windows systems, enter a Windows UNC pathname, for example: \\ComputerName\SomeFolder
\open-search-server\examples\web_crawler

Entering the URL of the web site to be crawled


The pattern list lets you decide which URL will be crawled. Only URLs that match these patterns will be indexed.
6 | Open Search Server | Quick start

1. Select the Crawler panel.


2. Then, select the Web sub-panel.
3. Finally, select the Pattern list sub-panel.
4. Enter, for example, http://www.open-search-server.com*
5. Click on the Add button.

Starting the crawl process.


The crawl process will download and index the url(s) you inserted in the patterns list.

1. Select the Crawl process sub-panel.


2. Click on the Not running - Click to start button.
3. Later, you can click on the same button to stop the crawl.
Open Search Server | Quick start | 7

Querying the index


You can use the web interface to query the data in your index.

1. Select the Query panel.


2. Load the predefined search query template.
3. Enter a word in the field named Enter the query , for example: open
4. Click on the Search button

Testing the XML API


Try the same request using the XML API to get an XML result. Open a new web browser with the following url:
8 | Open Search Server | Quick start

1. Open a new window on your web browser


2. Enter the following url: http://localhost:8080/select?qt=search&q=open
API Search / Select
API Search/Select is the interface to query the OSS search engine. The call is sent through a HTTP request. POST OR
GET are both available. The engine will answer with a XML result.
Url call
Basic relative url is : /select
Example
http://localhost:8080/OpenSearchServer/select?q=test&qt=search
Parameters
Note: Parameters have to be encoded in UTF-8.

Name Description Type Default value Needed?


q Searches for Text yes (ou query)
keywords. Ex:
q=try
query Same as Text yes (ou q)
parameter
q. Ex.:
query=try
qt Enables you to Text no
pre-load a set
query in index
configuration
file config.xml.
Ex.:
qt=requestName
start Indicates the Number 0 no
first result's
rank shown.
This parameter
allows for a
pagination.
Ex.:
start=10
rows Indicates the Number 10 no
number of
records to
be returned.
Associated
with the 'start'
parameter,
This parameter
allows for a
pagination.
Ex.: rows=5
lang Indicates the Text no
language of
the keywords
passed to
10 | Open Search Server | API Search / Select

Name Description Type Default value Needed?


parameter q.
The engine
will use the
matching
analyzer. Ex.:
lang=fr
collapse.mode Choose [off|optimized| no
collapsing full]
method. Ex.:
collapse.mode=optimized
collapse.field Active field's name no
collapsing
on the field
passed as a
parameter. Ex.:
collapse.field=hostname
collapse.max Indicates the Number 2 no
number of
documents to
send before
collapsing
activation. Ex.:
collapse.max=2
delete If this no
parameter is
passed, the
documents
returned by
the query are
removed. Ex.:
&delete
noCache Disables the no
cache (for the
current call
only). Ex.:
&noCache
debug Enables no
the debug
information in
the result. Ex.:
&debug
fq Adds a filter to Text no
the current call.
The parameters
can be used
several times
in the same call
for successive
filters. Ex.:
fq=date:20101201&fq=color:red
Open Search Server | API Search / Select | 11

Name Description Type Default value Needed?


rf Adds one or Text (field's no
more fields name)
to send. Ex.:
&rf=date&rf=color
fl Same as Text no
parameter rf
sort Controls Text no
results order.
Using the
abbreviation
+ ou - to sort
by ascending
or descending
order. Ex.:
&sort=-
date&sort=color
facet Enables Text(Number) no
faceting for the
field passed as
a parameter.
You can add
a number in
parenthesis
to specify
the minimum
count. Ex.:
&facet=color
ou
&facet=color(2)
facet.multi Same as Text(Number) no
parameter
facet, for use
with fields
containing
multiple values
(multi-valued
fields). Ex.:
&facet.multi=color
ou
&facet.multi=color(2)

XML result
Note: The answer is in XML format encoded in UTF-8.
12 | Open Search Server | API Search / Select
War deployment guide
This first version of the installation guide demonstrates that it takes few minutes to have a OSS server running and
ready to be used.
1. Install Apache Tomcat or another JAVA server: This installation guide assumes that it is installed. Please refer to
standard installation procedures at the corresponding website. http://tomcat.apache.org/index.html Version 5 or
newer available.
2. Deploy the OSS war file: Put oss.war in 'tomcat/webapps' tomcat directory. Rename it as you want (but keep 'war'
extension !). Ex. : oss.war
3. Configuration of war in Tomcat: In 'tomcat/conf/Catalina/localhost/' path, create a xml file named as same as you
have named your war at the step 2.1 (keep 'xml' extension !).
Example : oss.xml

<Context docbase="oss.war" debug="0" crossContext="true">


<Environment name="JaeksoftSearchServer/configfile"
type="java.lang.String"
value="/mnt/all_oss/oss1/config.xml" override="true" />
</Context>

4. Configuration of the physical index: In any folder where you would like to put it (no special needs), use '/mnt/
all_oss/', create the place you want to have your physical index at. For instance oss1 ( to match the previous steps).
a) put the file config.xml in. (don't change its name !). You can observe that oss.xml refers to it.
b) create a single folder named 'index' in oss1, At server start, empty index files will automatically be added
inside it.
Example of a basic config.xml:

<configuration>
<indices>
<index name="index" searchCache="100" filterCache="100" fieldCache="500" /
>
</indices>
<schema>
<analyzers>
<analyzer name="StandardAnalyzer"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
</analyzer>
<analyzer name="TextAnalyzer" tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="en"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="SnowballEnglishFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="fr"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="FrenchStemFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="de"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballGermanFilter" />
14 | Open Search Server | War deployment guide

</analyzer>
<analyzer name="TextAnalyzer" lang="nl"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="DutchStemFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="es"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballSpanishFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="it"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballItalianFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="pt"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballPortugueseFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="no"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballNorwegianFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="se"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballSwedishFilter" />
</analyzer>
<analyzer name="TextAnalyzer" lang="fi"
tokenizer="LetterOrDigitTokenizerFactory">
<filter class="LowerCaseFilter" />
<filter class="ISOLatin1AccentFilter" />
<filter class="SnowballFinnishFilter" />
</analyzer>
</analyzers>
<fields default="content" unique="url">
<field name="lang" indexed="yes" stored="yes" />
<field name="title" analyzer="TextAnalyzer" indexed="yes"
stored="compress" termVector="positions_offsets" />
<field name="titleExact" analyzer="StandardAnalyzer" indexed="yes"
stored="compress" termVector="positions_offsets" />
<field name="content" analyzer="TextAnalyzer" indexed="yes"
stored="compress" termVector="positions_offsets" />
<field name="contentExact" analyzer="StandardAnalyzer" indexed="yes"
stored="compress" termVector="positions_offsets" />
<field name="contentBaseType" indexed="yes" stored="yes" />
<field name="url" indexed="yes" stored="yes" />
<field name="urlSplit" indexed="yes" stored="no" analyzer="TextAnalyzer"
termVector="positions_offsets" />
<field name="urlExact" indexed="yes" stored="no"
analyzer="StandardAnalyzer"
termVector="positions_offsets" />
<field name="metaDescription" indexed="no" stored="compress" />
<field name="metaKeywords" indexed="no" stored="compress" />
<field name="host" indexed="yes" stored="yes" />
Open Search Server | War deployment guide | 15

</fields>
</schema>
<parsers>
<parser class="com.jaeksoft.searchlib.parser.HtmlParser"
sizeLimit="8388608">
<contentType>text/html</contentType>
</parser>
<parser class="com.jaeksoft.searchlib.parser.PdfParser"
sizeLimit="8388608">
<contentType>application/pdf</contentType>
</parser>
<parser class="com.jaeksoft.searchlib.parser.DocParser"
sizeLimit="8388608">
<contentType>application/msword</contentType>
</parser>
<parser class="com.jaeksoft.searchlib.parser.PptParser"
sizeLimit="8388608">
<contentType>application/vnd.ms-powerpoint</contentType>
</parser>
</parsers>
</configuration>
16 | Open Search Server | War deployment guide

Das könnte Ihnen auch gefallen