Beruflich Dokumente
Kultur Dokumente
While the design and conception of the various modules that compose the engine---crawler,
indexer, and query processor---will be discussed in class, the students will be responsible for
implementing them in C++ and validating their implementations during the course. Additionally, the
students will explore techniques of machine learning to improve the ranking of the results.
Evaluation will consist of three implementation assignments (crawler, indexer, and query
processor) and two exams.
This is a rigorous and intensive course with a non-trivial load of theoretical material and practical
tasks in the form of coding in C++. At the end of the course, the students should gain a technical
understanding of how a search engine works and the challenges at the frontier of the technology---
they should understand why search is hard.
Required Textbook
Ricardo Baeza-Yates & Berthier Ribeiro-Neto. Modern Information Retrieval. Pearson, 2011, 2nd
Edition.
Slides can be found here.
1. Ian Witten, Alistair Moffat, Timothy Bell. Managing Gigabytes. Morgan Kauffman, 1999, 2nd
Edition.
2. Christopher Manning, Prabakhar Raghavan, Hinrich Schutze. Introduction to Information
Retrieval. Cambridge University Press, 2008.
3. Bruce Croft, Donald Metzer, Trevor Strohman. Search Engines: Information Retrieval in
Practice. Pearson, 2009.
4. Charles Clarke, Gordon Cormack, Stefan Buttcher. Information Retrieval: Implementing and
Evaluating Search Engines. MIT Press, 2016.
5. Gerard Salton. Introduction to Modern Information Retrieval. McGraw Hill, 1983.
http://homepages.dcc.ufmg.br/~berthier/ir-2018.htm 2/2