Skip to the content.

ir-course-uoi

The project for the Information Retrieval course @cse.uoi.gr is about implementing a search engine for Wikipedia articles using Apache Lucene.

In ir-course-uoi-data, you can find the implementation of a custom crawler and HTML preprocessor to extract text from the HTML pages scrapped.

This search engine supports multiple features. For example:

Screenshots

0-main-window.png 1-airplane.png 2-advanced-search.png 3-Ellada.png 4-a380.png 5-nokia-5g.png 6-no-results.png 7-indexing.png

License

GNU GENERAL PUBLIC LICENSE Version 2, June 1991

For the license statements of 3rd party software, please refer to lucene-8.5.1 and javafx-sdk-11.0.2.