Skip to content

bensengupta/search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C++ Search Engine

Why?

  • Learn C++ (this was my first C++ project)
  • Learn how search engines work

Setup

Prerequisites

  • CMake
  • A C++ compiler (MSVC, GCC, MINGW, etc.)

Building

git clone https://github.com/bensengupta/search
cd search
cmake .
# Run the executable: ./search <title file> <query>
./search titles_100k.txt France

titles_100k.txt is the first 100K Wikipedia page titles extracted from enwiki-latest-all-titles-in-ns0.gz.

TODO

  • Indexing documents should also remove previous documents with same ID from index
    • Removing documents by searching through entire index for ID
    • or pop old document with same ID from storage, find what words it contains and only search & remove in those indices

References

About

A mini search engine implementation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published