I decided that the website had grown to the point where it needed a search engine. I didn’t want to use a google search or an embedded yahoo search – they look disgusting. I also didn’t want to use any of the third party searching scripts, since most of them were costly and all of them had commercial licenses. I like free software. So I set out to write my own.
General Considerations
Let me start off by saying that what I have below is not a magic, easy solution to writing a search engine. If you are planning to write the world’s “next google”, I have a recommendation: go to http://bing.com – Microsoft’s “next google”. Notice how “copycatted” it looks. Then search around (on google, please) to find out exactly how popular it is. Hint: not very. Microsoft tried and failed. Don’t waste your time. My problem is to build an internal search engine, which only needs to deal with a small number of pages, and is low traffic so it doesn’t have to be super fast. When I told my co-working friends about the project, the responses I got varied from “maniac” to “shoot yourself now rather than afterwards – save some time”. And that’s with a highly simplified version of the problem.
Read the rest of this entry »