Language Selection

English French German Italian Portuguese Spanish

PhpDig excels at small Web site indexing

Filed under
HowTos

Webmasters looking to provide search capabilities for their site would do well to try out PhpDig, a Web spider and search engine written in PHP with a MySQL backend. There are other open source search engines, all of which have their own advantages. PhpDig just happens to suit the needs of my Information Technology for Greenhouses and Horticulture site. Here's how I got it working.

Webmasters with small sites know the problem of providing useful site search capabilities. Typically, visitors enter keywords in a search box and the search engine returns a ranked list of pages related to the query. This is a useful service -- provided the visitor can tune the search and the results returned are reliable and relevant.

Some Webmasters rely on Google for this service. A listing in Google or another mainstream engine is a must-have in practical terms, so it is easy enough to piggyback on the main engine with a site-specific search, provided Google understands your site and keeps coming back for updates -- but this isn't always the case.

Large search engines boast of indexing of billions of pages, but we are only interested in digesting a hundred pages or so. We need them indexed on a regular basis, daily or at least more often than Google might do it.

It is also important to know if our site is responding correctly by providing public pages, hiding private pages, and following links correctly. Since Google uses algorithms that it doesn't share, we have no way of predicting the indexing results or doing any testing in advance. Advance testing is useful if, for example, you have private files that you want to be sure will not be indexed, but you are relying on your robots.txt file to deny access to bots. If we make a spelling mistake in robots.txt, our private pages could go in Google's cache for the world to read. We also need to control what words are indexed and customize our own search and result pages.

Enter PhpDig.

More in Tux Machines

Hands-On: More adventures with Manjaro-ARM for the Raspberry Pi 2

In my previous post I celebrated the announcement of Manjaro-ARM Linux for the Raspberry Pi 2. I installed it on my Pi 2 with no problems, and I was ready to continue experimenting and investigating with two major objectives - how complete/stable is it, and what are the chances of getting the i3 window manager working on it? Read more

Canonical Will Be Present at MWC 2016 to Showcase Its Ubuntu Convergence

MWC (Mobile World Congress) 2016 is almost upon us, and one of the biggest attraction there will be, of course, Canonical's latest Ubuntu convergence features, which the company behind the world's most popular free operating system will showcase on the new BQ Aquaris M10 Ubuntu Edition tablet device. Read more

Benchmarks Of The ODROID-C2 64-Bit ARM Development Board

Earlier this month Hardkernel announced the ODROID-C2 as a 64-bit ARM development board that would begin shipping in March. Fortunately, you don't need to wait until next month to find out how this $40 USD 64-bit ARM development board is performing: here are some benchmarks. Read more

Pinterest open-sources its Teletraan tool for deploying code

As promised last year when the company introduced it, Pinterest today announced that it has released its Teletraan tool for deploying source code on GitHub under an open source Apache license. “Teletraan is designed to do one thing, deploy code,” Pinterest software engineer Baogang Song wrote in a blog post. “Not only does it support critical features such as zero downtime deploy, rollback, staging and continuous deploy, but it also has convenient features, such as displaying commit details, comparing different deploys, notifying deploy state changes through either email or chat room, displaying OpenTSDB metrics and more.” Read more