Language Selection

English French German Italian Portuguese Spanish

PhpDig excels at small Web site indexing

Filed under

Webmasters looking to provide search capabilities for their site would do well to try out PhpDig, a Web spider and search engine written in PHP with a MySQL backend. There are other open source search engines, all of which have their own advantages. PhpDig just happens to suit the needs of my Information Technology for Greenhouses and Horticulture site. Here's how I got it working.

Webmasters with small sites know the problem of providing useful site search capabilities. Typically, visitors enter keywords in a search box and the search engine returns a ranked list of pages related to the query. This is a useful service -- provided the visitor can tune the search and the results returned are reliable and relevant.

Some Webmasters rely on Google for this service. A listing in Google or another mainstream engine is a must-have in practical terms, so it is easy enough to piggyback on the main engine with a site-specific search, provided Google understands your site and keeps coming back for updates -- but this isn't always the case.

Large search engines boast of indexing of billions of pages, but we are only interested in digesting a hundred pages or so. We need them indexed on a regular basis, daily or at least more often than Google might do it.

It is also important to know if our site is responding correctly by providing public pages, hiding private pages, and following links correctly. Since Google uses algorithms that it doesn't share, we have no way of predicting the indexing results or doing any testing in advance. Advance testing is useful if, for example, you have private files that you want to be sure will not be indexed, but you are relying on your robots.txt file to deny access to bots. If we make a spelling mistake in robots.txt, our private pages could go in Google's cache for the world to read. We also need to control what words are indexed and customize our own search and result pages.

Enter PhpDig.

More in Tux Machines

today's leftovers

  • Puppet Rolls Out New Docker Image Builds
    Folks who are focused on container technology and virtual machines as they are implemented today might want to give a hat tip to some of the early technologies and platforms that arrived in the same arena. Among those, Puppet, which was built on the legacy of the venerable Cfengine system, was an early platform that helped automate lots of virtual machine implementations. We covered it in depth all the way back in 2008. Earlier this year, Puppet Labs rebranded as simply Puppet, and also named its first president and COO, Sanjay Mirchandani, who came to the company from VMware, where he was a senior vice-president. Now, at PuppetConf, the company has announced the availability of Puppet Docker Image Build, which "automates the container build process to help organizations as they define, build and deploy containers into production environments." This new set of capabilities adds to existing Puppet functionality for installing and managing container infrastructure, including Docker, Kubernetes and Mesos, among others.
  • Five Cool Alternative Open Source Linux Shells
    We are going to look at some of the available Linux shells out there that users have access to free of charge since they are open source, they come in a number of different licenses and this mainly depends on the software creator but in essence one doesn’t have to pay to use the system; so that a major plus in whichever way we look at it. We find that there are different kinds of users when it comes to Linux, the ones who tread carefully preferring to stick to tried and tested software, the other kinds are the ones who dive into the deep end of cutting edge software; head first.
  • openSUSE Tumbleweed – Review of the Week 2016/42
    This was week 42 – The openSUSE LEAP week of the Year. It can’t be a co-incidence that the Release Candidate 1 was announced in Week 42, on the 2nd day (42.2 – European counting, we start our week on Monday, not on Sunday). But also in Tumbleweed things are not standing still: of course many of the things are well in line with what Leap received (like for example Plasma updates), but Tumbleweed rolls at a different pace ahead of the game.