Language Selection

English French German Italian Portuguese Spanish

PhpDig excels at small Web site indexing

Filed under
HowTos

Webmasters looking to provide search capabilities for their site would do well to try out PhpDig, a Web spider and search engine written in PHP with a MySQL backend. There are other open source search engines, all of which have their own advantages. PhpDig just happens to suit the needs of my Information Technology for Greenhouses and Horticulture site. Here's how I got it working.

Webmasters with small sites know the problem of providing useful site search capabilities. Typically, visitors enter keywords in a search box and the search engine returns a ranked list of pages related to the query. This is a useful service -- provided the visitor can tune the search and the results returned are reliable and relevant.

Some Webmasters rely on Google for this service. A listing in Google or another mainstream engine is a must-have in practical terms, so it is easy enough to piggyback on the main engine with a site-specific search, provided Google understands your site and keeps coming back for updates -- but this isn't always the case.

Large search engines boast of indexing of billions of pages, but we are only interested in digesting a hundred pages or so. We need them indexed on a regular basis, daily or at least more often than Google might do it.

It is also important to know if our site is responding correctly by providing public pages, hiding private pages, and following links correctly. Since Google uses algorithms that it doesn't share, we have no way of predicting the indexing results or doing any testing in advance. Advance testing is useful if, for example, you have private files that you want to be sure will not be indexed, but you are relying on your robots.txt file to deny access to bots. If we make a spelling mistake in robots.txt, our private pages could go in Google's cache for the world to read. We also need to control what words are indexed and customize our own search and result pages.

Enter PhpDig.

More in Tux Machines

Android/Google Leftovers

3 open source alternatives to Office 365

It can be hard to get away from working and collaborating on the web. Doing that is incredibly convenient: as long as you have an internet connection, you can easily work and share from just about anywhere, on just about any device. The main problem with most web-based office suites—like Google Drive, Zoho Office, and Office365—is that they're closed source. Your data also exists at the whim of large corporations. I'm sure you've heard numerous stories of, say, Google locking or removing accounts without warning. If that happens to you, you lose what's yours. So what's an open source advocate who wants to work with web applications to do? You turn to an open source alternative, of course. Let's take a look at three of them. Read more

Hackable voice-controlled speaker and IoT controller hits KS

SeedStudio’s hackable, $49 and up “ReSpeaker” speaker system runs OpenWrt on a Mediatek MT7688 and offers voice control over home appliances. The ReSpeaker went live on Kickstarter today and has already reached 95 percent of its $40,000 funding goal with 29 days remaining. The device is billed by SeedStudio as an “open source, modular voice interface that allows us to hack things around us, just using our voices.” While it can be used as an Internet media player or a voice-activated IoT hub — especially when integrated with Seeed’s Wio Link IoT board — it’s designed to be paired with individual devices. For example, the campaign’s video shows the ReSpeaker being tucked inside a teddy bear or toy robot, or attached to plant, enabling voice control and voice synthesis. Yes, the plant actually asks to be watered. Read more

Security News