, , , , , ,

Cygwin: The Answer to Indexing Windows 7

How to Install Cygwin on Windows

Required: Obtain Cygwin

Follow the provided link to the appropriate installation binary, as relevant to your version of Windows. Be sure to install to pay attention to any details provided, re: setup.exe (to the best of this author’s knowledge, the file to download will likely be named setup.exe. Good luck!

Let the reader proceed, please: assume yourself as in search of some means of indexing all files currently accessible to his or her current PC configuration.

Much distraught with the Microsoft Windows 7{*1} Operating System, I soon sought after a means of indexing the system which would include in the index, those files on my NAS device(s). It’s important to note that my issues regarding Indexing a Windows 7 Home Premium edition are NOT relevant to Windows XP fortified with the Windows Search 4 desktop Search add-on available through MS Downloads for that Operating System. As well, my displeasure concerning the prohibitions imposed by the limited features of the Home Premium edition of Windows 7 are also NOT relevant to more expensive{*2} versions of the Operating System.

For all intents and purposes, let us suppose the reader is likewise limited by the Indexing capacity default of the Windows 7 Home Premium edition Operating System, as to be righteously displeased, for the Search Facility limits its index to local disks (i.e. those Hard Drives which the O/S sees as a piece of hardware connected to the main circuit-board, or motherboard itself). I present the following solution is viable: an up-to-date instance of Cygwin must be installed on the system.

Illustrated: Indexing the Windows System via Cygwin

It’s really quite simple, once you’ve got Cygwin installed, like many *NIX operations are known to be. Running the command as an administrator account, just copy and paste the following into your Cygwin BASH terminal command line:

Anyone familiar with Linux will likely be familiar with the ease of use afforded through “locate”, “mlocate”, or “slocate”. Cygwin offers the same functionality! Once I discovered this, I’ve been using it with great success, and i’ve never looked back (i.e. looking back for another; different; better means of finding stuff on the Windows system). This simply does what I need. It’s fast, and accurate, and accepts Regular Expressions. What more could the User possibly want, or need!?

Step One: Index the System

Get started by creating the initial database, and performing the initial index:

updatedb --output=/usr/local/var/locatedb/locate02.db --localpaths=/cygdrive --netpaths=//PATHTO/NASDEVICE

Depending on the size of the disks; amount of data it must index, this command might take anywhere from just a few minutes to several hours.
For me, it takes about 45 mins, and I’ve got about 1TB in local disks, and 1TB in NAS storage.

Step Two: Locate What is Desired

After the initial indexing has finished, you’re essentially finished, aside from any locate commands you might wish to execute.

How do I submit a locate query? Actually, there are a couple of options here. If you prefer to use regular expressions, try your hand at the following:

The -i option tells Locate not to concern itself with the case of the object (i.e. perform the query, case-insensitive), while the -r lets Locate know that what follows should be evaluated as a regular expression.

locate -d /usr/local/var/locatedb/locate02.db --regextype=egrep -i -r /^/cygdrive/[a-z]/.*[_W_R_I_T_E_-_H_E_R_E_]$/

If you don’t want to fool around with RegEx in your Locate Query, I recommend submitting something like the following. In my experience, I’ve found what I wanted much faster using this method. I realize it’s something of a satisfying experience to realize a regular expression, as compiled by oneself, might return the desired results. I’ve been there, done that, and I can’t claim it’s the wrong way, but in my experience, I’ve found it’s simply not worth the effort, since the locate database has everything indexed. Perhaps you’re better at compiling regular expressions than I, and you’ll find you prefer the regular expression option. To each his own. I believe you’ll find either method has its pros and cons, and neither is necessarily better than the other. It stands to reason that these are options.

locate -d /usr/local/var/locatedb/locate02.db -i *something*i*wish*to*locate*.extension

For example…

locate -d /usr/local/var/locatedb/locate02.db -i *photoshop*.psd

If the reader is working with a functional Cygwin installation, having never tried this technique before, I hope he or she finds as much satisfaction in using it, as I have! I’m pleased to share it with you.

On Occasion: Re-Index

If Cygwin is operating properly (assuming the software functions as it did when this article was first authored), any locate function executed while the database is more than a week old, a reminder will appear, indicating the locate db is out-of-date.

Because files may be moved and new files may be created, as the Locate database grows older, it gradually becomes less accurate in relation to the actual contents of the file system. Why? New files, deleted files, and moved files will not be reflected in the existing database. New files are created more often than we realize, but– for example– a new file appears when anything is downloaded from the Internet and stored on the local file system of the PC, tablet, phone… whatever hardware device is running the relevant software (ie. not every device will be configured with the prerequisite conditions indicated above, such as Cygwin installed under Windows).

Whether a result of new files, existing files moved to new locations, or deleted files, all file system activity is happening unbeknown to the locate index database. That is, the locate db can not know about such changes, until updatedb is executed again, to scan the file system and record its current state to the database path provided in the updatedb command line options. While some system indexes might update in real-time, locate has no such option for automatically updating itself. Nevertheless, the locate database must be updated to accommodate file system changes.

How do we deal with this problem? It’s simple: execute updatedb, the same as used for the initial indexing! Advanced users might handle the process using a chron job.

Further Study: Cygwin/ Linux Operations

Question: When performing any updatedb operation, after the initial indexing has completed (eg. execute updatedb, as described in step one, at subsequent, regular intervals of approximately 7-10 days; when the locate database is too out-of-date to accurately reflect the current filesystem contents, execute updatedb), as will act upon existing data created upon initial indexing (ie. if Step One is the initial filesystem indexing process, and that process completed such that subsequent locate operations return [desired] data relevant to the current file system.)
re-writing the database from scratch, it doesn’t matter, as the end result will be the same for the locate query. The answer to the question most likely resides in the man pages, as should be installed along with Cygwin. In my experience, accessing the man pages for Cygwin, proves it is authored specific for the Cygwin configuration.

I recommend the reader perform the following, from the Cygwin command line:

info locate

I prefer the info interface to the man pages, versus for example:

man locate

Again, this is a matter of to each his own.

Now, go forth and LOCATE something! Enjoy.
– @ajaxStardust

  1. Displeasure RE Windows 7 Operating System: The particulars of my opinion regarding Microsoft Windows 7 are beyond the scope of this article, and as well, irrelevant to the tutorial contained herein.
  2. More Expensive Versions [of Windows 7]: For lack of a better term, Windows 7 Ultimate Professional Supreme with Extra Cheese, [whatever is the name, clearly, i couldn’t care much less about] supposedly does its indexing, regardless whether the index should include a “local disk”, an NAS device, or any other source of device objects; nodes; files accessible to its processes; items the User might access, for example, via Windows Explorer
  3. Updated by @ajaxStardust, 2013-03-31 20.07.43

Whatchu do

Leave a Reply

Your email address will not be published. Required fields are marked *