Tuesday, January 22, 2013

Comparing Image File Names Using Irfanview, Excel

A sample photo image directory.

Introduction

Consider this common scenario for people working on commercial website projects. You have a bunch of product photos but not photos of every product. You need to create two lists: one of the images you already have and another with other images, some of which you might not have.
This “cheat sheet” uses Irfanview ® and Microsoft Excel ® to create these lists and compare the data.

Preliminary Steps

Your first steps are gathering your information. You should have a list of image file names, possibly already in Excel. If not, duplicate the following procedure on the folder or directory where your existing images are found.
You will also want to locate the directory where the new images—including the missing shots—can be found.
If you already have a list of image file names, copy and paste the list into a column in Excel under the header “existing” and save the worksheet.

Irfanview Steps

These steps work on a folder of images where you are using Irfanview’s thumbnail function to view them.
Start by opening Irfanview and the folder where the images are found. Hit “T” to view them as thumbnails. Next hit “A” to select all images.
Now right click on the images and choose “Save selected file names as text” from the options presented.
You now have your list of file names without the associated images.

Initial Excel Steps

The Long Way
Use this method at least once so you understand how to do it.
Open Excel then, under File Open, have it open All Files and select your text file. Open the images as Fixed width and follow the Screen import wizard.
Next, see how the image file names are presented. Assuming parts of the file name are spread across multiple columns, delete every column except the final one with the actual image name. An example is saving only names such as Test1.JPG instead of C:\My_Documents\project_image_files\Test1.jpg when all of the other text is stored in additional columns.
An excel list with original file names spread across several columns.
Now you should have file names like Shoes\redmens10m.JPG and Shoes\redwomens6.JPG.
Shrink those names down so they resemble those in the list of images you already have. One way of doing this is counting the number of spaces from the right—including the period for your “dot JPG”—for your longest length name.
Now use Excel’s RIGHT Function to gather just the text characters you want to keep.
Assuming your original filename with all of the data is in column A and you want to save the first 8 characters from the right, your formula (in cell B1) will look like this:
=Right(A1,8).
Delete or modify any file names where extra characters still exist. It’s far easier to delete unneeded text than add critical information back in.
Cell B1 should just have the characters you need for comparison purposes. Copy and paste special values (Alt+ESV) to replace your formula with the actual file names and that step is finished.
Use the same procedure on your existing images if their file names are not already stored in an Excel worksheet.
Use this method when  the text you want to delete is identical with every image file name.
The Short Way
Assuming you have the same set of characters at the beginning of every file name, use search and replace (Crtl+H) to search for them and replace them with nothing. Hit Replace All and the extra characters are gone leaving just the rest of your original file name. It’s a heck of a lot quicker than the long way.

Comparing Lists

Now you’ve got a list of new file names that looks identical in format to what you already had. Do a vlookup (hit the F1 key, type in Vlookup function and read the help article if you are unfamiliar with it) between your two spreadsheets and sort the results.
Assuming your new images are stored in Column A and your old images in column B on a different worksheet, your formula should look something like:
·         =VLOOKUP(A1,'test 2.txt'!$B:$B,1,FALSE)
·         A1 is your starting cell
·         ‘test 2.txt’ is the spreadsheet where you are looking for the comparison data
·         $B:$B is the column where the data can be found
·         1 is the number of cells you are comparing
·         False indicates you want an exact match. True would give you an inexact match, which is generally avoided.
Assuming your results are in Column C, do a sort and look for the #N/A’s. An N/A indicates your worksheet of existing file names lacks that particular image. The name will appear if you already have it.
All you have to do now is harvest the images listed as N/A and you’re done.

Editor’s Note: A detailed look at using Irfanview can be found in my archives. Email me directly at davereyn83@gmail.com for a copy.

No comments:

Post a Comment