# Find corrupted images

So I deleted all my pictures, and I restored them which resulted in a bunch of corrupted images; thousands of corrupted images. To fix this, I wrote the following script in MATLAB using the image processing toolbox:

// insert blog here

Using matlab I tried to determine what a corrupted image is. First when using the image processing toolbox to open an image, I noticed:

g = imread(s);
Warning: JPEG library error (8 bit), "Corrupt JPEG data: premature end of data segment"."
Warning: JPEG library error (8 bit), "Invalid JPEG file structure: two SOI markers"."

Also, a histogram of such a file looked liked this:

So, the only challenge is to find the spike, or simply a crazy high percent of 128, the mean value. Simple enough.

cd('/media/95543211-fd8f-4fc9-9b24-3a787113e4c2/+JPEG');
jpegs = dir('.');

num_files = 100;

file_count = length(jpegs);

G = zeros(1,num_files-2);

for i = 3:(num_files+2)
name = jpegs(i).name;
disp(['working: ' name]);
if true
try
[w, l] = size(I);
gray_percent = sum(sum(I==128))/(w*l);
G(i-2) = gray_percent;
if gray_percent > 0.07
disp(['moving . . . ' name]);
movefile(name, ['too_much_gray/' name]);
else
disp(['good: ' name]);
movefile(name, ['noerr/' name]);
end
catch
end

end
end

Then a script to see which images might be corrupt:

And a ruby script to move the results (yeah — really inefficient, I know).

So files that might crash matlab are at least removed.

#!/bin/bash

for f in *
do
# echo "Processing $f file..." # take action on each file.$f store current file name
if ! identify "$f" &> /dev/null; then echo "$f"
fi
done


and (yes, this is silly)

#!/home/bonhoeffer/.rvm/rubies/ruby-1.9.3-p286/bin/ruby
filez = <<EOF
__003999
__026328
__029322
__032335
__035823
__035842
__036090
__038688
__039670
__048554
__048561
__048634
19991215_22_43_43_033877
19991215_22_43_43_034820
19991215_22_43_43_049844
19991215_22_43_56_038011
19991215_22_44_16_010202
20070729_14_42_57_048540
EOF

puts filez.split(' ').size

filez.split(' ').each do |f|
mv #{f} matlab_bad/#{f}