Wednesday, September 12, 2012

Act 11. Life is not black and white: color segmentation.

My nieces are wriggly creatures... especially when we go out. One is liable to take off without sparing me a second glance, and one likes to hide behind bushes and yell "Boo!". My sister tends to dress them up in bright colors to make them stand out even in a crowd. Now, we'll try to let the machine find them for us.
Adorable nieces Bea (left) an Annika (right) (filename test.png)
To teach the machine what colors to look for, we crop out a representative sample, which I'm calling spot. For humans, it's easy to tell what colors are nearly the same, regardless of their shade. For the computer, colors are simply RGB values, so colors that are close together would have nearly the same RGB values.
sample cloth (spot.png)



Colors that are close together would appear as a connected area in the normalized chromaticity space(NCS). Since it's easier to interpret data that's in two dimensions, instead of explicitly stating the RGB values, the R is mapped on the x axis, the G on the y axis, and B is determined by the equation
B = 1 - R - G
To determine the areas that are connected in the NCS, we take a spread along the R axis, a separate spread along the G axis, and multiply these two spreads together to make a symmetrical blob. Then the test picture is evaluated if each pixel's color has a high enough value in the NCS. Here's the code that does that, using a Gaussian function as the probability spread.

SIP code

Extracted picture using a gaussian distribution

It was able to successfully extract the jackets, but it also detected the lips and part of the chin(presumably because some color from the jacket reflected off their skin, giving it a similar color). Adding the following lines to the code:
A little cleaning

The connected areas which were less than 300 pixels were ignored, giving us a cleaner image.
Cleaner picture

Histogram backprojection

Histogram backprojection works by taking the colors of your sample, and checking how often they appear. The test image is then evaluated - if a pixel color appears often in the sample, then it should be extracted, otherwise it is ignored.

I used the code given by Ma'am Jing to create a 2 dimensional histogram, then added the following lines at the end.


Line 20 basically says that as long as it appeared once in the sample, then it should be extracted in the test image. x and y map the red and green channels of the test image to the range 1-255 and bin it into integers. The for loops get the color of the RG channels, look it up in the hist, and place the hist value in a blank image.

It was somewhat able to get the rough area of the both of the jackets, but the extraction using the parametric method was much better. It can be noticed that Bea's jacket(from which spot was cropped), was detected much more densely than her sister's jacket. The detection can be improved by cropping a larger area next time.
Comparison

In terms of speed, the histogram backprojection should run faster, but it was actually slower in my code because of the indexing. In terms of segmentation success, backprojection's success would be very dependent on the representative sample you get. If a nearby shade does not appear in the sample, it will not be detected. On the other hand, overdetection was an initial problem using the method of segmentation by computing the probability. 

Histogram backprojection would work better if the selected representative cropped image is non-uniform or has two colors, since the parametric method just takes the average and basically restricts the colors to a symmetrical blob in the NCS.

Fun stuff to do

I'm looking forward to using a similar code to keep track of my nieces in a video!

-----
The code used for the 2D histogram was created by Ma'am Jing. I thank Rommel Bartolome for a discussion that reaffirmed my understanding of what to do.

I'm giving myself a grade of 9/10 for being able to do both methods, the -1 point would be because the histogram backprojection was not entirely successful.

**Figure 3 comes from the Activity 11 manual prepared by Dr. Maricor Soriano.

Monday, September 3, 2012

Act 10. Feeling like a robot? Morphological operations 3/3: Looping through images

Our objective for this week is to be able to perform a simple task - specifically tagging objects with irregular features - and automating it for multiple pictures.

The given simple task is to isolate the bigger circles from this picture. This is similar to the detection of possible cancer cells in a sample of normal cells.

Punched out circles with a few larger circles mixed in.
The first step to figuring out which ones are irregular is to create a standard for regular cells using this picture which has no "cancer cells".

Calibration picture

To make it easier to process, we first transform it to a black and white image. Looking at its histogram, it's easy to choose 0.85 to separate the blacks and whites.
(left) histogram             (right) black and white image
Using the same command as the ones used in the singing scilab activity,
[L, n] = bwlabel(image); 
and by imshow-ing L, we can visually see which dots are connected(they have the same exact color).
Original picture with each blob tagged with a different integer
Connected blobs present a problem, as the only difference between "cancer cells" and "normal cells" is their area. Looking at the histogram of blob sizes, 

(code for blob size)

Blob sizes histogram
A lot of particles have blob sizes of roughly 500, so we can hypothesize that the circles have radii of around 12 pixels.

Going back to the cancerous image, I eroded it with a circles of radius 10... and the cancerous cells were successfully identified!

(left) eroded image (right) tagged cells

Yeah, so I'd consider that a success. :D The advantage of the method I employed is that it would be able to identify the big circles even if they touch the smaller circles.

The only bad thing was, it took a couple of seconds of my laptop sounding like a fan to produce this result. The load on the machine can be reduced by processing smaller segments at a time.

I automated the cutting of the original image.

Then, looping through the second element of dir(filepath of pictures), the identification was done automatically. :)

I'd give myself a grade of 9 for not working entirely on the cut-up images.



Wednesday, August 22, 2012

Act 8. Morphological operations application 1: Writing extraction

 Given the photo of a receipt shown below, we were tasked to crop an area and isolate the text.

Raw image
 After learning erode and dilate in the previous lesson, of course I was excited to put it to use!

After binarizing the cropped area, we erode it with horizontal structuring element. This must be longer than the length of one letter in the text, but it must be short enough so that the slight tilt of the lines would not pose a problem.

From the dilation, we get an image containing only the horizontal lines. Inverting this gives us a filter that is zero only on these lines. Multiplying this filter with the binarized image isolates the text. As a final step, we thin the image to make the letters one pixel thick simply by applying thin(Im).
(left) cropped are (middle) binarized image (right) isolated text
This code is limited in the sense that one would have to do additional masking(erosion, inversion, multiplication) to remove the vertical lines. This would not be a problem if we just edited the image in the frequency domain. (Remember this?)

Plus, since this code deletes all horizontal lines, it would erase part of the text if the text happened to intersect a horizontal line. This can be remedied a little by dilation of a short vertical structuring element.

Here's the code that takes the cropped picture and isolates the text. :)
Since I was successfully able to do the required task, I give myself a grade of 10/10.

Thursday, August 2, 2012

Act 7: Erosion and dilation: Morphological Operations

(To follow: video tutorial of how to get the erosion/dilation of an image given a structuring element)

In this activity, we were asked to predict the dilation and erosion of the following images:
  • 5x5 square
  • 4x3 triangle
  • 10x10 hollow square; 2 pixels thick
  • 5x5 cross; 1 pixel thick
using the following structuring elements(SE):
  • 2x2 square
  • 2x1 rectangle
  • 1x2 rectangle
  • 3x3 cross
  • 2x2 (lower left to upper right) diagonal
Here are my predictions:



 

 
The predicted dilation(top) and erosion(middle) of four image(in four pages) 
with the structuring elements at the bottom row of each page.


As we can see, erode looks for the SE in the image. Dilate makes the images thicker, with the "pressure" being related to the SE. It must be noted that these processes are not inverses of each other; taking the dilation of the erosion of an image (or vice-versa) does not return the original image. In fact, that process is called opening, and the reverse is called closing.

By virtue of changing the structuring element, one can choose which characteristic one wants to highlight.

Next, we check the predictions using SciLab. Since SIP has morphological operations, getthing the dilation and erosion is easy and intuitive.
dilate(Image, SE);
erode(Image, SE);





Images(top) with various structuring elements(left) 
and the corresponding erosion(middle) and dilation(right)

Almost all of the predictions were confirmed, except for the erosion of the hollow 10x10 and 5x5 cross using the diagonal structuring element. Rechecking my work assures me that it was not simply a careless mistake. The result from the code does not seem to fit the pattern - perhaps an anomaly? It would be of interest to inspect the code for the dilate and erode commands.

 I was able to do the necessary work, so 10/10. :)

Wednesday, July 25, 2012

Act 6. Repetitive? Image Manipulation in Frequency Space

In this activity, we will edit images in ... dundundun... frequency space!

Instead of
normal picture > edit > finished picture
we will do
normal picture > (Fourier Transform - transfer to frequency space) >  edit > (Inverse Fourier Transform - leave frequency space) > finished picture

The Fourier Transform looks for patterns in the image. Each point in frequency space represents a sinusoidal pattern, with the persistence of that pattern as the amplitude at that point. Edges (or a rapid change in intensity) in an image which do not repeat periodically are represented by low frequency values. Smooth areas with a little bit of noise varying across it seem to repeat with a high frequency. These types of filters are pretty easy to implement, we just have to use m(find(r>0.5)) for a high pass filter, replace the > with a < for a low pass filter, and combine them for a band pass filter.


The following images are arranged according to (top) image, (bottom)frequency space, (left)original image, (middle) filtered components, (right) resulting image.
High pass filters. It can be used to smoothen edges. Try using a highpass filter on your picture - your pores will disappear!
Low pass filter. Can grab edges. The resulting photo now emphasizes the changes in the  original image.

Band pass filter. I'm not really sure how this is normally used, but it was able to (coincidentally) grab the weave of the painting (middle).

The task for this activity is to remove repeating patterns from images.

Specifically, the horizontal lines from

Lunar orbiter image with lines
and the weave from this painting.

Painting by ___ with a noticeable weave pattern
The methodology used is

  1. Go to the frequency space in Scilab.
  2. Save the FT
  3. Open the FT in an image editing program(Gimp).
  4. Create a white layer on top.
  5. Add black/gray dots on top of the high amplitude points in the FT (you have to lower the opacity of the blank layer).
  6. Save.
  7. Open it up in Scilab
  8. Multiply the filter with the FT of the original image.
  9. Take the inverse FT.
  10. You have your finished image. :D
But since my Gimp kept on crashing :( I decided to do it purely in Scilab instead.

I took a look at the histogram of the natural logarithm of the FT amplitudes.

(histplot of log(FT))

I took an arbitrary mark that I estimated would include the bright spots in the FT. I then added a criterion that would isolate the general area where I expect the FT of the undesired repeating pattern. A circle around the center was left as is because changing it would remove important details from the image. I replaced these areas with 10, and not with 0, because it was just below the median value.

Deletion of lines from the lunar orbit picture (left) original (middle) extracted (right) resulting image

Code for the lunar picture


moon = gray_imread("Documents/AP186/act 6/crater.jpg");
x = [-0.5:1.0 / (size(moon,1)-1):0.5];
y = [-0.5:1.0 / (size(moon,2)-1):0.5];
[X,Y] = meshgrid(y,x);
r = sqrt(X.^2 + Y.^2);
moonF = fftshift(mfft(moon,-1,[size(moon,1),size(moon,2)]));
a = scf(1);
subplot(231); //original image
imshow(moon);
subplot(234); //FT original image
imshow(log(abs(moonF)),bonecolormap(8));
subplot(235); // filtered out (frequency space)
erased = ones(size(moonF,1), size(moonF,2));
erased(find(log(abs(moonF)) > 4  & abs(X)<0.01 & r > 0.015)) = moonF(find(log(abs(moonF)) > 4 & abs(X)<0.01 & r > 0.015));
imshow(log(abs(erased)),bonecolormap(8));
subplot(232); // filtered out (real space)
imshow(mfft(erased,1,[size(moon,1),size(moon,2)]));
subplot(236); // edited image FT
moonF(find(log(abs(moonF)) > 4 & abs(X)<0.01 & r > 0.015)) = 10.0;
imshow(log(abs(moonF)),bonecolormap(8));
subplot(233); //edited image
moonFix = mfft(moonF,1,[size(moonF,1),size(moonF,2)]);
imshow(abs(moonFix),bonecolormap(8));
a.figure_size = [1000,800];
xs2png(a, "Documents/AP186/act 6/moonsir cool.png");

A similar code was used for the painting, but the criterion for masking in the frequency space was instead
find(log(abs(snipF))>4 & r > 0.1)
Extraction of weave from painting  (left) original (middle) extracted (right) resulting image
By blowing up the top left and top middle pictures, we can see that a general approximation of the triangularly shaped weave is obtained.
(Left) Original texture and (right) extracted texture

Part of the preparation for this activity was looking at the convolution(or multiplication in frequency space) of two images of dissimlar sizes. It is supposed to used as familiarization with applying filters. 
(Left) Ten random points (middle) convolved with a 5x5 image (right) points in a periodic fashion

I would give myself a grade of 11 for being able to get the desired output (lunar picture without lines and painting without weave) without using external applications.

Act 6 Preparation

Back in activity 2, we used scilab to look at the Fourier Transform(FT) of simple geometric shapes. In this preparatory activity, we're going to do that again, albeit with the addition of dimensionality, rotation, and translation.


Coding part step-by-step (REALLY!)  (code given by Dr. Soriano, might be edited a little)
The grid is set up
x = [-2:0.05:2];
y = [-2:0.05:2];
[X,Y] = meshgrid (x,y);
r = sqrt(X.^2 + Y.^2);
 x and y become arrays from -2.0 to 2.0 in steps of 0.05, i.e. [-2.,-1.95,-1.90, ... -0.05,0,0.05,... 1.90,1.95,2.00]
[X,Y] is the 2D array which we can think of as one similar to the xy coordinate system. X is x repeated 81 times, Y is basically X transposed.
r is the distance from the center to a point in [X,Y]

A 2D array filled with zeros represents a black 81x81 image.
m = zeros(size(x), size(y))
Some parts of it are selected to become nonzero; 1.0 is chosen since that value corresponds to white in scilab bw images.
m(find(r<0.5)) = 1.0;
From the inside-out, in baby steps:
  • if we declare an integer j = 0.25 and we type j < 0.5, scilab returns "T" since x is less than 0.5
  • if j is instead, an array, scilab evaluates each element of the array and returns an identically size array of the same size. i.e. if j = [0.25,0.5,0.75], j<0.5 returns [T,F,F]
  • r<0.5 therefore gives us all the points in XY within a circle of radius 0.5
  • find(r<0.5) gives the indices of r which are true
  • m(find(r<0.5)) identifies all elements of m whose mirror in r<0.5 is 
  • m(find(r<0.5)) = 1.0 sets all aforementioned elements to 1.0
m can now be displayed in 3D
imesh(m);
or topview
imshow(m);

<1>

caption
Matrix m displayed in (left)3D and (right)topview


A few coding notes:
In the code given to us,
grayplot(x,y,(FTmc)) was used to show the topview of the FT. According to the documentation, 
grayplot makes a 2D plot of the surface given by z on a grid defined by x and y. Each rectangle on the grid is filled with a gray or color level depending on the average value of z on the corners of the rectangle.
Apparently, grayplot smoothens the image, leading to a loss of detail. That's why I chose to use imshow for the real space images. However, the regular imshow gets cranky with the images in frequency space, you need to specify the colormap to make it presentable. imshow(FTimage, hotcolormap(8)) works for my FT-ed images. :) HOWEVER, all the pictures here are in grayplot, since I always mess things up when doing repetitive actions.


mfft(m,-1,[81 81]) was used to take the FT of the image. It's a multidimensional FT, you just have to specify the matrix, flag( -1  denotes FT, 1 denotes inverse FT), and dimensionality(our matrix is 81 by 81). An alternative is using fft2(m) for the 2D FT and ifft(m) for the inverse FT.


Necessary pics


The pics are ordered with the original shape at the left, the FT in the middle, and the topview of the FT at the right.




Dirac deltas symmetric with respect to the y axis. The distance from the y axis is increased from 5 to 30.




Cylinders with (top to mid) distance from the y axis increased from 5 to 20 and (mid to bot) radius increased from 2 to 10




Gaussian functions of various variances along the x and y.
 


Squares (top to middle) displaced and (middle to bottom) increased in size

entire code:
mtlb_close all;
// CREATE A MATRIX OF X AND Y COORDINATES WITH
// THE ORIGIN AT THE CENTER

x = [-2:0.05:2];
y = [-2:0.05:2];
[X,Y] = meshgrid (x,y);
R = sqrt(X.^2 + Y.^2);

// INITIALIZE A CLEAN MATRIX
m = zeros(size(x), size(y));

//point sources at distance d from the y axis
//d = 5;
//m(41+d,41)= 1;
//m(41-d,41) = 1;

//circles of radius r at distance d from the y axis
//r = 10
//d = 20
//m(find(R<0.2)) = 1.0;
//m = [m(:,1+d:40+d), m(:,41-d:81-d)];

//squares of length l at distance d from the y axis
//l = 0.5
//d = 20
//m(find((abs(X) <=l)& (abs(Y)<=l)))=1;
//m = [m(:,1+d:40+d), m(:,41-d:81-d)];

//gaussian of variance var
varx = 0.1;
vary = 0.05;
m = exp(-(X.^2)/varx-(Y.^2)/vary);

FT = mfft(m,-1,[81 81]);
FTmc = fftshift(abs(FT));

a = scf(1);
xset("colormap", hotcolormap(128));
subplot(131);
mesh(m);
subplot(132);
mesh(FTmc);
subplot(133);
grayplot(x,y,(FTmc));
a.figure_size = [1000,500];
xs2png(a, "Documents/AP186/act 6/CMap gauss varx0.1 vary0.05.png");

Wednesday, July 11, 2012

Act 5: Too dark? Image enhancement by histogram manipulation

"Don't trust your eyes." - Dr. Soriano

People say that seeing is believing. As humans, we are used to trusting our senses. However, our perception is not perfect; for example, upon exposure to pain, it takes an intensity change of more than 6% before we notice any change. (I have to find a proper source for this; it just stuck in my mind while sleep-listening to RadioLab.) In the ranges of greatest sensitivity, small changes in intensity are noticeable, but when we get far away from that range, large changes go unnoticed. It's like comparing a bite of food when you're famished to a bite in the middle of a long meal.

Most modern electronics have a wider range of sensitivity than we do. They allow us to observe differences that are not registered by our senses. Image enhancement by histogram manipulation simply magnifies the differences so as to be detectable by humans.

What's a histogram again? A histogram is simply a frequency distribution of color intensities in an image. For a grayscale image, the x-axis ranges from black to shades of gray to white (0 to 1 OR 0 to 255), while a truecolor image will have three histograms: intensities of red, green, and blue. On a side note, that's how a digital camera decides the correct exposure: its ideal exposure occurs when the histogram is centrally balanced and not truncated at the sides. I'll add examples here when I have some free time.

Foraging for underexposed images in Flickr, I found a dark picture with only a few discernible details: a tree branch, some shrubs, and a pool ladder.
Nulkku's "The Darkness" (accessible here)

We open it in Scilab by typing:
Img = gray_imread('image3.jpg');
imsize = size(Img); //gets the dimensions of the image
 A histogram with 256 bins can be called by typing histplot(256, Img), but we need the values of the histogram if we're going to fix it manually. To get an array of values and frequency of value occurrence,

[freqs, val] = imhist(Img, 256);
 Displaying the histogram gives us the following picture.
Histogram of original image
From the histogram alone, we can immediately tell that this image is too dark, since the values are crowded at 0. Since it looks truncated, we can assume that the some details would not be recoverable - it "bunched up" blacker details to 0. Increasing the brightness in image editing programs simply shifts the histogram to the right, thus there isn't an increase in detail. Shifting it too much will bunch up the white portions, losing more detail.

First, we were instructed to see what happens if we try to make the histogram approximate a uniform distribution. We take its cumulative distribution function to be able to remap it.
CDF = cumsum(freqs)/sum(freqs);              //normalized CDF of image histogram
CDFline = cumsum(zeros(256,1)+1./256); //linear CDF
The instructions were to get the value of each pixel, (say 0.3 gray) look at where it falls in the corresponding CDF (i.e. 40%), trace the value (40%) to the desired CDF and exchange the original pixel value with the obtained color value in the new CDF. However, that involves a lot of finds and indexing, so I decided to map it instead. I figured out the replacement value for all possible values in the original image and saved the pairs in an array.
trans = [];
for i = 1:256
    new = val(max(find(CDF(i)>=CDFline)))
    if new then
        trans = [trans,new];
    else
        trans = [trans,0];
    end
end
I'm sure there's a more efficient way, but I'm still feeling my way with Scilab. Any advice would be always welcome. :)

Replacing the values only needs one line:
newImg = trans(Img*255+1);
Getting the new histogram is a bit disappointing.
Remapped to a linear CDF

 It's obvious that the new image will be blocky and pixelized. However, that's to be expected as only colour values with frequency greater than 1 can be remapped. We have not added any code that integrates the surrounding colors to smoothen the remapping.

Looking at the new image,
Enhanced image
It doesn't look very nice, but we've successfully brought out more details.

For the second part, we make a Gaussian histogram
 gauss = exp(-0.0005*(128-(1:256))^2);
and set its CDF, shown in the next figure, as the desired CDF.
CDF of a Gaussian distribution
Simply replacing CDFline with CDFgauss in the computation for trans gives us the following histogram and enhanced image.

Histogram of enhanced image #2
Enhanced image using a Gaussian CDF
 It can be seen that the branches at the right are not distractingly white anymore. This type of CDF allows us to highlight more details but looks less synthetic.


I would give myself a grade of 10 for being able to complete the activities detailed in the histogram manipulation manual.