[math-fun] Image understanding
Is there something like "elementary introduction to FFT's for digital photography" ? I'm trying to develop (!) some heuristics for automatically processing my digital photos, and I'm doing FFT's on them to try get a sense the orientation. Few of my photos are of fences or zebras, so I need some experience in understanding how to interpet FFTs of general scenes. Hilarie Wears Full Bark Jacket for tree hugging Help Stomp Out Box Elder Beetles
There are a (small) number of good books in the AI and graphics communities on the effect of Fourier Transforms on images. The best sources I have seen are a chapter or two in a larger textbook. If you get to MIT, CMU, Stanford, etc., their bookstores will usually have the right materials. Basically, FT's (FFT is merely a "fast" FT) allow you to do convolutional filtering more efficiently in the "frequency" domain instead of in the "time" domain (this terminology leaks over from the 1-D usage of FT's). Convolutional filtering is "position-independent" filtering, where the filter to be applied is the same at every image point. Basically, a 1-D DFT (D = digital) takes a number of points (usually 2^n), and produces another 2^n complex numbers which give the amplitude and phase of each of the frequencies found in the signal. The points are interpreted as an infinite signal which repeats every 2^n points, so each point is interpreted mod 2^n. So a 2-D DFT treats the image as the surface of a torus. Changing the phase of a frequency shifts that frequency around the torus in one of the two cyclic dimensions. The effect of this is to have the left & right edges of your image connected together, as well as the top & bottom of your image. You can fix this through "padding" the image, but this has other effects -- e.g., "ringing". A lowly glass lens does a(n approximation to a) 2-D FT, if you hold a piece of paper right behind the lens instead of at the focus point. FT filtering is an integral part of image compression a la JPEG and MPEG, in the form of DCT -- discret cosine transform. DCT's are a form of FFT's which don't require complex numbers -- i.e., stay in the real domain. There are a large number of free image processing tools around -- unfortunately a number of them run under MSDOS. Check with the GNU/FSF sites. A modern Pentium 4 or Apple G4 both have excellent FFT performance -- if properly programmed -- so you should be able to play around with this on any recent PC/Mac. At 04:36 PM 1/19/2004, The Purple Streak, Hilarie Orman wrote:
Is there something like "elementary introduction to FFT's for digital photography" ? I'm trying to develop (!) some heuristics for automatically processing my digital photos, and I'm doing FFT's on them to try get a sense the orientation. Few of my photos are of fences or zebras, so I need some experience in understanding how to interpet FFTs of general scenes.
Hilarie
Wears Full Bark Jacket for tree hugging Help Stomp Out Box Elder Beetles
participants (2)
-
Henry Baker -
The Purple Streak, Hilarie Orman