Cmos/ccd Sensors And Camera Systems Second Edition
The image sensor of a digital camera produces values, proportional to radiance, that approximate red, green, and blue ( RGB) tristimulus values. I call these values linear-light. However, in most imaging systems, RGB tristimulus values are coded using a nonlinear transfer function – gamma correction – that mimics the perceptual response of human vision. Most image coding systems use R'G'B' values that are not proportional to intensity; the primes in the notation denote imposition of a perceptually motivated nonlinearity. Caputo, in, 2010 Digital Image Sensors: CCD Versus CMOSThe image sensor of the camera is responsible for converting the light and color spectrum into electrical signals for the camera to convert into zeroes and ones. All commercially available digital cameras (still, movie, or security) use one of two possible technologies for the camera's image sensor: CCD or CMOS. CCD sensor technology is specifically developed for the camera industry, whereas CMOS sensors are based on the same technology used in many electronic devices for memory and/or firmware.CCD sensors have been used in photographic equipment for 20 years and up until recently included better light sensitivity than CMOS sensors.
This higher light sensitivity translates into better low-light images, which is important for security cameras. A CCD sensor is more expensive to manufacture and incorporate into a camera than a CMOS chip. Thus, CMOS technology had to improve dramatically to meet the demand for lower cost digital image products. CMOS sensors are more cost-effective to manufacture and assemble, making smaller cameras with larger sensors possible, but they still lack the power of low-light sensitivity. CMOS and CCD sensors are typically measured in either millimeters or inches. The majority of security cameras use anywhere from a 1/4″ to a 2/3″ sensor, which as you can see from Figure 3-4, is a fraction of the size of the traditional 35 mm sensor.
However, that's why the “normal” lens on any digital camera is smaller than 50 mm. Even digital SLR cameras, although capable of using many of their 35 mm counterpart lenses, are considered 1.5–1.6 times their original 35 mm size (the 1.5× or 1.6× crop factor) because they're using the APS-C-sized sensor, which is smaller than the original 35 mm sensor. Mohammad Wajih Alam. Wahid, in, 2019 1.2.1 Image SensorConventional endoscopy systems use an image sensor to capture images of the GI tract. The image sensor (camera) along with illumination, processor, and wireless communication is miniaturized into a capsule. The most popular image sensor is the complementary metal oxide semiconductor (CMOS) image sensor as it is cheaper, smaller, and consumes less power than charge-coupled device (CCD) image sensor.
On the other hand, CCD image sensors offer a high quality and low-noise image 9. Among the few small-bowel capsule endoscope models available on the market, PillCam, MiroCam, OMOM, and CapsoCam use CMOS imagers whereas the EndoCapsule uses a CCD imager 10. In, 2012 Sampling apertureIn a practical image sensor, each element acquires information from a finite region of the image plane; the value of each pixel is a function of the distribution of intensity over that region. The distribution of sensitivity across a pixel of an image capture device is referred to as its sampling aperture, sort of a PSF in reverse – you could call it a point “collection” function. The sampling aperture influences the nature of the image signal originated by a sensor. Sampling apertures used in continuous-tone imaging systems usually peak at the center of each pixel, fall off over a small distance, and overlap neighboring pixels to some extent.In 1915, Harry Nyquist published a landmark paper stating that a sampled analog signal cannot be reconstructed accurately unless all of its frequency components are contained strictly within half the sampling frequency. This condition subsequently became known as the Nyquist criterion; half the sampling rate became known as the Nyquist rate.
Nyquist developed his theorem for one-dimensional signals, but it has been extended to two dimensions. In a digital system, it takes at least two elements – two pixels or two scanning lines – to represent a cycle. A cycle is equivalent to a line pair of film, or two “TV lines” (TVL). In Figure 7.6, the black square punctured by a regular array of holes represents a grid of small sampling apertures. Behind the sampling grid is a set of a dozen black bars, tilted 14° off the vertical, representing image information. In the region where the image is sampled, you can see three wide dark bars tilted at 45°. Those bars represent spatial aliases that arise because the number of bars per inch (or mm) in the image is greater than half the number of apertures per inch (or mm) in the sampling lattice.
Aliasing can be prevented – or at least minimized – by imposing a spatial filter in front of the sampling process, as I will describe for one-dimensional signals in Filtering and sampling, on page 191, and for two dimensions in Image presampling filters, on page 242. Point sampling refers to capture with an infinitesimal sampling aperture. This is undesirable in continuoustone imaging. Figure 7.7 shows what would happen if a physical scene like that in Figure 7.1 were rotated 14°,captured with a point-sampled camera, and displayed with a box distribution. The alternating on-off elements are rendered with aliasing in both the checkerboard portion and the titlebar.
(Aliasing would be evident even if this image were to be reconstructed with a Gaussian.) This example emphasizes that in digital imaging, we must represent arbitrary scenes, not just scenes whose elements have an intimate relationship with the sampling grid. Spatial phenomena at an image sensor or at a display device may limit both vertical and horizontal resolution. Analog processing, recording, and transmission historically limits bandwidth, and thereby affects only horizontal resolution. Resolution in video historically refers to horizontal resolution:Resolution in TVL/PH – colloquially, “TV lines” – is wice the number of vertical black and white line pairs (cycles) that can be visually discerned across a horizontal distance equal to the picture height.Vertical resampling has become common in consumer equipment; resampling potentially affects vertical resolution. In transform-based compression (such as JPEG, DV, and MPEG), dispersion comparable to overlap between pixels occurs; this affects horizontal and vertical resolution.
Caputo, in, 2014 Digital Image Sensors: CCD vs. CMOSThe camera’s image sensor is responsible for converting the light and color spectrum into electrical signals for the camera to convert into zeros and ones. All commercially available digital cameras (still, movie, or security) use one of two possible technologies for the camera’s image sensor: charged coupled device (CCD) or complementary metal oxide semiconductor (CMOS). CCD sensor technology is specifically developed for the camera industry, whereas CMOS sensors are based on the same technology used in many electronic devices for memory and/or firmware.CCD sensors have been used in photographic equipment for 20 years and until recently included better light sensitivity than CMOS sensors. This higher light sensitivity translates into better low-light images, which is important for security cameras.
A CCD sensor is more expensive to manufacture and incorporate into a camera than a CMOS chip. Thus, CMOS technology had to improve dramatically in order to meet the demand for lower-cost digital image products. CMOS sensors are more cost-effective to manufacture and assemble, making smaller cameras with larger sensors possible, but they still lack the power of low-light sensitivity. CMOS and CCD sensors are typically measured in either millimeters or inches. The majority of security cameras use anywhere from a ¼- to a 2/3-inch sensor, which, as you can see from Figure 3.4, is a fraction of the size of the traditional 35mm sensor. However, that’s why the “normal” lens on any digital camera is smaller than 50mm.
Even digital SLR cameras, although capable of using many of their 35mm counterpart lenses, are considered 1.5 to 1.6 times their original 35mm size (the 1.5× or 1.6× crop factor) because they’re using the APS-C size sensor, which is smaller than the original 35mm sensor. A digital image is produced using image sensors resulting in a 2D or 3D image data representation of one or a sequence of images. The data comprise spatial characteristics such as light intensity, depth, and absorption.
In this work, we assume an input image needs to be compared with a large number of images archived in a database. The input image is preprocessed by reducing noise and enhancing contrast to enable extraction of relevant attributes and suppression of false information. To study the scalability and performance of the image-search (or matching), we use Scale-Invariant Feature Transform (SIFT) as an algorithm to detect and describe local features in images. The SIFT method is invariant to image scaling and rotation, and partially invariant to illumination changes and affine distortions even in the presence of occlusion, clutter, or noise. A large number of features can be extracted from a typical image that are highly distinctive and can be matched accurately to a large database of features for object, facial, or scene recognition. The runtime proportion of SIFT processing is illustrated in Figure 11.2.
The SIFT library is an open-source library referenced in section “For more information,” where one can extract invariant feature points out of any given database of images. Various stages of SIFT can be summarized as follows.
High-level view of Visual Pattern Matching using Scale Invariant Feature Transform (SIFT). Scale-space extrema detectionThis is the first stage of the SIFT framework that searches over all the scales and image locations called keypoints. The image undergoes Gaussian convolution at different scales to produce images separated by a constant factor. This convolution is followed by subtracting adjacent image scales to produce the difference-of-Gaussian (DoG) images. These images identify potential interest points invariant to scale and orientation. DoG images identify keypoints that are local minima/maxima of the DoG images across scales.
Each pixel in the DoG image is compared to its eight neighbors at the same scale and nine corresponding neighboring pixels in each of the neighboring scales. A candidate keypoint is represented by a pixel whose value is the maximum or minimum among all compared pixels. Keypoint localizationOnce the keypoint candidates are selected, a detailed fit is performed to the nearby data for location, scale, and ratio of principal curvatures. This information allows some keypoints to be rejected that have low contrast (and are therefore sensitive to noise) or are poorly localized along an edge. Keypoints with strong edge response in a single direction are rejected.
Orientation assignmentIn this step of SIFT, each keypoint is assigned one or more orientations based on local image gradient directions. The keypoint can be referenced relative to this orientation and thereby achieves invariance with respect to image rotation. Keypoint descriptor. Joe Stam, James Fung, in, 2011 36.1 Introduction, Problem Statement, and ContextDigital imaging systems include a lens and an image sensor. The lens forms an image on the image sensor plane. The image sensor contains an array of light-sensitive pixels, which produce a digital value indicative of the light photons accumulated on the pixel over the exposure time. Conventional image sensor arrays are sensitive to a broad range of light wavelengths, typically from about 350 to 1100 nm, and thus do not produce color images directly.
Most color image sensors contain a patterned mosaic color filter array over the pixels such that each pixel is sensitive to light only in the red, blue, or green regions of the visible spectrum, and an IR cut filter is typically positioned in the optical path to reflect or absorb any light beyond about 780 nm, the limit of the human visible spectrum. The typical mosaic or Bayer 1 pattern color filter array (CFA) is shown in Figure 36.1. This figure shows a common RG-GB configuration used throughout this example. Other arrangements may be used, including horizontal or vertical stripes rather than a mosaic.
Different optical methods also exist to create color images, such as the common “three-chip” configuration used in some professional cameras where dichroic beam splitters direct three different spectral bands of the image onto three different image sensor arrays. Color-imaging techniques all have advantages with respect to image quality, cost, complexity, sensitivity, size, and weight. Although the merits of different color-imaging techniques may be hotly debated, the Bayer filter pattern has clearly emerged as the most popular technique for virtually all consumer and most professional video and still cameras. Bayer pattern color filter array (CFA).Because each pixel of a Bayer pattern filter responds to only one spectral band (red, green, or blue) the other two color components must be somehow computed from the neighboring pixels to create a full-color image. This process, commonly called demosaicing, or de-Bayering, is the subject of this chapter.
Perhaps the simplest method is to reduce the effective resolution of the image by three quarters and treat each RG-GB 2 × 2 quad as a superpixel. Although simple, the lost resolution is undesirable, and artifacts remain because each pixel is not perfectly coincident with the other pixels in the quad. Many improved methods have been developed with various degrees of computational complexity. Simple linear interpolation between neighbors is probably the most commonly used, especially for video applications. Much more complex and higher-quality methods are preferred when postprocessing images from high-resolution digital still cameras on a PC — a tedious process typically referred to as RAW file conversion.De-mosaicing occurs either by an embedded processor within a camera (or even on the image sensor chip) or as part of a postprocess step on a workstation after acquisition. Simple on-camera implementations result in a substantial loss of quality and information. Using a high-power processor would consume too much of the camera's battery life and produce heat and electrical noise that could degrade the image quality.
Although on-camera methods are acceptable for consumer applications, professional photographers and videographers benefit from storing the raw unprocessed images and reserving the color interpolation for a later stage. This allows the use of much higher quality algorithms and prevents the loss of any original recorded information. Additionally, the user may freely adjust parameters or use different methods to achieve their preferred results. Unfortunately, the raw postprocessing step consumes significant computational resources, and thus, proves cumbersome and lacks fluid interactivity. To date, use of full high-performance raw conversion pipelines is largely limited to use for still photography. The same techniques could be used for video, but the huge pixel rates involved make such applications impractical.Graphics processing units (GPUs) have become highly programmable and may be used for general-purpose massively parallel programming.
De-mosaicing is such an embarrassingly parallel process ideally suited for implementation on the GPU. This chapter discusses implementation of a few de-mosaicing algorithms on GPUs using NVIDIA's CUDA GPU computing framework. The de-mosaicing methods presented are generally known and reasonably simple in order to present a basic parallel structure for implementing such algorithms on the GPU. Programmers interested in more sophisticated techniques may use these examples as starting points for their specific applications. Analog closed-circuit TV camera block diagram.The camera converts the optical image produced by the lens into a time-varying electric signal that changes (modulates) in accordance with the light-intensity distribution throughout the scene. Other camera electronic circuits produce synchronizing pulses so that the time-varying video signal can later be displayed on a monitor or recorder or printed out as hard copy on a video printer.
Although cameras may differ in size and shape depending on specific type and capability, the scanning process used by most cameras is essentially the same. Almost all cameras must scan the scene, point by point, as a function of time (an exception is the image intensifier). Solid-state CCD or CMOS color and monochrome cameras are used in most applications.
In scenes with low illumination, sensitive CCD cameras with IR illuminators are used. In scenes with very low illumination and where no active illumination is permitted (i.e., covert) low-light-level (LLL) intensified CCD (ICCD) cameras are used. These cameras are complex and expensive. In the early 1990s, the nonbroadcast, tube-type color cameras available for security applications lacked long-term stability, sensitivity, and high resolution. Color cameras were not used much in security applications until solid-state color CCTV cameras became available through the development of solid-state color sensor technology and widespread use of consumer color CCD cameras used in camcorders. Color cameras have now become standard in security systems and most CCTV security cameras in use today are color.
20.10 shows representative CCTV cameras including monochrome and color solid-state CCD and CMOS cameras, a small single board camera, and a miniature remote head camera. Figure 20.10.
Representative video cameras. Transmission FunctionOnce the camera has generated an electrical video signal representing the scene image, the signal is transmitted to a remote security monitoring site via some transmission means: coaxial cable, two-wire twisted-pair, LAN, WAN, intranet, Internet, fiber-optic, or wireless techniques. The choice of transmission medium depends on factors such as distance, environment, and facility layout.If the distance between the camera and the monitor is short (10–500 ft), coaxial cable, UTP, and fiber optic or wireless are used. For longer distances (500 ft to several thousand feet) or where there are electrical disturbances, fiber-optic cable and UTP are preferred.
For very long distances and in harsh environments (frequent lightning storms) or between separated buildings where no electrical grounding between buildings is in place, fiber optics is the choice. In applications where the camera and monitor are separated by roadways or where there is no right of way, wireless systems using RF, microwave, or optical transmission are used. For transmission over many miles or from city to city the only choice is the digital or Internet IP camera using compression techniques and transmitting over the Internet. Images from these Internet systems are not real time but sometimes come close to real time. Monitor FunctionAt the monitoring site a CRT or LCD or plasma monitor converts the video signal back into a visual image on the monitor face via electronic circuitry similar but inverse to that in the camera.The final scene is produced by a scanning electron beam in the CRT in the video monitor. This beam activates the phosphor on the CRT, producing a representation of the original image onto the faceplate of the monitor. Alternatively, the video image is displayed point by point on an LCD or plasma screen.
A permanent record of the monitor video image is made using a VCR tape or DVR hard disk magnetic recorder and a permanent hard copy is printed with a video printer. Recording FunctionFor decades the VCR has been used to record monochrome and color video images. The real-time and TL VCR magnetic tape systems have been a reliable and efficient means for recording security scenes.Beginning in the mid-1990s the DVR was developed using a computer hard disk drive and digital electronics to provide video image recording.
The availability of large memory disks (hundreds of megabytes) made these machines available for long-duration security recording. Significant advantages of the DVR over the VCR are the high reliability of the disk as compared with the cassette tape, its ability to perform high-speed searches (retrieval of images) anywhere on the disk, and absence of image deterioration after many copies are made.
(6.1) λ c = h c E g = 1.24 E g ( e V )where h is Planck's constant. Photons of wavelength shorter than the cutoff wavelength are absorbed and create electron–hole pairs. An important metric for a photodetector is quantum efficiency η, which measures the number of carriers measured per photon.As with LEDs, we use a junction to promote the capture of photons and generation of conduction-band electrons. The quantum efficiency of a photojunction is the ratio of electron–hole pairs generated to incident photons. The p-i-n (p-type, then intrinsic, then n-type) junction is often used for photodetection. Light absorbed by a reverse-biased junction creates electron–hole pairs that result in current flow.Photodetectors take two common forms.
A photodiode is a diode optimized for use as a photodetector. A phototransistor uses one of the p-n junctions of the transistor as the photodetector; the transistor effect then amplifies the resulting current.
As with LEDs, the composition of the material and its resulting bandgap determines the frequencies to which the photodetector is sensitive. However, photodetectors are typically used as panchromatic detectors that capture all frequencies of light. The fact that they are less sensitive to some frequencies is taken into account in other ways. As shown in Fig. 6.8, a pixel in an image sensor contains several elements in addition to the photodetector. The pixel circuitry provides access to the pixel value.
As a result, not all of the surface of the image sensor can be used to detect photons. The fill factor is the ratio of the photodetector area to the total pixel area. One way to compensate for the limited fill factor of a pixel is by using a lens to concentrate as much light as possible onto the photodetector. The microlenses must be fabricated with material of good optical quality, and their optical properties must be evenly matched across the array. Although different photodiode materials can be used to sense light of different wavelengths, building an array of red, green, and blue photodiodes of different materials on the same chip is impractical given the small sizes required for the pixel.
Instead, color filters are placed over each pixel as shown in Fig. 6.8. The filter material itself is relatively simple to handle compared with other microelectronic materials, but each pixel in the sensor array must have its own color. The most common pattern for filters is known as the Bayer pattern Bay75 shown in Fig. 6.9.
It is a 2 × 2 pattern with two green, one blue, and one red pixel. Two greens were chosen because the human visual system is most sensitive to green; the pair of green pixels can be used as a simple form of luminance signal. We use a string of MOS capacitors to build the CCD array. Charge is moved from one capacitance to the next to form an analog shift register that is often called a bucket brigade. The standard way to visualize the operation of a CCD is to show each capacitor's potential well with the charge sitting at the bottom of the well. Although the charge is actually at the surface of the MOS capacitor, the potential well imagery helps us to visualize the bucket brigade behavior.
MOS capacitors can be arranged in several different ways to form bucket brigades; Fig. 6.10 shows the operation of a three-phase CCD. Charge is successively transferred from one device to the next by applying voltages to each MOS capacitor.
Charge will flow from one device to the adjacent device if that device's potential well is lower. A cell consists of three devices. The three phases of a clock are applied to the devices to move charge from one device to the next by manipulating their potential wells. At the end of three phases, each sample has moved by one cell. Operation of a three-phase CCD.CCDs are extremely efficient at transferring charge which means that they introduce very little noise into the image. CCDs are still used, particularly for applications that require operation in low light such as astronomy. But CCDs require specialized manufacturing processes.The CMOS imager, also known as an active pixel sensor (APS) Fos95; Men97 Fos95 Men97, is widely used because it provides good image quality while being compatible with standard CMOS fabrication technologies.
CMOS imager manufacturing is often adjusted somewhat to provide better characteristics for the photosensor, but the basic manufacturing process is shared with CMOS. The schematic for one form of the APS cell is shown in Fig. 6.11 ElG05. A photogate is used as the photosensor. This form uses a structure known as a pinned photodiode due to the additional layer of doping on top to control the pinned states at the surface. A transfer gate controls access to the charge on the photogate.
A pair of transistors are used to amplify the pixel value to the bit line: the bottom transistor is on when the row select line is high, allowing the top transistor to amplify the photogate output onto the bit line. The charge produced by the photogate is accumulated on the gate of the output transistor; the value of the pixel is determined by the integral of the illumination of the photodiode during the image exposure. When the reset line is high, the reset transistor reverses the bias of the photogate through the transfer gate and resets the photogate's value.
The performance of an infrared search and track (IRST) sensor depends on a large number of variables that are important for determining systems performance. One of the variables is the pulse visibility factor (PVF). The PVF is linearly related to IRST performance metrics, such as signal-to-noise ratio (SNR) or signal-to-clutter ratio (SCR). Maximizing the performance of an IRST through a smart design of the sensor requires understanding and optimizing the PVF. The resulting peak, average, or worst case PVF may cause large variations in the sensor SNR or SCR as the target position varies in the sensor field of view (FOV) and corresponding position on the focal plane.
As a result, the characteristics of the PVF are not straightforward. The definitions and characteristics for the PVF to include ensquared energy (best case PVF), worst case PVF, and average PVF are provided as a function of F. Lambda / dcc (dcc is the center-to-center distance between pixels, i.e., pixel pitch).
F. Lambda / dcc is a generalized figure of merit that permits broad analysis of the PVF. We show the PVF trends when the target has a finite size but is still unresolved on the focal plane smaller than an instantaneous field of view (IFOV). The target size was constrained to be no less than 2% of the IFOV but also no greater than 100% to study the effects on the PVF as a function of target size. Finally, we describe the characteristics of the PVF when optical degradations, such as aberrations, are inherent in the sensor transfer function. The results have illustrated that small F. Lambda / dcc with large fill factor maximized the PVF at the expense of greater variability.
Larger F. Lambda / dcc can reduce the PVF variations but results in a decreased PVF. Finite target sizes and additional optical degradation decreased the PVF compared to diffraction-limited systems.
Infrared image quality can be degraded by atmospheric aerosol scattering. Aerosol interactions are dependent upon the atmospheric conditions and wavelength. Measurements of an edge target at range in the LWIR under hot humid weather provided a blur on the image plane, which is characterized by an MTF. In this experiment, the edge spread function, measured at range, was differentiated to obtain the line spread function and transformed into a MTF. By dividing the total measured MTF by the imager and turbulence MTFs, the aerosol MTF was obtained. Numerical analysis performed using MODTRAN and existing known scattering theory was compared with the experimental results. The measured and numerical results demonstrated a significant aerosol MTF suggesting that the aerosol MTF should be included in sensor performance analysis.
Target acquisition range predictions are based upon system MTF analysis. Johnson linked cycles on target (sounds like an MTF), N50, to detection, recognition, and identification (DRI). Over time, models changed with V50 replacing N50. Not surprisingly, the DRI V50 values changed as the models evolved.
Current DRI V50 values appear to be target specific. What is changing is the detail size needed for DRI – not V50. Rather than have a V50 for each task, use V50 = 2 for detecting specific target features. As the feature size decreases, target detail becomes more prominent leading to recognition and identification. Finding the optimum design is an iterative decision process. Every step in the design process that has conflicting needs requires a trade study.
Trade studies indicate which component(s) affect acquisition range the most and the least. The variables include sensor parameters (focal length, aperture diameter, detector size, noise, and viewing distance) and scenario (target size, target/background contrast, line-of-sight jitter, atmospheric transmittance, and atmospheric turbulence). With an almost infinite number of trades possible, selecting the most important requires a priori knowledge of each parameter, the relationship to others, and its effect on acquisition range.
In previous studies maximum acquisition range was achieved when Fλ/d approached 2. There was no constraint on magnification or field-of-view. This suggested that detector size approach λ/2 when F = 1. Night vision goggles typically have a fixed FOV of 40 deg with unity magnification. Digital night vision goggles (DNVG) acquisition range is limited by the human visual system resolution of 0.291 mrad (20/20 vision).
This suggests the maximum number of horizontal detectors should be about 2500 with a minimum pixel size of about 8 μm when F = 1 and aperture = 1 inch. Values change somewhat depending upon f-number and noise level. Ranges are provided for GaAs and InGaAs detectors under starlight conditions. The different spectral responses create minimum resolvable contrast (MRC) test issues. Existing FLIR detection models such as NVThermIP and NV-IPM, from the U.S.
Army Night Vision and Electronic Sensors Directorate (NVESD), use only basic inputs to describe the target and background (area of the target, average and RMS temperatures of both the target and background). The objective of this work is to try and bridge the gap between more sophisticated FLIR detection models (of the sensor) and high-fidelity signature models, such as the NATO-Standard ShipIR model. A custom API is developed to load an existing ShipIR scenario model and perform the analysis from any user-specified range, altitude, and attack angle. The analysis consists of computing the total area of the target (m2), the average and RMS variation in target source temperature, and the average and RMS variation in the apparent temperature of the background. These results are then fed into the associated sensor model in NV-IPM to determine its probability of detection (versus range).
Since ShipIR computes and attenuates the spectral source radiance at every pixel, the black body source and apparent temperatures are easily obtained for each point using numerical iteration (on temperature), using the spectral attenuation and path emissions from MODTRAN (already used by ShipIR to predict the apparent target and background radiance). In addition to performing the above calculations on the whole target area, a variable threshold and clustering algorithm is used to analyse whether a sub-area of the target, with a higher contrast signature but smaller size, is more likely to be detected. The methods and results from this analysis should provide the basis for a more formal interface between the two models. Human visual system (HVS) “resolution” (a.k.a.
Visual acuity) varies with illumination level, target characteristics, and target contrast. For signage, computer displays, cell phones, and TVs a viewing distance and display size are selected. Then the number of display pixels is chosen such that each pixel subtends 1 min-1.
Resolution of low contrast targets is quite different. It is best described by Barten’s contrast sensitivity function. Target acquisition models predict maximum range when the display pixel subtends 3.3 min-1. The optimum viewing distance is nearly independent of magnification. Noise increases the optimum viewing distance. Panoramic imagers are becoming more commonplace in the visible part of the spectrum. These imagers are often used in the real estate market, extreme sports, teleconferencing, and security applications.
Infrared panoramic imagers, on the other hand, are not as common and only a few have been demonstrated. A panoramic image can be formed in several ways, using pan and stitch, distributed aperture, or omnidirectional optics. When omnidirectional optics are used, the detected image is a warped view of the world that is mapped on the focal plane array in a donut shape.
The final image on the display is the mapping of the omnidirectional donut shape image back to the panoramic world view. In this paper we analyze the performance of uncooled thermal panoramic imagers that use omnidirectional optics, focusing on range performance. Recent progress, in small infrared detector fabrication, has raised interest in determining the minimum useful detector size. We approach detector size analysis, from an imaging system point of view, with reasonable assumptions for future sensor design. The analysis is a simplified version of the target task performance model using the parameter Fλ/d for generalization. Our figure-of-merit is a system characteristic. The results are easy to use and yield minimum useful detector size of 2 μm for the mid-wave infrared region (MWIR) and 5 μm for the long-wave infrared region (LWIR) when coupled with an F/1 optical system under high signal-to-noise ratio conditions.
Final size depends upon optical design difficulty, manufacturing constraints, noise equivalent differential temperature, and the operational scenario. For challenging signal-to-noise ratio conditions and more reasonable F/1.2 optics, a 3 μm MWIR detector and a 6 μm LWIR detector are recommended. There are many benefits to approaching these detector sizes with low F -number optics. They include lower cost detectors, no need for dual FOV or continuous zoom optics, and no need for dual F -number optics. Our approach provides the smallest volume and lowest weight sensor with maximum range performance. While this paper focuses on infrared design, our approach applies to all imaging sensors.
In the past five years, significant progress has been accomplished in the reduction of infrared detector pitch and detector size. Recently, longwave infrared (LWIR) detectors in limited quantities have been fabricated with a detector pitch of 5 μm.
Detectors with 12-μm pitch are now becoming standard in both midwave infrared (MWIR) and LWIR sensors. Persistent surveillance systems are pursuing 10-μm detector pitch in large format arrays. The fundamental question that most system designers and detector developers desire an answer to is: 'How small can you produce an infrared detector and still provide value in performance?' If a system is mostly diffraction-limited, then developing a smaller detector is of limited benefit. If a detector is so small that it does not collect enough photons to produce a good image, then a smaller detector is not much benefit. Resolution and signal-to-noise are the primary characteristics of an imaging system that contribute to targeting, pilotage, search, and other human warfighting task performance.
We investigate the task of target discrimination range performance as a function of detector size/pitch. Results for LWIR and MWIR detectors are provided and depend on a large number of assumptions that are reasonable. In the past five years, significant progress has been accomplished in the reduction of infrared detector pitch and detectorsize. Recently, longwave infrared detectors in limited quantities have been fabricated with a detector pitch of 5micrometers.
Detectors with 12 micrometer pitch are now becoming standard in both the midwave infrared (MWIR)and longwave infrared (LWIR) sensors. Persistent surveillance systems are pursuing 10 micrometer detector pitch inlarge format arrays.
The fundamental question that most system designers and detector developers desire an answer tois: 'how small can you produce an infrared detector and still provide value in performance?' If a system is mostlydiffraction-limited, then developing a smaller detector is of limited benefit. If a detector is so small that it does notcollect enough photons to produce a good image, then a smaller detector is not much benefit. Resolution and signal-tonoiseare the primary characteristics of an imaging system that contribute to targeting, pilotage, search, and other humanwarfighting task performance. In this paper, we investigate the task of target discrimination range performance as afunction of detector size/pitch.
Results for LWIR and MWIR detectors are provided and depend on a large number ofassumptions that are reasonable. Point-and-shoot, TV studio broadcast, and thermal infrared imaging cameras have significantly different applications. A parameter that applies to all imaging systems is Fλ/d, where F is the focal ratio, λ is the wavelength, and d is the detector size. Fλ/d uniquely defines the shape of the camera modulation transfer function. The fully updated edition of this bestseller addresses CMOS/CCD differences, similarities, and applications, including architecture concepts and operation, such as full-frame, interline transfer, progressive scan, color filter arrays, rolling shutters, 3T, 4T, 5T, and 6T.
The authors discuss novel designs, illustrate sampling theory and aliasing with numerous examples, and describe the advantages and limitations of small pixels.This monograph provides the very latest information for specifying cameras using radiometric or photometric concepts to consider the entire system-from scene to observer. Numerous new references have also been added. Successful fusion combines salient features of each image to produce a new fused image with more 'information'. Although different sensors have different spatial resolutions, they tend to be detector-limited and this usually does not significantly affect fusion. The biggest problem with sensor fusion is that the number of detectors on each array is different. If the number of pixels on target are made equal (the desired design), then the fields-of-view are different.
This may affect operational effectiveness. If the fields-of-view are equal, then pixels on target are different. This accentuates phasing effects, increases target edge ambiguity, and overall makes fusion more difficult. Sampling artifacts are most noticeable with man-made objects and are pronounced with periodic targets (bar targets).
The sampling process creates an infinite number of new frequencies that were not present in the original scene. For an under-sampled system (which is characteristic of nearly all imaging systems), the replicated frequencies can overlap scene frequencies. An optical anti-alias filter can eliminate the overlapping but does not prevent frequency replication.
Frequencies above the Nyquist frequency are eliminated by an ideal reconstruction filter. As overlapping increases or with less than ideal reconstruction, the resultant image is distorted. The phases associated with the replicated frequencies violate linear-shift-invariant system requirements. As a result, movement of the scene with respect to the detector array creates ambiguity in edge locations further distorting imagery. While sampling theory suggests that sharp cutoff filters are required, these filters will create ringing (Gibbs phenomenon) in the image. Replicated spectra that appear in the image are called the spurious response.
Out-of-band spurious response (above Nyquist frequency) looks very similar to the input but with phase variation. The phase errors interfere with target recognition and identification. In-band spurious response (frequencies less than the Nyquist frequency) appears somewhat like noise. At this juncture it is not clear how this 'noise' interferes with recognition and identification tasks.
It may only affect detection. Since the sampling process replicates frequencies, it is possible to extract information with reconstruction band-pass filters whose center frequency is above the Nyquist frequency. Imaging system performance can be described in the spatial domain, where the optical blur diameter is compared to the detector size, or in the frequency domain modular transfer function (MTF) approach, where the optics cutoff is compared to the detector cutoff. Both comparisons provide a metric that is a function of F/d, where F is the focal ratio, is the wavelength, and d is the detector size.
F/d is applied to three models: Schade's equivalent resolution (visible systems), Snyder's MTFA (MTF area; visible systems), and target acquisition (NVThermIP—an infrared system model). All models produced curves that exhibit a transition in the region 0.412, no aliasing occurs. This may be important for medical imaging, where sampling artifacts may be interpreted as a medical abnormality, or for space probes, where it is impossible to obtain ground truth. Imaging system resolution depends upon Fλ/d where F is the focal ratio, λ is the wavelength, and d is the detector size.Assuming a 100% fill factor, no aliasing occurs when Fλ/d ≥ 2. However, sampling artifacts are quite acceptable andmost systems have Fλ/d. Super resolution reconstruction (SRR) improves resolution by increasing the effective sampling frequency.
Target acquisition range increases but the amount of increase depends upon the relationship between the optical blur diameter and the detector size. Range improvement of up to 52% is possible.Modern systems digitize the scene into 12 or more bits but the display typically presents only 8 bits. Gray scale compression forces scene detail to fall into a gray level and thereby 'disappear.' Local area processing (LAP) readjusts the gray scale so that scene detail becomes discernible. Without LAP the target signature is small compared to the global scene dynamic range and this results in poor range performance. With LAP, the target contrast is large compared to the local background.
The combination of SRR and LAP significantly increases range performance. A simplified approach to target acquisition is presented which combines the optics performance (often specified by the Airy disk size) with the detector size. The variable is Fλ/d where F is the focal ratio, λ is the wavelength, and d is the detector size. The simplified approach allows plotting range as a function of aperture diameter, focal length, wavelength, detector size, field-of-view, or noise.
Assuming a 100% fill factor, no aliasing occurs when Fλ/d ⩾ 2. This suggests that the sampling theorem plays an important role in target detection. However, sampling artifacts are quite acceptable. Since real targets are aperiodic, relating the number of detectors to the sampling theorem should be avoided.
Likewise, the Airy disk size can be related to the detector size (Fλ/d) but trying to decide the required number of samples across the Airy disk as a design criterion should be avoided. Shannon's sampling theorem (also called the Shannon-Whittaker-Kotel'nikov theorem) was developed for thedigitization and reconstruction of sinusoids. Strict adherence is required when frequency preservation is important. Threeconditions must be met to satisfy the sampling theorem: (1) The signal must be band-limited, (2) the digitizer mustsample the signal at an adequate rate, and (3) a low-pass reconstruction filter must be present.In an imaging system, the signal is band-limited by the optics. For most imaging systems, the signal is not adequatelysampled resulting in aliasing. While the aliasing seems excessive mathematically, it does not significantly affect theperceived image. The human visual system detects intensity differences, spatial differences (shapes), and colordifferences.
The eye is less sensitive to frequency effects and therefore sampling artifacts have become quite acceptable.Indeed, we love our television even though it is significantly undersampled.The reconstruction filter, although absolutely essential, is rarely discussed. It converts digital data (which we cannot see)into a viewable analog signal. There are several reconstruction filters: electronic low-pass filters, the display media(monitor, laser printer), and your eye. These are often used in combination to create a perceived continuous image.
Eachfilter modifies the MTF in a unique manner. Therefore image quality and system performance depends upon thereconstruction filter(s) used. The selection depends upon the application. There have been numerous applications of super-resolution reconstruction algorithms toimprove the range performance of infrared imagers.
These studies show there can be adramatic improvement in range performance when super-resolution algorithms areapplied to under-sampled imager outputs. These occur when the imager is movingrelative to the target which creates different spatial samplings of the field of view foreach frame. The degree of performance benefit is dependent on the relative sizes of thedetector/spacing and the optical blur spot in focal plane space. The blur spot size on thefocal plane is dependent on the system F-number. Hence, in this paper we provide arange of these sensor characteristics, for which there is a benefit from super-resolutionreconstruction algorithms. Additionally, we quantify the potential performanceimprovements associated with these algorithms.
We also provide three infrared sensorexamples to show the range of improvements associated with provided guidelines. Automated test methods have been developed and implemented which provide a high degree of correlation with average manual test results using human observers. The results of this effort are given in this paper using the presently implemented MRT models. New data using the FLIR92 3-D noise model are also presented which offer a more detailed description of sensor noise over previous implementations. This improves the accuracy of automated MRT and will facilitate testing of scanning time delay and integration (TDI) and staring array thermal imaging sensors in the future. Sampling is present in all electronic imaging systems. For scanning systems, the scene is sampled in the cross scan direction by the discrete location of the detectors and by the A/D converter in the scan direction.
For staring arrays, the discrete location of the detectors samples the scene in both directions. Sampling creates both phasing effects and aliasing. Since the aliasing occurs at the detector, it cannot be avoided. After a signal has been aliased, it cannot be reconstructed. Aliasing and phasing effects are obvious when viewing periodic targets such as those used for system characterization. Aliasing and phasing effects become pronounced as the target frequency approaches the electronic imaging systems's Nyquist frequency. Aliasing is not very obvious when viewing complex scenery and, as such, is rarely reported during actual system usage although it is always present.
We have become accustomed to phasing effects and aliasing at the moves, on TV and on computer monitors. These effects becomes bothersome when trying to perform scientific measurements. What you see visually is not what you get with an EO imaging system. Holst and Pickard experimentally determined that MRT responses tend to follow a log-normal distribution. The log normal distribution appeared reasonable because nearly all visual psychological data is plotted on a logarithmic scale.
It has the additional advantage that it is bounded to positive values; an important consideration since probability of detection is often plotted in linear coordinates. Review of published data suggests that the log-normal distribution may have universal applicability. Specifically, the log-normal distribution obtained from MRT tests appears to fit the target transfer function and the probability of detection of rectangular targets. Thermal imaging system manufacturers have an extensive product line with many options available for each system. It is impossible to list all the systems nor all the options.
Instead, the systems described in this paper are those presented at this conference by the following manufacturers: Agerna, Anther Engineering, Bales Scientific, Cincinnati Electronics, David Sarnoff Research Center, Eastman Kodak, FLIR Systems, Inframnetrics, ISI Group, Mitsubishi Electronics of America, and Santa Barbara Focal Plane. The systems are described in functional form to illustrate the similarities and differences. No attempt is made to rate them. Only the user can rate them when applied to his specific appi icat ion. Minimum Resolvable Temperature Difference (MRTD) has long been the universally accepted standard measure of a thermal imaging sensor's performance.
This test is a complete evaluation of man and machine and is the best predictor, short of actual field evaluation, for determining the performance of the man/machine combination. Variables associated with the observer have generally been taken for granted. With the development of more sophisticated sensors for different applications, it is now time to analyze the link between the sensor and observer more closely and how it relates to the MRTD. This paper investigates the impact of the MRTD results due to observer variables such as monocular versus binocular viewing, small amount of head movement and varying viewing distance from the display. The high frequency MRTD appears to be limited by the system's MTF and amount o noise present. The low frequency MRTD appears to be affected by viewing distance and the amount of low frequency noise (non-uniformity) present.
This paper describes the work completed by Martin Marietta in support of the U.S. Army's standoff minefield detection system, advanced technology transition demonstration. This paper discusses the high priority and urgent need for the standoff mine detection system within the Army Combat Engineers, it presents the results of the successful application of non developmental technology/hardware in an airborne mine/minefield detection system, and it discusses the significant payoff of applying advanced ATR and high speed parallel processing. The technologies discussed include the IR imager as the source of mine imagery, advanced image processing algorithms including neural nets, and a high speed parallel processor unique to Martin Marietta called GAPP (geometric arithmetic parallel processor). MRT target visibility is affected by the relative phase between the target location and the sampling lattice of thermal imaging systems.
Camera Systems For Business
Undersampled systems such as staring arrays are particularly susceptible to phasing effects. For input spatial frequencies which range from 0.6 to 0.9 times the Nyquist frequency, bar fidelity is lost and the MRT is increased at these spatial frequencies.
In apparent contradiction to Nyquist theory, it is possible to perceive MRT targets whose spatial frequencies are between Nyquist and 1.1 times Nyquist frequency. In its current forni, the NVL model does not adequately predict thelaboratory measured minimum resolvable temperature (MRT) values at lowor high spatial frequencies. The differences between the measured andthe predicted values are caused by inappropriate modeling of the eye,tremendous variability in observers, and ill-defined data analysismethodology. In the usual laboratory procedure, the observer is allowedto move his head. This, in effect, renioves the eye's response from themodel because by adjusting his viewing distance, the observer appears toachieve equal detection capability at all spatial frequencies.
Recentstudies provide two new eye models: one allowing head movement and onein which the head is stationary.A NVL type model was modified to accept five different eye models:the original NVL eye model, the Sendall-Rosell model, the two new eyemodels, and the Campbell-Robson eye model. All the models provided thesame shaped MRT curve at high spatial frequencies to within a constant.Large discrepancies exist at low spatial frequencies presumably due tothe inability to model the eye's inhibitory response. Imaging system design and performance depend upon a myriad of radiometric, spectral, and spatial parameters. The “bare bones” sensor consists of optics, detector, display, and an observer.
Range degrading parameters include 3D noise, optical blur, and pixel interpolation. Scenario parameters include detection, recognition, and identification probability, target contrast, target size, line-of-sight motion, and atmospheric conditions. Generally, the customer provides the scenario and the analyst optimizes sensor parameters to achieve maximum acquisition range. A wide variety of programs have been available in the past (e.g., SSCamIP, NVThermIP etc.). These programs have been consolidated into the Night Vision Integrated Performance Model (NVIPM).
For convenience, the calculations are performed in the frequency domain (MTF analysis). This is often called image chain modeling.
Although the math is sometimes complex, the equations are graphed for easy interpretation. NVIPM can easily perform trade studies and provides a gradient (sensitivity) analysis. Gradient analysis lists those parameters (in decreasing order) that affect acquisition range.This course consists of 6 sections: (1) The history of imaging system design and the transition from scanning arrays to staring arrays, (2) imaging system chain analysis covering MTF theory, “bare bones” system design, environmental effects (atmospheric attenuation, turbulence, and line-of-sight motion a.k.a. Jitter), sampling artifacts, and image processing, (3) detector responsivity, radiometry, various noise sources (photon, dark current, read) and the resulting SNR, (4) targets, backgrounds, and target signatures, (5) various image quality metrics which includes NVIPM, and (6) acquisition range and trade studies. By far, the most important section is the trade study graphical representations. Three optimization examples are provided (case study examples): long range imaging, short range imaging, and IRST systems.While the course emphasizes infrared system design, it applies to visible, NIR, and short infrared (SWIR) systems.
From an optimization viewpoint, the only difference across the spectral bands is the target signature nomenclature. When considering hardware design, the spectral region limits lens material and detector choices. The test concepts presented apply to CCD/CMOS cameras, intensified CCD cameras, night vision goggles, SWIR cameras, and infrared cameras.
Using a systems approach, this course describes all the quantitative and qualitative metrics that are used to characterize imaging system performance. Laboratory performance parameters discussed include resolution, responsivity, random noise, uniformity, fixed pattern noise, modulation transfer function (MTF), contrast transfer function (CTF), minimum resolvable temperature (MRT), and the minimum resolvable contrast (MRC). The eye’s spatial and temporal integration allows perception of images whose signal-to-noise ratio (SNR) is less than unity. Since most imaging systems spatially sample the scene, sampling artifacts affects all measurements and significantly affect MRT and MTF test results. Phasing effects are illustrated. Data analysis techniques are independent of the sensor selected (i.e., wavelength independent).
January 19, 2010. Hackintosh from windows. ^. Retrieved August 23, 2015.
The difference lies in the input variable name (watts, lumens, or delta-T) and the output variable name (volts, lumens, or observer response). Field tests are extremely difficult. Differences between lab and field test approaches are provided with an estimate of anticipated field results. Real world target are significantly different than laboratory targets and the illumination is quite different. This course describes the most common laboratory test techniques. Equally important is identifying those parameters that adversely affect results. Believable test results depend upon specifications that are testable, unambiguous, and provide a true measure of performance.
The imaging system analyst must be conversant in numerous diverse technologies. Each has a unique effect on system evaluation. This course highlights these technologies in 10 sections and is filled with numerous practical and useful examples. While the equations are provided, the concepts are presented graphically and with imagery. An engineering approach is taken: the 'bare bones' imaging system consists of illumination, optics, detector, and display.
The radiometry and photometry section compares calibration sources to real sources (sun, fading twilight, artificial sources). The optics/detector combination performance can be described in the frequency domain (MTF analysis) by the parameter Fλ/d which is the ratio of the detector cutoff to the optics cutoff. Equally important, but often neglected is sampling; an inherent feature of all electronic imaging systems. Sampling artifacts, which creates blocky images, are particularly bothersome with periodic targets such as test targets and bar codes. Sampling cannot be studied in isolation but requires a reconstruction filter.
Sampling artifacts are illustrated through numerous imagery. For man-in-the-loop operation, the display and the eye are of concern and, in many situations, these limit the over system performance.
The impact of viewing distance on the image quality of displays, TVs, computers, cell phones, and halftones is discussed. A point-and-shoot camera appears simplistic from the outside. However, as shown in an example, modeling can be quite complex. The math, statistics, and data analysis section covers the validity of approximations, central limit theorem, Gaussian statistics, decision theory, and the receiver operating curve (ROC). Included are different ways to graph data and, as an example, the margin of error reported in political polls. System resolution (a.k.a.
Image quality) can be inferred from Schade's equivalent resolution which is a function of Fλ/d. Atmospheric transmittance and glare (via the sky-to-ground ratio) is discussed.Target acquisition is presented with a simplified, back-of-the-envelope, approach using Fλ/d. Early systems had 'large' detectors (Fλ/d.
Thermal imaging system applications range from construction applications to electrical and mechanical inspections. It includes search and rescue, endangered species monitoring, border patrol, law enforcement, military applications and surveillance which includes people and objects. This course explains how nondestructive testing can locate flaws on commercial aircraft. A thermal imaging system that evaluates the condition of power lines, transformers, circuit breakers, motors, printed circuit boards, and other electronic components is described. Heat transfer, radiation theory, and emissivity are the parameters define the target signature. The environment (sun, wind, or other hot targets) may further modify the target signature.
This course presents these characteristics and applications.