Embedded Eye One important consideration when designing any vision system is the relationship between pixels in a camera system and angles and distances in the external world being imaged. This is important if, say, you want to know how many pixels wide an object would be at a certain distance, or how to convert an optical flow odometry measurement into a linear odometry measurement. I'll try to cover these matters in this post.

Before we begin- first one assumption: I'm going to assume that all angles are "relatively small" so that a first order approximation can be used for trigonometric functions. For "small" angles of theta, we can use the first order approximation sin(theta)=theta and tan(theta)=theta.

For angles, I am going to use radians as a unit of measure rather than degrees. Together with the small angle assumption, this simplifies arithmetic, allowing for simple calculations of angles and distances using a simple multiply or divide. For example, a 10m tall tree, located 100m away from you, will have an angular size of about 10/100=0.1 radians from your perspective. For someone else 1km away, that tree is just 0.01 radians or 10 milliradians in size.

The figure above shows a geometric model of a camera system. Two rays of light, from opposite ends of an object, travel through a lens and strike two adjacent pixels of an image sensor. For most practical lens systems, we can model the lens as a "pinhole" from a classical pinhole camera. Let p be the "pitch" between pixels on the image sensor chip. On a Tam2 image sensor chip, this is 84 microns. Next let f denote the "focal length" of the lens. In our model this is the distance from the image sensor chip to the pinhole that models the lens. On a conventional camera lens, which is generally constructed with multiple lens elements, the focal length of that lens refers to the distance of such a "virtual pinhole" from the image sensor. For the small lens mounted onto the Tam2 sensors for some of the ArduEye boards, the effective value of f is about 0.9mm or 900 microns.

Let us now consider dimensions on the outside of the camera. Let r denote the "range" or the distance from the camera lens to an object being viewed. Let d denote the size of the object. Finally, let angle alpha denote the angular size of the object as viewed from the camera.

Using our first-order approximation, the above five variables follow this relationship:

alpha = (p/f) = (d/r)

In the context of a simple camera system using a Tam2 ArduEye shield, we can now relate linear distances to angular sizes and angular rates by applying the above equation. For a Tam2 ArduEye, the value alpha = (84 microns)/(900 microns) = about 0.093 radians, or 93 milliradians.

Suppose this camera system is imaging an object about 3cm away. Perhaps the Tam2 ArduEye is mounted on a ground robot to look at the ground from a height of 3cm. Two adjacent pixels in the ArduEye will respond to two points about 0.093 * (3cm) = 2.8mm apart. You can similarly think of the pixel array on the Tam2 chip as being projected onto the ground in a grid with 2.8mm spacing.

This is useful information to know since if two adjacent pixels project to points about 2.8mm apart on the ground, the texture on the ground should be large enough that the pixel array can detect meaningful information. Typical asphalt with, say, 1cm chunks should be readily imaged. However if the ground consists of fine grained sand, you probably will not be able to pick up any meaningful texture.

Now let's say that the ground robot traveled forward a distance of 1 meter. Since one pixel is 2.8mm, a distance of 1 meter corresponds to 1 / 0.0028 = 357 pixels of motion. If the robot is traveling at 1 meter per second, then the camera will experience optical flow of about 357 pixels per second! Now you know why optic mouse sensors typically run at several thousand frames per second!

Similarly suppose a person 2m tall is standing 10m away from the sensor. This corresponds to an angular size of 0.2 radians. Since one pixel is about 0.093 radians, the person will appear to be about 0.2 / 0.093 = 2.15 or just over two pixels tall.

It is obviously very useful to consider the pixel pitch p and the focal length f when using a camera system, including a simple ArduEye or other optical flow sensor, for an application. First you need to make sure that the pixels are close enough together (in angle) to pick out meaningful texture. Second you need to make sure that for any visual motion you may experience, the resulting "pixels per second" of motion can be adequately handled.

Views: 7219

Comment