Camera Lenses Part 5: Perspective

Capturing still and video imagery requires the shooter understand how a 3-dimensional object will appear in a 2-dimensional display.

In everyday speech, perspective is another word for having a point of view. In the image forming sense, the same is literally true, because perspective depends on the position of the point from which the image is captured and nothing else.

Human vision differs from that of a conventional camera in a number of ways. We have two eyes whereas most cameras have but one and display what we call cyclopean vision, after the mythical Cyclops that had one eye. The focal length of the human eye is fixed, whereas cameras have interchangeable or zoom lenses.

Having grown up with stereoscopic vision and eyes of fixed focal length, and in that process having learned the relationship between what we see and attributes such as size, mass, orientation, shape and distance, it is inevitable that when faced with cyclopean images obtained with a different focal length we will be in error when we try to assess such attributes.

The cyclopean camera cannot reproduce the disparities between the images seen by two eyes from which a great deal or depth information is derived. A camera is a mapping device that converts the three-dimensional world to a two-dimensional representation. Information must be lost in shedding a dimension and in a conventional image the loss is mostly depth or distance information. The loss is not total, because some depth clues can survive a two dimensional representation. Nevertheless the representation of depth must be diminished to an extent that varies on a case by case basis. A good cameraman takes steps to put depth clues back again or at least minimise the loss.

For example if the atmosphere is damp or dusty, objects further away will be less distinct. If one object obscures another, we can be certain the former is closer than the latter, even if we cannot say by how much. Objects diminish in size with distance, and if we know how big they are, we can sometimes judge distance. However, seen from the ground, we have all experienced how a jumbo jet in flight appears hardly to be moving. Surrounded by nothing but air, we cannot judge the size of the airplane and we wrongly think it is much smaller, nearer and slower than it really is.

Looking out of a moving train, we have all seen how the distant background seems to be moving with us whereas the closer objects get the faster they pass through our field of view. This is a depth clue that is available to television and cinema but clearly not to photography. By moving the camera in the plane of the image, even quite slowly, the relative motion of objects reveals their distances.

One of the errors made in assessing artificial images is known as perspective distortion. In the physical world this does not exist. Correctly tracing rays from solid object to flat image through an imaginary pin hole cannot introduce distortion. It exists in the world of image reproduction simply because cameras having lenses of different focal lengths allow us to see objects from unnatural distances.

Despite acres of text trying to intellectualise it, understanding perspective is desperately simple and consists of nothing more than grasping that the closer something is, the bigger the angle it subtends to eye and camera alike. Fig.1 shows a closed door that we see from a point aligned with its centre so the image we perceive is a rectangle. If the door is now partially opened, the free edge of the door may move closer to us and subtends a larger angle, whereas the hinged side clearly does not. The image of the door now has the shape of a trapezium, but we know from experience that doors do not actually change shape and that this trapezium indicates instead the door is no longer square to us but instead presents an angle.

Figure 1. At a) a closed door seen on axis is rectangular. If the door is opened and shot from very close by, the perspective is strong b), reduced at a “normal distance” c) and flattened at an extreme distance d).

Figure 1 also shows that the extent to which the shape of the door changes is determined only by viewpoint. If we are very close, the near edge becomes very near and subtends a much larger angle. If we are far away, the difference in distance between the two sides of the door and the angles subtended are almost the same. Having a greater field of view, a wide angle lens simply allows us to see the whole door from a shorter distance that our eyes would allow. A telephoto lens allows us to see the door from a distance in better detail than our own eyes would allow. The differences in perspective in comparison with what we would see with our own eyes are called perspective distortion. It is not the focal length of the lenses that cause the distortion, but the location of the camera that the lenses permit.

The moon appears to be a flat disc because difference in distance between the middle and the edge is negligible. The glass paperweight on my desk looks spherical from arms length, and it continues to look spherical even if I impersonate a Cyclops and close one eye, because I have learned the significance of the shape of reflections of the window. Figure 2 shows that this is one way of conveying depth in two dimensional images; to incorporate a clue of that kind.

Figure 2. My spherical paperweight would look like a flat disk in a photographic image, but the reflections of the window convey a non-stereoscopic depth clue.

Television, cinema and photography alike have in common the frequency with which human subjects are found in the images. It is therefore surprising that so little seems to be understood about the effects that imaging has on the human subject. Humans prefer to converse at a closely defined distance. Not small enough to invade personal space, and not so large as to prevent expressions being recognised or to diminish the disparities between the right and left eyes from which depth information is obtained. We remember what people look like at that distance.

People checking themselves in a mirror will typically stand at half that distance so the distance to the virtual image is the same. If we try to shoot a human from a different distance, the results may not be successful. Shooting from too close results in exaggerated noses and invisible ears, shooting from too far exaggerates the ears and diminishes the nose. There is no substitute for having the camera in the right place.

In these days of feminism and equality between the sexes, it gives me no pleasure to inform the reader that the distortion of the conventional cyclopean camera does greater harm to the image of the female than it does to the male to the extent that it can be accused of chauvinism.

We are all aware that of the two sexes women are most reluctant to have photographs taken and when they see such photographs they are less likely to express any satisfaction. They are not making it up and there is science behind the difference. They truly do not see in the photograph what they saw in the mirror, and it’s not the mirror that is at fault.

There are a number of mechanisms at work here. One of these is that the loss of depth information in the conventional camera is not at all beneficial to the female form, flattening instead of flattering, whereas the male, having less modulation of the z-axis, is unaffected. The flattening effect of cyclopean vision is present at normal subject distances. The last thing a girl needs is further loss of depth information by shooting from excessive distances with long lenses, with the crime completed by lighting that leaves no shadows.

Figure 3. Seen with both eyes, only area A of the background is invisible, but with the left eye alone, area B is invisible. The size of an object is partially judged by the way it obscure the background.

One way the human visual system assesses mass is by the extent to which an object obscures the background. Figure 3 shows that with stereoscopic vision, more of the background is visible because one eye can see what the other cannot. The effect becomes stronger as an object gets smaller, indeed with very small objects one eye or the other can see the whole background. Thus when objects have a waist, the waist appears slimmer when viewed stereoscopically then it does when in the cyclopean view.

Once again the results are sexist. People’s necks look wider on the screen than they do in real life, making guys look more muscular and making girls look overweight. Girls’ waists don’t look as slim on the screen as they do in real life. This distortion of body mass index is measurable. In tests where subjects were asked to estimate body weight from photographs of people shot stereoscopically and conventionally, the estimates were always heavier for the cyclopean shots and by a larger margin for women. This time it’s fattening instead of flattering.

The camera never lies? – Tell that to the Marines. Half of the world’s population knows that TV, film and photographic images generally don’t do them justice. Unfortunately the equipment is typically designed and used by the other half of the population.

There is no shortage of evidence: the success of pneumatic actresses, the foray of Howard Hughes into bra design, the rise of the silicone implant and excess dieting in film and television actresses, with consequent health risks, all because the imaging technology we use causes body mass distortion that is sexually discriminating.

Figure 4. How not to do it. Facing the camera squarely takes away depth clues and the result is not flattering.

What’s a girl to do? Most importantly, get into the habit of looking in the mirror with one eye closed, because that’s how the camera sees. Seen with one eye, that black dress is a disaster, absorbing all depth clues, as are tops with fussy patterns that act like camouflage. On the other hand a stripy top provides depth clues even to a Cyclops. Never ever stand facing the camera squarely like a rabbit in headlights. (Figure 4). Instead try to show the camera a silhouette as in Figure 5. It can’t get that wrong. Finally if someone starts shooting from too far away, simply take on the persona of Miss Piggy.

Figure 5. Same camera, same girl, same clothes, same light, but the incorporation of silhouette puts back some depth clues.

Other related articles posted on The Broadcast Bridge.

You might also like...

Monitoring & Compliance In Broadcast: Monitoring Cloud Infrastructure

If we take cloud infrastructures to their extreme, that is, their physical locality is unknown to us, then monitoring them becomes a whole new ball game, especially as dispersed teams use them for production.

Phil Rhodes Image Capture NAB 2025 Show Floor Report

Our resident image capture expert Phil Rhodes offers up his own personal impressions of the technology he encountered walking the halls at the 2025 NAB Show.

The DOP As Sound Recordist: 32-BIT Float Is Our Godsend

As a cinematographer with several decades of experience on feature films and large broadcast projects, my current work on smaller productions and documentaries has increasingly added the duties of a sound recordist, and with it a greater appreciation for 32-bit…

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Monitoring & Compliance In Broadcast: Monitoring Cloud Networks

Networks, by their very definition are dispersed. But some are more dispersed than others, especially when we look at the challenges multi-site and remote teams face.