When we sit back to watch a movie, we rarely think of the mechanics behind it. You settle down, sip some soft drinks, munch some popcorn and keep your eyes forward. That might seem like all that’s going on but your brain, and particularly your eyes, are getting a workout of epic proportions. Those peepers are constantly darting around, exploring the frame and trying to process what they’re seeing, how it moves and so much more.
This phenomenon – the audience’s eyes moving in unison – is a characteristic of film viewing. It is not typical of real-world vision. Rather, filmmakers use editing, framing, and other techniques to tightly control where we look. Over 125 years, the global filmmaking community has been engaged in an informal science of vision, conducting a large number of trial-and-error experiments on human perception. The results are not to be found in any neuroscience or psychology textbook, though you can find some in books on cinematography and film editing, and in academic papers analyzing individual films. Other insights are there in the films themselves, waiting to be described. In recent years, professional scientists have started to mine this rich, informal database, and some of what we have learned is startling.
To understand how the eyes are affected by movies, you need to know a bit about how they work outside the theatre. When we are just living our lives, our eyes jump from one location to another two or three times per second, taking in some things and skipping over others. Those jumps are called saccades. (Our eyes also make smooth tracking movements say when we are following a bird in the sky or a car on the road, but those are somewhat rare.) Why do we do this? Because our brains are trying to build a reasonably complete representation of what is happening using a camera – the eye – that has a high resolution only in a narrow window. If any visual detail is important for our understanding of the scene, we need to point our eyes at it to encode it.
The way people use eye movements to explore a scene has a consistent rhythm that involves switching between a rapid exploratory mode and a slower information-extraction mode. Suppose you check into a resort, open a window, and look out on a gorgeous beach. First, your eyes will rapidly scan the scene, making large movements to fix on objects throughout the field of what you can see. Your brain is building up a representation of what is there in the scene – establishing the major axes of the environment, localizing landmarks within that space, categorizing the objects. Then, you will transition to a slower, more deliberate mode of seeing. In this mode, your eyes will linger on each object for longer, and your eye movements will be smaller and more deliberate. Now, your brain is filling in details about each object. Given enough time, this phase will peter out. At this point, you might turn to another window and start all over again, or engage in a completely different activity – writing a postcard or unpacking.
The fact that our eyes transition from an exploratory mode to an information-extraction mode, without anything new happening in the scene, tells us that visual behavior is guided by powerful internal control mechanisms.
The first efforts to understand these mechanisms took place in the 1970s when the psychologists Julian Hochberg and Virginia Brooks at Columbia University studied the transition from rapid exploration to slower information-extraction. To do that work, they created simple movies: slideshows made of still shots of simple abstract patterns, natural scenes and line drawings that told little stories. By getting rid of the motion within each shot, they could carefully quantify which features drove the eyes. Then, the experimenters asked people to watch the slide shows while their eyes were tracked with special infrared camera setup.
In these initial studies, Hochberg and Brooks documented the switch from an exploratory phase lasting a few seconds at most to an information-extraction phase. The duration of each exploratory phase depended on the complexity of the slide being viewed. When presented with more complex pictures, people explored for longer before they settled down. And when they were given a choice between looking at more complex and less complex images, they spent more time looking at the more complex images. Later researchers investigated which visual features specifically draw the eyes, finding that viewers tend to look at parts of an image with edges, with a lot of contrast between light and dark, with a lot of texture, and with junctures such as corners.
In natural scenes, there are usually several locations with features that draw the eyes. Your beach scene might have an umbrella or two with high contrast, a few palm trees with a distinctive texture, and a few chairs with lots of edges and corners. During the exploratory phase, each person viewing the scene will tend to land on most of these focal points, but exactly when and for how long they hit each focal point will vary from person to person depending on differences in their internal control systems. For example, a surfer might be drawn to a surfboard leaning against a tree, while a sailor’s eye is captured by a boat on the horizon.
One difference between real-world scenes and film is that movies move. How does this change what people look at? In a recent experiment, Parag Mital, Tim Smith, Robin Hill and John Henderson from the University of Edinburgh recorded eye movements from a few dozen people while they watched a grab-bag of videos, including ads, documentaries, trailers, news, and music videos. A number of effects carried over from looking at still pictures. People still look at places with a lot of contrast, and at corners. However, with moving pictures, new effects dominate: viewers look at things that are moving, and at things that are going from light to dark or from dark to light. This makes good ecological sense: things that are changing are more likely relevant for guiding your actions than things that are just sitting there. In particular, the eyes follow new motion that could reveal something that you need to deal with in a hurry – an object falling or an animal on the move.
Motion onsets are known to powerfully capture attention, even more quickly than the eyes can move. For example, when we first see Edward Scissorhands in the 1990 Tim Burton film of the same name, he is attempting to hide in shadow in a complex scene. It is the involuntary movement of his scissors that give him away, attracting the viewer’s eye at the same moment it attracts the eye of Peg the Avon Lady.
In short, film editing alters pretty much everything about how we control our eyes: when they move, where they move, and when they blink. Thus, watching the film is a dance between the filmmakers – especially the editor – and your visual system. If the filmmakers lead well, your eyes will follow effortlessly, smoothly coordinating the ideas presented by the filmmaker with the perceptual machinery you bring to bear. If not, the dance might be awkward, comic, or even anxiety-provoking. Filmmakers sometimes do this on purpose: if I am trying to make you laugh or scare you, I might try to lead your eyes in a way that makes it harder to figure out what’s going on; I might give you nothing to move to or too many things. The film has the option of playing against our perceptual systems rather than playing nice with them.
Controlling where and when viewers look is important from a practical point of view: filmmakers can save money on the parts of the frame we don’t look at, just like home builders can save money by not painting the attic. Over time, you develop informal intuitions and rules of thumb that work. By measuring viewers’ behaviors and physiology in real time while they watch, perhaps my field can help accelerate this process and bring it more into the light.
Related articles