The Organ and the Vacuum Cleaner (Bresson, the Devil, the voice-over and other things)

By Serge Daney

An attempt to assemble critical pieces approaching cinema “from the standpoint of sound”. This article was first published in Cahiers du cinéma, issue 278-280, Aug-Sept 1977 and reprinted in La Rampe: Cahiers critique 1970-82 (Cahiers du Cinema/Gallimard, 1983). Published in English in Literary Debate: texts and contexts, volume 2, edited by Dennis Hollier and Jeffrey Mehlman, The New Press, 1999. Translated by Arthur Goldhammer.

“… given that air being a heavy body, and therefore (according to the system of Epicurus) continually descending, it will descend even more so, when loaded and pressed down by words; which are also bodies of much weight and gravity, as it is manifest from those deep impressions they make and leave upon us; and therefore must be delivered from a due altitude, or else they will neither carry a good aim, nor fall down with a sufficient force.”(Swift)

I would like to describe the sound setup (dispositif sonore) in a scene close to the beginning of Robert Bresson’s 1977 film The Devil Probably. The scene in question is the one in which Charles and his friends enter a church (we have already seen them hooted out of a political rally) and immediately find themselves involved in a rather lugubrious debate, whose subject, we quickly discover, is postconcilial Catholicism. How can I describe this scene (or fragment of a scene: Bresson’s films have long since dispensed with scenes) from the standpoint of sound?

– Nothing prepares for it. At no point does the viewer anticipate where things are going. We are quickly (all too quickly for those who did not like the film) plunged into the middle of a debate, which, because it is reduced to this middle, is immediately denaturalized.

– There are no sides in the debate; everyone is against everyone else. It might be better to call it a round of speeches rather than a debate, and those speeches are delivered in an irritatingly toneless, zombielike manner. Or perhaps one should call it a series of questions without pause for response or reply.

– Everyone speaks, but each person utters only a single sentence. Each sentence is punctuated by a loud, sustained organ note. The vehemence lacking in the words seems to have been shifted to these impromptu interruptions. In a previous quick shot, we caught a glimpse of the organist as he sat down at his instrument and lifted the cover of the keyboard.

– In addition to these two sounds, that of the organ above and the discussion below, each oblivious of the other, there is a third sound, of a vacuum cleaner being run over a red carpet.

What holds this fragment together? Where is the thread, the logic? Not in the presumed psychology of the characters (Charles has supposedly decided to join this debate) or in the dramatization of the scene (in which he is supposed to have had a hand). It lies elsewhere, namely, in the fact that from the moment Charles and his friends enter the church, they are caught up in a random, heterogeneous system of sounds, a montage consisting of the debate, the organ, and the vacuum cleaner, which literally disposes of them. This Bressonian heterology consists of three terms: the high (organ), the low (debate), and a third term that destroys the opposition between high and low, namely, the trivial (vacuum cleaner). All the newcomers can do is add their own sounds to the ambient sound configuration, or, rather, chaos, which is the true “subject” of the film. Or perhaps, as we are told in Godard’s 1976 Here and Elsewhere, sound is always too loud.

There is something paradoxical about The Devil Probably. Never before has Bresson seemed so concerned with being topical, yet at the same time he has never been more vehemently, radically insistent on his contempt for all discourse. Not only because talk and speechifying inevitably lead to theatre (bombast, pathos), thereby transforming “models” into actors, histrionic performers, but also because all discourse, insofar as it aims at triviality (or worse, edification) presupposes an emitter, and for Bresson the human emitter is ridiculously inadequate as a sonic system (dispositif sonore).

There is a sonic hierarchy, in which speech and speechifiers occupy the bottom rung. Charles encounters several of the latter in the course of his (elegant) Calvary, from the bookseller who “preaches destruction” in a crypt to the ineffable Dr. Mime, the great psychoanalyst. If the speechifiers are (irrevocably) condemned, it is because they are reasoners without resonance. Their talk is dull, colorless, and stilted. Their attitude toward money is similar: think of the checkbook with which the bookseller wants to buy the prostitute, or the stack of banknotes and checks that we glimpse in Dr. Mime’s half-open drawer. In paper money there is something solidified, turdlike, and soundless, something we can grasp more fully if we look again at the “inspired” scene in the film, the one that shows the second visit to the church. When Valentin follows Charles into the church “conditionally” (he is under the influence of drugs), it is under the sign of sacrilege (Valentin breaks open the poorbox) and simulacrum (Charles plays Monteverdi on a record player) that the simultaneous clinking of coins and tinkle of music set up the metaphor voice/gold.

Yet another reasoner is Michel, the ecologist. He is a well-known Bressonian figure: the “best friend” who makes edifying speeches and is usually sexually desired (but not loved) by the heroine, a situation of which he takes advantage (think of Jacques, the fiend in Pickpocket). In The Devil Probably, Michel is working on a militant film about ecology which we see him showing to (or perhaps in collaboration with) a group of friends. Some critics have poked fun at this scene, which they see as an indication that a senile Bresson is willing to do anything to make his supposed portrait of youth credible. But the scene is anything but simple. The film without a film is silent, and Michel and his friends “speak” the commentary as it is projected. As to the commentary (commentaire), there is no better example of what Pascal Bonitzer has termed the comment-taire (how to silence): the young people read it, mouth the words, mumble their way through the text. What we see is nothing less than the fabrication of a voice-over narrative.

There is a disturbing quality in the alternation, in the film within the film, between oftentimes violent images (the washing of the oil tanker, the red slime, the slaughter of a baby seal) and the moving hands of the commentators, who hold electric lamps that pick out the words on paper that are to be read or recited – posted over the images, as it were. The fabrication of a voice-over: it is quite bold of Bresson to film these well-dressed youths, who, as they watch images that illustrate their own cause, can respond only with words that have already ceased to resonate and begun to stagnate. Enough has already been said in the Cahiers about the dubious facility of the voice “over” (the rationale for the quotation marks will become clear in a moment) that one cannot fail to be struck by what is going on before our very eyes: the separation of the silent violence of the images from the blasé commentary, the distancing of the silent visual cry from the voice keeping “out of sight” in obscurity.

Here we confront the inability of human discourse (and of the human voice) to bear the violence of the world. Bresson’s pessimism is hardly new: in The Devil, it is simply more naked. Clearly, the problem is not what Charles is looking for (his quest) or what he thinks (his convictions). It does not matter whether or not he opposes the ecological crusade or the macrobiotic diet. Indeed, the debate over ideas invariably takes off against a deafening background of sounds (the shouts in the crypt, the trees being cut down); the decibel level is high. The sound is always too loud. And if Michel is discredited in Charles’s eyes, it is not because the cutting of the trees (to which he consents) contradicts his ecological convictions but because the horrible sound that the falling trees make makes all debate pointless in advance, because it is inaudible.

So much for Bresson’s “materialism”: so far as discourse is concerned, it is the ear, not the brain, that decides. The voice is only a noise, one of the softest kinds of noise. And what Charles wants is not to be convinced (for he is certain of his superiority) or to convince (since he is prepared to say virtually anything in order to have the last word) but to be vanquished. And in the Bressonian logic of sonic bodies (corps sonores) he can be vanquished only by a noise louder than all the rest: a gunshot into the water and then into the back of the neck.

At the risk of disappointing, therefore, the question of whether Charles is a symbol of present-day youth as seen by Bresson has to be set aside. Bresson is not at last, in old age, turning his attention to young people; youth has always been the only subject that interests him. The Bressonian “model” is never more than thirty years old. It is better to study Charles as one sonic body among others – the chosen one.

At bottom, Charles cares only about one thing: having the last word – not, however, in the manner of the glib talker who wins arguments intellectually but more in the manner of a parrot. Amid the bedlam produced by machines out of control, he has the last word only because he never has the first. He is best compared to the nymph Echo, who, according to Ovil, “cannot speak first, / cannot keep silent when spoken to, / and repeats the last words of the last voice she hears.”

A poor transmitter-receiver, the Bressonian hero-nymph can sand up to the overwhelming volume of sound only by being, like the twice-empty church, a mere conduit, a resonance chamber. In Charles’s sham tirade against Dr. Mime, all he can do is read, in the thick voice of an exasperated snob, a list of the “horrors” of modern civilization that he has torn from a magazine. He can only repeat. Living – up to the moment when he buys the right to die from the silent Valentin – is merely a matter of allowing the world, whose sounds is too loud for him, to resonate within, without speaking for himself or even opening his mouth; he is an accompanist of the world’s din. He is and old-fashioned form of resistance, known to schoolchildren the world over: he hums with his mouth closed. In this there is, of course, religious nostalgia: he the Middle Ages, the term neume was applied to musical phrases emitted in a single breath (uno pneumate) – without opening the mouth, because if one did open it, who knows what might enter in? Probably the devil.

“Vocal chords can vibrate in the absence of any airflow and under the sole effect of nervous stimulations.” (Moulonguet and Portmann)

Thus, we must digress a moment to consider the voice. In Lacanian terms, it is a question of an object “a”, and one of its partial objects is the mouth. But the voice is not produced exclusively in the mouth. It always originates deeper down. The voice involves the entire body.

What distinguishes the cinematic voice is that it can have a visual double, a shadow that seems to prey on it. It never seems easier to grasp or more tangible than at the moment when it is emitted, when it leaves the body through purposefully twisted lips. This mentonymy is crucial: what is seen (the moving lips, the open mouth, the tongue and teeth) justifies the belief in the reality of what is heard at the same time.

There is no other way of assigning a body to a voice than by way of such a visual stand-in: it is the image that ascribes reality to what remains invisible by definition. Silent film lived off this emtonymy (no smoke without fire, no moving lips without voice) resolved into metaphor (the interpolated title took the place of the voice). As Anne-Marie Miéville says in Godard’s 1978 Comment ça va, the eyes are in command. We blame the discrepancy between image and sound when a film is poorly synchronised or dubbed. But to appreciate the full import of such complaints, one has to ask if we are capable of recognizing a poorly synchronised foot or back. Obviously, this question comes from Bresson, who was once of the first to take the fragmentary bodies of his “models” as the ghost of the voice, its visual stand-in, as it were.

“Dubbing is crude and naïve,” he writes in Notes sur le cinématographe. “Unreal voices, inconsistent with the movement of the lips. Out of sync with the lungs and the heart. Coming ‘from the wrong mouths.’” Bresson is one filmmaker (Jacques Tati is another) who has always insisted on a certain realism of sound. In this respect, he was deeply influential on the most innovative New Wave filmmakers. Note, however, that he mentions not only the mouth and lips but also the lungs and heart. Although he insisted on realism, he never made a fetish of directly recorded sound; rather, he stubbornly insisted on meticulous postsynchronization of carefully mixed and orchestrated tracks. Why? Precisely because he drew a distinction between the voice and the mouth. If one looks at the mouth, it is easy (and takes no effort) to see that something is being said. But the voice involves the whole body, including the heart and lungs, which cannot be seen.

In order to pursue this theme further, on needs to be wary of such terms as “voice-over” and the like, which are altogether too dependent on the visual and, as such, surreptitiously extend the hegemony of the eye, with the inevitable consequence that the ear is mutilated: film, we are told, is primarily images, which “strike the eye” and “orient vision.” The advent of direct sound recording in televised news reports, ethnographic documentaries, and propaganda films, together with the wild enthusiasm for the essential immediacy of the audiovisual (Jean Rouch and Jean-Marie Straub, quickly copied but poorly understood), led people to pattern sonic space after visual space, which served to guarantee its veracity, to authenticate it. In fact, however, the two spaces are heterogeneous. A more precise description of each is required, along with terminology for specifying their interactions.

To begin with, there is always a danger of importing what is primarily a vocabulary of technical terms. One saw this in the phrase “images and sounds,” which became so overused after Godard introduced it that it lost all specific meaning. For whom does a film consist of images and sounds? For the person who makes it and the person who deconstructs it, the technician and the semiologist, but not for the person who watches it. Just when talking about “images and sounds” became the last word in materialism (although for Godard it was already the “and” that was interesting), people began to notice that this terminology made it impossible to discuss the place of the spectator, the system of which he was a part, of his desire. The problem had to be approached from a different angle – in terms of the gaze (which is neither the eye nor the image) and the voice (which is neither the mouth nor the ear nor the sound). And also in terms of drives (the “scopic” drive: to look is not the same as to see, to listen is not the same as to hear).

In terms of images, the distinction between on-screen and off-screen occurrences, while no doubt useful for writing a screenplay or critically analyzing a film, is not subtle enough for a theory of missing objects because there are different types of off-screen events. Some objects are permanently missing (either because they are unrepresentable – for instance, to take the standard example, the camera that cannot film itself filming the scene – or taboo, such as the prophet Muhammad), while others are temporarily out of sight, hence subject to the familiar alternation of presence and absence, of Fort Da, to use the Freudian metaphor. The possibility of eternal return is greeted by the spectator with either horror or relief. These are not the same, even if they happen off-screen.

The same on-screen/off-screen distinction that is already of dubious value in discussing the visual is altogether too crude for analyzing voices. Broadly speaking, the term voice-over refers to the voices of off-screen speakers. But this really depends on a distinction between sound that is synchronized and sound that is not: the voice is reduced to its visual stand-in, which is itself reduced to the configuration and shape of the lips. The voice-over is then identified with an absence in the image. I favour the opposite approach: voices should be related to their effects in or on the image.

I will use the term voice-over narrowly to describe an off-screen voice that always runs parallel to the sequence of images and never intersects with it. For example, in a documentary about sardines, the voice-over can say whatever it likes (whether it describes sardines or slanders them makes no difference); it remains without measurable impact on the fish. This voice, superimposed on the image after the fact and linked to it by editing, is a purely metalanguistic phenomenon. It is addressed (both as statement and delivery) solely to the viewer, with whom it enters into an alliance or contract that ignores the image. Because the image serves only as the pretext for the wedding of commentary and viewer, the image is left in an enigmatic state of abandonment, of frantic disinheritance, which gives it a certain form of presence, of obtuse significance (Barthes’ third meaning), which (with a certain element of perversity) can be enjoyed incognito, as it were. To see this, mute the sound on your television and look at the images left to themselves. Voice-over of this kind can be coercive. If, speaking of sardines, I say that “these grotesque animals, driven by a suicidal compulsion, hasten toward the fisherman’s nets and end their lives in the most ridiculously way imaginable,” the statement will contaminate not the sardines but the gaze of the spectator, who is obliged to make what sense he can of it despite the obvious disparity between what he sees and what he hears. The voice-over narrative, which coerces the image, intimidates the gaze, and creates a double-bind, is one of the primary modes of propaganda in film.

This is the level at which a director like Godard operates: one might call it the “voice-over degree zero.” In his 1976 Leçons de chose (the second part of Six fois deux), the sudden intrusion of a shot of a marketplace (an intrusion that is as violent as it is sudden, since like all of Godard’s images it is totally unpredictable) is immediately baptized “fire” by the soundtrack. This is justified in part by a play on words (flambée des prix is French for “skyrocketing prices,” hence the connection to the image of the marketplace, but flambée also means “blaze,” hence the connection to the soundtrack), in part as a response to the intrusiveness of the image and the enunciation of the word, retroactively re-marking the violence. One sees the same thing in Here and Elsewhere with the sequence on “how to organize an assembly line.” With each new image, Godard’s voice hollowly repeats the words: “Well, this way… like this… but also like that.” In relation to the “one-by-one” sequence of images that the voice plays the same role as quotation marks in a text: it highlights but also distances.

The voice-over is the focal point of all power, all arbitrariness, all omission. In this respect, there is little difference between Marguerite Duras’s 1975 India Song, the documentary about sardines, a Situationist film, and the Chinese propaganda film on which it is based: the contract with the viewer (seduction, pedagogy, demagogy) depends on coercion of the image. The potential here for the exercise of power is unlimited. The only way to escape from this vicious circle is for the voice-over to take a risk, and to do so as voice: either by multiplication (not once voice but many voices, not one certitude but many enigmas) or, even more, by singularization. And the way to escape from the politics of the auteurs is through a “politics of voices, inimitable voices (Godard, Duras and, for some time now, Bresson). Radio takes is revenge on film, Dziga Vertov on Sergei Eisentstein, the simple voice on the constructed dialogue, and the feminine on the masculine.

By contract, I will use the term, “in voice” to refer to a voice that participates in the image, merges with it, and has material impact on it by way of a visual stand-in. If my commentary on sardines has the effect of leaving the poor fish stranded in their mere presence as sardines, my voice has a totally different effect if, in the course of a live report, I ask someone a question. Even if that question is spoken off-camera, my voice intrudes in the image, affecting my interlocutor’s face and body and triggering a furtive or perhaps overt reaction, a response. The viewer can measure the violence of my statement by the disturbance it causes in the person who receives it, as one might catch a bullet or a ball (or other small “a” objects), to one side or head on. This is the technique used by Jori Ivens and Marceline Loridan in their 1976 How Yukong Moved the Mountain. It is also the technique of horror films and of the “subjective” films of Robert Montgomery. One also sees it in the now somewhat outmoded technique of having a voice put familiar questions to the characters in a film, who halt their action long enough to respond. Think, for example, of Sacha Guitry’s paternalistic attitude toward his “creations,” or the complicity between the narrator and characters in films from Salah Abou Sefi’s Entre ciel et terre to Louis Berlanga’s Welcome, Mr. Marshall.

The “in” voice is the focal point of a different but just as redoubtable form of power. What is presented as the emergence of truth may well be merely the production of discomfort in the guinea pig forced to answer questions as the viewer looks on. There are at least two other kinds of voices: those spoken “within” the image, either through a mouth (“out voice”) or through an entire body (“through voice”).

The “out” voice is basically the voice as it emerges from a mouth. It is projected, dropped, thrown away: one of various objects expelled from the body (along with the gaze, blood, vomit, sperm, and so on). With the out voice we touch on the nature of the cinematographic image itself: though flat, it gives the illusion of depth. Both the voice-over and the in voice emanate from an imaginary space (whose position varies with the type of projection equipment, configuration of the theatre, placement of loudspeakers, and the location of the spectator). By contrast, the “out” voice emanates from an illusory space, a decoy. It emerges from the filmed body, which is a body of a problematic sort, a false surface and a false depth. It is a container with a false bottom, with no bottom at all, which expels (and therefore makes visible) objects as generously as Buster Keaton’s taxis can disgorge regiments. This filmed body is made in the image of the barracks in Cops or of the church in Seven Chances.

The out voice is a form of pornography in the sense that it fetishes the moment of emergence from the lips (stars’ lips, or, in X.27, Marlene foregoing lipstick before the firing squad). Similarly, porno films are entirely centered on the spectacle of the orgasm seen from the male side, that is, the most visible side. The out voice gives rise to a “material theatre” since it is central to every religious metaphor (passage from inside to outside with metamorphosis). To grasp the moment of emission of the voice is to grasp the moment when the object o separates from the partial object. Pornographic cinema is a denial of this separation, which threatens to reduce the object a to unproductive expenditure (waste) and the partial object to its status as orgasm (meat). It attempts to sustain as long as possible the fetish of an orgasm that can only be followed by another orgasm and so on, ad infinitum – the constant obligation of the visible, “the transparent sphere of seminal emission,” as Pascal Bruckner and Alan Finkielkraut nicely phrase it. There is a pornography of the voice comparable in every way to the pornography of sex (abusive use of interviews, mouths of political leaders, and so on). Clever writers have woven stories around this theme (such as Daniel Schmid’s Angels’ Shadow, in which a prostitute is paid to listen, and Le Sexe qui parle, in which a woman’s vagina expresses its insatiable appetite).

Finally, a “through” voice is a voice that originates within the image but does not emanate from the mouth. Certain types of shot, involving characters filmed from behind, from the side, or in three-quarter view or from behind a piece of furniture, screen, another person, or an obstacle of some sort, cause the voice to be separated from the mouth. The status of the through voice is ambiguous and enigmatic, because its visual stand-in is the body in all its opacity, the expressive body, in whole or in part. It is well known that for reasons of economy, poor filmmakers often film speaking characters from behind rather than in front. Of course, the backs in question are not “real.” For Bresson (and Straub) the whole problem is to shift the effect of frontal filming to some other part of the body, to something round and smooth. Modern filmmaking (since Bresson, in fact) has featured a large number of bodies filmed from behind (sometime in seductive and provocative ways). Direct and indirect, here and elsewhere. The latest (and not the least mysterious) of these back shots is of Anne-Marie Miéville in Comment ça va.

“The devil jumps in his mouth.” Do not make the devil jump in a mouth. “All husbands are ugly.” Do not show a multitude of ugly husbands. (Bresson)

I will conclude with a word on the famous “Bressonian voice,” which both exasperated and enchanted a generation or two of filmgoers. The timbre of the voice has been attributed to Bresson’s outspoken hatred of the theatre. A small number of critics has seen it as Bresson’s unavowed homage to a class (the grande bourgeoisie) whose children he fetishizes but at the cost of transforming them into young, déclassé aristocrats caught up in Dostoyevskian plots. Both these views are correct. But one can also say that the Bressonian voice is a voice that requires the minimum possible opening of the mouth, that limits, or reserves, the spectacle of emission as much as possible.

In The Devil Probably there is indeed a radical disjunction of voice and mouth. On the one hand, the voice involves the entire body, instruments, and machines (the organ blows, the vacuum cleaner breathes). Bresson’s slogan might be: Don’t look to see where the voice is coming from, don’t look for the visible origin of what you hear. To that end, after showing how voices are reduced to noise, he shows how noises begin to constitute voices (all of which Charles hears, except that he is not Joan of Arc, and to him the voices say nothing). On the other hand, he sees the mouth in terms of its function as orifice, or hole, and of the pleasure of its possessor – the mouth as an instrument of the devil’s pleasure.