“I’ll never forget going to see this film, the first one shot in CinemaScope, on its initial release. I sat there, and the curtains kept opening wider and wider and wider. None of us, not me or anyone else in the audience, was prepared for the experience and it changed the movies forever.” -- Martin Scorsese
The initial release of Henry Koster’s “The ROBE” in 1953 heralded a serendipitous convergence of two parallel innovations in cinema technology—Twentieth Century Fox’s ultra-wide CinemaScope picture format, synchronized with the first four-channel “surround-sound” recording medium for film. Fox’s bold investment in these previously unproven technologies did not go unrecognized. “The ROBE” garnered Academy Awards for Best Cinematography, Best Color, Best Art Direction, Best Set Decoration, and Best Costume Design, along with Best Actor in a Leading Role (Richard Burton) and both Academy and Golden Globe Awards for Best Picture.
Fifty-five years later, a complementary convergence of new technologies assures “The ROBE’s” preservation for future audiences. Lowry Digital of Burbank, California just completed a labor intensive restoration of “The ROBE’s” picture using new temporal image processing algorithms it has pioneered for granular noise reduction, detail enhancement and visual artifact removal. In concert, Audio Mechanics, a premiere audio restoration house (also of Burbank) carried out a comprehensive reconstruction of “The ROBE’s” original four-channel soundtrack. Due to unusual problems plaguing “The ROBE’s” existing audio source materials, this restoration became possible as a result of a new knowledge-directed signal processing technology developed by Signal Inference.
“Wow, the original master is missing…”
“The ROBE’s” existing audio source materials presented several obstacles which have prevented restoration of its soundtrack by previous means, until now. The original four channel audio master of “The ROBE” was lost, and the only surviving copies were made over twenty years ago. Time and physical limitations of analog recording media notwithstanding, production errors made during transfer of the preferred master introduced severe time-base distortions throughout. These time base distortions (most often characterized as “wow” and “flutter”) seriously impaired intelligibility of the film’s existing audio source materials in 94 of the film’s 135 minutes of running time.
"Wow" manifests as large, objectionable pitch variations that are most audible in longer, sustained tones of soundtrack music. Severe wow is readily perceivable as a "drunken" warping of otherwise stable musical pitches, not unlike what might be encountered in a bad American Idol audition. More subtle wow lends an uncomfortable "seasick" quality to audio program material, as if a soundtrack might have been recorded on board a Canadian ocean liner (to minimize production costs).
“Flutter” is a repetitive time-base distortion that surfaces most frequently in two modes. “Sprocket modulation” flutter can occur any time sound is recorded on film. It can introduce the unwanted sensation of a “projector running in the background” of an otherwise quiet soundtrack. “Scrape flutter” is a more subtle effect that imposes a distinct roughness on soundtrack dialog and music. This roughness can be easily mistaken for dynamic range distortion caused by analog media saturation.
Wow and/or flutter may be introduced in any phase of soundtrack production where analog media is employed. This includes any stage of sound recording and mixing, mastering, reformatting, transfer or playback. Wow and flutter introduced in successive generations of soundtrack production may compound one another, producing large, unphysical variations in musical pitch. Digital recording has all but eliminated wow and flutter from modern motion picture soundtracks, but all forms of analog recording suffer from indigenous wow and flutter, in varying degrees.
“How does one typically correct wow and flutter?”
In the analog world, typically one can’t. Wow and flutter are introduced by non-uniformly rotating mechanical components of analog magnetic tape and optical media transports. Throughout the history of audio recording, elaborate mechanical systems were developed in an attempt to stabilize rotational speed of these machines. In some cases these measures were able to reduce wow and flutter, but not eliminate them entirely. More recently, three techniques have been employed to minimize wow and flutter at time of transfer:  chemically treat deteriorating audio reels to minimize friction,  modify media transports to stabilize their rotational speed, and  capture bias frequency or other reference signals in an effort to compensate for unwanted variations in playback speed. Condition of "The ROBE"
Unfortunately, the severe wow and flutter confounding “The ROBE’s” audio source materials was introduced in an earlier stage of the soundtrack’s preservation. The preferred audio source material for “The ROBE” was in good physical condition. So, steps taken to stabilize playback speed in its final transfer produced only minimal reductions in wow and flutter.
Audio Mechanics’ past experience with these problems led its founder and president, John Polito to undertake a strategic collaboration with Signal Inference. Signal Inference specializes in developing custom solutions to unusual scientific and technical problems. Signal Inference is founded and directed by John Amuedo, a scientist formerly with the research staff of M.I.T.'s Artificial Intelligence Laboratory. Signal Inference has particular expertise in scientific computing, knowledge-directed signal processing, and software engineering.
The original audio source material of "The ROBE"
consisted of two four-channel (LCRS) masters and one monophonic optical
track. Fox distributed the mono track to theatres not yet equipped with
four-channel sound systems at the time of “The ROBE’s” initial release.
From a restoration perspective, these sources constituted:
• a badly wowed master with minimal sprocket modulation flutter but otherwise acceptable audio reproduction fidelity;
• a noisy, lower-fidelity master having minimal rotational wow but significant sprocket modulation flutter; and
• a mono mix of the soundtrack with poor fidelity but acceptable time-base stability.
initial review of this labyrinth of partially-usable source material
convinced us that any successful reconstruction strategy would require
judicious combination of multiple signal features extracted from
different sources. This became necessary to produce an accurate
composite wow and flutter profile of "The ROBE" soundtrack.
Knowledge-Directed Signal Processing (KDSP)
Knowledge-Directed Signal Processing integrates conventional digital signal processing algorithms with symbolic reasoning methods to build highly adaptive signal understanding systems. KDSP systems typically employ an initial stage of signal analysis to generate a coarse symbolic description of an expected class of signals. The symbolic description functions as a kind of bird's eye "sketch" or "roadmap" of relevant signal features. It captures features with detail sufficient only to identify signal regions of potential interest.
A rule-based reasoning component then examines accumulating symbolic descriptions for features that are representative of signals the system has encountered before. If the system finds features typical of previously encountered signals, it will initiate a secondary stage of signal processing optimized to confirm (or falsify) presence of a likely signal of interest. If a KDSP system encounters a signal that it cannot describe in terms of previously known signals, it will automatically enrich its database of known signals with a description of the new signal.
KDSP systems are especially well suited to combine multiple signal features extracted from different signal sources. Individual features may then be used to assemble a more accurate composite "picture" (or model) of the signal production environment. For example, a mono signal collected from a single hydrophone in the vicinity of several ships can be used infer the type and speed of the vessels present. The ensemble of signals recorded from a small collection of nearby hydrophones may then be combined to estimate the position and direction of travel of each vessel.
“Master, the original wow is missing…”
The ability to combine wow and flutter estimates from all three audio sources enabled Signal Inference to construct a robust characterization of wow and flutter in the original “ROBE” soundtrack. The KDSP approach made this possible despite frequent signal dropouts and interference. In some cases, the software was able to pinpoint the precise copy generation in which a particular wow and flutter signature was imposed. Accuracy of composite wow and flutter profiles was validated by re-processing the time-base corrected soundtrack and comparing its residuals with those of the individual sources.
By correcting major wow and flutter artifacts, we discovered that in some cases this process also yielded an additional 6 to 10 dB of noise reduction. After removing wow and flutter, coherent energy over the entire audio spectrum is re-concentrated into narrower, sharper spectral lines. Following correction, the energy of some signal components rose 6 to 10 decibels above the soundtrack’s original noise floor.
“So what does the future hold?”
In response to the unusual technical requirements of “The ROBE,” Signal Inference has developed a toolbox of new signal processing methods for solving outstanding problems of media soundtrack restoration. As a strategic partner, Audio Mechanics is uniquely equipped to evaluate the production requirements of particular projects, and coordinate a comprehensive approach to audio restoration.
We believe Knowledge-Directed Signal Processing offers tremendous promise as a technology for automating many routine aspects of visual soundtrack preservation and restoration. As increasingly sophisticated “virtual listeners,” KDSP systems are likely to find future application for automatically annotating multimedia content with descriptive metadata. Such metadata may then be used for efficient indexing, maintenance and retrieval of picture, sound and story content.