Internal Logic

This experiment uses Python code adapted from a number of sources (see below) to create a film ‘trailer’ based on the sound energy present in the films soundtrack.

In her chapter on ‘Digital Humanities‘ in The Craft of Criticism (Kackman & Kearney, 2018), Miriam Posner discusses the layers of a digital humanities project; source, processing, & presentation. With this experiment/video I’m stuck on the last one. I get the sense that there is a way to present this work that might expose the “possibilities of meaning” (Samuels & McGann, 1999) in a way which is more sympathetic to the internal logic of the deformance itself. A presentation mode (videographic or other?) which is self-contained, rather than relying on any accompanying explanation. So, more to be done with this one. Very much a WIP.

A brief note on the process

The code for this experiment is adapted from a number of Python scripts which were written to automatically create highlight reels from sports broadcasts (the first one I found used a cricket match as an example). Links to the various sources I’ve used are below.

Become a video analysis expert using python

Video and Audio Highlight Extraction Using Python

Creating Video Clips in Python

The idea is that the highlights in any sporting event will be accompanied by a rise in the ‘energy’ in the soundtrack (read as volume here for simplicity) as the crowd and commentator get louder. The Python script analyses the soundtrack in small chunks, calculating the short term energy for each. The result is a plot like this, which shows a calculation of the sound energy present in the center channel of Dune’s sound mix.

I’m not entirely clear what scale the X axis is using here for the energy (none of the blogs get into any sort of detail on this) but as the numbers increase, so does the sound energy. The Y axis is the number of sound chunks in the films sountrack with that energy (in this case I set the size of the chunks to 2 seconds). To create the ‘trailer’ I picked a threshold number based on the plot (the red circle) and the code extracted any chunks from the film that had a sound energy above this figure. Choosing the threshold is not an exact science so I tried to pick a figure which gave me a manageable amount of video to work with. A higher threshold would mean less content, a lower threshold would result in more. Note – The video above features the films full soundtrack, but the clip selections were made on energy calculations from the center channel alone.

I am not a coder!

I’m not going to share the code here. It took an age to get working (largely as a result of my coding ignorance) and I can’t guarantee that it will work for anyone else the way I’ve cobbled it together. If anyone is interested in trying it out though, please get in touch, I’m more than happy to run through it on a Zoom or something.

Kackman, M. and Kearney, M.C. eds., 2018. The craft of criticism: Critical media studies in practice. Routledge.
Kackman, Mary Celeste Kearney

Samuels, L. and McGann, J., 1999. Deformance and interpretation. New Literary History, 30(1), pp.25-56.

Dodge This

This experiment goes back to some of my earliest thinking about sound and the (still) image. This was only made possible thanks to a new version of the audio software PaulXStretch which turned up recently.

This frame scan from The Matrix I’m using here was posted on Twitter some time ago (many thanks to John Allegretti for the scan). I couldn’t resist comparing this full Super35 frame with the 2.39 : 1 aspect ratio that was extracted for the final film. (I also love seeing the waveform of the optical soundtrack on the left of the frame).

The smallest frame is a screen shot from a Youtube clip, the larger is a screen shot from the Blu Ray release. The center extraction is nice and clear, but it really highlights to me the significant amount of the frame which is unused. (See this post for some excellent input from David Mullen ASC on Super 35 and center extraction). This frame was still on my mind when I spotted that a new version of PaulXStretch was available.

PaulXStretch is designed for radical transformation of sounds. It is NOT suitable for subtle time or pitch correction. Ambient music and sound design are probably the most suitable use cases. It can turn any audio into hours or days of ambient soundscape, in an amazingly smooth and beautiful way.

This got me thinking again about the still image; those that I’d been looking at for years in magazines, books, posters. Those that I’d fetishised over, collected, and archived. The images that meant so much to me and were central to my love of film, but that were also entirely silent. So with the help of PaulXStretch, I have taken the opportunity to bring sound back to this particular still image. The soundtrack to this video is the software iterating on the 41.66667 milliseconds of audio that accompanies this single frame from the film.

But wait, there’s more…

When I first had this idea, about 12 months ago, I wanted to try and accomplish it with an actual film frame. I found a piece of software (AEO-Light) which could extract the optical soundtrack information from a film frame scan, and render it to audio. So I went and bought myself some strips of film.

These are quite easy to come by on Ebay but there was a fatal flaw in my plan (which I didn’t realise until some time later). On a release print like the frames I have here, the physical layout of the projector, and specifically the projector gate, means that there is no space for the sound reader to exist in synchronous proximity to the frame as it is being projected. The optical sound reader actually lives below the projector gate, which means that the optical soundtrack is printed in advance of the picture by 21 frames (this is the SMPTE standard for the creation of release prints, but how that translates to the actual threading of a projector is a little more up in the air according to this thread). So in this material sense, where the optical soundtrack is concerned, sound and image only come into synchronisation at the very instant of projection.

If you’ve made it this far and want to know more about projector architecture then I highly recommend this video (I’ve embedded it to start at the ‘tour’ of the film path). Enjoy.

Se7en Payne’s Constraint

This experiment was inspired by this post by Alan O’Leary for The Video Essay Podcast where he writes about Matt Payne’s video essay ‘Who Ever Heard….?’

For me, ‘Who Ever Heard…?’ is an example of ‘potential videography’. It offers a form that can be put to many other uses even as its formal character—its use of repetition and its ‘Cubist’ faceting of space and time—will tend to influence the thrust of the analysis performed with it (but when is that not true of a methodology?). 

Alan O’Leary On Matt Payne’s ‘Who Ever Heard…?’ 2020

I wanted to see how this form might respond if I made my clip choices purely on sonic grounds. This was also another opportunity to explore the ‘malleability’ of the multi-channel soundtrack. Most video editing software will allow you to pull apart the 6 channels of a 5.1 mix and edit them separately. To me this is a fundamental sonic deformation, where the editing software allows access to the component parts of the films sonic archive (riffing extensively on Jason Mittell here).

For my ‘Payne’s Constraint’ I decided to top and tail my sequence with the same image of Somerset in bed, accompanied by a different channel selection from the soundtrack each time. The opening shot uses just the surround channels which are carrying rain at this point, but on the way back round I used the front left and right channels from the mix where the sound of the city invades Somerset’s bedroom (and consciousness). The other sound selections are less intentional than this, I picked sounds that were interesting, or odd, but that also contributed to the sonic whole that was coming together. It’s worth pointing out that after I added the 4th or 5th clip Resolve refused to play back any video so, much like with the Surround Grid, this ended up being a piece which was fashioned without really understanding how it would work visually until I could render it out.

A few sonic takeaways from this;

  • There is a lot of thunder in this film, but no lightning
  • There is very little sky (excepting the final desert scene) but there are helicopter and plane sounds throughout
  • Low oscillating sounds are prevalent in the soundtrack. Lorries/trucks and air conditioning contribute to this, but room tones also move within the soundtrack, rather than being static ambiences.
  • There is music in here that I was never consciously aware of before. Here’s ‘Love Plus One’ by Haircut 100, which I found burbling in the background of the cafe scene (clip 14 in my video).

Singin’ Will Wall

These experiments are inspired by Hollis Frampton’s 1971 film Critical Mass and were made possible using the software HF Critical Mass created by Barbara Lattanzi

I think I only watched Critical Mass because it auto-played on YT after I’d finished (nostalgia), another Hollis Frampton film, also from 1971. When I tried to find out more about Frampton’s process making Critical Mass I came across Barbara Lattanzi’s site, and the HF Critical Mass software she created “…as an interface for improvising digital video playback.” These 3 videos were made with Version 2 of the software.

I originally thought I might pick one of the musical numbers from Singin’ in the Rain’ (the film seemed like an obvious choice given it’s centrality to deformative videographic practice!) for this first experiment, but as I scrubbed through the film I hit on this scene, which not only has it’s own ‘built in’ loopability, but also appeals to my sonic self. The HF Critical Mass software gives you control over the length of the loop it will make, and the speed with which the loop with advance through the video (amongst many other controls) and I set these specifically for each video. In this case the loop length was defined by the door slam and the clapperboard, essentially bookending the loop. I’m not sure if this is the first time I noticed the sound engineer’s exaggerated movements, but the looping did highlight the synchronicity between Lina’s head turns, and his sympathetic manipulation of the recording controls.

I wanted to see how this would work on some quick fire dialogue, and I always share this scene with my research students, so it was an easy pick. Finding the loop length here was harder, and I’m a little surprised how consistent some of the rhythms in the delivery are, and how many lines actually get a complete delivery ‘in loop’ (Should I be surprised? Or is a rhythmic line delivery, with consistent pauses, inherent to this kind of monologue). The looping again highlights the movement within the scene, and the camera moves also get ‘revealed’ in the looping. Favourite moment is definitely the (total accident) at the end where ‘unoriginal’ becomes ‘original’.

This is a scene which I’ve always loved listening to but I think I’m listening to it differently with this video. The looping, in combination with the staggered progress of the video, seems to hold the sounds in my memory just long enough that I feel I can grasp them a little more clearly. Each loop seems to exist as its own short sonic motif, almost self-contained, even as it contributes to the whole.

The Double Sound Stack

This work is deeply indebted to the Film Visualizations of Kevin L. Ferguson and the process he describes here (well worth a read).

Two videos this week, but neither of them are the actual ‘output’ from the experiment. The one above is an annotation where I’ve labelled each sound I can hear in this clip from The Double (2013). (I could be much more forensic with the labelling but I’m happy with this for now). This video is interesting on its own, and the annotation deserves some more investigation, but for this experiment, it just feeds the next bit, so on to the process video.

And the result of all this is…..

There is so much more to be done with this, and so many questions to consider. How would a whole film look? Where would I fit all the annotations? Will my computer cope? How else can I view this output? Definitely more on this to come.

PS Thanks to Alan O’Leary for suggesting the name ‘Sound Stack’.

Se7en Sin Subtitle Edit

I wanted to explore a text based deformation that would allow me to dig into the soundtrack, in particular the dialogue, but do so in a way that removed as much of my own agency from the process as possible. This involved jumping through some hoops but here’s what I ended up with. I grabbed a subtitle (.srt) file for Se7en online (there are plenty of sources out there). Opening it in Notepad (Windows) I searched for all the lines that featured a particular word (in this case ‘sin’). The result was 11 lines which I saved as a new .srt file.

The ‘Sin’ .srt file

The next bit of the process wouldn’t have been possible without this handy Subtitle converter from It took my new .srt and converted it to a Resolve compatible EDL (Edit Decision List). In Resolve I had a session with the full film which I applied the EDL to. Resolve made all the cuts for me and all I had to do was line up the clips and export.

In this version then I have very little agency (aside from choosing the word to use). The cut locations and length of the clips is dictated by the subtitle file, and its interesting to see how/where these occur when they’re driven by a focus on readability rather than absolute sync.

Postscript. Not long after I finished this I became aware of Videogrep by Sam Lavigne (via Zach Whalen and Jason Mittell). It uses Python to create ‘automatic supercuts’. I am still trying to get it working on my system!

Se7en Surround Sound Grid

I was lucky enough to catch a talk by Jenny Oyallon-Koloski at the Symposium on Interrogating the Modes of Videographic Criticism in February. Jenny talked about her Musical Grid Deformations (you can see more of them here) which showed “Every number from a musical film, presented simultaneously”. I had already been doing some work with Se7en, isolating all the sequences in the film that had surround sound content, so, inspired by Jenny’s work, I decided to create a surround grid.

This is my first effort. The video features every sequence from Se7en that has any surround sound content, and the audio you can hear is only from the surround channels. I worked in sequences (as opposed to scenes) so in this version I’ve not made any cuts on scene boundaries. The longest sequence here covers 9 scenes, but there is consistent audio content in the surround channels for the duration (I have defined ‘consistent’ here as not dropping below -50db on Resolves meters).

Initial thoughts – there is a lot of music, and also (strangely?) a lot of helicopter and plane sounds, even though we only actually see helicopters towards the end of the film. It does make me think that we also don’t really see sky until the end of the film. Also, (and Jenny did warn about this in her talk) my computer refused to render this in anything less than 20 hours. So in the end I had to optimise the footage (a DaVinci Resolve feature), and then render out that optimised footage.