Generative Design Programming

Week 10: Audio-reactive visuals

4h

Topic: What is sound? What are its qualities: amplitude (loudness) and frequency (pitch). Basics of computer audition: what it means for the computer to hear and how we measure the qualities. FFT as a mathematical device to deconstruct the signal and measure bass/mids/highs.

Creative constraints generator
Generate yourself 1-2 creative constraints and stick to them throughout the work

Learning materials

Sound as a wave signal (think of sine) can have different measurable qualities: the most basic ones are amplitude (associated with loudness) and frequency (associated with pitch)

Low frequency means low pitch. Look at . Notice that we usually work with the frequencies from lowest to the left to highest to the right.

When composed, the sound is a mixture of waves of different frequencies and amplitudes. In order to get meaningful information out of a recording, we need to decompose the signal into the waves that initially made it. We can do this using a mathematical machine called Fourier transform in a process called spectrum analysis. See 3blue1brown's introduction to FFT (see 0:30 -> 2:30): . Computers now allow us to do this magical decomposition in real-time, called Fast Fourier transform, or FFT in short.

We can then look at the spectrum generated by FFT and try to extract some higher-level information about the sound: for example, when the bass drops. This allows us to create artwork reacting to the sound, such as transcribing music into visuals. Because sound is continuous, we simplify and sum similar frequencies together into bands of frequencies. For example, the lowest band would mean bass, with all the soundwaves with frequencies between 20 to 60 Hertz. The band can then have a single value representing the sum of amplitudes of each wave, meaning the band's loudness. See https://www.youtube.com/watch?t=209&v=4Av788P9stk&feature=youtu.be for further explanation.

The sketch below creates an array where each element represents a single band. We then use FFT to assign the loudness of each band to the array. You must click on the canvas first (a safety feature of browsers).

Again, because the sound is continuous, and we measure it only 30 frames per second (fft.analyze() in the draw function), the signal can look very random. It's then helpful to average the measurements over time to get smoother values, which you need to program yourself. I suggest using 10-20 last values.

Where to look for inspiration

Max Cooper
Very popular and contemporary ambient VJ/music performer. He uses generative techniques and makes visuals with generative artists/programmers.
You can search for the album "Yearning for the Infinite" and read about it here: https://www.yearningfortheinfinite.net/. The concept of the album is emergent behavior.
https://www.youtube.com/watch?v=j8SNmGHhfks
https://www.youtube.com/watch?v=_7wKjTf_RlI

Ryoji Ikeda
Very famous video artist, one of the older ones. Ikeda creates immersive audiovisual installations, notable is also his experimentation with sound illusions.
See Transfinite
https://www.youtube.com/watch?v=omDK2Cm2mwo

David Mrugala / Thedotisblack youtube channel
Transcribing nature sound onto paper: 
was also exhibited in school https://www.youtube.com/watch?v=5sY76MRS_XQ
Sound language, print: 
https://www.youtube.com/watch?v=5sY76MRS_XQ

Others

This is an excellent example of what visuals VJs use for EDM, mostly techno music: you can see 9 different geometrical shapes in space manipulated by the music: https://vimeo.com/68161863.

Audio artist Joelle as an example of advanced techniques: https://vimeo.com/116097721.

You can use Spotify API to get some AI-processed information about the music: https://developer.spotify.com/console/get-audio-analysis-track/ (example here: https://spotify-audio-analysis.glitch.me/analysis.html).

A simple example of audio-reactive artwork on the web: https://therewasaguy.github.io/p5-music-viz/demos/08_echonestPitchSegment/.