The overall aim of the investigation is to create current abstract visual music that is informed by, but not a re-making of, previous visual music. Visual music can be perceived as overly repetitive, cold and alienating if it seems to embody a purely mechanical alignment of music to image, or if it seems disengaged from both human emotions and natural imagery. One key objective, for me, is to create work that is non-narrative, abstract, and yet suffused with human presence and emotion.

An affordance but also a key issue with creating abstract visual music is that the possibilities appear limitless. As the animator Hans Richter elucidates:

The overwhelming freedom which the "abstract", "pure", "absolute", "non-objective", "concrete" and "universal" form offered (which, indeed, was thrust upon us) carried responsibilities. The "heap of fragments" left to us by the cubists did not offer us an over-all principle. Such a principle was needed to save us from the limitless horizons of possible form-combinations (1952: 78).

Given that, in the hundred years since Richter started to create abstract animations, visual music has become, as the art and media theorist Cornelia Lund asserts (2016), an 'umbrella-term'. This is a timely article focusing on the development of one approach. It is underpinned with references to seminal artists whose works have helped to form the canon of visual music, composers, scientists, sound theorists and visual music theorists, and artists who directly inform my work. As my investigations into creating visual music are practice-based, this article traces these investigations through the development of some of my works: Horizon (2014),Reservoir (2014), Shadow Sounds (2015) and Ambience (2016).

What is Visual Music?

The impulse to find correspondences between music and visuals and use these to build a new mode of creative activity - indeed a new art form - has a long history. Visual music encompasses many types of output: abstract paintings, time-based performance art such as colour organs, abstract film, projected light shows, art installations of film and expanded cinema (digital media). Though the types of outputs continue to proliferate, the unifying impetus behind visual music is still the idea of synaesthesia.

Synaesthesia is an involuntary response in one sense, such as sight, triggered by the stimulation of another sense, such as hearing. The prevalence of synaesthetes is estimated at one percent of the population. However, as the experimental psychologists Shinsuke Shimojo and Ladan Shams assert: 'Cross-modal integration [of the senses] is performed on a vast level in the brain and contributes significantly to adaptive behavior in our daily life' (2001: 505). Therefore, it is not surprising that there has been a quest to link and unify the senses through the arts. Ways of linking colour and music have been sought for centuries. The exact linkage has never been agreed upon. Many have thought about this relationship as one colour to one tone in a musical scale that could be mapped, as Fred Collopy, designer of musical instruments and visual music historian, illustrated (2006: 64). The consummate physicist Issac Newton produced a mapping that equates G to green. Father Louis Bertrand Castel, who built a candle-lit ocular harpsichord in 1830, produced a mapping that equates G to red. There have been many mappings over three hundred years and no consensus has been reached. My own investigation excludes one-to-one mapping of note to colour as it is too prescriptive and too mechanically repetitive for my aspiration of engaging the viewer.

Historically, the idea of synaesthesia has also involved a mystical aspect. As Bruce Elder, the filmmaker and critic states: 'The experiences of prayer, meditation, contemplation, trance and dream are not incorporated into modernity's model of normative cognition' (2010: xxvi). Artists sought ways of affording transcendental experiences through their art. The painter Wassily Kandinsky engaged with the idea of synaesthesia as a mystical way to achieve a higher, visionary state. In 1807 Thomas Young published his discovery that light travels in waves. It had long been known that sound travels in waves. These scientific discoveries supported a mystical view of physics of the late nineteenth century; the idea that light waves and sounds waves were both types of cosmic vibrations and that there was a physical basis for a synaesthetic art. Synaesthesia was seen as a way of unifying the arts, especially through colour and music. Kandinsky wrote his pioneering work 'On the Spiritual in Art' in 1911, setting out his belief that art could afford spiritual awakening. He added his knowledge of Russian Symbolism and Theosophy to his research into his synaesthetic multisensory responses and developed his own methodology of capturing his immediate artistic response, improvising on it and then developing his final composition. Kandinsky stated:

Colour is a power, which directly influences the soul. Colour is the keyboard. The eye is the hammer. The soul is the piano, with its many strings. The artist is the hand that purposefully sets the soul vibrating by means of this or that key. Thus it is clear that the harmony of colours can only be based upon the principle of purposefully touching the human soul (1977: 25).

Other contemporaneous abstract artists, such as Robert Delaunay and Frantisek Kupa created works that expressed rhythm and formal structures like those used in musical composition, but were more sceptical about a colour-musical analogy. Kupa describes the difference between music and the visual arts: 'Now the fact is that listening to a musical work evokes different images in everyone, an accompaniment that each draws from his own visual memory. That is, chromatism in music and musicality of colors has validity only in metaphor' (quoted in Brougher & Mattis, 2005: 41). My work investigates chromatism in music and the musicality of colors metaphorically.

However, many artists continued to search for a scientific, physical basis for synaesthetic art. The influential Dadaist Raoul Hausmann spread the idea that there was a unifying identity that applied to light and mechanical waves, based on the nineteenth century theory of ether. The recent invention of the photo-cell, which transforms light into electrical signals, appeared to him to be positive proof. He conceived his Optofonetik. Writing in 1925 the innovator László Moholy-Nagy was convinced this 'scientific' approach would afford new forms of art (1987: 22). The desire for unification of the senses, the interweaving of the scientific with the mystical and affordance of transcendental experiences continued to underpin visual music. This is evidenced by the work of the seminal practitioner James Whitney who made films from 1942-1982 and was celebrated by the renowned visual music historian William Moritz (1985). Seminal practitioners such as John Whitney have speculated that Pythagorean musical harmony could apply to images:

This hypothesis assumes the existence of a new foundation for a new art. It assumes a broader context in which Pythagorean laws of harmony operate [...] In other words, the hypothesis assumes that the attractive and repulsive forces of harmony's consonant/dissonant patterns function outside the dominion of music (1980: 5).

However, though he created tremendously innovative computer animated mathematical patterns he did not find harmonic interrelationships that could be applied to both sound and image. Nor could the animators that followed.

The works of Brakhage, the Whitneys and Belson were underpinned by their fascination with creating abstract motion that was analogous to music and the aspiration to make music and image so seamless and unified that it would have a transformative effect on the viewer. In this way they hoped to bring the viewer into an egoless state linked to an ideal world : 'Their films do not refer to the actual world but instead use optical effects to pluck at the musical inner mind and allow each viewer to become a synaesthete' (Brougher et al. 2005: 145).

My work does not investigate the mystical, however, it does aspire to making vision and music more than the sum of the two parts through very closely integrating them. Experientially I find this enables a meditative state in me as a viewer. This may relate to the actual cross-modal integration of the senses referred to above.


In 2014 I created Horizon. The sky darkens from grey, to blue, to black. The horizon line cuts across: a vivid slash. Mapping a liminal space between abstract and representational landscape on a 2D screen, Horizon investigates how to abstract filmed natural imagery, creating a moving painting and creating metaphorical space.

How Best to Abstract Filmed Natural Imagery?

Abstracted natural imagery seems to afford a link to music. The photographer and art promoter Alfred Stieglitz revealed the musical inspiration underpinning the hundreds of quasi abstract photographs he took of the sky: 'I wanted a series of photographs which when seen by Ernest Bloch (the great composer) he would exclaim: 'Music! Music! Man, why that is music! How did you ever do that? And he would point to violins, and flutes, and oboes, and brass, full of enthusiasm, and would say he'd have to write a symphony called Clouds' (1925: 255).

When reflecting on how best to abstract filmed natural imagery to accompany music I looked to a painter whose work is transcendental. He observed nature, captured tumultuous motion in colour and created almost abstract work. J.M.W. Turner concentrated on his experience of the visible world, however much it was dissolved by dazzle, mist and light. He discarded what he knew of the structure of objects and matched what he saw using his painterly process. Indeed the innovator László Moholy-Nagy made a similar connection: 'Abstract painting can be understood as an arrested, frozen phase of kinetic light display leading back to the original emotional, sensuous meaning of color of which William Turner (1775-1851), the great English painter, was an admirable predecessor' (1947: 150). Turner has been named 'the father of modern art'. There is a clear link between modern art and early abstract film, as Hans Richter asserts: 'Problems in modern art lead directly into the film. Organization and orchestration of form, color, the dynamics of motion, simultaneity, were a problem with which Cezanne, the cubist the futurists had to deal' (1951: 160). For me, Turner provided a strong starting point with clear links to modern art, without treading over the same ground as the early abstract animators.

To abstract a natural scene I started with close observation of my experience of time passing as the sky gradually darkened over Margate Beach, where Turner painted. I observed the light and colours gradually changing as the sun set, whilst I captured the material, visual and audio data of the scene.

Figure 1. Margate Sky 001 Copyright 2014 by Julie Watkins

Figure 2. Margate Sky 002 Copyright 2014 by Julie Watkins

Figure 3. Margate Sky 003 Copyright 2014 by Julie Watkins

Figure 4. From Horizon Copyright 2014 by Julie Watkins

Stieglitz, especially his photographs Songs of the Sky (1923-5) and Equivalents (1925-9) also influenced Horizon. Stieglitz states:

I know I have done something that has never been done. Maybe an approach occasionally found in music. I also know that there is more of the really abstract in some "representation" than in most of the dead representation of the so-called abstraction so fashionable now (in a letter to Hart Crane, 10 December 1923, in Greenough, & Hamilton, 1983).

Horizon explores the possibilities for creating abstracted images that transcend literal representations, with a similar aim of engaging viewers by affording a less associative and more meditative experience. In this way, I believe that music can evoke a meditative state.

Creating a Moving Painting

Paintings and photographs are static and do not have the musical aspect of unfolding over time. In 1927 Moholy-Nagy debated whether static painting was still necessary:

The essence of the reflected light play is the production of light-space-time tensions in colour or chiaroscuro harmonies and (or) in various forms by kinetic means, in a continuity of motion: as an optical passage of time in a state of equilibrium. The newly emergent impulse of time and its ever expanding articulation here produce a state of increased activity in the observer, who - instead of meditating upon a static image and instead of immersing himself in it and only then becoming active - is forced almost to double his efforts immediately in order to be able simultaneously to comprehend and to participate in the optical events. Kinetic composition so to speak enables the observer's desire to seize instantly upon new moments of vital insight, whereas the static image generates these reactions slowly. This indicates that there can be no doubt about the justification of both forms of creation (1987: 23).

Horizon is a time-collage; the continuous colour change of the sunset is sped up, whilst the sound of the waves remains in real time. The camera is static. The image is almost still. Colour, tone and textural changes and movement without resolution are both used to create a moving painting. This is an attempt to allow the viewer to meditate on the image and on the way that a static image can be viewed, whilst simultaneously gradually changing to keep the viewer's gaze active.

Creating Metaphorical Space

The methodology underpinning my work is to create field recordings of audio and capture video of stills and textures to layer, enhance and re-animate. This affords creating abstracted images through re-timing video through a visible and almost imperceptible blending of layers of video and re-animating to change focus and detail. I developed a mixed media process (both digital and painterly) that included creating brushstrokes and particle atmospheres. The overlapping and interwoven textures of different sizes give a sense of collaged, textural depth. The use of formal drawn perspective is purposefully avoided. Brushstrokes and sky-scape are sampled out of scale. The non-realistic perspective, a multiple view-point, a fragmentation of the screen (picture plane) is influenced by cubism. Moholy-Nagy explains how cubist collage can affect emotions:

In the early cubist collages an astonishing skill was apparent in manipulating these planes. They were employed in a hide-and-seek combination, woven each behind and about the other. When one does not worry about what each element means in its naturalistic connotation, then one can enjoy the pictorial and graphic wealth of these interpenetrating planes, shadings and textures […] a rhythmical and emotional exultation […] arriving at a new visual microcosmos of primordial emotional values' (1947: 128).

Generally sight is more associated with perceiving space and sound is more associated with perceiving time. However, as the composer Denis Smalley (2007) identifies sound is also associated with perceiving space and perspective. My work explores perception of space through interventions such as creating metaphorical space and perspective by juxtaposing close audio with distant image. The nebulous, almost infinite distance of sky and horizon line and close up audio of (unseen) waves confounds the viewer's expectation of space. This non-realistic use of scale and perspective is crucial to the balance of abstraction and representation in Horizon. See the moving image piece Horizon Copyright 2014 by Julie Watkins at


In 2014 I created Reservoir. The view changes like an Impressionist's slideshow as we circle the ever-distant reservoir, vibrant with life heard but unseen, off screen. Reservoir investigates treating footage as data, illuminating the materiality of the moving image and its temporalisation.

Figure 5. From Reservoir Copyright 2014 by Julie Watkins

Treating Footage as Data

I used vision and audio as data to animate Reservoir in order to fashion new aesthetic experiences from the world around me. As the film and video artist Malcolm Le Grice asserts of Len Lye's work:

It also embodies in the meaning of the work a philosophical concept that information subject to abstraction as component "data" becomes a new form of "raw material" available for "retrieval" in ways which construct a new experiential model of the world' (2009: 317).

Reservoir is created from footage that I had filmed in 2000. I walked around the reservoir in Central Park in New York, capturing video and audio. The distant abstract landscape is doubly framed; glimpsed through the wire mesh and purposefully enclosed again by the presence of the edge of the frame. This captures the motion of walking around the reservoir into a graphic space. The motion is seen in the rapid movement of close up fence and the slower changes of the distant landscape.

Illuminating the Materiality of the Moving Image

Len Lye's films highlight the nature and materiality of celluloid film as the superimposed effects flicker and are imperfectly registered with live-action images. As Brougher et al. state: '[Lye's films] are precursors not only to the work of filmmakers such as Harry Smith and Pat O'Neill but also to structuralist films of the 1960s and 70s with their insistence on deconstructing the medium' (2005: 112). In Reservoir the footage is deconstructed until nearly abstract by digital layering and colour effects that foreground the grain and materiality of the medium. The resulting imagery is massively sped up and then step-framed, i.e. each frame is printed seven times, to create a step-frame effect that breaks the flow of live action and makes the viewer aware of the individual frames: this is the base material of the moving image.


Step-framing also affords vertical 'chords' with audio. Michel Chion, the composer and theoretician of audio-visual relationships analyses how sound temporalises images: 'Temporalization actually depends more on the regularity or irregularity of the aural flow more than on tempo' (1994: 14). This extends to how synch points, like chords in music, vertically converge and phrase the flow. I abstracted and re-timed the imagery and created a layered time montage through re-synching visual and audio components. The audio includes fragments of speech, bells, birdsong, chant, breath and steps. It was recorded at the same time as the imagery, is played in real time and is relayed in its original chronological order. Only one minute remains from the original forty-five minutes, which has been re-composed in relation to the imagery. See the moving image piece Reservoir Copyright 2014 by Julie Watkins

Shadow Sounds

In 2015 I created Shadow Sounds. Vibrant colours are animated with strident motion in contrast with the soft, shadowy, continuously moving, grey, dappled background. The animations are fused to be playful non-verbal vocalisations. Some softly drift whilst others erupt with hard attack like a cartoon punch. Shadow Sounds playfully investigates three useful strategies for my work: animating non-verbal vocalisations, fusing sound and image to create audio-image units and building a composition from audio-image units using musical phrasing.

Figure 6. From Shadow Sounds Copyright 2015 by Julie Watkins

Animating to Non-verbal Vocalizations

To create Shadow Sounds I started by defining a set of non-verbal vocalisations such as 'ooh' 'ah' 'eeh' and 'pah', which I recorded these and then set animation to. The importance of vocalisation both signaling and effecting emotions has been studied ever since the founding father of evolutionary theory Charles Darwin highlighted the importance of the voice as the transmitter of emotion in animals and humans (1890). As psychologists Rainer Banse and Klaus Scherer state, acoustic profiling via digital acoustic analysis has provided detailed data to differentiate vocal parameters for different emotions (1996). The aim of Shadow Sounds is to create a pilot for a new, interactive and affective experience, to move away from measuring the voice as audio data, or physiological data, or scientific neural data, and place shared, embodied, experiences at the heart of the work.

To find a starting point for animation that was not based on some mapping of volume or frequency to image I looked for inspiration to the light projections that influenced the first abstract animations. At the Bauerhaus in 1922 Kurt Schwerdtfeger created Reflecting Light Games, lighting cardboard shapes so that they threw abstract shadow patterns onto a projection surface. Together with Ludwig Hirschfeld-Mack, he discovered warm and cold shadows and further developed this intoReflecting Plays of Colours. There is a distinct similarity between the projected shadows and the abstract films of this time. The dappled background of Shadow Sounds is influenced by the shadows created in these Bauerhaus experiments. The polymorphous 3D light shapes of Thomas Wilfred's lumia influence the sound-shapes. Wilfred invented the Clavilux in 1919 in order to control the form, colour and motion of light through the Clavilux's keys. He projected strong white light through lenses and created polymorphous 3D forms, creating rhythmic motion, and then finally added colour.

Fusing Sound and Image to Create Audio-image Units

Fusing sound and image is influenced by Synthetic Sound, which is a way of transcribing sound as graphic elements on a film stock's optical track, as seen for example in Norman McLaren's Synchromy (1971). The history and development of Synthetic Sound are fascinating but lie beyond the scope of this article. In Shadow Sounds there is a consistent mapping of one sound to one animation as they are fused into audio-image units. This creates, by repetition, a consistent association of sound and image as one entity and reinforces syncronisation.

Non-verbal vocalisations are a form of vocal gesture, both in the sense that they are actions performed to convey feelings and in the sense of the composer Denis Smalley's sound-making gesture :

Sound-making gesture is concerned with human, physical activity which has spectromorphological consequences: a chain of activity links a cause to a source. A human agent produces spectromorphologies via the motion of gesture, using the sense of touch or an implement to applyenergy to a sounding body. A gesture is therefore an energy-motion trajectory which excites the sounding body, creating spectromorphological life' (1997: 111).

Animating to sound-making gesture was influenced by a discovery of the animator Len Lye. Lye describes composing figures of motion:

One morning it had been raining all night, and there were these marvelous fast little skuddy clouds in the blue sky. As I was looking at those clouds I was thinking, wasn't it Constable, the English landscape painter who sketched clouds to try to convey their motions? Well, I thought, why clouds, why not just motion? Why pretend they are moving, why not just move something? All of a sudden it hit me - if there was such a thing as composing music, there could be such a thing as composing motion. After all, there are melodic figures, why can't there be figures of motion? (1984: 31).

The strong gestural impetus of the sounds afforded composing figures of motion. The aim was to closely fit the character of each temporal phase, the onset, continuant and termination, of each non-verbal utterance and so build fused units of audio-image.

Building a Composition from Audio-image Units Using Musical Phrasing

My strategy was to compose the piece from these unique units of audio-image. Composer and colleague Andrew Hill has created a topology of visual music, dividing it into four main sub categories:

A. A purely visual approach to Visual Music, for example Thomas Wilfred's Lumia, or the works of Kandinsky or Klee.

B. Visual composition to pre-existing musics such as in some of the early works of Oskar Fischinger.

C. The composition of sound and image informed by traditions of music in which materials are structured within time. The sound and image are regarded as equal components joined in the context of a work and are both structured musically.

D. The synthesis of visual materials from sound and the representation of sound visually (2013: 8).

This composition falls into category C; the sound and image are not only regarded as equal but also as fused into units that are then structured musically. Smalley identifies gesture as 'propelling time forwards' associating it with linear narrative (1997: 113). Shadow Sounds is built from individual vocal gestures that are analogous to notes in tonal music and this gives the piece a strong forward impetus.

However, I was not satisfied with Shadow Sounds for two reasons. Firstly, the sound-shapes do not live up to their influences. The forms in Shadow Sounds should live more distinctly in 3D space or make much better and more graphic use of positive and negative deepened nuanced shadow as found in the photographer Man Ray's and also the innovator László Moholy-Nagy's photograms. Secondly, the mapping of animation to sound is not accurate enough; each animated shape needs to be more nuanced in relation to the onset, continuant and termination, of each non-verbal utterance. Nevertheless, Shadow Sounds afforded the creation and testing of useful strategies that I developed further when creating Ambience.


In 2016 I created Ambience. The melody of a traditional song unfolds. The song has no words and is not literal in any sense. The singer remains unseen, off-screen. There is no face, no mouth to gaze upon. The emotion of the melody and performance is embodied in soft circles and light plays that move in sync with the music. Delicate colours softly change and particles sway and flow as if performing. Ambience investigates animation of the human voice, which wordlessly sings traditional songs. It does so by building a composition using musical phrasing and abstract visual performance, especially of the singer's face and motion.

Figure 7. From Ambience Copyright 2016 by Julie Watkins

Animating to the Human Voice Wordlessly Singing Traditional Songs

I compose visual music to the human voice to underpin the abstract movement of light and colour with human motivation and emotion. As Simon Frith persuasively argues: 'Because of its qualities of abstractness, music is, by nature, an individualizing form. We absorb songs into our own lives and rhythm into our own bodies; they have a looseness of reference that makes them immediately accessible' (2011, 121). Furthermore, we all do something like singing when we make sounds and so when we hear singing, consciously or unconsciously, we are aware that we embody the same type of instrument and in some sense sing along with the singer, in a way that is quite distinct from a non-violinist hearing a violinist play. There is a strong sense of identification with the singer.

Each singer sounds unique as their sounds are shaped by their anatomy. The lungs push air through the vocal folds, which then resonates in the mouth and throat cavities (the vocal tract). The singers John Potter and Neil Sorrell (2012) liken this to a power supply, moving an oscillator and a resonator. Unlike other instruments, the lungs, vocal folds and vocal tract were not designed specifically to make music. Singers stretch the functionality of their vocal tract (a piece of anatomy that was primarily designed to prevent choking) in order to reach for the sounds they want to make. The particular qualities and shapes of the human flesh and bone of the singer plays a large part in determining the final sound. This uniqueness is a further humanising quality. The only visual representation in Ambience is of the sound of the singer's voice, created using a digital version of the oscilloscope, inspired by the first to create sound imagery by filming oscilloscope patterns, Hy Hirsch'sDivertissement Rococo (1951) and Mary Ellen Bute'sAbstronic (1957). This is layered into the other imagery.

Figure 8. From Ambience Copyright 2016 by Julie Watkins

The choice of traditional music was inspired by Mary Ellen Bute: 'The two abstronic films I have made are based on the music of 'Hoe Down' by Aaron Copeland and 'Ranch House Party' by Don Gillis. Because this music is simple rhythmically, clear and sharp, I thought it suitable for my first experiment in this new art medium' (1954). Additionally, traditional songs are formed from, and constitute part of, a shared pool of experience. As Will Hodgkinson found: 'Communities sing to tell stories, to mark events, to replay the news and to bring poetry to the most mundane aspects of daily life' (2009: 67). The songs are sung wordlessly to emphasise the emotion and avoid the associations that cling to words. Newly sharing traditional songs follows the impetus of filmmakers such as Oskar Fischinger, Hy Hirsh and Harry Smith, who, as Kerry Brougher in asserts: '[It] Began [by] merging painting's spiritual dimensions with popular culture; for them art was not so much about the self but rather a commonly shared, egoless experience' (2005: 120).

Building a Composition Using Musical Phrasing

Smalley identifies structural hierarchies in tonal music:

'the note is regarded as the lowest structural level, and all tonal music is made up of note-groupings of increasing dimensions as one moves outwards through the form - from note to motive to phrase, and so on. In addition, metrical structure gives the lowest-level note a pulse, which defines the minimum possible density of movement' (1997: 114).

Sustained notes display a graduated continuant. Sustained, sung notes are prolonged by breath; this elongates the gesture and makes the listener aware of every underlying of nuanced change. These notes afford the greatest development of variants. To create Ambience, I asked the singer, Clare McCaldin, to wordlessly sing musical iterations of a traditional song focusing on using smooth singing and then articulated aspirated singing. She sang different versions with the same melody but at different speeds, with different articulation and timbre and with different prosody. This afforded editing phrases of the song together to make a new song structure.

Abstracting Visual Performance, the Singer's Face and Motion

Whitney (1994) advocated that animation should not directly represent music but demonstrate 'complementarity', to be more expressive and respond in a more aesthetic and less mechanical way to the tension and resolution, indeed to the emotions, within the music. Ambience follows Whitney's intentions in that the images respond on a higher level to the music, using input, i.e. data, from the singer's performance. From Disney onwards, character animation has had a tradition of using human actors to closely model animation on dramatic performance and so engage the audience in the story. As Stone et al. assert: 'Engaging dramatic performances convincingly depict people with genuine emotions and personality' (2004: 1). This has been further developed with motion capture and is widely used today. Abstract animation can draw on the audience's understanding of character animation and has its own history of conveying emotion through animating motion. Continuing this thread, Freedberg and Gallese state that: 'Recent studies in macaques and humans demonstrated that mirror neurons not only underpin action understanding, but they are also involved in understanding the intentions that underlie action' (2007: 200). I tracked Clare's head gestures, her singing motion and animated particle effects to this motion. This showed her movement in the abstract, without the individualising associations of her face or physical form. Like Jordan Belson suggests:

I don't want there to be any ideas connected to my images, and if there are any there, if anybody sees any, those are entirely in the eyes of the beholder […] Actually, the films are not meant to be explained, analysed, or understood. They are more experiential, more like listening to music (quoted in Brougher et al., 2005: 148).

My work explores abstracting the face, for example, by mapping facial movement of the singer to changes in colour and showing only these changes, not their face. This is with the aim of affording the spectator a less associative and more meditative experience. In other variations simplified forms may evoke a face, through pareidolia [2]. As Leonardo da Vinci wrote: If you look at any walls spotted with various stains or with a mixture of different kinds of stones, if you are about to invent some scene you will be able to see in it a resemblance to various different landscapes adorned with mountains, rivers, rocks, trees, plains, wide valleys, and various groups of hills. You will also be able to see divers, combats and figures in quick movement, and strange expressions of faces, and outlandish costumes, and an infinite number of things, which you can then reduce into separate and well conceived forms. With such walls and blends of different stones it comes about as it does with the sound of bells, in whose clanging you may discover every name and word that you can imagine (quoted in MacCurdy, 1958: 873).

According to the neurologist Nouchine Hadjikhani, typical people are born with facial perception and develop it further with the social brain to become excellent at detecting faces and 'face experts'. That is why we can experience pareidolia. Hadjikhani et al. have found pareidolia happens early in the cognitive process in the subcortical system; it is not a post-recognition cognitive re-interpretation process (2009). My aim is to create images that intrigue and engage the viewer in the way that pareidolia do and, also like pareidolia evoke forms only in passing.


I have highlighted my main concerns and findings through different ways of unifying sound and image and through avoiding obvious associations by abstracting both natural image and visual performance whilst at the same time creating visual musics that are clearly full of emotion. I have a growing understanding of affordances within this area of visual music. My practice has developed from creating a moving painting to an impressionistic slideshow, to animated non-verbal vocalisations, to animating an abstracted performance of an emotive melody. I have historicised my methodologies for creating abstract visual music that has no narrative but is suffused with human presence and emotion and might afford a meditative experience. This investigation affirms that visual music has a long, rich, multi-faceted history that can underpin new forms of visual music today.

[1] Abstracted Animation is my own term; it refers to animation that has texture, depth and expressive movement, without overtly representing concrete reality.

[2] Pareidolia is the human propensity to discern meaningful images in random patterns, for example seeing the moon as having a face.

Special thanks to Clare McCaldin for her singing.

Shorter versions of parts of this paper and the visual music pieces have been presented at DRHA 2016 and Sound/Image 2016.

Visual Music Pieces

Horizon Copyright 2014 by Julie Watkins

Reservoir Copyright 2014 by Julie Watkins

Shadow Sounds Copyright 2015 by Julie Watkins

Ambience Copyright 2016 by Julie Watkins


Brougher, K., Mattis, O., Museum of Contemporary Art (Los Angeles, Calif.), & Hirshhorn Museum and Sculpture Garden (Eds.). (2005). Visual Music: Synaesthesia in art and music since 1900. [London] : Washington, D.C. : Los Angeles: Thames & Hudson.

Bute, M. E. (1954). ABSTRONICS: An Experimental Filmmaker Photographs the Esthetics of the Oscillograph. Films in Review, 5(6). Retrieved from Accessed 18/12/2016

Chion, M., Gorbman, C., & Murch, W. (1994). Audio-vision: Sound on screen. New York: Columbia University Press.

Collopy, F. (2006). Playing (with) Color. Glimpse: The Art and Science of Seeing, 2(3): 62-67.

Darwin, C. (1890). The Expression of the Emotions in Man and Animals (2nd ed.). London: John Murray. Retrieved from Accessed 13/01/2016

Elder, R. B. (2010). Harmony and Dissent: Film and Avant-garde Art Movements in the Early Twentieth Century. Waterloo, ON: Wilfrid Laurier University Press.

Freedberg, D., & Gallese, V. (2007). Motion, Emotion and Empathy in Esthetic Experience. Trends in Cognitive Sciences, 11(5): 197-203. Accessed 05/09/2013

Frith, S. (2011). Music and Identity. In Questions of Cultural Identity (pp. 108-127). London: SAGE Publications Ltd. Retrieved from Accessed 03/09/2016

Hadjikhani, N., Kveraga, K., Naik, P., & Ahlfors, S. P. (2009). Early (M170) Activation of Face-Specific Cortex by Face-Like Objects: NeuroReport, 20(4): 403-407. Accessed 09/08/2016

Hill, A. (2013). Interpreting Electroacoustic Audio-visual Music. Leicester, UK: De Montfort University.

Hodgkinson, W. (2009). The Ballad of Britain: How music captured the soul of a nation. London: Portico.

Kandinsky, W. (1977). Concerning the Spiritual in Art. New York: Dover Publications.

Le Grice, M. (2009). Experimental Cinema in the Digital Age (1. publ., reprinted). London: BFI Publications.

Lund, C. (2016). How to Talk About Visual Music Without Having your Fingers Burnt. Presented at Sound/Image, University of Greenwich.

Lye, L., Curnow, W., & Horrocks, R. (1984). Figures of Motion: Len Lye, selected writings. Auckland: Auckland University Press.

MacCurdy, E. (1958). The Notebooks of Leonardo da Vinci: Arranged, Rendered into English and Introduced by Edward MacCurdy (Vol. 2). New York: George Braziller.

Moholy-Nagy, L. (1947). Vision in Motion. Chicago: Paul Theobald & Co.

Moholy-Nagy, L. (1987). Painting, Photography, Film. Cambridge, Mass: MIT Press.

Moritz, W. (1985). Who's Who in Filmmaking: James Whitney. Sightlines, 19(2). Retrieved from Accessed 18/12/2016

Potter, J., & Sorrell, N. (2012). A History of Singing. Cambridge: Cambridge University Press.

Richter, H. (1951). The Film as an Original Art Form. College Art Journal, 10(2): 157. Accessed 22/07/2016

Richter, H. (1952, February). Easel-Scroll-Film. Magazine of Art, 78-86.

Scherer, K., & Banse, R. (1996). Acoustic Profiles in Vocal Emotion Expression. Journal of Personality and Social Psychology, 70(3): 614-636.

Shimojo, S. (2001). Sensory Modalities are not Separate Modalities: Plasticity and interactions. Current Opinion in Neurobiology, 11(4): 505-509. Accessed 13/01/2016

Smalley, D. (1997). Spectromorphology: Explaining sound-shapes. Organised Sound, 2(2): 107-126. Accessed 05/12/2015

Smalley, D. (2007). Space-form and the Acousmatic Image. Organised Sound, 12(01): 35. Accessed 05/12/2015

Stieglitz, A. (1925). How I came to Photograph Clouds. The Amateur Photographer and Photography, (56): 255.

Stieglitz, A., Greenough, S., & Hamilton, J. (1983). Alfred Stieglitz, photographs & writings (1st ed). Washington: National Gallery of Art.

Stone, M., DeCarlo, D., Oh, I., Rodriguez, C., Stere, A., Lees, A., & Bregler, C. (2004). Speaking with Hands: Creating animated conversational characters from recordings of human performance (p. 506). ACM Press. Accessed 11/09/2013

Whitney, J. (1980). Digital harmony: On the complementarity of music and visual art. Peterborough, N.H: Byte Books.

Whitney, J. (1994). To Paint on Water: The Audiovisual Duet of Complementarity. Computer Music Journal, 18(3): 45-52.


Julie Watkins is a senior lecturer in Film and Television at the University of Greenwich. She worked as lead creative in prestigious Post-Production facilities in Soho and Manhattan. She designed concepts, led Technical Direction, Animation, Motion Graphic and Visual Effects Teams, for Commercials, Broadcast Graphics and Films. She taught at New York University. She joined the University of Greenwich in 2006, initiated a Film and Television degree and partnership with the BBC. She has MA (distinction) in Graphic Design from University of the Arts London. She has presented papers and shown work at DRHA 2014, 2015 and 2016 and Sound and Image 2015 and 2016 is completing a Ph.D.