The ideology of linear perspective as realism assumes that ‘transparent immediacy’ (Bolter and Grusin 1999: 103) is offered by successive technologies from painting to photography and Victorian stereoscopy, to film and television, games and Virtual Reality. The linear-perspectival methods used in Renaissance painting suggested to art historians such as Ernest Gombrich (2011) that reality was shown with immediacy, that it was possible to regard paintings as a view on to a world, albeit from a fixed point-of-view. However, this disregards surface materiality, for example brushstrokes must be smoothed away in order for a painting to look realistic and not like daubs of paint. Modernists were intensely interested in the material surface of images. They treated photographs as objects and cut them up into photomontages and collages, remediated them, and made the viewer aware of photographs as objects. This artistic arranging and layering of elements foreshadowed digital layering today. Media on a computer, even when it is not consciously remediated, e.g. a museum’s archive of a painting, is necessarily changed. A luminous computer screen is different from paint or print. Colours are rendered differently on a screen and surfaces displayed differently to paint or print. In the 1980s PAL and NTSC were felt to be sharp because we supplied the missing information autonomically through our visual perception systems. Forty years later we are used to higher resolution images, which make images in PAL and NTSC look very soft and lossy. Older technology in media in television and film is mimicked (for example film weave and scratch or VHS glitch) to create information that purports to be from a different age. It is assumed that current broadcast resolutions are transparent, just as photographs were assumed to be transparent. However, as the spatial video artist Pipilotti Rist (2021) suggests video, even at 4K resolution is much rougher and flatter than our vision in real life. When computer programmers created windows as a user interface the windows were meant to be transparent. However, the computer as a medium is not transparent, the interface is not transparent. It’s necessary to click to access what you want to see; you can’t just look and and immediately access what you want to see. As Rist (2021: online) states: ‘Today we put all our knowledge, our feelings, our history behind flat screens’. By showing the physicality of the screen she makes us aware of our own bodies and allows us to alternate between feeling how our bodies are as we view her work and feeling absorbed, bodiless, in the images she is projecting.

The breaking of the flat plane into three-dimensional depth was achieved by the Victorian stereoscope. It allowed the viewer to transform two flat images taken in parallel into seeing a scene in relief, with a real sensation of depth. The spectacle of this form of viewing is similar to Virtual Reality in that it promised to free the viewer into a real three-dimensional space and required a special viewing apparatus. The exact relationship of the viewer with the stereoscope was vital. The space that opened up in the stereoscope was created from parallax, without the curvature of our visual system. It seemed to consist of receding planes, somewhat akin to stage-flats. The viewer’s eyes had to be static in order to unify the image optically through convergence. Once their eyes moved the illusion was shattered until they fixed on another area. This experience broke with linear perspective and established the embodied viewer by integrating the viewer as an essential part of producing the spectacle. However, in doing so it limited the body in relation to the scene, was limited by human anatomy and required continuous human effort of fusing and re-fusing the images. As the art critic Jonathan Crary (2005: 133) suggests:

The prehistory of the spectacle and the ‘pure perception’ of modernism are lodged in the newly discovered territory of a fully embodied viewer, but the eventual triumph of both depends on the denial of the body, its pulsings and phantasms, as the ground of vision.

Virtual Reality aims to be immersive by offering interaction in three-dimensions, instead of looking into a frame to a world beyond one enters the world. Virtual Reality is less planar than the Victorian stereoscope. However, as Rist (2021) posits: Virtual Reality tries to make us believe it is not flat; but it is’. Even though the experiencer can turn their head, The goal of Virtual Reality is to be transparent so that the experiencer has a completely immersive experience, but transparent immediacy is lost as interacting with Virtual Reality is made possible by donning a head-mounted display. Additionally, a wealth of research points to some experiencers’ discomfort and even health issues, as Chattha et al (2020: 130486) suggest: ‘motion sickness limits the VR community in the full adaptation of this immersive technology.’ The manipulative effect of technologies from painting to film to television and Virtual Reality to evoke emotions and measurable affective and autonomic changes has been widely researched. The audio-visual theorist Michel Chion posits that the power of film affects our perceptions and our autonomic responses such as breathing (Chion, Gorbman and Murch, 1994). This points to experiencers having authentic, embodied experiences, but as (Bolter and Grusin 1999: 165) suggest through the remediation of generations of point-of-view technologies these technologies are providing their own ‘self-authenticating experience[s]’. In order to differentiate between these self-authenticating experiences and richer embodied experiences we will turn to visual perception.

Our visual brains use a simplified version of the physics of the real world. Our simplification is efficient and allows for rapid processing, a necessity for survival. Our visual brain is tolerant of visual elements not conforming to the physics of the real world, for example: reflections, shadows and colours that would not be found in the real world pass unnoticed in an image and do not hinder the viewer’s understanding. Flat representation is meaningful in all cultures and for infants; i.e. it is not dependent on learning flat equivalences for three-dimensional objects (Gombrich 2011). This gives artists the freedom to change the physics of their created worlds to suit the message they want to convey. As Patrick Cavanagh writes:

Our ability to interpret representations that are less than 3D indicates that we do not experience the visual world as truly 3D, and has allowed flat pictures (and movies) to dominate our visual environment as an economical and convenient substitute for 3D representations.

(Cavanagh 2005: 304)

This tolerance allows us to accept images with linear perspective as representing the three-dimensional world and flat photographs as true representations of the world. David Hockney has long criticised linear and one-point perspective and the perspective imposed by photography, which he suggests reduces our vision to a static experience as seen through one frozen eye (Hockney and Gayford 2016). In this paper I posit that our visual experience of being in 3-D space is not captured when we take a photograph and by extension it is not captured by later technologies based on linear perspective. Elevating photographic visual reality diminishes the understanding of the power of individual visual sensibility and perception imbricated in lived experience. Accepting point-of-view technologies, such as photographs, as the ultimate visual veracity conforms to the computer-biased norm of elevating what is most easily measured, computed and communicated. Artists can expand our sense of space. Some artists, including Rist (2021), aim to free images from being behind screens, to bring images into the room with us so that we have fully embodied experiences. This paper establishes key aspects of our visual processing in relation to sensing depth, examines seminal artworks that expand our sense of space, excavates a phenomenological viewing of a painting and explorations from my own practice and ends with a call to more awareness of our richness of vision.

Vision Processing and Experiencing 3-D Space

The key aspects of our vision processing creating a sense of depth addressed here are: building representations of the world, field of vision, multiple perspectives, tactile vision, perceiving depth through transparency, rhythm and movement.

We build our representation of the world by focusing reflected light on to our retinas and we process this data in order to make sense of what we see (Livingstone 2002). The artificial intelligence and neuroscience researcher Dileep George suggests that processing vision takes up almost one third of neuronal computations that deal with perception, attention, thought and episodic memory. This processing seems effortless to us, but is immensely complex and computers cannot match us, for example, in our ability to recognise invariant visual patterns. We can recognise that images are the same ‘despite changes in location, size, lighting conditions and in the presence of deformations and large amounts of noise’ (George, 2008, 8). George goes on to explain how we achieve this. We are not supervised as we learn to do this, but train ourselves through learning to generalise what we see into ‘manifolds’ of images. This depends on our using temporal information (if the images close in time they are more likely to be of the same object) and the relative motion between the objects and ourselves. Thus, seeing images over time and having multiple view-points is built into our vision and affects how we interact with the world. This informs all the aspects of our vision that are delineated below.

Our field of vision is key to our visual experience of being in 3-D space. This is not captured when we take a photograph. In part this is due to differences in how we and cameras capture three-dimensional space. A photograph is made by light reflecting from objects being focused by a lens onto a sensitive area: the lens and resolution of the sensor are key factors in the final image. Our field of vision is determined by the focal length of our eyes, which is about 22 mm, but this is complicated to measure in practice because, unlike a photographic sensor, our retinas are curved. Additionally we combine data from our left eye and right eye giving an overlap of about 130 degrees. As can be seen through a camera lens, a narrow angle of view loses sense of depth, and a very wide angle of view exaggerates the relative sizes of objects, stretches objects near the edge of frame and introduces curvature. Our preference for curvature over linear perspective images with straight lines is well-documented (Leder et al. 2004; Palumbo, Ruta and Bertamini 2015; Gómez-Puerto, Munar and Nadal 2016; Penacchio and Wilkins 2015). This cannot be due to acclimatization as linear perspective has depicted three-dimensional space in images for centuries and is seen ubiquitously in digital imagery both static and moving today. Alistair Burleigh et al suggest that: ‘curved forms are easier for the human visual system to process and therefore more comfortable to view’ (Burleigh, Pepperell and Ruta 2018: 16). Generally we fuse the data from both our eyes into a single image. Therefore, to see a linear perspective image depicting true three-dimensional space we need to be in exactly the right position in front of the image. This still does not achieve our level of visual perception as our binocular vision allows us to judge depth and space through the retinal disparities between our two eyes (Howard and Rogers 1996). Currently ‘natural perspective principles’ (Burleigh, Pepperell and Ruta 2018: 18) that include curvature and, it is suggested by Burleigh et al, more closely mimic our visual system are being further explored.

We see most detail and texture in the centre of our gaze, via the tiny area of our retina, the fovea. In contrast to this central vision, our peripheral vision organises the wider scene spatially, so we know where to direct the attention of our acute central vision and how to categorise the scene (Trouilloud et al. 2022). The artist Paul Klee uses our lack of all-over detail as we gather visual data. He invites his work to be explored without the constraint of perspective leading the eye. There is no one point of focus but a myriad of possible points and innumerable paths to explore across his images. As he states:

The limitation of the eye is its inability to see even a small surface equally sharp at all points. The eye must ‘graze’ over the surface, grasping sharply portion after portion, to convey them to the brain which collects and stores the impressions

(Klee and Moholy-Nagy 2000: 33)

Our central angle of view, what we see if we do not move our head or our eyes, is about 40 to 60 degrees. However, we are always moving our eyes and refresh our vision several times a second. Our peripheral vision has spatial imprecision which enables us to create illusions and to complete incomplete renderings of objects, for example in Impressionist Paintings where detail when viewed closely falls way into brushstrokes such as Claude Monet’s Shadows on the Sea (1882). At first glance we see the sea, then, looking more closely we become aware of individual brushstrokes. We find these incomplete renderings lively because they make us actively complete them and each time we glance at the image we complete it slightly differently (Gombrich 1960). In additional to spatial limitations there are temporal limitations, as we and the world are in movement. Often we capture the world in a glance. We do not freeze a moment in time and examine all the details: we form an impression of the scene, with very little seen in detail. And this is how we remember a scene in movement. Henri Matisse posits that our perception of movement is not caught in photography: ‘when we capture [movement] by surprise in a snapshot, the resulting image reminds us of nothing that we have seen’ (Matisse 1978: 77). This reduction to an impression, as seen in Monet’s Shadows on the Sea, comes much closer to rendering what we see in a glance than a photograph could. Monet’s sea appears to be in motion and thus comes much closer to our memory of sea than waves frozen in a photograph.

We gather visual data from multiple perspectives and use these to build our understanding of the world. We are able to recognise someone or an object seen from completely new angles because in our visual memory we build a manifold recognition for a person or object that is view invariant. Livingstone suggests that Cubism is related to our processing of imprecision via our peripheral vision and how we form visual memories and that Cubism might evoke this aspect of our memory and therefore be pleasing to us (2002). This is one example of how art movements that use multiple dimensions in non-linear perspective appeal to us. Whereas Renaissance paintings’ linear perspective assumes that there is a single viewpoint Avant-garde art deconstructs the frame, mirroring new concepts of artists and the world that arrived with the great social and political upheavals of the first two decades of the twentieth century. In the 1920s Avant-garde art widely promoted the individuality of the artist, transcendental experiences and the concept of a multi-dimensional universe (Tarasov 2017). This extended to new, radical views of traditional art. The Russian philosopher Pavel Florensky suggested that: ‘The composition [of the icon] is constructed as if the eye were looking at different parts of it, while changing its position’ (Florenskiĭ 2006: 204). Raoul Hausmann posited that photographs run counter to creative vision in his 1921 manifesto Wir sind nicht die Photographen (We are not Photographers). He suggested that the linear perspective that the camera captures distances the photographer from the scene and place the photographer at the peak of a visual pyramid as master of the scene. This static position negates opportunities for a living, transformational vision: ‘a radically relational mode of seeing and understanding the world that embraces the complexity, vitality, and even the agency held by the objects of one’s gaze’ (Hackbarth 2020: 63). However, he found reassembling and collaging photographs and reproductions of photographs useful for his photomontages. In Hausmann’s ABCD (1924) multiple objects rendered at different scales and from different single view-points are positioned in a way that allows no single perspective point. There is no cohesive lighting source: each object is rendered as if lit individually and modulated grey tones tumble across the scene. Additionally, textures vary with the sources, which include printed globe and hand, photographic face, graphic letters and numbers. Nevertheless there is a strong frontal composition based on dynamic diagonals that is emphasised by the otherwise chaotic lack of cohesion. The physical overlapping of the cut images draws the eye into the composition and the central image of the open mouth. The whole is not meant to be read as a sequence of symbols, but reacted to on first confrontation on a visceral affective level. The way photograph is used here shatters the concept of the photograph as an illusion of reality.

Tactile vision: sensual and tactile qualities were seen as central to vision by many painters. The painter Mikhail Vrubel rejected the single vanishing point and emphasised the importance of the surface of the painting, foregrounding its tactile and material qualities. He was as seminal in his influence on the Russian avant-guarde as Cézanne had been to Western art (Rivkin and Ryan 2004). Pavel Florensky suggested, in his lectures on the Reverse Perspective (1919), that visual scenes could best be interpreted through attention to the tactile and sensual and by emphasising the whole physiology of the perceiver (Florenskiĭ 2006). This had a great influence on the Constructivists. These influences continued via artists such as Mark Rothko, who positioned sensuality and touch as the central sense to the plastic arts (Rothko 2004). This sensuality overrides form and ground creating a space in which our visual sense perceives depth through texture and mark making without the presence of any elements of formal perspectival space. Gilles Deleuze writing on the work of Francis Bacon foregrounds touch in close viewing ‘grasped in a close view, a tactile or “haptic” view … there is no relation of depth or distance, no incertitude of light and shadow’ (Deleuze, Smith & Conley 2005: 5). In this haptic vision the form and the ground are perceived to be in the same place and the artist creates form or depth through their painterly gestures. Deleuze suggests that this sensual, tactile mode of seeing allows the experiencer to perceive afresh.

The words Leiris uses to describe Bacon – hand, touch, seizure, capture — evoke this direct manual activity that traces the possibility of fact: we will capture the fact, just as we will ‘seize hold of life.’ But the fact itself, this pictorial fact that has come from the hand, is the formation of a third eye, a haptic eye, a haptic vision of the eye, this new clarity

(Deleuze, Smith & Conley 2005: 161)

He aligns this tactile vision to colour: ‘There is indeed a creative taste in color, in the different regimes of color, which constitute a properly visual sense of touch, or a haptic sense of sight’ (Deleuze, Smith & Conley 2005: 153). He contrasts this with tonality: ‘a haptic sight of color-space, as opposed to the optical sight of light-time.’ (Deleuze, Smith & Conley 2005: 139).

In order to clarify tactile vision and responding to sensuality and colour over form and ground this paper will describe phenomenological viewing of a painting. I feel drawn by an image. I am responding on a sensory level that evokes affective responses. As Roland Barthes states: ‘What I can name cannot really prick me! The incapacity to name is a good symptom of disturbance’ (Barthes 2000: 53). Looking can cause an affective shock as I allow myself to be ‘pricked’ by the image that I do not name. When my gaze feels drawn by a painting, I walk closer to it and an area of ‘nameless appearance’ (Benjamin 1972: 379) draws me in closer. I weave in and out of naming and not naming, experiencing and reflecting on my experience (Kozel 2007). I slip between what the artist is depicting, the physicality of the real world, and also the physicality and formal pictorial aspects of the painting, and sensations I feel looking at it. I react and I question what I am reacting to. For example, my gaze was drawn by the beautiful delicate surface of Whistler’s Nocturne Blue and Silver: Cremorne Lights (1872). Drawing very close to the image I saw distance, distance that became more tangible as I identified distant misty buildings. Their brown-grey was pricked with orange-yellow light. The light then became luminous distant dots and long flickering lines. Taking a step back and widening my view these lines resolved into long reflections of light on water. Stepping back further I saw the sky and sea are almost merged together by layers of silvery mist. I became aware of the formal element of composition. Three-quarters of the image is water. The half-dissolved buildings seem to float in layers of mist, weightless above their long reflections on the water. The brushstrokes describing the waves on the water and clouds in the sky were more tangible than the distant buildings. The visible rhythmic brushstrokes revealed gesture, the layering of time in the painting and a sense of the artist’s intention. The painting had highly sensuous, subtly filtered light, depicted with subtle tones and hues that were muted and nearly equiluminant. I reflected that there is a tension between J. M. Whistler’s sensuous view and his aesthetic minimalist approach and that this very tension between reduction and sensuality creates an endless sense of depth.

Perceiving depth through overlapping objects and transparency. As can be seen in Nocturne Blue and Silver: Cremorne Lights, transparency can radically change the understanding of fundamental perceptual elements such as depth and colour, in the real world and in abstract objects. An object is seen as transparent when it causes a difference in lightness or luminance of the background, the shape of the object is perceivable because there is some kind of boundary, and the object is perceived as being in front of the background. We see the transparent object as the source of a depth effect. As Ken Nakayama et al found:

[T]ransparency is not coupled strongly to real-world chromatic constraints since combinations of luminance and color which would be unlikely to arise in real-world scenes still give rise to the perception of transparency. Rather than seeing transparency as a perceptual end-point, determined by seemingly more primitive processes, we interpret perceived transparency as much a ‘cause’, as an ‘effect’.

(Nakayama, Shimojo & Ramachandran 1990: 501)

Transparency shows that there is depth: that something is in front of something else, but because the something is not opaque and does not fully occlude what is behind there can be many interpretations of the scene (Figure 1). We tend to interpret a scene with a transparent object in the foreground, but in an image it may well be that a full layer is transparent. Additionally, the transparent layer may have a hole in it. Playing with and inverting object-layer or foreground-background expectations creates surprises and causes the viewer to re-evaluate depth, colour and form, whilst still showing that there is depth, just not what was expected.

Figure 1
Figure 1

A no transparency half black and half grey, B transparent circle on top of grey and black, C transparent dark grey & lighter grey field, on top of background with a circle, D white circle over grey and black with no transparency, E dark transparent circle on top of grey and black, F the top layer is transparent with a hole, the background is half dark and half light grey, G no transparency: a hole is cut in the grey and black revealing white underneath, H transparency is expected, I transparency is not expected. (Watkins, 2021) Copyright 2021 Julie Watkins.

Transparency is expected in 1-H, because there is a top and bottom X-junction (based on Adelson-Anandan-Anderson’s X-junction contrast-polarity model (Anderson 1997)) where the contour of the transparent surface and the underlying contour cross. Transparency is not expected in 1-I because the contours are displaced and the X-junctions are eliminated. This makes the grey hemispheres look opaque. The ambiguous nature of transparency, as demonstrated in the examples above, makes it a key element in creating non-fixed depth; i.e. depth that can appear to change as our gaze sweeps over the image. Our reading of the cause of transparency can change, which reverses foreground and background. This reversal has similarities with the Rubin face-vase illusion, where it is impossible to see face and vase at the same time (de Graaf et al 2011). The very ambiguity of ground and form expands our concept of visual space as we become aware of limitations of our visual system.

Paul Klee’s Two Passages (1932) is tonal watercolour with complex, layered, textured transparency. Klee thought about tone as density. He did not use tone as light and shade to describe form. He does not light a scene. He stated: that the challenge of tonality needs the ‘whole scale of graduations’ (Klee 1992: 306). Klee mixed black to white in measured proportions and charted the effects, in terms of both the absolute and relative tones. To create Two Passages Klee used two overlapping rectangular spirals that he overlaid with measured, graduated tones, creating a puzzle, with arrows from the left and right edges indicating how to solve it (Figure 2). The overlapping translucent shapes create a spiralling depth but the forms remain indeterminate as figure and ground are inextricably fused together and we cannot decide if we are seeing L-shapes or overlapping rectangles. The resulting image is a paradox. A simple form, a rectangle, has produced a myriad of possibilities of occulusion and therefore a myriad of possible readings of depth. Klee’s Two Passages is redolent with painterly qualities, which evoke a phenomenological response. The rectangles are individually rendered; their wavering lines and the tones within tones of the watercolour soften the geometric shapes, as if we are seeing a geometrical cityscape through a fog or submerged under water.

Figure 2
Figure 2

Diagram of the two overlapping rectangular spirals (Watkins, 2021) Copyright 2021 Julie Watkins.

Perceiving depth through rhythm and movement. Mondrian’s Pier and Ocean (1915) demonstrates how the movement of the eye over pattern can create a sense of depth through a sense of rhythm and movement even when the pictorial plane has been conceived of as flat and the pictorial elements are vastly reduced (Figure 3).

Figure 3
Figure 3

Pencil drawing (Watkins, 2021) inspired by Mondrian’s Pier and Ocean (1915) Copyright 2021 Julie Watkins.

As a modernist Mondrian tried to quantify and code scientifically defined optical processes in relation to nature on a vertical canvas plane. His interest lay in composition on flat planes. He suggested that the retinal field produced rules that governed the pictorial field:

The two planes—that of the retinal field and that of the picture—were understood now to be isomorphic with one another, the laws of the first generating both the logic and the harmonic of the order of the second; and both of these fields—the retinal and the pictorial—unquestionably organized as flat.

(Krauss 1996: 11)

Mondrian restricted the plastic expression, beelding, i.e. the forming and making of expression, to the immutable relationship of two perpendicular straight lines. He used oppositions in his composition and rhythm to order the whole, but, nevertheless, was aware that there was a challenge of variation (Mondrian, Holtzman & James 1993). Standing back from Pier and Ocean the oval forms a complete sphere or world within which the lines are either horizontal or vertical. At first glance the image is white or black. The impression is of a binary pattern, a woven mesh or code that contrasts positive and negative. Krauss writes that Mondrian structured his visual field with ‘fragments of an abstract grid that would intend to throw its net over the whole of the external world in order to enter into consciousness’ (Krauss 1996: 12).

Mondrian starts from a vast seascape and reduces it not to colour but to tonal rhythmic lines on a background made of degrees of off-white. The rhythm of sky is different to the sea, which is parted by a small pier projecting into it. The horizontal and vertical lines are strongly directional, but, generally, if adjacent the lines were joined they would not meet to form a straight line as there are small variations in their horizontal or vertical placement. This creates a lack of continuity and therefore the lack of visual redundancy, which surprises the viewer. As the viewer’s gaze moves across the surface there is a sense of movement. This is no imitation of the natural appearance of sky, ocean and pier; these elements become rhythm and movement. The end result lies between representation and abstraction.


It is hoped that artist and practitioners will find my articulation of how I have been inspired by the research above useful. In my practice I explored spatial depth in my practice through three approaches: augmenting spatial depth in photography, rhythm and movement in fixed-screen animation, and immersive, interactive animated projected light in three-dimensions.

Augmenting spatial depth in photography by capturing transparent mist. To explore softening linear perspective and whilst accentuating depth in a 2D image I took a photograph from the Sky Garden over London when the landscape was enveloped in fog. View from Sky Gardens (Figure 4) demonstrates using natural lighting and weather effects to soften linear perspective and simultaneously increase our visual sense of space. It captures the sweeping panorama in a linear perspective with strong depth cues in the changing scale of the buildings as they recede into the distance. The occlusion of buildings in front of more buildings strengthens the one point perspective. As our eyes move over the photograph we see detail and then realise the amount of detail we can see is constrained by the translucent fog. The fog speaks to our field of vision and emphasises our lack of all-over detail as we gather visual data. There is one point of focus but it is subsumed in the fog. This frees up viewing, just as Paul Klee invites his work to be explored, grazed over, without the constraint of perspective and offers myriad possible paths to explore. The translucent mist softens and desaturates the far distant landscape. The mist captures the light and gives it volume, echoing Whistler’s use of light in Nocturne Blue and Silver: Cremorne Lights. The mist makes the rendering of the view incomplete, i.e. we do not see all the details in the distance. The lack of detail adds life to the view as each time we glance at the photograph we complete it slightly differently, and this increases the sense of space that is evoked.

Figure 4
Figure 4

View from Sky Gardens (Watkins, 2021) Copyright 2021 Julie Watkins.

Creating depth in fixed-screen animation through rhythm and movement. In Tones Turbulence (2021) I explored creating depth with abstract shapes using transparency and occlusion. I created a simplified impression Two Passages by Paul Klee only using transparent rectangles, and changed the transparency the same amount each time (Figure 5). This gives a sense of depth or layering through tonal contrast and X-junctions.

Figure 5
Figure 5

An impression of Two Passages (Watkins, 2021) Copyright 2021 Julie Watkins.

To soften the geometry of the piece, I introduced a sense of the organic patina of watercolour by adding continuously moving painterly textures. I carefully controlled the opacity to allow the creation of movement in the darkest areas whilst not over-brightening the lighter areas, and thereby left the tonal variations of the rectangles intact (Figure 6).

Figure 6
Figure 6

A still from Tones Turbulence (Watkins, 2021) Copyright 2021 Julie Watkins.

To give a sense of the movement the spirals create in Two Passages I animated the rectangles, building the scene up and then deconstructing it, with each rectangle setting off the movement of the next. This gives impetus and rhythm to the movement and means there is no redundancy and the eye remains engaged. The depths of the rectangles remain ambiguous: they could be passing in front or behind or between. The evolving texture acts as a setting and integrates the whole by alluding to scale, space, and passing time. Play the sequence at

I created an installation of animated light Tactile Vision and Voice (2022) at Bathway Theatre in the real three-dimensional world to further explore building multiple perspectives, tactile vision, perceiving depth through overlap and transparency, representations of the world and field of vision. Funke Oyebanjo wrote and performed the audio track, which I edited. I created two animations to the impetus of the audio, Horizon Line and Circles. These animations were shown simultaneously. They were given volume by haze and became three-dimensional. Experiencers (Nelson 2016) of the work were invited to step into the installation, to walk through it, explore it and touch the animated light (Figure 7). This form shattered the static Cyclops effect of linear perspective as multiple views and perspectives were experienced. As with photomontage the experiencer was not at the apex of a viewing pyramid, mastering the scene from one point, but interacted with the changing scene. Experiencers were free to photograph and video the work but Tactile Vision and Voice can only be experienced through participation: photographs and films cannot fully capture this immersive and interactive event. The event itself highlights that our vision is much freer than the static and frozen nature of images with linear perspective.

Figure 7
Figure 7

Touching the light from inside the tunnel of light (Watkins, 2022) Copyright 2022 Julie Watkins.

The installation on a darkened stage was big enough (Griffiths 2008) to encourage the experiencers to become immersed. The affect of Tactile Vision and Voice was increased as experiencers had a series of visceral surprises. They entered the dark theatre and were invited on to the stage. A voice, speaking unintelligibly (haikus spoken backwards) came slowly out of the darkness and a single projected line appeared across hanging translucent flags.

Then two-dimensional circles of light were animated through the haze and, incredibly, formed three-dimensional tunnels of light. The experiencers had the surprise of not only seeing this but of being inside the work, stepping in and out of the tunnels and being able to touch and play with the animations that draped over their bodies (Figure 8). The animations were changed by the experiencers who cast shadows into the light tunnels and rippled the haze as they dragged their fingers through the rays of light. Swapping the usual two-dimensional experience of fixed screen animations for interacting with a three-dimensional experience is affecting and brings the experiencer into a tactile sense of vision, in which vision seems to be refreshed by visceral surprise. The tactile quality of depth through volume of light expanded the space, just as light through mist expands space in Nocturne Blue and Silver: Cremorne Lights and View from Sky Gardens. Seeing volume expressed in light expands our sense of space (Turrell 1993).

Figure 8
Figure 8

Interacting with Tactile Vision and Voice (Watkins, 2022) Copyright 2022 Julie Watkins.

Standing back from Tactile Vision and Voice the tunnel of light formed a complete cone; similarly to as Pier and Ocean appearing as an oval woven mesh of lines from a distance. Closer to Pier and Ocean the individual lines become apparent, with their lack of continuity, lack of redundancy, which creates a sense of rhythm and movement. Closer to Tactile Vision and Voice the tunnel of light is seen as an ever-changing haze wrapped around animated three-dimensional forms, fluidly defining and redefining it, with no visual redundancy. Movement creates the form, the rhythms of the animations and the haze creation elide to become a rhythm in three-dimensions.

The projection was on to translucent hanging flags and into ever-changing translucent and transparent haze, giving multiple depth cues through their multiple overlaps. I took a post-digital approach to the animation to increase its textural qualities and touch-ability, by projecting onto imperfect, non-flat surfaces: the ever-changing haze, swaying flags of translucent tracing paper and the folds of black curtains at the back of the stage. Tactile Vision and Voice shattered the illusion of perspective within the projected image through this post-digital approach, evoking photomontage. This was enhanced by the lack of a flat, unbroken white screen for the projection to fall on to: there was no film to passively look at from a fixed point of view (Figure 9). Two projections were combined (Horizon Line and Circles) across the scene, so there was no one cohesive lighting source and the animations constantly changed the strength, colour, and position of the light. Like ABCD, Tactile Vision and Voice encouraged building representations of an animated world through multiple perspectives in an exploratory way.

Figure 9
Figure 9

Animated circles creating infinite tunnels of light (Watkins, 2022) Copyright 2022 Julie Watkins.


This paper has examined limitations of linear perspective and proposes approaches to widening our understanding and expression of the richness of our vision. It suggests developing an awareness of: how we our view of the world over time and through multiple perspectives, how we are affected by tactile vision, how artists can expand our sense of space through reduction in their work that plays with our visual perception, how phenomenological viewing can inform our vision and practice expand it still further. As Dileep George posits: ‘Vision is the primary sensory modality for humans and most mammals to interact with this world’ (2008: 8). Therefore, expanding our expectations of vision from a static frozen view of the world expands our possibilities for our primary sensory modality for interacting with the world.

Author Information

Dr Julie Watkins is a senior lecturer in Animation at the University of Greenwich. She worked as lead creative in prestigious Post-Production facilities in Soho and Manhattan leading Concept Design and Technical Direction for Animation, Motion Graphic and VFX teams. She taught VFX at New York University. She joined the University of Greenwich in 2006 and initiated a Film and Television degree and partnership with the BBC. Supporting her animation practice she has published numerous papers and presented work internationally.

School of Design, University of Greenwich, London, U.K.

Orcid 0000-0001-8872-7041

Competing Interests

The author has no competing interests to declare.


Anderson, Barton L. (1997) ‘A Theory of Illusory Lightness and Transparency in Monocular and Binocular Images: The Role of Contour Junctions’, Perception, 26(4), p: 419–453. DOI:

Barthes, Roland (2000) Camera lucida: reflections on photography. London: Vintage (Vintage classics). DOI:

Benjamin, Walter (1972) ‘A Short History of Photography’, Screen, 13(1), pp. 5–26. DOI:

Burleigh, Alistair, Pepperell, Robert and Ruta, Nicole (2018) ‘Natural Perspective: Mapping Visual Space with Art and Science’, Vision, 2(2), p. 21. DOI:

Cavanagh, Patrick (2005) ‘The artist as neuroscientist’, Nature, 434(7031), pp. 301–307. DOI:

Crary, Jonathan (2005) Techniques of the observer: on vision and modernity in the nineteenth century. Nachdr. Cambridge, Mass.: MIT Press (October books).

Deleuze, Gilles, Smith, Daniel W. and Conley, Tom (2005) Francis Bacon: the logic of sensation. Minneapolis: University of Minnesota Press.

Florenskiĭ, Pavel A. (2006) Beyond vision: essays on the perception of art. Place of publication not identified: Reaktion Books.

George, Dileep (2008) How the brain might work: a hierarchical and temporal model for learning and recognition. Stanford.

Gombrich, Ernst H. (1960) Art and illusion: a study in the psychology of pictorial representation. New York: Pantheon Books (Bollingen series, 35. The A. W. Mellon lectures in the fine arts, 5).

Gombrich, Ernst H. (2011) The story of art. Repr. London: Phaidon Press.

Gómez-Puerto, Gerardo, Munar, Enric and Nadal, Marcos (2016) ‘Preference for Curvature: A Historical and Conceptual Framework’, Frontiers in Human Neuroscience, 9. DOI:

Graaf, Tom A. et al. (2011) ‘On the Functional Relevance of Frontal Cortex for Passive and Voluntarily Controlled Bistable Vision’, Cerebral Cortex, 21(10), pp. 2322–2331. DOI:

Griffiths, Alison (2008) Shivers down your spine: cinema, museums, and the immersive view. New York: Columbia University Press (Film and culture).

Hackbarth, Dirk (2020) ‘Raoul Hausmann’s Infrared Photography: Energy and Perceptual Education after Dada’, Art Journal, 79(1), pp. 56–73. DOI:

Hausmann, Raoul (1924) ABCD montage Negative date: 1924 or later. The original montage is reproduced in: Raoul Hausmann, “Je ne suis pas un photographe” (I am not a photographer), Sté Nlle des Editions du Chéne, Paris, 1975: 55

Hockney, David and Gayford, Martin (2016) A history of pictures: from the cave to the computer screen. London: Thames & Hudson.

Howard, Ian P. and Rogers, Brian J. (1996) Binocular Vision and Stereopsis. Oxford University Press. DOI:

Klee, Paul (1932) Two Passages, watercolour.

Klee, Paul (1992) Paul Klee Notebooks Volume 2. London: Lund Humphries.

Klee, Paul and Moholy-Nagy, Sibyl (2000) Pedagogical sketchbook. Nachdr. London: Faber and Faber.

Kozel, Susan (2007) Closer: performance, technologies, phenomenology. Cambridge, Massachusetts: MIT Press (Leonardo). DOI:

Krauss, Rosalind E. (1996) The optical unconscious. Cambridge, Mass: MIT Press.

Leder, Helmut et al. (2004) ‘A model of aesthetic appreciation and aesthetic judgments’, British Journal of Psychology, 95(4), pp. 489–508. DOI:

Livingstone, Margaret (2002) Vision and art: the biology of seeing. New York, N.Y: Harry N. Abrams.

Matisse, Henri (1978) Matisse on Art. New York, N.Y: Phaidon Press.

Mondrian, Piet, Holtzman, Harry and James, Martin S. (1993) The new art--the new life: the collected writings of Piet Mondrian. 1st Da Capo Press ed. New York: Da Capo.

Monet, Claude 1882 Shadows on the Sea painting Claude Monet, Public domain, via Wikimedia Commons.

Nakayama, Ken, Shimojo, Shinsuke and Ramachandran, Vilayanur S. (1990) ‘Transparency: Relation to Depth, Subjective Contours, Luminance, and Neon Color Spreading’, Perception, 19(4), pp. 497–513. DOI:

Nelson, Robin (2016) ‘The Emergence of “Affect” in Contemporary TV Fictions’, in Alberto N. García (ed.) Emotions in Contemporary TV Series. London: Palgrave Macmillan UK, pp. 26–51. DOI:

Palumbo, Letizia, Ruta, Nicole and Bertamini, Marco (2015) ‘Comparing Angular and Curved Shapes in Terms of Implicit Associations and Approach/Avoidance Responses’, PLOS ONE. Edited by Wael El-Deredy, 10(10), p. e0140043. DOI:

Penacchio, Olivier and Wilkins, Arnold J. (2015) ‘Visual discomfort and the spatial distribution of Fourier energy’, Vision Research, 108, pp. 1–7. DOI:

Rivkin, Julie and Ryan, Michael (eds) (2004) Literary theory: an anthology. 2nd ed. Malden, MA: Blackwell Pub.

Rothko, Mark (2004) The artist’s reality: philosophies of art. Edited by C. Rothko. New Haven, Conn.: Yale University Press.

Tarasov, Oleg (2017) ‘5. Spirituality and the Semiotics of Russian Culture: From the Icon to Avant-Garde Art’, in Louise Hardiman and Nicola Kozicharow (eds) Modernism and the Spiritual in Russian Art: New Perspectives. Open Book Publishers, pp. 115–128. DOI:

Trouilloud, Audrey et al. (2022) ‘Influence of physical features from peripheral vision on scene categorization in central vision’, Visual Cognition, 30(6), pp. 425–442. DOI:

Turrell, James (1993) Air Mass. London: South Bank Centre.

Whistler, James Abbott McNeill (1872) Nocturne Blue and Silver: Cremorne Lights.