Syntactic Structures in Graphics
Autor: Yuri Engelhardt
Building upon the existing literature, we are suggesting to regard the building blocks of all graphics as falling into three main categories: a) the graphic objects that are shown (e.g. a dot, a pictogram, an arrow), b) the meaningful graphic spaces into which these objects are arranged (e.g. a geographic coordinate system, a timeline), and c) the graphic properties of these objects (e.g. their colors, their sizes). We suggest that graphic objects come in different syntactic categories, such as nodes, labels, frames, links, etc. Such syntactic categories of graphic objects can explain the permissible spatial relationships between objects in a graphic representation. In addition, syntactic categories provide a criterion for distinguishing meaningful basic constituents of graphics. Based on the above, we discuss how the concept of syntactics can be applied to graphics. Finally we distinguish different types of meaningful graphic spaces that can be used to construct graphics. Throughout the paper we relate our proposals to the relevant existing literature.
Visual displays of information are playing an increasing role in modern society. Think of anything from simple subway maps on the wall, to infographics in the newspaper, to interactive 3D data visualizations on the computer. The focus of this paper is such diagrams, maps, charts, graphs, tables, and information visualizations. In other words, this paper is not primarily about pictures in the sense of images of physical scenes and objects. Nor is it about art. It is about images that can be regarded as 'visualizing the non-visual' in an attempt to clarify information of some sort. Such images are often collectively referred to as ”graphics”.
Various scholars have tried to approach graphic representations with concepts from linguistics. Is there such a thing as a ”grammar of graphics”? Which level of visual detail is useful for distinguishing basic constituents of graphics? Do constituents of graphics – like constituents of speech – come in different grammatical categories? Building upon the existing literature on these topics, we are trying to answer these questions.
2 Earlier ”grammatical approaches” to graphics
Various authors have attempted to approach graphics with the linguistic concept of grammar. Let us briefly review a few examples. In 1914, Willard Brinton writes in his book Graphic methods for presenting facts that ”The principles for a grammar of graphic presentation are so simple that a remarkably small number of rules would be sufficient to give a universal language”. In 1967, Jacques Bertin publishes his classic Sémiologie graphique, in which he analyses the ”language” of graphic representations and the ”visual variables of the image”. In 1976, linguist Ann Harleman Stewart examines the properties of diagrams and claims that ”Like any language, graphic representation has a vocabulary and a grammar”. In 1984, Clive Richards proposes a ”grammatically-based analysis” of diagrams in his Ph.D. thesis Diagrammatics. In 1986, Jock Mackinlay suggests that ”graphical presentations are actually sentences of graphical languages that have precise syntactic and semantic definitions”. In Mackinlay's approach, ”the syntax of a graphical language is defined to be a set of well-formed graphical sentences”. In 1987, Fred Lakin publishes his paper ”Visual grammars for visual languages”, in which he describes his approach to the ”spatial parsing” of graphics, which he defines as ”the process of recovering the underlying syntactic structure of a visual communication object from its spatial arrangement”.
Since the mid-nineties the literature on grammatical aspects of graphics is expanding further. Kress and van Leeuwen publish their book Reading images: the grammar of visual design (1996). Unfortunately, it is difficult to extract a systematic approach to a syntactic analysis of graphics from their book. A paper titled ”The visual grammar of information graphics” (1996) by Engelhardt et al., suggests ”syntactic categories of visual components”. Robert Horn, in his book Visual Language (1998), proposes a morphology and a syntax of visual language based partly on the work of Jacques Bertin and on the Gestalt principles of perception. In his book The grammar of graphics (1999), Leland Wilkinson describes an approach to graphics that is related to objectoriented design in computer science. However, he uses grammatical terminology ”metaphorically”, and not in a linguistic sense. Colin Ware (2000) writes about the ”perceptual syntax of diagrams”, describing ”the grammar of node-link diagrams” and ”the grammar of maps”. Engelhardt, in his Ph.D. thesis The language of graphics (2002) provides a detailed proposal for the analysis of syntactic structure, which he applies to a broad spectrum of graphic representations. Based on all this previous work, what can we say about the structure of graphics?
3 Building blocks: objects, spaces, properties
To be able to talk about the building blocks of graphics, let us introduce some terminology. We propose a notion of graphic objects that will allow for recursive structures: Any graphic representation – and any meaningful visible component of a graphic representation – may be referred to as a graphic object. This means that graphic objects can be distinguished at various levels of a graphic representation. For example, a map or a chart in its entirety is a graphic object. In addition, the various symbols or components that are positioned within that map or chart are graphic objects as well.
A set of graphic objects can be combined into a meaningful arrangement, together forming a single graphic object at a higher level. As Winn (1991) writes: ”One property of the symbol system of maps and diagrams is that their components can form clusters, which in turn can form other clusters in a hierarchical fashion. Each cluster can then act as a discrete component.” Let us give a top-down description of this principle: A graphic object (e.g. a map, a time chart) can contain a meaningful graphic space (e.g. a cartographic space, or the space defined by a time line, see section 7 for more about meaningful graphic spaces). In turn, that meaningful graphic space can contain graphic objects (e.g. symbols or small ”sub-graphics”). This can be applied recursively, resulting in objects inside spaces inside objects etc. A bottom-up description of this principle was given above: a set of graphic objects can be arranged into a meaningful graphic space, together forming a single graphic object at a higher level. This ”nesting” or ”embedding” (Engelhardt 2002) of graphic structures can be referred to as ”recursive composition” (Card 2003). In section 5 of this paper we will come to the question of which level of visual detail is useful for distinguishing basic graphic objects.
In contrast to the general notion of ”space”, the notion of meaningful graphic space (Engelhardt 1998, 1999, 2002) involves signification: a spatial position stands for something. In many graphics, for example in maps and in time charts, a change of position of an object will correspond to a change of meaning. In technical terms, a meaningful graphic space could be defined as a graphic space that involves an interpretation function from spatial positions to one or more domains of information values. For example, moving to the left in a graphic space may mean moving towards the West (in case of a map), or moving back in time (in case of a time chart). In his paper ”Giving meaning to place: Semantic spaces”, Wexelblat (1991) explains that visualizations ”give representational significance to arrangement and location”, and that ”location may have precise meaning even without the presence of an object at that location”. Card (2003), referring to Engelhardt et al. (1996), explains that in a visualization, ”Empty space itself, as a container, can be treated as if it had metric structure”. Card presents spatial axes as ”an important building block” of graphics.
Before we continue, let us first try to say more about the different categories of ”building blocks” of graphics. In graphics, not only the possible constituents themselves (graphic objects), and the diverse possible ways of arranging these constituents (in a meaningful graphic spaces), but also the possible visual appearances of these constituents (graphic properties such as size, color), could be considered as being part of the graphic ”vocabulary”. In this sense we can say that the building blocks of graphics fall into three main categories: graphic objects, meaningful graphic spaces, and graphic properties. Consider a drawing of a family tree for example. In a family tree, the names and the lines between the names are graphic objects. The meaningful graphic space into which these graphic objects are arranged involves a vertical ordering of generations (e.g. grandparents on top, grandchildren at the bottom). And if names are written in different colors or sizes, then these are graphic properties of those names.
The three categories of the building blocks of graphics – objects, spaces, and properties– can be traced back in the literature, although various different terms have been used to refer to them (see the table below). In his classic Sémiologie graphique, Jacques Bertin (1967) elaborates on the uses of ”marks”, ”positional variables”, and ”retinal variables”. Twyman's ”schema for the study of graphic language” (1979) sets out ”mode of symbolization” against ”method of configuration”. Wexelblat (1991) describes visualizations as ”represented objects” that are positioned in ”semantic spaces”. Winn (1991) dissects maps and diagrams into ”components” and their ”configuration”. Engelhardt et al. (1996) distinguish ”visual components”, ”basic operations of spatial syntax”, and ”visual appearance”. Card, Mackinlay and Shneiderman (1999) and Card (2003), both referring to Engelhardt et al (1996), introduce the term ”spatial substrate”.
Table 1: The building blocks of graphics.
Meaningful graphic spaces are elaborated on in section 7. In the next two sections we will examine graphic objects.
4 Syntactic categories of graphic objects
Every known natural language is based on the possibility of combining language constituents of different syntactic categories. Examples of such syntactic categories are ”noun phrase” and ”verb phrase” (sometimes referred to as phrasal categories), or ”noun”, ”verb” and ”adjective” (usually referred to as lexical categories, or 'parts of speech'). In natural languages, such syntactic categories usually differ from each not only with regard to syntactic aspects but also with regard to semantic aspects.
Graphics can be approached in a similar way. Richards (1984) provides a very simple example figure in which a letter ”A” and a letter ”B” are connected by a line. This figure represents visually that ”A is connected to B”. Richards suggests that in this case ”the line serves a verb-like function for the nouns A and B”. A different way of describing this is to say that this simple figure contains nodes (the letters ”A” and ”B”) and a connector (the line connects ”A” and ”B”). Mackinlay (1986) uses the term ”connection languages” and writes that ”Sentences of connection languages consist of two sets of marks: the set of nodes [...] and the set of links [...]” (again, in our terminology: a set of nodes and a set of connectors). As Mackinlay points out, it is also syntactically relevant here that ”The nodes constrain the position of links”.
To make a more general statement, we claim that all graphics are based on the possibility of combining graphic constituents (graphic objects) of different syntactic categories (Engelhardt et al. 1996, Engelhardt 2002, 2006). Let us take a subway map as an example. On a subway map, each subway station is indicated by a graphic object (e.g. a dot, or a small circle, or a tick). In terms of our analysis, that graphic mark functions syntactically as a node. Next to that graphic mark we read the name of that particular subway station. That station name functions syntactically as a label. The paths taken by different subway lines are represented as lines of different color. These colored line segments between the subway stations function syntactically as connectors.
These three syntactic categories reflect the existence of discrete entities (nodes), their specification (labels), and their connection (connectors). While nodes may make sense by themselves (icons for example), labels and connectors only make sense in the presence of the nodes that they are labeling or connecting (Engelhardt et al, 1996).
These syntactic categories may apply to a subway map, but how about a topographic map? Well, a topographic map may for example contain red dots that function as nodes indicating cities. In addition, the topographic map may contain blue lines that function as line locators indicating rivers. And, for example, small blue areas that function as surface locators indicating lakes. And the map may contain words that function as labels naming all these cities, rivers and lakes.
Graphic objects of different syntactic categories ”behave” differently in a graphic representation. The constraints that govern their spatial positioning are different. Let us look at three examples. Example 1: What makes a connector different from a line locator? Consider a map that shows airline services between cities. Such a map will usually use connectors to show which cities are connected by flights. A connector is attached to the two graphic objects that are connected by it, and can easily been drawn with a slightly different curve, possibly making it bend a little more in order to prevent it from running through the middle of a third city in between, for example. A line locator on the other hand, such as a blue line that indicates a river on that same map, is attached to every point along the line that is described by the course of that river. The mapmaker can (should) not, for example, bend the line a little, in order to prevent it from running through a certain city. The reason for this is that this line is not simply a connector that links spring to ocean, but a line locator that traces a specific line in space.
Example 2: What makes a label different from a node? Consider a small black square on a map that indicates the location of a city, with a word indicating the name of the city (e.g. ”Amsterdam”). That word is a label, which is attached to the black square. If more convenient for some reason, the mapmaker can move the label to the other side of the black square, as long as the label remains close to the black square. The black square however is a node, which is attached to a point in graphic space. This means that, while the mapmaker can move the label to the other side of black square, he cannot move the black square to the other side of the label. (In the latter case he would be moving the city.)
Example 3: What makes a node different from a surface locator? Consider two colored shapes on a map. One of the colored shapes is a pictogram of some sort that indicates a particular location (e.g. ”you are here”). The other colored shape indicates a lake. The first colored shape (”you are here”) is a node, which is attached to a point in graphic space. This colored shape can be made somewhat bigger or smaller by the mapmaker, without a change in meaning. The second colored shape (lake) is a surface locator, which is attached to a specific surface in graphic space. Consequently, this colored shape cannot, for example, be made somewhat bigger or smaller by the mapmaker without a change in meaning. (The lake would grow or shrink.)
Nodes, labels, connectors, line locators, and surface locators are examples of frequently used syntactic categories of graphic objects. Proportional segments are an example of a syntactic category that appears specifically in pie charts (the pie segments), in stacked bar charts, and more recently, in ”treemaps”. See table 2 for a few more syntactic categories. Corresponding to ”parts of speech” in natural languages, one could refer to these syntactic categories in graphics as ”graphic parts”.
Table 2: Syntactic categories of graphic objects and rules for their combination.
All syntactic categories of graphic objects can be divided into two main groups: 1) objects that are attached to locations in graphic space (e.g. node, line locator, surface locator, grid marker), and 2) objects that are attached to other objects (label, connector, proportional segment, frame).
Several of the examples we that we are using above are taken from maps. However, we claim that all types of graphic representation of information can analyzed in terms of their composition from graphic objects of different syntactic categories. For a more complete list of syntactic categories see ”The language of graphics” (Engelhardt 2002) and ”Objects and spaces: The visual language of graphics” (Engelhardt 2006).
5 At which level of detail do we define basic graphic objects?
If we wish to regard graphics as sign systems, at which level of visual detail should we look for the 'basic signs' that graphics are composed of? What is the lowest level at which it is useful to talk about graphic objects in the sense of the approach that is proposed here? Richards (1984) believes that ”there seems to be little profit in using such items as an individual dot or line as a unit of analysis. If we are going to use linguistics as a model, then what is needed for present purposes is not the pictorial equivalent of a phoneme or morpheme but something closer to a noun phrase”. A little further on, Richards formulates it even stronger: ”If any analysis is going to be possible at all it seems that it must start at a 'noun phrase' level, otherwise we are forced down to the meaningless level of dots and lines or else up to the level where all we can say is 'here is a diagram'”.
How would ”a 'noun phrase' level” generalize to the approach that is proposed here? Well, it points us to the (lowest) levels at which syntactic categories of graphic objects can be observed. This leads us to the following proposal:
The basic graphic objects in a particular graphic representation are those that can be regarded as functioning in some syntactic category within that particular graphic representation (e.g. as a label, as a node, as a connector, as a proportional segment, etc.).
In other words, we use the term basic graphic object to mean the smallest visual entities that play some syntactic role in the sense that we have been discussing in the previous section. Having explored syntactic aspects of graphic objects, we will now take a look at how the concept of syntactics can be applied to graphics.
The distinction between syntactics, semantics, and pragmatics was introduced by Charles Morris (1938, 1946). Morris conceives of syntactics as the investigation of the relationships between signs, of the ways in which complex signs can be constructed from simple ones, as well as the ways in which complex signs can be analyzed into more simple ones (Morris 1946/1971). MacEachren (1995) notes:
All of the descriptions of syntactics given above fit perfectly with the approach to graphic structure that is proposed in this paper. The syntactics of graphics investigates the relationships between graphic objects of different syntactic categories. It investigates the rule- and constraint-based relationships between graphic objects (of different syntactic categories) and graphic spaces. And syntactics investigates how graphic objects can be combined into composite graphic objects, and how composite graphic objects can be analyzed into more simple ones.
So far, we have concentrated on the discussion of graphic objects. The uses of graphic properties (e.g. color, size) have been thoroughly investigated by Bertin (1967), and later, among others, by Mackinlay (1986) and by MacEachren (1995). We will now take a closer look at the third main category of the building blocks of graphics: meaningful graphic spaces.
7 Meaningful graphic spaces
Imagine sitting in a bar and using the arrangement of empty beer glasses on the bar table to explain, say, the location of Berlin with respect to London and Paris. The positioning of two beer glasses, standing for London and Paris, creates a meaningful space (Engelhardt 1998, 1999, 2002) – every position on the bar table has been assigned a geographical meaning. The meaningful space can even be regarded as extending beyond the bar table – a person on the other side of the bar may now happen to be ”sitting in Africa”. Similarly, when starting to draw a financial chart, by drawing two labeled axes (e.g. one for the months of the year, and the other for expenses in dollars), a meaningful graphic space has been created: every position in the yet-empty chart has been assigned a meaning, even before we have any data. The face of a clock also constitutes a meaningful graphic space - it assigns meaning (time of day) to every spatial position along a circle. While the ”London-Paris-Berlin space” represents a physical space, the empty financial chart and the clock face represent a conceptual space. This is a pretty straightforward but important distinction.
Looking at the broad spectrum of graphics we can say that images of physical scenes and objects, such as pictures and maps, represent physical spaces, while many abstract graphics, such as family trees and statistical charts, represent conceptual spaces (Engelhardt 1999, 2002). In other words, pictures and maps use spatial arrangement in the image to represent spatial arrangement in the world, while family trees and pie charts use spatial arrangement in the image to represent non-spatial information.
Let us take a brief look at the relevant terminology in the literature. Regarding the frequently used term ”iconic”, we can assert that ”iconic” graphics (such as pictures and maps) display physical spaces, while ”abstract” graphics (such as family trees and statistical charts) display conceptual spaces. The former represent ”concrete objects”, while the latter represent ”intangible concepts” (Winn 1991). This distinction has also been referred to as portraying ”visible things” versus portraying ”things that are inherently not visible” (Tversky 2001). One could argue however, that some representations of physical spaces such as a drawing of a molecule, a floor plan, or a world map, are – strictly speaking - not portraying ”visible things”. Therefore, instead of ”visible” versus ”non-visible”, the distinction between physical and conceptual seems more appropriate here. Accordingly, we can observe that some (aspects of) graphics are ”meant to reflect physical reality” while other (aspects of) graphics are ”meant to reflect conceptual reality” (Tversky 2002).
Representations of physical spaces do, by the way, not always have to express the true co-ordinate proportions of the represented objects. Think of a world map, the London Underground map, or an ”exploded view” of a machine. All three of these images greatly distort the physical spaces that they show, but nevertheless they are still representations of physical spaces.
Many graphics combine physical and conceptual spaces. As an example, think of little pictures of people or things (showing physical spaces) that are arranged on a time line (representing a conceptual space). Richards (1984) points out that while the ”perspective landscape is homogeneous in that it portrays a single unbroken space at a single moment in time [...] it seems that more than one space and more than one time can be portrayed in a single diagram”. As another example, think of little bar charts (showing conceptual spaces) that are arranged on a map (showing a physical space). Both of these examples make use of what can be referred to as ”nesting”, ”embedding” (Engelhardt 2002), or ”recursive composition” (Card 2003).
As an example of a true hybrid space (Engelhardt 1999, 2002), think of a three-dimensional landscape drawing of a country in which the drawn ”mountains” do not represent physical mountains, but – for example - population density, peaking in the cities and flat in the countryside. In this case, the horizontal plane represents the physical space of the country's geography, while the vertical dimension represents the conceptual space of population density. This example makes use of what can be referred to as ”orthogonal placement of axes” (Card et al. 1999) or ”simultaneous combination” (Engelhardt 2002).
In table 3, the two main operations for combining basic graphic spaces into composite graphic spaces are marked as ”a” (embedding) and ”b” (orthogonal placement of axes). These two techniques could be regarded as ”composition operators that can generate composite designs” (Mackinlay 1986).
Table 3: A typology of meaningful graphic space.
Not all graphic spaces are meaningful graphic spaces by the way. A set of graphic objects can also be shown in a random arrangement (Engelhardt et al. 1996), forming a more or less arbitrary spatial structure (Engelhardt 2002). In this case the involved graphic space is ”unstructured” (Card et al. 1999, Card 2003).
We claim that all types of graphic representation of information can be analyzed in terms of their composition from graphic spaces of different sorts. For a more complete discussion of meaningful graphic spaces see Engelhardt 2002 and Engelhardt 2006.
Graphics can be regarded as expressions in visual languages. We have tried to show that specifying such a visual language means a) specifying the syntactic categories of its graphic objects, plus b) specifying the meaningful graphic space in which these graphic objects are positioned, plus c) specifying the visual coding rules that determine the graphic properties of these graphic objects (see table 1). The syntactic structure of a graphic representation is determined by the rules of attachment for each of the involved syntactic categories (see table 2) and by the structure of the meaningful graphic space that is involved (see table 3). With this analysis we have attempted to demonstrate that Morris' original notion of syntactics applies well to the structure of graphics.