Druckansicht
Conclusive Notes on Computational Picture Morphology


Autor: Jörg R. J. Schirra
[erschienen in: Computational Visualistics and Picture Morphology (Themenheft zu IMAGE 5)]

Schlagwörter: [Schlagwörter]

Disziplinen: computational visualistics


As the thematic issue of IMAGE on computational image morphology attempts in particular to mediate between computational visualistics and other disciplines investigating pictures and their uses, the following remarks broaden the perspective again and relate the computational argumentations of the preceding papers to the more general discussion of image science. The two fundamental categories of picture syntax, the geometric base structure and the marker value dimension, are described. They are applied to the questions whether pictures with ill-formed syntax may exist at all, and if so whether they can be dealt with by computers, as well. The overview finally extends the discussion to the limits of pictorial syntax studies.

As the thematic issue of IMAGE on computational image morphology attempts in particular to mediate between computational visualistics and other disciplines investigating pictures and their uses, the following remarks broaden the perspective again and relate the computational argumentations of the preceding papers to the more general discussion of image science. The overview also extends the discussion to the limits of pictorial syntax or morphology.

1 The Two Sides Of Picture Morphology

In her influential book on picture syntax (1990), Fernande Saint-Martin distinguishes two kinds of properties of syntakto-morphological elements of pictorial signs that are often interpreted in the following manner (cf., e.g., Dölling 1999): plastic properties belong to the “material” of the picture vehicle while other properties are of a perceptual-visual nature, which means they are essentially “in the beholder’s eye”, constructed following the principles of visual perception and particularly Gestalt theory (cf., e.g., [Metzger 1966]). The geometric forms and their topological relations are given as typical examples for the latter, whereas color and texture are considered to be properties of the material as such. A combination of plastic and visual-perceptive attributes forms Saint-Martin’s version of a pixeme called “coloreme”.(1)

In the papers presented in this volume, a related yet different distinction can be found at the basis of picture morphology in computational visualistics: the distinction of geometric base structure and visual marker values. In the essence, they form two different (classes of) abstract data types that have to be coordinated in order to form the logical manifold of the pixeme structure on which the algorithms described work.

Let us recall that – from the application point of view on computer science – an abstract data type is a formal version of one of the fields of concepts that structure the argumentations in the application domain in question: a calculus that covers the essence of the way people of the application domain speak about a certain phenomenon (cf. the introduction of this volume). In our case, the manner of speaking of picture researchers about color, texture, geometric forms, and their spatial relations is the reference to which solutions in computer science are rated.

The distinction between geometry and color is not – at least not primarily – one between something belonging (objectively) to the picture vehicle’s material versus something constructed (subjectively) by the mechanisms of (visual) perception.(2) However, we indeed can talk about colors and the dependencies between them on the one hand, and about spatial entities and their geometrical or topological relations on the other hand without necessarily mixing the two threads of argumentations. They can be treated as independent, and thus can be considered as being governed by – prima vista – autonomous fields of concepts. Thus when dealing with picture morphology, computational visualists ought to consider, first, one set of abstract data types covering the logic of color, and another set of types describing the logic of space. They, then, have to combine one data type of each of the two groups in order to gain a calculus (more or less) equivalent to the argumentations concerning pixemes (cf. Figure 1).



Figure 1: Combination of fields of concepts

The combination of one field B (e.g., the field of geometric concepts) with another one C (for instance the one of color concepts) explains the structure of a more complex field of concepts A regulating instances with coordinated properties from the two constituting fields (for example concepts governing colored geometric entities, i.e., pixemes)


1.1 The Geometric Base Structures: The Logic of Locational Gestalts

Pictorial syntax deals, coarsely speaking, with the limited, spatial arrangement in two dimensions of visual distinctions (or, for short: of colors). The logic of spatial arrangement in general is covered in the essence by the various calculi of geometry (cf., e.g., [Aiello et al. 2007]). Abstract as those calculi are, they formalize central aspects of our concrete interactions with any kind of pure spatial configuration, and can be more or less immediately translated to abstract data types. Correspondingly, all the papers in this volume refer to one or the other of such a geometric base structure in two dimensions.(3)

The locational organization of pixemes is in fact not perceivable as such.(4) Like the temporal base structure of music that can only be perceived as organizing a sequence of distinct auditory markers – difference of pitch or harmonic pro­gression, change of volume or variation of timbre – the perception of the spatial base structure of pictures depends on visible differences: visual markers usually subsumed under the expression “color”. Indeed, color in this general sense includes hue, saturation and intensity as well as texture or even homogeneous temporal variations thereof. It is exactly the change of any one of those values that induces the border of a pixeme, and thus determines the spatial “rhythms” of the picture. Although underlying most of the papers in this volume, only few of the authors have elaborated this aspect in some details; they mostly rely on the everyday knowledge about this dimension. However, color theory is often not as simple, and some additional remarks about the visual marker system should be added in this conclusion.

1.2 The Visual Marker Values: Color And Texture, Reflection And Transparency

The various systems to formally cover color (in the closer sense) in computer science are indeed, we assume, well-known: every painting program or system for picture manipulation offers at least RGB or HSV. Those “color models” are essentially equivalent to each other and do not need a detailed description at this place. They basically implement the system of color concepts we normally apply when speaking about colors and the dependencies between them (cf. Fig. 2).



Figure 2: Graphical-geometrical presentation of two color models: neighborhood and transpositions with respect to the axes or symmetry centers are equivalent to relevant relations between the corresponding color concepts


We should note that we meet once more with color the problem of formalizing a seemingly dense dimension: between any two colors there appear to be more colors. And again, we depend on a perceptual system with a limited resolution in color distinction.(5) In contrast to locative resolution however, there is no such thing in “color space” as a natural “zooming operation”: the members of some pairs of colors are only distinguishable by means of a complicated technical device like spectral analysis that has no equivalent in non-technical human behavior.(6) We may therefore take color without real simplification as a syntactically discrete dimension with a resolution just below the threshold of human perception. Correspondingly, contemporary computer systems offer a data type for homogeneous colors with more than 16.5 million values together with methods to select and manipulate them easily along the dimensions of our color concepts. Two immediately neighboring color values of that system are for most humans indistinguishable (cf. Fig. 3).(7)



Figure 3: Different Colors?

Starting from RGB (255,0,0) in the left box, the color in each box is changed by (0,5,0) till (255,75,0): that is, “between” two adjacent boxes, four more color values are possible in RGB, although it is already impossible for most humans to distinguish two colors in adjacent boxes.

(this picture may not work well on all devices)


Homogeneous color as covered by the color models mentioned above is the central aspect of the visual marker dimension, but not its only aspect. More often, the visual markers are given as fine-grained textures that only appear as more or less homogeneous if the spatial resolution is not too high. In these cases, zooming reveals that a locale distribution of homogeneous colors is in fact relevant (or even fields with textures on a still finer level). However beside the zooming, textures are perceived, remembered, and even imagined not as a particular spatial distribution of individual (homogeneous) color values but as a different kind of visual marker values more or less analogous to accords in music (with tones as analogon to colors). The system of visual markers consists in fact of – at least – two levels. Although the two levels are not completely independent from each other, they follow quite different internal rules.(8)

1.3 Contextual Aspects of Picture Morphology

Finally, by means of transparency and reflectivity something distantly related to deictic elements in verbal signs is included in the visual marker dimension, as well. Those two phenomena of color in the broad sense are seldom dealt with in computational picture morphology. Recall as examples of corresponding traditional pictures stained church glasses or Mexican or Turkish folk art with build-in pieces of mirrors, or see Figure 4.



Figure 4: A Rather Extreme Example for Reflectivity and Transparency as Aspects of the Pictorial Marker Dimension

Toby Mason: Forming of the World (1997)


Note that the effects of reflectivity and transparency in the examples cannot be ascribed to the picture vehicle as such: it has to be considered in (and in contrast to) changing situational contexts. In every single context (i.e., arrangement of objects and lights around the image), the transparent and reflective regions of the picture have a fixed appearance indistinguishable in that respect from other regions – they may have been marked by homogeneous colors or textures just as well (as in fact in Figure 4). Only if changes in the context do indeed modify the distribution of marker values, and hence the arrangement of pixemes, an observer perceives regions as being transparent or reflective. The phenomenon is also directly important for computational visualists when combining pictures in layout (mostly transparency) or 3D graphics (transparency and reflection). Of course, an adequate conceptualization in the data type »image« must explicitly include such “indexical marker values”; in general, we cannot replace them by one arbitrarily induced distribution of homogeneous colors or textures.(9)

There also exists a contextual factor that influences the geometric base structure: While the calculi of, for example, pure mereogeometries only provide symmetric spatial dimensions, gravity – or the up-down polarity induced by it in the perceptual system of the observer – introduces an asymmetry in the spatial arrangement of the pixemes. However, like the quasi-indexical elements of reflectivity and transparency, the influence of gravity may be taken not as a syntactical aspect of pictures at all, but as an element of pictorial pragmatics.

In conclusion: the formal treatment of pictures in computational visualistics covering the syntactic aspects rests essentially on two basic data types and their interaction: first, the base structure of position and form, for which the calculi of mereo­geo­metry are quite promising general candidates at present; second, the field of marker values based on a discretized range of homogeneous colors and an additional dimension for transparency (and perhaps reflectivity), offering further structural principles for the secondary level of textures.

2 Do Syntactically Ill-Formed Pictures Exist?

Let us consider as a final aspect of pictorial syntax a thesis that is often discussed in general visualistics: in contrast to verbal expressions that have syntactically ill-formed counterparts, there seems to be no such thing as a syntactically ill-formed picture (cf. [Plümacher 1999]). Whereas, for example, the syntactic structure of a verbal language may be described by just one Chomsky grammar, so that expressions not described by that grammar are considered ill-formed (with respect to that grammar), any expression in any two-dimensional visual L-system, any mereogeometric configuration associated accordingly with marker values, any flat surface makes, it seems, a picture vehicle. The reason appears to be essentially that the geometric basis of pixemes is dense, and any potential combination of pixemes can be used as a picture.

2.1 What Are Morphologically Ill-Formed Pictures

The distinction between the dimension of the geometric base structure and the dimension of the marker values is indeed quite helpful to understand the difference between verbal sign systems and pictures, also with respect to ill-formedness. Indeed, those discussing this issue do usually not mention damaged screens: Cuts, holes, and burned regions disrupt the homogeneous topology that is part of the pictorial base structure. Cuts, for example, separate neighboring pixels: are they neighboring anymore or not, we cannot really say (cf. Fig. 5). Suddenly, there is non-space in picture space – which is certainly not equivalent to fully transparent regions. After all, a cut in a “Rembrandt” results not just in another picture but in a destroyed picture. So, our counter-thesis is that pictures might quite well be counted as syntactically ill-formed if the underlying geometric structure is disrupted.



Figure 5: An ill-formed Picture?

(A. E. Arkhipov: Peasant Girl (1920s), with a tear)


As with syntactically ill-formed verbal expressions, which may nevertheless be used efficiently for communication, syntactic well-formedness is no necessary criterion for a picture to be employed: a certain art form in the middle of the 20th century, particularly exemplified by L. Fontana (1899–1968), plays exactly with this syntactic deviation from well-formed images: Fontana’s “cut pictures” are reflective pictures that focus our attention on the “materiality” – or in our terminology: on the geometric base structure – of pictures exactly by means of the violation of that very basis (cf. Fig. 6, [Whitfield, 1999] and [Sachs-Hombach 2002, 164f.]).



Figure 6: Lucio Fontana: Concetto Spaziale (1965)
Intentionally cut screen said to refer to the “materiality” (i.e., the syntax) of pictures


Syntactic disorders – much like reflexion or transparency – have a deictic quality: viewed from merely one perspective, a cut may not be noticed for what it actually is, but taken as another (probably semantically strange) pixeme. Only the movement of the beholder makes clear that the spatial tissue of the syntactic base structure itself has been broken.(10)

Let us, before turning to the question of how such intentionally ill-formed pictures might be dealt with in computer visualistics, shortly look at the purpose of such pictures. Employing syntactically ill-formed picture vehicles on purpose is mostly restricted to art. More precisely, these pictures are associated with a special mode of use, as the communi­cative act they are used for deals with the pic­torial sign act it­self, and hence, among other aspects, with its syntactic structure.

2.2 The Reflexive Use of Pictures

Pictures that are not used in the primary sense of showing their content but instead of demonstrating aspects of pictorial communication are usually called reflective pictures. Many pic­tu­res of art indeed are re­flective pictures. They are called ‘reflective’ as they are used to communicate pictorially about the conditions of picture uses and picture productions, or for short: about picture communication and its constituents itself.(11)

Reflective pictures differ from other pictures by a different attitude of the beholders.(12) In this attitude, we show ourselves a picture as an example of one or the other of the many aspects of pictorial communication. Indeed, this is what we usually do when visiting an art museum, and pictures of art can generally be interpreted as pictures that are made specially for being received in reflective mode. In consequence, distinguishing reflective pictures from others is an aspect of pragmatics rather than syntax. However, as any aspect of picture use may be focused on when using a picture reflectively, syntactic features also play a role occasionally.(13)

Reflective picture uses are not restricted to the artistic contexts: quoting a picture is basically showing a picture vehicle in the reflective mode, too. Analogous to the use of verbal quotation, for instance of some example sentences in a linguistic textbook, which must not be confunded with the normal (direct) use of that sentence, the application conditions of a quoted picture are quite different from its direct use: e.g., showing a Renaissance sacral picture in a university seminar on art history vs. using it for prayer in a chapel.




Figure 7: Exemplifications of the Reflections of Depicted Objects Reached by the Computer Graphics Algorithms ‘Environment Mapping’ (left) and ‘Ray Tracing’ (right) Using the Notorious “Utah Teapot”


2.3 Reflexively-Used and Ill-Formed Pictures in Computational Visualistics

Although reflective pictures of the kinds used and invented in art are seldom relevant in computational visualistics, at least the particular use conditions of example images employed in texts on pictures may be considered important. We may quote pictures in order to exemplify a certain algorithm of computer graphics (cf. Fig. 7) or image processing (cf. Fig. 8). Therefore, an aspect of picture produc­tion (hence use) is communicated by means of the presentation of such a picture; what is to be seen (as those pictures are usually of the representational kind) is more or less contingent. The frequency of teapots in pictures presented in computer graphics books does by no means communicate a particular addiction to the beverage or the receptacle, nor is the fact that a horse is presented pictorially important for the original use of Figure 8.(14) How the object chosen is depicted, how the visual Gestalt relates to the object, and in particular: how that relation again is linked with some aspects of the algorithm exemplified, that is what the sender of such a pictorial message normally intends – and what the receivers expect to be told in those communicative circumstan­ces. Those pictures are therefore clear cases of reflective pictures, as well. In particular those pictures exemplifying, like Figure 8, segmentation algorithms are indeed quite important for the discussion of pictorial syntax in computational visualistics: those algorithms operationalize the concepts of pixeme formation.



Figure 8: Exemplification of a Particular Segmentation Algorithm (Segmentation by Aggregated Weighting): Input Image and Depictions of Results for Three Parameter Settings


As the reflective use of a picture determines in fact a pragmatic category, reflective pictures are usually not distinguished syntactically from other pictures: they correspondingly are not dealt with in a special way in the computer as long as their syntactic characterization alone is considered. In more complex applications, like interactive systems that have to consider semantic and pragmatic aspects at least to a certain degree, the reflexive use has not been employed so far.(15)

Ill-formed pictures, on the other hand, are syntactically special, and hence, it seems, should be dealt with in computational picture morphology accordingly. First, however, it is important to distinguish the digital pictures of ill-formed real pictures from the ill-formed digital pictures. Certainly, the digital photos of a Russian painting with a large tear or of a work of Fontana – see again Figures 5 and 6 – are syntactically not ill-formed as well. They are quite regular pictures with an undisturbed spatial base structure, with some of their pixemes marking the regions of the tear or cuts, just as other pixemes in other pictures mark the regions of open doors or other holes of an object depicted.(16)

If the base structure for pictorial syntax is given by some calculus of geometry, any inconsistent geometrical description can be counted as the computational analogon of a damaged screen. Thus, picture files that have have been incompletely transferred from a digital camera or from some Internet server could indeed be taken accordingly. However, presenting them in the visual form – projected on a screen or printed on some paper – the missing spatial coherence is not realized but substituted as in the case of the photos of an ill-formed picture. Thus, although syntactically ill-formed pictures theoretically exist in computational visualistics, as well, they are, at least up to now, not practically accessible. That is: up to now, it is not possible to adequately ”computerize” one of the ”Spatial concept” pictures of Fontana with their characteristic cuts.

3 The Limitations Of Picture Morphology

There still remain many aspects of pictorial morphology not covered in this volume. One particular question is the one of the identity of pictures – a question that in Western culture has mainly been answered in close connection to the evolving focus on genial artists by refering to the identity of the corresponding picture vehicle, the “original”.(17) Other cultures have developed different conceptions that are more closely linked to the relation between a piece of music and its individual performances.(18) The generic concept of pictures encompasses a sub concept where the picture is fixedly bound to a certain individual picture vehicle, as well as a sub concept with an elaborate two-level conception.(19)

Since instances of the data types for picture vehicles cannot per se be seen but have to be made visible by means of a computer screen, beamer or printer, the computational treatment of pictures favors the second type. The concrete instantiation of the picture vehicle may thus differ more or less slightly. This principal morphological “slackness” is even used for certain syntactic solutions to pragmatic problems: By means of “watermarking” a picture vehicle, i.e., subtly modifying the morphology, the authenticity of a picture is ensured, and uses hurting copyrights can be verified (cf. Fig. 9).(20)



Figure 9: Example of Watermarking
From Left to Right: Original, Watermarked Original, and the difference between them, i.e., the Watermark Image Used


Providing structures isomorphic to the morphological characteristics of images may indeed be sufficient for handling pictures by means of a computer – after all that structure is (or rather would be) exactly equivalent to all the relevant aspects of the picture vehicles. However, computational visualistist should not be satisfied, as pictures are not merely picture vehicles but much more complicated entities. Not everything flat and covered with regions of textures is already a picture. With its pictorial metaphor of a theatrical spot light, Figure 10, the basis of which was originally used in [Schirra 2005] as a coarse overview on a version of the complete data type around »image«, illustrates how small the syntactic range of questions is indeed compared with the other conceptual facets of pictures.



Figure 10: The Spot is on Pictorial Syntax – but there is a lot more of (computational) picture theory


Many of the papers collected in this thematic issue of IMAGE refer to semantic and pragmatic aspects, in the strict structural considerations as well as in the practical applications, since the syntactic problems they investigate only make sense in the context of those features and cannot be solved in isolation. Even the syntactic grouping of pixemes into entities of higher order takes into account not only the morphological attributes of the corresponding elements but also more or less every other pixeme present in the picture: the grouping is highly context-sensitive. Indeed, the identification of the pixemes particularly in a figurative picture depends to a high degree on the picture’s content, i.e., what is depicted.

If we – the computational visualists – do not also consider the particular contexts of use that make us take a flat object for a picture, there is no way to, for example, select rationally from a given set of pictures the one to be best presented to a certain computer user under some specific conditions at hand. An overview on computational picture morphology cannot deal with the multitude of other questions associated with the concept (or data type) »image«. But – apart from explaining for those not too familiar with computer science our insights into the syntactic aspects of pictures and the options and restrictions of the computational approaches – it may help us to see the limitations of syntax, and to better understand the demands, the other image sciences express to computational visualists.


Literature

  • Aiello, Marco & Pratt-Hartmann, Ian & van Benthem, Johan (2007): What is Spatial Logic? In: dito (eds.): Handbook of Spatial Logics, Berlin: Springer, 1–12.
  • Buchholz, Kai (1999): Zum Verhältnis von Bildsyntax und Darstellungswert am Beispiel künstlerischer Grafik. In: [Sachs-Hombach & Rehkämper 1999, 255–270].

  • Dölling, Evelyn (1999): Kategorialstruktur ikonischer Sprachen und Syntax der visuellen Sprache. In: [Sachs-Hombach & Rehkämper 1999, 123–134].

  • Held, Jutta (1975): Visualisierter Agnostizismus. Kritische Berichte 3(5/6):1–7.

  • Huber, Hans-Dieter (1997): Internet. In: Hirner, R. (ed., 1997): Vom Holzschnitt zum Internet. Die Kunst und die Geschichte der Bildmedien von 1450 bis heute. Ostfildern-Ruit: Cantz, 186–189.

  • Long, Hui & Leow, K. W. & Chua, F.K. & Kee, F. (2000): Perceptual Texture Space for Content-Based Image Retrieval. In: Proc. MMM, 167–180.

  • Morphy, Howard & Smith B. M. (eds., 1999): Art from the Land: Dialogues with the Kluge-Ruhe Collection of Australian Aboriginal Art. Charlottesville: University of Virginia.

  • Plümacher, Martina (1999): Wohlgeformtheitsbedingungen für Bilder? In: [Sachs-Hombach & Rehkämper 1999, 47–56].

  • Sachs-Hombach, Klaus (2002): Elemente einer allgemeinen Bildwissenschaft. Habilita­tionsschrift, FGSE, Univ. Magdeburg (revised version: Köln: Herbert von Har­lem, 2003).

  • Sachs-Hombach, K. & Rehkämper, K. (eds., 1999): Bildgrammatik: Interdisziplinäre Forschungen zur Syntax bildlicher Darstellungsformen. Magdeburg: Scriptum.

  • Saint-Martin, Fernande (1990): Semantics of Visual Language. Bloomington: Indiana Univ. Press.

  • Schirra, Jörg R.J. (2005): Foundation of Computational Visualistics. Wiesbaden: DUV.

  • Sung C.-K. (1988): Extraktion von typischen und komplexen Vorgängen aus einer langen Bildfolge einer Verkehrsszene. In: Bunke H. & Kübler, O. & Stucki, P.: (eds., 1988): Mustererkennung 88, Berlin: Springer, 90–96.

  • Whitfield, S. (1999): Lucio Fontana. London: Hayward Gallery Publishing.

  • Wu, C.-M. & Chen, Y.-C. (1992): Statistical Feature Matrix for Texture Analysis. CVGIP: Graphical Models and Image Processing, 54(5): 407–419.




Sources of Pictures

Fig. 1 quoted from [Schirra 2005] p. 123.

Fig. 2 quoted from Apple Developer Connection: “About Color Spaces”
--> http://developer.apple.com/documentation/Cocoa/Conceptual/DrawColor/Concepts/AboutColorSpaces.html (as of June 2007).

Fig. 4 quoted from Toby Mason: Reflective Glass Mosaics -- Forming the World, 1997
--> http://users.erols.com/masont/artpages/forming.html (as of June 2007).

Fig. 5 quoted from Olga Nikolic-Litwin: Patina Studio Conservation of Paintings and Icons
--> http://www.patinapal.com (as of June 2007).

Fig. 6 quoted from Maria Hynes: Rethinking Reductionism.
--> http://culturemachine.tees.ac.uk/Cmach/Backissues/j007/art_res.htm (as of June 2007).

Fig. 7 quoted from Watt, Alan & Policarpo, F. (2001): 3D Games: Real-Time Rendering and Software Technology. New York: Addison Wesley, Fig. 7.25.

Fig. 8 quoted from Brown University, Amir Tamrakar: Project 5: Comparison of various region-based segmentation algorithms
--> http://www.lems.brown.edu/vision/courses/computer-vision/ (as of June 2007).

Fig. 9 quoted from Peter Meerwald: Digital Watermarking
--> http://www.cosy.sbg.ac.at/~pmeerw/Watermarking/ (as of June 2007)

Fig. 10 modified from [Schirra 2005] p. 265.