Computational Visualistics and Picture Morphology – An Introduction

Autor: Jörg R. J. Schirra
[erschienen in: Computational Visualistics and Picture Morphology (Themenheft zu IMAGE 5)]

Schlagwörter: pictures, syntax, morphology, pixeme

Disziplinen: computer science, visualistics, linguistics

Pictures have to be formalized digitally in an adequate manner when computer scientists are to work with them. It is mainly the relevant physical properties of the corresponding picture vehicle that have to be considered in that formalization: that is, the picture syntax. The present special issue of IMAGE deals in particular with morphological questions taking the specific, formalizing perspective of computational visualistics. It is also intended as the attempt to offer a clear and easily understandable summary of the state of the art of research on picture morphology in computational visualistics for picture scientists of the other disciplines. As an introduction, the relations between computer science, general visualistics, syntax studies, and morphology are examined.

Together with language, pictures have been connected to human culture from the very beginning (cf. [Schirra & Sachs-Hombach 2006a]). In the western societies they have gained a rather prominent place. However, steps toward a general science of images, which we may call ‘general visualistics’ in analogy to general linguistics, have only been taken recently (cf. [Sachs-Hombach & Schirra 2002], and [Schirra & Sachs-Hombach 2006b]). In computer science, too, considering pictures evolved originally along several more or less independent questions, which lead to proper sub-disciplines: computer graphics is certainly the most “visible” among them, but there are image processing, information visualization, and computer vision, as well. Only just recently, the effort has been increased to finally form a unique and partially autonomous branch of computer science specifically dedicated to images in general. In analogy to computational linguistics, the artificial expression ‘computational visualistics’ is used for addressing the whole range of investigating scientifically pictures “in” the computer (cf. [Schirra 2005]).

Pictures have to be formalized digitally in an adequate manner when computer scientists are to work with them. It is mainly the relevant physical properties of the corresponding picture vehicle that have to be considered in that formalization: that is, the picture syntax. The present special issue of IMAGE deals with exactly that theme taking the specific, formalizing perspective of computational visualistics. It is also intended as the attempt to offer a clear and easily understandable summary of the state of the art of research on picture morphology in computational visualistics for picture scientists of the other disciplines.

1 Computational Visualistics

Computational visualistics gains its name from its two parent disciplines: “computational” refers to the rather young discipline of computer science. “Visualistics” brings into mind the even younger unified science of pictures: general visualistics. Computer science, the endeavor of studying scientifically computers and information processing, has two different roots determining its methodology. In some aspects, computer science is a typical structural science like mathematics and logic: their subjects are purely abstract entities together with the relations in between. Such entities far off of our living practice are at best linked to everyday life by means of an interpretation relation arbitrary to the structures as such. With respect to some other aspects, computer scientists are like electrical engineers interested in engineering problems, an interest resulting in concrete artifacts that have already changed our lives dramatically during the past few decades and continue to do so with growing acceleration.

Correspondingly, the topics of computer science are, on the one hand, certain forms of purely abstract structures underlying data processing,(1) and on the other hand, certain kinds of purpose-bound artifacts we usually call “computers”. The concept “implementation” relates those two poles.(2)

1.1 The Relation between Computer Science and General Visualistics

Quite obviously, pictures are not mentioned so far as a genuine topic of computer science. So, how are they linked with abstract data structures and their implementations on computers? That question is indeed a particular version of the more general problem of the relation between computer science and any domain of application; a relation that can be explained by means of the philosophical theory of rational argumentation (cf., e.g., [Ros 1999]) because the function of abstract data structures is equivalent to the function of the concepts structuring the rational argumentations in the domain of application. Data structures determine how formal expressions can correctly be constructed and transformed. The interrelated concepts that form a whole field of concepts(3) – computer scientists sometimes use the expression ‘ontology’ in this context, as well – determine how we ought to speak in a rational manner about a certain thematic domain, e.g., about pictures, and how we may draw correct conclusions from corresponding assertions (in general visualistics, in our example).

The relation between computer science and any domain of application employs that equivalence. Applications of computer science to a certain subject are mediated essentially by means of a formal translation of the field of concepts that structures the rational argumentations in the application domain under investigation into a corresponding abstract data structure. Computational visualistics can thus be characterized by means of its central topic: the data structure(s) »image« that can be conceived of as the formalized equivalent(s) of the field(s) of concepts that form(s) the subject of general visualistics; or in other words: the former ruling formal expressions that are correctly constructed and transformed if and only if they correspond to the latter, which determine how we ought to speak in a rational manner about pictures. Algorithms in those data structures exemplify potential argumentations in picture theory in a formalized manner. Therefore computational visualistics is indeed able to contribute, as well, to general visualistics in return: with its algorithms implemented, the results of applying a theoretically proposed argumentation in a formalized and automatized manner onto concrete examples can be demonstrated and examined in great number with dramatically reduced effort. This is particularly evident in a range of picture phenomena that would even not exist without the help of computers: the interactive images.

1.2 Components of Computational Visualistics

Most of the pre-existing picture-related subjects in computer science focus on only certain aspects of the data structure »image«. In the area called image processing, the focus of attention is formed by the operations that take (at least) one picture (and potentially several other parameters that are not images) and relate it to another picture. With these operations, we can define algorithms for improving the quality of images (e.g., contrast reinforcement), and procedures for extracting certain parts of an image (e.g., edge finding) or for stamping out pictorial patterns following a particular Gestalt criterion (e.g., blue screen technique). Compression algorithms for the efficient storing or transmitting of pictorial data also belong into this field.

Two disciplines share the operations transforming images into non-pictorial data types. The field of pattern recognition is actually not restricted to pictures, but it has performed important precursory work for computational visualistics since the early 1950’s in those areas that essentially classify information in given images: the identification of simple geometric Gestalts (e.g., “circular region”), the classification of letters (recognition of handwriting), the “seeing” of spatial objects in the images or even the association of stylistic attributes of the representation. That is, the images are to be associated with a non-pictorial data type forming a kind of description. The neighboring subject of computer vision is the part of AI (Artificial Intelligence) in which computer scientists try to teach – loosely speaking – computers the ability of visual perception. Therefore, a problem rather belongs to computer vision to the degree to which its goal is “semantic”, i.e., the result approximates the human seeing of objects and their behavior in a picture.

The investigation of possibilities gained by the operations that result in instances of the data type »image« but take as starting point instances of non-pictorial data types is performed in particular in computer graphics and information visualization. The former deals with images in the closer sense, i.e., those pictures showing spatial configurations of objects (in the colloquial meaning of ‘object’) in a more or less naturalistic representation like, e.g., in a computer game. The starting point of the picture-generating algorithms in computer graphics is usually a data type that allows us to describe the geometry in three dimensions and the lighting of the scene to be depicted together with the important optical properties of the surfaces considered. Information visualizers are interested in presenting pictorially any other data type, in particular those that consist of non-visual components in a “space” of states: in order to do so, a convention of visual presentation has firstly to be determined – e.g., a code of colors or certain icons.

1.3 The Concept »Image«

The central issue of computational visualistics depends, in conclusion, on the core topic of general visualistics, i.e., the concept »image«. Correspondingly, determinations of that concept in image science are highly relevant for structuring the investigation of the data structure »image«, its algorithms, and the implementations thereof. It may therefore be rather helpful to end this section about computational visualistics with a short note on the concept »picture« in general visualistics.

Unfortunately, picture science has not yet come to final conclusions concerning the complete “ontology”(4) of pictures, which might be taken as the ultimate reference point for computational visualistics. Nevertheless, a sufficiently comprehensive determination to guide computer scientists dealing with pictures is available with Sachs-Hombach’s [2003] proposition of a general conceptual framework, namely to determine the concept »picture« as »perceptoid signs«.(5) In the form of an Aristotelian definition with genus proximum (»sign«) and differentia specifica (»perceptoid«), this determination refers not only to two core aspects of pictures but opens originally, as we shall see below, the way to speak about pictorial syntax and picture morphology.

The superimposed concept »sign« implies that something – the picture vehicle – can be a picture if and only if it is in a certain way part of a special kind of situation that is characterized by a particular action: the sign act. That context also includes acting subjects called “sender” and “receiver”. The sign (e.g., a picture) is used by the sender as a means to direct the focus of attention of the receiver onto something that is usually not present in that situation.(6)

Furthermore, in order to function properly each picture has to apply our abilities of visual perception in a specific manner, which we call its »perceptoid« character. More precisely, in using – i.e., adequately using – pictures we do not only perceive visually the sign in its physical appearance, that is, the picture vehicle. We have also to invoke – at least to some degree – our abilities to visually perceive spatial objects and configurations that are closely related with what the picture is employed to symbolize (the picture content).

2 Pictures and Syntactic Investigations

Taking pictures to be a kind of sign allows the visualists – and that is, the computational visualists, too – to apply semiotic distinctions in order to guide their investigations. Since a picture like any sign depends on being part of a sign act, the broadest range of investigations (enclosing and determining all other questions) is the one that examines any relations between the other acts of sender and receiver with the signing activity – i.e., the presentation of a picture by a sender to a receiver in a certain context. That is the field of pragmatics. Examinations considering only the relations holding between the picture vehicles and what they are used to symbolize for sender and receiver determine the field of semantics.

Syntax is the third semiotic range of questions; and it is also the most restricted one since it deals with the sign vehicles (or in our case: the picture vehicles) alone. More precisely, the classifications of and relations between sign vehicles with respect to their physical properties are examined. This also includes the question of the range of variability of sign vehicles that may be used as the same sign, but also potential compositions of sign vehicles to more complex sign vehicles.

2.1 Syntactical Density

Syntactical considerations belong to the repertoire of picture theories since Nelson Goodman’s publications at the latest (cf. [Goodman 1976], and also [Sachs-Hombach & Rehkämper 1999]). Although Goodman does indeed consider more than syntax, it is an important syntactic characterization of pictures that has had the most influence in general visualistics, so far. Syntactically, pictures are, he proposes, dense – in contrast to verbal signs, which are syntactically distinct. A sign system is called syntactically dense, if the dimension of values for at least one of the syntactically relevant properties of the sign vehicles corresponds to the rational numbers: between any two values there are always more values. Sign vehicles with different values in that property are taken as different signs in that sign system. So, two of the infinitely many signs of such a system can be “infinitely similar” to each other, as there are always more sign vehicles “in between”.

Syntactic characteristics of pictures are obviously defined by the visual properties of a marked surface of the picture vehicle. There are at least two different relevant dimensions that are apparently dense: (i) the positions of a point of color or a border between colors, and (ii) the perceived color (in a broad meaning). Between two different positions of a point of color, there is always – at least in theory – a (multitude of) position(s) in between. And similarly, in the theories of color two different color values are always connected by means of a sequence of intermediate color values, even if the human eye may not be able to distinguish those without the help of an artificial instrument.

The syntactically characteristic property of density is of high significance for the possibility of encoding, presenting, storing, and transferring pictures in/with a computer. Is it decidable whether two pictures are syntactically equal? Can we, with other words, determine by means of an effective, finite algorithm whether the transmission of a picture vehicle through the Internet, for example, has been correct, or whether a stored image still corresponds exactly to the original? Goodman has denied that possibility, which means that computational visualistics has a problem if he is right. Any computer system would only be able to differentiate picture vehicles up to a certain degree of resolution (in location or color).

2.2 Resolution in Computational Visualistics: Pixels

Indeed, the combination of images and computers did originally cost the former a property conceived of as characteristic for pictures by the scientists of many disciplines involved: pictures had to become digital in order to join that liaison. Essentially, ‘being digital’ means that the resolution of pictures has a definite (and often quite small) value. In contrast, the common view holds that picture vehicles have to be (at least in principle) analogous, i.e., without any limitation of resolution.

The most simple and well-known type for making picture vehicles available for a digital computer are bitmaps – matrices of pixels as they are called (‘picture elements’). This data type allows us to define a pixel-value for any pair of coordinates taken from two finite sets of successive indices (i.e., natural numbers). The pixel values encode a visual property, like color or intensity. Bitmaps have therefore a finite and fixed locale resolution that depends on the size a pixel is given: bitmap pictures are ratcheted. The number of different bitmaps of a given matrix size is finite, while the number of different matrix sizes is infinite but enumerable.(7)

The presentation of pictures on a computer screen usually employs this data type in just one matrix size. Although obviously only a finite number of different picture vehicles is discriminated in that manner, an underlying data structure »image« still can be designed in order to fit the criterion of syntactic density imposed by general visualistics: the dense structure of a picture has to be projected (potentially only in parts) onto the syntactically distinct pixel matrix with the option of zooming in and out. In contrast to the visual approximations shown on the screen, a picture encoded by an instance of a data structure incorporating such a zoomable projection function needs not having a finite level of resolution (at least in theory: recall for example the small program systems fashionable few years ago that were used to visually inspect certain fractal functions, e.g. the Mandelbrot set).

Resolution is only one aspect of computational pictorial syntax: It corresponds roughly to the level of linguistics dealing merely with the range of letters; the notorious pixel usually comes into the beholder’s (or creator’s) focus of attention only when the presentation quality of a picture is low. There are other parts of which a picture vehicle is viewed as composed of and which could be rearranged to form another picture vehicle: When discussing syntactic design elements M. Scholz (1999), for example, refers to Paul Klee’s pedagogic sketch book (1925, republished 1997) as an overview. Klee proposes several kinds of points, spots, lines, and areas (including typical geometric Gestalts like circle or square).(8) We shall later come back to such entities from geometry. Sometimes, candidates for syntactic elements can also be defined based on the production process: each stroke of a pen, a brush or a graving tool may lead to an individually visible mark usable as a syntactic element.

Of course, confronted with the questions of pictorial syntax and its combination rules, the first impulse of computer scientists is usually: to think of formal grammars.

2.3 Picture Grammars

Every computer scientist knows by heart the abstract structures called formal grammars – also called Chomsky grammars or compositional grammars or transformation grammars – since those are the major instrument for defining and classifying linear structures like programming languages. They are actually a tool from linguistics and have been applied to verbal syntax with great success.

A compositional grammar provides (i) a finite set of grammatical categories like ‘article’, ‘prepositional phrase’ or ‘sentence’, (ii) a lexicon (i.e., a collection of basic signs (words) each associated to a grammatical category), and (iii) a finite set of composition (transformation) rules. Essentially, each rule associates a grammatical category with a sequence of such categories, like in the following examples:(9)

PP --> Prep + NP NP --> Art + Noun NP --> Art + Adj + Noun

Those three sets determine all sentences, i.e., sequences of words, belonging to the language considered. Note that each word listed in the lexicon always has clear semantic and grammatical functions of its own.

Assuming that all pictures form just one “language”,(10) a formal grammar for picture syntax thus would also have to provide corresponding sets of syntactic categories, elementary pictures with associated syntactic categories, and composition rules. Those set should be accordingly applicable for analyzing in a mechanical manner given objects in order to decide whether they are pictures,(11) or to generate from the starting category any picture vehicle. Such a formal grammar for pictures would indeed enable us to distinguish between well-formed and ill-formed picture vehicles.

Unfortunately, all proposals so far to provide such a combinational grammar system for all pictorial signs (or even large subsets) have failed: only very special pictorial media – that apparently are also used in a way similar to language anyway, like pictograms – could be formalized in that manner.(12) In general, there does not even seem to be anything like an ill-formed picture vehicle at all (cf. [Plümacher 1999]). Any more or less flat surface that can be visually perceived can apparently serve as a picture vehicle.

Already the question “what are the syntactical elements in the ‘lexicon’ – as we do not have a better expression, so far – of copper engravings (for example)” is not easily answered. Can the engraving lines carry that function? Are pixels – as used in computer visualistics – better candidates? However, neither engraving lines nor digital pixels bear a proper pictorial meaning by their own – one of the characteristics in the linguistic case, i.e., for the words in the lexicon.

Furthermore: What corresponds to the grammatical categories? Are perhaps “Circle” or “Spot” pictorial analogies of “Noun” and “Art”? And if so, what would actually be the difference between the ‘lexical’ basic elements and the grammatical categories in that system?

In conclusion: Being rather fertile in linguistic syntax studies, the idea of generative syntax has often been proposed for pictorial syntax, as well – though, with little success: compositional syntax is mainly interested in the syntactically correct composition of words (as elementary verbal signs) into sentences (i.e., compound verbal signs). A pictorial analogy of words so that pictures could be conceived of as corresponding sentences has not been suggested in a convincing manner. However, another important building block of syntax studies – at least in linguistics – is given by morphology.

3 Morphology

3.1 Morphology in Linguistics

In linguistic morphology, the rules of building words, and hence the inner structure of words is examined instead of sentences.(13) Words are partitioned in segments called ‘morphemes’(14) that contribute to the word’s meaning or grammatical function. The postfix ‘-ed’ in English, the prefix ‘pré-’ in French, or the root ‘-wend-’ in German are typical examples for morphemes. Mostly, morphological elements are identified and arranged into classes by means of a rule of mutual exchange: some words beginning with ‘pré-’ can be transformed into other words of French by just changing the prefix to ‘re-’, ‘con-’, ‘de-’ etc.

More generally, morphological modifications can be differentiated into internal modifications mainly by means of vowel permutations (e.g., ‘come’ to ‘came’), and external modifications by means of affixes – beside prefix and postfix, some languages also use infix and circumfix modifications. While inner modification alters the “color” of a word, so to speak, external modification changes its shape and size. Thus, the combination of morphological elements also plays a major role in the invention of completely new words.

Morphemes do not have to be – and are usually not – words by themselves.(15) Even the semantic or grammatical function of one morpheme can be ambiguous and may change in different compositions (e.g., “s” as flexion postfix and plural postfix in English). Morphemes may best be viewed as the vehicles of unsaturated partial signs acts without an independent pragmatic function(16) that modify in a more or less specific way the meaning of the whole.

There are arguments that syntax in the form of a formal grammar, and syntax as morphology are not categorically opposed but form the two ends of a more or less continuous scale of various language structures: from the analytic language structure (also: isolating languages) to the various types of synthetic language structures (with the subsets of agglutinating, flexing, and fusing languages), and finally the polysynthetic language structure (also: incorporating languages).(17)(18)

The morphological structure of the word matgībulhahumš in Egyptian Arabic,(19) for example, could be literally translated to approximately “not-you-all-ought-bring-her-them-thing” (i.e., “do not bring them to her, all of you”). It consist of the two circumfixes ma...š (“not … thing”) and t(i)...u (marker for 2nd person plural imperfect in jussive mode: approx.: “you ought to”), the two morphemes l(ī)ha (3rd person singular feminine dative), hum (3rd person plural accusative), and, as the root, an internally modified lexeme gīb (the imperfect form of gāb: “to bring”), as is indicated in the following schema:

Schema 1

All the morphological elements are fused to a single word that is used as a sentence. The schema of such complicated combinations by means of the fusion of morphemes with partial phoneme elisions – together with the used of enclosing or inserting affixes – can indeed much stronger evoke the idea of a syntactic structure of pictures than the schema of formal grammars.

3.2 Transfer to Visualistics

Intuitively, the system of pictures and most of its subsystems are similar to extremely polysynthetic languages. Of course, picture vehicles do have parts that modify the pictorial meaning and use of that vehicle. But for each picture, those parts are closely fused together – comparable to an enormously complex one-word sentence. They form a single entity that does not allow us usually to isolate in a clear manner the semantic contribution of any part, as it had to be expected in the case of a formal grammar. Nevertheless it is clear that any morphological element of the picture vehicle – or pixeme for short – does contribute in some way to the meaning, and hence modifies the use of the picture. Therefore, any tiny change in the spatial distribution of pigments may very well be seen primarily as a modification in the sense of morphology.

There are several characteristic differences to verbal morphology: In contrast to the essentially temporal and hence linear composition of verbal morphology, pictorial morphology extends in (flat) space and thus in (usually) two co-ordinated dimensions, which increases the complexity quite heavily. Instead of the pair of possible directions for morphological extensions – “before” (as prefix) and after (as postfix)(20) – an infinite and actually dense multitude of directions can be used to position pixemes.

Of course, the specific difference of resolution already mentioned above has to be taken into account, as well: there is a distinct lower limit to resolution in linguistics since morphemes cannot be smaller than the difference between two letters or phonemes. For picture vehicles, no such quantization is evident. The criterion of density also implies that any pixeme can – at least in principle – be considered as composed of even smaller pixemes.

Brush strokes, pencil lines, etc. are rather good candidates for simple pixemes, as was already mentioned above.(21) They are composed into more complex configurations that nevertheless still are pixemes. In general, we may view any geometrical entity of two-dimensional geometry of the picture vehicle as a pixeme. Then, even a picture is a pixeme, as well – which makes sense as its surface can be seamlessly incorporated in another picture vehicle. Still, pictures may very well have a morphological structure without a list of given elements that are pictures themselves. Although there appears to be no (natural) verbal language that employs bound morphemes only, morphology does not necessarily depend on the existence of free morphemes (lexemes). On the other hand: there always exists a maximal pixeme to which all the other pixemes are infixes. It is the frame that externally binds and thus determines the maximal pixeme. Indeed, maximal pixemes might act as free morphemes for picture vehicles.(22) While verbal structures grow morphologically outward by adding elements mostly externally, pictorial structures grow morphologically inward by adding details internally.

Since morphemes essentially change the color of vowels in the course of an internal modification, a literal change of color of a pixeme is a very plausible candidate for the corresponding derivation. Again, the bandwidth of alternatives is characteristically different: a finite set of phonemes vs. the colors from a dense range of options.

Evidently, the rules of visual perception are constitutive for the “segmentation” of pictures in pixemes.(23) The empirical findings from psychophysics and the concepts of Gestalt theory in particular help to determine the laws of pixeme formation. The former indicate general principles of indiscernibility of optical properties while the latter formulates grouping principles that bind compound pixemes to the constituting simpler pixemes. That decomposition runs down to optically uniform regions, which we find on any level of resolution since we deal with dense fields both in color and in location. An optically uniform region is not only given by a single color, but also by a color gradient (in particular a saturation or intensity gradient), and even by homogenous textures.

As an extremely simplified example in analogy to the verbal example above, the following schema exemplifies a morphological (de)composition for a picture. In accord with the assumption mentioned above that pictorial morphology grows inward, the frame defines the root in the decomposition, or more precisely: bound by the frame, the empty “canvas” acts as the maximal pixeme. As an infix, the face marks modify the maximal pixeme. The face mark itself consists of a simple circular pixeme with several infixes and one circumfix (the ear marks).

Schema 2

Of course, the specific difference of resolution already mentioned above has to be taken into account, as well: there is a distinct lower limit to resolution in linguistics since morphemes cannot be smaller than the difference between two letters or phonemes. For picture vehicles, no such quantization is evident. The criterion of density also implies that any pixeme can – at least in principle – be considered as composed of even smaller pixemes.

Although there is still much more to be said about pictorial morphology in general, it is now the time to come back to the particular perspective of computational visualistics.

4 Aspects of Picture Morphology in Computational Visualistics

Morphological considerations in the particular context of computational visualistics are at the focus of this thematic part of IMAGE V. We are interested in questions like the following: What alternative formalizations for pixemes apart from pixels can be offered by computational visualistics? Where and in which form do such formal pixeme systems play an important role? And what is the influence these formalizations in computational visualistics have on picture morphology in general?

4.1 Some Specific Approaches

Let us concentrate for the moment on lines or strokes. A stroke may be defined pragmatically by the painter’s movement or semantically as the contour line of an object. Beside the potential graphical meaning of a line or the stylistic indications associated with its particular make (not to mention any other expressive or appellative function of dynamism associated to it on the level of pragmatics), there are several dimensions in which a line – just being taken as a line – can vary: most prominently in the course or path it takes. But there are other ranges: is it a continuous line, or dashed, or dotted? Does it consist of strokes of one kind or another? How thick is it? Does its thickness change over its course or not? Is there an internal fine structure to the strokes?

An extensive treatment of such a data type and its possible implementations has been performed in the context of non-photorealistic rendering (NPR), a sub field of computer graphics.(24) While Figure 1 exemplifies several types of digital “hairy brush strokes” that have been generated – quite expensively in computational resources – by simulating a brush with several individual bristles applied with changing pressure to a certain kind of surface, Figure 2 shows examples of lines resulting the application of a “style function” to the “skeletal path” of the stroke.(25) Both constituents of the latter case are defined by means of parametric curves: the style describes how a given path (as the core of the line) is to be perturbed in order to result in a corresponding pixeme. Style and path can be viewed as independent ranges determined in each particular picture by semantic and / or pragmatic aspects.

Figure 1: Enlarged Fine Structure of Computer-Generated Stroke Types

Figure 2: Examples with Style-Parameterized Stroke Functions

To some degree, the rules of composition of strokes or other pixemes into a picture can be investigated by means of the tools of formal languages. Formal grammars based on replacement rules that lead to two-dimensional “pictorial” structures have been investigated essentially under the name of L-systems. The expressions generated by an L-system can be interpreted as orders to place substructures, and to move or turn in-between. A fairly simple example is defined by the following replacement rule:

P --> P [ – P ] P [ + P ] P

Interpret “P” as “place a pixeme and move a bit forward”, “+” by “turn right”, “–“ by “turn left”, and the square brackets as stack operations that allow us to return to that point after the bracketed sub expression has been dealt with. The plant-like structures in Figure 3 have been generated by this rule. Obviously the pixemes themselves are not really relevant for L-systems and their relatives, since these grammars basically deal with arrangements and groupings of abstract entities that may or may not be interpreted in a pictorial sense.(26)

Figure 3: Two Complex Example Morphemes Generated by (Bracketed) L-Systems, and the Graphical Interpretation for the Rule for the Left Example

For a more extensive approach to pictorial morphology, a data type for pixemes can best be derived from a calculus for geometry. That any pixeme must be a geometric entity seems almost too trivial to be mentioned. That inversely any entity in flat geometry – apart from non-extended points – may also be a candidate for a pixeme is at least a good guess. Taking the common Euclidean formalization of geometry leads however to the “unpleasant” consequence that the most basic pixemes must be non-extended points – a concept highly abstracted from experience, that is.(27)

Fortunately, some non-standard approaches to geometry offer an interesting way out. The traditional calculus of geometry develops around the fundamental concept of a zero-dimensional point. In contrast, the family of mereogeometries(28) is based on extended regions as the most elementary entities, which may or may not have (distinguishable) proper parts. The regions are often called “individuals”. Individuals do not have immediate attributes of form or position: only the relations to other individuals, in particular parts, determine form and (relative) location.

An individual may quite well be thought of as a visual Gestalt – thus following the principle of perception psychology of the Gestalt school: one has to consider the perceived whole first and introduce the concepts for perceptual atoms as instruments of the explanations of the former, not the other way round. We do not see sets of zero-dimensional points but regional Gestalts. The abstract notion of a spatial entity without extension is secondarily constructed in order to explain some aspects of experienced space, but leads on the other side to severe difficulties as the discussion on infinite resolution has shown. Therefore, the constructs of an individual calculus for the two-dimensional mereogeometry are excellent candidates for a general and exhaustive discussion of pixemes.

In fact, the concept of a minimal region can be introduced in mereogeometry: They are usually called a “point”, but we may well use “pixel” instead. A point in this sense is a region that has no proper parts (or rather, a region where no proper parts are considered). When the concept »point« is introduced in the data structure in that manner, there is no need in any concrete instance for using infinitely many point instances: only the “relevant” points must be instantiated. This also means that there is always a finite resolution. However, N. Asher & L. Vieu [1995] propose a formal mechanism called “microscopization” covering a kind of zooming operation by means of a modal extension to their calculus. What is a “point” on one level may be a compound of regions with several points on a microscopized level. While Euclidean geometry first introduces the continuous range of infinitely many coordinates determining potential points some of which are then chosen to be relevant (still an infinite number in any practical relevant instance), mereogeometry starts with a (usually finite) number of relevant individuals (regions) we can think of being given in perception. That is, we may indeed assume that the principles governing visual perception determine the regions that are syntactically relevant, hence leading only to the essential “points”. determined by the given individuals.

The empty picture plain – as the simplest maximal pixeme – is particularly characterized in its most usual rectangular form by the four corner points. The “energetic field” often associated to such a maximal pixeme (cf. Fig. 4) cannot easily be derived as it depends essentially on features of the perceptual mechanism not covered by the Euclidean calculus as such.(29) Additional explanations have to be added that often employ rather mystical metaphors to physics.(30) The mereogeometrical conception of points and limits may offer a better access to the problem of the “energetic aspects” of pixemes, and especially of the empty picture plane: As those points are only conceivable as the result of operations on extended regions, the four corner points implicitly refer to defining individuals (virtual pixemes). It is a promising hypothesis for future research to derive within the calculus of mereogemometry any “energetic effects” from those implicit pixemes.

Figure 4: Rectangular maximal pixeme with “energetic phenomena” as sketched by Saint-Martin [1990, 97]

Mereogeometries are a formal way to deal with geometry in a manner more closely related to visual perception than traditional point geometry. If we accept the view that the central data type of a two-dimensional mereogeometry determines what is a pixeme – namely any connected sub system of individuals, then there is indeed no finite number of possible pixemes – a clear difference to verbal sign systems with their strictly limited number of morphemes. However, any pixeme can be described and dealt with in a unique and generatable manner in the calculus in a finite number of steps: pixemes can be combined to form pixemes of a higher order – until every visually separable Gestalt of a picture is covered.

4.2 The contributions of the volume

Since the fluctuation of the focus of attention between structural science and engineering is characteristic for all investigations in computer science, it is also valid for the dealing with pictorial data. On the one hand, particular abstract data types for pictorial representations are investigated and designed from a purely structural point of view. For example, efficiency properties are examined, or minimal sub-structures for particular tasks determined. On the other hand, concrete algorithms for, e.g., picture processing are “software-engineered” and used in diagnosis. Correspondingly, the papers collected in this issue exhibit a wide range between analytic investigations and constructive engineering.

The call for paper for this thematic issue of IMAGE did in particular list the following five ‘crystallization cores’ for a discussion of picture morphology from the perspective of computational visualistics:

  • Picture morphology as Grammar: L-Systems and Similar Formalizations

  • Mereo-Geometrical Approaches to Picture Morphology

  • Pixemes in Non-Photorealistic Computer Graphics

  • Image Processing: Pixeme-based Approaches of picture manipulation and Computer Vision

  • Glyphs and Icons: Pixemes in Information Visualization

  • With the exception of the fifth theme – each item has been covered by at least one contribution.

    The first two texts deal with the general question of the systems of pictorial syntax or morphology and its constituents. A set of building blocks for formally describing graphics is presented in the contribution of Engelhardt (Netherlands). He takes a perspective rather related to design and design theory, and proposes a set of building blocks for all graphics derived from the relevant literature. Three types of building blocks are distinguished: graphic objects, meaningful graphic spaces, and graphic properties. Although this system does not yet reach the formal stringency of the logical calculi employed, for instance, in the formal ontology of space, it provides a good entry point for the discussion of computational picture morphology.

    An overview on the formal representations of space in the field of formal ontology, a sub-domain of AI and cognitive science, is given by the contribution of Borgo et al. (Italy). Without putting too much stress on the (actually rather demanding) underlying logical and mathematical formalizations, these authors explain the advantages of mereogeometrical approaches in the cognitive dimension fitting the qualitative categorizations of the human access to space. From that perspective, Borgo and his colleagues discuss the application of mereogeometrical calculi to the description of pictorial morphology. While Engelhardt starts from more or less informal notions as used in design theories and proposes a systematic categorization of graphic objects, rules for their combination, and a typology of meaningful graphic space, Borgo moves from highly formalized concept (which are elaborated in formal calculi) toward the more informal notions employed in pictorial syntax.

    With his survey on morphological models with L-Systems and relational growth grammars, Kurth (Germany) brings into the debate another meaning of the expression ‘morphological’ – a meaning more closely related to the word’s original, i.e., biological context: the knowledge about the bodily forms of living beings, and the rules of the arrangements of body parts and organs (especially in the temporal development). The special grammatical formalisms described by Kurth do not originally refer to pictures but to objects that are conceived of as being constructed by formally arranging parts in space by means of a quasi-biological manner of “growing”, and that are often depicted in order to be further studied or used. Therefore, this meaning of ‘morphological’ actually exceeds the borders of strict syntax – after all, the structures of the things depicted are actually in the range of semantics. Nevertheless, the formal options given by means of quasi-grammatical mechanisms for “growing” spatial arrangements of “body parts”, and the geometrical arrangement of pixemes are close enough to the discussion on pictorial morphology for further enlightening the latter.

    While Kurth is more interested in the arrangement, i.e., the spatial configuration of any kind of parts, the contribution of Isenberg (Canada), turns our focus of interest to the potential parts to be arranged by giving an overview on the techniques used to generate computer graphics apart from naturalistic – say: “photo-realistic” – representation styles. Contrasting the resulting images with the photorealistic case, Isenberg describes a wide range of morphological modifications possible with those techniques. Different rules for calculating shading, for example, lead to a picture that is internally modified; applying strokes or graftals corresponds to external modifications. Unlike the pixel, the morphological primitives used in NPR often carry a “meaning” beyond the syntactical level; saving the morphological structure with the picture is therefore, so Isenberg, often quite helpful for subsequent processing.

    The paper of du Buf and Rodrigues (Portugal) also aims for non-photorealistic rendering, as the authors explain how computational models of neuro-physiological explanations of visual perception can be employed in order to generate painterly pictures. After giving us an extended overview about the relevant state of the art of neuro-physiological analyses, they consider the relation between bottom-up processing (pixels to higher pixemes) and top-down projections (from semantic entities to pixemes), and sketch a computational model of the visual system that can re-create a visual input in the form of a painting.

    A strictly engineering perspective is finally taken in the text of Hermes and the SVP Group (Germany), which also broadens the view to moving pictures: how can an acceptable movie trailer – as a kind of cinematic summary – be more or less automatically generated on basically syntactic principles from the movie. In the practical point of view, the theoretical discussion about the elements of picture morphology retreats behind the complicated concrete problems at hand. Due to that complexity, the task has even to be restricted to a certain genre (and certainly bound by the current “taste” for trailer esthetics). The focus is mainly on “shots” and the transitions between them. The group presents a program system, the outcome of which has been empirically compared with satisfying results to commercial trailers produced in the ordinary way. In contrast to the neuro-physiologically inspired analysis of an input picture in the contribution of du Buf and Rodigues, the input movie is analyzed with several standard techniques from computer vision like motion-based segmentation, and face detection and recognition (supplemented by a range of classification/recognition methods for acoustic input or even text) – techniques that are not necessarily cognitively adequate but basically optimized for the tasks they have to solve. Unlike the system of du Buf and Rodrigues, the final (re)creation of a (moving) picture depends on a separate set of templates following semantic and pragmatic aspects.

    As the thematic issue of IMAGE on computational image morphology attempts in particular to mediate between computational visualistics and other disciplines investigating pictures and their uses, a final chapter broadens the perspective again and relates the computational argumentations of the preceding papers to the more general discussion of image science. I, then, also extend the discussion to the question of syntactically ill-formed pictures and the limits of pictorial syntax or morphology.


    • [Asher & Vieu 1995] Asher, Nicholas & Vieu, L.: Toward a Geometry of Common Sense: A Semantics and a Complete Axiomatization of Mereotopology. in: IJCAI-95, 846–852.
    • [Clarke 1981] Clarke, B.: A calculus of individuals based on ‘connection’. Notre Dame Journal of Formal Logic 22(3):204–218, 1981.
    • [Ehrig & Mahr 1985] Ehrig, Hartmut, & Mahr, B.: Fundamentals of Algebraic Specification. Berlin: Springer, 1985.
    • [Goodman 1976] Goodman, Nelson: Languages of Art. An Approach to a Theory of Symbols. Indianapolis: Hackett, 11968, 21976.
    • [Klee Paul 1925: Pädagogisches Skizzenbuch. (Nachdruck: hg. von Hans M. Wingler Berlin: Gebr. Mann, 41997).
    • [Metzger 1966] Metzger, Wolfgang: Figurale Wahrnehmung. In: Metzger W. (Ed.): Handbuch der Psychologie, Vol. 1, Göttingen: Hogrefe, 1966.
    • [Plümacher 1999] Plümacher Martina: Wohlgeformtheitsbedingungen für Bilder?. In: Sachs-Hombach, K. & Rehkämper, K.: Bildgrammatik: Interdisziplinäre Forschungen zur Syntax bildlicher Darstellungsformen. Magdeburg: Scriptum, 1999, 47–56.
    • [Pratt-Hartmann 2000]: Pratt-Hartmann, Ian: Empiricism and Rationalism in Region-based Theories of Space. Fundamenta Informaticae 34:1–31, 2000.
    • [Ros 1999] Ros, Arno: Was ist Philosophie? In: Richard Raatzsch (ed.): Philosophieren über Philosophie. Leipzig: Leipziger Universitätsverlag, 1999, 36–58.
    • [Sachs-Hombach 2003] Sachs-Hombach, Klaus: Das Bild als kommunikatives Medium. Köln: Herbert von Harlem, 2003.
    • [Sachs-Hombach & Schirra 2002] Von der interdisziplinären Grundlagenforschung zur computervisualistischen Anwendung. Die Magdeburger Bemühungen um eine allgemeine Wissenschaft vom Bild. Magdeburger Wissenschaftsjournal 1/2002, 27–38.
    • [Sachs-Hombach & Rehkämper 1999] Sachs-Hombach, K. & Rehkämper, K.: Bildgrammatik: Interdisziplinäre Forschungen zur Syntax bildlicher Darstellungsformen. Magdeburg: Scriptum, 1999.
    • Saint-Martin, Fernande (1990): Semantics of Visual Language. Bloomington: Indiana Univ. Press.
    • [Schirra 2005] Schirra, Jörg R.J.: Foundation of Computational Visualistics. Wiesbaden: DUV.
    • [Schirra & Sachs-Hombach 2006a] Schirra, Jörg R.J. & Sachs-Hombach, Klaus: Fähigkeiten zum Bild- und Sprachgebrauch. In: Deutsche Zeitschrift für Philosophie. Berlin, Band 54(2006)/6, pp. 887–905.
    • [Schirra & Sachs-Hombach 2006b] Schirra, Jörg R.J. & Sachs-Hombach, Klaus: Bild und Wort. Ein Vergleich aus bildwissenschaftlicher Sicht. In: Essener Linguistische Skripte – elektronisch (ELiSe), 6(2006)/1:51–72, 2006.

    • [Schlechtweg & Raab 1998] Schlechtweg, Stefan & Raab, A.: Rendering Line Drawings for Illustrative Purposes. In. Strothotte, Th. (ed.): Computer Visualization – Graphics, Abstraction, and Interactivity. Springer, Heidelberg, 1998, Chapter 4, 63–89.
    • [Strassmann 1986] Strassmann, S.: Hairy Brushes. In: Proceedings of SIGGRAPH'86, 225–232, 1986.
    • [Scholz 1999] Scholz, Martin: Gestaltungsregeln in der pictorialen Kommunikation. In: Sachs-Hombach, K. & Rehkämper, K.: Bildgrammatik: Interdisziplinäre Forschungen zur Syntax bildlicher Darstellungsformen. Magdeburg: Scriptum, 1999, 273–286.
    • [Vieu 1991] Vieu, Laure: Sémantique des relations spatiales et inférences spatio-temporelles : Une contribution à l’étude des structures formelles de l’espace en Langage Naturel. Toulouse: IRIT, Université Paul Sabatier, 1991.
    • [Wittgenstein 1953] Wittgenstein, Ludwig: Philosophical Investigations. New York: MacMillan, 1953.