Sie sind auf Seite 1von 11

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

A Fuzzy Set Approach to Using Linguistic Hedges in Geographical Information Systems


Allan Brimicombe School of Surveying, University of East London, Longbridge Road, Dagenham, Essex RM8 2AS tel : 0181-849-3514 - fax : 0181-849-3618

Abstract : Spatial data quality has been attracting much interest. Much of the problem lies in the degree to which current data structures are unable to model the real world and the way imperfections in the data may propagate during analyses and cast doubt on the validity of the outcomes. Much of the research has concentrated on the quantitative accuracy of spatial data, the derivation of indices and their propagation through analyses. Geographical data invariably includes an element of interpretation for which linguistic hedges of uncertainty may be generated. The paper presents a new technique of handling such expressions in a GIS through fuzzy expectation intuitive probabilities linked to stylized fuzzy sets. By using fuzzy expectation as linguistic building blocks, many of the difficulties in using fuzzy set descriptors in GIS have been overcome. The stylized fuzzy sets can be propagated using Boolean operators to give a resultant fuzzy set which can be translated back into a linguistic quality statement. For the first time, linguistic criteria of fitness-for-use can be derived for GIS outputs regardless of the language being used.

Introduction
The problem of spatial data quality has been attracting much interest within the Geographical Information Systems (GIS) community. The general treatment to date, with its emphasis on accuracy and error, reflects the continuing conceptual closeness of digital map layers to their analog roots. Error is the deviation from the truth of observations and computations. This assumes that the truth can be known and a measure of conformance with that truth, data accuracy, can be obtained. When dealing with digital spatial data, this focus is too narrow and only considers the reliability of the raw data. From a users perspective, uncertainty is a notion (rarely quantified) of doubt or distrust in results or outputs of GIS and is thus concerned with the fitness-for-use of information in the decision-making process.

Strategies for Reducing Uncertainty


A number of strategies for reducing uncertainty have been investigated and reported in the literature. These have most recently been reviewed by Unwin (1995). Veregin (1989), Chrisman (1991), Goodchild (1991), Lanter & Veregin (1992), Hunter & Goodchild (1993) and Hunter et al. (1995) provide more detailed treatments of the issues although, for the most part, with the more narrow emphasis on error in spatial data. Much of the research has concentrated on the quantitative accuracy of spatial data, the derivation of indices and their propagation through analyses. Map accuracy

Cybergeo : Revue europenne de gographie, N66, 08/12/1998 testing invariably relies on well-defined, unambiguous points. Apart from being logically flawed (Dobson, 1994) and the lack of randomness in such testing, a high proportion of data used in a GIS does not exhibit the exactness required of such testing. Error matrices (Congalton & Mead, 1983) and the indices derived from them are global measures for a dataset and fail to address the spatial distribution of errors. Geographical data invariably includes an element of interpretation. Objects and their relationships often have to be described intuitively in the first place. Accuracy should properly be viewed as accordance of data to reality in the context of a fixed interpretation of reality. Some types of data rely on a very large measure of interpretation in their compilation as for example from aerial photographic interpretation or from expert opinion. And again, for information output and cartographic communication, any quality measures available need to be interpreted as to their implication for the application in hand. It is these highly subjective elements that are providing much of the difficulty in uncertainty and fitness-for-use in GIS.

Fuzzy Sets
Fuzzy sets (Zadeh, 1965) are used to handle the imprecision that characterises much of human reasoning. A fuzzy set assigns levels of membership in a range [0, 1] for each element of x in a set A in a universe U :

Hence for intervals of x of 0.1 in the range [0, 1] : The traditional binary (0, 1) can be viewed as crisp sets in the form :

(where is "not") and can hence be viewed as a special case of fuzzy set. The concept of crisp and fuzzy sets are illustrated in Figure 1.

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

Figure 1 : Graphic illustration of crisp and fuzzy sets Fuzzy sets can be combined in Boolean operations and, in general, can be handled in much the same way as probabilities :

Fuzzy set operations and the use of fuzzy sets in geography has recently been reviewed by Macmillan (1995). The term fuzzy has been introduced for handling uncertainty in GIS but for the most part the term has been loosely applied to any non-binary treatment of data, particularly probabilities. Although probabilities allow greater discrimination in the range [0, 1] they are crisp numbers and should not be confused with fuzzy sets. The use of fuzzy set theory proper has thus far been quite restricted (Unwin, 1995) and is reviewed in Altman (1994). One area of application has been to quantify verbal assessments of data quality from image interpreters and as a consequence of expert evaluations (Hadipriono et al., 1991 ; Sui, 1992 ; Gopal & Woodcock, 1994). Table 1 provides a set of linguistic hedges for features and boundaries defined and encouraged for use in aerial photographic interpretation in terrain evaluation by the Geological Society Working Party on Land Surface Evaluation for Engineering Practice (Edwards et al., 1982). These are intended as qualitative indicators of data accuracy and reliability. The problem that arises, however, is that such standard linguistic hedges are themselves defined in terms of yet other hedges which, to each individual, may have nuances and different interpretations. When more than one

Cybergeo : Revue europenne de gographie, N66, 08/12/1998 language is considered, the problem of meaning of linguistic hedges is compounded apparently to the point of impossibility.

Table 1 : Linguistic hedges suggested for use in aerial photographic interpretation (from Edwards et al., 1982). One of the earliest and continuing applications of fuzzy sets has been to represent linguistic hedges such as tall or short and to facilitate computing with words (Zadeh, 1996). Empirical studies of fuzzy set equivalents of linguistic hedges (Zadeh, 1972 ; Lakoff, 1973) have shown a general pattern of reduced spread in the fuzzy sets as they tend towards the more definite boundaries of 0 and 1. Examples of such fuzzy set representation of linguistic hedges (from Ayyub & McCuen, 1987) are : Small, low, short or poor : A = { 0| 1, 0.1| 0.9, 0.2| 0.5 } (6) Medium or fair : A = { 0.3| 0.2, 0.4| 0.8, 0.5| 1, 0.6| 0.8, 0.7| 0.2} (7) Large, high, long or good : A = { 0.8| 0.5, 0.9| 0.9, 1| 1 } (8) These are illustrated graphically in Figure 2 and reflect the empirically verified generalpattern of reduced spread in the fuzzy sets as the linguistic hedge verges on the binary 0 or 1.

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

Figure 2 : Illustrative linguistic hedges as fuzzy sets. Whilst the definition of a linguistic hedge as a fuzzy set can be a fairly straightforward process, the same cannot be said for the reverse. Faced with a fuzzy set which does not fit any pre-defined terms, it can be very difficult to associate it with an appropriate linguistic. Furthermore, fuzzy sets are cumbersome to store in a database. Not only is the notation difficult to encode but there are 39,916,789 useful combinations of fuzzy sets in the range [0, 1] for an interval of xi=0.1 These are serious problems in the use of fuzzy sets and whilst their use may at first appear an attractive solution (commented on by many authors), they are as yet to find widespread use in GIS research or practice.

The Principles of Fuzzy Expectation


To overcome the problems associated with the practical use of fuzzy sets, a universal translator has been devised to assist the two way translation of linguistic hedges and fuzzy sets (Figure 3). This comprises of a series of 11 stylized fuzzy sets in the range [0, 1]. These 11 stylized fuzzy sets have a number of attributes that make them particularly useful as linguistic building blocks :
they spread towards the mid-values (i.e. become more fuzzy as one is less certain about the data and thus conform with the linguistic use of fuzzy sets) ; they are a constant orthogonal distance from each other and thus unambiguously partition the fuzzy set space ; they normalize (i.e. where membership A(xi)=1) in a series from 0 to 1 at 0.1 interval.

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

Figure 3 : A set of 11 stylized fuzzy sets Users may, however, still be confused by the use of this metric. A common form of hedge, other than purely linguistic, is intuitive (subjective) probabilities which individuals use in making judgments under uncertainty (Tversky & Kahneman, 1974). Thus an individual may say "I'm 90% sure". Given that each individual can have their own subjective equivalence between both types of hedges - linguistic and intuitive probability - each stylized fuzzy set can be 'labeled' or identified by an intuitive probability equivalent to the value of xi where A(xi)=1. This intuitive probability-like metric is named here as fuzzy expectation, (E). (Figure 4).

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

Figure 4 : Fuzzy expectation ( E) and underlying fuzzy sets. Values of E are both the building blocks for fuzzy set representations of linguistic hedges and the means for translating fuzzy sets into qualifying statements of fitness-for-use. For example, as illustrated in Figure 5, during a session of interpreting aerial photographs, an interpreter marks or codes the class attribute (label) of a polygon with a linguistic hedge that expresses the interpreter's uncertainty. When recording data in the GIS, the interpreter defines an equivalence using E intuitive probabilities. Underlying stylized fuzzy set(s) are automatically substituted. A number of points can be noted about this process :

Cybergeo : Revue europenne de gographie, N66, 08/12/1998


a linguistic hedge may be equivalent to one or more values of E; these values would normally be adjacent in the series (logically) but need not necessarily be so ; linguistic hedges can overlap in their E equivalence showing that two linguistic hedges may be close in meaning ; where two or more values of E are used, they are not used singly but are combined using a Boolean or prior to propagation through analysis ; the linguistic hedges can be in any language where the person using that language can define E equivalence ; a table giving the linguistic hedges and their E equivalence used by an should be stored as metadata for future reference ; to overcome the problem of encoding and storing numerous cumbersome fuzzy sets, a linguistic hedge is simply stored within the data structure as one or more pointers (reflecting the values of E) to a lookup table containing the 11 stylized fuzzy sets.

It must be stressed that the process illustrated in Figure 5 is not proposing a fuzzy logic alternative to standard GIS analyses using Boolean logic. E is a means of encoding, propagating and decoding linguistically expressed uncertainty associated with spatial objects. E is designed for use with standard GIS. Naturally, a users linguistic expression of fitness-for-use will be different from the interpreter's linguistic hedges of uncertainty. But the fitness-for-use statements such as "highly applicable" or "useless" are still qualifying statements to the same degree that linguistic hedges are. The problems of dealing with linguistic hedges as discussed in above, also apply to expressions of fitness-for-use. In the example given in Figure 5, linguistic hedges encoded as fuzzy sets through E have been propagated through a union overlay to give a resultant fuzzy set whose meaning may not be known. To match this propagated uncertainty with the relevant expression of fitness-for-use (in terms of E), the resultant fuzzy set needs to be equated with one of the stylized fuzzy sets and thus with one of the users equivalent quality statements. This is achieved by calculating the relative Hamming distance between the resultant fuzzy set and all the stylized fuzzy sets in E :

(10) The shortest distance provides the match and hence the resultant fuzzy set can be translated into one of the users linguistic criteria.

Cybergeo : Revue europenne de gographie, N66, 08/12/1998

Figure 5 : The process of encoding linguistic hedges of uncertainty as E, propagating them and then translation into fitness-for-use statements.

Cybergeo : Revue europenne de gographie, N66, 08/12/1998 The process is not without its problems. In propagating fuzzy sets through a union overlay (Boolean and), it is possible in some instances to arrive at a null fuzzy set (i.e. for all xi , A(xi)=0) which then cannot be resolved further. In such instances the fitness-for-use defaults to E=0. The use of relative Hamming distance can result in a tie between two adjacent values of E. In such cases the lowest value of E is taken. Finally, further research is required for a broader range of functionality.

Conclusions
A technique has been presented here that allows objective handling of linguistic hedges of uncertainty within GIS. By using fuzzy expectation as building blocks for translating linguistic hedges into stylized fuzzy sets, many of the difficulties of recording linguistic descriptors in GIS have been overcome. The stylized fuzzy sets can be propagated using Boolean operators to give a resultant fuzzy set which can be translated back into a linguistic quality statement. Thus linguistic criteria of fitness-for-use can be derived for GIS outputs regardless of the language being used. Altman, D. (1994) Fuzzy set theoretic approaches for handling imprecision in spatial analysis, International Journal of Geographical Information Systems 8 : 271-289 Bibliographie : Ayyub, B. and McCuen, R. (1987) Quality and uncertainty assessment of wildlife habitat with fuzzy sets, Journal of Water Resources Planning and Management 113 : 95-109 Chrisman, N. (1991) The error component of spatial data, in Geographical Information Systems Principles and Applications (eds Maguire et al.), Longman, Harlow, Vol 1 : 165-174 Conglaton, R.G. and Mead, R.A. (1983) A quantitative method to test the consistency and correctness of photo-interpretation, Photogrammetric Engineering and Remote Sensing 49 : 69-74 Dobson, J. (1994) Face the ground truth about accuracy assessment, GIS World, November : 32-33 Edwards, R. ; Brunsden, D. ; Burton, A. ; Dowling, J. ; Greenwood, J. ; Kelly, J. ; King, R. ; Mitchell, C. and Sherwood, D. (1982) Land surface evaluation for engineering practice, Quarterly Journal of Engineering Geology 15 : 265-316 Goodchild, M. (1991) Issues of quality and uncertainty, in Advances in Cartography (ed. Muller), Elsevier, London : 113-140 Gopal, S. and Woodcock, C. (1994) Theory and methods for accuracy assessment of thematic maps using fuzzy sets, Photogrammetric Engineering and Remote Sensing 60 : 181-188 Hadipriono, F. ; Lyon, J. and Thomas, L. (1991) Expert opinion in satellite data interpretation, Photogrammetric Engineering and Remote Sensing 57 : 75-78 Hunter, G. ; Caetano, M. and Goodchild, M. (1995) A methodology for reporting uncertainty in spatial database products, URISA Journal 7(2) : 11-21 Hunter, G. and Goodchild, M. (1993) Managing uncertainty in spatial databases : putting theory into practice, URISA Journal 5(2) : 55-62

Cybergeo : Revue europenne de gographie, N66, 08/12/1998 Lakoff, G. (1973) Hegdes : a study in meaning criteria and the logic of fuzzy concepts Journal of Philosophical Logic 2 : 458-508 Lanter, D. and Veregin, H. (1992) A research paradigm for propagating error in a layer-based GIS, Photogrammetric Engineering and Remote Sensing 58 : 825-833 Macmillan, W. (1995) Modelling : fuzziness revisited, Progress in Human Geography 19 : 404-413 Sui, D. (1992) A fuzzy GIS modeling approach for urban land evaluation, Computers, Environment and Urban Systems 16 : 101-115 Tversky, A. & Kahneman, K, (1974) Judgement under uncertainty : heuristics and biases, Science 185 : 1124-1131 Unwin, D. (1995) Geographical information systems and the problem of error and uncertainty, Progress in Human Geography 19 : 549-558 Veregin, H. (1989) A taxonomy of error in spatial databases, NCGIA Technical Paper 89-12, NCGIA. Zadeh, L. (1965) Fuzzy sets, Information and Control 8 : 338-353 Zadeh, L. (1972) A fuzzy-set-theoretical interpretation of linguistic hedges, Journal of Cybernetics 2(3) : 4-34 Zadeh, L. (1996) Fuzzy logic = computing with words, IEEE Transactions on Fuzzy Systems, 4(2) : 103-111 CYBERGEO 1998

Das könnte Ihnen auch gefallen