Erlang

Budapest University of Technology and Economics
Object-oriented extension to Erlang

Paper for the Conference of Students Scientic Associations 2008
Written by: Gbor Fehr 5th year in Technical Informatics Budapest University of Technology and Economics feherga@gmail.com Advisors: Andrs Gyrgy Bks software developer, Ericsson Hungary Andras.Gyorgy.Bekes@Ericsson.com Dr. Pter Szeredi Department of Computer Science and Information Theory Budapest University of Technology and Economics szeredi@cs.bme.hu
Gbor Fehr
Abstract Erlang is a functional programming language developed at Ericsson in the 1980s for telecommunication applications. It provides language level elements to support distributed systems and fault tolerance. To structure the code of an Erlang program, one can split it into modules. There are also available some basic data structures: lists, records and tuples. However, there are no fully functional classes that encapsulate data and functionality, and can be extended with inheritance. We think these features could promote code reuse in certain cases, therefore we decided to extend the language with object-oriented capabilities. Our goals were not to violate the single-assignment nature of Erlang and to keep method-call and value-access times constant. It was also a priority to make the extension easily installable, to reach as much developers as possible. For this, I had to avoid changes in the Erlang compiler itself. Instead, I used the compilers interface for language extensions, called parse transformation. It makes it possible to manipulate the syntax tree of a program after parsing, but before compilation. I created the extension built on this, and implemented classes as the unication of modules and records, with method and eld inheritance added. A strong evidence of the usability of this is the fact, that part of the program itself is rewritten using our newly created language elements: the new version was simpler and cleaner than the original Erlang one. I also examined the currently available other object-oriented extensions for Erlang, and compared them with ours. It turned out that while ours has strong advantages, it also lacks some features. Compatibility with records and speed are the main advantages. In my essay among describing and comparing our extension I also show the possible ways of adding the missing features.
Gbor Fehr
Contents
1 Introduction 1.1 The concepts of Erlang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Object-oriented programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The outline of the paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Related work 2.1 Erlang behaviours . . . . . . . . . . . . 2.2 Programming objects with funs . . . . . 2.3 Programming objects with modules . . . 2.4 eXAT Objects . . . . . . . . . . . . . . . 2.5 WOOPER: Wrapper for OOP in Erlang 3 The 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 5 5 5 7 8 8 8 9 12 13 16 16 16 17 18 19 20 22 22 24 25 25 26 27 28 29 29 29 30 31 32 32 33 33 34 36 36 36 36 36 37 37 39 39 40 41 41 42 42
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3.13
Erlang Class Transformation Extension Specication Motivation and design goals . . . . . . . . . . . . . . . . . . . . Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dening a class . . . . . . . . . . . . . . . . . . . . . . . . . . . Method calls and -inheritance . . . . . . . . . . . . . . . . . . . Dening a subclass . . . . . . . . . . . . . . . . . . . . . . . . . Method inheritance . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Overriding a single clause . . . . . . . . . . . . . . . . . Field manipulation . . . . . . . . . . . . . . . . . . . . . . . . . Type checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . Query expressions . . . . . . . . . . . . . . . . . . . . . . . . . Update expressions . . . . . . . . . . . . . . . . . . . . . . . . . 3.10.1 The issue of side-eect variables . . . . . . . . . . . . . Pattern-matches . . . . . . . . . . . . . . . . . . . . . . . . . . Patterns in clause heads . . . . . . . . . . . . . . . . . . . . . . 3.12.1 An unbound variable in the pattern . . . . . . . . . . . 3.12.2 Atomic constants in the pattern . . . . . . . . . . . . . 3.12.3 Bound variables in the pattern . . . . . . . . . . . . . . 3.12.4 Bound and unbound variables in embedded clauses . . . 3.12.5 Variables present in a pattern and also in the guard . . 3.12.6 Variables appearing multiple times in patterns . . . . . 3.12.7 Composite subpatterns for eld extraction . . . . . . . . 3.12.8 Composite terms . . . . . . . . . . . . . . . . . . . . . . 3.12.9 Summary of transformation . . . . . . . . . . . . . . . . Remote objects . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
4 Implementation 4.1 Transformation modules . . 4.2 Helper modules . . . . . . . 4.3 Base modules . . . . . . . . 4.4 Runtime support modules . 4.5 Path of a processed module 4.6 Client transformation . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
5 Performance analysis 5.1 Measured properties . . . . . . . . 5.2 Methods of measurement . . . . . 5.3 Results . . . . . . . . . . . . . . . . 5.3.1 Dependence on the depth of 5.3.2 ECT versus Erlang . . . . . 5.3.3 ECT versus other OOP . .
. . . . . . . . . class . . . . . .
. . . . . . . . . . . . . . . . . . hierarchy . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Gbor Fehr
6 Conclusions and future work 6.1 Future work . . . . . . . . . . . . . . . . . . 6.1.1 Unique variable generation . . . . . 6.1.2 Integration with Erlang behaviours . 6.1.3 Use of upcoming Erlang features . . 6.1.4 Better integration with the compiler 6.1.5 New syntax for method-heads . . . . 6.1.6 Restricted uses . . . . . . . . . . . . References A Overview of Erlang A.1 Functions . . . . . . . . . . . . . . . . . . A.2 Function clauses and guards . . . . . . . . A.2.1 Built-in functions . . . . . . . . . . A.3 Variables . . . . . . . . . . . . . . . . . . A.4 Tail-recursion optimisation . . . . . . . . A.5 Data types . . . . . . . . . . . . . . . . . A.5.1 Primitive data types . . . . . . . . A.5.2 Compound data types . . . . . . . A.5.3 Syntactic sugar data types . . . . . A.6 Expressions . . . . . . . . . . . . . . . . . A.7 Patterns . . . . . . . . . . . . . . . . . . . A.8 Tuple manipulation . . . . . . . . . . . . . A.8.1 Element extraction . . . . . . . . . A.8.2 Updating elements . . . . . . . . . A.8.3 Using BIFs . . . . . . . . . . . . . A.9 Record manipulation . . . . . . . . . . . . A.9.1 Field extraction . . . . . . . . . . . A.9.2 Updating elds . . . . . . . . . . . A.9.3 Summary . . . . . . . . . . . . . . A.10 List manipulation . . . . . . . . . . . . . . A.11 Expression sequences in begin-end blocks A.12 Clause-based constructs . . . . . . . . . . A.12.1 Functions . . . . . . . . . . . . . . A.12.2 funs . . . . . . . . . . . . . . . . . A.12.3 case construct . . . . . . . . . . . . A.12.4 Guard sequences . . . . . . . . . . A.13 Module attributes . . . . . . . . . . . . . B Detailed measurement results B.1 Erlang static function call . . B.2 Erlang dynamic function call B.3 Erlang fun call . . . . . . . . B.4 Erlang records . . . . . . . . B.5 ECT classes . . . . . . . . . . B.6 ECT remote classes . . . . . . B.7 eXAT classes . . . . . . . . . B.8 Parameterized modules . . . . B.9 Wooper classes . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
43 43 43 43 44 44 44 45 47 48 48 49 50 50 50 51 51 52 52 53 53 54 54 55 55 56 56 56 57 57 58 58 59 60 60 60 61 62 62 62 62 62 63 64 65 66 67
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Gbor Fehr
Introduction
Computer programs can be large and complex. However the amount of complexity a programmer can handle is limited. To bridge this gap, several techniques emerged to structure programs into isolated segments, so that programmers can deal with them one by one. The level of isolation is a key point. The lesser the dependence of segments on each other, the easier is to understand, develop or modify one. (On the other hand, in one program there must be some interdependence of the segments, otherwise it would not be one program.) Segmenting should therefore minimise the interdependeces between pieces. This can be achieved by identifying the logically cohesive parts of the program and putting them into segments. Structuring has two related aspects: structuring the data the program works on, and structuring the instructions of the program. The relation comes from the fact that operations that work on data and their data are interdependent. Lifespan of programs might be long as well, and during that maintenence is required. As I mentioned above, structuring can help maintenence by making the program more understandable and making the segments relatively easily modiable. The cost of the software development is also an important factor. Reusing the already written segments can decrease this. If the segments have been used before, it is less likely that they contain errors. The possibilities of segmenting a program are determined by the programming language. There are several examples: arrays and records (structs) for data, and functions, modules (units, source les) for operations. The structures for operations are also involved in the alignment of data: functions have local variables and modules usually have global variables. It must be noted that among language elements, there are tools on higher abstraction levels to improve the structure of programs design patterns on the level of few objects, and design methodologies on the level of complex software systems. This essay does not discuss these. It describes the implementation questions of a well-known structuring-system object-oriented programming in the framework of a less well-known, functional programming language: Erlang. Erlang has strong language elements to structure programs, but has no objects. In the following sections I will show why objects can improve Erlang and how can they be implemented. But before that I briey introduce Erlang in the following subsection, and object-oriented programming and its relation to Erlang in the subsequent.
1.1
The concepts of Erlang
In telecommunication there are large systems that need to run reliably and must be well-scalable. Erlang [2] addresses these with segmenting programs into modules and processes. It also increases the isolation of program segments by forbidding global variables, and destructive variable updates. The latter means that a variable can be assigned only once in its lifetime. Modules are units of the program code. They can be loaded, and replaced at run time. This helps maintenence. Processes are isolated in many ways: they do not share state and variables, fault in one process does not cause fault of other processes. Processes can communicate with asynchronous messages. Fault detection is also a way of communication: one process can detect if another fails. The fault then can possibly be corrected. This helps building a reliable application from possibly unreliable components, as discussed in [1]. Erlang processes are light-weight: creation and message-sending is relatively cheap in comparison to usual operating system processes or threads. [2] describes ways to design programs with the use of these processes, and calls this Concurrency-Oriented Programming. A brief introduction to Erlang can be found in appendix A. More can be read in section 3 of [1] or in [2].
1.2
Object-oriented programming
Object-oriented programming [7] (OOP) is another way to structure a program. In an object-oriented system, the segments of the program are objects. An object encapsulates data and functionality. If related data and functionality can be distinguished in the system, using objects can improve segmenting. Isolation is possible with dening certain data elements or functions to be only accessible from functions of the object1 .
1 For
data elements this is called data hiding.
Gbor Fehr
Objects are grouped into classes, where instances of a class have the same data structures and functionality, but might have dierent values in their data structures. Classes can be extended with data structures or functionality to create new classes. An important aspect of OOP languages is polymorphism, so that objects of an extended class can be used in any place where objects of the original class can be used. This makes reusing code written in objects possible. I highlight some costs and benets of OOP based on [7]: Benets: If a suitable algorithm works with objects, it can be used with extended objects and this way even its behaviour can altered. If a program contains suitable classes, new functionality can be added to the program without the modication of these classes, just by extending them. Costs: Programmers need to learn applying the concepts of classes, inheritance and polymorphism. Programmers need to learn when to use classes and when not to use them. Ineciency: run-time overhead and storage-overhead. Erlang has several features that provide the above benets, however more work from the programmer is needed to achieve these. With the introduction of objects my aim is to generalise and simplify these solutions. It is important to note, what is not my intention: I am not saying that the use of objects will always improve any program written in Erlang. This can be stated in parallel with the processes of Concurrency-Oriented Programming. If applied with care, it can be useful to build up a concurrent program. But when used carelessly, it provides no benets: e.g. evaluating each function call in a new process is too resource-hungry. In this essay I will not deal with the questions of object-oriented design. Most of the examples will be so short, that they can not show the benets of using objects. The only exception from this will be an example, where objects made a program more simple and extensible, in 4.6.
Gbor Fehr
1.3
The outline of the paper
The paper is structured as follows. In Section 2 I describe the available object-oriented implementations and similar techniques oered by Erlang. I also highlight for each, why it is insucient. In Section 3 I nally specify the exact objectives for our object-oriented system and describe the new language elements in detail. In Section 4 I show the architecture of the implemented system. I measured the speed of the implemented system and other object-oriented systems for Erlang. In Section 5 I describe the process of measurement and present the results are described. In Section 6 I summarise the achieved results and show possible development directions. In Appendix A I introduce the subset of Erlang to the Reader, that I use in this essay. In Appendix B I show the detailed results of the measurements I done in Section 5. The sheets I present here provide information about the accuracy of the measurements.
Gbor Fehr
Related work
I collected and tested the following other object-oriented, and semi-object-oriented approaches for Erlang. I found that there are basically two approaches. In the rst case, an object is represented by a data structure, containing eld values, and type information. In the second case, not only a data-structure, but one or more Erlang processes represent the instance of a class. In their general pattern, when the client creates such an object, a process is created and its pid2 is handed to the client. This newly created process contains the eld values of the object, and runs in loop waiting for messages from clients. A client then can send a message to query or update a eld or to call a method of the object. Another message can terminate the loop, thus delete the object. As a response, the server sends the value back, updates the eld, calls the method or terminates the loop. The method is called from the loop of the server, therefore it is asynchronous to the client. The message-passing causes time-overhead which I will measure in 5.3. However there are cases when this structure is desired, for example: for simulating multi-agent systems [11], and for distributed component systems.
2.1
Erlang behaviours
Erlang has a similar functionality like interfaces in object-oriented programming: behaviours. A behaviour denes a set of functions. A module can state that it implements a behaviour, with an attribute -behaviour(behaviour_name). When this is the case, the compiler checks if the module really implements the functions listed in the behaviour. This is generally used in Erlang to dene interfaces for callback modules. For example a name of a module that denes the gen_server behaviour, can be passed to the gen_server:start_link function, which starts a server process, and calls back3 the given module at certain events of the process. For the example, I copy here the list of callback routines of gen_server: init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, and code_change/3. Behaviours do not provide methods for handling inheritance and for structuring data.
2.2
Programming objects with funs
As for all the languages that support late binding of calls, it is possible to write programs in Erlang that simulate the behaviour of objects. This way in Erlang the methods of a class can be funs4 . The data structure of each object contains funs that point to its method-functions. Users of an object need to call the methods through these funs. This makes inheritance and overriding possible: an object of an inherited class can have a fun at the same place of its data structure as its superclass, but either with the same functionality or with overridden functionality.
identier, see A.5.1 Erlang it is possible to call a function from a module whose name is not known at compile time, but constructed only at runtime, see section 2.3 4 See section A.5.1
3 In 2 Process
Gbor Fehr
methods of class A function1 function2
methods of class B (extends A) function3 function2
an object of class A method1 method2
an object of class B method1 method2 method3
Implementing objects with using funs The issue here is that the programmer needs to make sure that the funs of each new object point to the appropriate functions. This can be done with writing constructors. Another issue is the data structure which the programmer must choose, and implement its inheritance when needed.
2.3
Programming objects with modules
Erlang modules can be used to program polymorphism. The following syntax makes it possible: ModuleName = math, X = ModuleName:sqrt(16.0). This way the module of the called function can be determined by a variable. However this does not support inheriting functions and no data-structure is introduced. Parameterized modules [4] address these issues. They are already built into the Erlang distribution, but not yet ocially documented, and programmers are not recommended to use them. Therefore it is possible that their behaviour changes from release to release. Because of this, I am describing the behaviour I experienced in Erlang/OTP R12B-4, released at 3rd September, 2008. Parameterized modules provide a way to dene parameters for a module. Functions of such a module can not be called directly. Instead, an instance of the module should be initialised with parameter values, and functions of that instance can be called. For example:
1 2 3 4 5
-module(pmodule, [Par1, Par2]). -export([pfunction/1]). pfunction(A) -> A*Par1+Par2. and this can be called like this:
%Par1 and Par2 are the parameters
1 2
M = pmodule:new(10, 2), X = M:pfunction(4).
% X will be 10*4+2 = 42
As you can see, the parameter variables of the module are implicitly available in the functions. Erlang modules also support virtual function inheritance [5], but this is also an unocial new feature. Exported functions in a base module that are not implemented in the extension module are inherited to the extension module. Field inheritance is also available, but not fully supported. Let us see the following example of three modules: beta extends alpha with an additional eld Z, and gamma extends beta with U. 9
Gbor Fehr
1 2 3 4 5 6 7 8 9 10
-module(alpha, [X, Y]). -export([getthis/0, getxy/0]). getthis() -> THIS. % the module instance (for parameterized modules) %% returns all fields in a tuple getxy() -> {X, Y}. -module(beta, [Z]). -extends(alpha). -export([new/3, getthis_beta/0, getxyz/0]). new(X, Y, Z) -> A = ?BASE_MODULE:new(X, Y), instance(A, Z). getthis_beta() -> THIS. %% returns all fields in a tuple getxyz() -> %{X, Y, Z}. % not possible, causes compile error: X and Y are unbound {beta, {alpha, X, Y}, _} = THIS, % (*) {X, Y, Z}. -module(gamma, [U]). -extends(beta). -export([new/4]). new(X, Y, Z, U) -> B = ?BASE_MODULE:new(X, Y, Z), instance(B, U). The programmer needs to create such new/N constructors to initialise the elds of base modules. Enter the following expressions into the Erlang shell:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 2 3 4 5 6 7
1 2 3 4 5 6 7
1> A = alpha:new(1, 2). {alpha,1,2} 2> B = beta:new(1, 2, 3). {beta,{alpha,1,2},3} 3> C = gamma:new(1, 2, 3, 4). {gamma,{beta,{alpha,1,2},3},4} 4> The value of A shows that the representation of a simple (non-extension) parameterized module is represented the same way as records. But according to B and C, instances of extended modules dier: the elds of the base-modules are carried in recursively nested tuples of base modules.
10
Gbor Fehr
Parameterized modules: gamma beta alpha X, Y Z U
Records: gamma X, Y, Z, U
This implies that elds can only be accessed or updated if we know which base module denes them. This is the explanation of the line marked with (*) in beta:getxyz(). I further examined the semantics of inherited functions: 1> A = alpha:new(a, b). {alpha,a,b} 2> A:getxy(). {a,b} 3> A:getthis(). {alpha,a,b} 4> B = beta:new(aa, bb, cc). {beta,{alpha,aa,bb},cc} 5> B:getxy(). {aa,bb} 6> B:getxyz(). {aa,bb,cc} 7> B:getthis(). {alpha,aa,bb} 8> B:getthis_beta(). {beta,{alpha,aa,bb},cc} 9> The lines of the above shell session: 1> Instantiate an alpha as variable A. 2> Query both elds of A. 3> Get the instance variable of A. This is A itself. 4> Instantiate a beta in variable B. 5> Get inherited elds of alpha from B, with function getxy, which is also inherited. 6> Get inherited and owned elds of B, with function getxyz. The commented line in this function shows that the inherited elds are not implicitly visible in functions of extended modules. Instead, they must be extracted manually, by the programmer. 7> Get the module instance variable of B, with inherited function getthis. This is not B, but the alpha structure nested in B. The problem here is that the inherited function only sees this nested instance, therefore this function should be overridden in descendant modules. For this example to work, I did not override it, but created another by a new name: 8> Returns the instance variable of B. This is B itself. 11
Gbor Fehr
The form of module data-structures raises the clue that Erlang has an undocumented calling convention. The following test succeeds:
1 2 3 4 5 6 7 8
-module(mctest). -export([t/0, t/1]). t() -> {mctest, bla, bla}:t(). t(X) -> io:format("t/1: my parameter was: ~p~n", [X]). Calling t/0 will output: t/1: my parameter was: {mctest,bla,bla}. The fact experienced in 7> means, that functions in base modules can only see the stripped-down module instance. This has troublesome consequences. First, the base module can not invoke the overridden implementation of its functions. However this is the desired behaviour in object-oriented languages. Another consequence is a problem with the update of elds in base modules. By update, I mean creating a copy of the module instance, which has some of its elds changed. The problem is that a eld update in the base module must be implemented in all extension modules, because the inherited function from the base module can not see and copy those data parts of the instance, which belong to an extended module. The following diagram shows the case when a function in a base class updates some variables and returns the module/object instance.
Parameterized modules: gamma beta alpha X, Y Z U alpha X', Y'
Desired behaviour: gamma

Possible more metainformation
gamma
Possible more metainformation
X, Y, Z, U
X', Y', Z, U
Apart from implementing functions involving eld updates in all extension modules, I could not nd other ways to work this problem around.
2.4
eXAT Objects
eXAT [11] [10] is an experimental agent-programming platform written in Erlang. It provides objectoriented programming features for the programming of agents. Objects run in separate processes. It includes virtual method inheritance, in such a way that not only methods, but single clauses alone can be overridden. Fields are declared per instance, the rst time they are given a value. They behave much like a dictionary. eXAT classes can be dened as modules. According to [14], every instance of an object has two associated Erlang processes: one of them servers the method call requests, and the other the eld manipulation requests. This implies that every function call and eld manipulation involves inter-process message sending. For the programmer, this is hidden: the methods and elds of an object can be accessed through the functions of the object module. The denition of a simple class:
1 2
-module(alpha). -export([extends/0, class1/1, fact/2]). 12
Gbor Fehr
3 4 5 6 7 8 9 10 11
extends() -> nil. class1(Self) -> io:format("alpha constructor~n", []), ok. fact(Self, 0) -> 1; fact(Self, N) -> N * object:call(Self, fact, [N-1]). We can try out this class by entering the following commands into the Erlang shell: 1> A = object:new(alpha). {object,alpha,alpha,<0.33.0>,<0.34.0>} 2> object:call(A, fact, [7]). 5040 4> object:set(A, field1, 11). 11 5> object:set(A, field2, 22). 22 6> object:get(A, field1). 11 7> object:call(A, fact, [-2]). In this shell session, rst we instantiate an alpha. After this, we call the fact method, set elds field1 and field2, and query the value of field1. The last expression causes an endless recursion. The object instance can be deleted with: 8> object:delete(A). To avoid the endless recursion, we can extend the previous class the following way.
1 2 3 4 5 6 7 8 9 10
-module(beta). -export([extends/0, class2/1, fact/2]). extends() -> alpha. class2(Self) -> io:format("beta constructor~n", []), object:super(Self, class1). fact(Self, X) when X < 0 -> invalid number. Calling the fact method of a beta object will behave normally as inherited from alpha unless X < 0 is true. In that case, the overriding clause will be executed.
2.5
WOOPER: Wrapper for OOP in Erlang
WOOPER [3] is also a process-per-object class implementation for Erlang. It supports virtual methods, single clauses can be overridden separately. For elds, it uses a similar directory approach like eXAT. An advantage over eXAT is the possibility of multiple inheritance. A possible disadvantage is that the programmer needs to send messages for method calling by hand. However this can be useful in certain cases, for example, more than one method calls can be initiated at a time, and the the response messages can be processed asynchronously in a batch. To create a class, its main parameters like exported methods, constructor parameters should be dened as Erlang macros. In methods, the state of the object, and return values should also be controlled with macros:
13
Gbor Fehr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
-module(class_Alpha). % list of superclasses: empty -define(wooper_superclasses, []). % parameters of constructor -define(wooper_construct_parameters,Field1, Field2). % exported constructors -define(wooper_construct_export,new/2,new_link/2,construct/3). % exported methods -define(wooper_method_export,getField1/1,setField1/2,op/4). -include("wooper.hrl"). construct(State, ?wooper_construct_parameters) -> % sets fields of the object, here they are called attributes ?setAttributes(State, [ {field1,Field1}, {field2, Field2} ] ). getField1(State) -> % returns the value of ?getAttr(field1), which is the value of field1 ?wooper_return_state_result(State,?getAttr(field1)). setField1(State,NewF1) -> % returns the state updated by ?setAttribute(State,field1,NewF1) ?wooper_return_state_only(?setAttribute(State,field1,NewF1)). op(State, add, X, Y) -> ?wooper_return_state_result(State, X-Y); op(State, mul, X, Y) -> ?wooper_return_state_result(State, X*Y). The use of the class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1> A = class_Alpha:new(10, 20). <0.33.0> 2> A ! {getField1, [], self()}. {getField1,[],<0.31.0>} 3> receive {wooper_result, Res1} -> Res1 end. 10 4> A ! {setField1, [20]}. {setField1,[20]} 5> A ! {getField1, [], self()}. {getField1,[],<0.31.0>} 6> receive {wooper_result, Res2} -> Res2 end. 20 7> A ! {op, [add, 3, 2], self()}. {op,[add,3,2],<0.31.0>} 8> receive {wooper_result, Res3} -> Res3 end. 1 class_Beta corrects the add clause of op method, and adds a new clause which divides its two arguments. It also denes a new eld named field3.
1 2 3 4 5
-module(class_Beta). -define(wooper_superclasses, [class_Alpha]). -define(wooper_construct_parameters,Field1, Field2, Field3). -define(wooper_construct_export,new/3,new_link/3, construct/4, op/4). 14
Gbor Fehr
6 7 8 9 10 11 12 13 14 15 16 17
-include("wooper.hrl"). construct(State, ?wooper_construct_parameters) -> % initialise fields of class_Alpha: AState = class_Alpha:construct(State, Field1, Field2), ?setAttributes(AState, [ {field3, Field3} ] ). op(State, add, X, Y) -> ?wooper_return_state_result(State, X+Y); op(State, div, X, Y) -> ?wooper_return_state_result(State, X / Y).
15
Gbor Fehr
The Erlang Class Transformation Extension Specication
In this section I describe the object-oriented extension I created for Erlang. Its name is ECT, which stands for Erlang Class Transformation. In the rst subsection I state the goals for ECT. In the consecutive subsections I describe the required design decesions to satisfy these goals and their results: the new syntax elements and their semantics.
3.1
Motivation and design goals
In the following list, I summarise the neccesary features for our object-oriented system. I also highlight some goals that have been motivated by the related works 2. eld5 inheritance Method inheritance and virtual methods that can be overwritten. Polymorphism of instances if class A is an extension of B, than it should be possible to use A any place where B can be used. Uniform access to elds inherited parameterised modules showed that it is inconvenient when access of a eld depends on the class where it was dened. Single-assignment behaviour of elds. one problem with the object-as-process paradigm is that it breaks the single-assignment nature of Erlang: a eld of an object can be updated destructively arbitrary times. Single-assignment variables are important in Erlang for better support of concurrency, see pages 31-32 of [2]. Provide objects-as-processes optionally. For special uses (e.g. in 2.4), a supporting this might be useful. Only O(1)-time runtime overhead. The performance measurement results presented in 5 show that the method-call times of certain OOP systems increase with the depth of inheritance. We should keep this constant. All this should be achieved without the modication of the current Erlang distribution. It should behave like a plugin, to make it easily usable. I leave out the following possible features: Signle-clause overriding. I will show it that it can be simulated easily by the use of full-method overriding. Multiple inheritance. According to section 8.6 of [7], multiple inheritance has several issues, and can be left out from an object-oriented language.
3.2
Overview
To create a system satisfying the above requirements, we have chosen an Erlang technology named Parse Transformation. See page erl_id_trans of [13]. This can be used to preprocess Erlang code at module level. A parse transformation is dened by a module. It can be applied to an arbitrary Erlang module, by a compiler switch or by specifying it in the le of the module. When applied, the transformation module gets the the parsed source code of the transformed module before compilation, transforms it, and then the result is compiled. There are two options to extend a language with a such a parse transformation. First, we use certain syntactically correct elements, and give them new semantics by the transformation. Second, we use elements that are invalid, but accepted by the parser, and give new semantics to them. An example for the rst will be pattern-matching with objects, and another for the second will be method-calls.
5 In other terminologies these are called attributes, or the state of the object. I will use the term eld, because I build objects on records that have elds. The other reason is that Erlang modules can have attributes, but those are more similar to constants.
16
Gbor Fehr
Our extension distinguishes two types of modules. Class modules begin with -class(classname). instead of the original -module(modulename)., and each such module denes exactly one class. Client modules begin with the usual -module(modulename).. They dier from class modules in that they do not dene classes, only use them. So they are more similar to normal Erlang modules. In addition to these, class-aware pattern matching, expressions and method calls are available in both module types. Modules of both these types use the same parse transformation named classtrans, which detects the -class attribute and performs appropriate class module or client module transformations. Software developers using this system do not need to know about this parse transformation: the preferred way of using it is the inclusion of ect.hrl in each module, which instructs the compiler to run the parse transformation, and denes two necessary macros for class denitions. In fact, this inclusion can also be omitted in certain cases, which will be described later.
3.3
Dening a class
A class is a bundle of a module and a record6 having the same name. Some of the functions of the module are the methods of the class, and all of the elds of the record are the elds of the class. This means that a module denes exactly one class. When dening a class, the -module(modulename) attribute at the beginning of the le should be replaced with -class(classname). This can be followed by three possible types of optional class specic attributes: -superclass(superclass_name) Denes the superclass by its name. ?FIELDS(field1 = value1, field2, ...) Fields of the class in the same format as in record denitions. After transformation, they will be added to a record, which has the same name as the class. Its denition will also contain the elds of the superclasses and an extra eld for administration. ?METHODS([methodname/arity, ...]) List of methods in the usual Erlang export syntax. (Methods are always automatically exported.) The transformer generates a new record type, and users of the class need a copy of its denition to use it. For them, the transformer creates an includeable header le with the denition and with some other administrative attributes, that are necessary for compilation of client modules. It has the same name as the module, but with a .class.hrl7 extension instead of .erl. It also contains the inclusion of ect.hrl , so that if a module uses a class, our parse transformation will be automatically activated. Note that the transformer inserts the same source code sequence at the beginning of the class module, as into the hrl le. Let us see an example of a simple class denition:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
-include_lib("ect/include/ect.hrl"). -class(class1). ?FIELDS({field1,field2 = 2,field3}). ?METHODS([method1/2, method2/2]). -export([notmethod/1]). method1(This, X) -> {This, X+1}. method2(This, a) -> 1; method2(This, b) -> 2. notmethod(A) -> A+1.
6 See 7 hrl
section A.5.3 on details of records. is the standard extension for les to be included: header les. class is added to distinguish these generated headers.
17
Gbor Fehr
In the above code, we dened a class with three elds. Field field2 has a default value. There are also two methods: they are almost the same as ordinary functions. The only dierence at transformation time is that non-method functions like notmethod/2, which is not in the ?METHOD list will not be inherited to subclasses, while methods will be. This class has no superclass, therefore it will not inherit any methods. When calling a method of an object, the object is automatically passed as the rst parameter. It is analogous to this pointers in imperative languages. By the way, the name of the rst parameter is not checked and can be any valid variable name. However I will refer it by the name This, and I recommend the use of this name. Variables are immutable in Erlang, therefore if a method changes the value of a eld, it must return the updated This object. Method method1/2 shows the preferred way to return the modied object along with a return value. Note that the object is not modied here, neither the content of This is used in any of the above example methods. We will see in the next section, that there is no need to transform functions to make them accessible by method-calls. This will also mean that even non-method functions in the module of the class can be called with method-call syntax, thus passing an object as the rst parameter. The transformer does the following with the le of class1: omits the class-specic attributes from the source code generates class1.class.hrl replaces attribute -class with -module at the beginning The resulting include le will be:
1 2 3 4
-include_lib("/ect/include/ect.hrl"). -record(class1,{_types = {class1},field1,field2 = 2,field3}). -classlevel({class1,1}). -vmt({class1,[{method1,2,class1},{method2,2,class1}]}). All attributes contain the class name, so included headers of dierent classes in the same module will not interfere with each other. Attribute -classlevel and the record eld _types are for O(1) time runtime type checking. (See section 3.8) Attribute -vmt stores for each method, which is the most specic class that implements it. This is for ecient method overriding in subclasses. For now, both methods are implemented in class1, and the source code of the generated module is8 :
1 2 3 4 5 6 7 8 9 10 11 12
-module(class1). -export([static/2]). -export([method1/2,method2/2]). -include("class1.class.hrl"). method1(This, X) -> {This,X + 1} method2(_, a) -> 1; method2(_, b) -> 2 notmethod(A) -> A+1.
3.4
Method calls and -inheritance
To understand method overriding, we must rst discuss our method calling mechanism. When an object an instance of a class has a method, it can be called from any module using our transformation9 , with the following syntax:
8 This, 9 Class
and the following generated codes are shortened, but otherwise identical forms of the real transformation output. or client modules. In general, what can be done in client modules, it can also be done in class modules.
18
Gbor Fehr
1 2 3 4 5 6
-include("class1.class.hrl"). ... %% create a class instance: Instance = #classname{initialisers, exactly like for records}, %% method call: {Instance}:methodname(Paramer2, Parameter3, ...) Class instances can be created like record instances. This is not surprising since we represent classes with records. The method call is expanded into the following expression10 :
(element(1, Instance)):methodname(Instance, Paramer2, Parameter3, ...) This shows, that the module name of the called method is determined by the rst element11 of the tuple of the class, which is in fact its name. It is extracted at runtime to ensure that having an object, method calls are always directed to the module of its most specic class. The methods implemented in the module of that class are called therefore directly. But this does not make it possible to call a method implemented in one of the superclasses an inherited method. To achieve this, a so called stub function is generated in the module of the class for each inherited method. The name and arity of this function is the same as the inherited methods , but it has only one clause with only one expression: a call to the inherited method. Consider the following example: let us suppose that area/2 is a method in a class named geometry:
1 2 3 4
area(This, {rect, W, H}) when This#geometry.RectAllowed =:= true -> W*H; area(This, {square, A}) -> A*A. And also let us suppose that another class named xgeometry extends this class, and does not dene a method with name and arity of area/2. However, users of this class should be able to call {Obj}:area/2 on its instances, therefore the following stub is generated:
1 2
area(Arg1, Arg2) -> geometry:area(Arg1, Arg2). This stub calls the implementation. Creating stubs is optimised in the sense that always the direct implementation is called. For example if a third class would extend xgeometry and does not implement area/2, its stub method would call directly geometry:area/2 and not xgeometry:area/2. Thus we limit the number extra function-calls per method call to one. I used guards and expressions in geometry:area/2 to emphasise that no guards and patterns are copied to the stub. Choosing the clause remains the responsibility of the implementing class. This mechanism is called late binding, or dynamic binding because the actually called function is decided only at runtime. On the contrary early binding or static binding is when the called function is xed at compile time. This is the case of ordinary call of functions. Methods can also be called in this static way: module:method(Instance, Arg2, Arg3, ...). I recommend to avoid this, because it makes future changes to the class-hierarchy harder. As the above example shows, the object is always passed as the rst paramater of the method. This is called the this pointer in imperative languages. Ordinary functions can play the role of static methods. The background implementation of method inheritance is discussed in the next section, using the following example.
3.5
Dening a subclass
When we extend a class dene a subclass of it , the -superclass(superclass_name) attribute should be used in the subclass, to denote the superclass. In the following example we extend class1 to class2.
10 In case Instance is not a variable but an expression, the generated code will rst evaluate that expression, and then invoke the method, with the evaluated instance variable. 11 See section A.8.3 for the denition of BIF element/2.
19
Gbor Fehr
1 2 3 4 5 6 7 8
-include("class1.class.hrl"). -class(class2). -superclass(class1). ?FIELDS({x, y, z}). ?METHODS([method2/2]). method2(_, _) -> ok. It is essential to include the header le of the superclass, because that contains all necessary information about it. First, let us see the generated source code:
1 2 3 4 5 6 7 8
-module(class2). -export([method1/2,method2/2]). -include("class1.class.hrl"). -include("class2.class.hrl"). method2(_, _) -> ok. method1(Var2, Var1) -> class1:method1(Var2, Var1). As the code shows, a stub function is generated for method1/2, because it must be available through the call of class2:method1/2, since it is inherited. The generated header le is:
1 2 3 4
-include_lib("/ect/include/ect.hrl"). -record(class2,{_types = {class1,class2}, field1, field2 = 2, field3, x, y, z}). -classlevel({class2,2}). -vmt({class2,[{method1,2,class1},{method2,2,class2}]}). Two dierences from class1.class.hrl are important here: class record The new class record is the record of the superclass with the elds of the class appended to the end. This is a general rule in our system. The new elds are always appended at the end, in the same order as dened in the ?FIELDS directive of the erl le. This is an unambiguous representation, because we do not support multiple inheritance. attribute vmt The vmt attribute is also diers, because it shows that the most specic implementation of method2/2 is in class2, while method1/2 has it only in class1. In general, attribute vmt stores for each method, which is the most specic class that implements it. (Methods implemented in the current class are also added: they will have the current class in their entries.) Methods are represented by their names and arities. This information makes it possible to create stubs that redirect in one step to the implementation functions, instead of calling through a chain of stubs. More precisely, it is used to decide which inherited method must have a stub function generated, and to which module should it point: these are those methods, whose entry is not pointing to the current class, and the destination of the callback is the class in the vmt entry. In fact, it is the vmt entry what makes a function a method. The vmt for a class can be generated from its methods, and the vmt of the superclass.
3.6
Method inheritance
An important principle of OOP is the principle of substitutability. It says, that if A is a subclass of B, then instances of B can be used in any place where instances of A can be used. By usage, we mean 20
Gbor Fehr
method calls and eld manipulation. Field manipulation is a more complex matter and will be discussed in the next section. As for method calls, the principle is followed, because a subclass always inherits all the methods of the superclass. The following example demonstrates possible method-call scenarios. It also introduces a new syntax element for method calls: calling a method of the superclass. Note that in this simplied source code, all functions are methods. extract of class1.erl:
1 2 3 4 5 6 7 8
extract of class2.erl: -superclass(class1).
method1(This) -> 2. method2(This) -> {This}:method1(). method3(This) -> method1(This). This example introduces the following new syntax: {{Instance}}:methodname(Arg2, Arg3, ...)
method1(This) -> {{This}}:method1() + 40.
It can not be used in client modules, only in such class modules that have a superclass. The meaning is: call the method named methodname in the superclass of the current class (and not in the superclass of Instance). This means that class2:method1/1 will always call class1:method1/1, even if the object has a descendant class of class212 . This is in fact a more exible way of early binding. Flexibility comes from the fact, that such a call is independent from the name of the superclass, therefore no refactoring is needed when that name is changed. I present some arguments about the choice of this syntax at the end of this subsection. The key point in this example is method1. I demonstrate the issues of method-calls on it. For this, let us suppose that we have a an instance of class1 named O1, and an instance of class2 named O2. There are six possible calls: {O1}:method1() is directed to class1:method1(O1). {O1}:method2() is directed to class1:method2(O1). The call in its body is directed to class1:method1(O1). {O1}:method3() is directed to class1:method3(O1). The call in its body is directed to class1:method1(O1). {O2}:method1() is directed to class2:method1(O1), because it overrides class1:method1/2. It calls class1:method1(O1), and adds 40 to its return value. {O2}:method2() is directed to class1:method2(O1), because it is an inherited method. The call inside it is directed to class2:method1(O1), because it is overridden. This latter is a desired behaviour, and can be utilised for example to create easily extendable algorithms. I used this in the implementation of this transformation itself. {O1}:method3() is directed to class1:method3(O1), because it is an inherited method. However, the call inside it is always directed to class1:method1(O1), because the ordinary (static) function-call syntax is used. This way overriding of methods has no eect. In spite of this, the reason of dening a function to be a method is to enable inheritance. Therefore I recommend not dene the called function a method, if such behaviour is desired. And now, let us turn back to the syntax of superclass-calls. The implemented one is: {{Instance}}:methodname(Arg2, Arg3, ...)
12 Of
course this construct also allows the call of a superclass method by a dierent name than the caller method.
21
Gbor Fehr
And an alternative one is: ?SUPERCLASS:methodname(Instance, Arg2, Arg3, ...) The alternative one stresses better that the destination of the call is static it is the superclass. However, the implemented one emphasizes better that this is a method-call of an object. 3.6.1 Overriding a single clause
Overriding a single clause of a funcion is possible in eXAT (2.4). There is no such feature in ECT, however it is easy to program such behaviour: -include_lib("ect/include/ect.hrl"). -class(alpha). ?METHODS([fact/2]). fact(Self, 0) -> 1; fact(Self, N) -> N * object:call(Self, fact, [N-1]).
1 2 3 4 5 6 7
-include_lib("ect/include/ect.hrl"). -class(beta). ?METHODS([fact/2]). fact(Self, X) when X < 0 -> invalid number fact(Self, N) -> {{Self}}:fact(Self, N). % call implementation of superclass. The newly inserted clauses will be matched rst, even when a more specic pattern or guard exisits in the superclass.
3.7
Field manipulation
The ways Erlang provide for manipulating records are described in section A.9. I repeat here a simpliefed example:
1 2 3 4 5 6 7 8 9
%definition (must be at module level): -record(example, {a, b = 5, c = 6}). demo() -> X = #example{b = 6}, % instance creation #example{a = A, b = B} = X, % field extraction Y = X#example{a = 1, b = 2}, % field update C = Y#example.c, % single field extraction % nesting: #example{a = A, b = {_, B}, c = #example{a = C}} = X, The important thing here is that all operations explicitly state the name of the record. Before an operation is executed, it is checked whether its input variable is a valid record with the stated name. (Operations in line 5 and 6 check X to be example, operation in line 7 check Y to be example.) This type check passes if two conditions are met: 1. The input variable is a tuple, and its rst element is the same as the stated name. 2. The size of the input tuple is the same as the dened size of the stated record, i.e.: the number of the dened elds + 1 for the name element. However, similar operations on objects should behave such a way, that if classa is a superclass of classb, and B is an instance of classb, than B could be still matched and updated with the stated name classa: 22
Gbor Fehr
1 2 3 4 5 6 7 8 9
% ...definition of classa, classb... demo() -> X = #classb{b = 6}, % #classb{a = A1, b = B1} = X, % #classa{a = A2, b = B2} = X, % Y1 = X#classb{a = 1, b = 2}, % Y2 = X#classa{a = 1, b = 2}, % C1 = Y#classb.c, % C2 = Y#classa.c, %
instance creation field extraction field extraction field update field update single field extraction single field extraction
Lines 4, 6 and 8 show the conventional semantics, which can be achieved with record operations. Lines 5, 7 and 9 show the use of the name of the base class as stated name. This would fail with record operations, but desired with class operations. The easy way to work-around this would be to turn o type-checking, and thus enable these operations on any tuples. This can be done in theory, because a class name and a eld name determines a position in the tuple. However this would have an undesired side-eect: it would became possible to access a eld of a class with the statement of an unrelated eld of an unrelated class. However this question is only theoretical, because the type-check can only be turned o for single-eld extractions13 . For a correct behaviour, we need a type check, that answers the following question: Is X an instance of yclass (directly or trough a superclass)? In other words, we need to check two things: (O1) the size of the tuple of the object must be greater than or equal to the size of the tuple of the stated class. (O2) The class of the variable object must be subclass or equal to the stated class. To see an example in practice, let us suppose that we have the following two objects:
1 2 3 4 5
A = #classa{a = 1, b = 2}, % A = {classa, {classa}, 1, 2} B = #classb{a = 1, b = 2, c = 3}, % B = {classb, {classa, classb}, 1, 2, 3} #classa{a = X} = B.
% % % %
an instance A same with an instance B same with
of classa tuple-syntax of classb tuple-syntax
In line 5, where B is checked to be classa, the following two checks should happen: tuple_size(B) >= 4, where 4 is the size of classa. The rst element of B is a subclass of classa. (The second element, which is the _types eld, is also used, to achieve O(1) time. This will be discussed in section 3.8.) The previous example showed class-operations with the same syntax as record-operations. Objects are records with an extra eld, so the only dierence is the method of type-checking. However it is not obvious that using the same syntax is the best choice. If we do not introduce a basically dierent concept of handling records, there remain two options: Use a new syntax a new syntax that the parser accepts, but did not existed in Erlang14 can be introduced with the above-discussed semantics. Because it did not existed, it can be easily ltered by the parse transformer, and change its occurrences to suitable Erlang code. Extend record syntax the semantics of the current record-syntax can be extended: record manipulations behave ordinarily when the stated name is the name of a record, but behave according to class-semantics when the stated name is the name of a class. The parse transformer should collect the list of class-denitions, locate all record-manipulating constructs and transform those that work on classes. This is the general question that I mentioned in section 3.2, that is whether to introduce new syntax or not.
compiler option no_strict_record_tests. example for this is obj#classname{field1 = Pattern1, field2 = Pattern2, ...} {Instance}#classname{field1 = Value1, field2 = Value2, ...} for eld-updating expressions.
14 An 13 With
for
patterns,
and
23
Gbor Fehr
Apart from modifying the compiler, there is no way to turn o Erlangs strict record-type checking, however this would be essential to implement (O1) . Therefore we have chosen to transform each pattern or expression involving a class type, into a code that behaves according to (O1) and (O2). Until a point in the development, we planned to create a syntax that diers from record manipulation for these behaviours, but according to the following arguments which I collected, we have changed the design. Possible arguments about having the same syntax for classes as records Cons Misunderstandable, because the same expressions has a dierent meaning. Pros The meaning is dierent, but strongly related to the original: Check the type of an entity and take/update a eld of it. This makes the usage intuitive. Simpler source code: easier to learn and understand. In common imperative languages like C++ and C# record/struct and object elds are accessed the same way. There are examples where only one construct exists for the role of records and classes: classes in Java and records in Oberon.
Therefore I have chosen to implement eld manipulations for objects with the same syntax as for records. To emphasise the dierences of object eld-handling in the following texts, I will distinguish the patterns and expressions when they deal with classes having #classname from any other patterns or expressions by the terms: class-patterns and class-expressions. The current implementation transforms a pattern or an expression, if and only if the stated record type is the type of a known class. Known classes in a module are the included classes and the current class if the module is class module. All the eld extractions and updates will be decomposed into using the following Erlang BIFs: element/2, setelemnt/3, tuple_size/1. See section A.8.3. The begin-end construct dened in section A.11 will also be used.
3.8
Type checking
Before any object-eld manipulation, the class of the object will be checked, like the type is checked for record instances. This can be done in O(1) time, with the following formula from page 254 of [7]: the class of object O is subclass or equal to the class C if and only if O.types[C.classlevel] = C , where O.types is an array containing the classes of O from the most general to the most specic. The most general superclass is at index 1, and the last element is the most specic class of O. C.classlevel is the depth of C in the class hierarchy. In other words, this is the number of superclasses + 1. It is easy to see that C can only appear at the C.classlevelth position in the array of any object. Our objects carry the types array as a tuple in their _types eld. This is always their rst eld, so it is located at the second position when looking them as tuples. The formula can be rewritten as a guard expression15 : ClassName =:= element(ClassLevel, element(2, Object)) or as a pattern match16 :
15 See 16 See
section A.12.4. section A.7.
24
Gbor Fehr
ClassName = element(ClassLevel, element(2, Object)) As I mentioned above, the _types eld containts the names of the superclasses and the class of Object. This information is in fact the property of the class of Object; we store it per-objectbecause the class only reveals runtime. Note that in this example, ClassSize, ClassLevel and ClassName are written as Erlang variables, but in fact all of them are known at compile-time, and written to the generated code. The value of ClassName is the stated name the name after the hash-mark (#classname). The level and size for the stated class can be found at compile-time in attributes of the include le of the class. Generated source code examples after this point will be written as if there would be a BIF for the previous guards, named, is_object(Object, ClassName). The name choice is based on the similarity to Erlang BIF is_record(Record, Name). When it appears in a guard, it stands for ClassName =:= element(ClassLevel, element(2, Object)). When it appears as an expression, it stands for: ClassName = element(ClassLevel, element(2, Object)).
3.9
Query expressions
A query expression extracts one eld from an object; the object can also be a result of an expression: Expression#classname.field is converted to begin X1 = Expression, true = is_object(X1, classname), element(NNN, X1) end where NNN is a constant number, representing the position of eld in class classname. First, the expression is evaluated and stored into the variable X1, this is to avoid evaluation multiple times. After that, a type check is done, which causes a runtime error if it fails17 . The result of the third expression is the eld in question. The object being the result of an expression gives the possibility of nesting these expressions, for example: (Obj#class1.field1)#class2.field2 is converted to
1 2 3 4 5 6 7 8 9
begin X1 = begin X2 = Obj, true = is_object(X2, class1), element(NNN1, X2) end, true = is_object(X1, class2), element(NNN2, X1) end The variables X1, X2, ... are generated by the transformation. In the current implementation, to minimise the risk of name clashes, these variables in fact named _CCTRANS_1, _CCTRANS_2, .... For better readability, I use Xis in this text.
3.10
Update expressions
Update expressions create a new object from an existing one, changing some elds of it. The existing object, and the new eld values can of course be results of expressions. So
17 There can be two causes of runtime error. The rst possibility is that is_object/2 causes it directly, because X1 is not an object. The second is that the result of is_object/2 is false and can not be matched to true.
25
Gbor Fehr
Expression#classname{field1 = ValueExpr1, field2 = ValueExpr2} is converted to

1 2 3 4 5 6
begin X1 = Expression, true = is_object(X1, classname), {X2 = ValueExpr1, X3 = ValueExpr2}, setelement(NNN1, setelement(NNN2, X1, X3), X2) end NNN1 and NNN2 are the positions of field1 and field2, respectively. After the type-check, all the elds are evaluated. According to section 3.5 of [12], the Erlang compiler optimises setelement calls when there are no other function-calls, or acces to the tuple between them, and the indices are variable literals in descending order. This optimisation causes that only one copy of X1 is made during the setelement chain, instead of making one copy per one setelement call. The values are evaluated before the setelement chain to trigger this optimisation. The order of the calls is also rearranged to update elds in descending order. Due to the use of begin-end blocks, nesting is also possible here. So an expression of a eld can contain further class-expressions. 3.10.1 The issue of side-eect variables
For the sake of optimisation, the expressions are evaluated into variables. As the above example showed, these evaluations are done in a tuple. This is a temporary solution, whose reasons I describe here: An Erlang expression can be a pattern match. If the pattern contains unbound variables, than the expression has the side-eect of bounding its variables. For example the following two function calls pass the same parameter value, but the expression in the second bounds variable named V:
1 2
sqrt(16*16) sqrt(V = 16*16) This is also true for update expressions, so the following is valid: Obj#class{field1 = A = 5, field2 = B = 6} But these variables only become accessible after the whole expression is evaluated, so the following is invalid in Erlang. Obj#class{field1 = A = 5, field2 = B = 6, field3 = A + B} Evaluating the eld-expressions sequentially would lift this limitation. The quickest soultion to keep it is to evaluate them in a tuple. The question might arise whether is this really a necessary? Should variable use limited in the same expression? Cons No, because this does not existed in Erlang, thus does not interfere with current Erlang code. No, because it causes runtime time overhead. No, because the generated tuples cause compiler warnings: a term is constructed but never used. Pros Yes, to avoid discrepancies from original Erlang language.
26
Gbor Fehr
The last two counter-arguments highlight defects of the current tuple-based implementation. I plan a better implementation in a later version. In theory it is possible to keep track of variables at transformationtime, and issue error messages when such anomalies are found. To avoid the change of semantics later, I am temporarily keeping the current sub-optimal implementation. However, in a real-world environment, in its current form, it should be removed.
3.11
Pattern-matches
A pattern-match has the following form. See section A.7 for details. Pattern = Expression In normal pattern-matches, a tuple must match a tuple of the same size. On the contrary, in classpattern matches, this is not always true: when matching against a class-pattern, the size of the right-hand tuple must be greater than or equal to the size of the class-pattern. That is why pattern-matches are also converted, using element/2 BIF calls: #class1{field1 = 5, field2 = B} = Obj is transformed to
1 2 3 4 5 6 7
begin X1 = X0 = Obj, true = is_object(X0, class1) 5 = element(3, X0), B = element(4, X0), X1 end Line 2 bounds two variables to the matched object. The matched object is used several times in the next lines. If it is an expression, it would be evaluated each time. But this way it is evaluated only at this line. Line 3 is the type-check. Line 4 tests the value of eld1 with a pattern-match. Line 5 extracts the value of eld2 with a pattern-match. Line 6 sets the result of the begin-end to the value what the pattern-match would have. The following example will demonstrate the case when more class-patterns are embedded into another pattern in this case, this is a list pattern: [_,#class1{field1 = 5, field2 = A},#class2{z = B}|_] = ObjList The rst class-pattern is matched against the second element of the list, and the second class-pattern is against the third element of the list. The resulting code will rst extract the elements from the list, and do the class-matches on them separately:
1 2 3 4 5 6 7 8 9
begin X2 = [_,X0,X1|_] = ObjList, true = is_object(X1, class2), B = element(8, X1), true = is_object(X0, class1), 5 = element(3, X0), A = element(4, X0), X2 end
27
Gbor Fehr
First, the two elements are extracted from the list (into X0 and X1), and the list is saved into X2. X2 is used at the end to restore the correct evaluation result of this pattern-match. After this, the types are checked, and the elds are matched independently against their patterns. The following example demonstrates what happens when there is a compound pattern and two variables are the same: #class1{field1 = {A, B}, field2 = A} = Obj. This is transformed to:
1 2 3 4 5 6 7
begin X1 = X0 = Obj, true = is_object(X0, class1), {A,B} = element(3, X0), A = element(4, X0), X1 end The composite pattern is matched against the extracted element of the tuple. When the pattern match should fail because the two subpatterns would bound A to dierent values, a runtime error will occur at the second attempt at line 5. By the way this is only true when A was unbound before this match. If A was bound, the match can also fail at line 4. Due to the begin-end construct, class-patterns can be nested into other class-patterns. For example: #class1{field1 = C = #class2{x = {5, B}}, field2 = A} = Obj. This example also shows that multiple patterns can be matched against a eld. Thus, field1 is not only matched against #class2{x = {5, B}}, but extracted into variable C. The result of the transformation is:
1 2 3 4 5 6 7 8 9 10 11 12
begin X1 = X0 = Obj, true = is_object(X0, class1), C = begin X3 = X2 = element(3, X0), true = is_object(X2, class2), {5,B} = element(6, X2), X3 end, A = element(4, X0), X1 end.
3.12
Patterns in clause heads
Functions, funs, if expressions, case expressions, receive expressions and try-catch blocks have a common property in Erlang: they are all consist of clauses. (See section A.12.) Patterns can appear not only in pattern-matching expressions, but also in places of arguments in clause heads. These patterns need to be handled when they are class-patterns. In the following subsections, I will describe the common issues for all clause types on the examples of function-clauses. The exception is 3.12.4, where I describe the specic issues of the other clause-constructions. By the nature of Erlang, we must consider the following issues: 1. Any value-tests derived from the class-pattern must be placed in the guard, to maintain correct program behaviour, which is: skipping to the next clause instead of raising a runtime error on match failure.
28
Gbor Fehr
2. It is not possible to assign values to variables in the guard. See section A.12.4. These implies, that the class-patterns must be splitted into two parts: a value-testing, which goes into the guard, and a value assignment, which goes into the beginning of the body. In the following subsections, each step necessary to deal with these restrictions is described. We will demonstrate these steps on simple subpatterns variables and atomic constants , and show in the last subsection how to generalise this to composite subpatterns. 3.12.1 An unbound variable in the pattern
An unbound variable means that its value is to be determined by the pattern-match; in case it is in a class-pattern, it means that its value will be extracted from the object, against which the pattern is matched. (If it is not in a class-pattern, the transformation keeps it untouched.) What is done is the following: class-patterns in the head are substituted with variables, a type-check is added to the guard, and the values are extracted at the beginning of the body:
1 2
clausedemo1(A = {_, #class1{field2 = B}}, C) when C > 5 -> A+B+C; is transformed to
1 2 3 4
clausedemo1(A = {_,X0}, C) when is_object(X0, class1), C > 5 -> B = element(4, X0), A+B+C; 3.12.2 Atomic constants in the pattern
When the pattern contains constant expressions, they can not be checked in the body, because then the match-failure will cause a runtime error, instead of the desired behaviour which is to step to the next clause of the function. The solution is to do the check in the guard:
1 2
clausedemo1(A = {_, #class1{field2 = B, field3 = xyz}}, C) when C < 5 -> A + B + C; is transformed to
1 2 3 4 5 6 7
clausedemo1(A = {_,X1}, C) when is_object(X1, class1), xyz =:= element(5, X1), C < 5 -> B = element(4, X1), A + B + C; 3.12.3 Bound variables in the pattern
A similar case is when the pattern containts a variable which already has a value, because it appears in another (non-class) pattern:
1 2
clausedemo1(A = {_, #class1{field2 = B, field3 = C}}, C)-> A-B+C. is transformed to,
29
Gbor Fehr
1 2 3 4 5 6
clausedemo1(A = {_,X2}, C) when is_object(X2, class1), C =:= element(5, X2) -> B = element(4, X2), A + B + C; ,so the variable is treated as a constant, and checked in the guard. 3.12.4 Bound and unbound variables in embedded clauses
In fact, variables in the pattern can also be bound by other reasons. This is the case when the clauses are embedded in a sequence of expressions, and a previous expression creates a variable. This embedding is not possible for function clauses, but possible for case, receive, try-catch and fun clauses. Consider the following example:
1 2 3 4 5 6 7
(*) A = 5, case Z of #class1{field1 = A, field2 = B} -> {A, B}; _ -> nothing end It does far-reaching consequences, if we remove the line marked with (*). If it is removed, the following code should be generated:
1 2 3 4 5 6 7 8
case Z of X0 when is_object(X0, class1) -> A = element(3, X0), B = element(4, X0), {A, B} _ -> nothing end On the contrary, if it is present, the following should be generated:
1 2 3 4 5 6 7 8
A = 5, case Z of X0 when is_object(X0, class1), element(3, X0) =:= A -> B = element(4, X0), {A, B} _ -> nothing end In other words, the boundness of a variable tells whether to put its value-extraction into the guard as a test, or into the body as a variable binding. What needs to be done is to maintain a list of bound variables. This can be done by the following steps: 1. The list starts empty at the beginning of every function clause. 2. Any variable appearing in a pattern is added to the list. When a variable appears in a pattern, it means that it is either already bound, or gets bound by the pattern. 30
Gbor Fehr
3. In Erlang, if a variable dened in an expression, which is in an expression sequence, it will also be accessible in the following expressions of the sequence. For example:
1 2 3 4 5
A = 5, begin B = 6 end, C = A + B
%% correct, B is accessible
The previous point respects this, but conditional constructs with more than one branches raise some problems:
1 2 3 4 5 6 7 8 9
if something1 -> A = 5; something2 -> %% B = A + 1, this would be incorrect, A is not yet visible A = 5, B = A + 1 end %% now A is visible, but B is not, because it is only %% defined in one branch (unsafe)
4. In clause based constructs, the list at the beginning of each clause starts from the list before the whole construct. In other words, the list is reset after each clause, to what it was before. The last clause is an exception: For funs, it is reset, because no variables are exported from them. For the rest of the construct types, it is not reset, because those variables are visible outside the clauses. This is a bit tricky in fact, since at rst it seems that there is no guarantee that after the last clause the list will be correct. The good thing is that, those variables that are created in all the clauses, will enter into the list, because they are also created in the last clause. Variables created only in some of the clauses are considered unsafe by the compiler, and can not be accessed after the clause-construct. Therefore we do not need them in the list of bound variables. Some of them which are created in the last clause, but not in all clauses might enter the list, but this does not changes things, since these variables are never used after the clause-construct in a correct (successfully compiled) program. The previous subsection was in fact the special case of this method, because the list maintained here contains all the variables, which are considered bound there. 3.12.5 Variables present in a pattern and also in the guard
Another special case is when a variable appearing in a class-pattern, also appears in the guard. This does not aect the transformation of the pattern in any ways, but since the variable will only be bound in the body, the guard can not use it. The solution is to transform the guard, and substitute the variable with its corresponding extraction sequence.
1 2
clausedemo1(A = {_, #class1{field2 = B}}, C) when B =:= 42 -> A + B + C; is transformed to
1 2 3 4
clausedemo1(A = {_,X3}, C) when is_object(X3, class1), element(4, X3) =:= 42 -> 31
Gbor Fehr
5 6
B = element(4, X3), A + B + C; In line 5, element(4, X3) extracts the value of B for the guard. 3.12.6 Variables appearing multiple times in patterns
One more case remains, when a variable appears more than once in a class-pattern. In this case, extracting the variable in the function body is not enough, it must be tested in the guard that all occurrences get the same value:
1 2
clausedemo2([_,#class1{field1 = A, field2 = A}], #class2{field3 = A}) -> A. is transformed to
1 2 3 4 5 6 7 8 9 10
clausedemo2([_,X0], X1) when is_object(X0, class1), is_object(X1, class2), element(5, X1) =:= element(4, X0), element(4, X0) =:= element(3, X0) -> A = element(3, X0), A = element(4, X0), A = element(5, X1), A. , so that equality is checked before the body. Extracting A three times is of course not necessary, but never will cause a runtime error, because the guard guarantees that all three elements are the same. This will be optimised in later versions. 3.12.7 Composite subpatterns for eld extraction
Until this point, we only discussed single value extractions from objects, into variables, and single value tests. This section covers the case of more complex eld-patterns. An unsubstituted variable is the special case of a general pattern. When more complex subpatterns appear in the class-patterns, the above techniques will be used, and only the value extractions will be upgraded. This will be done by the additional help of the following two BIFs: hd(List) The head of the list. tl(List) The tail of the list. All variables appearing in class-patterns will be extracted with the combination of these, and the previous BIFs, and these combinations will be used the same way as the previous element(NNN, Xi) schemas. For example, the resolution of a clause with complex patterns, and special cases:
1 2 3
clausedemo3([_,#class1{field1 = A, field2 = #class2{z = #rec{b = Z}}}, #class1{field3 = [_, {_,_,Z}]}], Z) -> A+Z. is transformed to
1 2 3 4 5
clausedemo3([_,X0,X1], Z) when is_object(X0, class1), is_object(element(4, X0), class2), is_record(element(8, element(4, X0)), rec, 4), %% test record type 32
Gbor Fehr
6 7 8 9 10 11 12
Z =:= element(3, hd(tl(element(5, X1)))), 3 =:= tuple_size(hd(tl(element(5, X1)))), is_object(X1, class1), Z =:= element(3, element(8, element(4, X0))) [] =:= tl(tl(element(5, X1))) -> A = element(3, X0), A + Z.
%% test tuple size
%% test list closure
There is no need to compare Zs other occurrences in the guard with each other, because Z gets its value in the head, and behaves like a constant in the guard. The example shows that the more complex eldextracting patterns are also relaxed into variable extractions and tests, but the extraction goes in more then one steps, by nested BIF calls. We experienced in simpler cases that the compiled assembly code from our generated class-matches is very similar to or the same as the code for Erlang record-matches. Altough we have not examined this in detail; we are aware that there are cases when our assembly code is more complex than it could be, like the one in the following subsection. 3.12.8 Composite terms
Consider the following behaviour of the transformation:

1
composite(#class1{field1 = {bre, ke, ke}}) -> frog. is transformed to
1 2 3 4 5 6 7 8
composite(X0) when is_object(X0, class1), size(element(3, X0)) =:= 3, element(1, element(3, X0)) =:= bre, element(2, element(3, X0)) =:= ke, element(3, element(3, X0)) =:= ke -> frog. , while it would also possible to transform it to the more simple code:
1 2 3 4 5
composite(X0) when is_object(X0, class1), element(3, X0) =:= {bre,ke,ke} -> frog. This is because bre,ke,ke is a literal. The generated beam assembly code18 is also much sorter. 3.12.9 Summary of transformation
When processing a pattern, the following steps are done in the following order: 1. The class-patterns are substituted with variables in the head. 2. These class-patterns are converted into sequence of value-extractions, and stored. 3. Based on the list of bound variables, these extractions are sorted into two groups: (1) real extractions, and (2) extractions that are in fact value-tests which access constants or bound variables. 4. Those variables in the guard, which appear in class-patterns, substituted with their corresponding value extractions.
18 As
showed by the S compiler switch.
33
Gbor Fehr
5. Members of group (1) are appended to the beginning of the function body. 6. Members of group (2), along with necessary type-checks are appended to each guard sequence19 of the current guard. 7. Equality tests for variables that are the result of multiple extractions group (1) are appended to each guard sequence of the current guard.
3.13
Remote objects
It is possible to start an object in a process. The basic concept is that the process acts as a server. Its loop contains an object instance as a loop-variable, and it is waiting for requests from the client. When one arrives, it executes the requested operations on the object, and the loop-variable is changed to the resulting object. To achieve this I added some extra generated code to class modules (by remotetrans.erl), and created a new module ectremote which makes remote handling of objects more simple for programmers it encapsulates message-passing into functions at the clients side.
Client's process
request message (optional) response message
Server's process
object
To make it ecient, the server-loop is generated at compile-time20 . Thus the processing of incoming messages can be done entirely by pattern matching. This is most notable in the case of eld manipulations, where in the recevie construct a separate clause is generated for each elds query and value-setting operations. There are a non-trivial issue with method calls. This is because there are in fact two kinds of methods: those which modify the elds of the object, and those which are not. This is important because the serverprocess needs to know what to do with the return value of a method. With methods which do not modify elds, it must pass the whole value to the client. With methods which do modify elds, the return value should be split into a new-elds part, and a part containing the return values. The rst part should be used to update the object in the loop-variable, and the second part should be sent back to the client. To make this work, I needed the following assumption on the possible return values: A method which modied a eld, should return (1) either the updated object, (2) or optionally a 2-sized tuple, where the rst part is the updates object and the second part is a custom return value. In this case I dened two kind of method calls: query-calls, and update-calls. I also implemented them as two separate calling mechanisms, from which the client can choose for each call. A query-call does not aect the object-state in the server, the whole return value is sent back to the client, unmodied. In the case of an update-call the server examines the return-value, decides that either (1) or (2) is the case21 , updates the object-state, and in the case of (2), the result is sent back. The need for two separate calling mechanisms (and the need for the programmer to decide it for each call) might be eliminated, but perhaps should not be, because it makes the programmer aware of statechanges. And since Erlang is a functional language, it might be a good practice to mark state-changes. However, I think the elimination can be done basically in two ways. First, it is possible to make the
A.12.4. remotetrans.erl into the class module 21 This decision can be done, because an object can not be a 2-sized tuple. More precisely, a 2-sized tuple can only be such an object that has no elds, in which case eld updates are not relevant. This comes from the fact that the rst two elements of an object tuple are always used, they hold type information.
20 by 19 See
34
Gbor Fehr
assumption stricter, and say the all methods must return a 2-sized tuple, even when the state is not changed. This might make however certain algorithms look less clear or readable:
1 2 3 4
method(This, 0) -> 1; method(This, N) -> N*{This}:method(N-1). If the rst clause were followed this assumption, it should have returned {This, 1}, and therefore the second clause should have used a pattern-match to extract its value. There are perhaps ways to resolve this issue with automatically generated code. The other possible elimination could be to somehow decide if the return value contains the new state of the object or not. The problem here is that a method can return an object of the same class, but the server can not determine if its the new state of the loop-object, or another object which should be sent back to the client. This can be resolved by saying that eld-updating methods must return a 2-sized tuple, even when they do not intend to return a value. (In this case the second element might be ok, or anything else.) Of course such a 2-sized tuple can also be returned by accident, so more work need to be done on this issue. Currently, the following operations are available for a remote object. get Queries a eld by its name. set Sets the value of a eld. query_call Invokes a method of the object. The return value will be sent back to the client unchanged, and the elds of the object are not changed. update_call Invokes a method of the object. The new state of the object is separated from the return value, and only the rest is sent back to the client. The state of the object is updated. delete Terminates the server process. copy Sends (a copy of) the internal object to the client. They can be invoked either with sending messages to the server, or through the functions of ectremote22 . The advantage of the latter is that the programmer does not need to know about message-sending. The disadvantage is that this decreases exibility: for example this way the client can not send more requests in a batch. However, the server processes the requests sequentially in the current implementation.
22 Remote
object creation is also possible from here.
35
Gbor Fehr
Implementation
This sections give only an overview of the implementation. I implemented the client transformation part of the system cctrans.erl & idtrans.erl using the newly created OO language elements. To be able to compile this, I also wrote a simplied client transformation in Erlang cctrans0.erl , which implements the subset of OO functionality that cctrans.erl uses. I grouped the modules of the transformation into the following 3 layers.
4.1
Transformation modules
These modules work directly with the parsed module source codes. They use the services of the following two layers. classtrans.erl Transformation intended for both class and client modules. It contains the entry point of the transformation, which ect.hrl calls. idtrans.erl Identity transformation, which can be extended by inheritance. cctrans.erl Extending idtrans, performs client module transformation. Invoked by classtrans. cctrans0.erl Non-OO version of the client transformation, only implements a subset necessary to compile the above two. remotetrans.erl Generates functions into class-modules to support remote objects. Invoked by classtrans, before cctrans.
4.2
Helper modules
These modules provide class-pattern and class-expression specic services for the above layer. They work do relatively complex tasks with AST subtrees, and use the services of the following layer. objpats.erl Processing of class-patterns. Relaxes AST patterns into single value assignments and single equality tests. In other words: into guards and expressions. objexprs.erl Processing of class-expression.
4.3
Base modules
The following modules provide simple services to manipulate abstract syntax trees. classes.erl Retrieves information from the attributes of the hrl les of the included classes. records.erl Retrieves information from record denition ASTs. ast.erl Creates simple AST expressions. By simple I mean they can be described with few parameters. my_pp.erl Pretty prints ASTs into les as source code.
4.4
Runtime support modules
This fourth group contains only module that does not belong to the transformation. It supports the runtime usage of remote objects. ectremote.erl Provides functions to communicate with a remotely running object.
36
Gbor Fehr
4.5
Path of a processed module
clienttransformation class module classtrans client module cctrans0 cctrans
normal erlang module
The chart above shows the path of processed modules. ect.hrl always invokes classtrans, which passes the module immediately cctrans or cctrans0 is its a client module. If a module is a class-module, then classtrans rst deals with inheritance creates method stubs, record denition , and creates the .hrl. After this, the transformed module has the structure of a normal Erlang module, but invokes classes in its functions. In other words its a client module, so it will be transformed by either cctrans or cctrans0.
4.6
Client transformation
The client transformation has two equally important requirements. First, it must reach every expressions in the AST of the transformed module, even at complex places like in a case construct in a fun in an if construct in a function. Secondly, the transformation will not be context-free, so there must be state variables that go through the transformation. (Simple examples for this: the class name, the record denitions found so far, a counter to guarantee that no two variables with the same name are generated...) The rst requirement can be satised, if I take erl_id_trans.erl from the Erlang distribution, and replace the necessary function clauses with our custom processing. (erl_id_trans.erl is a dummy parse transformation, which traverses the tree and returns it unchanged.) This is how I created cctrans0.erl, because its subset of functionality does not require state variables. For the second requirement, a new parameter called State should be maintained through the traversal, which can be a record with the state variables as elds. To maintain its value, it must be appended to the argument list of all clauses, and every return value must contain its updated instance. For this, return values are turned into 2-sized tuples, which contain the original return value in the second place, and the State in the rst place. This is how I implemented cctrans.erl rst. The main problem with this is the lack of exibility and readability, because clauses responsible for traversing normal Erlang AST, and clauses responsible for the transformation of our modied semantics AST are mixed in a module. Also, this module must be edited, when there is a change in the Erlang AST specication, which implies new changes in erl_id_trans.erl. My idea to improve the situation was the use of the Template method pattern [6] with our new OOsyntax. erl_id_trans.erl is turned into a class in idtrans.erl, with no elds. Now, the new version of cctrans.erl can be a subclass of idtrans.erl, and I can dene new elds in it, which will be passed through in the rst argument of every function and clause. To change the behaviour of a certain clause in the identity transformation, theres no need to rewrite it, I can be overridden as a virtual method in cctrans.erl. When something in the ocial erl_id_trans.erl changes, there is a good chance that only idtrans.erl needs to be updated, and cctrans.erl can be kept unmodied. The transformed source code for this will be very similar to the original cctrans.erl. This way I separated the transformation-logic-specic and Erlang tree traversal specic aspects of the transformation. This can also be a good general pattern for parse transformation writing. It could also be extended with a third class between idtrans and cctrans, which provides general, domain-independent
37
Gbor Fehr
services for parse transformations, for example: unique variable name generation, error handling, etc. I didnt implement this because then the bootstrap compilation with cctrans0 would be more dicult.
38
Gbor Fehr
Performance analysis
In the rst subsection, I describe what properties I measured. In the second subsection, I describe how I did the measurements. In the third subsection, I present the results.
In this section I describe the performance test I made on our and on others OO-systems for Erlang.
5.1
Measured properties
I measured and compared the speed of object-operations in the following systems: ECT, Remote ECT, eXAT, Parameterized Modules, Wooper. I also measured the corresponding non-OO operations of Erlang: records, static function call module:function, dynamic function calls Module:function() and funs. The measured operations were: method calls, eld queries and eld modication. The term eld modication means dierent things for dierent systems. For ECT and Erlang records it is making a copy of an object with some elds changed. For Remote ECT, eXAT and Wooper, it is destructively updating a eld. I examined the dependence of these speeds on the depth of the inheritance chain. I created the following class hierarchy for each system, and measured method call times on an instance of alpha, beta, gamma . . . etc. (The body of each methods is just the atom ok, which they return as a result.)
alpha field1 field2 field3 field4 method1() method2() method3() method4() beta gamma delta epsilon zeta
This way I could examine how the distance of the method implementation and the object instance in the inheritance chain aects the time of a call. I set the following parameters to be constant: The number of dened elds: 4. The number of dened methods: 4. The number of accessed elds per operation: 1. Records and ECT support pattern matching, and more than one eld-extractions can appear in a pattern-match. An update-expression for records can also update more than one eld. I chose to measure one-eld operations for two reasons. First, for comparison with systems that support only one. Second, because in theory, this is the worst-case scenario, because a each eld operation might trigger a type-check. With the use of multiple elds, the time of the type-check would dissolve among them. 39
Gbor Fehr
5.2
Methods of measurement
A method call or a eld read is a rather short operation. Their times are usually in the range of nanoseconds or microseconds. This is such a short time that operating-system interrupts can signicantly inuence it. The timer function for Erlang also works in the range of microseconds. To measure such a short time, the operation should be run several times, and the total time can be measured. There are two approaches for this. Calculate the exact time of a single operation, by running it repeatedly. The compiler might recognise this repeated operation and optimise it. This should be avoided, because the goal is to measure complete operations. The results show the developers of these object-oriented systems where can they improve performance further. They also help software developers who use these systems for time-critical applications. Use complex benchmark-algorithms, which use similar patterns to real-world applications. Real world applications benet from compiler optimisations, so they should not be disabled. The results can guide software developers to decide which system to use. To implement the second option, more research is needed to nd suitable algorithms. My choice for now is the rst option: I ran the measured operations N times, and took the average by dividing the total time by N . When measuring something, the accuracy of the results is a question. To get some picture about it, I divide the N operations into k groups of the size Ops = N k . Let the result of the ith group is Mi , so the k 1 M . I take the following value to characterise the error: overall average will be Avg = k i i=1 Err = 100 max1ik |A Mi | % Avg
. In words, this is the maximum absolute dierence from the average, expressed in percentage of the average. I chose k = 10 and I used this metric as an indicator to see if something went wrong. The next question was how to run N operations with the least overhead of the benchmark system, and also avoid compiler optimisations23 . I discovered three possibilities: 1. To create a function the contains N operations sequentially. 2. To create a loop that runs i iterations and contains N/k operations. 3. To create a loop that runs N iterations and contains 1 operations. The rst choice is tempting, because it eliminates the overhead of the loop. However the source and binary les of the test program became unmanageable for me. (For example the call time for an Erlang function was 100 ns. To run the test for a second I needed 10, 000, 000 calls. I used a parse transformation test3/mhelper.erl to multiply operations in the source code. During compile the Erlang virtual machine exited with insucient memory error while trying to allocate 3GBs. Perhaps more ecient solutions than a parse transform exist. The second choice is a compromise solution which uses a loop, but reduces the impact of its administration by multiplying the operation. The problem here was that the Erlang compiler optimised repeated uses of record-syntax on the same variable24 . The third choice was that I implemented, but with an addition. I also run the loop without the measured operation in it, and subtracted this latter time from the measured. I implemented both loop as a tail-recursive function. To ensure that exactly the overhead is subtracted, I checked the assembly code of them to be the same except for the operation. I found that the jump assembly command at the end diers when the function is otherwise empty and nonempty call_only for empty and call_last for non-empty. The performance of them were also dierent. To rectify this, I put the operation into a
used the beam compiler of Erlang. can be inspected with the use of S compiler switch: it causes the compiler to create human-readable assembly code instead of binary.
24 This 23 I
40
Gbor Fehr
separate function and created an empty dummy-pair for it. I ran the tail-recursion both with calling the operation-function and with calling the dummy-function, and subtracted the results. To further reduce uncertainties, I took the following steps: disabled the second core of the processor25 , and switched Linux to single user mode (runlevel 1). This latter means that no graphical user interface is loaded, network support and most of the daemons are turned o. I created a simple framework, to run the same measurement for all systems. It is a header26 le which contains the common functions for each system. These include the average calculation, the above A mentioned tail-recursive loops, exporting the results to L TEXcode. The computer I used for measuring had an Athlon64 X2 4200+ CPU running at 2210MHz, and the second core was turned o.
5.3
Results
The following table summarises the results of the performance tests. Each row with a bold face denotes the beginning of the section of a system. Each remaining row represents a measured operation of the system they belong to. Columns denote the class, whose instance was tested. alpha beta Erlang static function call Call time (us) 0.010 Erlang dynamic function call Call time (us) 0.088 Erlang fun call Call time (us) 0.042 Erlang records Read time (us) 0.026 Write time (us) 0.102 Parameterized modules Call time (us) 0.090 4.283 ECT classes Call time (us) 0.094 0.105 Read time (us) 0.047 0.047 Write time (us) 0.107 0.107 ECT remote classes Call time (us) 0.839 0.885 Read time (us) 0.715 0.702 Write time (us) 0.436 0.436 eXAT classes Call time (us) 1.394 7.301 Read time (us) 1.759 1.761 Write time (us) 2.736 2.750 Wooper classes Call time (us) 1.915 1.977 Read time (us) 2.342 2.354 Write time (us) 2.012 2.017 gamma delta n.a. n.a. n.a. n.a. n.a. 8.451 0.104 0.047 0.107 0.884 0.702 0.437 13.254 1.768 2.759 1.968 2.337 2.015 12.626 0.104 0.047 0.107 0.889 0.702 0.439 19.635 1.767 2.747 1.968 2.337 2.014 16.773 0.104 0.047 0.107 0.891 0.701 0.444 25.453 1.766 2.740 1.971 2.335 2.016 20.848 0.104 0.047 0.109 0.879 0.698 0.440 31.533 1.769 2.739 1.979 2.342 2.012 epsilon zeta
In the following subsections I highlight some facts from the above table. 5.3.1 Dependence on the depth of class hierarchy
Field manipulation times are near-constant functions of the depth in all systems. Method calls for ECT, Remote ECT and Wooper are near-constant functions of the depth. Method calls for eXAT and Parameterized Modules are near-linear functions of the depth.
25 I
added the nosmp kernel parameter at boot time.
26 test3/measure.hrl
41
Gbor Fehr
5.3.2
ECT versus Erlang
Method calls of ECT are about 10 times slower than statically bound, inter-module erlang function calls. However if they are compared to the dynamically bound Erlang call-conventions, the picture is better: only 2.5 times slower than fun-s and 1.2 times slower than dynamic calls. Field read is about 2 times slower than record eld read. Field update is less than 1.1 times slower than record eld update. 5.3.3 ECT versus other OOP
To compare ECT and other OOP systems, I consider object-as-process systems separately: Remote ECT outperforms both eXAT and Wooper. The narrowest dierence is between ECT and eXAT, when comparing the times of method calls with no inheritance: ECT is 1.6 times faster. The only non-process OOP system I measured was Parameterized Modules. I did not measured eld access times because in section refsec:Parameterized modules I did not found eld handling fully functional. The results for method calls: When the called function is not inherited, Parameterized Modules are faster by about 1.04 times. When the called function is inherited, the speed of parameterized modules decreases linearly. In this case ECT is at least 40 times faster.
42
Gbor Fehr
Conclusions and future work
In my project, I designed, implemented and measured a working object-oriented extension for Erlang. For the new syntax and conceptual elements I added to the language, the advises of Andrs Gyrgy Bks guided me. I also relied on others similar works and on my general knowledge of object-oriented languages, like Java, C++ and Oberon. ECT the extension I created satises my stated goals from section 3.1. The new language elements I added provide extensible objects that encapsulate data and funcionality. Fields and methods can be inherited. The newly created objects are compatible with records, which implies that they can be used in pattern-matches, and they follow the single-assignment nature of Erlang. ECT can actually be used: I used the new language elements to implement the client-transformation part of the system itself. I also studied others similar works and measured the performance dierences. To compare ECT with OOP systems that store objects in processes, I added a similar feature to ECT, which I named Remote ECT. When ECT is compared to similar non-OOP Erlang constructs, it has a slowdown of maximum 2.5 times. However when compared to the other OOP systems for Erlang, it outperforms them in all situations by at least 1.6 times, with one exception: in a very special case ECT and Parameterized Modules perform virtually equally. The source code of ECT can be downloaded from http://code.google.com/p/ect/.
6.1
Future work
The next step is to present ECT to the Erlang community. But before that I plan to complete the following tasks: Generating proper error messages when invalid source code is parsed. Sorting the issue of side-eect variables, described in section 3.10.1. Implementing is_object(Object, Class). Improving the currently available ad-hoc automated tests. In the rest of my essay, I show some more improvement possibilities. 6.1.1 Unique variable generation
I showed that my transformation sometimes needs to create new variables in the generated source code. For now, these are created by the name _CCTRANS_NNN where NNN is an increasing number for each module. Currently, ECT does not detect if the programmer uses such variables. An improvement would be to do this check, and choose variable names that are guaranteed to be unused. 6.1.2 Integration with Erlang behaviours
The idea of using behaviours as interfaces in conjunction with ECT comes naturally. It is possible indeed, thank to the following properties of the transformation: (1) all methods are transformed to exported functions with the same arity as the programmer writes them, and (2) unknown attributes of a class module are passed through the transformation. Even inherited methods can be functions of the behaviour. However there is currently a serious limitation: inherited classes do not inherit the behaviours of base classes. To x this, the transformation should copy the -behaviour attributes of the superclass to the current class. A way to achieve this would be to store behaviour information in generated header les of classes. This should be in a suitable format, that does not interfere if included into other modules. Something like this: -classbehaviours(classname, [behaviour1, behaviour2, ...]).
43
Gbor Fehr
6.1.3
Use of upcoming Erlang features
There is a proposal [8] for the Erlang language to allow binding variables in guards. If it will be implemented, the transformation can be simplied: no separate value-extraction will be needed in the beginning of clauses, and there will be no need for equality test of these values. Currently two of the class-attributes are implemented by macros: ?METHODS and ?FIELDS. This is because it is restricted, what can be written into an attribute of the form -attribute(args). . There can be no name/arity pairs, which makes -methods([method1/3, method2/4, ...]). impossible. Neither can be equal-signs between terms, which makes impossible to implement -fields({field1, field2, field3 = DefaultValue}).. There is a proposal [9] for Erlang which addresses the case of methods, and will be included in the next release of Erlang/OTP. 6.1.4 Better integration with the compiler
One possibility is to modify the Erlang compiler to create code immediately from our syntax. However records which are similar to our classes are also implemented by a preprocessor transformation. Another possibility is to propose new elements for Erlang that make code generation for classes easier: Support Object:method()-style calls The language already supports Object:method()-style calls for parameterised modules. This syntax is similar, and possibly ECT should be switched to this. However semantics dier, because the instance variable is passed as the last argument and not the rst, as in ECT. A possible option would be to extend these calls to detect if Object is an ECT object, and call its methods suitably. The impact on speed is a concern and should be analysed. Better support for class-patterns and class-expressions Implement is_object(Object, Class) BIF natively. The following concept would made class-matching very simple. (I have not examined its feasibility.) Let us consider the following tuple-pattern: {SubPattern1, SubPattern2, ..., SubPatternN, *} It matches any tuple that has at least N elements, and the corresponding SubPatternss match the corresponding elements. For example: {A, _, B, ok, *} = {1, 2, 3, ok, 4, 5, 6, 7, 8, 9}, A = 1, B = 3 % the results of the match % this match succeeds
This would allow the following transformation of class-patterns27 : #class2{b = 4, field2 = Z} = Obj to {class2, {_, class2, *}, _, 4, _, _, Z, _, *} 6.1.5 New syntax for method-heads
First, it is possible to automatically insert the rst argument which is the This object for methods, like for Parameterized Modules. This might seem convenient, but the problem is that pattern-matching on This in the head would be impossible. In Oberon-2 [7], there is a separate syntax for functions that are methods: PROCEDURE (VAR obj: Object) MethodName(param1: Type1, ...) The this-pointer obj is dened separately from the other arguments. This would seem the following in ECT:
27 class1
and class2 are dened in sections 3.3 and 3.5
44
Gbor Fehr
(This)method(Arg1, Arg2, Arg3) -> ... % or method(This)(Arg1, Arg2, Arg3) -> ... The parser should be modied to allow these. The intention behind this concept is to separate the This instance from other arguments. It will also make it possible to mark methods, instead of the currently used marking: ?METHODS([method1/N, method2/N, ...]). 6.1.6 Restricted uses
There are possible restricted uses of ECT that might be usable by themselves. Method inheritance only There are no elds the macro-attribute ?FIELDS({field1, field2, ...}) is empty for all classes. If this is the case, ECT can be simplied: instances do not need to carry elds, the form {classname, {superclass1, superclass2, ... classname}} is sucient. If there is no need for type-checks, classname on its own is sucient. In this case, there will be no need to transform method calls: they can be either directly invoked: modulename:methodname(Args) or indirectly M = modulename, M:methodname(Args) Field inheritance only There are no methods the macro-attribute ?METHODS([...]) is empty for all classes. In this case, classes become extensible records. Syntax could be further simplied by not binding classes to modules, and allowing more classes in a module: ?CLASS(name, superclass, {fields}). % in general
?CLASS(person, nil, {name, birth, favcolour}). % an example with no superclass ?CLASS(employee, person, {salary, status}). % an example with superclass Only record denitions and classlevel attributes would be generated. Use of ECT without transformation A client module can be written without the use of this transformation. This might be important when a programmer has beam and header les of classes created by ECT, but does not have ECT installed. In this case the programmer should to be aware of the semantics described in section 3, and the use of classes is restricted: Methods should to be called in the form of (element(1, Object)):method(Object, Args). (Possibly with using a macro.) Fields of an object can not be extracted with a pattern-match, unless the exact class of that object is known. The same is true for updating elds. However the alternative way of eld-extraction Object#class.fieldname can be used. For this, type-checking for records must be turned o, with compiler switch no_strict_record_tests.
45
Gbor Fehr
Acknowledgements
I would like to thank the ideas, advises and corrections of Andrs Gyrgy Bks and Dr. Pter Szeredi. Without their work and support I would not be able to complete this paper.
46
Gbor Fehr
References
[1] J. Armstrong. Making reliable distributed systems in the presence of software errors. PhD thesis, Royal Institute of Technology, Swedish Institute of Computer Science (SICS), Stockholm, December 2003. http://www.erlang.org/download/armstrong_thesis_2003.pdf. [2] J. Armstrong. Programming Erlang: Software for a Concurrent World. The Pragmatic Bookshelf, 2007. [3] O. Boudeville. WOOPER: Wrapper for OOP in Erlang. http://ceylan.sourceforge.net/main/ documentation/wooper/. [4] R. Carlsson. Parameterized modules in erlang. 2nd ACM SIGPLAN Erlang Workshop, August 2003. http://www.erlang.se/workshop/2003/paper/p29-carlsson.pdf. [5] R. Carlsson. Inheritance in Erlang. Erlang/OTP User Conference, November 2007. http://www. erlang.se/euc/07/papers/1700Carlsson.pdf (slides only). [6] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns. Addison-Wesley, 1995. [7] H. Mssenbck. Object-Oriented Programming in Oberon-2, chapter Appendix A Oberon-2 Language Denition. Springer-Verlag, 1993. [8] R. A. OKeefe. eep-0014.html. EEP14: Guard clarication and extension. http://www.erlang.org/eeps/
[9] R. A. OKeefe. EEP24: Functions may be named using F/N in all module attributes. http: //www.erlang.org/eeps/eep-0024.html. [10] C. Santoro. Erlang eXperimental Agent Tool. Reference Manual. http://www.diit.unict.it/ users/csanto/exat/exat.ps. [11] A. D. Stefano and C. Santoro. Designing collaborative agents with exat. 2nd International Workshop on Agent-based Computing for Enterprise Collaboration (ACEC 2004) at WETICE 2004. http://www.diit.unict.it/users/csanto/exat/ACEC-Santoro-eXAT.ps Modena, Italy, June 14-16, 2004. [12] Erlang Eciency Guide. http://www.erlang.org/doc/efficiency_guide/part_frame.html. [13] Erlang/OTP R12B online manual. http://www.erlang.org/doc/man/. [14] eXAT download site. http://www.diit.unict.it/users/csanto/exat/download.html.
47
Gbor Fehr
Overview of Erlang
This appendix gives a brief overview of the subset of Erlang needed for this paper. More can be found in section 3 of [1] or in [2]. Erlang is a functional language with strict dynamic type checking: type-check is done at run time and no automatic data type conversions are invoked28 .
A.1
Functions
An Erlang program is segmented into modules. Each module resides in its own le, which has an extension of erl. For example erldemo1.erl:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
% erlang demonstration program 1 -module(erldemo1). -export([add/2, quadratic_equation1/3, quadratic_equation2/3]). add(X, Y) -> X + Y. discriminant(A, B, C) -> B*B - 4*A*C. quadratic_equation1(A, B, C) -> D = discriminant(A, B, C), (-B - math:sqrt(D)) / (2 * A). quadratic_equation2(A, B, C) -> D = discriminant(A, B, C), (-B + math:sqrt(D)) / (2 * A). 1. Line 1 is a comment. 2. Line 2 states the name of the module. It must be the same as the name of the le without the erl extension. 3. Line 3 enumerates the exported functions: only these can be called from other modules. The form in which the functions appear is Name/Arity where arity is the number of arguments. Functions with the same name but dierent arity are considered dierent functions. 4. Line 5 is the head of function add/2, it lists its arguments. 5. Line 6 is a single expression representing the body of add/2. This expression gives the return value of the function. 6. Lines 8-9 contain a similar function denition. 7. Lines 11-13 is a function that calculates the rst solution of a quadratic equation. Its body consists of two expressions. When a body of a function contains more than one expressions, they are evaluated sequentially in order. The return value of the function will be the value of the last expression. 8. Line 12 shows an example of a function call from the same module. 9. Line 13 shows an example of a remote function call from another module. (sqrt function of math module.) The module name is always separated by a colon () from the function name. To compile and run Erlang programs, we use the Erlang shell.
28 Except
for int to oat.
48
Gbor Fehr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
user@linux:~/erlang> erl Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:2] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.6.4 (abort with ^G) 1> c(erldemo1). {ok,erldemo1} 2> A = 3. 3 3> B = 4. 4 4> C = erldemo1:add(A, B). 7 5> C. 7 6> erldemo1:discriminant(1, 2, 1). ** exception error: undefined function erldemo1:discriminant/3 7> erldemo1:quadratic_equation1(2, -22, 60). 5.0 8> erldemo1:quadratic_equation1(1, 1, 1). ** exception error: bad argument in an arithmetic expression in function math:sqrt/1 called as math:sqrt(-3) in call from erldemo1:quadratic_equation1/3 9> 1. Line 1 starts the Erlang shell from a Linux shell. 2. Line 6 compiles our module. 3. Lines 8-11 show how to assign variables in this shell. 4. Line 12 invokes function add, the next line is the return value. 5. Line 14 shows how to query the value of a variable. 6. Line 16 tries to invoke function discriminant but the error message in the next line shows that it is not possible. (This function was not exported.) 7. Line 18 invokes function quadratic_equation1 and the next line shows the result. 8. Line 21-24 show an exception generated in math:sqrt/1 because of negative argument.
A.2
Function clauses and guards
Functions might have guards in their heads. For the rst approach, a guard seems a logic expression on the arguments of the function which must be satised before evaluating the body. (For the following example, this denition is sucient. For more complex examples, guards are dened properly in section A.12.4.) If it does not satises, the process is terminated and an exception is thrown. However, this is only the case when the function has one clause. A function can have more clauses, which have the same names and arities, but dierent guards. In this case, always the rst clause with a satisfying guard is executed. For example:
1 2 3 4 5
quadratic_equation1(A, B, C) when B*B - 4*A*C >= 0 andalso A =/= 0 -> D = discriminant(A, B, C), (-B - math:sqrt(D)) / (2 * A); quadratic_equation1(A, B, C) when A =:= 0 andalso B =/= 0 -> -C / B; 49
Gbor Fehr
6 7
quadratic_equation1(_, _, _) -> custom error message or etc. 1. Lines 1-3 represent the rst clause, which is executed when the discriminant is non-negative, and A is nonzero. Guards can not only contain arithmetic expressions and literals, but a subset of built-in Erlang functions (BIFs) are available. Arbitrary function-calls are forbidden, this is the reason why discriminant/3 is not used in the guard. 2. Lines 4-5 represent the second clause: it is executed when the guard of the rst clause is not satised, A is zero, but B is nonzero. (This is the case of a linear equation with exactly one solution.) 3. Lines 6-7 represent the last clause: it has no guards therefore it will be executed each time the guards of the previous two are not satised. If a variable is dened, but not used later the compiler issues a warning. The underscores in place of argument names indicate that the function does not use these arguments, and thus avoids warnings. A.2.1 Built-in functions
Built-in functions (BIFs) are functions that are built into Erlang. They usually do tasks that are impossible to implement in Erlang. Most of them are not compiled to a function call, rather inline code in the place of the call is generated.
A.3
Variables
Variable names in Erlang are by denition begin with an uppercase letter.29 There are no such things as global variables: the scope of a variable is the lexical unit in which it appears. Even variables by the same name, but in dierent clauses of a function considered dierent. 30 Variables are immutable. In other words, once a value is assigned to a variable, it can not be changed any more. The term used for assigned is bound, and for unassigned is unbound. To simulate the behaviour of destructive value assignments of imperative languages, one should create a new variable each time the value is updated: in Java
1 2 3 4 5
in Erlang A = 1, B = 1, A2 = A + 1, B2 = B + 1, B3 = A2 * B2,
int int a = b = b =
a b a b b
= = + + *
1; 1; 1; 1; a;
A.4
Tail-recursion optimisation
A similar issue is that it is impossible to write loops. There is no such language construct in Erlang as loop. Even if there would be one, loop-variables could not change their values once they are bound. In Erlang, like in other functional languages, recursive functions play the role of loops. See the rst two columns of the following example:
29 Or 30 I
an underscore, which turns o unused variable warnings. discuss the scope of variables in more detail in section 3.12.4.
50
Gbor Fehr
in Java
1 2 3 4 5 6 7 8
in Erlang fact(N) when N =< 0 -> 1; fact(N) when N > 0 -> N * fact(N-1).
in Erlang, tail recursive fact0(N, R) when N <= 0 -> R; fact0(N, R) when N > 0 -> fact0(N-1, N * R). fact(N) -> fact0(N, 1).
public static int fact(int N) { int res = 1; while (N > 0) { res = res * N; N = N - 1; } return res; }
This recursive function has two clauses. The second clause is evaluated when N is positive, and calls the function recursively with the reduced problem. When zero is reached, the recursion is terminated, and the multiplication is calculated backwards from 1 to N. If a function call is the last expression of a function, more optimised machine code can be compiled. Before such call the local variables will not have to be saved on the stack, because they are not needed after returning from the call. This implies that a simple jump instruction is sucient instead of a subroutine call. This is called tail-call optimisation and the compiled program will be as eective as loops in imperative languages. When the last call is a recursive call, it is called tail recursion. For interests sake, we showed the tail-recursive version of fact in the third column.
A.5
Data types
So far in the above examples, we only saw integer- and real-type variables. In this section I describe the three groups of data types: primitive data types, compound data types and syntactic sugar data types. A.5.1 Primitive data types
Erlang has eight primitive data types. Detailed description of the following ve is not needed for this article. Integers arbitrary-sized, exact integer numbers. Floats IEEE 754 oating-point numbers. References globally unique symbols. Binaries eciently stored binary data. Ports ports are used to communicate with the outside world. The following three are more important for us: Atoms atoms are unique, non-numeric constants represented by a string of characters. The rst character has to be lower-case letter, and it can be followed by alphanumeric characters, underscores (_) or at (@) signs. Optionally, if the string is surrounded by single-apostrophes, it can contain any characters. Atoms in use are similar to enumerated types, but they dier from for example the Java implementation that there is one single namespace for them. Examples of atoms: monday, tuesday, wednesday, January, _types, etc. FUNs funs are anonymous functions that can be assigned to variables. The simplest way to explain is an example run of the Erlang shell:
1 2 3 4
1> X = fun(A, B) -> A*A-B*B end. #Fun<erl_eval.12.113037538> 2> X(5, 4). 9 After being assigned with a fun expression, X behaves like a function. The following example demonstrates that a fun can have more clauses, even with guards, like ordinary functions. 51
Gbor Fehr
5 6 7 8 9 10 11 12 13 14
3> Y = fun(Op, A, B) when Op =:= add -> 3> A+B; 3> (Op, A, B) when Op =:= mul -> 3> A*B 3> end. #Fun<erl_eval.18.105910772> 4> Y(add, 3, 4). 7 5> Y(mul, 3, 4). 12 Note that this fun expression spans several lines (5-9), but this is not a requirement, just a possibility.
Pids pids are process identiers, which point to lightweight Erlang processes. Messages can be sent to a process using its identier. Processes do not have shared memory, they can only communicate via messages. A.5.2 Compound data types
These data types are structures that can contain elements. The elements can have arbitrary (valid) data types. Manipulation of values of these types is detailed in the next section. Tuples tuples are containers for xed number of elements. The elements of the tuple are separated with commas, and enclosed with curly brackets. To create a tuple containing elements E1, E2, . . . En, one should write: {E1, E2, ... En}. This is an expression which can be for example assigned to a variable: Var = {E1, E2, ..., En}. The order of the elements is xed. In other words, an element can be referenced by its position. Elements can be accessed in constant-time, this makes tuples similar to arrays in imperative languages. But do not forget that tuples and their elements are immutable, like all values in Erlang. Concrete examples of tuples: {1, 2, 3} contains three integers, {example, 3.0, 4} contains an atom, a oat and an integer in this order. We call the number of elements of a tuple its size. Lists lists can contain variable number of elements. Lists are also immutable, so having a variable number of elements may be an apparent paradox. The resolution is that variable length means there is an ecient operation for adding elements: it takes an element and a list, and results in a new list that has the element at the rst place, followed by the elements of the original list. The tradeo to tuples is that access of arbitrary elements costs time proportional to the position of the element. We create a list by enclosing its elements by square brackets and separating them with commas, eg.: List1 = [element1, 2, 3.0]. A.5.3 Syntactic sugar data types
These data types are not new in the sense that they are just dierent syntactical representations of the above. Records records are similar to Pascal records or C structs. They can be dened the following way: -record(person, {name, height, favcolor}). denes the record name person with three elds: name, height, favcolour. Records named person will have these elds. One of them for example can be created this way: #person{name = bob, height = 2.00, favcolour = green}. This shows that record names act like types, more precisely subtypes of type record. Records are represented internally by tuples, which have one more elements than record elds: the rst element is the record name, and the following elements are the elds of the record, in order of their denition. For example the previous record will look the following: {person, bob, 2.00, green}.
52
Gbor Fehr
A pre-compiler takes the record denitions and replaces name references with position references. This is why records are considered just syntactic sugar on tuples. To use record syntax on instances of a record type, its denition is required in the module. When more than one module uses the same record type, the denition is usually put into a separate le and included into each module. Strings are written double-quoted list of characters, and represented by lists of ASCII codes of the characters.
A.6
Expressions
In Erlang, everything that can be evaluated to a value is an expression. Expressions can not have unbound variables in them. Examples of expressions: hello an atom. 5 an integer. 5+6 an operation of two integers. {a, b, {5, 5}} a compound data structure. A+B it can not be decided from syntax whether this is an expression or not. It depends on the boundness of variable A and B. This is an expression if and only if A and B are bound. erldemo1:add(4, 5) a function call. A = 5 a value-assignment. (The result is the assigned value.) We have not yet dened the content of function clause bodies: they are comma-separated sequences of expressions, ending with a period. The expressions are evaluated sequentially, and the result of the last expression is the return value of the function. An argument passed to a function, or an entered line in the shell is also an expression.
A.7
Patterns
A pattern can be dened recursively: a pattern is 1. either a primitive data type 2. or a (not necessarily bound) variable 3. or a list of patterns 4. or a tuple of patterns 5. or a record of patterns . Examples of patterns: hello an atom, a primitive data type. 5 an integer. A a variable. {person, bob, Height, FavColour} a tuple of patterns, which are two atoms and two variables. #person{name = bob, height = Height, favcolour = FavColour} a record of patterns. This is transformed to a tuple of patterns, since records are tuples. Value-assignments in Erlang can be generalised into pattern-matches of the form 53
Gbor Fehr
Pattern = Expression When a pattern match is evaluated, the possible unbound variables of the pattern are bound to those values that make the equation satised31 . In the simple case of UnboundVariable = Expression , when the pattern is an unbound variable, this assigns the value of the expression to UnboundVariable. When no suitable values exist for the unbound variables of the pattern, the pattern-match fails, and an exception is thrown. Of course when the pattern is a single unbound variable, the match will never fail: the variable is bound to the value of the expression. But when for example the pattern is an integer, and the result of the expression diers, the match will always fail: 5 = 6+7 A pattern-match is in fact an expression, whose result is the result of its right-hand expression. Pattern = Expression This makes it possible for example, to simultaneously pass a variable as an argument to a function and match it against a pattern: Sqrt = math:sqrt(X = 16.0) It is also called a layered pattern, when the right-hand side is also a pattern instead of an expression: Pattern1 = Pattern2 This unit behaves like a pattern and can be written everywhere where a pattern can be placed. For example: Pattern1 = Pattern2 = ... = PatternN = Expression in this case all patterns are matched against the value of Expression. This becomes more interesting when the value of the expression is more complex. In this case, more dierent patterns can be matched against it. Pattern matches provide an ecient tool for extracting elements from values of compound data types. The following three sections discuss this (and more features) for tuples, records and lists.
A.8
Tuple manipulation
I describe the tools for tuple manipulation in this section. The rst subsection introduces two syntaxes for value-extraction, the second introduces a syntax for updating elements. The third shows the available BIFs for tuple manipulation. A.8.1 Element extraction
A tuple-match succeeds, if and only if the matched tuple has the same size as the pattern-tuple, and the sub-patterns of the pattern-tuple match the corresponding elements of the tuple. A = {3, 4, 5}, {3, X, 5} = A.
31 This
act is called unication, see section 3.3.3 of [1]
54
Gbor Fehr
This for example extracts the value 4 into X. (Each sub-pattern matches to the corresponding element of A: integer 3 matches to 3, unbound variable X matches to 4, and 5 matches to 5.) A more frequent use is in conjunction with anonymous variables: A = {3, 4, bob}, {_, X, _} = A. The anonymous variable matches any value, therefore, the rst and last elements of A are ignored. In the previous case, they are checked to be 3 and 5. It is also possible that a variable appears more than once in a pattern: A = {bill, 4, bob}, {Z, X, Z} = A. This pattern-match fails. It would only succeed when the rst and last elements of A were the same. For example when the rst element was atom bob instead of bill. A.8.2 Updating elements
To update elds of a tuple, i.e. make a copy of the tuple with some elements changed, the best way is to reconstruct a new tuple from the elements of the current one: {Element1, Element2, _, Element4, _} = Tuple, NewTuple = {Element1, Element2, NewElement3, Element4, NewElement5} The above code copies Tuple into NewTuple, with the exception that the third element is changed to NewElement3 and the fth element is changed to NewElement5. A.8.3 Using BIFs
There exist the following BIFs built in functions that gives us greater freedom in manipulating tuples: element(N, Tuple) returns the Nth element of Tuple. >element(3, {2, 3, 5, 7}). 5 setelement(N, Tuple, E) returns a new tuple which is a copy of Tuple with the Nth element is changed to E. >setelement(4, {2, 3, 5, seven}, 7). {2, 3, 5, 7} To update more than one elds, these calls can be nested. This logically works the following way: each setelement call creates a copy of its tuple, updates its appropriate eld and passes this result one nesting level outer. This is inecient because one update would cause more than one copying. But fortunately, if certain conditions are met, an optimised code is generated that creates only one copy. I describe these conditions in section 3.10. tuple_size(Tuple) returns the size of Tuple. >tuple_size({two, {1, 1, 1}, 5, 7}). 4 These functions are not compiled into function calls, inline code is generated instead.
55
Gbor Fehr
A.9
A.9.1
Record manipulation
Field extraction
As I mentioned in section A.5.3, records are just syntactically dierent notations of tuples. For example, eld extraction can be rewritten trivially from record syntax to tuple syntax. with record syntax A = #person{ name = bob, height = 2.0, favcolour = green }, #person{favcolour = FC, height = H} = A. with tuple syntax A = {person, bob, 2.00, green},
{person, _, H, FC} = A.
This example extracts elds height and favcolour from A into H and FC. Note that it is not necessary to match against all elds of a record. The tuple syntax shows that unused elds are matched to the anonymous variable. Order of elds in the pattern is also arbitrary. A record-match succeeds if and only if its corresponding tuple-match succeeds. There is an alternative syntax for extracting one single value, but it is an expression rather than a pattern. with record syntax A = #person{ name = bob, height = 2.0, favcolour = green }, N = A#person.name. with tuple syntax A = {person, bob, 2.00, green},
% ...type check... N = element(2, A).
Here A#person.name is an expression, whose result is the corresponding eld: name. The tuple-syntax rewrite is not entirely precise here, because the type-check of A is missing32 . A.9.2 Updating elds
For records, a syntax for updating the value of a record eld also exists. More precisely, creating a copy of a record with some elds changed. The following expression Record#name{field1 = NewField1, field2 = NewField2, ...} rst checks that the Record is a record and its type is Name. If it is not, an exception is thrown. Otherwise, the value of this expression will be a copy of Record, but with the value of eld field1 changed to NewField1, the value of field2 changed to NewField2 etc. To create a copy of bob having the favourite colour brown, one should write: with record syntax A = #person{ name = bob, height = 2.00, favcolour = green }, B = A#person{favcolour = brown}. The type check is omitted again.
type-check can be added with the use of begin-end blocks from A.11, and I actually do this for classes in section 3.8. The complete generated code can be inspected with the E compiler switch.
32 Such
with tuple syntax A = {person, bob, 2.00, green},
% ...type check... B = setelement(4, A, brown}.
56
Gbor Fehr
A.9.3
Summary
The following example summarises the manipulation of records. The comments on the right show the results of the expressions and the tuple representations of the records.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
%definition (must be at module level): -record(example, {a, b = 5, c = 6}). demo() -> % instance creation: X = #example{b = 6}, % fields: a = undefined, b = 6, c = 6 % tuple rep.: X = {example, undefined, 6, 6} % field value extraction, with a pattern match: #example{a = A, b = B} = X, % A = undefined, B = 6 % field update Y = X#example{a = 1, b = 2}, % fields of X are unchanged % fields of Y: a = 1, b = 2, c = 6 % tuple rep.: Y = {example, 1, 2, 6} % single field extraction C = Y#example.c, % C = 6 Nesting is possible with all these constructs: #example{a = A, b = {_, B}, c = #example{a = C}} = X, The common things in these constructs are that all contain the name of record type to use, and type check is always done an the record instances involved. Two things are checked: the name of the record must be equal to the stated name in the construct, and the size of the tuple of the record must be equal to the number of dened elds + 1.
A.10
List manipulation
I describe the tools for list manipulation in this section. When the size of the list is known at compile time, syntax is similar to tuples: Element extraction is done with a pattern match: List = [3, 4, 5], [_, X, _] = List. % matches a list of length 3
The dierence from tuples is that the rst elements can be extracted with a pattern even if the length of the list is unknown: List = [1, 4, 9, 16, 25, 36, 49], [_, X1, X2, _|Rest] = List % matches a list of length >= 4 This match bounds X1 to the second element and X2 to the third. The sublist starting with the fth element is bound to Rest. Updating an element is done with the creation of a new list: [Element1, _, Element3] = List, List2 = [Element1, NewElement2, Element3]. Appending some elements at the beginning can also be done: List1 = [25, 36, 49], List2 = [1, 4, 9, 16|List1], % List2 = [1, 4, 9, 16, 25, 36, 49]
57
Gbor Fehr
The following BIFs can be used on lists: hd(List) returns the rst element of List. The rst element is called head. > hd([2, 3, 5, 7]). 2 tl(List) returns the sublist of List starting at the second element. This called the tail of the list. > tl([2, 3, 5, 7]). [3, 5, 7] The background concept behind lists is not discussed in this section. See for example section 3.3.1 of [1] for details.
A.11
Expression sequences in begin-end blocks
An expression sequence is a comma-separated list of expressions. They can also be found in function bodies. The result of such sequences is always the result of the last expression. An expression sequence can be enclosed between keywords begin and end to form a begin-end block:
1 2 3 4 5 6
begin Expression1, Expression2, ..., ExpressionN end This can be placed anywhere an expression can, because this is an expression itself. The sub-expressions are sequentially evaluated, and the result is the result ExpressionN. It is important to note that variables created in the subexpressions are available outside the block, for example:
1 2 3
X = begin A = 2, 4 end, Z = {X, A}, %% this line is valid and Z = {4, 2} A = 5 %% this line is invalid, since A has a value already
A.12
Clause-based constructs
There are some constructs in Erlang that have a common property: they consist of identically structured units named clauses. In this section I introduce this construct in general, and in the subsections I show three of the ve clause-based constructs of Erlang: functions, funs and case-constructs. The remainders are: if-constructs, receive-constructs and try-catch constructs. Clauses are always separated by semicolons. The last clause is ended with a period (.) for functions and with keyword end for the rest. A clause has three parts: 1. head, which is a pattern 2. gurad, which is a logic condition. It is optional except for if-constructs. 3. body, which is an expression sequence The general structure of the clause-besed constructs is the following:
1 2 3 4
Head1 Head2 Head3 Head4
[when [when [when [when
Guard1] Guard2] Guard3] Guard4]
-> -> -> ->
Body1; Body2; Body3; Body4.
58
Gbor Fehr
A.12.1
Functions
We have already dened functions in A.1. However it was left out then, that arguments can not only be variables, but patterns as well:
1 2 3 4 5 6 7 8
function1(ArgPatterns1) ExprList1; function1(ArgPatterns2) ExprList2; function1(ArgPatterns3) ExprList3; function1(ArgPatterns4) ExprList4.
when Guard1 -> when Guard2 -> when Guard3 -> when Guard4 ->
Consider a simple function with one clause, no guards and only variables in its head. When it is called, the variables in its head are bound to the elements of the passed argument list:
1 2 3 4 5 6 7
> simplefunct(1, 2, 3). ... simplefunct(A, B, C) -> % A is bound to 1 % B is bound to 2 % C is bound to 3 ok. % an atom to have a return value The generalisation of this is that the passed arguments are matched against the patterns in the head. In the previous example the patterns were unbound variables. This can be interpreted most easily the following way. Consider a more complicated function with both variables and patterns in its head: almost equivalent version, with only variables in head
with patterns in head

1 2 3 4 5 6 7 8
> area(3, {rectangle, 4.0, 5.0}). ... area(N, {rectangle, Width, Height}) -> % N is bound to 3 % Width is bound to 4.0 % Height is bound to 5.0 Width * Height.
area(N, Argument2) -> {rectangle, Width, Height} = Argument2, % N is bound to 3 % Width is bound to 4.0 % Height is bound to 5.0 Width * Height.
The pattern-matching happens with the second argument and the pattern at the second position in the head. In the right column, we see that this is equivalent to the case when the pattern-match is the rst expression of the body. The equivalence breaks in the case when there are more than one clauses of the function: with patterns in head
1 2 3 4 5 6
almost equivalent version, with only variables in head area2(N, Argument2) -> {rectangle, Width, Height} = Argument2, Width * Height; area2(N, Argument2) -> {circle, Radius} = Argument2, Radius * Radius * math:pi().
area1(N, {rectangle, Width, Height}) -> Width * Height; area1(N, {circle, Radius}) -> Radius * Radius * math:pi().
The dierence is that the evaluated clause of area1/2 is chosen with respect to patterns, while the evaluated clause of area2/1 is always the rst one. In other words, the body has no eect on which clause to
59
Gbor Fehr
choose. Note that the pattern-match in the head of area1/2 makes that function a conditional construct, similar to if. If no matching clause is found an exception is thrown. Guards can be used to extend clause-heads with checks for non-structural properties of arguments i.e. that can not be expressed by patterns. For example, we can distinguish between clauses using the calculated area of the second argument: area3(_, {rectangle, Width, Height}) when Width * Height >= 5.0 -> big; area3(_, {square, Side}) when Side * Side >= 5.0 -> big; area3(N, _) -> N. When the area is less than 5.0 units, the value of N is returned. A.12.2 funs
Funs were introduced in section A.5.1. A fun is a value, which can be assigned to a variable, and that variable can be used to call the represented function. A.12.3 case construct
A case construct is a conditional expression, which is very similar in semantics to functions. The dierence is that it is used as an expression:
1 2 3 4 5 6 7 8 9
case Argument of Pattern1 when Guard1 -> ExprList1; Pattern2 when Guard2 -> ExprList2; ... PatternN when GuardN -> ExprListN; end Argument is an expression. Lines 2-3, 4-5 and 7-8 are clauses. PatternI is the head, which always contains exactly one pattern. When a case is evaluated, the rst clause is chosen, whose pattern can be matched to the value of Argument, and guard can be satised. The expression list of this clause is evaluated then. Construct case is analogous to a function that is called from exactly one place of the program, with argument Argument. If no matching clause is found an exception is thrown. For an example, I show area1 written using a case:
1 2 3 4 5 6 7
area1(N, Shape) -> case Shape of {rectangle, Width, Height} -> Width * Height; {circle, Radius} -> Radius * Radius * math:pi() end. A.12.4 Guard sequences
The part of a clause named guard is a Guard sequence. Guard sequences look like expressions, which return true or false, and built up from BIFs and certain operators. However, in fact they are just expressionlike constructs that extend the functionality of pattern-matching. The terminology also diers: a guard sequence either succeeds of fails. 60
Gbor Fehr
A guard sequence is either a guard, or a semicolon-separated list of guards: G1; G2; ...; Gn. The guard sequence succeeds when at least one of its guards succeeds. A guard is either a guard expression or a comma-separated list of guard expressions : GuardExpr1, GuardExpr2, ..., GuardExprN. The guard succeeds when all of its guard expressions succeed. I present here the necessary subset of features available in guard expressions: They can use binary operators like =:= for equality tests, >= and =< for comparison. They can use short-circuit evaluated logic operators: andalso, orelse. They can use a subset of Erlang BIFs. There are the most important ones that we use in section A.8.3 . Guards can not create new variables. There is a proposal [8] for the Erlang language to change this, but it is not yet implemented. When a BIF in a guard expression is called with invalid arguments, no exception is thrown, but the guard expression fails.
1 2 3 4
a(X, Y) when element(1, X) =:= yes orelse Y =:= 1-> yes; a(_, _) -> no. Calling a(nontuple, 1) returns no because element/2 has invalid arguments and this makes the whole guard expression fail nevertheless it should logically succeed. However, this eect does not span to other expressions of the guard sequence:
1 2 3 4
b(X, Y) when element(1, X) =:= yes ; Y =:= 1-> yes; b(_, _) -> no. Calling b(nontuple, 1) returns yes, because the second element of the sequence Y =:= 1 succeeds.
A.13
Module attributes
Attributes hold information about the module, and they are processed at compile time. They start with a hyphen (-), continue with an atom, an opening bracket, an arbitrary constant, a closing bracket and end with a period (.). The most common attribute is module at the beginning of each module. The list of common attributes: module denes the name of the module.
1
-module(ModuleName).
record denes a record type.

1
-record(RecordName, {FieldName1, FieldName2, ...}).
include instructs the compiler to include a le at the position it appears. The standard extension for include les is hrl. The given lename can be either absolute path or relative path. In case it is relative, the include path is searched for the le.
1
-include("my_record_definitions.hrl").
include_lib works like include, but the given le can not be an absolute path.
1
-include_lib("ect/include/ect.hrl"). Instead, the rst component of the path is considered the name of an application, and the remaining components are relative to the base directory of the application.
61
Gbor Fehr
B
B.1
Detailed measurement results

Erlang static function call
Ops Avg (us) Err % M1 M6 10033 10062 M2 M7 10046 10068 M3 M8 10051 10071 M4 M9 10053 10108 M5 M10 10061 10373
Call time (us) alpha 1000000
0.010
0.743
B.2
Erlang dynamic function call

0.088
0.407
B.3
Erlang fun call

0.042
0.298
B.4
Erlang records
Ops Avg (us) Err % M1 M6 26763 26804 102245 102750 M2 M7 26771 26810 102351 102816 M3 M8 26786 26832 102401 102988 M4 M9 26795 26837 102455 103026 M5 M10 26804 26889 102681 103123
Read time (us) alpha 1000000 Write time (us) alpha 1000000
0.026
0.317
0.102
0.855
62
Gbor Fehr
B.5
ECT classes
Ops Avg (us) Err % M1 M6 93988 94463 104954 104988 104055 104451 103663 104061 104036 104073 104351 104949 47638 47673 47555 47681 47656 47686 47666 47693 47661 47680 47634 47685 107615 107944 107543 107764 107342 107801 107569 107743 107401 107809 108940 110097 M2 M7 94001 94463 104983 104993 104098 104517 104034 104063 104061 104105 104497 104989 47654 47676 47654 47686 47669 47701 47666 47696 47662 47682 47657 47688 107677 108036 107619 107835 107611 107810 107609 107800 107480 107811 109578 110195 M3 M8 94002 94506 104985 105001 104102 104519 104035 104065 104063 104106 104528 104993 47657 47688 47669 47690 47672 47702 47673 47697 47663 47697 47669 47703 107729 108079 107626 107849 107648 107895 107651 107832 107493 107864 109833 110229 M4 M9 94022 94529 104986 105062 104131 104547 104037 104073 104068 104111 104528 105008 47663 47697 47669 47728 47677 47715 47684 47716 47666 47703 47670 47724 107756 108095 107672 107938 107710 108117 107693 107901 107741 107865 109951 110312 M5 M10 94050 94546 104987 105401 104445 104551 104041 104099 104069 104114 104586 105205 47671 47706 47673 47745 47680 47755 47692 47738 47671 47726 47685 47724 107921 108128 107710 108175 107729 108167 107743 108178 107784 107955 110001 110367
Call time (us) alpha 1000000 beta gamma delta epsilon zeta 1000000 1000000 1000000 1000000 1000000
0.094 0.105 0.104 0.104 0.104 0.104
0.559 0.001 0.401 0.357 0.000 0.346
Read time (us) alpha 1000000 beta gamma delta epsilon zeta 1000000 1000000 1000000 1000000 1000000
0.047 0.047 0.047 0.047 0.047 0.047
0.102 0.031 0.060 0.002 0.065 0.081
Write time (us) alpha 1000000 beta gamma delta epsilon zeta 1000000 1000000 1000000 1000000 1000000
0.107 0.107 0.107 0.107 0.107 0.109
0.475 0.466 0.034 0.308 0.000 0.812
63
Gbor Fehr
B.6
ECT remote classes

0.839 0.885 0.884 0.889 0.891 0.879
2.710 5.733 3.722 1.189 0.065 0.080
0.715 0.702 0.702 0.702 0.701 0.698
0.972 0.461 0.187 0.282 0.482 0.310
0.436 0.436 0.437 0.439 0.444 0.440
0.567 0.826 0.259 0.279 4.650 0.193
64
Gbor Fehr
B.7
eXAT classes
1.394 7.301 13.254 19.635 25.453 31.533
3.518 0.598 0.784 6.870 0.986 0.475
1.759 1.761 1.768 1.767 1.766 1.769
0.859 1.404 0.082 1.025 1.319 0.567
2.736 2.750 2.759 2.747 2.740 2.739
0.243 0.000 0.849 0.587 0.159 0.022
65
Gbor Fehr
B.8
Parameterized modules
Ops Avg (us) Err % M1 M6 90863 90888 4225359 4291151 8317401 8451106 12427134 12649059 16506503 16781526 20507463 20911440 M2 M7 90865 90896 4267204 4298756 8384380 8458614 12555800 12685108 16667196 16795047 20621167 20940458 M3 M8 90872 90898 4276810 4298800 8408677 8474969 12582165 12694566 16691205 16803972 20744044 20940972 M4 M9 90875 90923 4283320 4302990 8416681 8496117 12601570 12713205 16714897 16868811 20752868 21050256 M5 M10 90884 90996 4289755 4304999 8429000 8678430 12615244 12739218 16769582 17134690 20864384 21147207
0.090 4.283 8.451 12.626 16.773 20.848
0.080 0.512 0.443 0.206 1.773 2.058
66
Gbor Fehr
B.9
Wooper classes
Call time (us) class_Alpha class_Beta class_Gamma class_Delta class_Epsilon class_Zeta
1000000 1000000 1000000 1000000 1000000 1000000
1.915 1.977 1.968 1.968 1.971 1.979
0.499 0.816 0.165 0.295 0.016 0.660
Read time (us) class_Alpha 1000000 class_Beta class_Gamma class_Delta class_Epsilon class_Zeta 1000000 1000000 1000000 1000000 1000000
2.342 2.354 2.337 2.337 2.335 2.342
0.016 1.066 0.126 0.016 0.035 0.216
Write time (us) class_Alpha 1000000 class_Beta class_Gamma class_Delta class_Epsilon class_Zeta 1000000 1000000 1000000 1000000 1000000
2.012 2.017 2.015 2.014 2.016 2.012
3.513 3.642 3.713 3.590 3.550 3.125
67

Erlang

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Erlang

Hochgeladen von

Copyright:

Verfügbare Formate

Budapest University of Technology and Economics

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Object-oriented extension to Erlang

The concepts of Erlang

data elements this is called data hiding.

Object-oriented extension to Erlang

Object-oriented extension to Erlang

The outline of the paper

Object-oriented extension to Erlang

Programming objects with funs

Object-oriented extension to Erlang

methods of class A function1 function2

methods of class B (extends A) function3 function2

an object of class A method1 method2

an object of class B method1 method2 method3

Programming objects with modules

%Par1 and Par2 are the parameters

M = pmodule:new(10, 2), X = M:pfunction(4).

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Parameterized modules: gamma beta alpha X, Y Z U

Object-oriented extension to Erlang

Parameterized modules: gamma beta alpha X, Y Z U alpha X', Y'

Desired behaviour: gamma

-module(alpha). -export([extends/0, class1/1, fact/2]). 12

Object-oriented extension to Erlang

WOOPER: Wrapper for OOP in Erlang

Object-oriented extension to Erlang

-module(class_Beta). -define(wooper_superclasses, [class_Alpha]). -define(wooper_construct_parameters,Field1, Field2, Field3). -define(wooper_construct_export,new/3,new_link/3, construct/4, op/4). 14

Object-oriented extension to Erlang

Object-oriented extension to Erlang

The Erlang Class Transformation Extension Specication

Motivation and design goals

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Method calls and -inheritance

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Object-oriented extension to Erlang

extract of class2.erl: -superclass(class1).

method1(This) -> {{This}}:method1() + 40.

Object-oriented extension to Erlang

Object-oriented extension to Erlang

an instance A same with an instance B same with

of classa tuple-syntax of classb tuple-syntax

Object-oriented extension to Erlang

section A.12.4. section A.7.

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Expression#classname{field1 = ValueExpr1, field2 = ValueExpr2} is converted to

Object-oriented extension to Erlang

Object-oriented extension to Erlang

Patterns in clause heads

Object-oriented extension to Erlang

clausedemo1(A = {_, #class1{field2 = B}}, C) when C > 5 -> A+B+C; is transformed to

clausedemo1(A = {_, #class1{field2 = B, field3 = xyz}}, C) when C < 5 -> A + B + C; is transformed to

clausedemo1(A = {_, #class1{field2 = B, field3 = C}}, C)-> A-B+C. is transformed to,

Object-oriented extension to Erlang

Object-oriented extension to Erlang

clausedemo1(A = {_, #class1{field2 = B}}, C) when B =:= 42 -> A + B + C; is transformed to