Sie sind auf Seite 1von 117

Mango

A General Purpose Programming


Language

My Background
Early experience on Apple II
University of Illinois Champaign
Urbana.
Bachelor's degree in computer
engineering
Three years at Neoglyphics: software
Three years at Alpha: hardware

How Mango Got Started


Frustrated with C/C++/Java
There had to be a better way
Foolishly began designing my own
language
Foolishness can be a virtue
Its been a seven year voyage

The Problem
Common computing infrastructure is
written in C/C++
C/C++ is inadequate

Lack of higher level abstractions


Programmers must use low level constructs

Results of C/C++ use

Unsafe/unstable software
Slower development times
Higher development costs

A lack of alternatives
Java, Python, Perl, Pascal, Ada, Modula,
C#

Not viable replacements


Lack Cs virtues in performance and flexibility
Dependent on C for core tasks
Lack of widespread appeal (clumsy, i.e. Ada?)
Not sufficiently different to switch

The Solution: Core Goals


Provide higher level abstractions
Avoid low level constructs when not needed
Make programming easier, more enjoyable

Retain performance and flexibility


Allow unrestricted operations as necessary
Avoid overhead
Match machine execution model

Overall: make a better experience for


programmers

Overview
High level design goals and decisions
Feature walk through
Future directions

Design Goals

Design goals
Syntax
Static Typing vs. Dynamic Typing
How Vs. What
Large Languages Vs. Small Languages
Object Orientation: Yes or No

Goal #1: A Good Syntax


Syntax is key

Its underrated (focus on semantics)


Makes the language easier to learn
Makes it accessible to non-programmers
Makes the language self-documenting

Marketing versus engineering


Bad marketing of a good product will fail
Bad syntax around good semantics will have
a harder time gaining acceptance

Static versus Dynamic Typing


Dynamic languages are very popular
Due to poor implementations of static
languages

Advantages of static typing

Critical for performance


Types act as documentation
They catch many errors at compile time
Types allows overloading of names

How versus What


CS fantasy to forget how and focus on what
How is a hard problem

Distinguish features that are theoretically


equivalent but practically different

Everything can be a list, but itll be slow

Rich set of primitive and aggregate types


Side effects to use memory more effectively
Reduce copying

Manual memory management


GC is not possible for some applications
Done right beats a garbage collector

Large Vs. Small Languages


Small language
Secondary features are in a standard library
Advantages: easier to learn core features, make a compiler

Large language
First class treatment of secondary features
Allows specialized operations, makes programs more readable
Advantage: a smoother user experience

Ease of learning is dependent on more than language


size

You still have to learn library APIs


Intangible quality: how the language corresponds to human
cognition

Open source makes compilers easier to write


Open front end acts as the spec

Object-Orientation: Yes or No?


Stepanovs criticism
programming = data structures + algorithms
multi-sorted algebras

Object orientation has shown itself useful in


certain circumstances
Offer OO as an option
Leave inheritance behind

Inheritance hierarchies are difficult to follow


Fragile base class problem requires reanalysis of
class behavior

Mango walk through

Influences
Syntax plus some basic examples
Module system + incremental compilation (Include requires clause, parameter and options)
(platform and foundation files)
Naming and overloading
Literals (include string format)
Primitive Types
Memory model (include pointer/reference syntax)
Records and Abstracts
Procedures, functions, constants (pure functions?)
Statements: Control flow
Statements: I/O
Abstract Data Type (objects)
Iterators and iteration operators
Mutexes
Exception Handling
Strings, arrays, buffers
Collection Types
Global variables/External symbols/Module Constructors

Calling convention part of the type system

Packet/Device types
Development aids
Genericity

SETL - set operations


ALGOL - imperative block structure and syntax
C - low level ops, low overhead
ML - type inference, type syntax
ADA - fine grained control over primitives
PYTHON - indentation based syntax
JAVA - interfaces
C++ - STL, operator overloading, IO syntax
CLU - iterators
PASCAL - sets (bit masks)
PERL - richness of expressibility
COBOL - readable syntax
SIMULA - objects

Syntax

Mangos Syntax
Mango looks like pseudo code
Indentation based syntax

Reduces clutter and typing


Allows more code to be shown on the screen
Blocks can be enclosed with begin/end delimiters if desired

All directives, definitions, statements begin with a


keyword

Simple style that is easy to remember


Users symbolic names cannot conflict with reserved words
This makes the language easy to extend

There are no exclusively reserved keywords


Legacy of lex?

Modules and Naming

Modules

Module is a set of symbols


Each file is a module
Module name must correspond to pathname
A modules symbols can be public or private
Public: symbols are visible to other modules
Private: symbols are invisible to other modules

Modules may import and include other modules


Import: foreign symbols are localized (private)
Include: foreign symbols are exported (public)

Incremental Compilation
Mango supports incremental compilation
Module has changed
A dependency has changed

Major or minor revision?


Compares syntax trees
Major revision: public change
Minor revision: private change

Comparing syntax

Advantage: Eases implementation


Disadvantage: Set of major revisions is obviously
larger

Naming
Naming is not hierarchical, but geometric
Symbol names exist within a four-d grid
Namespace, keyword, extension, auxiliary
Only namespace and keyword are mandatory

Each module is part of a namespace


Public symbols use declared namespace
Private symbols use special namespace local

Format:
namespace::keyword@extension$auxiliary

Shallow Overloading
Overloading occurs for every imported or included

module
4d namespace is collapsed into 1d namespace
Utilizing keyword or extension
Other partial namespaces as well
Symbol can be access using proper name or alias
Ensures all overloaded symbols have a unique name
As a result, all overloading is superficial or shallow
Operator overloading is also supported

Memory Model

Mangos Memory Model


Value semantics

Put stuff on the stack, particularly primitives


Key for performance
Offers more flexibility

Three types of memory

Static:
Compiled data, global variables
Heap items that are never deleted
Arbitrary:
Heap items that are eventually deleted
Local:
Items that live on the stack

Safe manual memory management


Static datums
Always safe to use

Arbitrary datums
Need to be guarded to avoid dangling pointer
references

Local datums
Compiler must enforce restrictions to avoid
dangling pointer references

Sentries
Pointer guards are called sentries
Pointers to arbitrary datums are fat
One address to the datum on the heap
One address to the datums sentry

Sentries live on the heap too


Have a static lifetime (i.e. never deallocated)
They are very small ~= 5 bytes

Sentry Performance
When # sentries is small

Good performance on modern hardware


Sentries stay in the cache
Half of the processors time is spent waiting
on memory

As # sentries increases
cache starts to overflow

We need to reduce the number of sentries

Arenas and Recycling


Method #1: Allocate in pools
A group of datums share a sentry
Allocated arbitrarily, but deallocated at once

Method #2: Recycling


Use static datums instead
When static datums are deleted
Initialized to zero
Stay on the heap until datum of the same type is requested
Incorrect results are possible, catastrophic failures
are not
The program cannot break the type system

Literals

Literals
Definition
A value which is known at compile time

Types

Immediate values
Numeric primitives and text
Literal expressions
Interpreted during compilation
Parameters
Values used to configure the compiler
Options
User supplied values to customize a build

Literals can be named

Numeric Literals
Six types
Integer, Decimal, Hex, Address, Binary, Signal
Integers and decimals also include complex plane signifiers

Can be anonymous
Anonymous literals are stored as untyped strings
Converted to a real value when type is known
There are no constraints on range

Can be typed
Type is specified with value
Converted immediately to the desire type value
There are no constraints on range

Text Literals
Two types
Characters and Strings
Stored as untyped strings until desired type is known

Characters enclosed with the back tick


Text string enclosed with the double quote
Literals can be inserted into characters and text strings

Named literals
Parameters
Options
Character codes
Character aliases

Literal Expressions
Two forms of literal expressions
Normal literals: immediate evaluation
Macro literals: deferred evaluation

Macros are evaluated over arguments


Result value is optionally typed
Expressions can include
Condition construct
Conversion construct
Local aliasing

Parameters and Options


Changes cause recompilation of module
Part of the public interface of the module
Checked when comparing syntax
Only options that are used included in
dependency analysis
Parameters included in dependency
analysis
Specified by user
Have a major impact on compilation

Core Types

Core Types
Primitives
Tuples and Unions
Addresses, Pointers, References
Polymorphic Types
Strings, Arrays and Buffers
Collections
Records and Abstracts

Type Qualifiers
Pliancy: Immutable, Mutable
Reactivity: Volatile, Inert
Duration: Local, Static, Arbitrary
Memory: IO, Virtual, Physical?
Useful for embedded system with multiple
memories
Undecided: means to access hardware
registers directly

Primitives
Logical
Bit (2 State), Boolean (3 State), Signal (4 state)

Ordinal
Range from 0 to N, where N is user specified

Character
ASCII, UTF 8, UTF 16, UTF 32, 8 Bit Data

Register
Binary register, user specified dimensions

Signal
Signal bus, user specified dimensions

Primitives (contd)
Cardinal
Unsigned integer, 1/2/4/8/16/32/64 bits

Subrange

Signed range, upper/lower bound, default value

Integer

Signed integer, 8/16/32/64 bits

Rational

Signed rational, fixed or floating denominator

Decimal

Fixed point decimal number


Whole and fractional component

Number

Floating point number, specified size

Primitives (contd)
Complex numbers
Primitives qualified with units
Enumerations
Matrices
Coordinates

Primitive Modifiers
Conversion
Automatic, manual, none, universal

Evaluation
None, Fixed, Fluid

Approximation
Round, Truncate, Conserve

Overflow
Check, Limit, Wrap

Byte Order
Big, Little, Host, Network

Tuples and Unions


Anonymous type products
Fields can be labeled
Unions
Each term of product is overlapped
Unsafe

Elaborate types decompose into tuples

Addresses, Pointers, References


Addresses have bitwise resolution
Address is 2 product tuple

Upper value: byte count


Lower value: bit count

Addresses still optimal


types with byte-wise alignment will drop bit count

Pointers
Address of type where address is significant

References
Address of type where type is significant

Polymorphic Types
Sums

Superposition of multiple types


Qualified with type tag to ensure safety

Handle
Anonymous pointer (points to anything)

Anything
Stores any type value

Strings, Arrays, Buffers


Arrays
Dimension type can be customized
Slices of arrays preserve range information

Strings

Array of items from 1 to x


Dimension type can be customized
Slices of strings begin at 1

Buffers
Fixed length string that wraps around itself
Varying start and end positions
Dimension type can be customized

Collections
Entry

a node of a linked list

Segment

List

a combination of string and pointer

appends new data at the end


elements allocated in pages

Stack

FIFO
Can be prioritized

Sequence

inserts new data


elements allocated in pages

Queue

LIFO
Can be prioritized

Collections (contd)
Mask
A bit mask

Range

A numeric range with upper and lower bounds

Set

A hashed set
Doubles as a one-to-one map

Table

A hashed one-to-many mapping

Group

A special collected used for comparisons

Graph

Used for frequency tables

Records
Three visibility states
Public: accessible anywhere
Protected: accessible by modules that declare
access
Private: accessible within the module

Two layout types


Fixed: in order of declaration
Packed: ordered to reduce record size

Fields can be qualified to remove them


from the build

Abstracts
Interface to multiple record types
Records mapped to abstracts using link directive
Gives more flexibility in mapping abstracts
Mappings can occur after declaration
i.e. library types can still be mapped to abstracts
Simplifies remapping of fields

Fields can be qualified to remove them from the


build
Medley: combines abstract with a group of
records

Abstract Data Types

Mango can bind any type to a class


Objects: Classes + Record
Properties
getter/setter methods

Parameters
Properties that can only be set at instantiation

Delegates
Multiple dispatch for aspects and aggregates

Normal classes do not support inheritence

Object Construction
Constructor (prelude)
Destructor (finale)
New style constructor
Parameters and properties can be set before
the constructor is called
Only one constructor per class no
overloading
Constructor spans entire object.
Variable initialization at declaration

Object Interfaces
Interfaces (mixins)
Separates sub typing concerns from
implementation

Interfaces are purely abstract. No


associated code
Uses link directive like record abstracts
Bind operator links types to interfaces
automatically

Aggregates
Composition instead of hierarchical
inheritance
Flattens the hierarchy: easier to
understand and maintain
Overriding of methods is explicit
Compiler can statically check for problems

All publicly accessible methods must


explicitly exported

Built in I/O Devices


Mangos built in I/O devices are typed
Six types of I/O devices

file: an operating system file


text: a line oriented text file
stream: an I/O stream
port: a connection listener
database: a database
document: a document (XML)

Procedures and Functions

Procedures
Multiple inputs and output
Call by localized reference
Compiler makes a duplicate of a value on the
stack if necessary

Nested procedures
Can reference variables within parents scope

Return values are held in a named


variable

Possible to return values with assignment

Functions and Constants


Real functions
Essentially a single expression
Extended with two case constructs
Condition and conversion

Constants

Similar format to functions


Evaluated only once, result is cached
AKA Eiffel once function
Clean way to instantiate global variables

Procedures and Functions (contd)


Mango supports function/procedure
pointers
Both global and local procedures

Invokation through expressions or


statements
Statement invokations automatically infer
return types
Labeled arguments possible

Method Pointers
Method pointers
combine class state with class method

When a method is selected from a class


Result is a method pointer

Special attribute and receptor pointers


Automatically evaluated

Method pointers are a supertype of


function pointers

Will absorb function pointers

Expressions

Arithmetic
Addition, Subtraction, Negation, Identity
Multiplication, Division (Truncate)
Remainder, Division (Conserve)
Apply (Exponent), Absolute Value
Shift Left, Shift Right

Comparison/Logical
Equal, Not Equal
Less, Less Equal, Greater, Greater Equal
Test, Not, And, Or, Exclusive Or, Reduce

Casting
Static casting
Type is specified

Dynamic casting
Type is infered
Cast is specified with dynamic cast operator

Implicit casting

From subtype to supertype


Class to interface, Function to method, etc
Polymorphic types

Membership
in, not_in

String, array, buffer, entry, segment


List, stack, queue, sequence
Mask, range, set, table, graph

is, is_not
Evaluates left value with right function
Returns boolean

like, not_like
Tests left value is equivalent to right hand type

Selections
Selection of record fields/class methods

By name
By number (access to anonymous fields)

Class fields
Special attribute operator

Dynamic method operator


Looks for class that can operate on target
type, with named method

Array
A[x] to index an array
Slice operator
Before and after some position
Within a range

~=, /= for string comparisons


Size operator returns size (# elements)
Domain operator returns dimensions

Production
Create and initialize data types
Local stack, static heap, dynamic heap

Production arguments
Where clause
Initialize class properties/record fields
With clause
Class constructor arguments
Aggregate members
Within/upto
Dynamic string/array dimensions
From
Target arena

Sequence Operators
Generate sequence

Pick: select one value


Every: select group values

Test sequence
Some: one term in set matches
All: all terms in set match

Compute sequence
Product: product of terms
Sum: sum of terms

General Iteration Form


[a of t] in x at h using u over p where z
while q
a in x at h using u over p where z while q
<a,b> in x at h using u over p where z while q

Iterations
Iterator targets

String, array, buffer, entry, segment


List, stack, queue, sequence
Mask, range, set, table, graph
File, text, stream, database, document
Iterator construct

Statements

Statements Overview
Sequence of statements
Nested block delimitation
Begin/end pair
Indentation

Each statement begins with keyword


Possible to group statements under a single
keyword
Delimited with indentation or begin/end pair

Overview (contd)
Variables can be named anywhere
Renaming can occur
In current block
In nested blocks

No gotos
Blocks can be labeled
Blocks can be exited
Block labels can be used to exit multiple levels

Overview: Contd
Conditional predication

For many statements


Often with alternate evaluation

Basic Blocks
Execute: nested block
Exit: exit blocks
Return: exit procedure and set results
Repeat: repeating block
Continue: repeat loop

Production
Declare: declare local variable
Allocate: make temporary
Deallocated when leaving block

Get: declare static type


New: declare dynamic type
Delete: recycle or delete heap type
Trash: delete static heap type

Placement
Overlay: place a type over another type
Create: create a type from string
Place: place a type at an address
Map: map a device type
Open: open an I/O type
Close: close an I/O type

Assignment
Assign: copy value
Optional declared result with inferred type

Swap: swap two values


Let: reference alias of value
Refine: update multiple fields of a record
Touch: Mark variable as used
Nothing: Do nothing

Arithmetic
Reset: reset value to default
Set: set value to 1
Incr: Increment by target, else 1
Decr: Decrement by target, else 1
Magnify: Multiply by target
Reduce: Reduce by target

Invokations
Call: call function with arguments
Optional declared result with inferred type

Eval: evaluate arguments with function


Optional declared result with inferred type

Initialize: initialize object with arguments


Finalize: finalize object
Function: local function
Procedure: local procedure

Conditional Blocks
If: If-then-else
Select: Multiple conditions
When: Defers compilation
During: Conditional debug block
Compiler option removes them

While: Loop on pre-condition


Until: Loop on post-condition

Case Constructs
Where: case of values

Handles multiple matched terms


Case values can include ranges and groups
Character strings have contents compared
Wild cards: Case value can be ignored

Convert: covert one value into another


Case values can include ranges and groups
Character strings have contents compared

Specialized Blocks
Given: Reduces polymorphic type

Handles multiple matched terms\


Wild cards: Case value can be ignored

Cast: Cast value, Execute Block


Foreach: Iterate and execute block
Acquire: Acquire mutex and execute block

Collection Operators
Add: add item into set
Replace: exchange item with another
Remove: remove items from set
Take: take (arbitrary) item from set
Trim: reduce set y by set x
Filter: intersect set y by set x

Collection Iterators
Find: Return item
Option to insert and return new item

Extract: Remove and return item


Fill: Populate collection with iterator
Compute: Arithmetic sum over iteration
Copy: Copy from one set to another
Modify: Update items within collection

I/O
Write: write terms to I/O device
Read: read term from I/O device
Strings, Buffers, Lists read series of terms
Continue/Stop masks for character devices

Print: write to standard out


Log: Atomic write to I/O device
Accept: accept stream from listener
Monitor: monitor group of listeners

Error Handling
Activate: active exception handler
Try: trap exceptions with block
Throw: throw exception
Require: assertion statement
Quit: quit program
Raise: rethrow exception

Iterators

Iterators
Procedures to iterate over type
Consists of operand and result type
Iterator activation:
By name
By type (operand and/or result)

Iteration deactivation
Result is <stop>
Result type is tagged

Iterators (contd)
Iterator sections

State: local state while iterator active


Prelude: initialization segment
Query: compute result or signal stop
Finale: finalization segment

Local state is pushed on stack

Exceptions

Activated Handler Style


Exception event code contained in global
handler object
Handler object pushed onto activation
stack
On exception, handlers are processed
until exception is handled
Exception handling is immediate, before
stack is unwound

Exceptions (contd)
Four exception flavors:

Fatal: Quit program without unwinding stack


Severe: Quit program but unwind stack
Normal: Unwind stack to last trap
Soft: Program can continue

Advantages
Exception is processed before unwinding
No side effects before exception handling

Exception code can be pushed into


libraries

Still possible to localize with arguments


passed during activation

Reduces clutter
No need for try-catch blocks

No need for throws clauses

Global variables

Global Variables
Variable: global value
Dataset: string of values
Symbol: external (linked) value
External: external (linked) function

Global Constructors
Global Initialization/Finalization code
Executed before and after program
Contents:
Local state
Prelude procedure
Finale procedure

Priority value sorts contructors

Misc

Mutexes
Formal means of acquiring locks
Mutex object + Acquire statement
Object selected by name or by type

Mutex Components
State: local state
Prelude: initialization
Acquire: acquire lock
Retry: reacquire flag
Release: release lock
Finale: finalization

Packets
Specialized record

Header
Payload of variable size
Footer

Packet type is polymorphic pointer


Point to packet of any size

Devices
Used to define set of hardware registers
Record of records
Register placement:
In sequence
At specified offset

Package
Used to define group of related functions
Elements of packages pushed into
extended namespace
Particularly useful for operators
Means of reducing namespace clutter
Means of grouping related generic
operators

Genericity

Genericity

Builds

Builds
Assemblies, Platforms, and Prologues

Module Suffixes
Runtime type identification

Requests types that are used


Therefore not all types need to be instantiated

Future Directions

Future Directions
Finish the compiler
Mango is the first in a family of languages
Sharing type system and syntax
Targeting a virtual machine

Application Language
In the spirit of C# or Java
Reference semantics
i.e. no user defined pointers

Automated memory management


With user intervention

Less complicated type system


More accessible language

Functional Language
Functional/Dataflow semantics
Strong focus on doing distributed
computations
Gateway to bring functional programming
to the masses
Accessible because of shared syntax and
types

Record Processing Language


Basically a replacement for COBOL

The use of COBOL cripples the mind


working within the limitations of the original
COBOL programming model can make for
robust and maintainable code

Overlooked area by language designers

Command Language
Cross between TCL and LISP
Can be used as OS shell
Eliza MIT LL4 conference
Classification of Functions into 4 Types:
<G> Generator
<C> Concentrator
<T> Transformer
<F> Filter
Connected with pipes (aka unix shells)

Miscellaneous
Standard Development Environment
Text Editor, Debugger, Static Analysis, Etc.

Programming is about community


Programmers Organization

Builds community
Standards body
Funds open source projects
Earns revenue by supplying directory services

The End
Thanks for listening

Das könnte Ihnen auch gefallen