MoLa - an interpreted (via VM) programming language featuring modules, garbage collection, and other stuff
Modules are MoLa programs (files) which can be imported into other programs. Any .mola file can be imported using its location.
When a module gets imported, it is assigned a name under which it will be accessible to the importer program. When we import module A into program B, the following happens:
- A gets executed normally, without access to B's objects.
- Objects exported from A are combined into a special module objects. Each exported object gets assigned an identifier under which it can be accessed in that module object.
- The module structure gets assigned the identifier provided in the
import
statement. From now on, exported objects from A are accessible to B.
A module structure is always stored as a global variable in the importer's environment.
Each program (file) is a module. Such module can be imported using its path in the filesystem. There are 3 types of modules:
- Modules included in the
MOLA_MODULES_PATH
directory. A module at locationMOLA_MODULES_PATH/a/b/c.mola
is imported usignimport mola.a.b.c as c
statement. - Modules that are considered the part of the application. Such modules are usually part of the complete project, and are placed somewhere close to the importer program. Such a module with relative path of
~/a/b/c.mola
(where directory ~ contains the importer program) can be imported usingimport a.b.c as c
statement. To go up a directory, the '^' token is used:import ^^tools
- Modules that are somewhere else in the system are imported using the absolute path to the program. A module at location
a/b/c.mola
can be imported usingimport "a/b/c.mola" as c
statement.
When a module code is executed, it has access to the things that it would have access if it was executed normally as the main program. This means that globals, functions, and types are directly accessible only within their module. Objects that are not exported become unaccessible to the importer; however, they stay accessible to the imported module.
In an application, a module can only be imported once. This means that next import
statements will return a new reference to the already created module structure.
Each module has it's own environment. The module structure has a pointer to that environment.
A module structure holds the absolute offset of the first instruction of its code.
Object are references to values: Object -> Value.
- Objects always have a pointer to their type.
- Value of object X is noted as val(X)
- type object is an object that references a type value, which can be used to create new isntances of that type.
- An object O -> V can be copied in 2 ways:
- O` -> V (by reference)
- O` -> copy(v) (by value)
- When we copy x using rules defined by its type, we say that it is a copy by auto. we can write it as x -> copyauto(y)
- All builtin types are copied by reference.
- Builtin types are:
- immutable, if it is bool, int, float, null, error, or a type value
- mutable, if it is string, array, or a custom type
- When we have objects A -> X and B -> Y, then we can make A reference B's value (we say A references B) if we make A -> Y.
- When a reference/object x is removed, we simply make x point to null, and let GC deal with x.
- When a reference/object x is destroyed, we destroy val(x) and let GC deal with x.
- Assignment x = y is executed in steps:
- if x is an identifier and a variable with such name doesn't exist - create x as an object with value null
- create a copy by auto of y: z
- make x a reference of z: x -> val(z)
- remove reference z
- result: x -> copyauto(y)
- Assignment by value can be forced using: x copies y
- if x is an identifier and a variable with such name doesn't exist - create x as an object with value null
- create a copy of y by value
- make x a reference of z: x -> val(z)
- remove reference z
- result: x -> copy(val(y))
- Assignment by reference can be forced using: x refs y
- if x is an identifier and a variable with such name doesn't exist - create x as an object with value null
- create a copy of y by reference
- make x a reference of z: x -> val(z)
- remove reference z
- result: x -> val(y)
- Functions are special objects that can only be created in the global scope (top level) of the module.
- Functions can see itself, global variables, types, and other functions created in that module.
- Functions can accept a fixed amount of arguments.
- Functions always return a value.
- Function arguments are passed to the function in 4 ways:
- by value when parameter is defined as: copy x such a parameter becomes a copy by value of the passed argument: arg -> copy(val(x))
- by reference when parameter is defined as: ref x such a parameter becomes a copy by reference of the passed argument: arg -> val(x)
- by original when parameter is defined as: orig x such a parameter becomes the passed argument, and therefore can: arg = x
- by auto when parameter is defined as: x such a parameter becomes a copy by auto of the passed argument: arg -> copyauto(x)
- Function may return a value in 3 ways:
- by value using: return copy x a copy by value gets returned: ret -> copy(val(x))
- by reference using: return ref x a copy by reference gets returned: ret -> val(x)
- by auto using: return x a copy by auto gets returned: ret -> copyauto(x)
- Custom types are containers with fields and methods.
- Fields are fixed and new fields cannot be added to the created type object nor the type instance.
- Methods are functions (and have all their features and nuances) that can only be called with the instance object: x.method()
- Methods have access to the instance via "this" keyword.
- Methods have to be defined at the same level where the type is defined.
- Methods are stored in the type, therefore an instance has to access its type and then the method in order to run it.
Errors in MoLa happen when something goes wrong: division by zero, negative index access in array, and others. An error is an object which contains 4 public fields:
code
. An integer uniquely correspoding to the error type.reason
. A string describing the reason why the error has happened.module
. A string containing the path to the module (file) where the error has happened.line
. A line number in the module which caused the error.
However, when an error is uncaught, it prints all locations which are relevant to the error.
The error stack contains pairs (checkpoints): (module path, line number) which are currently executing. When an error is printed, the entire error stack gets printed (beginning with youngest checkpoint, ending with the oldest one) as well. Therefore, the programmer can see what code caused the error, as well as the entire executiong path to that code. The error stack receives new checkpoints when any MoLa instruction is executed. Checkpoints are removed when the respective instructions finish execution.
Basic errors are registered in std/errors
module and available as integer codes: ZeroDivisionError, NameError, etc. New errors can be registered using the registerError(defaultReason)
function from the same module. This function returnes an integer code of the newly registered error. Note that the defaultReason must be provided when registering the error; it will be included in the error when the signal
statement doesn't provide the because
part.
An error can be unregistered using the unregisterError(code)
function
To catch an error, the try-catch
statement is needed:
try {...}
catch CODE1 as NAME {...} // optional
catch CODE2 as NAME {...} // optional
...
finally catch AS NAME {...} // optional
When an error happens, the youngest try-catch statement (that covers the code that caused the error) tries to handle it:
- If there is a
catch
clause with the same error code, then the error object is created and becomes available under name NAME, and the code of that clause is executed. If an error happens in that block, it gets raised outside of the current try-catch statement. - Otherwise, if there is a
finally catch
clause, then the error object is created and becomes available under name NAME, and the code of that clause is executed. If an error happens in that block, it gets raised outside of the current try-catch statement. - Otherwise, the error is raised (transfered) to the next youngest try-catch statement that covers the code that caused the error.
If the error ends up outside of a try-catch statement, it gets printed to the stderr, and the program ends with error.
Scope holds variables which can be accessed inside that that scope.
- Scopes have a pointer to the parent scope.
- Scope may or may not be allowed to access to the parent scope.
- Local variables are created in the current scope.
- When an identifier is searched in a scope, the following happens:
- If a function with that name exists in the module, it gets returned.
- If not 1, and a type with that name exists in the module, it gets returned.
- If not 2, and a global variable with that name exists in the module, it gets returned.
- If not 3, and a variable with that name exists in the module, it gets returned.
- If not 4, and the scope is allowed to access its parent scope, the search happens in the parent scope, starting with step 4.
- If not 5, and the scope is not allowed to access its parent scope, a NameError is signalled.
- A new is created in the following situations:
- Function call. Parent access is forbidden.
- while/for loop body. Parent access is allowed.
- if/else body. Parent access is allowed.
- Normal block statement (not in 1., 2., 3.). Parent access is allowed.
- Scope is destroyed when a respective code structure ends.
The GC in MoLa does reference counting, assuming that the number of references to an object fits into a 32-bit unsigned integer. To handle cycles, simple mark-and-sweep cycle is run when there is too much memory allocated.
A checkphase is an event when the GC updates its information about memory usage. A checkphase happens when the Allocator allocates CHECKPHASE_THRESHOLD
objects. CHECKPHASE_RATIO
is the ratio between the number of bytes allocated (and not deallocated) since the last checkphase and the number of bytes allocated (and not deallocated) between the second last and last checkphases.
When an object's reference counter reaches 0, it becomes garbage, and therefore has to be destroyed. The GC pushes this object onto the garbage stack.
Each reference update causes destruction of DESTRUCTION_COUNT * CHECKPHASE_RATIO
objects from the garbage stack.
During the checkphase, the GC decides that it needs to run a collection cycle if:
- new
CHECKPHASE_RATIO
is greater thanGC_CYCLE_THRESHOLD
. - or the collection cycle hasn't been run for
GC_CYCLE_CHECKSPHASES_LIMIT
checkphases.
If a decision to run a collection cycle is taken, the GC does the following:
- destroys all objects on the garbage stack.
- runs a simple mark & sweep cycle to break reference cycles.
To implement this, the GC maintains a set of objects that have been created. When an object is destroyed, it is removed from this set. When a GC collection cycle is run:
- The roots are globals and locals (objects reachable from any active scope, the objects stack, or any other similar source). Functions and types are always alive do not reference any objects other then themselves, and therefore they are not processed during garbage collection. The roots are taken from all loaded modules.
An environment is a data structure assigned to each module. It holds all the variables created in that module, and implements lookup function.
An environment also contains all exported objects of the module, and lets other modules use those objects.
Global variables are stored in the module's environment.
To execute a program, it first has to be converted to a list of simple instructions, which will be then executed by the VM.
An instruction is an structure that contains details of how some action has to be executed. A MoLaVM instruction is made of:
- Instruction code. Denoted as
inscode(ins)
. - The name of the file which caused the instruction to be generated. Denoted as
insfile(ins)
. - The line number in the file which caused the instruction to be generated. Denoted as
insline(ins)
. - Arguments of the instruction, packed into it during its creation. Denoted as
insarg_i(int)
, where i is the index of the argument.
Instructions of a module are stored in a single list. Instructions from new modules are appended to the end of the list. A module structure stores the absolute offset of the first instruction of that module in the list.
An instruction may access the arguments stack. By convention:
- "Runtime" arguments of the instruction have to be pushed on the stack.
- Any instruction that requires arguments takes them and removes them from the stack before executing actual operation. This way memory leaks are prevented when returning / catching an error.
- The execution result of the instruction has to be pushed on the stack, unless it is absent.
Structures:
- Objects stack (arguments stack). Contains objects that are indended to be used as arugments in the instructions.
- A stack of registered error handlers. Each handler (a catch block) is a tuple of:
- error code.
- absolute offset (of the catch block) to where the program has to jump.
- the number of elements on the stack before the try block was executed. the stack is rollbacked to this state.
- the current scope pointer.
The following is the list of MoLaVM instructions with their descriptions.
- The stack must contain an object. Pops an object from the stack. Signals an InternalError if the stack is empty.
- The stack must contain 2 objects.
Swaps two top elements on the stack. Signals an InternalError if there are less than 2 objects on the stack.
Creates a new environment. The environment gets assigned an id, which is later used to switch to that environment. This id is assigned during runtime. The root program get's assigned the id of 0.
Switches the current environment to the environment of the module where this instruction is located.
- The stack must contain an object which is a function. Switches the current environment to the environment of the function located on the stack.
module_path
is the path to the module we want to import.identifier
is the name under which we want to import the module.
Imports a module.
- Checks whether
module_path
is correct. Otherwise signals ImportError. - Checks whether there is no global variable with the name
identifier
in the current environment. If there is, signals NameCollisionError. - Checks whether the module at
module_path
has already been imported (directly or indirectly). If such,identifier
becomes a reference to that module structure, and the instruction terminates. - Generates instructions and appends them to the end of the instructions list. All generated instructions receive a new env_id.
- Executes the instructions of the module in the new environment.
- Creates a module structure containing the environment under the name of
identifier
, as well as the absolute offset to the instructions in the imported module.
identifier
is the name under which we want to export the object on the stack.- the stack must contain an object.
Exports the objects on the stack under the name identifier
by adding it to the current environment. Throws an InternalError if the stack is empty. If there's already an object exported with the given name, throws NameCollisionError.
identifier
is the name of the global variable we want to create.
Creates a global variable in the current environment, initially with value null
. Throws NameCollisionError if there's already an accessible object with the same name in the environment.
identifier
is the name of the function.mode_i
is the mode of the i-th argument (copy, ref, pass, or auto).arg_i
is the name of the i-th argument.
Creates a new function object under the given name. The object will have a list of modifiers and argument names, which will be used when calling the function. The function objecmts contains 2 values:
- (Absolute) position of the second next instruction (after this one) in its module.
- A pointer to the module structure (which contains the absolute module offset).
Signals a NameCollisionError if an accessible object with the given name already exists. Signals a DuplicateNameError if there are duplicate argument names.
identifier
is the name of the type.f
is the number of fields.m
is the number of methods.name_i
is the name of the i'th member of the type. These are listed in the order of: fields, methods.
Creates a new type under the given name. The type object contains the information about what fields and methods it has.
Signals a NameCollisionError if an accessible object with the given name already exists, or a DuplicateNameError, if there are duplicate field/method names.
identifier
is the name of the function.mode_i
is the mode of the i-th argument (copy, ref, or pass).arg_i
is the name of the i-th argument.- the stack must contain an object.
Creates a new method of the object on the stack.
Note: 'this' special argument has to be passed as a normal argument; it is not automatically added to the arguments list by this instruction.
The method object will have a list of modifiers and argument names, which will be used when calling the function. The object will also have a pointer to the next instruction (after the current one), which is where the method body will be generated. The type will receive a pointer to this method.
Signals an InternalError if the stack is empty. Signals a ValueError the object on the stack is not a type object. Signals a NameCollisionError if the method has already been defined. Signals an NameError if the type object didn't declare a method with such name. Signals an DuplicateNameError if there are duplicate argument names.
Note: this is used to create constructor / destructor as well. In those cases, identifier
becomes constructor
/ destructor
respectively.
access_mode
isWITH_PARENT_ACCESS
, if the scope should have access to its parent scope, andWITHOUT_PARENT_ACCESS
otherwise.
Creates a new scope. The access to the parent scope is determined by the access_mode
argument. If there is no parent scope, access_mode
is disregarded.
Destroys the current scope. Signals an InternalError if there is no scope to destroy.
Removes records from error handlers stack that were bound to this scope. (is it gonna work?)
offset
is the relative offset of the instruction where the VM has to jump if the value on the stack isfalse
.- the stack must contain an object.
If the object on the stack is false
, jumps to the instruction at the position offset
relative to the current instruction.
Signals a ValueError if the object on the stack is not boolean, InternalError if the stack is empty.
offset
is the relative offset of the instruction where the VM has to jump.
Jumps to the instruction at the position offset
relative to the current instruction.
- the stack must contain an object.
Jumps to the instruction at the absolute offset given in the integer object on the stack.
Signals a ValueError if the object on the stack is not integer, InternalError if the stack is empty.
- the stack must contain an object.
Registeres a new catch statement with the error code provided on the stack. If the value on the stack is -1, an universal handler is registered.
- Checks whether the object on the stack is an integer, and takes its value.
- Inserts a new tuple of (integer error code, absolute offset of the second next instruction after the current one, number of elements on the stack, current scope pointer) on the error handlers stack.
Signals InternalError if there is no object on the stack, ValueError if the object on the stack is not an integer.
Removes n tuples error handlers from the errors stack. Signals an InternalError if the errors stack contains less than n tuples
- the stack must contain 2 objects: bottom->top: ... s x
Signals an error with code x and message s.
- Takes the object from the stack, extracting the error code.
- Goes through the error handlers stack. If it doesnt locate a handler with the given error code, the program terminates.
- Otherwise it destroys the current scope until it finds the right one. Then it jumps to the instruction at the absolute offset.
- An error object is pushed on the stack.
Signals an InternalError if the stack is empty, or ValueError if the object x on the stack is not an integer, or s is not a string.
- the stack must contain an object.
Creates a new variable with name identifier
, pointing to the object on the stack.
Signals a NameCollisionError if an accessible object with the given name already exists, InternalError if the stack is empty.
- the stack must contain an object.
Copies by value the object from the stack, and pushes that created copy on the stack.
Signals an InternalError if there is no object on the stack.
- the stack must contain an object.
Copies by reference the object from the stack, and pushes that created copy on the stack.
Signals an InternalError if there is no object on the stack.
- the stack must contain an object.
Copies by auto the object from the stack, and pushes that created copy on the stack.
Signals an InternalError if there is no object on the stack.
- the stack must contain 2 objects: bottom->top : ... object2 object1
Replaces the pointer Object1->Value1 to Object1->Value2. Object2 gets removed from the stack, Object1 stays.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x or y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x and y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an integer object with the value of (x | y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an integer object with the value of (x ^ y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an integer object with the value of (x & y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x == y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x != y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x < y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x <= y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x > y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates a boolean object with the value of (x >= y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an object with the value of (x + y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an object with the value of (x - y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an integer object with the value of (x << y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an integer object with the value of (x >> y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an object with the value of (x * y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an object with the value of (x / y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 2 objects: bottom->top: ... x y
Creates an object with the value of (x % y). The object is pushed on the stack.
Signals an InternalError if there are less than 2 objects on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 1 object: bottom->top: ... x
Creates an object with the value of -x. The object is pushed on the stack.
Signals an InternalError if there are less than 1 object on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 1 object: bottom->top: ... x
Creates an object with the value of ~x. The object is pushed on the stack.
Signals an InternalError if there are less than 1 object on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain 1 object: bottom->top: ... x
Creates an object with the value of not x. The object is pushed on the stack.
Signals an InternalError if there are less than 1 object on the stack. Signals a ValueError if the objects have invalid values.
- the stack must contain n + 1 object: bottom->top: ... f arg1 arg2 ... argn
Calls function f.
- Checks whether the amount of arguments matches the amount expected by f. Raises an WrongNumberOfArgumentsError if it doesn't match.
- Adds new local variables to the scope:
- for i-th argument, a copy is created based on the i-th mode specified in f.
- that copy is added to the scope under the i-th name stored in f.
- The address of the next instruction is pushed on the stack as the return address.
- Jumps to the address stored in the function object.
- the stack must contain 2 objects: bottom->top: ... x y
Returns x[y].
Signals an InternalError if there are less than 2 objects on the stack, ValueError if y is not integer, OutOfBoundsError if y is out of bounds.
Creates and pushes a boolean object with value = value
on the stack.
Creates and pushes a character object with value = value
on the stack.
Creates and pushes an integer object with value = value
on the stack.
Creates and pushes a float object with value = value
on the stack.
Creates and pushes a string object with value = value
on the stack.
Creates and pushes a null object on the stack.
identifier
is the name of the object we want to locate.
- The stack must contain an object
Finds the field with name identifier
of the object on the stack, and pushes it on the stack. Signals InternalError if the stack is empty, NameError if the object on the stack doesn't have a field with the given name.
- The stack must contain an object
The object on the stack is put in the special "caller" register.
Finds the method with name identifier
of the object on the stack, and pushes it on the stack. Signals InternalError if the stack is empty, NameError if the object on the stack doesn't have a method with the given name.
- The stack must contain n+1 objects: bottom->top: t a_1 ... a_n
Creates an instance of the given type t, passing arguments a_1 ... a_n into the constructor.
Raises an InternalError if there are less than n+1 objects on the stack, ValueError if t is not a type object, and other errors that could happen when calling the constructor (which is a function).
Some notations:
gen(x)
means taking x and converting it into a list of instructions. The instructions are returned.
Instructions have a number preceding them. This is used as a marker for the position of certain instructions. It is neither relative nor absolute; its only purpose is to mark the instructions in order.
If we want to specify that a jump instruction has to jump at a specific place, we can use the number of the line, preceding it with REL or ABS: JUMP REL@3 or JUMP ABS@3
1 IMPORT_MODULE module_path identifier
# gen(expr_1)
1 EXPORT_OBJECT name_1
# gen(expr_2)
2 EXPORT_OBJECT name_2
...
# gen(expr_n)
n EXPORT_OBJECT name_n
1 CREATE_GLOBAL identifier
1 CREATE_FUNCTION identifier mode_1 arg_1 ... mode_n arg_n
2 JUMP REL4
# gen(block_stmt)
3 RETURN
4
1 CREATE_TYPE identifier n k f_1 ... f_n m_1 ... m_k
1 LOAD type_name
2 CREATE_METHOD name 'ref' 'this' mode_1 arg_1 ... mode_n arg_n
3 JUMP REL@5
# gen(block_stmt)
4 RETURN
5
1 LOAD type_name
2 CREATE_METHOD 'constructor' 'ref' 'this' mode_1 arg_1 ... mode_n arg_n
3 JUMP REL@5
# gen(block_stmt)
4 RETURN
5
1 LOAD type_name
2 CREATE_METHOD 'destructor' 'ref' 'this'
3 JUMP REL@5
# gen(block_stmt)
4 RETURN
5
1 CREATE_SCOPE WITH_PARENT_ACCESS
# gen(block_stmt)
3 DESTROY_SCOPE
1 gen(expr)
2 JUMP_IF_FALSE REL@4
# gen(stmt)
3 JUMP REL@1
4
if expr2 is present:
1 CREATE_SCOPE WITH_PARENT_ACCESS
# gen(expr1?)
2 gen(expr2)
3 JUMP_IF_FALSE REL@4
# gen(stmt)
# gen(expr3?)
# JUMP REL@2
4 DESTROY_SCOPE
if expr2 is not present:
1 CREATE_SCOPE WITH_PARENT_ACCESS
# gen(expr1?)
2 gen(stmt)
# gen(expr3?)
# JUMP REL@2
3 DESTROY_SCOPE
If else part is present:
# gen(expr)
1 JUMP_IF_FALSE REL@3
# gen(stmt1)
2 JUMP REL@4
3 gen(stmt2)
4
If else part is absent:
# gen(expr)
1 JUMP_IF_FALSE REL@2
# gen(stmt1)
2
1 DESTROY_SCOPE
# ...
k DESTROY_SCOPE
# JUMP REL@X
Continue is meaningful only in 2 cases: (innermost loops that contain continue)
while ... // X is after the condition expression
...
{... {...
continue;
...} ...}
...
for ... // X is after the step expression
...
{... {...
continue;
...} ...}
...
In these cases, continue must destroy all the scopes where it is contained.
k
is the number of such open scopes.
Otherwise, the instruction is ignored.
1 DESTROY_SCOPE
# ...
k DESTROY_SCOPE
# JUMP REL@X
Break is meaningful only in 2 cases: (innermost loops that contain continue)
while ... // X is after the condition expression
...
{... {...
break;
...} ...}
...
// X is here
for ...
...
{... {...
break;
...} ...}
...
// X is here
In these cases, continue must destroy all the scopes where it is contained.
k
is the number of such open scopes.
Otherwise, the instruction is ignored.
The stack must contain the return address.
The return statement only makes sense in one case:
function f(...) {
...
{... {...
return ...
...} ...}
...
}
K is the number of open scopes that contain the return statement. On all other situations return is ignored.
If mode is copy
:
# DESTROY_SCOPE
...
K DESTROY_SCOPE
# gen(expr)
1 COPY_BY_VALUE
2 SWAP
3 RETURN
If mode is ref
:
# DESTROY_SCOPE
...
K DESTROY_SCOPE
# gen(expr)
1 COPY_BY_REFERENCE
2 SWAP
3 RETURN
If mode is pass
:
# DESTROY_SCOPE
...
K DESTROY_SCOPE
# gen(expr)
1 SWAP
2 RETURN
If mode is absent:
# DESTROY_SCOPE
...
K DESTROY_SCOPE
# gen(expr)
# COPY_BY_AUTO
1 SWAP
2 RETURN
If mode is copies
:
# gen(expr2)
1 COPY_BY_VALUE
# gen(expr1)
2 ASSIGNMENT
3 POP
If mode is refs
:
# gen(expr2)
1 COPY_BY_REFERENCE
# gen(expr1)
2 ASSIGNMENT
3 POP
'try' block_stmt ('catch' expr 'as' identifier block_stmt)* ('catch' '*' 'as' identifier block_stmt)?
Error handlers are generated in reversed order. For a specific error handler (catch expr as identifier block_stmt
), the following is generated:
# gen(expr)
0 REGISTER_CATCH
1 JUMP REL@6
2 DESTROY_CATCH 1
3 SWITCH_ENV
4 CREATE_VAR identifier
# gen(block_stmt)
# DESTROY_CATCH *total handlers - pos of this one*
5 JUMP REL@*first instruction after the try-catch statement*
6
For a universal error handler (catch * as identifier block_stmt
), the following is generated:
# LOAD_INT -1
0 REGISTER_CATCH
1 JUMP REL@6
2 DESTROY_CATCH 1
3 SWITCH_ENV
4 CREATE_VAR identifier
# gen(block_stmt)
5 JUMP REL@*first instruction after the try-catch statement
6
After the catch statements, the try block is generated:
# gen(try_block)
# DESTROY_CATCH *number of catch parts*
For each catch clause, a DESTROY_CATCH instruction is generated.
For example, code:
try try_stmt
catch expr1 as name1 catch_stmt1
catch expr2 as name2 catch_stmt2
catch * as name3 catch_stmt3
will be compiled into:
# LOAD_INT -1
# REGISTER_CATCH
# JUMP REL@1
# DESTROY_CATCH 1
# SWITCH_ENV
# CREATE_VAR name3
# gen(catch_stmt3)
# JUMP REL@4
1 gen(expr2)
# REGISTER_CATCH
# JUMP REL@2
# DESTROY_CATCH 1
# SWITCH_ENV
# CREATE_VAR name2
# gen(catch_stmt2)
# DESTROY_CATCH 1
# JUMP REL@4
2 gen(expr1)
# REGISTER_CATCH
# JUMP REL@3
# DESTROY_CATCH 1
# SWITCH_ENV
# CREATE_VAR name1
# gen(catch_stmt1)
# DESTROY_CATCH 2
# JUMP REL@4
3 gen(try_stmt)
# DESTROY_CATCH 3
4
# gen(expr2)
# gen(expr1)
# SIGNAL_ERROR
# gen(expr)
# COPY_BY_AUTO
# CREATE_VAR identifier
# gen(B)
# COPY_BY_AUTO
# gen(A)
# ASSIGNMENT
# gen(A)
# JUMP_IF_FALSE REL@1
# gen(B)
# JUMP REL@2
1 gen(C)
2
# gen(A)
# gen(B_1)
...
# gen(B_n)
# NEW n
# gen(A)
# gen(B)
# INS
where INS is the instruction that executed operation op
# gen(A)
# gen(B_1)
...
# gen(B_n)
# SWITCH_ENV *env of the function A*
# CREATE_SCOPE WITHOUT_PARENT_ACCESS
# CALL n
# DESTROY_SCOPE
# SWITCH_ENV
# gen(A)
# LOAD_FIELD name
# gen(A)
# LOAD_METHOD name
Builtin functions (like print, cppload) are function objects that point to a C++ function. Such a function, when executed, runs the C++ function, and returns the result.
Builtin functions are stored in a separate module, which is implicitly imported into any other program.
MoLa programs can load C++ functions, binding them to MoLa functions. To load such a function, cppload
builtin function is used.
When a C++ function is loaded(dynamically), a new function object is created that points to that function (just like with the builtin functions). Loaded C++ functions cannot be unloaded.