A memory-polymorphic, polyglot implementation of SplDoublyLinkedList (as transpiled by Pholyglot 0.2-betachicken)
Pholyglot is a transpiler that compiles a subset of PHP into PHP-and-C compatible code, so called polyglot code.
This blog post describes the new features added into version 0.2 of the Pholyglot transpiler:
- Two different memory allocation strategies
- A memory-polymorph linked list
Memory allocation
One of the reason I started this project was to experiment with an opt-out-of-GC kind of system. In Pholyglot, the Boehm GC will be used as default, but you can now also choose to use an arena. The interaction between these two memory systems in the same program is not safe yet, but the idea is to add alias control and escape analysis to enforce a clear separation.
Memory-polymorphism
Not sure if this is an established word, but what it means in Pholyglot is that you can tell an object to use the same allocation strategy as another object, without knowing exactly which strategy was used.
This example adds a new Point
to a list of points, using the same memory-strategy as the list, using a new type of annotation @alloc
:
/**
* @param SplDoublyLinkedList<Point> $list
*/
function addPointToList(SplDoublyLinkedList $list): void
{
// Use same memory allocation strategy as $list
$p = /** @alloc $list */ new Point();
$list->push($p);
}
Obviously, at a later stage, $list->push($p)
must be type-checked so that two different memory strategies aren’t being used in the same collection.
The above snippet compiles to this1 (and yes, this is valid vanilla PHP):
#define function void
function addPointToList(SplDoublyLinkedList $list)
#undef function
{
#__C__ Point
$p = new(Point
#__C__, $list->mem
);
$list->push(
#__C__ $list,
$p
);
}
where new
is a macro taking two arguments: the object and a memory allocation strategy struct:
#define new(x, m) x ## __constructor((x) m.alloc(m.arena, sizeof(struct x)), m)
m
is defined as:
struct mem {
uintptr_t* (*alloc) (void* a, size_t size);
void* arena;
};
Meaning, it contains a pointer to an allocation function (currently to either the Boehm GC alloc or arena alloc), and a pointer to the arena (not used for Boehm).
I hope this makes sense. :)
Other possible memory strategies could be unsafe
that just mallocs and never frees (possibly useful for global variables); malloc
that mallocs and is not allowed to escape scope (because it’s freed at end of scope); or stack
that allocs on the stack instead of heap, and is also not allowed to escape. I’ve written more about my thoughts here.
A full example with points and lists can be found in this gist.
Notes
-
The
#__C__
word is removed before compiling withgcc
usingsed
. ↩