| Authors: | frank, kena |
|---|---|
| Date: | May 2012 |
| Abstract: | This note proposes to extend the field syntax with a simple scheme to support integer and floating point scalars, with no visible change for existing code. |
The type system for records currently knows about tags, which are simple integers, and fields, which are manipulated by reference.
The machinery to support object fields, while suitable for “large” data like arrays, is heavyweight to communicate “simple” values like long integers or floating-point scalars.
So we propose a simple syntax to annotate fields in a box interface which should be communicated around as scalars instead of object references.
The box syntax is of the form “(a,b,c,...) -> (d,e,f,...)”.
As long as each of the names “a” “b” “c” etc contains only alphanumeric characters, it refers to an object manipulated by reference. This default remains.
In addition to this, we introduce the following 3 formats:
From the perspective of box typing and combination these should be considered as part of the field name, ie they should match exactly when connecting the output of one box to the input to another. We do not propose automatic conversion of any kind.
We introduce the following implementation type:
From this point a box can use the scalar types in their interface, for example:
| Using the “external bind” interface | Using the “self bind” interface |
// sig: (a.i,b.i) -> (c.i)
int add(dispatch_t*cb,
intval_t a, intval_t b)
{
return svp_out(cb, a+b);
}
|
// sig: (a.i,b.i) -> (c.i)
int add(dispatch_t*cb)
{
intval_t a, b;
svp_bind(cb, &a, &b);
return svp_out(cb, a+b);
}
|
// sig: (x.d) -> (a.d,b.i)
int box_frexp(dispatch_t*cb, double x)
{
double mantissa;
int exponent;
mantissa = frexp(x, &exponent);
// NB: cast to intval_t necessary
return svp_out(cb, mantissa,
(intval_t)exponent);
}
|
// sig: (x.d) -> (a.d,b.d)
int box_frexp(dispatch_t*cb)
{
double x;
svp_bind(cb, &x);
double mantissa;
int exponent;
mantissa = frexp(x, &exponent);
return svp_out(cb, mantissa,
(intval_t)exponent);
}
|
Each scalar value occupies a slot in a record. All slots are equally sized. Therefore, the minimum width of a slot must be large enough to accommodate all scalar types and field references.
With the proposed design, a slot is large enough for both a fieldref_t, double and intval_t. The latter is in turn guaranteed to be wide enough for 64 bits or a pointer. Therefore, any smaller integer type is implicitly supported, which includes most integer types in contemporary systems.
Introducing support for larger integers or floats (eg. __int128 or long double) would in turn require to grow the size of a slot, impacting overall storage usage by records. We might consider this if needs arise, but for now the corresponding storage overhead was not deemed justified.
Arrays are already supported by the default data language of the EMA object interface for regular object fields.
For example:
// sig: (<sz>, v.f) -> (a)
// function: produce an array of sz floats with value v
int box(dispatch_t* cb, tagval_t sz, float v)
{
// allocate the array
fieldref_t f = svp_new(cb, FLOATS, sz);
// get access to the float storage
float *p;
svp_access(cb, f, &p);
// fill in the values
for (int i = 0; i < sz; ++i)
p[i] = v;
// produce the output record
return svp_out(cb, svp_demit(f));
}
| Type | Description |
|---|---|
| tagval_t | Tag value: integer of non-guaranteed width. |
| fieldref_t | Object reference field, use the field manager to access. |
| intval_t | (new) Integer scalar field, passed by value; min size 64 bits or pointer. |
| float | (new) 32-bits (single-precision) FP scalar field, passed by value. |
| double | (new) 64-bits (double-precision) FP scalar field, passed by value. |
A discussion remains on how and where to specify the type annotations. The proposal above proposes that:
There are several areas for discussion:
whether these concrete type annotations should be part of the snet syntax at all. Indeed they do not participate in the snet semantics; they are useful only to an external observer of a running application to inspect the flow of data between components.
whether to annotate at the point of box declaration, or whether to use a separate syntax; for example using a new typemap syntax:
box foo : (a) -> (b);
typemap a -> .i;
typemap b -> .d;
This syntax form could be listed next to the box declaration, or in a separate file that would be loaded when instantiating the network.
what the syntax should be. Here are the alternatives previously examined: