NN-512

Back

Activation
    FromTensor=from
    ToTensor=to
    Kind=ReLU
    Param=0

Generate code to apply an elementwise activation function.

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    Kind= The kind of activation function to apply. ReLU means that if X is
    positive then F(X)=X else F(X)=X*C where C is the constant negative slope
    parameter.

    Param= A parameter for the activation function. For ReLU this is the
    negative slope parameter (0 gives standard ReLU, 0.1 gives a typical leaky
    ReLU, -1 gives absolute value, 1 gives the identity function, etc.). Must be
    a simple float: ^-?(0|[1-9][0-9]*)(\.[0-9]+)?$

----

Add
    FromTensor1=from1
    FromTensor2=from2
    ToTensor=to

Generate code for the elementwise addition of two data tensors. FromTensor1,
FromTensor2, and ToTensor are all structurally identical (same number of
channels, same height, same width).

    FromTensor1= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    FromTensor2= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

----

BatchNorm
    FromTensor=from
    ToTensor=to
    Epsilon=0.001

Generate code to apply batch normalization with per-channel mean, variance,
scale, and shift parameters. Let X be an element of FromTensor and let Y be the
corresponding element of ToTensor that will be computed. X and Y are at the same
CHW coordinate in their respective tensors and the channel part of that
coordinate selects a mean M, a variance V, a scale S, and a shift H. Then
Y=S*(X-M)/SQRT(V+E)+H where E is the constant epsilon parameter (to avoid
division by zero).

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. The user passes the
    mean, variance, scale, and shift parameter tensors into the generated
    initialization code through struct fields that have this same name but with
    "Means", "Variances", "Scales", and "Shifts" appended (each of these
    parameter tensors is an array of 32-bit floats, one float per data tensor
    channel). Must be a letter followed by zero or more letters/digits:
    ^[a-zA-Z][a-zA-Z0-9]*$

    Epsilon= A small positive number added to the variance to avoid division by
    zero. Should match the value that was used for this purpose during training.
    Must be a simple float: ^-?(0|[1-9][0-9]*)(\.[0-9]+)?$

----

Concat
    FromTensor1=from1
    FromTensor2=from2
    ToTensor=to

Generate code to concatenate two tensors along the channel dimension.
FromTensor1 and FromTensor2 must have matching spatial extents (the same height
H and the same width W). If FromTensor1 has C1 channels and FromTensor2 has C2
channels then ToTensor has C1+C2 channels, height H, and width W. The feature
maps of FromTensor1 go first (they are assigned channel numbers starting with
zero) and the feature maps of FromTensor2 go next (they are assigned channel
numbers starting with C1).

    FromTensor1= Read from a pre-existing data tensor with this name. The
    feature maps of this tensor get the low channel numbers in ToTensor. Must be
    a letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    FromTensor2= Read from a pre-existing data tensor with this name. The
    feature maps of this tensor get the high channel numbers in ToTensor. Must
    be a letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

----

Config
    Prefix=NN512
    Platform=AVX512Float32
    L1DataCachePerThread=32KiB
    L2CachePerThreadExL1=960KiB
    L3CachePerThreadExL1L2=1408KiB

Settings for the code generator.

    Prefix= A string used for filenames, function names, etc. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    Platform= The kind of C99 code to generate. AVX512Float32 denotes x86-64
    AVX-512 Foundation (AVX512F) and 32-bit floating point.

    L1DataCachePerThread= Size in bytes of each L1D cache divided by the number
    of threads that share each L1D cache. A positive integer with an optional
    suffix like k, K, KB, KiB, m, M, MB, MiB. The K suffixes multiply by 1024.
    The M suffixes multiply by the square of 1024. After conversion to
    lowercase: ^([1-9][0-9]*)([km](i?b)?)?$

    L2CachePerThreadExL1= Size in bytes of each L2 cache divided by the number
    of threads that share each L2 cache. This size must exclude the L1 overlap
    if L2 is inclusive. A positive integer with an optional suffix like k, K,
    KB, KiB, m, M, MB, MiB. The K suffixes multiply by 1024. The M suffixes
    multiply by the square of 1024. After conversion to lowercase:
    ^([1-9][0-9]*)([km](i?b)?)?$

    L3CachePerThreadExL1L2= Size in bytes of the L3 cache divided by the number
    of threads that share the L3 cache. This size must exclude the L1/L2 overlap
    if L3 is inclusive. A positive integer with an optional suffix like k, K,
    KB, KiB, m, M, MB, MiB. The K suffixes multiply by 1024. The M suffixes
    multiply by the square of 1024. After conversion to lowercase:
    ^([1-9][0-9]*)([km](i?b)?)?$

----

Conv
    FromTensor=from
    ToTensor=to
    ToChannels=64
    FilterH=3
    FilterW=3
    StrideH=1
    StrideW=1
    PaddingH=1
    PaddingW=1
    DilationH=1
    DilationW=1
    Groups=1

Generate code to perform cross-correlation. Suppose FromTensor has C channels,
height H, and width W. ToTensor has K (= ToChannels) channels. A formula for the
height of ToTensor is ((H+2*PaddingH)-(1+(FilterH-1)*DilationH))/StrideH+1 in
which the division truncates toward zero and the dividend must not be negative.
The width of ToTensor is calculated analogously. There are K filters in the
weight parameter tensor and each of them has C/Groups channels, a height of
FilterH, and a width of FilterW. The weight parameter tensor is in KCHW format,
32-bit floating point, fully packed (filter number is the outermost/slowest
dimension and otherwise the layout is just like an input data tensor). The bias
parameter tensor is an array of K 32-bit floats (one float for each filter),
fully packed.

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. The user passes the
    weight parameter tensor into the generated initialization code through a
    struct field that has this same name but with "Weights" appended. Similarly
    the bias parameter tensor ("Biases" is appended). Must be a letter followed
    by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToChannels= The number of feature maps in ToTensor. This is also the number
    of filters in the weight parameter tensor (the K in KCHW) and the number of
    biases in the bias parameter tensor. Must be a positive integer:
    ^[1-9][0-9]*$

    FilterH= The undilated spatial height of each filter in the weight parameter
    tensor (the H in KCHW). Must be a positive integer: ^[1-9][0-9]*$

    FilterW= The undilated spatial width of each filter in the weight parameter
    tensor (the W in KCHW). Must be a positive integer: ^[1-9][0-9]*$

    StrideH= The heightwise step between adjacent rows of the filtering position
    grid (the heightwise subsampling ratio). Must be a positive integer:
    ^[1-9][0-9]*$

    StrideW= The widthwise step between adjacent columns of the filtering
    position grid (the widthwise subsampling ratio). Must be a positive integer:
    ^[1-9][0-9]*$

    PaddingH= Implicit heightwise padding of FromTensor. This is the number of
    all-zero rows to implicitly concatenate at the top of each feature map,
    before the first explicit row. The same number of all-zero rows is
    implicitly concatenated at the bottom of each feature map, after the last
    explicit row. Must be a non-negative integer: ^0|[1-9][0-9]*$

    PaddingW= Implicit widthwise padding of FromTensor. This is the number of
    all-zero columns to implicitly concatenate on the left side of each feature
    map, before the first explicit column. The same number of all-zero columns
    is implicitly concatenated on the right side of each feature map, after the
    last explicit column. Must be a non-negative integer: ^0|[1-9][0-9]*$

    DilationH= The heightwise filter dilation factor. 1 means no dilation
    (ordinary cross-correlation). 2 means the filter is multiplied against
    FromTensor in a spatially sparse (spread out) way just as if one all-zero
    row had been inserted between each pair of adjacent rows in the filter. 3 is
    like if two all-zero rows had been inserted. And so on. Must be a positive
    integer: ^[1-9][0-9]*$

    DilationW= The widthwise filter dilation factor. 1 means no dilation
    (ordinary cross-correlation). 2 means the filter is multiplied against
    FromTensor in a spatially sparse (spread out) way just as if one all-zero
    column had been inserted between each pair of adjacent columns in the
    filter. 3 is like if two all-zero columns had been inserted. And so on. Must
    be a positive integer: ^[1-9][0-9]*$

    Groups= The number of disjoint cross-correlation operations to perform (no
    shared data, no shared filters). Suppose FromTensor has C channels and
    ToTensor has K channels (ToChannels is K). Let G be the number of groups
    (both C and K must be divisible by G). Then there are K filters in the
    weight parameter tensor and each of them has C/G channels. The first
    operation applies the first K/G filters to the first C/G FromTensor channels
    to produce the first K/G ToTensor channels. The second operation applies the
    second K/G filters to the second C/G FromTensor channels to produce the
    second K/G ToTensor channels. And so on. Must be a positive integer:
    ^[1-9][0-9]*$

----

FullyConnected
    FromTensor=from
    ToTensor=to
    ToChannels=1000

Generate code to implement a fully connected layer. Suppose FromTensor has C
channels, height H, and width W. The weight parameter tensor consists of K
filters (where K is the ToChannels parameter) and each filter is structurally
identical to FromTensor (C channels, height H, width W). The weight parameter
tensor is in KCHW format, 32-bit floating point, fully packed (filter number is
the outermost/slowest dimension; the rest is like an input data tensor). Each
filter element is multiplied by the FromTensor element that has the same CHW
coordinate. The bias parameter tensor is an array of K 32-bit floats (one float
for each filter), fully packed. ToTensor has K channels, height 1, and width 1.

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. The user passes the
    weight parameter tensor into the generated initialization code through a
    struct field that has this same name but with "Weights" appended. Similarly
    the bias parameter tensor ("Biases" is appended). Must be a letter followed
    by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToChannels= The number of feature maps in ToTensor (each feature map has
    height 1 and width 1). This is also the number of filters in the weight
    parameter tensor (the K in KCHW) and the number of biases in the bias
    parameter tensor. Must be a positive integer: ^[1-9][0-9]*$

----

Input
    ToTensor=image
    Channels=3
    Height=224
    Width=224

Declare an input data tensor parameter for the generated inference function.
Input data must be in CHW format, 32-bit floating point, fully packed. The
inference code reads the input tensor memory but never writes to it.

    ToTensor= A name for this input data tensor. The corresponding inference
    function parameter in the generated code has the same name but with "Data"
    appended. Must be a letter followed by zero or more letters/digits:
    ^[a-zA-Z][a-zA-Z0-9]*$

    Channels= The number of feature maps for this input data tensor. This is the
    C in CHW (the outermost/slowest dimension) and has a stride of
    H*W*sizeof(float) bytes. Must be a positive integer: ^[1-9][0-9]*$

    Height= The spatial height dimension of this input data tensor. For an image
    tensor the height is usually the number of pixel rows. This is the H in CHW
    (the outermost/slowest spatial dimension) and has a stride of
    W*sizeof(float) bytes. Must be a positive integer: ^[1-9][0-9]*$

    Width= The spatial width dimension of this input data tensor. For an image
    tensor the width is usually the number of pixels per row. This is the W in
    CHW (the innermost/fastest dimension) and has a stride of sizeof(float)
    bytes. Must be a positive integer: ^[1-9][0-9]*$

----

Output
    FromTensor=prob

Declare an output data tensor parameter for the generated inference function.
The user allocates output tensor memory and passes a pointer into the inference
function. There, output data is written in CHW format, 32-bit floating point,
fully packed.

    FromTensor= The name of a data tensor that will be written back to the user
    as output. Must not be the name of an input tensor and must not be the same
    as another output. The corresponding inference function parameter in the
    generated code has a matching name but with "Data" appended. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

----

Pooling
    FromTensor=from
    ToTensor=to
    Kind=Max2x2Stride2
    PaddingH=0
    PaddingW=0

Generate code to apply a standard window pooling or global pooling operation.
Padding affects window placement but padding values never participate in max/avg
calculations. Therefore the padding must be small enough that every window will
contain at least one non-padding value. Each (H+2*PaddingH)x(W+2*PaddingW)
feature map in FromTensor yields a corresponding feature map in ToTensor. For
RxR window pooling with a stride of S the height of every feature map in
ToTensor is ((H+2*PaddingH)-R)/S+1 where the division by S truncates toward
zero; the dividend must not be negative. The width formula is analogous. For
global pooling there is no padding and every feature map in ToTensor is 1x1.

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    Kind= The kind of pooling operation to apply. Max2x2Stride2 and
    Avg2x2Stride2 produce a single value for each 2x2 window and there is no
    overlap between adjacent windows. Max3x3Stride2 and Avg3x3Stride2 produce a
    single value for each 3x3 window and adjacent windows overlap. MaxGlobal and
    AvgGlobal produce a single value for each feature map.

    PaddingH= Implicit heightwise padding of FromTensor. This is the number of
    all-zero rows to implicitly concatenate at the top of each feature map,
    before the first explicit row. The same number of all-zero rows is
    implicitly concatenated at the bottom of each feature map, after the last
    explicit row. Must be a non-negative integer: ^0|[1-9][0-9]*$

    PaddingW= Implicit widthwise padding of FromTensor. This is the number of
    all-zero columns to implicitly concatenate on the left side of each feature
    map, before the first explicit column. The same number of all-zero columns
    is implicitly concatenated on the right side of each feature map, after the
    last explicit column. Must be a non-negative integer: ^0|[1-9][0-9]*$

----

Softmax
    FromTensor=from
    ToTensor=to

Generate code to compute softmax along the channel dimension independently for
each spatial (height, width) location. FromTensor and ToTensor have the same
number of channels, the same height, and the same width.

    FromTensor= Read from a pre-existing data tensor with this name. Must be a
    letter followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$

    ToTensor= Write to a new data tensor with this name. Must be a letter
    followed by zero or more letters/digits: ^[a-zA-Z][a-zA-Z0-9]*$
NN-512

Back

Top