Skip to content

Latest commit

 

History

History
264 lines (168 loc) · 6.79 KB

IndRNN.md

File metadata and controls

264 lines (168 loc) · 6.79 KB

haste_tf.IndRNN

Class IndRNN

Independently Recurrent Neural Network layer.

This layer offers a fused, GPU-accelerated TensorFlow op for inference and training. It also supports Zoneout regularization.

__init__(
    num_units,
    direction='unidirectional',
    **kwargs
)

Initialize the parameters of the IndRNN layer.

Arguments:

  • num_units: int, the number of units in the IndRNN cell.
  • direction: string, 'unidirectional' or 'bidirectional'.
  • **kwargs: Dict, keyword arguments (see below).

Keyword Arguments:

  • kernel_initializer: (optional) the initializer to use for the input matrix weights. Defaults to glorot_uniform.
  • recurrent_initializer: (optional) the initializer to use for the recurrent scale weights. Defaults to uniform random in [-0.5, 0.5]. Note that this initialization scheme is different than in the original authors' implementation. See #7 for details.
  • bias_initializer: (optional) the initializer to use for the bias vector. Defaults to zeros.
  • kernel_transform: (optional) a function with signature (kernel: Tensor) -> Tensor that transforms the kernel before it is used. Defaults to the identity function.
  • recurrent_transform: (optional) a function with signature (recurrent_scale: Tensor) -> Tensor that transforms the recurrent scale vector before it is used. Defaults to the identity function.
  • bias_transform: (optional) a function with signature (bias: Tensor) -> Tensor that transforms the bias before it is used. Defaults to the identity function.
  • zoneout: (optional) float, sets the zoneout rate for Zoneout regularization. Defaults to 0.
  • dtype: (optional) the data type for this layer. Defaults to tf.float32.
  • name: (optional) string, the name for this layer.

Properties

bidirectional

True if this is a bidirectional RNN, False otherwise.

name

Returns the name of this module as passed or determined in the ctor.

NOTE: This is not the same as the self.name_scope.name which includes parent module names.

name_scope

Returns a tf.name_scope instance for this class.

output_size

state_size

submodules

Sequence of all sub-modules.

Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on).

a = tf.Module()
b = tf.Module()
c = tf.Module()
a.b = b
b.c = c
assert list(a.submodules) == [b, c]
assert list(b.submodules) == [c]
assert list(c.submodules) == []

Returns:

A sequence of all submodules.

trainable_variables

Sequence of variables owned by this module and it's submodules.

Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.

Returns:

A sequence of variables for the current module (sorted by attribute name) followed by variables from all submodules recursively (breadth first).

variables

Sequence of variables owned by this module and it's submodules.

Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.

Returns:

A sequence of variables for the current module (sorted by attribute name) followed by variables from all submodules recursively (breadth first).

Methods

__call__(
    inputs,
    training,
    sequence_length=None,
    time_major=False
)

Runs the RNN layer.

Arguments:

  • inputs: Tensor, a rank 3 input tensor with shape [N,T,C] if time_major is False, or with shape [T,N,C] if time_major is True.
  • training: bool, True if running in training mode, False if running in inference mode.
  • sequence_length: (optional) Tensor, a rank 1 tensor with shape [N] and dtype of tf.int32 or tf.int64. This tensor specifies the unpadded length of each example in the input minibatch.
  • time_major: (optional) bool, specifies whether input has shape [N,T,C] (time_major=False) or shape [T,N,C] (time_major=True).

Returns:

A pair, (output, state) for unidirectional layers, or a pair ([output_fw, output_bw], [state_fw, state_bw]) for bidirectional layers.

build(shape)

Creates the variables of the layer.

Calling this method is optional for users of the RNN class. It is called internally with the correct shape when __call__ is invoked.

Arguments:

  • shape: instance of TensorShape.
@classmethod
with_name_scope(
    cls,
    method
)

Decorator to automatically enter the module name scope.

class MyModule(tf.Module):
  @tf.Module.with_name_scope
  def __call__(self, x):
    if not hasattr(self, 'w'):
      self.w = tf.Variable(tf.random.normal([x.shape[1], 64]))
    return tf.matmul(x, self.w)

Using the above module would produce tf.Variables and tf.Tensors whose names included the module name:

mod = MyModule()
mod(tf.ones([8, 32]))
# ==> <tf.Tensor: ...>
mod.w
# ==> <tf.Variable ...'my_module/w:0'>

Args:

  • method: The method to wrap.

Returns:

The original method wrapped such that it enters the module's name scope.