hidet.graph

Classes:

Operator(inputs, attributes, task)

An operator that takes tensor as input and output.

FlowGraph(outputs[, inputs, nodes])

The computation graph representation.

PassContext()

Graph-level pass context.

GraphPassInstrument()

Graph pass instrument.

Functions:

asarray(obj, /, *[, dtype, device])

Convert a list, tuple, or numpy ndarray to a hidet tensor.

randn(shape[, dtype, mean, stddev, device])

Create a tensor with uniformly distributed values.

empty(shape[, dtype, device, layout])

Create an uninitialized tensor.

zeros(shape[, dtype, device])

Create a tensor initialized with zero.

ones(shape[, dtype, device])

Create a tensor initialized with one.

symbol(shape[, dtype, device, layout])

Create a symbolic tensor.

randn_like(data[, mean, stddev, shape, ...])

Create a randomly initialized tensor with the same shape, dtype, and device as the given tensor.

empty_like(data[, shape, dtype, device, layout])

Create an uninitialized tensor with the same shape, dtype, and device as the given tensor.

zeros_like(data[, shape, dtype, device])

Create a tensor initialized with zero with the same shape, dtype, and device as the given tensor.

ones_like(data[, shape, dtype, device])

Create a tensor initialized with one with the same shape, dtype, and device as the given tensor.

symbol_like(data[, shape, dtype, device, layout])

Create a symbol tensor like an existing tensor.

full(shape, fill_value[, dtype, device])

Create a tensor initialized with given constant.

full_like(data, fill_value[, shape, dtype, ...])

Create a tensor initialized with fill_value with the same shape, dtype, and device as the given tensor.

from_numpy(nparray)

Create a tensor from a numpy array, sharing the memory with the numpy array when possible.

from_dlpack(dltensor)

Create a hidet tensor from an object that implements the __dlpack__ protocol.

from_torch(torch_tensor)

Create a hidet tensor from pytorch tensor.

trace_from(tensor[, inputs])

Trace the flow graph given the output tensor(s).

optimize(graph)

Optimize a flow graph.

class hidet.graph.Operator(inputs, attributes, task)[source]

An operator that takes tensor as input and output.

Parameters
class hidet.graph.FlowGraph(outputs, inputs=None, nodes=None)[source]

The computation graph representation.

Methods:

__call__(*inputs)

Run the computation graph.

forward(*inputs)

Run the computation graph.

save(model_file)

Save the flow graph to a file.

load(model_file)

Load a flow graph from a file.

cuda_graph()

Create a CudaGraph from FlowGraph.

latency([warmup, number, repeat, median, ...])

Measure the latency of the flow graph.

Attributes:

nodes

The list of operators in the computation graph.

usage_count

The usage count of each tensor in the computation graph.

Parameters
  • outputs (Sequence[Tensor]) –

  • inputs (Optional[Sequence[Tensor]]) –

__call__(*inputs)[source]

Run the computation graph. See Also FlowGraph.forward().

Parameters

inputs (hidet.Tensor) –

Return type

Union[List[hidet.Tensor], hidet.Tensor]

property nodes: List[hidet.graph.operator.Operator]

The list of operators in the computation graph.

property usage_count: Dict[hidet.Tensor, int]

The usage count of each tensor in the computation graph.

forward(*inputs)[source]

Run the computation graph.

Parameters
  • *inputs (Tensor) – The input tensors. They should be consistent with the symbolic inputs of the computation graph.

  • inputs (hidet.Tensor) –

Returns

output – If there is only one output, it is returned directly. Otherwise, a list of output tensors are returned.

Return type

Union[List[Tensor], Tensor]

save(model_file)[source]

Save the flow graph to a file.

Parameters

model_file (str) – The model file to store the flow graph.

static load(model_file)[source]

Load a flow graph from a file.

Parameters

model_file (str) – The path to the flow graph.

Returns

ret – The loaded flow graph.

Return type

FlowGraph

cuda_graph()[source]

Create a CudaGraph from FlowGraph.

Returns

ret – The created cuda graph.

Return type

hidet.cuda.graph.CudaGraph

latency(warmup=1, number=3, repeat=3, median=True, dummy_inputs=None)[source]

Measure the latency of the flow graph.

Parameters
  • warmup (int) – The number of warmup runs.

  • number (int) – The number of runs to measure the latency.

  • repeat (int) – The number of times to repeat the measurement.

  • median (bool) – Whether to return the median latency.

  • dummy_inputs (Optional[Sequence[Tensor]]) – The dummy inputs to run the flow graph. If not given, automatic generated dummy inputs would be used.

Returns

ret – The measured latency in milliseconds.

Return type

Union[float, List[float]]

class hidet.graph.PassContext[source]

Graph-level pass context.

Use the pass context to control the behavior of optimization passes. Normally, we can optimize a flow graph by directly calling hidet.graph.optimize():

graph_opt = hidet.graph.optimize(graph)

This will optimize the given flow graph in a default context.

To customize the optimizations, run the optimize() function with in a custom hidet.graph.PassContext:

with hidet.graph.PassContext() as ctx:
    # config the contexts
    ctx.profile_pass_instrument(print_stdout=True)  # print elapsed time for each pass
    ctx.save_graph_instrument(out_dir='./outs')  # save the output of each pass as text
    ctx.set_precision(dtype='float16')  # use float16 as the data type
    ctx.set_reduce_precision(dtype='float32')  # use float32 for reduction accumulation
    ctx.set_mma('mma')  # use TensorCore in NVIDIA GPUs to accelerate matmul and conv2d
    ...   # other configs

    # call optimize function
    graph_opt = hidet.graph.optimize(graph)

Please refer to the member functions of this class for the available configs and their usage.

instruments

The graph pass instruments that will be applied before and after each pass. The instruments will be applied in order. See hidet.graph.GraphPassInstrument on how to add custom instrument.

Type

List[GraphPassInstrument]

configs

The current configs of the pass context.

Type

Dict[str, Any]

Methods:

current()

Get the current pass context.

set_precision([dtype])

Set the target precision to use as the output of most operators.

set_reduce_precision([dtype])

Set the target precision used for accumulation results.

set_use_attention([flag])

Set to use fused attention schedule

set_verbose()

Allow each graph level passes to print detailed information related to its lowering and optimization.

set_mma(mma)

Specify the matrix-multiply-accumulate (mma) computation primitives used in matrix multiplication and convolution.

set_parallel_k([disabled, default, search, ...])

Set the strategy to parallel on reduction dimension for matrix multiplication and convolution.

save_graph_instrument(out_dir)

Save the computation graph after each pass to given output directory.

profile_pass_instrument([log_file, print_stdout])

Profile the time of each pass.

classmethod current()[source]

Get the current pass context.

Returns

ret – The current pass context.

Return type

PassContext

set_precision(dtype=None)[source]

Set the target precision to use as the output of most operators. To retain the accuracy, some operators will still use the original data type.

Parameters

dtype (Optional[str]) –

The target dtype to mix the precision of the model. Candidates:

  • None Do not mix the precision.

  • ’float16’ Convert the model into float16 data type.

  • ’bfloat16’ Convert the model into bfloat16 data type.

  • ’float32’ Convert the model into float32 data type.

Return type

hidet.graph.transforms.base.PassContext

set_reduce_precision(dtype=None)[source]

Set the target precision used for accumulation results. Operators like reduce_mean, reduce_avg, matrix multiplication and convolution will reduce along some dimensions. We might want to use a data type with more precision to accumulate the results for more accuracy.

Parameters
  • dtype (Optional[str]) –

  • accumulation. (The target dtype to use for) –

    • None Use the same as inputs of operators.

    • ’float16’ Use ‘float16’ to accumulate. Only valid when set_precision(‘float16’) has been used.

    • ’float32’ Use ‘float32’ to accumulate.

Return type

hidet.graph.transforms.base.PassContext

set_use_attention(flag=False)[source]

Set to use fused attention schedule

Return type

hidet.graph.transforms.base.PassContext

set_verbose()[source]

Allow each graph level passes to print detailed information related to its lowering and optimization.

Return type

hidet.graph.transforms.base.PassContext

set_mma(mma)[source]

Specify the matrix-multiply-accumulate (mma) computation primitives used in matrix multiplication and convolution.

Parameters

mma (str) –

The mma computation primitive to use. Candidates:

  • ’simt’

    Use cuda cores.

  • ’wmma’

    Use wmma instructions.

  • ’mma’

    Use mma instructions.

Return type

hidet.graph.transforms.base.PassContext

set_parallel_k(disabled=False, default=False, search=False, nparts=None)[source]

Set the strategy to parallel on reduction dimension for matrix multiplication and convolution.

Only one of the three parameters should be specified.

Parameters
  • disabled (bool) – Disable the parallelization on reduction dimension.

  • default (bool) – Allow hidet to figure our the parallel factor.

  • search (bool) – Whether to search the k.

  • nparts (Optional[int]) – Use a fixed factor.

save_graph_instrument(out_dir)[source]

Save the computation graph after each pass to given output directory.

Parameters

out_dir (str) – The directory to save graph.

Return type

hidet.graph.transforms.base.PassContext

profile_pass_instrument(log_file=None, print_stdout=False)[source]

Profile the time of each pass.

Parameters
  • log_file (Optional[str]) – When given, write the elapsed time for each pass to this file.

  • print_stdout (bool) – Whether to print the elapsed time for each pass to standard output.

Return type

hidet.graph.transforms.base.PassContext

class hidet.graph.GraphPassInstrument[source]

Graph pass instrument.

This class defines the interface for graph pass instruments. An instrument defines the functions that will be called before and after each pass. This can be used to collect the information of graph passes. Currently, the instrument does not support modifying the flow graph passed to it (such functionality should be implemented as graph pass).

To define a custom graph pass instrument and use it:

import hidet

# define custom instrument and implement instrument functions
class MyInstrument(hidet.graph.GraphPassInstrument):
    def before_all_passes(self, graph: FlowGraph) -> None:
        print('before all passes')

    def before_pass(self, pass_name: str, graph: FlowGraph) -> None:
        print('before pass', pass_name)

    def after_pass(self, pass_name: str, graph: FlowGraph) -> None:
        print('after pass', pass_name)

    def after_all_passes(self, graph: FlowGraph) -> None:
        print('after all passes')

graph = hidet.graph.FlowGraph(outputs=[])   # empty flow graph
with hidet.graph.PassContext() as ctx:
    # add custom instrument to pass context
    ctx.instruments.append(MyInstrument())
    # optimize flow graph
    hidet.graph.optimize(graph)

We can get output like

before all passes
before pass FoldConstantPass
after pass FoldConstantPass
before pass PatternTransformPass
after pass PatternTransformPass
...
after all passes

Methods:

before_all_passes(graph)

Called before process all passes.

before_pass(pass_name, graph)

Called before each pass.

after_pass(pass_name, graph)

Called after each pass.

after_all_passes(graph)

Called after applying all passes.

before_all_passes(graph)[source]

Called before process all passes.

Parameters

graph (FlowGraph) – The flow graph before applying all passes.

Return type

None

before_pass(pass_name, graph)[source]

Called before each pass.

Parameters
  • pass_name (str) – The name of the pass that is going to be applied.

  • graph (FlowGraph) – The flow graph before applying the pass.

Return type

None

after_pass(pass_name, graph)[source]

Called after each pass.

Parameters
  • pass_name (str) – The name of the pass that has been applied.

  • graph (FlowGraph) – The flow graph after applied the pass.

Return type

None

after_all_passes(graph)[source]

Called after applying all passes.

Parameters

graph (FlowGraph) – The flow graph after applying all passes.

Return type

None

hidet.graph.asarray(obj, /, *, dtype=None, device=None)[source]

Convert a list, tuple, or numpy ndarray to a hidet tensor.

Parameters
  • obj (Union[bool, int, float, List, Tuple, Tensor, np.ndarray]) – The object to be converted.

  • dtype (DataType, optional) – The data type of the output tensor.

  • device (Device or str) – The device of the output tensor.

Returns

ret – The hidet tensor converted from given object.

Return type

Tensor

hidet.graph.randn(shape, dtype='float32', mean=0.0, stddev=1.0, device='cpu')[source]

Create a tensor with uniformly distributed values.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • dtype (DataType or str, default 'float32') – The data type of element of the tensor.

  • mean (float, default 0.0) – The mean of the uniform distribution.

  • stddev (float, default 1.0) – The standard deviation of the uniform distribution.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

Returns

ret – The created tensor.

Return type

Tensor

Examples

>>> randn([2, 3])
Tensor(shape=[2, 3], dtype='float32', device='cuda')
[[ 0.10720467 -1.6906018   0.06347568]
 [-0.37061226  0.562728    1.857547  ]]
hidet.graph.empty(shape, dtype='float32', device='cpu', layout=None)[source]

Create an uninitialized tensor.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • dtype (str or DataType) – The data type of element of the tensor.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

  • layout (DataLayout, optional) – The layout of the new tensor. None indicates the default layout (row-major layout).

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.zeros(shape, dtype='float32', device='cpu')[source]

Create a tensor initialized with zero.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • dtype (str or DataType) – The data type of element of the tensor.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.ones(shape, dtype='float32', device='cpu')[source]

Create a tensor initialized with one.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • dtype (DataType or str, default 'float32') – The data type of element of the tensor.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.symbol(shape, dtype='float32', device='cpu', layout=None)[source]

Create a symbolic tensor.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • dtype (str) – The data type of element of the tensor.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

  • layout (DataLayout, optional) – The layout of the new tensor. None indicates the default layout (row-major layout).

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.randn_like(data, mean=0.0, stddev=1.0, shape=None, dtype=None, device=None)[source]

Create a randomly initialized tensor with the same shape, dtype, and device as the given tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • mean (float, optional) – The mean of the normal distribution.

  • stddev (float, optional) – The standard deviation of the normal distribution.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

Returns

ret – The created tensor with random values sampled from a normal distribution.

Return type

Tensor

hidet.graph.empty_like(data, shape=None, dtype=None, device=None, layout=None)[source]

Create an uninitialized tensor with the same shape, dtype, and device as the given tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

  • layout (DataLayout, optional) – The layout of the new tensor. If None, the layout of data is used.

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.zeros_like(data, shape=None, dtype=None, device=None)[source]

Create a tensor initialized with zero with the same shape, dtype, and device as the given tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

Returns

ret – The created tensor with all elements as zero.

Return type

Tensor

hidet.graph.ones_like(data, shape=None, dtype=None, device=None)[source]

Create a tensor initialized with one with the same shape, dtype, and device as the given tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

Returns

ret – The created tensor with all elements as one.

Return type

Tensor

hidet.graph.symbol_like(data, shape=None, dtype=None, device=None, layout=None)[source]

Create a symbol tensor like an existing tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

  • layout (DataLayout, optional) – The layout of the new tensor. If None, the layout of data is used.

Returns

ret – The created symbol tensor.

Return type

Tensor

hidet.graph.full(shape, fill_value, dtype='float32', device='cpu')[source]

Create a tensor initialized with given constant.

Parameters
  • shape (Sequence[int]) – The shape of new tensor.

  • fill_value (float or int or hidet.ir.Constant) – The constant to initialize the new tensor.

  • dtype (DataType or str, default 'float32') – The data type of element of the tensor.

  • device (Device or str, default 'cpu') – The device of the new tensor is created on.

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.full_like(data, fill_value, shape=None, dtype=None, device=None)[source]

Create a tensor initialized with fill_value with the same shape, dtype, and device as the given tensor.

Parameters
  • data (Tensor) – The tensor to copy shape, dtype, and device from.

  • fill_value (int, float, or bool) – The value to fill the tensor with.

  • shape (Sequence[int], optional) – The shape of new tensor. If None, the shape of data is used.

  • dtype (DataType or str, optional) – The data type of element of the tensor. If None, the dtype of data is used.

  • device (Device or str, optional) – The device of the new tensor is created on. If None, the device of data is used.

Returns

ret – The created tensor with all elements as fill_value.

Return type

Tensor

hidet.graph.from_numpy(nparray)[source]

Create a tensor from a numpy array, sharing the memory with the numpy array when possible.

Parameters

nparray (numpy.ndarray) – The numpy array to create the tensor from.

Returns

ret – The created tensor.

Return type

Tensor

hidet.graph.from_dlpack(dltensor)[source]

Create a hidet tensor from an object that implements the __dlpack__ protocol.

Parameters

dltensor (an object that implements the DLPack protocol.) – The object must have the method __dlpack__ that returns a PyCapsule object with name dltensor.

Returns

ret – The hidet tensor that shares the same storage with the DLPack tensor.

Return type

Tensor

hidet.graph.from_torch(torch_tensor)[source]

Create a hidet tensor from pytorch tensor.

The created tensor shared the same memory as given pytorch tensor. Thus, any content modification on one tensor would be reflected on the other one.

Parameters

torch_tensor (torch.Tensor) – The pytorch tensor.

Returns

ret – The created hidet tensor.

Return type

Tensor

hidet.graph.trace_from(tensor, inputs=None)[source]

Trace the flow graph given the output tensor(s).

Each hidet.graph.Tensor has an attribute hidet.graph.Tensor.trace which indicates how the tensor is generated. If the tensor is generated by an operator with symbolic input(s), the tensor itself is also symbolic. And the tensor will have a reference to the operator that generates it. The reference is stored in this attribute.

What this function does is to walk through the trace of the given tensor(s) and construct a flow graph.

When there are multiple symbol inputs, it is mandatory to specify the “inputs” argument explicitly to avoid ambiguity.

Parameters
  • tensor (Tensor or List[Tensor]) – The output tensor(s) that we trace from.

  • inputs (Optional, Tensor or List[Tensor]) – The inputs of the flow graph. When there is only a single symbol tensor in the flow graph, it is optional. When there are multiple inputs, this is required to specify the input order.

Returns

ret – The flow graph that outputs the given input tensor(s).

Return type

FlowGraph

hidet.graph.optimize(graph)[source]

Optimize a flow graph.

This function applies a sequence of predefined graph-level passes to a FlowGraph to conduct optimizations and graph transformations.

Tip

Some graph passes provide options to config, please refer to hidet.graph.PassContext for more information on graph pass configuration.

Parameters

graph (FlowGraph) – The flow graph to be optimized.

Returns

ret – The optimized flow graph.

Return type

FlowGraph