Skip to content

Cheatsheet

Default ConfigurationΒΆ

When retrieving a global config not from a file, torchmeter will initialize it using the following default configuration.
You can view it in a hierarchical way via:

Python
from torchmeter import get_config

cfg = get_config()
print(cfg)
YAML
# time interval in displaying profile
render_interval: 0.15         # unit: second

# Whether to fold the repeat part in rendering model structure tree
tree_fold_repeat: True 

# Display settings for repeat blocks in the model structure tree
# It actually is a rich.panel.Panel object, refer to https://rich.readthedocs.io/en/latest/reference/panel.html#rich.panel.Panel 
tree_repeat_block_args:
  title: '[i]Repeat [[b]<repeat_time>[/b]] Times[/]' # Title of the repeat block, accept rich styling
  title_align: center         # Title alignment, left, center, right
  subtitle: null              # Subtitle of the repeat block, accept rich styling
  subtitle_align: center      # Subtitle alignment, left, center, right

  style: dark_goldenrod   # Style of the repeat block, execute `python -m rich.theme` to get more
  highlight: true             # Whether to highlight the value (number, string...)
  box: HEAVY_EDGE             # Box type, use its name directly like here!!! execute `python -m rich.box` to get more
  border_style: dim           # Border style, execute `python -m rich.theme` to get more

  width: null                 # Width of the repeat block, null means auto
  height: null                # Height of the repeat block, null means auto
  padding:                    # Padding of the repeat block
    - 0                       # top/bottom padding
    - 1                       # left/right padding
  expand: false               # Whether to expand the repeat block to full screen size


# Fine-grained display settings for each level in the model structure tree
# It actually is a rich.tree.Tree object, refer to https://rich.readthedocs.io/en/latest/reference/tree.html#rich.tree.Tree
# the `level` field is necessary!!!! It indicates that the following settings will be applied to that layer.
# level 0 indicates the root node(i.e. the model itself), level 1 indicates the first layer of model children, and so on.
tree_levels_args:
  default:   # Necessary!!!! Alternatively, you can set use 'all' to apply below settings to all levels
    label: '[b gray35](<node_id>) [green]<name>[/green] [cyan]<type>[/]' # node represent string, accept rich styling
    style: tree               # Style of the node, execute `python -m rich.theme` to get more
    guide_style: light_coral  # Guide style of the node, execute `python -m rich.theme` to get more
    highlight: true           # Whether to highlight the value (number, string...)
    hide_root: false          # Whether to not display the node in this level 
    expanded: true            # Whether to display the node's children

  '0':   # Necessary!!!! The number indicates that the following settings will be applied to that layer.   
    label: '[b light_coral]<name>[/]'
    guide_style: light_coral
    # if other settings is not specified, it will use the default settings defined by the `level default`


# Display settings for each column in the profile table
# It actually is a rich.table.Column object, refer to https://rich.readthedocs.io/en/latest/reference/table.html#rich.table.Column
table_column_args:
  style: none       # Style of the column, execute `python -m rich.theme` to get more
  justify: center   # Justify of the column, left, center, right
  vertical: middle  # Vertical align of the column, top, middle, bottom
  overflow: fold    # Overflow of the column, fold, crop, ellipsis, see https://rich.readthedocs.io/en/latest/console.html?highlight=overflow#overflow
  no_wrap: false    # Prevent wrapping of text within the column.


# Display settings for the profile table
# It actually is a rich.table.Table object, refer to https://rich.readthedocs.io/en/latest/reference/table.html#rich.table.Table
table_display_args:
  style: spring_green4        # Style of the table, execute `python -m rich.theme` to get more
  highlight: true             # Whether to highlight the value (number, string...)

  width: null                 # The width in characters of the table, or `null` to automatically fit
  min_width: null             # The minimum width of the table, or `null` for no minimum
  expand: false               # Whether to expand the table to full screen size
  padding:                    # Padding for cells 
    - 0                       # top/bottom padding
    - 1                       # left/right padding
  collapse_padding: false     # Whether to enable collapsing of padding around cells
  pad_edge: true              # Whether to enable padding of edge cells
  leading: 0                  # Number of blank lines between rows (precludes `show_lines` below)

  title: null                 # Title of the table, accept rich styling
  title_style: bold           # Style of the title, execute `python -m rich.theme` to get more
  title_justify: center       # Justify of the title, left, center, right
  caption: null               # The table caption rendered below, accept rich styling
  caption_style: null         # Style of the caption, execute `python -m rich.theme` to get more
  caption_justify: center     # Justify of the caption, left, center, right

  show_header: true           # Whether to show the header row
  header_style: bold          # Style of the header, execute `python -m rich.theme` to get more

  show_footer: false          # Whether to show the footer row
  footer_style: italic        # Style of the footer, execute `python -m rich.theme` to get more

  show_lines: false           # Whether to show lines between rows
  row_styles: null            # Optional list of row styles, if more than one style is given then the styles will alternate

  show_edge: true             # Whether to show the edge of the table
  box: ROUNDED                # Box type, use its name directly like here!!! execute `python -m rich.box` to get more
  safe_box: true              # Whether to disable box characters that don't display on windows legacy terminal with *raster* fonts
  border_style: null          # Style of the border, execute `python -m rich.theme` to get more

# Display settings about how to combine the tree and table in the profile
combine:
  horizon_gap: 2  # horizontal gap in pixel between the tree and table

Tree Level IndexΒΆ

What is the tree level index?ΒΆ

As the name implies, it is the hierarchical index of a operation tree.

Figuratively, within the model operation tree, each guide line represents a level. The level index value commences from 0 and increments from left to right.

Additionally, the level index at which a tree node is mounted is equal to len(tree_node.node_id.split('.')). For instance, the node (1.1.6.2) 1 BatchNorm2d below is mounted at level 4.

AnyNet
β”œβ”€β”€ (1) layers Sequential
β”‚   β”œβ”€β”€ (1.1) 0 BasicBlock
β”‚   β”‚   β”œβ”€β”€ (1.1.1) conv1 Conv2d
β”‚   β”‚   β”œβ”€β”€ (1.1.2) bn1 BatchNorm2d
β”‚   β”‚   β”œβ”€β”€ (1.1.3) relu ReLU
β”‚   β”‚   β”œβ”€β”€ (1.1.4) conv2 Conv2d
β”‚   β”‚   β”œβ”€β”€ (1.1.5) bn2 BatchNorm2d
β”‚   β”‚   └── (1.1.6) downsample Sequential
β”‚   β”‚       β”œβ”€β”€ (1.1.6.1) 0 Conv2d
β”‚   β”‚       └── (1.1.6.2) 1 BatchNorm2d
β”‚   └── (1.2) 1 BasicBlock
β”‚       β”œβ”€β”€ (1.2.1) conv1 Conv2d
β”‚       β”œβ”€β”€ (1.2.2) bn1 BatchNorm2d
β”‚       β”œβ”€β”€ (1.2.3) relu ReLU
β”‚       β”œβ”€β”€ (1.2.4) conv2 Conv2d
β”‚       └── (1.2.5) bn2 BatchNorm2d
β”œβ”€β”€ (2) avgpool AdaptiveAvgPool2d
└── (3) fc Linear

↑   ↑   ↑   ↑

0   1   2   3  (level index, each pointing to a guide line)

How to use the tree level index?ΒΆ

A valid level index empowers you to customize the operation tree with meticulous precision. torchmeter regards the following value as a valid tree level index:

  1. A non-negative integer (e.g. 0, 1, 2, ...): The configurations under a specific index apply only to the corresponding level.
  2. default: The configurations under this index will be applied to all undefined levels.
  3. all: The configurations under this index will override those at any other level, and will be applied with the highest priority across all levels.

Please refer to Customize the Hierarchical Display for specific usage scenarios.


Tree Node AttributesΒΆ

What is a tree node attribute?ΒΆ

  • Upon the instantiation of a torchmeter.Meter in combination with a Pytorch model, an automated scan of the model's architecture will be executed. Subsequently, a tree structure will be produced to depict the model.

  • This tree structure is realized via torchmeter.engine.OperationTree. In this tree, each node is an instance of torchmeter.engine.OperationNode, which represents a layer or operation (such as nn.Conv2d, nn.ReLU, etc.) within the model.

  • Therefore, the attributes of a tree node are the attributes / properties of an instance of OperationNode.

What can a tree node attribute help me?ΒΆ

All the attributes that are available, as defined below, are intended to:

  • facilitate your acquisition of supplementary information of a tree node;
  • customize the display of the tree structure during the rendering procedure.

What are the available attributes of a tree node?ΒΆ

Illustrative Example
Python
from collections import OrderedDict
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()

        self.single_1 = nn.Linear(1, 10)

        self.repeat_1x2 = nn.Sequential(OrderedDict({
            "A": nn.Linear(10, 10),
            "B": nn.Linear(10, 10),
        }))

        self.single_2 = nn.ReLU()

        self.repeat_2x3 = nn.Sequential(OrderedDict({
            "C": nn.Linear(10, 5),
            "D": nn.ReLU(),
            "E": nn.Linear(5, 10),

            "F": nn.Linear(10, 5),
            "G": nn.ReLU(),
            "H": nn.Linear(5, 10),

            "I": nn.Linear(10, 5),
            "J": nn.ReLU(),
            "K": nn.Linear(5, 10),
        }))

        self.single_3 = nn.Linear(10, 1)

Taking the above model as an example, the values of each attribute in each layer are as follows:

name , type , node_id , is_leaf , operation , module_repr
node_id name type is_leaf operation module_repr
0 SimpleModel SimpleModel False instance created via SimpleModel() SimpleModel
1 single_1 Linear True instance created via nn.Linear(1, 10) in line 8 Linear(in_features=1, out_features=10, bias=True)
2 repeat_1x2 Sequential False instance created via nn.Sequential in line 10 Sequential
2.1 A Linear True instance created via nn.Linear(10, 10) in line 11 Linear(in_features=10, out_features=10, bias=True)
2.2 B Linear True instance created via nn.Linear(10, 10) in line 12 Linear(in_features=10, out_features=10, bias=True)
3 single_2 ReLU True instance created via nn.ReLU() in line 15 ReLU()
4 repeat_2x3 Sequential False instance created via nn.Sequential in line 17 Sequential
4.1 C Linear True instance created via nn.Linear(10, 5) in line 18 Linear(in_features=10, out_features=5, bias=True)
4.2 D ReLU True instance created via nn.ReLU() in line 19 ReLU()
4.3 E Linear True instance created via nn.Linear(5, 10) in line 20 Linear(in_features=5, out_features=10, bias=True)
4.4 F Linear True instance created via nn.Linear(10, 5) in line 22 Linear(in_features=10, out_features=5, bias=True)
4.5 G ReLU True instance created via nn.ReLU() in line 23 ReLU()
4.6 H Linear True instance created via nn.Linear(5, 10) in line 24 Linear(in_features=5, out_features=10, bias=True)
4.7 I Linear True instance created via nn.Linear(10, 5) in line 26 Linear(in_features=10, out_features=5, bias=True)
4.8 J ReLU True instance created via nn.ReLU() in line 27 ReLU()
4.9 K Linear True instance created via nn.Linear(5, 10) in line 28 Linear(in_features=5, out_features=10, bias=True)
5 single_3 Linear True instance created via nn.Linear(10, 1) in line 31 Linear(in_features=10, out_features=1, bias=True)
parent & childs

Here we use the node id of the parent and childs of each node to simplify the display.
In actual uage, a node's parent is None or an instance of torchmeter.engine.OperationNode;
while the childs is an orderdict with node id as key and the node instance as value.

node_id name type parent childs
0 SimpleModel SimpleModel None 1 ~ 5
1 single_1 Linear 0
2 repeat_1x2 Sequential 0 2.1, 2.2
2.1 A Linear 2
2.2 B Linear 2
3 single_2 ReLU 0
4 repeat_2x3 Sequential 0 4.1 ~ 4.9
4.1 C Linear 4
4.2 D ReLU 4
4.3 E Linear 4
4.4 F Linear 4
4.5 G ReLU 4
4.6 H Linear 4
4.7 I Linear 4
4.8 J ReLU 4
4.9 K Linear 4
5 single_3 Linear 0
repeat_winsz & repeat_time
node_id name type repeat_winsz repeat_time explanation
0 SimpleModel SimpleModel 1 1 no repeatition
1 single_1 Linear 1 1 no repeatition
2 repeat_1x2 Sequential 1 1 no repeatition
2.1 A Linear 1 2 Repeating windows cover 2.1 and 2.2.
The two layers have the same definition,
so it can be considered that one module is repeated twice
2.2 B Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
3 single_2 ReLU 1 1 no repeatition
4 repeat_2x3 Sequential 1 1 no repeatition
4.1 C Linear 3 3 Repeating windows taking 4.1 ~ 4.3 as a whole and cover 4.1 ~ 4.9.
4.2 D ReLU 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.3 E Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.4 F Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.5 G ReLU 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.6 H Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.7 I Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.8 J ReLU 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
4.9 K Linear 1 1 Have been included in a repeating window.
So skip repetitiveness analysis and use default values.
5 single_3 Linear 1 1 no repeatition
Attribute Type Explanation
operation torch.nn.Module The underlying pytorch module
type str The operation type. If the operation is a pytorch module, use the name of its class
name str The module name defined in the underlying pytorch model
node_id str A globally unique module identifier,
formatted as <parent-node-id>.<node-number-in-current-level>.
The index commences from 1, cause the root is denoted as 0
is_leaf bool Whether the node is a leaf node (no child nodes)
module_repr str The text representation of the current operation.
For non-leaf nodes, it's the ndoe type.
Conversely, it is the return of __repr__() method
parent torchmeter.engine.OperationNode The parent node of this node. Each node has only one parent
childs OrderDict[str, OperationNode] An orderdict storing children of this node in feed-forward order.
Key is node_id of child, value is the child node itself.
repeat_winsz int The size of the repeating window for the current node.
Default is 1, meaning no repetition (window has only the node itself)
repeat_time int The number of repetitions of the window where the current module is located.
Default is 1, meaning no repetition

How to use the attributes of a tree node?ΒΆ

An attribute of a tree node can be employed as a placeholder within the value of certain configurations. This allows for the dynamic retrieval of the attribute value during the tree-rendering procedure.

The configurations/scenario supporting the tree node attribute as a placeholder are listed below.

configuration/scenario Default Value
tree_levels_args.[level-index].label1 '[b gray35](<node_id>) [green]<name>[/green] [cyan]<type>[/]'2
tree_repeat_block_args.title '[i]Repeat [[b]<repeat_time>[/b]] Times[/]'
tree_renderer.repeat_footer Support text and function, see Customize the footer
Usage Example

For example, if you want to unify the titles of all repeated blocks into bold My Repeat Title, then you can do this

Python
from rich import print
from torchmeter import Meter
from torchvision import models

resnet18 = models.resnet18()
model = Meter(resnet18)

model.tree_repeat_block_args.title = '[b]My Repeat Title[/b]' #(1)

print(model.structure) 
  1. πŸ™‹β€β™‚οΈ That's all, then you can see the titles in all repeat blocks have been changed

Unit ExplanationΒΆ

There are four types of units in torchmeter, listed as follows:

The raw-data tag in the subsequent content indicates that the unit marked with this tag is used in the raw data mode

Used by param, cal

unit explanation tag example
null Number of subjects raw-data 5: There are 5 semantic subjects
K \(10^3\) 5 K: There are 5,000 ...
M \(10^6\) 5 M: There are 5,000,000 ...
G \(10^9\) 5 G: There are 5,000,000,000 ...
T \(10^{12}\) 5 T: There are 5,000,000,000,000 ...

Used by mem

unit explanation tag example
B \(2^0=1\) bytes raw-data 5 B: \(5 \times 1 = 5\) bytes
KiB \(2^{10}=1024\) bytes 5 KiB: \(5 \times 2^{10} = 5120\) bytes
MiB \(2^{20}\) bytes 5 MiB: \(5 \times 2^{20} = 5242880\) bytes
GiB \(2^{30}\) bytes 5 GiB: \(5 \times 2^{30} = 5368709120\) bytes
TiB \(2^{40}\) bytes 5 TiB: \(5 \times 2^{40} = 5497558138880\) bytes

Used by ittp - inference time

Why do I obtain different results with different attempts or inputs?

Don't worry, it's a normal phenomenon.

Different results with different attempts

Inference latency is measured in real-time because it is related to dynamic factors such as the real-time load of the machine and the device where the model is located. Therefore, each time the ittp attribute (i.e., Meter(your_model).ittp) is accessed, the inference latency and throughput will be re-measured to reflect the model performance under the current working conditions.

Different results with different inputs

The meaning of inference latency is the time it takes for the model to complete one forward propagation with the given input. Therefore, different inputs will bring different workloads to the model, resulting in differences in inference lantency.

In TorchMete, the measurement of inference latency and throughput will be based on the input received by the model in the most recent forward propagation. Hence, different input batches or different sample shape, combined with differences in machine load at different times, will lead to changes in inference latency.

It should be additionally mentioned that due to automatic device synchronization, the input will be synchronized to the device where the model is located before the forward propagation is executed, so the results obtained from two inputs of the same content on different devices will be very similar.

unit explanation tag example
ns nanosecond 5 ns: \(5 \times 10^{-9}\) seconds
us microsecond 5 us: \(5 \times 10^{-6}\) seconds
ms millisecond 5 ms: \(5 \times 10^{-3}\) seconds
s second raw-data 5 s: \(5 \times 10^{0}\) seconds
min minute 5 min: \(5 \times 60^{1}\) seconds
h hour 5 h: \(5 \times 60^{2}\) seconds

Used by ittp - throughput

What is the meaning of Input in the table below?

Input refers to all the inputs received by the model in your last execution of forward propagation. Torchmeter will treat these inputs as a standard unit to calculate the inference latency and throughput.

To facilitate comparisons between models, we call for using the same input (such as a single sample with batch_size=1) for different models when measuring all statistics, in order to obtain more universally comparable results.

In the following example, Input in Case 1 refers to x=torch.randn(1, 3, 224, 224); y=0.1, while in Case 2, it refers to x=torch.randn(100, 3, 224, 224); y=0.1. You can see the difference between two cases from the results: the inference latency when batch_size=100 is significantly higher than that when batch_size=1.

Python
import torch
import torch.nn as nn
from rich import print
from torchmeter import Meter
from torchvision import models

class ExampleModel(nn.Module):
    def __init__(self):
        super(ExampleModel, self).__init__()
        self.backbone = models.resnet18()

    def forward(self, x: torch.Tensor, y: int):
        return self.backbone(x) + y

model = Meter(ExampleModel(), device="cuda")

# case1: batch size = 1 ------------------------------
ipt = torch.randn(1, 3, 224, 224)
model(ipt, 0.1)
print(model.ittp)

# case2: batch size = 100 ------------------------------
ipt = torch.randn(100, 3, 224, 224)
model(ipt, 0.1)
print(model.ittp)
InferTime_Throughput_INFO
β€’   Operation_Id = 0
β€’ Operation_Name = ExampleModel
β€’ Operation_Type = ExampleModel
β€’     Infer_Time = 2.20 ms Β± 19.53 us
β€’     Throughput = 454.31 IPS Β± 4.03 IPS
InferTime_Throughput_INFO
β€’   Operation_Id = 0
β€’ Operation_Name = ExampleModel
β€’ Operation_Type = ExampleModel
β€’     Infer_Time = 11.38 ms Β± 8.30 us
β€’     Throughput = 87.86 IPS Β± 0.06
unit explanation tag example
IPS Input Per Second raw-data 5 IPS: process 5 inputs per second
KIPS \(10^3\) IPS 5 KIPS: process 5,000 inputs per second
MIPS \(10^6\) IPS 5 MIPS: process 5,000,000 inputs per second
GIPS \(10^9\) IPS 5 GIPS: process 5,000,000,000 inputs per second
TIPS \(10^{12}\) IPS 5 TIPS: process 5,000,000,000,000 inputs per second

  1. As for the value of [level-index], please refer to Tree Level Index ↩

  2. The style markup and its abbreviation in rich is supported in writing value content. β†©