Cheatsheet
Default Configuration¶
When retrieving a global config not from a file, torchmeter
will initialize it using the following default configuration.
You can view it in a hierarchical way via:
# time interval in displaying profile
render_interval: 0.15 # unit: second
# Whether to fold the repeat part in rendering model structure tree
tree_fold_repeat: True
# Display settings for repeat blocks in the model structure tree
# It actually is a rich.panel.Panel object, refer to https://rich.readthedocs.io/en/latest/reference/panel.html#rich.panel.Panel
tree_repeat_block_args:
title: '[i]Repeat [[b]<repeat_time>[/b]] Times[/]' # Title of the repeat block, accept rich styling
title_align: center # Title alignment, left, center, right
subtitle: null # Subtitle of the repeat block, accept rich styling
subtitle_align: center # Subtitle alignment, left, center, right
style: dark_goldenrod # Style of the repeat block, execute `python -m rich.theme` to get more
highlight: true # Whether to highlight the value (number, string...)
box: HEAVY_EDGE # Box type, use its name directly like here!!! execute `python -m rich.box` to get more
border_style: dim # Border style, execute `python -m rich.theme` to get more
width: null # Width of the repeat block, null means auto
height: null # Height of the repeat block, null means auto
padding: # Padding of the repeat block
- 0 # top/bottom padding
- 1 # left/right padding
expand: false # Whether to expand the repeat block to full screen size
# Fine-grained display settings for each level in the model structure tree
# It actually is a rich.tree.Tree object, refer to https://rich.readthedocs.io/en/latest/reference/tree.html#rich.tree.Tree
# the `level` field is necessary!!!! It indicates that the following settings will be applied to that layer.
# level 0 indicates the root node(i.e. the model itself), level 1 indicates the first layer of model children, and so on.
tree_levels_args:
default: # Necessary!!!! Alternatively, you can set use 'all' to apply below settings to all levels
label: '[b gray35](<node_id>) [green]<name>[/green] [cyan]<type>[/]' # node represent string, accept rich styling
style: tree # Style of the node, execute `python -m rich.theme` to get more
guide_style: light_coral # Guide style of the node, execute `python -m rich.theme` to get more
highlight: true # Whether to highlight the value (number, string...)
hide_root: false # Whether to not display the node in this level
expanded: true # Whether to display the node's children
'0': # Necessary!!!! The number indicates that the following settings will be applied to that layer.
label: '[b light_coral]<name>[/]'
guide_style: light_coral
# if other settings is not specified, it will use the default settings defined by the `level default`
# Display settings for each column in the profile table
# It actually is a rich.table.Column object, refer to https://rich.readthedocs.io/en/latest/reference/table.html#rich.table.Column
table_column_args:
style: none # Style of the column, execute `python -m rich.theme` to get more
justify: center # Justify of the column, left, center, right
vertical: middle # Vertical align of the column, top, middle, bottom
overflow: fold # Overflow of the column, fold, crop, ellipsis, see https://rich.readthedocs.io/en/latest/console.html?highlight=overflow#overflow
no_wrap: false # Prevent wrapping of text within the column.
# Display settings for the profile table
# It actually is a rich.table.Table object, refer to https://rich.readthedocs.io/en/latest/reference/table.html#rich.table.Table
table_display_args:
style: spring_green4 # Style of the table, execute `python -m rich.theme` to get more
highlight: true # Whether to highlight the value (number, string...)
width: null # The width in characters of the table, or `null` to automatically fit
min_width: null # The minimum width of the table, or `null` for no minimum
expand: false # Whether to expand the table to full screen size
padding: # Padding for cells
- 0 # top/bottom padding
- 1 # left/right padding
collapse_padding: false # Whether to enable collapsing of padding around cells
pad_edge: true # Whether to enable padding of edge cells
leading: 0 # Number of blank lines between rows (precludes `show_lines` below)
title: null # Title of the table, accept rich styling
title_style: bold # Style of the title, execute `python -m rich.theme` to get more
title_justify: center # Justify of the title, left, center, right
caption: null # The table caption rendered below, accept rich styling
caption_style: null # Style of the caption, execute `python -m rich.theme` to get more
caption_justify: center # Justify of the caption, left, center, right
show_header: true # Whether to show the header row
header_style: bold # Style of the header, execute `python -m rich.theme` to get more
show_footer: false # Whether to show the footer row
footer_style: italic # Style of the footer, execute `python -m rich.theme` to get more
show_lines: false # Whether to show lines between rows
row_styles: null # Optional list of row styles, if more than one style is given then the styles will alternate
show_edge: true # Whether to show the edge of the table
box: ROUNDED # Box type, use its name directly like here!!! execute `python -m rich.box` to get more
safe_box: true # Whether to disable box characters that don't display on windows legacy terminal with *raster* fonts
border_style: null # Style of the border, execute `python -m rich.theme` to get more
# Display settings about how to combine the tree and table in the profile
combine:
horizon_gap: 2 # horizontal gap in pixel between the tree and table
Tree Level Index¶
What is the tree level index?¶
As the name implies, it is the hierarchical index of a operation tree.
Actually, the level index of a tree node equals to len(tree_node.node_id.split('.'))
AnyNet
├── (1) layers Sequential
│ ├── (1.1) 0 BasicBlock
│ │ ├── (1.1.1) conv1 Conv2d
│ │ ├── (1.1.2) bn1 BatchNorm2d
│ │ ├── (1.1.3) relu ReLU
│ │ ├── (1.1.4) conv2 Conv2d
│ │ ├── (1.1.5) bn2 BatchNorm2d
│ │ └── (1.1.6) downsample Sequential
│ │ ├── (1.1.6.1) 0 Conv2d
│ │ └── (1.1.6.2) 1 BatchNorm2d
│ └── (1.2) 1 BasicBlock
│ ├── (1.2.1) conv1 Conv2d
│ ├── (1.2.2) bn1 BatchNorm2d
│ ├── (1.2.3) relu ReLU
│ ├── (1.2.4) conv2 Conv2d
│ └── (1.2.5) bn2 BatchNorm2d
├── (2) avgpool AdaptiveAvgPool2d
└── (3) fc Linear
↑ ↑ ↑ ↑
0 1 2 3
How to use the tree level index?¶
A valid level index empowers you to customize the operation tree with meticulous precision.
torchmeter
regards the following value as a valid tree level index:
- A non-negative integer (e.g.
0
,1
,2
, ...): The configurations under a specific index apply only to the corresponding level. default
: The configurations under this index will be applied to all undefined levels.all
: The configurations under this index will be applied to all levels.
Please refer to Customize the Hierarchical Display for specific usage scenarios.
Tree Node Attributes¶
What is a tree node attribute?¶
-
Upon the instantiation of a
torchmeter.Meter
in combination with aPytorch
model, an automated scan of the model's architecture will be executed. Subsequently, a tree structure will be produced to depict the model. -
This tree structure is realized via
torchmeter.engine.OperationTree
. In this tree, each node is an instance oftorchmeter.engine.OperationNode
, which represents a layer or operation (such asnn.Conv2d
,nn.ReLU
, etc.) within the model. -
Therefore, the attributes of a tree node are the attributes / properties of an instance of
OperationNode
.
What can a tree node attribute help me?¶
All the attributes that are available, as defined below, are intended to:
- facilitate your acquisition of supplementary information of a tree node;
- customize of the display of the tree structure during the rendering procedure.
What are the available attributes of a tree node?¶
Illustrative Example
from collections import OrderedDict
import torch.nn as nn
class SimpleModel(nn.Module):
def __init__(self):
super(SimpleModel, self).__init__()
self.single_1 = nn.Linear(1, 10)
self.repeat_1x2 = nn.Sequential(OrderedDict({
"A": nn.Linear(10, 10),
"B": nn.Linear(10, 10),
}))
self.single_2 = nn.ReLU()
self.repeat_2x3 = nn.Sequential(OrderedDict({
"C": nn.Linear(10, 5),
"D": nn.ReLU(),
"E": nn.Linear(5, 10),
"F": nn.Linear(10, 5),
"G": nn.ReLU(),
"H": nn.Linear(5, 10),
"I": nn.Linear(10, 5),
"J": nn.ReLU(),
"K": nn.Linear(5, 10),
}))
self.single_3 = nn.Linear(10, 1)
Taking the above model as an example, the values of each attribute in each layer are as follows:
name , type , node_id , is_leaf , operation , module_repr
node_id | name | type | is_leaf | operation | module_repr |
---|---|---|---|---|---|
0 |
SimpleModel |
SimpleModel |
False |
instance created via SimpleModel() |
SimpleModel |
1 |
single_1 |
Linear |
True |
instance created via nn.Linear(1, 10) in line 8 |
Linear(in_features=1, out_features=10, bias=True) |
2 |
repeat_1x2 |
Sequential |
False |
instance created via nn.Sequential in line 10 |
Sequential |
2.1 |
A |
Linear |
True |
instance created via nn.Linear(10, 10) in line 11 |
Linear(in_features=10, out_features=10, bias=True) |
2.2 |
B |
Linear |
True |
instance created via nn.Linear(10, 10) in line 12 |
Linear(in_features=10, out_features=10, bias=True) |
3 |
single_2 |
ReLU |
True |
instance created via nn.ReLU() in line 15 |
ReLU() |
4 |
repeat_2x3 |
Sequential |
False |
instance created via nn.Sequential in line 17 |
Sequential |
4.1 |
C |
Linear |
True |
instance created via nn.Linear(10, 5) in line 18 |
Linear(in_features=10, out_features=5, bias=True) |
4.2 |
D |
ReLU |
True |
instance created via nn.ReLU() in line 19 |
ReLU() |
4.3 |
E |
Linear |
True |
instance created via nn.Linear(5, 10) in line 20 |
Linear(in_features=5, out_features=10, bias=True) |
4.4 |
F |
Linear |
True |
instance created via nn.Linear(10, 5) in line 22 |
Linear(in_features=10, out_features=5, bias=True) |
4.5 |
G |
ReLU |
True |
instance created via nn.ReLU() in line 23 |
ReLU() |
4.6 |
H |
Linear |
True |
instance created via nn.Linear(5, 10) in line 24 |
Linear(in_features=5, out_features=10, bias=True) |
4.7 |
I |
Linear |
True |
instance created via nn.Linear(10, 5) in line 26 |
Linear(in_features=10, out_features=5, bias=True) |
4.8 |
J |
ReLU |
True |
instance created via nn.ReLU() in line 27 |
ReLU() |
4.9 |
K |
Linear |
True |
instance created via nn.Linear(5, 10) in line 28 |
Linear(in_features=5, out_features=10, bias=True) |
5 |
single_3 |
Linear |
True |
instance created via nn.Linear(10, 1) in line 31 |
Linear(in_features=10, out_features=1, bias=True) |
parent & childs
Here we use the node id of the parent and childs of each node to simplify the display.
In actual uage, a node'sparent
isNone
or an instance oftorchmeter.engine.OperationNode
;
while thechilds
is an orderdict with node id as key and the node instance as value.
node_id | name | type | parent | childs |
---|---|---|---|---|
0 |
SimpleModel |
SimpleModel |
None |
1 ~ 5 |
1 |
single_1 |
Linear |
0 |
|
2 |
repeat_1x2 |
Sequential |
0 |
2.1, 2.2 |
2.1 |
A |
Linear |
2 |
|
2.2 |
B |
Linear |
2 |
|
3 |
single_2 |
ReLU |
0 |
|
4 |
repeat_2x3 |
Sequential |
0 |
4.1 ~ 4.9 |
4.1 |
C |
Linear |
4 |
|
4.2 |
D |
ReLU |
4 |
|
4.3 |
E |
Linear |
4 |
|
4.4 |
F |
Linear |
4 |
|
4.5 |
G |
ReLU |
4 |
|
4.6 |
H |
Linear |
4 |
|
4.7 |
I |
Linear |
4 |
|
4.8 |
J |
ReLU |
4 |
|
4.9 |
K |
Linear |
4 |
|
5 |
single_3 |
Linear |
0 |
repeat_winsz & repeat_time
node_id | name | type | repeat_winsz | repeat_time | explanation |
---|---|---|---|---|---|
0 |
SimpleModel |
SimpleModel |
1 |
1 |
no repeatition |
1 |
single_1 |
Linear |
1 |
1 |
no repeatition |
2 |
repeat_1x2 |
Sequential |
1 |
1 |
no repeatition |
2.1 |
A |
Linear |
1 | 2 | Repeating windows cover 2.1 and 2.2 . The two layers have the same definition, so it can be considered that one module is repeated twice |
2.2 |
B |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
3 |
single_2 |
ReLU |
1 |
1 |
no repeatition |
4 |
repeat_2x3 |
Sequential |
1 |
1 |
no repeatition |
4.1 |
C |
Linear |
3 | 3 | Repeating windows taking 4.1 ~ 4.3 as a whole and cover 4.1 ~ 4.9 . |
4.2 |
D |
ReLU |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.3 |
E |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.4 |
F |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.5 |
G |
ReLU |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.6 |
H |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.7 |
I |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.8 |
J |
ReLU |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
4.9 |
K |
Linear |
1 |
1 |
Have been included in a repeating window. So skip repetitiveness analysis and use default values. |
5 |
single_3 |
Linear |
1 |
1 |
no repeatition |
Attribute | Type | Explanation |
---|---|---|
operation |
torch.nn.Module |
The underlying pytorch module |
type |
str |
The operation type. If the operation is a pytorch module, use the name of its class |
name |
str |
The module name defined in the underlying pytorch model |
node_id |
str |
A globally unique module identifier, formatted as <parent-node-identifier>.<current-level-index> . The index commences from 1 , cause the root is denoted as 0 |
is_leaf |
bool |
Whether the node is a leaf node (no child nodes) |
module_repr |
str |
The text representation of the current operation. For non-leaf nodes, it's the ndoe type. Conversely, it is the return of __repr__() method |
parent |
torchmeter.engine.OperationNode |
The parent node of this node. Each node has only one parent |
childs |
OrderDict[str, OperationNode] |
An orderdict storing children of this node in feed-forward order. Key is node_id of child, value is the child node itself. |
repeat_winsz |
int |
The size of the repeating window for the current node. Default is 1, meaning no repetition (window has only the node itself) |
repeat_time |
int |
The number of repetitions of the window where the current module is located. Default is 1, meaning no repetition |
How to use the attributes of a tree node?¶
In the scenarios described below, an attribute of a tree node can be utilized as a placeholder, which enables the dynamic retrieval of its value during the tree-rendering process.
Global Configuration
About the
<level-index>
shown below, please refer to Tree Level Index .
configuration | default value |
---|---|
tree_repeat_block_args |
'[i]Repeat [[b]<repeat_time>[/b]] Times[/]' |
tree_levels_args.default.label |
'[b gray35](<node_id>) [green]<name>[/green] [cyan]<type>[/]' |
tree_levels_args.0.label |
'[b light_coral]<name>[/]' |
tree_levels_args.<level-index>.label |
same as the tree_levels_args.default.label if not specified |
Repeat Block Footer
Please refer to Customize the footer for more details.
Unit Explanation¶
There are four types of units in torchmeter
, listed as follows:
The raw-data
tag in the subsequent content indicates that the unit marked with this tag is used in the raw data
mode
Used by
param
,cal
unit | explanation | tag | example |
---|---|---|---|
null |
Number of subjects | raw-data |
5 : There are 5 semantic subjects |
K |
\(10^3\) | 5 K : There are 5,000 ... |
|
M |
\(10^6\) | 5 M : There are 5,000,000 ... |
|
G |
\(10^9\) | 5 G : There are 5,000,000,000 ... |
|
T |
\(10^{12}\) | 5 T : There are 5,000,000,000,000 ... |
Used by
mem
unit | explanation | tag | example |
---|---|---|---|
B |
\(2^0=1\) bytes | raw-data |
5 B : \(5 \times 1 = 5\) bytes |
KiB |
\(2^{10}=1024\) bytes | 5 KiB : \(5 \times 2^{10} = 5120\) bytes |
|
MiB |
\(2^{20}\) bytes | 5 MiB : \(5 \times 2^{20} = 5242880\) bytes |
|
GiB |
\(2^{30}\) bytes | 5 GiB : \(5 \times 2^{30} = 5368709120\) bytes |
|
TiB |
\(2^{40}\) bytes | 5 TiB : \(5 \times 2^{40} = 5497558138880\) bytes |
Used by
ittp
- inference time
unit | explanation | tag | example |
---|---|---|---|
ns |
nanosecond | 5 ns : \(5 \times 10^{-9}\) seconds |
|
us |
microsecond | 5 us : \(5 \times 10^{-6}\) seconds |
|
ms |
millisecond | 5 ms : \(5 \times 10^{-3}\) seconds |
|
s |
second | raw-data |
5 s : \(5 \times 10^{0}\) seconds |
min |
minute | 5 min : \(5 \times 60^{1}\) seconds |
|
h |
hour | 5 h : \(5 \times 60^{2}\) seconds |
Used by
ittp
- throughput
unit | explanation | tag | example |
---|---|---|---|
IPS |
Input Per Second | raw-data |
5 IPS : process 5 inputs per second |
KIPS |
\(10^3\) IPS | 5 KIPS : process 5,000 inputs per second |
|
MIPS |
\(10^6\) IPS | 5 MIPS : process 5,000,000 inputs per second |
|
GIPS |
\(10^9\) IPS | 5 GIPS : process 5,000,000,000 inputs per second |
|
TIPS |
\(10^{12}\) IPS | 5 TIPS : process 5,000,000,000,000 inputs per second |