Tutorials
# installation
## pip install torchmeter
A. Wrap your model with Meter¶
from torchvision import models
from torchmeter import Meter
underlying_model = models.vgg19_bn()
model = Meter(underlying_model)
Finish Scanning model in 0.0146 seconds
B. Zero-Intrusion Proxy¶
Use the instance of
Meter
as like using the underlying model
B.a Access Attrs/Methods of Underlying Model¶
# Context
# --------------------------------------------------------------------------------
# underlying_model: Your pytorch model
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
import random
underlying_items = random.sample(underlying_model.__dir__(), 3)
print(f"These attributes/methods are accessible in the underlying model: \n{underlying_items}")
for i in underlying_items:
print(f"If `{i}` can be accessed through `Meter` instance ——", hasattr(model, i))
These attributes/methods are accessible in the underlying model: ['_forward_hooks_always_called', 'bfloat16', '__repr__']
If `_forward_hooks_always_called` can be accessed through `Meter` instance —— True
If `bfloat16` can be accessed through `Meter` instance —— True
If `__repr__` can be accessed through `Meter` instance —— True
B.b Access Attrs/Methods of Meter class¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"Model now on: {model.device}")
print(f"The number of benchmark iterations per operation in measuring `ittp` is {model.ittp_benchmark_time}")
Model now on: cpu
The number of benchmark iterations per operation in measuring `ittp` is 100
B.c Access Attrs/Methods Sharing Same Names¶
In this case, you can directly access the attrs/methods of the
Meter
instance by name.
To access those of the underlying model, add the prefixORIGIN_
to the name.
from torchvision import models
from torchmeter import Meter
underlying_model = models.vgg19_bn()
# Suppose your model happens to have an attribute named `ittp_warmup`,
# which conflicts with an attribute of the Meter class in terms of name.
underlying_model.ittp_warmup = 55
# Access the `ittp_warmup` attribute of the Meter class
model = Meter(underlying_model)
model.ittp_warmup = 66
print(f"The `ittp_warmup` attribute of the Meter class is {model.ittp_warmup}")
# Access the `ittp_warmup` attribute of the underlying model through `ORIGIN_` prefix
print(f"The `ittp_warmup` attribute of the underlying model is {model.ORIGIN_ittp_warmup}")
Finish Scanning model in 0.0159 seconds
The `ittp_warmup` attribute of the Meter class is 66
The `ittp_warmup` attribute of the underlying model is 55
C. Automatic Device Synchronization¶
- No need to concern about the device mismatch between the model and input.
- Always get ready to perform a feed forward 🚀
import torch
from torchmeter import Meter
from torchvision import models
model = Meter(models.vgg19_bn())
input = torch.randn(1, 3, 224, 224)
# move to GPU if available
if torch.cuda.is_available():
model.device = "cuda:0"
print(f"The model now on: {model.device}, The input now on {input.device}")
output = model(input)
else:
print(f"The model now on: {model.device}, The input now on {input.device}")
output = model(input)
print("Inference done !")
Finish Scanning model in 0.0121 seconds
The model now on: cuda:0, The input now on cpu
Inference done !
D. Model Structure Analysis¶
This feature will help you quickly understand the model architecture,
especially when there are a large number of repetitive structures.
D.a Enable Repeat Block Folding¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
from rich import print
# default value is True
model.tree_fold_repeat = True
print(model.structure)
VGG ├── (1) features Sequential │ ├── (1.1) 0 Conv2d │ ├── (1.2) 1 BatchNorm2d │ ├── (1.3) 2 ReLU │ ├── (1.4) 3 Conv2d │ ├── (1.5) 4 BatchNorm2d │ ├── (1.6) 5 ReLU │ ├── (1.7) 6 MaxPool2d │ ├── (1.8) 7 Conv2d │ ├── (1.9) 8 BatchNorm2d │ ├── (1.10) 9 ReLU │ ├── (1.11) 10 Conv2d │ ├── (1.12) 11 BatchNorm2d │ ├── (1.13) 12 ReLU │ ├── (1.14) 13 MaxPool2d │ ├── (1.15) 14 Conv2d │ ├── ┏━━━━ Repeat [3] Times ━━━━┓ │ │ ┃ (1.x) 15 BatchNorm2d ┃ │ │ ┃ (1.(x+1)) 16 ReLU ┃ │ │ ┃ (1.(x+2)) 17 Conv2d ┃ │ │ ┃ ------------------------ ┃ │ │ ┃ Where x = 16, 19, 22 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.25) 24 BatchNorm2d │ ├── (1.26) 25 ReLU │ ├── (1.27) 26 MaxPool2d │ ├── (1.28) 27 Conv2d │ ├── ┏━━━━ Repeat [3] Times ━━━━┓ │ │ ┃ (1.y) 28 BatchNorm2d ┃ │ │ ┃ (1.(y+1)) 29 ReLU ┃ │ │ ┃ (1.(y+2)) 30 Conv2d ┃ │ │ ┃ ------------------------ ┃ │ │ ┃ Where y = 29, 32, 35 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.38) 37 BatchNorm2d │ ├── (1.39) 38 ReLU │ ├── (1.40) 39 MaxPool2d │ ├── ┏━━━━━━ Repeat [2] Times ━━━━━━┓ │ │ ┃ (1.i) 40 Conv2d ┃ │ │ ┃ (1.(i+1)) 41 BatchNorm2d ┃ │ │ ┃ (1.(i+2)) 42 ReLU ┃ │ │ ┃ (1.(i+3)) 43 Conv2d ┃ │ │ ┃ (1.(i+4)) 44 BatchNorm2d ┃ │ │ ┃ (1.(i+5)) 45 ReLU ┃ │ │ ┃ ---------------------------- ┃ │ │ ┃ Where i = 41, 47 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ └── (1.53) 52 MaxPool2d ├── (2) avgpool AdaptiveAvgPool2d └── (3) classifier Sequential ├── (3.1) 0 Linear ├── (3.2) 1 ReLU ├── (3.3) 2 Dropout ├── (3.4) 3 Linear ├── (3.5) 4 ReLU ├── (3.6) 5 Dropout └── (3.7) 6 Linear
D.b Disable Repeat Block Folding¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
from rich import print
# If disable, the output will directly reflect the model structure,
# and may be verbose and not clear when there exists repetitive structure.
# If there are no repetitive structure in your model, there will be no difference whether it is enabled or not.
model.tree_fold_repeat = False
print(model.structure)
VGG ├── (1) features Sequential │ ├── (1.1) 0 Conv2d │ ├── (1.2) 1 BatchNorm2d │ ├── (1.3) 2 ReLU │ ├── (1.4) 3 Conv2d │ ├── (1.5) 4 BatchNorm2d │ ├── (1.6) 5 ReLU │ ├── (1.7) 6 MaxPool2d │ ├── (1.8) 7 Conv2d │ ├── (1.9) 8 BatchNorm2d │ ├── (1.10) 9 ReLU │ ├── (1.11) 10 Conv2d │ ├── (1.12) 11 BatchNorm2d │ ├── (1.13) 12 ReLU │ ├── (1.14) 13 MaxPool2d │ ├── (1.15) 14 Conv2d │ ├── (1.16) 15 BatchNorm2d │ ├── (1.17) 16 ReLU │ ├── (1.18) 17 Conv2d │ ├── (1.19) 18 BatchNorm2d │ ├── (1.20) 19 ReLU │ ├── (1.21) 20 Conv2d │ ├── (1.22) 21 BatchNorm2d │ ├── (1.23) 22 ReLU │ ├── (1.24) 23 Conv2d │ ├── (1.25) 24 BatchNorm2d │ ├── (1.26) 25 ReLU │ ├── (1.27) 26 MaxPool2d │ ├── (1.28) 27 Conv2d │ ├── (1.29) 28 BatchNorm2d │ ├── (1.30) 29 ReLU │ ├── (1.31) 30 Conv2d │ ├── (1.32) 31 BatchNorm2d │ ├── (1.33) 32 ReLU │ ├── (1.34) 33 Conv2d │ ├── (1.35) 34 BatchNorm2d │ ├── (1.36) 35 ReLU │ ├── (1.37) 36 Conv2d │ ├── (1.38) 37 BatchNorm2d │ ├── (1.39) 38 ReLU │ ├── (1.40) 39 MaxPool2d │ ├── (1.41) 40 Conv2d │ ├── (1.42) 41 BatchNorm2d │ ├── (1.43) 42 ReLU │ ├── (1.44) 43 Conv2d │ ├── (1.45) 44 BatchNorm2d │ ├── (1.46) 45 ReLU │ ├── (1.47) 46 Conv2d │ ├── (1.48) 47 BatchNorm2d │ ├── (1.49) 48 ReLU │ ├── (1.50) 49 Conv2d │ ├── (1.51) 50 BatchNorm2d │ ├── (1.52) 51 ReLU │ └── (1.53) 52 MaxPool2d ├── (2) avgpool AdaptiveAvgPool2d └── (3) classifier Sequential ├── (3.1) 0 Linear ├── (3.2) 1 ReLU ├── (3.3) 2 Dropout ├── (3.4) 3 Linear ├── (3.5) 4 ReLU ├── (3.6) 5 Dropout └── (3.7) 6 Linear
E. Full-Stack Model Analytics¶
TorchMeter
give you two ways to quantify your model performance:
- Overall Report: A quick summary of specific statistics.
- Layer-wise Profile: A detailed operation-wise tabular report of specific statistics.
# To better show the feature of measuring total/learnable parameter numbers,
# we assume that the `features` part of the underlying model is frozen.
_ = model.features.requires_grad_(False)
E.a Model State¶
Provide an inspection of your model's basic information, including:
- Model type
- Device the model now on
- Feed-forward input
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(model.model_info)
• Model : VGG • Device : cuda:0 • Signature: forward(self, x) • Input : x = Shape([1, 3, 224, 224]) <Tensor>
E.b Overall Report¶
Provide a comprehensive report on the overall performance of the model, including all the statistics:
- Model state
- Parameters volumn
- Calculation burden
- Memory usage
- Inference time
- Throughput
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(model.overview())
Warming Up: 100%|██████████| 50/50 [00:00<00:00, 330.15it/s] Benchmark Inference Time & Throughput: 100%|██████████| 6400/6400 [00:00<00:00, 6582.64module/s]
────────────── Model INFO ─────────────── ───────────── Param INFO ───────────── • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : ────────────────────────────────────── x = Shape([1, 3, 224, 224]) <Tensor> ───────────────────────────────────────── ────────────────────────── Cal INFO ────────────────────────── ────────────────── Mem INFO ─────────────────── • Statistics: cal • Statistics: mem • FLOPs : 39.34 G • Parameters Memory Cost: 548.09 MiB, 82.12 % • MACs(aka MACC, MADD): 19.68 G • Buffers Memory Cost : 43.12 KiB, 0.01 % ⚠ Warning: the result may be inaccurate, cause: • FeatureMap Memory Cost: 119.31 MiB, 17.88 % ▶ Some modules don't support calculation measurement yet. • Total Memory Cost : 667.44 MiB ☑ use `Meter(your_model).profile('cal')` to see more. ─────────────────────────────────────────────── ────────────────────────────────────────────────────────────── ──────────────── Ittp INFO ──────────────── • Statistics: ittp • Benchmark Times: 100 • Inference Elapse: 2.05 ms ± 19.06 us • Throughput : 488.77 IPS ± 4.55 IPS ───────────────────────────────────────────
E.c Layer-wise Profile¶
Provide a layer-wise, rich-text, detailed tabular report concerning each statistics.
# This block is to disable the interval output to adapt to Jupyter Notebook output limits
# In your daily use, you are not required to do this unless you do want to ban the output annimation.
# This section is using the global configuration, which we will be discussed in section H of this tutorial.
from torchmeter import get_config
cfg = get_config()
cfg.restore()
## Disable interval output to adapt to Jupyter Notebook
cfg.render_interval = 0
E.c.1 Parameter Analysis¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Overall Report ", "="*10)
# Total/trainable parameter quantification
print(model.param)
========== Overall Report ==========
Params_INFO • Operation_Id = 0 • Operation_Name = VGG • Operation_Type = VGG • Total_Params = 143678248.00 = 143.68 M • Learnable_Params = 123642856.00 = 123.64 M
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Layer-wise Profile ", "="*10)
# Layer-wise parameter distribution analysis
# Note that the data for each layer is only statistically analyzed for that layer and does not include sub-layers ❗❗❗
tb, data = model.profile('param', no_tree=True)
========== Layer-wise Profile ==========
╭──────────────┬────────────────┬───────────────────┬────────────┬───────────────┬─────────────╮ │ Operation_Id │ Operation_Name │ Operation_Type │ Param_Name │ Requires_Grad │ Numeric_Num │ ├──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼─────────────┤ │ 1 │ features │ Sequential │ - │ - │ 0.00 │ │ 1.1 │ 0 │ Conv2d │ weight │ False │ 1.73 K │ │ 1.1 │ 0 │ Conv2d │ bias │ False │ 64.00 │ │ 1.2 │ 1 │ BatchNorm2d │ weight │ False │ 64.00 │ │ 1.2 │ 1 │ BatchNorm2d │ bias │ False │ 64.00 │ │ 1.3 │ 2 │ ReLU │ - │ - │ 0.00 │ │ 1.4 │ 3 │ Conv2d │ weight │ False │ 36.86 K │ │ 1.4 │ 3 │ Conv2d │ bias │ False │ 64.00 │ │ 1.5 │ 4 │ BatchNorm2d │ weight │ False │ 64.00 │ │ 1.5 │ 4 │ BatchNorm2d │ bias │ False │ 64.00 │ │ 1.6 │ 5 │ ReLU │ - │ - │ 0.00 │ │ 1.7 │ 6 │ MaxPool2d │ - │ - │ 0.00 │ │ 1.8 │ 7 │ Conv2d │ weight │ False │ 73.73 K │ │ 1.8 │ 7 │ Conv2d │ bias │ False │ 128.00 │ │ 1.9 │ 8 │ BatchNorm2d │ weight │ False │ 128.00 │ │ 1.9 │ 8 │ BatchNorm2d │ bias │ False │ 128.00 │ │ 1.10 │ 9 │ ReLU │ - │ - │ 0.00 │ │ 1.11 │ 10 │ Conv2d │ weight │ False │ 147.46 K │ │ 1.11 │ 10 │ Conv2d │ bias │ False │ 128.00 │ │ 1.12 │ 11 │ BatchNorm2d │ weight │ False │ 128.00 │ │ 1.12 │ 11 │ BatchNorm2d │ bias │ False │ 128.00 │ │ 1.13 │ 12 │ ReLU │ - │ - │ 0.00 │ │ 1.14 │ 13 │ MaxPool2d │ - │ - │ 0.00 │ │ 1.15 │ 14 │ Conv2d │ weight │ False │ 294.91 K │ │ 1.15 │ 14 │ Conv2d │ bias │ False │ 256.00 │ │ 1.16 │ 15 │ BatchNorm2d │ weight │ False │ 256.00 │ │ 1.16 │ 15 │ BatchNorm2d │ bias │ False │ 256.00 │ │ 1.17 │ 16 │ ReLU │ - │ - │ 0.00 │ │ 1.18 │ 17 │ Conv2d │ weight │ False │ 589.82 K │ │ 1.18 │ 17 │ Conv2d │ bias │ False │ 256.00 │ │ 1.19 │ 18 │ BatchNorm2d │ weight │ False │ 256.00 │ │ 1.19 │ 18 │ BatchNorm2d │ bias │ False │ 256.00 │ │ 1.20 │ 19 │ ReLU │ - │ - │ 0.00 │ │ 1.21 │ 20 │ Conv2d │ weight │ False │ 589.82 K │ │ 1.21 │ 20 │ Conv2d │ bias │ False │ 256.00 │ │ 1.22 │ 21 │ BatchNorm2d │ weight │ False │ 256.00 │ │ 1.22 │ 21 │ BatchNorm2d │ bias │ False │ 256.00 │ │ 1.23 │ 22 │ ReLU │ - │ - │ 0.00 │ │ 1.24 │ 23 │ Conv2d │ weight │ False │ 589.82 K │ │ 1.24 │ 23 │ Conv2d │ bias │ False │ 256.00 │ │ 1.25 │ 24 │ BatchNorm2d │ weight │ False │ 256.00 │ │ 1.25 │ 24 │ BatchNorm2d │ bias │ False │ 256.00 │ │ 1.26 │ 25 │ ReLU │ - │ - │ 0.00 │ │ 1.27 │ 26 │ MaxPool2d │ - │ - │ 0.00 │ │ 1.28 │ 27 │ Conv2d │ weight │ False │ 1.18 M │ │ 1.28 │ 27 │ Conv2d │ bias │ False │ 512.00 │ │ 1.29 │ 28 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.29 │ 28 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.30 │ 29 │ ReLU │ - │ - │ 0.00 │ │ 1.31 │ 30 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.31 │ 30 │ Conv2d │ bias │ False │ 512.00 │ │ 1.32 │ 31 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.32 │ 31 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.33 │ 32 │ ReLU │ - │ - │ 0.00 │ │ 1.34 │ 33 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.34 │ 33 │ Conv2d │ bias │ False │ 512.00 │ │ 1.35 │ 34 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.35 │ 34 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.36 │ 35 │ ReLU │ - │ - │ 0.00 │ │ 1.37 │ 36 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.37 │ 36 │ Conv2d │ bias │ False │ 512.00 │ │ 1.38 │ 37 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.38 │ 37 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.39 │ 38 │ ReLU │ - │ - │ 0.00 │ │ 1.40 │ 39 │ MaxPool2d │ - │ - │ 0.00 │ │ 1.41 │ 40 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.41 │ 40 │ Conv2d │ bias │ False │ 512.00 │ │ 1.42 │ 41 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.42 │ 41 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.43 │ 42 │ ReLU │ - │ - │ 0.00 │ │ 1.44 │ 43 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.44 │ 43 │ Conv2d │ bias │ False │ 512.00 │ │ 1.45 │ 44 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.45 │ 44 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.46 │ 45 │ ReLU │ - │ - │ 0.00 │ │ 1.47 │ 46 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.47 │ 46 │ Conv2d │ bias │ False │ 512.00 │ │ 1.48 │ 47 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.48 │ 47 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.49 │ 48 │ ReLU │ - │ - │ 0.00 │ │ 1.50 │ 49 │ Conv2d │ weight │ False │ 2.36 M │ │ 1.50 │ 49 │ Conv2d │ bias │ False │ 512.00 │ │ 1.51 │ 50 │ BatchNorm2d │ weight │ False │ 512.00 │ │ 1.51 │ 50 │ BatchNorm2d │ bias │ False │ 512.00 │ │ 1.52 │ 51 │ ReLU │ - │ - │ 0.00 │ │ 1.53 │ 52 │ MaxPool2d │ - │ - │ 0.00 │ │ 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 0.00 │ │ 3 │ classifier │ Sequential │ - │ - │ 0.00 │ │ 3.1 │ 0 │ Linear │ weight │ True │ 102.76 M │ │ 3.1 │ 0 │ Linear │ bias │ True │ 4.10 K │ │ 3.2 │ 1 │ ReLU │ - │ - │ 0.00 │ │ 3.3 │ 2 │ Dropout │ - │ - │ 0.00 │ │ 3.4 │ 3 │ Linear │ weight │ True │ 16.78 M │ │ 3.4 │ 3 │ Linear │ bias │ True │ 4.10 K │ │ 3.5 │ 4 │ ReLU │ - │ - │ 0.00 │ │ 3.6 │ 5 │ Dropout │ - │ - │ 0.00 │ │ 3.7 │ 6 │ Linear │ weight │ True │ 4.10 M │ │ 3.7 │ 6 │ Linear │ bias │ True │ 1 K │ ╰──────────────┴────────────────┴───────────────────┴────────────┴───────────────┴─────────────╯ ---------------------------------------- s u m m a r y ----------------------------------------- • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : x = Shape([1, 3, 224, 224]) <Tensor>
E.c.2 Computational Profiling¶
❗❗❗ You need to give at least one feed-forward before measuring the computational ❗❗❗
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# give a feed-forward
# highly recommend to use a single batch to make the result comparable to the other model
import torch
input = torch.randn(1, 3, 224, 224)
output = model(input)
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Overall Report ", "="*10)
# FLOPs/MACs measurement
print(model.cal)
========== Overall Report ==========
Calculation_INFO • Operation_Id = 0 • Operation_Type = VGG • Operation_Name = VGG • MACs = 19681218048.00 = 19.68 G • FLOPs = 39342984704.00 = 39.34 G
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Layer-wise Profile ", "="*10)
# Operation-wise calculation distribution analysis
# Different from `param`, the value of each operation is taking its sub-operation into account.
tb, data = model.profile('cal', no_tree=True)
========== Layer-wise Profile ==========
╭────────────┬────────────┬────────────┬────────────┬──────┬────────────┬────────────┬─────────────┬────────────╮ │ Operation_ │ Operation_ │ Operation_ │ Kernel_Siz │ │ │ │ │ │ │ Id │ Name │ Type │ e │ Bias │ Input │ Output │ MACs │ FLOPs │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1 │ features │ Sequential │ - │ - │ [1, 3, │ [1, 512, │ 19.56 G │ 39.10 G │ │ │ │ │ │ │ 224, 224] │ 7, 7] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.1 │ 0 │ Conv2d │ [3, 3] │ True │ [1, 3, │ [1, 64, │ 86.70 M │ 173.41 M │ │ │ │ │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.2 │ 1 │ BatchNorm2 │ - │ - │ [1, 64, │ [1, 64, │ 6.42 M │ 12.85 M │ │ │ │ d │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.3 │ 2 │ ReLU │ - │ - │ [1, 64, │ [1, 64, │ 3.21 M │ 3.21 M │ │ │ │ │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.4 │ 3 │ Conv2d │ [3, 3] │ True │ [1, 64, │ [1, 64, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.5 │ 4 │ BatchNorm2 │ - │ - │ [1, 64, │ [1, 64, │ 6.42 M │ 12.85 M │ │ │ │ d │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.6 │ 5 │ ReLU │ - │ - │ [1, 64, │ [1, 64, │ 3.21 M │ 3.21 M │ │ │ │ │ │ │ 224, 224] │ 224, 224] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.7 │ 6 │ MaxPool2d │ [2, 2] │ - │ [1, 64, │ [1, 64, │ 2.41 M │ 2.41 M │ │ │ │ │ │ │ 224, 224] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.8 │ 7 │ Conv2d │ [3, 3] │ True │ [1, 64, │ [1, 128, │ 924.84 M │ 1.85 G │ │ │ │ │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.9 │ 8 │ BatchNorm2 │ - │ - │ [1, 128, │ [1, 128, │ 3.21 M │ 6.42 M │ │ │ │ d │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.10 │ 9 │ ReLU │ - │ - │ [1, 128, │ [1, 128, │ 1.61 M │ 1.61 M │ │ │ │ │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.11 │ 10 │ Conv2d │ [3, 3] │ True │ [1, 128, │ [1, 128, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.12 │ 11 │ BatchNorm2 │ - │ - │ [1, 128, │ [1, 128, │ 3.21 M │ 6.42 M │ │ │ │ d │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.13 │ 12 │ ReLU │ - │ - │ [1, 128, │ [1, 128, │ 1.61 M │ 1.61 M │ │ │ │ │ │ │ 112, 112] │ 112, 112] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.14 │ 13 │ MaxPool2d │ [2, 2] │ - │ [1, 128, │ [1, 128, │ 1.20 M │ 1.20 M │ │ │ │ │ │ │ 112, 112] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.15 │ 14 │ Conv2d │ [3, 3] │ True │ [1, 128, │ [1, 256, │ 924.84 M │ 1.85 G │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.16 │ 15 │ BatchNorm2 │ - │ - │ [1, 256, │ [1, 256, │ 1.61 M │ 3.21 M │ │ │ │ d │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.17 │ 16 │ ReLU │ - │ - │ [1, 256, │ [1, 256, │ 802.82 K │ 802.82 K │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.18 │ 17 │ Conv2d │ [3, 3] │ True │ [1, 256, │ [1, 256, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.19 │ 18 │ BatchNorm2 │ - │ - │ [1, 256, │ [1, 256, │ 1.61 M │ 3.21 M │ │ │ │ d │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.20 │ 19 │ ReLU │ - │ - │ [1, 256, │ [1, 256, │ 802.82 K │ 802.82 K │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.21 │ 20 │ Conv2d │ [3, 3] │ True │ [1, 256, │ [1, 256, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.22 │ 21 │ BatchNorm2 │ - │ - │ [1, 256, │ [1, 256, │ 1.61 M │ 3.21 M │ │ │ │ d │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.23 │ 22 │ ReLU │ - │ - │ [1, 256, │ [1, 256, │ 802.82 K │ 802.82 K │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.24 │ 23 │ Conv2d │ [3, 3] │ True │ [1, 256, │ [1, 256, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.25 │ 24 │ BatchNorm2 │ - │ - │ [1, 256, │ [1, 256, │ 1.61 M │ 3.21 M │ │ │ │ d │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.26 │ 25 │ ReLU │ - │ - │ [1, 256, │ [1, 256, │ 802.82 K │ 802.82 K │ │ │ │ │ │ │ 56, 56] │ 56, 56] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.27 │ 26 │ MaxPool2d │ [2, 2] │ - │ [1, 256, │ [1, 256, │ 602.11 K │ 602.11 K │ │ │ │ │ │ │ 56, 56] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.28 │ 27 │ Conv2d │ [3, 3] │ True │ [1, 256, │ [1, 512, │ 924.84 M │ 1.85 G │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.29 │ 28 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 802.82 K │ 1.61 M │ │ │ │ d │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.30 │ 29 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 401.41 K │ 401.41 K │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.31 │ 30 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.32 │ 31 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 802.82 K │ 1.61 M │ │ │ │ d │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.33 │ 32 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 401.41 K │ 401.41 K │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.34 │ 33 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.35 │ 34 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 802.82 K │ 1.61 M │ │ │ │ d │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.36 │ 35 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 401.41 K │ 401.41 K │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.37 │ 36 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 1.85 G │ 3.70 G │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.38 │ 37 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 802.82 K │ 1.61 M │ │ │ │ d │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.39 │ 38 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 401.41 K │ 401.41 K │ │ │ │ │ │ │ 28, 28] │ 28, 28] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.40 │ 39 │ MaxPool2d │ [2, 2] │ - │ [1, 512, │ [1, 512, │ 301.06 K │ 301.06 K │ │ │ │ │ │ │ 28, 28] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.41 │ 40 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 462.42 M │ 924.84 M │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.42 │ 41 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 200.70 K │ 401.41 K │ │ │ │ d │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.43 │ 42 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 100.35 K │ 100.35 K │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.44 │ 43 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 462.42 M │ 924.84 M │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.45 │ 44 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 200.70 K │ 401.41 K │ │ │ │ d │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.46 │ 45 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 100.35 K │ 100.35 K │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.47 │ 46 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 462.42 M │ 924.84 M │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.48 │ 47 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 200.70 K │ 401.41 K │ │ │ │ d │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.49 │ 48 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 100.35 K │ 100.35 K │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.50 │ 49 │ Conv2d │ [3, 3] │ True │ [1, 512, │ [1, 512, │ 462.42 M │ 924.84 M │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.51 │ 50 │ BatchNorm2 │ - │ - │ [1, 512, │ [1, 512, │ 200.70 K │ 401.41 K │ │ │ │ d │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.52 │ 51 │ ReLU │ - │ - │ [1, 512, │ [1, 512, │ 100.35 K │ 100.35 K │ │ │ │ │ │ │ 14, 14] │ 14, 14] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 1.53 │ 52 │ MaxPool2d │ [2, 2] │ - │ [1, 512, │ [1, 512, │ 75.26 K │ 75.26 K │ │ │ │ │ │ │ 14, 14] │ 7, 7] │ │ │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 2 │ avgpool │ AdaptiveAv │ - │ - │ [1, 512, │ [1, 512, │ Not │ Not │ │ │ │ gPool2d │ │ │ 7, 7] │ 7, 7] │ Supported │ Supported │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3 │ classifier │ Sequential │ - │ - │ [1, 25088] │ [1, 1000] │ 123.64 M │ 247.28 M │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.1 │ 0 │ Linear │ - │ True │ [1, 25088] │ [1, 4096] │ 102.76 M │ 205.52 M │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.2 │ 1 │ ReLU │ - │ - │ [1, 4096] │ [1, 4096] │ 4.10 K │ 4.10 K │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.3 │ 2 │ Dropout │ - │ - │ [1, 4096] │ [1, 4096] │ Not │ Not │ │ │ │ │ │ │ │ │ Supported │ Supported │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.4 │ 3 │ Linear │ - │ True │ [1, 4096] │ [1, 4096] │ 16.78 M │ 33.55 M │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.5 │ 4 │ ReLU │ - │ - │ [1, 4096] │ [1, 4096] │ 4.10 K │ 4.10 K │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.6 │ 5 │ Dropout │ - │ - │ [1, 4096] │ [1, 4096] │ Not │ Not │ │ │ │ │ │ │ │ │ Supported │ Supported │ ├────────────┼────────────┼────────────┼────────────┼──────┼────────────┼────────────┼─────────────┼────────────┤ │ 3.7 │ 6 │ Linear │ - │ True │ [1, 4096] │ [1, 1000] │ 4.10 M │ 8.19 M │ ╰────────────┴────────────┴────────────┴────────────┴──────┴────────────┴────────────┴─────────────┴────────────╯ ------------------------------------------------- s u m m a r y ------------------------------------------------- • Model : VGG • Statistics: cal • Device : cuda:0 • FLOPs : 39.34 G • Signature: forward(self, x) • MACs(aka MACC, MADD): 19.68 G • Input : x = Shape([1, 3, 224, 224]) <Tensor>
E.c.3 Memory Diagnostics¶
❗❗❗ You need to give at least one feed-forward before measuring the memory usage ❗❗❗
# Give a feed-forward here if you have not yet.
# We did it when measuring calculation, therefore we don't need to do it again
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Overall Report ", "="*10)
print(model.mem)
========== Overall Report ==========
Memory_INFO • Operation_Id = 0 • Operation_Type = VGG • Operation_Name = VGG • Param_Cost = 574712992.00 = 548.09 MiB • Buffer_Cost = 44160.00 = 43.12 KiB • Output_Cost = 125108128.00 = 119.31 MiB • Total = 699865280.00 = 667.44 MiB
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Layer-wise Profile ", "="*10)
# Hierarchical memory consumption analysis
# Same as `cal`, the value of each operation is taking its sub-operation into account
tb, data = model.profile('mem', no_tree=True)
========== Layer-wise Profile ==========
╭──────────────┬────────────────┬───────────────────┬────────────┬─────────────┬─────────────┬────────────╮ │ Operation_Id │ Operation_Name │ Operation_Type │ Param_Cost │ Buffer_Cost │ Output_Cost │ Total │ ├──────────────┼────────────────┼───────────────────┼────────────┼─────────────┼─────────────┼────────────┤ │ 1 │ features │ Sequential │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ 1.1 │ 0 │ Conv2d │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ 1.2 │ 1 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.3 │ 2 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.4 │ 3 │ Conv2d │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ 1.5 │ 4 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.6 │ 5 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.7 │ 6 │ MaxPool2d │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ 1.8 │ 7 │ Conv2d │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ 1.9 │ 8 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.10 │ 9 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.11 │ 10 │ Conv2d │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ 1.12 │ 11 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.13 │ 12 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.14 │ 13 │ MaxPool2d │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ 1.15 │ 14 │ Conv2d │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ 1.16 │ 15 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.17 │ 16 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.18 │ 17 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.19 │ 18 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.20 │ 19 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.21 │ 20 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.22 │ 21 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.23 │ 22 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.24 │ 23 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.25 │ 24 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.26 │ 25 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.27 │ 26 │ MaxPool2d │ - │ - │ 784 KiB │ 784 KiB │ │ 1.28 │ 27 │ Conv2d │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ 1.29 │ 28 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.30 │ 29 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.31 │ 30 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.32 │ 31 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.33 │ 32 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.34 │ 33 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.35 │ 34 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.36 │ 35 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.37 │ 36 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.38 │ 37 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.39 │ 38 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.40 │ 39 │ MaxPool2d │ - │ - │ 392 KiB │ 392 KiB │ │ 1.41 │ 40 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.42 │ 41 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.43 │ 42 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.44 │ 43 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.45 │ 44 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.46 │ 45 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.47 │ 46 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.48 │ 47 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.49 │ 48 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.50 │ 49 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.51 │ 50 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.52 │ 51 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.53 │ 52 │ MaxPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 3 │ classifier │ Sequential │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ 3.1 │ 0 │ Linear │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ 3.2 │ 1 │ ReLU(inplace) │ - │ - │ - │ - │ │ 3.3 │ 2 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 3.4 │ 3 │ Linear │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ 3.5 │ 4 │ ReLU(inplace) │ - │ - │ - │ - │ │ 3.6 │ 5 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 3.7 │ 6 │ Linear │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰──────────────┴────────────────┴───────────────────┴────────────┴─────────────┴─────────────┴────────────╯ ---------------------------------------------- s u m m a r y ---------------------------------------------- • Model : VGG • Statistics: mem • Device : cuda:0 • Parameters Memory Cost: 548.09 MiB, 82.12 % • Signature: forward(self, x) • Buffers Memory Cost : 43.12 KiB, 0.01 % • Input : • FeatureMap Memory Cost: 119.31 MiB, 17.88 % x = Shape([1, 3, 224, 224]) <Tensor> • Total Memory Cost : 667.44 MiB
E.c.4 Performance Benchmarking¶
❗❗❗ You need to give at least one feed-forward before measuring the inference time / throughput ❗❗❗
# Give a feed-forward here if you have not yet.
# We did it when measuring calculation, therefore we don't need to do it again
# There are two hyper parameters to promise the correctness of the measurement.
# 1. ittp_warmup: Number of warm-up(i.e., feed-forward inference) iterations before `ittp` measurement.
# Default to 50.
model.ittp_warmup = 10
# 2. ittp_benchmark_time: Number of benchmark iterations per operation in measuring `ittp`.
# Default to 100.
model.ittp_benchmark_time = 20
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Overall Report ", "="*10)
# Inference latency & Throughput Benchmarking
# Same as `param`, the data for each operation is only statistically analyzed for that operation and does not include sub-operations ❗❗❗
# The result unit `IPS` means `Input Per Second`, where the input refer to the input given for feed-forward before.
# You can check the input via `model.ipt`
print(model.ittp)
========== Overall Report ==========
Warming Up: 100%|██████████| 10/10 [00:00<00:00, 181.68it/s] Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:00<00:00, 5424.65module/s]
InferTime_Throughput_INFO • Operation_Id = 0 • Operation_Name = VGG • Operation_Type = VGG • Infer_Time = 2.06 ms ± 173.78 us • Throughput = 485.85 IPS ± 37.97 IPS
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print("="*10, " Layer-wise Profile ", "="*10)
# The result unit `IPS` means `Input Per Second`, where the input refer to the input given for feed-forward before.
# You can check the input via `model.ipt`
tb, data = model.profile('ittp', no_tree=True)
========== Layer-wise Profile ==========
Warming Up: 100%|██████████| 10/10 [00:00<00:00, 556.83it/s] Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:00<00:00, 6193.69module/s]
╭──────────────┬────────────────┬───────────────────┬───────────────────────┬─────────────────────────╮ │ Operation_Id │ Operation_Name │ Operation_Type │ Infer_Time │ Throughput │ ├──────────────┼────────────────┼───────────────────┼───────────────────────┼─────────────────────────┤ │ 1 │ features │ Sequential │ 1.43 ms ± 134.30 us │ 699.11 IPS ± 61.65 IPS │ │ 1.1 │ 0 │ Conv2d │ 66.53 us ± 2.42 us │ 15.03 KIPS ± 542.73 IPS │ │ 1.2 │ 1 │ BatchNorm2d │ 49.68 us ± 1.65 us │ 20.13 KIPS ± 679.42 IPS │ │ 1.3 │ 2 │ ReLU │ 23.22 us ± 968.00 ns │ 43.07 KIPS ± 1.82 KIPS │ │ 1.4 │ 3 │ Conv2d │ 114.64 us ± 4.13 us │ 8.72 KIPS ± 325.37 IPS │ │ 1.5 │ 4 │ BatchNorm2d │ 50.18 us ± 1.83 us │ 19.93 KIPS ± 725.58 IPS │ │ 1.6 │ 5 │ ReLU │ 23.55 us ± 280.00 ns │ 42.46 KIPS ± 509.36 IPS │ │ 1.7 │ 6 │ MaxPool2d │ 30.18 us ± 1.02 us │ 33.14 KIPS ± 1.12 KIPS │ │ 1.8 │ 7 │ Conv2d │ 78.77 us ± 319.99 ns │ 12.70 KIPS ± 51.68 IPS │ │ 1.9 │ 8 │ BatchNorm2d │ 50.18 us ± 744.00 ns │ 19.93 KIPS ± 299.96 IPS │ │ 1.10 │ 9 │ ReLU │ 23.36 us ± 1.02 us │ 42.81 KIPS ± 1.93 KIPS │ │ 1.11 │ 10 │ Conv2d │ 101.30 us ± 664.00 ns │ 9.87 KIPS ± 64.46 IPS │ │ 1.12 │ 11 │ BatchNorm2d │ 45.20 us ± 6.14 us │ 22.13 KIPS ± 2.91 KIPS │ │ 1.13 │ 12 │ ReLU │ 21.26 us ± 3.33 us │ 47.14 KIPS ± 7.50 KIPS │ │ 1.14 │ 13 │ MaxPool2d │ 28.11 us ± 4.41 us │ 35.69 KIPS ± 5.69 KIPS │ │ 1.15 │ 14 │ Conv2d │ 70.66 us ± 160.00 ns │ 14.15 KIPS ± 32.12 IPS │ │ 1.16 │ 15 │ BatchNorm2d │ 45.50 us ± 6.15 us │ 21.98 KIPS ± 2.91 KIPS │ │ 1.17 │ 16 │ ReLU │ 20.98 us ± 2.75 us │ 47.71 KIPS ± 6.03 KIPS │ │ 1.18 │ 17 │ Conv2d │ 103.17 us ± 928.00 ns │ 9.69 KIPS ± 87.55 IPS │ │ 1.19 │ 18 │ BatchNorm2d │ 50.18 us ± 352.00 ns │ 19.93 KIPS ± 141.03 IPS │ │ 1.20 │ 19 │ ReLU │ 22.59 us ± 800.00 ns │ 44.26 KIPS ± 1.53 KIPS │ │ 1.21 │ 20 │ Conv2d │ 102.40 us ± 5.23 us │ 9.77 KIPS ± 515.19 IPS │ │ 1.22 │ 21 │ BatchNorm2d │ 43.71 us ± 2.86 us │ 22.88 KIPS ± 1.43 KIPS │ │ 1.23 │ 22 │ ReLU │ 20.43 us ± 2.82 us │ 48.94 KIPS ± 6.20 KIPS │ │ 1.24 │ 23 │ Conv2d │ 98.30 us ± 1.06 us │ 10.17 KIPS ± 109.48 IPS │ │ 1.25 │ 24 │ BatchNorm2d │ 43.01 us ± 584.00 ns │ 23.25 KIPS ± 312.60 IPS │ │ 1.26 │ 25 │ ReLU │ 20.32 us ± 904.00 ns │ 49.21 KIPS ± 2.17 KIPS │ │ 1.27 │ 26 │ MaxPool2d │ 30.45 us ± 1.26 us │ 32.84 KIPS ± 1.37 KIPS │ │ 1.28 │ 27 │ Conv2d │ 72.93 us ± 1.34 us │ 13.71 KIPS ± 247.66 IPS │ │ 1.29 │ 28 │ BatchNorm2d │ 50.18 us ± 2.02 us │ 19.93 KIPS ± 803.87 IPS │ │ 1.30 │ 29 │ ReLU │ 23.55 us ± 1.02 us │ 42.46 KIPS ± 1.77 KIPS │ │ 1.31 │ 30 │ Conv2d │ 105.47 us ± 1.21 us │ 9.48 KIPS ± 109.38 IPS │ │ 1.32 │ 31 │ BatchNorm2d │ 42.96 us ± 488.00 ns │ 23.28 KIPS ± 266.26 IPS │ │ 1.33 │ 32 │ ReLU │ 19.68 us ± 600.00 ns │ 50.81 KIPS ± 1.52 KIPS │ │ 1.34 │ 33 │ Conv2d │ 100.35 us ± 383.99 ns │ 9.96 KIPS ± 37.93 IPS │ │ 1.35 │ 34 │ BatchNorm2d │ 43.01 us ± 304.00 ns │ 23.25 KIPS ± 164.05 IPS │ │ 1.36 │ 35 │ ReLU │ 20.19 us ± 928.00 ns │ 49.52 KIPS ± 2.32 KIPS │ │ 1.37 │ 36 │ Conv2d │ 100.35 us ± 464.00 ns │ 9.96 KIPS ± 45.80 IPS │ │ 1.38 │ 37 │ BatchNorm2d │ 43.07 us ± 248.00 ns │ 23.22 KIPS ± 133.29 IPS │ │ 1.39 │ 38 │ ReLU │ 20.38 us ± 1.15 us │ 49.06 KIPS ± 2.76 KIPS │ │ 1.40 │ 39 │ MaxPool2d │ 30.72 us ± 168.00 ns │ 32.55 KIPS ± 179.00 IPS │ │ 1.41 │ 40 │ Conv2d │ 69.79 us ± 8.31 us │ 14.33 KIPS ± 1.54 KIPS │ │ 1.42 │ 41 │ BatchNorm2d │ 43.74 us ± 1.09 us │ 22.86 KIPS ± 575.54 IPS │ │ 1.43 │ 42 │ ReLU │ 20.19 us ± 760.00 ns │ 49.53 KIPS ± 1.90 KIPS │ │ 1.44 │ 43 │ Conv2d │ 69.49 us ± 1.28 us │ 14.39 KIPS ± 266.38 IPS │ │ 1.45 │ 44 │ BatchNorm2d │ 43.10 us ± 1.02 us │ 23.20 KIPS ± 540.73 IPS │ │ 1.46 │ 45 │ ReLU │ 19.55 us ± 896.00 ns │ 51.15 KIPS ± 2.26 KIPS │ │ 1.47 │ 46 │ Conv2d │ 69.63 us ± 1.25 us │ 14.36 KIPS ± 259.96 IPS │ │ 1.48 │ 47 │ BatchNorm2d │ 43.01 us ± 488.00 ns │ 23.25 KIPS ± 266.81 IPS │ │ 1.49 │ 48 │ ReLU │ 20.05 us ± 968.00 ns │ 49.88 KIPS ± 2.42 KIPS │ │ 1.50 │ 49 │ Conv2d │ 68.82 us ± 904.00 ns │ 14.53 KIPS ± 189.55 IPS │ │ 1.51 │ 50 │ BatchNorm2d │ 43.01 us ± 800.00 ns │ 23.25 KIPS ± 426.38 IPS │ │ 1.52 │ 51 │ ReLU │ 19.49 us ± 856.00 ns │ 51.31 KIPS ± 2.16 KIPS │ │ 1.53 │ 52 │ MaxPool2d │ 26.64 us ± 696.00 ns │ 37.54 KIPS ± 958.52 IPS │ │ 2 │ avgpool │ AdaptiveAvgPool2d │ 28.50 us ± 904.00 ns │ 35.09 KIPS ± 1.14 KIPS │ │ 3 │ classifier │ Sequential │ 585.07 us ± 1.28 us │ 1.71 KIPS ± 3.73 IPS │ │ 3.1 │ 0 │ Linear │ 488.45 us ± 2.37 us │ 2.05 KIPS ± 9.92 IPS │ │ 3.2 │ 1 │ ReLU │ 19.62 us ± 824.00 ns │ 50.98 KIPS ± 2.09 KIPS │ │ 3.3 │ 2 │ Dropout │ 15.36 us ± 432.00 ns │ 65.10 KIPS ± 1.89 KIPS │ │ 3.4 │ 3 │ Linear │ 35.84 us ± 224.00 ns │ 27.90 KIPS ± 172.36 IPS │ │ 3.5 │ 4 │ ReLU │ 18.24 us ± 976.00 ns │ 54.82 KIPS ± 3.03 KIPS │ │ 3.6 │ 5 │ Dropout │ 14.30 us ± 632.00 ns │ 69.91 KIPS ± 3.07 KIPS │ │ 3.7 │ 6 │ Linear │ 36.86 us ± 784.00 ns │ 27.13 KIPS ± 581.71 IPS │ ╰──────────────┴────────────────┴───────────────────┴───────────────────────┴─────────────────────────╯ -------------------------------------------- s u m m a r y -------------------------------------------- • Model : VGG • Statistics: ittp • Device : cuda:0 • Benchmark Times: 20 • Signature: forward(self, x) • Inference Elapse: 2.07 ms ± 63.78 us • Input : • Throughput : 483.33 IPS ± 14.49 IPS x = Shape([1, 3, 224, 224]) <Tensor>
F. Fine-Grained Customization¶
TorchMeter
provides lots of customization options in following aspects, feel free to customize your style:
- Statistics Overview
- Rich-Text Operation Tree
- Tabular Report
F.a Customization of Statistics Overview¶
F.a.1 Pick and Reorder Statistics¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# Just pass in the names and orders of the statistics you want to display in `model.overview()`
# they will be displayed in the order you pass them in after the model information.
print(model.overview("param", "mem"))
────────────── Model INFO ─────────────── ───────────── Param INFO ───────────── • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : ────────────────────────────────────── x = Shape([1, 3, 224, 224]) <Tensor> ───────────────────────────────────────── ────────────────── Mem INFO ─────────────────── • Statistics: mem • Parameters Memory Cost: 548.09 MiB, 82.12 % • Buffers Memory Cost : 43.12 KiB, 0.01 % • FeatureMap Memory Cost: 119.31 MiB, 17.88 % • Total Memory Cost : 667.44 MiB ───────────────────────────────────────────────
F.a.2 Pure Output without Warnings¶
TorchMeter
is still in the development stage, and the support for some operations or layers is not yet perfect.Therefore, the current version of
TorchMeter
may not be able to measurecal
(param
,mem
, andittp
will not be affected) for some operations, and a warning message will be display at this time.We offer an argument to disable this behavior, see below. You can compare the result with that in section
E.b
.
print(model.overview(show_warning=False))
Warming Up: 100%|██████████| 10/10 [00:00<00:00, 432.77it/s] Benchmark Inference Time & Throughput: 100%|██████████| 1280/1280 [00:00<00:00, 5649.95module/s]
────────────── Model INFO ─────────────── ───────────── Param INFO ───────────── • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : ────────────────────────────────────── x = Shape([1, 3, 224, 224]) <Tensor> ───────────────────────────────────────── ─────────── Cal INFO ──────────── ────────────────── Mem INFO ─────────────────── • Statistics: cal • Statistics: mem • FLOPs : 39.34 G • Parameters Memory Cost: 548.09 MiB, 82.12 % • MACs(aka MACC, MADD): 19.68 G • Buffers Memory Cost : 43.12 KiB, 0.01 % ───────────────────────────────── • FeatureMap Memory Cost: 119.31 MiB, 17.88 % • Total Memory Cost : 667.44 MiB ─────────────────────────────────────────────── ──────────────── Ittp INFO ───────────────── • Statistics: ittp • Benchmark Times: 20 • Inference Elapse: 2.26 ms ± 175.01 us • Throughput : 442.99 IPS ± 36.73 IPS ────────────────────────────────────────────
F.b Customization of Rich-Text Operation Tree¶
There are two types of customizations for hierarchical operation tree:
- Hierarchical display customization
- Repeat block customization
- Customize the title of the repeat block
- Customize the overall style of the repeat block
- Customize the footnote of the repeat block
F.b.1 Customize the Hierarchical Display¶
All customization fields can be found in the
Default Configuration
section inCheatsheet
tab.
You can customize the display of a tree level by designating the configurations through the level index, which can be found here.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# modify related configurations via attribute access (Suitable for small-scale, one-on-one modification)
model.tree_levels_args.default.label = "[b gray35](<node_id>) [green]<name>[/green] [cyan]<module_repr>[/]"
# modify related configurations via dict (Suitable for a large number of modifications)
model.tree_levels_args = {
"default": {"guide_style": "yellow"},
"1": {"guide_style": "cornflower_blue"}
}
print(model.structure)
VGG ├── (1) features Sequential │ ├── (1.1) 0 Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.2) 1 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.3) 2 ReLU(inplace=True) │ ├── (1.4) 3 Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.5) 4 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.6) 5 ReLU(inplace=True) │ ├── (1.7) 6 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.8) 7 Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.9) 8 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.10) 9 ReLU(inplace=True) │ ├── (1.11) 10 Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.12) 11 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.13) 12 ReLU(inplace=True) │ ├── (1.14) 13 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.15) 14 Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Repeat [3] Times ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │ │ ┃ (1.j) 15 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(j+1)) 16 ReLU(inplace=True) ┃ │ │ ┃ (1.(j+2)) 17 Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ --------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where j = 16, 19, 22 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.25) 24 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.26) 25 ReLU(inplace=True) │ ├── (1.27) 26 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.28) 27 Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Repeat [3] Times ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │ │ ┃ (1.k) 28 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(k+1)) 29 ReLU(inplace=True) ┃ │ │ ┃ (1.(k+2)) 30 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ --------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where k = 29, 32, 35 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.38) 37 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.39) 38 ReLU(inplace=True) │ ├── (1.40) 39 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Repeat [2] Times ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │ │ ┃ (1.a) 40 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ (1.(a+1)) 41 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(a+2)) 42 ReLU(inplace=True) ┃ │ │ ┃ (1.(a+3)) 43 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ (1.(a+4)) 44 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(a+5)) 45 ReLU(inplace=True) ┃ │ │ ┃ ------------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where a = 41, 47 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ └── (1.53) 52 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ├── (2) avgpool AdaptiveAvgPool2d(output_size=(7, 7)) └── (3) classifier Sequential ├── (3.1) 0 Linear(in_features=25088, out_features=4096, bias=True) ├── (3.2) 1 ReLU(inplace=True) ├── (3.3) 2 Dropout(p=0.5, inplace=False) ├── (3.4) 3 Linear(in_features=4096, out_features=4096, bias=True) ├── (3.5) 4 ReLU(inplace=True) ├── (3.6) 5 Dropout(p=0.5, inplace=False) └── (3.7) 6 Linear(in_features=4096, out_features=1000, bias=True)
F.b.2 Customize the Repeat Block¶
F.b.2.1 Customize the title¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# modify related configurations via attribute access (Suitable for small-scale, one-on-one modification)
model.tree_repeat_block_args.title = "[[b]<repeat_time>[/b]] [i]Times Repeated[/]"
model.tree_repeat_block_args.title_align = "right"
print(model.structure)
VGG ├── (1) features Sequential │ ├── (1.1) 0 Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.2) 1 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.3) 2 ReLU(inplace=True) │ ├── (1.4) 3 Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.5) 4 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.6) 5 ReLU(inplace=True) │ ├── (1.7) 6 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.8) 7 Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.9) 8 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.10) 9 ReLU(inplace=True) │ ├── (1.11) 10 Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.12) 11 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.13) 12 ReLU(inplace=True) │ ├── (1.14) 13 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.15) 14 Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ [3] Times Repeated ━┓ │ │ ┃ (1.b) 15 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(b+1)) 16 ReLU(inplace=True) ┃ │ │ ┃ (1.(b+2)) 17 Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ --------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where b = 16, 19, 22 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.25) 24 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.26) 25 ReLU(inplace=True) │ ├── (1.27) 26 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.28) 27 Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ [3] Times Repeated ━┓ │ │ ┃ (1.c) 28 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(c+1)) 29 ReLU(inplace=True) ┃ │ │ ┃ (1.(c+2)) 30 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ --------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where c = 29, 32, 35 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ ├── (1.38) 37 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.39) 38 ReLU(inplace=True) │ ├── (1.40) 39 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ [2] Times Repeated ━┓ │ │ ┃ (1.d) 40 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ (1.(d+1)) 41 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(d+2)) 42 ReLU(inplace=True) ┃ │ │ ┃ (1.(d+3)) 43 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ┃ │ │ ┃ (1.(d+4)) 44 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ┃ │ │ ┃ (1.(d+5)) 45 ReLU(inplace=True) ┃ │ │ ┃ ------------------------------------------------------------------------------------------------- ┃ │ │ ┃ Where d = 41, 47 ┃ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ └── (1.53) 52 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ├── (2) avgpool AdaptiveAvgPool2d(output_size=(7, 7)) └── (3) classifier Sequential ├── (3.1) 0 Linear(in_features=25088, out_features=4096, bias=True) ├── (3.2) 1 ReLU(inplace=True) ├── (3.3) 2 Dropout(p=0.5, inplace=False) ├── (3.4) 3 Linear(in_features=4096, out_features=4096, bias=True) ├── (3.5) 4 ReLU(inplace=True) ├── (3.6) 5 Dropout(p=0.5, inplace=False) └── (3.7) 6 Linear(in_features=4096, out_features=1000, bias=True)
F.b.2.2 Customize the style¶
All customization fields can be found in the
Customization
tab.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
from rich.box import ROUNDED
# modify related configurations via dict (Suitable for a large number of modifications)
model.tree_repeat_block_args = {
"style": "purple",
"box": ROUNDED,
}
print(model.structure)
VGG ├── (1) features Sequential │ ├── (1.1) 0 Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.2) 1 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.3) 2 ReLU(inplace=True) │ ├── (1.4) 3 Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.5) 4 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.6) 5 ReLU(inplace=True) │ ├── (1.7) 6 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.8) 7 Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.9) 8 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.10) 9 ReLU(inplace=True) │ ├── (1.11) 10 Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── (1.12) 11 BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.13) 12 ReLU(inplace=True) │ ├── (1.14) 13 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.15) 14 Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ╭────────────────────────────────────────────────────────────────────────── [3] Times Repeated ─╮ │ │ │ (1.e) 15 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ │ │ │ (1.(e+1)) 16 ReLU(inplace=True) │ │ │ │ (1.(e+2)) 17 Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ │ │ │ --------------------------------------------------------------------------------------------- │ │ │ │ Where e = 16, 19, 22 │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────╯ │ ├── (1.25) 24 BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.26) 25 ReLU(inplace=True) │ ├── (1.27) 26 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── (1.28) 27 Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ ├── ╭────────────────────────────────────────────────────────────────────────── [3] Times Repeated ─╮ │ │ │ (1.f) 28 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ │ │ │ (1.(f+1)) 29 ReLU(inplace=True) │ │ │ │ (1.(f+2)) 30 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ │ │ │ --------------------------------------------------------------------------------------------- │ │ │ │ Where f = 29, 32, 35 │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────╯ │ ├── (1.38) 37 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ ├── (1.39) 38 ReLU(inplace=True) │ ├── (1.40) 39 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) │ ├── ╭────────────────────────────────────────────────────────────────────────────── [2] Times Repeated ─╮ │ │ │ (1.g) 40 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ │ │ │ (1.(g+1)) 41 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ │ │ │ (1.(g+2)) 42 ReLU(inplace=True) │ │ │ │ (1.(g+3)) 43 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) │ │ │ │ (1.(g+4)) 44 BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) │ │ │ │ (1.(g+5)) 45 ReLU(inplace=True) │ │ │ │ ------------------------------------------------------------------------------------------------- │ │ │ │ Where g = 41, 47 │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────╯ │ └── (1.53) 52 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ├── (2) avgpool AdaptiveAvgPool2d(output_size=(7, 7)) └── (3) classifier Sequential ├── (3.1) 0 Linear(in_features=25088, out_features=4096, bias=True) ├── (3.2) 1 ReLU(inplace=True) ├── (3.3) 2 Dropout(p=0.5, inplace=False) ├── (3.4) 3 Linear(in_features=4096, out_features=4096, bias=True) ├── (3.5) 4 ReLU(inplace=True) ├── (3.6) 5 Dropout(p=0.5, inplace=False) └── (3.7) 6 Linear(in_features=4096, out_features=1000, bias=True)
F.b.2.3 Customize the footer¶
There are three ways to customize the footer, which can be classified according to the degree of setting freedom:
- Fixed text
- Dynamic text based on attributes of the tree node.
- Dynamic text based on function
For the convenience of demonstration, we need a simpler model, namely the
RepeatModel
below.
import torch.nn as nn
from random import sample
from torchmeter import Meter
class RepeatModel(nn.Module):
def __init__(self, repeat_winsz:int=1, repeat_time:int=2):
super(RepeatModel, self).__init__()
layer_candidates = [
nn.Linear(10, 10),
nn.ReLU(),
nn.Identity()
]
pick_modules = sample(layer_candidates, repeat_winsz)
all_modules = pick_modules * repeat_time
self.layers = nn.ModuleList(all_modules)
footer_model = Meter(
RepeatModel(repeat_winsz=2, repeat_time=3),
device="cpu"
)
print("The default footer:")
print(footer_model.structure)
Finish Scanning model in 0.0013 seconds
The default footer:
RepeatModel └── (1) layers ModuleList └── ╭─────────────────────────────────────────────── [3] Times Repeated ─╮ │ (1.x) 0 ReLU() │ │ (1.(x+1)) 1 Linear(in_features=10, out_features=10, bias=True) │ │ ------------------------------------------------------------------ │ │ Where x = 1, 3, 5 │ ╰────────────────────────────────────────────────────────────────────╯
F.b.2.3.1 Fixed Text¶
# Context
# --------------------------------------------------------------------------------
# footer_model: Instance of `torchmeter.Meter` created from RepeatModel in F.c.2.3
footer_model.tree_renderer.repeat_footer = "My custom footer"
print(footer_model.structure)
RepeatModel └── (1) layers ModuleList └── ╭─────────────────────────────────────────────── [3] Times Repeated ─╮ │ (1.y) 0 ReLU() │ │ (1.(y+1)) 1 Linear(in_features=10, out_features=10, bias=True) │ │ ------------------------------------------------------------------ │ │ My custom footer │ ╰────────────────────────────────────────────────────────────────────╯
F.b.2.3.2 Dynamic Text based on Tree Node Attributes¶
A tree node represents an operation in the model, the attributes of which can be found in Tree Node Attributes
section in Cheatsheet
tab.
# Context
# --------------------------------------------------------------------------------
# footer_model: Instance of `torchmeter.Meter` created from RepeatModel in F.c.2.3
footer_model.tree_renderer.repeat_footer = "The type of first module is <type>"
print(footer_model.structure)
RepeatModel └── (1) layers ModuleList └── ╭─────────────────────────────────────────────── [3] Times Repeated ─╮ │ (1.i) 0 ReLU() │ │ (1.(i+1)) 1 Linear(in_features=10, out_features=10, bias=True) │ │ ------------------------------------------------------------------ │ │ The type of first module is ReLU │ ╰────────────────────────────────────────────────────────────────────╯
F.b.2.3.3 Dynamic Text based on Function¶
# Context
# --------------------------------------------------------------------------------
# footer_model: Instance of `torchmeter.Meter` created from RepeatModel in F.c.2.3
from typing import Dict, Any
def my_footer(attr_dict: Dict[str, Any]) -> str:
""" Footer function requirements
1. must have only one argument(name irrelevant) to receive a dictionary of attributes
(key: attribute name | value: attribute value)
2. must return a string, the string can still contain a place holder like `<repeat_winsz>`
to be replaced with the corresponding attribute value before rendering.
"""
repeat_win_size = attr_dict["repeat_winsz"]
if repeat_win_size > 1:
return f"There are {repeat_win_size} modules in a repeat window"
else:
return "The repeat window only contains one module"
footer_model.tree_renderer.repeat_footer = my_footer
print(footer_model.structure)
RepeatModel └── (1) layers ModuleList └── ╭─────────────────────────────────────────────── [3] Times Repeated ─╮ │ (1.j) 0 ReLU() │ │ (1.(j+1)) 1 Linear(in_features=10, out_features=10, bias=True) │ │ ------------------------------------------------------------------ │ │ There are 2 modules in a repeat window │ ╰────────────────────────────────────────────────────────────────────╯
F.c Customization of Tabular Report¶
The customization of tabular report fucos on 3 aspects:
- Customize the column/overall style.
- Enable or not the operation tree.
- Customize the tabular report structure.
F.c.1 Customize the Column/Overall Style¶
All customization fields can be found in the
Default Configuration
section inCheatsheet
tab.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# customize column display settings
# modify related configurations via attribute access (Suitable for small-scale, one-on-one modification)
model.table_column_args.justify = "left"
# customize the display for the whole table
# modify related configurations via dict (Suitable for a large number of modifications)
model.table_display_args = {
"style": "#af8700", # or rgb(175,135,0)
"show_lines": True,
"show_edge": False
}
tb, data = model.profile("param", no_tree=True)
Operation_Id │ Operation_Name │ Operation_Type │ Param_Name │ Requires_Grad │ Numeric_Num ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1 │ features │ Sequential │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.1 │ 0 │ Conv2d │ weight │ False │ 1.73 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.1 │ 0 │ Conv2d │ bias │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.2 │ 1 │ BatchNorm2d │ weight │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.2 │ 1 │ BatchNorm2d │ bias │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.3 │ 2 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.4 │ 3 │ Conv2d │ weight │ False │ 36.86 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.4 │ 3 │ Conv2d │ bias │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.5 │ 4 │ BatchNorm2d │ weight │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.5 │ 4 │ BatchNorm2d │ bias │ False │ 64.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.6 │ 5 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.7 │ 6 │ MaxPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.8 │ 7 │ Conv2d │ weight │ False │ 73.73 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.8 │ 7 │ Conv2d │ bias │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.9 │ 8 │ BatchNorm2d │ weight │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.9 │ 8 │ BatchNorm2d │ bias │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.10 │ 9 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.11 │ 10 │ Conv2d │ weight │ False │ 147.46 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.11 │ 10 │ Conv2d │ bias │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.12 │ 11 │ BatchNorm2d │ weight │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.12 │ 11 │ BatchNorm2d │ bias │ False │ 128.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.13 │ 12 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.14 │ 13 │ MaxPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.15 │ 14 │ Conv2d │ weight │ False │ 294.91 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.15 │ 14 │ Conv2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.16 │ 15 │ BatchNorm2d │ weight │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.16 │ 15 │ BatchNorm2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.17 │ 16 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.18 │ 17 │ Conv2d │ weight │ False │ 589.82 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.18 │ 17 │ Conv2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.19 │ 18 │ BatchNorm2d │ weight │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.19 │ 18 │ BatchNorm2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.20 │ 19 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.21 │ 20 │ Conv2d │ weight │ False │ 589.82 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.21 │ 20 │ Conv2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.22 │ 21 │ BatchNorm2d │ weight │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.22 │ 21 │ BatchNorm2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.23 │ 22 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.24 │ 23 │ Conv2d │ weight │ False │ 589.82 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.24 │ 23 │ Conv2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.25 │ 24 │ BatchNorm2d │ weight │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.25 │ 24 │ BatchNorm2d │ bias │ False │ 256.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.26 │ 25 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.27 │ 26 │ MaxPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.28 │ 27 │ Conv2d │ weight │ False │ 1.18 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.28 │ 27 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.29 │ 28 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.29 │ 28 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.30 │ 29 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.31 │ 30 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.31 │ 30 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.32 │ 31 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.32 │ 31 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.33 │ 32 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.34 │ 33 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.34 │ 33 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.35 │ 34 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.35 │ 34 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.36 │ 35 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.37 │ 36 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.37 │ 36 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.38 │ 37 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.38 │ 37 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.39 │ 38 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.40 │ 39 │ MaxPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.41 │ 40 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.41 │ 40 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.42 │ 41 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.42 │ 41 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.43 │ 42 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.44 │ 43 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.44 │ 43 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.45 │ 44 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.45 │ 44 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.46 │ 45 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.47 │ 46 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.47 │ 46 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.48 │ 47 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.48 │ 47 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.49 │ 48 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.50 │ 49 │ Conv2d │ weight │ False │ 2.36 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.50 │ 49 │ Conv2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.51 │ 50 │ BatchNorm2d │ weight │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.51 │ 50 │ BatchNorm2d │ bias │ False │ 512.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.52 │ 51 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1.53 │ 52 │ MaxPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3 │ classifier │ Sequential │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.1 │ 0 │ Linear │ weight │ True │ 102.76 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.1 │ 0 │ Linear │ bias │ True │ 4.10 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.2 │ 1 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.3 │ 2 │ Dropout │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.4 │ 3 │ Linear │ weight │ True │ 16.78 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.4 │ 3 │ Linear │ bias │ True │ 4.10 K ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.5 │ 4 │ ReLU │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.6 │ 5 │ Dropout │ - │ - │ 0.00 ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.7 │ 6 │ Linear │ weight │ True │ 4.10 M ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 3.7 │ 6 │ Linear │ bias │ True │ 1 K --------------------------------------- s u m m a r y ---------------------------------------- • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : x = Shape([1, 3, 224, 224]) <Tensor>
F.c.2 Enable the Operation Tree Beside¶
❗️❗️❗️ When the terminal width is too small or the tree width is too big,
❗️❗️❗️ the space of the table will be squeezed and reduce the visual experience.
## Discard above customization settings
cfg.restore()
## Disable interval output to adapt to Jupyter Notebook
cfg.render_interval = 0
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# here we use param report instead of the mem report, cause it has smaller width requirement
tb, data = model.profile("param", no_tree=False)
VGG ╭───────────┬───────────┬───────────┬───────────┬───────────┬───────────╮ ├── (1) features Sequential │ Operation │ Operation │ Operation │ Param_Nam │ Requires_ │ Numeric_N │ │ ├── (1.1) 0 Conv2d │ _Id │ _Name │ _Type │ e │ Grad │ um │ │ ├── (1.2) 1 BatchNorm2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.3) 2 ReLU │ 1 │ features │ Sequentia │ - │ - │ 0.00 │ │ ├── (1.4) 3 Conv2d │ │ │ l │ │ │ │ │ ├── (1.5) 4 BatchNorm2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.6) 5 ReLU │ 1.1 │ 0 │ Conv2d │ weight │ False │ 1.73 K │ │ ├── (1.7) 6 MaxPool2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.8) 7 Conv2d │ 1.1 │ 0 │ Conv2d │ bias │ False │ 64.00 │ │ ├── (1.9) 8 BatchNorm2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.10) 9 ReLU │ 1.2 │ 1 │ BatchNorm │ weight │ False │ 64.00 │ │ ├── (1.11) 10 Conv2d │ │ │ 2d │ │ │ │ │ ├── (1.12) 11 BatchNorm2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.13) 12 ReLU │ 1.2 │ 1 │ BatchNorm │ bias │ False │ 64.00 │ │ ├── (1.14) 13 MaxPool2d │ │ │ 2d │ │ │ │ │ ├── (1.15) 14 Conv2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── ┏━━━━ Repeat [3] Times ━━━━┓ │ 1.3 │ 2 │ ReLU │ - │ - │ 0.00 │ │ │ ┃ (1.h) 15 BatchNorm2d ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ (1.(h+1)) 16 ReLU ┃ │ 1.4 │ 3 │ Conv2d │ weight │ False │ 36.86 K │ │ │ ┃ (1.(h+2)) 17 Conv2d ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ ------------------------ ┃ │ 1.4 │ 3 │ Conv2d │ bias │ False │ 64.00 │ │ │ ┃ Where h = 16, 19, 22 ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ 1.5 │ 4 │ BatchNorm │ weight │ False │ 64.00 │ │ ├── (1.25) 24 BatchNorm2d │ │ │ 2d │ │ │ │ │ ├── (1.26) 25 ReLU ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.27) 26 MaxPool2d │ 1.5 │ 4 │ BatchNorm │ bias │ False │ 64.00 │ │ ├── (1.28) 27 Conv2d │ │ │ 2d │ │ │ │ │ ├── ┏━━━━ Repeat [3] Times ━━━━┓ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ (1.l) 28 BatchNorm2d ┃ │ 1.6 │ 5 │ ReLU │ - │ - │ 0.00 │ │ │ ┃ (1.(l+1)) 29 ReLU ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ (1.(l+2)) 30 Conv2d ┃ │ 1.7 │ 6 │ MaxPool2d │ - │ - │ 0.00 │ │ │ ┃ ------------------------ ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ Where l = 29, 32, 35 ┃ │ 1.8 │ 7 │ Conv2d │ weight │ False │ 73.73 K │ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.38) 37 BatchNorm2d │ 1.8 │ 7 │ Conv2d │ bias │ False │ 128.00 │ │ ├── (1.39) 38 ReLU ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ ├── (1.40) 39 MaxPool2d │ 1.9 │ 8 │ BatchNorm │ weight │ False │ 128.00 │ │ ├── ┏━━━━━━ Repeat [2] Times ━━━━━━┓ │ │ │ 2d │ │ │ │ │ │ ┃ (1.m) 40 Conv2d ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ (1.(m+1)) 41 BatchNorm2d ┃ │ 1.9 │ 8 │ BatchNorm │ bias │ False │ 128.00 │ │ │ ┃ (1.(m+2)) 42 ReLU ┃ │ │ │ 2d │ │ │ │ │ │ ┃ (1.(m+3)) 43 Conv2d ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ (1.(m+4)) 44 BatchNorm2d ┃ │ 1.10 │ 9 │ ReLU │ - │ - │ 0.00 │ │ │ ┃ (1.(m+5)) 45 ReLU ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┃ ---------------------------- ┃ │ 1.11 │ 10 │ Conv2d │ weight │ False │ 147.46 K │ │ │ ┃ Where m = 41, 47 ┃ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ │ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ │ 1.11 │ 10 │ Conv2d │ bias │ False │ 128.00 │ │ └── (1.53) 52 MaxPool2d ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ ├── (2) avgpool AdaptiveAvgPool2d │ 1.12 │ 11 │ BatchNorm │ weight │ False │ 128.00 │ └── (3) classifier Sequential │ │ │ 2d │ │ │ │ ├── (3.1) 0 Linear ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ ├── (3.2) 1 ReLU │ 1.12 │ 11 │ BatchNorm │ bias │ False │ 128.00 │ ├── (3.3) 2 Dropout │ │ │ 2d │ │ │ │ ├── (3.4) 3 Linear ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ ├── (3.5) 4 ReLU │ 1.13 │ 12 │ ReLU │ - │ - │ 0.00 │ ├── (3.6) 5 Dropout ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ └── (3.7) 6 Linear │ 1.14 │ 13 │ MaxPool2d │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.15 │ 14 │ Conv2d │ weight │ False │ 294.91 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.15 │ 14 │ Conv2d │ bias │ False │ 256.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.16 │ 15 │ BatchNorm │ weight │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.16 │ 15 │ BatchNorm │ bias │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.17 │ 16 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.18 │ 17 │ Conv2d │ weight │ False │ 589.82 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.18 │ 17 │ Conv2d │ bias │ False │ 256.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.19 │ 18 │ BatchNorm │ weight │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.19 │ 18 │ BatchNorm │ bias │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.20 │ 19 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.21 │ 20 │ Conv2d │ weight │ False │ 589.82 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.21 │ 20 │ Conv2d │ bias │ False │ 256.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.22 │ 21 │ BatchNorm │ weight │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.22 │ 21 │ BatchNorm │ bias │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.23 │ 22 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.24 │ 23 │ Conv2d │ weight │ False │ 589.82 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.24 │ 23 │ Conv2d │ bias │ False │ 256.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.25 │ 24 │ BatchNorm │ weight │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.25 │ 24 │ BatchNorm │ bias │ False │ 256.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.26 │ 25 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.27 │ 26 │ MaxPool2d │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.28 │ 27 │ Conv2d │ weight │ False │ 1.18 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.28 │ 27 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.29 │ 28 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.29 │ 28 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.30 │ 29 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.31 │ 30 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.31 │ 30 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.32 │ 31 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.32 │ 31 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.33 │ 32 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.34 │ 33 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.34 │ 33 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.35 │ 34 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.35 │ 34 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.36 │ 35 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.37 │ 36 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.37 │ 36 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.38 │ 37 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.38 │ 37 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.39 │ 38 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.40 │ 39 │ MaxPool2d │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.41 │ 40 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.41 │ 40 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.42 │ 41 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.42 │ 41 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.43 │ 42 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.44 │ 43 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.44 │ 43 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.45 │ 44 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.45 │ 44 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.46 │ 45 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.47 │ 46 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.47 │ 46 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.48 │ 47 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.48 │ 47 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.49 │ 48 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.50 │ 49 │ Conv2d │ weight │ False │ 2.36 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.50 │ 49 │ Conv2d │ bias │ False │ 512.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.51 │ 50 │ BatchNorm │ weight │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.51 │ 50 │ BatchNorm │ bias │ False │ 512.00 │ │ │ │ 2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.52 │ 51 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 1.53 │ 52 │ MaxPool2d │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 2 │ avgpool │ AdaptiveA │ - │ - │ 0.00 │ │ │ │ vgPool2d │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3 │ classifie │ Sequentia │ - │ - │ 0.00 │ │ │ r │ l │ │ │ │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.1 │ 0 │ Linear │ weight │ True │ 102.76 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.1 │ 0 │ Linear │ bias │ True │ 4.10 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.2 │ 1 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.3 │ 2 │ Dropout │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.4 │ 3 │ Linear │ weight │ True │ 16.78 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.4 │ 3 │ Linear │ bias │ True │ 4.10 K │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.5 │ 4 │ ReLU │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.6 │ 5 │ Dropout │ - │ - │ 0.00 │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.7 │ 6 │ Linear │ weight │ True │ 4.10 M │ ├───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤ │ 3.7 │ 6 │ Linear │ bias │ True │ 1 K │ ╰───────────┴───────────┴───────────┴───────────┴───────────┴───────────╯ -------------------------------------------------- s u m m a r y -------------------------------------------------- • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : x = Shape([1, 3, 224, 224]) <Tensor>
F.c.3 Customize Tabular Report Structure¶
F.c.3.1 Rename Columns¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"origin column names of mem report are: {model.table_cols('mem')}")
tb, data = model.profile(
"mem",
no_tree=True,
custom_cols={
"Operation_Id": "ID",
"Param_Cost": "Param Cost",
},
keep_custom_name = True # whether to keep the custom column name from now on
)
# if keep_custom_name = False
# this command will output a same set of column names as before
print(f"after customization, column names of mem report are: {model.table_cols('mem')}")
origin column names of mem report are: ('Operation_Id', 'Operation_Name', 'Operation_Type', 'Param_Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
╭──────┬────────────────┬───────────────────┬────────────┬─────────────┬─────────────┬────────────╮ │ ID │ Operation_Name │ Operation_Type │ Param Cost │ Buffer_Cost │ Output_Cost │ Total │ ├──────┼────────────────┼───────────────────┼────────────┼─────────────┼─────────────┼────────────┤ │ 1 │ features │ Sequential │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ 1.1 │ 0 │ Conv2d │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ 1.2 │ 1 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.3 │ 2 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.4 │ 3 │ Conv2d │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ 1.5 │ 4 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.6 │ 5 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.7 │ 6 │ MaxPool2d │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ 1.8 │ 7 │ Conv2d │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ 1.9 │ 8 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.10 │ 9 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.11 │ 10 │ Conv2d │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ 1.12 │ 11 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.13 │ 12 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.14 │ 13 │ MaxPool2d │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ 1.15 │ 14 │ Conv2d │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ 1.16 │ 15 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.17 │ 16 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.18 │ 17 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.19 │ 18 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.20 │ 19 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.21 │ 20 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.22 │ 21 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.23 │ 22 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.24 │ 23 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.25 │ 24 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.26 │ 25 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.27 │ 26 │ MaxPool2d │ - │ - │ 784 KiB │ 784 KiB │ │ 1.28 │ 27 │ Conv2d │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ 1.29 │ 28 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.30 │ 29 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.31 │ 30 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.32 │ 31 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.33 │ 32 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.34 │ 33 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.35 │ 34 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.36 │ 35 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.37 │ 36 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.38 │ 37 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.39 │ 38 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.40 │ 39 │ MaxPool2d │ - │ - │ 392 KiB │ 392 KiB │ │ 1.41 │ 40 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.42 │ 41 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.43 │ 42 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.44 │ 43 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.45 │ 44 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.46 │ 45 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.47 │ 46 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.48 │ 47 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.49 │ 48 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.50 │ 49 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.51 │ 50 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.52 │ 51 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.53 │ 52 │ MaxPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 3 │ classifier │ Sequential │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ 3.1 │ 0 │ Linear │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ 3.2 │ 1 │ ReLU(inplace) │ - │ - │ - │ - │ │ 3.3 │ 2 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 3.4 │ 3 │ Linear │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ 3.5 │ 4 │ ReLU(inplace) │ - │ - │ - │ - │ │ 3.6 │ 5 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 3.7 │ 6 │ Linear │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰──────┴────────────────┴───────────────────┴────────────┴─────────────┴─────────────┴────────────╯ ------------------------------------------ s u m m a r y ------------------------------------------ • Model : VGG • Statistics: mem • Device : cuda:0 • Parameters Memory Cost: 548.09 MiB, 82.12 % • Signature: forward(self, x) • Buffers Memory Cost : 43.12 KiB, 0.01 % • Input : • FeatureMap Memory Cost: 119.31 MiB, 17.88 % x = Shape([1, 3, 224, 224]) <Tensor> • Total Memory Cost : 667.44 MiB
after customization, column names of mem report are: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
F.c.3.2 Rerange Columns¶
❗️❗️❗️ The order of columns will only be changed in rendering, no in the underlying datasheet.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"origin column order of mem report is: {model.table_cols('mem')}")
# method 1
tb, data = model.profile(
"mem",
no_tree=True,
pick_cols=[
"Operation_Type",
"Operation_Name",
"ID",
"Param Cost",
"Buffer_Cost",
"Output_Cost",
"Total"
],
)
print(f"after customization, column order of mem report is: {model.table_cols('mem')}")
origin column order of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
╭───────────────────┬────────────────┬──────┬────────────┬─────────────┬─────────────┬────────────╮ │ Operation_Type │ Operation_Name │ ID │ Param Cost │ Buffer_Cost │ Output_Cost │ Total │ ├───────────────────┼────────────────┼──────┼────────────┼─────────────┼─────────────┼────────────┤ │ Sequential │ features │ 1 │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ Conv2d │ 0 │ 1.1 │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ BatchNorm2d │ 1 │ 1.2 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ ReLU(inplace) │ 2 │ 1.3 │ - │ - │ - │ - │ │ Conv2d │ 3 │ 1.4 │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ BatchNorm2d │ 4 │ 1.5 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ ReLU(inplace) │ 5 │ 1.6 │ - │ - │ - │ - │ │ MaxPool2d │ 6 │ 1.7 │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ Conv2d │ 7 │ 1.8 │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ BatchNorm2d │ 8 │ 1.9 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ ReLU(inplace) │ 9 │ 1.10 │ - │ - │ - │ - │ │ Conv2d │ 10 │ 1.11 │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ BatchNorm2d │ 11 │ 1.12 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ ReLU(inplace) │ 12 │ 1.13 │ - │ - │ - │ - │ │ MaxPool2d │ 13 │ 1.14 │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ Conv2d │ 14 │ 1.15 │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ BatchNorm2d │ 15 │ 1.16 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ ReLU(inplace) │ 16 │ 1.17 │ - │ - │ - │ - │ │ Conv2d │ 17 │ 1.18 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ BatchNorm2d │ 18 │ 1.19 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ ReLU(inplace) │ 19 │ 1.20 │ - │ - │ - │ - │ │ Conv2d │ 20 │ 1.21 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ BatchNorm2d │ 21 │ 1.22 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ ReLU(inplace) │ 22 │ 1.23 │ - │ - │ - │ - │ │ Conv2d │ 23 │ 1.24 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ BatchNorm2d │ 24 │ 1.25 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ ReLU(inplace) │ 25 │ 1.26 │ - │ - │ - │ - │ │ MaxPool2d │ 26 │ 1.27 │ - │ - │ 784 KiB │ 784 KiB │ │ Conv2d │ 27 │ 1.28 │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ BatchNorm2d │ 28 │ 1.29 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ ReLU(inplace) │ 29 │ 1.30 │ - │ - │ - │ - │ │ Conv2d │ 30 │ 1.31 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ BatchNorm2d │ 31 │ 1.32 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ ReLU(inplace) │ 32 │ 1.33 │ - │ - │ - │ - │ │ Conv2d │ 33 │ 1.34 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ BatchNorm2d │ 34 │ 1.35 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ ReLU(inplace) │ 35 │ 1.36 │ - │ - │ - │ - │ │ Conv2d │ 36 │ 1.37 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ BatchNorm2d │ 37 │ 1.38 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ ReLU(inplace) │ 38 │ 1.39 │ - │ - │ - │ - │ │ MaxPool2d │ 39 │ 1.40 │ - │ - │ 392 KiB │ 392 KiB │ │ Conv2d │ 40 │ 1.41 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ BatchNorm2d │ 41 │ 1.42 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ ReLU(inplace) │ 42 │ 1.43 │ - │ - │ - │ - │ │ Conv2d │ 43 │ 1.44 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ BatchNorm2d │ 44 │ 1.45 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ ReLU(inplace) │ 45 │ 1.46 │ - │ - │ - │ - │ │ Conv2d │ 46 │ 1.47 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ BatchNorm2d │ 47 │ 1.48 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ ReLU(inplace) │ 48 │ 1.49 │ - │ - │ - │ - │ │ Conv2d │ 49 │ 1.50 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ BatchNorm2d │ 50 │ 1.51 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ ReLU(inplace) │ 51 │ 1.52 │ - │ - │ - │ - │ │ MaxPool2d │ 52 │ 1.53 │ - │ - │ 98 KiB │ 98 KiB │ │ AdaptiveAvgPool2d │ avgpool │ 2 │ - │ - │ 98 KiB │ 98 KiB │ │ Sequential │ classifier │ 3 │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ Linear │ 0 │ 3.1 │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ ReLU(inplace) │ 1 │ 3.2 │ - │ - │ - │ - │ │ Dropout │ 2 │ 3.3 │ - │ - │ 16 KiB │ 16 KiB │ │ Linear │ 3 │ 3.4 │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ ReLU(inplace) │ 4 │ 3.5 │ - │ - │ - │ - │ │ Dropout │ 5 │ 3.6 │ - │ - │ 16 KiB │ 16 KiB │ │ Linear │ 6 │ 3.7 │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰───────────────────┴────────────────┴──────┴────────────┴─────────────┴─────────────┴────────────╯ ------------------------------------------ s u m m a r y ------------------------------------------ • Model : VGG • Statistics: mem • Device : cuda:0 • Parameters Memory Cost: 548.09 MiB, 82.12 % • Signature: forward(self, x) • Buffers Memory Cost : 43.12 KiB, 0.01 % • Input : • FeatureMap Memory Cost: 119.31 MiB, 17.88 % x = Shape([1, 3, 224, 224]) <Tensor> • Total Memory Cost : 667.44 MiB
after customization, column order of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
F.c.3.3 Delete Columns¶
TorchMeter
offers 2 argument in method torchmeter.Meter.profile()
to achieve this:
- Through
exclude_cols
argument: Specify the columns to be deleted to achieve a small amount of deletion - Through
pick_cols
argument: Implement mass deletion by defining the retained columns.
❗️❗️❗️ Note that this feature is only used to adjust the table display and does not actually delete columns,
❗️❗️❗️ as data cannot be restored once deleted.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"origin column set of mem report is: {model.table_cols('mem')}")
# Method 1
tb, data = model.profile(
"mem",
no_tree=True,
exclude_cols=["Operation_Type"],
)
print(f"after customization, column set of mem report is: {model.table_cols('mem')}")
origin column set of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
╭──────┬────────────────┬────────────┬─────────────┬─────────────┬────────────╮ │ ID │ Operation_Name │ Param Cost │ Buffer_Cost │ Output_Cost │ Total │ ├──────┼────────────────┼────────────┼─────────────┼─────────────┼────────────┤ │ 1 │ features │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ 1.1 │ 0 │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ 1.2 │ 1 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.3 │ 2 │ - │ - │ - │ - │ │ 1.4 │ 3 │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ 1.5 │ 4 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.6 │ 5 │ - │ - │ - │ - │ │ 1.7 │ 6 │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ 1.8 │ 7 │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ 1.9 │ 8 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.10 │ 9 │ - │ - │ - │ - │ │ 1.11 │ 10 │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ 1.12 │ 11 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.13 │ 12 │ - │ - │ - │ - │ │ 1.14 │ 13 │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ 1.15 │ 14 │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ 1.16 │ 15 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.17 │ 16 │ - │ - │ - │ - │ │ 1.18 │ 17 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.19 │ 18 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.20 │ 19 │ - │ - │ - │ - │ │ 1.21 │ 20 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.22 │ 21 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.23 │ 22 │ - │ - │ - │ - │ │ 1.24 │ 23 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.25 │ 24 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.26 │ 25 │ - │ - │ - │ - │ │ 1.27 │ 26 │ - │ - │ 784 KiB │ 784 KiB │ │ 1.28 │ 27 │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ 1.29 │ 28 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.30 │ 29 │ - │ - │ - │ - │ │ 1.31 │ 30 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.32 │ 31 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.33 │ 32 │ - │ - │ - │ - │ │ 1.34 │ 33 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.35 │ 34 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.36 │ 35 │ - │ - │ - │ - │ │ 1.37 │ 36 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.38 │ 37 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.39 │ 38 │ - │ - │ - │ - │ │ 1.40 │ 39 │ - │ - │ 392 KiB │ 392 KiB │ │ 1.41 │ 40 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.42 │ 41 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.43 │ 42 │ - │ - │ - │ - │ │ 1.44 │ 43 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.45 │ 44 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.46 │ 45 │ - │ - │ - │ - │ │ 1.47 │ 46 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.48 │ 47 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.49 │ 48 │ - │ - │ - │ - │ │ 1.50 │ 49 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.51 │ 50 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.52 │ 51 │ - │ - │ - │ - │ │ 1.53 │ 52 │ - │ - │ 98 KiB │ 98 KiB │ │ 2 │ avgpool │ - │ - │ 98 KiB │ 98 KiB │ │ 3 │ classifier │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ 3.1 │ 0 │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ 3.2 │ 1 │ - │ - │ - │ - │ │ 3.3 │ 2 │ - │ - │ 16 KiB │ 16 KiB │ │ 3.4 │ 3 │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ 3.5 │ 4 │ - │ - │ - │ - │ │ 3.6 │ 5 │ - │ - │ 16 KiB │ 16 KiB │ │ 3.7 │ 6 │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰──────┴────────────────┴────────────┴─────────────┴─────────────┴────────────╯ -------------------------------- s u m m a r y -------------------------------- • Model : VGG • Device : cuda:0 • Signature: forward(self, x) • Input : x = Shape([1, 3, 224, 224]) <Tensor> • Statistics: mem • Parameters Memory Cost: 548.09 MiB, 82.12 % • Buffers Memory Cost : 43.12 KiB, 0.01 % • FeatureMap Memory Cost: 119.31 MiB, 17.88 % • Total Memory Cost : 667.44 MiB
after customization, column set of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"origin column set of mem report is: {model.table_cols('mem')}")
# Method 2
tb, data = model.profile(
"mem",
no_tree=True,
pick_cols=[
"ID",
"Param Cost",
"Buffer_Cost",
"Output_Cost",
"Total"
],
)
print(f"after customization, column set of mem report is: {model.table_cols('mem')}")
origin column set of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
╭──────┬────────────┬─────────────┬─────────────┬────────────╮ │ ID │ Param Cost │ Buffer_Cost │ Output_Cost │ Total │ ├──────┼────────────┼─────────────┼─────────────┼────────────┤ │ 1 │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ 1.1 │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ 1.2 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.3 │ - │ - │ - │ - │ │ 1.4 │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ 1.5 │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ 1.6 │ - │ - │ - │ - │ │ 1.7 │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ 1.8 │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ 1.9 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.10 │ - │ - │ - │ - │ │ 1.11 │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ 1.12 │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ 1.13 │ - │ - │ - │ - │ │ 1.14 │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ 1.15 │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ 1.16 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.17 │ - │ - │ - │ - │ │ 1.18 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.19 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.20 │ - │ - │ - │ - │ │ 1.21 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.22 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.23 │ - │ - │ - │ - │ │ 1.24 │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 1.25 │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ 1.26 │ - │ - │ - │ - │ │ 1.27 │ - │ - │ 784 KiB │ 784 KiB │ │ 1.28 │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ 1.29 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.30 │ - │ - │ - │ - │ │ 1.31 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.32 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.33 │ - │ - │ - │ - │ │ 1.34 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.35 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.36 │ - │ - │ - │ - │ │ 1.37 │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 1.38 │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ 1.39 │ - │ - │ - │ - │ │ 1.40 │ - │ - │ 392 KiB │ 392 KiB │ │ 1.41 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.42 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.43 │ - │ - │ - │ - │ │ 1.44 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.45 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.46 │ - │ - │ - │ - │ │ 1.47 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.48 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.49 │ - │ - │ - │ - │ │ 1.50 │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 1.51 │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ 1.52 │ - │ - │ - │ - │ │ 1.53 │ - │ - │ 98 KiB │ 98 KiB │ │ 2 │ - │ - │ 98 KiB │ 98 KiB │ │ 3 │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ 3.1 │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ 3.2 │ - │ - │ - │ - │ │ 3.3 │ - │ - │ 16 KiB │ 16 KiB │ │ 3.4 │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ 3.5 │ - │ - │ - │ - │ │ 3.6 │ - │ - │ 16 KiB │ 16 KiB │ │ 3.7 │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰──────┴────────────┴─────────────┴─────────────┴────────────╯ ----------------------- s u m m a r y ------------------------ • Model : VGG • Device : cuda:0 • Signature: forward(self, x) • Input : x = Shape([1, 3, 224, 224]) <Tensor> • Statistics: mem • Parameters Memory Cost: 548.09 MiB, 82.12 % • Buffers Memory Cost : 43.12 KiB, 0.01 % • FeatureMap Memory Cost: 119.31 MiB, 17.88 % • Total Memory Cost : 667.44 MiB
after customization, column set of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
F.c.3.4 Add a New Column¶
By defining the calculation logic for new column values, you can achieve online, real-time data analysis.
In other words, the table report is programmable.
❗️❗️❗️ You can control whether to actually add a new column to the underlying table with the
keep_new_col
argument.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
import polars as pl
from polars.series.series import ArrayLike
def newcol_logic(df: pl.DataFrame) -> ArrayLike:
"""Requirements for the function to generate a new column:
1. must have only one argument(name irrelevant) to receive a `polars.DataFrame` object,
which is the underlying datasheet of the report for corresponding statistic. For safty reason,
the pass-in value is a copy of the original dataframe.
2. must return a 1D array-like data such as polars.Series, lists, tuples, ndarrays, etc.
3. the length of the return value must match the row number of the pass-in dataframe.
For instance , should be equal to `len(df.rows())` in this example.
Tips: For each data in the table, you can obtain its raw data(see I.1 below) through the `val` attribute.
- For `param`, `cal`, and `mem` data, this will return their values in the statistical unit.
(see `Customization` tab → `Units in Raw Data Mode`)
- For `ittp` data, it will return a tuple in the format of (benchmark median, benchmark interquartile range).
"""
col = df['Total']
return col.map_elements(
lambda x: f"{100 * x / model.mem.TotalCost:.4f} %",
return_dtype=str
)
print(f"origin column set of mem report is: {model.table_cols('mem')}")
# Add a new column at the left most position to
# show the percentage of memory each operation uses in the model's total memory.
tb, data = model.profile(
'mem',
no_tree = True,
newcol_name='Percentage',
newcol_func=newcol_logic, # Should be a function to generate a 1D array-like data
newcol_type=str, # Data type of the new column
# Position index of new column. Negative indexing is allowed.
# If negative index exceeds limit, it's at far left. If positive index exceeds limit, it's at far right.
newcol_idx=0,
keep_new_col=True # Whether to keep the new column from now on
)
# if keep_new_col = False
# this command will output a same set of column names as before
print(f"after customization, column set of mem report is: {model.table_cols('mem')}")
origin column set of mem report is: ('ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
╭────────────┬──────┬────────────────┬───────────────────┬────────────┬─────────────┬─────────────┬────────────╮ │ Percentage │ ID │ Operation_Name │ Operation_Type │ Param Cost │ Buffer_Cost │ Output_Cost │ Total │ ├────────────┼──────┼────────────────┼───────────────────┼────────────┼─────────────┼─────────────┼────────────┤ │ 29.3091 % │ 1 │ features │ Sequential │ 76.43 MiB │ 43.12 KiB │ 119.15 MiB │ 195.62 MiB │ │ 1.8364 % │ 1.1 │ 0 │ Conv2d │ 7 KiB │ - │ 12.25 MiB │ 12.26 MiB │ │ 1.8355 % │ 1.2 │ 1 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ - │ 1.3 │ 2 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.8565 % │ 1.4 │ 3 │ Conv2d │ 144.25 KiB │ - │ 12.25 MiB │ 12.39 MiB │ │ 1.8355 % │ 1.5 │ 4 │ BatchNorm2d │ 512 B │ 520 B │ 12.25 MiB │ 12.25 MiB │ │ - │ 1.6 │ 5 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.4588 % │ 1.7 │ 6 │ MaxPool2d │ - │ - │ 3.06 MiB │ 3.06 MiB │ │ 0.9599 % │ 1.8 │ 7 │ Conv2d │ 288.50 KiB │ - │ 6.12 MiB │ 6.41 MiB │ │ 0.9180 % │ 1.9 │ 8 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ - │ 1.10 │ 9 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.0020 % │ 1.11 │ 10 │ Conv2d │ 576.50 KiB │ - │ 6.12 MiB │ 6.69 MiB │ │ 0.9180 % │ 1.12 │ 11 │ BatchNorm2d │ 1 KiB │ 1.01 KiB │ 6.12 MiB │ 6.13 MiB │ │ - │ 1.13 │ 12 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.2294 % │ 1.14 │ 13 │ MaxPool2d │ - │ - │ 1.53 MiB │ 1.53 MiB │ │ 0.6275 % │ 1.15 │ 14 │ Conv2d │ 1.13 MiB │ - │ 3.06 MiB │ 4.19 MiB │ │ 0.4594 % │ 1.16 │ 15 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ - │ 1.17 │ 16 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.7961 % │ 1.18 │ 17 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 0.4594 % │ 1.19 │ 18 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ - │ 1.20 │ 19 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.7961 % │ 1.21 │ 20 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 0.4594 % │ 1.22 │ 21 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ - │ 1.23 │ 22 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.7961 % │ 1.24 │ 23 │ Conv2d │ 2.25 MiB │ - │ 3.06 MiB │ 5.31 MiB │ │ 0.4594 % │ 1.25 │ 24 │ BatchNorm2d │ 2 KiB │ 2.01 KiB │ 3.06 MiB │ 3.07 MiB │ │ - │ 1.26 │ 25 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.1147 % │ 1.27 │ 26 │ MaxPool2d │ - │ - │ 784 KiB │ 784 KiB │ │ 0.9039 % │ 1.28 │ 27 │ Conv2d │ 4.50 MiB │ - │ 1.53 MiB │ 6.03 MiB │ │ 0.2306 % │ 1.29 │ 28 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ - │ 1.30 │ 29 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.5781 % │ 1.31 │ 30 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 0.2306 % │ 1.32 │ 31 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ - │ 1.33 │ 32 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.5781 % │ 1.34 │ 33 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 0.2306 % │ 1.35 │ 34 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ - │ 1.36 │ 35 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.5781 % │ 1.37 │ 36 │ Conv2d │ 9.00 MiB │ - │ 1.53 MiB │ 10.53 MiB │ │ 0.2306 % │ 1.38 │ 37 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 1.53 MiB │ 1.54 MiB │ │ - │ 1.39 │ 38 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.0574 % │ 1.40 │ 39 │ MaxPool2d │ - │ - │ 392 KiB │ 392 KiB │ │ 1.4061 % │ 1.41 │ 40 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 0.0585 % │ 1.42 │ 41 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ - │ 1.43 │ 42 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.4061 % │ 1.44 │ 43 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 0.0585 % │ 1.45 │ 44 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ - │ 1.46 │ 45 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.4061 % │ 1.47 │ 46 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 0.0585 % │ 1.48 │ 47 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ - │ 1.49 │ 48 │ ReLU(inplace) │ - │ - │ - │ - │ │ 1.4061 % │ 1.50 │ 49 │ Conv2d │ 9.00 MiB │ - │ 392 KiB │ 9.38 MiB │ │ 0.0585 % │ 1.51 │ 50 │ BatchNorm2d │ 4 KiB │ 4.01 KiB │ 392 KiB │ 400.01 KiB │ │ - │ 1.52 │ 51 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.0143 % │ 1.53 │ 52 │ MaxPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 0.0143 % │ 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 98 KiB │ 98 KiB │ │ 70.6766 % │ 3 │ classifier │ Sequential │ 471.66 MiB │ 0.00 │ 67.91 KiB │ 471.73 MiB │ │ 58.7362 % │ 3.1 │ 0 │ Linear │ 392.02 MiB │ - │ 16 KiB │ 392.03 MiB │ │ - │ 3.2 │ 1 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.0023 % │ 3.3 │ 2 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 9.5935 % │ 3.4 │ 3 │ Linear │ 64.02 MiB │ - │ 16 KiB │ 64.03 MiB │ │ - │ 3.5 │ 4 │ ReLU(inplace) │ - │ - │ - │ - │ │ 0.0023 % │ 3.6 │ 5 │ Dropout │ - │ - │ 16 KiB │ 16 KiB │ │ 2.3422 % │ 3.7 │ 6 │ Linear │ 15.63 MiB │ - │ 3.91 KiB │ 15.63 MiB │ ╰────────────┴──────┴────────────────┴───────────────────┴────────────┴─────────────┴─────────────┴────────────╯ ------------------------------------------------ s u m m a r y ------------------------------------------------- • Model : VGG • Statistics: mem • Device : cuda:0 • Parameters Memory Cost: 548.09 MiB, 82.12 % • Signature: forward(self, x) • Buffers Memory Cost : 43.12 KiB, 0.01 % • Input : • FeatureMap Memory Cost: 119.31 MiB, 17.88 % x = Shape([1, 3, 224, 224]) <Tensor> • Total Memory Cost : 667.44 MiB
after customization, column set of mem report is: ('Percentage', 'ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
G. Tabular Report Export¶
G.a Instant Export¶
Export the tabular report right after instant rendering.
This is very useful in the following cases of immediate, non-permanent operations:
- Rename columns while setting
keep_custom_name = False
- Change the order of columns
- Delete columns
- Add columns while setting
keep_new_col = False
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
"""
About arguments `save_to` and `save_format`:
1. `save_to`: If a directory path is given, `save_format` is needed to create a file path. The file name will be
`<model-name>_<statistic-name>` by default. Note that the path doesn't need to exist in advance, `TorchMeter` will
automatically create all missing intermediate folders for you.
2. `save_format`: should be a valid file extension. If `save_to` is a file path, and `save_format` is given, then the
extension of the given file will be replaced by `save_format`. Now, `TorchMeter` supports export the tarbular report
as a `.xlsx` and `.csv` file.
"""
tb, data = model.profile(
'param',
show=False, # If you just want to export the report instead of displaying it, set `show = False` to avoid additional overhead.
save_to='./param_report.xlsx', # or csv
save_format="xlsx"
)
Param data saved to /home/hzy/project/TorchMeter/param_report.xlsx
G.b Postponed Export¶
If you've measured a statistic, you can export the underlying datasheet whenever you want.
But in this way, you can't customize the datasheet like reordering columns, renaming columns, etc.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# You can access the underlying dataframe of a statistic by querying the `table_renderer.stats_data` attribute
# with its name. Actually, `torchmeter.Meter.table_renderer.stats_data` maintains a dictionary to map the name of a
# statistic to its dataframe.
param_dataframe: pl.DataFrame = model.table_renderer.stats_data["param"]
# Then, with the dataframe, you can export it via `torchmeter.Meter.table_renderer.export` method.
# Here, you can specify the suffix of the file with `file_suffix` argument. Or course, you can save the raw data (see I.1)
# instead of the readable value by setting `raw_data` argument to `True`.
model.table_renderer.export(
df=param_dataframe,
save_path=".",
ext="csv",
file_suffix="custom_suffix",
raw_data=True
)
Custom_suffix data saved to /home/hzy/project/TorchMeter/VGG_custom_suffix.csv
H. Centralized Configuration Management¶
H.a List Current Configurations¶
from torchmeter import get_config
cfg = get_config()
# just print it, the output will be hierarchically organized
print(cfg)
• Config file: None(default setting below) • render_interval: 0 | <int> • tree_fold_repeat: True | <bool> • tree_repeat_block_args: namespace{ │ title = [i]Repeat [[b]<repeat_time>[/b]] Times[/] | <str> │ title_align = center | <str> │ subtitle = None | <NoneType> │ subtitle_align = center | <str> │ style = dark_goldenrod | <str> │ highlight = True | <bool> │ box = HEAVY_EDGE | <str> │ border_style = dim | <str> │ width = None | <NoneType> │ height = None | <NoneType> │ padding = list( │ │ - 0 | <int> │ │ - 1 | <int> │ └─ ) │ expand = False | <bool> └─ } • tree_levels_args: namespace{ │ default = namespace{ │ │ label = [b gray35](<node_id>) [green]<name>[/green] [cyan]<type>[/] | <str> │ │ style = tree | <str> │ │ guide_style = light_coral | <str> │ │ highlight = True | <bool> │ │ hide_root = False | <bool> │ │ expanded = True | <bool> │ └─ } │ 0 = namespace{ │ │ label = [b light_coral]<name>[/] | <str> │ │ guide_style = light_coral | <str> │ └─ } └─ } • table_column_args: namespace{ │ style = none | <str> │ justify = center | <str> │ vertical = middle | <str> │ overflow = fold | <str> │ no_wrap = False | <bool> └─ } • table_display_args: namespace{ │ style = spring_green4 | <str> │ highlight = True | <bool> │ width = None | <NoneType> │ min_width = None | <NoneType> │ expand = False | <bool> │ padding = list( │ │ - 0 | <int> │ │ - 1 | <int> │ └─ ) │ collapse_padding = False | <bool> │ pad_edge = True | <bool> │ leading = 0 | <int> │ title = None | <NoneType> │ title_style = bold | <str> │ title_justify = center | <str> │ caption = None | <NoneType> │ caption_style = None | <NoneType> │ caption_justify = center | <str> │ show_header = True | <bool> │ header_style = bold | <str> │ show_footer = False | <bool> │ footer_style = italic | <str> │ show_lines = False | <bool> │ row_styles = None | <NoneType> │ show_edge = True | <bool> │ box = ROUNDED | <str> │ safe_box = True | <bool> │ border_style = None | <NoneType> └─ } • combine: namespace{ │ horizon_gap = 2 | <int> └─ }
H.b Retrieve Specific Settings¶
# Context
# ------------------------------
# cfg: the global config object
# access a setting through the way of visiting an attribute
print(
f"config_file: {cfg.config_file}",
f"render time interval: {cfg.render_interval}",
f"tree default guide line style: {cfg.tree_levels_args.default.guide_style}",
f"table col justify: {cfg.table_column_args.justify}",
f"gap between tree and table in profiling: {cfg.combine.horizon_gap}",
sep="\n"
)
config_file: None render time interval: 0 tree default guide line style: light_coral table col justify: center gap between tree and table in profiling: 2
H.c Change Specific Settings¶
# Context
# ------------------------------
# cfg: the global config object
origin_val = {
"render_interval": cfg.render_interval,
"tree_levels_args.default.guide_style": cfg.tree_levels_args.default.guide_style,
"table_display_args.highlight": cfg.table_display_args.highlight,
"table_display_args.show_edge": cfg.table_display_args.show_edge
}
# You can modify the configuration one-on-one through this way (like attribute access)
cfg.render_interval = 0.1
cfg.tree_levels_args.default.guide_style = "red"
# For configuration items with sub-configurations, you can make batch modifications in the form of a dictionary.
# Under the top level, the configuration items which have sub-configurations are: (you can check the structure in H.a)
# `tree_repeat_block_args`, `tree_levels_args`, `table_column_args`, `table_display_args` and `combine`
cfg.table_display_args = {
"highlight": False,
"show_edge": False
}
from operator import attrgetter
for i, v in origin_val.items():
print(f"{i}: {v} -> {attrgetter(i)(cfg)}")
render_interval: 0 -> 0.1
tree_levels_args.default.guide_style: light_coral -> red
table_display_args.highlight: True -> False
table_display_args.show_edge: True -> False
H.d Dump to Disk¶
You can dump all the configurations as a
yaml
file for sharing or reloading in a new session
# Context
# ------------------------------
# cfg: the global config object
des = "./my_config.yaml"
cfg.dump(save_path=des)
import os
abs_des = os.path.abspath(des)
if os.path.exists(abs_des):
print(f"config dumped successfully to {abs_des}")
config dumped successfully to /home/hzy/project/TorchMeter/my_config.yaml
H.e Restore Configuration¶
Mess up the configurations? Don't worry, we can restore them to the value in loaded file.
If the config object is not created by loading ayaml
file, will use the default value inDefault Configuration
we've prepared for you.
# Context
# ------------------------------
# cfg: the global config object
origin_val = {
"render_interval": cfg.render_interval,
"tree_levels_args.default.guide_style": cfg.tree_levels_args.default.guide_style,
"table_display_args.highlight": cfg.table_display_args.highlight,
"table_display_args.show_edge": cfg.table_display_args.show_edge
}
# cause the config object is not created by a yaml file,
# so the `restore()` method will take all configurations to its default value we provided.
cfg.restore()
from operator import attrgetter
for i, v in origin_val.items():
print(f"{i}: {v} -> {attrgetter(i)(cfg)}")
render_interval: 0.1 -> 0.15
tree_levels_args.default.guide_style: red -> light_coral
table_display_args.highlight: False -> True
table_display_args.show_edge: False -> True
H.f Reload and Overwrite¶
With the
yaml
file exported inH.d
, you can easily mirror a config in another session.Or course, you can overwrite the config in current session using the settings in the
yaml
file.
# Context
# ------------------------------
# cfg: the global config object
from torchmeter import get_config
origin_val = {
"render_interval": cfg.render_interval,
"tree_levels_args.default.guide_style": cfg.tree_levels_args.default.guide_style,
"table_display_args.highlight": cfg.table_display_args.highlight,
"table_display_args.show_edge": cfg.table_display_args.show_edge
}
reload_cfg = get_config(config_path=abs_des)
from operator import attrgetter
for i, v in origin_val.items():
print(f"{i}: {v} -> {attrgetter(i)(cfg)}")
render_interval: 0.15 -> 0.1
tree_levels_args.default.guide_style: light_coral -> red
table_display_args.highlight: True -> False
table_display_args.show_edge: True -> False
I. Others¶
# Before proceeding, discard the previous settings
# Discard above customization settings
cfg.restore()
# Disable interval output to adapt to Jupyter Notebook
cfg.render_interval = 0
I.1 Raw Data Mode¶
When this mode is enabled, the statistic data will be represented in the unit at the time of statistics.
For details, seeUnit Explanation
.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
tb, data = model.profile("param", no_tree=True, raw_data=True)
Operation_Id │ Operation_Name │ Operation_Type │ Param_Name │ Requires_Grad │ Numeric_Num ──────────────┼────────────────┼───────────────────┼────────────┼───────────────┼───────────── 1 │ features │ Sequential │ - │ - │ 0.0 1.1 │ 0 │ Conv2d │ weight │ False │ 1728.0 1.1 │ 0 │ Conv2d │ bias │ False │ 64.0 1.2 │ 1 │ BatchNorm2d │ weight │ False │ 64.0 1.2 │ 1 │ BatchNorm2d │ bias │ False │ 64.0 1.3 │ 2 │ ReLU │ - │ - │ 0.0 1.4 │ 3 │ Conv2d │ weight │ False │ 36864.0 1.4 │ 3 │ Conv2d │ bias │ False │ 64.0 1.5 │ 4 │ BatchNorm2d │ weight │ False │ 64.0 1.5 │ 4 │ BatchNorm2d │ bias │ False │ 64.0 1.6 │ 5 │ ReLU │ - │ - │ 0.0 1.7 │ 6 │ MaxPool2d │ - │ - │ 0.0 1.8 │ 7 │ Conv2d │ weight │ False │ 73728.0 1.8 │ 7 │ Conv2d │ bias │ False │ 128.0 1.9 │ 8 │ BatchNorm2d │ weight │ False │ 128.0 1.9 │ 8 │ BatchNorm2d │ bias │ False │ 128.0 1.10 │ 9 │ ReLU │ - │ - │ 0.0 1.11 │ 10 │ Conv2d │ weight │ False │ 147456.0 1.11 │ 10 │ Conv2d │ bias │ False │ 128.0 1.12 │ 11 │ BatchNorm2d │ weight │ False │ 128.0 1.12 │ 11 │ BatchNorm2d │ bias │ False │ 128.0 1.13 │ 12 │ ReLU │ - │ - │ 0.0 1.14 │ 13 │ MaxPool2d │ - │ - │ 0.0 1.15 │ 14 │ Conv2d │ weight │ False │ 294912.0 1.15 │ 14 │ Conv2d │ bias │ False │ 256.0 1.16 │ 15 │ BatchNorm2d │ weight │ False │ 256.0 1.16 │ 15 │ BatchNorm2d │ bias │ False │ 256.0 1.17 │ 16 │ ReLU │ - │ - │ 0.0 1.18 │ 17 │ Conv2d │ weight │ False │ 589824.0 1.18 │ 17 │ Conv2d │ bias │ False │ 256.0 1.19 │ 18 │ BatchNorm2d │ weight │ False │ 256.0 1.19 │ 18 │ BatchNorm2d │ bias │ False │ 256.0 1.20 │ 19 │ ReLU │ - │ - │ 0.0 1.21 │ 20 │ Conv2d │ weight │ False │ 589824.0 1.21 │ 20 │ Conv2d │ bias │ False │ 256.0 1.22 │ 21 │ BatchNorm2d │ weight │ False │ 256.0 1.22 │ 21 │ BatchNorm2d │ bias │ False │ 256.0 1.23 │ 22 │ ReLU │ - │ - │ 0.0 1.24 │ 23 │ Conv2d │ weight │ False │ 589824.0 1.24 │ 23 │ Conv2d │ bias │ False │ 256.0 1.25 │ 24 │ BatchNorm2d │ weight │ False │ 256.0 1.25 │ 24 │ BatchNorm2d │ bias │ False │ 256.0 1.26 │ 25 │ ReLU │ - │ - │ 0.0 1.27 │ 26 │ MaxPool2d │ - │ - │ 0.0 1.28 │ 27 │ Conv2d │ weight │ False │ 1179648.0 1.28 │ 27 │ Conv2d │ bias │ False │ 512.0 1.29 │ 28 │ BatchNorm2d │ weight │ False │ 512.0 1.29 │ 28 │ BatchNorm2d │ bias │ False │ 512.0 1.30 │ 29 │ ReLU │ - │ - │ 0.0 1.31 │ 30 │ Conv2d │ weight │ False │ 2359296.0 1.31 │ 30 │ Conv2d │ bias │ False │ 512.0 1.32 │ 31 │ BatchNorm2d │ weight │ False │ 512.0 1.32 │ 31 │ BatchNorm2d │ bias │ False │ 512.0 1.33 │ 32 │ ReLU │ - │ - │ 0.0 1.34 │ 33 │ Conv2d │ weight │ False │ 2359296.0 1.34 │ 33 │ Conv2d │ bias │ False │ 512.0 1.35 │ 34 │ BatchNorm2d │ weight │ False │ 512.0 1.35 │ 34 │ BatchNorm2d │ bias │ False │ 512.0 1.36 │ 35 │ ReLU │ - │ - │ 0.0 1.37 │ 36 │ Conv2d │ weight │ False │ 2359296.0 1.37 │ 36 │ Conv2d │ bias │ False │ 512.0 1.38 │ 37 │ BatchNorm2d │ weight │ False │ 512.0 1.38 │ 37 │ BatchNorm2d │ bias │ False │ 512.0 1.39 │ 38 │ ReLU │ - │ - │ 0.0 1.40 │ 39 │ MaxPool2d │ - │ - │ 0.0 1.41 │ 40 │ Conv2d │ weight │ False │ 2359296.0 1.41 │ 40 │ Conv2d │ bias │ False │ 512.0 1.42 │ 41 │ BatchNorm2d │ weight │ False │ 512.0 1.42 │ 41 │ BatchNorm2d │ bias │ False │ 512.0 1.43 │ 42 │ ReLU │ - │ - │ 0.0 1.44 │ 43 │ Conv2d │ weight │ False │ 2359296.0 1.44 │ 43 │ Conv2d │ bias │ False │ 512.0 1.45 │ 44 │ BatchNorm2d │ weight │ False │ 512.0 1.45 │ 44 │ BatchNorm2d │ bias │ False │ 512.0 1.46 │ 45 │ ReLU │ - │ - │ 0.0 1.47 │ 46 │ Conv2d │ weight │ False │ 2359296.0 1.47 │ 46 │ Conv2d │ bias │ False │ 512.0 1.48 │ 47 │ BatchNorm2d │ weight │ False │ 512.0 1.48 │ 47 │ BatchNorm2d │ bias │ False │ 512.0 1.49 │ 48 │ ReLU │ - │ - │ 0.0 1.50 │ 49 │ Conv2d │ weight │ False │ 2359296.0 1.50 │ 49 │ Conv2d │ bias │ False │ 512.0 1.51 │ 50 │ BatchNorm2d │ weight │ False │ 512.0 1.51 │ 50 │ BatchNorm2d │ bias │ False │ 512.0 1.52 │ 51 │ ReLU │ - │ - │ 0.0 1.53 │ 52 │ MaxPool2d │ - │ - │ 0.0 2 │ avgpool │ AdaptiveAvgPool2d │ - │ - │ 0.0 3 │ classifier │ Sequential │ - │ - │ 0.0 3.1 │ 0 │ Linear │ weight │ True │ 102760448.0 3.1 │ 0 │ Linear │ bias │ True │ 4096.0 3.2 │ 1 │ ReLU │ - │ - │ 0.0 3.3 │ 2 │ Dropout │ - │ - │ 0.0 3.4 │ 3 │ Linear │ weight │ True │ 16777216.0 3.4 │ 3 │ Linear │ bias │ True │ 4096.0 3.5 │ 4 │ ReLU │ - │ - │ 0.0 3.6 │ 5 │ Dropout │ - │ - │ 0.0 3.7 │ 6 │ Linear │ weight │ True │ 4096000.0 3.7 │ 6 │ Linear │ bias │ True │ 1000.0 --------------------------------------- s u m m a r y ---------------------------------------- • Model : VGG • Statistics: param • Device : cuda:0 • Learnable Parameters Num: 123.64 M • Signature: forward(self, x) • Total Parameters Num : 143.68 M • Input : x = Shape([1, 3, 224, 224]) <Tensor>
I.2 Quickview of Column Names¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
for s in ("param", "cal", "mem", "ittp"):
print(f"Default column set of {s} report is: \n{model.table_cols(s)}")
Default column set of param report is: ('Operation_Id', 'Operation_Name', 'Operation_Type', 'Param_Name', 'Requires_Grad', 'Numeric_Num')
Default column set of cal report is: ('Operation_Id', 'Operation_Name', 'Operation_Type', 'Kernel_Size', 'Bias', 'Input', 'Output', 'MACs', 'FLOPs')
Default column set of mem report is: ('Percentage', 'ID', 'Operation_Name', 'Operation_Type', 'Param Cost', 'Buffer_Cost', 'Output_Cost', 'Total')
Default column set of ittp report is: ('Operation_Id', 'Operation_Name', 'Operation_Type', 'Infer_Time', 'Throughput')
I.3 Model Migration¶
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
print(f"model now on {model.device}")
model.to("cpu")
print(f"model now on {model.device}")
model.device = "cuda:0"
print(f"model now on {model.device}")
model now on cuda:0
model now on cpu
model now on cuda:0
I.4 Submodule Explore¶
Sometimes, we want to explore a specific submodule of a model to evaluate its performance.
In this case, we can use the rebase
method in conjunction with the subnodes
property to narrow down the model analysis scope to any submodule.
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# output a list of tuples, each item represents a node in the operation tree with format (node-id, layer-name)
print(model.subnodes)
[ '(0) VGG', '(1) features', '(2) avgpool', '(3) classifier', '(1.1) 0', '(1.2) 1', '(1.3) 2', '(1.4) 3', '(1.5) 4', '(1.6) 5', '(1.7) 6', '(1.8) 7', '(1.9) 8', '(1.10) 9', '(1.11) 10', '(1.12) 11', '(1.13) 12', '(1.14) 13', '(1.15) 14', '(1.16) 15', '(1.17) 16', '(1.18) 17', '(1.19) 18', '(1.20) 19', '(1.21) 20', '(1.22) 21', '(1.23) 22', '(1.24) 23', '(1.25) 24', '(1.26) 25', '(1.27) 26', '(1.28) 27', '(1.29) 28', '(1.30) 29', '(1.31) 30', '(1.32) 31', '(1.33) 32', '(1.34) 33', '(1.35) 34', '(1.36) 35', '(1.37) 36', '(1.38) 37', '(1.39) 38', '(1.40) 39', '(1.41) 40', '(1.42) 41', '(1.43) 42', '(1.44) 43', '(1.45) 44', '(1.46) 45', '(1.47) 46', '(1.48) 47', '(1.49) 48', '(1.50) 49', '(1.51) 50', '(1.52) 51', '(1.53) 52', '(3.1) 0', '(3.2) 1', '(3.3) 2', '(3.4) 3', '(3.5) 4', '(3.6) 5', '(3.7) 6' ]
# Context
# --------------------------------------------------------------------------------
# model: Instance of `torchmeter.Meter` which acts like a decorator of your model
# Input the node ID, which is the former part of an item in the output of `torchmeter.Meter.subnodes`, as shown above.
classify_head = model.rebase("3")
# now the model analysis scope changes to its submodule —— the `classifier`
print(classify_head.structure)
Finish Scanning model in 0.0017 seconds
Sequential ├── (1) 0 Linear ├── (2) 1 ReLU ├── (3) 2 Dropout ├── (4) 3 Linear ├── (5) 4 ReLU ├── (6) 5 Dropout └── (7) 6 Linear