Intros


π ππππ π¨ππ-ππ-πΆππ π»πππ πππ π·ππππππ π΄ππ ππ π¨πππππππ π
- Repo: https://github.com/TorchMeter/torchmeter
- Intro: Provides comprehensive measurement of
Pytorch
model'sParameters
,FLOPs/MACs
,Memory-Cost
,Inference-Time
andThroughput
with highly customizable result display β¨
π. π»πΎππ½ππΎππ½ππ¶
ππππ-πΈππππππππ πΏπππ‘π’
- Acts as drop-in decorator without any changes of the underlying model
- Seamlessly integrates with
Pytorch
modules while preserving full compatibility (attributes and methods)
π΅πππ-πππππ πΌππππ π°ππππ’ππππ
Holistic performance analytics across 5 dimensions:
-
Parameter Analysis
- Total/trainable parameter quantification
- Layer-wise parameter distribution analysis
- Gradient state tracking (requires_grad flags)
-
Computational Profiling
- FLOPs/MACs precision calculation
- Operation-wise calculation distribution analysis
- Dynamic input/output detection (number, type, shape, ...)
-
Memory Diagnostics
- Input/output tensor memory awareness
- Hierarchical memory consumption analysis
-
Inference latency & 5. Throughput benchmarking
- Auto warm-up phase execution (eliminates cold-start bias)
- Device-specific high-precision timing
- Inference latency & Throughput Benchmarking
ππππ π πππππππ£πππππ
-
Programmable tabular report
- Dynamic table structure adjustment
- Style customization and real-time rendering
- Real-time data analysis in programmable way
-
Rich-text hierarchical operation tree
- Style customization and real-time rendering
- Smart module folding based on structural equivalence detection for intuitive model structure insights
π΅πππ-πΆππππππ π²πππππππ£πππππ
-
Real-time hot-reload rendering:
Dynamic adjustment of rendering configuration for operation trees, report tables and their nested components -
Progressive update:
Namespace assignment + dictionary batch update
π²πππππ-π³πππππ πππππππ πΌπππππππππ
-
Centralized control:
Singleton-managed global configuration for dynamic behavior adjustment -
Portable presets:
Export/import YAML profiles for runtime behaviors, eliminating repetitive setup
πΏππππππππππ’ πππ πΏπππππππππππ’
-
Decoupled pipeline:
Separation of data collection and visualization -
Automatic device synchronization:
Maintains production-ready status by keeping model and data co-located -
Dual-mode reporting with export flexibility:
- Measurement units mode vs. raw data mode
- Multi-format export (
CSV
/Excel
) for analysis integration
β¬. πΌππππΆπππΆππΎππ¶
π²ππππππππππππ’
- OS:
windows
/linux
/macOS
- Python: >= 3.8
- Pytorch: >= 1.7.0
πππππππ πΏπ’ππππ πΏππππππ πΌππππππ
the most convenient way, suitable for installing the released latest stable version
πππππππ π±πππππ’ π³πππππππππππ
Suitable for installing released historical versions
-
Download
.whl
from PyPI or Github Releases . -
Install locally:
- πββοΈ Replace
x.x.x
with actual version
- πββοΈ Replace
πππππππ ππππππ π²πππ
Suitable for who want to try out the upcoming features (may has unknown bugs).
git clone https://github.com/TorchMeter/torchmeter.git
cd torchmeter
# If you want to install the released stable version, use this:
git checkout vx.x.x # Stable (1)
# If you want to try the latest development version(alpha/beta), use this:
git checkout master # Development version
pip install .
- πββοΈ Don't forget to eplace
x.x.x
with actual version. You can check all available versions withgit tag -l
π. π’ππππΎππ πππΆππππΉ¶
π³πππππππ π’πππ πππππ ππ ππππππππππ
Implementation of ExampleNet
import torch.nn as nn
class ExampleNet(nn.Module):
def __init__(self):
super(ExampleNet, self).__init__()
self.backbone = nn.Sequential(
self._nested_repeat_block(2),
self._nested_repeat_block(2)
)
self.gap = nn.AdaptiveAvgPool2d(1)
self.classifier = nn.Linear(3, 2)
def _inner_net(self):
return nn.Sequential(
nn.Conv2d(10, 10, 1),
nn.BatchNorm2d(10),
nn.ReLU(),
)
def _nested_repeat_block(self, repeat:int=1):
inners = [self._inner_net() for _ in range(repeat)]
return nn.Sequential(
nn.Conv2d(3, 10, 3, stride=1, padding=1),
nn.BatchNorm2d(10),
nn.ReLU(),
*inners,
nn.Conv2d(10, 3, 1),
nn.BatchNorm2d(3),
nn.ReLU()
)
def forward(self, x):
x = self.backbone(x)
x = self.gap(x)
x = x.squeeze(dim=(2,3))
return self.classifier(x)
import torch.nn as nn
from torchmeter import Meter
from torch.cuda import is_available as is_cuda
# 1οΈβ£ Prepare your pytorch model, here is a simple examples
underlying_model = ExampleNet() # (1)
# Set an extra attribute to the model to show
# how torchmeter acts as a zero-intrusion proxy later
underlying_model.example_attr = "ABC"
# 2οΈβ£ Wrap your model with torchmeter
model = Meter(underlying_model)
# 3οΈβ£ Validate the zero-intrusion proxy
# Get the model's attribute
print(model.example_attr)
# Get the model's method
# `_inner_net` is a method defined in the ExampleNet
print(hasattr(model, "_inner_net"))
# Move the model to other device (now on cpu)
print(model)
if is_cuda():
model.to("cuda")
print(model) # now on cuda
- πββοΈ see above for implementation of
ExampleNet
πΆππ ππππππππ ππππ πππ πππππ πππππππππ
ππππππππ’ πππππ πππππππππππ ππππ πππππππ ππππππππππ
# Parameter Analysis
# Suppose that the `backbone` part of ExampleNet is frozen
_ = model.backbone.requires_grad_(False)
print(model.param)
tb, data = model.profile('param', no_tree=True)
# Before measuring calculation you should first execute a feed-forward
import torch
input = torch.randn(1, 3, 32, 32)
output = model(input) # (1)
# Computational Profiling
print(model.cal) # (2)
tb, data = model.profile('cal', no_tree=True)
# Memory Diagnostics
print(model.mem) # (3)
tb, data = model.profile('mem', no_tree=True)
# Performance Benchmarking
print(model.ittp) # (4)
tb, data = model.profile('ittp', no_tree=True)
# Overall Analytics
print(model.overview())
- πββοΈ you do not need to concern about the device mismatch, just feed the model with the input.
- πββοΈ
cal
for calculation - πββοΈ
mem
for memory - πββοΈ
ittp
for inference time & throughput
π΄π‘ππππ πππππππ πππ πππππππ πππππ’πππ
π°πππππππ πππππ
- Attributes/methods access of underlying model
- Automatic device synchronization
- Smart module folding
- Performance gallery
- Customized visulization
- Best practice of programmable tabular report
- Instant export and postponed export
- Centralized configuration management
- Submodule exploration
π. ππππππΎπ·πππ¶
Thank you for wanting to make TorchMeter
even better!
There are several ways to make a contribution:
Before jumping in, let's ensure smooth collaboration by reviewing our π contribution guidelines first.
Thanks again !
β°. πππΉπ ππ» ππππΉππΈπ¶
Refer to official code-of-conduct file for more details.
-
TorchMeter
is an open-source project built by developers worldwide. We're committed to fostering a friendly, safe, and inclusive environment for all participants. -
This code applies to all community spaces including but not limited to GitHub repositories, community forums, etc.
β±. πΏπΎπΈππππ¶
TorchMeter
is released under the AGPL-3.0 License, see the LICENSE file for the full text.- Please carefully review the terms in the LICENSE file before using or distributing
TorchMeter
. - Ensure compliance with the licensing conditions, especially when integrating this project into larger systems or proprietary software.