
Tensorboard
Wire TensorBoard logging into PyTorch, TensorFlow, Lightning, HuggingFace, and other training stacks so experiment metrics and graphs are visible while you train.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill tensorboardWhat is this skill?
- End-to-end PyTorch training loop with batch and epoch scalar logging plus parameter histograms
- Table-of-contents coverage for 7+ stacks: PyTorch, TensorFlow/Keras, PyTorch Lightning, HuggingFace Transformers, Fast.a
- Model graph logging via add_graph with dummy inputs for architecture visualization
- torchvision and framework-specific patterns for consistent run directories under runs/
- Operational reuse: reopen TensorBoard against saved runs to compare experiments after training
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Journey fit
Training loops and SummaryWriter setup happen while you are building ML features and pipelines, which is the canonical Build shelf for framework hookups. The skill is explicitly a multi-framework integration guide (writers, scalars, histograms, graphs), not a standalone model architecture tutorial.
Common Questions / FAQ
Is Tensorboard safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Tensorboard
# Framework Integration Guide Complete guide to integrating TensorBoard with popular ML frameworks. ## Table of Contents - PyTorch - TensorFlow/Keras - PyTorch Lightning - HuggingFace Transformers - Fast.ai - JAX - scikit-learn ## PyTorch ### Basic Integration ```python import torch import torch.nn as nn from torch.utils.tensorboard import SummaryWriter # Create writer writer = SummaryWriter('runs/pytorch_experiment') # Model and optimizer model = ResNet50() optimizer = torch.optim.Adam(model.parameters(), lr=0.001) criterion = nn.CrossEntropyLoss() # Log model graph dummy_input = torch.randn(1, 3, 224, 224) writer.add_graph(model, dummy_input) # Training loop for epoch in range(100): model.train() train_loss = 0.0 for batch_idx, (data, target) in enumerate(train_loader): optimizer.zero_grad() output = model(data) loss = criterion(output, target) loss.backward() optimizer.step() train_loss += loss.item() # Log batch metrics if batch_idx % 100 == 0: global_step = epoch * len(train_loader) + batch_idx writer.add_scalar('Loss/train_batch', loss.item(), global_step) # Epoch metrics train_loss /= len(train_loader) writer.add_scalar('Loss/train_epoch', train_loss, epoch) # Log histograms for name, param in model.named_parameters(): writer.add_histogram(name, param, epoch) writer.close() ``` ### torchvision Integration ```python from torchvision.utils import make_grid # Log image batch for batch_idx, (images, labels) in enumerate(train_loader): if batch_idx == 0: # First batch img_grid = make_grid(images[:64], nrow=8) writer.add_image('Training_batch', img_grid, epoch) break ``` ### Distributed Training ```python import torch.distributed as dist from torch.nn.parallel import DistributedDataParallel as DDP # Setup dist.init_process_group(backend='nccl') rank = dist.get_rank() # Only log from rank 0 if rank == 0: writer = SummaryWriter('runs/distributed_experiment') model = DDP(model, device_ids=[rank]) for epoch in range(100): train_loss = train_epoch() # Log only from rank 0 if rank == 0: writer.add_scalar('Loss/train', train_loss, epoch) ``` ## TensorFlow/Keras ### Keras Callback ```python import tensorflow as tf # TensorBoard callback tensorboard_callback = tf.keras.callbacks.TensorBoard( log_dir='logs/keras_experiment', histogram_freq=1, # Log histograms every epoch write_graph=True, # Visualize model graph write_images=True, # Visualize layer weights as images update_freq='epoch', # Log metrics per epoch (or 'batch', or integer) profile_batch='10,20', # Profile batches 10-20 embeddings_freq=1 # Log embeddings every epoch ) # Compile model model.compile( optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'] ) # Train with callback history = model.fit( x_train, y_train, epochs=10, validation_data=(x_val, y_val), callbacks=[tensorboard_callback] ) ``` ### Custom Training Loop ```python import tensorflow as tf # Create summary writers train_summary_writer = tf.summary.create_file_writer('logs/train') val_summary_writer = tf.summary.create_file_writer('logs/val') # Training loop for epoch in range(100): # Training for step, (x_batch, y_batch) in enumerate(train_dataset): with tf.GradientTape() as tape: predictions = model(x_batch, training=True) loss = loss_fn(y_batch, predictions) gradients = tape.gradient(loss, model.trainable_variables) optimizer.apply_gradients(zip(gradients, model.trainable_variables)) # Log training metrics with train_summary_writer.as_default(): tf.summary.scalar('loss', loss, step=epoch * len(train_dataset) + step) # Validation for x_batch, y_batch in val_dataset: predictio