Using Ephemeral Storage

Important Change: Starting October 8, 2025, all new Thunder Compute instances use ephemeral storage. This means instances can only be created and deleted, not started and stopped. All data is lost when an instance is deleted.

What is Ephemeral Storage?

Ephemeral storage means that your instance’s disk is temporary and exists only for the lifetime of that instance. When you delete an instance, all data on it is permanently removed. This change enables several important benefits:

Better GPU availability and lower pricing
Access to new GPU types, including H100s and 8-GPU nodes
Faster instance creation and deletion

While we plan to reintroduce persistent storage in a more scalable form in the future, we recommend using external backup solutions to preserve your important data.

Setting Up Backups

Using GitHub

For your code and configuration:

# Initialize a git repository
git init

# Add your project files
git add .

# Commit your changes
git commit -m "Initial commit"

# Push to GitHub (create a repo on GitHub first)
git remote add origin https://github.com/yourusername/your-project.git
git push -u origin main

Downloading Files to Your Local Computer

The simplest way to preserve your data is to download it directly to your local machine using scp, or by dragging and dropping in VSCode. With scp:

# Download a single file
scp tnr-0:~/model_checkpoint.pth ./local_backups/

# Download an entire directory
scp -r tnr-0:~/training_outputs ./local_backups/

# Upload files when creating a new instance
scp ./dataset.tar.gz tnr-0:~/

Make sure you’ve connected to your instance with tnr connect first. This sets up the tnr-0 SSH alias.

Best Practices

Commit frequently - Push your code changes to GitHub regularly, especially before deleting an instance.
Download important results - When you complete a training run or generate important outputs, download them to your local machine or upload to cloud storage right away.
Separate data from code - Keep your code in GitHub and large datasets either on your local machine or in cloud storage (R2/Drive).
Save checkpoints during long runs - For multi-day training jobs, periodically download checkpoints or upload them to cloud storage.

Use automation - Create scripts that automatically save your outputs:

# Example: Auto-upload checkpoints after each epoch
import boto3

def save_checkpoint(epoch, model):
    # Save locally
    torch.save(model.state_dict(), f'checkpoint_{epoch}.pth')
    
    # Upload to R2
    s3 = boto3.client('s3', endpoint_url='https://your-account.r2.cloudflarestorage.com')
    s3.upload_file(f'checkpoint_{epoch}.pth', 'your-bucket', f'checkpoints/checkpoint_{epoch}.pth')

Create setup scripts - Document your environment setup in a script that can quickly recreate your environment on a new instance:

#!/bin/bash
# setup.sh

# Clone your code repository
git clone https://github.com/yourusername/your-project.git
cd your-project

# Install dependencies
pip install -r requirements.txt

# Upload your dataset from local machine using scp before running this
# Or pull data from cloud storage if needed:
# aws s3 sync s3://your-bucket/datasets ./data --profile r2 --endpoint-url https://your-account.r2.cloudflarestorage.com

Accessing Data from Old Instances

If you have existing instances with data you need to retrieve:

Change your instance type to a T4 (no GPU) to reduce costs
Download your data using one of the backup methods above
You have 30 days from October 8, 2025 to retrieve your data

After 30 days, data on old instances will be permanently deleted. Make sure to back up anything important before the deadline.

Need Help?

If you run into any issues setting up your backup workflow or have questions about ephemeral storage:

Join our Discord community
Email us at [email protected]

We’re here to help you transition smoothly to ephemeral storage!

Documentation

Guides

API Reference

What is Ephemeral Storage?

Recommended Backup Solutions

1. Use GitHub for Your Code and Configuration (Recommended)

2. For Large Files, Choose What Works Best for You

Option A: Download to Your Local Computer

Option B: Use Cloud Object Storage

Setting Up Backups

Using GitHub

Downloading Files to Your Local Computer

Best Practices

Accessing Data from Old Instances

Need Help?

Documentation

Guides

API Reference

​What is Ephemeral Storage?

​Recommended Backup Solutions

​1. Use GitHub for Your Code and Configuration (Recommended)

​2. For Large Files, Choose What Works Best for You

​Option A: Download to Your Local Computer

​Option B: Use Cloud Object Storage

​Setting Up Backups

​Using GitHub

​Downloading Files to Your Local Computer

​Best Practices

​Accessing Data from Old Instances

​Need Help?

What is Ephemeral Storage?

Recommended Backup Solutions

1. Use GitHub for Your Code and Configuration (Recommended)

2. For Large Files, Choose What Works Best for You

Option A: Download to Your Local Computer

Option B: Use Cloud Object Storage

Setting Up Backups

Using GitHub

Downloading Files to Your Local Computer

Best Practices

Accessing Data from Old Instances

Need Help?