Cheapest Way to Run DeepSeek R1 on Thunder Compute

Looking for the cheapest way to run DeepSeek R1 or just want to try DeepSeek R1 without buying hardware? Thunder Compute lets you spin up pay‑per‑minute A100 GPUs so you only pay for the time you use. Follow the steps below to get the model running in minutes.

Quick reminder: Make sure your Thunder Compute account is set up. If not, start with our Quickstart Guide.

If you prefer video instructions, watch this overview:

Step 1: Create a Cost‑Effective GPU Instance

Open your CLI and launch an 80 GB A100 GPU (perfect for the 70B variant):

tnr create --gpu "a100xl" --template "ollama"

For details on instance templates, see our templates guide.

Step 2: Check Status and Connect

Verify the instance is running:

tnr status

Connect with its ID:

tnr connect <instance-id>

Step 3: Start the Ollama Server

Inside the instance, start Ollama:

start ollama

If you hit any hiccups, check our troubleshooting guide.

Wait about 30 seconds for the web UI to load.

Step 4: Access the Web UI and Load DeepSeek R1

  1. Visit http://localhost:8080 in your browser.
  2. Choose DeepSeek R1 from the dropdown. On an 80 GB A100, pick the 70B variant for peak performance.

Step 5: Run DeepSeek R1

Type a prompt in the web interface. For example:

“If the concepts of rCUDA were applied at scale, overcoming latency, what would it mean for the cost of GPUs on cloud providers?”

The model will think through the answer and respond. A full reply can take up to 200 seconds.

Conclusion

That’s the cheapest way to run DeepSeek R1 and a quick way to try DeepSeek R1 on Thunder Compute. Explore more guides:

Happy building!