update · 61557fcb

--chain@gakonst/mpp-nanogpt-modalTESTED · gakonst/mpp-nanogpt-modal · 2026-03-263/26/2026

We've upgraded `run.py` into an intelligent AI agent that dynamically configures nanoGPT training jobs from natural language prompts, eliminating the need for manual script editing.

> impact

Previously, our `mpp-nanogpt-modal` example used a hardcoded bash script within `run.py` to train a nanoGPT model exclusively on the Shakespeare dataset. This implementation was static, requiring developers to fork the repository and manually modify the Python source code to train on a different dataset or adjust hyperparameters. We have now replaced this static configuration with a dynamic AI agent. The core limitation was the tool's inflexibility. It served as a great proof-of-concept for running nanoGPT on Modal, but it wasn't a practical utility for custom model training. Any developer wanting to train a model on their own data—be it philosophical texts, project documentation, or a specific author's corpus—faced the friction of editing and redeploying the script. Our goal was to remove this barrier and empower users to launch custom training jobs with zero code changes. This update transforms the script from a simple demo into a powerful, interactive utility. Now, `run.py` accepts a natural language prompt from the user describing their training goal. This prompt is sent to an LLM, which intelligently generates the necessary bash commands to download the data, preprocess it, and initiate `train.py` with appropriate parameters. Developers can now train a nanoGPT model on virtually any text-based dataset by simply describing what they want, making custom character-level model training on Modal more accessible than ever.

> Try this now

try this

# To train a custom nanoGPT model, simply use the `modal run` command
# and provide your goal as a string argument to the `train` function.
# The AI agent will interpret your request and generate the training script.

# --- 1. Define your training goal in natural language. ---
# Let's train a model on stoic philosophy using Marcus Aurelius's "Meditations"
# from Project Gutenberg.

# --- 2. Execute the run script from your terminal. ---
# The script will pass your prompt to an LLM, which generates the shell commands
# to download the data, prepare it, and kick off the training process.

modal run run.py --prompt "Train a nanoGPT model on the philosophy of Marcus Aurelius. Use the text of Meditations from Project Gutenberg's public domain copy at https://www.gutenberg.org/files/2680/2680-0.txt"

# --- Expected Behind-the-Scenes Action ---
# The LLM agent would generate a script similar to this:
#
# #!/bin/bash
# # Download the dataset
# wget -O input.txt https://www.gutenberg.org/files/2680/2680-0.txt
#
# # Prepare the data (optional, depends on model)
# # For this example, no extra prep is needed.
#
# # Run the training script with appropriate hyperparameters
# python data/shakespeare_char/prepare.py
# python train.py config/train_shakespeare_char.py --device=cuda --compile=False --max_iters=1000 --eval_interval=100
#
# This generated script is then executed by the Modal function.