Strategic Test-Time Compute (TTC)

This repository contains the official code for the paper: "Test-Time Compute Games".

We model the LLM-as-a-service market as a game where providers compete by strategically selecting test-time compute levels (e.g., Best-of-N, Majority Voting) to maximize profit. We also introduce a Reverse Second-Price Auction mechanism that aligns provider incentives with social welfare.

Repository Structure

The repository is organized as follows:

.
├── configs/            # Experiment configuration files (YAML) for Llama, Qwen, etc.
├── notebooks/          # Jupyter notebooks for recreating paper figures and analysis
├── scripts/            # Helper scripts for loading results and SLURM job submission
├── strategic_ttc/      # Main package source code
│   ├── benchmarks/     # Benchmark datasets (GSM8K, AIME, GPQA)
│   ├── core/           # Core logic: Game dynamics, Auction mechanism, Generation
│   ├── models/         # HuggingFace model wrappers
│   └── verifiers/      # Answer extraction and verification logic
└── pyproject.toml      # Project metadata and build configuration

Getting Started

Prerequisites

Python 3.11+ (Tested on Python 3.11.2)
[Optional] CUDA-enabled GPU for running inference locally.

Installation

Clone the repository:

git clone git@github.com:Networks-Learning/strategic-ttc.git
cd strategic-ttc

Create a virtual environment:

python -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt
# OR install in editable mode if you plan to modify code:
pip install -e .

Reproducing Results

Run Inference (Data Generation)

To generate the raw performance data (Accuracy vs. Tokens) for different models and compute budgets, use the CLI with a configuration file:
```
python -m strategic_ttc.cli --config configs/Llama-3-8B--temp-0.6--samples-128--max-512.yaml
```
Note: The raw run data is hosted on Hugging Face Datasets. To run the analysis notebooks, you must download the data into the final_runs/ folder.
```
# Clone the data directly into the expected folder
git clone https://huggingface.co/datasets/Human-Centric-Machine-Learning/strategic-ttc-data final_runs
```
Analyze Game Dynamics

Use the provided notebooks to simulate the market game, compute Nash Equilibria, and compare with the Auction mechanism.
- GSM8K Analysis: notebooks/GSM8K-demo.ipynb
- AIME Analysis: notebooks/AIME-demo.ipynb
- GPQA Analysis: notebooks/GPQA-demo.ipynb
These notebooks will generate the plots and tables found in the paper, saving them to the figures/ directory.

Data Management

The following directories are excluded from version control (via .gitignore) to keep the repo lightweight. You must create them locally.

datasets/: Stores downloaded benchmark datasets.
final_runs/: Stores raw inference logs and generation outputs.
figures/: Stores the output plots generated by the notebooks.

Configuration Files

The configs/ directory contains standard setups for all models used in the paper, naming convention follows: {Benchmark}-{Model}-{Size}--temp-{T}--samples-{N}--max-{Tokens}.yaml

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you find this code useful, please cite our paper:

@misc{velasco2026testtimecomputegames,
      title={Test-Time Compute Games}, 
      author={Ander Artola Velasco and Dimitrios Rontogiannis and Stratis Tsirtsis and Manuel Gomez-Rodriguez},
      year={2026},
      eprint={2601.21839},
      archivePrefix={arXiv},
      primaryClass={cs.CY},
      url={https://arxiv.org/abs/2601.21839}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strategic Test-Time Compute (TTC)

Repository Structure

Getting Started

Prerequisites

Installation

Reproducing Results

Data Management

Configuration Files

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
configs		configs
notebooks		notebooks
scripts		scripts
strategic_ttc		strategic_ttc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Strategic Test-Time Compute (TTC)

Repository Structure

Getting Started

Prerequisites

Installation

Reproducing Results

Data Management

Configuration Files

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages