Overview

Twinkle provides a complete HTTP Server/Client architecture that supports deploying models as services and remotely calling them through clients to complete training, inference, and other tasks. This architecture decouples model hosting (Server side) and training logic (Client side), allowing multiple users to share the same base model for training.

Core Concepts

  • Server side: Deployed based on Ray Serve, hosts model weights and inference/training computation. The Server is responsible for managing model loading, forward/backward propagation, weight saving, sampling inference, etc. A single Server simultaneously supports both Twinkle Client and Tinker Client connections.

  • Client side: Runs locally, responsible for data preparation, training loop orchestration, hyperparameter configuration, etc. The Client communicates with the Server via HTTP, sending data and commands.

Two Model Backends

Model loading supports two backends:

Backend use_megatron Description
Transformers false Based on HuggingFace Transformers, suitable for most scenarios
Megatron true Based on Megatron-LM, suitable for ultra-large-scale model training, supports more efficient parallelization strategies

Two Client Modes

Client Initialization Method Description
Twinkle Client init_twinkle_client Native client, simply change from twinkle import to from twinkle_client import to migrate local training code to remote calls
Tinker Client init_tinker_client Patches Tinker SDK, allowing existing Tinker training code to be directly reused

How to Choose

Client Mode Selection

Scenario Recommendation
Existing Twinkle local training code, want to switch to remote Twinkle Client — only need to change import paths
Existing Tinker training code, want to reuse Tinker Client — only need to initialize patch
New project Twinkle Client — simpler API

Model Backend Selection

Scenario Recommendation
7B/14B and other medium-small scale models Transformers backend (use_megatron: false)
Ultra-large-scale models requiring advanced parallelization strategies Megatron backend (use_megatron: true)
Rapid experimentation and prototype verification Transformers backend (use_megatron: false)

Cookbook Reference

Complete runnable examples are located in the cookbook/client/ directory:

cookbook/client/
├── server/                         # Server startup configuration
│   ├── transformer/                # Transformers backend
│   │   ├── run.sh                  # Startup script
│   │   ├── server.py               # Server entry point
│   │   └── server_config.yaml      # Configuration file
│   └── megatron/                   # Megatron backend
│       ├── run.sh
│       ├── server.py
│       ├── server_config.yaml
│       └── server_config_4b.yaml
├── twinkle/                        # Twinkle Client examples
│   ├── self_host/                  # Self-hosted Server
│   │   ├── dpo.py                  # DPO training client
│   │   ├── multi_modal.py          # Multi-modal training client
│   │   ├── sample.py               # Inference sampling client
│   │   ├── self_congnition.py      # Self-cognition training client
│   │   └── short_math_grpo.py      # GRPO math training client
│   └── modelscope/                 # ModelScope managed service
│       ├── dpo.py
│       ├── multi_modal.py
│       └── self_congnition.py
└── tinker/                         # Tinker Client examples
    ├── self_host/                  # Self-hosted Server
    │   ├── dpo.py                  # DPO training client
    │   ├── lora.py                 # LoRA training client
    │   ├── multi_modal.py          # Multi-modal training client
    │   ├── sample.py               # Inference sampling client
    │   ├── self_cognition.py       # Self-cognition training client
    │   └── short_math_grpo.py      # GRPO math training client
    └── modelscope/                 # ModelScope managed service
        ├── dpo.py
        ├── sample.py
        ├── self_cognition.py
        └── short_math_grpo.py

Running steps:

# 1. Start Server first
python cookbook/client/server/megatron/server.py

# 2. Run Client in another terminal (Tinker Client example)
python cookbook/client/tinker/self_host/self_cognition.py

# Or use Twinkle Client
python cookbook/client/twinkle/self_host/self_cognition.py