# ModelScope上的Twinkle训练服务 在 Twinkle 框架开源的同时,我们依托ModelScope的后台服务,也提供了托管的模型训练服务(Training as a Service),开发者可以通过这一服务, 免费体验Twinkle的训练API。 目前在集群中运行的模型是[Qwen/Qwen3.6-27B](https://www.modelscope.cn/models/Qwen/Qwen3.6-27B)。下面介绍具体的使用方法: ## Step 1. 注册ModelScope用户并获取 API Key 开发者首先需要注册成为ModelScope用户。您也可以通过本地部署服务,来使用Twinkle✨。 注册地址:https://www.modelscope.cn/ 注册后,在此页面获取访问的API-Key(即ModelScope平台的访问Token):https://www.modelscope.cn/my/access/token 。 调用端点:`base_url="https://www.modelscope.cn/twinkle"` ## Step 2. 查看 Cookbook 并二次定制开发 我们强烈推荐开发者查看我们的 [cookbook](https://github.com/modelscope/twinkle/tree/main/cookbook/client/tinker),并根据其中的训练代码进行二次开发。 样例代码: ```python import os from tqdm import tqdm from tinker import types from twinkle_client import init_tinker_client from twinkle.dataloader import DataLoader from twinkle.dataset import Dataset, DatasetMeta from twinkle.preprocessor import SelfCognitionProcessor from twinkle.server.common import input_feature_to_datum base_model = 'ms://Qwen/Qwen3.6-27B' base_url='https://www.modelscope.cn/twinkle' api_key=os.environ.get('MODELSCOPE_TOKEN') # Use twinkle dataset to load the data dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(500))) dataset.set_template('Qwen3_5Template', model_id=base_model, max_length=256) dataset.map(SelfCognitionProcessor('Twinkle Model', 'ModelScope Team'), load_from_cache_file=False) dataset.encode(batched=True, load_from_cache_file=False) dataloader = DataLoader(dataset=dataset, batch_size=8) # Initialize Tinker client before importing ServiceClient init_tinker_client() from tinker import ServiceClient service_client = ServiceClient(base_url=base_url, api_key=api_key) training_client = service_client.create_lora_training_client(base_model=base_model[len('ms://'):], rank=16) # Training loop: use input_feature_to_datum to transfer the input format for epoch in range(2): for step, batch in tqdm(enumerate(dataloader)): input_datum = [input_feature_to_datum(input_feature) for input_feature in batch] fwdbwd_future = training_client.forward_backward(input_datum, "cross_entropy") optim_future = training_client.optim_step(types.AdamParams(learning_rate=1e-4)) fwdbwd_result = fwdbwd_future.result() optim_result = optim_future.result() print(f'Training Metrics: {optim_result}') result = training_client.save_state(f"twinkle-lora-{epoch}").result() print(f'Saved checkpoint for epoch {epoch} to {result.path}') ``` 通过上述代码,你可以训练一个原模型为`Qwen/Qwen3.6-27B`的自我认知lora。这个lora会改变模型的名称和制造者为训练时指定的名称。使用这个lora进行推理: ```python import os from tinker import types from twinkle.data_format import Message, Trajectory from twinkle.template import Template from twinkle import init_tinker_client # Step 1: Initialize Tinker client init_tinker_client() from tinker import ServiceClient base_model = 'Qwen/Qwen3.6-27B' base_url = 'https://www.modelscope.cn/twinkle' # Step 2: Define the base model and connect to the server service_client = ServiceClient( base_url=base_url, api_key=os.environ.get('MODELSCOPE_TOKEN') ) # Step 3: Create a sampling client by loading weights from a saved checkpoint. # The model_path is a twinkle:// URI pointing to a previously saved LoRA checkpoint. # The server will load the base model and apply the LoRA adapter weights. sampling_client = service_client.create_sampling_client( model_path='twinkle://xxx-Qwen_Qwen3.6-35B-A3B-xxx/weights/twinkle-lora-1', base_model=base_model ) # Step 4: Load the tokenizer locally to encode the prompt and decode the results print(f'Using model {base_model}') template = Template(model_id=f'ms://{base_model}') trajectory = Trajectory( messages=[ Message(role='system', content='You are a helpful assistant'), Message(role='user', content='Who are you?'), ] ) input_feature = template.batch_encode([trajectory], add_generation_prompt=True)[0] input_ids = input_feature['input_ids'].tolist() # Step 5: Prepare the prompt and sampling parameters prompt = types.ModelInput.from_ints(input_ids) params = types.SamplingParams( max_tokens=128, # Maximum number of tokens to generate temperature=0.7, stop=['\n'] # Stop generation when a newline character is produced ) # Step 6: Send the sampling request to the server. # num_samples=1 generates 1 independent completions for the same prompt. print('Sampling...') future = sampling_client.sample(prompt=prompt, sampling_params=params, num_samples=1) result = future.result() # Step 7: Decode and print the generated responses print('Responses:') for i, seq in enumerate(result.sequences): print(f'{i}: {repr(template.decode(seq.tokens))}') ``` 开发者也可以将这个lora和原模型合并之后,使用自己的服务进行部署,并使用OpenAI标准接口进行调用。 > 目前的服务兼容 Tinker Client,因此请使用 Tinker 的 cookbook 进行训练。后续我们会支持单服务器同时支持 Twinkle/Tinker 双客户端。 开发者可以定制数据集/优势函数/奖励/模板等,其中 Loss 部分由于需要在服务端执行,因此当前暂不支持定制(安全性原因)。 如果需要支持您的额外 Loss,可以将该 Loss 实现上传到 [ModelHub](https://modelscope.cn) 中,并在答疑群中或者 [issue](https://github.com/modelscope/twinkle/issues) 中联系我们,将对应组件开放白名单即可使用。 ## 附录:支持的训练方式 该模型为纯文本模型,因此暂不支持多模态任务。在纯文本任务中,你可以训练: 1. PT/SFT的常规训练方法,包含Agentic训练 2. GRPO/RLOO等自采样RL算法 3. GKD/On-policy等蒸馏方法,由于魔搭官方端仅支持单模型,因此另一个Teacher/Student模型需要开发者自行准备 当前官方环境仅支持LoRA训练,对LoRA的要求: 1. 最大rank=32 2. 不支持modules_to_save