Model Input

The class used by Twinkle to represent model input is InputFeature, which is adapted to model structures such as transformers/megatron.

InputType = Union[List[List[int]], List[int], np.ndarray, Any]

class InputFeature(TypedDict, total=False):
    # Text-related fields
    input_ids: InputType
    attention_mask: InputType
    position_ids: InputType
    labels: InputType

InputFeature is essentially a Dict. Its input comes from the output of the Template component.

  • input_ids: Token list after List[Messages] is nested with a template

  • attention_mask: Attention mask

  • position_ids: Position encoding for sample distinction

  • labels: Training labels, which have already undergone a one-token left shift

In the case of packing or padding_free, fields such as input_ids are concatenated from lists of multiple samples. In multimodal scenarios, InputFeature contains other multimodal fields.

InputFeature is the standard interface for all template outputs and model inputs in Twinkle.