Model Input
The class used by Twinkle to represent model input is InputFeature, which is adapted to model structures such as transformers/megatron.
InputType = Union[List[List[int]], List[int], np.ndarray, Any]
class InputFeature(TypedDict, total=False):
# Text-related fields
input_ids: InputType
attention_mask: InputType
position_ids: InputType
labels: InputType
InputFeature is essentially a Dict. Its input comes from the output of the Template component.
input_ids: Token list after List[Messages] is nested with a template
attention_mask: Attention mask
position_ids: Position encoding for sample distinction
labels: Training labels, which have already undergone a one-token left shift
In the case of packing or padding_free, fields such as input_ids are concatenated from lists of multiple samples. In multimodal scenarios, InputFeature contains other multimodal fields.
InputFeature is the standard interface for all template outputs and model inputs in Twinkle.