Skip to content

[Model] Why self.params.max_seq_len is multiplied by 2? #23

@NANWOOD

Description

@NANWOOD

The original explanation is below:

# Note that self.params.max_seq_len is multiplied by 2 because the token limit for the
# Llama 2 generation of models is 4096. Adding this multiplier instead of using 4096
# directly allows for dynamism of token lengths while training or fine-tuning.

It seems that further discussion is needed.
https://github.com/meta-llama/llama/blob/main/llama/model.py#L450

Metadata

Metadata

Assignees

Labels

Work ItemMixture-AI Work Item

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions