[Model] Why self.params.max_seq_len is multiplied by 2?

The original explanation is below:
```
# Note that self.params.max_seq_len is multiplied by 2 because the token limit for the
# Llama 2 generation of models is 4096. Adding this multiplier instead of using 4096
# directly allows for dynamism of token lengths while training or fine-tuning.
```
It seems that further discussion is needed.
https://github.com/meta-llama/llama/blob/main/llama/model.py#L450


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Why self.params.max_seq_len is multiplied by 2? #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Model] Why self.params.max_seq_len is multiplied by 2? #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions