# Note that self.params.max_seq_len is multiplied by 2 because the token limit for the
# Llama 2 generation of models is 4096. Adding this multiplier instead of using 4096
# directly allows for dynamism of token lengths while training or fine-tuning.
The original explanation is below:
It seems that further discussion is needed.
https://github.com/meta-llama/llama/blob/main/llama/model.py#L450