438.

Tokens are a big reason today's generative AI falls short | TechCrunch

techcrunch.com/2024/07/06/tokens-are-a-big-reason-todays-generative-ai-falls-short

Rarely are digits tokenized consistently. Because they don’t really know what numbers are, tokenizers might treat “380” as one token, but represent “381” as a pair (“38” and “1”) — effectively destroying the relationships between digits and results in equations and formulas. The result is transformer confusion; a recent paper showed that models struggle to understand repetitive numerical patterns and context, particularly temporal data. (See: GPT-4 thinks 7,735 is greater than 7,926).