According to reports, Anthropic officially launched the AI model Claude Mythos, hailed as "the strongest in history." However, its astonishing pricing has caused a significant upheaval among developers: the output cost reaches 125 dollars per million Tokens, nearly 8 times more expensive than the current flagship model, Claude Sonnet4.6.

image.png

Key Focus: The "Unspeakable Expensive" Mythos

Claude Mythos’s pricing logic marks a new phase of AI computing power premium:

Exorbitant Costs: Input and output prices are 25 dollars and 125 dollars respectively. In comparison, Claude Sonnet4.6 only costs 3/15 dollars.

High Threshold: Due to its overwhelming power and high cost, the model is not yet available to ordinary users. Even some Reddit users have joked that, even with simple Skills, sending a single "Hello" might consume 13% of the monthly Token allowance.

Geek Self-Rescue: The Popular "Caveman" Saving Method

Facing Token costs as high as gold, developers began exploring extreme "cost-saving techniques." Among them, the project named caveman (Caveman) quickly became popular on GitHub:

Core Logic: It forces the AI to stop all pleasantries (such as "It's a pleasure to serve you"), remove articles, avoid vague chit-chat, and retain only core technical terms.

Amazing Effect: Tests show that this "Caveman language" mode can save about 65% of Tokens, without affecting the accuracy of the output content.

Scientific Basis: Research found that forcing the model to give short responses not only saves money but also eliminates negative interference caused by overthinking, improving the accuracy of certain benchmark tests by 26%.

Practical Tips: 10 Token-Saving Hacks

Aside from technical methods, regular users can also avoid the "Token Assassin" by changing their interaction habits:

Editing in Place: When the result doesn't meet expectations, click the "Edit" button to modify the original prompt, avoiding repeated billing caused by long conversations.

Timely "Cutting Off": Start a new conversation every 15-20 messages to prevent context stacking from becoming a Token black hole.

Combining Questions: Concentrate multiple instructions into one message to reduce the number of system loads.

Smart Use of Project Space: Upload long documents to Projects and use the caching function to avoid repeatedly scanning the document.

Downgrading as Needed: Assign basic tasks like grammar checking to low-cost models like Haiku, saving expensive quotas for Claude Mythos.

Using Off-Peak Times: Avoid peak hours between 5 AM and 11 AM Pacific Time to take advantage of off-peak benefits.

Conclusion: From "Wastefulness" Back to "Precision"

From the short messages in 2000 charged by the word, to today’s large models charged by Tokens, human pursuit of communication efficiency has come full circle. In the "High-Cost AI Era" opened by Claude Mythos