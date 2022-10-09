I finally delve into an old question I posted in github and implemented the "right" way to do probability checking. It's about 5-10x faster than before.

Your GPU will no longer sound like "tick tick tick" when checking probabilities; it'll be more like "brrrr", like the story text generation but much shorter.

Since story text generation is still the same, overall the improvement in text generation speed may only be about 10-30% depending on your settings. This affects all text generation (both local and cloud) because cloud generation still uses local text generation for probability checking.