All About AI Model Settings
About the AI
Model Settings is a feature in Questsmith that lets you customize and control how AI responses are generated during gameplay. These options may vary across different models.
Advanced Model Architecture and Generation Parameters
Questsmith engine settings allow authors and players to precisely calibrate how the artificial intelligence computes text generations. These settings modify the underlying token selection formulas, balancing creative volatility with narrative coherence.
Every turn the AI constructs a dynamic probability distribution of potential tokens, which are the foundational alphanumeric fragments used to build words. The engine processes these probabilities utilizing statistical weights that can be refined through the advanced control panel.
Context Length
Context Length defines the maximum volume of tokens transmitted to the AI engine during a single generation cycle. To maintain deep narrative continuity it is recommended to maximize this value.
The context payload is compiled hierarchically from the following active elements:
- Core Plot Essentials configuration
- Dynamically triggered Story Cards
- Active Author Notes and environmental memory matrices
- Global system instructions
- The immediate player input and recent historical logs
Any residual bandwidth within the allocated Context Length is backfilled with preceding adventure history. Maximizing this window ensures structural alignment and logical stability across extended campaigns. Total context allocation scales based on your premium account tier.
Response Length
Response Length dictates the absolute token ceiling for any single programmatic output. This parameter can be adjusted to serve distinct gameplay architectures. Authors seeking rapid tactical interactions may prefer brief structural outputs, while players pursuing deep narrative immersion can scale this threshold to allow for expansive cinematic descriptions.
Temperature
Temperature governs the stochastic volatility or randomness of the generation engine. Increasing the value expands the operational variance, prompting the model to select low probability tokens. This results in highly divergent, creative, and unpredictable plot vectors, which are ideal for speculative worldbuilding.
Lowering the value compresses variance, forcing the AI to select only the highest probability tokens. This enforces rigid adherence to established logic and maintains strict plot compliance. Standard configurations utilize a baseline of 0.8. Adjusting below 0.6 introduces extreme narrative discipline, while pushing beyond 1.2 can result in structural degradation or fragmented outputs.
Top K
Top K establishes a hard numerical ceiling on the token selection pool, restricting the engine choices exclusively to the specified number of most probable next tokens. By eliminating the long tail of low probability choices, Top K guarantees semantic relevance and keeps the generation locked within the logical boundaries of the ongoing narrative.
For example a Top K setting of 20 forces the AI to evaluate only the 20 most optimal choices, suppressing erratic creative leaps.
Top P
Top P applies a dynamic cumulative probability filter to ensure structural cohesion. Rather than selecting a fixed number of tokens, Top P aggregates the highest ranking tokens until their collective probability matches the targeted threshold, such as 90%.
This functions as a vital safety matrix alongside Top K. While Top K ensures a consistent pool size, Top P truncates highly improbable choices within that pool during moments of high certainty, allowing the model to be decisive when confident and expansive when exploring ambiguous scenarios. Standard operational thresholds range between 0.90 and 0.95.
Presence Penalty
Presence Penalty introduces a flat mathematical discount to the selection probability of any token that has already appeared within the current generation window. This parameter directly suppresses redundant loops and prevents the model from echoing its immediate outputs. If an asset or phrase has been introduced once, the engine actively searches for alternative linguistic paths unless the base probability of the original word is exceptionally high.
Frequency Penalty
Frequency Penalty operates similarly to Presence Penalty but scales proportionally with use. The penalty compounds quadratically the more frequently a specific token appears throughout the generation history. While highly effective at forcing diverse vocabulary, excessive values will penalize essential structural tokens including pronouns and common articles, causing severe syntactic breakdown and grammatical alienation. This parameter is deactivated at a zero value by default.
Operational Paradigms
Zero Risk Latency: There are no permanent architectural consequences when modifying these parameters. The entire Questsmith environment remains fully mutable and editable. If an experimental configuration generates anomalous text, the output can be purged, the values reordered, and the generation reinitialized instantly.
Absence of Universal Norms: While factory defaults are optimized for balanced narrative prose, they do not represent an absolute standard. The interface is explicitly designed to support custom deployment styles. Authors are encouraged to engineer bespoke profiles that match their specific narrative rhythms and world mechanics.