Default Model:
System Instruction:
Context Limit:
Temperature:
Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Top K:
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Min: 0
Top P:
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
Presence Penalty:
How much to penalize new tokens based on whether they appear in the text so far. Increases the model's likelihood to talk about new topics.
Frequency Penalty:
Max Tokens:
The maximum number of tokens to generate before stopping.