Quick Start 🚀¶
To quickly start using the SynthGenAI, you need to have the package installed. You can install it using the following command:
After installation, simply run the following command in your terminal:
This will launch the Gradio UI for generating synthetic datasets.
Gradio UI for generating synthetic datasets
To create datasets, you need to set up the following fields in the UI:
- LLM Model: The LLM model to use (e.g., model_provider/model_name).
- Temperature: The temperature for the LLM.
- Top P: The top_p value for the LLM.
- Max Tokens: The maximum number of tokens for the LLM.
- API Base: The API base URL (optional).
- API Key: The API key (optional).
- Dataset Type: The type of dataset to generate (e.g., Raw, Instruction, Preference, Sentiment Analysis, Summarization, Text Classification).
- Topic: The topic of the dataset.
- Domains: The domains for the dataset (comma-separated).
- Language: The language of the dataset.
- Additional Description: Additional description for the dataset (optional).
- Number of Entries: The number of entries in the dataset.
- Hugging Face Token: The Hugging Face token.
- Hugging Face Repo Name: The Hugging Face repository name.
- LLM Environment Variables: Comma-separated environment variables for the LLM (e.g., KEY1=VALUE1, KEY2=VALUE2).