AI Janitor LLM Integration
AI Janitor uses large language models (LLMs) to analyze flag usage patterns and generate code removal suggestions. Choose between OpenAI, Anthropic, or bring your own self-hosted model — each with different cost, accuracy, and data residency characteristics.
Provider Comparison
| Feature | OpenAI | Anthropic | Self-Hosted |
|---|---|---|---|
| Recommended Model | gpt-4o-mini | claude-3-5-sonnet-latest | Any OpenAI-compatible |
| Data Residency | US/EU (via API) | US (via API) | Your infrastructure |
| Cost per Scan (est.) | $0.02–$0.15 | $0.05–$0.30 | Infra cost only |
| Setup Time | 5 min (API key) | 5 min (API key) | 1–4 hours (deploy model) |
| Prompt Customization | Full | Full | Full |
| Rate Limit Handling | Built-in retry | Built-in retry | Configurable |
OpenAI Setup
OpenAI is the default LLM provider. It offers the best speed/cost/accuracy balance for flag cleanup analysis and requires only an API key to get started.
1. Get an API key
Create an API key at platform.openai.com/api-keys. You'll need a funded OpenAI account with API access.
2. Configure AI Janitor
In the AI Janitor settings page, select OpenAI as your provider and enter your API key:
{
"llm_provider": "openai",
"llm_model": "gpt-4o-mini",
"openai_api_key": "sk-..."
}
3. Choose a model
- gpt-4o-mini — Recommended. Fast, cost-effective, sufficient for 95% of flag cleanup tasks.
- gpt-4o — Higher accuracy for complex refactors involving many flags or intricate logic.
Anthropic Setup
Anthropic's Claude models offer strong code understanding and are a good alternative if you prefer Anthropic's API or have existing Anthropic contracts.
1. Get an API key
Create an API key at console.anthropic.com/settings/keys.
2. Configure AI Janitor
{
"llm_provider": "anthropic",
"llm_model": "claude-3-5-sonnet-latest",
"anthropic_api_key": "sk-ant-..."
}
3. Choose a model
- claude-3-5-sonnet-latest — Recommended. Excellent code understanding and generation.
- claude-3-haiku-latest — Faster and cheaper, suitable for straightforward flag removals.
Self-Hosted Setup
For organizations with strict data residency requirements or existing GPU infrastructure, AI Janitor supports any OpenAI-compatible API endpoint. This includes popular model serving frameworks like vLLM, Ollama, and LM Studio.
Compatibility requirements
/v1/chat/completionsendpoint) and accessible from FeatureSignals' infrastructure. Models should support at least 8K tokens of context for effective code analysis.1. Deploy a compatible model
# Deploy an OpenAI-compatible model with vLLM
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3.1-8B-Instruct \
--host 0.0.0.0 \
--port 8000
2. Configure AI Janitor
{
"llm_provider": "self_hosted",
"llm_model": "meta-llama/Llama-3.1-8B-Instruct",
"self_hosted_endpoint": "https://llm.internal.example.com/v1",
"self_hosted_api_key": "optional-auth-token"
}
Rate Limits & Token Usage
AI Janitor makes multiple LLM calls per scan — one per file that references a stale flag. Understanding rate limits and token consumption helps you estimate costs and avoid throttling.
Token Estimation
- ~500 tokens per file for context (system prompt + flag metadata)
- ~1K–3K tokens per file for the code being analyzed
- ~500–2K tokens per file for the generated response (diff)
- Total per file: ~2K–5.5K tokens
- Total per 50-flag scan: ~10K–50K tokens
Rate Limit Handling
- AI Janitor automatically respects rate limit headers
- Exponential backoff with jitter on 429 responses
- Maximum 3 retries per LLM call before failing
- Scans are paused (not failed) when rate limits are hit
- Concurrent LLM calls limited to 5 by default
Prompt Customization
Advanced users can customize the system prompt AI Janitor sends to the LLM. This is useful for enforcing code style conventions, adding organization-specific instructions, or improving accuracy for your codebase.
{
"custom_system_prompt": "You are an expert code reviewer specializing in feature flag cleanup. Follow these rules:\n1. Always preserve the active code path (the branch that was always taken)\n2. Remove any now-unused imports\n3. If removing a flag eliminates the last reference to an imported module, remove that import\n4. Keep JSDoc/comment headers intact\n5. Do not modify any code outside the flag conditional block\n6. Format output as a unified diff"
}
Prompt changes affect accuracy