Evaluation Metrics
Understand the performance and health of your feature flag evaluations. Monitor request volume, latency percentiles, cache hit rates, and error rates to ensure flags are delivered fast and reliably.
Metrics Dashboard
The evaluation metrics dashboard provides a real-time view of how your flags are performing. Access it from the sidebar under Metrics. The dashboard is scoped to the current project and environment, with support for custom time ranges.
Key Metrics
Request Volume
The total number of flag evaluation requests over the selected time period. This metric helps you understand the load on your evaluation infrastructure and identify usage patterns.
- Measured in requests per second (RPS) or total requests per period
- Break down by flag, environment, or SDK to identify top consumers
- Spikes may indicate a deployment or incident — correlate with your release timeline
- Use to right-size your infrastructure and set rate-limit thresholds
Latency Percentiles (p50, p95, p99)
Evaluation latency measures how long it takes for the server to process a flag evaluation request and return a result. Latency is measured end-to-end on the server side.
- p50 (median) — Half of all evaluations complete faster than this value. Represents the typical user experience.
- p95 — 95% of evaluations are faster than this. A good indicator of real-world performance that excludes extreme outliers.
- p99 — 99% of evaluations are faster than this. Tracks tail latency; important for high-throughput applications where even 1% slow requests matter.
- FeatureSignals targets <1ms p99 evaluation latency (excluding network). The evaluation engine is optimized in Go for the hot path.
Cache Hit Rate
The percentage of evaluations served from the in-memory ruleset cache versus those that required a database lookup. A high cache hit rate is essential for low-latency evaluations.
- Target: >99% cache hit rate under normal operation
- Cache misses occur after rule updates, flag toggles, or cache invalidation events
- Low cache hit rate may indicate frequent configuration changes or cache configuration issues
- The cache uses PG LISTEN/NOTIFY for cross-instance invalidation, ensuring all server instances stay synchronized
Error Rate
The percentage of evaluation requests that result in an error. Errors include timeouts, invalid API keys, missing flags, and internal server errors.
- Target: <0.01% error rate (one error per 10,000 requests)
- Client errors (4xx) — invalid API keys, missing flags, malformed requests. These count against your error budget but don't indicate a server problem
- Server errors (5xx) — internal failures. Any 5xx rate above 0% warrants immediate investigation
- The error rate dashboard breaks down errors by type, flag, and SDK version for quick root-cause analysis
Time Range & Granularity
The metrics dashboard supports multiple time ranges to help you analyze both real-time and historical performance:
- Last 1 hour — Minute-level granularity for real-time monitoring
- Last 24 hours — Hourly granularity for daily trends
- Last 7 days — Hourly granularity for weekly patterns
- Last 30 days — Daily granularity for long-term trends
- Custom range — Select any date range; granularity adjusts automatically
Filtering & Grouping
Slice and dice metrics to focus on what matters:
- By flag — View metrics for a single flag to identify performance issues specific to one feature.
- By environment — Compare dev, staging, and production metrics to catch problems before they reach users.
- By SDK — See which SDK versions are generating traffic. Identify outdated SDKs that should be upgraded.
- By SDK type — Compare server-side vs client-side evaluation patterns.
Performance Best Practices
Monitor p99 latency
Set alerts on p99 latency exceeding 5ms to catch performance regressions early. While FeatureSignals targets <1ms p99, network latency adds overhead — measure from your application, not just the server.
Keep SDKs updated
Newer SDK versions include performance improvements and bug fixes. Check the SDK adoption chart in Usage Insights to see which versions your team is running.
Cache strategically
Server SDKs maintain an in-memory cache of flag rules. Configure the polling interval balance between freshness and load. The default 30-second interval works for most use cases.
Next Steps
- Flag Health— monitor stale flags and flag-level health indicators
- Usage Insights— track flag evaluation trends and SDK adoption
- Evaluation Engine Architecture— understand how the evaluation engine works under the hood