Evaluation Metrics

Understand the performance and health of your feature flag evaluations. Monitor request volume, latency percentiles, cache hit rates, and error rates to ensure flags are delivered fast and reliably.

Metrics Dashboard

The evaluation metrics dashboard provides a real-time view of how your flags are performing. Access it from the sidebar under Metrics. The dashboard is scoped to the current project and environment, with support for custom time ranges.

Key Metrics

Request Volume

The total number of flag evaluation requests over the selected time period. This metric helps you understand the load on your evaluation infrastructure and identify usage patterns.

Measured in requests per second (RPS) or total requests per period
Break down by flag, environment, or SDK to identify top consumers
Spikes may indicate a deployment or incident — correlate with your release timeline
Use to right-size your infrastructure and set rate-limit thresholds

Latency Percentiles (p50, p95, p99)

Evaluation latency measures how long it takes for the server to process a flag evaluation request and return a result. Latency is measured end-to-end on the server side.

p50 (median) — Half of all evaluations complete faster than this value. Represents the typical user experience.
p95 — 95% of evaluations are faster than this. A good indicator of real-world performance that excludes extreme outliers.
p99 — 99% of evaluations are faster than this. Tracks tail latency; important for high-throughput applications where even 1% slow requests matter.
FeatureSignals targets <1ms p99 evaluation latency (excluding network). The evaluation engine is optimized in Go for the hot path.

Cache Hit Rate

The percentage of evaluations served from the in-memory ruleset cache versus those that required a database lookup. A high cache hit rate is essential for low-latency evaluations.

Target: >99% cache hit rate under normal operation
Cache misses occur after rule updates, flag toggles, or cache invalidation events
Low cache hit rate may indicate frequent configuration changes or cache configuration issues
The cache uses PG LISTEN/NOTIFY for cross-instance invalidation, ensuring all server instances stay synchronized

Error Rate

The percentage of evaluation requests that result in an error. Errors include timeouts, invalid API keys, missing flags, and internal server errors.

Target: <0.01% error rate (one error per 10,000 requests)
Client errors (4xx) — invalid API keys, missing flags, malformed requests. These count against your error budget but don't indicate a server problem
Server errors (5xx) — internal failures. Any 5xx rate above 0% warrants immediate investigation
The error rate dashboard breaks down errors by type, flag, and SDK version for quick root-cause analysis

Time Range & Granularity

The metrics dashboard supports multiple time ranges to help you analyze both real-time and historical performance:

Last 1 hour — Minute-level granularity for real-time monitoring
Last 24 hours — Hourly granularity for daily trends
Last 7 days — Hourly granularity for weekly patterns
Last 30 days — Daily granularity for long-term trends
Custom range — Select any date range; granularity adjusts automatically

Filtering & Grouping

Slice and dice metrics to focus on what matters:

By flag — View metrics for a single flag to identify performance issues specific to one feature.
By environment — Compare dev, staging, and production metrics to catch problems before they reach users.
By SDK — See which SDK versions are generating traffic. Identify outdated SDKs that should be upgraded.
By SDK type — Compare server-side vs client-side evaluation patterns.

Performance Best Practices

Monitor p99 latency

Set alerts on p99 latency exceeding 5ms to catch performance regressions early. While FeatureSignals targets <1ms p99, network latency adds overhead — measure from your application, not just the server.

Keep SDKs updated

Newer SDK versions include performance improvements and bug fixes. Check the SDK adoption chart in Usage Insights to see which versions your team is running.

Cache strategically

Server SDKs maintain an in-memory cache of flag rules. Configure the polling interval balance between freshness and load. The default 30-second interval works for most use cases.

Next Steps

Flag Health— monitor stale flags and flag-level health indicators
Usage Insights— track flag evaluation trends and SDK adoption
Evaluation Engine Architecture— understand how the evaluation engine works under the hood