dynamicroutingconnector

package module

v0.34.0 Latest Latest Go to latest Published: Feb 24, 2026 License: Apache-2.0 Imports: 24 Imported by: 0

README ¶

Dynamic Routing Connector

The Dynamic Routing Connector is an OpenTelemetry Collector connector that routes telemetry data (traces, logs, and metrics) to different pipelines based on the estimated cardinality of unique combinations of configured metadata keys. It uses the HyperLogLog algorithm to efficiently estimate cardinality without storing all unique identifiers, making it memory-efficient even at scale. The metadata keys you configure determine what type of cardinality is being measured—whether that's unique connections, unique services, unique pods, or any other combination of metadata attributes.

Status
Distributions	[]
Issues
Code coverage

Supported Pipeline Types

Exporter Pipeline Type	Receiver Pipeline Type	Stability Level
logs	logs	development
metrics	metrics	development
traces	traces	development

Overview

The Dynamic Routing Connector enables intelligent, data-driven routing of telemetry signals to different processing pipelines based on the observed cardinality of unique combinations of metadata keys. The connector estimates how many unique combinations exist for a given primary key (e.g., tenant ID) based on the configured metadata keys. This is particularly useful in multi-tenant environments or scenarios where different workloads require different processing strategies based on their cardinality characteristics.

The Problem It Solves

Traditional OpenTelemetry Collector configurations use static routing rules that are defined at configuration time. This approach has limitations:

Static Configuration: Routing decisions are fixed and cannot adapt to changing traffic patterns
One-Size-Fits-All: All data flows through the same pipeline regardless of volume or connection patterns
Inefficient Resource Usage: High-cardinality tenants may need different batching, sampling, or processing strategies than low-cardinality ones
Manual Tuning Required: Operators must manually configure routing rules based on assumptions about traffic patterns

The Dynamic Routing Connector fills this gap by providing adaptive, cardinality-based routing that automatically adjusts pipeline selection based on observed cardinality patterns derived from configured metadata keys.

How It Works

The connector uses the following approach:

Metadata Extraction: For each incoming telemetry signal, the connector extracts metadata from the client context using the configured routing_keys.partition_by and routing_keys.measure_by. The partition_by keys are used to create a composite key that partitions cardinality estimates (e.g., per tenant, per tenant+type, per region+environment). Multiple keys can be specified to create composite partitions. The measure_by keys define what unique combinations are being counted for each composite partition key value.
Cardinality Estimation: The connector uses the HyperLogLog algorithm to estimate the number of unique combinations of the configured measure_by keys for each composite value of the partition_by keys. Composite keys are constructed by concatenating values from all specified partition keys, separated by colons (:) for multiple values of the same key and semicolons (;) for different keys.
Threshold-Based Routing: Based on the estimated cardinality for each primary key value, the connector routes data to different pipelines defined by threshold boundaries. For example:
- Low cardinality (0-10 unique combinations) → Pipeline A
- Medium cardinality (11-100 unique combinations) → Pipeline B
- High cardinality (101+ unique combinations) → Pipeline C
Periodic Re-evaluation: At configurable intervals, the connector re-evaluates routing decisions based on the most recent cardinality estimates, allowing it to adapt to changing patterns.
Memory Efficiency: The HyperLogLog algorithm provides accurate cardinality estimates with minimal memory overhead, making it suitable for high-throughput scenarios.

Configuration

Basic Configuration

connectors:
  dynamicrouting:
    routing_keys:
      partition_by: ["x-tenant-id"]
      measure_by:
        - "x-forwarded-for"
        - "user-agent"
    routing_pipelines:
      - pipelines:  ["traces/low_cardinality"]
        max_cardinality: 10
      - pipelines: ["traces/medium_cardinality"]
        max_cardinality: 100
      - pipelines: ["traces/high_cardinality"]
        max_cardinality: 500
      - pipelines: ["traces/very_high_cardinality"]
        max_cardinality: .inf
    default_pipelines: ["traces/default"]
    evaluation_interval: 30s

Composite Partition Keys

You can specify multiple keys in routing_keys.partition_by to create composite partitions. This is useful when you want to track cardinality per combination of multiple dimensions (e.g., per tenant AND tenant type, or per region AND environment).

connectors:
  dynamicrouting:
    routing_keys:
      # Composite key: partitions by both tenant and tenant type
      partition_by: ["x-tenant-id", "x-tenant-type"]
      measure_by:
        - "x-forwarded-for"
        - "user-agent"
    routing_pipelines:
      - pipelines: ["traces/low_cardinality"]
        max_cardinality: 10
      - pipelines: ["traces/high_cardinality"]
        max_cardinality: .inf
    default_pipelines: ["traces/default"]
    evaluation_interval: 30s

In this example, the connector will:

Create separate cardinality estimates for each unique combination of x-tenant-id and x-tenant-type
For example: tenant-a:premium, tenant-a:standard, tenant-b:premium, etc.
Each composite key will have its own routing decision based on its cardinality

Composite Key Construction: Values from multiple keys are concatenated with colons (:) separating multiple values of the same key, and semicolons (;) separating different keys. If a key is missing from the metadata, it's skipped in the composite key construction.

Configuration Fields

Field	Type	Description	Required
`routing_keys`	RoutingKeys	Configuration object for routing keys. Contains `partition_by` and `measure_by` fields.	Yes
`routing_keys.partition_by`	[]string	Array of metadata keys used to create a composite key for partitioning cardinality estimates. Multiple keys can be specified to create composite partitions (e.g., `["x-tenant-id"]` for per-tenant, or `["x-tenant-id", "x-tenant-type"]` for per-tenant+type). Composite keys are constructed by concatenating values from all specified keys. Each unique composite key value will have its own cardinality estimate. At least one key must be specified.	Yes
`routing_keys.measure_by`	[]string	Metadata keys used to define unique combinations for cardinality estimation. The connector counts how many unique combinations of these keys exist for each composite value of `partition_by`. The choice of keys determines what type of cardinality is measured (e.g., unique connections, unique pods, unique deployments).	No
`routing_pipelines`	[]RoutingPipeline	Array of pipeline configurations, each containing `pipelines` (array of pipeline IDs) and `max_cardinality` (float64). Pipelines must be defined in ascending order of `max_cardinality`, and the last pipeline must have `max_cardinality` set to `.inf` (positive infinity). The connector routes to the first pipeline where the estimated cardinality is less than or equal to `max_cardinality`.	Yes
`default_pipelines`	[]pipeline.ID	Pipelines to use when all partition keys are missing from the client context.	Yes
`evaluation_interval`	duration	How often to re-evaluate routing decisions based on new cardinality estimates. Default: 30s	No

Configuration Rules

routing_keys.partition_by must contain at least one key
routing_pipelines must contain at least one pipeline configuration
routing_pipelines must be defined in ascending order of max_cardinality values
The last pipeline in routing_pipelines must have max_cardinality set to .inf (positive infinity)
Each pipeline configuration must specify at least one pipeline ID in the pipelines array

Routing Logic

The connector routes data based on the estimated cardinality for the composite partition key:

If all routing_keys.partition_by keys are missing from the client context → routes to default_pipelines
Otherwise, constructs a composite key from the partition_by keys and routes to the first pipeline in routing_pipelines where estimated_cardinality ≤ max_cardinality
- The connector iterates through routing_pipelines in order and selects the first pipeline where the condition is met
- Since the last pipeline must have max_cardinality: .inf, all cardinality values will match at least one pipeline
- Composite keys are created by concatenating values from all partition_by keys (values separated by :, keys separated by ;)

Use Cases

Dynamic Batching Based on Cardinality

One of the most powerful use cases for the Dynamic Routing Connector is implementing dynamic batching strategies based on cardinality. By configuring routing_keys.measure_by to represent unique connections (e.g., source IP and user agent), you can route tenants with different connection volumes to different batching pipelines.

Scenario: You're operating a multi-tenant observability platform where different tenants have vastly different cardinality patterns. Some tenants have a few unique combinations (e.g., few connections, few pods, few services), while others have many.

Problem: Using a single batching configuration for all tenants leads to:

Low-cardinality tenants: Small batches that are inefficient and increase overhead
High-cardinality tenants: Large batches that may cause memory pressure and latency spikes

Solution: Use the Dynamic Routing Connector to route tenants to different pipelines with optimized batching configurations based on their cardinality:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

connectors:
  dynamicrouting:
    routing_keys:
      partition_by: ["x-tenant-id"]
      measure_by:
        - "x-forwarded-for"
        - "user-agent"
    routing_pipelines:
      # ≤10 unique connections: Small batches, frequent flush
      - pipelines: ["traces/small_batch"]
        max_cardinality: 10
      # ≤50 unique connections: Medium batches
      - pipelines: ["traces/medium_batch"]
        max_cardinality: 50
      # ≤200 unique connections: Large batches
      - pipelines: ["traces/large_batch"]
        max_cardinality: 200
      # >200 unique connections: Very large batches, aggressive batching
      - pipelines: ["traces/xlarge_batch"]
        max_cardinality: .inf
    default_pipelines: ["traces/default"]
    evaluation_interval: 30s

processors:
  batch/small:
    timeout: 1s
    send_batch_size: 100
    send_batch_max_size: 200
  
  batch/medium:
    timeout: 5s
    send_batch_size: 500
    send_batch_max_size: 1000
  
  batch/large:
    timeout: 10s
    send_batch_size: 2000
    send_batch_max_size: 5000
  
  batch/xlarge:
    timeout: 30s
    send_batch_size: 5000
    send_batch_max_size: 10000

exporters:
  otlp/elastic:
    endpoint: https://elastic-cloud-endpoint:443
    headers:
      Authorization: "Bearer ${ELASTIC_API_KEY}"

service:
  pipelines:
    traces:
      receivers: [otlp]
      connectors: [dynamicrouting]
  
    traces/small_batch:
      processors: [batch/small]
      exporters: [otlp/elastic]
  
    traces/medium_batch:
      processors: [batch/medium]
      exporters: [otlp/elastic]
  
    traces/large_batch:
      processors: [batch/large]
      exporters: [otlp/elastic]
  
    traces/xlarge_batch:
      processors: [batch/xlarge]
      exporters: [otlp/elastic]
  
    traces/default:
      processors: [batch/medium]
      exporters: [otlp/elastic]

How It Works:

Cardinality Tracking: For each tenant (identified by x-tenant-id), the connector tracks unique combinations of the configured routing_keys.measure_by keys (x-forwarded-for and user-agent in this example). This measures the cardinality of unique connection combinations per tenant.
Cardinality Estimation: Using HyperLogLog, the connector estimates how many unique combinations each tenant has without storing all identifiers. In this case, it estimates unique connection combinations.
Dynamic Routing: Based on the estimated cardinality:
- Tenant A (5 unique connection combinations) → traces/small_batch pipeline with 1s timeout and 100-item batches
- Tenant B (25 unique connection combinations) → traces/medium_batch pipeline with 5s timeout and 500-item batches
- Tenant C (150 unique connection combinations) → traces/large_batch pipeline with 10s timeout and 2000-item batches
- Tenant D (500 unique connection combinations) → traces/xlarge_batch pipeline with 30s timeout and 5000-item batches
Adaptive Behavior: Every 30 seconds, the connector re-evaluates routing decisions. If Tenant A's cardinality grows to 15, it automatically switches to the traces/medium_batch pipeline.

Benefits:

Optimized Throughput: High-cardinality tenants benefit from larger batches, reducing overhead
Lower Latency: Low-cardinality tenants get faster processing with smaller batches
Resource Efficiency: Memory and CPU usage are optimized per tenant workload
Automatic Adaptation: No manual intervention needed as traffic patterns change

Implementation Details

HyperLogLog Algorithm

The connector uses the HyperLogLog probabilistic data structure to estimate cardinality. This provides:

Memory Efficiency: Constant memory usage regardless of the number of unique combinations being tracked
Accuracy: Typical error rate of ~1% for cardinality estimation
Performance: O(1) insertion and estimation operations

Evaluation Interval

The evaluation_interval determines how frequently routing decisions are updated:

Shorter intervals: More responsive to changes but higher CPU usage
Longer intervals: More stable routing but slower adaptation to traffic changes
Recommended: 30-60 seconds for most use cases

Warnings

Statefulness

This connector maintains state (HyperLogLog sketches) in memory. Important considerations:

Memory Usage: Memory usage scales with the number of unique partition key values, not the total number of unique combinations being tracked
State Loss: State is lost on collector restart. Routing decisions will rebuild over the evaluation interval
High Cardinality Partition Keys: If you have many unique composite values for routing_keys.partition_by, memory usage will increase proportionally. Using multiple keys in partition_by will create more partitions (one per unique combination), which increases memory usage.

Metadata Requirements

The connector requires client metadata to be set in the context. Ensure your receivers/proxies propagate metadata appropriately
Missing all routing_keys.partition_by keys will route to default_pipelines. If some (but not all) keys are missing, the composite key will be constructed from the available keys.
Missing routing_keys.measure_by keys will still work, but the cardinality estimation will be based on the partition key alone (which may not provide meaningful cardinality measurements)
The choice of measure_by keys determines what type of cardinality is being measured—choose keys that represent the unique combinations you want to track

Troubleshooting

All Data Routes to Default Pipeline

Check: Verify that at least one of the routing_keys.partition_by keys is present in client metadata
Solution: Ensure your receiver or proxy is setting the metadata in the context

Routing Not Updating

Check: Verify evaluation_interval is not too long
Solution: Reduce the interval or manually trigger evaluation (requires collector restart)

High Memory Usage

Check: Number of unique composite values for routing_keys.partition_by
Solution: Consider using a more selective partition key or increasing evaluation interval to reduce state accumulation

Contributing

Contributions are welcome! Please see the main repository contributing guidelines for details.

Documentation ¶

Overview ¶

Package dynamicroutingconnector provides a connector for dynamically routing requests to different pipelines depending on the configuration.

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewFactory ¶

func NewFactory() connector.Factory

NewFactory returns a connector.Factory.

Types ¶

type Config ¶

type Config struct {
	RoutingKeys        RoutingKeys       `mapstructure:"routing_keys"`
	DefaultPipelines   []pipeline.ID     `mapstructure:"default_pipelines"`
	EvaluationInterval time.Duration     `mapstructure:"evaluation_interval"`
	RoutingPipelines   []RoutingPipeline `mapstructure:"routing_pipelines"`
}

func (*Config) Validate ¶

func (c *Config) Validate() error

type RoutingKeys ¶

type RoutingKeys struct {
	PartitionBy []string `mapstructure:"partition_by"`
	MeasureBy   []string `mapstructure:"measure_by"`
}

type RoutingPipeline ¶

type RoutingPipeline struct {
	Pipelines      []pipeline.ID `mapstructure:"pipelines"`
	MaxCardinality float64       `mapstructure:"max_cardinality"`
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
metadata
metadatatest

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL