Handling sensitive data
আপনি এই পৃষ্ঠার ইংরেজি সংস্করণ দেখছেন কারণ এটি এখনও সম্পূর্ণভাবে অনুবাদ করা হয়নি। সাহায্য করতে আগ্রহী? দেখুন Contributing।
When implementing OpenTelemetry, it’s crucial to be mindful of sensitive data handling. The collection of telemetry data always carries the risk of inadvertently capturing sensitive or personal information that may be subject to various privacy regulations and compliance requirements.
Your responsibility
OpenTelemetry collects telemetry data, but it can’t determine what data is sensitive in your specific context on its own. As the implementer, you are responsible for:
- Ensuring compliance with applicable privacy laws and regulations.
- Protecting sensitive information in your telemetry data.
- Obtaining necessary consents for data collection.
- Implementing appropriate data handling and storage practices.
Additionally, you are responsible for understanding and reviewing the telemetry data emitted by any instrumentation libraries you use, as these libraries may collect and expose sensitive information as well.
Sensitive data considerations
What data is sensitive varies from situation to situation. Examples include:
- Personal Identifiable Information (PII)
- Authentication credentials
- Session tokens
- Financial information
- Health-related data
- User behavior data
Data minimization
When collecting potentially sensitive data through telemetry, follow the principle of data minimization. This means:
- Only collect data that serves an observability purpose.
- Avoid collecting personal information unless absolutely necessary.
- Consider whether aggregated or anonymized data could serve the same purpose.
- Regularly review collected attributes to ensure they remain necessary.
Protecting sensitive data
As outlined in the previous section, the best way to prevent the collection of sensitive data is not to collect data that might be sensitive. However, you might want to collect this data under certain circumstances, or perhaps have no full control over the data being collected, and need ways to scrape the data in post processing. The following suggestions can help you with that.
The OpenTelemetry Collector provides several processors that can help manage sensitive data:
- attributeprocessor: Remove or modify specific attributes.
- filterprocessor: Filter out entire spans or metrics containing sensitive data.
- redactionprocessor: Delete span, log, and metric datapoint attributes that don’t match a list of allowed attributes.
- transformprocessor: Transform data using regular expressions.
Deleting and hashing user information
The following configuration for the attribute processor is hashing the
user.email and deleting user.full_name from sensitive
user information:
processors:
  attributes/example:
    actions:
      - key: user.email
        action: hash
      - key: user.full_name
        action: delete
Replacing user.id with user.hash
The following configuration for the transform processor can be used to remove
the user.id and replace it with a user.hash:
transform:
  trace_statements:
    - context: span
      statements:
        - set(attributes["user.hash"], SHA256(attributes["user.id"]))
        - delete_key(attributes, "user.id")
Hashing the ID or name of a user may not provide the level of anonymization you need, since hashes are reversible in practice if the input space is small and predictable (e.g. numeric user IDs).
Truncating IP addresses
As an alternative to hashing you can truncate data, or group it by a common prefix or suffix. This for example applies to
- dates, where you keep only the year or the year and the month, but drop the day.
- email addresses, where you drop the local part and only keep the domain.
- IP addresses, where you drop drop the last octet of IPv4 or the last 80 bits of IPv6.
The following configuration for the transform processor drops the last octet
of a client.address attribute:
transform:
  trace_statements:
    - context: span
      statements:
        - replace_pattern(attributes["client.address"], "\\.\\d+$", ".0")
Delete attributes with redaction processor
Finally, an example for the redaction processor to delete certain attributes
can be found in the section
“Scrub sensitive data”
of the security best practices page for Collector configurations.
Feedback
Was this page helpful?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!