Skip to content

Conversation

lzchen
Copy link
Member

@lzchen lzchen commented Aug 27, 2025

This change refactors statsbeat from using globals to a centrally managed StatsbeatManager. This makes it easier to enforce singleton as well as sets up statsbeat to be able to make configuration changes easier during runtime (for OneSettings).

Also fixes some bugs found in customer sdkstats:

  1. Circular import error found in both regular and customer sdkStats, so moved to using lazy imports in some places.
  2. Setting _is_customer_sdkstats as an attribute directly on AzureMonitorMetricExporter does not mark it properly as a stats exporter, fixed to pass in a parameter to mark it instead.
  3. We do not want to set credential field to None for customer sdkstats, we want to use the same credential for AAD.
  4. We do not want to collect regular sdkStats data on a customer sdk stats exporter, the current logic does not handle this.
  5. Customer sdkstats export interval was set to 900 ms, not 90000 (15m) by default.
  6. Moved all of customer sdkstats tests out of test_base_exporter.py and improved testing coverage.

The bulk of the net new changes are in _manager.py. It is designed so that making config changes dynamically during runtime with OneSettings in the future will be easier

@rads-1996

@Copilot Copilot AI review requested due to automatic review settings August 27, 2025 00:45
@github-actions github-actions bot added the Monitor - Exporter Monitor OpenTelemetry Exporter label Aug 27, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the statsbeat module to use a centralized StatsbeatManager class instead of global variables. The refactoring introduces a singleton-based, thread-safe manager that enforces proper initialization and shutdown while supporting dynamic reconfiguration for OneSettings.

Key changes:

  • Introduction of StatsbeatManager singleton class for centralized statsbeat management
  • Replacement of global _STATSBEAT_METRICS variable with managed instance
  • Addition of StatsbeatConfig class for configuration management
  • Enhanced thread safety with proper locking mechanisms

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
_manager.py New file implementing StatsbeatManager singleton and StatsbeatConfig classes
_statsbeat.py Simplified to delegate to StatsbeatManager instead of managing globals
_exporter.py Refactored to use composition with delayed import to avoid circular dependencies
_statsbeat_metrics.py Updated to use delayed imports and fixed circular import issues
_utils.py Added delayed import for StorageExportResult to prevent circular imports
_customer_sdkstats.py Added delayed imports to avoid circular dependencies
_state.py Added set_statsbeat_shutdown helper function
export/_base.py Updated statsbeat detection logic and added error handling
_utils.py Fixed thread safety issue in Singleton metaclass
__init__.py New module initialization file exposing public API
test_manager.py New comprehensive test file for StatsbeatManager functionality
test_metrics.py Removed obsolete tests and updated remaining tests
test_exporter.py Enhanced with new tests for recursive prevention and thread safety

Copy link

github-actions bot commented Aug 27, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-monitor-opentelemetry-exporter

@@ -380,10 +380,13 @@ def _get_scope(aad_audience=None):

class Singleton(type):
_instance = None
_lock = threading.Lock()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will be leveraging the Singleton class in utils going forward in the future where we need to. For example, live metrics uses the same pattern.

return (
is_customer_sdkstats_enabled
and not get_customer_sdkstats_shutdown()
and not self._is_stats_exporter()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important change, we do not want to collect cx stats on regular stats exporter.

@@ -477,7 +481,7 @@ def _is_statsbeat_initializing_state(self):
return self._is_stats_exporter() and not get_statsbeat_shutdown() and not get_statsbeat_initial_success()

def _is_stats_exporter(self):
return self.__class__.__name__ == "_StatsBeatExporter"
return getattr(self, "_is_sdkstats", False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed _StatsBeatExporter class and use regular metrics exporter instead so we use a field on the exporter instead to determine whether it is a stats exporter.

@@ -75,13 +76,15 @@ class AzureMonitorMetricExporter(BaseExporter, MetricExporter):
"""Azure Monitor Metric exporter for OpenTelemetry."""

def __init__(self, **kwargs: Any) -> None:
self._is_sdkstats = kwargs.get("is_sdkstats", False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We move these attribute settings BEFORE the parent __init__ calls because BaseExporter needs to know if these fields are set.

envelope = _convert_point_to_envelope(point, name, resource, scope)

# Apply statsbeat metric name mapping if this is a statsbeat exporter
final_metric_name = name
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added since we got rid of _StatsBeatExporter

metric_reader_options = {
"exporter": self._customer_sdkstats_exporter,
"export_interval_millis": _get_customer_sdkstats_export_interval()
"export_interval_millis": _get_customer_sdkstats_export_interval() * 1000 # Default 15m
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug before since we want milliseconds and the env vars are defined in seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I actually realized this and fixed it in - https://github.com/Azure/azure-sdk-for-python/pull/42655, thanks for catching this. This refactor PR should be merged first, right? The other PRs that are up for race condition and exception categorization will have to be modified now a bit to accommodate these changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this should be merged first as it is quite intrusive. I've moved all tests to a different file as well. Apologies for the extra work.

if self._config and self._config == config:
return True
# If config is different, reconfigure
return self._reconfigure(config)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used currently but will be relevant once we have OneSettings change the config. Then we will be able to reconfigure and reinitialize stats during runtime if config changes.

def collect_statsbeat_metrics(exporter) -> None:
global _STATSBEAT_METRICS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We remove all this logic into _manager.py, We leave the functions here for now to be backwards compatible.



def shutdown_statsbeat_metrics() -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@@ -117,8 +118,8 @@
# pylint: disable=line-too-long
_DEFAULT_NON_EU_STATS_CONNECTION_STRING = "InstrumentationKey=c4a29126-a7cb-47e5-b348-11414998b11e;IngestionEndpoint=https://westus-0.in.applicationinsights.azure.com/"
_DEFAULT_EU_STATS_CONNECTION_STRING = "InstrumentationKey=7dc56bab-3c0c-4e9f-9ebb-d1acadee8d0f;IngestionEndpoint=https://westeurope-5.in.applicationinsights.azure.com/"
_DEFAULT_STATS_SHORT_EXPORT_INTERVAL = 900 # 15 minutes
_DEFAULT_STATS_LONG_EXPORT_INTERVAL = 86400 # 24 hours
_DEFAULT_STATS_SHORT_EXPORT_INTERVAL = 15 * 60 # 15 minutes in s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change just to show how the conversion works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Monitor - Exporter Monitor OpenTelemetry Exporter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants