Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp client tracing #791

Open
dazza-codes opened this issue Apr 6, 2020 · 4 comments
Open

aiohttp client tracing #791

dazza-codes opened this issue Apr 6, 2020 · 4 comments

Comments

@dazza-codes
Copy link
Contributor

It could help to somehow expose aiohttp client tracing

@dazza-codes
Copy link
Contributor Author

dazza-codes commented Apr 11, 2020

The only clue to adding some tracing or monitoring is in create_client where it has an option to register a 'monitor' component that can listen to client events.

# snippet from botocore.session.Session.create_client

        client_creator = botocore.client.ClientCreator(
            loader, endpoint_resolver, self.user_agent(), event_emitter,
            retryhandler, translate, response_parser_factory,
            exceptions_factory, config_store)
        client = client_creator.create_client(
            service_name=service_name, region_name=region_name,
            is_secure=use_ssl, endpoint_url=endpoint_url, verify=verify,
            credentials=credentials, scoped_config=self.get_scoped_config(),
            client_config=config, api_version=api_version)
        monitor = self._get_internal_component('monitor')
        if monitor is not None:
            monitor.register(client.meta.events)
        return client

The monitor seems to be created with

    def _create_csm_monitor(self):
        if self.get_config_variable('csm_enabled'):
            client_id = self.get_config_variable('csm_client_id')
            host = self.get_config_variable('csm_host')
            port = self.get_config_variable('csm_port')
            handler = monitoring.Monitor(
                adapter=monitoring.MonitorEventAdapter(),
                publisher=monitoring.SocketPublisher(
                    socket=socket.socket(socket.AF_INET, socket.SOCK_DGRAM),
                    host=host,
                    port=port,
                    serializer=monitoring.CSMSerializer(
                        csm_client_id=client_id)
                )
            )
            return handler
        return None

@thehesiod
Copy link
Collaborator

actually there's an event you can trace:

        await self._event_emitter.emit(
            'creating-client-class.%s' % service_id,
            class_attributes=class_attributes,
            base_classes=bases)

could you state the goal you want with tracing the underlying aiohttp client?

@dazza-codes
Copy link
Contributor Author

dazza-codes commented Apr 14, 2020

Essentially, it would be ideal to better understand and monitor how the clients are using a connection pool so that concurrent processes can be optimized for repeatable workloads or debugged for performance issues. Some relevant notes on the connection pool for botocore/urllib3 are in https://gitlab.com/dazza-codes/aio-aws/-/wikis/botocore-clients and similar notes for aiobotocore would be great, with additional notes on how to monitor and tune the connection pools. Hooking into the 'creating-client-class' event is not quite enough to actually trace the number of connections used and whether or not they are part of a connection pool or not.

Some of the terms in botocore (hence aiobotocore) are confusing. For example, a botocore.session.Session or aiobotocore.session.AioSession may not be equivalent to a aiohttp.ClientSession and the former does not manage the latter at all. It helps to understand that it's a botocore.client.BaseClient or similar aiobotocore.client.AioBaseClient that holds an endpoint with a client-session that manages a connection pool. The confusion can arise from the use of Session in classes that don't actually manage a session connection-pool. So, surfacing some documentation and APIs to monitor or trace connections and connection pools may help to clarify the system architecture.

The aiobotocore.config.AioConfig seems to be thing that captures some generic or default client configuration details. It could be the class that adds an option to add a trace on the clients created. This could allow it to be added with default tracing options into the AioSession so it's applied to new clients created by the session. (Of course, the inherent library defaults would not enable tracing on AioConfig.)

It would help to know whether MAX_POOL_CONNECTIONS for botocore also applies to aiobotocore and how it's implemented. By default, it seems that aiohttp does not limit the connection pool:

@dazza-codes
Copy link
Contributor Author

Most of the answers to some details are in aiobotocore.endpoint module.

Looking at AioEndpointCreator, it seems there might be no way to pass through any tracing from AioConfig because the signature for create_endpoint does not allow it and it should remain consistent with botocore; unless it might be added to connector_args?

There is also an interesting botocore.endpoint.history_recorder that is used to track all the request/response details. But, it doesn't provide any details on how the connection pool is used or optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants