Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Feature: Make Instrumentations Robust to How Users Call Functions (args/kwargs) #539

Open
1 task done
paolorechia opened this issue Feb 27, 2024 · 0 comments
Open
1 task done
Labels
enhancement New feature or request

Comments

@paolorechia
Copy link
Contributor

Which component is this feature for?

All Packages

🔖 Feature description

I've noticed some instrumentations only check for the kwargs passed by the method. For instance, in ChromaDB:

def _set_add_attributes(span, kwargs):
    _set_span_attribute(span, "db.chroma.add.ids_count", count_or_none(kwargs.get("ids")))
    _set_span_attribute(span, "db.chroma.add.embeddings_count", count_or_none(kwargs.get("embeddings")))
    _set_span_attribute(span, "db.chroma.add.metadatas_count", count_or_none(kwargs.get("metadatas")))
    _set_span_attribute(span, "db.chroma.add.documents_count", count_or_none(kwargs.get("documents")))

This has the implication that depending on the user calls the function, he might miss some attributes from being added to the trace.

For instance:

chromadb.some_function("1", "books")

Would not instrument the arguments, while calling:

chromadb.some_function(id="1", collection="books")

Would successfully include attributes id and collection in the trace.

🎤 Why is this feature needed ?

Make instrumentations more robust / reliable / predictable. We don't want the behavior to change depending on the user calls a method.

✌️ How do you aim to achieve this?

We should make this is baked-in in the instrumentation tooling, so that this always handled correctly. We need some code design proposal done and experimented with. This is what I've implemented in weaviate instrumentation to avoid the issue and could serve as a starting point:

class ArgsGetter:
    """Helper to make sure we get arguments regardless
    of whether they were passed as args or as kwargs.
    Additionally, cast serializes dicts to JSON string.
    """

    def __init__(self, args, kwargs):
        self.args = args
        self.kwargs = kwargs

    def __call__(self, index, name):
        try:
            obj = self.args[index]
        except IndexError:
            obj = self.kwargs.get(name)

        if obj:
            try:
                return json.dumps(obj)
            except json.decoder.JSONDecodeError:
                logger.warning(
                    "Failed to decode argument (%s) (%s) to JSON", index, name
                )


class _Instrumentor:
    def map_attributes(self, span, method_name, attributes, args, kwargs):
        getter = ArgsGetter(args, kwargs)
        for idx, attribute in enumerate(attributes):
            _set_span_attribute(
                span,
                f"{self.namespace}.{method_name}.{attribute}",
                getter(idx, attribute),
            )

    def instrument(self, method_name, span, args, kwargs):
        attributes = self.mapped_attributes.get(method_name)
        if attributes:
            self.map_attributes(span, method_name, attributes, args, kwargs)

🔄️ Additional Information

Note, this could also be labeled as a bug, but I think the goal here is not to simple fix cases where it might misbehave. It's to prevent it from even happening again in the future.

👀 Have you spent some time to check if this feature request has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

@paolorechia paolorechia added the enhancement New feature or request label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant