Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oracle OpenTelemetry support #4078

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open

Oracle OpenTelemetry support #4078

wants to merge 38 commits into from

Conversation

cmdkeen
Copy link
Contributor

@cmdkeen cmdkeen commented May 3, 2024

Closes #4072 by updating to the new ODP.Net version and adding OTel support via Oracle.ManagedDataAccess.OpenTelemetry. As mentioned in the issue this will need a new package and bumped version adding to the NuGet feeds.

Used the tracing settings taken from the Oracle sample to ensure that EF Core SQL text shows is provided to the traces.

Added an EndToEnd test using the Npgsql one as a base.

Microsoft Reviewers: Open in CodeFlow

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-components Issues pertaining to Aspire Component packages label May 3, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 3, 2024
@cmdkeen
Copy link
Contributor Author

cmdkeen commented May 7, 2024

@mitchdenny or @danmoseley are you able to take a look at this?

@mitchdenny
Copy link
Member

Adding @eerhardt for the Otel side of things.

@mitchdenny
Copy link
Member

Importing the Otel package to internal feed so we can build this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the manifest.

Comment on lines 173 to 176
o.EnableConnectionLevelAttributes = true;
o.RecordException = true;
o.InstrumentOracleDataReaderRead = true;
o.SetDbStatementForText = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a silly question, but why aren't these set to true by default in AddOracleDataProviderInstrumentation? Are there other properties that should be set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One for the package authors as to why they're not the defaults. I did try originally with just an empty AddOracleDataProviderInstrumentation() constructor call but then didn't see any database calls being traced from the E2E playground. This is the same set of properties as used in the Oracle OTel samples.

Potentially it is because the Oracle OTel library isn't specifically aimed at EF so it might be the way that the Oracle EF package is setting the SQL statement, but hard to tell as it isn't OSS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know any of the authors? Can you tag them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't see Alex Keh on here, have raised oracle/dotnet-db-samples#378 to ask

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eerhardt @cmdkeen Oracle ODP.NET sets these properties to false by default as a security precaution. We didn't want developers to incorporate the ODP.NET OpenTelemetry and then inadvertently publish sensitive DB info into the trace. We want to make sure there was an explicit intent to do so.

Are other DB OpenTelemetry providers enabling these settings by default? I forget whether SqlClient has the options class members automatically enabled (and I couldn't locate the API doc).

Copy link
Contributor Author

@cmdkeen cmdkeen May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - I'm no expert, from looking at the playground output the Postgres and MySQL ones do emit at least some of this data (SQL command, passwordless connection string and db info) by default whereas it looks like the SqlServer one does not.
From my perspective having that level of detail is incredibly helpful. I can see at least four options:

  1. Turning on this detail by default, mentioning in the docs
  2. Switch back to the library defaults, allow fine grained control of the instrumentation options via adding a OracleDataProviderInstrumentationOptions property to OracleEntityFrameworkCoreSettings
  3. Switch back to the library defaults, add a "DetailedTracing" property to OracleEntityFrameworkCoreSettings that sets these options with a suitable comment
  4. Switch back to the library defaults - not sure if you can change the options easily, otherwise users might need to disable tracing and manually add

I can't see any other examples of tracing configuration being passed through the constructors, and appreciate the extensions are trying to set up sensible defaults. My preference is for either 1 or 3 from above, but happy to take a steer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of .NET Aspire components is to codify best practices. So developers will fall into a "pit of success" with using the defaults.

I don't think we should be enabling any setting by default if there are security concerns with that setting being on.

@@ -66,7 +66,7 @@ protected override void SetHealthCheck(OracleEntityFrameworkCoreSettings options
=> options.DisableHealthChecks = !enabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests for the tracing. They need to be declared at each component level. For example:

[RequiresDockerFact]
public void TracingEnablesTheRightActivitySource()
=> RemoteExecutor.Invoke(static connectionStringToUse => RunWithConnectionString(connectionStringToUse, obj => obj.ActivitySourceTest(key: null)),
ConnectionString).Dispose();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see this will require adding the TestContainers.Oracle nuget package to the internal package feed as #3063 hasn't covered Oracle yet. Happy to give that a go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also seems that the current .Net Testcontainers is using the XE image, there isn't the equivalent of the Java one having the newer Oracle free module. It also locks some of the config behind private methods that would swapping out the image to support the same image as Hosting uses. A bit of pondering required on the best approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added and also enabled Testcontainers for Oracle as part of it. Turns out the base Testcontainers instance works fine.

@cmdkeen
Copy link
Contributor Author

cmdkeen commented May 8, 2024

Importing the Otel package to internal feed so we can build this.

@mitchdenny or @eerhardt are you able to add Testcontainers.Oracle to the internal feed as well to get this compiling again.

@mitchdenny
Copy link
Member

@mitchdenny or @eerharor @eerhardt are you able to add Testcontainers.Oracle to the internal feed as well to get this compiling again.

Importing that package now. Should be there in 30 mins.

@cmdkeen
Copy link
Contributor Author

cmdkeen commented May 10, 2024

@eerhardt @sebastienros would you be able to take a look with the failing tests? I've got them working on my machine, and made some progress on working out what's needed on the them running on the build agent (adding the Helix reference helped) but it looks like the ones that use Docker are still all failing.
I did adjust the TriggerActivity to try and be obvious about connection issues to the database, but it feels like this is something else.

@sebastienros
Copy link
Member

@radical The Helix tests are failing on things related to this PR, can you sport something wrong with how the new tests are written?

@radical
Copy link
Member

radical commented May 10, 2024

Looking at the test results the first test fails with:
Screenshot 2024-05-10 at 14 39 40

Oracle might be taking a longer time to start up, or failing at some point? I would try to add some waits to confirm that the db is up, and accessible. This might be helpful too.

Also, there is #3161 but it doesn't use testcontainers, and instead depends on the AppHost starting the container.

@eerhardt
Copy link
Member

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

{
"$schema": "https://json.schemastore.org/aspire-8.0.json",
"resources": {
"pg1": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look correct. Did you just copy-paste this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and you're right definitely not accurate. Do I need to generate one or can I just delete the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regenerate it.

@eerhardt
Copy link
Member

eerhardt commented Jun 3, 2024

It looks like we aren't getting the database icon to show up in the dashboard:

image

compare that to SqlServerEndToEnd:
image

The reason appears to be because there is no peer.service attribute/tag on the span:

image

vs.

image

@JamesNK - is there a reason why the span has to have peer.service for the DB icon to show up? Can't it having db.system be enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-components Issues pertaining to Aspire Component packages community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Oracle OpenTelemetry
6 participants