Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Throwing error messages with field. #3239

Merged
merged 12 commits into from
Jul 11, 2023

Conversation

aimuz
Copy link
Collaborator

@aimuz aimuz commented Jun 29, 2023

Path and field.Error will make the error message update accurately

Signed-off-by: aimuz [email protected]

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug
/kind cleanup
/kind documentation
/kind feature
/kind hotfix

/kind breaking

What this PR does / Why we need it:

Which issue(s) this PR fixes:

Closes #

Special notes for your reviewer:

This PR uses field.ErrorList to record error messages. This will make the errors thrown more standard and easier to locate

@github-actions github-actions bot added the kind/breaking Breaking change label Jun 29, 2023
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: f1435361-dd62-4200-bc19-27ccc3842d23

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aimuz aimuz changed the title refactor: Throwing error messages with field. WIP: refactor: Throwing error messages with field. Jun 29, 2023
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: d2a43874-b71c-4bdf-8167-088e1e325644

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aimuz
Copy link
Collaborator Author

aimuz commented Jun 29, 2023

ref #3103

Related to this, in order to describe more information in Message without having to perform type conversion

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: fe92ab53-a0aa-4693-a268-2dcef8f29cab

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 1735c2d3-1c58-44ca-8bd9-216e9edff77f

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aimuz
Copy link
Collaborator Author

aimuz commented Jun 29, 2023

@markmandel
Copy link
Member

Unable to see detailed logs

Make sure you are using the correct google account when viewing the page.

@markmandel
Copy link
Member

markmandel commented Jun 29, 2023

Thanks for all this work, but what's not clear to me is why this is an improvement?

What does this actually do?

Is this a consolidation of code? A pure refactor? An improvement on user experience? Something else?

If you could write a few sentences on the goal etc of what the aims are that would be super useful, and also aid in reviewing it as well. Does it link back to an existing issue? Whatever resources you can point to would also help.

@aimuz
Copy link
Collaborator Author

aimuz commented Jun 30, 2023

What does this actually do?

There are several purposes for using the same validation as k8s. It allows for reuse of existing code and reduces development time.

The error messages thrown by this implementation are more detailed compared to the current one. For example, when creating a fleet with incorrect parameters, the current message is simply "Fleet configuration is invalid". This hides critical information that could be useful for debugging.

To address this issue, the proposed changes include implementing the Error() method in the StatusError struct of the errors.go file from the k8s.io/apimachinery/pkg/api/errors package. This method returns the error message stored in the ErrStatus.Message field.

By adopting this approach, it becomes easier to retrieve detailed error information through err.Error(), without requiring additional type assertions.

Purpose of the PR

The motivation behind this pull request is to integrate Agones into a PaaS platform. However, during the integration process, there were errors occurring without clear descriptions, making it difficult to understand the root cause.

Currently, the program checks if the error is empty and then directly outputs the result of Error(). However, this method doesn't provide sufficient details about the error.

Through this PR, the aim is to make error handling more standardized and improve the clarity of error messages. The proposed changes in errors.go enable the creation of an Invalid error, which indicates that the item being processed is invalid. This error includes a list of causes, each containing information such as the cause type, message, and field.

For example, the introduction of this PR enables the following error message:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Fleet.agones.dev \"AAsimple-game-server\" is invalid: metadata.name: Invalid value: \"AAsimple-game-server\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')",
  "reason": "Invalid",
  "details": {
    "name": "AAsimple-game-server",
    "group": "agones.dev",
    "kind": "Fleet",
    "causes": [
      {
        "reason": "FieldValueInvalid",
        "message": "Invalid value: \"AAsimple-game-server\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')",
        "field": "metadata.name"
      }
    ]
  },
  "code": 422
}

In this example, the error message clearly states that the metadata.name field has an invalid value according to the specified criteria.

Similarly, when the allocationOverflow field is not supported, the error message would be:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "admission webhook \"validations.agones.dev\" denied the request: Fleet configuration is invalid",
  "reason": "Invalid",
  "details": {
    "name": "simple-game-server",
    "group": "agones.dev",
    "kind": "Fleet",
    "causes": [
      {
        "reason": "FieldValueNotSupported",
        "message": "Value cannot be set unless feature flag FleetAllocationOverflow is enabled",
        "field": "allocationOverflow"
      }
    ]
  },
  "code": 400
}

As you can see, the error messages generated by kube-apiserver provide more detailed information. Although additional details are available in the details field, accessing them requires an extra layer of logic and type conversion.

The proposed changes aim to make errors more explicit and easier to understand.

@aimuz
Copy link
Collaborator Author

aimuz commented Jun 30, 2023

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 6, 2023

@markmandel Do you have time to help me with the question of why I can't see the logs?

@markmandel
Copy link
Member

Just to be 100% clear on the logs -- you can see the logs linked from this PR, but I'm guessing not the ones that are the e2e for each version of a gke cluster? (sub-builds)?

@markmandel
Copy link
Member

The error messages thrown by this implementation are more detailed compared to the current one. For example, when creating a fleet with incorrect parameters, the current message is simply "Fleet configuration is invalid". This hides critical information that could be useful for debugging.

Thank you very much for the detailed description -- this makes things much more clear, and this is sounding like a really good change.

For example, the introduction of this PR enables the following error message:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Fleet.agones.dev \"AAsimple-game-server\" is invalid: metadata.name: Invalid value: \"AAsimple-game-server\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')",
  "reason": "Invalid",
  "details": {
    "name": "AAsimple-game-server",
    "group": "agones.dev",
    "kind": "Fleet",
    "causes": [
      {
        "reason": "FieldValueInvalid",
        "message": "Invalid value: \"AAsimple-game-server\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')",
        "field": "metadata.name"
      }
    ]
  },
  "code": 422
}

In this example, the error message clearly states that the metadata.name field has an invalid value according to the specified criteria.

Just because my memory is no good 😄 what sort of output did you see before this change?

Also, does this change improve things when kubectl fails?

Since this is a bigger change @zmerlynn - would you care to also pass your eyes over this, just for review?

return causes
func validateObjectMeta(objMeta *metav1.ObjectMeta, fldPath *field.Path) field.ErrorList {
var allErrs field.ErrorList
allErrs = append(allErrs, metav1validation.ValidateLabels(objMeta.Labels, fldPath.Child("labels"))...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this now, what you propose makes lots of sense. Where before we would take the errors and convert them back into structs -- when we never had to do that.

Nice find.

if productCauses := apiHooks.ValidateGameServerSpec(gss); len(productCauses) > 0 {
causes = append(causes, productCauses...)
for _, cause := range apiHooks.ValidateGameServerSpec(gss) {
// TODO: Use field.ErrorList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - any reason this is a TODO without being implemented yet? Or were you waiting on some review before implementing this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APIHOOK is an interface, and I plan to modify the implementation of the interface in the next phase to avoid making too many changes in this PR.

If you think you can do it together in this PR, I will do the work together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - that's fine - just wanted to check 👍🏻

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a workflow nit - I don't see it much in this project, but consider using something like TODO(#xyz) where xyz is an issue number. Then you can search for it easily to ensure you've accounted for all the TODOs when you're working on a long arc of PRs.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 67f161aa-2d1e-451f-b4c8-0a51b8676e0a

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member

Looking through the e2e tests (while you can't see them), here are some errors:

    gameserver_test.go:930: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/gameserver_test.go:930
        	Error:      	Not equal: 
        	            	expected: "container"
        	            	actual  : "spec.template.spec.containers[0].resources.requests.memory"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-container
        	            	+spec.template.spec.containers[0].resources.requests.memory
        	Test:       	TestGameServerResourceValidation

  gameserver_test.go:939: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/gameserver_test.go:939
        	Error:      	Not equal: 
        	            	expected: "container"
        	            	actual  : "spec.template.spec.containers[0].resources.requests.cpu"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-container
        	            	+spec.template.spec.containers[0].resources.requests.cpu
        	Test:       	TestGameServerResourceValidation

   fleet_test.go:1428: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/fleet_test.go:1428
        	Error:      	Not equal: 
        	            	expected: "container"
        	            	actual  : "spec.template.spec.template.spec.containers[1].resources.requests.memory"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-container
        	            	+spec.template.spec.template.spec.containers[1].resources.requests.memory
        	Test:       	TestFleetResourceValidation

  fleet_test.go:1438: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/fleet_test.go:1438
        	Error:      	Not equal: 
        	            	expected: "container"
        	            	actual  : "spec.template.spec.template.spec.containers[0].resources.requests.cpu"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-container
        	            	+spec.template.spec.template.spec.containers[0].resources.requests.cpu
        	Test:       	TestFleetResourceValidation
    fleet_test.go:1632: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/fleet_test.go:1632
        	            				/go/src/agones.dev/agones/test/e2e/fleet_test.go:1440
        	Error:      	Should be true
        	Test:       	TestFleetResourceValidation
        	Messages:   	Was not able to find 'Request must be less than or equal to cpu limit'
    fleet_test.go:1632: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/fleet_test.go:1632
        	            				/go/src/agones.dev/agones/test/e2e/fleet_test.go:1441
        	Error:      	Should be true
        	Test:       	TestFleetResourceValidation
        	Messages:   	Was not able to find 'Resource cpu limit value must be non negative'
    fleet_test.go:1632: 
        	Error Trace:	/go/src/agones.dev/agones/test/e2e/fleet_test.go:1632
        	            				/go/src/agones.dev/agones/test/e2e/fleet_test.go:1442
        	Error:      	Should be true
        	Test:       	TestFleetResourceValidation
        	Messages:   	Was not able to find 'Request must be less than or equal to memory limit'

That should keep you going 😄

@markmandel
Copy link
Member

markmandel commented Jul 7, 2023

#3239 should fix the access issue for future builds once merged. Thanks for highlighting the issue 👍🏻

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 7, 2023

That should keep you going 😄

Thanks, I will continue to complete this work

Path and field.Error will make the error message update accurately

Signed-off-by: aimuz <[email protected]>
@aimuz aimuz requested a review from zmerlynn July 8, 2023 02:43
Signed-off-by: aimuz <[email protected]>
if productCauses := apiHooks.ValidateGameServerSpec(gss); len(productCauses) > 0 {
causes = append(causes, productCauses...)
for _, cause := range apiHooks.ValidateGameServerSpec(gss) {
// TODO(#3239): Use field.ErrorList
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: bc3cd867-baa7-4892-b928-837d9fee4280

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3239/head:pr_3239 && git checkout pr_3239
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.34.0-dev-02eff83-amd64

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 8, 2023

Just because my memory is no good 😄 what sort of output did you see before this change?

@markmandel I'm sorry I ignored this message

This is the content before modification

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "admission webhook \"validations.agones.dev\" denied the request: GameServer configuration is invalid",
  "reason": "Invalid",
  "details": {
    "name": "simple-game-server-ghnf7",
    "group": "agones.dev",
    "kind": "GameServer",
    "causes": [
      {
        "reason": "FieldValueInvalid",
        "message": "ContainerPort cannot be specified with Passthrough PortPolicy",
        "field": "default.containerPort"
      }
    ]
  },
  "code": 400
}

Changes to this PR

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "admission webhook \"validations.agones.dev\" denied the request: GameServer.agones.dev \"simple-game-server-x8w4m\" is invalid: spec.ports[0].containerPort: Required value: ContainerPort cannot be specified with Passthrough PortPolicy",
  "reason": "Invalid",
  "details": {
    "name": "simple-game-server-x8w4m",
    "group": "agones.dev",
    "kind": "GameServer",
    "causes": [
      {
        "reason": "FieldValueRequired",
        "message": "Required value: ContainerPort cannot be specified with Passthrough PortPolicy",
        "field": "spec.ports[0].containerPort"
      }
    ]
  },
  "code": 422
}

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 10, 2023

/assign @cyriltovena

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 10, 2023

/assign @markmandel

Copy link
Member

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the error message diff - look great!

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aimuz, markmandel, zmerlynn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@markmandel markmandel enabled auto-merge (squash) July 10, 2023 23:16
@google-oss-prow
Copy link

New changes are detected. LGTM label has been removed.

@google-oss-prow google-oss-prow bot removed the lgtm label Jul 10, 2023
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 2c297dc0-a716-40a7-bb73-91cbf6a51a61

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member

Oops, got caught when we were recreating the e2e clusters. Let's hit retry when complete.

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 11, 2023

Do I need to do something?

@markmandel
Copy link
Member

Nah, we've got it.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: bdd94e98-b4ae-4cc1-af4b-d08cd449d932

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aimuz
Copy link
Collaborator Author

aimuz commented Jul 11, 2023

https://console.cloud.google.com/cloud-build/builds/bc4ae54d-10b7-4274-b9ee-f91dc15e282d?project=agones-images

It looks like the timeout is causing the error, do I need to resubmit to trigger the retest?

@gongmax
Copy link
Collaborator

gongmax commented Jul 11, 2023

kicked off another retry

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: e4426dfa-2167-4ca0-8e71-2aa1e784d481

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3239/head:pr_3239 && git checkout pr_3239
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.34.0-dev-ed2a38b-amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants