Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT]: Auto Detect Noisy Label #1682

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Yaxhveer
Copy link
Contributor

@Yaxhveer Yaxhveer commented Mar 13, 2024

Related Issue

  • Info about Issue or bug

Describe the changes you've made

It is draft related to task for finding noisy labels. Here, we are detect the noisy field in responses by simulating the request in record mode and comparing the responses of both. One can use -a flag to enable auto noise detection in record mode and -a flag to check with auto noise in replay mode (initially the flags are false). The following draft is just a prototype and require much more refactoring. Just wanted to know if this would work or not.

@Yaxhveer
Copy link
Contributor Author

@charankamarapu @PranshuSrivastava Please review the draft and let me know the improvements and features that can be added in the same.

@charankamarapu
Copy link
Member

More than code can you elaborate on how user would use it , how is the design done ..? That would actually give more context to me and @PranshuSrivastava .

@Yaxhveer
Copy link
Contributor Author

Yaxhveer commented Mar 13, 2024

@charankamarapu @PranshuSrivastava Here user can use -a flag to enable auto noise detection. So whenever they make request in record mode, keploy would again take the request and simulate it. Then it would compare both the responses and identify the noisy labels. These noisy labels are then added in specific test-set as autoNoise. Then during replay mode user again can add -a flag to enable check for auto noise, so here keploy would add these auto noise labels with the other noises.
Like for the given request:

c.JSON(http.StatusOK, gin.H{
	"ts":  time.Now().UnixNano(),
	"url": "http://localhost:8080/" + id,
	"yy": gin.H{
		"hh": rand.Intn(100),
		"kk": rand.Intn(100),
		"jj": gin.H{
			"array": []int{8, rand.Intn(100)},
			"ii": rand.Intn(100),
		},
	},
})

We would get autoNoise as
Screenshot from 2024-03-13 22-09-10

For the above approached I had to reuse multiple methods of test package like testHttp and replaceHostToIP method.

@charankamarapu
Copy link
Member

Firstly, how are you going simulate the call again in the record mode..? are you going to call the real databases or are you going to use the mocks ..? and why do you need to add -a in test mode again if you are getting the noise params in the record only you can edit the noise param in the test file in record only right..?

@Yaxhveer
Copy link
Contributor Author

Yaxhveer commented Mar 14, 2024

Currently pkg.SimulateHttp() method would simulate the request which is also being used by test mode. It is using the real database right now. I have set [Keploy-Simulation: true] header in simulated request so it can be easily be differentiated from actual request. Here we simulate and do the comparison. I also reusing testHttp method of test package to compare the responses.
Detecting autoNoise would take addtional time (making one more request and then comparing the responses), increasing the latency, so it depend on user if they want to detect auto noise or not in record mode.
AutoNoise is added as a completely different label. It depend on user if they want to use this functionality or not for which I have provided -a flag in test mode.

@Yaxhveer
Copy link
Contributor Author

@charankamarapu @PranshuSrivastava should i work further on this approach or think of any other implementation?

@charankamarapu
Copy link
Member

When you use Keploy-Simulation: true for simulation call , are you going to use the real database or already recorded mocks..? When you are doing the noise detection in record and you will add the noise in the testcase, why do you need -a in test mode...? Already noise ignoring is present in test mode right..?

@Yaxhveer
Copy link
Contributor Author

Yaxhveer commented Mar 17, 2024

@charankamarapu I am using the real database for simulation. I thought we are treating these noises as independent that's why I added another flag. I would remove the -a flag in the main implementation.

@charankamarapu
Copy link
Member

so In record mode after the recording one testcase and related database calls, If you simulate the testcase and then do the real database calls the test mostly fails. For example lets take an employee manager project example, I have recorded or performed an insert of an employee. If I repeat the call to the real database it will say employee already exits then the test case will have completely new response which will fail the testcase right..?

@Yaxhveer
Copy link
Contributor Author

@charankamarapu Yeah, didn't think of that case. Thanks for pointing it out, I will try to simulate the calls using mock database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants