Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reduce noise ? #6

Open
ghost opened this issue Aug 2, 2016 · 15 comments
Open

How to reduce noise ? #6

ghost opened this issue Aug 2, 2016 · 15 comments

Comments

@ghost
Copy link

ghost commented Aug 2, 2016

Currently, landmarks seems noisy and jump a lot (clear environment with sufficient light ), this is however a lot more stable when compiling dlib for OSX, is there something that we can do to improve the noise level on iOS?

Many thanks for your awesome work!

@zweigraf
Copy link
Owner

zweigraf commented Aug 2, 2016

Thanks for your interest in this project.

I know what you are talking about with noisy, jumpy. Unfortunately I did not find a solution yet for this problem.

What do you mean with OSX? Did you use the same project on OSX?

I have one advice that could work, but I have not tested it. Maybe you should plot the rectangles that iOS gives you onto the camera image as well. Maybe these are a little bit too small for Dlib to find the face in. You could try padding the face rectangles with like 5-50px on each side and test if that works better than.

@ghost
Copy link
Author

ghost commented Aug 2, 2016

Thanks for the quick update.good idea! I'll give it a try and will let you know.

regarding OSX, I build form source for OSX per SATYA's tutorial

@ghost
Copy link
Author

ghost commented Aug 5, 2016

Could you please advice on where's the best place to do that ?(increase face rectangle)

I did some digging and it seems
(dlib::rectangle)convertScaleCGRect:(CGRect)rect toDlibRectacleWithImageSize:(CGSize)size is doing the conversion from AVMetadataFaceObject bound to Dlib Rectangle, but increasing the size and origin in the that method wont fix it, instead it breaks face recognition.

I even tried another approach with transformedMetadataObjectForMetadataObject to pass in the actual rect in the view instead of scalar values which is provided by AVMetadataFaceObject but that didn't help either.

At this point I am certain of one thing and that's the face rect returned by AVMetadataFaceObject is way too small and your suspicion is correct.

I am also considering switching to CIDetector to see if I can get the better value for face

any thoughts ?

@ghost
Copy link
Author

ghost commented Aug 9, 2016

Bump
Did someone figure out how to improve noise level, I don't think I'm the only one with this issue.

Also I increase the detected rectangle but that didn't help

@stanchiang
Copy link

i'm working with a portrait orientation version of this app and i'm thinking it's less stable because we are initially passing the dlibwrapper function an image with smaller dimensions than the one we output. at a smaller scale the jumpiness wouldn't be so conspicuous, but upon ultimately scaling up the image we amplify the noise.

@zweigraf could that be it, at least for my issues?
if so, then i guess i should be trying to pass in an image that is scaled properly before sending it off to dlib

@faoiseamh
Copy link

faoiseamh commented Apr 20, 2018

I also see significant jitter / noise on iOS vs. OSX. I've been rendering the face detection box to debug and the noise + jitter is present even when the face detection remains constant and everything is motionless in the frame. My OSX webcam is worse quality and about the same resolution. Anyone ever resolve this issue?

I'm also using my own iOS project (just ended up here from googling), however I also built this one and see similar behavior.

@faoiseamh
Copy link

faoiseamh commented Apr 20, 2018

@mosn / For anyone else who lands here from Googling:

I was able to figure out the issue on my end -- it was the result of reading the sample buffer as if it was BGR pixels when in fact it was BGRA. I fixed it by reading the buffer as BGRA into a CV_8UC4 mat and then converting the mat to BGR (because of assertions in dlib methods rejecting mats with an alpha channel).

@Cloov
Copy link

Cloov commented Apr 25, 2018

@faoiseamh I think I'm experiencing the same issue - I used two iPads both stationary, with one viewing a video on the other (instead of shakily holding a camera up at my own face), and the amount of jitter was the same.

I'm not using a cv::mat at the point I'm reading from the AVCaptureSession sample buffer, I just step through the buffer 4 elements at a time and ignore the fourth value each time, creating a dlib::bgr_pixel from each set of 3 values.

Does that mean I need to introduce a step where I create a mat from this sample buffer, then convert it, and step through that in the same way as I do with the original buffer, described above? I'm not sure how I'd do that yet, but it would be good to know if this sounds like the exact process you followed for a reliable fix!

@faoiseamh
Copy link

@Cloov If you're building the dlib datastructure directly that should be fine. The format of the buffer depends on the pixel format you've set for the av catpure session. If it's BGRA then your logic is right, but it may be one of several other more exotic formats. I would first validate the format you set. I'd also try rendering the dlib datastructure back to a UIImage or something that you can view in the debugger to validate that it's actually what you expect it to be. I was able to spot the bug easily once I rendered a frame.

I still have more jitter on an iPad I'm testing on vs. the same code on desktop, and I'm tracking down the sources of this. I'll update you if I resolve it. The fix I described above significantly improved the jitter though.

@faoiseamh
Copy link

I gave the native ios face detection a try as a faster alternative to dlib's frontal face detector. The results are significantly faster, but the resulting bounding face box is not as optimal as input for dlib's landmarking. It seems to be generally smaller and as a result the landmarking get confused if the head is at even relatively small angles. This is probably a large source of the "noise" everyone is experiencing.

@Cloov
Copy link

Cloov commented May 1, 2018

I've been using the native iOS face detection, but it was already in the project I began with. I tried simply expanding those boxes 5, 20, and then 50 pixels on each side (as a temporary measure) as the position of the face detection was fine, it just seemed to be that the rectangles were highlighting facial features tightly and not encompassing any more.

However, that didn't improve the jitter for me. If I look at videos online of the same 68-point landmark detection, there is some jitter in a similar place - for me, it's mostly around the chin area, and on all types of face too. Movements in the mouth don't work very well also.

I know you can retrain these or create your own, but since the file I'm using is so widely used, I don't think that's the solution at all.

@faoiseamh , I may still try skipping the native face detection and using Dlib's, as I have no other ideas at the moment! I did inspect the camera data as it goes through Dlib and pixel formats look fine - besides, I also now ensure I'm using the iOS camera's BGRA format.

@faoiseamh
Copy link

@Cloov The face detection being too large is a problem as well. It seems the landmarking relies on fairly precise detection area. If it's too large the edges of the face can jump around, so chin and jaw line are problematic. My current iteration uses the ios native face detection as a starting point, expands that by 25% in all directions, then downsizes the resulting region and runs the dlib face detection on the downsized area. That's the best balance of performance and precision I've found. To reduce noise I'm also using a simple moving average of the output. You could enhance this with some outlier rejection using various noise reduction / outlier rejection statistical techniques, but this was sufficient for my needs.

There are a variety of other facial landmarking algorithms out there, and iOS 11 has a native landmarking feature as well in Vision.framework, so if I need to improve this further I'm going to try other libraries. I think I've squeezed as much performance from dlib as I can. My main additional needs are more robust performance for extreme angles (profile faces), which are not intended functionality with dlib.

@Cloov
Copy link

Cloov commented May 16, 2018

@faoiseamh I think a lot of the more extreme noise I was experiencing was down to reflections! I was often pointing at 3d models on my screen. I think I also need to average out changes in my pose transform, because even when detection is going well, a quite-still target face causes a lot of shaking. Do you have any advice on averaging this motion out - techniques or algorithms? Are they in openCV/dlib?

@faoiseamh
Copy link

faoiseamh commented May 16, 2018 via email

@RubenBenBen
Copy link

RubenBenBen commented Jun 1, 2018

@Cloov @faoiseamh Hi guys, I ended up here from googleing too. @faoiseamh Not sure if your suggestion about changing BGRA to BGR will make a difference, because if you print the alpha components of your bgra sampleBuffer you can see that it is always \xff which is 100%. So basically it should be the same with just ignoring the alpha component as far as I understand. I use this exact project and while testing on my iPhone 4 I too experience a constant jumping/noise with eyes. I wonder if it can be improved somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants