Some doubts after testing Champ #72

ZZfive · 2024-04-16T02:15:47Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.

grid_wguidance.mp4

Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.

91D23ZVV6NS.mp4

Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.

In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.

However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.

animation.mp4

Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

zhou-linpeng · 2024-04-16T03:07:23Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.

grid_wguidance.mp4
Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.

91D23ZVV6NS.mp4
Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.

In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.

However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.

animation.mp4
Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

Can you show your grid_wguidance.mp4 results, I think it's your conditional map flickering that causes results like this

ZZfive · 2024-04-16T04:10:14Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.
grid_wguidance.mp4
Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.
91D23ZVV6NS.mp4
Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.
In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.
However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.
animation.mp4
Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

Can you show your grid_wguidance.mp4 results, I think it's your conditional map flickering that causes results like this

Since the reference image used was not found to be an RGBA image, error happened due to a size mismatch when saving grid_wguidance.mp4. Therefore, grid_wguidance.mp4 could not be provided as above. I just discovered this problem. After converting the reference image to RGB, got grid_wguidance.mp4, as shown below. As you guessed, grid_wguidance.mp4 has serious flickering. I follow the doc to perform the data preprocessing process. What problems may cause the flickering in the condition map?

grid_wguidance.mp4

zhanghongyong123456 · 2024-04-16T05:13:59Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.
grid_wguidance.mp4
Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.
91D23ZVV6NS.mp4
Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.
In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.
However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.
animation.mp4
Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

Can you show your grid_wguidance.mp4 results, I think it's your conditional map flickering that causes results like this

Since the reference image used was not found to be an RGBA image, error happened due to a size mismatch when saving grid_wguidance.mp4. Therefore, grid_wguidance.mp4 could not be provided as above. I just discovered this problem. After converting the reference image to RGB, got grid_wguidance.mp4, as shown below. As you guessed, grid_wguidance.mp4 has serious flickering. I follow the doc to perform the data preprocessing process. What problems may cause the flickering in the condition map?

grid_wguidance.mp4

My results are also particularly flashing. This is my grid..mp4

grid_wguidance.mp4

faiimea · 2024-04-16T05:24:28Z

I followed the data_process process for each step, and both the background flicker and the facial distortion appeared in my generated video. At the same time, I used the transferd_result processed by data_process and the reference image provided in the source code for video generation, and the above problems also occurred. I suspect that the alignment of the video with the image may be causing the problem.

I want to know if there is any way to solve the facial distortion problem and the flicker of the background. Also, I want to ask what images are stored under the 'champ/transferd_result/visualized_imgs' path. Now what I observe is a superposition of normal_image and reference image, but I don't know what that means. Please let me know if I did something wrong that caused the visualized_imgs error.

zhou-linpeng · 2024-04-16T06:51:13Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.
grid_wguidance.mp4
Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.
91D23ZVV6NS.mp4
Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.
In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.
However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.
animation.mp4
Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

Can you show your grid_wguidance.mp4 results, I think it's your conditional map flickering that causes results like this

Since the reference image used was not found to be an RGBA image, error happened due to a size mismatch when saving grid_wguidance.mp4. Therefore, grid_wguidance.mp4 could not be provided as above. I just discovered this problem. After converting the reference image to RGB, got grid_wguidance.mp4, as shown below. As you guessed, grid_wguidance.mp4 has serious flickering. I follow the doc to perform the data preprocessing process. What problems may cause the flickering in the condition map?

grid_wguidance.mp4

grid_wguidance_anyone.mp4

Here is my result, you can apply some deflicker methods to your condition maps

ZZfive · 2024-04-16T07:05:31Z

Without data preprocessing, a random picture is used as ref_image and the provided motion_6 for inference. The result is as follow. The consistency of the character's movements is very good, but the character's face is greatly damaged. It should be due to the lack of preprocessing and the human body information in ref_image and the figure in motion are not aligned.
grid_wguidance.mp4
Because the paper mentioned that champ was tested on the UBC fashion dataset, in order to test the data preprocessing process, the following video was selected as the guidance motion from the UBC fashion dataset.
91D23ZVV6NS.mp4
Based on the data preprocessing doc, after completing the environment setup, the required depth, normal, semantic_map and dwpose features can be successfully obtained from the motion guidance video. But I encountered a problem. The obtained semantic_map was missing two frames for some reason. Have you encountered this during data preprocessing? Because the 14s motion guidance video has a total of 422 frames, the difference between the two frames before and after is small. For the two missing frames in semantic_map, directly copy the previous frame to supplement.
In the figure below, the left side is the first frame of the guidance motion video, its size is 960×1254. The right side is the reference image, its size is 451×677. The middle is the depth in the first frame of the guidance motion video after data preprocessing, you can see that the image size is aligned to 451×677, and the human body parts are also more consistent.
However, using the preprocessed data based on the above reference image and guidance motion video for inference, the result is very bad, as shown below. There is a lot of jitter in the video, and there are serious distortions in the faces and bodies of the characters.
animation.mp4
Can somebody tell me the reason for the poor performance or provide some suggestions for improvement? Thanks

Can you show your grid_wguidance.mp4 results, I think it's your conditional map flickering that causes results like this

Since the reference image used was not found to be an RGBA image, error happened due to a size mismatch when saving grid_wguidance.mp4. Therefore, grid_wguidance.mp4 could not be provided as above. I just discovered this problem. After converting the reference image to RGB, got grid_wguidance.mp4, as shown below. As you guessed, grid_wguidance.mp4 has serious flickering. I follow the doc to perform the data preprocessing process. What problems may cause the flickering in the condition map?
grid_wguidance.mp4

grid_wguidance_anyone.mp4
Here is my result, you can apply some deflicker methods to your condition maps

What deflicker methods can i try? Can you tell me?

faiimea · 2024-04-16T07:16:07Z

The first video uses the ref-07.png and motion-02
The second video uses the ref-07.png and processed video

322714326-39fcd7dc-9795-462a-9409-9a5f35141ca1.mp4

grid_wguidance.mp4

And the face distortion like this:

subazinga · 2024-04-16T08:05:46Z

We will release a SMPL smoothing feature soon, maybe this week, to solve the flicker problem.

AricGamma assigned subazinga Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some doubts after testing Champ #72

Some doubts after testing Champ #72

ZZfive commented Apr 16, 2024

zhou-linpeng commented Apr 16, 2024

ZZfive commented Apr 16, 2024

zhanghongyong123456 commented Apr 16, 2024 •

edited

faiimea commented Apr 16, 2024

zhou-linpeng commented Apr 16, 2024

ZZfive commented Apr 16, 2024

faiimea commented Apr 16, 2024

subazinga commented Apr 16, 2024

Some doubts after testing Champ #72

Some doubts after testing Champ #72

Comments

ZZfive commented Apr 16, 2024

zhou-linpeng commented Apr 16, 2024

ZZfive commented Apr 16, 2024

zhanghongyong123456 commented Apr 16, 2024 • edited

faiimea commented Apr 16, 2024

zhou-linpeng commented Apr 16, 2024

ZZfive commented Apr 16, 2024

faiimea commented Apr 16, 2024

subazinga commented Apr 16, 2024

zhanghongyong123456 commented Apr 16, 2024 •

edited