Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

循环对多个视频进行处理时会发生内存泄漏 #148

Open
tingfngy opened this issue Nov 6, 2020 · 9 comments
Open

循环对多个视频进行处理时会发生内存泄漏 #148

tingfngy opened this issue Nov 6, 2020 · 9 comments
Labels
bug Something isn't working not very important

Comments

@tingfngy
Copy link

tingfngy commented Nov 6, 2020

在循环对多个视频进行处理时,内存会异常增长。比如在处理第一个视频时,内存占用为1G。当处理第二个视频时,内存会增长为2G。

@williamfzc williamfzc added the bug Something isn't working label Nov 6, 2020
@williamfzc
Copy link
Owner

是,之前发现过,不过这么干的人也不多(一般用命令行调用可以规避)
opencv的回收机制也有点问题
可以发下你的脚本吗,我看看现状

@tingfngy
Copy link
Author

tingfngy commented Nov 6, 2020

通过内存分析工具,定位在这行代码上:return cv2.cvtColor(old, cv2.COLOR_RGB2GRAY)
脚本稍等我整理一下,有点难抽离出来。

@tingfngy
Copy link
Author

tingfngy commented Nov 6, 2020

这是一段可以复现bug的代码,将base_dir变量改成本地的一个文件夹,文件夹下有多个视频文件。

@tingfngy
Copy link
Author

tingfngy commented Nov 6, 2020

`from stagesepx.cutter import VideoCutter
from stagesepx.classifier import SVMClassifier
from stagesepx.video import VideoObject
from loguru import logger

import os

def test(video_path):
video = VideoObject(video_path)
video.load_frames()

--- cutter ---

cutter = VideoCutter()
res = cutter.cut(video, block=2)
stable, unstable = res.get_range(offset=3)
picture_path = video_path.split('.')[0]
if not os.path.exists(picture_path):
os.mkdir(picture_path)
data_home = res.pick_and_save(stable, 3, to_dir=picture_path)

--- classify ---

cl = SVMClassifier()
cl.load(data_home)
cl.train()
classify_result = cl.classify(video, stable)

res_dic = classify_result.to_dict()
return res_dic
if name == 'main':
import warnings
warnings.filterwarnings("ignore")
logger.remove(handler_id=None)
import tracemalloc
tracemalloc.start()
base_dir = r'D:\data2\down\launch_video'
for i,j in enumerate(os.listdir(base_dir)):
print(f'正在进行第{i}次.....')
print(j)
test(os.path.join(base_dir,j))
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
for stat in top_stats[:5]:
print(stat)`

@williamfzc
Copy link
Owner

williamfzc commented Nov 6, 2020

短期可以先用命令行调用规避下,opencv的泄露问题挺多的
比如 https://www.v2ex.com/t/714960

命令行调用就把循环里调函数改成循环里调脚本
或者可以用别的分割python runtime的方法

@tingfngy
Copy link
Author

tingfngy commented Nov 6, 2020

好的,谢谢大哥,我暂时用这个方法规避这个问题。

@huzhicheng1993
Copy link

hello williamfzc

我最近也遇到这个问题, 通过内存分析工具,定位在video.load_frames()这行代码上,内存暴增。而且结尾处有清理缓存
cl.clean_model() , video.clean_frames() 但是没有任何效果。

在for循环内部调用同一个函数(函数部分实现如下图)内存是不断的增长,这个问题和上面的opencv泄漏相同吗? 

image

@williamfzc
Copy link
Owner

hi @huzhicheng1993

load_frames 内存上升 属于预期内的行为,预期是将视频帧一次读入内存供后续分析加速用。
根因基本能确定是这些 frames 没有被正常释放掉,但具体在哪里还没时间看,因为 #148 (comment) 能规避,使得这个问题优先级不高。

当然欢迎PR,不过也想问问是否有些一定要用这种方式的场景?

@HResponsibility
Copy link

hello williamfzc

我最近也遇到这个问题, 通过内存分析工具,定位在video.load_frames()这行代码上,内存暴增。而且结尾处有清理缓存 cl.clean_model() , video.clean_frames() 但是没有任何效果。

在for循环内部调用同一个函数(函数部分实现如下图)内存是不断的增长,这个问题和上面的opencv泄漏相同吗? 
image

您好 这个内存检测的工具能告知一下么。我看可以检测出那行代码内存占用的特别多。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working not very important
Projects
None yet
Development

No branches or pull requests

4 participants