Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results are depending on DC component of the signal #80

Open
meierman1 opened this issue Nov 4, 2021 · 1 comment · May be fixed by #84
Open

Results are depending on DC component of the signal #80

meierman1 opened this issue Nov 4, 2021 · 1 comment · May be fixed by #84

Comments

@meierman1
Copy link

meierman1 commented Nov 4, 2021

Issue
The results of process() and more specifically of fit_peaks() depend on the DC component of the signal, which, IMO, does not make sense.
To showcase the issue, you can use any data, for which the analysis works well and add an offset to the signal. When the mean of the signal surpasses a range of approx. 15-30 times the std of the signal, the analysis will fail.
More importantly: The quality of the results is affected by much smaller changes in offset already.

Cause (as far as I can tell)
At least partially, this is caused by detect_peaks() where different moving averages are tested. The candidates are a percentage of the the absolute moving average which means, the tested values are more coarse for high-mean signals.

Proposed fix
I propose using the signal mean and add the same offset percentages as before but of 3*std(signal) instead of the signal mean itself to the signal.
Alternatively, the signal mean could always be set to a fixed value as it is already being done for signals with negative baselines (line 280 in heartpy.py).

I have to add, that I experienced something odd when testing all of this: When using a zero-mean signal (which is internally offset to a zero baseline as mentioned above), I get different results compared to when I use the very same signal and add the same offset abs(np.percentile(hrdata, 0.1)) to it before calling the process function. Have not had the time to find the root cause of this.

Edit: I actually did some more testing. I would recommend using the offset percentages on just std(hrdata). 3*std(hrdata) does not make sense as peaks are usually well within plus/minus 2*std(hrdata) and we test up to 300 percent anyways. In fact I even improved results for my data by adding the following lower/negative percentages to ma_perc_list: [-5, -3, -2, -1, -0.5, 0, 0.5, 1, 2, ...]. At least in my tests, the best percentage was never over 15 and usually within plus/minus 1 percent.

@meierman1 meierman1 linked a pull request Nov 5, 2021 that will close this issue
@paulvangentcom
Copy link
Owner

Thanks for the detective work and effort. I'll go through the PR somewhere today or tomorrow and merge if it makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants