Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App Security Score inconsistencies #1940

Open
Prehistoic opened this issue Apr 8, 2022 · 12 comments
Open

App Security Score inconsistencies #1940

Prehistoic opened this issue Apr 8, 2022 · 12 comments
Labels
investigating MobSF collaborators are investigating this issue static analyzer Static Analyzer related

Comments

@Prehistoic
Copy link

Prehistoic commented Apr 8, 2022

ENVIRONMENT

OS and Version: all
Python Version: all
MobSF Version: 3.5.0

EXPLANATION OF THE ISSUE

I'm noticing some inconsistencies in security scores since the release of 3.5.0.

I ran a static analysis on 2 applications and got these results :
image
image

With this new security_score formula
image
the security scores obtained are 39/100 and 36/100. So basically the first one which has way more findings receives a better score ?

I understand the intention to ponder the score based on the ratio of the findings' criticities but that case seems to be a bit extreme.

EDIT :

Another example that shows the pitfalls of this formula :

  • if I got just 1 HIGH it means that my security score is 100 - (1 + 0 - 0) * 100 = 100 - 1 * 100 = 0
@github-actions
Copy link

github-actions bot commented Apr 8, 2022

👋 @mat42290
Issues is only for reporting a bug/feature request. For limited support, questions, and discussions, please join MobSF Slack channel
Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.

@ajinabraham
Copy link
Member

Thanks for brining this up. Do you have any thoughts on alternative implementations?

@Prehistoic
Copy link
Author

Prehistoic commented Jun 2, 2022

Honestly I thought that the previous implementation wasn't so bad, that is to say : 100 - X * number_of_highs - Y * number_of_mediums + Z * number_of_secures with some arbitrary X, Y and Z values.

You then capped the score to 10 if it went below that. What about lowering that cap value to 0 or 1 ? I think using the whole range between 0 and 100 would be beneficial and allow to better compare apps with low scores.

I guess you were not satisfied with this implementation because you changed it but in my opinion even if it is pretty basic that's the way to go. I think that any pondering based on criticities will always result in inconsistencies between apps with just a few findings and apps with a lot of findings.

Actually a big selling point of this implementation (imo) is that it is very easy to give a goal to the developers of an app. For example their app got a 20/100. Let's say they aim at getting 70/100 before releasing any app. If they know that fixing 1 High will grant them X points, they can easy plan and prioritize what needs to be done. Following the same logic I think it could be great to advertize about how the score is calculated somewhere in the documentation (or even in the webpage with the scan results) to make it clear for everyone.

EDIT :

There might also be something to do with CVSS scores there. Maybe you could adapt the X and Y values based on the average CVSS scores of the findings of the corresponding criticity.

For example :
Security_score = 100 - X * (average_cvss_for_high_findings/A) * number_of_highs - Y * (average_cvss_for_medium_findings/B) * number_of_mediums + Z * number_of_secures

Where A is the maximum CVSS score possible for High findings (should be 10) and B is the maximum CVSS score possible for Medium findings (depends on where you placed the barrier between high and medium, I think it's 6.9 but haven't checked)

But that might make the formula more complex for no real gain.

@ajinabraham
Copy link
Member

Let me take a look at this.

@ajinabraham ajinabraham added the investigating MobSF collaborators are investigating this issue label Jun 13, 2022
@michaelkyawsan
Copy link

May I know, Is this issues solved in latest version?

@devtty1er
Copy link

devtty1er commented Oct 18, 2023

@michaelkyawsan no, unfortunately the scoring system is really just a weighted average. E.g. if you have one high finding, it is "worse" than any number of high findings and some additional medium findings.

@ajinabraham please advise. Would you accept a PR that reverts the scoring to pre-#1881?

mobsf/StaticAnalyzer/views/common/appsec.py

     high = len(findings.get('high'))
     warn = len(findings.get('warning'))
     sec = len(findings.get('secure'))
-    total = high + warn + sec
-    score = 0
-    if total > 0:
-        score = int(100 - (
-            ((high * 1) + (warn * .5) - (sec * .2)) / total) * 100)
+    score = 100 - (high * 15) - (warn * 10) + (sec * 5)
     if score > 100:
         score = 100
+    elif score < 0:
+        score = 10

devtty1er added a commit to devtty1er/Mobile-Security-Framework-MobSF that referenced this issue Oct 18, 2023
@ajinabraham ajinabraham added the static analyzer Static Analyzer related label Dec 10, 2023
@johnxguo
Copy link

how about this algorithom
first, calculate a "loss score",e.g. , loss_score = high * 10 + warning * 5 - sec * 2,it‘s value range is (-∞,+∞), but in most case, the range is [0, +∞)
then, normalize the loss score to [-1, 1] or some other range, you can use softmax functions,e.g., sigmoid like functions
at last,map to your score interval
example code:

def sig_like(x):
    e = 2.71828
    return 2 / (1 + pow(e, x / 30))

def loss_score(high, warning, sec):
    return high * 10 + warning * 5  - sec * 2

def score(high, warning, sec):
    loss = loss_score(high, warning, sec)
    return min(sig_like(loss), 1) * 100

print(score(4, 3, 1))

@johnxguo
Copy link

how about this algorithom first, calculate a "loss score",e.g. , loss_score = high * 10 + warning * 5 - sec * 2,it‘s value range is (-∞,+∞), but in most case, the range is [0, +∞) then, normalize the loss score to [-1, 1] or some other range, you can use softmax functions,e.g., sigmoid like functions at last,map to your score interval example code:

def sig_like(x):
    e = 2.71828
    return 2 / (1 + pow(e, x / 30))

def loss_score(high, warning, sec):
    return high * 10 + warning * 5  - sec * 2

def score(high, warning, sec):
    loss = loss_score(high, warning, sec)
    return min(sig_like(loss), 1) * 100

print(score(4, 3, 1))

@ajinabraham

@ajinabraham
Copy link
Member

Do you want to send a PR?

johnxguo pushed a commit to johnxguo/Mobile-Security-Framework-MobSF that referenced this issue Dec 19, 2023
@johnxguo
Copy link

Do you want to send a PR?

#2311 @ajinabraham

@ajinabraham
Copy link
Member

Thanks, I will test this out.

@sumit-jha-Pw
Copy link

sumit-jha-Pw commented May 16, 2024

I am also facing a similar issue where the score decreases even after improved results.
Screenshot 2024-05-16 at 12 47 45 PM

Screenshot 2024-05-15 at 4 26 16 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigating MobSF collaborators are investigating this issue static analyzer Static Analyzer related
Development

Successfully merging a pull request may close this issue.

6 participants