Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we use database encryption instead of writing everything as plaintext #4778

Open
sharonxd1234 opened this issue May 21, 2024 · 25 comments
Open
Labels
area:deployment related to how uptime kuma can be deployed releaseblocker blocking bugs encountered with a new release security
Milestone

Comments

@sharonxd1234
Copy link

sharonxd1234 commented May 21, 2024

πŸ“‘ I have found these related issues/pull requests

No found any

🏷️ Feature Request Type

Other

πŸ”– Feature description

Database file encryption (Plaintext) in the SQLite to avoid any other users with the same access to watch on IP's logs and others (Using Hash sha-1 like the password of the users)

βœ”οΈ Solution

I would love if the plaintext in the database would not be able to read by any standard db browser.

❓ Alternatives

No response

πŸ“ Additional Context

No response

@sharonxd1234 sharonxd1234 added the feature-request Request for new features to be added label May 21, 2024
@CommanderStorm CommanderStorm added help area:deployment related to how uptime kuma can be deployed and removed feature-request Request for new features to be added labels May 21, 2024
@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 21, 2024

plaintext in the database would not be able to read by any standard db browser.

So you have FS-level access-permissions to read the database of the node itsself...

How would we write to the database? We would likely have an encryption key lying around somewhere on the same disk
=> you could just read the encryption key and read the db anyway

I don't see this as being possible or adding security.
=> reclassifying as help

What is your actual usecase? I think this is a XY-Problem.

@thielj
Copy link

thielj commented May 22, 2024

A use case is for example container volumes. When they are detached, plain text passwords remain on the volume or the underlying storage system. The solution in many other projects is to provide an encryption key or seed through env variables or secrets that are not written to disk or into the database.

Or consider the case of an exploit enabling an attacker to download the database. It doesn't even have to be a vulnerability in U-K. This is exactly the scenario I was talking about when I mentioned splitting monitoring duties between an internal instance that isn't exposed to the net, but instead 'contributing' to an external instance that's used for simple reachability checks and displaying status pages.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 22, 2024

In v1, the performance budgetΒ is too tightΒ to include this.
In v2, you can use MariaDB and https://mariadb.com/kb/en/data-at-rest-encryption-overview/ to achieve this.
Please see #4500 for the status of said release and what needs to happen before we can publish a beta.

Enabling the SQLite crypto extension https://www.sqlite.org/see/doc/trunk/www/readme.wiki does not seem worth dev-time as I don't think this would help with actual security that much.

Think this resolves your request.
If you have insights of how SQLITE-SEE works and if this is simple to configure/maintain and performant enough (provide a benchmark ^^), we can reopen the issue.

@thielj
Copy link

thielj commented May 23, 2024

It's unnecessary to encrypt heartbeats and events; but connection strings, authentication headers and such shouldn't be persisted in plaintext, as much as you don't store login passwords in plaintext.

There's no performance issue with that either.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 23, 2024

We are not splitting our database.
This would be exponentially harder to test (at least I can't come up with a design that would not be a major rewrite, having quite subtle bugs or missing features).
Remember that we still need to support both sqlite and mariadb/mysql.
=> would be way too risky given the current test coverage

About the performance aspect:
Yes, I have gotten shit for this yesterday as well while sitting on the couch with @Valentin-Metz already.
Still don't think we should add this extra complexity into the sqlite part.
If people need it, they can reach for https://mariadb.com/kb/en/data-at-rest-encryption-overview/.

@thielj
Copy link

thielj commented May 23, 2024

I didn't say "split the database"; you actually don't even need to change the database at all. Where you save plaintext strings now, you save encrypted bytes - base64 encoded if you prefer.

And re MariaDB, that option is mainly targeted at cloud databases with keys managed by the provider or some other means. Even with that enabled, you would still encrypt valuable data yourself.

Give me one good reason why my login information to other systems should be stored less secure than my login information to your app.

@thielj
Copy link

thielj commented May 23, 2024

See second paragraph here: https://learn.microsoft.com/en-us/azure/mariadb/concepts-security

Without a (hardware based) KMS infrastructure in place, MariaDB's encryption at rest features won't be of much help.

@CommanderStorm
Copy link
Collaborator

re MariaDB, that option is mainly targeted at cloud databases with keys managed by the provider or some other means. Even with that enabled, you would still encrypt valuable data yourself.

As I read the mariadb docs, this is not the case.
Hashicorp and File are both valid KMS plugins.

From the Microsoft article I don't quite get what you are trying to get at.
At some point keys are going to be stored and written to disk. If they are not, they would not survive a reboot...
Cloud providers have developed custom hardware for this.

I don't see why I would trust my environment variables but not the keys managed by my database. Seems like the same level of security to me...

@CommanderStorm
Copy link
Collaborator

you actually don't even need to change the database at all. Where you save plaintext strings now, you save encrypted bytes - base64 encoded if you prefer.

Think running your own crypto falls under the exponentially harder to test category.
Given the current test coverage, I would classify the risk as being too high.
If we can outsource this risk to maridb, that would be a good choice for the moment.

@CommanderStorm CommanderStorm changed the title Database file encryption (PlainText) Can we use database encryption instead of writing everything as plaintext May 23, 2024
@thielj
Copy link

thielj commented May 25, 2024

Every backup will be in cleartext (except for file system based snapshots before you start nitpicking).

Every other leak of database connection details will be able to access the data in plaintext.

Explain to me why you protect your admin password by hashing it, please.

Being the evil advocate and following your logic, I could just say: "there's no need to do that. if someone compromises the app, they can obviously get in without the password. So why bother. And if someone can download the database with the plaintext password, they already have all the data anyway. So why bother."

And, yet you hash your passwords.

Give me one good reason why my login information to other systems should be stored less secure than my login information to your app.

@thielj
Copy link

thielj commented May 25, 2024

As I said before, I'm no JS programmer. But it's something like this to set you up.

const crypto = require('crypto');
const key = crypto.scryptSync(user_supplied_seed, your_chosen_salt, 24);
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv('aes-192-cbc', key, iv);
const decipher = crypto.createDecipheriv('aes-192-cbc', key, iv);

@sharonxd1234

This comment has been minimized.

@CommanderStorm
Copy link
Collaborator

I think the password hashing response is a bad analogy and I am not going to react to that bait.

Let's get this discussion back to being productive:

So your concern has changed in terms of what the attacker has access to:
As I read your last comment the attacker has dumped the database credentials for an encrypted maridb instances from the environment variables.
I don't think that for this attacker, having symmetric crypto stored in the DB would make a difference, if we store the key material in the environment variables.

@thielj
Copy link

thielj commented May 26, 2024

We're running in circles here. Just read my comment you marked as spam.

You have two choices:

  1. limit your user's exposure in cases where a partial exploit of U-K or access to the database or backups outside of U-K's control would disclose sensitive information; or
  2. make a "full compromise" (i.e. root access to the running instance or the container host) your one and only concern. "Nach uns die Sintflut", in case my point is lost in translation.

@thielj
Copy link

thielj commented May 26, 2024

I don't think that for this attacker, having symmetric crypto stored in the DB would make a difference, if we store the key material in the environment variables.

Where do you plan to store the connection string to that secure Maria DB by the way? Answers on a postcard...

@CommanderStorm
Copy link
Collaborator

read my comment

The relevant part of #4784 (comment) which this discussion was missing:

But I didn't encrypt them, because I think encrypting them is meaningless, as the encryption key will be in the same data directory. If an attacker can read your SQLite database, they can probably get the encryption key to decrypt the information too.

That's not how you do it, and also not why you do it. Unless you have a hardware key storage, an encryption key or seed is provided through some other means (secrets, environment) and never persisted to disk and ideally stored in memory that isn't swapped out.

The encryption protects plaintext or weakly hashed passwords, connections strings, authorization headers, etc. A docker volume mounted to store the DB for example gets detached and lives on until it is destroyed. Even then the data is probably still on it and can be scanned for valuable information.

Or consider the case of an exploit enabling an attacker to download the database. It doesn't even have to be a vulnerability in Uptime Kuma; an attacker might get access to storage volumes through other means, without actually having the key information.

And yes, someone who has compromised your running container will have access to it. At this point, you have a much bigger problem already.

@CommanderStorm CommanderStorm added security releaseblocker blocking bugs encountered with a new release and removed discussion labels May 26, 2024
@CommanderStorm CommanderStorm added this to the 2.0.0 milestone May 26, 2024
@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 26, 2024

Where do you plan to store the connection string

Good point.
The user can configure the connection settings in the ui on initial setup or via environment variables.
It is then (=in both cases) written to disk as db-config.json in the format

{"type": "sqlite"}

or

{
  "type": "maridb",
  "hostname": "localhost",
  "port": "3306",
  "database": "kuma",
  "username": "",
  "password": "",
}

So in the v2-design, the database credentials are written to disk which means that the malicious local admin could also read them.

We have the constraint, that we want to offer simple setup of embedded mariadb. We like our users to have as simple a time configuring us as possible => forcing environment variables is not really something we want.

An option to allow for no disk storage of these would be to change the way we handle from writing to disk to staying in the environment variables.
=> remove this line

Database.writeDBConfig(dbConfig);

This would then allow users to have a encrypted, external mariadb instance which they connect via the credentials in the enviroment variabes and those to never be written to disk.
What do you think about this?

@thielj

This comment was marked as abuse.

@CommanderStorm

This comment was marked as resolved.

@thielj
Copy link

thielj commented May 27, 2024

You're not just planning to do something which you repeatedly argue against as making no difference - you're even going further and plan to write that information to disk.

I'm out of this discussion, but let me give you some points on the way:

  • Forget about Sqlite SEE.

    • Or read at least the license terms before you go on about it.
    • There's a free alternative to that, but it will just worsen your performance issues
  • some performance notes

    • Judging from projects with similar data, you are not going to solve your performance issues by switching to MariaDB - you will ultimately need to split your data path and separate relational data (config, irregular events) from the time series data
    • SQlite's query latency is normally measured in microseconds. For any external database, you're looking at milliseconds due to the additional marshalling, networking, encryption, authorization and authentication overhead. When comparing "like for like", embedded SQLite is hard to beat.
    • As I said, I'm no JS programmer but Node.js is inherently single threaded while SQLite scales best with multiple threads (2x the number of cores as a starting point). You may need to look into this from a JS perspective. Worker threads? Connection pooling? Caching? WAL?
    • Postgres with the time series extension is the only database that could reasonably handle both relational and TS data. If I wanted to offer hosted "Kuma as a Service" with minimal costs, I would go for that. I would explicitly not recommend this as a default or embedded solution.
    • If I wanted to keep it lightweight, easy on resources, performant, reasonably secure, low on maintenance, etc I would default to SQLite for relational data, allowing advanced users to migrate to an external MariaDB, Postgres or SQLServer instance. Not for performance (it would probably be worse), but for ease of backup or replication.
    • If this wasn't Node.js, I would look into using RRD for time series data, like many other networking monitors do. For Node.js, YMMV.
    • As such, you'll probably have to look for solutions that provide sufficient performance for your Node backend. I believe Influx and Prometheus provide JS bindings. I wouldn't recommend these as an embedded or default option either, but as an option for advanced users to migrate to.
    • Ultimately, a (second, separate) SQLite DB for time series is probably still an acceptable default for small setups or "tiny" hosting plans, with options to migrate the TS data to a real TSDB for larger setups.

about the security aspects of this issue:

  • You can't rely on the database protecting sensitive data.
  • There's no perfect protection. You can just make it incrementally harder to break.
  • There are options to further harden the protection of key materials, but probably not with Node.js. Look into how postfix reads secret as root user, then changes into a chroot jail before dropping all privileges.
  • Compared to storing data in plaintext, even obfuscating it with a fixed key stored in U-K is much preferable and protects against e.g. syphoning off terrabytes from a misconfigured S3 bucket and using a content indexer to search for potential valuable information "at scale"
  • Connection strings can also leak accidentally, independent of the user's key seed. They get emailed around, noted somewhere, etc.
  • As you might have realized by now, passing in a key seed isn't less secure than passing connection credentials
  • However, passing in secrets or connection strings through the environment or similar is by far more secure than storing them on the disk. An attacker wouldn't just need access to the data, they would need access to the specific running instance that has been passed the secrets or breach the security of the control plane. If the data is stored on disk, an exploit that lifts off that single file or a stopped instance or a volume backup is sufficient - something they might find in the same location as the DB backup.
  • Make security and data protection your default, and your priority. Consider cases that don't involve a full compromise. Assume that your users expect their data to be secure. Assume users have no clue what they're doing.
  • You can make an initial setup both easy and secure. It's not one or the other.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 27, 2024

Forget about Sqlite SEE

Most parts of their licence are fine. The only part that is problematic would be customers cannot make additional copies of the software to use for other purposes.
We could not guarantee that our users don't extend the software via FROM uptime-kuma:2 in another docker image
=> You are correct, SEE is not pheasable.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 27, 2024

Re:some performance notes

performance issues

About the performance problems of v1:
They are mostly because we don't agregate at all and store nearly everything.
We likely could have improved our utilisation a lot by optimising the db layout.
Anyway, we have now committed to using aggregated tables => solving this issue for all platforms.
Scanning 1-10 gigabytes for queries is somewhat slow, but some things in sqlites design such as their query optimisation engine are not optimised for this. Being concurrent here is better, as we execute a lot of database queries (this could have been another optimisation route which we did not take).

While we might have been able to solve it via other ways, this is the way we chose. I don't see us changing out that part as the current push is already nearly done. See #4500 for the bugs left.

Postgres [...] is the only database that could [...]

Please see #3748 (comment)

SQlite's query latency is normally measured in nanoseconds

The relevant query is currently (v1) at ~70ms (dependent on your monitor count and reteintion, mileage may vary). Please see #2750 (comment)
We may be using it incorrectly, but we are not db experts.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 27, 2024

syphoning off terrabytes from a misconfigured S3 bucket

We currently don't have a feature where we automatically upload+restore backups to/from a S3 bucket. If we had, that would be encrypted.
The database export feature is getting removed in v2, as it is currently unmaintainable.

Make security and data protection your default, and your priority. Consider cases that don't involve a full compromise. Assume that your users expect their data to be secure. Assume users have no clue what they're doing.

Our current project goals are that we are simple to setup and allow great UX.
If one of these goals comes into conflict with security or data protection, have in the past chosen to prioritised better UX and allowed users to downgrade their security/privacy. (this obviously only covers the cases where users might be depending on the old behavior or need an exception for some reason)

Note

Lets be clear on this one:
We obviously try to follow best practices where possible.
I agree that we can/have to improve this a lot. Where what is acceptable is drawn is always super muddy and difficult (this is a job for a good reason ^^).

Note

Because this might cause an misunderstanding, here are examples what I mean.

About the prioritisation:

  • we CURRENTLY don't have a dedicated, non-tamperable audit log. We currently only log these events to stdout instead.
  • we CURRENTLY don't have proper user management with a fleshed out permission system
  • we CURRENTLY don't have SSO

About the downgrade security aspects of that statement:

  • 2FA-logins can be left disabled
  • Auth can be disabled alltogether
  • UPTIME_KUMA_DISABLE_FRAME_SAMEORIGIN allows users to allow iframes (and thus clickjacking)
  • UPTIME_KUMA_WS_ORIGIN_CHECK allows users to not have CORS for the ORIGIN-Header
  • UPTIME_KUMA_ALLOW_ALL_CHROME_EXEC allows users who have custom chromium paths to have them recognized

And the downgrade privacy aspects:

  • we allow users to add google analytics to their status pages
  • we have a check build in (default=enabled) to show a banner to the user if they are running an outdated version. This check does do an http check against our server. (We don't track users this way, but for some hardline users, this might be a red line)

You can make an initial setup both easy and secure. It's not one or the other.

Easier said than done (at least concearning this issue).
We don't want to force hundreds of environement variables.
The ones which we have are likely aready pretty intimidating for new users: https://github.com/louislam/uptime-kuma/wiki/Environment-Variables

I currently don't see good compromises that:

  • We can CURRENTLY maintain (this can change, if somebody would like to refactor parts of the internals..).
    This part might change in a v3 or v4.
  • Have an acceptable risk to our users (see the test coverage and how often we have been shipping breaking stuff already)
  • Being true to our goals of simple setup for new users and good UX.
  • Achieves the goal and is actually more secure against rouge local admins or exploits that would allow FS-access.

=> For this feature, I currently see it as an either or.

Pushing people towards a tool which we don't have to maintain like https://mariadb.com/kb/en/data-at-rest-encryption-overview/ is quite attactive as this gets around the maintenance problem. Assuming we do the change I proposed here #4778 (comment), this should solve the problem of malicious local admins without environemnt variables.

You are correct that it does not solve the problem of users sharing database credentials via unsecure means (and the db is publicly routable), but I don't think that this is an attack vector we actually have to protect against.

As an analogy:
If a user posts their login credentials on a public site and has not enabled 2FA, I would not consider this a security risk on our end.

@thielj

This comment was marked as resolved.

@CommanderStorm
Copy link
Collaborator

Jea, that was likely not the best way to write what I meant and could be misunderstood. Sorry, should not have been this blunt.
I have rephrased the comment above what I really mean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:deployment related to how uptime kuma can be deployed releaseblocker blocking bugs encountered with a new release security
Projects
None yet
Development

No branches or pull requests

3 participants