-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Session ID handling #89
Conversation
This adds the generateID() function from https://github.com/nginxinc/nginx-saml/blob/c3baf97d2c90e8a597803efd819095353881da83/saml_sp.js#L551 without modifications, and uses it to generate the plain nonce in getAuthZArgs(). The default length of 20 was used as https://openid.net/specs/openid-connect-core-1_0.html does not specify a min or max for the plain nonce, and per https://stackoverflow.com/questions/69209677/google-nonce-max-length the limits are device based not protocol based.
…that rotate on refresh. This refactor changes the use of the request_id as the client session access token (auth_token cookie value) and the keyval store key. On keyval store set the new behavior generates a random value for the client session access token. This value is then hashed using the oidc_hmac_key and the hashed value is used as the keyval store ID and as a session ID for the logs. The hashed value is not redeemable for a session, hardening the config against attacks on the logs and keyval stores. The client session access token is rotated each refresh, which was done both for security and because it allowed for code deduplication.
… refresh response does not contain a new token. Per discussions with Liam C. this is the desired behavior.
Hi @ag-TJNII and thank you for your contribution to our project and for paying attention to security-related aspects. However, I must decline this approach for the following reasons:
|
This change is the core of this proposed PR. The main reason why I'm considering While the request_id is random, it is a documented core variable which likely will be used outside the scope of this module. It's reasonable for client config to use the request ID for other purposes, like a request trace ID. Once the variable is used for other purposes it is exposed, which is a risk as the ID is directly redeemable for a client's session. A decent example of this is nginx-openid-connect/openid_connect.js Line 193 in 39334b6
OWASP recommendation against this behavior: https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html#logging-sessions-life-cycle-monitoring-creation-usage-and-destruction-of-session-ids CWE on logging secrets: https://cwe.mitre.org/data/definitions/532.html Before I address the other concerns I think we need agreement on this, as it is core to the proposed change. |
Yes, I asked that question to ensure that we are on the same page regarding the understanding and nuances of security issues related to implementation.
This is correct, and it's worth clarifying that:
Regarding the logging issue, it's somewhat separate and does not directly relate to the method of generation or use of session cookies. Technically, we could log any confidential or potentially vulnerable data, such as the Key points to consider:
|
Technically true, however in practice we shouldn't. This repo is advertised as a reference implementation, so I think the majority of admins expect this to be reasonably secure by default. I think logging valid session tokens which can be redeemed for a user's session with a simple cookie header would take the majority of admins by surprise. I know I was surprised when I discovered this, and my immediate thought wasn't "As an admin I need to secure this log", instead it was "if they're getting the security fundamentals of 'don't log passwords' wrong, what else does this repo get wrong". Not confidence inspiring. This can be mitigated by clearly documenting the logging behavior, but looking over the README I don't believe the session token logging behavior is documented. Correct me if I'm wrong.
This is a good point. I don't think changing from the request ID to a random value alone has a major changes on this topic, but the one-way function does. I'd like to table this and come back to it once we discuss the one-way cookie to keyval id function implementation, which also matters for the concern about using the njs js_set function in each request. Before we dig into that I'd like to wrap up the discussion on the change from request_id to a random value.
So this is a good point, the session rotation behavior will result in increased zone memory usage as each rotation will result in a new key. The implementation will flush out the values, so using the 1h session / 8h refresh default TTLs, I wouldn't expect the refresh token zone memory usage to increase 8x overall due to the flushed values, but the memory usage for the keys by themselves will increase 8x. I was on the fence about rotating the keys, I think you've just convinced me not to. I'll redo that change. For general memory usage I think the simplest answer is just make the new random value the same length as the old key. What is your opinion on making that a configuration option with a reverse compatible default, so that admins are not surprised on upgrade but advanced users can increase their session ID entropy if they desire/require? |
Most old-school admins understand the NGINX security model quite well, which is based on the Unix approach to rights separation. In simple terms, there are two entities - the admin and everyone else. The admin can do anything; everyone else, nothing. NGINX's model lacks RBAC, and as an admin, you should understand this and make appropriate adjustments in your security lifecycle. By default, keyvals are writable and readable only by the NGINX user, and the Let me reiterate:
Yes, I see no harm in this improvement.
This is a good question. We planned to incorporate such a possibility in the SAML implementation but never made it a separate option. Personally, I think it's a good idea because it offers flexibility. However, I'd only include it in the documentation and not expose it in the openid_connect_configuration.conf. |
Yes, and this is the crux of my concern. The session cookie isn't a password, but for as long as the session is valid it is functionally equal to a password. If a user sets up a valid session then the contents of that cookie can be used to access the protected service as easily as if the attacker had their password. Even easier, frankly, as modern username/passwords should also need MFA, but the session cookie has already gotten past that flow. If I have a user's session cookie then impersonating them is as simple as adding a Not only are these cookies logged, they're also available via the Nginx API and on disk. I did not set a umask on the process (as I assume most won't) and I see the file is world-readable:
So if an attacker can read this file with the current master branch code they also have values they can immediately use to impersonate a user, with no safeguards. The current README doesn't document that the keyval store spool directory needs to be only accessible by the Nginx user, and the default config puts it in the conf directory. Should the admin figure this out? Arguably yes, but we should provide a default config that doesn't rely on the deploying engineer being on-the-ball and fully understanding the nuance and security of the underlying implementation. "Secure out of the box" is a reasonable expectation nowadays.
It's common practice now to run application logs into log aggregation systems for analysis. These logs will commonly be accessible by security, engineering, and development teams. This has been my experience as a professional Linux operations engineer for over 10 years now. The days of the error log being a single protected file on the host, honestly was never a thing as syslog based log aggregation has been a standard since before I started my career. It's a reasonable expectation that logs will not contain secrets or credentials, especially for a web server like Nginx. Asserting that the error log must be protected so that we can log valid session cookies into it is, frankly, not reasonable and one that I think would surprise the majority of users. And I'm not just being paranoid, this kind of attack is gaining popularity:
The CWE on this topic flatly says "Do not write secrets into the log files." Not protect the log, not limit access to the log, "Do not." https://cwe.mitre.org/data/definitions/532.html Also, currently the only safeguard available in this implementation against session hijacking sending the logs to a SIEM for analysis to detect usage pattern. This data stream should arguably include the error log, as that's where this implementation logs that it's created new tokens and accepted logins. We shouldn't assume the error log is secure, there are many common and valid cases where it will be centrally stored and available for analysis.
I assert that logging the session cookie in the name of debugging is like cutting a hole in the side of the building in the name of accessibility. Yes, it will be very easy to get in and out, but there are a lot of negatives that come with it that outweigh the positives. It's an option, but there are others that are likely better. And to be clear I'm absolutely not dismissing the troubleshooting concern. That is valid and will need to be addressed. I don't think logging the actual cookie contents is the correct pattern, however. But I'd like to come back to that once we agree on how the session will be handled, as how to troubleshoot is implementation specific. I have some ideas on how to mitigate/resolve this concern. So with that this is actually a good transition into discussing the cookie value hashing I implemented in https://github.com/nginxinc/nginx-openid-connect/pull/89/files#diff-1dfe8897de6fda3d50975bb6eb32ef0b776015cb9ed38f058ed05c3f3a70d90aR281-R293, as it's highly related to these points. The core assertions I made when implementing this is:
That last point is a hard one, for obvious reasons. In the current implementation we store the user's session cookie and use it as a keyval key for the keyval zone stores. This is a clean, fast, and simple solution, but now the user's session tokens are stored to disk and transferred over the wire if
That's the other, and honestly larger, crux of this change. By using a one-way function on the cookie value and using that as the keyval key we mitigate most of these threats by using a key that's not redeemable for a session. This implementation requires the use of |
That is why this implementation path was chosen initially. I don’t know if all the listed improvements can be implemented without njs or additional modules. |
I changed the default paths for the state file to align with the approach we use for SAML. (see PR #90) |
Great, that's the same path I settled on myself so glad to see I picked right. I checked the permissions on that directory yesterday in auditing my config and it was 0o750 from the installer, so that path should be good.
So I haven't benchmarked, I guess that's the next step as we're concerned about the njs performance. I'm currently of the opinion that the security benefits are worth the potential performance issues, but the current app I'm protecting isn't performance sensitive. Since I rolled back the client session cookie rotation the internal store sets are now basically unchanged, and the njs_set is in the openid_connect_configuration.conf file which is config. I'm thinking it would be best to make the new pattern or the old pattern selectable with good documentation. If users prefer the security proposed here they can use the njs_set method. If they prefer the performance and accept the risks, they can use the cookie value directly. Implementation wise I'm thinking this would entail a couple more config maps to select the hash or the bare client session ID, but I haven't tested yet. At this point I see two paths forward on the client session ID handling/hashing proposal:
Let me know how you're leaning so I don't put effort into the wrong direction. Also if there is a config native way to do a one-way function let me know and I'll pivot to that, I'm working under the impression |
This PR hardens the implementation to use random values instead of the request_id for the OIDC Nonce and client session access token. The major changes are:
Rotating the client session token each refresh is the only real behavior change introduced. This will result in a Cookie set header being added to non-OIDC requests when the session token is refreshed. This change was made as a general security improvement, but mostly because it simplifies the implementation allowing for one keyval store update codepath for both new sessions and refreshes.
Other than the rotation this change should be a no-op behavior wise. I have made an effort to verify the major behaviors and it passes my testing.