Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@cached_property on blueprint.session.token causing session leak? #430

Open
kgutwin opened this issue Jan 23, 2024 · 2 comments
Open

@cached_property on blueprint.session.token causing session leak? #430

kgutwin opened this issue Jan 23, 2024 · 2 comments

Comments

@kgutwin
Copy link

kgutwin commented Jan 23, 2024

I noticed a concerning potential race condition today while using flask-dance via the Flask local dev server. Here's the setup:

  • I have a script polling my app's "get session" endpoint, using a session that is not logged in. This endpoint invokes flask_dance.contrib.github.authorized in order to check if the session should be pulled from GitHub.
  • While that script is running, I log in to my app via GitHub on a web browser.
  • 1 in 3 times, the web browser login session will be duplicated to the script's session. This is bad because the script has done nothing to activate the session.

I tried to trace the code paths within flask-dance and I think that the use of @cached_property on either the blueprint's session.token or on the blueprint's session itself is part of the problem. Even though flask_dance.contrib.github is a LocalProxy to g.flask_dance_github (which is equivalent to github_bp.session) that isn't sufficient for separation between threads -- @cached_property's cache is associated with the blueprint, which is global across all threads/requests.

Basically, what I think is happening in my case is:

  • The valid login process begins to handle the authorized() blueprint endpoint
  • The OAuth access token is saved to github_bp.session.token
  • Before the endpoint handler returns, my script calls my "get session" endpoint
    • A new thread begins to handle that parallel request, which calls github.authorized
    • This checks for github_bp.session.token, which is cached by @cached_property
    • It returns successfully, allowing a subsequent call to github.get("/user") to fetch the full session info
    • The script then gets the transferred session
  • Finally, the original authorized() blueprint endpoint concludes, and the session is set as expected to the original request

I've done a little testing of my app that is deployed in a test environment using gunicorn (which separates requests into processes). I haven't been able to reproduce the problem there yet, but because it's stochastic, I haven't fully ruled it out yet. Regardless, though, this seems like a concern as login sessions should never leak across threads.

Lastly, I tried a quick patch to take out the use of @cached_property for session and session.token, and it seemed to fix the issue -- I was unable to trigger the bad behavior even when the logs showed the possibility of a race condition.

Should we consider removing @cached_property or replace it with something that uses a thread-local cache?

@singingwolfboy
Copy link
Owner

That's some impressive detective work! This seems like a real problem. I have two questions:

  1. Do you know of an equivalent to @cached_property that uses a thread-local cache?
  2. Do you know of any kind of automated testing framework that we could use to reliably test for this bug? That way, we can verify that it's fixed, now and in the future.

@kgutwin
Copy link
Author

kgutwin commented Jan 29, 2024

Thanks for looking into it! I spent some time this morning trying to research answers to your questions:

  1. I wasn't able to find any good equivalent, especially not one that uses flask.g as you would probably want to use. The closest I found was cachetools which offers a customizable @cached() decorator, although they don't seem to have a good @cached_property replacement. It may be the most straightforward and transparent to just refactor the code using @property and to store the result in flask.g.
  2. Would it be appropriate to consider this bug tested if we can use the existing test framework to reproduce it? My thought is that I could probably write a test case that, in a deterministic manner, reproduces the issue. It would try to reproduce the case that I observed - pausing in the middle of processing the authorized() endpoint and making a secondary client call to an endpoint that calls the blueprint .authorized property. I can submit a PR for that test case if you would like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants