Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Risk of violating the rate limit #31

Open
sio opened this issue Mar 26, 2019 · 8 comments
Open

Risk of violating the rate limit #31

sio opened this issue Mar 26, 2019 · 8 comments

Comments

@sio
Copy link

sio commented Mar 26, 2019

From what I've gathered there exists a risk of violating the rate limit when using this module.

As of now RateLimitDecorator does not calculate a sliding time frame to track number of API calls within any given period of specified length. Instead we track some time periods of correct length, placed back-to-back almost arbitrarily (depending on the time of first API call).

Here is how it can bite us.

Let's say we must not make more than 5 calls within a minute. We make a call, then wait almost a minute, and make 9 more API calls consecutively. There are only 5 calls within the first minute and 5 calls within the second. RateLimitDecorator will allow such behavior while the API provider will definitely consider it to be a violation.

          |  time frame #1  |  time frame #2  |
API hits: -x---------x-x-x-x-x-x-x-x-x---------
                    |   VIOLATION!!!  |

Here is the code that demonstrates described violation. As in the picture above we have the limit of 5 calls per time frame (10 seconds for faster demo). But we make 9 calls within the last two seconds. API provider would not be happy with us :)

from ratelimit import limits
import time


period = 10
calls = 5
count = 0


@limits(calls=calls, period=period)
def demo():
    global count
    count += 1
    print(count, time.time())


demo()
time.sleep(period - 1)
for x in range(calls - 1):
    demo()
time.sleep(1)
for x in range(calls):
    demo()

To avoid this we need to track the time of each API call and adjust the decision making logic accordingly.

For my personal use I've written an alternative class altogether, but I do not plan on uploading it to PyPI and fragmenting the user base. You have done a great job of maintaining this package for a long time and I think it should remain the go-to place for rate limiter logic. Feel free to reuse my code though.

@stuaxo
Copy link

stuaxo commented Oct 30, 2019

@sio would it be worth submitting your sliding window work as a PR?

@sio
Copy link
Author

sio commented Oct 31, 2019

@stuaxo, in my implementation I made some design decisions differently, so adding sliding window logic to this project would require more work than just copy-paste.

If someone is willing to do it, I'm all in favor. Though my explicit approval is not necessary - I've already published the code under Apache License that allows it.

@abhi-jha
Copy link

Did this functionality get merged in the release?

@sio
Copy link
Author

sio commented Dec 20, 2019

The most recent commit to master is older than this thread. That's definitely a no :-)

@deckar01
Copy link

deckar01 commented Mar 22, 2020

Another factor that can contribute to violating a rate limit is losing the state of RateLimitDecorator between application restarts. During development this happens to me often, and there is currently no way to safely resume without waiting the full time limit after that last request.

test.py
from datetime import datetime
from functools import wraps
from ratelimit import limits, sleep_and_retry, RateLimitException


def log_and_pass(func):
    @wraps(func)
    def wrapper(*args, **kargs):
        try:
            return func(*args, **kargs)
        except RateLimitException as exception:
            print('{}\tmiss\t0'.format(datetime.now()))
            print('{}\twait\t{}'.format(datetime.now(), exception.period_remaining))
            raise exception
    return wrapper

@sleep_and_retry
@log_and_pass
@limits(calls=1, period=1)
@limits(calls=2, period=10)
def test():
    print('{}\tcall\t0'.format(datetime.now()))

test()
test()
test()
python test.py > log1.txt && python test.py > log2.txt

Screen Shot 2020-03-22 at 4 31 38 AM

Running parallel processes illustrates this a little more clearly:

python test.py > log1.txt & python test.py > log2.txt &

Screen Shot 2020-03-22 at 4 29 42 AM

If the call log was implemented as an sqlite database, the operations would be relatively simple queries and the state would not only be safe across restarts, but also be safe for parallel processes.

master...deckar01:31-db-log

test.py
from datetime import datetime
from functools import wraps
from ratelimit import limits, sleep_and_retry, RateLimitException


def log_and_pass(func):
    @wraps(func)
    def wrapper(*args, **kargs):
        try:
            return func(*args, **kargs)
        except RateLimitException as exception:
            print('{}\tmiss\t0'.format(datetime.now()))
            print('{}\twait\t{}'.format(datetime.now(), exception.period_remaining))
            raise exception
    return wrapper

@sleep_and_retry
@log_and_pass
@limits(calls=1, period=1, storage='test.db', name='limit_1')
@limits(calls=2, period=10, storage='test.db', name='limit_10')
def test():
    print('{}\tcall\t0'.format(datetime.now()))

test()
test()
test()
python test.py > log1.txt & python test.py > log2.txt &

Screen Shot 2020-03-22 at 4 12 44 AM

@stuaxo
Copy link

stuaxo commented Mar 22, 2020

I wish I had time to play at fixing this but I know the tickets at work mean it wont happen.

Having a callback to store state in the db (or ideally in Redis) would sort this.

@deckar01
Copy link

No worries. I went ahead and published my fork to make my proposal easier to test drive. If there is anything I can do to help resolve this issue, just let me know.

https://pypi.org/project/deckar01-ratelimit/

@lmmsoft
Copy link

lmmsoft commented Mar 21, 2024

+1, use slidingwindow is the right solution.

Sqlite is not suit for multi-processing, use redis zadd and zremrangebyscore can be better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants