Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't get Ticker.history dataframe at time of market open. #210

Open
quikksilver1 opened this issue Jul 20, 2023 · 7 comments
Open

Don't get Ticker.history dataframe at time of market open. #210

quikksilver1 opened this issue Jul 20, 2023 · 7 comments

Comments

@quikksilver1
Copy link

image

I run my code between the hours of 8:30-9am CDT.
Using windows 11
yahooquery==2.3.2
I switched to 2.3.2 after getting "Invalid Cookie" error
python 3.11.3

Expected behavior
Here is an example of when I run the code after I get home.
image

image

Screenshots
If applicable, add screenshots to help explain your problem.

@quikksilver1
Copy link
Author

Looking through my logs I see that the premarket change percent and market change percent are coming through, but are blocked after a certain number of tries. Most likely it is rate limited. However the volume issue is persistent on every call.

@quikksilver1
Copy link
Author

Is seriously nobody else having this problem? Nothing else in my code has changed but the yahoo query version. I have now finally tried rotating proxies and still have same issue. I either get a empty dataframe or I get one row that has a "0" for volume value.

Ex:
2023-07-25 08:36:07.967493 (Currently in central timezone)
Whole Dataframe open high low close volume
symbol date
IMMX 2023-07-25 09:30:01-04:00 2.04 2.04 2.04 2.04 0

Ex: 2023-07-25 08:36:10.354911
Whole Dataframe Empty DataFrame
Columns: [high, low, volume, open, close]
Index: []

This is how I call it:

now_adj is todays date

stock = Ticker(stock_1, retry=40, status_forcelist=[404, 429, 500, 502, 503, 504],
user_agent=user_agent,
proxies=proxy, timeout=0.3)
stock_history = stock.history(start=now_adj, interval="1m", period="1d")

I find it difficult to believe that nobody is having this issue.

@maread99
Copy link
Contributor

Hi @quikksilver1, I'm struggling to understand this issue. With respect to Ticker.history, could you offer a minimal example, from imports through to the return.

Ideally enclose the code part in the following way:
```python
from yahooquery import Ticker
# your example here
```
So that it renders as...

from yahooquery import Ticker
# your example here

@quikksilver1
Copy link
Author

Certainly, the issue is when I call the Ticker.history function between the hrs of 8:30am to 9am Central time. I either get a empty dataframe or a dataframe that contains only one row. "stock_1" is a list of stocks that I iterate through. The list changes daily.

Here is how I call it
'''python
value = yahoo_data(stock_1)
class yahoo_data:

def __init__(self, stock_1):

    from datetime import datetime
    from datetime import date
    import re
    from yahooquery import Ticker
    from pytz import timezone
    d = datetime.now()
    self.date = d
    now = datetime.now(timezone('US/Eastern'))
    now = str(now)
    now = now.split(".")[0]
    now_2 = now.split(".")[0]
    now = now[0:(len(now) - 8)]
    today = date.today()
    date_list = []
    num_weekdays = 0
    for i in range(10):
        d = today - timedelta(days=i)

        if d.weekday() < 5:
            date_list.append(d)
            date_array = np.array(date_list)
            num_weekdays += 1

        if num_weekdays >= 7:
            break
    date_list = str(date_list)
    num_dick = re.findall("\d+", date_list)

    new_list_d = []
    # converts findall intems to proper ints
    for item in num_dick:
        new_list_d.append(int(item))
    num_dick = np.array(new_list_d)

    now_year = num_dick[0]
    now_month = num_dick[1]
    now_month = int(num_dick[1])
    now_day = num_dick[2]
    now_day = int(now_day)

    if now_month <= 9:
        now_month = str(now_month)
        now_month = "0" + now_month + "-"
    else:
        now_month = str(now_month)
        now_month = now_month + "-"

    if now_day <= 9:
        now_day = str(now_day)
        now_day = "0" + now_day
    else:
        now_day = str(now_day)
        # now_day=num_dick[1]
        now_day = now_day

    now_year = str(now_year)

    now_day = str(now_day)
    now_year = now_year + "-"
    now_day = now_day
    now_adj = now_year + now_month + now_day
    print("Date used by YahooQuery: {}".format(now_adj))



    user_agent_list = [
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15',
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0',
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36',
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
        'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15'
    ]


    https_proxy = [{1: 'https://167.172.248.53:3128'}, {1: 'https://136.226.2.85:80'},{1: 'https://104.129.207.20:10605'}, {1: 'https://65.108.87.218:80'},{1: 'https://78.47.219.204:3128'},{1: 'https://176.31.129.223:8080'}]

    random.shuffle(https_proxy)

    def get_next_random_proxy():
        global selected_proxies
        if not 'selected_proxies' in globals():
            selected_proxies = set()
        for proxy_dict in https_proxy:
            proxy_key = list(proxy_dict.keys())[0]
            if proxy_key not in selected_proxies:
                selected_proxies.add(proxy_key)
                return proxy_dict[proxy_key]
        # If all proxies have been used, reset the set and shuffle the list again
        selected_proxies = set()
        random.shuffle(https_proxy)
        return get_next_random_proxy()

    for i in range(len(https_proxy)):
        try:
            user_agent = random.choice(user_agent_list)
          
            proxy = get_next_random_proxy()
            stock = Ticker(stock_1, retry=40, status_forcelist=[404, 429, 500, 502, 503, 504],
                           user_agent=user_agent,
                           proxies=proxy, timeout=0.3)
            break
        except Exception as e:
            print("Trying different Configuration")
            continue

    self.stock = stock
    self.details = stock.summary_detail
    self.prices = stock.price

    

    try:
        stock_history = stock.history(start=now_adj, interval="1m", period="1d")
        log_print("Whole Dataframe from yahoo query {}".format(stock_history))

        pd.set_option('display.max_columns', None)
        pd.set_option('display.width', None)
        pd.set_option('display.max_colwidth', None)
        pd.set_option('display.max_rows', 9999)
        self.stock_history = stock_history
        self.stock_1 = stock_1

    except Exception as e:
        print("Error getting yahoo data{}".format(e))

'''
The output that in 99% of cases:
Date used by YahooQuery: 2023-07-27
Whole Dataframe from yahoo query Empty DataFrame
Columns: [high, low, volume, open, close]
Index: []

or I get 1 row that has a 0 for the volume. (Showing that that was the last row in the dataframe):
Date used by YahooQuery: 2023-07-27
Whole Dataframe from yahoo query open high low close volume
symbol date
KULR 2023-07-27 09:30:50-04:00 0.9619 0.9619 0.9619 0.9619 0

If run my code later in the day, I get the correct dataframe, but I need to run my code in the first 30 minutes of market opening. Thank you for trying to help.

@quikksilver1 quikksilver1 changed the title I dont get premarket change data, market change data, volume at time of market open. Don't get Ticker.history dataframe at time of market open. Jul 27, 2023
@maread99
Copy link
Contributor

maread99 commented Jul 27, 2023

dude, that's not minimal

Could you cut your example down to a bare minimum of code under which the issue reproduces.

@quikksilver1
Copy link
Author

I couldn't fix it so I ended up getting the data through yfinance.

@DevilbissLab
Copy link

One hack I used is to pass "daily" to _get_daily_index and override with keep_live_indice = False
I am sure there is a more elegant way to do this, but I've run out of time.

def _get_daily_index(data, index_utc, adj_timezone, daily=False):

line 1482
if daily:
#Overide if Daily
keep_live_indice = False

...just a thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants