Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] fetch public trades #9066

Open
wants to merge 265 commits into
base: develop
Choose a base branch
from

Conversation

TheJoeSchr
Copy link
Contributor

@TheJoeSchr TheJoeSchr commented Aug 18, 2023

Orderflow

Summary

This PR enables to use ccxt fetch_public_trades so freqtrade can use trades data for backtesting and trading

Quickstart:

  • use --dl-trades to fetch trades for timerange
  • enable using trades in config.json
"exchange": {
   ...
   "use_public_trades": true,
}
  • set orderflow processing configuration in config.json:
"orderflow": {
   "scale": 0.5,
   "stacked_imbalance_range": 3,
   "imbalance_volume": 1,
   "imbalance_ratio": 300
 },

TODO

  • [docs] write (new?) docs about how to access this new trades data in populate_indicators etc
    • dataframe['trades']
    • dataframe['orderflow']
    • dataframe['delta']
  • [docs] how to use --dl-trades to get trade data
  • [docs] which configuration options are available and how to use them
  • [tests] unit tests exist, but they are using too big for github test files. so I need to shrink them and rewrite tests to still make sense

from previous feedback by @xmatthias :

  • Some diffs are a bit strange at first glance (especially in the exchange class), so expect some questions once i look at it more carefully (might be the webUI doing the diffs oddly if 2 functions are next to each other and are similar enough though) ...

  • and i usually don't like random formatting changes (like - an unnecessary linebreak in a function that's not even touched otherwise) => fixed with commit 137ee07

  • i also assume we'll want to move some things around in the end (converter seems to become quite big now) - but i can eventually help with that.

What's new?

@TheJoeSchr TheJoeSchr marked this pull request as draft August 18, 2023 11:21
@TheJoeSchr TheJoeSchr force-pushed the feature/fetch-public-trades branch 2 times, most recently from a5660c6 to 137ee07 Compare August 18, 2023 11:39
@xmatthias xmatthias added the Enhancement Enhancements to the bot. Get lower priority than bugs by default. label Aug 18, 2023
@biiiipy
Copy link

biiiipy commented Aug 19, 2023

This is awesome! Orderflow has a lot of valuable information, can't wait to start experimenting with this!

@xmatthias xmatthias linked an issue Aug 19, 2023 that may be closed by this pull request
@dijvar

This comment was marked as off-topic.

@xmatthias
Copy link
Member

xmatthias commented Sep 25, 2023

@dijvar it's an open PR. Docs will be written before merging the PR (which the PR description also mentions).
Please don't pollute the PR discussion with pointless requests, thanks.

@TheJoeSchr you may want to look at the conflicts (don't just look at the conflicting lines though ...).
trades datahandling changed right around the time (one PR number before your's 😆 ) you submitted this PR (#9065) - so i suspect some of the code you've written is now already available - or types won't align etc.

I think the diff of #9065 will be helpful to show what changed .... i don't think much else changed ... but these changes WILL impact this PR for sure.

@TheJoeSchr
Copy link
Contributor Author

TheJoeSchr commented Sep 25, 2023

I think the diff of #9065 will be helpful to show what changed .... i don't think much else changed ... but these changes WILL impact this PR for sure.

thanks for the headsup @xmatthias . I think the overlap is minimal, because my "trades" are about L2 data aka orderflow trades and that PR is about trades the user/bot send to the exchange to execute.

but the naming overlap is unfortunate, I'm not sure how to handle it so there won't be any more confusion

@TheJoeSchr
Copy link
Contributor Author

Please, write a short docs to use it. Thank you very much for the work :)

Thanks for showing interest, this helps keeping my motivation up to finally write the docs. did you try the quickstart on top of this PR though? If yes, what steps didn't work?

Copy link
Member

@xmatthias xmatthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few things that i don't like and make a review very difficult
For example:

  • changes that don't belong here (see explicit comment)
  • docstring (and signature) changes on methods that should not have been changed, causing pointless diffs (refresh_latest_ohlcv() - for example)
  • reintroduction of arrow to exchange (which has been removed in Add datetime helpers, reduce arrow usage to a minimum #8661, and was replaced with internal helpers directly for datetime)
  • duplicate methods in the same class (one will overwrite the other)

Due to the above, doing a proper review is currently near impossible - a the diff is unnecessarily large (especially in the exchange file).
Please do a review yourself (for example vscode's github pull requests extension can assist you with this by showing you the diff locally, allowing you to change what you don't like), comparing the differences - and identify what you actually INTENDED to change. That's something that's VERY difficult for me to identify.

Things you think should be changed but are not directly related to this pr (for example the max_calls point) - should either be removed, or extracted into individual PR's.
this becomes increasingly important with bigger PR's - small pr's combining 2-3 things - not really a problem, usually
but in a big feature like this - very problematic - as it draws focus apart - having high potential to introduce unwanted bugs due to these "non-directly" related changes.

freqtrade/data/dataprovider.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
@TheJoeSchr
Copy link
Contributor Author

@xmatthias do you have an idea how I could mock out get_datahandler in my tests? I need to do this so there are no trade cache files are generated/overwritten when the tests run.

I tried overwriting it directly with a mock function in my test, but that doesn't work:

def test_refresh_latest_trades(mocker, default_conf, caplog, candle_type) -> None:
    # ...
    from freqtrade.data.history import get_datahandler
    get_datahandler = mock_get_datahandler
    # ...
    res = exchange.refresh_latest_trades(pairs, cache=False)

because the module gets loaded directly in refresh_latest_trades

    def refresh_latest_trades(self,
                              pair_list: ListPairsWithTimeframes,
                              *,
                              cache: bool = True,
                              ) -> Dict[PairWithTimeframe, DataFrame]:
        
        from freqtrade.data.history import get_datahandler
        data_handler = get_datahandler(
            self._config['datadir'], data_format=self._config['dataformat_trades']
        )

Maybe you have encountered something like this before or have a good idea?

@xmatthias
Copy link
Member

Why is writing files bad? that's something that can be tested and asserted, too.

You just need to make sure it's not in the user's directory, but in a temporary space.

You can change user_data_dir to tmp_path quite easily (i was of the opinion that this is done globally, but apparently that's not the case) by assigning that in the tests (that's also done elsewhere ...).
For these particular tests - you might wanna update it in some "local" conftest file - so it applies globally to all tests where this applies.

@TheJoeSchr
Copy link
Contributor Author

@xmatthias don't worry about my previous message I think I figured it out with mocker.patch()

return stacked_imbalance(df, "ask", stacked_imbalance_range, should_reverse=True)


def orderflow_to_volume_profile(df: pd.DataFrame):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method seems unused ... intentional ?

can it be removed ?

trades_grouped_df["amount"],
)
)
sell = df.loc[is_between, "ask"].apply(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is .apply() actually necessary here?
.apply() (without additional argument) means it'll run once for each row ... but the row is not actually used, so i'm not entirely sure if it's really necessary to do this per row, or if the np.where() could work by itself - which would greatly speed up this operation.
I also can't really see a place where df.loc[is_between,...] can return more than 1 row, so that kinda reinforces me questioning if apply() is necessary for that reason either.


doing a quick test for this based on the test: - with a breakpoint on the sell line - testing for buy above:

np.all(buy[0] == np.where(
                        trades_grouped_df["side"].str.contains("buy"),
                        0,
                        trades_grouped_df["amount"],
                    ))

Now doing that change (use np.where() directly to assign buy/sell) also has implications below - as the "deltas per trade" loop only returns one value now - but with both changes, all tests do still pass, so i'd assume it's good, unless doing this as loop covers some odd scenario i currently can't imagine.

# calculate orderflow for each candle
df.loc[is_between, "orderflow"] = df.loc[is_between, "orderflow"].apply(
lambda _: trades_to_volumeprofile_with_total_delta_bid_ask(
pd.DataFrame(trades_grouped_df), scale=config_orderflow["scale"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would trades_grouped_df not be a dataframe?

:return: dataframe with bid and ask imbalance
"""
bid = df.bid
ask = df.ask.shift(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure i understand why we need a shift(-1) here.
It looks like a bug from "birds view" - so if it's correct, i think it'll justify an explaining comment as to why we need it shifted.

Comment on lines +156 to +161
df.loc[is_between, "bid"] = np.where(
trades_grouped_df["side"].str.contains("buy"), 0, trades_grouped_df["amount"]
).sum()
df.loc[is_between, "ask"] = np.where(
trades_grouped_df["side"].str.contains("sell"), 0, trades_grouped_df["amount"]
).sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this condition is correct?
wouldn't you want the "ask" side of the OB to be the same as "sell" (and bid the same as buy)?

np.where's syntax is np.where(condition, value_if_true, value_if_false) ... so the "asks" line would use "0" in case of a sell side trade?


except Exception as e:
logger.exception("Error populating dataframe with trades:", e)
raise e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should raise a "freqtrade-ish" exception instead? otherwise we catch it here - but still have any random exception bubble "everywhere" ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a "freqtrade-ish" exception for example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends what we want to accomplish i think
most likely, something around DependencyException (which doesn't completely crash the bot) should work.

ValueError can also be used - the point is though - it should be one error - not "all possible errors python has to offer" 😆

)

else:
raise OperationalException("no new ticks")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sure "OperationalException" is the right one here?
That exception is assumed to stop the bot - so it completely restarts - though i'd not do that just because there's no ticks ... which could very well happen on random, low-volume pairs (?)

Comment on lines +2652 to +2653
logger.error(f"Refreshing TRADES data for {pair} failed")
logger.error(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.error(f"Refreshing TRADES data for {pair} failed")
logger.error(e)
logger.exception(f"Refreshing TRADES data for {pair} failed")

)
results_df[(pair, timeframe, candle_type)] = trades_df
data_handler.trades_store(
f"{pair}", trades_df[DEFAULT_TRADES_COLUMNS], self.trading_mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to re-visit what i've said before (where i actually just questioned why a "-cached" was appened - not "Actually" said to change it).
this should use a -cached in the filename to avoid overriding actual trade files.

These are slow to download (hence hard to come by) - and we shouldn't overwrite them with a trimmed, rolling version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Enhancements to the bot. Get lower priority than bugs by default.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Volume Profile and Cluster Search as freqtrade feature
10 participants