Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating technical indicators for intervals and periods #7

Open
jcfbeardsley opened this issue Feb 16, 2021 · 4 comments
Open

Generating technical indicators for intervals and periods #7

jcfbeardsley opened this issue Feb 16, 2021 · 4 comments

Comments

@jcfbeardsley
Copy link

Hi,

After creating the the master BTC_Data.csv file, it needs to be broken down into the respective indicator files for the different intervals (1, 2, 3) and periods (1, 7, 30, 90 days etc). There seems to be a loose framework for the interval interval file generation in the Feature_Selection notebooks, but I just want to confirm the methodology before proceeding.

Do you already have this code in a loop that will generate each file automatically, or do the notebooks require manually editing for each iteration? If the latter, can you please clarify which lines need to be updated in Feature_Collection_reg.ipynb and Feature_Collection_cls.ipynb to generate all the different combinations of technical indicators on each run?

Thanks,
J.

@heliphix
Copy link
Owner

heliphix commented Feb 16, 2021

Hi,
The selected features are chosen manually by looking at different things such as correlation, feature importance, train/test scores, and performance metrics and so on. It is an iterative process. You may try some of the features reported in the manuscript. But other sets of features may yield good performance as well. You may select the features in lines 70 - 72 in Feature_Collection_reg.ipynb and lines 65-67 in Feature_Collection_cls.ipynb.

Best,
Mohammed Mudassir

@jcfbeardsley
Copy link
Author

jcfbeardsley commented Feb 25, 2021

Thanks for clarifying Mohammed,

Is it correct to assume that the only lines that need to be updated to change between the different different intervals and prediction timeframes are the following:

Feature_Selection_reg
Feature Selection for Interval 1:

df=data.loc[interval1]

#%%

#df['priceUSD']=one.loc[interval3]

Changes to the following for Feature selection for Interval 3 at the 7 day prediction:

df=data.loc[interval3]

#%%

df['priceUSD']=seven.loc[interval3]

and swapping:

X_high.to_csv('reg_interval1.csv',sep=',',index=False)

for:

X_high.to_csv('reg_seven.csv',sep=',',index=False)

I'm first looking to reproducing your 7 day price forecast using the different models (hence the interval3/seven combination), but if it's easier to supply a copy of your notebook used to generate the reg_seven.csv files within the paper, I'm happy to work through the differences myself.

Thanks again for all your help.

@heliphix
Copy link
Owner

heliphix commented Feb 26, 2021

Hi,
The lines of codes and the procedure you have highlighted for choosing the interval and forecast period are correct. I have to search for that reg_seven.csv file. Nevertheless, you should be able to generate a slightly different set of features and still be able to get model performance comparable to the ones I have reported in the manuscript.

Edit:
There is a reg_seven.csv file in commit b80f8913e0 as you have reported in issue #9. Please try out the codes using Python 3.6. Slight discrepancies in the selected features should not massively reduce the performance.

@heliphix heliphix reopened this Feb 26, 2021
@Leci37
Copy link

Leci37 commented Jan 25, 2023

I want to offer a new point of view, and my colaboraty

Why this stock prediction project ?
Things this project offers that I did not find in other free projects, are:

Testing with +-30 models. Multiple combinations features and multiple selections of models (TensorFlow , XGBoost and Sklearn )
Threshold and quality models evaluation
Use 1k technical indicators
Method of best features selection (technical indicators)
Categorical target (do buy, do sell and do nothing) simple and dynamic, instead of continuous target variable
Powerful open-market-real-time evaluation system
Versatile integration with: Twitter, Telegram and Mail
Train Machine Learning model with Fresh today stock data
https://github.com/Leci37/stocks-prediction-Machine-learning-RealTime-telegram/tree/develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants