Add two transformer models via upload #508

yingtaoluo · 2021-07-13T04:27:20Z

Add a naive transformer model and an improved transformer model.

Description

The tested successful requirement is Python 3.6/3.7/3.8 and Pytorch 1.12/1.2.

The naive transformer implemented here for financial time series prediction follows the paper "Attention is all you need":
Given the input (N, T, F),

An embedding layer that maps the input (N, T, F) to representation (N, T, F’);
A positional encoding layer that adds the positional sigmoid;
An encoder that consists of several encoding layers, each of which uses a self-attention layer as the computing module (function of query, key, and value).
A decoder that consists of an MLP (or a Linear layer) that maps the representation of the last time (N, 1, F') into output (N, 1).

The improved transformer is a simple self-designed transformer (based on the paper 'SLGT: Self-adaptive Local-global aware Transformer for Sequential Recommendation', which is submitted to a conference and will be available on ArXiv soon). Localformer imports 1-dimensional convolutional layers besides the encoder layer as a locality inductive bias to supplement the long-term dependent self-attention module, which updates the representation of sequence at each time locally. Specifically, the input representation that passes through each encoder layer (self-attention layer) will be the original input adds (+) the output of the input passing through an extra 1-d convolutional layer. For example, if the encoder originally contains three self-attention layers attn-attn-attn, it will now be conv-attn-conv-attn-conv-attn. After the transformer module, a GRU is added to further aggregate the representation with sequential inductive bias (provided by the RNN layers).

Motivation and Context

It adds two famous transformer models for customers to select, besides other base models that Qlib already contains. The model performance reaches a 1.47 information ratio, which is fairly high. The improved version transformer adds convolution and RNN to supplement inductive bias, which is simple but effective.

How Has This Been Tested?

Pass the test by running: qrun benchmarks/Transformer/workflow_config_localformer_Alpha158.yaml under upper directory of qlib, where 'workflow_config_localformer_Alpha158.yaml' only needs to change this line of code 'task: model: class: LocalformerModel' or 'task: model: class: TransformerModel'.
The performances of the two models are described above.

Screenshots of Test Results (if appropriate):

Transformer Results on Alpha158:
'''
'IC': 0.03186587768611013,
'ICIR': 0.2556910881045764,
'Rank IC': 0.04735251936658551,
'Rank ICIR': 0.388378955424602

'The following are analysis results of the excess return without cost.'
risk
mean 0.000309
std 0.004209
annualized_return 0.077839
information_ratio 1.164993
max_drawdown -0.106215

'The following are analysis results of the excess return with cost.'
risk
mean 0.000126
std 0.004209
annualized_return 0.031707
information_ratio 0.474567
max_drawdown -0.131948

Transformer Results on Alpha360:
{'IC': 0.011659216755690713,
'ICIR': 0.07383408561758713,
'Rank IC': 0.03505118059955821,
'Rank ICIR': 0.2453042675836217}
'The following are analysis results of the excess return without cost.'
risk
mean 0.000026
std 0.005318
annualized_return 0.006658
information_ratio 0.078865
max_drawdown -0.104203

Localformer Results on Alpha158:
{'IC': 0.037426503365732174,
'ICIR': 0.28977883455541603,
'Rank IC': 0.04659889541774283,
'Rank ICIR': 0.373569340092482}

'The following are analysis results of the excess return without cost.'
risk
mean 0.000381
std 0.004109
annualized_return 0.096066
information_ratio 1.472729
max_drawdown -0.094917

'The following are analysis results of the excess return with cost.'
risk
mean 0.000213
std 0.004111
annualized_return 0.053630
information_ratio 0.821711
max_drawdown -0.113694

Localformer Results on Alpha360:
{'IC': 0.03766845905185995,
'ICIR': 0.26793394150788935,
'Rank IC': 0.0530091645633088,
'Rank ICIR': 0.40090294387953357}
'The following are analysis results of the excess return without cost.'
risk
mean 0.000131
std 0.004943
annualized_return 0.033129
information_ratio 0.422228
max_drawdown -0.127502

Types of changes

Fix bugs
Add new feature
Update documentation

Add naive transformer model and a improved transformer model.

ghost · 2021-07-13T04:27:31Z

All CLA requirements met.

you-n-g · 2021-07-14T01:59:27Z

@yingtaoluo It looks great! Thanks so much!

Please check the errors in the CI

These suggestions
maybe useful for you

Would you mind adding more docs about your model and include your PyTorch version in the requirements.txt like other models?

Thanks.

qlib/contrib/model/pytorch_transformer.py

Have passed black.

Have passed black

yingtaoluo · 2021-07-14T08:26:50Z

I have cleared these errors with Black and have added yaml files and requirement.txt. I have also expanded the docs about the models. Please contact me at any time if there are other works needed to be done. :}

qlib/contrib/model/pytorch_localformer.py

examples/benchmarks/Localformer/workflow_config_localformer_Alpha360.yaml

qlib/contrib/model/pytorch_localformer.py

you-n-g · 2021-07-18T04:06:21Z

@yingtaoluo

Thanks for your contribution
I'm trying to run your models 20 times... (If you have complete results, please send them directly to me)

Besides, please try to make the commit message meanful. Otherwise, the pull request will be squashed.

you-n-g · 2021-07-18T04:18:06Z

@yingtaoluo
I made some modifications to the auto backtest scripts. Please merge it.
https://github.com/yingtaoluo/qlib/pull/1

I'm testing them with the following code(You can try them as well).

python run_all_model.py 1 localformer Alpha158 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 localformer Alpha360 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 transformer Alpha158 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 transformer Alpha360 --qlib_uri "<your qlib file path>" --wait_when_err True

qlib/contrib/model/pytorch_localformer_ts.py

update `run_all_model` and black format

yingtaoluo

I have reviewed the codes and haven't found errors.

yingtaoluo · 2021-07-18T14:47:08Z

@yingtaoluo
I made some modifications to the auto backtest scripts. Please merge it.
yingtaoluo#1

I'm testing them with the following code(You can try them as well).

python run_all_model.py 1 localformer Alpha158 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 localformer Alpha360 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 transformer Alpha158 --qlib_uri "<your qlib file path>" --wait_when_err True
python run_all_model.py 1 transformer Alpha360 --qlib_uri "<your qlib file path>" --wait_when_err True

Thank you! I have merged.

yingtaoluo · 2021-07-18T14:49:18Z

@yingtaoluo

Thanks for your contribution
I'm trying to run your models 20 times... (If you have complete results, please send them directly to me)

Besides, please try to make the commit message meanful. Otherwise, the pull request will be squashed.

Sure. The only problem might be that I may not have the adequate running time to run 20 times. I may wait until weeks later to finish that. Could you, if convenient, run the 20 times? I appreciate it.

you-n-g · 2021-07-20T05:46:20Z

@yingtaoluo
Here is my result of 20 times running.
Do you have any questions about it?

Model Name	Dataset	IC	ICIR	Rank IC	Rank ICIR	Annualized Return	Information Ratio	Max Drawdown
Localformer	Alpha158	0.0355±0.00	0.2747±0.04	0.0466±0.00	0.3762±0.03	0.0506±0.02	0.7447±0.34	-0.0875±0.02
Transformer	Alpha158	0.0274±0.00	0.2166±0.04	0.0409±0.00	0.3342±0.04	0.0204±0.03	0.2888±0.40	-0.1216±0.04
Localformer	Alpha360	0.0408±0.00	0.2988±0.03	0.0538±0.00	0.4105±0.02	0.0275±0.03	0.3464±0.37	-0.1182±0.03
Transformer	Alpha360	0.0141±0.00	0.0917±0.02	0.0331±0.00	0.2357±0.03	-0.0259±0.03	-0.3323±0.43	-0.1763±0.07

If it is OK.
Please add them to the following links

And add your paper references and descriptions to the benchmark README after you publish it.

Thanks

you-n-g · 2021-07-20T05:48:44Z

The above numbers are the results of the following commands after removing the fixed seed in your YAML files.

python run_all_model.py 20 localformer Alpha158 --qlib_uri "~/repos/libs/qlib/" --wait_when_err True
python run_all_model.py 20 localformer Alpha360 --qlib_uri "~/repos/libs/qlib/" --wait_when_err True
python run_all_model.py 20 transformer Alpha158 --qlib_uri "~/repos/libs/qlib/" --wait_when_err True
python run_all_model.py 20 transformer Alpha360 --qlib_uri "~/repos/libs/qlib/" --wait_when_err True

Add the performance of transformer and localformer.

Add transformer and localformer (SLGT) models for time series prediction in finance in the Quant Model Zoo.

yingtaoluo · 2021-07-20T07:03:31Z

@yingtaoluo
Here is my result of 20 times running.
Do you have any questions about it?

Model Name Dataset IC ICIR Rank IC Rank ICIR Annualized Return Information Ratio Max Drawdown
Localformer Alpha158 0.0355±0.00 0.2747±0.04 0.0466±0.00 0.3762±0.03 0.0506±0.02 0.7447±0.34 -0.0875±0.02
Transformer Alpha158 0.0274±0.00 0.2166±0.04 0.0409±0.00 0.3342±0.04 0.0204±0.03 0.2888±0.40 -0.1216±0.04
Localformer Alpha360 0.0408±0.00 0.2988±0.03 0.0538±0.00 0.4105±0.02 0.0275±0.03 0.3464±0.37 -0.1182±0.03
Transformer Alpha360 0.0141±0.00 0.0917±0.02 0.0331±0.00 0.2357±0.03 -0.0259±0.03 -0.3323±0.43 -0.1763±0.07
If it is OK.
Please add them to the following links

https://github.com/microsoft/qlib/blob/main/examples/benchmarks/README.md

https://github.com/microsoft/qlib#quant-model-zoo

And add your paper references and descriptions to the benchmark README after you publish it.

Thanks

I have added the results to the two links. I will add the paper reference after publication. Thank you again.

you-n-g · 2021-07-21T12:53:54Z

@yingtaoluo Please merge the main branch to fix the CI error.

Thanks

you-n-g · 2021-07-22T03:05:28Z

OK. I'll merge this branch first and then solve the CI problem o the main branch.

@yingtaoluo It's really a great job! Thanks so much!

yingtaoluo · 2021-07-22T04:21:02Z

Thank you for patiently guiding me through every step!

you-n-g · 2021-07-22T05:27:09Z

@yingtaoluo
Welcome to join the contributor list
Looking forward to your contributions in the future! :D

you-n-g · 2022-01-26T08:08:14Z

@yingtaoluo
Is your paper published?
If your paper is published, you can add the paper link and more details about your code.
I think it will be helpful to make more Qlib users know your work.

* refine ds modal for more cases: eval and es * update model template * prompts for model and ensemble * fix a bug * fix a bug * init: ds workflow evovingstrategy * Adding ensemble (microsoft#505) * Initial Draft * Updating logic for init * Revising * Successful Testing * Updating to use the latest & right class * bug: bug-fixing for testing * data science loop changes * data science loop base * ds loop feedback * fix * remove measure_time because it's duplicated (in LoopBase) * add the knowledge query for data_loader & feature * edit ds workflow evaluator * data_loader bug fix * stop evolving when all tasks completed * llm app change * fix break all complete strategy * Adding queried knowledge (microsoft#508) Co-authored-by: XianBW <[email protected]> * fix loop bug * ds workflow evaluator; test; refine prompts * workflow spec * fix ci * feature task changes * ds loop change * fix a bug in feat * add query knowledge for model and workflow * llm_debug info(for show) using pickle instead of json * remove NextLoopException * loop change * coder raise CoderError when all sub_tasks failed * rename code_dict to file_dict in FBWorkspace * add CoSTEER unittest * now show self.version in Task.get_task_information(), simplify CoSTEER sub tasks definition * remove some properties in ModelTask, add model_type in it. * fix llm app bug * llm web app bug fix * ds loop bug fix * fix: give component code to feature&ens eval * loop catch error bug * rename load_from_raw_data to load_data * feat: Add debug data creation functionality for data science scenarios * support local folder (microsoft#511) * support local folder * remove unnecessary random * KaggleScen Subclass * small fix * use template for style description * update default scen to kaggle * update sample data script * make sure frac < 1 * fix a bug * feature spec changes * fix * changeimport order * clear unnecessary std outputs * fix a typo * create sample folder after unzip kaggle data * feature/model test script update * Align the data types across modules. * fix a bug in model eval * show line number * move sample entry point to app * spec & model prompt changes * Refine the competition specification to address the data type problem and the coherence issue. * fix some bugs * add file filter in FBworkspace.code property * support non-binary prediction * avoid too much warnings * fix a bug in ensemble module * filtered the knowledge query in all modules * delete RAG in idea proposal * refine the code in ensemble * show exp workspace in llm_st * exp_gen bug fix * feedback bug fix * use `feature` instead of `feat01` * Trace & method of judging if exp is completed change * fix a bug in package calling and execute ci * fix code * bug fix * bug fix * fix a bug * fix some bugs * fix a bug * refactor: Enhance error handling and feedback in data science loop * support different use_azure on chat and embedding models * multi-model proposal logic * fix a small syntax error * loopBase and some changes * ensemble scores change * fbworkspace.code -> .all_codes * use all model codes in workflow coder * check scores.csv's keys(model_names) * model name changes * add a todo in ensemble test * sota_exp changes * give model info in exp gen * add runner time limit * config using debug data or not in evals * exp to feedback base * add feature code when writing model task * small problem * copying during sampling * update * refactor: Simplify code handling and improve workspace management * model part output fix * print model's execution time * bug fix * ensemble test fix * ens small change * ens_test bug fix * Refine partial expansion logic to display only a few subfolders when their structure is uniform, improving readability in nested directories. * several update on prompts * sample subfolders * Filter the stdout after code execution to remove irrelevant information e.g. progress bars, whitespace characters, excessive line breaks. * Add some more prompts and comments * several update on the first init rounds * model timeout as error * fix pattern of getting model codes in workspace * small bux fix on model prompts * remove get_code_with_key since we have regex pattern * fix: Correct tqdm progress bar update logic in LoopBase class * feat: Add diff generation and enhance feedback mechanism in data science loop * update some fix to model and workflow prompts * refine the logic of progress bar filter * add last_successful_exp in exp_gen * fix a one line bug * add a hint in prompt * fix data sample for bms * fix data sample for bms * hypothesis small fix * crawler readme update * fix component gen * fix bug * annotation change * load description.md if it exists * refactor: Simplify SOTA description handling in feedback and prompts * refactor: Use shared templates for feedback and experiment descriptions * change webapp for model codes changes * update proposal * add timeout message for docker run output * fix * refine the code in docker time processing * use .shape instead of len() when do shape eval * won't change size during iteration * support bson sample * sample support jsonl and bson * add former_code to coder prompts * a little speed us in debug data creating * filter progress bar when eval ens and main * avoid costeer makes no change to former code * fix several log error * add timeout judge threshold * fix some bugs in the evaluation of component output shapes * File structure for supporting litellm (microsoft#517) Co-authored-by: Young <[email protected]> * ignore submission and show processing * ignore submission and show processing * add efficiency notice * refactor: Enhance error message with detailed feedback summary * refactor: Simplify component handling in DSExpGen class * refactor: Update code structure and add docstring for clarity * reserve one sample to each label in data sampling * add Evaluation info * refine costeer code to avoid giving same code twice * use raw_description as plain text * add a prompt hint to avoid same dict key * model task name bug in first model exp gen * fix a typo * add some debug info in costeer tests * task init change * enhance data sampling * refine the code in data_loader * more reasonable loop * fix a bug in data folder description * add error msg & traceback to execution feedback * fix llm error msg detection * add task information to costeer eval & add cache to docker run(use zipfile to store the whole workspace) * fix CI first round * fix CI second round * use txt to store test script to avoid pytest * remove zipfile in requirements * add azure.identity to requirements * ignore debug web page * component test changes * remove redundent task_desc in model coder * feat: Add APE module and prompts for automated prompt engineering * fix: Update .gitignore and improve text formatting in eval.py * refactor: Update print output and improve code comments and imports * style: Fix string formatting and import order in ape.py and fmt.py * exclude ape * add a data folder notice * reduce unnecessary output to stdout * refine the code of describe_data_folder * fix ci * style: streamlit style update (microsoft#522) * streamlit style update * fix import * fix format * fix llm_st loop progress bar * debugapp small change * fix model str * refine some prompts * fix model str * fix CI * refine the logic associated with the data_folder * fix ci * small change * set filter_progress_bar as default in execute * model proposal with workflow * add submission check in workflow eval * fix bug * small change * fix CI * fix CI * refactor: Move generate_diff to utils and update DSExpGen logic * more reasonable prompt describing metric direction * fix a minor jinja2 bug * quick fix exp_gen bugs * fix the following bug * fix * fix some bugs * remove workflow from model * add pending_tasks_list in data science to enable coding model and workflow * refine the code for handling JSON-formatted data descriptions * assert with information * ensure correct csv file name * add logging to help record the output * log competition * add log tag for debug llm app * test: Test ds refactor ll (microsoft#523) * fix bugs to former scenario * fix a bug because coding in rdloop changed * fix the bug when feedback gets no hypothesis * fix trace structure * change all trace hist when merging hypothesis to experiments * ignore some error in ruff * fix kaggle scenario bugs * refine one line * another bug * another small bug * fix ui bugs * chage kaggle train.py path --------- Co-authored-by: Xu Yang <[email protected]> * fix CI * Update rdagent/app/data_science/loop.py Co-authored-by: Copilot <[email protected]> * add samplecsv into spec prompts * fix CI --------- Co-authored-by: TPLin22 <[email protected]> Co-authored-by: yuanteli <[email protected]> Co-authored-by: Xisen Wang <[email protected]> Co-authored-by: Bowen Xian <[email protected]> Co-authored-by: Xu Yang <[email protected]> Co-authored-by: XianBW <[email protected]> Co-authored-by: Tim <[email protected]> Co-authored-by: 炼金术师华华 <[email protected]> Co-authored-by: Linlang <[email protected]> Co-authored-by: Copilot <[email protected]>

Add files via upload

6100c0e

Add naive transformer model and a improved transformer model.

you-n-g reviewed Jul 14, 2021

View reviewed changes

qlib/contrib/model/pytorch_transformer.py Outdated Show resolved Hide resolved

qlib/contrib/model/pytorch_transformer.py Outdated Show resolved Hide resolved

yingtaoluo added 6 commits July 14, 2021 15:49

Update pytorch_localformer.py

b13c140

Have passed black.

Update pytorch_transformer.py

d6215c9

Have passed black

Add files via upload

9eb333b

Add files via upload

71a1d1d

Add files via upload

2610e4b

Add files via upload

e398dfb

you-n-g reviewed Jul 16, 2021

View reviewed changes

qlib/contrib/model/pytorch_localformer.py Outdated Show resolved Hide resolved

Update pytorch_localformer.py

664e48f

you-n-g reviewed Jul 16, 2021

View reviewed changes

examples/benchmarks/Localformer/workflow_config_localformer_Alpha360.yaml Outdated Show resolved Hide resolved

you-n-g reviewed Jul 16, 2021

View reviewed changes

qlib/contrib/model/pytorch_localformer.py Show resolved Hide resolved

yingtaoluo added 3 commits July 16, 2021 18:33

Add files via upload

c797b48

Add files via upload

549a645

Add files via upload

65ae976

you-n-g and others added 2 commits July 18, 2021 12:07

Update setup.py

516a825

update run_all_model and black format

3bc8c96

you-n-g reviewed Jul 18, 2021

View reviewed changes

qlib/contrib/model/pytorch_localformer_ts.py Outdated Show resolved Hide resolved

you-n-g reviewed Jul 18, 2021

View reviewed changes

qlib/contrib/model/pytorch_localformer_ts.py Outdated Show resolved Hide resolved

yingtaoluo added 2 commits July 18, 2021 20:41

Merge pull request #1 from you-n-g/ytl_main

30e09f1

update `run_all_model` and black format

Update pytorch_localformer_ts.py

fd9e005

yingtaoluo commented Jul 18, 2021

View reviewed changes

yingtaoluo added 3 commits July 20, 2021 14:55

Add performance of two new models

36feb17

Add the performance of transformer and localformer.

Add two new model in zoo

88f20fb

Add transformer and localformer (SLGT) models for time series prediction in finance in the Quant Model Zoo.

Add two new models in model zoo

c18b64e

you-n-g merged commit 025b1dc into microsoft:main Jul 22, 2021

Add two transformer models via upload #508

Add two transformer models via upload #508

Conversation

yingtaoluo commented Jul 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

Uh oh!

ghost commented Jul 13, 2021 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

you-n-g commented Jul 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yingtaoluo commented Jul 14, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

you-n-g commented Jul 18, 2021

Uh oh!

you-n-g commented Jul 18, 2021

Uh oh!

Uh oh!

Uh oh!

yingtaoluo left a comment

Choose a reason for hiding this comment

Uh oh!

yingtaoluo commented Jul 18, 2021

Uh oh!

yingtaoluo commented Jul 18, 2021

Uh oh!

you-n-g commented Jul 20, 2021

Uh oh!

you-n-g commented Jul 20, 2021

Uh oh!

yingtaoluo commented Jul 20, 2021

Uh oh!

you-n-g commented Jul 21, 2021

Uh oh!

you-n-g commented Jul 22, 2021

Uh oh!

yingtaoluo commented Jul 22, 2021

Uh oh!

you-n-g commented Jul 22, 2021

Uh oh!

you-n-g commented Jan 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yingtaoluo commented Jul 13, 2021 •

edited

Loading

ghost commented Jul 13, 2021 •

edited by ghost

Loading

you-n-g commented Jul 14, 2021 •

edited

Loading