xCoFormer

1. Quick Start

# clone the project 
git clone [email protected]:celsofranssa/xCoFormer.git

# change directory to project folder
cd xCoFormer/

# Create a new virtual environment by choosing a Python interpreter 
# and making a ./venv directory to hold it:
virtualenv -p python3 ./venv

# activate the virtual environment using a shell-specific command:
source ./venv/bin/activate

# install dependecies
pip install -r requirements.txt

# setting python path
export PYTHONPATH=$PATHONPATH:<path-to-project-dir>/xCoFormer/

# (if you need) to exit virtualenv later:
deactivate

2. Datasets

Download the datasets from kaggle:

kaggle datasets download celsofranssa/xcoformer-datasets -p resource/dataset/ --unzip

After downloading the datasets from it should be placed inside the resources/datasets/ folder as shown below:

xCoFormer/
|-- resources
|   |-- datasets
|   |   |-- java_v01
|   |   |   |-- test.jsonl
|   |   |   |-- train.jsonl
|   |   |   `-- val.jsonl
|   |   `-- python_v01
|   |       |-- test.jsonl
|   |       |-- train.jsonl
|   |       `-- val.jsonl

3. Test Run

The following bash command fits the BERT encoder over Java dataset using batch_size=128 and a single epoch.

python main.py tasks=[fit] model=BERT data=JAVA data.batch_size=128 trainer.max_epochs=1

If all goes well the following output should be produced:

GPU available: True, used: True
[2020-12-31 13:44:42,967][lightning][INFO] - GPU available: True, used: True
TPU available: None, using: 0 TPU cores
[2020-12-31 13:44:42,967][lightning][INFO] - TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2020-12-31 13:44:42,967][lightning][INFO] - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name         | Type                          | Params
------------------------------------------------------------
0 | desc_encoder | BERTEncoder                   | 109 M
1 | code_encoder | BERTEncoder                   | 109 M
2 | loss_fn      | NPairLoss                     | 0     
3 | mrr          | MRRMetric                     | 0     
------------------------------------------------------------
91.0 M    Trainable params


Epoch 0: 100%|███████████████████████████████████████████████████████| 5199/5199 [13:06<00:00,  6.61it/s, loss=5.57, v_num=1, val_mrr=0.041, val_loss=5.54]
Testing: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 288/288 [00:17<00:00, 16.83it/s]
--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'m_test_mrr': tensor(0.0410),
 'm_val_mrr': tensor(0.0410),
 'test_mrr': tensor(0.0410),
 'val_loss': tensor(5.5390, device='cuda:0'),
 'val_mrr': tensor(0.0410)}
--------------------------------------------------------------------------------

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
resource		resource
settings		settings
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xCoFormer

1. Quick Start

2. Datasets

3. Test Run

About

Releases

Packages

Contributors 2

Languages

License

celsofranssa/xCoFormer

Folders and files

Latest commit

History

Repository files navigation

xCoFormer

1. Quick Start

2. Datasets

3. Test Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages