Name	Name	Last commit message	Last commit date
parent directory ..
answer	answer
bilstmcrf	bilstmcrf
data	data
doc	doc
.gitignore	.gitignore
bilstmcrf.sh	bilstmcrf.sh
bilstmcrf_tester.py	bilstmcrf_tester.py
chunk.ipynb	chunk.ipynb
chunk.py	chunk.py
count-sentences.py	count-sentences.py
default.ipynb	default.ipynb
default.py	default.py
neural_config.py	neural_config.py
neural_model.py	neural_model.py
perc.py	perc.py
pytorch-requirements.txt	pytorch-requirements.txt
readme.md	readme.md
requirements.txt	requirements.txt
run.sh	run.sh
score_chunks.py	score_chunks.py
tester.py	tester.py
zipsrc.py	zipsrc.py

Name

Last commit message

Last commit date

bilstmcrf

pytorch-requirements.txt

Phrasal Chunking

Setup

python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt

Training phase

python3 default.py > default.model

Testing and Evaluation phase

python3 perc.py -m default.model > output
python3 score_chunks.py < output

python3 perc.py -m default.model | python3 score_chunks.py

Options

python3 default.py -h

This shows the different options you can use in your training algorithm implementation. In particular the -n option will let you run your algorithm for less or more iterations to let your code run faster with less accuracy or slower with more accuracy. You must implement the -n option in your code so that we are able to run your code with different number of iterations.

Baseline

$ time python3 chunk.py -e 10 -m baseline.model
reading data ...
done.
number of mistakes: 5620
number of mistakes: 3962
number of mistakes: 2930
number of mistakes: 2284
number of mistakes: 1768
number of mistakes: 1390
number of mistakes: 1226
number of mistakes: 1031
number of mistakes: 810
number of mistakes: 707

real        23m49.039s
user        23m33.333s
sys 0m5.446s

The time was computed on a 1.4 GHz Intel Core i7 with 16 GB 1867 MHz LPDDR3 on battery power. The baseline scores F1 score of 92.37 on this dataset. However, based on implementation details your score might well be slightly higher or lower.

$ python3 perc.py -m baseline.model | python3 score_chunks.py 
reading data ... 
done.
processed 500 sentences with 10375 tokens and 5783 phrases; found phrases: 5747; correct phrases: 5325
             ADJP: precision:  70.83%; recall:  68.69%; F1:  69.74; found:     96; correct:     99  
             ADVP: precision:  74.00%; recall:  73.27%; F1:  73.63; found:    200; correct:    202 
            CONJP: precision:  66.67%; recall:  80.00%; F1:  72.73; found:      6; correct:      5   
             INTJ: precision:   0.00%; recall:   0.00%; F1:   0.00; found:      0; correct:      1   
               NP: precision:  93.66%; recall:  92.80%; F1:  93.23; found:   2998; correct:   3026
               PP: precision:  96.41%; recall:  96.81%; F1:  96.61; found:   1226; correct:   1221
              PRT: precision:  68.42%; recall:  59.09%; F1:  63.41; found:     19; correct:     22  
             SBAR: precision:  76.52%; recall:  82.24%; F1:  79.28; found:    115; correct:    107 
               VP: precision:  93.28%; recall:  92.18%; F1:  92.73; found:   1087; correct:   1100
accuracy:  94.87%; precision:  92.66%; recall:  92.08%; F1:  92.37
Score: 92.37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hw3

hw3

readme.md

Phrasal Chunking

Setup

Training phase

Testing and Evaluation phase

Options

Baseline

Files

hw3

Directory actions

More options

Directory actions

More options

Latest commit

History

hw3

Folders and files

parent directory

readme.md

Phrasal Chunking

Setup

Training phase

Testing and Evaluation phase

Options

Baseline