truncated_output_suffixes & #32

thistleknot · 2024-04-29T00:04:22Z

with open("data/all_truncated_outputs.json") as f:
    output_suffixes = json.load(f)
truncated_output_suffixes = [
    tokenizer.convert_tokens_to_string(tokens[:i])
    for tokens in (tokenizer.tokenize(s) for s in output_suffixes)
    for i in range(1, len(tokens))
]
truncated_output_suffixes_512 = [
    tokenizer.convert_tokens_to_string(tokens[:i])
    for tokens in (tokenizer.tokenize(s) for s in output_suffixes[:512])
    for i in range(1, len(tokens))
]

files referenced that do not exist in the repo for the mve

another ex is true_facts.json (did not find an example in the paper that mentioned facts or a .json file)

The text was updated successfully, but these errors were encountered:

thistleknot · 2024-04-29T03:07:09Z

created a script that i think mimics what you were showcasing

https://gist.github.com/thistleknot/b936477ee82ce608b3c7f47381f6b15d

vgel · 2024-05-24T23:45:04Z

make sure you're running the notebook with cwd in the notebooks folder, the data folder is notebooks/data. alternatively you can just copy the data folder to wherever you need it (you can figure out the current cwd with import os; print(os.getcwd()) and copy the data folder there), it's pretty small.

vgel closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

truncated_output_suffixes & #32

truncated_output_suffixes & #32

thistleknot commented Apr 29, 2024 •

edited

thistleknot commented Apr 29, 2024

vgel commented May 24, 2024 •

edited

truncated_output_suffixes & #32

truncated_output_suffixes & #32

Comments

thistleknot commented Apr 29, 2024 • edited

thistleknot commented Apr 29, 2024

vgel commented May 24, 2024 • edited

thistleknot commented Apr 29, 2024 •

edited

vgel commented May 24, 2024 •

edited