Cannot run make_graphfeat.sh script #8

VincentBt · 2022-11-29T18:56:59Z

Hi @linminhtoo, thank you for the work! I'd like to reproduce the results of the paper thanks to the README file. I'm trying to generate graph features (as they are not in the google drive, contrary to your statement "We again provide them in our Drive") but I cannot execute bash scripts/retrosim/make_graphfeat.sh as it raises the following exception:

File "trainEBM.py", line 477, in main
    raise ValueError(f"Model {args.model_name} not supported!")

I've looked at the history of the sh files and the trainEBM.py file and I guess it's simply a problem in trainEBM.py not properly dealing with the case where model_name = None?

The text was updated successfully, but these errors were encountered:

linminhtoo · 2022-12-15T02:37:27Z

hello @VincentBt , sorry for my late reply. i've since graduated and am working full-time, so I've not been checking these repos as regularly. please feel free to message me on LinkedIn if my replies are slow.

Yes, you are right - we decided not to upload the graph feats anymore (we used to) because they take up too much space and it's easier to just generate them from scratch. I've made a PR to remove that incorrect statement in the README.

As for the generation itself, you're also right, the bash script has incorrect arguments, somehow (it definitely was working before, hahaha...).

It's been a long time since I last ran it, but the idea is the PyTorch Dataset class we've defined will always attempt to precompute (or load precomputed files from disk) whenever it's initialised, see the entire class here:

rxn-ebm/rxnebm/data/dataset.py

Lines 106 to 107 in 1919eec

 class ReactionDatasetSMILES(Dataset): 

 """Dataset class for SMILES/Graph representation of reactions, should be good for both GNN and Transformer"""

(this line calls the precompute function)

rxn-ebm/rxnebm/data/dataset.py

Line 164 in 1919eec

self.precompute()

now, i admit it's a convoluted way of doing it (back when i was still young in college...), but the idea is to run trainEBM.py such that we reach the part where the Dataset gets initialised, which then triggers the graph feat precompute function. this should really be a separate script of its own, which I might get to refactoring some day haha

here, you can see that we will first look for precomputed files, and if they don't exist at the expected paths, then we will proceed with the precomputation:

rxn-ebm/rxnebm/data/dataset.py

Lines 186 to 193 in 1919eec

 def precompute(self): 

 if self.args.representation == 'graph': 

 # for graph, we want to cache since the pre-processing is very heavy 

 cache_smi = self.root / f"{self.rxn_smis_filename}.{self.args.cache_suffix}.cache_smi.pkl" 

 cache_mask = self.root / f"{self.rxn_smis_filename}.{self.args.cache_suffix}.cache_mask.pkl" 

 cache_feat = self.root / f"{self.rxn_smis_filename}.{self.args.cache_suffix}.cache_feat.npz" 

 cache_feat_index = self.root / f"{self.rxn_smis_filename}.{self.args.cache_suffix}.cache_feat_index.npz" 

 if all(os.path.exists(cache) for cache in [cache_smi, cache_mask, cache_feat, cache_feat_index]):

if we really only want to make the graphfeats, then we could set the training epochs to 0 so that no training happens. for the model name, you could provide --model_name "GraphEBM_1MPN" and provide the correct argument --representation "graph". alternatively, this also means that if you attempt to run an actual graphEBM training, the code should do the graphfeat precomputation needed to make the training happen. just make sure to give the correct paths to store the graphfeats so you won't have multiple copies of those massive files on your storage (or HPC cluster)

linminhtoo · 2023-02-07T06:23:47Z

Hello @VincentBt , I wanted to check in if you're still facing any other issues with using our work? :)

linminhtoo mentioned this issue Dec 15, 2022

doc: fix incorrect README on graph feats #9

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot run make_graphfeat.sh script #8

Cannot run make_graphfeat.sh script #8

VincentBt commented Nov 29, 2022 •

edited

linminhtoo commented Dec 15, 2022 •

edited

linminhtoo commented Feb 7, 2023

Cannot run make_graphfeat.sh script #8

Cannot run make_graphfeat.sh script #8

Comments

VincentBt commented Nov 29, 2022 • edited

linminhtoo commented Dec 15, 2022 • edited

linminhtoo commented Feb 7, 2023

VincentBt commented Nov 29, 2022 •

edited

linminhtoo commented Dec 15, 2022 •

edited