-
-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Initiative]: Annotate Ersilia's models following BioModels standards #1059
Comments
Hello @Zainab-ik, as discussed, let's start by conceiving an issue template to prompt discussion about each model individually. I suggest that we start by doing this in the current antimalarial model, then we can replicate the template to other models as we see fit. In my opinion, the template should not be too complex. |
@Zainab-ik here are some questions in preparation with our meeting with Sheriff. Feel free to add more:
|
I'd be working on the issue template. |
|
Yes it is empty for now. Please add the two models that we are currently working on and then we will add more. |
Update
@miquelduranfrigola Am I missing anything? |
Thanks @Zainab-ik - this is very useful. I don't think anything is missing. |
Update!!!
|
GitHub Issue Template While discussing with @miquelduranfrigola, He suggested I create an issue template, open it for each models i'm annotating, link them to this main issue to keep track of the work, and finally close them after the model is uploaded to the BioModel repository. Using the Ersilia issue template as sample, I came up with a draft and I'd like a review before incorporating into each model repository. I'd like to ask about the issue usage considering we'd have to open in each model repository and not the general repository? |
Hi @Zainab-ik After our meeting today, please:
From my side, I'll prioritize some further models for annotation. And we have decided that, once we have completed the annotation of at least 10 models, we will start thinking about:
|
Following the meeting.
I'd work on completing the annotation, I've sorted the compact identifiers with the EBI team. I'd also try uploading one model to the BioModels with Sheriff to give a sample of what the issue template information would look like. |
Hi @Zainab-ik Thanks! This is looking good, as I stated in the model issues I suggest we have two issues, one for discussion and one we will only open once we know which data from BioModels we want to store in Ersilia as well. |
Hi @GemmaTuron I've created the discussion issue around eos80ch and eos7kpb. I've completed the annotation of eos80ch and I'd like your review before uploading. |
Thanks @Zainab-ik ! |
Hi @GemmaTuron I've worked around the suggestions. Do I go ahead and start working on the priority models in the sheet? Also, there's an option of opening an account on BioModels to review submissions.
I think 1 applies to us. I could share my submission for review. |
@GemmaTuron feel free to take the lead here 👍 |
Hi @Zainab-ik ! Thanks, good start! Feedback from today's meeting:
If you are done with all the tasks before our next meeting, I suggest you have a look at the model incorporation that is still midway, but this is less prioritary |
Feedback from BioModels (Sheriff) !!
I've incorporated all feedbacks into the two models. I believe both models are fully annotated. Based on the feedbackThe following are/would be standard metadata in all models;
|
Update!!! DOME annotation completed and both models are up on BioModels. This has been linked in the respective repository. |
eos46ev !!!
A more detailed comments/question is in the issue here |
eos4e40 !!!
A quick questionI realized the use of term active, inactive, hit, non-hit, when describing data binarization is dependent on a paper. How do we pick a standard then? They are all mapped with ontology terms except non-hit The curation/annotation can be accessed here |
eos5xng !!!
The curation/annotation completed and linked here |
An open-ended Question "How much of the model properties i.e. core model properties (e.g., packages, libraries, open source software) should be curated and annotated?"
|
Hi @Zainab-ik, Good job, thanks for the updates, please find below some comments:
|
Thank you @GemmaTuron
For this, I added H3D Priopetary term as a metadata and just annotated with a suitable ontology and the ontology link. I didn'r necessarily mean I added the priopetary data link. Sheriff mentioned the term should be added for transparency.
Noted @GemmaTuron, That was uploaded as a sample to have an insight into how the overview would look and if there's any comment or any changes the Ersilia team would like. I'd appreciate a feedback on that. The upload can always be updated.
I'd inform the BioModels team. Could you please specify which so I can exactly mention.
These are the lists of tags available. A new one can be proposed if that'd be more suitable for Ersilia models.
They are properties. More like data properties very relevant to the model. |
Antimicrobial models annotation Questions
|
Hi @Zainab-ik ! Good job thanks for keeping it up! I have answered your questions in the respective models and below the general ones:
|
Thank you @GemmaTuron
That's great. That'd mean a QSAR metadata should be constant one, right. Just a thought;can a generative model classify as QSAR too?
Okay, that's clarified. What if an experimental method (in-vivo precisely) is used to generate the dataset then, should experimental method and the in-vivo model be added as a metadata then?
The metadata would be the same except for the organism and output and adding an antiviral metadata to it. |
@GemmaTuron All models ready for review. |
Hi @GemmaTuron A few clarifications from the meeting;
|
Hi @Zainab-ik !
After redoing the current models to review, let's get back to the old ones before we move onto the new ones. Feel free to reopen the issues and note the changes that should be made |
A clarification regarding the in-vivo and in-vitro, if it's used for data generation, it's not to be added, right @GemmaTuron |
exactly, all data has been eventually generated experimentally, so it is not that relevant to collect this information |
General fields that do not add information;
|
Hi @Zainab-ik I agree with most of them but MACCS keys are a different type of descriptor. IF the model is using RDKIT descriptors we should annotate that, if it is using MACCS we should annotate it and maybe we should think if we want to annotate all the different descriptors used |
That's right. The only challenge is MACCS and RDKIT are the only descriptors present in OLS that can be annotated. |
Antimicrobial and COVID models uploaded to BioModels
|
Hey @Zainab-ik Before starting with new models, can you have a look at the existing ones and make sure they all comply with the latest decisions we have made? Note down here any changes that had to be made in the annotations. thanks! |
Yes, working on that. |
Previous Model review
|
Regarding the first 2 models; eos7kpb, eos80ch
|
Hi @Zainab-ik Good on the corrections, as we discussed let's leave all the biological endpoints on eos7kpb |
Update: BioModels Upload;
To-do's
|
Automating Metadata Annotation using Zooma Steps;
Comments/Observation
|
Models uploaded to BioModels
|
New model Annotation - In Progress eos2lqb - issue Note: I've been working with a lot of regression model recently which is quite exciting. One of the evaluating metrics is root-mean-square error (RMSE), which I believe is also known as RMSD while reading. On OLS, RMSE doesn't exists but RMSD does, and i've been using that in my annotation. |
Hi @Zainab-ik ! I'm having a look at the models you are annotating, let me know when the excel files are ready - RMSE and RMSD are the same ;) |
Alright, Thanks @GemmaTuron |
Hi @GemmaTuron |
Grover Models
General comments about the Grover model
|
eos7w6n - This is the base model (GROVER) that was fine-tuned for task-specific dataset. Grover Models - Annotation in Progress (Metadata extraction and curation done) |
All models ready for review. |
Summary
We have partnered with BioModels at EMBL-EBI (Hinxton) to explore potential ways to incorporate Ersilia's models into well-established BioModels resource.
Of note, BioModels model annotation is based on ontologies as reported in the Ontology Lookup Service. We expect to reach similar standards thanks to the current project.
Scope
Initiative 🐋
Objective(s)
The objectives of the project are the following:
Team
@Zainab-ik is currently doing an internship at EBI-EMBL in the BioModels team.
Importantly, @Zainab-ik will meet with @miquelduranfrigola twice a week to report progress and decide next steps. Previous to the meeting, @Zainab-ik will update the corresponding model issues and, after the meeting, actionables will be reflected in the issues.
Timeline
The project timeline is still up for discussion. This are some tentative milestones:
Documentation
A backlog of models can be found in the Ersilia BioModels Spreadsheet. This spreadsheet should act as a centralized resource to keep track of progress.
The shared folder in Google Drive can be accessed here.
The text was updated successfully, but these errors were encountered: