A curated list of pre-trained language models in scientific domains (e.g., mathematics, physics, chemistry, biology, medicine, materials science, and geoscience), covering different model sizes (from <100M to 70B parameters) and modalities (e.g., language, vision, graph, molecule, protein, genome, and climate time series). The repository will be continuously updated.
NOTE 1: To avoid ambiguity, when we talk about the number of parameters in a model, "Base" refers to 110M (i.e., BERT-Base), and "Large" refers to 340M (i.e., BERT-Large). Other numbers will be written explicitly.
NOTE 2: In each subsection, papers are sorted chronologically. If a paper has a preprint (e.g., arXiv or bioRxiv) version, its publication date is according to the preprint service. Otherwise, its publication date is according to the conference proceeding or journal.
NOTE 3: We appreciate contributions. If you have any suggested papers, feel free to reach out to [email protected] or submit a pull request. For format consistency, we will include a paper after (1) it has a version with author names AND (2) its GitHub and/or Hugging Face links are available.
- General
- Mathematics
- Physics
- Chemistry and Materials Science
- Biology and Medicine
- Geography, Geology, and Environmental Science
-
(SciBERT) SciBERT: A Pretrained Language Model for Scientific Text
EMNLP 2019
[Paper] [GitHub] [Model (Base)] -
(SciGPT2) Explaining Relationships between Scientific Documents
ACL 2021
[Paper] [GitHub] [Model (117M)] -
(CATTS) TLDR: Extreme Summarization of Scientific Documents
EMNLP 2020 Findings
[Paper] [GitHub] [Model (406M)] -
(SciNewsBERT) SciClops: Detecting and Contextualizing Scientific Claims for Assisting Manual Fact-Checking
CIKM 2021
[Paper] [Model (Base)] -
(ScholarBERT) The Diminishing Returns of Masked Language Models to Science
ACL 2023 Findings
[Paper] [Model (Large)] [Model (770M)] -
(AcademicRoBERTa) A Japanese Masked Language Model for Academic Domain
COLING 2022 Workshop
[Paper] [GitHub] [Model (125M)] -
(Galactica) Galactica: A Large Language Model for Science
arXiv 2022
[Paper] [Model (125M)] [Model (1.3B)] [Model (6.7B)] [Model (30B)] [Model (120B)] -
(DARWIN) DARWIN Series: Domain Specific Large Language Models for Natural Science
arXiv 2023
[Paper] [GitHub] [Model (7B)] -
(FORGE) FORGE: Pre-training Open Foundation Models for Science
SC 2023
[Paper] [GitHub] [Model (1.4B, General)] [Model (1.4B, Biology/Medicine)] [Model (1.4B, Chemistry)] [Model (1.4B, Engineering)] [Model (1.4B, Materials Science)] [Model (1.4B, Physics)] [Model (1.4B, Social Science/Art)] [Model (13B, General)] [Model (22B, General)] -
(SciGLM) SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning
arXiv 2024
[Paper] [GitHub] [Model (6B)]
-
(SPECTER) SPECTER: Document-level Representation Learning using Citation-informed Transformers
ACL 2020
[Paper] [GitHub] [Model (Base)] -
(OAG-BERT) OAG-BERT: Towards a Unified Backbone Language Model for Academic Knowledge Services
KDD 2022
[Paper] [GitHub] -
(ASPIRE) Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
NAACL 2022
[Paper] [GitHub] [Model (Base)] -
(SciNCL) Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
EMNLP 2022
[Paper] [GitHub] [Model (Base)] -
(SPECTER 2.0) SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
EMNLP 2023
[Paper] [GitHub] [Model (113M)] -
(SciPatton) Patton: Language Model Pretraining on Text-Rich Networks
ACL 2023
[Paper] [GitHub] -
(SciMult) Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
EMNLP 2023 Findings
[Paper] [GitHub] [Model (138M)]
-
(GenBERT) Injecting Numerical Reasoning Skills into Language Models
ACL 2020
[Paper] [GitHub] -
(MathBERT) MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
arXiv 2021
[Paper] [GitHub] [Model (Base)] -
(MWP-BERT) MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving
NAACL 2022 Findings
[Paper] [GitHub] [Model (Base)] -
(BERT-TD) Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems
ACL 2022 Findings
[Paper] [GitHub] -
(GSM8K-GPT) Training Verifiers to Solve Math Word Problems
arXiv 2021
[Paper] [GitHub] -
(DeductReasoner) Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction
ACL 2022
[Paper] [GitHub] [Model (125M)] -
(NaturalProver) NaturalProver: Grounded Mathematical Proof Generation with Language Models
NeurIPS 2022
[Paper] [GitHub] -
(Minerva) Solving Quantitative Reasoning Problems with Language Models
NeurIPS 2022
[Paper] -
(Bhaskara) Lila: A Unified Benchmark for Mathematical Reasoning
EMNLP 2022
[Paper] [GitHub] [Model (2.7B)] -
(WizardMath) WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (70B)] -
(MAmmoTH) MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
ICLR 2024
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)] [Model (13B, LLaMA-2)] [Model (70B, LLaMA-2)] -
(MetaMath) MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
ICLR 2024
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)] [Model (13B, LLaMA-2)] [Model (70B, LLaMA-2)] -
(ToRA) ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (70B)] -
(MathCoder) MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)] -
(Llemma) Llemma: An Open Language Model For Mathematics
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (34B)] -
(OVM) OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning
NAACL 2024 Findings
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)] -
(DeepSeekMath) DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
arXiv 2024
[Paper] [GitHub] [Model (7B)] -
(InternLM-Math) InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (20B)] -
(OpenMath) OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
arXiv 2024
[Paper] [Model (7B, Mistral)] [Model (70B, LLaMA-2)] -
(Rho-Math) Rho-1: Not All Tokens Are What You Need
arXiv 2024
[Paper] [GitHub] [Model (1B)] [Model (7B)] -
(MAmmoTH2) MAmmoTH2: Scaling Instructions from the Web
arXiv 2024
[Paper] [GitHub] [Model (7B, Mistral)] [Model (8B, LLaMA-3)] [Model (8x7B, Mistral)]
-
(Inter-GPS) Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
ACL 2021
[Paper] [GitHub] -
(Geoformer) UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
EMNLP 2022
[Paper] [GitHub] -
(SCA-GPS) A Symbolic Character-Aware Model for Solving Geometry Problems
ACM MM 2023
[Paper] [GitHub] -
(UniMath-Flan-T5) UniMath: A Foundational and Multimodal Mathematical Reasoner
EMNLP 2023
[Paper] [GitHub] -
(G-LLaVA) G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
arXiv 2023
[Paper] [GitHub]
-
(TAPAS) TAPAS: Weakly Supervised Table Parsing via Pre-training
ACL 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(TaBERT) TaBERT: Learning Contextual Representations for Natural Language Utterances and Structured Tables
ACL 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(GraPPa) GraPPa: Grammar-Augmented Pre-training for Table Semantic Parsing
ICLR 2021
[Paper] [GitHub] [Model (355M)] -
(TUTA) TUTA: Tree-based Transformers for Generally Structured Table Pre-training
KDD 2021
[Paper] [GitHub] -
(RCI) Capturing Row and Column Semantics in Transformer Based Question Answering over Tables
NAACL 2021
[Paper] [GitHub] [Model (12M)] -
(TABBIE) TABBIE: Pretrained Representations of Tabular Data
NAACL 2021
[Paper] [GitHub] -
(TAPEX) TAPEX: Table Pre-training via Learning a Neural SQL Executor
ICLR 2022
[Paper] [GitHub] [Model (140M)] [Model (406M)] -
(FORTAP) FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining
ACL 2022
[Paper] [GitHub] -
(OmniTab) OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering
NAACL 2022
[Paper] [GitHub] [Model (406M)] -
(ReasTAP) ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples
EMNLP 2022
[Paper] [GitHub] [Model (406M)] -
(Table-GPT) Table-GPT: Table-tuned GPT for Diverse Table Tasks
arXiv 2023
[Paper] -
(TableLlama) TableLlama: Towards Open Large Generalist Models for Tables
NAACL 2024
[Paper] [GitHub] [Model (7B)] -
(TableLLM) TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)]
-
(astroBERT) Building astroBERT, a Language Model for Astronomy & Astrophysics
arXiv 2021
[Paper] [Model (Base)] -
(AstroLLaMA) AstroLLaMA: Towards Specialized Foundation Models in Astronomy
AACL 2023 Workshop
[Paper] [Model (7B)] -
(AstroLLaMA-Chat) AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Research Notes of the AAS 2024
[Paper] [Model (7B)]
-
(ChemBERT) Automated Chemical Reaction Extraction from Scientific Literature
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] [Model (Base)] -
(MatSciBERT) MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction
npj Computational Materials 2022
[Paper] [GitHub] [Model (Base)] -
(MatBERT) Quantifying the Advantage of Domain-Specific Pre-training on Named Entity Recognition Tasks in Materials Science
Patterns 2022
[Paper] [GitHub] -
(BatteryBERT) BatteryBERT: A Pretrained Language Model for Battery Database Enhancement
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] [Model (Base)] -
(MaterialsBERT) A General-Purpose Material Property Data Extraction Pipeline from Large Polymer Corpora using Natural Language Processing
npj Computational Materials 2023
[Paper] [Model (Base)] -
(CatBERTa) Catalyst Property Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models
ACS Catalysis 2023
[Paper] [GitHub] -
(LLM-Prop) LLM-Prop: Predicting Physical and Electronic Properties of Crystalline Solids from Their Text Descriptions
arXiv 2023
[Paper] [GitHub] -
(ChemDFM) ChemDFM: Dialogue Foundation Model for Chemistry
arXiv 2024
[Paper] [GitHub] [Model (13B)] -
(CrystalLLM) Fine-Tuned Language Models Generate Stable Inorganic Materials as Text
ICLR 2024
[Paper] [GitHub] -
(ChemLLM) ChemLLM: A Chemical Large Language Model
arXiv 2024
[Paper] [Model (7B)] -
(LlaSMol) LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
arXiv 2024
[Paper] [GitHub] [Model (6.7B, Galactica)] [Model (7B, LLaMA-2)] [Model (7B, Mistral)]
-
(Text2Mol) Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries
EMNLP 2021
[Paper] [GitHub] -
(KV-PLM) A Deep-learning System Bridging Molecule Structure and Biomedical Text with Comprehension Comparable to Human Professionals
Nature Communications 2022
[Paper] [GitHub] [Model (Base)] -
(MolT5) Translation between Molecules and Natural Language
EMNLP 2022
[Paper] [GitHub] [Model (60M)] [Model (220M)] [Model (770M)] -
(MoMu) A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language
arXiv 2022
[Paper] [GitHub] -
(MoleculeSTM) Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing
Nature Machine Intelligence 2023
[Paper] [GitHub] -
(Text+Chem T5) Unifying Molecular and Textual Representations via Multi-task Language Modelling
ICML 2023
[Paper] [GitHub] [Model (60M)] [Model (220M)] -
(GIMLET) GIMLET: A Unified Graph-Text Model for Instruction-based Molecule Zero-Shot Learning
NeurIPS 2023
[Paper] [GitHub] [Model (60M)] -
(MolFM) MolFM: A Multimodal Molecular Foundation Model
arXiv 2023
[Paper] [GitHub] -
(MolCA) MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter
EMNLP 2023
[Paper] [GitHub] -
(InstructMol) InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery
arXiv 2023
[Paper] [GitHub] -
(3D-MoLM) Towards 3D Molecule-Text Interpretation in Language Models
ICLR 2024
[Paper] [GitHub]
- (GIT-Mol) GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Computers in Biology and Medicine 2024
[Paper] [GitHub]
-
(SMILES-BERT) SMILES-BERT: Large Scale Unsupervised Pre-training for Molecular Property Prediction
ACM BCB 2019
[Paper] [GitHub] -
(MAT) Molecule Attention Transformer
arXiv 2020
[Paper] [GitHub] -
(ChemBERTa) ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction
arXiv 2020
[Paper] [GitHub] [Model (125M)] -
(MolBERT) Molecular Representation Learning with Language Models and Domain-Relevant Auxiliary Tasks
arXiv 2020
[Paper] [GitHub] [Model (Base)] -
(rxnfp) Mapping the Space of Chemical Reactions using Attention-based Neural Networks
Nature Machine Intelligence 2021
[Paper] [GitHub] [Model (Base)] -
(RXNMapper) Extraction of Organic Chemistry Grammar from Unsupervised Learning of Chemical Reactions
Science Advances 2021
[Paper] [GitHub] -
(MoLFormer) Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Nature Machine Intelligence 2022
[Paper] [GitHub] [Model (47M)] -
(Chemformer) Chemformer: A Pre-trained Transformer for Computational Chemistry
Machine Learning: Science and Technology 2022
[Paper] [GitHub] [Model (45M)] [Model (230M)] -
(R-MAT) Relative Molecule Self-Attention Transformer
Journal of Cheminformatics 2024
[Paper] [GitHub] -
(MolGPT) MolGPT: Molecular Generation using a Transformer-Decoder Model
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] -
(T5Chem) Unified Deep Learning Model for Multitask Reaction Predictions with Explanation
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] -
(ChemGPT) Neural Scaling of Deep Chemical Models
Nature Machine Intelligence 2023
[Paper] [Model (4.7M)] [Model (19M)] [Model (1.2B)] -
(TransPolymer) TransPolymer: A Transformer-based Language Model for Polymer Property Predictions
npj Computational Materials 2023
[Paper] [GitHub] -
(polyBERT) polyBERT: A Chemical Language Model to Enable Fully Machine-Driven Ultrafast Polymer Informatics
Nature Communications 2023
[Paper] [GitHub] [Model (86M)] -
(MFBERT) Large-Scale Distributed Training of Transformers for Chemical Fingerprinting
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] -
(SPMM) Bidirectional Generation of Structure and Properties Through a Single Molecular Foundation Model
Nature Communications 2024
[Paper] [GitHub] -
(BARTSmiles) BARTSmiles: Generative Masked Language Models for Molecular Representations
arXiv 2022
[Paper] [GitHub] [Model (406M)] -
(MolGen) Domain-Agnostic Molecular Generation with Self-feedback
ICLR 2024
[Paper] [GitHub] [Model (406M, BART)] [Model (7B, LLaMA)] -
(SELFormer) SELFormer: Molecular Representation Learning via SELFIES Language Models
Machine Learning: Science and Technology 2023
[Paper] [GitHub] [Model (58M)] [Model (87M)] -
(PolyNC) PolyNC: A Natural and Chemical Language Model for the Prediction of Unified Polymer Properties
Chemical Science 2024
[Paper] [GitHub] [Model (220M)]
Acknowledgment: We referred to Wang et al.'s survey paper Pre-trained Language Models in Biomedical Domain: A Systematic Survey when writing some parts of this section.
-
(BioBERT) BioBERT: A Pre-trained Biomedical Language Representation Model for Biomedical Text Mining
Bioinformatics 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(BioELMo) Probing Biomedical Embeddings from Language Models
NAACL 2019 Workshop
[Paper] [GitHub] [Model (93M)] -
(ClinicalBERT, Alsentzer et al.) Publicly Available Clinical BERT Embeddings
NAACL 2019 Workshop
[Paper] [GitHub] [Model (Base)] -
(ClinicalBERT, Huang et al.) ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
arXiv 2019
[Paper] [GitHub] [Model (Base)] -
(BlueBERT, f.k.a. NCBI-BERT) Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
ACL 2019 Workshop
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(BEHRT) BEHRT: Transformer for Electronic Health Records
Scientific Reports 2020
[Paper] [GitHub] -
(EhrBERT) Fine-Tuning Bidirectional Encoder Representations from Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study
JMIR Medical Informatics 2019
[Paper] [GitHub] -
(Clinical XLNet) Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
EMNLP 2020 Workshop
[Paper] [GitHub] -
(ouBioBERT) Pre-training Technique to Localize Medical BERT and Enhance Biomedical BERT
arXiv 2020
[Paper] [GitHub] [Model (Base)] -
(COVID-Twitter-BERT) COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
Frontiers in Artificial Intelligence 2023
[Paper] [GitHub] [Model (Large)] -
(Med-BERT) Med-BERT: Pretrained Contextualized Embeddings on Large-Scale Structured Electronic Health Records for Disease Prediction
npj Digital Medicine 2021
[Paper] [GitHub] -
(Bio-ELECTRA) On the Effectiveness of Small, Discriminatively Pre-trained Language Representation Models for Biomedical Text Mining
EMNLP 2020 Workshop
[Paper] [GitHub] [Model (Base)] -
(BiomedBERT, f.k.a. PubMedBERT) Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
ACM Transactions on Computing for Healthcare 2021
[Paper] [Model (Base)] [Model (Large)] -
(MCBERT) Conceptualized Representation Learning for Chinese Biomedical Text Mining
arXiv 2020
[Paper] [GitHub] [Model (Base)] -
(BRLTM) Bidirectional Representation Learning from Transformers using Multimodal Electronic Health Record Data to Predict Depression
JBHI 2021
[Paper] [GitHub] -
(BioRedditBERT) COMETA: A Corpus for Medical Entity Linking in the Social Media
EMNLP 2020
[Paper] [GitHub] [Model (Base)] -
(BioMegatron) BioMegatron: Larger Biomedical Domain Language Model
EMNLP 2020
[Paper] [GitHub] [Model (345M)] -
(SapBERT) Self-Alignment Pretraining for Biomedical Entity Representations
NAACL 2021
[Paper] [GitHub] [Model (Base)] -
(ClinicalTransformer) Clinical Concept Extraction using Transformers
JAMIA 2020
[Paper] [GitHub] [Model (Base, BERT)] [Model (125M, RoBERTa)] [Model (12M, ALBERT)] [Model (Base, ELECTRA)] [Model (117M, XLNet)] [Model (149M, Longformer)] [Model (86M, DeBERTa)] -
(BioRoBERTa) Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art
EMNLP 2020 Workshop
[Paper] [GitHub] [Model (125M)] [Model (355M)] -
(RAD-BERT) Highly Accurate Classification of Chest Radiographic Reports using a Deep Learning Natural Language Model Pre-trained on 3.8 Million Text Reports
Bioinformatics 2020
[Paper] [GitHub] -
(BioMedBERT) BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR
COLING 2020
[Paper] [GitHub] -
(LBERT) LBERT: Lexically Aware Transformer-based Bidirectional Encoder Representation Model for Learning Universal Bio-Entity Relations
Bioinformatics 2021
[Paper] [GitHub] -
(ELECTRAMed) ELECTRAMed: A New Pre-trained Language Representation Model for Biomedical NLP
arXiv 2021
[Paper] [GitHub] [Model (Base)] -
(SciFive) SciFive: A Text-to-Text Transformer Model for Biomedical Literature
arXiv 2021
[Paper] [GitHub] [Model (220M)] [Model (770M)] -
(BioALBERT) Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT
BMC Bioinformatics 2022
[Paper] [GitHub] [Model (12M)] [Model (18M)] -
(Clinical-Longformer) Clinical-Longformer and Clinical-BigBird: Transformers for Long Clinical Sequences
arXiv 2021
[Paper] [GitHub] [Model (149M, Longformer)] [Model (Base, BigBird)] -
(BioBART) BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model
ACL 2022 Workshop
[Paper] [GitHub] [Model (140M)] [Model (406M)] -
(BioGPT) BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Briefings in Bioinformatics 2022
[Paper] [GitHub] [Model (355M)] [Model (1.5B)] -
(Med-PaLM) Large Language Models Encode Clinical Knowledge
Nature 2023
[Paper] -
(ChatDoctor) ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) using Medical Domain Knowledge
Cureus 2023
[Paper] [GitHub] -
(DoctorGLM) DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task
arXiv 2023
[Paper] [GitHub] -
(BenTsao, f.k.a. HuaTuo) HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge
arXiv 2023
[Paper] [GitHub] -
(MedAlpaca) MedAlpaca - An Open-Source Collection of Medical Conversational AI Models and Training Data
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)] -
(PMC-LLaMA) PMC-LLaMA: Towards Building Open-source Language Models for Medicine
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)] -
(Med-PaLM 2) Towards Expert-Level Medical Question Answering with Large Language Models
arXiv 2023
[Paper] -
(GatorTronGPT) A Study of Generative Large Language Model for Medical Research and Healthcare
arXiv 2023
[Paper] [GitHub] [Model (345M)] -
(HuatuoGPT) HuatuoGPT, towards Taming Language Model to Be a Doctor
EMNLP 2023 Findings
[Paper] [GitHub] [Model (7B)] [Model (13B)] -
(MedCPT) MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval
Bioinformatics 2023
[Paper] [GitHub] [Model (Base)] -
(DISC-MedLLM) DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation
arXiv 2023
[Paper] [GitHub] [Model (13B)] -
(DRG-LLaMA) DRG-LLaMA: Tuning LLaMA Model to Predict Diagnosis-related Group for Hospitalized Patients
npj Digital Medicine 2024
[Paper] [GitHub] -
(BioT5) BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations
EMNLP 2023
[Paper] [GitHub] [Model (220M)] -
(HuatuoGPT-II) HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (34B)] -
(MEDITRON) MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (70B)] -
(PLLaMa) PLLaMa: An Open-source Large Language Model for Plant Science
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)] -
(BioMistral) BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
arXiv 2024
[Paper] [Model (7B)] -
(BioMedLM, f.k.a. PubMedGPT) BioMedLM: a Domain-Specific Large Language Model for Biomedical Text
arXiv 2024
[Paper] [GitHub] [Model (2.7B)] -
(BMRetriever) BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
arXiv 2024
[Paper] [GitHub] [Model (410M)] [Model (1B)] [Model (2B)] [Model (7B)]
-
(G-BERT) Pre-training of Graph Augmented Transformers for Medication Recommendation
IJCAI 2019
[Paper] [GitHub] -
(CODER) CODER: Knowledge Infused Cross-Lingual Medical Term Embedding for Term Normalization
JBI 2022
[Paper] [GitHub] [Model (Base)] -
(KeBioLM) Improving Biomedical Pretrained Language Models with Knowledge
NAACL 2021 Workshop
[Paper] [GitHub] [Model (155M)] -
(MoP) Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT
EMNLP 2021
[Paper] [GitHub] -
(BioLinkBERT) LinkBERT: Pretraining Language Models with Document Links
ACL 2022
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(DRAGON) Deep Bidirectional Language-Knowledge Graph Pretraining
NeurIPS 2022
[Paper] [GitHub] [Model (360M)]
-
(ConVIRT) Contrastive Learning of Medical Visual Representations from Paired Images and Text
MLHC 2022
[Paper] [GitHub] -
(MedViLL) Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-training
JBHI 2022
[Paper] [GitHub] -
(GLoRIA) GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition
ICCV 2021
[Paper] [GitHub] -
(LoVT) Joint Learning of Localized Representations from Medical Images and Reports
ECCV 2022
[Paper] [GitHub] -
(CvT2DistilGPT2) Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Artificial Intelligence in Medicine 2023
[Paper] [GitHub] -
(BioViL) Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
ECCV 2022
[Paper] [GitHub] -
(LViT) LViT: Language meets Vision Transformer in Medical Image Segmentation
TMI 2022
[Paper] [GitHub] -
(M3AE) Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-training
MICCAI 2022
[Paper] [GitHub] -
(ARL) Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
ACM MM 2022
[Paper] [GitHub] -
(CheXzero) Expert-Level Detection of Pathologies from Unannotated Chest X-ray Images via Self-Supervised Learning
Nature Biomedical Engineering 2022
[Paper] [GitHub] -
(MGCA) Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
NeurIPS 2022
[Paper] [GitHub] -
(MedCLIP) MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
EMNLP 2022
[Paper] [GitHub] -
(BioViL-T) Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
CVPR 2023
[Paper] [GitHub] [Model] -
(BiomedCLIP) BiomedCLIP: A Multimodal Biomedical Foundation Model Pretrained from Fifteen Million Scientific Image-Text Pairs
arXiv 2023
[Paper] [Model] -
(RGRG) Interactive and Explainable Region-guided Radiology Report Generation
CVPR 2023
[Paper] [GitHub] -
(LLaVA-Med) LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
NeurIPS 2023
[Paper] [GitHub] -
(MONET) Transparent Medical Image AI via an Image–Text Foundation Model Grounded in Medical Literature
Nature Medicine 2024
[Paper] [GitHub] -
(Med-PaLM M) Towards Generalist Biomedical AI
NEJM AI 2024
[Paper] [GitHub] -
(BioCLIP) BioCLIP: A Vision Foundation Model for the Tree of Life
arXiv 2023
[Paper] [Github] [Model]
-
(ProtTrans) ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance Computing
TPAMI 2021
[Paper] [GitHub] [Model (Base, BERT)] [Model (12M, ALBERT)] [Model (117M, XLNet)] [Model (3B, T5)] [Model (11B, T5)] -
(ESM-1b) Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences
PNAS 2021
[Paper] [GitHub] [Model (650M)] -
(ESM-1v) Language Models Enable Zero-Shot Prediction of the Effects of Mutations on Protein Function
NeurIPS 2021
[Paper] [GitHub] [Model (650M)] -
(ProteinBERT) ProteinBERT: A Universal Deep-Learning Model of Protein Sequence and Function
Bioinformatics 2022
[Paper] [GitHub] [Model (16M)] -
(ProtGPT2) ProtGPT2 is a Deep Unsupervised Language Model for Protein Design
Nature Communications 2022
[Paper] [Model (738M)] -
(ESM-IF1) Learning Inverse Folding from Millions of Predicted Structures
ICML 2022
[Paper] [GitHub] [Model (124M)] -
(ProGen) Large Language Models Generate Functional Protein Sequences across Diverse Families
Nature Biotechnology 2023
[Paper] -
(ProGen2) ProGen2: Exploring the Boundaries of Protein Language Models
Cell Systems 2023
[Paper] [GitHub] [Model (151M)] [Model (764M)] [Model (2.7B)] [Model (6.4B)] -
(ESM-2) Evolutionary-Scale Prediction of Atomic-Level Protein Structure with a Language Model
Science 2023
[Paper] [GitHub] [Model (8M)] [Model (35M)] [Model (150M)] [Model (650M)] [Model (3B)] [Model (15B)] -
(Ankh) Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
arXiv 2023
[Paper] [GitHub] [Model (450M)] [Model (1.1B)] -
(ProtST) ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
ICML 2023
[Paper] [GitHub] -
(LM-Design) Structure-informed Language Models Are Protein Designers
ICML 2023
[Paper] [GitHub] [Model (650M)] -
(Prot2Text) Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers
AAAI 2024
[Paper] [GitHub] [Model (256M)] [Model (283M)] [Model (398M)] [Model (898M)] -
(SaProt) SaProt: Protein Language Modeling with Structure-Aware Vocabulary
ICLR 2024
[Paper] [GitHub] [Model (35M)] [Model (650M)]
-
(DNABERT) DNABERT: Pre-trained Bidirectional Encoder Representations from Transformers Model for DNA-Language in Genome
Bioinformatics 2021
[Paper] [GitHub] [Model (Base)] -
(Enformer) Effective Gene Expression Prediction from Sequence by Integrating Long-Range Interactions
Nature Methods 2021
[Paper] [GitHub] [Model (249M)] -
(GenSLMs) GenSLMs: Genome-Scale Language Models Reveal SARS-CoV-2 Evolutionary Dynamics
The International Journal of High Performance Computing Applications 2023
[Paper] [GitHub] -
(Nucleotide Transformer) The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics
bioRxiv 2023
[Paper] [GitHub] [Model (50M)] [Model (100M)] [Model (250M)] [Model (500M)] -
(GENA-LM) GENA-LM: A Family of Open-Source Foundational DNA Language Models for Long Sequences
bioRxiv 2023
[Paper] [GitHub] [Model (Base, BERT)] [Model (Large, BERT)] [Model (Base, BigBird)] -
(DNABERT-2) DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
ICLR 2024
[Paper] [GitHub] [Model (Base)] -
(HyenaDNA) HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution
NeurIPS 2023
[Paper] [GitHub] [Model (0.4M)] [Model (3.3M)] [Model (6.6M)]
-
(RNABERT) Informative RNA-base Embedding for Functional RNA Structural Alignment and Clustering by Deep Representation Learning
NAR Genomics and Bioinformatics 2022
[Paper] [GitHub] -
(RNA-FM) Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions
arXiv 2022
[Paper] [GitHub] -
(RNA-MSM) Multiple Sequence-Alignment-based RNA Language Model and its Application to Structural Inference
Nucleic Acids Research 2024
[Paper] [GitHub]
-
(scBERT) scBERT as a Large-scale Pretrained Deep Language Model for Cell Type Annotation of Single-cell RNA-seq Data
Nature Machine Intelligence 2022
[Paper] [GitHub] -
(scGPT) scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
Nature Methods 2024
[Paper] [GitHub] -
(scFoundation) Large Scale Foundation Model on Single-cell Transcriptomics
bioRxiv 2023
[Paper] [GitHub] [Model (100M)] -
(Geneformer) Transfer Learning Enables Predictions in Network Biology
Nature 2023
[Paper] [Model (10M)] [Model (40M)] -
(CellLM) Large-Scale Cell Representation Learning via Divide-and-Conquer Contrastive Learning
arXiv 2023
[Paper] [GitHub] -
(BioMedGPT) BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (10B)] -
(CellPLM) CellPLM: Pre-training of Cell Language Model Beyond Single Cells
ICLR 2024
[Paper] [GitHub] [Model (82M)]
-
(ClimateBERT) ClimateBERT: A Pretrained Language Model for Climate-Related Text
arXiv 2021
[Paper] [GitHub] [Model (82M)] -
(SpaBERT) SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation
EMNLP 2022 Findings
[Paper] [GitHub] [Model (Base)] [Model (Large)] -
(MGeo) MGeo: Multi-Modal Geographic Pre-training Method
SIGIR 2023
[Paper] [GitHub] -
(K2) K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization
WSDM 2024
[Paper] [GitHub] [Model (7B)] -
(OceanGPT) OceanGPT: A Large Language Model for Ocean Science Tasks
arXiv 2023
[Paper] [GitHub] [Model (7B)] -
(ClimateBERT-NetZero) ClimateBERT-NetZero: Detecting and Assessing Net Zero and Reduction Targets
EMNLP 2023
[Paper] [Model (82M)] -
(GeoLM) GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding
EMNLP 2023
[Paper] [GitHub] -
(GeoGalactica) GeoGalactica: A Scientific Large Language Model in Geoscience
arXiv 2024
[Paper] [GitHub] [Model (30B)]
-
(ERNIE-GeoL) ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps
KDD 2022
[Paper] -
(PK-Chat) PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model
arXiv 2023
[Paper] [GitHub]
- (UrbanCLIP) UrbanCLIP: Learning Text-Enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
WWW 2024
[Paper] [GitHub]
-
(FourCastNet) FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
arXiv 2022
[Paper] [GitHub] -
(Pangu-Weather) Accurate Medium-Range Global Weather Forecasting with 3D Neural Networks
Nature 2023
[Paper] [GitHub] -
(ClimaX) ClimaX: A Foundation Model for Weather and Climate
ICML 2023
[Paper] [GitHub] -
(FengWu) FengWu: Pushing the Skillful Global Medium-Range Weather Forecast beyond 10 Days Lead
arXiv 2023
[Paper] [GitHub] -
(W-MAE) W-MAE: Pre-trained Weather Model with Masked Autoencoder for Multi-Variable Weather Forecasting
arXiv 2023
[Paper] [GitHub] -
(FuXi) FuXi: A Cascade Machine Learning Forecasting System for 15-day Global Weather Forecast
npj Climate and Atmospheric Science 2023
[Paper] [GitHub]