Literature Review and Roadmaps for robust Knowledge-based LLMs
This is a list of important knowledge-based natural language processing (NLP) papers that discuss the current research in the field. This list is compiled by the collaborators from University of Notre Dame, and Deloitte.
This list is far from complete or objective, and is evolving, as important papers are being published year after year.
A paper doesn't have to be a peer-reviewed conference/journal paper to appear here. We also include tutorial/survey-style papers and blog posts that are often easier to understand than the original papers.
-
ICLR 2022 - GREASELM: GRAPH REASONING ENHANCED LANGUAGE MODELS FOR QUESTION ANSWERING
-
NEURIPS 2022 - Deep Bidirectional Language-Knowledge Graph Pretraining
- The difference from GreaseLM is that while GreaseLM only performs finetuning (hence, it is an LM finetuned with KGs), DRAGON performs self-supervised pretraining (hence, it can be viewed as an LM pretrained + finetuned with KGs). Both of these papers are out of Stanford.
- Lin et al. RA-DIT: RETRIEVAL-AUGMENTED DUAL INSTRUCTION TUNING
- Kai Sun et al. Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs?
- An evaluation of multiple LLMs to confidently internalize factual knowledge especially for facts of torso-to-tail entities (non popular and still emerging knowledge) which concludes that these set of LLMs are still far from perfect.
- Jiongnan Liu et al. RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit
- Ilia Shumailov et al. THE CURSE OF RECURSION: TRAINING ON GENERATED DATA MAKES MODELS FORGET
- Potsawee Manakul et al. SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
- Chengrun Yang et al. Large Language Models as Optimizers
- Shehzaad Dhuliawala et al. CHAIN-OF-VERIFICATION REDUCES HALLUCINATION IN LARGE LANGUAGE MODELS
- Sebastian Borgeaud et al. Improving language models by retrieving from trillions of tokens
- Yike Wu et al. KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering
- Yu et al. 2023, Grounding Language Models to Real-World Environments
- Grounding "languate understaning that uses discriminative ability of LMs instea of their generative ability"
- Swamy et al., 2021, Interpreting Language Models Through Knowledge Graph Extraction
- Generate or drive the creation of KGs from LLMs
- Asai et al., SELF-RAG: LEARNING TO RETRIEVE, GENERATE, AND CRITIQUE THROUGH SELF-REFLECTION
- Huang et al.,LARGE LANGUAGE MODELS CANNOT SELF-CORRECT REASONING YET
- Lukas Berglund et al. The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A
- Encoding clinical knowledge (2023 Nature by Google Research folks)- Aligning LLMs to new domains
- Using logical reasoning to guide an LM allows for training on certified self-generated reasoning to help avoid halucinaitions. Certified Reasoning with Language Models
- EMNLP 2023 WORKSHOP - FLEEK: Factual Error Detection and Correction with Evi Re External edge
- Shirui Pan et al., 2023, Unifying Large Language Models and Knowledge Graphs: A Roadmap
- Collection of papers and resources about LLMs and KGs. github repo