EleutherAI / lm-evaluation-harness Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 5.5k

Code
Issues 198
Pull requests 59
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: EleutherAI/lm-evaluation-harness

Labels 10 Milestones 1

New pull request New

59 Open 995 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Alghafa benchmark

#1946 opened Jun 11, 2024 by khalil-Hennara

Loading…

Allow running hugging face models with both data parallelism and model parallelism at once

#1943 opened Jun 10, 2024 by clefourrier

Loading…

Easier unitxt tasks loading and removal of unitxt library dependancy

#1933 opened Jun 6, 2024 by elronbandel

Loading…

samples is newline delimited

#1930 opened Jun 5, 2024 by baberabb

Loading…

Prettify lm_eval --tasks list

#1929 opened Jun 5, 2024 by anthony-dipofi • Draft

[New Task] Add Paloma benchmark

#1928 opened Jun 5, 2024 by zafstojano

Loading…

Multiprompt

#1922 opened Jun 4, 2024 by lintangsutawika • Draft

Confusion matrix metric

#1921 opened Jun 4, 2024 by minaremeli

Loading…

change openai completions params to fit API documentation

#1919 opened Jun 3, 2024 by artemorloff

Loading…

mlx Model (loglikelihood & generate_until)

#1902 opened May 29, 2024 by chimezie

Loading…

add arc_challenge_mt

#1900 opened May 29, 2024 by jonabur

Loading…

Add LegalBench tasks

#1878 opened May 23, 2024 by zafstojano

Loading…

Test coverage for optimum_lm.py

#1872 opened May 22, 2024 by zafstojano

Loading…

Added tests for Anthropic LLMs

#1868 opened May 21, 2024 by zafstojano

Loading…

Draft - Support ov models via genai

#1862 opened May 20, 2024 by sstrehlk

Loading…

mmlu-pro for the Italian language

#1860 opened May 19, 2024 by giux78

Loading…

[WIP] Fix NeuralMagic tests

#1859 opened May 19, 2024 by haileyschoelkopf

Loading…

Fix m_mmlu target

#1853 opened May 18, 2024 by jordane95

Loading…

Implement Exams benchmark

#1852 opened May 17, 2024 by snova-zoltanc

Loading…

Fix self.max_tokens in anthropic_llms.py

#1848 opened May 16, 2024 by lozhn

Loading…

Adding LLaVa support

#1832 opened May 13, 2024 by ashvinnihalani

Loading…

Financial PhraseBank (FPB) Eval Metric

#1815 opened May 9, 2024 by bcicc

Loading…

Fix cost_estimate.py

#1810 opened May 8, 2024 by xksteven

Loading…

Fix --gen_kwargs and VLLM (temperature not respected) bug

Something isn't working.

#1800 opened May 7, 2024 by haileyschoelkopf

Loading…

Make scripts.write_out error out when no splits match

#1796 opened May 7, 2024 by haileyschoelkopf

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly