Add support for `drop_last=False` #96

odelalleau · 2024-02-01T00:24:34Z

Is your feature request related to a problem? Please describe.

The current code generally doesn't support drop_last=False.

Describe the solution you'd like

Proper support for drop_last=False. This could be done in two phases:

Support it for validation only
Support it for training as well

The text was updated successfully, but these errors were encountered:

trias702 · 2024-02-01T00:28:23Z

I see that the current SFT code essentially has drop_last=False for validation. But in a DP setting, how can that work, you won't have equal shards on each DP?

odelalleau · 2024-02-01T01:08:19Z

I see that the current SFT code essentially has drop_last=False for validation. But in a DP setting, how can that work, you won't have equal shards on each DP?

Yeah, you may need to pad with dummy samples whose loss would be masked out.

trias702 · 2024-02-01T01:11:19Z

I see that the current SFT code essentially has drop_last=False for validation. But in a DP setting, how can that work, you won't have equal shards on each DP?

Yeah, you may need to pad with dummy samples whose loss would be masked out.

That sounds like a fairly complex solution which could be cumbersome to maintain. I think the easier, and generally acceptable fix here is to just make all validation use drop_last=True in Aligner. We already do that for DPO and RM.

odelalleau · 2024-02-01T01:18:37Z

But there may be cases where people don't want to drop samples. For instance if you're reporting metrics over an "official" validation set on a common benchmark.

Edit: but I agree the main priority is to avoid crashing by default, so if our current SFT code actually uses drop_last=False by default and crashes because of that, we should instead make it use drop_last=True.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for `drop_last=False` #96

Add support for `drop_last=False` #96

odelalleau commented Feb 1, 2024

trias702 commented Feb 1, 2024

odelalleau commented Feb 1, 2024

trias702 commented Feb 1, 2024

odelalleau commented Feb 1, 2024 •

edited

Add support for drop_last=False #96

Add support for drop_last=False #96

Comments

odelalleau commented Feb 1, 2024

trias702 commented Feb 1, 2024

odelalleau commented Feb 1, 2024

trias702 commented Feb 1, 2024

odelalleau commented Feb 1, 2024 • edited

Add support for `drop_last=False` #96

Add support for `drop_last=False` #96

odelalleau commented Feb 1, 2024 •

edited