Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Librilight Preprocess Scripts Revision #125

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

HarryHe11
Copy link
Collaborator

@HarryHe11 HarryHe11 commented Jan 29, 2024

✨ Description

This PR aims to enhance the compatibility and efficiency of the librilight preprocessing scripts with the new NS2 framework to be updated by @HeCheng0625. By updating the librilight preprocessing scripts, we ensure they align with the new NS2 dataloader's expectations, facilitating smoother NS2 training processes. Besides, this PR also improves the usability of the original librilight preprocess scripts.

🚧 Related Issues

#115

πŸ‘¨β€πŸ’» Changes Proposed

  • Updated librilight preprocess scripts for better alignment with the NS2 dataloader, ensuring compatibility and efficiency.
  • Enhanced the original librilight preprocess scripts, improving usability.

πŸ§‘β€πŸ€β€πŸ§‘ Who Can Review?

@HeCheng0625 @lmxue

πŸ›  TODO

  • Conduct a comprehensive test of the entire NS2 training pipeline with the updated preprocessing scripts, as coordinated with @HeCheng0625, to validate the enhancements and ensure seamless integration and functionality within the new NS2 framework to be updated.

βœ… Checklist

  • Code has been reviewed
  • Code complies with the project's code standards and best practices
  • Code has passed all tests
  • Code does not affect the normal use of existing features
  • Code has been commented properly
  • Documentation has been updated (if applicable)
  • Demo/checkpoint has been attached (if applicable)

Copy link
Collaborator

@lmxue lmxue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your efforts. Please check the comments.

or not os.path.exists(mfa_model_path)
or not os.path.exists(mfa_config_path)
):
if not os.path.exists(mfa_dict_path) or not os.path.exists(mfa_model_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend modifying this to separately check if different files exist. Otherwise, it's unclear which specific file is missing.

config/tts.json Show resolved Hide resolved
Comment on lines +141 to +142
if "librilight" in cfg.dataset:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines of code imply that the LibriLight dataset does not undergo subsequent feature extraction; does it use online feature extraction? Can we add a condition here to determine whether the feature extraction method is online extraction or pre-extraction? This means our system will support two types of feature extraction methods.
P.S., we need to integrate the online feature extraction process later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants