Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This new package retrieves metadata from Youtube videos. It also allows combining that metadata with the transcript of the video. When using large numbers of videos for RAG AI, the transcript of the video was not enough. Having metadata, in particular the tags, but also statistics such as hit count, date and comment count were extremely useful for quick and effective use with LLMs. These can also easily be combined with an LLM generated summary of the transcript which I chose not to include here but will include when I create my example using this tool.
It uses the youtube_transcript_api package and needs a google API key.
It does not use the existing llama-index-readers-youtube-transcript as using youtube_transcript_api directly was straightforward.
This is my first time updating any open source project. I tried hard to follow the guidelines, but if anything is not correct I am happy to fix it.
Did I fill in the
tool.llamahub
section in thepyproject.toml
and provide a detailed README.md for my new integration or package?Version Bump?
Did I bump the version in the
pyproject.toml
file of the package I am updating? (Except for thellama-index-core
package)Type of Change
Please delete options that are not relevant.
How Has This Been Tested?
I have used this as part of a larger project to collect and label thousands of Youtube videos from various creators. I had another programmer and a Copilot look it over as well. I wrote a separate testing app to call it remotely to check that it worked properly.
Suggested Checklist:
make format; make lint
to appease the lint gods