Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔍 Meta Descriptions - Backfill all old rules with good descriptions #1307

Closed
5 tasks done
Tracked by #1293
bradystroud opened this issue Apr 17, 2024 · 6 comments
Closed
5 tasks done
Tracked by #1293
Assignees
Labels
Type: Feature A suggested idea for this project

Comments

@bradystroud
Copy link
Member

bradystroud commented Apr 17, 2024

CC @bradystroud @JackDevAU @Aibono1225 @KristenHu
Write a script that goes through all the rules and adds a meta description based off the rule content (tip: use AI to help speed this up)

Warning

using ChatGPT could be expensive (need a desc for 3000+ rules), consider other options e.g. local LLM

Tasks

Even though this will only be run once, store the script in the repo.

As per my conversation with @JackDevAU and @Aibono1225 we considered doing this 10 rules at a time, but this would take too long. Since this is urgent, it is better to ship the rules with unchecked generated descriptions and refine them over time.

@bradystroud bradystroud changed the title Backfill all old rules with good descriptions based off the rule content (tip: use AI to help speed this up) 🔍 Meta Descriptions - Backfill all old rules with good descriptions Apr 17, 2024
@JackDevAU
Copy link
Member

As per my conversation with @bradystroud we are going to wait until the Website team completes this first:
SSWConsulting/SSW.Website#2594

@bradystroud bradystroud self-assigned this Jun 3, 2024
@bradystroud
Copy link
Member Author

Update:

Found a cool solution
https://www.youtube.com/watch?v=e4V-heTEpE8 (11 min)

I'm still working on getting it perfect :)

@bradystroud
Copy link
Member Author

Update:

Picked this up again today.

Script is being pushed to this branch
https://github.com/SSWConsulting/SSW.Rules.Content/tree/seo-descriptions

I added some code to check the description after generating to ensure its not terrible

Issue Description Explanation
Exceeds 300 characters 150 chars is recommended, but that is too hard for an AI to follow because it cant count
Contains the phrase 'Here is the ...' The AI sometimes adds "here is the description i've generated for you"
Contains 'I've generated' Similar to above, catching "I've generated," which is a formality from the AI.
Contains odd characters * or _ Odd characters (normally markdown syntax like asterisks (*) or underscores (_)

If the rule has issues its added to a log file.

All the rules in the log file will need to be dealt with later.

@bradystroud
Copy link
Member Author

This morning i merged in the changes to 3000+ rules at once

Rules has a build step that makes a copy of this history, this started failing due to the 3000+ files changed.
I tried to resolve this by skipping that build step in #1365 but that caused more problems.

I need to undo the commit that added the changes, then submit the changes in chunks (100 rules at a time)

@bradystroud
Copy link
Member Author

Update - this is taking longer than expected due to a few new issues
#1368 #1367

@bradystroud
Copy link
Member Author

bradystroud commented Jun 11, 2024

🥳 Done! (mostly)

I have shipped 3,150 rules with descriptions - the remaining ones are rules the AI struggled to generate a description for.
I have moved these to a new issue #1378

https://github.com/SSWConsulting/SSW.Rules.Content/blob/main/scripts/generateSeoDescriptions/seo_issues.log

Image
Figure: ChatGPT rule has a generated description

I have also created a new issue to do the same for categories #1377

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature A suggested idea for this project
Projects
None yet
Development

No branches or pull requests

2 participants