Byte Me README

Made By Calvin Schmeichel for "Byte Me" (Team 5) | SCSU Hackathon Fall 2023

1. Overview / TLDR

For our Fall 2023 Hackathon Project topic we chose web scrapping.
All SCSU students have to take General Liberal Education courses. For many finding the right courses to take that make financial sense and are fun and appealing to the student can be hard. So we set out to fix it.
Our Solution is to leverage Web Scraping to gather a students Degree Audit and Unofficial Transcript to figure out what classes the student has completed. To then use our revolutionary in-house algorithm. Leveraging cutting-edge AI and machine learning techniques, we've crafted a state-of-the-art solution to optimize course schedules. We then use this algorithm to generate a custom report for the student to help better plan their future semesters.

2. Introduction

The main problem is the lack of guidance and information of courses available to student's. It's sometimes unclear what classes provide more value to the student or what classes have a less favorable professor.

**(Figure 1)** SCSU requires students to take classes from 10 Separate goal areas plus Diversity and Cultural requirements. The university conveniently hosts all of this information on their website. This is the foundation for our web scrapper.

Unofficial Transcript Example

Student Information

Name	John Doe
Student ID	123456789
Major	Computer Science
Academic Advisor	Dr. Jane Smith

Academic Record

Term	Course Code	Course Title	Credits	Grade
Fall 2019	CS 101	Intro to Computing	4	A
Fall 2019	MATH 150	Calculus I	4	B+
Spring 2020	CS 102	Data Structures	4	A-
Spring 2020	ENG 101	English Composition	3	B
Fall 2020	CS 201	Algorithms	4	B+
Fall 2020	STAT 200	Statistics	3	A
Spring 2021	CS 301	Operating Systems	4	A-
Spring 2021	PHIL 105	Ethics in Tech	3	B+

Cumulative Information

Total Credits Earned	29
Cumulative GPA	3.55

Note: This is an unofficial transcript.

Product Data Pipeline

Sketch/Draft

Final

graph TD;
    A["Browser & Username + Password"] --> B["API Request"];
    B --> C["Functions Kickoff"];
    C --> D["Degree Audit Parser"];
    C --> E["Transcript Parser"];
    E --> F["Text Formatter"];
    F --> G["Points Formula"];
    D --> G
    G --> H["API Response"]
    H --> A

3. Demo

1 | Start StarID & Login

Sketch/Draft

Final

// Function to populate the accordion with data
// 
function populateAccordion(data) {
    const accordionContainer = document.getElementById('accordTemplate');

    Object.keys(data).forEach((mainCategory, mainIndex) => {
        let subAccordions = '';
        const mainAccordionId = `collapseMain${mainIndex}`;

        data[mainCategory].forEach((subCategoryObj) => {
            const subCategoryKey = Object.keys(subCategoryObj)[0];
            const subCategoryData = subCategoryObj[subCategoryKey];
            const subAccordionId = `collapseSub${mainIndex}${subCategoryKey.replace(
                /[^a-zA-Z0-9]/g,
                ''
            )}`;

            subAccordions += `
                <div class="accordion-item">
                    <h2 class="accordion-header" id="subHeading${subCategoryKey}${mainIndex}">
                        <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#${subAccordionId}" aria-expanded="false" aria-controls="${subAccordionId}">
                           ${subCategoryKey}
                        </button>
                    </h2>
                    <div id="${subAccordionId}" class="accordion-collapse collapse" aria-labelledby="subHeading${subCategoryKey}${mainIndex}">
                        <div class="accordion-body">
                            Goal: ${subCategoryData.join(', ')}
                        </div>
                    </div>
                </div>
            `;
        });

        const accordionItem = `
            <div class="accordion-item">
                <h2 class="accordion-header" id="heading${mainIndex}">
                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#${mainAccordionId}" aria-expanded="false" aria-controls="${mainAccordionId}">
                        Goal: ${mainCategory}
                    </button>
                </h2>
                <div id="${mainAccordionId}" class="accordion-collapse collapse" aria-labelledby="heading${mainIndex}">
                    <div class="accordion-body">
                        <div class="accordion" id="subAccordion${mainIndex}${mainCategory}">
                            ${subAccordions}
                        </div>
                    </div>
                </div>
            </div>
        `;

        accordionContainer.innerHTML += accordionItem;
    });
}

2 | Audit and Degree Map Scrape

grab_audit.py

move_on = input("Move on 0 -->")
try:
    input_field_username = driver.find_element(By.XPATH, '/html/body/div/div[2]/div[1]/div[1]/form/table/tbody/tr[1]/td/input')
    input_field_password = driver.find_element(By.XPATH, '/html/body/div/div[2]/div[1]/div[1]/form/table/tbody/tr[2]/td/input')
    input_field_username.send_keys(username)
    input_field_password.send_keys(password)
    button_on_page = driver.find_element(By.XPATH, '/html/body/div/div[2]/div[1]/div[1]/form/table/tbody/tr[5]/td[2]/input')
    button_on_page.click()
except NoSuchElementException as e:
    print(f"Error: {e}")
    # Handle the exception or pass
    pass
except TimeoutException as e:
    print(f"Page load timed out: {e}")
    pass
except Exception as e:
    print(f"An unexpected error occurred: {e}")
    pass

grab_schedule.py

move_on = input("Move on 1 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[2]/div[1]/div[1]/ul/li[4]/a')
button_on_page.click()
move_on = input("Move on 2 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[1]/div[2]/ul/li[3]/a')
button_on_page.click()
move_on = input("Move on 3 -->")
# Execute a script that will change all elements with 'display: none' to 'display: block'
driver.execute_script("""
var elements = document.querySelectorAll('[style*="display: none"]');
for (var i = 0; i < elements.length; i++) {
  elements[i].style.display = 'block';
}
""")

grab_transcript.py

move_on = input("Move on 1 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[2]/div[1]/div[1]/ul/li[5]/a')
button_on_page.click()
move_on = input("Move on 2 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[1]/div/div[2]/div/ul/li[2]/a')
button_on_page.click()
move_on = input("Move on 3 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[1]/div/div[2]/div/div[1]/div/div/form/input[2]')
button_on_page.click()
move_on = input("Move on 4 -->")
button_on_page = driver.find_element(By.XPATH, '/html/body/div[2]/div[1]/div/div[2]/div/div[2]/a/img')
button_on_page.click()

Demo GIF

3 | Data Parser

Degree Audit Parser

def findOptimalCourse(courses: dict, goalLeft: list, userNotCompleted: dict):
    hashMap = {}
    res = []

    for course in courses:
        points = 0
        goalFulfill = courses[course]
        for goals in goalLeft:
            if goals in goalFulfill:
                points += 1
        hashMap[course] = points

    hashMap = dict(sorted(hashMap.items(), key=lambda x: x[1], reverse=True))

    for key, val in hashMap.items():
        if val > 0:
            res.append(key)
    for i, key in enumerate(res):
        if key in courses:
            res[i] = {key: courses[key]}
    courseGoal = {}

    for key in res:
        for val in key.values():
            for item in val:
                if item in courseGoal:
                    courseGoal[item].append(key)
                else:
                    courseGoal[item] = [key]

    newDict = {}

    for key in goalLeft:
        newDict[f"{str(key)} {userNotCompleted[key]}"] = courseGoal[key]
    return json.dumps(newDict)

This function creates a list of optimal courses according to the goal areas hasn't fulfilled in descending order
- :param courses: The json of all courses from the scraper
- :type courses: dict
- :param goalLeft: Goal areas the user needs to fulfill
- :type goalLeft: list
- :return: A json of all the optimal courses the user can take
- :rtype: dict

Transcript Parser

def GetDegreeAudit(DegreeAuditFile):
    with open(DegreeAuditFile, 'r') as file:
        file_contents = file.read()   
    CoursesTaken = file_contents.split()
    return(CoursesTaken)

The GetDegreeAudit function gets the degree audit file name and opens the file for the program to read and save it to a string. It then splits the data by blank/white space and make a list and returns the saved list to main for further processing.

def GetTakenCourses(CoursesTaken):
	courses = [["CYB"],["CSCI"], ["..."]

    flattened_courses = [item for sublist in courses for item in sublist]
    flattened_courses_lower = set(course.lower() for course in flattened_courses)

    filtered_words_with_next = []

    for i in range(len(CoursesTaken) - 1):
        word = CoursesTaken[i]
        if word.isupper() and word.lower() in flattened_courses_lower:
            filtered_words_with_next.append(word)
            filtered_words_with_next.append(CoursesTaken[i + 1])
    return(filtered_words_with_next)

The GetTakenCourses function takes the newly generated list and cleans out all the "Garbage" data from the data scrap we don't need.

def PrintToScreen(filtered_words_with_next):
    
    FinalList = []

    for n in range (0,len(filtered_words_with_next)-1, 2):
        FinalList.append(str(filtered_words_with_next[n])+str(filtered_words_with_next[n+1]))
    return(FinalList)

The PrintToScreen function takes the newly cleaned and list and formats the final list in a standardized way for our other pythons scripts to process.

4 | Points System

def findOptimalCourse(courses: dict, goalLeft: list):
    hashMap = {}
    res = []
    for course in courses:
        points = 0
        goalFulfill = courses[course]
        for goals in goalLeft:
            if goals in goalFulfill:
                points += 1
        hashMap[course] = points

    hashMap = dict(sorted(hashMap.items(), key=lambda x: x[1], reverse=True))

    for key, val in hashMap.items():
        if val > 0:
            res.append(key)

    return json.dumps({"optimalCourses": res})

The essence of the function can be represented by the equation:

$$O = \max(\sum G_c)$$

Where:

$O$ Represents the Optimal set of courses.
$Gc$ denotes the Goals fulfilled by course $c$.
The summation $∑Gc$ sums the goals fulfilled by each course.
The max⁡ function selects the combination of courses that maximizes the total number of fulfilled goals.
Once the class scores are fully calculated. The data is exported as a JSON.

5 | Display Data

4. Technologies

5. Looking Forward

Business Aspect

This product would add great value to students and SCSU as a whole. It would allow students to find more classes they will enjoy for a better value which in return would lead to better student retention at SCSU.

Feasibility

This Product could be implemented directly into D2L or E-Services.
Long-term we would want to grab this data directly from a database that SCSU host vs scraping the public web.
In the long run this could be very profitable for the university.

Future Works & Additions

Implement rate my professor to our Points System to add more data points to our generated report.
Connect our product to e-services to gain access to class start times, dates and semester availability to provide a more detailed generated report.
Expand our products scope to all Majors that are offered at SCSU. Not just General Liberal Education courses.
Expand to all courses and implement a questionnaire/chatbot to help students gage what elective courses they might have interest in.

6. Glossary

Web Scrapping: The process of using bots to extract content and data from a website.
API: Application Programming Interface.
Text Parsing/Parser: Task that separates the given series of text into smaller components based on some rules.
GUI: Graphical User Interface

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
Hackathon Fall 2023 README Resources		Hackathon Fall 2023 README Resources
bytemeUtil		bytemeUtil
node_modules		node_modules
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
auth.js		auth.js
data.json		data.json
grab_audit.py		grab_audit.py
grab_audit_copy.py		grab_audit_copy.py
grab_schedule.py		grab_schedule.py
grab_transcript.py		grab_transcript.py
main.py		main.py
package-lock.json		package-lock.json
package.json		package.json
rate_my_proffessor.py		rate_my_proffessor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Byte Me README

Made By Calvin Schmeichel for "Byte Me" (Team 5) | SCSU Hackathon Fall 2023

Table of Contents

1. Overview / TLDR

2. Introduction

Unofficial Transcript Example

Student Information

Academic Record

Cumulative Information

Product Data Pipeline

3. Demo

1 | Start StarID & Login

2 | Audit and Degree Map Scrape

grab_audit.py

grab_schedule.py

grab_transcript.py

Demo GIF

3 | Data Parser

Degree Audit Parser

Transcript Parser

4 | Points System

5 | Display Data

4. Technologies

5. Looking Forward

6. Glossary

7. Contributors

About

Releases

Packages

Contributors 5

Languages

Computer-Science-Club-SCSU-University/byteme

Folders and files

Latest commit

History

Repository files navigation

Byte Me README

Made By Calvin Schmeichel for "Byte Me" (Team 5) | SCSU Hackathon Fall 2023

Table of Contents

1. Overview / TLDR

2. Introduction

Unofficial Transcript Example

Student Information

Academic Record

Cumulative Information

Product Data Pipeline

3. Demo

1 | Start StarID & Login

2 | Audit and Degree Map Scrape

grab_audit.py

grab_schedule.py

grab_transcript.py

Demo GIF

3 | Data Parser

Degree Audit Parser

Transcript Parser

4 | Points System

5 | Display Data

4. Technologies

5. Looking Forward

6. Glossary

7. Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages