Skip to content

A Python tool to scrape and count U.S. subsidiaries from SEC filings. It utilizes requests and BeautifulSoup to parse subsidiary information, handling multiple tables and providing state-wise counts.

Notifications You must be signed in to change notification settings

Ezaz-Ahmad/Data-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Description This project is a Python-based data scraper designed to extract information about U.S. subsidiaries from a specified SEC filing webpage. It uses the requests and BeautifulSoup libraries to fetch and parse HTML content, specifically targeting tables that list subsidiary information under the "Jurisdiction" section. The script is capable of handling multiple tables on a single page and counts the number of branches in each U.S. state.

Setup and Installation: Prerequisites: Python 3.x pip (Python package manager) Libraries Used: requests beautifulsoup4

Clone the repository: git clone https://github.com/your-username/your-repository-name.git

Install the required Python packages: pip install requests beautifulsoup4

Usage To run the script, execute the following command in the terminal: python path/to/scraper.py

When prompted, enter the URL of the SEC filing page you wish to scrape.

Features Extracts subsidiary data from SEC filing webpages. Counts the number of branches in each state in the U.S. Handles multiple tables on a single page. Contributing Contributions to this project are welcome. Please ensure that your code adheres to the project's coding standards and include tests for new features or bug fixes.

License MIT License - Feel free to use, modify, and distribute this software as per the terms of the license.

This script is for educational purposes only. Users are responsible for ensuring that their use of the script complies with the terms of service of the scraped website and applicable laws.I am not responsible for any misuse or legal implications arising from the use of this script.

About

A Python tool to scrape and count U.S. subsidiaries from SEC filings. It utilizes requests and BeautifulSoup to parse subsidiary information, handling multiple tables and providing state-wise counts.

Topics

Resources

Stars

Watchers

Forks

Languages