Instagram_Crawler

Extract Data From Instagram Using Selenium/Python.

Detail Info • Description • Install Libraries • Get Started • Architecture • Stack • Contribute

Description

Instagram Crawler is a python module for crawling Instagram data.

⚠️ If you access more than a certain number of posts on Instagram, the posts are no longer loaded. Therefore, about 100 to 300 posts can be crawled.

Install

Simply run :

pip install -r requirements.txt

You can also install additional dependencies (for running examples, generating documentation, etc...) with : ⚠️ Python ≥ 3.6 required

Get Started

The full documentation contains more detailed tutorials, but to get a taste of the framework, you can take a look at the examples folder.
Let's look at the easy example, bart_easy.py. You can run the example with following command :

$ python3 main.py --id=[user_id] \
  --password=['user_password']\
  --hash_tag=[hash_tag] \
  --display=[0 or 1] \
  --extract_num=[extract_num: int] \
  --login_option=[instagram or facebook] \
  --extract_file=[file name] \
  --extract_tag_file=[tag file name] \
  --driver_path=[chromedriver path]

# -*- coding:utf-8 -*-

import argparse
from instagram_crawler.metadata import EXTRACT_NUM, LOGIN_OPTION, SAVE_FILE_NAME, SAVE_FILE_NAME_TAG
from instagram_crawler.extract_data import crawling_instagram


parser = argparse.ArgumentParser(description='Crawling Instagram Post - Comment',
                                 formatter_class=argparse.RawTextHelpFormatter)


def get_arguments():
    parser.add_argument("--driver_path", 
                        help="selenium chrome driver path", 
                        required=True, type=str)

    parser.add_argument("--id", 
                        help="instagram or facebook id", 
                        required=True, type=str)

    parser.add_argument("--password", 
                        help="instagram or facebook password", 
                        required=True, type=str)

    parser.add_argument("--hash_tag", 
                        help="The hashtag you want to extract.", 
                        required=True, type=str)

    parser.add_argument("--display",
                        help="display selenium chromedriver or not 0 or 1",
                        required=True, type=int)


    parser.add_argument("--extract_num", 
                        help="The number of posts I want to extract.", 
                        default=EXTRACT_NUM, type=int)

    parser.add_argument("--login_option", 
                        help="select login account [facebook, instagram]", 
                        default=LOGIN_OPTION, type=str)

    parser.add_argument("--extract_file",
                        help="set extract file name", 
                        default=SAVE_FILE_NAME, type=str)

    parser.add_argument("--extract_tag_file",
                        help="set extract tag file name", 
                        default=SAVE_FILE_NAME_TAG, type=str)

    _args = parser.parse_args()

    return _args


def instagram_main():
    args = get_arguments()
    is_file_save, is_tag_file_save = crawling_instagram(args=args)

    if is_file_save:
        print("file save success - {}".format(args.extract_file))

    if is_tag_file_save:
        print("file save success - {}".format(args.extract_tag_file))


if __name__ == "__main__":
    instagram_main()

Stack

Pandas

Library used for make result csv file.

Selenium

Library used for extract instagram data in chrome browser.

Contribute

To contribute, simply clone the repository, add your code in a new branch and open a pull request !

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
images		images
instagram_crawler		instagram_crawler
jupyter notebook files		jupyter notebook files
result example files		result example files
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instagram_Crawler

Description

Install

Get Started

Stack

Pandas

Selenium

Contribute

About

Releases

Packages

Contributors 2

Languages

License

SOMJANG/Instagram_Crawler

Folders and files

Latest commit

History

Repository files navigation

Instagram_Crawler

Description

Install

Get Started

Stack

Pandas

Selenium

Contribute

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages