Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Make async: True do everything under the hood #51

Open
dimitryzub opened this issue May 23, 2023 · 1 comment
Open

Comments

@dimitryzub
Copy link
Contributor

From a user perspective, the less setup required the better. I personally find the second example (example.py) more user-friendly especially for non-very technical users.

The user has to just add an async: True and don't bother tinkering/figuring out stuff for another ~hour about how Queue or something else works.

@jvmvik @ilyazub @hartator what do you guys think?

@aliayar @marm123 @schaferyan have you guys noticed similar issues for the users or have any users requested similar things?


What if instead of this:

# async batch requests: https://github.com/serpapi/google-search-results-python#batch-asynchronous-searches

from serpapi import YoutubeSearch
from queue import Queue
import os, re, json

queries = [
    'burly',
    'creator',
    'doubtful'
]

search_queue = Queue()

for query in queries:
    params = {
        'api_key': '...',                 
        'engine': 'youtube',              
        'device': 'desktop',              
        'search_query': query,          
        'async': True,                   # ❗
        'no_cache': 'true'
    }
    search = YoutubeSearch(params)       
    results = search.get_dict()         
    
    if 'error' in results:
        print(results['error'])
        break

    print(f"Add search to the queue with ID: {results['search_metadata']}")
    search_queue.put(results)

data = []

while not search_queue.empty():
    result = search_queue.get()
    search_id = result['search_metadata']['id']

    print(f'Get search from archive: {search_id}')
    search_archived = search.get_search_archive(search_id)
    
    print(f"Search ID: {search_id}, Status: {search_archived['search_metadata']['status']}")

    if re.search(r'Cached|Success', search_archived['search_metadata']['status']):
        for video_result in search_archived.get('video_results', []):
            data.append({
                'title': video_result.get('title'),
                'link': video_result.get('link'),
                'channel': video_result.get('channel').get('name'),
            })
    else:
        print(f'Requeue search: {search_id}')
        search_queue.put(result)

Users can do something like this and we handle everything under the hood:

# example.py
# testable example
# example import: from serpapi import async_search

from async_search import async_search
import json

queries = [
    'burly',
    'creator',
    'doubtful',
    'minecraft' 
]

# or as we typically pass params dict
data = async_search(queries=queries, api_key='...', engine='youtube', device='desktop')

print(json.dumps(data, indent=2))
print('All searches completed')

Under the hood code example:

# async_search.py
# testable example

from serpapi import YoutubeSearch
from queue import Queue
import os, re

search_queue = Queue()

def async_search(queries, api_key, engine, device):
    data = []
    for query in queries:
        params = {
            'api_key': api_key,                 
            'engine': engine,              
            'device': device,              
            'search_query': query,          
            'async': True,                  
            'no_cache': 'true'
        }
        search = YoutubeSearch(params)       
        results = search.get_dict()         
        
        if 'error' in results:
            print(results['error'])
            break

        print(f"Add search to the queue with ID: {results['search_metadata']}")
        search_queue.put(results)

    while not search_queue.empty():
        result = search_queue.get()
        search_id = result['search_metadata']['id']

        print(f'Get search from archive: {search_id}')
        search_archived = search.get_search_archive(search_id)
        
        print(f"Search ID: {search_id}, Status: {search_archived['search_metadata']['status']}")

        if re.search(r'Cached|Success', search_archived['search_metadata']['status']):
            for video_result in search_archived.get('video_results', []):
                data.append({
                    'title': video_result.get('title'),
                    'link': video_result.get('link'),
                    'channel': video_result.get('channel').get('name'),
                })
        else:
            print(f'Requeue search: {search_id}')
            search_queue.put(result)
            
    return data

Is there a specific reason we haven't done it before?

@aliayar
Copy link

aliayar commented May 23, 2023

We have received a few requests asking for an easier and faster way to use but nothing particular Dimitriy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants