Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Duckdb query runner #6356

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ashutosh6500
Copy link

What type of PR is this?

  • New Query Runner (Data Source)

Description

This PR adds support for duckdb query runner. DuckDB is an OLAP database used by data professionals, such as data scientists and analysts, to analyze data in a fast and efficient manner . This supports querying tables from duckdb files.

How is this tested?

  • Manually

This query runner is tested with redash docker based setup and environment.For sample .duckdb files, it fetches tables and sample queries are also checked.
image

Related Tickets & Documents

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

@ashutosh6500 ashutosh6500 changed the title Master Added Duckdb query runner Aug 8, 2023
@ashutosh6500
Copy link
Author

I am facing issue while loading schema in schema dashboard. Getting schema refresh failed. For some query runners that works. So how can I do that for this one?

@konnectr konnectr linked an issue Aug 8, 2023 that may be closed by this pull request
@konnectr
Copy link
Collaborator

konnectr commented Aug 8, 2023

I actually like DuckDB. I think it's worth adding it now as a QR. But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.

@konnectr
Copy link
Collaborator

konnectr commented Aug 8, 2023

@ashutosh6500 hI, think there's a description here. https://redash.io/help/open-source/dev-guide/write-a-query-runner .
You need implement get_schema https://github.com/getredash/redash/blob/master/redash/query_runner/__init__.py#L230

@ashutosh6500
Copy link
Author

I actually like DuckDB. I think it's worth adding it now as a QR. But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.

yes

@ashutosh6500
Copy link
Author

requirements.txt Outdated Show resolved Hide resolved
@konnectr
Copy link
Collaborator

konnectr commented Aug 8, 2023

Then we are waiting for you to implement get_schema and tests.

@ashutosh6500
Copy link
Author

Then we are waiting for you to implement get_schema and tests.

I have implemented get_schema now. What about tests? I didn't get actually.

@konnectr
Copy link
Collaborator

konnectr commented Aug 9, 2023

You can take the test as an example, I think from this folder and do it by analogy
https://github.com/getredash/redash/tree/master/tests/query_runner

@konnectr
Copy link
Collaborator

konnectr commented Aug 9, 2023

It is advisable for you to make such a setting for yourself https://github.com/getredash/redash/wiki/Local-development-setup#configuring-pre-commit . And run make format

@konnectr
Copy link
Collaborator

I could do it for you, but then the lines of code will belong to me, not to you. I would like to avoid this !

@justinclift
Copy link
Member

I've just added a new commit to this PR, which should (in theory) fix the formatting. Hopefully the CI tests all pass ok now. 😄

@codecov
Copy link

codecov bot commented Aug 11, 2023

Codecov Report

Attention: 51 lines in your changes are missing coverage. Please review.

Comparison is base (d333660) 60.78% compared to head (3c7722c) 60.59%.
Report is 126 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6356      +/-   ##
==========================================
- Coverage   60.78%   60.59%   -0.20%     
==========================================
  Files         153      154       +1     
  Lines       12527    12596      +69     
  Branches     1701     1712      +11     
==========================================
+ Hits         7614     7632      +18     
- Misses       4687     4738      +51     
  Partials      226      226              
Files Coverage Δ
redash/settings/__init__.py 98.65% <ø> (ø)
redash/query_runner/duckdb.py 26.08% <26.08%> (ø)

@justinclift
Copy link
Member

Well, my commit fixed the formatting ok. It didn't fix the other stuff though. 😇

@konnectr
Copy link
Collaborator

the problem is in ci/cd because it can't find duckdb

Traceback (most recent call last):
  File "/app/manage.py", line 6, in <module>
    from redash.cli import manager
  File "/app/redash/__init__.py", line 53, in <module>
    import_query_runners(settings.QUERY_RUNNERS)
  File "/app/redash/query_runner/__init__.py", line 436, in import_query_runners
    __import__(runner_import)
  File "/app/redash/query_runner/duckdb.py", line 3, in <module>
    import duckdb
ModuleNotFoundError: No module named 'duckdb'

@ashutosh6500
Copy link
Author

the problem is in ci/cd because it can't find duckdb

Traceback (most recent call last):
  File "/app/manage.py", line 6, in <module>
    from redash.cli import manager
  File "/app/redash/__init__.py", line 53, in <module>
    import_query_runners(settings.QUERY_RUNNERS)
  File "/app/redash/query_runner/__init__.py", line 436, in import_query_runners
    __import__(runner_import)
  File "/app/redash/query_runner/duckdb.py", line 3, in <module>
    import duckdb
ModuleNotFoundError: No module named 'duckdb'

will take a look at this. why its not working

@justinclift
Copy link
Member

@ashutosh6500 Any interest in finishing this off? 😄

@guidopetri
Copy link
Collaborator

Closing for now - @ashutosh6500 feel free to reopen

@wearpants
Copy link

I'm interested in picking this up & finishing it off but a few questions:

But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.

Not sure I understand this comment (at least in the context of the PR as it now stands). In my use case, I need to open the duckdb file in read-only mode (since there maybe multiple processes accessing it). Any problem doing so?

@guidopetri guidopetri reopened this Jan 11, 2024
@guidopetri
Copy link
Collaborator

I'm not sure what @konnectr meant either, I was under the impression all query results were saved to the Redash internal postgres db.

For now, I'd be comfortable moving this forward if you merged master in / fixed conflicts and added tests. Thanks for picking this back up, I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for DuckDB
5 participants