Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REL-CLT] High level description of changes and key-features #2006

Open
Tracked by #1840
blythed opened this issue Apr 26, 2024 · 6 comments
Open
Tracked by #1840

[REL-CLT] High level description of changes and key-features #2006

blythed opened this issue Apr 26, 2024 · 6 comments
Assignees

Comments

@blythed
Copy link
Collaborator

blythed commented Apr 26, 2024

No description provided.

@kartik4949
Copy link
Collaborator

  • Added graph mode Fixed bugs in Vector search
  • Fixed bugs in training of LLMs
  • Optimised LLM training
  • Added ray support for model serving
  • Improve vector search with optimisations
  • Added REST server
  • Fixed miscellaneous bugs
  • Improved testing suite
  • Allow developers to write Listeners and Graph in a single formalism
  • Simplification of developer contracts around model
  • Enabled lazy loading of artifact for low memory footprint.

@guerra2fernando
Copy link
Collaborator

We need a complete description of all those items @kartik4949

@kartik4949
Copy link
Collaborator

Description:

  • Graph Mode Support:

    • Build complex model graphs by connecting superduperdb Model components. These graphs can be saved in the database like Models and used for inference similar to Models. This feature enables users to visually represent and manage intricate relationships between components, enhancing the understanding and organization of their data models.
  • REST Server Support:

    • Access superduperdb core via REST. Perform database queries, create components, upload artifacts, etc., using the REST interface as an alternative to the Python client. With this feature, users can interact with superduperdb from any programming language or platform that supports RESTful communication, expanding accessibility and integration capabilities.
  • Local Cluster Creation with tmux Utility:

    • Easily set up a debuggable cluster environment using the tmux cluster utility. Users automatically receive a tmux session with windows running as superduperdb service for straightforward debugging. This feature streamlines the process of creating and debugging cluster environments, providing users with a seamless experience for testing and troubleshooting their applications.
  • Custom Usecase Creation:

    • With our updated documentation, users can create their own unique use cases. These can be downloaded as notebooks, enabling a wide range of possibilities with different databases and modalities from a single documentation source. This feature empowers users to tailor superduperdb to their specific needs and workflows, fostering innovation and flexibility in application development.
  • Bulk Query Execution:

    • Create and execute bulk queries in one go. This feature allows users to combine various types of queries (insert, update, delete, etc.) into a single bulk query for efficient backend execution. By minimizing the number of round trips to the database, users can significantly improve performance and optimize resource utilization in data-intensive applications.
  • Lazy File Datatype:

    • Utilize lazy file datatype to create file references that can be retrieved later. This feature simplifies file management by allowing users to load files when needed. Lazy file datatype enhances efficiency by deferring file loading until necessary, reducing memory usage and improving system responsiveness, especially when dealing with large files or datasets.
  • Auto Decorator for Models:

    • Apply objectmodel and torchmodel decorators directly on callables to easily create superduperdb basic model objects. This streamlined interface speeds up the process of model creation. By automating the application of decorators, users can focus more on building and refining their models, accelerating development cycles and boosting productivity.
  • Pandas Directory Support:

    • Create a pandas datalayer instance with a directory. This directory contains future model output tables, data inputs, etc. Preexisting CSV files in the directory are treated as data tables and can be referenced after datalayer creation. This feature simplifies data management by integrating pandas functionality directly into superduperdb, allowing users to seamlessly work with structured data and leverage the powerful capabilities of pandas within their workflows.
  • Simplified Model Prediction Interface:

    • Access a simplified model interface with predict_one and predict APIs for single and multi-datapoint prediction tasks. This feature provides a user-friendly and intuitive way to perform model predictions, simplifying the integration of machine learning models into applications and workflows.
  • File Datatype Support:

    • Add file datatype type to support saving and reading files/folders in artifact_store. This feature enhances the flexibility of managing files and folders within the system. By supporting file datatype, superduperdb becomes a more versatile platform for handling various types of data, including unstructured data such as documents, images, and multimedia files, alongside structured data.

@blythed blythed assigned blythed and jieguangzhou and unassigned kartik4949, blythed and thgnw May 10, 2024
@blythed
Copy link
Collaborator Author

blythed commented May 10, 2024

@jieguangzhou to provide comments. @kartik4949 to provide latest version.

@blythed blythed closed this as completed May 20, 2024
@thgnw
Copy link
Collaborator

thgnw commented May 22, 2024

@blythed don't we want to use this to write a blog post? this is too low level for website and press release. may be good for new readme?

@thgnw thgnw reopened this May 22, 2024
@blythed blythed self-assigned this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

5 participants