Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type inference for pt.function #40

Open
orm011 opened this issue Nov 14, 2023 · 1 comment
Open

type inference for pt.function #40

orm011 opened this issue Nov 14, 2023 · 1 comment

Comments

@orm011
Copy link
Collaborator

orm011 commented Nov 14, 2023

Currently making a python function to compute something on elements of a pixeltable dataframe or table is cumbersome, as it requires type annotation. The problem is that this is quite a basic operation, and the change coming from pandas is quite annoying.

@pt.function(param_types=[pt.JsonType()], return_type=pt.JsonType())
def restrict_json_for_pytorch(obj):
    keys = ['id', 'label', 'iscrowd', 'bounding_box']
    return {k: obj[k] for k in keys}

One small thing we can do for the user is to infer the annotation from python types in the signature.

As there is a correspondence:

   type_dict = { 
        np.ndarray: ts.ArrayType,
        int:ts.IntType,
        float:ts.FloatType,
        PIL.Image:ts.ImageType,
        str:ts.StringType,
    }

import PIL.Image
import typing
from typing import Dict, Any


@pt.function
def foo(a : int, b : float, c : PIL.Image) -> Dict[str, Any]:
    return {'a': a, 'b': b, }

typing.get_type_hints(foo)
{'a': int,
 'b': float,
 'c': <module 'PIL.Image' from '/Users/orm/mambaforge/envs/pixeltable/lib/python3.9/site-packages/PIL/Image.py'>,
 'return': typing.Dict[str, typing.Any]}

One current issue is dealing with video columns.
One option is to add a pt.Video python type that wraps the filesystem. It can also have some utilities for working with videos
within functions. Unlike for PIL.Image, I'm not sure there is a 'standard' for video.
In the case of PIL.image we kind of re-use it for our own work.

@mkornacker
Copy link
Collaborator

Agreed that this is more convenient. It doesn't work for arrays, though, because ArrayType also includes the shape, which ndarray doesn't convey. We'd also need some way to communicate that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants