Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue opening vaex.example() #2424

Open
FSCoen opened this issue May 16, 2024 · 0 comments
Open

Issue opening vaex.example() #2424

FSCoen opened this issue May 16, 2024 · 0 comments

Comments

@FSCoen
Copy link

FSCoen commented May 16, 2024

I'm a brand new user to vaex, and am having some difficulties going through the introduction/tutorial in the documentation.

I created a new miniconda environment for vaex. At first I had some pydantic_settings issues, but I was able to resolve that after I found a recommendation on here to downgrade pydantic to 1.10.8. I also have installed jupyer, there are no other packages in this environment.

I've run into another error now that has me more stumped. Here's the first two lines of code I'm trying to run (in a Jupyter notebook, as recommended):

import vaex
df = vaex.example()  

however, I get the following error message:

ERROR:MainThread:vaex:error opening 'C:\\Users\\FR32957/.vaex\\data\\helmi-dezeeuw-2000-FeH-v2-10percent.hdf5'
Traceback (most recent call last):
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\__init__.py", line 229, in open
    ds = vaex.dataset.open(path, fs_options=fs_options, fs=fs, **kwargs)
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\dataset.py", line 74, in open
    return opener.open(path, fs_options=fs_options, fs=fs, *args, **kwargs)
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\dataset.py", line 1438, in open
    return cls(path, *args, **kwargs)
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py", line 71, in __init__
    self._load()
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py", line 194, in _load
    self._load_columns(self.h5file["/table"])
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py", line 362, in _load_columns
    self.add_column(column_name, self._map_hdf5_array(data))
  File "C:\Users\FR32957\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py", line 253, in _map_hdf5_array
    array = self._map_array(offset, dtype=dtype, length=len(data))
TypeError: _map_array() got an unexpected keyword argument 'length'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 2
      1 import vaex
----> 2 df = vaex.example()  

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\__init__.py:594, in example()
    589 def example():
    590     """Returns an example DataFrame which comes with vaex for testing/learning purposes.
    591 
    592     :rtype: DataFrame
    593     """
--> 594     return vaex.datasets.helmi_de_zeeuw_10percent.fetch()

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\datasets.py:48, in Hdf5Download.fetch(self, force_download)
     46 def fetch(self, force_download=False):
     47     self.download(force=force_download)
---> 48     return vx.open(self.filename)

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\__init__.py:229, in open(path, convert, progress, shuffle, fs_options, fs, *args, **kwargs)
    227     ds = vaex.dataset.open(path_output, fs_options=fs_options, fs=fs, **kwargs)
    228 else:
--> 229     ds = vaex.dataset.open(path, fs_options=fs_options, fs=fs, **kwargs)
    230 df = vaex.from_dataset(ds)
    231 if df is None:

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\dataset.py:74, in open(path, fs_options, fs, *args, **kwargs)
     72     if opener.quick_test(path, fs_options=fs_options, fs=fs):
     73         if opener.can_open(path, fs_options=fs_options, fs=fs, *args, **kwargs):
---> 74             return opener.open(path, fs_options=fs_options, fs=fs, *args, **kwargs)
     76 # otherwise try all openers
     77 for opener in opener_classes:

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\dataset.py:1438, in DatasetFile.open(cls, path, *args, **kwargs)
   1436 @classmethod
   1437 def open(cls, path, *args, **kwargs):
-> 1438     return cls(path, *args, **kwargs)

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py:71, in Hdf5MemoryMapped.__init__(self, path, write, fs_options, fs, nommap)
     69 self.h5table_root_name = None
     70 self._version = 1
---> 71 self._load()
     72 if not write:  # in write mode, call freeze yourself, so the hashes are computed
     73     self._freeze()

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py:194, in Hdf5MemoryMapped._load(self)
    192 if "table" in self.h5file:
    193     self._version = 2
--> 194     self._load_columns(self.h5file["/table"])
    195     self.h5table_root_name = "/table"
    196 root_datasets = [dataset for name, dataset in self.h5file.items() if isinstance(dataset, h5py.Dataset)]

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py:362, in Hdf5MemoryMapped._load_columns(self, h5data, first)
    360         self.add_column(column_name, self._map_hdf5_array(data, column['mask']))
    361     else:
--> 362         self.add_column(column_name, self._map_hdf5_array(data))
    363 else:
    364     transposed = shape[1] < shape[0]

File ~\AppData\Local\miniconda3\envs\mnvx2\lib\site-packages\vaex\hdf5\dataset.py:253, in Hdf5MemoryMapped._map_hdf5_array(self, data, mask)
    251             dtype = np.dtype('U' + str(data.attrs['dlength']))
    252 #self.addColumn(column_name, offset, len(data), dtype=dtype)
--> 253 array = self._map_array(offset, dtype=dtype, length=len(data))
    254 if mask is not None:
    255     mask_array = self._map_hdf5_array(mask)

TypeError: _map_array() got an unexpected keyword argument 'length'

Could this be a result of how I have my environment set up? Would greatly appreciate any help, very eager to try this new tool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant