-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I try to read a rds file, but get the following error: #49
Comments
Make sure you are using the latest version of pyreadr. If the problem persists send a file to reproduce the issue. If I cannot reproduce it, I cannot fix it. |
I have the latest pyreadr. However, I am not allowed to share the dataset. |
that's unfortunate because if I can't reproduce it there is nothing I can do now. |
Hello, I have the same issue and I have made a reproducible file for you to check out (however I cannot find how to upload it here). I tried a lot to get it to work and probably more during my long internet search. I think my file is not a "good" .RData file and tried to find the reason why, but so far unsuccessful. Could you have a look?
|
thanks, I need the file to take a look. Zip it and then upload it here, just drag and drop into this text box. If the file is too big, then put it in dropbox, google drive or similar and share it with everyone and paste here the link. You can research for other services where you can put your file without having an account. Without file it is impossible for me to take a look. |
Sorry, I uploaded some corrupt files earlier. This one should work |
I Finally found a solution! However, I would like not to load it into R, to re-save the file, and then use it in my code. I would rather just use the original RData files. But I was trying all kinds of stuff for proof of concept. I load this file into R, run the following to remove the Factors: (rlvnc2 is the name of de dataframe, change accordingly)
And then save it with the standard save() option from R Then it works fine with your pyreadr. But if I save it with |
Ok, thanks I can reproduce it. The issue is coming from the C library, therefore I have submitted a new issue about this. I see that in the file every factor has a lot of levels, I wonder if there is some non-UTF8 character hidden there somewhere. In the other hand it seems that you already tried to change the encoding of all factors and that didn't work. |
to be sure, I tried to change the encoding again and save with Good luck finding the exact problem. If you need any help with trial and error, let me know |
interesting, when I save the file it looks completely different when looked at a hex file editor. What version of R are you using, on which platform? (windows, mac, linux ... )? |
I think that the original file (that isn't working) is made on a linux based computer with an old version of R or a windows computer with an old version of R. I do not know the exact origin, because I only work with this file and was created before I was involved. the new file (after changing the factors to characters) was made on R version 4.0.2 with Rstudio 2021.09.0 Build 351 "Ghost Orchid" Release (077589bc, 2021-09-20) for macOS. EDIT: |
OK anyway, saving the file again with 4.02 gives exactly the same error, I think somehow the C library is not reading one of the fields in the binary file from the correct byte. |
This is my code:
import pyreadr
result = pyreadr.read_r('data/injuryTimeDataset.rds')
This is the error:
parser.parse(path)
File "pyreadr\librdata.pyx", line 117, in pyreadr.librdata.Parser.parse
File "pyreadr\librdata.pyx", line 139, in pyreadr.librdata.Parser.parse
File "pyreadr\librdata.pyx", line 102, in pyreadr.librdata._handle_value_label
File "pyreadr\librdata.pyx", line 197, in pyreadr.librdata.Parser.__handle_value_label
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 1: invalid start byte
What should I do? I have not looked in the rds file, but it is supposed to be a mixture of strings, ints and floats. Lastly, this works:
pyreadr.object_list
The text was updated successfully, but these errors were encountered: