Code Refactor: Remove Pandas as a dependency by using the natively in built csv module instead #1131

dzumii · 2024-05-30T10:24:36Z

Description
Code refactoring to remove Pandas from the tracking functionality and implement the functionality using built-in Python libraries.

Changes Made

defined read_csv function to read CSV files using the native CSV library.
Replaced some other pandas data manipulation.
Adjusted the stats function to calculate statistics using the statistics library.
Adjusted the get_file_size function

Status

The tracking functionality works with the changes.
Time taken is now a bit shorter.
However there are still some inconsistencies in the outcome in comparison with the outcome generated using the tracking functionality with pandas, particularly relating to the file_size and stats functions.

To Do

Refine the check_types function.
Refine the get_file_size function.

This is still a Work In Progress
Comments and suggestions would be really appreciated

Related to #1090

Closes #1054

dzumii · 2024-06-03T21:38:42Z

Fixed all inconsistencies, tracking functionality now works as it did before removing pandas dependency

ersilia/core/tracking.py

DhanshreeA · 2024-06-04T06:54:26Z

@dzumii could you also share before and after results? That is, results from the master branch and your work?

ersilia/core/tracking.py

miquelduranfrigola · 2024-06-04T15:43:02Z

@DhanshreeA @dzumii anything you need from me here?

DhanshreeA · 2024-06-04T17:35:54Z

@DhanshreeA @dzumii anything you need from me here?

All good on my end.

dzumii · 2024-06-04T20:28:48Z

@DhanshreeA, I have been able to finally fix the inconsistency in the mismatch_type ,thanks to @Malikbadmus,took us some hours to crack

here is the comparison between the two metric files when i run
diff -y <txt-file-generated-from-master-before-changes> <txt-file-generated-after-removing-pandas>

I also noticed a warning when i try to close a model having used the track flags

ersilia/core/tracking.py

…rary instead

DhanshreeA · 2024-06-06T07:02:23Z

@DhanshreeA, I have been able to finally fix the inconsistency in the mismatch_type ,thanks to @Malikbadmus,took us some hours to crack

here is the comparison between the two metric files when i run diff -y <txt-file-generated-from-master-before-changes> <txt-file-generated-after-removing-pandas>

I also noticed a warning when i try to close a model having used the track flags

Hey @dzumii Could you please report this in an issue (the warning) and we can tackle it there?

DhanshreeA · 2024-06-06T07:06:15Z

LGTM on the PR! Ready to merge.

dzumii · 2024-06-06T07:33:17Z

@DhanshreeA, I have been able to finally fix the inconsistency in the mismatch_type ,thanks to @Malikbadmus,took us some hours to crack
here is the comparison between the two metric files when i run diff -y <txt-file-generated-from-master-before-changes> <txt-file-generated-after-removing-pandas>
I also noticed a warning when i try to close a model having used the track flags

Hey @dzumii Could you please report this in an issue (the warning) and we can tackle it there?

Ok, will do that

dzumii force-pushed the code-refactoring-remove-pandas branch from 5a825d5 to b4d649e Compare June 3, 2024 12:42

DhanshreeA self-requested a review June 4, 2024 06:41

DhanshreeA requested changes Jun 4, 2024

View reviewed changes

DhanshreeA mentioned this pull request Jun 4, 2024

🐛 Bug: Pandas slows down build time for ersiliaos/base image #1054

Closed

DhanshreeA requested changes Jun 4, 2024

View reviewed changes

DhanshreeA reviewed Jun 5, 2024

View reviewed changes

ersilia/core/tracking.py Outdated Show resolved Hide resolved

dzumii added 6 commits June 5, 2024 20:37

code refactoring to remove pandas dependency using the native csv lib…

8e39f0b

…rary instead

used the os module to get file size

087dfea

fixed the discrepancies in the get-file-size function

1b78d29

implemented PR reviews and changes

d978336

fixed the mismatch_type error

8c121ee

Updated documentation

3973044

dzumii force-pushed the code-refactoring-remove-pandas branch from c744232 to 3973044 Compare June 5, 2024 20:38

DhanshreeA approved these changes Jun 6, 2024

View reviewed changes

DhanshreeA merged commit d0e0f6c into ersilia-os:master Jun 6, 2024
16 checks passed

dzumii mentioned this pull request Jun 6, 2024

Update pyproject.toml file #1153

Merged

dzumii mentioned this pull request Jun 20, 2024

Track without output bug #1170

Merged

dzumii deleted the code-refactoring-remove-pandas branch June 23, 2024 15:02

dzumii restored the code-refactoring-remove-pandas branch June 23, 2024 15:02

dzumii deleted the code-refactoring-remove-pandas branch June 23, 2024 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Refactor: Remove Pandas as a dependency by using the natively in built csv module instead #1131

Code Refactor: Remove Pandas as a dependency by using the natively in built csv module instead #1131

dzumii commented May 30, 2024 •

edited by DhanshreeA

Loading

dzumii commented Jun 3, 2024

DhanshreeA commented Jun 4, 2024

miquelduranfrigola commented Jun 4, 2024

DhanshreeA commented Jun 4, 2024

dzumii commented Jun 4, 2024 •

edited

Loading

DhanshreeA commented Jun 6, 2024 •

edited

Loading

DhanshreeA commented Jun 6, 2024

dzumii commented Jun 6, 2024

Code Refactor: Remove Pandas as a dependency by using the natively in built csv module instead #1131

Code Refactor: Remove Pandas as a dependency by using the natively in built csv module instead #1131

Conversation

dzumii commented May 30, 2024 • edited by DhanshreeA Loading

dzumii commented Jun 3, 2024

DhanshreeA commented Jun 4, 2024

miquelduranfrigola commented Jun 4, 2024

DhanshreeA commented Jun 4, 2024

dzumii commented Jun 4, 2024 • edited Loading

DhanshreeA commented Jun 6, 2024 • edited Loading

DhanshreeA commented Jun 6, 2024

dzumii commented Jun 6, 2024

dzumii commented May 30, 2024 •

edited by DhanshreeA

Loading

dzumii commented Jun 4, 2024 •

edited

Loading

DhanshreeA commented Jun 6, 2024 •

edited

Loading