Skip to content

Convert investment statements (like mutual fund) for India to interpretable formats

License

Notifications You must be signed in to change notification settings

jethar/mutualfund-stmts-etl

Repository files navigation

ETL for Mutual Fund Statements

This project is for handling and interpreting the mutual fund statments in Indian context to provide useful insights. the user provides pdf statments for CAS (Consolidated Account Statements) and gain statements from fund management services such as CAMS and Karvy.

Features:

  • Interpret pdf statments to provide transactions in csv format which can then be used for other purposes such as portfolio management or ITR inputs.
  • Reconcile the CAS with gain statements to flag any missing transactions on "Sell" side.

Disclaimer: Please consider this tool as alpha version as needs to be thoroughly tested on different inputs. You are responsible for usage of data, and are strongly advised to cross-check the outputs manually as well.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

  • Python 3.x environment
  • Java
    • Confirmed working with Java 7, 8

Installing

From a command line (preferably, from a virtual environment), simply issue the command:

pip install -r requirements.txt

This will download [the latest released version][23] of the program dependencies from the [Python Package Index (PyPI)][22] along with all the necessary dependencies. At this point, you should be ready to start using it.

If this does not work, because you are on Python 2 versions (e.g. 2.7.5 on Ubuntu 14.4), try:

apt-get install python3 python3-pip
pip3 install -r requirements.txt

instead.

Note 1: We strongly recommend that you don't install the package globally on your machine (i.e., with root/administrator privileges), as the installed modules may conflict with other Python applications that you have installed in your system (or they can interfere with this). Prefer to use the option --user to pip install, if you need can.

Usage

tl;dr

usage: extract.py [-h] [-c CONFIG_FILE] [--ignore-folio] [--ignore-nav]
                  [--debug]

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config CONFIG_FILE
                        read pdf files info from csv file
  --ignore-folio        ignore folio while comparing transactions
  --ignore-nav          ignore nav while comparing transactions
  --debug               show verbose info

Go into the mutualfund-stmts-etl folder, containing the extract.py file.

  1. Create two folders - input and output in the working directory.
  2. Put the pdf files in the input folder. You can also add a csv transactions file based on eth template - input/gain-statement-sample.csv
  3. Make a copy of the file - input/job-desc-sample.csv, and make changes in it appropriately.
    1. filename
    2. Whether the given pdf is - CAS or GAIN
    3. password
    4. in case of pt.b being GAIN, whether it is from CAMS or KARVY operated funds.
  4. Run the command -
    extract.py -c input/job-description.csv
      # or
    python extract.py -c input/job-desc.csv
    
  5. To get help, type -
    python extract.py -h
    

The output folder will contain the generated CSV files for each input pdf. Also following reconciliation CSVs will be generated:

  • reconciliation_summary.csv - Containing transactions reconciliation between CAS and gain statements
  • reconciliation_detailed.csv - includes even matched folios, to infer if there is anything amiss.
  • reconciliation_summary_excl_csv_gains.csv - Containing transactions reconciliation between CAS and gain pdf statements excluding csv gain statement.

Now the project also generates a consolidated gains transaction csv gain_stmt_consolidated.csv, which can be used for generating final capital gains values.

TODOs

  1. Handling statments from other than CAMS or Karvy. e.g. FTAMIL / SBFS

Troubleshooting

Reporting issues

Before reporting any issue please follow the steps below:

  1. Verify that you are running the latest version of the script, and the recommended versions of its dependencies, see them in the file requirements.txt.

  2. If the problem persists, feel free to open an issue in our bugtracker, please fill the issue template with as much information as possible.

Filing an issue/Reporting a bug

When reporting bugs against mutualfund-stmts-etl, please don't forget to include enough information so that you can help us help you:

  • Is the problem happening with the latest version of the script?
  • What operating system are you using?
  • Do you have all the recommended versions of the modules? See them in the file requirements.txt.
  • What is the precise command line that you are using (feel free to hide your username and password with asterisks, but leave all other information untouched).
  • What are the precise messages that you get? Please, use the --debug option before posting the messages as a bug report. Please, copy and paste them. Don't reword/paraphrase the messages.

Contact

Please, post bugs and issues on github.

Releases

No releases published

Packages

No packages published

Languages