Parse Robinhood 1099 Tax Document from PDF into CSV

Overview

Robinhood 1099 Parser

This project converts Robinhood Securities 1099 tax document from PDF to CSV file. This tool will be helpful for those who need every transaction in a spreadsheet format for tax reporting purposes.

Original Work

Copyright (c) 2021 Keun Park ([email protected])

Donate

πŸš€ Running Locally

Make sure you have Python 3 on your computer. If not, download the latest version from here.

Environment Setup

git clone https://github.com/kevinpark1217/Robinhood-1099-Parser.git
cd Robinhood-1099-Parser
python -m pip install -r requirements.txt

Start Parsing!

➜ python main.py 
usage: main.py [-h] --pdf FILE [--csv FILE] [--silent] [--check]

Example and Checking

Enable --check flag to print out total values for some columns. Make sure these values match with the PDF!

Example Screenshot

🐞 Issues and Bugs

If you have any issues with the tool, please open a GitHub Issue with as much as detail as you can provide.

Comments
  • IndexError: list index out of range

    IndexError: list index out of range

    Hey, I'm checking out your script and after getting git, python and visual studio build tools installed I finally got it to work. Now when I run the script, it errors out with index out of range. This is using the February 2021 RH pdf.

    C:\Users\hypno\Documents\Robinhood-1099-Parser>python main.py --pdf 1099.pdf --check Pages: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 9/9 [00:04<00:00, 2.02it/s] Traceback (most recent call last): File "C:\Users\hypno\Documents\Robinhood-1099-Parser\main.py", line 32, in contents = parser.process(not args.silent) File "C:\Users\hypno\Documents\Robinhood-1099-Parser\rh_1099\pdf_parser\parser_2020.py", line 49, in process pdf_contents.add_sales(Sales2020.parse(last_raw_entries)) File "C:\Users\hypno\Documents\Robinhood-1099-Parser\rh_1099\sales_transactions\sales_2020.py", line 38, in parse desc = raw_data[0].strip() IndexError: list index out of range

    bug invalid 
    opened by hypnotizd 10
  • 1200+ transactions missing from csv output

    1200+ transactions missing from csv output

    In total, I have 3235 transactions for a specific stock (listed on my 1099 tax form), but around 1200 is missing on the csv form (this affects my total proceeds calculation by a large amount). Everything else under 1000 transactions works fine though.

    bug 
    opened by twangodev 3
  • Totals are calculating incorrectly.

    Totals are calculating incorrectly.

    Everything ran fine, but the totals are not correct. proceeds, cost, wash and gain all wrong, gain is only off by $0.43, wash is about half the value it should be, and proceeds and cost is off quite a bit. Was really hoping this would work!! Taxes are almost due!

    bug 
    opened by maximumhax 3
  • Help running script

    Help running script

    Hi, I am completely new to python script and have been trying to figure out how to get this to run. I am currently running python on mac. I was able to run python3 -m pip install wheel and -m pip install --upgrade rh_1099 but from here I am entirely confused where to go to execute and import my pdf file. Any help would be greatly appreciated! Thank you! # #

    opened by D-C-1977 2
  • Is separation of short term and long term transactions needed?

    Is separation of short term and long term transactions needed?

    Currently the parser combines short term and long term transactions "for covered tax lots" into a single csv file.

    Should I add the feature of separating them into 2 separate csv files? How useful would this be?

    enhancement question 
    opened by kevinpark1217 2
  • Packaging & Running on Windows

    Packaging & Running on Windows

    1. Combines the tool in to a single Python package
    2. Update README with instructions on running the tool on Windows

    TODO Automatically upload to PyPI Once uploaded to PyPI, user should only need to run pip install rh_1099

    enhancement 
    opened by kevinpark1217 0
  • Command not found after install

    Command not found after install

    I successfully installed this (there were no errors) This is the output if I run the commands again.

    [email protected] site-packages % python3 -m pip install wheel                                      
    Defaulting to user installation because normal site-packages is not writeable
    Requirement already satisfied: wheel in /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/site-packages (0.36.2)
    [email protected] site-packages % python3 -m pip install --upgrade rh_1099                          
    Defaulting to user installation because normal site-packages is not writeable
    Requirement already satisfied: rh_1099 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (1.0.1)
    Requirement already satisfied: tqdm>=4.59.0 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from rh_1099) (4.64.0)
    Requirement already satisfied: pdfreader>=0.1.9 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from rh_1099) (0.1.10)
    Requirement already satisfied: bitarray>=1.1.0 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from pdfreader>=0.1.9->rh_1099) (2.4.1)
    Requirement already satisfied: pycryptodome>=3.9.9 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from pdfreader>=0.1.9->rh_1099) (3.14.1)
    Requirement already satisfied: python-dateutil>=2.8.1 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from pdfreader>=0.1.9->rh_1099) (2.8.2)
    Requirement already satisfied: pillow>=7.1.0 in /Users/andrewporzio/Library/Python/3.8/lib/python/site-packages (from pdfreader>=0.1.9->rh_1099) (9.1.0)
    Requirement already satisfied: six>=1.5 in /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/site-packages (from python-dateutil>=2.8.1->pdfreader>=0.1.9->rh_1099) (1.15.0)
    

    Now when I try to run the command I get the error that it can't be found

    [email protected] site-packages % rh_1099 --pdf /Users/andrewporzio/Downloads/9d176fbf-9351-45d2-887c-940ed3cb5af4.pdf --check zsh: command not found: rh_1099

    opened by aporzio1 1
  • PDFContents - ImportError: attempted relative import with no known parent package

    PDFContents - ImportError: attempted relative import with no known parent package

    Code does not function

    ~/Downloads/Robinhood-1099-Parser-1.0.1/Robinhood-1099-Parser-1.0.1/rh_1099 $ python main.py --pdf tax.pdf --check Traceback (most recent call last): File "C:\Users\Dylan\Downloads\Robinhood-1099-Parser-1.0.1\Robinhood-1099-Parser-1.0.1\rh_1099\main.py", line 5, in from .pdf_contents import PDFContents ImportError: attempted relative import with no known parent package

    opened by bluenostromo 1
Owner
Keun Tae (Kevin) Park
I am a Computer Science student at Georgia Institute of Technology with the focus of Intelligence and Systems & Architecture.
Keun Tae (Kevin) Park
Customizing Visual Styles in Plotly

Customizing Visual Styles in Plotly Code for a workshop originally developed for an Unconference session during the Outlier Conference hosted by Data

Data Design Dimension 9 Aug 03, 2022
Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 01, 2023
BGraph is a tool designed to generate dependencies graphs from Android.bp soong files.

BGraph BGraph is a tool designed to generate dependencies graphs from Android.bp soong files. Overview BGraph (for Build-Graphs) is a project aimed at

Quarkslab 10 Dec 19, 2022
Movie recommendation using RASA, TigerGraph

Demo run: The below video will highlight the runtime of this setup and some sample real-time conversations using the power of RASA + TigerGraph, Steps

Sudha Vijayakumar 3 Sep 10, 2022
Custom ROI in Computer Vision Applications

EasyROI Helper library for drawing ROI in Computer Vision Applications Table of Contents EasyROI Table of Contents About The Project Tech Stack File S

43 Dec 09, 2022
Small binja plugin to import header file to types

binja-import-header (v1.0.0) Author: matteyeux Import header file to Binary Ninja types view Description: Binary Ninja plugin to import types from C h

matteyeux 15 Dec 10, 2022
Python package that generates hardware pinout diagrams as SVG images

PinOut A Python package that generates hardware pinout diagrams as SVG images. The package is designed to be quite flexible and works well for general

336 Dec 20, 2022
Apache Superset is a Data Visualization and Data Exploration Platform

Apache Superset is a Data Visualization and Data Exploration Platform

The Apache Software Foundation 49.9k Jan 02, 2023
Jupyter notebook and datasets from the pandas Q&A video series

Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note

Kevin Markham 2k Jan 05, 2023
Visualization Website by using Dash and Heroku

Visualization Website by using Dash and Heroku You can visit the website https://payroll-expense-analysis.herokuapp.com/ In this project, I am interes

YF Liu 1 Jan 14, 2022
view cool stats related to your discord account.

DiscoStats cool statistics generated using your discord data. How? DiscoStats is not a service that breaks the Discord Terms of Service or Community G

ibrahim hisham 5 Jun 02, 2022
This is a Boids Simulation, written in Python with Pygame.

PyNBoids A Python Boids Simulation This is a Boids simulation, written in Python3, with Pygame2 and NumPy. To use: Save the pynboids_sp.py file (and n

Nik 17 Dec 18, 2022
nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation.

nptsne nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation and HSNE modelling. For more detail s

Biomedical Visual Analytics Unit LUMC - TU Delft 29 Jul 05, 2022
Fast scatter density plots for Matplotlib

About Plotting millions of points can be slow. Real slow... 😴 So why not use density maps? ⚑ The mpl-scatter-density mini-package provides functional

Thomas Robitaille 473 Dec 12, 2022
visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem

visualize_ML visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem. It is build

Ayush Singh 164 Dec 12, 2022
The official colors of the FAU as matplotlib/seaborn colormaps

FAU - Colors The official colors of Friedrich-Alexander-UniversitΓ€t Erlangen-NΓΌrnberg (FAU) as matplotlib / seaborn colormaps. We support the old colo

Machine Learning and Data Analytics Lab FAU 9 Sep 05, 2022
Print matplotlib colors

mplcolors Tired of searching "matplotlib colors" every week/day/hour? This simple script displays them all conveniently right in your terminal emulato

Brandon Barker 32 Dec 13, 2022
VDLdraw - Batch plot the log files exported from VisualDL using Matplotlib

VDLdraw Batch plot the log files exported from VisualDL using Matplotlib. At pre

Yizhou Chen 5 Sep 26, 2022
🎨 Python3 binding for `@AntV/G2Plot` Plotting Library .

PyG2Plot 🎨 Python3 binding for @AntV/G2Plot which an interactive and responsive charting library. Based on the grammar of graphics, you can easily ma

hustcc 990 Jan 05, 2023
Make visual music sheets for thatskygame (graphical representations of the Sky keyboard)

sky-python-music-sheet-maker This program lets you make visual music sheets for Sky: Children of the Light. It will ask you a few questions, and does

21 Aug 26, 2022