A comprehensive TAXSIM emulator using the PolicyEngine US federal and state tax calculator, with advanced comparison tools and an interactive dashboard for analyzing tax calculation accuracy across different scenarios.
- Overview
- Quick Start
- Installation
- Usage
- Dashboard
- Input Variables
- State-Specific Features
- Output Variables
This project provides a high-fidelity emulator for TAXSIM-35, leveraging PolicyEngine's comprehensive US federal and state tax calculator. It enables researchers, analysts, and policymakers to run large-scale tax microsimulations with full compatibility to TAXSIM-35 input/output formats.
- High-Performance Microsimulation: Process thousands of households simultaneously using PolicyEngine's vectorized calculations
- TAXSIM-35 Compatibility: Full compatibility with TAXSIM input CSV format and output variables
- Advanced State Handling: Correctly handles state-specific tax conformity rules (e.g., states that adopt federal AGI vs federal taxable income)
- Comprehensive Comparison Tools: Side-by-side comparison of PolicyEngine vs TAXSIM results with detailed mismatch analysis
- Interactive Dashboard: React-based dashboard for exploring results across years, states, and household characteristics
- Flexible Output Options: Standard, full, and text description output formats matching TAXSIM specifications
- YAML Test Generation: Generate PolicyEngine test cases for reproducibility and validation
# Clone and install
git clone https://github.com/PolicyEngine/policyengine-taxsim.git
cd policyengine-taxsim
pip install -e .
# Run a comparison analysis on sample data
python policyengine_taxsim/cli.py compare your_data.csv --sample 1000 --year 2021
# Start the interactive dashboard
cd cps-dashboard && npm install && npm start
-
Clone the repository:
git clone https://github.com/PolicyEngine/policyengine-taxsim.git cd policyengine-taxsim
-
Create a virtual environment:
# For Windows python -m venv venv venv\Scripts\activate # For macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install the package:
pip install -e .
-
To update the project codebase (for existing project)
git pull origin main
-
To update dependencies used by the project (for existing project):
pip install -e . --upgrade
pip install git+https://github.com/PolicyEngine/policyengine-taxsim.git
The CLI provides several commands for different use cases. All commands are accessed through the main CLI interface:
python policyengine_taxsim/cli.py [COMMAND] [OPTIONS]
Calculate taxes using PolicyEngine:
python policyengine_taxsim/cli.py policyengine your_input_file.csv
Options:
Option | Description |
---|---|
--output , -o |
Specify the output file path (default: output.txt) |
--logs |
Generate PolicyEngine YAML Tests Logs |
--disable-salt |
Set State and Local Sales or Income Taxes used for the SALT deduction to 0 |
--sample N |
Sample N records from input for testing |
Example:
python policyengine_taxsim/cli.py policyengine input.csv --output results.csv --logs --sample 1000
Run native TAXSIM-35 calculations:
python policyengine_taxsim/cli.py taxsim your_input_file.csv
Options:
Option | Description |
---|---|
--output , -o |
Output file path (default: taxsim_output.csv) |
--sample N |
Sample N records from input |
--taxsim-path |
Custom path to TAXSIM executable |
Example:
python policyengine_taxsim/cli.py taxsim input.csv --output taxsim_results.csv
Run comprehensive side-by-side comparisons between PolicyEngine and TAXSIM:
python policyengine_taxsim/cli.py compare your_input_file.csv
Options:
Option | Description |
---|---|
--output-dir |
Directory for comparison results (default: comparison_output) |
--year |
Override tax year for calculations |
--sample N |
Sample N records for comparison |
--disable-salt |
Disable SALT deduction in PolicyEngine |
--logs |
Generate PolicyEngine YAML test logs |
The comparison uses a $15 tolerance for both federal and state tax comparisons, which accounts for reasonable rounding differences.
Examples:
Basic comparison:
python policyengine_taxsim/cli.py compare cps_households.csv --sample 1000
Year-specific analysis with detailed logging:
python policyengine_taxsim/cli.py compare input.csv --year 2023 --logs --sample 5000
Output Files:
comparison_results_YYYY.csv
- Consolidated results with both PolicyEngine and TAXSIM outputs for each household- Console output with match statistics and summary
The consolidated output includes (by default):
- All input variables for each household
- Complete TAXSIM output variables
- Complete PolicyEngine output variables
- Match/mismatch indicators for federal and state taxes
- State codes for easy filtering
- All mismatches are automatically included - no separate mismatch files needed
Extract a sample from large datasets:
python policyengine_taxsim/cli.py sample-data input.csv --sample 1000
Options:
Option | Description |
---|---|
--sample N |
Number of records to sample |
--output , -o |
Output file (auto-generated if not specified) |
The project includes a comprehensive React-based interactive dashboard for visualizing and exploring tax calculation comparisons across multiple years and states.
-
Navigate to the dashboard directory:
cd cps-dashboard
-
Install dependencies:
npm install
-
Start the development server:
npm start
-
Open your browser to http://localhost:3000
- Multi-Year Analysis: Compare results across tax years 2021-2024 with year-over-year trends
- State-by-State Breakdown: Detailed analysis by all 50 US states plus DC
- Interactive Filtering: Advanced filtering by state, match status, and household characteristics
- Variable-Level Comparisons: Drill down to see differences in specific tax variables (v10, v32, etc.)
- Match Rate Analytics: Visualize federal vs state tax calculation accuracy rates
- Household Explorer: Expand individual households to examine all input and output variables
- Mismatch Analysis: Identify patterns in calculation differences
- Export Capabilities: Download filtered comparison data in CSV format
- Smart Tolerance: Uses $15 tolerance accounting for reasonable calculation differences
- Real-Time Statistics: Dynamic summary statistics that update with filtering
- GitHub Integration: Direct links to relevant issues and documentation
The dashboard loads comparison data from public/data/YYYY/comparison_results_YYYY.csv
files. To update:
-
Generate new comparison data:
python policyengine_taxsim/cli.py compare your_data.csv --year 2024
-
Copy results to dashboard:
cp comparison_output/comparison_results_2024.csv cps-dashboard/public/data/2024/
-
Restart dashboard to load new data
Production Build:
npm run build
The dashboard provides an intuitive interface for researchers and analysts to explore large-scale tax calculation comparisons without requiring technical expertise.
The emulator accepts CSV files with the following variables:
Variable | Description | Notes |
---|---|---|
taxsimid | Unique identifier | |
year | Tax year | |
state | State code | |
mstat | Marital status | Only supports: 1 (single), 2 (joint) |
page | Primary taxpayer age | |
sage | Age of secondary taxpayer | |
depx | Number of dependents | |
age1 | First dependent's age | |
age2 | Second dependent's age | |
ageN | Nth dependent's age | Taxsim only allows up to 8 dependents |
Variable | Description |
---|---|
pwages | Primary taxpayer wages |
swages | Wage and salary income of secondary taxpayer |
intrec | Taxable interest income |
dividends | Qualified dividend income |
ltcg | Long-term capital gains |
stcg | Short-term capital gains |
psemp | Primary taxpayer self-employment income |
ssemp | Spouse self-employment income |
gssi | Social security retirement benefits |
pensions | Taxable private pension income |
scorp | Partnership/S-corp income |
pbusinc | Primary taxpayer business income that qualifies for the QBID |
Variable | Description |
---|---|
rentpaid | Amount of rent paid |
mortgage | Deductible mortgage interest |
proptax | Real Estate Taxes |
childcare | Childcare expenses |
Depending on the idtl input value it can generate output types as following:
idtl | Description |
---|---|
0 | Standard output |
2 | Full output |
5 | Full text description output |
The emulator produces all standard TAXSIM output variables:
Variable | Description |
---|---|
taxsimid | Record identifier |
year | Tax year |
state | State code |
fiitax | Federal income tax liability |
siitax | State income tax liability |
fica | FICA taxes |
Variable | Description |
---|---|
v10 | Federal adjusted gross income |
v11 | Unemployment compensation in AGI |
v12 | Social Security benefits in AGI |
v13 | Zero bracket amount/standard deduction |
v14 | Personal exemptions |
v17 | Itemized deductions |
v18 | Federal taxable income |
v19 | Federal income tax before credits |
v22 | Child tax credit |
v23 | Additional child tax credit (refundable portion) |
v24 | Child and dependent care credit |
v25 | Earned income tax credit |
v26 | Alternative minimum tax income |
v27 | Alternative minimum tax |
v28 | Income tax before credits |
v29 | FICA taxes |
v32 | State AGI (or federal AGI/taxable income for conformity states) |
v34 | State standard deduction |
v35 | State itemized deductions |
v36 | State taxable income |
v37 | Property tax credit |
v38 | State child care credit |
v39 | State earned income credit |
v40 | Total state credits |
qbid | Qualified business income deduction |
niit | Net investment income tax |
cares | COVID-related recovery rebate credit |