# Tax Engine Caching Strategy Design Document

## Overview
The Tax Engine is designed to calculate sales tax for line items while minimizing the frequency of external API calls to TaxJar. This is achieved through a multi-level, line-item granular caching strategy.

## Core Problem
External tax services like TaxJar often charge per API request. In a Point of Sale (POS) system with high transaction volumes, frequent API calls can become prohibitively expensive.

## Solution: Granular Multi-Level Caching

### 1. Line-Item Granularity
Unlike traditional caching that might store the entire response of a transaction, this system caches tax rates at the **Line Item Level**.

- **Cache Key**: A SHA-256 hash of `from_zip` + `to_zip` + `product_tax_code`.
- **Price Independence**: The `unit_price` is **excluded** from the cache key. This ensures that if the price of an item changes, the *tax rate* remains valid and cached. The final tax amount is recalculated dynamically using the cached rate.

### 2. Multi-Level Storage
The system uses two layers of caching for maximum efficiency:
1. **L1: Redis (Speed)**: An in-memory cache for ultra-fast lookups during active sessions.
2. **L2: MongoDB (Persistence)**: A persistent database store to ensure that tax rates remain available across application restarts and provide a long-term historical record.

### 3. Partial API Execution (Nexus Strategy)
When a request with multiple line items arrives:
1. The engine iterates through all line items.
2. It checks L1 (Redis) and then L2 (MongoDB) for each item.
3. It separates items into **Cached** and **Missing**.
4. **Efficiency Gain**: If 8 out of 10 items are cached, the engine makes a **partial API call** to TaxJar containing *only the 2 missing items*.
5. Results are then aggregated: cached data is merged with the new API response to return a complete result to the user.

## Architecture Diagram

```mermaid
graph TD
    A[POS Request] --> B{Tax Engine}
    B --> C[Iterate Line Items]
    C --> D{Check L1: Redis}
    D -- Hit --> E[Add to Results]
    D -- Miss --> F{Check L2: MongoDB}
    F -- Hit --> G[Backfill L1 & Add to Results]
    F -- Miss --> H[Mark as Missing]
    H --> I{Missing Items > 0?}
    I -- Yes --> J[Call TaxJar API for Missing]
    J --> K[Cache New Rates in L1 & L2]
    K --> L[Aggregate All Results]
    I -- No --> L
    L --> M[Final Tax Response]
```

## Benefits
- **Cost Reduction**: Significant decrease in TaxJar API billing by reusing rates for common zip code pairs and product categories.
- **Improved Latency**: Cached items are returned in milliseconds, avoiding the network overhead of an external API call.
- **Resilience**: The system can still provide tax rates for previously seen items even if the external tax service is temporarily unreachable.
- **Scalability**: Decoupling the tax calculation from the external service allows the POS system to handle higher bursts of traffic.

## Data Schema (MongoDB)
Cached items are stored with the following structure:
- `cacheKey`: Unique identifier (hash).
- `taxRate`: The combined tax rate.
- `stateTaxRate`, `countryTaxRate`, `shippingTaxRate`: Breakdown for audit purposes.
- `requestPayload`: Metadata about the original request (zips, tax code) for debugging.
- `createdAt`: TTL management.