Enhance API with data loading functionality and update README.

- Added `/load-data` endpoint to load transaction data from either a database or a CSV file. - Updated `SalaryAnalyticsPipeline` and `DataLoader` to support loading from CSV. - Implemented data validation and error handling for loading processes. - Revised README to include new data loading instructions and workflow steps. - Added checks to ensure data is loaded before running analysis endpoints.
2025-05-01 22:57:55 +01:00
parent 7e7094f0fd
commit 8acfb436f3
12 changed files with 205 additions and 29 deletions
@@ -46,7 +46,6 @@ salary_analytics/
 └── api.py             # FastAPI endpoints
 ```

-
 ## Configuration

 The system can be configured through environment variables or the `config.py` file:
@@ -89,12 +88,26 @@ uvicorn salary_analytics.api:app --reload
   - `GET /`: Welcome message
   - `GET /health`: Health check

-2. **Analysis Endpoints**
+2. **Data Loading**
+   - `POST /load-data`: Load transaction data
+     - Parameters:
+       - `source`: Data source ('db' or 'csv')
+       - `file`: CSV file (required if source is 'csv')
+     - Example:
+       ```bash
+       # Load from database
+       curl -X POST "http://localhost:8000/load-data?source=db"
+       
+       # Load from CSV
+       curl -X POST "http://localhost:8000/load-data?source=csv" -F "file=@path/to/your/file.csv"
+       ```
+
+3. **Analysis Endpoints**
   - `POST /analyze/keyword`: Run keyword analysis
   - `POST /analyze/consistent-amount`: Run consistent amount analysis
   - `POST /analyze/transaction-type`: Run transaction type analysis

-3. **Report Generation**
+4. **Report Generation**
   - `POST /generate/reports`: Generate all reports
   - `GET /download/{report_type}`: Download specific reports
     - Available types:
@@ -105,12 +118,21 @@ uvicorn salary_analytics.api:app --reload
       - `inconsistent_plot`: Inconsistent earners plot
       - `hypothesis_plot`: Hypothesis overlap plot

-4. **Model Training**
+5. **Model Training**
   - `POST /train/models`: Train prediction models

-5. **Pipeline**
+6. **Pipeline**
   - `POST /run/pipeline`: Run complete pipeline

+### Workflow
+
+1. Start the API server
+2. Load data using the `/load-data` endpoint
+3. Run any of the analysis endpoints
+4. Generate and download reports as needed
+
+Note: All analysis endpoints require data to be loaded first. If you try to run any analysis without loading data, you'll receive a 400 error with a message to load data first.
+
 ## Docker Deployment

 1. Build the Docker image: