11 Commits

Author SHA1 Message Date
Joshua Salako 1a4e539626 Enhance salary analytics API with database operations and performance logging
- Introduced `DatabaseOperations` class for managing batch results in the database.
- Added functionality to create a batch results table and save batch processing results.
- Updated API endpoints to log execution time and handle batch processing errors more effectively.
- Improved response handling in analysis endpoints and added batch metadata to results.
- Suppressed warnings and improved logging throughout the application.
2025-05-10 16:56:23 +01:00
Joshua Salako 305e5da4ec Update .gitignore to exclude __pycache__ in salary_analytics directory and enhance run_streaming_pipeline endpoint to enforce file presence for CSV source. 2025-05-10 15:14:38 +01:00
CHIEFSOFT\ameye 5e5459450b File upload issue 2025-05-10 08:34:59 -04:00
CHIEFSOFT\ameye 420ff8e26c added to git ignore 2025-05-10 07:31:12 -04:00
Joshua Salako a060fa69c5 Refactor data loading and streaming pipeline endpoints for improved file handling
- Updated `/load-data` endpoint to make the file parameter optional and added validation for CSV uploads.
- Introduced a new dependency function `get_file_if_csv` to streamline file checks when loading data from CSV.
- Enhanced `/run/streaming-pipeline` endpoint to utilize the new file handling logic.
- Improved code readability by restructuring file renaming logic.
2025-05-03 15:40:50 +01:00
Joshua Salako 9c429caa56 Implement streaming pipeline endpoint for batch processing
- Added `/run/streaming-pipeline` endpoint to process data in batches from either a database or CSV file.
- Introduced `BatchResponse` model for structured responses.
- Updated README with new endpoint details, including parameters and example usage.
- Enhanced error handling and logging during batch processing.
- Ensured data preprocessing and NaN handling in analysis functions.
2025-05-02 14:25:31 +01:00
Joshua Salako 5767f55686 Update project structure and enhance model persistence
- Added new model and scaler files to .gitignore and output directory.
- Updated Dockerfile to create output/models directory.
- Revised README to include instructions for using a .env file for configuration.
- Enhanced config.py to load database credentials from environment variables.
- Implemented model saving functionality in salary_predictor.py for consistent and inconsistent earners.
2025-05-02 00:16:46 +01:00
Joshua Salako 8acfb436f3 Enhance API with data loading functionality and update README.
- Added `/load-data` endpoint to load transaction data from either a database or a CSV file.
- Updated `SalaryAnalyticsPipeline` and `DataLoader` to support loading from CSV.
- Implemented data validation and error handling for loading processes.
- Revised README to include new data loading instructions and workflow steps.
- Added checks to ensure data is loaded before running analysis endpoints.
2025-05-01 22:57:55 +01:00
Joshua Salako 7e7094f0fd Remove salary.py file, eliminating all salary transaction analysis and related functions. 2025-04-28 19:45:19 +01:00
Joshua Salako 591d4611b6 Added new salary-related terms and improved image outputs in salary.ipynb 2025-04-28 19:44:40 +01:00
Joshua Salako 8207d8f1ff first commit 2025-04-25 00:01:38 +01:00