mirror of
https://github.com/fhswf/aki_prj23_transparenzregister.git
synced 2025-04-26 01:22:33 +02:00
Created a dataprocessing pipline that enhances the raw mined data with Organsiation extractions and sentiment analysis prio to moving the data to the sql db. The transfer of matched data is done afterword. --------- Co-authored-by: SeZett <zeleny.sebastian@fh-swf.de>
81 lines
3.1 KiB
Markdown
81 lines
3.1 KiB
Markdown
# aki_prj23_transparenzregister
|
|
|
|
[](https://www.python.org)
|
|
[](https://github.com/astral-sh/ruff/actions)
|
|
[](https://github.com/astral-sh/ruff)
|
|
[](https://github.com/pre-commit/pre-commit)
|
|
[](http://mypy-lang.org/)
|
|
[](https://mypy.readthedocs.io/en/stable/?badge=stable)
|
|
[](https://github.com/psf/black)
|
|
|
|
## Contributions
|
|
|
|
See the [CONTRIBUTING.md](CONTRIBUTING.md) about how code should be formatted and what kind of rules we set ourselves.
|
|
|
|
## Available entrypoints
|
|
|
|
The project has currently the following entrypoint available:
|
|
|
|
- **data-transformation** > Transfers all the data from the mongodb into the sql db to make it available as production data.
|
|
- **data-processing** > Processes the data using NLP methods and transfers matched data into the SQL table ready for use.
|
|
- **reset-sql** > Resets all sql tables in the connected db.
|
|
- **copy-sql** > Copys the content of a db to another db.
|
|
- **webserver** > Starts the webserver showing the analysis results.
|
|
|
|
## DB Connection settings
|
|
|
|
To connect to the SQL db see [sql/connector.py](./src/aki_prj23_transparenzregister/utils/sql/connector.py)
|
|
To connect to the Mongo db see [connect]
|
|
|
|
Create a `secrets.json` in the root of this repo with the following structure (values to be replaces by desired config):
|
|
|
|
The sqlite db is alternative to the postgres section.
|
|
```json
|
|
{
|
|
"sqlite": "path-to-sqlite.db",
|
|
"postgres": {
|
|
"username": "username",
|
|
"password": "password",
|
|
"host": "localhost",
|
|
"database": "db-name",
|
|
"port": 5432
|
|
},
|
|
"mongo": {
|
|
"username": "username",
|
|
"password": "password",
|
|
"host": "localhost",
|
|
"database": "transparenzregister",
|
|
"port": 27017
|
|
}
|
|
}
|
|
```
|
|
|
|
Alternatively, the secrets can be provided as environment variables. One option to do so is to add a `.env` file with
|
|
the following layout:
|
|
|
|
```
|
|
PYTHON_POSTGRES_USERNAME=postgres
|
|
PYTHON_POSTGRES_PASSWORD=postgres
|
|
PYTHON_POSTGRES_HOST=localhost
|
|
PYTHON_POSTGRES_DATABASE=postgres
|
|
PYTHON_POSTGRES_PORT=5432
|
|
|
|
PYTHON_MONGO_USERNAME=username
|
|
PYTHON_MONGO_HOST=localhost
|
|
PYTHON_MONGO_PASSWORD=password
|
|
PYTHON_MONGO_PORT=27017
|
|
PYTHON_MONGO_DATABASE=transparenzregister
|
|
|
|
PYTHON_SQLITE_PATH=PathToSQLite3.db # An overwrite path to an sqllite db
|
|
|
|
PYTHON_DASH_LOGIN_USERNAME=some-login-to-webgui
|
|
PYTHON_DASH_LOGIN_PW=some-pw-to-login-to-webgui
|
|
|
|
CR=ghcr.io/fhswf/aki_prj23_transparenzregister
|
|
TAG=latest
|
|
|
|
HTTP_PORT=80
|
|
```
|
|
|
|
The prefix `PYTHON_` can be customized by setting a different `prefix` when constructing the ConfigProvider.
|