af8a907cf9
Stop table reset of better persistent tables. ( #373 )
2023-11-12 14:27:44 +01:00
d66e4e2b67
Added Hausarbeit
2023-11-12 14:22:56 +01:00
19a4460d90
Added a small stability fix. ( #374 )
...
The current code has problems with an empty db.
2023-11-12 14:14:15 +01:00
24c55c68b7
Removed docstring ruins. ( #367 )
2023-11-12 13:58:00 +01:00
3b2f9b98f2
Update pre-commit hooks ( #376 )
...
Update versions of pre-commit hooks to latest version.
Co-authored-by: philipp-horstenkamp <philipp-horstenkamp@users.noreply.github.com >
2023-11-12 10:12:17 +01:00
bbc15bc7a2
Feat/116 scheduling tools ( #358 )
...
Init `ingestion` container with `fetch_news` target to retrieve latest
news articles from Tagesschau and Handelsblatt twice a day.
Integration of the `find_missing_companies.py` will follow once this is
merged.
2023-11-11 14:34:51 +01:00
05ea0fbb33
refactor: Include logger.catch with reraise
2023-11-11 14:30:00 +01:00
5dcf8ecf55
build: Dockerize apps/fetch_news.py as ingestor
2023-11-11 14:30:00 +01:00
170056bf58
test: Cover apps/fetch_news.py with unit tests
2023-11-11 14:30:00 +01:00
ac6ca3547b
test: Add unit test for news api wrapper
2023-11-11 14:30:00 +01:00
ae41cf61bc
checkpoint: Resolve error in handelsblatt text fetch
2023-11-11 14:30:00 +01:00
a428eb4432
checkpoint: Init news extraction components and main app
2023-11-11 14:30:00 +01:00
905021af14
Experimental caching ( #285 )
...
Added some caching decoraterors to speedup page delivery.
2023-11-11 14:28:25 +01:00
066800123d
Created pipeline to run ner sentiment and sql ingest ( #314 )
...
Created a dataprocessing pipline that enhances the raw mined data with
Organsiation extractions and sentiment analysis prio to moving the data
to the sql db.
The transfer of matched data is done afterword.
---------
Co-authored-by: SeZett <zeleny.sebastian@fh-swf.de >
2023-11-11 13:28:12 +00:00
a6d486209a
Introduce extended_financial_data code ( #357 )
...
Introducing the previously developed method to fetch the financial data
via table parsing (aka "data lake like solution") in a non-destructive
manner by defaulting to the current RegEx-based behaviour.
2023-11-11 14:10:20 +01:00
e5b61bc19c
Added multi relation dropdowns to dashbord ( #363 )
...
This change allows for a more complete combination of relation
combinations to be filtered.
2023-11-11 13:47:46 +01:00
ad8f5d0fb1
Added github actions automerge for pre-commit updates. ( #362 )
2023-11-11 13:30:16 +01:00
b0bcdc6fe1
refactor: PR feedback implemented
2023-11-11 11:18:23 +01:00
834f93a26e
Update src/aki_prj23_transparenzregister/utils/data_extraction/bundesanzeiger.py
...
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de >
2023-11-11 11:03:36 +01:00
e1b8397f9e
feat: Introduce switch for different financial extraction routines
2023-11-11 11:03:36 +01:00
9edf5b1dce
test: Increase coverage for multi-column headers
2023-11-11 11:03:36 +01:00
3ba8c0abea
refactor: Remove debugging statements
2023-11-11 11:03:36 +01:00
3b1f0425cf
deps: Adding html5lib for table parsing via Pandas
2023-11-11 11:03:36 +01:00
801f945c59
temp: Print exception for test debuggin
2023-11-11 11:01:17 +01:00
c19697c7f8
Update src/aki_prj23_transparenzregister/utils/data_extraction/bundesanzeiger.py
...
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de >
2023-11-11 11:01:17 +01:00
fe7690620a
Update src/aki_prj23_transparenzregister/utils/data_extraction/bundesanzeiger.py
...
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de >
2023-11-11 11:01:17 +01:00
fecf42d75a
test: Unit test new KPI extraction
2023-11-11 11:01:17 +01:00
f8a0d58314
feat(data-extraction): Provide KPI table analysis in bundesanzeiger wrapper
2023-11-11 11:01:17 +01:00
815e08a8f1
checkpoint: Transform values to € and normalize column names
2023-11-11 11:01:17 +01:00
ec11ae13aa
checkpoint: Parse table into dict of financial data
2023-11-11 11:01:17 +01:00
972fcd155e
checkpoint: Normalize HTML tables fetched from Bundesanzeiger
2023-11-11 11:01:17 +01:00
8781d746e7
hotfix: Add missing networkx dependency ( #361 )
...
Depyloment on Jupiter is currently broken due to missing `networkx`
dependency:

Should be fixed by the changes included
2023-11-10 22:52:18 +01:00
c333ad70c5
hotfix: Add missing networkx dependency
2023-11-10 21:47:34 +01:00
247719c76f
Feature/visualize verflechtungen ( #324 )
2023-11-10 19:33:19 +01:00
da72c3d0a8
lint: Format company_elements.py
2023-11-10 19:21:32 +01:00
a1d8e942a9
test: Adapt home.py to run unit tests
2023-11-10 19:20:49 +01:00
fdbb6b5fd4
Added Graph to Company page again
2023-11-10 18:57:11 +01:00
e5769b3c25
Added Tests
...
Co-authored-by: Tristan Nolde <TrisNol@users.noreply.github.com >
2023-11-10 18:56:51 +01:00
410b690873
Added test
2023-11-10 18:56:51 +01:00
41af7e2d18
Added test behaviour
2023-11-10 18:56:51 +01:00
4d2ca3b3e7
Refactored Session handling for Network analysis
2023-11-10 18:56:51 +01:00
ac46348cc8
Added Dash DAQ
2023-11-10 18:55:13 +01:00
c38460c740
fixed mypy errors
2023-11-10 18:54:30 +01:00
f38728450d
now ruff confirm
2023-11-10 18:53:47 +01:00
f2ac0eda91
Added Realtion_count MEthod
2023-11-10 18:53:47 +01:00
76af89ff32
updated poetry lock
2023-11-10 18:53:32 +01:00
5b7f82a983
Bug fixes v2
2023-11-10 18:52:13 +01:00
152743597e
Bug fixes
2023-11-10 18:52:13 +01:00
31d7098d48
Checkpoint commit
2023-11-10 18:52:13 +01:00
c5721362ac
Test Bugs
2023-11-10 18:52:01 +01:00