36 Commits

Author SHA1 Message Date
TrisNol
f8a0d58314 feat(data-extraction): Provide KPI table analysis in bundesanzeiger wrapper 2023-11-11 11:01:17 +01:00
TrisNol
815e08a8f1 checkpoint: Transform values to € and normalize column names 2023-11-11 11:01:17 +01:00
TrisNol
ec11ae13aa checkpoint: Parse table into dict of financial data 2023-11-11 11:01:17 +01:00
TrisNol
972fcd155e checkpoint: Normalize HTML tables fetched from Bundesanzeiger 2023-11-11 11:01:17 +01:00
41f2c9f995
Executing black over all jupyter notebook (#190)
Reverting black for the jupyter notebooks gets old. Can we just run
black over all of them?
2023-10-04 20:03:47 +02:00
TrisNol
febcd59e39 test(data-extraction): Include first unit tests 2023-09-17 19:20:28 +02:00
TrisNol
bfe50ac76d checkpoint(data-ingestion): Move Unternehmensregister code to .py 2023-09-15 17:22:54 +02:00
TrisNol
8be192e1de checkpoint(data-ingestion): Include type in company relations, fix issue in capital for KGs 2023-09-15 15:39:42 +02:00
TrisNol
0c7216e105 checkpoint 2023-09-14 18:17:02 +02:00
TrisNol
413b43c615 checkpoint(data-ingestion): Unify date format in data 2023-09-14 16:47:11 +02:00
TrisNol
cf92cb61cc checkpoint(data-ingestion): Extract founding_date and other stats 2023-09-12 19:07:23 +02:00
TrisNol
1e15656028 refactor: Pull Auditor extraction into Bundesanzeiger utils 2023-08-18 16:21:52 +02:00
TrisNol
eb0962e1be refactor: Move Auditor dataclass to models 2023-08-18 14:20:34 +02:00
TrisNol
309755383e Install deutschland package 2023-08-18 14:15:05 +02:00
TrisNol
ed681d7c47 refactor: Implement linter feedback 2023-07-11 14:20:16 +02:00
TrisNol
4c95550dbf feat(data-extraction): MongoWrapper, DataClasses and services for News and Company data 2023-07-10 18:58:31 +02:00
TrisNol
e44385ce3a style: Refactoring imports, adapting MongoConnector to different connection_strings 2023-06-30 20:36:03 +02:00
TrisNol
3cd8860312 adding distric court location to export 2023-06-27 19:49:23 +02:00
TrisNol
421b1e8c87 Bundesanzeiger preparation, Handeslblatt RSS feed export 2023-06-27 19:17:54 +02:00
TrisNol
37fb1b1da3 multi-process scraping, transforming unternehmensregister output 2023-06-25 15:58:53 +02:00
TrisNol
c9c7b0cf7a code cleanup, presentation on data extraction 2023-06-19 18:02:34 +02:00
TrisNol
6e31bc62bd mongodb wrapper for managing News objects 2023-06-16 18:50:19 +02:00
TrisNol
5b96bb7e3e adding company ID as well as compatible dataclasses 2023-06-16 18:00:11 +02:00
TrisNol
d3d8adabad dockerized mongodb as staging DB 2023-06-15 20:24:39 +02:00
TrisNol
3e737fbac5 first news article data extraction from tagesschau api 2023-06-15 18:04:23 +02:00
TrisNol
058c16b3ff Bulk process Unternehmensregister .xmls 2023-06-11 13:11:44 +02:00
TrisNol
1010b43a5f Extract first stakeholder informationen from Unternehmensregister export 2023-06-09 14:23:56 +02:00
TrisNol
e2ad2d475a Traverse all pages 2023-06-09 13:51:36 +02:00
TrisNol
d69368318f Download Unternehmensregister export via Selenium 2023-06-09 13:01:46 +02:00
RonnyFlex
6d509ee6ed News research + Abstract verflechtungsanalyse V1 2023-05-08 21:01:33 +02:00
TrisNol
0b4d955d26 feat(data extraction): Scraping data from Bundesanzeiger and parsing the results 2023-05-01 11:33:50 +02:00
6da2ba6a4f
Revert "OpenRegister DB Schema hinzugefügt."
This reverts commit d0803397ce936de87c1c429ff6feeee853ecabcf.
2023-04-27 00:05:07 +02:00
d0803397ce
OpenRegister DB Schema hinzugefügt. 2023-04-13 16:26:46 +02:00
TrisNol
a79c5b6560 extended bundesanzeiger API test 2023-04-12 10:58:39 +02:00
TrisNol
1762f41cb1 api tests: Adding previous test results for different apis 2023-04-07 11:51:55 +02:00
262476bd87
Added a frist structure 2023-04-06 19:00:15 +02:00