Commit Graph

33 Commits

Author SHA1 Message Date
972fcd155e checkpoint: Normalize HTML tables fetched from Bundesanzeiger 2023-11-11 11:01:17 +01:00
41f2c9f995 Executing black over all jupyter notebook (#190)
Reverting black for the jupyter notebooks gets old. Can we just run
black over all of them?
2023-10-04 20:03:47 +02:00
febcd59e39 test(data-extraction): Include first unit tests 2023-09-17 19:20:28 +02:00
bfe50ac76d checkpoint(data-ingestion): Move Unternehmensregister code to .py 2023-09-15 17:22:54 +02:00
8be192e1de checkpoint(data-ingestion): Include type in company relations, fix issue in capital for KGs 2023-09-15 15:39:42 +02:00
0c7216e105 checkpoint 2023-09-14 18:17:02 +02:00
413b43c615 checkpoint(data-ingestion): Unify date format in data 2023-09-14 16:47:11 +02:00
cf92cb61cc checkpoint(data-ingestion): Extract founding_date and other stats 2023-09-12 19:07:23 +02:00
1e15656028 refactor: Pull Auditor extraction into Bundesanzeiger utils 2023-08-18 16:21:52 +02:00
eb0962e1be refactor: Move Auditor dataclass to models 2023-08-18 14:20:34 +02:00
309755383e Install deutschland package 2023-08-18 14:15:05 +02:00
ed681d7c47 refactor: Implement linter feedback 2023-07-11 14:20:16 +02:00
4c95550dbf feat(data-extraction): MongoWrapper, DataClasses and services for News and Company data 2023-07-10 18:58:31 +02:00
e44385ce3a style: Refactoring imports, adapting MongoConnector to different connection_strings 2023-06-30 20:36:03 +02:00
3cd8860312 adding distric court location to export 2023-06-27 19:49:23 +02:00
421b1e8c87 Bundesanzeiger preparation, Handeslblatt RSS feed export 2023-06-27 19:17:54 +02:00
37fb1b1da3 multi-process scraping, transforming unternehmensregister output 2023-06-25 15:58:53 +02:00
c9c7b0cf7a code cleanup, presentation on data extraction 2023-06-19 18:02:34 +02:00
6e31bc62bd mongodb wrapper for managing News objects 2023-06-16 18:50:19 +02:00
5b96bb7e3e adding company ID as well as compatible dataclasses 2023-06-16 18:00:11 +02:00
d3d8adabad dockerized mongodb as staging DB 2023-06-15 20:24:39 +02:00
3e737fbac5 first news article data extraction from tagesschau api 2023-06-15 18:04:23 +02:00
058c16b3ff Bulk process Unternehmensregister .xmls 2023-06-11 13:11:44 +02:00
1010b43a5f Extract first stakeholder informationen from Unternehmensregister export 2023-06-09 14:23:56 +02:00
e2ad2d475a Traverse all pages 2023-06-09 13:51:36 +02:00
d69368318f Download Unternehmensregister export via Selenium 2023-06-09 13:01:46 +02:00
6d509ee6ed News research + Abstract verflechtungsanalyse V1 2023-05-08 21:01:33 +02:00
0b4d955d26 feat(data extraction): Scraping data from Bundesanzeiger and parsing the results 2023-05-01 11:33:50 +02:00
6da2ba6a4f Revert "OpenRegister DB Schema hinzugefügt."
This reverts commit d0803397ce.
2023-04-27 00:05:07 +02:00
d0803397ce OpenRegister DB Schema hinzugefügt. 2023-04-13 16:26:46 +02:00
a79c5b6560 extended bundesanzeiger API test 2023-04-12 10:58:39 +02:00
1762f41cb1 api tests: Adding previous test results for different apis 2023-04-07 11:51:55 +02:00
262476bd87 Added a frist structure 2023-04-06 19:00:15 +02:00