Commit Graph

32 Commits

Author SHA1 Message Date
Philipp 41f2c9f995 Executing black over all jupyter notebook (#190)
Reverting black for the jupyter notebooks gets old. Can we just run
black over all of them?
2023-10-04 20:03:47 +02:00
TrisNol febcd59e39 test(data-extraction): Include first unit tests 2023-09-17 19:20:28 +02:00
TrisNol bfe50ac76d checkpoint(data-ingestion): Move Unternehmensregister code to .py 2023-09-15 17:22:54 +02:00
TrisNol 8be192e1de checkpoint(data-ingestion): Include type in company relations, fix issue in capital for KGs 2023-09-15 15:39:42 +02:00
TrisNol 0c7216e105 checkpoint 2023-09-14 18:17:02 +02:00
TrisNol 413b43c615 checkpoint(data-ingestion): Unify date format in data 2023-09-14 16:47:11 +02:00
TrisNol cf92cb61cc checkpoint(data-ingestion): Extract founding_date and other stats 2023-09-12 19:07:23 +02:00
TrisNol 1e15656028 refactor: Pull Auditor extraction into Bundesanzeiger utils 2023-08-18 16:21:52 +02:00
TrisNol eb0962e1be refactor: Move Auditor dataclass to models 2023-08-18 14:20:34 +02:00
TrisNol 309755383e Install deutschland package 2023-08-18 14:15:05 +02:00
TrisNol ed681d7c47 refactor: Implement linter feedback 2023-07-11 14:20:16 +02:00
TrisNol 4c95550dbf feat(data-extraction): MongoWrapper, DataClasses and services for News and Company data 2023-07-10 18:58:31 +02:00
TrisNol e44385ce3a style: Refactoring imports, adapting MongoConnector to different connection_strings 2023-06-30 20:36:03 +02:00
TrisNol 3cd8860312 adding distric court location to export 2023-06-27 19:49:23 +02:00
TrisNol 421b1e8c87 Bundesanzeiger preparation, Handeslblatt RSS feed export 2023-06-27 19:17:54 +02:00
TrisNol 37fb1b1da3 multi-process scraping, transforming unternehmensregister output 2023-06-25 15:58:53 +02:00
TrisNol c9c7b0cf7a code cleanup, presentation on data extraction 2023-06-19 18:02:34 +02:00
TrisNol 6e31bc62bd mongodb wrapper for managing News objects 2023-06-16 18:50:19 +02:00
TrisNol 5b96bb7e3e adding company ID as well as compatible dataclasses 2023-06-16 18:00:11 +02:00
TrisNol d3d8adabad dockerized mongodb as staging DB 2023-06-15 20:24:39 +02:00
TrisNol 3e737fbac5 first news article data extraction from tagesschau api 2023-06-15 18:04:23 +02:00
TrisNol 058c16b3ff Bulk process Unternehmensregister .xmls 2023-06-11 13:11:44 +02:00
TrisNol 1010b43a5f Extract first stakeholder informationen from Unternehmensregister export 2023-06-09 14:23:56 +02:00
TrisNol e2ad2d475a Traverse all pages 2023-06-09 13:51:36 +02:00
TrisNol d69368318f Download Unternehmensregister export via Selenium 2023-06-09 13:01:46 +02:00
RonnyFlex 6d509ee6ed News research + Abstract verflechtungsanalyse V1 2023-05-08 21:01:33 +02:00
TrisNol 0b4d955d26 feat(data extraction): Scraping data from Bundesanzeiger and parsing the results 2023-05-01 11:33:50 +02:00
Philipp 6da2ba6a4f Revert "OpenRegister DB Schema hinzugefügt."
This reverts commit d0803397ce.
2023-04-27 00:05:07 +02:00
Philipp d0803397ce OpenRegister DB Schema hinzugefügt. 2023-04-13 16:26:46 +02:00
TrisNol a79c5b6560 extended bundesanzeiger API test 2023-04-12 10:58:39 +02:00
TrisNol 1762f41cb1 api tests: Adding previous test results for different apis 2023-04-07 11:51:55 +02:00
Philipp 262476bd87 Added a frist structure 2023-04-06 19:00:15 +02:00