84 Commits

Author SHA1 Message Date
TrisNol
9edf5b1dce test: Increase coverage for multi-column headers 2023-11-11 11:03:36 +01:00
TrisNol
fecf42d75a test: Unit test new KPI extraction 2023-11-11 11:01:17 +01:00
Tim
e5769b3c25 Added Tests
Co-authored-by: Tristan Nolde <TrisNol@users.noreply.github.com>
2023-11-10 18:56:51 +01:00
Tim
410b690873 Added test 2023-11-10 18:56:51 +01:00
Tim
41af7e2d18 Added test behaviour 2023-11-10 18:56:51 +01:00
Tim
f38728450d now ruff confirm 2023-11-10 18:53:47 +01:00
Tim
f2ac0eda91 Added Realtion_count MEthod 2023-11-10 18:53:47 +01:00
Tim
30f9e4506f solved errors 2023-11-10 18:50:38 +01:00
Tim
7e8adfafd5 Test Version 2023-11-10 18:50:11 +01:00
TrisNol
f7ec3eaf24 test: Increase test coverage and refactor v3 2023-11-05 12:55:47 +01:00
TrisNol
e8d1a37cff test: Extend unit tests 2023-11-04 14:19:41 +01:00
TrisNol
61f94fa3b9 test: Unit tests 2023-11-04 11:24:36 +01:00
TrisNol
d6b07431e7 test: Adapt existing unit tests to refactored imports 2023-11-04 11:24:36 +01:00
ad36c68993
Moved the AI tests into the AI folder. (#315) 2023-11-03 13:45:24 +01:00
8d9981d967
Moved AI files in the AI module. (#308) 2023-11-02 20:30:04 +01:00
7953ba9291
Mixed typo fixes (#270) 2023-10-26 19:06:45 +02:00
1eb972b7ff
Adds the transfer of sentiments into the sql db (#253)
Transfers the sentimenes from the mongodb int the sql db.
2023-10-24 17:50:40 +02:00
36a0bab6ff
Add relations from finanical reports to SQL (#216) 2023-10-19 19:21:33 +02:00
TrisNol
83d313150c test: Update to new functions 2023-10-17 18:47:25 +02:00
TrisNol
600039207d test(data-extraction): Adapt unit tests to new behaviour 2023-10-17 18:16:44 +02:00
Sebastian
c680ac9759
Feature/ner (#103)
NER und Sentiment-Pipeline mit Services zur Datenextraktion.

---------

Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
Co-authored-by: TrisNol <tristan.nolde@yahoo.de>
2023-10-16 19:54:24 +02:00
TrisNol
f1474feaf8 refactor: Adapt to extended unit tests 2023-10-15 13:21:41 +02:00
Tristan Nolde
fd47487367
Update tests/utils/data_extraction/unternehmensregister/transform_test.py
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
2023-10-15 13:07:34 +02:00
TrisNol
8db04177be feat(data-extraction): Extract c/o relation from street in company relation 2023-10-15 13:06:32 +02:00
TrisNol
eba5235dff refactor: Implement PR feedback 2023-10-15 12:05:25 +02:00
Tristan Nolde
39c13ac74a
Update tests/utils/data_extraction/unternehmensregister/transform_test.py
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
2023-10-15 11:51:11 +02:00
TrisNol
b972acee7a fix(data-extraction): Parse date from Gesellschaftsvertrag entry 2023-10-14 18:22:41 +02:00
6365e252b9
Added location to person (#185) 2023-10-14 15:27:19 +00:00
f8c111d7e2
Resolve mismatch between staging and prod db data for financials (#211)
SQL Creation is now done dynamicly by the definition of the enumeration
type.
2023-10-14 17:16:14 +02:00
TrisNol
84d0139531 fix(data-extraction): Handle malformed date_of_birth fields 2023-10-07 17:01:34 +02:00
Tristan Nolde
7500895982
fix: Add script to fix malformed yearly_result entries (#202) 2023-10-07 12:35:29 +02:00
TrisNol
9cc58ba8be fix: Add script to fix malformed yearly_result entries 2023-10-07 09:11:43 +02:00
63325e7faa
Add constraints to the SQL entities (#186) 2023-10-06 18:48:58 +02:00
b1ca268a62
SQL fixes after new mongo ingest (#199) 2023-10-06 18:22:19 +02:00
09c36960e3
Add an list of missing relation partners to be searched (#171)
- [x] Add a new table
- [x] Add a field to the table that can register if the company was
already queried
- [x] Add a field to the table that counts how many times a relation
partner was missing
- [x] Add a function that restets the counter

Also:
- Reworked the get_company function to use the location dict as kwargs
2023-10-05 19:57:30 +02:00
c6f2c7467c
Rework the transfer of company data to fit the new data in the mongodb (#188)
This adds the additional company data as proposed to the sql db.

- [x] @TrisNol Is everything included or did I miss a feature. Relations
are in another issue.
- [x] @KM-R New DB features for the Dashbord for your review.
2023-10-05 19:47:46 +02:00
TrisNol
259259953e refactor: Move quote removal funtion to string utils, adapt to requirements 2023-10-03 16:37:54 +02:00
TrisNol
2a446a9937 checkpoint: Remove quotes from company names in relations 2023-10-03 14:33:46 +02:00
TrisNol
49498ad7c0 checkpoint: Remove quotes from company name 2023-10-03 14:33:45 +02:00
Tristan Nolde
7e9cff046a
fix(data-extraction): Parse house-number from street field if possibl… (#179) 2023-10-03 14:26:21 +02:00
01b4ce00c1
Spellchecking with PyCharm (#133)
Co-authored-by: KM-R <129882581+KM-R@users.noreply.github.com>
2023-10-02 20:47:42 +02:00
d2d4a436f8
Add a cli interface to choose a configuration (#163)
- [x] add a cli to the webserver to take env variables into account 
- [x] add a cli to the data processing that takes enviromental variable
as a valid source into account
- [x] rework the cli for the reset sql command
- [x] rework the cli for the copying of sql data from one db to another
2023-10-02 20:31:42 +02:00
2abe12f027
Add a function to convert DM to EUR (#168)
The function is ment to transform the captial dict into a format that
can be added as a kwarg (**norm_capital(capital_dict) to the company
entities.
This PR only contains the function itself.
2023-10-02 19:46:17 +02:00
05472cc16a
Added longitude/latitude and positional accuracy to the company data (#180) 2023-10-02 17:18:04 +02:00
TrisNol
ab26a7a01e fix(data-extraction): Parse house-number from street field if possible, write Straße in full 2023-10-01 21:24:17 +02:00
TrisNol
2050b49fde fix(data-extraction): Resolve issue in different Bundesanzeiger formats 2023-09-25 18:37:39 +02:00
Tristan Nolde
5c8d20f4c2
Feature/additional stammdaten (#132)
Feature/additional stammdaten
2023-09-24 15:31:17 +02:00
820fb3e52b
Repaired the SQL copy and reduced the log volume a bit (#141)
- Added a cli interface to the SQL copy
- Repaired the SQL copy function
- Added the SQL copy function to the scripts
- Reduced the logging verbosity
2023-09-24 15:11:49 +02:00
TrisNol
282d638c11 refactor: Implement PR feedback 2023-09-24 13:46:19 +02:00
TrisNol
5a7472cd3c checkpoint(data-extraction): Adapt load to update exisitng entries in order to keep yearly_results 2023-09-23 12:07:07 +02:00