84 Commits

Author SHA1 Message Date
TrisNol
83d313150c test: Update to new functions 2023-10-17 18:47:25 +02:00
TrisNol
600039207d test(data-extraction): Adapt unit tests to new behaviour 2023-10-17 18:16:44 +02:00
Sebastian
c680ac9759
Feature/ner (#103)
NER und Sentiment-Pipeline mit Services zur Datenextraktion.

---------

Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
Co-authored-by: TrisNol <tristan.nolde@yahoo.de>
2023-10-16 19:54:24 +02:00
TrisNol
f1474feaf8 refactor: Adapt to extended unit tests 2023-10-15 13:21:41 +02:00
Tristan Nolde
fd47487367
Update tests/utils/data_extraction/unternehmensregister/transform_test.py
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
2023-10-15 13:07:34 +02:00
TrisNol
8db04177be feat(data-extraction): Extract c/o relation from street in company relation 2023-10-15 13:06:32 +02:00
Tristan Nolde
7e54ab98c5
fix(data-extraction): Parse date from Gesellschaftsvertrag entry (#221) 2023-10-15 13:06:04 +02:00
TrisNol
eba5235dff refactor: Implement PR feedback 2023-10-15 12:05:25 +02:00
Tristan Nolde
39c13ac74a
Update tests/utils/data_extraction/unternehmensregister/transform_test.py
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
2023-10-15 11:51:11 +02:00
TrisNol
b972acee7a fix(data-extraction): Parse date from Gesellschaftsvertrag entry 2023-10-14 18:22:41 +02:00
KM-R
84772a5511
Small mypy fix (#219) 2023-10-14 18:12:01 +02:00
6365e252b9
Added location to person (#185) 2023-10-14 15:27:19 +00:00
f8c111d7e2
Resolve mismatch between staging and prod db data for financials (#211)
SQL Creation is now done dynamicly by the definition of the enumeration
type.
2023-10-14 17:16:14 +02:00
KM-R
9f7d714403
Visualize financials (#206)
Adds the financial graph to the company page. The graph is only
available for companies with existing financial data.
2023-10-14 17:08:34 +02:00
TrisNol
84d0139531 fix(data-extraction): Handle malformed date_of_birth fields 2023-10-07 17:01:34 +02:00
Tristan Nolde
7500895982
fix: Add script to fix malformed yearly_result entries (#202) 2023-10-07 12:35:29 +02:00
TrisNol
9cc58ba8be fix: Add script to fix malformed yearly_result entries 2023-10-07 09:11:43 +02:00
63325e7faa
Add constraints to the SQL entities (#186) 2023-10-06 18:48:58 +02:00
b1ca268a62
SQL fixes after new mongo ingest (#199) 2023-10-06 18:22:19 +02:00
09c36960e3
Add an list of missing relation partners to be searched (#171)
- [x] Add a new table
- [x] Add a field to the table that can register if the company was
already queried
- [x] Add a field to the table that counts how many times a relation
partner was missing
- [x] Add a function that restets the counter

Also:
- Reworked the get_company function to use the location dict as kwargs
2023-10-05 19:57:30 +02:00
c6f2c7467c
Rework the transfer of company data to fit the new data in the mongodb (#188)
This adds the additional company data as proposed to the sql db.

- [x] @TrisNol Is everything included or did I miss a feature. Relations
are in another issue.
- [x] @KM-R New DB features for the Dashbord for your review.
2023-10-05 19:47:46 +02:00
KM-R
2152704dfc
175 create person page (#178)
Created person page and updated search bar in the header to search for persons
2023-10-05 18:00:31 +02:00
Tristan Nolde
bf7c072e87
Fix/company names with quotes (#187) 2023-10-04 20:07:51 +02:00
030ad00c7d
Testing speedup with in memory SQLite (#189)
If no SQLite File is written and deleted testing is MUCH faster.
2023-10-04 19:36:57 +02:00
TrisNol
259259953e refactor: Move quote removal funtion to string utils, adapt to requirements 2023-10-03 16:37:54 +02:00
TrisNol
2a446a9937 checkpoint: Remove quotes from company names in relations 2023-10-03 14:33:46 +02:00
TrisNol
49498ad7c0 checkpoint: Remove quotes from company name 2023-10-03 14:33:45 +02:00
Tristan Nolde
7e9cff046a
fix(data-extraction): Parse house-number from street field if possibl… (#179) 2023-10-03 14:26:21 +02:00
01b4ce00c1
Spellchecking with PyCharm (#133)
Co-authored-by: KM-R <129882581+KM-R@users.noreply.github.com>
2023-10-02 20:47:42 +02:00
d2d4a436f8
Add a cli interface to choose a configuration (#163)
- [x] add a cli to the webserver to take env variables into account 
- [x] add a cli to the data processing that takes enviromental variable
as a valid source into account
- [x] rework the cli for the reset sql command
- [x] rework the cli for the copying of sql data from one db to another
2023-10-02 20:31:42 +02:00
2abe12f027
Add a function to convert DM to EUR (#168)
The function is ment to transform the captial dict into a format that
can be added as a kwarg (**norm_capital(capital_dict) to the company
entities.
This PR only contains the function itself.
2023-10-02 19:46:17 +02:00
05472cc16a
Added longitude/latitude and positional accuracy to the company data (#180) 2023-10-02 17:18:04 +02:00
TrisNol
ab26a7a01e fix(data-extraction): Parse house-number from street field if possible, write Straße in full 2023-10-01 21:24:17 +02:00
KM-R
9566276047
Create multi page layout (#147)
Created two pages (home and company), page reloads after company
selection in dropdown or clicking the home button.
2023-09-26 18:38:40 +02:00
TrisNol
77711d8a2f feat: Add simple wrapper to update particual financial entries 2023-09-25 19:34:10 +02:00
TrisNol
2050b49fde fix(data-extraction): Resolve issue in different Bundesanzeiger formats 2023-09-25 18:37:39 +02:00
091e67de79
build first set of docker container in pipline and place them in the the github registry (#142)
- added a Dockerfile for the thre containers
- added a workflow step to build and placing the container in the
registry
- added a docker-compose.yaml to use the build images
- added a docker compose to build the images locally and a script for
prebuild steps
2023-09-24 16:32:52 +00:00
Tristan Nolde
5c8d20f4c2
Feature/additional stammdaten (#132)
Feature/additional stammdaten
2023-09-24 15:31:17 +02:00
820fb3e52b
Repaired the SQL copy and reduced the log volume a bit (#141)
- Added a cli interface to the SQL copy
- Repaired the SQL copy function
- Added the SQL copy function to the scripts
- Reduced the logging verbosity
2023-09-24 15:11:49 +02:00
TrisNol
282d638c11 refactor: Implement PR feedback 2023-09-24 13:46:19 +02:00
TrisNol
5a7472cd3c checkpoint(data-extraction): Adapt load to update exisitng entries in order to keep yearly_results 2023-09-23 12:07:07 +02:00
TrisNol
1e23a8d5a3 refactor(data-extraction): Move date_to_iso function to string_tools 2023-09-23 10:51:54 +02:00
TrisNol
77f08cd901 Merge branch 'main' into feature/additional-stammdaten 2023-09-23 10:32:09 +02:00
TrisNol
d6223b4192 refactor(data-extraction): Improve variable naming and exception handling 2023-09-23 10:21:26 +02:00
TrisNol
4e25be5466 test(data-extraction): Introduct load.py test and scrape test 2023-09-23 10:07:15 +02:00
TrisNol
d7f167a868 ignore types mypy 2023-09-21 18:08:20 +02:00
TrisNol
e6af96ea6d test(data-extraction): Host temporary_dir in local env 2023-09-21 17:25:41 +02:00
TrisNol
535c31fc9f test(data-extraction): Change use of TemporaryDirection 2023-09-21 17:16:25 +02:00
TrisNol
56c2ed55ec test(data-extraction): Delay file creation in test_rename_latest_file to avoid same timestamps 2023-09-21 16:54:23 +02:00
KM-R
487b2f42d1
update data based on selected company (#122)
Added UI elements to select a company and update shown data depending on chosen company



---------

Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
2023-09-19 23:45:10 +02:00