Finanical reports are now filtered before beeing added to the SQL
database to only added knwon keys.
Some matching is also done.
The most importend missing reports are printed to be implemented later
on.
Rapidfuzz could be used.
Created a dataprocessing pipline that enhances the raw mined data with
Organsiation extractions and sentiment analysis prio to moving the data
to the sql db.
The transfer of matched data is done afterword.
---------
Co-authored-by: SeZett <zeleny.sebastian@fh-swf.de>
Introducing the previously developed method to fetch the financial data
via table parsing (aka "data lake like solution") in a non-destructive
manner by defaulting to the current RegEx-based behaviour.
NER und Sentiment-Pipeline mit Services zur Datenextraktion.
---------
Co-authored-by: Philipp Horstenkamp <philipp@horstenkamp.de>
Co-authored-by: TrisNol <tristan.nolde@yahoo.de>
- [x] Add a new table
- [x] Add a field to the table that can register if the company was
already queried
- [x] Add a field to the table that counts how many times a relation
partner was missing
- [x] Add a function that restets the counter
Also:
- Reworked the get_company function to use the location dict as kwargs
This adds the additional company data as proposed to the sql db.
- [x] @TrisNol Is everything included or did I miss a feature. Relations
are in another issue.
- [x] @KM-R New DB features for the Dashbord for your review.
- [x] add a cli to the webserver to take env variables into account
- [x] add a cli to the data processing that takes enviromental variable
as a valid source into account
- [x] rework the cli for the reset sql command
- [x] rework the cli for the copying of sql data from one db to another
The function is ment to transform the captial dict into a format that
can be added as a kwarg (**norm_capital(capital_dict) to the company
entities.
This PR only contains the function itself.