Data Science Projects

Optimization of the production of a large(>15k employees) steel producer in North-West Europe.

Modelling of the whole production chain, from the reception of the raw materials in the port, transport by rails, storage, all processing steps (coal ovens, melting furnace, smelting, casting, lamination, finite products), warehouse, stocks. Identified the parameters which enabled the optimization of the production throughput by 2-8% (i.e. 50-400mE).

Used : non-homogeneous data aggregation, exploratory analysis, custom optimization, linear optimizers, monte Carlo simulations, simple linear models and clustering techniques.

Tech: sklearn, pandas , pyspark , azure , tableau

Classification of company invoices with NLP with a Human-in-the-Loop approach

Implemented a classifier which could split company invoices in the 500+ accounting accounts using NLP on the invoices content.

Developed a Directly Oriented Graph scheduler for the tasks which enabled on the fly debugging of the training and inference process. Deployed on companies with +1million invoices yearly.

Tech: NLP / Airflow / Python /

Exploration of Glassdoor Comments on a company

Solution developed to extract for the HR departments main trends expressed by potential employees in the company(and competition) Glassdoor profiles.

Tech : sklearn / NLP

Anomaly Detection

Rule based system for alerting user transactions which breach on certain policies for one of the top 10 banks globally. Significant amount of Data, production environment constraint. Sensitive Entities matching by ML.

Alert scoring system – giving a confidence score that an alert will evolve in a breach.

User behavior – predict future user behavior from history.

Outlier detections – different unsupervised and semi-supervised techniques.

Tech: spark, hadoop, sklearn

Predictive Marketing Solution

AI Solution to predict the Next best action a company can undertake on each of each customer.

ToDo – add link to the demo app

Leave a Reply

Your email address will not be published. Required fields are marked *