April 2021
Since September 2020 I’ve been working as a Data Scientist at Axians. Axians is an IT Consultancy, specialised in Government, Finance, and Energy sectors.
As part as the Intelligent Customer Services Unit, I develop models that facilitate the the relations between companies or governments, with their clients, users or citizens.
These models include a multi-class, semi-hierarchical Text Classification algorithm with over 600 classes, reducing errors in responses and in forwarding enquiries from the Portuguese citizens to the eBalcao system, while speeding up response times, saving thousands of hours of error-induced-work, making the whole process more agile and more pleasant for all involved.
For the same project, I’ve developed a semi-agnostic Automatic Keyword Extraction algorithm inspired by Google’s PageRank. Semi-agnostic for there was no labeled dataset to learn from, meaning that it would have to be completely independent from the content it would encounter. As long as the text is in Portuguese, by using Graph theory and Part-Of-Speach (POS) Tagging, this TextRank algorithm is able to identify the most relevant words within any text.
I’ve also picked up the project of a former colleague on Intelligent Document Automation, refactoring the code, adapting the OCR to also identify handwritten text, and created the algorithm to calculate the distance between the entities within the documents. In this project, the aim was to be able to automatically read digitised cheques, invoices, receipts, bank transfer slips, and identify certain identities (such as the issuer of the cheque, or the company that emitted a receipt), and its position on the document.
I’ve also been mentoring a Data Scientist trainee, and developed the syllabus for an Introduction to Machine Learning Course to be taught within the company to new hires, and Fundamentals of Natural Language Processing.