Arrow
COMPLEX SOLUTION

Federated learning

Line

Confidential Computing Method for ML Training

Line
Site preview on GitHub Pages (after the CTA, before the Analytics / Website / Promotion blocks)

Open site in new tab— if the preview has not loaded

Project Description

Check 1 STEP

Analytics

  • Project documentation
    (example:USER Guide)
  • Testing VFL/HFL frameworks
  • R&D open source solutions
    Fate, Flower
Document
Check 2 STEP

Development

Corporatewebsite
for the company

  • Testing VFL / HFL frameworks / Assistance in creating a training pipeline (DS)
Document
Check 3 STEP

Promotion

  • Marketing materials
    to promote a product
  • Presentations for conferences
  • Communication strategy
    and branding
Document

Description of technology

Check General description

Federated learning is an approach to machine learning in which a model is trained on distributed data that remains in the hands of its owners, without transferring that data to a centralized repository. Interaction between participants is carried out through the exchange of model parameters or aggregated statistics, which allows maintaining data confidentiality.

Horizontal Federated Learning (HFL) is used when participants have the same features but different objects (records). Vertical Federated Learning (VFL) is used when participants have different features for the same objects.

An important component of such systems is PSI (Private Set Intersection), a cryptographic protocol used to securely determine the intersection of sets of identifiers between parties without disclosing the sets themselves. PSI is used in particular in VFL scenarios.

Tag cloud
Horizontal (HFL) Vertical (VFL) Federated Learning FATE Flower Homomorphic Encryption (HE) XGBoost Paillier Python Docker gRPC APIs Machine Learning Data Privacy Distributed Computing Cryptography ML Security
Current links

Project design

design of Presentations, handouts and websites

FL Case

My role in the project

SA analytics, technology testing and how the model works (DS)

Development of SA documentation and USER Guide for a federated learning project +

The documentation development covered all key aspects of federated learning: description of vertical (VFL) and horizontal (HFL) learning scenarios, including the use of PSI (Private Set Intersection) to securely negotiate identifiers between participants. Also according to the logic of the FATE and Flower frameworks.

Testing VFL, HFL and PSI in local implementation +

The experiments involved deploying local instances of the frameworks to simulate federated learning in a controlled environment. For VFL, the correctness of joint learning in the distribution of features between participants was tested, including the use of PSI (Private Set Intersection) for secure matching of objects and identifiers without data disclosure. For HFL, training was tested on different sets of objects with the same features

During testing, model quality metrics (accuracy, AUC, F1), the amount of transferred data, and the impact of network configurations on performance were analyzed.

Research and evaluation of open-source solutions for R&D, including analysis of architecture and functionality +

As part of the work, a comparative analysis of frameworks was carried out. For FATE, the analysis included built-in PSI mechanisms, support for cryptographic protocols, and ready-made components for VFL, which makes the framework more focused on production use. For Flower, the possibilities of flexible custom implementation of PSI and federated protocols were considered, which increases its value for R&D and experimental scenarios, but requires additional improvements for industrial use. As a result, the strengths, limitations and areas of optimal application of each framework were recorded.

Participation in the formation of the model training pipeline, including Feature Selection +

I analyzed several options for constructing a model training pipeline in a federated architecture as part of the support of the Data Science team. In particular, I worked on approaches to selecting features both on the side of the active participant and on the side of the data partner without disclosing the names of the features and their domain groups. Various options for data processing were considered, including local assessment of feature importance, the use of aggregated and encrypted statistics, and coordination of Feature Selection stages between parties. The results of the analysis were used to select the optimal model training pipeline, taking into account the requirements for privacy and compatibility with VFL scenarios.

Rebranding and development of the company website +

I implemented the site from scratch, including developing a design concept and branding elements, creating a visual style and user logic. The site was developed using HTML, CSS and JavaScript, with an emphasis on responsive layout for correct display on various devices and screens.

As part of the work, coordination with legal requirements was also carried out (including requirements for user data, mandatory sections and notifications), and internal corporate standards were also taken into account. Additionally, SEO optimization was carried out: elaboration of page structure, meta tags, semantics and basic performance indicators, which increased the site’s visibility in search engines.

The result of the work was a functioning corporate website of the company:

https://digi-track.ru/

Preparation of marketing materials and presentations for conferences +

I developed a set of marketing materials to promote the product as part of the new branding. This included the preparation of motion videos for use at conferences, as well as a significant number of presentations for public speaking, demo sessions and negotiations. Additionally, handouts for conferences (brochures, infographics and printed materials) were developed, designed in a single visual style and corresponding to the updated brand concept.

Arrow