# Federated PCA with Adaptive Rank Estimation

@article{Grammenos2019FederatedPW, title={Federated PCA with Adaptive Rank Estimation}, author={Andreas Grammenos and Rodrigo Mendoza-Smith and Cecilia Mascolo and Jon Crowcroft}, journal={ArXiv}, year={2019}, volume={abs/1907.08059} }

In many online machine learning and data science tasks such as data summarisation and feature compression, d-dimensional vectors are usually distributed across a large number of clients in a decentralised network and collected in a streaming fashion. This is increasingly common in modern applications due to the sheer volume of data generated and the clients’ constrained resources. In this setting, some clients are required to compute an update to a centralised target model independently using… Expand

#### Figures, Tables, and Topics from this paper

#### 7 Citations

Fast and Sample-Efficient Federated Low Rank Matrix Recovery from Column-wise Linear and Quadratic Projections

- Computer Science, Mathematics
- ArXiv
- 2021

A novel gradient descent (GD) based solution called GD-min that needs only Ω((n + q)r log(1/ )) samples and O(mqnr log( 1/ )) time to obtain an -accurate estimate is introduced that is the best achievable sample complexity guarantee for a non-convex solution to the above problem. Expand

Federated Over-the-Air Subspace Learning from Incomplete Data

- Computer Science, Mathematics
- ArXiv
- 2020

This work develops a federated over-the-air version of the power method (FedPM) and shows that its iterates converge as long as the channel noise is very small compared to the singular value of the matrix; and the ratio between its $(r+1)$-th and $r-th singular value is smaller than a constant less than one. Expand

Pronto: Federated Task Scheduling

- Computer Science
- ArXiv
- 2021

This work presents a federated, asynchronous, memory-limited algorithm for online task scheduling across large-scale networks of hundreds of workers that is able to predict changes in the system responsiveness ahead of time based on the industry standard CPU Ready metric and can lead to better scheduling decisions and overall utilization of the available resources. Expand

Privacy Preserving PCA for Multiparty Modeling

- Computer Science
- ArXiv
- 2020

A general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data and shows that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized plaintext data. Expand

Turning Channel Noise into an Accelerator for Over-the-Air Principal Component Analysis

- Computer Science, Engineering
- 2021 IEEE 22nd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)
- 2021

The novelty of this design lies in exploiting channel noise to accelerate the descent in the region around each saddle point encountered by gradient descent, thereby increasing the convergence speed of over-the-air PCA. Expand

One-shot Distributed Algorithm for Generalized Eigenvalue Problem

- Computer Science
- ArXiv
- 2020

A general distributed GEP framework with one-shot communication for GEP with modified method for eigenvalue decomposition if the symmetric data covariance has repeated eigenvalues, and the number of local servers is analyzed. Expand

Federated Over-Air Subspace Tracking from Incomplete and Corrupted Data

- Computer Science, Mathematics
- 2020

This work provides a new simple algorithm and guarantee for both ST with missing data ( ST-miss) and RST-miss and extends the approach and its analysis to provably solving these problems when the raw data is federated and when the over-air data communication modality is used for information exchange between the K peer nodes and the center. Expand

#### References

SHOWING 1-10 OF 33 REFERENCES

MOSES: A Streaming Algorithm for Linear Dimensionality Reduction

- Mathematics, Computer Science
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2020

This paper introduces Memory-limited Online Subspace Estimation Scheme (MOSES) for both estimating the principal components of streaming data and reducing its dimension, and finds that MOSES consistently surpasses the state of the art in the authors' numerical experiments with both synthetic and real-world datasets, while being computationally inexpensive. Expand

Federated Learning: Strategies for Improving Communication Efficiency

- Computer Science
- ArXiv
- 2016

Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling. Expand

Streaming PCA with Many Missing Entries

- 2014

We consider the streaming memory-constrained principal component analysis (PCA) problem with missing entries, where the available storage is linear in the dimensionality of the problem, and each… Expand

Memory Limited, Streaming PCA

- Mathematics, Computer Science
- NIPS
- 2013

An algorithm is presented that uses O(kp) memory and is able to compute the k-dimensional spike with O(p log p) sample-complexity - the first algorithm of its kind. Expand

Communication-Efficient Learning of Deep Networks from Decentralized Data

- Computer Science
- AISTATS
- 2017

This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets. Expand

Relative Errors for Deterministic Low-Rank Matrix Approximations

- Computer Science, Mathematics
- SODA
- 2014

It is shown that Frequent Directions cannot be adapted to a sparse version in an obvious way that retains the l original rows of the matrix, as opposed to a linear combination or sketch of the rows. Expand

COLA: Decentralized Linear Learning

- Computer Science, Mathematics
- NeurIPS
- 2018

This work proposes COLA, a new decentralized training algorithm with strong theoretical guarantees and superior practical performance, that achieves communication efficiency, scalability, elasticity as well as resilience to changes in data and allows for unreliable and heterogeneous participating devices. Expand

Federated Multi-Task Learning

- Computer Science, Mathematics
- NIPS
- 2017

This work shows that multi-task learning is naturally suited to handle the statistical challenges of this setting, and proposes a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Expand

Improved Practical Matrix Sketching with Guarantees

- Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2016

This paper attempts to categorize and compare the most known methods under row-wise streaming updates with provable guarantees, and then to tweak some of these methods to gain practical improvements while retaining guarantees. Expand

A Distributed and Incremental SVD Algorithm for Agglomerative Data Analysis on Large Networks

- Computer Science, Mathematics
- SIAM J. Matrix Anal. Appl.
- 2016

It is shown that the SVD of a matrix can be constructed efficiently in a hierarchical approach and the hierarchical algorithm can be used to recover the largest singular values and left singular vectors with bounded error. Expand