Michael Mahoney - Publications

2025

Anatomy of High-Performance Column-Pivoted QR Decomposition,

M. Melnichenko, R. Murray, W. Killian, J. Demmel, M. W. Mahoney, P. Luszczek, and M. Gates,

Technical Report, Preprint: arXiv:2507.00976 (2025) (arXiv),

Does Multimodality Lead to Better Time Series Forecasting?,

X. Zhang, B. Han, H. Fang, A. F. Ansari, S. Zhang, D. C. Maddix, C. Hu, A. G. Wilson, M. W. Mahoney, H. Wang, Y. Liu, H. Rangwala, G. Karypis, and B. Wang,

Technical Report, Preprint: arXiv:2506.21611 (2025) (arXiv),

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models,

Z. Liao and M. W. Mahoney,

Technical Report, Preprint: arXiv:2506.13139 (2025) (arXiv),

Multipole Attention for Efficient Long Context Reasoning,

C. Hooper, S. Zhao, L. Manolache, S. Kim, M. W. Mahoney, Y. S. Shao, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2506.13059 (2025) (arXiv),

Spectral Estimation with Free Decompression,

S. Ameli, C. van der Heide, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2506.11994 (2025) (arXiv),

End-to-End Probabilistic Framework for Learning with Hard Constraints,

U. Utkarsh, D. C. Maddix, R. Ma, M. W. Mahoney, and Y. Wang,

Technical Report, Preprint: arXiv:2506.07003 (2025) (arXiv),

Models of Heavy-Tailed Mechanistic Universality,

L. Hodgkinson, Z. Wang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2506.03470 (2025) (arXiv),

Proc. of the 2025 ICML Conference (2025) (pdf).

FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems,

N. B. Erichson, V. Mikuni, D. Lyu, Y. Gao, O. Azencot, S. H. Lim, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2505.17351 (2025) (arXiv),

Removing Watermarks with Partial Regeneration using Semantic Information,

K. Tallam, J. K. Cava, C. Geniesse, N. B. Erichson, and M. W. Mahoney

Technical Report, Preprint: arXiv:2505.08234 (2025) (arXiv),

Paving the way for scientific foundation models: enhancing generalization and robustness in PDEs with constraint-aware pre-training,

A. Totounferoush, S. Kotchourko, M. W. Mahoney, and S. Staab,

Technical Report, Preprint: arXiv:2503.19081 (2025) (arXiv),

Determinant Estimation under Memory Constraints and Neural Scaling Laws,

S. Ameli, C. van der Heide, L. Hodgkinson, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2503.04424 (2025) (arXiv),

Proc. of the 2025 ICML Conference (2025) (pdf).

Fundamental Bias in Inverting Random Sampling Matrices with Application to Sub-sampled Newton,

C. Niu, Z. Liao, Z. Ling, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2502.13583 (2025) (arXiv),

Proc. of the 2025 ICML Conference (2025) (pdf).

ETS: Efficient Tree Search for Inference-Time Scaling,

C. Hooper, S. Kim, S. Moon, K. Dilmen, M. Maheswaran, N. Lee, M. W. Mahoney, S. Shao, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2502.13575 (2025) (arXiv),

MatterChat: A Multi-Modal LLM for Material Science,

Y. Tang, W. Xu, J. Cao, J. Ma, W. Gao, S. Farrell, B. Erichson, M. W. Mahoney, A. Nonaka, and Z. Yao,

Technical Report, Preprint: arXiv:2502.13107 (2025) (arXiv),

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache,

R. Tiwari, H. Xi, A. Tomar, C. Hooper, S. Kim, M. Horton, M. Najibi, M. W. Mahoney, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2502.10424 (2025) (arXiv),

Proc. of the 2025 ICML Conference (2025) (pdf).

Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting,

K. Hegazy, M. W. Mahoney, and N. B. Erichson,

Technical Report, Preprint: arXiv:2502.06151 (2025) (arXiv),

Simulating seismic wavefields using generative artificial intelligence,

R. Nakata, N. Nakata, P. Ren, Z. Bi, M. Lacour, B. Erichson, and M. W. Mahoney,

The Leading Edge (TLE) 44(2): 123-132, 2025 (pdf).

Advancing data-driven broadband seismic wavefield simulation with multi-conditional diffusion model,

Z. Bi, N. Nakata, R. Nakata, P. Ren, X. Wu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2501.14348 (2025) (arXiv),

Accepted for publication, IEEE Transactions on Geoscience and Remote Sensing (TGRS) 000:000-000 (2025) ().

Neural equilibria for long-term prediction of nonlinear conservation laws,

J. A. L. Benitez, J. Guo, K. Hegazy, I. Dokmanic, M. W. Mahoney, and M. V. de Hoop,

Technical Report, Preprint: arXiv:2501.06933 (2025) (arXiv),

Using Pre-trained LLMs for Multivariate Time Series Forecasting,

M. L. Wolff, S. Yang, K. Torkkola, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2501.06386 (2025) (arXiv),

2024

A Statistical Framework for Ranking LLM-Based Chatbots,

S. Ameli, S. Zhuang, I. Stoica, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2412.18407 (2024) (arXiv),

Accepted for publication, Proc. of the 2025 ICLR Conference ().

LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics,

T. Xie, J. Chen, Y. Yang, C. Geniesse, G. Shi, A. Chaudhari, J. K. Cava, M. W. Mahoney, T. Perciano, G. H. Weber, and R. Maciejewski,

Technical Report, Preprint: arXiv:2412.13321 (2024) (arXiv),

Accepted for publication, IEEE Computer Graphics and Applications 000:000-000 (2024) ().

Forecasting high-dimensional spatio-temporal systems from sparse measurements,

J. Song, Z. Song, P. Ren, N. B. Erichson, M. W. Mahoney, and X. S. Li,

Mach. Learn.: Sci. Technol. 5 045067 (2024) (pdf).

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization,

L. Masserano, A. F. Ansari, B. Han, X. Zhang, C. Faloutsos, M. W. Mahoney, A. G. Wilson, Y. Park, S. Rangapuram, D. C. Maddix, and Y. Wang,

Technical Report, Preprint: arXiv:2412.05244 (2024) (arXiv),

Proc. of the 2025 ICML Conference (2025) (pdf).

LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data,

H. Zhang, C. Arvin, D. Efimov, M. W. Mahoney, D. Perrault-Joncas, S. Ramasubramanian, A. G. Wilson, and M. Wolff,

Technical Report, Preprint: arXiv:2412.02525 (2024) (arXiv),

Proc. of the NeurIPS 2024 Workshop on Time Series in the Age of Large Models (TSALM) (pdf).

Gradient-Free Generation for Hard-Constrained Systems,

C. Cheng, B. Han, D. C. Maddix, A. F. Ansari, A. Stuart, M. W. Mahoney, and Y. Wang,

Technical Report, Preprint: arXiv:2412.01786 (2024) (arXiv),

Accepted for publication, Proc. of the 2025 ICLR Conference ().

Visualizing Loss Functions as Topological Landscape Profiles,

C. Geniesse, J. Chen, T. Xie, G. Shi, Y. Yang, D. Morozov, T. Perciano, M. W. Mahoney, R. Maciejewski, and G. H. Weber,

Technical Report, Preprint: arXiv:2411.12136 (2024) (arXiv),

Accepted for publication, Proc. of the NeurIPS 2024 Workshop on Symmetry and Geometry in Neural Representations (NeurReps) ().

Evaluating Loss Landscapes from a Topology Perspective,

T. Xie, C. Geniesse, J. Chen, Y. Yang, D. Morozov, M. W. Mahoney, R. Maciejewski, and G. H. Weber,

Technical Report, Preprint: arXiv:2411.09807 (2024) (arXiv),

Proc. of the NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning (SciForDL) (pdf).

Squeezed Attention: Accelerating Long Context Length LLM Inference,

C. Hooper, S. Kim, H. Mohammadzadeh, M. Maheswaran, J. Paik, M. W. Mahoney, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2411.09688 (2024) (arXiv),

Accepted for publication, Proc. of the 63rd ACL Meeting 000-000 (2025) ().

SPADE: Split Peak Attention DEcomposition,

M. Wolff, K. G. Olivares, B. Oreshkin, S. Ruan, S. Yang, A. Katoch, S. Ramasubramanian, Y. Zhang, M. W. Mahoney, D. Efimov, and V. Quenneville-Belair,

Technical Report, Preprint: arXiv:2411.05852 (2024) (arXiv),

Proc. of the NeurIPS 2024 Workshop on Time Series in the Age of Large Models (TSALM) (pdf).

How many classifiers do we need?,

H. Kim, L. Hodgkinson, R. Theisen, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2411.00328 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 NeurIPS Conference ().

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models,

H. Lu, Y. Zhou, S. Liu, Z. Wang, M. W. Mahoney, and Y. Yang,

Technical Report, Preprint: arXiv:2410.10912 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 NeurIPS Conference ().

Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting,

S. H. Lim, Y. Wang, A. Yu, E. Hart, M. W. Mahoney, X. S. Li, and N. B. Erichson,

Technical Report, Preprint: arXiv:2410.03229 (2024) (arXiv),

Accepted for publication, Transactions on Machine Learning Research ().

Mitigating Memorization In Language Models,

M. Sakarvadia, A. Ajith, A. Khan, N. Hudson, C. Geniesse, K. Chard, Y. Yang, I. Foster, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2410.02159 (2024) (arXiv),

Accepted for publication, Proc. of the 2025 ICLR Conference ().

Tuning Frequency Bias of State Space Models,

A. Yu, D. Lyu, S. H. Lim, M. W. Mahoney, and N. B. Erichson,

Technical Report, Preprint: arXiv:2410.02035 (2024) (arXiv),

Accepted for publication, Proc. of the 2025 ICLR Conference ().

Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models,

Y. Fang, S. Na, M. W. Mahoney, and M. Kolar,

Technical Report, Preprint: arXiv:2409.15734 (2024) (arXiv),

Consensus Planning with Primal, Dual, and Proximal Agents,

A. Maggiar, L. Dicker, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2408.16462 (2024) (arXiv),

Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling,

P. Ren, R. Nakata, M. Lacour, I. Naiman, N. Nakata, J. Song, Z. Bi, O. A. Malik, D. Morozov, O. Azencot, N. B. Erichson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2407.15089 (2024) (arXiv),

Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics,

M. Karlbauer, D. C. Maddix, A. F. Ansari, B. Han, G. Gupta, Y. Wang, A. Stuart, and M. W. Mahoney,

Proc. ICLR 2024 Workshop on AI4Differential Equations In Science, at the 2024 ICLR Conference (pdf).

Technical Report, Preprint: arXiv:2407.14129 (2024) (arXiv),

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance,

H. Lu, X. Liu, Y. Zhou, Q. Li, K. Keutzer, M. W. Mahoney, Y. Yan, H. Yang, and Y. Yang,

Technical Report, Preprint: arXiv:2407.12996 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 NeurIPS Conference ().

Reliable edge machine learning hardware for scientific applications,

T. Baldi, J. Campos, B. Hawks, J. Ngadiuba, N. Tran, D. Diaz, J. Duarte, R. Kastner, A. Meza, M. Quinnan, O. Weng, C. Geniesse, A. Gholami, M. W. Mahoney, V. Loncar, P. Harris, J. Agar, and S. Qin,

Technical Report, Preprint: arXiv:2406.19522 (2024) (arXiv),

Proc. of the 2024 IEEE 42nd VLSI Test Symposium (VTS) 1-5 (2024) (pdf).

Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning,

M. Derezinski and M. W. Mahoney,

Technical Report, Preprint: arXiv:2406.11151 (2024) (arXiv),

Accepted for publication, Proc. of the 30th Annual SIGKDD, 0000–0000 (2024) ().

Towards Scalable and Versatile Weight Space Learning,

K. Schurholt, M. W. Mahoney, and D. Borth,

Technical Report, Preprint: arXiv:2406.09997 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().

WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning,

D. Lyu, R. Nakata, P. Ren, M. W. Mahoney, A. Pitarka, N. Nakata, and N. B. Erichson,

Technical Report, Preprint: arXiv:2405.20516 (2024) (arXiv),

HOPE for a Robust Parameterization of Long-memory State Space Models,

A. Yu, M. W. Mahoney, and N. B. Erichson,

Technical Report, Preprint: arXiv:2405.13975 (2024) (arXiv),

Accepted for publication, Proc. of the 2025 ICLR Conference ().

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement,

N. Lee, T. Wattanawong, S. Kim, K. Mangalam, S. Shen, G. Anumanchipali, M. W. Mahoney, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2403.15042 (2024) (arXiv),

Accepted for publication, Proc. of the 62nd ACL Meeting 000-000 (2024) ().

AI and Memory Wall,

A. Gholami, Z. Yao, S. Kim, C. Hooper, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2403.14123 (2024) (arXiv),

RiseLab Medium Post 1, 6 (2021) (blog),

IEEE Micro 44:33-39 (2024) ().

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs,

S. C. Mouli, D. C. Maddix, S. Alizadeh, G. Gupta, A. Stuart, M. W. Mahoney, and Y. Wang,

Technical Report, Preprint: arXiv:2403.10642 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().

Chronos: Learning the Language of Time Series,

A. F. Ansari, L. Stella, C. Turkmen, X. Zhang, P. Mercado, H. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapoor, J. Zschiegner, D. C. Maddix, M. W. Mahoney, K. Torkkola, A. G. Wilson, M. Bohlke-Schneider, and Y. Wang,

Technical Report, Preprint: arXiv:2403.07815 (2024) (arXiv),

Transactions on Machine Learning Research (10/2024) (pdf).

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning,

W. Chen, J. Song, P. Ren, S. Subramanian, D. Morozov, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2402.15734 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 NeurIPS Conference ().

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization,

C. Hooper, S. Kim, H. Mohammadzadeh, M. W. Mahoney, Y. S. Shao, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2401.18079 (2024) (arXiv),

Accepted for publication, Proc. of the 2024 NeurIPS Conference ().

SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data,

A. Eshragh, L. Yerbury, A. Nazari, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2401.00122 (2024) (arXiv),

2023

Multi-scale Local Network Structure Critically Impacts Epidemic Spread and Interventions,

O. Eldaghar, M. W. Mahoney, and D. F. Gleich,

Technical Report, Preprint: arXiv:2312.17351 (2023) (arXiv),

An LLM Compiler for Parallel Function Calling,

S. Kim, S. Moon, R. Tabrizi, N. Lee, M. W. Mahoney, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2312.04511 (2023) (arXiv),

Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training,

Y. Zhou, T. Pang, K. Liu, C. H. Martin, M. W. Mahoney, and Y. Yang,

Technical Report, Preprint: arXiv:2312.00359 (2023) (arXiv),

Proc. of the 2023 NeurIPS Conference ().

Rapid Fitting of Band-Excitation Piezoresponse Force Microscopy Using Physics Constrained Unsupervised Neural Networks,

A. T. Kaliyev, R. F. Forelli, S. Qin, Y. Guo, S. Memik, M. W. Mahoney, A. Gholami, N. Tran, P. Harris, M. Takac, and J. Agar,

Proc. of the AI4Mat Workshop at NeurIPS 2023 (pdf).

Does In-Context Operator Learning Generalize to Domain-Shifted Settings?,

J. W. Liu, N. B. Erichson, K. Bhatia, M. W. Mahoney, and C. Re,

Proc. of the DLDE Workshop at NeurIPS 2023 (pdf).

DMLR: Data-centric Machine Learning Research -- Past, Present and Future,

L. Oala, M. Maskey, L. Bat-Leah, A. Parrish, N. M. Gurel, T.-S. Kuo, Y. Liu, R. Dror, D. Brajovic, X. Yao, M. Bartolo, W. A. G. Rojas, R. Hileman, R. Aliment, M. W. Mahoney, M. Risdal, M. Lease, W. Samek, D. Dutta, C. G. Northcutt, C. Coleman, B. Hancock, B. Koch, G. A. Tadesse, B. Karlas, A. Alaa, A. B. Dieng, N. Noy, V. J. Reddi, J. Zou, P. Paritosh, M. van der Schaar, K. Bollacker, L. Aroyo, C. Zhang, J. Vanschoren, I. Guyon, and P. Mattson,

Technical Report, Preprint: arXiv:2311.13028 (2023) (arXiv),

Journal of Data-centric Machine Learning Research (2024) ().

CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT),

M. Melnichenko, O. Balabanov, R. Murray, J. Demmel, M. W. Mahoney, and P. Luszczek,

Technical Report, Preprint: arXiv:2311.08316 (2023) (arXiv),

SIAM J. Matrix Analysis and Applications, 46(3), 1701-1734 (2025) ().

A PAC-Bayesian Perspective on the Interpolating Information Criterion,

L. Hodgkinson, C. van der Heide, R. Salomone, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2311.07013 (2023) (arXiv),

Proc. of the Mathematics of Modern Machine Learning (M3L) Workshop at NeurIPS 2023.

Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels,

D. Long, W. W. Xing, A. S. Krishnapriyan, R. M. Kirby, S. Zhe, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2310.05387 (2023) (arXiv),

Proc. of the 27th International Conference on AISTATS, PMLR 238:2413-2421 (2024) ().

Extensions to the SENSEI In situ Framework for Heterogeneous Architectures,

B. Loring, E. W. Bethel, G. H. Weber, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2310.02926 (2023) (arXiv),

Proceedings of the SC23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 868–874 (2023) (pdf).

Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs,

I. Naiman, N. B. Erichson, P. Ren, M. W. Mahoney, and O. Azencot,

Technical Report, Preprint: arXiv:2310.02619 (2023) (arXiv),

Proc. of the 2024 ICLR Conference (pdf).

Robustifying State-space Models for Long Sequences via Approximate Diagonalization,

A. Yu, A. Nigmetov, D. Morozov, M. W. Mahoney, and N. B. Erichson,

Technical Report, Preprint: arXiv:2310.01698 (2023) (arXiv),

Proc. of the 2024 ICLR Conference (pdf).

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems,

Y. Cho, J. W. Demmel, M. Derezinski, H. Li, H. Luo, M. W. Mahoney, and R. J. Murray,

Technical Report, Preprint: arXiv:2308.15720 (2023) (arXiv),

SIAM J. Matrix Analysis and Applications, 46(2), 1247-1279 (2025) ().

CLOVER: Probabilistic Forecasting with Coherent Learning Objective Reparameterization,

K. G. Olivares, G. Negiar, R. Ma, O. N. Meetei, M. Cao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2307.09797 (2023) (arXiv),

Transactions on Machine Learning Research (12/2024) (pdf).

The Interpolating Information Criterion for Overparameterized Models,

L. Hodgkinson, C. van der Heide, R. Salomone, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2307.07785 (2023) (arXiv),

GEANN: Scalable Graph Augmentations for Multi-Horizon Time Series Forecasting,

S. Yang, M. Wolff, S. Ramasubramanian, V. Quenneville-Belair, R. Metha, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2307.03595 (2023) (arXiv).

SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning,

P. Ren, N. B. Erichson, S. Subramanian, O. San, Z. Lukic, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2306.14070 (2023) (arXiv),

Accepted for publication, Journal of Data-centric Machine Learning Research (2025) ().

A Heavy-Tailed Algebra for Probabilistic Programming,

F. Liang, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2306.09262 (2023) (arXiv),

Accepted for publication, Proc. of the 2023 NeurIPS Conference ().

SqueezeLLM: Dense-and-Sparse Quantization,

S. Kim, C. Hooper, A. Gholami, Z. Dong, X. Li, S. Shen, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2306.07629 (2023) (arXiv),

Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior,

S. Subramanian, P. Harrington, K. Keutzer, W. Bhimji, D. Morozov, M. W. Mahoney, and A. Gholami,

Technical Report, Preprint: arXiv:2306.00258 (2023) (arXiv),

Accepted for publication, Proc. of the 2023 NeurIPS Conference ().

A Three-regime Model of Network Pruning,

Y. Zhou, Y. Yang, A. Chang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2305.18383 (2023) (arXiv),

Proc. of the 2023 ICML Conference 202:42790-42809 (2023) (pdf).

Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching,

I. Hong, S. Na, M. W. Mahoney, and M. Kolar,

Technical Report, Preprint: arXiv:2305.18379 (2023) (arXiv),

Proc. of the 2023 ICML Conference 202:13174-13198 (2023) (pdf).

When are ensembles really effective?,

R. Theisen, H. Kim, Y. Yang, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2305.12313 (2023) (arXiv),

Accepted for publication, Proc. of the 2023 NeurIPS Conference ().

End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs,

J. Campos, Z. Dong, J. Duarte, A. Gholami, M. W. Mahoney, J. Mitrevski, and N. Tran,

Technical Report, Preprint: arXiv:2304.06745 (2023) (arXiv),

ACM Transactions on Reconfigurable Technology and Systems, 17, 3, Article 36 (2024) (pdf).

Full Stack Optimization of Transformer Inference: a Survey,

S. Kim, C. Hooper, T. Wattanawong, M. Kang, R. Yan, H. Genc, G. Dinh, Q. Huang, K. Keutzer, M. W. Mahoney, Y. S. Shao, and A. Gholami,

Technical Report, Preprint: arXiv:2302.14017 (2023) (arXiv),

Proc. of the ASSYST at ISCA 2023 / MLArchSys 2023 Workshop (pdf),

Learning Physical Models that Can Respect Conservation Laws,

D. Hansen, D. C. Maddix, S. Alizadeh, G. Gupta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2302.11002 (2023) (arXiv),

Proc. of the 2023 ICML Conference PMLR 202:12469-12510 (2023) (pdf),

Physica D: Nonlinear Phenomena, 457: 133952 (2024) (pdf).

Speculative Decoding with Big Little Decoder,

S. Kim, K. Mangalam, S. Moon, J. Canny, J. Malik, M. W. Mahoney, A. Gholami, and K. Keutzer,

Technical Report, Preprint: arXiv:2302.07863 (2023) (arXiv),

Proc. of the 2023 NeurIPS Conference 1705: 39236-39256 (2023) ().

2022

Gated Recurrent Neural Networks with Weighted Time-Delay Feedback,

N. B. Erichson, S. H. Lim, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2212.00228 (2022) (arXiv),

Accepted for publication Proc. of the 28th International Conference on AISTATS, PMLR 000:0000-0000 (2025) ().

Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems,

Y. Fang, S. Na, M. W. Mahoney, and M. Kolar,

Technical Report, Preprint: arXiv:2211.15943 (2022) (arXiv),

SIAM J. Optimization, 34(2), 1187-2037 (2024) (pdf).

Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software,

R. Murray, J. Demmel, M. W. Mahoney, N. B. Erichson, M. Melnichenko, O. A. Malik, L. Grigori, P. Luszczek, M. Dereziński, M. E. Lopes, T. Liang, H. Luo, and J. Dongarra,

LAWNs (LAPACK Working Notes), UCB/EECS-2022-235 (2022) (pdf),

Technical Report, Preprint: arXiv:2302.11474 (2023) (arXiv),

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes,

L. Hodgkinson, C. van der Heide, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2210.07612 (2022) (arXiv),

Proc. of the 2023 ICML Conference PMLR 202:13085-13117 (2023) (pdf).

Gradient Gating for Deep Multi-Rate Learning on Graphs,

T. K. Rusch, B. P. Chamberlain, M. W. Mahoney, M. M. Bronstein, and S. Mishra,

Technical Report, Preprint: arXiv:2210.00513 (2022) (arXiv),

Proc. of the 2023 ICLR Conference (pdf).

Learning differentiable solvers for systems with hard constraints,

G. Negiar, M. W. Mahoney, and A. S. Krishnapriyan,

Technical Report, Preprint: arXiv:2207.08675 (2022) (arXiv),

Proc. of the 2023 ICLR Conference (pdf).

Adaptive Self-supervision Algorithms for Physics-informed Neural Networks,

S. Subramanian, R. M. Kirby, M. W. Mahoney, and A. Gholami,

Technical Report, Preprint: arXiv:2207.04084 (2022) (arXiv),

Proc. of the ECAI-23 Conference 2234-2241 (2023) (pdf).

GACT: Activation Compressed Training for General Architectures,

X. Liu, L. Zheng, D. Wang, Y. Cen, W. Chen, X. Han, J. Chen, Z. Liu, J. Tang, J. Gonzalez, M. W. Mahoney, and A. Cheung,

Technical Report, Preprint: arXiv:2206.11357 (2022) (arXiv),

Proc. of the 2022 ICML Conference 162:14139-14152 (2022) (pdf).

Neurotoxin: Durable Backdoors in Federated Learning,

Z. Zhang, A. Panda, L. Song, Y. Yang, M. W. Mahoney, J. E. Gonzalez, K. Ramchandran, and P. Mittal,

Technical Report, Preprint: arXiv:2206.10341 (2022) (arXiv),

Proc. of the 2022 ICML Conference 162:26429-26446 (2022) (pdf).

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition,

S. Kim, A. Gholami, A. Shaw, N. Lee, K. Mangalam, J. Malik, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2206.00888 (2022) (arXiv),

Proc. of the 2022 NeurIPS Conference 35:9361-9373 (2022) (pdf, supp).

Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming,

S. Na and M. W. Mahoney,

Technical Report, Preprint: arXiv:2205.13687 (2022) (arXiv),

J. Machine Learning Research, 26(33):1−75, (2025) (pdf).

Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows,

F. Liang, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2205.07918 (2022) (arXiv),

Proc. of the 2022 ICML Conference 162:13257-13270 (2022) (pdf).

The Sky Above The Clouds,

S. Chasins, A. Cheung, N. Crooks, A. Ghodsi, K. Goldberg, J. E. Gonzalez, J. M. Hellerstein, M. I. Jordan, A. D. Joseph, M. W. Mahoney, A. Parameswaran, D. Patterson, R. Ada Popa, K. Sen, S. Shenker, D. Song, and I. Stoica,

Technical Report, Preprint: arXiv:2205.07147 (2022) (arXiv).

A Fast Post-Training Pruning Framework for Transformers,

W. Kwon, S. Kim, M. W. Mahoney, J. Hassoun, K. Keutzer, and A. Gholami,

Technical Report, Preprint: arXiv:2204.09656 (2022) (arXiv),

Proc. of the 2022 NeurIPS Conference 35:24101-24116 (2022) (pdf, supp).

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence,

S. Na, M. Derezinski, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2204.09266 (2022) (arXiv),

Mathematical Programming, 201:473–520 (2023) (pdf).

Fast Feature Selection with Fairness Constraints,

F. Quinzan, R. Khanna, M. Hershcovitch, S. Cohen, D. G. Waddington, T. Friedrich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2202.13718 (2022) (arXiv),

Proc. of the 26th International Conference on AISTATS, PMLR 7800-7823 (2023) (pdf),

Proc. Second WFVML Workshop, at the 2023 ICML Conference (pdf).

AutoIP: A United Framework to Integrate Physics into Gaussian Processes,

D. Long, Z. Wang, A. Krishnapriyan, R. Kirby, S. Zhe, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2202.12316 (2022) (arXiv),

Proc. of the 2022 ICML Conference, 162:14210-14222 (2022) (pdf).

Learning continuous models for continuous physics,

A. S. Krishnapriyan, A. F. Queiruga, N. B. Erichson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2202.08494 (2022) (arXiv),

Communications Physics, 6, 319 (2023) (pdf).

Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data (Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data),

Y. Yang, R. Theisen, L. Hodgkinson, J. E. Gonzalez, K. Ramchandran, C. H. Martin, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2202.02842 (2022) (arXiv),

Proc. of the 29th Annual SIGKDD, 3011–3021 (2023) (pdf).

Boosting Model Robustness to Common Corruptions with Noisy Data Augmentations (NoisyMix: Boosting Model Robustness to Common Corruptions),

N. B. Erichson, S. H. Lim, F. Utrera, W. Xu, Z. Cao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2202.01263 (2022) (arXiv),

Proc. of the 27th International Conference on AISTATS, PMLR 238:4033-4041 (2024) ().

2021

Learning from learning machines: a new generation of AI technology to meet the needs of science,

L. Pion-Tonachini, K. Bouchard, H. G. Martin, S. Peisert, W. B. Holtz, A. Aswani, D. Dwivedi, H. Wainwright, G. Pilania, B. Nachman, B. L. Marrone, N. Falco, Prabhat, D. Arnold, A. Wolf-Yadlin, S. Powers, S. Climer, Q. Jackson, T. Carlson, M. Sohn, P. Zwart, N. Kumar, A. Justice, C. Tomlin, D. Jacobson, G. Micklem, G. V. Gkoutos, P. J. Bickel, J.-B. Cazier, J. Muller, B.-J. Webb-Robertson, R. Stevens, M. Anderson, K. Kreutz-Delgado, M. W. Mahoney, and J. B. Brown,

Technical Report, Preprint: arXiv:2111.13786 (2021) (arXiv).

Long Expressive Memory for Sequence Modeling,

T. K. Rusch, S. Mishra, N. B. Erichson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2110.04744 (2021) (arXiv),

Proc. of the 2022 ICLR Conference (pdf).

Noisy Feature Mixup,

S. H. Lim, N. B. Erichson, F. Utrera, W. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2110.02180 (2021) (arXiv),

Proc. of the 2022 ICLR Conference (pdf).

Inexact Newton-CG Algorithms With Complexity Guarantees,

Z. Yao, P. Xu, F. Roosta, S. J. Wright, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2109.14016 (2021) (arXiv),

IMA Journal of Numerical Analysis, 43(3): 1855–1897 (2023) ().

Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information,

M. Jahani, S. Rusakov, Z. Shi, P. Richtarik, M. W. Mahoney, and M. Takac,

Technical Report, Preprint: arXiv:2109.05198 (2021) (arXiv),

Proc. of the 2022 ICLR Conference (pdf).

What's Hidden in a One-layer Randomly Weighted Transformer?,

S. Shen, Z. Yao, D. Kiela, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2109.03939 (2021) (arXiv),

Proc. of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pdf).

Characterizing possible failure modes in physics-informed neural networks,

A. S. Krishnapriyan, A. Gholami, S. Zhe, R. M. Kirby, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2109.01050 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:26548-26560 (2021) (pdf, supp).

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers,

L. Hodgkinson, U. Simsekli, R. Khanna, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2108.00781 (2021) (arXiv),

Proc. of the 2022 ICML Conference (pdf).

Taxonomizing local versus global structure in neural network loss landscapes,

Y. Yang, L. Hodgkinson, R. Theisen, J. Zou, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2107.11228 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:18722-18733 (2021) (pdf).

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update,

M. Derezinski, J. Lacotte, M. Pilanci, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2107.07480 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:2835-2847 (2021) (pdf, supp).

Stateful ODE-Nets using Basis Function Expansions,

A. Queiruga, N. B. Erichson, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2106.10820 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:21770-21781 (2021) (pdf, supp).

Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:2106.00734 (2021) (arXiv).

LEAP: Learnable Pruning for Transformer-based Models,

Z. Yao, X. Wu, L. Ma, S. Shen, K. Keutzer, M. W. Mahoney, and Y. He,

Technical Report, Preprint: arXiv:2105.14636 (2021) (arXiv).

LocalNewton: Reducing Communication Bottleneck for Distributed Learning,

V. Gupta, A. Ghosh, M. Derezinski, R. Khanna, K. Ramchandran, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2105.07320 (2021) (arXiv),

Proc. of the 37th UAI Conference, 632-642 (2021) (pdf, pdf).

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training,

J. Chen, L. Zheng, Z. Yao, D. Wang, I. Stoica, M. W. Mahoney, and J. E. Gonzalez,

Technical Report, Preprint: arXiv:2104.14129 (2021) (arXiv),

Proc. of the 38th ICML Conference PMLR 139:1803-1813 (2021) (pdf, supp).

Integer-only Zero-shot Quantization for Efficient Speech Recognition,

S. Kim, A. Gholami, Z. Yao, N. Lee, P. Wang, A. Nrusimha, B. Zhai, T. Gao, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2103.16827 (2021) (arXiv),

Proc. of the ICASSP 2022 Conference, 4288-4292 (2022) (pdf).

A Survey of Quantization Methods for Efficient Neural Network Inference,

A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2103.13630 (2021) (arXiv),

Chapter in Low-Power Computer Vision: Improve the Efficiency of Artificial Intelligence, pp. 291-326 (2021).

Hessian Eigenspectra of More Realistic Nonlinear Models,

Z. Liao and M. W. Mahoney,

Technical Report, Preprint: arXiv:2103.01519 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:20104-20117 (2021) (pdf).

A Differential Geometry Perspective on Orthogonal Recurrent Models,

O. Azencot, N. B. Erichson, M. Ben-Chen, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2102.09589 (2021) (arXiv).

Noisy Recurrent Neural Networks,

S. H. Lim, N. B. Erichson, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2102.04877 (2021) (arXiv),

Proc. of the 2021 NeurIPS Conference, 34:5124-5137 (2021) (pdf, supp).

Hessian-Aware Pruning and Optimal Neural Implant,

S. Yu, Z. Yao, A. Gholami, Z. Dong, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2101.08940 (2021) (arXiv),

Proc. of the 2022 WACV Conference, 3880-3891 (2022) (pdf, supp).

I-BERT: Integer-only BERT Quantization,

S. Kim, A. Gholami, Z. Yao, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2101.01321 (2021) (arXiv),

Proc. of the 38th ICML Conference PMLR 139:5506-5518 (2021) (pdf, supp).

2020

Sparse sketches with small inversion bias,

M. Derezinski, Z. Liao, E. Dobriban, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2011.10695 (2020) (arXiv),

Proc. of the 2021 COLT, 134:1467-1510 (2021) (pdf).

HAWQV3: Dyadic Neural Network Quantization,

Z. Yao, Z. Dong, Z. Zheng, A. Gholami, J. Yu, E. Tan, L. Wang, Q. Huang, Y. Wang, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2011.10680 (2020) (arXiv),

Proc. of the 38th ICML Conference PMLR 139:11875-11886 (2021) (pdf, supp).

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks,

J. Chen, Y. Gai, Z. Yao, M. W. Mahoney, and J. E. Gonzalez,

Technical Report, Preprint: arXiv:2010.14298 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 883-894 (2020) (pdf).

Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism,

V. Gupta, D. Choudhary, P. Tak Peter Tang, X. Wei, X. Wang, Y. Huang, A. Kejariwal, K. Ramchandran, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2010.08899 (2020) (arXiv),

Proc. of the 27th Annual SIGKDD, 2928-2936 (2021) (pdf).

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding,

Q. Wang, H. Tan, S. Shen, M. W. Mahoney, and Z. Yao,

Technical Report, Preprint: arXiv:2010.05379 (2020) (arXiv),

Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2030–2038 (2020) (pdf).

Sparse Quantized Spectral Clustering,

Z. Liao, R. Couillet, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2010.01376 (2020) (arXiv),

Proc. of the 2021 ICLR Conference (pdf).

Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models,

Z. Zhang, Z. Yao, Y. Yang, Y. Yan, J. E. Gonzalez, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2008.11364 (2020) (arXiv),

Proc. 2021 IEEE BigData, 1214-1225 (2021) (pdf).

Continuous-in-Depth Neural Networks,

A. F. Queiruga, N. B. Erichson, D. Taylor, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2008.02389 (2020) (arXiv).

Noise-Response Analysis of Deep Neural Networks Quantifies Robustness and Fingerprints Structural Malware,

N. B. Erichson, D. Taylor, Q. Wu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2008.00123 (2020) (arXiv),

Proc. 2021 SDM Conference, 100-108 (2021) (pdf).

Adversarially-Trained Deep Nets Transfer Better,

F. Utrera, E. Kravitz, N. B. Erichson, R. Khanna, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2007.05869 (2020) (arXiv),

Proc. of the 2021 ICLR Conference (pdf).

Boundary thickness and robustness in learning models,

Y. Yang, R. Khanna, Y. Yu, A. Gholami, K. Keutzer, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2007.05086 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 6223-6234 (2020) (pdf).

Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),

J. Demmel, J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. W. Mahoney,

LAWNs (LAPACK Working Notes), ICL-UT-20-07 (2020) (pdf).

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization,

M. Derezinski, B. Bartan, M. Pilanci, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2007.01327 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 6684-6695 (2020) (pdf).

Good classifiers are abundant in the interpolating regime,

R. Theisen, J. M. Klusowski, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.12625 (2020) (arXiv),

Proc. of the 24th International Conference on AISTATS, PMLR 130:3376-3384 (2021) (pdf).

Lipschitz Recurrent Neural Networks,

N. B. Erichson, O. Azencot, A. Queiruga, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.12070 (2020) (arXiv),

Proc. of the 2021 ICLR Conference (pdf).

Precise expressions for random projections: Low-rank approximation and randomized Newton,

M. Derezinski, F. Liang, Z. Liao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.10653 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 18272-18283 (2020) (pdf).

Multiplicative noise and heavy tails in stochastic optimization,

L. Hodgkinson and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.06293 (2020) (arXiv),

Proc. of the 38th ICML Conference PMLR 139:4262-4274 (2021) (pdf, supp).

A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent,

Z. Liao, R. Couillet, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.05013 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 13939-13950 (2020) (pdf),

Journal of Statistical Mechanics, Theory and Experiment 124006 (2021) (pdf).

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning,

Z. Yao, A. Gholami, S. Shen, M. Mustafa, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2006.00719 (2020) (arXiv), (code),

Proc. of the AAAI-21 Conference, 10665-10673 (2021) (pdf).

Determinantal Point Processes in Randomized Numerical Linear Algebra,

M. Derezinski and M. W. Mahoney,

Technical Report, Preprint: arXiv:2005.03185 (2020) (arXiv),

Notices of the AMS, 68 (1) 34-45 (2021) (pdf).

Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance,

K. Fountoulakis, M. Liu, D. F. Gleich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2004.09608 (2020) (arXiv),

SIAM Review 65(1): 59-143, (2023) (pdf).

PowerNorm: Rethinking Batch Normalization in Transformers,

S. Shen, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2003.07845 (2020) (arXiv),

Proc. of the 37th ICML Conference 4566-4576 (2020) (pdf).

Error Estimation for Sketched SVD via the Bootstrap,

M. E. Lopes, N. B. Erichson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2003.04937 (2020) (arXiv),

Proc. of the 37th ICML Conference 5435-5445 (2020) (pdf).

Forecasting Sequential Data using Consistent Koopman Autoencoders,

O. Azencot, N. B. Erichson, V. Lin, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2003.02236 (2020) (arXiv),

Proc. of the 37th ICML Conference 4493-4503 (2020) (pdf).

Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms,

P. Ma, X. Zhang, X. Xing, J. Ma, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2002.10526 (2020) (arXiv),

Proc. of the 23rd International Conference on AISTATS, PMLR 108:1026-1035 (2020) (pdf),

J. Machine Learning Research, 23(177):1−45, (2022) (pdf).

Stochastic Continuous Normalizing Flows: Training SDEs as ODEs (Stochastic Normalizing Flows),

L. Hodgkinson, C. van der Heide, F. Roosta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2002.09547 (2020) (arXiv),

Proc. of the 37th UAI Conference 161:1130-1140 (2021) (pdf).

Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method,

M. Derezinski, R. Khanna, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2002.09073 (2020) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 4953-4964 (2020) (pdf) (Awarded Best Paper Award),

Proc. of the IJCAI-21, Sister Conferences Best Paper (SCBP) Track, 4765-4769 (2021) (pdf).

Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data,

C. H. Martin, T. S. Peng, and M. W. Mahoney,

Technical Report, Preprint: arXiv:2002.06716 (2020) (arXiv), (code),

Nature Communications, 12, 4122 (2021) (pdf).

ZeroQ: A Novel Zero Shot Quantization Framework,

Y. Cai, Z. Yao, Z. Dong, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:2001.00281 (2020) (arXiv), (code),

Proc. of the 33rd CVPR Conference, 13169-13178 (2020) (pdf, supp).

2019

PyHessian: Neural Networks Through the Lens of the Hessian,

Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1912.07145 (2019) (arXiv), (code),

Proc. 2020 IEEE BigData, 581-590 (2020) (pdf).

Exact expressions for double descent and implicit regularization via surrogate random design,

M. Derezinski, F. Liang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1912.04533 (2019) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 5152-5164 (2020) (pdf).

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data,

A. Eshragh, F. Roosta, A. Nazari, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1911.12321 (2019) (arXiv),

J. Machine Learning Research, 23(22):1−36, (2022) (pdf).

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks,

Z. Dong, Z. Yao, Y. Cai, D. Arfeen, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:1911.03852 (2019) (arXiv),

Proc. of the 2020 NeurIPS Conference, 33: 18518-18529 (2020) (pdf).

Running Alchemist on Cray XC and CS Series Supercomputers: Dask and PySpark Interfaces, Deployment Options, and Data Transfer Times,

K. Rothauge, H. Ayyalasomayajula, K. J. Maschhoff, M. Ringenburg, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1910.01354 (2019) (arXiv), (code),

Proc. Cray User Group, CUG 2019 (2019) (pdf).

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings,

K. Levin, F. Roosta, M. Tang, M. W. Mahoney, and C. E. Priebe,

Technical Report, Preprint: arXiv:1910.00423 (2019) (arXiv),

J. Machine Learning Research, 22(194): 1−59, (2021) (pdf).

Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching,

M. E. Lopes, N. B. Erichson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1909.06120 (2019) (arXiv),

Bernoulli Journal, 29(1): 428-450 (2023) (pdf).

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT,

S. Shen, Z. Dong, J. Ye, L. Ma, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:1909.05840 (2019) (arXiv),

Proc. of the AAAI-20 Conference, 8815-8821 (2020) (pdf).

The Difficulties of Addressing Interdisciplinary Challenges at the Foundations of Data Science,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1909.03033 (2019) (arXiv),

Appeared in SIAM News, SIGACT News, etc.

Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks,

C. H. Martin and M. W. Mahoney,

Proc. of the 25th Annual SIGKDD, 3239-3240 (2019) (pdf).

Geometric Rates of Convergence for Kernel-based Sampling Algorithms,

R. Khanna, L. Hodgkinson, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1907.08410 (2019) (arXiv),

Proc. of the 37th UAI Conference 161:2156-2164 (2021) (pdf, supp).

Statistical guarantees for local graph clustering,

W. Ha, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04863 (2019) (arXiv),

Proc. of the 23rd International Conference on AISTATS, PMLR 108:2687-2697 (2020) (pdf),

J. Machine Learning Research, 22(148): 1−54, (2021) (pdf).

ANODEV2: A Coupled Neural ODE Evolution Framework,

T. Zhang, Z. Yao, A. Gholami, K. Keutzer, J. Gonzalez, G. Biros, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04596 (2019) (arXiv), (code),

Proc. of the 2019 NeurIPS Conference, 5151-5161 (2019) (pdf).

Bayesian experimental design using regularized determinantal point processes,

M. Derezinski, F. Liang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04133 (2019) (arXiv),

Proc. of the 23rd International Conference on AISTATS, PMLR 108:3197-3207 (2020) (pdf, supp) (talk).

Distributed estimation of the inverse Hessian by determinantal averaging,

M. Derezinski and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.11546 (2019) (arXiv),

Proc. of the 2019 NeurIPS Conference, 11405-11415 (2019) (pdf).

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization,

K. Rothauge, Z. Yao, Z. Hu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.13386 (2019) (arXiv).

Parallel and Communication Avoiding Least Angle Regression,

S. Das, J. Demmel, K. Fountoulakis, L. Grigori, M. W. Mahoney, and S. Yang,

Technical Report, Preprint: arXiv:1905.11340 (2019) (arXiv),

SIAM J. Scientific Computing, 43(2), C154–C176 (2021) (pdf).

Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction,

N. B. Erichson, M. Muehlebach, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.10866 (2019) (arXiv),

Proc. Second Workshop on Machine Learning and the Physical Sciences, at the 2018 NeurIPS Conference (pdf).

HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision,

Z. Dong, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:1905.03696 (2019) (arXiv),

Proc. ICCV 2019 293-302 (2019) (pdf).

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks,

N. B. Erichson, Z. Yao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1904.03750 (2019) (arXiv),

Proc. of the 9th ICPRAM Conference 103-114 (2020) (pdf).

OverSketched Newton: Fast Convex Optimization for Serverless Systems,

V. Gupta, S. Kadhe, T. Courtade, M. W. Mahoney, and K. Ramchandran,

Technical Report, Preprint: arXiv:1903.08857 (2019) (arXiv),

Proc. 2020 IEEE BigData, 288-297 (2020) (pdf).

Inefficiency of K-FAC for Large Batch Size Training,

L. Ma, G. Montague, J. Ye, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1903.06237 (2019) (arXiv),

Proc. of the AAAI-20 Conference, 5053-5060 (2020) (pdf).

Sub-Sampled Newton Methods,

F. Roosta-Khorasani and M. W. Mahoney,

Mathematical Programming, 174(1-2): 293-326 (2019) (pdf).

Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data,

N. B. Erichson, L. Mathelin, Z. Yao, S. L. Brunton, M. W. Mahoney, and J. N. Kutz,

Technical Report, Preprint: arXiv:1902.07358 (2019) (arXiv),

Proceedings of the Royal Society A, 476:20200097 (2020) (pdf).

Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression,

M. Derezinski, K. L. Clarkson, M. W. Mahoney, and M. K. Warmuth,

Technical Report, Preprint: arXiv:1902.00995 (2019) (arXiv),

Proc. of 2019 COLT, PMLR 99:1050-1069 (2019) (pdf).

Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1901.08278 (2019) (arXiv), (code),

Proc. 2020 SDM Conference, 505-513 (2020) (pdf).

Traditional and Heavy-Tailed Self Regularization in Neural Network Models,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1901.08276 (2019) (arXiv), (iclr19), (code),

Proc. of the 36th ICML Conference 4284-4293 (2019) (pdf).

2018

Trust Region Based Adversarial Attack on Neural Networks,

Z. Yao, A. Gholami, P. Xu, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1812.06371 (2018) (arXiv), (code),

Proc. of the 32nd CVPR Conference, 11350-11359 (2019) (pdf).

Parameter Re-Initialization through Cyclical Batch Size Schedules,

N. Mu, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1812.01216 (2018) (arXiv),

Proc. Systems for Machine Learning Workshop, at the 2018 NeurIPS Conference (pdf).

On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent,

N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, and J. Gonzalez,

Technical Report, Preprint: arXiv:1811.12941 (2018) (arXiv), (iclr19).

The Mathematics of Data,

M. W. Mahoney, J. C. Duchi, and A. C. Gilbert, Eds.

AMS, IAS/PCMI, and SIAM (2018) (web), (intro).

A Short Introduction to Local Graph Clustering Methods and Software,

K. Fountoulakis, D. F. Gleich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.07324 (2018) (arXiv),

Absts. of the 7th Intl. Conference on Complex Networks and Their Applications (pdf), (code).

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.01075 (2018) (arXiv), (code),

J. Machine Learning Research, 22(165): 1−73, (2021) (pdf).

Large batch size training of neural networks with adversarial training and second-order information,

Z. Yao, A. Gholami, D. Arfeen, R. Liaw, J. Gonzalez, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.01021 (2018) (arXiv), (iclr19), (code).

Newton-MR: Inexact Newton Method With Minimum Residual Sub-problem Solver,

F. Roosta, Y. Liu, P. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.00303 (2018) (arXiv),

EURO Journal on Computational Optimization, 10: 100035 (2022) (pdf).

Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification Problems,

C.-H. Fang, S. B Kylasa, F. Roosta, M. W. Mahoney, and A. Grama,

Technical Report, Preprint: arXiv:1807.07132 (2018) (arXiv), (code),

Proc. SC20 Conference, 50:1-12 (2020) (pdf).

Alchemist: An Apache Spark <=> MPI Interface,

A. Gittens, K. Rothauge, M. W. Mahoney, S. Wang, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,

Technical Report, Preprint: arXiv:1806.01270 (2018) (arXiv), (code),

Concurrency and Computation: Practice and Experience (Special Issue of the Cray User Group, CUG 2018), e5026 (2018) (pdf).

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist,

A. Gittens, K. Rothauge, S. Wang, M. W. Mahoney, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,

Technical Report, Preprint: arXiv:1805.11800 (2018) (arXiv),

Proc. of the 24th Annual SIGKDD, 293-301 (2018) (pdf).

Group Collaborative Representation for Image Set Classification,

B. Liu, L. Jing, J. Li, J. Yu, A. Gittens, and M. W. Mahoney,

International Journal of Computer Vision, 1-26 (2018) (pdf).

Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap,

M. E. Lopes, S. Wang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1803.08021 (2018) (arXiv),

Proc. of the 35th ICML Conference 3223-3232 (2018) (pdf).

GPU Accelerated Sub-Sampled Newton's Method,

S. B. Kylasa, F. Roosta-Khorasani, M. W. Mahoney, and A. Grama,

Technical Report, Preprint: arXiv:1802.09113 (2018) (arXiv), (code),

Proc. 2019 SDM Conference, 702-710 (2019) (pdf).

Hessian-based Analysis of Large Batch Training and Robustness to Adversaries,

Z. Yao, A. Gholami, Q. Lei, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1802.08241 (2018) (arXiv),

Proc. of the 2018 NeurIPS Conference, 4954-4964 (2018) (pdf).

Inexact Non-Convex Newton-Type Methods,

Z. Yao, P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1802.06925 (2018) (arXiv),

INFORMS Journal on Optimization 3(2):154-182 (2021) (pdf).

Out-of-sample extension of graph adjacency spectral embedding,

K. Levin, F. Roosta-Khorasani, M. W. Mahoney, and C. E. Priebe,

Technical Report, Preprint: arXiv:1802.06307 (2018) (arXiv),

Proc. of the 35th ICML Conference 2981-2990 (2018) (pdf).

2017

Lectures on Randomized Numerical Linear Algebra,

P. Drineas and M. W. Mahoney,

Technical Report, Preprint: arXiv:1712.08880 (2017) (arXiv),

In: Lectures of the 2016 PCMI Summer School on Mathematics of Data.

Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization,

A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1712.06047 (2017) (arXiv),

Proc. of the 2018 IPDPS Conference 409-418 (2018) (pdf).

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior, (click here for a blog about this paper)

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1710.09553 (2017) (arXiv), (iclr18).

LASAGNE: Locality And Structure Aware Graph Node Embedding,

E. Faerman, F. Borutta, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1710.06520 (2017) (arXiv),

Proc. 2018 International Conference on Web Intelligence, 246-253 (2018) (pdf). (Awarded Best Student Paper Award.)

A Berkeley View of Systems Challenges for AI,

I. Stoica, D. Song, R. A. Popa, D. A. Patterson, M. W. Mahoney, R. H. Katz, A. D. Joseph, M. Jordan, J. M. Hellerstein, J. Gonzalez, K. Goldberg, A. Ghodsi, D. E. Culler, and P. Abbeel,

Technical Report No. UCB/EECS-2017-159, October 2017 (www),

Technical Report, Preprint: arXiv:1712.05855 (2017) (arXiv).

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization,

S. Wang, F. Roosta-Khorasani, P. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1709.03528 (2017) (arXiv), (Spark code), (Python code),

Proc. of the 2018 NeurIPS Conference, 2338-2348 (2018) (pdf).

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study,

P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.07827 (2017) (arXiv), (code),

Proc. 2020 SDM Conference, 199-207 (2020) (pdf).

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information,

P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.07164 (2017) (arXiv),

Mathematical Programming, 184: 35-70(2020) (pdf).

A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication,

M. E. Lopes, S. Wang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.01945 (2017) (arXiv),

J. Machine Learning Research, 20(39): 1−40 (2019) (pdf).

Capacity releasing diffusions for speed and locality,

D. Wang, K. Fountoulakis, M. Henzinger, M. W. Mahoney, and S. Rao,

Technical Report, Preprint: arXiv:1706.05826 (2017) (arXiv),

Proc. of the 34th ICML Conference 3598-3607 (2017) (pdf, supp) (talk).

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds,

S. Wang, A. Gittens, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1706.02803 (2017) (arXiv),

J. Machine Learning Research, 20(12): 1-49 (2019) (pdf).

Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction,

K. E. Bouchard, A. F. Bujan, F. Roosta-Khorasani, S. Ubaru, Prabhat, A. M. Snijders, J.-H. Mao, E. F. Chang, M. W. Mahoney, S. Bhattacharyya,

Technical Report, Preprint: arXiv:1705.07585 (2017) (arXiv),

Proc. of the 2017 NIPS Conference, 1078-1086 (2017) (pdf).

Skip-Gram - Zipf + Uniform = Vector Additivity,

A. Gittens, D. Achlioptas, and M. W. Mahoney,

Proc. of the 55th ACL Meeting 69-76 (2017) (pdf).

Principles and Applications of Science of Information [Scanning the Issue],

T. Courtade, A. Grama, M. W. Mahoney, and T. Weissman,

Proceedings of the IEEE, 105(2): 183-188 (2017) (pdf).

Social Discrete Choice Models,

D. Zhang, K. Fountoulakis, J. Cao, M. Yin, M. W. Mahoney, and A. Pozdnoukhov,

Technical Report, Preprint: arXiv:1703.07520 (2017) (arXiv).

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging,

S. Wang, A. Gittens, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1702.04837 (2017) (arXiv),

Proc. of the 34th ICML Conference 3608-3616 (2017) (pdf),

J. Machine Learning Research, 18(218): 1-50 (2018) (pdf).

2016

Avoiding communication in primal and dual block coordinate descent methods,

A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1612.04003 (2016) (arXiv),

SIAM J. Scientific Computing, 41(1), C1-C27 (2019) (pdf).

Feature-distributed sparse regression: a screen-and-clean approach,

J. Yang, M. W. Mahoney, M. A. Saunders, and Y. Sun,

Proc. of the 2016 NIPS Conference, 2711-2719 (2016) (pdf).

Multi-label learning with semantic embeddings,

L. Jing, M. Cheng, L. Yang, A. Gittens, M. W. Mahoney,

ICLR 2017 OpenReview.net (iclr17).

Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data,

D. Lawlor, T. Budavari, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1609.03932 (2016) (arXiv),

The Astrophysical Journal, 833:1, 26 (2016) (pdf).

Lecture Notes on Spectral Graph Methods,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1608.04845 (2016) (arXiv),

Lecture Notes on Randomized Linear Algebra,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1608.04481 (2016) (arXiv),

An optimization approach to locally-biased graph algorithms,

K. Fountoulakis, D. F. Gleich, M. W. Mahoney,

Technical Report, Preprint: arXiv:1607.04940 (2016) (arXiv),

Proceedings of the IEEE, 105(2): 256-272 (2017) (pdf).

DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection,

L. Jing, B. Liu, J. Choi, A. Janin, J. Bernd, M. W. Mahoney, and G. Friedland,

Technical Report, Preprint: arXiv:1607.04378 (2016) (arXiv),

Proc. of the 2016 ACM Multimedia Conference 57-61 (2016) (pdf),

IEEE Transactions on Multimedia, 19(12): 2637-2650 (2017) (pdf).

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies,

A. Gittens, A. Devarakonda, E. Racah, M. Ringenburg, L. Gerhardt, J. Kottalam, J. Liu, K. Maschhoff, S. Canon, J. Chhugani, P. Sharma, J. Yang, J. Demmel, J. Harrell, V. Krishnamurthy, M. W. Mahoney, and Prabhat,

Technical Report, Preprint: arXiv:1607.01335 (2016) (arXiv), (code),

Proc. 2016 IEEE BigData, 204-213 (2016) (pdf).

Sub-sampled Newton Methods with Non-uniform Sampling,

P. Xu, J. Yang, F. Roosta-Khorasani, C. Re, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1607.00559 (2016) (arXiv),

Proc. of the 2016 NIPS Conference, 3000-3008 (2016) (pdf).

Approximating the Solution to Mixed Packing and Covering LPs in parallel time,

M. W. Mahoney, S. Rao, D. Wang, and P. Zhang,

Proc. of the 43rd ICALP Conference, 52:1-52:14 (2016) (pdf).

A Simple and Strongly-Local Flow-Based Method for Cut Improvement,

N. Veldt, D. F. Gleich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1605.08490 (2016) (arXiv),

Proc. of the 33rd ICML Conference 1938-1947 (2016) (pdf, supp).

RandNLA: Randomized Numerical Linear Algebra,

P. Drineas and M. W. Mahoney,

Communications of the ACM, 59, 80-90 (2016) (pdf).

FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods,

X. Cheng, F. Roosta-Khorasani, S. Palombo, P. L. Bartlett, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1605.08108 (2016) (arXiv),

Proc. of the 21st International Conference on AISTATS, PMLR 84:404-414 (2018) (pdf, supp).

Parallel Local Graph Clustering,

J. Shun, F. Roosta-Khorasani, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1604.07515 (2016) (arXiv),

Proceedings of the VLDB Endowment, 9(12) 1041-1052 (2016) (pdf).

A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark,

A. Gittens, J. Kottalam, J. Yang, M. F. Ringenburg, J. Chhugani, E. Racah, M. Singh, Y. Yao, C. Fischer, O. Ruebel, B. Bowen, N. G. Lewis, M. W. Mahoney, V. Krishnamurthy, and Prabhat,

Proc. 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, at IPDPS, 2016 (pdf).

Mining Large Graphs,

D. F. Gleich and M. W. Mahoney,

In Handbook of Big Data. pp. 191-220, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).

Structural properties underlying high-quality Randomized Numerical Linear Algebra algorithms,

M. W. Mahoney and P. Drineas,

In Handbook of Big Data. pp. 137-154, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).

Variational Perspective on Local Graph Clustering,

K. Fountoulakis, X. Cheng, J. Shun, F. Roosta-Khorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1602.01886 (2016) (arXiv),

Mathematical Programming, 174(1-2): 553-573 (2019) (pdf).

Sub-Sampled Newton Methods II: Local Convergence Rates,

F. Roosta-Khorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1601.04738 (2016) (arXiv).

Sub-Sampled Newton Methods I: Globally Convergent Algorithms,

F. Roosta-Khorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1601.04737 (2016) (arXiv).

RandNLA, Pythons, and the CUR for Your Data Problems: Reporting from G2S3 2015 in Delphi,

E. Gallopoulos, P. Drineas, I. Ipsen, and M. W. Mahoney,

SIAM News 49:1 January/February 2016 (web), (pdf).

2015

Faster Parallel Solver for Positive Linear Programs via Dynamically-Bucketed Selective Coordinate Descent,

D. Wang, M. W. Mahoney, N. Mohan, and S. Rao,

Technical Report, Preprint: arXiv:1511.06468 (2015) (arXiv).

A Local Perspective on Community Structure in Multilayer Networks,

L. G. S. Jeub, M. W. Mahoney, P. J. Mucha, and M. A. Porter,

Technical Report, Preprint: arXiv:1510.05185 (2015) (arXiv),

Network Science, 5(2): 144-163, 2017 (pdf).

Optimal Subsampling Approaches for Large Sample Linear Regression,

R. Zhu, P. Ma, M. W. Mahoney, and B. Yu,

Technical Report, Preprint: arXiv:1509.05111 (2015) (arXiv).

Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction,

D. Wang, S. Rao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1508.02439 (2015) (arXiv),

Proc. of the 43rd ICALP Conference, 50:1-50:13 (2016) (pdf).

Using local spectral methods to robustify graph-based learning algorithms,

D. F. Gleich and M. W. Mahoney,

Proc. of the 21st Annual SIGKDD, 359-368 (2015) (pdf) (code).

Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation,

R. Wang, Y. Li, M. W. Mahoney, and E. Darve,

Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv),

SIAM J. Matrix Analysis and Applications, 40(4), 1497–1526 (2019) (pdf).

Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions,

J. Yang, O. Ruebel, Prabhat, M. W. Mahoney, and B. P. Bowen,

Analytical Chemistry, 87 (9), 4658-4666 (2015) (pdf) (code).

Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method,

D. G. Anderson, S. S. Du, M. W. Mahoney, C. Melgaard, K. Wu, and M. Gu,

Proc. of the 18th International Conference on AISTATS, PMLR 38:19-27 (2015) (pdf, supp) (code).

Weighted SGD for Lp Regression with Randomized Preconditioning,

J. Yang, Y.-L. Chow, C. Re, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv),

Proc. of the 27th Annual SODA, 558-569 (2016) (pdf),

J. Machine Learning Research, 18(211): 1-43 (2018) (pdf).

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments,

J. Yang, X. Meng, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1502.03032 (2015) (arXiv) (code),

Proceedings of the IEEE 104(1): 58-92 (2016) (pdf).

2014

Tree decompositions and social graphs,

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1411.1546 (2014) (arXiv), (code).

Internet Mathematics, 12(5), 315-361 (2016) (pdf).

Fast Randomized Kernel Methods With Statistical Guarantees,

A. El Alaoui and M. W. Mahoney,

Technical Report, Preprint: arXiv:1411.0306 (2014) (arXiv),

Proc. of the 2015 NIPS Conference, 775-783 (2015) (pdf).

Signal Processing for Big Data (Editorial for Special Issue)

G. B. Giannakis, F. Bach, R. Cendrillon, M. Mahoney, and J. Neville,

IEEE Signal Processing Magazine, 31: 15-16 (September 2014) (pdf).

A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares,

G. Raskutti and M. W. Mahoney,

Technical Report, Preprint: arXiv:1406.5986 (2014) (arXiv),

Proc. of the 32nd ICML Conference, 617-625 (2015) (pdf),

J. Machine Learning Research, 17(214): 1-31, (2016) (pdf).

Random Laplace Feature Maps for Semigroup Kernels on Histograms,

J. Yang, V. Sindhwani, Q. Fan, H. Avron, and M. W. Mahoney,

Proc. of the 27th CVPR Conference, 971-978 (2014) (pdf).

Anti-differentiating Approximation Algorithms: A case study with Min-cuts, Spectral, and Flow,

D. F. Gleich and M. W. Mahoney,

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 1018-1025 (2014) (pdf) (code, code) (talk).

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels,

J. Yang, V. Sindhwani, H. Avron, and M. W. Mahoney,

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 485-493 (2014) (pdf), (code),

Technical Report, Preprint: arXiv:1412.8293 (2014) (arXiv),

J. Machine Learning Research, 17(120): 1-38 (2016) (pdf).

Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks,

L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1403.3795 (2014) (arXiv), (code, code),

Physical Review E, 91, 012821 (2015) (pdf).

A new spin on an old algorithm: technical perspective on "Communication costs of Strassen's matrix multiplication,"

M. W. Mahoney,

Communications of the ACM, 57(2): 106 (2014) (pdf).

2013

Tree-like Structure in Large Social and Information Networks,

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,

Proc. of the 2013 IEEE ICDM, 1-10 (2013) (pdf).

Objective Identification of Informative Wavelength Regions in Galaxy Spectra,

C.-W. Yip, M. W. Mahoney, A. S. Szalay, I. Csabai, T. Budavari, R. F. G. Wyse, and L. Dobos,

Technical Report, Preprint: arXiv:1312.0637 (2013) (arXiv),

Astronomical Journal, 147, 5, 110 (2014) (pdf).

Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity,

A. B. Adcock, B. D. Sullivan, O. R. Hernandez, and M. W. Mahoney,

Proc. of the 9th IWOMP, 71-83 (2013) (pdf).

Frontiers in Massive Data Analysis,

Committee on the Analysis of Massive Data, et al. (M. I. Jordan, et al.),

The National Academies Press (2013) (pdf), (web).

A Statistical Perspective on Algorithmic Leveraging,

P. Ma, M. W. Mahoney, and B. Yu,

Technical Report, Preprint: arXiv:1306.5362 (2013) (arXiv),

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 91-99 (2014) (pdf),

J. Machine Learning Research, 16, 861-911 (2015) (pdf).

Robust Regression on MapReduce,

X. Meng, and M. W. Mahoney,

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 888-896 (2013) (pdf).

Quantile Regression for Large-scale Applications,

J. Yang, X. Meng, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1305.0087 (2013) (arXiv), (code),

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 881-887 (2013) (pdf),

SIAM J. Scientific Computing, 36(5), S78-S110 (2014) (pdf).

Revisiting the Nystrom Method for Improved Large-Scale Machine Learning,

A. Gittens and M. W. Mahoney,

Technical Report, Preprint: arXiv:1303.1849 (2013) (arXiv), (code),

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 567-575 (2013) (pdf),

J. Machine Learning Research, 17(117): 1-65 (2016) (pdf).

2012

Semi-supervised Eigenvectors for Large-scale Locally-biased Learning,

T. J. Hansen and M. W. Mahoney,

Proc. of the 2012 NIPS Conference, 2528-2536 (2012) (pdf), (code),

Technical Report, Preprint: arXiv:1304.7528 (2013) (arXiv),

J. Machine Learning Research, 15, 3691-3734 (2014) (pdf).

Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression,

X. Meng and M. W. Mahoney,

Technical Report, Preprint: arXiv:1210.3135 (2012) (arXiv),

Proc. of the 45th STOC, 91-100 (2013) (pdf).

The Fast Cauchy Transform and Faster Robust Linear Regression,

K. L. Clarkson, P. Drineas, M. Magdon-Ismail, M. W. Mahoney, X. Meng, and D. P. Woodruff,

Technical Report, Preprint: arXiv:1207.4684 (2012) (arXiv),

Proc. of the 24th Annual SODA, 466-477 (2013) (pdf),

SIAM J. Computing, 45, 763-810 (2016) (pdf).

rCUR: an R package for CUR matrix decomposition,

A. Bodor, I. Csabai, M. W. Mahoney, and N. Solymosi,

BMC Bioinformatics, 13:103 (2012) (pdf), (code).

Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1203.0786 (2012) (arXiv),

Proc. of the 2012 ACM Symposium on Principles of Database Systems, 143-154, 2012 (pdf).

On the Hyperbolicity of Small-World and Tree-Like Random Graphs,

W. Chen, W. Fang, G. Hu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1201.1717 (2012) (arXiv),

Proc. of the 23rd ISAAC 278-288 (2012) (pdf),

Internet Mathematics, 9(4), 434-491 (2013) (pdf).

2011

Randomized Dimensionality Reduction for K-means Clustering,

C. Boutsidis, A. Zouzias, M. W. Mahoney, and P. Drineas,

Technical Report, Preprint: arXiv:1110.2897 (2011) (arXiv),

IEEE Transactions on Information Theory, 61(2), 1045-1062 (2015) (pdf).

Regularized Laplacian Estimation and Fast Eigenvector Approximation,

P. O. Perry and M. W. Mahoney,

Technical Report, Preprint: arXiv:1110.1757 (2011) (arXiv),

Proc. of the 2011 NIPS Conference, 2420-2428 (2011) (pdf).

LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems,

X. Meng, M. A. Saunders, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1109.5981 (2011) (arXiv), (code),

SIAM J. Scientific Computing, 36(2), C95-C118 (2014) (pdf).

Fast approximation of matrix coherence and statistical leverage,

P. Drineas, M. Magdon-Ismail, M. W. Mahoney, and D. P. Woodruff,

Technical Report, Preprint: arXiv:1109.3843 (2011) (arXiv),

Proc. of the 29th ICML Conference, 1051-1058 (2012) (pdf),

J. Machine Learning Research, 13, 3475-3506 (2012) (pdf).

Localization on low-order eigenvectors of data matrices,

M. Cucuringu and M. W. Mahoney,

Technical Report, Preprint: arXiv:1109.1355 (2011) (arXiv).

Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation,

A. Javed, P. Drineas, M. W. Mahoney, and P. Paschou,

Annals of Human Genetics, 75, 707-722 (2011) (pdf).

Randomized Algorithms for Matrices and Data,

M. W. Mahoney,

Foundations and Trends in Machine Learning, NOW Publishers, Volume 3, Issue 2, 2011 (now),

TR version: Technical Report, Preprint: arXiv:1104.5557 (2011) (arXiv).

(Abridged version in: Advances in Machine Learning and Data Mining for Astronomy, edited by M. J. Way, et al., pp. 647-672, 2012.)

2010

Computation in Large-Scale Scientific and Internet Data Applications is a Focus of MMDS 2010,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1012.4231 (2010) (arXiv),

Appeared in SIGKDD Explorations, SIGACT News, ASA-SCGN Newsletter, and IMS Bulletin.

CUR from a Sparse Optimization Viewpoint,

J. Bien, Y. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1011.0413 (2010) (arXiv),

Proc. of the 2010 NIPS Conference, 217-225 (2010) (ps, pdf).

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1010.1609 (2010) (arXiv),

In: Combinatorial Scientific Computing, pp. 427-469, edited by U. Naumann and O. Schenk, 2012.

Implementing regularization implicitly via approximate eigenvector computation,

M. W. Mahoney and L. Orecchia,

Technical Report, Preprint: arXiv:1010.0703 (2010) (arXiv),

Proc. of the 28th ICML Conference, 121-128 (2011) (pdf) (talk).

Approximating Higher-Order Distances Using Random Projections,

P. Li, M. W. Mahoney, and Y. She,

Proc. of the 26th UAI Conference, 312-321 (2010) (ps, pdf),

Technical Report, Preprint: arXiv:1203.3492 (2012) (arXiv).

Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving,

P. Drineas and M. W. Mahoney,

Technical Report, Preprint: arXiv:1005.3097 (2010) (arXiv).

Empirical Comparison of Algorithms for Network Community Detection,

J. Leskovec, K. J. Lang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1004.3539 (2010) (arXiv),

Proc. of the 19th International WWW, 631-640 (2010) (ps, pdf).

2009

A Local Spectral Method for Graphs: with Applications to Improving Graph Partitions and Exploring Data Graphs Locally,

M. W. Mahoney, L. Orecchia, and N. K. Vishnoi,

Technical Report, Preprint: arXiv:0912.0681 (2009) (arXiv),

J. Machine Learning Research, 13, 2339-2365 (2012) (ps, pdf).

Unsupervised Feature Selection for the k-means Clustering Problem,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Proc. of the 2009 NIPS Conference, 153-161 (2009) (ps, pdf).

Learning with Spectral Kernels and Heavy-Tailed Data,

M. W. Mahoney and H. Narayanan,

Technical Report, Preprint: arXiv:0906.4539 (2009) (arXiv).

Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow,

K. J. Lang, M. W. Mahoney, and L. Orecchia,

Proc. of the 8th International SEA, 197-208 (2009) (ps, pdf).

CUR Matrix Decompositions for Improved Data Analysis,

M. W. Mahoney and P. Drineas,

Proc. Natl. Acad. Sci. USA, 106, 697-702 (2009) (ps, pdf).

2008

An Improved Approximation Algorithm for the Column Subset Selection Problem,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Technical Report, Preprint: arXiv:0812.4293 (2008) (arXiv),

Proc. of the 20th Annual SODA, 968-977 (2009) (ps, pdf).

Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis are the Focus of MMDS 2008

M. W. Mahoney, L.-H. Lim, and G. E. Carlsson

Technical Report, Preprint: arXiv:0812.3702 (2008) (arXiv),

Appeared in SIGKDD Explorations (ps, pdf), SIAM News (ps, pdf), and ASA-SCGN Newsletter (ps, pdf), and abridged versions appeared in IMS Bulletin (ps, pdf) and AmStat News.

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:0810.1355 (2008) (arXiv),

Internet Mathematics, 6(1), 29-123 (2009) (pdf).

Unsupervised Feature Selection for Principal Components Analysis,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Proc. of the 14th Annual SIGKDD, 61-69 (2008) (ps, pdf).

Statistical Properties of Community Structure in Large Social and Information Networks,

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,

Proc. of the 17th International WWW, 695-704 (2008) (ps, pdf).

2007

Faster Least Squares Approximation,

P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,

Technical Report, Preprint: arXiv:0710.1435 (2007) (arXiv),

Numerische Mathematik, 117, 219-249 (2011) (pdf).

PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations,

P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron, M. W. Mahoney, and P. Drineas,

PLoS Genetics, 3, 1672-1686 (2007) (ps, pdf).

Relative-Error CUR Matrix Decompositions,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Technical Report, Preprint: arXiv:0708.3696 (2007) (arXiv),

SIAM J. Matrix Analysis and Applications, 30, 844-881 (2008) (ps, pdf).

Feature Selection Methods for Text Classification,

A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,

Proc. of the 13th Annual SIGKDD, 230-239 (2007) (ps, pdf).

Sampling Algorithms and Coresets for Lp Regression,

A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,

Technical Report, Preprint: arXiv:0707.1714 (2007) (arXiv),

Proc. of the 19th Annual SODA, 932-941 (2008) (ps, pdf),

SIAM J. Computing, 38, 2060-2078 (2009) (ps, pdf).

Web Information Retrieval and Linear Algebra Algorithms,

A. Frommer, M. W. Mahoney, and D. B. Szyld (Eds.),

Proc. of Dagstuhl Seminar 07071, (2007) (web).

Intra- and interpopulation genotype reconstruction from tagging SNPs,

P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,

Genome Research, 17(1), 96-107 (2007) (ps, pdf).

2006

Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications,

G. H. Golub, M. W. Mahoney, P. Drineas, and L.-H. Lim,

SIAM News 39:8 October 2006 (ps, pdf).

Randomized Algorithms for Matrices and Massive Data Sets,

P. Drineas and M. W. Mahoney,

Proc. of the 32nd Annual VLDB, 1269 (2006) (ps, pdf).

Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 14th Annual ESA, 304-314 (2006) (ps, pdf).

Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 10th Annual RANDOM, 316-326 (2006) (ps, pdf).

Tensor-CUR Decompositions For Tensor-Based Data,

M. W. Mahoney, M. Maggioni, and P. Drineas,

Proc. of the 12th Annual SIGKDD, 327-336 (2006) (ps, pdf),

SIAM J. Matrix Analysis and Applications, 30, 957-987 (2008) (ps, pdf).

Polynomial Time Algorithm for Column-Row-Based Relative-Error Low-Rank Matrix Approximation,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Technical Report, DIMACS TR 2006-04 March 2006 (ps, pdf).

Sampling Algorithms for L2 Regression and Applications,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 17th Annual SODA, 1127-1136 (2006) (ps, pdf).

2005

A Randomized Algorithm for a Tensor-Based Generalization of the Singular Value Decomposition,

P. Drineas and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1327, June 2005 (ps, pdf),

Linear Algebra and its Applications, 420, 553-571 (2007) (ps, pdf).

On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,

P. Drineas and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1319, April 2005 (ps, pdf),

Proc. of the 18th Annual COLT, 323-337 (2005) (ps, pdf),

J. Machine Learning Research, 6, 2153-2175 (2005) (ps, pdf).

2004

Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1283, April 2004 (ps, pdf),

Proc. of the 22nd Annual STACS, 57-68 (2005) (ps, pdf),

Random Structures and Algorithms, 32:3, 307-333 (2008) (ps, pdf).

Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1271, February 2004 (ps, pdf),

SIAM J. Computing, 36, 184-206 (2006) (ps, pdf).

Fast Monte Carlo Algorithms for Matrices II: Computing Low-Rank Approximations to a Matrix,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1270, February 2004 (ps, pdf),

SIAM J. Computing, 36, 158-183 (2006) (ps, pdf).

Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR-1269, February 2004 (ps, pdf),

SIAM J. Computing, 36, 132-157 (2006) (ps, pdf).

2003

Rapid Mixing of Several Markov Chains for a Hard-Core Model,

R. Kannan, M. W. Mahoney, and R. Montenegro,

Proc. of the 14th Annual ISAAC, 663-675 (2003) (pdf).

2001

Quantum, Intramolecular Flexibility, and Polarizability Effects on the Reproduction of the Density Anomaly of Liquid Water by Simple Potential Functions,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 115, 10758-10768 (2001) (pdf).

Rapid Estimation of Electronic Degrees of Freedom in Monte Carlo Calculations for Polarizable Models of Liquid Water,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 114, 9337-9349 (2001) (pdf).

Diffusion Constant of the TIP5P Model of Liquid Water,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 114, 363-366 (2001) (pdf).

2000

A Five-Site Model for Liquid Water and the Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 112, 8910-8922 (2000) (pdf).

1997

Repression and Activation of Promoter-Bound RNA Polymerase Activity by Gal Repressor,

H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,

J. Mol. Biol. 272: 293-300, 1997 (pdf).

Discrete Representations of the Protein C-alpha Chain,

X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,

Fold. & Des. 2: 223-234, 1997 (pdf).