Michael Mahoney - Publications

(dblp, GoogleScholar)

2024

  • LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data,
  • H. Zhang, C. Arvin, D. Efimov, M. W. Mahoney, D. Perrault-Joncas, S. Ramasubramanian, A. G. Wilson, and M. Wolff,
    Technical Report, Preprint: arXiv:2412.02525 (2024) (arXiv),
    Proc. of the NeurIPS 2024 Workshop on Time Series in the Age of Large Models (TSALM) (pdf).
  • Hard Constraint Guided Flow Matching for Gradient-Free Generation of PDE Solutions,
  • C. Cheng, B. Han, D. C. Maddix, A. F. Ansari, A. Stuart, M. W. Mahoney, and Y. Wang,
    Technical Report, Preprint: arXiv:2412.01786 (2024) (arXiv),
  • Visualizing Loss Functions as Topological Landscape Profiles,
  • C. Geniesse, J. Chen, T. Xie, G. Shi, Y. Yang, D. Morozov, T. Perciano, M. W. Mahoney, R. Maciejewski, and G. H. Weber,
    Technical Report, Preprint: arXiv:2411.12136 (2024) (arXiv),
    Accepted for publication, Proc. of the NeurIPS 2024 Workshop on Symmetry and Geometry in Neural Representations (NeurReps) ().
  • Evaluating Loss Landscapes from a Topology Perspective,
  • T. Xie, C. Geniesse, J. Chen, Y. Yang, D. Morozov, M. W. Mahoney, R. Maciejewski, and G. H. Weber,
    Technical Report, Preprint: arXiv:2411.09807 (2024) (arXiv),
    Proc. of the NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning (SciForDL) (pdf).
  • Squeezed Attention: Accelerating Long Context Length LLM Inference,
  • C. Hooper, S. Kim, H. Mohammadzadeh, M. Maheswaran, J. Paik, M. W. Mahoney, K. Keutzer, and A. Gholami,
    Technical Report, Preprint: arXiv:2411.09688 (2024) (arXiv),
  • SPADE: Split Peak Attention DEcomposition,
  • M. Wolff, K. G. Olivares, B. Oreshkin, S. Ruan, S. Yang, A. Katoch, S. Ramasubramanian, Y. Zhang, M. W. Mahoney, D. Efimov, and V. Quenneville-Belair
    Technical Report, Preprint: arXiv:2411.05852 (2024) (arXiv),
    Proc. of the NeurIPS 2024 Workshop on Time Series in the Age of Large Models (TSALM) (pdf).
  • How many classifiers do we need?,
  • H. Kim, L. Hodgkinson, R. Theisen, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2411.00328 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 NeurIPS Conference ().
  • AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models,
  • H. Lu, Y. Zhou, S. Liu, Z. Wang, M. W. Mahoney, and Y. Yang,
    Technical Report, Preprint: arXiv:2410.10912 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 NeurIPS Conference ().
  • Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting,
  • S. H. Lim, Y. Wang, A. Yu, E. Hart, M. W. Mahoney, X. S. Li, and N. B. Erichson,
    Technical Report, Preprint: arXiv:2410.03229 (2024) (arXiv),
  • Mitigating Memorization In Language Models,
  • M. Sakarvadia, A. Ajith, A. Khan, N. Hudson, C. Geniesse, K. Chard, Y. Yang, I. Foster, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2410.02159 (2024) (arXiv),
  • Tuning Frequency Bias of State Space Models,
  • A. Yu, D. Lyu, S. H. Lim, M. W. Mahoney, and N. B. Erichson,
    Technical Report, Preprint: arXiv:2410.02035 (2024) (arXiv),
  • Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models,
  • Y. Fang, S. Na, M. W. Mahoney, and M. Kolar,
    Technical Report, Preprint: arXiv:2409.15734 (2024) (arXiv),
  • Consensus Planning with Primal, Dual, and Proximal Agents,
  • A. Maggiar, L. Dicker, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2408.16462 (2024) (arXiv),
  • Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling,
  • P. Ren, R. Nakata, M. Lacour, I. Naiman, N. Nakata, J. Song, Z. Bi, O. A. Malik, D. Morozov, O. Azencot, N. B. Erichson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2407.15089 (2024) (arXiv),
  • Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics,
  • M. Karlbauer, D. C. Maddix, A. F. Ansari, B. Han, G. Gupta, Y. Wang, A. Stuart, and M. W. Mahoney,
    Proc. ICLR 2024 Workshop on AI4Differential Equations In Science, at the 2024 ICLR Conference (pdf).
    Technical Report, Preprint: arXiv:2407.14129 (2024) (arXiv),
  • Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance,
  • H. Lu, X. Liu, Y. Zhou, Q. Li, K. Keutzer, M. W. Mahoney, Y. Yan, H. Yang, and Y. Yang,
    Technical Report, Preprint: arXiv:2407.12996 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 NeurIPS Conference ().
  • Reliable edge machine learning hardware for scientific applications,
  • T. Baldi, J. Campos, B. Hawks, J. Ngadiuba, N. Tran, D. Diaz, J. Duarte, R. Kastner, A. Meza, M. Quinnan, O. Weng, C. Geniesse, A. Gholami, M. W. Mahoney, V. Loncar, P. Harris, J. Agar, and S. Qin,
    Technical Report, Preprint: arXiv:2406.19522 (2024) (arXiv),
    Proc. of the 2024 IEEE 42nd VLSI Test Symposium (VTS) 1-5 (2024) (pdf).
  • Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning,
  • M. Derezinski and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2406.11151 (2024) (arXiv),
    Accepted for publication, Proc. of the 30th Annual SIGKDD, 0000–0000 (2024) ().
  • Towards Scalable and Versatile Weight Space Learning,
  • K. Schurholt, M. W. Mahoney, and D. Borth,
    Technical Report, Preprint: arXiv:2406.09997 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().
  • WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning,
  • D. Lyu, R. Nakata, P. Ren, M. W. Mahoney, A. Pitarka, N. Nakata, and N. B. Erichson,
    Technical Report, Preprint: arXiv:2405.20516 (2024) (arXiv),
  • There is HOPE to Avoid HiPPOs for Long-memory State Space Models,
  • A. Yu, M. W. Mahoney, and N. B. Erichson,
    Technical Report, Preprint: arXiv:2405.13975 (2024) (arXiv),
  • LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement,
  • N. Lee, T. Wattanawong, S. Kim, K. Mangalam, S. Shen, G. Anumanchipali, M. W. Mahoney, K. Keutzer, and A. Gholami,
    Technical Report, Preprint: arXiv:2403.15042 (2024) (arXiv),
    Accepted for publication, Proc. of the 62nd ACL Meeting 000-000 (2024) ().
  • AI and Memory Wall,
  • A. Gholami, Z. Yao, S. Kim, C. Hooper, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2403.14123 (2024) (arXiv),
    RiseLab Medium Post 1, 6 (2021) (blog),
    IEEE Micro 44:33-39 (2024) ().
  • Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs,
  • S. C. Mouli, D. C. Maddix, S. Alizadeh, G. Gupta, A. Stuart, M. W. Mahoney, and Y. Wang,
    Technical Report, Preprint: arXiv:2403.10642 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().
  • Chronos: Learning the Language of Time Series,
  • A. F. Ansari, L. Stella, C. Turkmen, X. Zhang, P. Mercado, H. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapoor, J. Zschiegner, D. C. Maddix, M. W. Mahoney, K. Torkkola, A. G. Wilson, M. Bohlke-Schneider, and Y. Wang,
    Technical Report, Preprint: arXiv:2403.07815 (2024) (arXiv),
    Transactions on Machine Learning Research (10/2024) (pdf).
  • Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning,
  • W. Chen, J. Song, P. Ren, S. Subramanian, D. Morozov, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2402.15734 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 NeurIPS Conference ().
  • KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization,
  • C. Hooper, S. Kim, H. Mohammadzadeh, M. W. Mahoney, Y. S. Shao, K. Keutzer, and A. Gholami,
    Technical Report, Preprint: arXiv:2401.18079 (2024) (arXiv),
    Accepted for publication, Proc. of the 2024 NeurIPS Conference ().
  • SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data,
  • A. Eshragh, L. Yerbury, A. Nazari, F. Roosta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2401.00122 (2024) (arXiv),

2023

  • Multi-scale Local Network Structure Critically Impacts Epidemic Spread and Interventions,
  • O. Eldaghar, M. W. Mahoney, and D. F. Gleich,
    Technical Report, Preprint: arXiv:2312.17351 (2023) (arXiv),
  • An LLM Compiler for Parallel Function Calling,
  • S. Kim, S. Moon, R. Tabrizi, N. Lee, M. W. Mahoney, K. Keutzer, and A. Gholami,
    Technical Report, Preprint: arXiv:2312.04511 (2023) (arXiv),
    Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().
  • Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training,
  • Y. Zhou, T. Pang, K. Liu, C. H. Martin, M. W. Mahoney, and Y. Yang,
    Technical Report, Preprint: arXiv:2312.00359 (2023) (arXiv),
    Proc. of the 2023 NeurIPS Conference ().
  • Rapid Fitting of Band-Excitation Piezoresponse Force Microscopy Using Physics Constrained Unsupervised Neural Networks,
  • A. T. Kaliyev, R. F. Forelli, S. Qin, Y. Guo, S. Memik, M. W. Mahoney, A. Gholami, N. Tran, P. Harris, M. Takac, and J. Agar,
    Proc. of the AI4Mat Workshop at NeurIPS 2023 (pdf).
  • Does In-Context Operator Learning Generalize to Domain-Shifted Settings?,
  • J. W. Liu, N. B. Erichson, K. Bhatia, M. W. Mahoney, and C. Re,
    Proc. of the DLDE Workshop at NeurIPS 2023 (pdf).
  • DMLR: Data-centric Machine Learning Research -- Past, Present and Future,
  • L. Oala, M. Maskey, L. Bat-Leah, A. Parrish, N. M. Gurel, T.-S. Kuo, Y. Liu, R. Dror, D. Brajovic, X. Yao, M. Bartolo, W. A. G. Rojas, R. Hileman, R. Aliment, M. W. Mahoney, M. Risdal, M. Lease, W. Samek, D. Dutta, C. G. Northcutt, C. Coleman, B. Hancock, B. Koch, G. A. Tadesse, B. Karlas, A. Alaa, A. B. Dieng, N. Noy, V. J. Reddi, J. Zou, P. Paritosh, M. van der Schaar, K. Bollacker, L. Aroyo, C. Zhang, J. Vanschoren, I. Guyon, and P. Mattson,
    Technical Report, Preprint: arXiv:2311.13028 (2023) (arXiv),
    Journal of Data-centric Machine Learning Research (2024) ().
  • CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT),
  • M. Melnichenko, O. Balabanov, R. Murray, J. Demmel, M. W. Mahoney, and P. Luszczek,
    Technical Report, Preprint: arXiv:2311.08316 (2023) (arXiv),
  • A PAC-Bayesian Perspective on the Interpolating Information Criterion,
  • L. Hodgkinson, C. van der Heide, R. Salomone, F. Roosta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2311.07013 (2023) (arXiv),
    Proc. of the Mathematics of Modern Machine Learning (M3L) Workshop at NeurIPS 2023.
  • Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels,
  • D. Long, W. W. Xing, A. S. Krishnapriyan, R. M. Kirby, S. Zhe, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2310.05387 (2023) (arXiv),
    Proc. of the 27th International Conference on AISTATS, PMLR 238:2413-2421 (2024) ().
  • Extensions to the SENSEI In situ Framework for Heterogeneous Architectures,
  • B. Loring, E. W. Bethel, G. H. Weber, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2310.02926 (2023) (arXiv),
    Proceedings of the SC23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 868–874 (2023) (pdf).
  • Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs,
  • I. Naiman, N. B. Erichson, P. Ren, M. W. Mahoney, and O. Azencot,
    Technical Report, Preprint: arXiv:2310.02619 (2023) (arXiv),
    Proc. of the 2024 ICLR Conference (pdf).
  • Robustifying State-space Models for Long Sequences via Approximate Diagonalization,
  • A. Yu, A. Nigmetov, D. Morozov, M. W. Mahoney, and N. B. Erichson,
    Technical Report, Preprint: arXiv:2310.01698 (2023) (arXiv),
    Proc. of the 2024 ICLR Conference (pdf).
  • Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems,
  • Y. Cho, J. W. Demmel, M. Derezinski, H. Li, H. Luo, M. W. Mahoney, and R. J. Murray,
    Technical Report, Preprint: arXiv:2308.15720 (2023) (arXiv),
  • Probabilistic Forecasting with Coherent Aggregation,
  • K. G. Olivares, G. Negiar, R. Ma, O. N. Meetei, M. Cao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2307.09797 (2023) (arXiv),
    Transactions on Machine Learning Research () ().
  • The Interpolating Information Criterion for Overparameterized Models,
  • L. Hodgkinson, C. van der Heide, R. Salomone, F. Roosta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2307.07785 (2023) (arXiv),
  • GEANN: Scalable Graph Augmentations for Multi-Horizon Time Series Forecasting,
  • S. Yang, M. Wolff, S. Ramasubramanian, V. Quenneville-Belair, R. Metha, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2307.03595 (2023) (arXiv).
  • SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning,
  • P. Ren, N. B. Erichson, S. Subramanian, O. San, Z. Lukic, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2306.14070 (2023) (arXiv),
  • A Heavy-Tailed Algebra for Probabilistic Programming,
  • F. Liang, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2306.09262 (2023) (arXiv),
    Accepted for publication, Proc. of the 2023 NeurIPS Conference ().
  • SqueezeLLM: Dense-and-Sparse Quantization,
  • S. Kim, C. Hooper, A. Gholami, Z. Dong, X. Li, S. Shen, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2306.07629 (2023) (arXiv),
    Accepted for publication, Proc. of the 2024 ICML Conference 000:000-000 (2024) ().
  • Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior,
  • S. Subramanian, P. Harrington, K. Keutzer, W. Bhimji, D. Morozov, M. W. Mahoney, and A. Gholami,
    Technical Report, Preprint: arXiv:2306.00258 (2023) (arXiv),
    Accepted for publication, Proc. of the 2023 NeurIPS Conference ().
  • A Three-regime Model of Network Pruning,
  • Y. Zhou, Y. Yang, A. Chang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2305.18383 (2023) (arXiv),
    Proc. of the 2023 ICML Conference 202:42790-42809 (2023) (pdf).
  • Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching,
  • I. Hong, S. Na, M. W. Mahoney, and M. Kolar,
    Technical Report, Preprint: arXiv:2305.18379 (2023) (arXiv),
    Proc. of the 2023 ICML Conference 202:13174-13198 (2023) (pdf).
  • When are ensembles really effective?,
  • R. Theisen, H. Kim, Y. Yang, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2305.12313 (2023) (arXiv),
    Accepted for publication, Proc. of the 2023 NeurIPS Conference ().
  • End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs,
  • J. Campos, Z. Dong, J. Duarte, A. Gholami, M. W. Mahoney, J. Mitrevski, and N. Tran,
    Technical Report, Preprint: arXiv:2304.06745 (2023) (arXiv),
    ACM Transactions on Reconfigurable Technology and Systems, 17, 3, Article 36 (2024) (pdf).
  • Full Stack Optimization of Transformer Inference: a Survey,
  • S. Kim, C. Hooper, T. Wattanawong, M. Kang, R. Yan, H. Genc, G. Dinh, Q. Huang, K. Keutzer, M. W. Mahoney, Y. S. Shao, and A. Gholami,
    Technical Report, Preprint: arXiv:2302.14017 (2023) (arXiv),
    Proc. of the ASSYST at ISCA 2023 / MLArchSys 2023 Workshop (pdf),
  • Learning Physical Models that Can Respect Conservation Laws,
  • D. Hansen, D. C. Maddix, S. Alizadeh, G. Gupta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2302.11002 (2023) (arXiv),
    Proc. of the 2023 ICML Conference PMLR 202:12469-12510 (2023) (pdf),
    Physica D: Nonlinear Phenomena, 457: 133952 (2024) (pdf).
  • Speculative Decoding with Big Little Decoder,
  • S. Kim, K. Mangalam, S. Moon, J. Canny, J. Malik, M. W. Mahoney, A. Gholami, and K. Keutzer,
    Technical Report, Preprint: arXiv:2302.07863 (2023) (arXiv),
    Proc. of the 2023 NeurIPS Conference 1705: 39236-39256 (2023) ().

2022

  • Gated Recurrent Neural Networks with Weighted Time-Delay Feedback,
  • N. B. Erichson, S. H. Lim, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2212.00228 (2022) (arXiv),
  • Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems,
  • Y. Fang, S. Na, M. W. Mahoney, and M. Kolar,
    Technical Report, Preprint: arXiv:2211.15943 (2022) (arXiv),
    SIAM J. Optimization, 34(2), 1187-2037 (2024) (pdf).
  • Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software,
  • R. Murray, J. Demmel, M. W. Mahoney, N. B. Erichson, M. Melnichenko, O. A. Malik, L. Grigori, P. Luszczek, M. Dereziński, M. E. Lopes, T. Liang, H. Luo, and J. Dongarra,
    LAWNs (LAPACK Working Notes), UCB/EECS-2022-235 (2022) (pdf),
    Technical Report, Preprint: arXiv:2302.11474 (2023) (arXiv),
  • Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes,
  • L. Hodgkinson, C. van der Heide, F. Roosta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2210.07612 (2022) (arXiv),
    Proc. of the 2023 ICML Conference PMLR 202:13085-13117 (2023) (pdf).
  • Gradient Gating for Deep Multi-Rate Learning on Graphs,
  • T. K. Rusch, B. P. Chamberlain, M. W. Mahoney, M. M. Bronstein, and S. Mishra,
    Technical Report, Preprint: arXiv:2210.00513 (2022) (arXiv),
    Proc. of the 2023 ICLR Conference (pdf).
  • Learning differentiable solvers for systems with hard constraints,
  • G. Negiar, M. W. Mahoney, and A. S. Krishnapriyan,
    Technical Report, Preprint: arXiv:2207.08675 (2022) (arXiv),
    Proc. of the 2023 ICLR Conference (pdf).
  • Adaptive Self-supervision Algorithms for Physics-informed Neural Networks,
  • S. Subramanian, R. M. Kirby, M. W. Mahoney, and A. Gholami,
    Technical Report, Preprint: arXiv:2207.04084 (2022) (arXiv),
    Proc. of the ECAI-23 Conference 2234-2241 (2023) (pdf).
  • GACT: Activation Compressed Training for General Architectures,
  • X. Liu, L. Zheng, D. Wang, Y. Cen, W. Chen, X. Han, J. Chen, Z. Liu, J. Tang, J. Gonzalez, M. W. Mahoney, and A. Cheung,
    Technical Report, Preprint: arXiv:2206.11357 (2022) (arXiv),
    Proc. of the 2022 ICML Conference 162:14139-14152 (2022) (pdf).
  • Neurotoxin: Durable Backdoors in Federated Learning,
  • Z. Zhang, A. Panda, L. Song, Y. Yang, M. W. Mahoney, J. E. Gonzalez, K. Ramchandran, and P. Mittal,
    Technical Report, Preprint: arXiv:2206.10341 (2022) (arXiv),
    Proc. of the 2022 ICML Conference 162:26429-26446 (2022) (pdf).
  • Squeezeformer: An Efficient Transformer for Automatic Speech Recognition,
  • S. Kim, A. Gholami, A. Shaw, N. Lee, K. Mangalam, J. Malik, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2206.00888 (2022) (arXiv),
    Proc. of the 2022 NeurIPS Conference 35:9361-9373 (2022) (pdf, supp).
  • Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming,
  • S. Na and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2205.13687 (2022) (arXiv),
  • Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows,
  • F. Liang, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2205.07918 (2022) (arXiv),
    Proc. of the 2022 ICML Conference 162:13257-13270 (2022) (pdf).
  • The Sky Above The Clouds,
  • S. Chasins, A. Cheung, N. Crooks, A. Ghodsi, K. Goldberg, J. E. Gonzalez, J. M. Hellerstein, M. I. Jordan, A. D. Joseph, M. W. Mahoney, A. Parameswaran, D. Patterson, R. Ada Popa, K. Sen, S. Shenker, D. Song, and I. Stoica,
    Technical Report, Preprint: arXiv:2205.07147 (2022) (arXiv).
  • A Fast Post-Training Pruning Framework for Transformers,
  • W. Kwon, S. Kim, M. W. Mahoney, J. Hassoun, K. Keutzer, and A. Gholami,
    Technical Report, Preprint: arXiv:2204.09656 (2022) (arXiv),
    Proc. of the 2022 NeurIPS Conference 35:24101-24116 (2022) (pdf, supp).
  • Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence,
  • S. Na, M. Derezinski, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2204.09266 (2022) (arXiv),
    Mathematical Programming, 201:473–520 (2023) (pdf).
  • Fast Feature Selection with Fairness Constraints,
  • F. Quinzan, R. Khanna, M. Hershcovitch, S. Cohen, D. G. Waddington, T. Friedrich, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2202.13718 (2022) (arXiv),
    Proc. of the 26th International Conference on AISTATS, PMLR 7800-7823 (2023) (pdf),
    Proc. Second WFVML Workshop, at the 2023 ICML Conference (pdf).
  • AutoIP: A United Framework to Integrate Physics into Gaussian Processes,
  • D. Long, Z. Wang, A. Krishnapriyan, R. Kirby, S. Zhe, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2202.12316 (2022) (arXiv),
    Proc. of the 2022 ICML Conference, 162:14210-14222 (2022) (pdf).
  • Learning continuous models for continuous physics,
  • A. S. Krishnapriyan, A. F. Queiruga, N. B. Erichson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2202.08494 (2022) (arXiv),
    Communications Physics, 6, 319 (2023) (pdf).
  • Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data (Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data),
  • Y. Yang, R. Theisen, L. Hodgkinson, J. E. Gonzalez, K. Ramchandran, C. H. Martin, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2202.02842 (2022) (arXiv),
    Proc. of the 29th Annual SIGKDD, 3011–3021 (2023) (pdf).
  • Boosting Model Robustness to Common Corruptions with Noisy Data Augmentations (NoisyMix: Boosting Model Robustness to Common Corruptions),
  • N. B. Erichson, S. H. Lim, F. Utrera, W. Xu, Z. Cao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2202.01263 (2022) (arXiv),
    Proc. of the 27th International Conference on AISTATS, PMLR 238:4033-4041 (2024) ().

2021

  • Learning from learning machines: a new generation of AI technology to meet the needs of science,
  • L. Pion-Tonachini, K. Bouchard, H. G. Martin, S. Peisert, W. B. Holtz, A. Aswani, D. Dwivedi, H. Wainwright, G. Pilania, B. Nachman, B. L. Marrone, N. Falco, Prabhat, D. Arnold, A. Wolf-Yadlin, S. Powers, S. Climer, Q. Jackson, T. Carlson, M. Sohn, P. Zwart, N. Kumar, A. Justice, C. Tomlin, D. Jacobson, G. Micklem, G. V. Gkoutos, P. J. Bickel, J.-B. Cazier, J. Muller, B.-J. Webb-Robertson, R. Stevens, M. Anderson, K. Kreutz-Delgado, M. W. Mahoney, and J. B. Brown,
    Technical Report, Preprint: arXiv:2111.13786 (2021) (arXiv).
  • Long Expressive Memory for Sequence Modeling,
  • T. K. Rusch, S. Mishra, N. B. Erichson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2110.04744 (2021) (arXiv),
    Proc. of the 2022 ICLR Conference (pdf).
  • Noisy Feature Mixup,
  • S. H. Lim, N. B. Erichson, F. Utrera, W. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2110.02180 (2021) (arXiv),
    Proc. of the 2022 ICLR Conference (pdf).
  • Inexact Newton-CG Algorithms With Complexity Guarantees,
  • Z. Yao, P. Xu, F. Roosta, S. J. Wright, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2109.14016 (2021) (arXiv),
    IMA Journal of Numerical Analysis, 43(3): 1855–1897 (2023) ().
  • Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information,
  • M. Jahani, S. Rusakov, Z. Shi, P. Richtarik, M. W. Mahoney, and M. Takac,
    Technical Report, Preprint: arXiv:2109.05198 (2021) (arXiv),
    Proc. of the 2022 ICLR Conference (pdf).
  • What's Hidden in a One-layer Randomly Weighted Transformer?,
  • S. Shen, Z. Yao, D. Kiela, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2109.03939 (2021) (arXiv),
    Proc. of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pdf).
  • Characterizing possible failure modes in physics-informed neural networks,
  • A. S. Krishnapriyan, A. Gholami, S. Zhe, R. M. Kirby, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2109.01050 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:26548-26560 (2021) (pdf, supp).
  • Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers,
  • L. Hodgkinson, U. Simsekli, R. Khanna, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2108.00781 (2021) (arXiv),
    Proc. of the 2022 ICML Conference (pdf).
  • Taxonomizing local versus global structure in neural network loss landscapes,
  • Y. Yang, L. Hodgkinson, R. Theisen, J. Zou, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2107.11228 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:18722-18733 (2021) (pdf).
  • Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update,
  • M. Derezinski, J. Lacotte, M. Pilanci, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2107.07480 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:2835-2847 (2021) (pdf, supp).
  • Stateful ODE-Nets using Basis Function Expansions,
  • A. Queiruga, N. B. Erichson, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2106.10820 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:21770-21781 (2021) (pdf, supp).
  • Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics,
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2106.00734 (2021) (arXiv).
  • LEAP: Learnable Pruning for Transformer-based Models,
  • Z. Yao, X. Wu, L. Ma, S. Shen, K. Keutzer, M. W. Mahoney, and Y. He,
    Technical Report, Preprint: arXiv:2105.14636 (2021) (arXiv).
  • LocalNewton: Reducing Communication Bottleneck for Distributed Learning,
  • V. Gupta, A. Ghosh, M. Derezinski, R. Khanna, K. Ramchandran, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2105.07320 (2021) (arXiv),
    Proc. of the 37th UAI Conference, 632-642 (2021) (pdf, pdf).
  • ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training,
  • J. Chen, L. Zheng, Z. Yao, D. Wang, I. Stoica, M. W. Mahoney, and J. E. Gonzalez,
    Technical Report, Preprint: arXiv:2104.14129 (2021) (arXiv),
    Proc. of the 38th ICML Conference PMLR 139:1803-1813 (2021) (pdf, supp).
  • Integer-only Zero-shot Quantization for Efficient Speech Recognition,
  • S. Kim, A. Gholami, Z. Yao, N. Lee, P. Wang, A. Nrusimha, B. Zhai, T. Gao, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2103.16827 (2021) (arXiv),
    Proc. of the ICASSP 2022 Conference, 4288-4292 (2022) (pdf).
  • A Survey of Quantization Methods for Efficient Neural Network Inference,
  • A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2103.13630 (2021) (arXiv),
    Chapter in Low-Power Computer Vision: Improve the Efficiency of Artificial Intelligence, pp. 291-326 (2021).
  • Hessian Eigenspectra of More Realistic Nonlinear Models,
  • Z. Liao and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2103.01519 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:20104-20117 (2021) (pdf).
  • A Differential Geometry Perspective on Orthogonal Recurrent Models,
  • O. Azencot, N. B. Erichson, M. Ben-Chen, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2102.09589 (2021) (arXiv).
  • Noisy Recurrent Neural Networks,
  • S. H. Lim, N. B. Erichson, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2102.04877 (2021) (arXiv),
    Proc. of the 2021 NeurIPS Conference, 34:5124-5137 (2021) (pdf, supp).
  • Hessian-Aware Pruning and Optimal Neural Implant,
  • S. Yu, Z. Yao, A. Gholami, Z. Dong, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2101.08940 (2021) (arXiv),
    Proc. of the 2022 WACV Conference, 3880-3891 (2022) (pdf, supp).
  • I-BERT: Integer-only BERT Quantization,
  • S. Kim, A. Gholami, Z. Yao, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2101.01321 (2021) (arXiv),
    Proc. of the 38th ICML Conference PMLR 139:5506-5518 (2021) (pdf, supp).

2020

  • Sparse sketches with small inversion bias,
  • M. Derezinski, Z. Liao, E. Dobriban, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2011.10695 (2020) (arXiv),
    Proc. of the 2021 COLT, 134:1467-1510 (2021) (pdf).
  • HAWQV3: Dyadic Neural Network Quantization,
  • Z. Yao, Z. Dong, Z. Zheng, A. Gholami, J. Yu, E. Tan, L. Wang, Q. Huang, Y. Wang, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2011.10680 (2020) (arXiv),
    Proc. of the 38th ICML Conference PMLR 139:11875-11886 (2021) (pdf, supp).
  • A Statistical Framework for Low-bitwidth Training of Deep Neural Networks,
  • J. Chen, Y. Gai, Z. Yao, M. W. Mahoney, and J. E. Gonzalez,
    Technical Report, Preprint: arXiv:2010.14298 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 883-894 (2020) (pdf).
  • Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism,
  • V. Gupta, D. Choudhary, P. Tak Peter Tang, X. Wei, X. Wang, Y. Huang, A. Kejariwal, K. Ramchandran, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2010.08899 (2020) (arXiv),
    Proc. of the 27th Annual SIGKDD, 2928-2936 (2021) (pdf).
  • MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding,
  • Q. Wang, H. Tan, S. Shen, M. W. Mahoney, and Z. Yao,
    Technical Report, Preprint: arXiv:2010.05379 (2020) (arXiv),
    Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2030–2038 (2020) (pdf).
  • Sparse Quantized Spectral Clustering,
  • Z. Liao, R. Couillet, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2010.01376 (2020) (arXiv),
    Proc. of the 2021 ICLR Conference (pdf).
  • Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models,
  • Z. Zhang, Z. Yao, Y. Yang, Y. Yan, J. E. Gonzalez, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2008.11364 (2020) (arXiv),
    Proc. 2021 IEEE BigData, 1214-1225 (2021) (pdf).
  • Continuous-in-Depth Neural Networks,
  • A. F. Queiruga, N. B. Erichson, D. Taylor, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2008.02389 (2020) (arXiv).
  • Noise-Response Analysis of Deep Neural Networks Quantifies Robustness and Fingerprints Structural Malware,
  • N. B. Erichson, D. Taylor, Q. Wu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2008.00123 (2020) (arXiv),
    Proc. 2021 SDM Conference, 100-108 (2021) (pdf).
  • Adversarially-Trained Deep Nets Transfer Better,
  • F. Utrera, E. Kravitz, N. B. Erichson, R. Khanna, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2007.05869 (2020) (arXiv),
    Proc. of the 2021 ICLR Conference (pdf).
  • Boundary thickness and robustness in learning models,
  • Y. Yang, R. Khanna, Y. Yu, A. Gholami, K. Keutzer, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2007.05086 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 6223-6234 (2020) (pdf).
  • Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),
  • J. Demmel, J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. W. Mahoney,
    LAWNs (LAPACK Working Notes), ICL-UT-20-07 (2020) (pdf).
  • Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization,
  • M. Derezinski, B. Bartan, M. Pilanci, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2007.01327 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 6684-6695 (2020) (pdf).
  • Good classifiers are abundant in the interpolating regime,
  • R. Theisen, J. M. Klusowski, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.12625 (2020) (arXiv),
    Proc. of the 24th International Conference on AISTATS, PMLR 130:3376-3384 (2021) (pdf).
  • Lipschitz Recurrent Neural Networks,
  • N. B. Erichson, O. Azencot, A. Queiruga, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.12070 (2020) (arXiv),
    Proc. of the 2021 ICLR Conference (pdf).
  • Precise expressions for random projections: Low-rank approximation and randomized Newton,
  • M. Derezinski, F. Liang, Z. Liao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.10653 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 18272-18283 (2020) (pdf).
  • Multiplicative noise and heavy tails in stochastic optimization,
  • L. Hodgkinson and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.06293 (2020) (arXiv),
    Proc. of the 38th ICML Conference PMLR 139:4262-4274 (2021) (pdf, supp).
  • A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent,
  • Z. Liao, R. Couillet, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.05013 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 13939-13950 (2020) (pdf),
    Journal of Statistical Mechanics, Theory and Experiment 124006 (2021) (pdf).
  • ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning,
  • Z. Yao, A. Gholami, S. Shen, M. Mustafa, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2006.00719 (2020) (arXiv), (code),
    Proc. of the AAAI-21 Conference, 10665-10673 (2021) (pdf).
  • Determinantal Point Processes in Randomized Numerical Linear Algebra,
  • M. Derezinski and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2005.03185 (2020) (arXiv),
    Notices of the AMS, 68 (1) 34-45 (2021) (pdf).
  • Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance,
  • K. Fountoulakis, M. Liu, D. F. Gleich, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2004.09608 (2020) (arXiv),
    SIAM Review 65(1): 59-143, (2023) (pdf).
  • PowerNorm: Rethinking Batch Normalization in Transformers,
  • S. Shen, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2003.07845 (2020) (arXiv),
    Proc. of the 37th ICML Conference 4566-4576 (2020) (pdf).
  • Error Estimation for Sketched SVD via the Bootstrap,
  • M. E. Lopes, N. B. Erichson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2003.04937 (2020) (arXiv),
    Proc. of the 37th ICML Conference 5435-5445 (2020) (pdf).
  • Forecasting Sequential Data using Consistent Koopman Autoencoders,
  • O. Azencot, N. B. Erichson, V. Lin, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2003.02236 (2020) (arXiv),
    Proc. of the 37th ICML Conference 4493-4503 (2020) (pdf).
  • Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms,
  • P. Ma, X. Zhang, X. Xing, J. Ma, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2002.10526 (2020) (arXiv),
    Proc. of the 23rd International Conference on AISTATS, PMLR 108:1026-1035 (2020) (pdf),
    J. Machine Learning Research, 23(177):1−45, (2022) (pdf).
  • Stochastic Continuous Normalizing Flows: Training SDEs as ODEs (Stochastic Normalizing Flows),
  • L. Hodgkinson, C. van der Heide, F. Roosta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2002.09547 (2020) (arXiv),
    Proc. of the 37th UAI Conference 161:1130-1140 (2021) (pdf).
  • Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method,
  • M. Derezinski, R. Khanna, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2002.09073 (2020) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 4953-4964 (2020) (pdf) (Awarded Best Paper Award),
    Proc. of the IJCAI-21, Sister Conferences Best Paper (SCBP) Track, 4765-4769 (2021) (pdf).
  • Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data,
  • C. H. Martin, T. S. Peng, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:2002.06716 (2020) (arXiv), (code),
    Nature Communications, 12, 4122 (2021) (pdf).
  • ZeroQ: A Novel Zero Shot Quantization Framework,
  • Y. Cai, Z. Yao, Z. Dong, A. Gholami, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:2001.00281 (2020) (arXiv), (code),
    Proc. of the 33rd CVPR Conference, 13169-13178 (2020) (pdf, supp).

2019

  • PyHessian: Neural Networks Through the Lens of the Hessian,
  • Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1912.07145 (2019) (arXiv), (code),
    Proc. 2020 IEEE BigData, 581-590 (2020) (pdf).
  • Exact expressions for double descent and implicit regularization via surrogate random design,
  • M. Derezinski, F. Liang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1912.04533 (2019) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 5152-5164 (2020) (pdf).
  • LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data,
  • A. Eshragh, F. Roosta, A. Nazari, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1911.12321 (2019) (arXiv),
    J. Machine Learning Research, 23(22):1−36, (2022) (pdf).
  • HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks,
  • Z. Dong, Z. Yao, Y. Cai, D. Arfeen, A. Gholami, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:1911.03852 (2019) (arXiv),
    Proc. of the 2020 NeurIPS Conference, 33: 18518-18529 (2020) (pdf).
  • Running Alchemist on Cray XC and CS Series Supercomputers: Dask and PySpark Interfaces, Deployment Options, and Data Transfer Times,
  • K. Rothauge, H. Ayyalasomayajula, K. J. Maschhoff, M. Ringenburg, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1910.01354 (2019) (arXiv), (code),
    Proc. Cray User Group, CUG 2019 (2019) (pdf).
  • Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings,
  • K. Levin, F. Roosta, M. Tang, M. W. Mahoney, and C. E. Priebe,
    Technical Report, Preprint: arXiv:1910.00423 (2019) (arXiv),
    J. Machine Learning Research, 22(194): 1−59, (2021) (pdf).
  • Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching,
  • M. E. Lopes, N. B. Erichson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1909.06120 (2019) (arXiv),
    Bernoulli Journal, 29(1): 428-450 (2023) (pdf).
  • Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT,
  • S. Shen, Z. Dong, J. Ye, L. Ma, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:1909.05840 (2019) (arXiv),
    Proc. of the AAAI-20 Conference, 8815-8821 (2020) (pdf).
  • The Difficulties of Addressing Interdisciplinary Challenges at the Foundations of Data Science,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1909.03033 (2019) (arXiv),
    Appeared in SIAM News, SIGACT News, etc.
  • Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks,
  • C. H. Martin and M. W. Mahoney,
    Proc. of the 25th Annual SIGKDD, 3239-3240 (2019) (pdf).
  • Geometric Rates of Convergence for Kernel-based Sampling Algorithms,
  • R. Khanna, L. Hodgkinson, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1907.08410 (2019) (arXiv),
    Proc. of the 37th UAI Conference 161:2156-2164 (2021) (pdf, supp).
  • Statistical guarantees for local graph clustering,
  • W. Ha, K. Fountoulakis, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1906.04863 (2019) (arXiv),
    Proc. of the 23rd International Conference on AISTATS, PMLR 108:2687-2697 (2020) (pdf),
    J. Machine Learning Research, 22(148): 1−54, (2021) (pdf).
  • ANODEV2: A Coupled Neural ODE Evolution Framework,
  • T. Zhang, Z. Yao, A. Gholami, K. Keutzer, J. Gonzalez, G. Biros, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1906.04596 (2019) (arXiv), (code),
    Proc. of the 2019 NeurIPS Conference, 5151-5161 (2019) (pdf).
  • Bayesian experimental design using regularized determinantal point processes,
  • M. Derezinski, F. Liang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1906.04133 (2019) (arXiv),
    Proc. of the 23rd International Conference on AISTATS, PMLR 108:3197-3207 (2020) (pdf, supp) (talk).
  • Distributed estimation of the inverse Hessian by determinantal averaging,
  • M. Derezinski and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1905.11546 (2019) (arXiv),
    Proc. of the 2019 NeurIPS Conference, 11405-11415 (2019) (pdf).
  • Residual Networks as Nonlinear Systems: Stability Analysis using Linearization,
  • K. Rothauge, Z. Yao, Z. Hu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1905.13386 (2019) (arXiv).
  • Parallel and Communication Avoiding Least Angle Regression,
  • S. Das, J. Demmel, K. Fountoulakis, L. Grigori, M. W. Mahoney, and S. Yang,
    Technical Report, Preprint: arXiv:1905.11340 (2019) (arXiv),
    SIAM J. Scientific Computing, 43(2), C154–C176 (2021) (pdf).
  • Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction,
  • N. B. Erichson, M. Muehlebach, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1905.10866 (2019) (arXiv),
    Proc. Second Workshop on Machine Learning and the Physical Sciences, at the 2018 NeurIPS Conference (pdf).
  • HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision,
  • Z. Dong, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,
    Technical Report, Preprint: arXiv:1905.03696 (2019) (arXiv),
    Proc. ICCV 2019 293-302 (2019) (pdf).
  • JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks,
  • N. B. Erichson, Z. Yao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1904.03750 (2019) (arXiv),
    Proc. of the 9th ICPRAM Conference 103-114 (2020) (pdf).
  • OverSketched Newton: Fast Convex Optimization for Serverless Systems,
  • V. Gupta, S. Kadhe, T. Courtade, M. W. Mahoney, and K. Ramchandran,
    Technical Report, Preprint: arXiv:1903.08857 (2019) (arXiv),
    Proc. 2020 IEEE BigData, 288-297 (2020) (pdf).
  • Inefficiency of K-FAC for Large Batch Size Training,
  • L. Ma, G. Montague, J. Ye, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1903.06237 (2019) (arXiv),
    Proc. of the AAAI-20 Conference, 5053-5060 (2020) (pdf).
  • Sub-Sampled Newton Methods,
  • F. Roosta-Khorasani and M. W. Mahoney,
    Mathematical Programming, 174(1-2): 293-326 (2019) (pdf).
  • Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data,
  • N. B. Erichson, L. Mathelin, Z. Yao, S. L. Brunton, M. W. Mahoney, and J. N. Kutz,
    Technical Report, Preprint: arXiv:1902.07358 (2019) (arXiv),
    Proceedings of the Royal Society A, 476:20200097 (2020) (pdf).
  • Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression,
  • M. Derezinski, K. L. Clarkson, M. W. Mahoney, and M. K. Warmuth,
    Technical Report, Preprint: arXiv:1902.00995 (2019) (arXiv),
    Proc. of 2019 COLT, PMLR 99:1050-1069 (2019) (pdf).
  • Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks,
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1901.08278 (2019) (arXiv), (code),
    Proc. 2020 SDM Conference, 505-513 (2020) (pdf).
  • Traditional and Heavy-Tailed Self Regularization in Neural Network Models,
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1901.08276 (2019) (arXiv), (iclr19), (code),
    Proc. of the 36th ICML Conference 4284-4293 (2019) (pdf).

2018

  • Trust Region Based Adversarial Attack on Neural Networks,
  • Z. Yao, A. Gholami, P. Xu, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1812.06371 (2018) (arXiv), (code),
    Proc. of the 32nd CVPR Conference, 11350-11359 (2019) (pdf).
  • Parameter Re-Initialization through Cyclical Batch Size Schedules,
  • N. Mu, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1812.01216 (2018) (arXiv),
    Proc. Systems for Machine Learning Workshop, at the 2018 NeurIPS Conference (pdf).
  • On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent,
  • N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, and J. Gonzalez,
    Technical Report, Preprint: arXiv:1811.12941 (2018) (arXiv), (iclr19).
  • The Mathematics of Data,
  • M. W. Mahoney, J. C. Duchi, and A. C. Gilbert, Eds.
    AMS, IAS/PCMI, and SIAM (2018) (web), (intro).
  • A Short Introduction to Local Graph Clustering Methods and Software,
  • K. Fountoulakis, D. F. Gleich, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.07324 (2018) (arXiv),
    Absts. of the 7th Intl. Conference on Complex Networks and Their Applications (pdf), (code).
  • Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning,
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.01075 (2018) (arXiv), (code),
    J. Machine Learning Research, 22(165): 1−73, (2021) (pdf).
  • Large batch size training of neural networks with adversarial training and second-order information,
  • Z. Yao, A. Gholami, D. Arfeen, R. Liaw, J. Gonzalez, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.01021 (2018) (arXiv), (iclr19), (code).
  • Newton-MR: Inexact Newton Method With Minimum Residual Sub-problem Solver,
  • F. Roosta, Y. Liu, P. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.00303 (2018) (arXiv),
    EURO Journal on Computational Optimization, 10: 100035 (2022) (pdf).
  • Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification Problems,
  • C.-H. Fang, S. B Kylasa, F. Roosta, M. W. Mahoney, and A. Grama,
    Technical Report, Preprint: arXiv:1807.07132 (2018) (arXiv), (code),
    Proc. SC20 Conference, 50:1-12 (2020) (pdf).
  • Alchemist: An Apache Spark <=> MPI Interface,
  • A. Gittens, K. Rothauge, M. W. Mahoney, S. Wang, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,
    Technical Report, Preprint: arXiv:1806.01270 (2018) (arXiv), (code),
    Concurrency and Computation: Practice and Experience (Special Issue of the Cray User Group, CUG 2018), e5026 (2018) (pdf).
  • Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist,
  • A. Gittens, K. Rothauge, S. Wang, M. W. Mahoney, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,
    Technical Report, Preprint: arXiv:1805.11800 (2018) (arXiv),
    Proc. of the 24th Annual SIGKDD, 293-301 (2018) (pdf).
  • Group Collaborative Representation for Image Set Classification,
  • B. Liu, L. Jing, J. Li, J. Yu, A. Gittens, and M. W. Mahoney,
    International Journal of Computer Vision, 1-26 (2018) (pdf).
  • Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap,
  • M. E. Lopes, S. Wang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1803.08021 (2018) (arXiv),
    Proc. of the 35th ICML Conference 3223-3232 (2018) (pdf).
  • GPU Accelerated Sub-Sampled Newton's Method,
  • S. B. Kylasa, F. Roosta-Khorasani, M. W. Mahoney, and A. Grama,
    Technical Report, Preprint: arXiv:1802.09113 (2018) (arXiv), (code),
    Proc. 2019 SDM Conference, 702-710 (2019) (pdf).
  • Hessian-based Analysis of Large Batch Training and Robustness to Adversaries,
  • Z. Yao, A. Gholami, Q. Lei, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1802.08241 (2018) (arXiv),
    Proc. of the 2018 NeurIPS Conference, 4954-4964 (2018) (pdf).
  • Inexact Non-Convex Newton-Type Methods,
  • Z. Yao, P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1802.06925 (2018) (arXiv),
    INFORMS Journal on Optimization 3(2):154-182 (2021) (pdf).
  • Out-of-sample extension of graph adjacency spectral embedding,
  • K. Levin, F. Roosta-Khorasani, M. W. Mahoney, and C. E. Priebe,
    Technical Report, Preprint: arXiv:1802.06307 (2018) (arXiv),
    Proc. of the 35th ICML Conference 2981-2990 (2018) (pdf).

2017

  • Lectures on Randomized Numerical Linear Algebra,
  • P. Drineas and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1712.08880 (2017) (arXiv),
    In: Lectures of the 2016 PCMI Summer School on Mathematics of Data.
  • Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization,
  • A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1712.06047 (2017) (arXiv),
    Proc. of the 2018 IPDPS Conference 409-418 (2018) (pdf).
  • Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior, (click here for a blog about this paper)
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1710.09553 (2017) (arXiv), (iclr18).
  • LASAGNE: Locality And Structure Aware Graph Node Embedding,
  • E. Faerman, F. Borutta, K. Fountoulakis, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1710.06520 (2017) (arXiv),
    Proc. 2018 International Conference on Web Intelligence, 246-253 (2018) (pdf). (Awarded Best Student Paper Award.)
  • A Berkeley View of Systems Challenges for AI,
  • I. Stoica, D. Song, R. A. Popa, D. A. Patterson, M. W. Mahoney, R. H. Katz, A. D. Joseph, M. Jordan, J. M. Hellerstein, J. Gonzalez, K. Goldberg, A. Ghodsi, D. E. Culler, and P. Abbeel,
    Technical Report No. UCB/EECS-2017-159, October 2017 (www),
    Technical Report, Preprint: arXiv:1712.05855 (2017) (arXiv).
  • GIANT: Globally Improved Approximate Newton Method for Distributed Optimization,
  • S. Wang, F. Roosta-Khorasani, P. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1709.03528 (2017) (arXiv), (Spark code), (Python code),
    Proc. of the 2018 NeurIPS Conference, 2338-2348 (2018) (pdf).
  • Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study,
  • P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.07827 (2017) (arXiv), (code),
    Proc. 2020 SDM Conference, 199-207 (2020) (pdf).
  • Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information,
  • P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.07164 (2017) (arXiv),
    Mathematical Programming, 184: 35-70(2020) (pdf).
  • A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication,
  • M. E. Lopes, S. Wang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.01945 (2017) (arXiv),
    J. Machine Learning Research, 20(39): 1−40 (2019) (pdf).
  • Capacity releasing diffusions for speed and locality,
  • D. Wang, K. Fountoulakis, M. Henzinger, M. W. Mahoney, and S. Rao,
    Technical Report, Preprint: arXiv:1706.05826 (2017) (arXiv),
    Proc. of the 34th ICML Conference 3598-3607 (2017) (pdf, supp) (talk).
  • Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds,
  • S. Wang, A. Gittens, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1706.02803 (2017) (arXiv),
    J. Machine Learning Research, 20(12): 1-49 (2019) (pdf).
  • Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction,
  • K. E. Bouchard, A. F. Bujan, F. Roosta-Khorasani, S. Ubaru, Prabhat, A. M. Snijders, J.-H. Mao, E. F. Chang, M. W. Mahoney, S. Bhattacharyya,
    Technical Report, Preprint: arXiv:1705.07585 (2017) (arXiv),
    Proc. of the 2017 NIPS Conference, 1078-1086 (2017) (pdf).
  • Skip-Gram - Zipf + Uniform = Vector Additivity,
  • A. Gittens, D. Achlioptas, and M. W. Mahoney,
    Proc. of the 55th ACL Meeting 69-76 (2017) (pdf).
  • Principles and Applications of Science of Information [Scanning the Issue],
  • T. Courtade, A. Grama, M. W. Mahoney, and T. Weissman,
    Proceedings of the IEEE, 105(2): 183-188 (2017) (pdf).
  • Social Discrete Choice Models,
  • D. Zhang, K. Fountoulakis, J. Cao, M. Yin, M. W. Mahoney, and A. Pozdnoukhov,
    Technical Report, Preprint: arXiv:1703.07520 (2017) (arXiv).
  • Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging,
  • S. Wang, A. Gittens, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1702.04837 (2017) (arXiv),
    Proc. of the 34th ICML Conference 3608-3616 (2017) (pdf),
    J. Machine Learning Research, 18(218): 1-50 (2018) (pdf).

2016

  • Avoiding communication in primal and dual block coordinate descent methods,
  • A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1612.04003 (2016) (arXiv),
    SIAM J. Scientific Computing, 41(1), C1-C27 (2019) (pdf).
  • Feature-distributed sparse regression: a screen-and-clean approach,
  • J. Yang, M. W. Mahoney, M. A. Saunders, and Y. Sun,
    Proc. of the 2016 NIPS Conference, 2711-2719 (2016) (pdf).
  • Multi-label learning with semantic embeddings,
  • L. Jing, M. Cheng, L. Yang, A. Gittens, M. W. Mahoney,
    ICLR 2017 OpenReview.net (iclr17).
  • Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data,
  • D. Lawlor, T. Budavari, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1609.03932 (2016) (arXiv),
    The Astrophysical Journal, 833:1, 26 (2016) (pdf).
  • Lecture Notes on Spectral Graph Methods,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1608.04845 (2016) (arXiv),
  • Lecture Notes on Randomized Linear Algebra,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1608.04481 (2016) (arXiv),
  • An optimization approach to locally-biased graph algorithms,
  • K. Fountoulakis, D. F. Gleich, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1607.04940 (2016) (arXiv),
    Proceedings of the IEEE, 105(2): 256-272 (2017) (pdf).
  • DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection,
  • L. Jing, B. Liu, J. Choi, A. Janin, J. Bernd, M. W. Mahoney, and G. Friedland,
    Technical Report, Preprint: arXiv:1607.04378 (2016) (arXiv),
    Proc. of the 2016 ACM Multimedia Conference 57-61 (2016) (pdf),
    IEEE Transactions on Multimedia, 19(12): 2637-2650 (2017) (pdf).
  • Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies,
  • A. Gittens, A. Devarakonda, E. Racah, M. Ringenburg, L. Gerhardt, J. Kottalam, J. Liu, K. Maschhoff, S. Canon, J. Chhugani, P. Sharma, J. Yang, J. Demmel, J. Harrell, V. Krishnamurthy, M. W. Mahoney, and Prabhat,
    Technical Report, Preprint: arXiv:1607.01335 (2016) (arXiv), (code),
    Proc. 2016 IEEE BigData, 204-213 (2016) (pdf).
  • Sub-sampled Newton Methods with Non-uniform Sampling,
  • P. Xu, J. Yang, F. Roosta-Khorasani, C. Re, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1607.00559 (2016) (arXiv),
    Proc. of the 2016 NIPS Conference, 3000-3008 (2016) (pdf).
  • Approximating the Solution to Mixed Packing and Covering LPs in parallel epsilon-cubed.gif time,
  • M. W. Mahoney, S. Rao, D. Wang, and P. Zhang,
    Proc. of the 43rd ICALP Conference, 52:1-52:14 (2016) (pdf).
  • A Simple and Strongly-Local Flow-Based Method for Cut Improvement,
  • N. Veldt, D. F. Gleich, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1605.08490 (2016) (arXiv),
    Proc. of the 33rd ICML Conference 1938-1947 (2016) (pdf, supp).
  • RandNLA: Randomized Numerical Linear Algebra,
  • P. Drineas and M. W. Mahoney,
    Communications of the ACM, 59, 80-90 (2016) (pdf).
  • FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods,
  • X. Cheng, F. Roosta-Khorasani, S. Palombo, P. L. Bartlett, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1605.08108 (2016) (arXiv),
    Proc. of the 21st International Conference on AISTATS, PMLR 84:404-414 (2018) (pdf, supp).
  • Parallel Local Graph Clustering,
  • J. Shun, F. Roosta-Khorasani, K. Fountoulakis, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1604.07515 (2016) (arXiv),
    Proceedings of the VLDB Endowment, 9(12) 1041-1052 (2016) (pdf).
  • A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark,
  • A. Gittens, J. Kottalam, J. Yang, M. F. Ringenburg, J. Chhugani, E. Racah, M. Singh, Y. Yao, C. Fischer, O. Ruebel, B. Bowen, N. G. Lewis, M. W. Mahoney, V. Krishnamurthy, and Prabhat,
    Proc. 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, at IPDPS, 2016 (pdf).
  • Mining Large Graphs,
  • D. F. Gleich and M. W. Mahoney,
    In Handbook of Big Data. pp. 191-220, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).
  • Structural properties underlying high-quality Randomized Numerical Linear Algebra algorithms,
  • M. W. Mahoney and P. Drineas,
    In Handbook of Big Data. pp. 137-154, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).
  • Variational Perspective on Local Graph Clustering,
  • K. Fountoulakis, X. Cheng, J. Shun, F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1602.01886 (2016) (arXiv),
    Mathematical Programming, 174(1-2): 553-573 (2019) (pdf).
  • Sub-Sampled Newton Methods II: Local Convergence Rates,
  • F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1601.04738 (2016) (arXiv).
  • Sub-Sampled Newton Methods I: Globally Convergent Algorithms,
  • F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1601.04737 (2016) (arXiv).
  • RandNLA, Pythons, and the CUR for Your Data Problems: Reporting from G2S3 2015 in Delphi,
  • E. Gallopoulos, P. Drineas, I. Ipsen, and M. W. Mahoney,
    SIAM News 49:1 January/February 2016 (web), (pdf).

2015

  • Faster Parallel Solver for Positive Linear Programs via Dynamically-Bucketed Selective Coordinate Descent,
  • D. Wang, M. W. Mahoney, N. Mohan, and S. Rao,
    Technical Report, Preprint: arXiv:1511.06468 (2015) (arXiv).
  • A Local Perspective on Community Structure in Multilayer Networks,
  • L. G. S. Jeub, M. W. Mahoney, P. J. Mucha, and M. A. Porter,
    Technical Report, Preprint: arXiv:1510.05185 (2015) (arXiv),
    Network Science, 5(2): 144-163, 2017 (pdf).
  • Optimal Subsampling Approaches for Large Sample Linear Regression,
  • R. Zhu, P. Ma, M. W. Mahoney, and B. Yu,
    Technical Report, Preprint: arXiv:1509.05111 (2015) (arXiv).
  • Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction,
  • D. Wang, S. Rao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1508.02439 (2015) (arXiv),
    Proc. of the 43rd ICALP Conference, 50:1-50:13 (2016) (pdf).
  • Using local spectral methods to robustify graph-based learning algorithms,
  • D. F. Gleich and M. W. Mahoney,
    Proc. of the 21st Annual SIGKDD, 359-368 (2015) (pdf) (code).
  • Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation,
  • R. Wang, Y. Li, M. W. Mahoney, and E. Darve,
    Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv),
    SIAM J. Matrix Analysis and Applications, 40(4), 1497–1526 (2019) (pdf).
  • Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions,
  • J. Yang, O. Ruebel, Prabhat, M. W. Mahoney, and B. P. Bowen,
    Analytical Chemistry, 87 (9), 4658-4666 (2015) (pdf) (code).
  • Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method,
  • D. G. Anderson, S. S. Du, M. W. Mahoney, C. Melgaard, K. Wu, and M. Gu,
    Proc. of the 18th International Conference on AISTATS, PMLR 38:19-27 (2015) (pdf, supp) (code).
  • Weighted SGD for Lp Regression with Randomized Preconditioning,
  • J. Yang, Y.-L. Chow, C. Re, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv),
    Proc. of the 27th Annual SODA, 558-569 (2016) (pdf),
    J. Machine Learning Research, 18(211): 1-43 (2018) (pdf).
  • Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments,
  • J. Yang, X. Meng, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1502.03032 (2015) (arXiv) (code),
    Proceedings of the IEEE 104(1): 58-92 (2016) (pdf).

2014

  • Tree decompositions and social graphs,
  • A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1411.1546 (2014) (arXiv), (code).
    Internet Mathematics, 12(5), 315-361 (2016) (pdf).
  • Fast Randomized Kernel Methods With Statistical Guarantees,
  • A. El Alaoui and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1411.0306 (2014) (arXiv),
    Proc. of the 2015 NIPS Conference, 775-783 (2015) (pdf).
  • Signal Processing for Big Data (Editorial for Special Issue)
  • G. B. Giannakis, F. Bach, R. Cendrillon, M. Mahoney, and J. Neville,
    IEEE Signal Processing Magazine, 31: 15-16 (September 2014) (pdf).
  • A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares,
  • G. Raskutti and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1406.5986 (2014) (arXiv),
    Proc. of the 32nd ICML Conference, 617-625 (2015) (pdf),
    J. Machine Learning Research, 17(214): 1-31, (2016) (pdf).
  • Random Laplace Feature Maps for Semigroup Kernels on Histograms,
  • J. Yang, V. Sindhwani, Q. Fan, H. Avron, and M. W. Mahoney,
    Proc. of the 27th CVPR Conference, 971-978 (2014) (pdf).
  • Anti-differentiating Approximation Algorithms: A case study with Min-cuts, Spectral, and Flow,
  • D. F. Gleich and M. W. Mahoney,
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 1018-1025 (2014) (pdf) (code, code) (talk).
  • Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels,
  • J. Yang, V. Sindhwani, H. Avron, and M. W. Mahoney,
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 485-493 (2014) (pdf), (code),
    Technical Report, Preprint: arXiv:1412.8293 (2014) (arXiv),
    J. Machine Learning Research, 17(120): 1-38 (2016) (pdf).
  • Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks,
  • L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1403.3795 (2014) (arXiv), (code, code),
    Physical Review E, 91, 012821 (2015) (pdf).
  • A new spin on an old algorithm: technical perspective on "Communication costs of Strassen's matrix multiplication,"
  • M. W. Mahoney,
    Communications of the ACM, 57(2): 106 (2014) (pdf).

2013

  • Tree-like Structure in Large Social and Information Networks,
  • A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,
    Proc. of the 2013 IEEE ICDM, 1-10 (2013) (pdf).
  • Objective Identification of Informative Wavelength Regions in Galaxy Spectra,
  • C.-W. Yip, M. W. Mahoney, A. S. Szalay, I. Csabai, T. Budavari, R. F. G. Wyse, and L. Dobos,
    Technical Report, Preprint: arXiv:1312.0637 (2013) (arXiv),
    Astronomical Journal, 147, 5, 110 (2014) (pdf).
  • Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity,
  • A. B. Adcock, B. D. Sullivan, O. R. Hernandez, and M. W. Mahoney,
    Proc. of the 9th IWOMP, 71-83 (2013) (pdf).
  • Frontiers in Massive Data Analysis,
  • Committee on the Analysis of Massive Data, et al. (M. I. Jordan, et al.),
    The National Academies Press (2013) (pdf), (web).
  • A Statistical Perspective on Algorithmic Leveraging,
  • P. Ma, M. W. Mahoney, and B. Yu,
    Technical Report, Preprint: arXiv:1306.5362 (2013) (arXiv),
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 91-99 (2014) (pdf),
    J. Machine Learning Research, 16, 861-911 (2015) (pdf).
  • Robust Regression on MapReduce,
  • X. Meng, and M. W. Mahoney,
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 888-896 (2013) (pdf).
  • Quantile Regression for Large-scale Applications,
  • J. Yang, X. Meng, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1305.0087 (2013) (arXiv), (code),
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 881-887 (2013) (pdf),
    SIAM J. Scientific Computing, 36(5), S78-S110 (2014) (pdf).
  • Revisiting the Nystrom Method for Improved Large-Scale Machine Learning,
  • A. Gittens and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1303.1849 (2013) (arXiv), (code),
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 567-575 (2013) (pdf),
    J. Machine Learning Research, 17(117): 1-65 (2016) (pdf).

2012

  • Semi-supervised Eigenvectors for Large-scale Locally-biased Learning,
  • T. J. Hansen and M. W. Mahoney,
    Proc. of the 2012 NIPS Conference, 2528-2536 (2012) (pdf), (code),
    Technical Report, Preprint: arXiv:1304.7528 (2013) (arXiv),
    J. Machine Learning Research, 15, 3691-3734 (2014) (pdf).
  • Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression,
  • X. Meng and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1210.3135 (2012) (arXiv),
    Proc. of the 45th STOC, 91-100 (2013) (pdf).
  • The Fast Cauchy Transform and Faster Robust Linear Regression,
  • K. L. Clarkson, P. Drineas, M. Magdon-Ismail, M. W. Mahoney, X. Meng, and D. P. Woodruff,
    Technical Report, Preprint: arXiv:1207.4684 (2012) (arXiv),
    Proc. of the 24th Annual SODA, 466-477 (2013) (pdf),
    SIAM J. Computing, 45, 763-810 (2016) (pdf).
  • rCUR: an R package for CUR matrix decomposition,
  • A. Bodor, I. Csabai, M. W. Mahoney, and N. Solymosi,
    BMC Bioinformatics, 13:103 (2012) (pdf), (code).
  • Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1203.0786 (2012) (arXiv),
    Proc. of the 2012 ACM Symposium on Principles of Database Systems, 143-154, 2012 (pdf).
  • On the Hyperbolicity of Small-World and Tree-Like Random Graphs,
  • W. Chen, W. Fang, G. Hu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1201.1717 (2012) (arXiv),
    Proc. of the 23rd ISAAC 278-288 (2012) (pdf),
    Internet Mathematics, 9(4), 434-491 (2013) (pdf).

2011

  • Randomized Dimensionality Reduction for K-means Clustering,
  • C. Boutsidis, A. Zouzias, M. W. Mahoney, and P. Drineas,
    Technical Report, Preprint: arXiv:1110.2897 (2011) (arXiv),
    IEEE Transactions on Information Theory, 61(2), 1045-1062 (2015) (pdf).
  • Regularized Laplacian Estimation and Fast Eigenvector Approximation,
  • P. O. Perry and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1110.1757 (2011) (arXiv),
    Proc. of the 2011 NIPS Conference, 2420-2428 (2011) (pdf).
  • LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems,
  • X. Meng, M. A. Saunders, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1109.5981 (2011) (arXiv), (code),
    SIAM J. Scientific Computing, 36(2), C95-C118 (2014) (pdf).
  • Fast approximation of matrix coherence and statistical leverage,
  • P. Drineas, M. Magdon-Ismail, M. W. Mahoney, and D. P. Woodruff,
    Technical Report, Preprint: arXiv:1109.3843 (2011) (arXiv),
    Proc. of the 29th ICML Conference, 1051-1058 (2012) (pdf),
    J. Machine Learning Research, 13, 3475-3506 (2012) (pdf).
  • Localization on low-order eigenvectors of data matrices,
  • M. Cucuringu and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1109.1355 (2011) (arXiv).
  • Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation,
  • A. Javed, P. Drineas, M. W. Mahoney, and P. Paschou,
    Annals of Human Genetics, 75, 707-722 (2011) (pdf).
  • Randomized Algorithms for Matrices and Data,
  • M. W. Mahoney,
    Foundations and Trends in Machine Learning, NOW Publishers, Volume 3, Issue 2, 2011 (now),
    TR version: Technical Report, Preprint: arXiv:1104.5557 (2011) (arXiv).
    (Abridged version in: Advances in Machine Learning and Data Mining for Astronomy, edited by M. J. Way, et al., pp. 647-672, 2012.)

2010

  • Computation in Large-Scale Scientific and Internet Data Applications is a Focus of MMDS 2010,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1012.4231 (2010) (arXiv),
    Appeared in SIGKDD Explorations, SIGACT News, ASA-SCGN Newsletter, and IMS Bulletin.
  • CUR from a Sparse Optimization Viewpoint,
  • J. Bien, Y. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1011.0413 (2010) (arXiv),
    Proc. of the 2010 NIPS Conference, 217-225 (2010) (ps, pdf).
  • Algorithmic and Statistical Perspectives on Large-Scale Data Analysis,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1010.1609 (2010) (arXiv),
    In: Combinatorial Scientific Computing, pp. 427-469, edited by U. Naumann and O. Schenk, 2012.
  • Implementing regularization implicitly via approximate eigenvector computation,
  • M. W. Mahoney and L. Orecchia,
    Technical Report, Preprint: arXiv:1010.0703 (2010) (arXiv),
    Proc. of the 28th ICML Conference, 121-128 (2011) (pdf) (talk).
  • Approximating Higher-Order Distances Using Random Projections,
  • P. Li, M. W. Mahoney, and Y. She,
    Proc. of the 26th UAI Conference, 312-321 (2010) (ps, pdf),
    Technical Report, Preprint: arXiv:1203.3492 (2012) (arXiv).
  • Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving,
  • P. Drineas and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1005.3097 (2010) (arXiv).
  • Empirical Comparison of Algorithms for Network Community Detection,
  • J. Leskovec, K. J. Lang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1004.3539 (2010) (arXiv),
    Proc. of the 19th International WWW, 631-640 (2010) (ps, pdf).

2009

  • A Local Spectral Method for Graphs: with Applications to Improving Graph Partitions and Exploring Data Graphs Locally,
  • M. W. Mahoney, L. Orecchia, and N. K. Vishnoi,
    Technical Report, Preprint: arXiv:0912.0681 (2009) (arXiv),
    J. Machine Learning Research, 13, 2339-2365 (2012) (ps, pdf).
  • Unsupervised Feature Selection for the k-means Clustering Problem,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Proc. of the 2009 NIPS Conference, 153-161 (2009) (ps, pdf).
  • Learning with Spectral Kernels and Heavy-Tailed Data,
  • M. W. Mahoney and H. Narayanan,
    Technical Report, Preprint: arXiv:0906.4539 (2009) (arXiv).
  • Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow,
  • K. J. Lang, M. W. Mahoney, and L. Orecchia,
    Proc. of the 8th International SEA, 197-208 (2009) (ps, pdf).
  • CUR Matrix Decompositions for Improved Data Analysis,
  • M. W. Mahoney and P. Drineas,
    Proc. Natl. Acad. Sci. USA, 106, 697-702 (2009) (ps, pdf).

2008

  • An Improved Approximation Algorithm for the Column Subset Selection Problem,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Technical Report, Preprint: arXiv:0812.4293 (2008) (arXiv),
    Proc. of the 20th Annual SODA, 968-977 (2009) (ps, pdf).
  • Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis are the Focus of MMDS 2008
  • M. W. Mahoney, L.-H. Lim, and G. E. Carlsson
    Technical Report, Preprint: arXiv:0812.3702 (2008) (arXiv),
    Appeared in SIGKDD Explorations (ps, pdf), SIAM News (ps, pdf), and ASA-SCGN Newsletter (ps, pdf), and abridged versions appeared in IMS Bulletin (ps, pdf) and AmStat News.
  • Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,
  • J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:0810.1355 (2008) (arXiv),
    Internet Mathematics, 6(1), 29-123 (2009) (pdf).
  • Unsupervised Feature Selection for Principal Components Analysis,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Proc. of the 14th Annual SIGKDD, 61-69 (2008) (ps, pdf).
  • Statistical Properties of Community Structure in Large Social and Information Networks,
  • J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
    Proc. of the 17th International WWW, 695-704 (2008) (ps, pdf).

2007

  • Faster Least Squares Approximation,
  • P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,
    Technical Report, Preprint: arXiv:0710.1435 (2007) (arXiv),
    Numerische Mathematik, 117, 219-249 (2011) (pdf).
  • PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations,
  • P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron, M. W. Mahoney, and P. Drineas,
    PLoS Genetics, 3, 1672-1686 (2007) (ps, pdf).
  • Relative-Error CUR Matrix Decompositions,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Technical Report, Preprint: arXiv:0708.3696 (2007) (arXiv),
    SIAM J. Matrix Analysis and Applications, 30, 844-881 (2008) (ps, pdf).
  • Feature Selection Methods for Text Classification,
  • A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,
    Proc. of the 13th Annual SIGKDD, 230-239 (2007) (ps, pdf).
  • Sampling Algorithms and Coresets for Lp Regression,
  • A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:0707.1714 (2007) (arXiv),
    Proc. of the 19th Annual SODA, 932-941 (2008) (ps, pdf),
    SIAM J. Computing, 38, 2060-2078 (2009) (ps, pdf).
  • Web Information Retrieval and Linear Algebra Algorithms,
  • A. Frommer, M. W. Mahoney, and D. B. Szyld (Eds.),
    Proc. of Dagstuhl Seminar 07071, (2007) (web).
  • Intra- and interpopulation genotype reconstruction from tagging SNPs,
  • P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,
    Genome Research, 17(1), 96-107 (2007) (ps, pdf).

2006

  • Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications,
  • G. H. Golub, M. W. Mahoney, P. Drineas, and L.-H. Lim,
    SIAM News 39:8 October 2006 (ps, pdf).
  • Randomized Algorithms for Matrices and Massive Data Sets,
  • P. Drineas and M. W. Mahoney,
    Proc. of the 32nd Annual VLDB, 1269 (2006) (ps, pdf).
  • Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 14th Annual ESA, 304-314 (2006) (ps, pdf).
  • Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 10th Annual RANDOM, 316-326 (2006) (ps, pdf).
  • Tensor-CUR Decompositions For Tensor-Based Data,
  • M. W. Mahoney, M. Maggioni, and P. Drineas,
    Proc. of the 12th Annual SIGKDD, 327-336 (2006) (ps, pdf),
    SIAM J. Matrix Analysis and Applications, 30, 957-987 (2008) (ps, pdf).
  • Polynomial Time Algorithm for Column-Row-Based Relative-Error Low-Rank Matrix Approximation,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Technical Report, DIMACS TR 2006-04 March 2006 (ps, pdf).
  • Sampling Algorithms for L2 Regression and Applications,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 17th Annual SODA, 1127-1136 (2006) (ps, pdf).

2005

  • A Randomized Algorithm for a Tensor-Based Generalization of the Singular Value Decomposition,
  • P. Drineas and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1327, June 2005 (ps, pdf),
    Linear Algebra and its Applications, 420, 553-571 (2007) (ps, pdf).
  • On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,
  • P. Drineas and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1319, April 2005 (ps, pdf),
    Proc. of the 18th Annual COLT, 323-337 (2005) (ps, pdf),
    J. Machine Learning Research, 6, 2153-2175 (2005) (ps, pdf).

2004

  • Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1283, April 2004 (ps, pdf),
    Proc. of the 22nd Annual STACS, 57-68 (2005) (ps, pdf),
    Random Structures and Algorithms, 32:3, 307-333 (2008) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1271, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 184-206 (2006) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices II: Computing Low-Rank Approximations to a Matrix,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1270, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 158-183 (2006) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1269, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 132-157 (2006) (ps, pdf).

2003

  • Rapid Mixing of Several Markov Chains for a Hard-Core Model,
  • R. Kannan, M. W. Mahoney, and R. Montenegro,
    Proc. of the 14th Annual ISAAC, 663-675 (2003) (pdf).

2001

  • Quantum, Intramolecular Flexibility, and Polarizability Effects on the Reproduction of the Density Anomaly of Liquid Water by Simple Potential Functions,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 115, 10758-10768 (2001) (pdf).
  • Rapid Estimation of Electronic Degrees of Freedom in Monte Carlo Calculations for Polarizable Models of Liquid Water,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 114, 9337-9349 (2001) (pdf).
  • Diffusion Constant of the TIP5P Model of Liquid Water,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 114, 363-366 (2001) (pdf).

2000

  • A Five-Site Model for Liquid Water and the Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 112, 8910-8922 (2000) (pdf).

1997

  • Repression and Activation of Promoter-Bound RNA Polymerase Activity by Gal Repressor,
  • H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,
    J. Mol. Biol. 272: 293-300, 1997 (pdf).
  • Discrete Representations of the Protein C-alpha Chain,
  • X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,
    Fold. & Des. 2: 223-234, 1997 (pdf).