Michael Mahoney - Publications

2018

  • Trust Region Based Adversarial Attack on Neural Networks,
  • Z. Yao, A. Gholami, P. Xu, K. Keutzer, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1812.06371 (2018) (arXiv),
  • Parameter Re-Initialization through Cyclical Batch Size Schedules,
  • N. Mu, Z. Yao, A. Gholami, K. Keutzer, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1812.01216 (2018) (arXiv),
    Presented in the Systems for Machine Learning Workshop at the NeurIPS'18 Conference.
  • On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent,
  • N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, and J. Gonzalez,
    Technical Report, Preprint: arXiv:1811.12941 (2018) (arXiv),
  • The Mathematics of Data,
  • M. W. Mahoney, J. C. Duchi, and A. C. Gilbert, Eds.
    AMS, IAS/PCMI, and SIAM (2018) (web), (intro).
  • A Short Introduction to Local Graph Clustering Methods and Software,
  • K. Fountoulakis, D. F. Gleich, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.07324 (2018) (arXiv),
  • Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning,
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.01075 (2018) (arXiv),
  • Large batch size training of neural networks with adversarial training and second-order information,
  • Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.01021 (2018) (arXiv), (code),
  • Newton-MR: Newton's Method Without Smoothness or Convexity,
  • F. Roosta, Y. Liu, P. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1810.00303 (2018) (arXiv),
  • Distributed Second-order Convex Optimization,
  • C.-H. Fang, S. B Kylasa, F. Roosta-Khorasani, M. W. Mahoney, and A. Grama,
    Technical Report, Preprint: arXiv:1807.07132 (2018) (arXiv), (code),
  • Alchemist: An Apache Spark <=> MPI Interface,
  • A. Gittens, K. Rothauge, M. W. Mahoney, S. Wang, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,
    Technical Report, Preprint: arXiv:1806.01270 (2018) (arXiv),
    Accepted for publication, CUG 2018.
  • Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist,
  • A. Gittens, K. Rothauge, S. Wang, M. W. Mahoney, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,
    Technical Report, Preprint: arXiv:1805.11800 (2018) (arXiv),
    Accepted for publication, KDD 2018.
  • Group Collaborative Representation for Image Set Classification,
  • B. Liu, L. Jing, J. Li, J. Yu, A. Gittens, and M. W. Mahoney,
    International Journal of Computer Vision, 1-26 (2018) (pdf).
  • Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap,
  • M. E. Lopes, S. Wang, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1803.08021 (2018) (arXiv),
    Proc. of the 35th ICML Conference 3223-3232 (2018) (pdf)
    Journal version submitted for publication.
  • GPU Accelerated Sub-Sampled Newton's Method,
  • S. B. Kylasa, F. Roosta-Khorasani, M. W. Mahoney, and A. Grama,
    Technical Report, Preprint: arXiv:1802.09113 (2018) (arXiv), (code),
    Submitted for publication.
  • Hessian-based Analysis of Large Batch Training and Robustness to Adversaries,
  • Z. Yao, A. Gholami, Q. Lei, K. Keutzer, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1802.08241 (2018) (arXiv),
    Accepted for publication, Proc. NIPS 2018.
  • Inexact Non-Convex Newton-Type Methods,
  • Z. Yao, P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1802.06925 (2018) (arXiv),
    Submitted for publication.
  • Out-of-sample extension of graph adjacency spectral embedding,
  • K. Levin, F. Roosta-Khorasani, M. W. Mahoney, and C. E. Priebe,
    Technical Report, Preprint: arXiv:1802.06307 (2018) (arXiv),
    Proc. of the 35th ICML Conference 2981-2990 (2018) (pdf),
    Journal version submitted for publication.

2017

  • Lectures on Randomized Numerical Linear Algebra,
  • P. Drineas and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1712.08880 (2017) (arXiv),
    To appear in: Lectures of the 2016 PCMI Summer School on Mathematics of Data.
  • Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization,
  • A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1712.06047 (2017) (arXiv),
    Proc. of the 2018 IPDPS 409-418 (2018) (pdf).
  • Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior, (click here for a blog about this paper)
  • C. H. Martin and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1710.09553 (2017) (arXiv), (iclr18),
  • LASAGNE: Locality And Structure Aware Graph Node Embedding,
  • E. Faerman, F. Borutta, K. Fountoulakis, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1710.06520 (2017) (arXiv),
    Accepted for publication, Proc. 2018 International Conference on Web Intelligence.
  • A Berkeley View of Systems Challenges for AI,
  • I. Stoica, D. Song, R. A. Popa, D. A. Patterson, M. W. Mahoney, R. H. Katz, A. D. Joseph, M. Jordan, J. M. Hellerstein, J. Gonzalez, K. Goldberg, A. Ghodsi, D. E. Culler, and P. Abbeel,
    Technical Report No. UCB/EECS-2017-159, October 2017 (www),
  • GIANT: Globally Improved Approximate Newton Method for Distributed Optimization,
  • S. Wang, F. Roosta-Khorasani, P. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1709.03528 (2017) (arXiv), (Spark code), (Python code),
    Accepted for publication, Proc. NIPS 2018.
  • Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study,
  • P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.07827 (2017) (arXiv), (code),
    Submitted for publication.
  • Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information,
  • P. Xu, F. Roosta-Khorasani, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.07164 (2017) (arXiv),
    Submitted for publication.
  • A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication,
  • M. E. Lopes, S. Wang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1708.01945 (2017) (arXiv),
    Submitted for publication.
  • Capacity releasing diffusions for speed and locality,
  • D. Wang, K. Fountoulakis, M. Henzinger, M. W. Mahoney, and S. Rao,
    Technical Report, Preprint: arXiv:1706.05826 (2017) (arXiv),
    Proc. of the 34th ICML Conference 3598-3607 (2017) (pdf) (supp) (talk).
  • Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds,
  • S. Wang, A. Gittens, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1706.02803 (2017) (arXiv),
    Accepted for publication, J. Machine Learning Research.
  • Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction,
  • K. E. Bouchard, A. F. Bujan, F. Roosta-Khorasani, S. Ubaru, Prabhat, A. M. Snijders, J.-H. Mao, E. F. Chang, M. W. Mahoney, S. Bhattacharyya,
    Technical Report, Preprint: arXiv:1705.07585 (2017) (arXiv),
    Proc. of the 2017 NIPS Conference (pdf).
  • Skip-Gram - Zipf + Uniform = Vector Additivity,
  • A. Gittens, D. Achlioptas, and M. W. Mahoney,
    Proc. of the 55th ACL Meeting 69-76 (2017) (pdf).
  • Principles and Applications of Science of Information [Scanning the Issue],
  • T. Courtade, A. Grama, M. W. Mahoney, and T. Weissman,
    Proceedings of the IEEE, 105(2): 183-188 (2017) (pdf).
  • Social Discrete Choice Models,
  • D. Zhang, K. Fountoulakis, J. Cao, M. Yin, M. W. Mahoney, and A. Pozdnoukhov,
    Technical Report, Preprint: arXiv:1703.07520 (2017) (arXiv),
    Submitted for publication.
  • Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging,
  • S. Wang, A. Gittens, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1702.04837 (2017) (arXiv),
    Proc. of the 34th ICML Conference 3608-3616 (2017) (pdf),
    J. Machine Learning Research, 18(218): 1-50 (2018) (pdf).

2016

  • Avoiding communication in primal and dual block coordinate descent methods,
  • A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1612.04003 (2016) (arXiv),
    Submitted for publication.
  • Feature-distributed sparse regression: a screen-and-clean approach,
  • J. Yang, M. W. Mahoney, M. A. Saunders, and Y. Sun,
    Proc. of the 2016 NIPS Conference (pdf).
  • Multi-label learning with semantic embeddings,
  • L. Jing, M. Cheng, L. Yang, A. Gittens, M. W. Mahoney,
    ICLR 2017 OpenReview.net (www),
  • Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data,
  • D. Lawlor, T. Budavari, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1609.03932 (2016) (arXiv),
    The Astrophysical Journal, 833:1, 26 (2016) (pdf).
  • Lecture Notes on Spectral Graph Methods,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1608.04845 (2016) (arXiv),
  • Lecture Notes on Randomized Linear Algebra,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1608.04481 (2016) (arXiv),
  • An optimization approach to locally-biased graph algorithms,
  • K. Fountoulakis, D. F. Gleich, M. W. Mahoney,
    Technical Report, Preprint: arXiv:1607.04940 (2016) (arXiv),
    Proceedings of the IEEE, 105(2): 256-272 (2017) (pdf).
  • DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection,
  • L. Jing, B. Liu, J. Choi, A. Janin, J. Bernd, M. W. Mahoney, and G. Friedland,
    Technical Report, Preprint: arXiv:1607.04378 (2016) (arXiv),
    Proc. of the 2016 ACM Multimedia Conference 57-61 (2016) (pdf),
    IEEE Transactions on Multimedia, 19(12): 2637-2650 (2017) (pdf).
  • Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies,
  • A. Gittens, A. Devarakonda, E. Racah, M. Ringenburg, L. Gerhardt, J. Kottalam, J. Liu, K. Maschhoff, S. Canon, J. Chhugani, P. Sharma, J. Yang, J. Demmel, J. Harrell, V. Krishnamurthy, M. W. Mahoney, and Prabhat,
    Technical Report, Preprint: arXiv:1607.01335 (2016) (arXiv), (code),
    Proc. 2016 IEEE BigData, 204-213 (2016) (pdf).
  • Sub-sampled Newton Methods with Non-uniform Sampling,
  • P. Xu, J. Yang, F. Roosta-Khorasani, C. Re, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1607.00559 (2016) (arXiv),
    Proc. of the 2016 NIPS Conference (pdf).
  • Approximating the Solution to Mixed Packing and Covering LPs in parallel epsilon-cubed.gif time,
  • M. W. Mahoney, S. Rao, D. Wang, and P. Zhang,
    Proc. of the 43rd ICALP Conference, 52:1-52:14 (2016) (pdf).
  • A Simple and Strongly-Local Flow-Based Method for Cut Improvement,
  • N. Veldt, D. F. Gleich, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1605.08490 (2016) (arXiv),
    Proc. of the 33rd ICML Conference 1938-1947 (2016) (pdf), (supp).
  • RandNLA: Randomized Numerical Linear Algebra,
  • P. Drineas and M. W. Mahoney,
    Communications of the ACM, 59, 80-90 (2016) (pdf).
  • FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods,
  • X. Cheng, F. Roosta-Khorasani, S. Palombo, P. L. Bartlett, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1605.08108 (2016) (arXiv),
    Proc. of the 21st International Conference on AISTATS, PMLR 84:404-414 (2018) (pdf, supp).
  • Parallel Local Graph Clustering,
  • J. Shun, F. Roosta-Khorasani, K. Fountoulakis, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1604.07515 (2016) (arXiv),
    Proceedings of the VLDB Endowment, 9(12) 1041-1052 (2016) (pdf).
  • A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark,
  • A. Gittens, J. Kottalam, J. Yang, M. F. Ringenburg, J. Chhugani, E. Racah, M. Singh, Y. Yao, C. Fischer, O. Ruebel, B. Bowen, N. G. Lewis, M. W. Mahoney, V. Krishnamurthy, and Prabhat,
    Proc. 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, at IPDPS, 2016 (pdf).
  • Mining Large Graphs,
  • D. F. Gleich and M. W. Mahoney,
    In Handbook of Big Data. pp. 191-220, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).
  • Structural properties underlying high-quality Randomized Numerical Linear Algebra algorithms,
  • M. W. Mahoney and P. Drineas,
    In Handbook of Big Data. pp. 137-154, edited by P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan, Chapman and Hall/CRC Press, 2016 (pdf).
  • Variational Perspective on Local Graph Clustering,
  • K. Fountoulakis, X. Cheng, J. Shun, F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1602.01886 (2016) (arXiv),
    Mathematical Programming, 1-21 (2017) (pdf).
  • Sub-Sampled Newton Methods II: Local Convergence Rates,
  • F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1601.04738 (2016) (arXiv),
    Journal version submitted for publication.
  • Sub-Sampled Newton Methods I: Globally Convergent Algorithms,
  • F. Roosta-Khorasani and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1601.04737 (2016) (arXiv),
    Journal version submitted for publication.
  • RandNLA, Pythons, and the CUR for Your Data Problems: Reporting from G2S3 2015 in Delphi,
  • E. Gallopoulos, P. Drineas, I. Ipsen, and M. W. Mahoney,
    SIAM News 49:1 January/February 2016 (web), (pdf).

2015

  • Faster Parallel Solver for Positive Linear Programs via Dynamically-Bucketed Selective Coordinate Descent,
  • D. Wang, M. W. Mahoney, N. Mohan, and S. Rao,
    Technical Report, Preprint: arXiv:1511.06468 (2015) (arXiv).
  • A Local Perspective on Community Structure in Multilayer Networks,
  • L. G. S. Jeub, M. W. Mahoney, P. J. Mucha, and M. A. Porter,
    Technical Report, Preprint: arXiv:1510.05185 (2015) (arXiv),
    Network Science, 5(2): 144-163, 2017 (pdf).
  • Optimal Subsampling Approaches for Large Sample Linear Regression,
  • R. Zhu, P. Ma, M. W. Mahoney, and B. Yu,
    Technical Report, Preprint: arXiv:1509.05111 (2015) (arXiv).
  • Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction,
  • D. Wang, S. Rao, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1508.02439 (2015) (arXiv),
    Proc. of the 43rd ICALP Conference, 50:1-50:13 (2016) (pdf).
  • Using local spectral methods to robustify graph-based learning algorithms,
  • D. F. Gleich and M. W. Mahoney,
    Proc. of the 21st Annual SIGKDD, (2015) (pdf) (code).
  • Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation,
  • R. Wang, Y. Li, M. W. Mahoney, and E. Darve,
    Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv).
  • Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions,
  • J. Yang, O. Rubel, Prabhat, M. W. Mahoney, and B. P. Bowen,
    Analytical Chemistry, 87 (9), 4658-4666 (2015) (pdf) (code).
  • Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method,
  • D. G. Anderson, S. S. Du, M. W. Mahoney, C. Melgaard, K. Wu, and M. Gu,
    Proc. of the 18th International Conference on AISTATS, PMLR 38:19-27 (2015) (pdf, supp) (code).
  • Weighted SGD for Lp Regression with Randomized Preconditioning,
  • J. Yang, Y.-L. Chow, C. Re, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1502.03571 (2015) (arXiv),
    Proc. of the 27-th Annual SODA, 558-569 (2016) (pdf),
    J. Machine Learning Research, 18(211): 1-43 (2018) (pdf).
  • Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments,
  • J. Yang, X. Meng, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1502.03032 (2015) (arXiv) (code),
    Proceedings of the IEEE 104(1): 58-92 (2016) (pdf).

2014

  • Tree decompositions and social graphs,
  • A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1411.1546 (2014) (arXiv), (code).
    Internet Mathematics, 12(5), 315-361 (2016) (pdf).
  • Fast Randomized Kernel Methods With Statistical Guarantees,
  • A. El Alaoui and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1411.0306 (2014) (arXiv),
    Proc. of the 2015 NIPS Conference (pdf).
  • Signal Processing for Big Data (Editorial for Special Issue)
  • G. B. Giannakis, F. Bach, R. Cendrillon, M. Mahoney, and J. Neville,
    IEEE Signal Processing Magazine, 31: 15-16 (September 2014) (pdf).
  • A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares,
  • G. Raskutti and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1406.5986 (2014) (arXiv),
    Proc. of the 32nd ICML Conference (2015) (pdf),
    J. Machine Learning Research, 17(214): 1-31, (2016) (pdf).
  • Random Laplace Feature Maps for Semigroup Kernels on Histograms,
  • J. Yang, V. Sindhwani, Q. Fan, H. Avron, and M. W. Mahoney,
    Proc. of the 27th CVPR Conference, 971-978 (2014) (pdf).
  • Anti-differentiating Approximation Algorithms: A case study with Min-cuts, Spectral, and Flow,
  • D. F. Gleich and M. W. Mahoney,
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 1018-1025 (2014) (pdf) (code, code) (talk).
  • Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels,
  • J. Yang, V. Sindhwani, H. Avron, and M. W. Mahoney,
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 485-493 (2014) (pdf), (code),
    Technical Report, Preprint: arXiv:1412.8293 (2014) (arXiv),
    J. Machine Learning Research, 17(120): 1-38 (2016) (pdf).
  • Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks,
  • L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1403.3795 (2014) (arXiv), (code, code),
    Physical Review E, 91, 012821 (2015) (pdf).
  • A new spin on an old algorithm: technical perspective on "Communication costs of Strassen's matrix multiplication,"
  • M. W. Mahoney,
    Communications of the ACM, 57(2): 106 (2014) (pdf).

2013

  • Tree-like Structure in Large Social and Information Networks,
  • A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,
    Proc. of the 2013 IEEE ICDM, 1-10 (2013) (pdf).
  • Objective Identification of Informative Wavelength Regions in Galaxy Spectra,
  • C.-W. Yip, M. W. Mahoney, A. S. Szalay, I. Csabai, T. Budavari, R. F. G. Wyse, and L. Dobos,
    Technical Report, Preprint: arXiv:1312.0637 (2013) (arXiv),
    Astronomical Journal, 147, 5, 110 (2014) (pdf).
  • Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity,
  • A. B. Adcock, B. D. Sullivan, O. R. Hernandez, and M. W. Mahoney,
    Proc. of the 9-th IWOMP, 71-83 (2013) (pdf).
  • Frontiers in Massive Data Analysis,
  • Committee on the Analysis of Massive Data, et al. (M. I. Jordan, et al.),
    The National Academies Press (2013) (pdf), (web).
  • A Statistical Perspective on Algorithmic Leveraging,
  • P. Ma, M. W. Mahoney, and B. Yu,
    Technical Report, Preprint: arXiv:1306.5362 (2013) (arXiv),
    Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 91-99 (2014) (pdf),
    J. Machine Learning Research, 16, 861-911 (2015) (pdf).
  • Robust Regression on MapReduce,
  • X. Meng, and M. W. Mahoney,
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 888-896 (2013) (pdf).
  • Quantile Regression for Large-scale Applications,
  • J. Yang, X. Meng, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1305.0087 (2013) (arXiv), (code),
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 881-887 (2013) (pdf),
    SIAM J. Scientific Computing, 36(5), S78-S110 (2014) (pdf).
  • Revisiting the Nystrom Method for Improved Large-Scale Machine Learning,
  • A. Gittens and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1303.1849 (2013) (arXiv), (code),
    Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 567-575 (2013) (pdf),
    J. Machine Learning Research, 17(117): 1-65 (2016) (pdf).

2012

  • Semi-supervised Eigenvectors for Large-scale Locally-biased Learning,
  • T. J. Hansen and M. W. Mahoney,
    Proc. of the 2012 NIPS Conference (pdf), (code),
    Technical Report, Preprint: arXiv:1304.7528 (2013) (arXiv),
    J. Machine Learning Research, 15, 3691-3734 (2014) (pdf).
  • Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression,
  • X. Meng and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1210.3135 (2012) (arXiv),
    Proc. of the 45-th STOC, 91-100 (2013) (pdf).
  • The Fast Cauchy Transform and Faster Robust Linear Regression,
  • K. L. Clarkson, P. Drineas, M. Magdon-Ismail, M. W. Mahoney, X. Meng, and D. P. Woodruff,
    Technical Report, Preprint: arXiv:1207.4684 (2012) (arXiv),
    Proc. of the 24-th Annual SODA, 466-477 (2013) (pdf),
    SIAM J. Computing, 45, 763-810 (2016) (pdf).
  • rCUR: an R package for CUR matrix decomposition,
  • A. Bodor, I. Csabai, M. W. Mahoney, and N. Solymosi,
    BMC Bioinformatics, 13:103 (2012) (pdf), (code).
  • Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1203.0786 (2012) (arXiv),
    Proc. of the 2012 ACM Symposium on Principles of Database Systems, 143-154, 2012 (pdf).
  • On the Hyperbolicity of Small-World and Tree-Like Random Graphs,
  • W. Chen, W. Fang, G. Hu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1201.1717 (2012) (arXiv),
    Proc. of the 23-rd ISAAC 278-288 (2012) (pdf),
    Internet Mathematics, 9(4), 434-491 (2013) (pdf).

2011

  • Randomized Dimensionality Reduction for K-means Clustering,
  • C. Boutsidis, A. Zouzias, M. W. Mahoney, and P. Drineas,
    Technical Report, Preprint: arXiv:1110.2897 (2011) (arXiv),
    IEEE Transactions on Information Theory, 61(2), 1045-1062 (2015) (pdf).
  • Regularized Laplacian Estimation and Fast Eigenvector Approximation,
  • P. O. Perry and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1110.1757 (2011) (arXiv),
    Proc. of the 2011 NIPS Conference (pdf).
  • LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems,
  • X. Meng, M. A. Saunders, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1109.5981 (2011) (arXiv), (code),
    SIAM J. Scientific Computing, 36(2), C95-C118 (2014) (pdf).
  • Fast approximation of matrix coherence and statistical leverage,
  • P. Drineas, M. Magdon-Ismail, M. W. Mahoney, and D. P. Woodruff,
    Technical Report, Preprint: arXiv:1109.3843 (2011) (arXiv),
    Proc. of the 29th ICML Conference (2012) (pdf),
    J. Machine Learning Research, 13, 3475-3506 (2012) (pdf).
  • Localization on low-order eigenvectors of data matrices,
  • M. Cucuringu and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1109.1355 (2011) (arXiv).
  • Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation,
  • A. Javed, P. Drineas, M. W. Mahoney, and P. Paschou,
    Annals of Human Genetics, 75, 707-722 (2011) (pdf).
  • Randomized Algorithms for Matrices and Data,
  • M. W. Mahoney,
    Foundations and Trends in Machine Learning, NOW Publishers, Volume 3, Issue 2, 2011 (now),
    TR version: Technical Report, Preprint: arXiv:1104.5557 (2011) (arXiv).
    (Abridged version in: Advances in Machine Learning and Data Mining for Astronomy, edited by M. J. Way, et al., pp. 647-672, 2012.)

2010

  • Computation in Large-Scale Scientific and Internet Data Applications is a Focus of MMDS 2010,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1012.4231 (2010) (arXiv),
    Appeared in SIGKDD Explorations, SIGACT News, ASA-SCGN Newsletter, and IMS Bulletin.
  • CUR from a Sparse Optimization Viewpoint,
  • J. Bien, Y. Xu, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1011.0413 (2010) (arXiv),
    Proc. of the 2010 NIPS Conference (ps, pdf).
  • Algorithmic and Statistical Perspectives on Large-Scale Data Analysis,
  • M. W. Mahoney,
    Technical Report, Preprint: arXiv:1010.1609 (2010) (arXiv),
    In: Combinatorial Scientific Computing, pp. 427-469, edited by U. Naumann and O. Schenk, 2012.
  • Implementing regularization implicitly via approximate eigenvector computation,
  • M. W. Mahoney and L. Orecchia,
    Technical Report, Preprint: arXiv:1010.0703 (2010) (arXiv),
    Proc. of the 28th ICML Conference, 121-128 (2011) (pdf) (talk).
  • Approximating Higher-Order Distances Using Random Projections,
  • P. Li, M. W. Mahoney, and Y. She,
    Proc. of the 26th UAI Conference, 312-321 (2010) (ps, pdf),
    Technical Report, Preprint: arXiv:1203.3492 (2012) (arXiv).
  • Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving,
  • P. Drineas and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1005.3097 (2010) (arXiv).
  • Empirical Comparison of Algorithms for Network Community Detection,
  • J. Leskovec, K. J. Lang, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:1004.3539 (2010) (arXiv),
    Proc. of the 19-th International WWW, 631-640 (2010) (ps, pdf).

2009

  • A Local Spectral Method for Graphs: with Applications to Improving Graph Partitions and Exploring Data Graphs Locally,
  • M. W. Mahoney, L. Orecchia, and N. K. Vishnoi,
    Technical Report, Preprint: arXiv:0912.0681 (2009) (arXiv),
    J. Machine Learning Research, 13, 2339-2365 (2012) (ps, pdf).
  • Unsupervised Feature Selection for the k-means Clustering Problem,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Proc. of the 2009 NIPS Conference (ps, pdf).
  • Learning with Spectral Kernels and Heavy-Tailed Data,
  • M. W. Mahoney and H. Narayanan,
    Technical Report, Preprint: arXiv:0906.4539 (2009) (arXiv).
  • Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow,
  • K. J. Lang, M. W. Mahoney, and L. Orecchia,
    Proc. of the 8-th International SEA, 197-208 (2009) (ps, pdf).
  • CUR Matrix Decompositions for Improved Data Analysis,
  • M. W. Mahoney and P. Drineas,
    Proc. Natl. Acad. Sci. USA, 106, 697-702 (2009) (ps, pdf).

2008

  • An Improved Approximation Algorithm for the Column Subset Selection Problem,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Technical Report, Preprint: arXiv:0812.4293 (2008) (arXiv),
    Proc. of the 20-th Annual SODA, 968-977 (2009) (ps, pdf).
  • Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis are the Focus of MMDS 2008
  • M. W. Mahoney, L.-H. Lim, and G. E. Carlsson
    Technical Report, Preprint: arXiv:0812.3702 (2008) (arXiv),
    Appeared in SIGKDD Explorations (ps, pdf), SIAM News (ps, pdf), and ASA-SCGN Newsletter (ps, pdf), and abridged versions appeared in IMS Bulletin (ps, pdf) and AmStat News.
  • Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,
  • J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:0810.1355 (2008) (arXiv),
    Internet Mathematics, 6(1), 29-123 (2009) (pdf).
  • Unsupervised Feature Selection for Principal Components Analysis,
  • C. Boutsidis, M. W. Mahoney, and P. Drineas,
    Proc. of the 14-th Annual SIGKDD, 61-69 (2008) (ps, pdf).
  • Statistical Properties of Community Structure in Large Social and Information Networks,
  • J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
    Proc. of the 17-th International WWW, 695-704 (2008) (ps, pdf).

2007

  • Faster Least Squares Approximation,
  • P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,
    Technical Report, Preprint: arXiv:0710.1435 (2007) (arXiv),
    Numerische Mathematik, 117, 219-249 (2011) (pdf).
  • PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations,
  • P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron, M. W. Mahoney, and P. Drineas,
    PLoS Genetics, 3, 1672-1686 (2007) (ps, pdf).
  • Relative-Error CUR Matrix Decompositions,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Technical Report, Preprint: arXiv:0708.3696 (2007) (arXiv),
    SIAM J. Matrix Analysis and Applications, 30, 844-881 (2008) (ps, pdf).
  • Feature Selection Methods for Text Classification,
  • A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,
    Proc. of the 13-th Annual SIGKDD, 230-239 (2007) (ps, pdf).
  • Sampling Algorithms and Coresets for Lp Regression,
  • A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,
    Technical Report, Preprint: arXiv:0707.1714 (2007) (arXiv),
    Proc. of the 19-th Annual SODA, 932-941 (2008) (ps, pdf),
    SIAM J. Computing, 38, 2060-2078 (2009) (ps, pdf).
  • Web Information Retrieval and Linear Algebra Algorithms,
  • A. Frommer, M. W. Mahoney, and D. B. Szyld (Eds.),
    Proc. of Dagstuhl Seminar 07071, (2007) (web).
  • Intra- and interpopulation genotype reconstruction from tagging SNPs,
  • P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,
    Genome Research, 17(1), 96-107 (2007) (ps, pdf).

2006

  • Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications,
  • G. H. Golub, M. W. Mahoney, P. Drineas, and L.-H. Lim,
    SIAM News 39:8 October 2006 (ps, pdf).
  • Randomized Algorithms for Matrices and Massive Data Sets,
  • P. Drineas and M. W. Mahoney,
    Proc. of the 32-nd Annual VLDB, 1269 (2006) (ps, pdf).
  • Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 14-th Annual ESA, 304-314 (2006) (ps, pdf).
  • Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 10-th Annual RANDOM, 316-326 (2006) (ps, pdf).
  • Tensor-CUR Decompositions For Tensor-Based Data,
  • M. W. Mahoney, M. Maggioni, and P. Drineas,
    Proc. of the 12-th Annual SIGKDD, 327-336 (2006) (ps, pdf),
    SIAM J. Matrix Analysis and Applications, 30, 957-987 (2008) (ps, pdf).
  • Polynomial Time Algorithm for Column-Row-Based Relative-Error Low-Rank Matrix Approximation,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Technical Report, DIMACS TR 2006-04 March 2006 (ps, pdf).
  • Sampling Algorithms for L2 Regression and Applications,
  • P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
    Proc. of the 17-th Annual SODA, 1127-1136 (2006) (ps, pdf).

2005

  • A Randomized Algorithm for a Tensor-Based Generalization of the Singular Value Decomposition,
  • P. Drineas and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1327, June 2005 (ps, pdf),
    Linear Algebra and its Applications, 420, 553-571 (2007) (ps, pdf).
  • On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,
  • P. Drineas and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1319, April 2005 (ps, pdf),
    Proc. of the 18-th Annual COLT, 323-337 (2005) (ps, pdf),
    J. Machine Learning Research, 6, 2153-2175 (2005) (ps, pdf).

2004

  • Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1283, April 2004 (ps, pdf),
    Proc. of the 22-nd Annual STACS, 57-68 (2005) (ps, pdf),
    Random Structures and Algorithms, 32:3, 307-333 (2008) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1271, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 184-206 (2006) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices II: Computing Low-Rank Approximations to a Matrix,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1270, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 158-183 (2006) (ps, pdf).
  • Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,
  • P. Drineas, R. Kannan, and M. W. Mahoney,
    Technical Report, YALEU/DCS/TR-1269, February 2004 (ps, pdf),
    SIAM J. Computing, 36, 132-157 (2006) (ps, pdf).

2003

  • Rapid Mixing of Several Markov Chains for a Hard-Core Model,
  • R. Kannan, M. W. Mahoney, and R. Montenegro,
    Proc. of the 14-th Annual ISAAC, 663-675 (2003) (pdf).

2001

  • Quantum, Intramolecular Flexibility, and Polarizability Effects on the Reproduction of the Density Anomaly of Liquid Water by Simple Potential Functions,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 115, 10758-10768 (2001) (pdf).
  • Rapid Estimation of Electronic Degrees of Freedom in Monte Carlo Calculations for Polarizable Models of Liquid Water,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 114, 9337-9349 (2001) (pdf).
  • Diffusion Constant of the TIP5P Model of Liquid Water,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 114, 363-366 (2001) (pdf).

2000

  • A Five-Site Model for Liquid Water and the Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions,
  • M. W. Mahoney and W. L. Jorgensen,
    J. Chem. Phys., 112, 8910-8922 (2000) (pdf).

1997

  • Repression and Activation of Promoter-Bound RNA Polymerase Activity by Gal Repressor,
  • H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,
    J. Mol. Biol. 272: 293-300, 1997 (pdf).
  • Discrete Representations of the Protein C-alpha Chain,
  • X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,
    Fold. & Des. 2: 223-234, 1997 (pdf).