Michael Mahoney  Publications
2019

Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks,

C. H. Martin and M. W. Mahoney,

Proc. of the 25th Annual SIGKDD, 000000 (2019)
(pdf).

On Linear Convergence of Weighted Kernel Herding,

R. Khanna and M. W. Mahoney,

Technical Report, Preprint: arXiv:1907.08410 (2019)
(arXiv),

Statistical guarantees for local graph clustering,

W. Ha, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04863 (2019)
(arXiv),

ANODEV2: A Coupled Neural ODE Evolution Framework,

T. Zhang, Z. Yao, A. Gholami, K. Keutzer, J. Gonzalez, G. Biros, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04596 (2019)
(arXiv),
(code),

Bayesian experimental design using regularized determinantal point processes,

M. Derezinski, F. Liang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1906.04133 (2019)
(arXiv),

Distributed estimation of the inverse Hessian by determinantal averaging,

M. Derezinski and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.11546 (2019)
(arXiv),

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization,

K. Rothauge, Z. Yao, Z. Hu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.13386 (2019)
(arXiv),

Parallel and Communication Avoiding Least Angle Regression,

S. Das, J. Demmel, K. Fountoulakis, L. Grigori, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.11340 (2019)
(arXiv),

Physicsinformed Autoencoders for Lyapunovstable Fluid Flow Prediction,

N. B. Erichson, M. Muehlebach, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1905.10866 (2019)
(arXiv),

HAWQ: Hessian AWare Quantization of Neural Networks with MixedPrecision,

Z. Dong, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer,

Technical Report, Preprint: arXiv:1905.03696 (2019)
(arXiv),

Accepted for publication, Proc. ICCV 2019.

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks,

N. B. Erichson, Z. Yao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1904.03750 (2019)
(arXiv),

OverSketched Newton: Fast Convex Optimization for Serverless Systems,

V. Gupta, S. Kadhe, T. Courtade, M. W. Mahoney, and K. Ramchandran,

Technical Report, Preprint: arXiv:1903.08857 (2019)
(arXiv),

Inefficiency of KFAC for Large Batch Size Training,

L. Ma, G. Montague, J. Ye, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1903.06237 (2019)
(arXiv),

SubSampled Newton Methods,

F. RoostaKhorasani and M. W. Mahoney,

Mathematical Programming, 174(12): 293326 (2019)
(pdf).

Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data,

N. B. Erichson, L. Mathelin, Z. Yao, S. L. Brunton, M. W. Mahoney, and J. N. Kutz,

Technical Report, Preprint: arXiv:1902.07358 (2019)
(arXiv),

Minimax experimental design: Bridging the gap between statistical and worstcase approaches to least squares regression,

M. Derezinski, K. L. Clarkson, M. W. Mahoney, and M. K. Warmuth,

Technical Report, Preprint: arXiv:1902.00995 (2019)
(arXiv),

Accepted for publication, Proc. COLT 2019.

HeavyTailed Universality Predicts Trends in Test Accuracies for Very Large PreTrained Deep Neural Networks,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1901.08278 (2019)
(arXiv),
(code),

Traditional and HeavyTailed Self Regularization in Neural Network Models,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1901.08276 (2019)
(arXiv),
(iclr19),
(code),

Proc. of the 36th ICML Conference 42844293 (2019)
(pdf).
2018

Trust Region Based Adversarial Attack on Neural Networks,

Z. Yao, A. Gholami, P. Xu, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1812.06371 (2018)
(arXiv),
(code),

Accepted for publication, Proc. CVPR 2019.

Parameter ReInitialization through Cyclical Batch Size Schedules,

N. Mu, Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1812.01216 (2018)
(arXiv),

Presented in the Systems for Machine Learning Workshop at the 2018 NeurIPS Conference.

On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent,

N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, and J. Gonzalez,

Technical Report, Preprint: arXiv:1811.12941 (2018)
(arXiv),
(iclr19),

The Mathematics of Data,

M. W. Mahoney, J. C. Duchi, and A. C. Gilbert, Eds.

AMS, IAS/PCMI, and SIAM (2018)
(web),
(intro).

A Short Introduction to Local Graph Clustering Methods and Software,

K. Fountoulakis, D. F. Gleich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.07324 (2018)
(arXiv),

Absts. of the 7th Intl. Conference on Complex Networks and Their Applications
(pdf),
(code).

Implicit SelfRegularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning,

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.01075 (2018)
(arXiv),
(code),

Journal version submitted for publication.

Large batch size training of neural networks with adversarial training and secondorder information,

Z. Yao, A. Gholami, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.01021 (2018)
(arXiv),
(iclr19),
(code),

NewtonMR: Newton's Method Without Smoothness or Convexity,

F. Roosta, Y. Liu, P. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1810.00303 (2018)
(arXiv),

Distributed Secondorder Convex Optimization,

C.H. Fang, S. B Kylasa, F. RoostaKhorasani, M. W. Mahoney, and A. Grama,

Technical Report, Preprint: arXiv:1807.07132 (2018)
(arXiv),
(code),

Alchemist: An Apache Spark <=> MPI Interface,

A. Gittens, K. Rothauge, M. W. Mahoney, S. Wang, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,

Technical Report, Preprint: arXiv:1806.01270 (2018)
(arXiv),

Concurrency and Computation: Practice and Experience (Special Issue of the Cray User Group, CUG 2018), e5026 (2018)
(pdf).

Accelerating LargeScale Data Analysis by Offloading to HighPerformance Computing Libraries using Alchemist,

A. Gittens, K. Rothauge, S. Wang, M. W. Mahoney, L. Gerhardt, Prabhat, J. Kottalam, M. Ringenburg, and K. Maschhoff,

Technical Report, Preprint: arXiv:1805.11800 (2018)
(arXiv),

Proc. of the 24th Annual SIGKDD, 293301 (2018)
(pdf).

Group Collaborative Representation for Image Set Classification,

B. Liu, L. Jing, J. Li, J. Yu, A. Gittens, and M. W. Mahoney,

International Journal of Computer Vision, 126 (2018)
(pdf).

Error Estimation for Randomized LeastSquares Algorithms via the Bootstrap,

M. E. Lopes, S. Wang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1803.08021 (2018)
(arXiv),

Proc. of the 35th ICML Conference 32233232 (2018)
(pdf)

Journal version submitted for publication.

GPU Accelerated SubSampled Newton's Method,

S. B. Kylasa, F. RoostaKhorasani, M. W. Mahoney, and A. Grama,

Technical Report, Preprint: arXiv:1802.09113 (2018)
(arXiv),
(code),

Proc. 2019 SDM, 702710 (2019)
(pdf).

Hessianbased Analysis of Large Batch Training and Robustness to Adversaries,

Z. Yao, A. Gholami, Q. Lei, K. Keutzer, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1802.08241 (2018)
(arXiv),

Proc. of the 2018 NeurIPS Conference, 49544964 (2018)
(pdf).

Inexact NonConvex NewtonType Methods,

Z. Yao, P. Xu, F. RoostaKhorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1802.06925 (2018)
(arXiv),

Journal version submitted for publication.

Outofsample extension of graph adjacency spectral embedding,

K. Levin, F. RoostaKhorasani, M. W. Mahoney, and C. E. Priebe,

Technical Report, Preprint: arXiv:1802.06307 (2018)
(arXiv),

Proc. of the 35th ICML Conference 29812990 (2018)
(pdf),

Journal version submitted for publication.
2017

Lectures on Randomized Numerical Linear Algebra,

P. Drineas and M. W. Mahoney,

Technical Report, Preprint: arXiv:1712.08880 (2017)
(arXiv),

In: Lectures of the 2016 PCMI Summer School on Mathematics of Data.

Avoiding Synchronization in FirstOrder Methods for Sparse Convex Optimization,

A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1712.06047 (2017)
(arXiv),

Proc. of the 2018 IPDPS Conference 409418 (2018)
(pdf).

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior,
(click here for a blog about this paper)

C. H. Martin and M. W. Mahoney,

Technical Report, Preprint: arXiv:1710.09553 (2017)
(arXiv),
(iclr18),

Journal version submitted for publication.

LASAGNE: Locality And Structure Aware Graph Node Embedding,

E. Faerman, F. Borutta, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1710.06520 (2017)
(arXiv),

Proc. 2018 International Conference on Web Intelligence, 246253 (2018)
(pdf). (Awarded the Best Student Paper Award.)

A Berkeley View of Systems Challenges for AI,

I. Stoica, D. Song, R. A. Popa, D. A. Patterson, M. W. Mahoney, R. H. Katz, A. D. Joseph, M. Jordan, J. M. Hellerstein, J. Gonzalez, K. Goldberg, A. Ghodsi, D. E. Culler, and P. Abbeel,

Technical Report No. UCB/EECS2017159, October 2017
(www),

Technical Report, Preprint: arXiv:1712.05855 (2017)
(arXiv).

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization,

S. Wang, F. RoostaKhorasani, P. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1709.03528 (2017)
(arXiv),
(Spark code),
(Python code),

Proc. of the 2018 NeurIPS Conference, 23382348 (2018)
(pdf).

SecondOrder Optimization for NonConvex Machine Learning: An Empirical Study,

P. Xu, F. RoostaKhorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.07827 (2017)
(arXiv),
(code).

NewtonType Methods for NonConvex Optimization Under Inexact Hessian Information,

P. Xu, F. RoostaKhorasani, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.07164 (2017)
(arXiv),

Accepted for publication, Mathematical Programming
().

A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication,

M. E. Lopes, S. Wang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1708.01945 (2017)
(arXiv),

J. Machine Learning Research, 20(39): 1−40 (2019)
(pdf).

Capacity releasing diffusions for speed and locality,

D. Wang, K. Fountoulakis, M. Henzinger, M. W. Mahoney, and S. Rao,

Technical Report, Preprint: arXiv:1706.05826 (2017)
(arXiv),

Proc. of the 34th ICML Conference 35983607 (2017)
(pdf)
(supp)
(talk).

Scalable Kernel KMeans Clustering with Nystrom Approximation: RelativeError Bounds,

S. Wang, A. Gittens, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1706.02803 (2017)
(arXiv),

J. Machine Learning Research, 20(12): 149 (2019)
(pdf).

Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction,

K. E. Bouchard, A. F. Bujan, F. RoostaKhorasani, S. Ubaru, Prabhat, A. M. Snijders, J.H. Mao, E. F. Chang, M. W. Mahoney, S. Bhattacharyya,

Technical Report, Preprint: arXiv:1705.07585 (2017)
(arXiv),

Proc. of the 2017 NIPS Conference, 10781086 (2017)
(pdf).

SkipGram  Zipf + Uniform = Vector Additivity,

A. Gittens, D. Achlioptas, and M. W. Mahoney,

Proc. of the 55th ACL Meeting 6976 (2017)
(pdf).

Principles and Applications of Science of Information [Scanning the Issue],

T. Courtade, A. Grama, M. W. Mahoney, and T. Weissman,

Proceedings of the IEEE, 105(2): 183188 (2017)
(pdf).

Social Discrete Choice Models,

D. Zhang, K. Fountoulakis, J. Cao, M. Yin, M. W. Mahoney, and A. Pozdnoukhov,

Technical Report, Preprint: arXiv:1703.07520 (2017)
(arXiv).

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging,

S. Wang, A. Gittens, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1702.04837 (2017)
(arXiv),

Proc. of the 34th ICML Conference 36083616 (2017)
(pdf),

J. Machine Learning Research, 18(218): 150 (2018)
(pdf).
2016

Avoiding communication in primal and dual block coordinate descent methods,

A. Devarakonda, K. Fountoulakis, J. Demmel, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1612.04003 (2016)
(arXiv),

SIAM J. Scientific Computing, 41(1), C1C27 (2019)
(pdf).

Featuredistributed sparse regression: a screenandclean approach,

J. Yang, M. W. Mahoney, M. A. Saunders, and Y. Sun,

Proc. of the 2016 NIPS Conference, 27112719 (2016)
(pdf).

Multilabel learning with semantic embeddings,

L. Jing, M. Cheng, L. Yang, A. Gittens, M. W. Mahoney,

ICLR 2017 OpenReview.net
(iclr17),

Mapping the Similarities of Spectra: Global and Locallybiased Approaches to SDSS Galaxy Data,

D. Lawlor, T. Budavari, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1609.03932 (2016)
(arXiv),

The Astrophysical Journal, 833:1, 26 (2016)
(pdf).

Lecture Notes on Spectral Graph Methods,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1608.04845 (2016)
(arXiv),

Lecture Notes on Randomized Linear Algebra,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1608.04481 (2016)
(arXiv),

An optimization approach to locallybiased graph algorithms,

K. Fountoulakis, D. F. Gleich, M. W. Mahoney,

Technical Report, Preprint: arXiv:1607.04940 (2016)
(arXiv),

Proceedings of the IEEE, 105(2): 256272 (2017)
(pdf).

DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection,

L. Jing, B. Liu, J. Choi, A. Janin, J. Bernd, M. W. Mahoney, and G. Friedland,

Technical Report, Preprint: arXiv:1607.04378 (2016)
(arXiv),

Proc. of the 2016 ACM Multimedia Conference 5761 (2016)
(pdf),

IEEE Transactions on Multimedia, 19(12): 26372650 (2017)
(pdf).

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies,

A. Gittens, A. Devarakonda, E. Racah, M. Ringenburg, L. Gerhardt, J. Kottalam, J. Liu, K. Maschhoff, S. Canon, J. Chhugani, P. Sharma, J. Yang, J. Demmel, J. Harrell, V. Krishnamurthy, M. W. Mahoney, and Prabhat,

Technical Report, Preprint: arXiv:1607.01335 (2016)
(arXiv),
(code),

Proc. 2016 IEEE BigData, 204213 (2016)
(pdf).

Subsampled Newton Methods with Nonuniform Sampling,

P. Xu, J. Yang, F. RoostaKhorasani, C. Re, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1607.00559 (2016)
(arXiv),

Proc. of the 2016 NIPS Conference, 30003008 (2016)
(pdf).

Approximating the Solution to Mixed Packing and Covering LPs in parallel
time,

M. W. Mahoney, S. Rao, D. Wang, and P. Zhang,

Proc. of the 43rd ICALP Conference, 52:152:14 (2016)
(pdf).

A Simple and StronglyLocal FlowBased Method for Cut Improvement,

N. Veldt, D. F. Gleich, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1605.08490 (2016)
(arXiv),

Proc. of the 33rd ICML Conference 19381947 (2016)
(pdf),
(supp).

RandNLA: Randomized Numerical Linear Algebra,

P. Drineas and M. W. Mahoney,

Communications of the ACM, 59, 8090 (2016)
(pdf).

FLAG n' FLARE: Fast LinearlyCoupled Adaptive Gradient Methods,

X. Cheng, F. RoostaKhorasani, S. Palombo, P. L. Bartlett, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1605.08108 (2016)
(arXiv),

Proc. of the 21st International Conference on AISTATS, PMLR 84:404414 (2018)
(pdf,
supp).

Parallel Local Graph Clustering,

J. Shun, F. RoostaKhorasani, K. Fountoulakis, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1604.07515 (2016)
(arXiv),

Proceedings of the VLDB Endowment, 9(12) 10411052 (2016)
(pdf).

A multiplatform evaluation of the randomized CX lowrank matrix factorization in Spark,

A. Gittens, J. Kottalam, J. Yang, M. F. Ringenburg, J. Chhugani, E. Racah, M. Singh, Y. Yao, C. Fischer, O. Ruebel, B. Bowen, N. G. Lewis, M. W. Mahoney, V. Krishnamurthy, and Prabhat,

Proc. 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, at IPDPS,
2016
(pdf).

Mining Large Graphs,

D. F. Gleich and M. W. Mahoney,

In
Handbook of Big Data.
pp. 191220,
edited by
P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan,
Chapman and Hall/CRC Press,
2016
(pdf).

Structural properties underlying highquality Randomized Numerical Linear Algebra algorithms,

M. W. Mahoney and P. Drineas,

In
Handbook of Big Data.
pp. 137154,
edited by
P. Buhlmann, P. Drineas, M. Kane, and M. van de Laan,
Chapman and Hall/CRC Press,
2016
(pdf).

Variational Perspective on Local Graph Clustering,

K. Fountoulakis, X. Cheng, J. Shun, F. RoostaKhorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1602.01886 (2016)
(arXiv),

Mathematical Programming, 174(12): 553573 (2019)
(pdf).

SubSampled Newton Methods II: Local Convergence Rates,

F. RoostaKhorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1601.04738 (2016)
(arXiv).

SubSampled Newton Methods I: Globally Convergent Algorithms,

F. RoostaKhorasani and M. W. Mahoney,

Technical Report, Preprint: arXiv:1601.04737 (2016)
(arXiv).

RandNLA, Pythons, and the CUR for Your Data Problems: Reporting from G2S3 2015 in Delphi,

E. Gallopoulos, P. Drineas, I. Ipsen, and M. W. Mahoney,

SIAM News 49:1 January/February 2016
(web),
(pdf).
2015

Faster Parallel Solver for Positive Linear Programs via DynamicallyBucketed Selective Coordinate Descent,

D. Wang, M. W. Mahoney, N. Mohan, and S. Rao,

Technical Report, Preprint: arXiv:1511.06468 (2015)
(arXiv).

A Local Perspective on Community Structure in Multilayer Networks,

L. G. S. Jeub, M. W. Mahoney, P. J. Mucha, and M. A. Porter,

Technical Report, Preprint: arXiv:1510.05185 (2015)
(arXiv),

Network Science, 5(2): 144163, 2017
(pdf).

Optimal Subsampling Approaches for Large Sample Linear Regression,

R. Zhu, P. Ma, M. W. Mahoney, and B. Yu,

Technical Report, Preprint: arXiv:1509.05111 (2015)
(arXiv).

Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction,

D. Wang, S. Rao, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1508.02439 (2015)
(arXiv),

Proc. of the 43rd ICALP Conference, 50:150:13 (2016)
(pdf).

Using local spectral methods to robustify graphbased learning algorithms,

D. F. Gleich and M. W. Mahoney,

Proc. of the 21st Annual SIGKDD, 359368 (2015)
(pdf)
(code).

Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation,

R. Wang, Y. Li, M. W. Mahoney, and E. Darve,

Technical Report, Preprint: arXiv:1502.03571 (2015)
(arXiv).

Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions,

J. Yang, O. Rubel, Prabhat, M. W. Mahoney, and B. P. Bowen,

Analytical Chemistry, 87 (9), 46584666 (2015)
(pdf)
(code).

Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method,

D. G. Anderson, S. S. Du, M. W. Mahoney, C. Melgaard, K. Wu, and M. Gu,

Proc. of the 18th International Conference on AISTATS, PMLR 38:1927 (2015)
(pdf,
supp)
(code).

Weighted SGD for Lp Regression with Randomized Preconditioning,

J. Yang, Y.L. Chow, C. Re, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1502.03571 (2015)
(arXiv),

Proc. of the 27th Annual SODA, 558569 (2016)
(pdf),

J. Machine Learning Research, 18(211): 143 (2018)
(pdf).

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments,

J. Yang, X. Meng, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1502.03032 (2015)
(arXiv)
(code),

Proceedings of the IEEE 104(1): 5892 (2016)
(pdf).
2014

Tree decompositions and social graphs,

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1411.1546 (2014)
(arXiv),
(code).

Internet Mathematics, 12(5), 315361 (2016)
(pdf).

Fast Randomized Kernel Methods With Statistical Guarantees,

A. El Alaoui and M. W. Mahoney,

Technical Report, Preprint: arXiv:1411.0306 (2014)
(arXiv),

Proc. of the 2015 NIPS Conference, 775783 (2015)
(pdf).

Signal Processing for Big Data (Editorial for Special Issue)

G. B. Giannakis, F. Bach, R. Cendrillon, M. Mahoney, and J. Neville,

IEEE Signal Processing Magazine, 31: 1516 (September 2014)
(pdf).

A Statistical Perspective on Randomized Sketching for Ordinary LeastSquares,

G. Raskutti and M. W. Mahoney,

Technical Report, Preprint: arXiv:1406.5986 (2014)
(arXiv),

Proc. of the 32nd ICML Conference, 617625 (2015)
(pdf),

J. Machine Learning Research, 17(214): 131, (2016)
(pdf).

Random Laplace Feature Maps for Semigroup Kernels on Histograms,

J. Yang, V. Sindhwani, Q. Fan, H. Avron, and M. W. Mahoney,

Proc. of the 27th CVPR Conference, 971978 (2014)
(pdf).

Antidifferentiating Approximation Algorithms: A case study with Mincuts, Spectral, and Flow,

D. F. Gleich and M. W. Mahoney,

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 10181025 (2014)
(pdf)
(code, code)
(talk).

QuasiMonte Carlo Feature Maps for ShiftInvariant Kernels,

J. Yang, V. Sindhwani, H. Avron, and M. W. Mahoney,

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 485493 (2014)
(pdf),
(code),

Technical Report, Preprint: arXiv:1412.8293 (2014)
(arXiv),

J. Machine Learning Research, 17(120): 138 (2016)
(pdf).

Think Locally, Act Locally: The Detection of Small, MediumSized, and Large Communities in Large Networks,

L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1403.3795 (2014)
(arXiv),
(code, code),

Physical Review E, 91, 012821 (2015)
(pdf).

A new spin on an old algorithm: technical perspective on "Communication costs of Strassen's matrix multiplication,"

M. W. Mahoney,

Communications of the ACM, 57(2): 106 (2014)
(pdf).
2013

Treelike Structure in Large Social and Information Networks,

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney,

Proc. of the 2013 IEEE ICDM, 110 (2013)
(pdf).

Objective Identification of Informative Wavelength Regions in Galaxy Spectra,

C.W. Yip, M. W. Mahoney, A. S. Szalay, I. Csabai, T. Budavari, R. F. G. Wyse,
and L. Dobos,

Technical Report, Preprint: arXiv:1312.0637 (2013)
(arXiv),

Astronomical Journal, 147, 5, 110 (2014)
(pdf).

Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity,

A. B. Adcock, B. D. Sullivan, O. R. Hernandez, and M. W. Mahoney,

Proc. of the 9th IWOMP, 7183 (2013)
(pdf).

Frontiers in Massive Data Analysis,

Committee on the Analysis of Massive Data, et al. (M. I. Jordan, et al.),

The National Academies Press (2013)
(pdf),
(web).

A Statistical Perspective on Algorithmic Leveraging,

P. Ma, M. W. Mahoney, and B. Yu,

Technical Report, Preprint: arXiv:1306.5362 (2013)
(arXiv),

Proc. of the 31st ICML Conference, JMLR W&CP 32 (1): 9199 (2014)
(pdf),

J. Machine Learning Research, 16, 861911 (2015)
(pdf).

Robust Regression on MapReduce,

X. Meng, and M. W. Mahoney,

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 888896 (2013)
(pdf).

Quantile Regression for Largescale Applications,

J. Yang, X. Meng, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1305.0087 (2013)
(arXiv),
(code),

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 881887 (2013)
(pdf),

SIAM J. Scientific Computing, 36(5), S78S110 (2014)
(pdf).

Revisiting the Nystrom Method for Improved LargeScale Machine Learning,

A. Gittens and M. W. Mahoney,

Technical Report, Preprint: arXiv:1303.1849 (2013)
(arXiv),
(code),

Proc. of the 30th ICML Conference, JMLR W&CP 28(3): 567575 (2013)
(pdf),

J. Machine Learning Research, 17(117): 165 (2016)
(pdf).
2012

Semisupervised Eigenvectors for Largescale Locallybiased Learning,

T. J. Hansen and M. W. Mahoney,

Proc. of the 2012 NIPS Conference, 25282536 (2012)
(pdf),
(code),

Technical Report, Preprint: arXiv:1304.7528 (2013)
(arXiv),

J. Machine Learning Research, 15, 36913734 (2014)
(pdf).

Lowdistortion Subspace Embeddings in Inputsparsity Time and Applications to Robust Linear Regression,

X. Meng and M. W. Mahoney,

Technical Report, Preprint: arXiv:1210.3135 (2012)
(arXiv),

Proc. of the 45th STOC, 91100 (2013)
(pdf).

The Fast Cauchy Transform and Faster Robust Linear Regression,

K. L. Clarkson, P. Drineas, M. MagdonIsmail, M. W. Mahoney, X. Meng, and D. P. Woodruff,

Technical Report, Preprint: arXiv:1207.4684 (2012)
(arXiv),

Proc. of the 24th Annual SODA, 466477 (2013)
(pdf),

SIAM J. Computing, 45, 763810 (2016)
(pdf).

rCUR: an R package for CUR matrix decomposition,

A. Bodor, I. Csabai, M. W. Mahoney, and N. Solymosi,

BMC Bioinformatics, 13:103 (2012)
(pdf),
(code).

Approximate Computation and Implicit Regularization for Very Largescale Data Analysis,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1203.0786 (2012)
(arXiv),

Proc. of the 2012 ACM Symposium on Principles of Database Systems, 143154, 2012
(pdf).

On the Hyperbolicity of SmallWorld and TreeLike Random Graphs,

W. Chen, W. Fang, G. Hu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1201.1717 (2012)
(arXiv),

Proc. of the 23rd ISAAC 278288 (2012)
(pdf),

Internet Mathematics, 9(4), 434491 (2013)
(pdf).
2011

Randomized Dimensionality Reduction for Kmeans Clustering,

C. Boutsidis, A. Zouzias, M. W. Mahoney, and P. Drineas,

Technical Report, Preprint: arXiv:1110.2897 (2011)
(arXiv),

IEEE Transactions on Information Theory, 61(2), 10451062 (2015)
(pdf).

Regularized Laplacian Estimation and Fast Eigenvector Approximation,

P. O. Perry and M. W. Mahoney,

Technical Report, Preprint: arXiv:1110.1757 (2011)
(arXiv),

Proc. of the 2011 NIPS Conference, 24202428 (2011)
(pdf).

LSRN: A Parallel Iterative Solver for Strongly Over or UnderDetermined Systems,

X. Meng, M. A. Saunders, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1109.5981 (2011)
(arXiv),
(code),

SIAM J. Scientific Computing, 36(2), C95C118 (2014)
(pdf).

Fast approximation of matrix coherence and statistical leverage,

P. Drineas, M. MagdonIsmail, M. W. Mahoney, and D. P. Woodruff,

Technical Report, Preprint: arXiv:1109.3843 (2011)
(arXiv),

Proc. of the 29th ICML Conference, 10511058 (2012)
(pdf),

J. Machine Learning Research, 13, 34753506 (2012)
(pdf).

Localization on loworder eigenvectors of data matrices,

M. Cucuringu and M. W. Mahoney,

Technical Report, Preprint: arXiv:1109.1355 (2011)
(arXiv).

Efficient Genomewide Selection of PCACorrelated tSNPs for Genotype Imputation,

A. Javed, P. Drineas, M. W. Mahoney, and P. Paschou,

Annals of Human Genetics, 75, 707722 (2011)
(pdf).

Randomized Algorithms for Matrices and Data,

M. W. Mahoney,

Foundations and Trends in Machine Learning,
NOW Publishers,
Volume 3, Issue 2, 2011
(now),

TR version:
Technical Report, Preprint: arXiv:1104.5557 (2011)
(arXiv).

(Abridged version in:
Advances in Machine Learning and Data Mining for Astronomy,
edited by
M. J. Way, et al.,
pp. 647672,
2012.)
2010

Computation in LargeScale Scientific and Internet Data Applications is a Focus of MMDS 2010,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1012.4231 (2010)
(arXiv),

Appeared in
SIGKDD Explorations,
SIGACT News,
ASASCGN Newsletter,
and IMS Bulletin.

CUR from a Sparse Optimization Viewpoint,

J. Bien, Y. Xu, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1011.0413 (2010)
(arXiv),

Proc. of the 2010 NIPS Conference, 217225 (2010)
(ps,
pdf).

Algorithmic and Statistical Perspectives on LargeScale Data Analysis,

M. W. Mahoney,

Technical Report, Preprint: arXiv:1010.1609 (2010)
(arXiv),

In:
Combinatorial Scientific Computing,
pp. 427469,
edited by
U. Naumann and O. Schenk,
2012.

Implementing regularization implicitly via approximate eigenvector computation,

M. W. Mahoney and L. Orecchia,

Technical Report, Preprint: arXiv:1010.0703 (2010)
(arXiv),

Proc. of the 28th ICML Conference, 121128 (2011)
(pdf)
(talk).

Approximating HigherOrder Distances Using Random Projections,

P. Li, M. W. Mahoney, and Y. She,

Proc. of the 26th UAI Conference, 312321 (2010)
(ps,
pdf),

Technical Report, Preprint: arXiv:1203.3492 (2012)
(arXiv).

Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving,

P. Drineas and M. W. Mahoney,

Technical Report, Preprint: arXiv:1005.3097 (2010)
(arXiv).

Empirical Comparison of Algorithms for Network Community Detection,

J. Leskovec, K. J. Lang, and M. W. Mahoney,

Technical Report, Preprint: arXiv:1004.3539 (2010)
(arXiv),

Proc. of the 19th International WWW, 631640 (2010)
(ps,
pdf).
2009

A Local Spectral Method for Graphs: with Applications to Improving Graph
Partitions and Exploring Data Graphs Locally,

M. W. Mahoney, L. Orecchia, and N. K. Vishnoi,

Technical Report, Preprint: arXiv:0912.0681 (2009)
(arXiv),

J. Machine Learning Research, 13, 23392365 (2012)
(ps,
pdf).

Unsupervised Feature Selection for the kmeans Clustering Problem,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Proc. of the 2009 NIPS Conference, 153161 (2009)
(ps,
pdf).

Learning with Spectral Kernels and HeavyTailed Data,

M. W. Mahoney and H. Narayanan,

Technical Report, Preprint: arXiv:0906.4539 (2009)
(arXiv).

Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow,

K. J. Lang, M. W. Mahoney, and L. Orecchia,

Proc. of the 8th International SEA, 197208 (2009)
(ps,
pdf).

CUR Matrix Decompositions for Improved Data Analysis,

M. W. Mahoney and P. Drineas,

Proc. Natl. Acad. Sci. USA, 106, 697702 (2009)
(ps,
pdf).
2008

An Improved Approximation Algorithm for the Column Subset Selection Problem,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Technical Report, Preprint: arXiv:0812.4293 (2008)
(arXiv),

Proc. of the 20th Annual SODA, 968977 (2009)
(ps,
pdf).

Algorithmic and Statistical Challenges in Modern LargeScale Data Analysis are the Focus of MMDS 2008

M. W. Mahoney, L.H. Lim, and G. E. Carlsson

Technical Report, Preprint: arXiv:0812.3702 (2008)
(arXiv),

Appeared in
SIGKDD Explorations
(ps,
pdf),
SIAM News
(ps,
pdf),
and
ASASCGN Newsletter
(ps,
pdf),
and abridged versions appeared in IMS Bulletin
(ps,
pdf)
and AmStat News.

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large WellDefined Clusters,

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,

Technical Report, Preprint: arXiv:0810.1355 (2008)
(arXiv),

Internet Mathematics, 6(1), 29123 (2009)
(pdf).

Unsupervised Feature Selection for Principal Components Analysis,

C. Boutsidis, M. W. Mahoney, and P. Drineas,

Proc. of the 14th Annual SIGKDD, 6169 (2008)
(ps,
pdf).

Statistical Properties of Community Structure in Large Social and Information Networks,

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,

Proc. of the 17th International WWW, 695704 (2008)
(ps,
pdf).
2007

Faster Least Squares Approximation,

P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,

Technical Report, Preprint: arXiv:0710.1435 (2007)
(arXiv),

Numerische Mathematik, 117, 219249 (2011)
(pdf).

PCACorrelated SNPs for Structure Identification in Worldwide Human Populations,

P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. RodriguezCintron, M. W. Mahoney, and P. Drineas,

PLoS Genetics, 3, 16721686 (2007)
(ps,
pdf).

RelativeError CUR Matrix Decompositions,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Technical Report, Preprint: arXiv:0708.3696 (2007)
(arXiv),

SIAM J. Matrix Analysis and Applications, 30, 844881 (2008)
(ps,
pdf).

Feature Selection Methods for Text Classification,

A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,

Proc. of the 13th Annual SIGKDD, 230239 (2007)
(ps,
pdf).

Sampling Algorithms and Coresets for Lp Regression,

A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,

Technical Report, Preprint: arXiv:0707.1714 (2007)
(arXiv),

Proc. of the 19th Annual SODA, 932941 (2008)
(ps,
pdf),

SIAM J. Computing, 38, 20602078 (2009)
(ps,
pdf).

Web Information Retrieval and Linear Algebra Algorithms,

A. Frommer, M. W. Mahoney, and D. B. Szyld (Eds.),

Proc. of Dagstuhl Seminar 07071, (2007)
(web).

Intra and interpopulation genotype reconstruction from tagging SNPs,

P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,

Genome Research, 17(1), 96107 (2007)
(ps,
pdf).
2006

Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications,

G. H. Golub, M. W. Mahoney, P. Drineas, and L.H. Lim,

SIAM News 39:8 October 2006
(ps,
pdf).

Randomized Algorithms for Matrices and Massive Data Sets,

P. Drineas and M. W. Mahoney,

Proc. of the 32nd Annual VLDB, 1269 (2006)
(ps,
pdf).

Subspace Sampling and RelativeError Matrix Approximation: ColumnRowBased Methods,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 14th Annual ESA, 304314 (2006)
(ps,
pdf).

Subspace Sampling and RelativeError Matrix Approximation: ColumnBased Methods,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 10th Annual RANDOM, 316326 (2006)
(ps,
pdf).

TensorCUR Decompositions For TensorBased Data,

M. W. Mahoney, M. Maggioni, and P. Drineas,

Proc. of the 12th Annual SIGKDD, 327336 (2006)
(ps,
pdf),

SIAM J. Matrix Analysis and Applications, 30, 957987 (2008)
(ps,
pdf).

Polynomial Time Algorithm for ColumnRowBased RelativeError LowRank Matrix Approximation,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Technical Report, DIMACS TR 200604 March 2006
(ps,
pdf).

Sampling Algorithms for L2 Regression and Applications,

P. Drineas, M. W. Mahoney, and S. Muthukrishnan,

Proc. of the 17th Annual SODA, 11271136 (2006)
(ps,
pdf).
2005

A Randomized Algorithm for a TensorBased Generalization of the Singular Value Decomposition,

P. Drineas and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1327, June 2005
(ps,
pdf),

Linear Algebra and its Applications, 420, 553571 (2007)
(ps,
pdf).

On the Nystrom Method for Approximating a Gram Matrix for Improved KernelBased Learning,

P. Drineas and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1319, April 2005
(ps,
pdf),

Proc. of the 18th Annual COLT, 323337 (2005)
(ps,
pdf),

J. Machine Learning Research, 6, 21532175 (2005)
(ps,
pdf).
2004

Sampling Subproblems of Heterogeneous MaxCut Problems and Approximation Algorithms,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1283, April 2004
(ps,
pdf),

Proc. of the 22nd Annual STACS, 5768 (2005)
(ps,
pdf),

Random Structures and Algorithms, 32:3, 307333 (2008)
(ps,
pdf).

Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1271, February 2004
(ps,
pdf),

SIAM J. Computing, 36, 184206 (2006)
(ps,
pdf).

Fast Monte Carlo Algorithms for Matrices II: Computing LowRank Approximations to a Matrix,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1270, February 2004
(ps,
pdf),

SIAM J. Computing, 36, 158183 (2006)
(ps,
pdf).

Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,

P. Drineas, R. Kannan, and M. W. Mahoney,

Technical Report, YALEU/DCS/TR1269, February 2004
(ps,
pdf),

SIAM J. Computing, 36, 132157 (2006)
(ps,
pdf).
2003

Rapid Mixing of Several Markov Chains for a HardCore Model,

R. Kannan, M. W. Mahoney, and R. Montenegro,

Proc. of the 14th Annual ISAAC, 663675 (2003)
(pdf).
2001

Quantum, Intramolecular Flexibility, and Polarizability Effects on the Reproduction of the Density Anomaly of Liquid Water by Simple Potential Functions,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 115, 1075810768 (2001)
(pdf).

Rapid Estimation of Electronic Degrees of Freedom in Monte Carlo Calculations for Polarizable Models of Liquid Water,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 114, 93379349 (2001)
(pdf).

Diffusion Constant of the TIP5P Model of Liquid Water,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 114, 363366 (2001)
(pdf).
2000

A FiveSite Model for Liquid Water and the Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions,

M. W. Mahoney and W. L. Jorgensen,

J. Chem. Phys., 112, 89108922 (2000)
(pdf).
1997

Repression and Activation of PromoterBound RNA Polymerase Activity by Gal Repressor,

H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,

J. Mol. Biol. 272: 293300, 1997
(pdf).

Discrete Representations of the Protein Calpha Chain,

X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,

Fold. & Des. 2: 223234, 1997
(pdf).
