1 High-Performance Implementation of Positive Matrix Completion for SDPs Makoto Yamashita (Tokyo Institute of Technology) Kazuhide Nakata (Tokyo Institute.

Slides:

Advertisements

Similar presentations

A Large-Grained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integration Takeshi Amako, Yusaku Yamamoto and Shao-Liang Zhang.

Advertisements

Algorithmic Mechanism Design: an Introduction VCG-mechanisms for some basic network optimization problems: The Minimum Spanning Tree problem Guido Proietti.

Lecture 19: Parallel Algorithms

Evaluating Graph Coloring on GPUs Pascal Grosset, Peihong Zhu, Shusen Liu, Suresh Venkatasubramanian, and Mary Hall Final Project for the GPU class - Spring.

Using Sparse Matrix Reordering Algorithms for Cluster Identification Chris Mueller Dec 9, 2004.

Sum of Squares and SemiDefinite Programmming Relaxations of Polynomial Optimization Problems The 2006 IEICE Society Conference Kanazawa, September 21,

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

1 Parallel Algorithms II Topics: matrix and graph algorithms.

Systems of Linear Equations

Matrices & Systems of Linear Equations

SDPA: Leading-edge Software for SDP Informs ’ 08 Tokyo Institute of Technology Makoto Yamashita Mituhiro Fukuda Masakazu Kojima Kazuhide Nakata.

DCABES 2009 China University Of Geosciences 1 The Parallel Models of Coronal Polarization Brightness Calculation Jiang Wenqian.

Perfect Graphs Lecture 23: Apr 17. Hard Optimization Problems Independent set Clique Colouring Clique cover Hard to approximate within a factor of coding.

Parallel Implementation of the Inversion of Polynomial Matrices Alina Solovyova-Vincent March 26, 2003 A thesis submitted in partial fulfillment of the.

MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.

10.1 Gaussian Elimination Method

MATRICES Using matrices to solve Systems of Equations.

1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba.

Section 10.3 – The Inverse of a Matrix No Calculator.

Chapter 7 Matrix Mathematics Matrix Operations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

How to find the inverse of a matrix

18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University.

CHAPTER 2 MATRIX. CHAPTER OUTLINE 2.1 Introduction 2.2 Types of Matrices 2.3 Determinants 2.4 The Inverse of a Square Matrix 2.5 Types of Solutions to.

Makoto Kudoh*1, Hisayasu Kuroda*1,

High Performance Solvers for Semidefinite Programs

Sparsity in Polynomial Optimization IMA Annual Program Year Workshop "Optimization and Control" Minneapolis, January 16-20, 2007 Masakazu Kojima Tokyo.

AN EXTENDED OPENMP TARGETING ON THE HYBRID ARCHITECTURE OF SMP-CLUSTER Author ： Y. Zhao 、 C. Hu 、 S. Wang 、 S. Zhang Source ： Proceedings of the 2nd IASTED.

Graphs and Functions (Review) MATH 207. Distance Formula Example: Find distance between (-1,4) and (-4,-2). Answer: 6.71.

Matrices. Definitions  A matrix is an m x n array of scalars, arranged conceptually as m rows and n columns.  m is referred to as the row dimension.

Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,

Automatic Performance Tuning of SpMV on GPGPU Xianyi Zhang Lab of Parallel Computing Institute of Software Chinese Academy of Sciences

The swiss-carpet preconditioner: a simple parallel preconditioner of Dirichlet-Neumann type A. Quarteroni (Lausanne and Milan) M. Sala (Lausanne) A. Valli.

Distributed Nonnegative Matrix Factorization for Web- Scale Dyadic Data Analysis on MapReduce Challenge : the scalability of available tools Definition.

On the Use of Sparse Direct Solver in a Projection Method for Generalized Eigenvalue Problems Using Numerical Integration Takamitsu Watanabe and Yusaku.

Listing Cliques in Parallel Using a Beowulf Cluster Kaveh Moallemi, Dr. Gerald D. Zarnett, and Dr. Eric R. Harley. Department of Computer Science Ryerson.

Sparse Matrix Vector Multiply Algorithms and Optimizations on Modern Architectures Ankit Jain, Vasily Volkov CS252 Final Presentation 5/9/2007

1 Efficient Parallel Software for Large-Scale Semidefinite Programs Makoto Tokyo-Tech Katsuki Chuo University MSC Yokohama.

INFOMRS Charlotte1 Parallel Computation for SDPs Focusing on the Sparsity of Schur Complements Matrices Makoto Tokyo Tech Katsuki Fujisawa.

Lecture 4 Sparse Factorization: Data-flow Organization

Introduction to Semidefinite Programs Masakazu Kojima Semidefinite Programming and Its Applications Institute for Mathematical Sciences National University.

Multi-area Nonlinear State Estimation using Distributed Semidefinite Programming Hao Zhu October 15, 2012 Acknowledgements: Prof. G.

1 Enclosing Ellipsoids of Semi-algebraic Sets and Error Bounds in Polynomial Optimization Makoto Yamashita Masakazu Kojima Tokyo Institute of Technology.

CUDA Basics. Overview What is CUDA? Data Parallelism Host-Device model Thread execution Matrix-multiplication.

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.

Toward an Automatically Tuned Dense Symmetric Eigensolver for Shared Memory Machines Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya.

1 Parallel Software for SemiDefinite Programming with Sparse Schur Complement Matrix Makoto Tokyo-Tech Katsuki Chuo University Mituhiro.

Robust Optimization and Applications in Machine Learning.

Performance of BLAS-3 Based Tridiagonalization Algorithms on Modern SMP Machines Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.

Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.

1 Aggregated Circulant Matrix Based LDPC Codes Yuming Zhu and Chaitali Chakrabarti Department of Electrical Engineering Arizona State.

1 An approach based on shortest path and connectivity consistency for sensor network localization problems Makoto Yamashita (Tokyo Institute of Technology)

College Algebra Chapter 6 Matrices and Determinants and Applications

ACCELERATING SPARSE CHOLESKY FACTORIZATION ON GPUs

CMAQ PARALLEL PERFORMANCE WITH MPI AND OpenMP George Delic, Ph

Lecture 22: Parallel Algorithms

Dissertation for the degree of Philosophiae Doctor (PhD)

Linchuan Chen, Peng Jiang and Gagan Agrawal

Parallel Inversion of Polynomial Matrices

Using matrices to solve Systems of Equations

Scaling up Link Prediction with Ensembles

Section 3.3 – The Inverse of a Matrix

Matrix Definitions It is assumed you are already familiar with the terms matrix, matrix transpose, vector, row vector, column vector, unit vector, zero.

Big Data Analytics: Exploring Graphs with Optimized SQL Queries

Introduction to Matlab

Signal Processing on Graphs: Performance of Graph Structure Estimation

Using matrices to solve Systems of Equations

Parallelized Analytic Placer

Presentation transcript:

1 High-Performance Implementation of Positive Matrix Completion for SDPs Makoto Yamashita (Tokyo Institute of Technology) Kazuhide Nakata (Tokyo Institute of Technology) 2013/10/6 INFORMS Annual Meeting 2013 The research was financially supported by the Sasakawa Sientific Research Grant from The Japan Science Society. 2013/10/6-9 Minneapolis Convention Center, Minneapolis, USA

Sparsity in SDPs 2013/10/6 INFORMS Annual Meeting Notation Only blue elements are involved in inner-product. However, we also have SDP condition.

Structural Sparsity in Spin-Glass SDPs Each node interacts with only its 6 neighbors Only the blue elements are involved in inner-product ⇒ Exploit this structural sparsity SDP condition ⇒ Positive Matrix Completion 2013/10/6 INFORMS Annual Meeting

Idea of Matrix-Completion type Interior-Point Method 2013/10/6 INFORMS Annual Meeting Without Blacks Complement Blacks

Outline of this talk Introduction of Matrix-Completion IPM Speed-up by new factorization formula Multiple threads computation Numerical results This talk corresponds to the new version of SDPA-C (SDPA with the Completion) Available at /10/6 INFORMS Annual Meeting

Standard form of SDP Primal-Dual form. 2013/10/6 INFORMS Annual Meeting

Framework of IPM 2013/10/6 INFORMS Annual Meeting

Keywords in Matrix-Completion How many elements are necessary? ⇒ Aggregate Sparsity Pattern How to convert into smaller matrices? ⇒ Chordal Graph & Maximal Cliques How to complete ? ⇒ The form of 2013/10/6 INFORMS Annual Meeting

Aggregate Sparsity Pattern Non-zero pattern in the dual side 2013/10/6 INFORMS Annual Meeting Example graph

Chordal Graph Chodal, if every cycle longer than 3 has a chord The variable matrix is decomposed by the maximal cliques. Maximal Cliques (Clique, if there is an edge between any pair of the verticies.) 2013/10/6 INFORMS Annual Meeting length4 length Chordal

Decomopostion of 2013/10/6 INFORMS Annual Meeting Blue: aggregate, Red: Chordal Grone et al The entire matrix can be positive definite.

An example of Matrix Completion 2013/10/6 INFORMS Annual Meeting Positive Definite Non-singular & Transpose Non-singular

A remarkable property of the matrix completion The matrix is fully-dense, but its inverse is sparse /10/6 INFORMS Annual Meeting

The factorization of the variable matrix 2013/10/6 INFORMS Annual Meeting Point::

Schur complement matrix with the sparse matrices 2013/10/6 INFORMS Annual Meeting

Outline of this talk Introduction of Matrix-Completion IPM Speed-up by new factorization formula Mutliple threads computation Numerical results 2013/10/6 INFORMS Annual Meeting

Review of Matrix Factorization.. Point 2013/10/6 INFORMS Annual Meeting

2013/10/6 INFORMS Annual Meeting

2013/10/6 INFORMS Annual Meeting

Speed-up by the new factorization 2013/10/6 INFORMS Annual Meeting SDPA-C6.2.0SDPA-C7.3.8Speed-up Schur Complement sec sec2.00x Total sec sec1.88x The computation time is shrunk, but there is still room to improve. Parallel computation by multiple threads Max-clique SDP

Multiple threaded Schur complement matrix 2013/10/6 INFORMS Annual Meeting Each column is independent from others. The thread that becomes idle computes the next column

The effect of multiple threads 2013/10/6 INFORMS Annual Meeting SDPA-C 6.2.0(1) SDPA-C 7.3.8(1) SDPA-C 7.3.8(2) SDPA-C 7.3.8(4) Schur sec sec sec 1.78 x sec 2.86 x Total sec sec sec 1.78 x sec 2.82 x Max-clique The number in （） is threads 5.31 times speed-up

New version of SDPA-C 2013/10/6 INFORMS Annual Meeting SDPA-C 6.2.1SDPA 7.3.8SDPA-C Interior-Point MethodMatrix CompletionStandardMatrix Compeltion Sparse CholeskyOur own codeMUMPSCHOLMOD &MUMPS Multiple Threads× △ ○ Callable Library× ○○ Matlab Interface× ○○

Test Environments and Test Problems CPU Xeon X5365(3.0GHz), Memory 48GB, Red Hat Linux Test Problem 1 SDP relaxation of Max-Clique Problem on lattice (p,q) Test Problem 2 Spin-glass computation in quantum chemistry 2013/10/6 INFORMS Annual Meeting

Max clique SDPs (p=400,q=10) 2013/10/6 INFORMS Annual Meeting SDPA-C SDPA-C 7.3.8(4) SDPA 7.3.8(4) SeDuMi 1.3 Schur ΔX Total New SDPA-C is the fastest. #Clique 438, Average Size 29.89, Max Size 59

Spin-glass SDPs 2013/10/6 INFORMS Annual Meeting pn#CliquesAve SizeMax SizeSDPA7.3.8SDPA-C The unit of computation time is second. The computation cost grows up mildly in SDPA-C. The clique size is almost constant. For larger SDPs, SDPA-C is more efficient.

Conclusion and future works Speed-up of Matrix-Completion IPM by the new factorization formula  More effective for larger problems Speed-up by multiple threads. We should automatically select the standard IPM or Matrix-Completion IPM 2013/10/6 INFORMS Annual Meeting