我可以使用什么库来计算Java或Scala中的大型稀疏矩阵?

当我使用大型稀疏矩阵时,最好使用CCS,CRS等压缩矩阵.

我尝试使用ScalaNLP,la4j,colc来计算100,000 * 100,000稀疏矩阵.
有一些问题.

>微风(ScalaNLP / Scalala)

>它给了我CSCMatrix类型,它可以有100,000 * 100,000大小.
>但问题是它正在开发中.
>所以我们不能用CSCMatrix来计算CSCMatrix的元素产品,比如csc1:* csc2.
>并且您也无法相互添加CSCMatrix.

> la4j

>它有CCSMatrix和CRSMatrix.
>但是在创建(new CCSMatrixFactory).createMatrix(100000,100000)时,它会出现OutOfMemoryError.
>矩阵应为零,因此不应使用大的内存空间.

> colc

>它有SparseDoubleMatrix2D.
>但是当创建像新的SparseDoubleMatrix2d(100000,100000)这样的矩阵时,它会说IllegalArgumentException:矩阵太大了.

要计算大型稀疏矩阵,我可以使用哪个库?
你能告诉我这个例子吗?

解决方法:

我对Breeze很好奇,所以我查看了源代码.它有点乱,因为运算符都是从一些println样式代码生成中发出的(!)……但我想出了这个:

import breeze.linalg.operators.{BinaryOp, OpMulScalar}

object CSCMatrixExtraOps {
  abstract class CSCMatrixCanMulM_M[@specialized (Int, Float, Long, Double) A]
    extends BinaryOp[CSCMatrix[A], CSCMatrix[A], OpMulScalar, CSCMatrix[A]] {

    protected def times(a: A, b: A): A

    protected def zeros  (rows: Int, cols: Int): CSCMatrix[A]
    protected def builder(rows: Int, cols: Int, sz: Int): CSCMatrix.Builder[A]

    final def apply(a: CSCMatrix[A], b: CSCMatrix[A]): CSCMatrix[A] = {
      val rows  = a.rows
      val cols  = a.cols
      require(rows == b.rows, "Matrices must have same number of rows!")
      require(cols == b.cols, "Matrices must have same number of cols!")

      if (cols == 0) return zeros(rows, cols)
      val res     = builder(rows, cols, math.min(a.activeSize, b.activeSize))
      var ci      = 0
      var acpStop = a.colPtrs(0)
      var bcpStop = b.colPtrs(0)
      while (ci < cols) {
        val ci1 = ci + 1
        var acp = acpStop
        var bcp = bcpStop
        acpStop = a.colPtrs(ci1)
        bcpStop = b.colPtrs(ci1)
        while (acp < acpStop && bcp < bcpStop) {
          val ari = a.rowIndices(acp)
          val bri = b.rowIndices(bcp)
          if (ari == bri) {
            val v = times(a.data(acp), b.data(bcp))
            res.add(ari, ci, v)
            acp += 1
            bcp += 1
          } else if (ari < bri) {
            acp += 1
          } else /* ari > bri */ {
            bcp += 1
          }
        }
        ci = ci1
      }

      res.result()
    }
  }
  implicit object CSCMatrixCanMulM_M_Int extends CSCMatrixCanMulM_M[Int] {
    protected def times(a: Int, b: Int) = a * b
    protected def zeros(rows: Int, cols: Int) = CSCMatrix.zeros(rows, cols)
    protected def builder(rows: Int, cols: Int, sz: Int) = 
      new CSCMatrix.Builder(rows, cols, sz)
  }

  implicit object CSCMatrixCanMulM_M_Double extends CSCMatrixCanMulM_M[Double] {
    protected def times(a: Double, b: Double) = a * b
    protected def zeros(rows: Int, cols: Int) = CSCMatrix.zeros(rows, cols)
    protected def builder(rows: Int, cols: Int, sz: Int) = 
      new CSCMatrix.Builder(rows, cols, sz)
  }
}

例:

import breeze.linalg._
import CSCMatrixExtraOps._

val m1 = CSCMatrix((0, 0, 0), (0, 5, 0), (0, 0, 10), (0, 13, 0))
val m2 = CSCMatrix((0, 0, 0), (0, 5, 0), (0, 0, 10), (13, 0, 0))
(m1 :* m2).toDenseMatrix

结果:

0  0   0    
0  25  0    
0  0   100  
0  0   0    
上一篇:有哪些不同的zlib压缩方法以及如何在Java的Deflater中强制使用默认值?


下一篇:python – Django:数据库级或代码级的TextField(字符串)数据压缩