GSS(GRUS SPARSE SOLVER) is a novel sparse solver that can solve million unknowns in 1 minute even on PC.
The high performance and generality of GSS has been verified by many commercial users and large testing sets.

Download free trial version.

Solve million SPD matrices in seconds!
Faster than other sparse solver!
Stable for ill-conditioned matrices.

High Performance
In most case,1 minute is enough for the numerical factorization of 1000,000 unknowns.
The forward/backward substitution time is in seconds.
For unsymmetric matrices,nearly 3 times faster than PARDISO in numerical factorization(Detail in Experimental Results).

Robust
Handle matrices with high condition number or strange patterns.
Some ill-conditioned matrices only can be solved by GSS.

CPU/GPU hybrid computing
GSS is the first sparse solver that supports Nvidia CUDA platform.

Good scalability
Support at most 64 CPUs.

IN-CORE, OUT-CORE and Hybrid-core mode
Handle matrices that needs more than 4G memory in 32-bit platform

High Generality
Support both 32bit and 64 bit Operating System.
Support both Linux and Windows.

Easy to use
Multiple user modes.
Supports user defined module.
More than 10 parameters with default value.

The performance of GSS has been verified by many commercial users and testing sets with thousands of matrices.

Download free trial version.

For more details, please contact
MAIL: gsp@grusoft.com
Phone: (+86) 013501997193
QQ: 304718494

Experimental Results

The test matrices(Table 1-3) are all from the UF sparse matrix collection,which also used in the paper of UMFPACK.

Table 4 lists the time of numerical factorization(in seconds).
As you see, GSS 2.2 is Nearly 3 times faster than PARDISO in MKL 11.

The testing CPU is INTEL Q8200 with 8G memory. The operating system is Windows Vista 64.

Table 1 symmetric set


Group

Name

n

Nonzeros(in 1000s)

Sym.

description

NORRIS

TORSO2

115967

1033.5

0.992

2D human torso, electro-phys finite-diff

SIMON

OLAFU

16146

1015.2

1.000

Structure problem

SIMON

VENKAT01

62424

1717.8

1.000

Unstructured 2D Euler problem

BAI

AF23560

23560

460.6

0.944

airfoil

SIMON

RAEFSKY3

21200

1488.8

1.000

Fluid-structure, turbulence

ZHAO

ZHAO1

33861

166.5

0.922

electromagnetics

ZHAO

ZHAO2

33861

166.5

0.922

electromagnetics

FIDAP

EX11

16614

1096.9

1.000

3D fluid flow, cylinder and plate

SIMON

RAEFSKY4

19779

1316.8

1.000

Container buckling problem

WANG

WANG4

26086

177.2

1.000

3D MOSFET semiconductor

RONIS

XENON1

48600

1181.1

1.000

Zeolite, sodalite crystals

VANHEUKELUM

CAGE10

11397

150.6

1.000

DNA electrophoresis

NORRIS

STOMACH

213360

3021.6

0.848

3D electro-physical, human duodenum

Table 2 2-by-2 set


Group

Name

n

Nonzeros(in 1000s)

Sym.

description

GOODWIN

GOODWIN

7320

324.8

0.635

Fluid mechanics, finite-element

AVEROUS

EPB2

25228

175.0

0.670

Plate-fin heat exchanger

GARON

GARON2

13535

373.2

0.999

2D finite-element, Navier-Stokes

GOODWIN

RIM

22560

1015.0

0.639

Fluid mechanics, finite-element

NORRIS

HEART2

2339

680.3

1.000

Quasi-static FEM, human heart

AVEROUS

EPB3

84617

463.6

0.667

Plate-fin heat exchanger

BOVA

RMA10

46835

2329.1

1.000

3D model of Charleston Harbor

NORRIS

HEART1

3557

1385.3

1.000

Quasi-static FEM, human heart

HB

PSMIGR_1

3140

543.2

0.479

Population migration

Table 3 unsymmetric set


Group

Name

n

Nonzeros(in 1000s)

Sym.

description

AT&T

ONETONE2

36057

222.6

0.116

Harmonic balance method

GRAHAM

GRAHAM1

9035

335.5

0.718

Navier-Stokes, finite-element

MALLYA

LHR34C

35152

764.0

0.002

Light hydrocarbon recovery

SHEN

E40R0100

17281

553.6

0.308

 

MALLYA

LHR71C

70304

1528.1

0.002

Light hydrocarbon recovery

FIDAP

EX40

7740

456.2

1.000

Navier-Stokes, FEM (3D)

AT&T

ONETONE1

36057

335.6

0.076

Harmonic balance method

VAVASIS

AV41092

41092

1683.9

0.001

Unstructured finite-element

AT&T

TWOTONE

120750

1206.3

0.246

Harmonic balance method

HB

PSMIGR_2

3140

540.0

0.479

Population migration

SIMON

BBMAT

38744

1771.7

0.529

2D airfoil, turbulence

HOLLINGER

G7JAC200SC

59310

717.6

0.025

Economic modeling

HOLLINGER

MARK3JAC140SC

64089

376.4

0.061

Economic modeling

Table 4 comparative testing between GSS and PARDISO


SET

Matrix

PARDISO

GSS

GSS/PARDISO

symmetric

TORSO2

1.09

0.83

0.76

OLAFU

0.34

0.22

0.65

VENKAT01

0.59

0.41

0.69

AF23560

0.75

0.42

0.56

RAEFSKY3

0.5

0.34

0.68

ZHAO1

0.48

0.23

0.48

ZHAO2

1.4

0.23

0.16

EX11

0.7

0.44

0.63

RAEFSKY4

0.81

0.58

0.72

WANG4

0.76

0.39

0.51

XENON1

1.95

1.14

0.58

CAGE10

1.78

0.42

0.24

STOMACH

10.5

4.1

0.39

2-by-2

GOODWIN

0.06

0.14

2.33

EPB2

0.22

0.13

0.59

GARON2

0.17

0.09

0.53

RIM

0.23

0.69

3.00

HEART2

0.11

0.08

0.73

EPB3

0.34

0.23

0.68

RMA10

0.62

0.39

0.63

HEART1

0.33

0.19

0.58

PSMIGR_1

4.96

0.81

0.16

unsymmetric

ONETONE2

0.2

0.17

0.85

GRAHAM1

0. 09

0.28

3.11

LHR34C

0.44

0.25

0.57

E40R0100

0.12

0.12

1.00

LHR71C

0.91

0.52

0.57

EX40

0.25

0.27

1.08

ONETONE1

0.84

0.34

0.40

AV41092

2.42

0.72

0.30

TWOTONE

8.36

0.91

0.11

PSMIGR_2

5.9

1.01

0.17

BBMAT

8.37

2.03

0.24

G7JAC200SC

13.5

4.98

0.37

MARK3JAC140SC

2.29

1.56

0.68

sum

 

72.38

25.66

0.35

 

History
4/18/2012  GSS 2.3 released.
1. Improved numerical factorization. Faster than GSS 2.2.
2. Need less memory than GSS 2.2.
3. Fix some bugs.

9/19/2009 GSS 2.2 released.
1. Improved numerical factorization for SPD matrices, which is 20% faster than PARDISO.
2. Improved CPU/GPU hybrid computing. The best speed-up for 500,000 unknowns is 7.
3. GSS_spd is free.

7/31/2007 GSS 2.1 released.
1. Support Nvidia CUDA.
2. Improved out-core module.
3. Improved memory module of LDLT.
12/25/2007 GSS 2.0 released.
4. Add new balance module.
5. Add LU-partial-updating module.
6. Improved out-of-core, in-core and hybrid-core module

7. Add hybrid multifrontal/Frontal module.
8. Improved iterative refine module and get better estimation of condition number.
11/25/2005 GSS 1.2 released.
9. Parallel version released.
10. Support INTEL Hyper-Threading.
11. Improved numerical factorization for symmetrical matrices.
12. Improved static pivoting.
13. Add iterative refine module.
9/12/2005 GSS 1.1 released
14. Add QUOTIENT GRAPH model for symbolic factorization.
15. Improved reorder module of diagonals.
16. Improved Numerical factorization for unsymmetrical matrices.
17. Add scaling module.
18. More experimental results.
7/20/2005 GSS 1.0 released.

For special reason, licenses are not available to commercial users in the field of semiconductor or optoelectronic modeling.
We apologize for any inconvenience.  


 

©copyright 2002-2012 GRUSOFT

All Rights Reserved