[Previous] [Up] [Next]

A Tour of NTL: Tips for Getting the Best Performance out of NTL


  1. On many machines that optionally offer 64-bit integer arithmetic (recent Mac OSX machines, for instance), you should compile using gcc with the option -m64 to get the full benefit. To do this, pass "CFLAGS=-O2 -m64" to the configure script (note the use of quotes). If you are using NTL with GMP on such a machine, you must do this to get compatible code. Note, however, that 64-bit is becoming the default, so this may not be necessary.

  2. On Sparcs, pass the argument "CFLAGS=-O2 -mcpu=v8" to the configure script. On more recent, 64-bit sparcs, pass "CFLAGS=-O2 -mcpu=v9 -m64" to get the full instruction set and 64-bit code.

  3. Make sure you run the configuration wizard when you install NTL. This is the default behaviour in the makefile in the Unix distribution, so don't change this; in the Windows distribution, there is unfortunately no easy way to run the wizard.

  4. In time-critical code, avoid creating unnecessary temporary objects. For example, instead of

    ZZ InnerProduct(const ZZ *a, const ZZ *b, long n)
    {
       long i;
       ZZ res;
       for (i = 0; i < n; i++)
          res += a[i] * b[i];
       return res;
    }

    write this as

    ZZ InnerProduct(const ZZ *a, const ZZ *b, long n)
    {
       long i;
       ZZ res, t;
       for (i = 0; i < n; i++) {
          mul(t, a[i], b[i]);
          add(res, res, t);
       }
       return res;
    }

    The first version of InnerProduct creates and destroys a temporary object, holding the value a[i]*b[i], in every loop iteration. The second does not.

  5. If you use the class ZZ_p, try to avoid switching the modulus too often, as this can be a rather expensive operation. If you must switch the modulus often, use the class ZZ_pContext to save the information associated with the modulus (see ZZ_p.txt).

[Previous] [Up] [Next]