In some cases, the poly_degree property of a MultiIndexSet is inconsistent with the value used in the construction
Inferring the polynomial degree from a given multi-index set is unstable for some choices of spatial dimension and poly degree.
This may be a peculiar edge case; I observed it for spatial dimension 7
and 10
with polynomial degree 5
and 10
and lp-degree 2.0.
I cannot say for sure about other combinations, at least in smaller dimension I haven't observed this.
Most probably this is due to some floating point operations.
Here's an example that reproduces the problem:
import minterpy as mp
spatial_dimension = 7
poly_degree = 5
lp_degree = 2.0
mi = mp.MultiIndexSet.from_degree(spatial_dimension, poly_degree, lp_degree)
assert poly_degree == mi.poly_degree
# 5 == 6 !!!
The code above will raise an AssertionError
: the polynomial degree passed to from_degree()
constructor is not the same as the polynomial degree of the resulting multi-index set.
Recall how the poly_degree
property is assigned when a MultiIndexSet
instance is created via from_degree()
:
- The set of exponents is created for the given spatial dimension, polynomial degree, and lp-degree via
get_exponent_matrix()
function. - The polynomial degree is inferred from the set of exponents via
_get_poly_degree()
function.
The problem can either be in the two functions mentioned above: get_exponent_matrix()
produces the wrong set of exponents or _get_poly_degree()
infers the wrong polynomial degree (well, or both).
After some investigation, it appears that _get_poly_degree()
is the suspect due to the function lp_norm()
called from within.
The problem in detail
Consider once more the following multi-index set:
spatial_dimension = 7
poly_degree = 5
lp_degree = 2.0
mi = mp.MultiIndexSet.from_degree(spatial_dimension, poly_degree, lp_degree)
This set must, at least, conform to the definition:
\lVert \boldsymbol{\alpha} \rVert_p = \left( \sum_{i=1}^{m} \alpha_i^p \right)^{\frac{1}{p}} \leq n, \forall \boldsymbol{\alpha} \in \mathcal{A},
where m, n, and p are the spatial dimension, polynomial degree, and lp-degree, respectively; \lVert \cdot \rVert_p
is the l_p
-norm; and \mathcal{A}
is the multi-index set.
The cardinality of the multi-index set is:
len(mi)
# 11496
But, the number of elements that fulfill the above condition is:
from minterpy.utils import lp_norm
np.sum(lp_norm(mi.exponents, lp_degree, axis=1) <= poly_degree)
# 11493
In other words, there are three elements that do not satisfy the condition based on the computation of the norm by lp_norm()
.
They are:
idx = np.logical_not(lp_norm(mi.exponents, lp_degree, axis=1) <= poly_degree)
mi.exponents[idx]
# array([[4, 1, 1, 1, 1, 1, 2],
# [1, 4, 1, 1, 1, 1, 2],
# [1, 1, 4, 1, 1, 1, 2]])
But do these elements really not satisfy the l_p
-norm condition?
With naive computation:
norm_naive = np.sum(mi.exponents[idx]**lp_degree, axis=1)**(1/lp_degree)
norm_naive
# array([5., 5., 5.])
norm_naive <= poly_degree
# array([ True, True, True])
which is strange because they satisfy the condition all right.
Take a look at the norms computed by lp_norm()
at the suspected indices:
norm_lp = lp_norm(mi.exponents, lp_degree, axis=1)[idx]
norm_lp
# array([5., 5., 5.])
np.allclose(norm_lp, poly_degree)
# True
norm_lp <= poly_degree
# array([False, False, False])
norm_lp - poly_degree
array([8.8817842e-16, 8.8817842e-16, 8.8817842e-16])
The norms are numerically close (but not identical) to the polynomial degree, but they are actually slightly larger.
This what causes the inconsistency between polynomial degree value used in the construction and the one assigned to poly_degree
property, because inferring the polynomial degree uses ceiling function on the maximum norms of a multi-index set such that the polynomial degree (for the given lp-degree) can include all the elements in the set.
In this case, the maximum absolute norm being slightly larger than the assigned polynomial degree is an artifact introduced by lp_norm()
function; 5.000000000000001
(should have been 5.0
) fed into the ceiling function gives 6
as the polynomial degree (should have been 5
).
lp_norm()
function
About the The function lp_norm()
is designed to handle very large (and small) values in a matrix when the norm is computed.
Using scaling trick, the computation is more stable than using np.linalg.norm()
directly.
Here's an illustration:
import numpy as np
xx = np.array([[1e300, 1e300]])
np.linalg.norm(xx, lp_degree, axis=1)
# RuntimeWarning: overflow encountered in multiply
# s = (x.conj() * x).real
# array([inf])
lp_norm(xx, lp_degree, axis=1)
# array([1.41421356e+300]) No problem here
But it behaves rather strangely for smaller size problems.
If we use the scaling trick from lp_norm()
:
mi.exponents[idx]
# array([[4, 1, 1, 1, 1, 1, 2],
# [1, 4, 1, 1, 1, 1, 2],
# [1, 1, 4, 1, 1, 1, 2]])
norm_1 = 4.0 * np.linalg.norm(mi.exponents[idx] / 4.0, lp_degree, axis=1)
norm_1
# array([5., 5., 5.])
norm_2 = 5.0 * np.linalg.norm(mi.exponents[idx] / 5.0, lp_degree, axis=1)
norm_2
# array([5., 5., 5.])
norm_3 = np.linalg.norm(mi.exponents[idx], lp_degree, axis=1)
norm_3
# array([5., 5., 5.])
norm_1 == norm_2
array([False, False, False])
norm_1 == norm_3
array([True, True, True])
The two norms above are not identical.
If we use, for instance, 2.0
and 3.0
instead, the results would be identical.
Apparently, the results are sensitive to the choice of the parameter.
Mathematically, they shouldn't really matter because:
\lVert \boldsymbol{\alpha} \rVert_p = \lvert c \rvert \, \lVert \frac{\boldsymbol{\alpha}}{c} \rVert_p.
Note that, for numerical reason, the maximum absolute value of the matrix is taken to be the parameter such that the largest value becomes \pm 1.0
avoiding overflow when the power is taken.
For now, I'm assigning this issue to myself, will check for other breaking cases first before attempting a solution.