Memory error for sparse high-dimensional multi-index set and about "completeness" vs "downward-closedness"
I ran into an issue is when updating the multi-index set of a sparse high-dimensional function. For instance:
>>> import numpy as np
>>> import minterpy as mp
>>> exp = np.zeros((10,10), dtype=int)
>>> exp[:,0] = np.arange(10)
>>> mi = mp.MultiIndexSet(exp)
>>> mi
MultiIndexSet
[[0 0 0 0 0 0 0 0 0 0]
[1 0 0 0 0 0 0 0 0 0]
[2 0 0 0 0 0 0 0 0 0]
[3 0 0 0 0 0 0 0 0 0]
[4 0 0 0 0 0 0 0 0 0]
[5 0 0 0 0 0 0 0 0 0]
[6 0 0 0 0 0 0 0 0 0]
[7 0 0 0 0 0 0 0 0 0]
[8 0 0 0 0 0 0 0 0 0]
[9 0 0 0 0 0 0 0 0 0]]
is 10-dimensional but very sparse, only the first dimension is "active".
The MultiIndexSet
is complete:
>>> mi.is_complete
True
with the completed_exponents
attribute the same as the exponents of the set itself:
>>> mi.exponents_completed
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[2, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[3, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[4, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[7, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[8, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
But, when I tried to add this multi-index set with a new element:
>>> new_element = np.array([10, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>> mi.add_exponents(new_element)
Minterpy throws a memory error:
...
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 373. GiB for an array with shape (10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10) and data type int32
because apparently, Minterpy insists on creating a new instance with the attribute _exponents_completed
whose value is a strictly "complete" multi-index set for the given lp-degree.
In the example above, it will create a "complete" exponents of dimension 10, degree 10, and lp-degree 1.0, which requires a massive amount of storage.
Furthermore, adding exponents also behaves rather strangely depending whether the attribute exponents_completed
has been previously queried at least once or not.
Why? Because in the method _new_instance_if_necessary
the function make_complete
is called only if the hidden attribute _exponents_completed
is not None
.
This attribute is not None
if the attribute exponents_completed
has been once queried.
If I never query exponents_completed
then _exponents_completed
will remain None
and there is no problem adding the new element in the example above.
I think, this is an unexpected behavior.
In this particular case, I suppose I can modify _new_instance_if_necessary
to behave more consistently regardless whether an attribute has been queried once or not.
However, I'd also like to discuss the is_complete
attribute of a MultiIndexSet
a bit further.
Completeness vs Downward-closedness
As far as I know, the divided difference scheme work with downward-closed multi-index sets.
But the multi-index set does need to be complete
in the sense of what Minterpy currently adopts.
In other words, while "complete" means downward-closed, downward-closed may not necessarily be "complete".
For instance, the multi-index set:
MultiIndexSet
[[0 0]
[1 0]
[2 0]
[0 1]]
is downward-closed (and DDS works fine with it), but not "complete" with respect to, say, lp-degree 1.0 which is:
MultiIndexSet
[[0 0]
[1 0]
[2 0]
[0 1]
[1 1]
[0 2]]
So completeness would be defined perhaps with respect to a particular lp-degree.
How to deal with these two notions?
Is there any part of the current Minterpy that works only with a strictly "complete" multi-index set?
If yes, should we be explicit in distinguishing these two notions: an attribute for completeness and an attribute for downward-closedness?
If not, what would exactly be the definition of is_complete
and exponents_completed
?
Finally, note that the utility function is_lexicographically_complete
(yet another jargon used) checks if a set of exponents is downward-closed and not complete with respect to an lp-degree.