## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Optimal prediction of underresolved dynamics

## Abstract

A method is presented for computing the average solution of problems that are too complicated for adequate resolution, but where information about the statistics of the solution is available. The method involves computing average derivatives by interpolation based on linear regression, and an updating of a measure constrained by the available crude information. Examples are given.

## Outline of Goal and Method

There are many problems in science whose solution is described by a set of differential equations, but where the solution of these equations is so complicated that it cannot be found in practice, even numerically, because it cannot be resolved properly. An accurate numerical solution requires that the problem be well resolved, i.e., that enough variables (“degrees of freedom”) be retained in the calculation to represent all relevant features of the solution. Well known examples where good resolution cannot be achieved in practice include turbulence and various problems in statistical physics and in economics. We consider in the present paper what one should do in underresolved problems, i.e., problems where good resolution has not been achieved.

There exists a large literature on the solution of underresolved problems, describing a wide variety of problem-specific tools; for example, in the case of turbulence there are modeling methods and methods for “large-eddy simulation” (1). All these methods involve some assumptions about the relation between what one can effectively compute and the “invisible” degrees of freedom that cannot be properly represented. Obviously, nothing can be done without some information in addition to what can be computed. In the present paper we assume that the additional information consists of explicit information about a measure preserved by the differential equations. In many problems of interest this kind of information is available, but it does not seem to have been fruitfully utilized previously. The key special case of turbulence will be treated in detail elsewhere, where it will be shown that it falls within the class of problems to which our methods apply.

The situation of interest is as follows: Consider a differential
equation of the form
1
where *t* is the time and *R*(*u*) = *R*(*u*,
?*u*/?*x*,?…) is a (generally nonlinear) function of its
arguments. Assume in addition that a measure on the space of solutions
of Eq. 1 is invariant under the flow induced by Eq.
1, and that we know what it is. We denote averages with
respect to this measure by angle brackets: 〈?〉. We further
assume that we cannot resolve *u*, but that we can find some
information about *u* at a small number of mesh points; the
information could consist of point values but it is more reasonable
physically and mathematically to assume that what one has is
“filtered” values, as would be indeed produced by a real physical
measurement. We thus assume that what we have at the mesh points are
the values ū_{α} defined by:
2
where α is an index on the mesh. The kernel *G* can
represent, for example, spatial averaging. The coarse grid data in Eq.
2 specify, at every moment in time, a subset of functions
(the functions that are consistent with these observations). We denote
averages over this restricted subset by angle brackets with a subscript
corresponding to the observations:
〈?〉_{ū}. We call the measure on the
restricted subset the “constrained measure.” If the problem is
underresolved, the measure is carried by a nontrivial set of functions.
It is important to note that the constrained measure is not invariant;
indeed, if the invariant measure we start with is ergodic, the
constrained measure tends in time to the unconstrained measure; for
example, if we demanded that initially all the functions assumed given
values at the grid points, there is no reason to believe that the
solutions of the differential equations that evolve from these data
would still take on the very same values at the mesh points at later
times.

Our goal is to calculate averages of the solutions with respect to the
constrained measure; these averages represent what one can calculate on
the crude grid, properly averaged over the “invisible” degrees of
freedom that cannot be represented on that grid. Given the constrained
measure and the filtered values, the mean and the moments of the
solution can be found at all points by interpolation (equivalent to
linear regression, ref. 2), and therefore the mean derivatives of these
quantities at the computational points can be found, for all practical
purposes without error. The remaining problem is to characterize the
evolution of the constrained measure so that the mean solution (and any
moments that may be needed) can be advanced in time. Our assumption is
that the constrained measure remains the invariant measure constrained
by *n* filters, where *n* is the number of grid
points. The filters therefore have to change in time. The formulas
below will allow us to relate quantities whose evolution can be
calculated to the parameters that determine the evolving filters. In
the present paper we shall simplify the problem of finding the evolving
filters by assuming that the filters are determined by the evolution of
the mean solution (and not, for example, by the evolution of the higher
moments of the solutions). This is equivalent to assuming that the
equation
3
is a good approximation to the real evolution of the
ū_{α}(*t*) given by Eqs. 1 and
2. Of course, the validity of this assumption depends in
particular on a good choice of filters *G*. In the present
paper, we assume furthermore that the measure is either Gaussian or
approximately Gaussian, in a sense specified below.

The two keys to success are: (*i*) averaging with respect to
the right constrained measure and (*ii*) updating the
constraints as the solutions evolve. Ingredient (*i*) was
already used numerically in ref. 3. A number of interesting attempts
have been made over the years to “fill in” data from coarse grids
in difficult computations so as to enhance accuracy without refining
the grid (see, e.g., refs. 4 and 5), but without our two key
ingredients the usefulness of the earlier methods is necessarily
limited.

We proceed as follows: First we present some elementary but important results on constrained Gaussian probabilities; then we explain and apply our scheme in the special cases of linear and nonlinear Schr?dinger equations. Note that in the nonlinear case the invariant measure is not Gaussian. We also explain why these are significant test models. Because these model equations have some simplifying features, we also sketch a more general methodology.

## Gaussian Distributions: Conditional Expectation Under Affine Constraints

We start by describing how to calculate expectation values of
functions of normally distributed variables when the variables satisfy
constraints of the affine form. Let **u** =
(*u*_{1},?…?, *u*_{n})^{T} be a real
vector of jointly normal random variables; it has a probability density
*F*(**u**) of the form,
4
where *Z* is the appropriate normalization factor,
repeated indices imply summation, the *n* × *n* matrix
*A* with entries *a*_{ij} is symmetric, and
its inverse *A*^{?1}, assumed to exist, is the
pairwise covariance matrix whose elements are
5
where the brackets denote averaging with respect to the
probability density, and the vector **b** with components
*b*_{i} is related to the pointwise expectation
values by
6
The distribution is fully determined by the *n* means and
by the ?*n*(*n* + 1) independent elements of the
covariance matrix; therefore, all expectation values of observables
〈??(**u**)〉 can be expressed in terms of these parameters.
In particular, all higher-order moments are given by Wick’s theorem
(e.g., ref. 6).

Next, assume that the random vector **u** satisfies a set of
affine constraints of the form,
7
where the index α enumerates the constraints; the matrix
*G*, whose entries are *g*_{αi}, is the
discrete analog of the continuous kernel *G*(·) introduced in
Eq. 2. To distinguish between the vector space of random
variables, (*u*_{1},?…?, *u*_{n}), and
the vector space of constraints,
(ū_{1},?…?, ū_{m}), we
use Roman and Greek letter indices, respectively.

Our goal is to calculate conditional expectation values, i.e., averages
over the functions that satisfy the constraints; formally,
8
where the left-hand side introduces a notation for a constrained
average and *F*(**u**) is the properly normalized
probability density (Eq. 4). We shall use the following
three lemmas:

Lemma 1. *The conditional expectation of the
variable u _{i} obeys an affine relation*,
9

*where the n × m matrix Q with entries q*

_{iα}and the n-vector**c**

*with entries c*10Lemma 2.

_{i}are*The conditional covariance matrix has entries*11Lemma 3.

*Wick’s theorem holds for constrained expectations, namely,*12

*where the summation is over all possible pairings of the*P

*coordinates that are in the list.*

*Lemmas 1* and *2* can be deduced from standard
linear regression theory. *Lemma 3* can be proved by noting
that a delta function is the limit of a narrow Gaussian function. As a
result, the projection of a Gaussian measure on the subspace of
functions that satisfy the constraints can be viewed as approximately
Gaussian, hence satisfying Wick’s theorem; an appropriate limit can
then be taken.

## A Linear Schr?dinger Equation

We demonstrate our method by applying it to two equations of Schr?dinger type. We chose these problems because we feel that a nonlinear Schr?dinger equation is a suitable one-dimensional cartoon of the Euler/Navier-Stokes problem that we are most interested in: it is Hamiltonian and nonlinear. The more popular cartoon, the Burgers equation, will be analyzed elsewhwere; its peculiar properties (dominance of the solution by shocks and the need for a driving noise term to obtain an invariant measure) introduce added complications whose analysis does not fit within a short introductory paper.

We start with a linear Schr?dinger equation written as a pair of
real equations:
13
where *p*, *q* : [0, 2π) × [0, ∞) → ?, μ is a
given constant, and periodic boundary conditions are assumed. This
system of equations can be derived from a Hamiltonian,
14
where *p*(*x*, *t*) and *q*(*x*, *t*) are the
canonically conjugate variables.

This system has an invariant measure, the canonical measure, depending
on a temperature that we set equal to one, so that the invariant
measure has the density distribution *F*[*p*, *q*] =
exp(?*H*[*p*, *q*]). The quadratic form of the Hamiltonian
makes the canonical distribution Gaussian, completely specified by the
average functions 〈*p*(*x*)〉 and 〈*q*(*x*)〉,
which are zero by symmetry, and by the covariance functions,
15
The measure is absolutely continuous with respect to a Wiener
measure, and the sample functions in this ensemble are continuous but
not differentiable with probability one, so that the evolution of a
single solution in time is hard to calculate.

We pick initial data from the invariant ensemble, but then assume that
measurement has revealed, at each of *m* points, the values of
two sets of filtered quantities that can be though of as local averages
of the solution:
16
where α = 1,?…?, *m*, *x*_{α} =
2πα/*m* are the points about which the averages are evaluated,
and the function *G*(·) is a Gaussian filter (i.e., a function
of Gaussian shape, not a random variable) of width σ. Once we know
the values of these filtered values, the constrained average is
defined. We wish of course to evaluate the future means of the
solutions in the ensemble that has been constrained in this way,
without calculating any particular solutions and using only the
information at the *m* given points.

Given the values of the filters, formulas (Eqs. 9 and
11) allow us to find the mean of the solution in the
ensemble conditioned by these filtered values; a short manipulation
yields
17
where
18
and the *s*_{αβ}^{?1} are the α, β
elements of the matrix that is the inverse of the matrix whose entries
are the *s*_{αβ}. The result of this procedure is
shown in Fig. 1, where we plot
the mean of the functions that satisfy the constraints that have been
imposed at a set of 10 points. The width of the filtering kernel is
σ = 2π/*m* (the distance between the grid points). Note
that we know exactly the mean of the functions in the constrained
ensemble, and we know it everywhere.

In the present linear problem, if one knows the mean of the functions
in the constrained ensemble, one can readily evaluate the mean time
derivative at the grid points by applying the differential operator to
the mean function. However, one cannot integrate this mean derivative
in time and expect to obtain the mean future, because at a later time
the ensemble that will have evolved from the initial constrained
ensemble will not be the same as the initial ensemble. The assumption
in the whole approach is that the ensemble prepared initially by
placing conditions (the values of the filters) on the functions sampled
from the invariant ensemble evolves into an ensemble that can still be
represented as a subset of the ensemble that carries the invariant
measure, with a measure conditioned by new filters *G* and
filtered values **p?** and **q?**. Here we are
making the further assumption that it is sufficient to advance in time
only mean values of **p?** and **q?**. Hence,
19
Using Eqs. 17 and 15 we find the more
detailed expression:
20
with
21
An analogous equation with opposite sign is obtained for the mean
time derivative of the vector **q?**. We now have a closed
system of equations for the estimation of the mean values of
**p?** and **q?** in the future. We emphasize
that the last assumption whereby it is sufficient to advance in time
the mean values of **p?** and **q?** is
nonessential and unlikely to be true in general (see the concluding
section). The domain of validity of the more general assumption remains
to be determined.

In Fig. 2 we plot the mean
evolution of five filtered values p?_{α}(*t*)
for randomly selected initial values of
p?_{α}(0) and
q?_{α}(0). The filtering function
*G*(·) has a width (defined above) equal to the distance
between mesh points. In this linear problem the evolution of the mean
of the filtered values can be calculated exactly simply by using
Green’s function for the problem to advance the mean solution and
hence advance the values of the filters. In Fig. 2 the results of an
exact calculation of the filtered values are represented by open
circles, and the results obtained by integrating the set of 10 ordinary
differential equations (Eq. 20) are represented by dashed
lines. The two results are practically indistinguishable.

## A Nonlinear Hamiltonian System

We now consider a nonlinear generalization of the method
demonstrated in the preceding section. We want to exhibit the power of
our method by comparing the solutions that it yields with exact
solutions; in the nonlinear case, exact solutions of problems with
random data are hard to find, and we resort to a stratagem. Even
though our method applies to the full nonlinear partial differential
equation, we study a finite dimensional system of 2*n*
ordinary differential equations that is formally an approximation of a
nonlinear Schr?dinger equation:
22
where Δ = 1/*n* is a mesh size. The approximation is
formal because the solutions of the differential equation are not
smooth enough to ensure convergence. We consider a periodic system so
that *q*_{0} = *q*_{n} and
*q*_{n+1} = *q*_{1} (with analogous conditions
on *p*). In the calculations below, *n* = 16.

The Hamiltonian of this system is 23 Starting from this discrete system, we shall use a discrete version of the method above to deduce equations for a reduced set of variables, which will reproduce the mean solution of the full set of equations, and we shall be able to show explicitly that the algorithm does what it should.

As in the linear case, we assume that the canonical distribution has a
temperature equal to one. Note that now the Hamiltonian is not a
quadratic function and the invariant measure is not Gaussian. The
lemmas of section 2 cannot be used without approximation. The
equilibrium distribution *P*(**p**, **q**) =
*e*^{?H[p,q]} is approximated by a
Gaussian distribution, chosen so as to yield the same means and
covariances as the original distribution. Symmetry considerations imply
that 〈*p*_{i}〉 = 〈*p*_{i}〉 =
〈*p*_{i}*q*_{j}〉 = 0. Translation and reflection
invariance, on the other hand, imply that
〈*p*_{i}*p*_{j}〉 =
〈*q*_{i}*q*_{j}〉 = *f*_{|i?j|}.
Thus, the approximate distribution depends on a single vector
*f*_{|i?j|} that can be calculated, for example,
by a straightforward Monte Carlo simulation.

We define a subset of 2*m* coarse variables (*m* ?
*n*) that represent local averages of the vectors **p** and
**q**:
24
where the index α = 1,?…?, *m* enumerates the
coarse variables. We consider the case *m* = 2; thus, we
replace a set of 32 variables by a reduced set of 4 variables. The
filtering matrix *G* with entries *g*_{αi}
has a Gaussian profile:
25
where *B* is a normalization factor, *x*_{i}
= *i*/*n*, x?_{α} = α/*m*, and σ is the width of
the filter. In the calculations below we took σ = 0.25, i.e., the
filter averages over an interval of the order of the distance between
the coarse data points.

Given the constrained distribution, the mean time derivative of the
coarse variables can be calculated. For example,
26
The first term on the right-hand side can readily be calculated by
the interpolation formula (*Lemma 1*). For the second term,
which involves the expected value of a cubic function of
*q*_{i}, we use the Gaussian approximation and apply
Wick’s theorem,
27
All these terms can be computed explicitly by using
*Lemmas 1* and *2*. The mean time
derivative of p?_{1} is
28
The term in square brackets is the “gaussianized”
approximation to the cubic term in Eq. 26. The equation for
p?_{2} is obtained by substituting 1 ? 2;
the equations for q?_{1}, q?_{2}
are obtained by the transformation p? → q? and
q? → ?p?.

We now test the accuracy of the prediction of the mean evolution of the
vectors **p?** and **q?**, obtained by the
integration of equations (Eq. 28). In contrast to the linear
example, we do not have an exact mean solution for comparison. Instead,
we generated numerically an ensemble of initial conditions that sampled
the constrained canonical ensemble. We employed a standard Metropolis
algorithm and sampled 10^{4} states. Each state was then
evolved in time using a fourth-order Runge–Kutta solver, and by
averaging over the evolved ensemble we were able to compute the mean
evolution of the solution of the system of 32 equations, which we then
compared with the solution of the reduced system. The possibility of
making this comparison is the reason we started with a discrete system
rather than the full partial differential equation.

A comparison between the true mean and the mean produced by our
averaging/updating procedure is shown in Fig. 3 *a* and
*b*. The two graphs show
different intervals of time; the coarse initial conditions are
p?_{1} = 0.2, p?_{2} = 0.3,
q?_{1} = 0.6, and q?_{2} = 0.55.

## Discussion

We have shown by examples that one can calculate average solutions
accurately on a crude grid. The examples were simple but not trivial;
in particular, they were complicated enough so that the calculation of
even a single sample solution was difficult. The main tools were
interpolation based on regression and the characterization of
constrained measures by time-dependent filters. In more complete papers
to come we plan to show that the method generalizes to more complicated
situations. In the Gaussian case, one can advance in time not only the
mean solution but also its moments and covariances. As more information
is updated, one can update not only the values of the filtered
variables, but also parameters that determine the structure of the
filters (i.e., one does not have to keep *G* in Eq.
2 fixed). In non-Gaussian problems, one can perform local
“gaussianization” rather than global “gaussianization,” as
above, and describe a broader range of measures. Furthermore, it should
be obvious from the discussion that one does not need complete
information about the measures, but only the covariances and low-order
moments on scales smaller than the distance between mesh points. This
partial information is available in a broad range of situations (7).

## Acknowledgments

We are grateful to Prof. G.?I. Barenblatt and to Dr. D. Nolan for many fruitful discussions. This work was supported by the Applied Mathematical Sciences Subprogram of the Office of Energy Research, under Contract Number DE-AC03-76SF-00098.

- Copyright ? 1998, The National Academy of Sciences

## References

- ?
- Lesieur M,
- Compte P,
- Zinn-Justin J

- ?
- Papoulis A

- ?
- Chorin A J

- ?
- ?
- ?
- Kleinert H

- ?
- Barenblatt G I,
- Chorin A J