4 Linear DAEs with Time-varying Coefficients

In this section, we consider linear DAEs with variable or time-dependent coefficients. This means, for matrix-valued functions EC(I,Cm,n),AC(I,Cm,n) and fC(I,Cm), we consider the DAE E(t)˙x(t)=A(t)x(t)+f(t) with, possibly, an initial condition x(t0)=x0Cn.

The same general solution concept applies. Basically x should be differentiable, fulfill the DAE, and, if stated, the initial condition too.

In the constant coefficient case, regularity played a decisive role for the existence and uniqueness of solutions; see, e.g. Section 3.4. Thus it seems natural to extend this concept to the time-varying case, e.g., through requiring that (E(t),A(t)) is a regular matrix pair independent of t. However, the following two examples show that this will not work out of the box.

Example 4.1 Let E, A be given as E(t)=[tt21t],A(t)=[1001] Then det for all t\in \mathcal I. Still, for every c \in \mathcal C^1(\mathcal I, \mathbb C) with c(t_0)=0, the function x\colon t \mapsto c(t)\begin{bmatrix} t\\1 \end{bmatrix} solves the homogeneous initial value problem (4.1)(4.2).

This was an example where the pair (E,A) is regular uniformly with respect to t but still allows for infinitely many solutions to the associated DAE. X: What about the initial value? Why it won’t help to make the solution unique?

Next we see the contrary – a matrix pair that is singular for any t but defines a unique solution.

Example 4.2 For E(t) = \begin{bmatrix} 0 & 0 \\ 1 & -t \end{bmatrix}, \quad A(t) = \begin{bmatrix} -1 & t \\ 0&0 \end{bmatrix}, \quad f(t) = \begin{bmatrix} f_1(t) \\ f_2(t) \end{bmatrix}, one has \det ( \lambda E(t) - A(t)) = 0 for all t\in \mathcal I. Still, if x=(x_1, x_2) denotes the solution, from the first line of the DAE \begin{align*} 0 &= -x_1(t) + tx_2(t) + f_1(t) \\ \dot x_1 - t\dot x_2(t) &= \phantom{-x_1(t) + tx_2(t) +}f_2(t) \end{align*} one can calculate directly that \dot x_1(t) = t\dot x_2(t) +x_2 + \dot{f_1}(t) or that \dot x_1(t) - t\dot x_2(t) = x_2 + \dot{f_1}(t) so that the second line becomes x_2(t) + \dot{f_1}(t) = f_2(t) which uniquely defines x_2(t) = - \dot{f_1}(t) + f_2(t) and also x_1(t) = t(- \dot{f_1}(t) + f_2(t))+f_1(t).

For both examples one can then simply choose x(t_0) in accordance with the right hand side to argue about whether and how a solution exists.

Recall that for the constant coefficient case, we were using invertible scaling and state transformation matrices P and Q for the equivalence transformations E \dot x(t) = Ax(t) +f(t) \quad \sim \quad \tilde E \dot {\tilde x(t)} = \tilde A\tilde x(t) +\tilde f(t)

with

x=Q\tilde x, \quad \tilde E = PEQ, \quad \tilde A = PAQ, \quad \tilde f = Pf.

For the time-varying case, we will use time-varying transformations and require that they are invertible at every point t in time.

Definition 4.1 Two pairs (E_i,A_i), E_i, A_i \in \mathcal C(\mathcal I, \mathbb C^{m,n}), i=1,2, of matrix functions are called (globally) equivalent, if there exist pointwise nonsingular matrix functions P\in \mathcal C(\mathcal I, \mathbb C^{m,m}) and Q\in \mathcal C^1(\mathcal I, \mathbb C^{n,n}) such that \begin{equation} E_2=PE_1Q, \quad A_2 = PA_1Q-PE_1\dot Q \tag{4.3} \end{equation} for all t\in \mathcal I. Again, we write (E_1,A_1) \sim (E_2, A_2).

The need of Q being differentiable and the appearance of E_1\dot Q in the definition of A_2 comes from the relation E\dot x(t) = E\frac{d}{dt} (Q\tilde x)(t) = E\bigl(Q(t)\dot {\tilde x}(t) + \dot Q(t)\tilde x(t) \bigr) for the transformed state \tilde x with the actual state x.

Lemma 4.1 The relation on pairs of matrix functions as defined in Definition 4.1 is an equivalence relation.
Proof. Exercise!

Next we will define local equivalence of matrix pairs.

Definition 4.2 Two pairs (E_i,A_i), E_i, A_i \in \mathbb C^{m,n}, i=1,2, of matrices are called locally equivalent, if there exist pointwise nonsingular matrices P \in \mathbb C^{m,m}) and Q\in \mathbb C^{n,n} as well as matrix R\in \mathbb C^{n,n} such that \begin{equation} E_2=PE_1Q, \quad A_2 = PA_1Q-PE_1R \tag{4.4}. \end{equation} Again, we write (E_1,A_1) \sim (E_2, A_2) and differentiate by context.
Lemma 4.2 The local equivalence as defined in Definition 4.2 is an equivalence relation on pairs of matrices.
Proof. Exercise!

We state a few observations:

  • Global equivalence implies local equivalence at all points of time t.
  • Vice versa, pointwise local equivalence, e.g. at some time instances t_i with suitable matrices P_i, Q_i, R_i, can be interpolated to a continuous matrix function P and a differentiable matrix function Q by Hermite interpolation, i.e. via P(t_i) = P_i, \quad Q(t_i) = Q_i, \quad \dot Q(t_i) = R_i.
  • Local equivalence is more powerful than the simple equivalence of matrix pairs (cp. Definition 3.1) for which R=0. This means we can expect more structure in a normal form.

4.1 A Local Canonical Form

For easier explanations, we introduce the slightly incorrect wording that a matrix M spans a vector space V to express that V is the span of the columns of M. Similarly, we will say that M is a basis of V, if the columns of M form a basis for V.

Some more notation:

Notation Explanation
V^H\in \mathbb C^{n,m} the conjugate transpose or Hermitian transpose of a matrix V\in \mathbb C^{m,n}
T' \in \mathbb C^{n,n-k} The complementary space as a matrix. If T\in \mathbb C^{n,k} is a basis of V, then T' contains a basis of V' so that V\oplus V' = \mathbb C^{n}. In particular, the matrix \begin{bmatrix}T&T'\end{bmatrix} is square and invertible.

Theorem 4.1 Let E, A \in \mathbb C^{m,n} and let \begin{equation} T,~Z,~T',~V \tag{4.5} \end{equation} be

Matrix as the basis of
T \operatorname{kernel}E
Z \operatorname{corange}E = \operatorname{kernel}E^H
T' \operatorname{cokernel}E = \operatorname{range}E^H
V \operatorname{corange}(Z^HAT)

then the quantities \begin{equation} r,~a,~s,~d,~u,~v \tag{4.6} \end{equation} defined as

Quantity Definition Name
r \operatorname{rank}E rank
a \operatorname{rank}(Z^HAT) algebraic part
s \operatorname{rank}(V^HZ^HAT') strangeness
d r-s differential part
u n-r-a undetermined variables
v m-r-a-s vanishing equations

are invariant under local equivalence transformations and (E, A) is locally equivalent to the canonical form

\begin{equation} \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right), \tag{4.7} \end{equation} where all diagonal blocks are square, except maybe the last one.
Proof. To be provided. Until then, see Theorem 3.7 in Kunkel/Mehrmann.

Some remarks on the spaces and how the names are derived for the case E\dot x = Ax +f with constant coefficients. The ideas are readily transferred to the case with time-varying coefficients.

Let x(t) = Ty(t) + T'y'(t),

where y denotes the components of x that evolve in the range of T and y' the respective complement. (Since [T|T'] is a basis of \mathbb C^{n}, there exist such y and y' that uniquely define x and vice versa). With T spanning \ker E we find that

E \dot x(t) = ET\dot y(t) + ET'\dot y'(t) = ET'\dot y'(t)

so that the DAE basically reads

ET'\dot y'(t) = ATy(t) + AT'y'(t)+f,

i.e. the components of x defined through y are, effectively, not differentiated. With Z containing exactly those v, for which v^HE=0, it follows that

Z^HET'\dot y'(t) = 0 = Z^HATy(t) + Z^HAT'y'(t)+Z^Hf,

or

Z^HATy(t) = -Z^HAT'y'(t)-Z^Hf,

so that \operatorname{rank}Z^HAT indeed describes the number of purely algebraic equations and variables in the sense that it defines parts of y (which is never going to be differentiated) in terms of algebraic relations (no time derivatives are involved).

With the same arguments and with V=\operatorname{corange}Z^HAT, it follows that

V^HZ^HAT'y'(t) = -V^HZ^HATy(t) -V^HZ^Hf=-V^HZ^Hf,

is the part of E\dot x = Ax + f in which those components y' that are also differentiated are algebraically equated to a right-hand side. This is the strangeness (rather in the sense of skewness) of DAEs that variables can be both differential and algebraic. Accordingly, \operatorname{rank}V^HZ^HAT' describes the size of the skewness component.

Finally, those variables that are neither strange nor purely algebraic, i.e. those that are differentiated but not defined algebraically, are the differential variables. There is no direct characterization of them, but one can calculate their number as r-s, which means number of differentiated minus number of strange variables.

Outlook: If there is no strangeness, the DAE is called strangeness-free. Strangeness can be eliminated through iterated differentiation and substitution. The needed number of such iterations (that is independent of the the size s of the strange block here) will define the strangeness index.

Example 4.3 With a basic state transformation \begin{bmatrix} \tilde x_1 \\ \tilde x_2 \\ \tilde x_3 \end{bmatrix} = \begin{bmatrix} x_3 - x_2 \\ x_2-x_1 \\ x_3 \end{bmatrix}, one finds for the coefficients of Example 1.2 that: (E, A) \sim \left( \begin{bmatrix} C & 0 & 0 \\ 0 & 0 &0 \\ 0 & 0 &0 \end{bmatrix} , \begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \right).

We compute the subspaces as defined in (4.5):

Matrix as the basis of/computed as
T=\begin{bmatrix} 0 \\ I_2 \end{bmatrix} \operatorname{kernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}
Z=\begin{bmatrix} 0 \\ I_2 \end{bmatrix} \operatorname{corange}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}=\operatorname{kernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}^H
T'=\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \operatorname{cokernel}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}=\operatorname{range}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}^H
Z^HAT=I_2 \begin{bmatrix} 0 \\ I_2 \end{bmatrix}^H\begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 0 \\ I_2 \end{bmatrix}
V=\begin{bmatrix} 0 \\ 0 \end{bmatrix} \operatorname{corange}(Z^HAT) = \operatorname{kernel}I_2^H\phantom{\begin{bmatrix} 0 \\ I_1 \end{bmatrix}}
V^HZ^HAT'=\begin{bmatrix} 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \end{bmatrix}^H\begin{bmatrix} 0 \\ I_2 \end{bmatrix}^H\begin{bmatrix} 0 & \frac{1}{R} & 0 \\ -1 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 \\0 \\ 0 \end{bmatrix}

and derive the quantities as defined in (4.6):

Name Value Derived from
rank r=1 \operatorname{rank}E = \operatorname{rank}\begin{bmatrix} C & 0 \\ 0 & 0_2 \end{bmatrix}
algebraic part a=2 \operatorname{rank}Z^HAT = \operatorname{rank}I_2
strangeness s=0 \operatorname{rank}V^HZ^HAT' = \operatorname{rank}\begin{bmatrix} 0\end{bmatrix}
differential part d=1 d=r-s=1-0
undetermined variables u=0 u=n-r-a=3-2-1
vanishing equations v=0 v=m-r-a-s=3-2-1-0
Example 4.4

For the semi-discrete linearized Navier-Stokes equations, the derivation of the local characteristic quantities is laid out in the Example Section.

`

4.2 A Global Canonical Form

A few observations:

  • For a pair of (E, A) of matrix functions, we can compute the characteristic values r, a, s, d as in (4.6) for any given t\in \mathcal I.
  • Thus, r, a, s, d\colon \mathcal I \to \mathbb R are functions of time t.
  • We will assume that r, a, s, d are constant in time:
    • Analysis will be enabled through a so-called smooth singular value decomposition (SVD – see the following theorem) that applies for matrices of constant rank.
    • Smooth matrix functions have countably many jumps in the rank. The analysis can be performed on subintervals, where the rank of the matrices are constant.
    • In practice, typically, there are but a few jumps in the rank at somewhat particular but known time instances or circumstances.

About a few and known jumps in the rank: A change in the ranks means an instantaneous change in the model itself. In fact the characteristic values, like the number of purely algebraic equations, would change suddenly. An example is the activation of a switch in an electrical circuit or switched systems in general.

Theorem 4.2 (see Kunkel/Mehrmann, Thm. 3.9) Let E\in \mathcal C^\ell(I, \mathbb C^{m,n}) with \operatorname{rank}E(t)=r for all t\in I. Then there exist pointwise unitary (and, thus, nonsingular) matrix functions U\in \mathcal C^\ell(I, \mathbb C^{m,m}) and V\in \mathcal C^\ell(I, \mathbb C^{n,n}), such that

U^HEV = \begin{bmatrix} \Sigma & 0 \\ 0 & 0 \end{bmatrix} with pointwise nonsingular \Sigma \in \mathcal C^l(I, \mathbb C^{r,r}).

Theorem 4.3 Let E, A \in \mathcal C^l(I, \mathbb C^{m,n}) be sufficiently smooth and suppose that \begin{equation} r(t) = r, \quad a(t)=a, \quad s(t)=s \tag{4.8} \end{equation}

for the local characteristic values of (E(t), A(t)). Then (E, A) is globally equivalent to the canonical form \begin{equation} \left( \begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12}& 0 & A_{14} \\ 0 & 0 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \right ). \tag{4.9} \end{equation}

All entries are again matrix functions on I and the last block column in both matrix functions of (4.9) has size u=n-s-d-a.

Proof. In what follows, we will tacitly redefine the block matrix entries that appear after the global equivalence transformations. The first step is the continous SVD of E; see Theorem 4.2. In what follows, the basic operations of

  • condensing blocks by the continuous SVD, e.g. U_2^HA_{31}V_2=\begin{bmatrix} I_s & 0 \\ 0 & 0 \end{bmatrix}
  • and eliminating blocks through by adding multiples of columns or rows

are applied repeatedly:

\begin{align*} (E,A) & \sim \left(\begin{bmatrix} \Sigma & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12}V_1 \\ U_1^HA_{21} & U_1^HA_{22}V_1 \end{bmatrix} - \begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 \\ 0 & \dot V_1 \end{bmatrix} \right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_r & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} & A_{13}\\ A_{21} & I_a & 0 \\ A_{31} & 0 & 0 \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} V_2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11}V_2 & A_{12} & A_{13}\\ A_{21}V_2 & I_a & 0 \\ U_2^HA_{31}V_2 & 0 & 0 \end{bmatrix} - \begin{bmatrix} \dot I_r & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \dot V_2 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \right)\\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} V_{11} & V_{12} & 0 & 0 \\ V_{21} & V_{22} & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} A_{11} & A_{12} & A_{13} & A_{14} \\ A_{21} & A_{22} & A_{23} & A_{24} \\ A_{31} & A_{32} & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \right)\\ %%%%%%%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & A_{13} & A_{14} \\ 0 & A_{22} & A_{23} & A_{24} \\ 0 & A_{32} & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & -A_{32} & I_a & 0 \\ I_s & 0 & 0 & I_a \end{bmatrix} \right) \\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & A_{13} & A_{14} \\ 0 & A_{22} & A_{23} & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right)\\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & 0 & A_{14} \\ 0 & A_{22} & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right)\\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & Q_2 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12}Q_2 & 0 & A_{14} \\ 0 & A_{22}Q_2-\dot Q_2 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right) \\ %%%%%%%%%%%%%%%%% & \sim \left(\begin{bmatrix} I_s & 0 & 0 & 0 \\ 0 & I_d & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & A_{12} & 0 & A_{14} \\ 0 & 0 & 0 & A_{24} \\ 0 & 0 & I_a & 0 \\ I_s & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\right), \end{align*} where the final equivalence holds, if Q_2 is chosen as the (unique and pointwise invertible) solution of the linear matrix valued ODE \dot Q_2 = A_{22}(t)Q_2 , \quad Q_2 (t_0 ) = I_d. Then, the prefinal A_{22}-block vanishes because of the special choice of Q_2 and E_{22} becomes I_d after scaling the second block line by Q_2^{-1}.

If we write down the transformed DAE that corresponds to the canonical form (4.9), i.e. \begin{align} \dot x_1 &= A_{12}(t)x_2 + A_{14}x_4 + f_1(t) \tag{4.10} \\ \dot x_2 &= A_{24}(t)x_4 + f_2(t) \tag{4.11}\\ 0 &= x_3 + f_3(t) \\ 0 &= x_1 + f_4(t) \tag{4.12} \\ 0 &= f_5(t) \end{align} we can read off a few properties:

  1. the part x_4 is free to choose, i.e. the undetermined part
  2. the equation f_5=0 does not define any variable, i.e. it is the vanishing or redundant part
  3. the part x_2 is defined through an ODE (in this representation)
  4. however, the part x_1 is strange (both differential and algebraic) and still linked to x_2.
Corollary 4.1 In fact, one may differentiate (4.12) and eliminate \dot x_1 in (4.10) to obtain -\dot f_4 = A_{12}(t)x_2 + A_{14}x_4 + f_1(t) which together with (4.11) becomes a new DAE for x_2: \bar E(t) \dot x_2 = \bar A(t) x_2 + \bar f(t) with \begin{equation} \bar E(t) = \begin{bmatrix} I_{d_0} \\ 0_{s_0, d_0} \end{bmatrix}\in \mathbb C^{d_0+s_0, d_0}, \quad \bar A(t) = \begin{bmatrix} 0_{d_0} \\ A_{12}(t) \end{bmatrix} \in \mathbb C^{d_0+s_0, d_0}, \quad \tag{4.13} \end{equation} and \bar f(t) = \begin{bmatrix} A_{24}x_4(t) + f_2(t) \\ A_{14}x_4(t)+f_1(t)+\dot f_4(t) \end{bmatrix} \in \mathbb C^{d_0+s_0}. Here, we have used the subscript to note that these d and s quantities were characteristic for the initial matrix pair (E, A). Now, after this differentiation step, one can calculate the characteristic values d_1, a_1, s_1 again for the pair (E_1, A_1) which is obtained from the canonical form (4.9) by replacing equations (4.10)(4.11) by the DAE with (\bar E, \bar A) as in (4.13).

The following theorem states that this differentiation-elimination step (which is not a global equivalence operation on matrix pairs) is well-defined in the sense that the next iteration characteristic values are invariant under global equivalence transformations.

Theorem 4.4 Assume that the pairs (E, A) and (\tilde E, \tilde A) are globally equivalent and in the global canonical form (4.9). Then the pairs (E_1, A_1) and (\tilde E_1, \tilde A_1) that are obtained by differentiation and elimination as described in Corollary 4.1 are globally equivalent too.
Proof. See Kunkel/Mehrmann: Theorem 3.14.

4.3 The Strangeness Index

Theorem 4.4 comes with a number of implications:

  • Starting with (E, A):=(E_0, A_0), we can define (E_i, A_i), i \in \mathbb N \cup {0} as follows
    1. (E_i, A_i) is the global canonical form
    2. differentiate and eliminate as in Corollary 4.1 and bring the obtained pair into global canonical form to obtain (E_{i+1}, A_{i+1}).
  • this gives a series of invariants (r_i, a_i, s_i) – invariant under global equivalence transforms –
  • Since r_{i+1} = r_i - s_i and r_i \geq 0 (rank of a matrix is always greater than zero) there exists a \mu \in \mathbb N \cup \{0\} for which s_\mu = 0.
  • This \mu is also characteristic for a matrix pair (because r_i and s_i are).

With these observations, the following definition is well-posed.

Definition 4.3 Let (E,A) be a pair of sufficiently smooth matrix functions and let the sequence (r_i, a_i, s_i), i=0,1,2,3,..., of global characteristic values for the pairs (E_i, A_i) that are generated as

  • (E_0, A_0):= (E, A)
  • (E_{i+1}, A_{i+1}) is derived from bringing (E_i, A_i) into the global canonical form as in Theorem 4.3 and removing the I_s block in E_{i+1} through differentiation and elimination as in Corollary 4.1
be well-defined. Then, the quantity \mu := \min\{i\in \mathbb N_0 | s_i=0\} is called the strangeness index of the DAE (4.1). If \mu=0, then the DAE is called strangeness-free.

The practical implications of the strangeness index and the procedure of its derivation are laid out in the following theorem.

Theorem 4.5 Let the strangeness index \mu of (E, A) be well defined and let f\in \mathcal C^\mu(\mathcal I, \mathbb C^{m}). Then the DAE (4.1) is equivalent (in the sense that the solution sets are in a one-to-one correspondence via a pointwise nonsingular matrix function) to a DAE of the form \begin{align} \dot x_1(t) &= A_{13}(t)x_3(t) + f_1(t) \tag{4.14}\\ 0 &= x_2(t) + f_2(t) \tag{4.15}\\ 0 &= f_3(t), \tag{4.16} \end{align} where A_{13} \in \mathcal C(\mathcal I, \mathbb C^{d_\mu, u_\mu}) and where f_1, f_2, f_3 are defined through f, \dot f, , f^{(\mu)}.
System (4.14)(4.16) is in the form of (4.9) with the I_s blocks not present and the remaining parts of the variables, coefficients, and right hand side renumbered accordingly.

Corollary 4.2 Let the strangeness index \mu of (E, A) be well defined and let f\in \mathcal C^\mu(\mathcal I, \mathbb C^{m}). Then

  1. The DAE (4.1) is solvable if and only if the v_\mu consistency conditions (4.16) 0=f_3(t) are fulfilled.

  2. An initial condition (4.2) is consistent if, and only if, the a_\mu conditions 0=x_2(t_0) + f_2(t_0) related to (4.15) hold.

  3. The corresponding initial value problem (4.1)(4.2) is uniquely solvable if, and only if, in addition u_\mu=0, i.e., x_3 is not present in (4.14).

Just by comparing the solvability conditions with those for the constant coefficient case, e.g. Theorem 3.3, we can observe that

  1. that \mu \sim \nu -1 (unless \nu =0) and
  2. a matrix pair is regular if u_\mu = v_\mu = 0.

4.4 Derivative Arrays

The concept and the derivation of the strangeness index gives a complete characterization of solvability of linear DAEs with time-varying coefficients (provided sufficient regularity of the coefficients and the right hand side). However, for practical considerations there are two shortcomings

  1. The formulation through the canonical form is very implicit and requires the derivatives of computed quantities (like the \dot V_2 in the proof of Theorem 4.3).

  2. There is no direct generalization to nonlinear systems.

Both these issues are better addressed in the approach to a strangeness free form like through derivative arrays.

For that, we consider the DAE E(t) \dot x = A(t) x+f differentiate it E(t) \ddot x(t) + \dot E(t) \dot x(t) = \dot A(t) x + A(t) \dot x+\dot f, and add these equations to the system to obtain the inflated DAE \begin{bmatrix} E & 0\\ \dot E - A & E \end{bmatrix} \frac{d}{dt} \begin{bmatrix} x \\ \dot x \end{bmatrix} = \begin{bmatrix} A & 0\\ \dot A & 0 \end{bmatrix} \begin{bmatrix} x \\ \dot x \end{bmatrix} + \begin{bmatrix} f \\ \dot f \end{bmatrix}.

If we add also the second derivative of the equations, we obtain \begin{bmatrix} E & 0 & 0\\ \dot E - A & E &0 \\ \ddot E - 2\dot A & 2\dot E -A &E \end{bmatrix} \frac{d}{dt} \begin{bmatrix} x \\ \dot x \\ \ddot x \end{bmatrix} = \begin{bmatrix} A & 0 & 0\\ \dot A & 0 & 0\\ \ddot A & 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ \dot x \\ \ddot x \end{bmatrix} + \begin{bmatrix} f \\ \dot f \\ \ddot f \end{bmatrix}.

If we do this \ell times, we arrive at the so-called derivative array

Definition 4.4 Consider the DAE (4.1) and let E, A, f be \ell-times differentiable for some integear \ell\geq 0. Then the derivative array of order \ell is given as \begin{equation} M_\ell (t) \dot z_\ell(t) = N_\ell (t) z_\ell(t) + g_\ell(t), \end{equation} where (M_\ell)_{i,j} = \binom{i}{j}E^{(i-j)} - \binom{i}{j+1}A^{(i-j-1)}, \quad i,j=0,\dotsc,\ell,

where (N_\ell)_{i,j} = \begin{cases} A^{(i)}, \quad &\text{for } i=0,\dotsc, \ell, \, j=0 \\ 0, \quad & \text{else} \end{cases},

and where

z_\ell = \begin{bmatrix} x \\ \dot x \\ \vdots \\ x^{(\ell)} \end{bmatrix}, \quad g_\ell = \begin{bmatrix} f \\ \dot f \\ \vdots \\ f^{(\ell)} \end{bmatrix}.
  • By construction, any solution x that solves the derivative array, solves the DAE and vice versa.
  • The strangeness-free form from above is an equivalent system too with d_\mu differential relations, a_\mu algebraic equations, and v_\mu redundant (or consistency) relations.
  • Next, we will show that from the derivative array we can extract d_\mu differential and a_\mu well separated algebraic relations for x, i.e. an equivalent strangeness free form.

The following theorem connects the derivative array to the strangeness index and provides a strangeness free reformulation of the DAE (4.1).

Theorem 4.6 Let the strangeness index of the pair (E, A) of matrix-valued functions be well defined according to Definition 4.3 with the global invariants d_\mu, a_\mu, v_\mu. Then for the derivative array as defined in Definition 4.4 it holds that

  1. \operatorname{corank}M_{\mu+1}- \operatorname{corank}M_\mu = v_\mu

  2. \operatorname{rank}M_\mu(t) = (\mu+1)m-a_\mu - v_\mu on \mathcal I, and there exists a smooth matrix function Z_{2,3} (that spans the left null space of M_\mu) with Z_{2,3}^H M_\mu(t) =0.

  3. The projection Z_{2,3} can be partitioned into two parts:
    • Z_3 (left nullspace of (M_\mu, N_\mu)) so that Z_3^HN_\mu(t) =0.
    • Z_2 such that Z_2^HN_\mu \begin{bmatrix} I_n \\ 0 \\ \vdots \\ 0 \end{bmatrix} = Z_2^H \begin{bmatrix} A \\ \dot A \\ \vdots \\ A^{(\mu)} \end{bmatrix} =:\hat A_2 has full rank.
  4. Furthermore, let T_2 be a smooth matrix function such that \hat A_2 T_2=0 (right nullspace of \hat A_2). Then \operatorname{rank}E(t) T_2 = d_\mu and there exists a smooth matrix function Z_1\colon \mathcal I \to \mathbb C^{n,d_\mu} with \operatorname{rank}(Z_1^TE) = d_\mu. We define Z_1^HE=\hat E_1.
Proof. Kunkel/Mehrmann: Thm. 3.29, Thm. 3.30, Thm. 3.32.

A few observations and implications:

  • Z_{2,3}^H operates on the derivative array M_\ell (t) \dot z_\ell(t) = N_\ell (t) z_\ell(t) + g_\ell(t),

  • Specifically, it picks out all constraint equations including the redundancies (Z_3^H) and all explicit and hidden constraints (Z_2^H).
  • Z_1^H operates on the original system E(t)\dot x = A(t)x + f(t) and picks out the dynamic part.

By means of the projections defined in Theorem 4.6, one can define (Kunkel/Mehrmann: Theorem 3.32) a strangeness-free condensed form

\begin{align} \hat E_1(t) \dot x(t) &= \hat A_{1}(t)x(t) + \hat f_1(t) \tag{4.17} \\ 0 &= \hat A_2 x_2(t) + \hat f_2(t) \tag{4.18}\\ 0 &= \hat f_3(t), \tag{4.19} \end{align}

where \hat E_1(t) := Z_1(t)^HE(t) \in \mathbb C^{d_\mu, n}, \quad \hat A_1(t) := Z_1^H(t)A (t)\in \mathbb C^{d_\mu, n}, \quad \hat f_1(t) = Z_1^Hf(t) \in \mathbb C^{d_\mu}, where \hat A_2(t) = Z_2(t)^H \begin{bmatrix} A(t) \\ \dot A(t) \\ \vdots \\ A^{(\mu)(t)} \end{bmatrix} \in \mathbb C^{a_\mu, n}, \quad \hat f_2(t) = Z_2^Hg_\mu(t) \in \mathbb C^{a_\mu}, \quad and where \hat f_3(t) = Z_3^Hg_\mu(t) \in \mathbb C^{(\mu+1)m-a_\mu-d_\mu}.

Remark: This condensed equivalent strangeness free form (4.17)(4.19) comes with a number of advantages over the one defined in Theorem 4.5.

  1. It can be derived by only differentiating the given functions E, A, and f. This is much preferable since a given function can be evaluated with high accuracy (which is needed for computing derivatives numerically) or can be even differentiated analytically.

  2. It is formulated in the original variable x – no state transformation involved.

  3. It can be generalized to nonlinear systems.

Example 1 Cp. Examples 3.33 and 3.34 in Kunkel/Mehrmann that provide the strangeness free forms for the motivating examples of this section.

Example 2 Compute the forms for the incompressible Navier-Stokes equations (to be provided).