Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures

Watanabe, Kazuho

doi:10.3390/e19060262

Open AccessArticle

Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures

by

Kazuho Watanabe

Department of Computer Science and Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku-cho, Toyohashi 441-8580, Japan

Entropy 2017, 19(6), 262; https://doi.org/10.3390/e19060262

Submission received: 4 May 2017 / Revised: 22 May 2017 / Accepted: 5 June 2017 / Published: 7 June 2017

(This article belongs to the Special Issue Information Geometry II)

Download

Browse Figures

Versions Notes

Abstract

:

The explicit form of the rate-distortion function has rarely been obtained, except for few cases where the Shannon lower bound coincides with the rate-distortion function for the entire range of the positive rate. From an information geometrical point of view, the evaluation of the rate-distortion function is achieved by a projection to the mixture family defined by the distortion measure. In this paper, we consider the

β

-th power distortion measure, and prove that

β

-generalized Gaussian distribution is the only source that can make the Shannon lower bound tight at the minimum distortion level at zero rate. We demonstrate that the tightness of the Shannon lower bound for

β = 1

(Laplacian source) and

β = 2

(Gaussian source) yields upper bounds to the rate-distortion function of power distortion measures with a different power. These bounds evaluate from above the projection of the source distribution to the mixture family of the generalized Gaussian models. Applying similar arguments to

ϵ

-insensitive distortion measures, we consider the tightness of the Shannon lower bound and derive an upper bound to the distortion-rate function which is accurate at low rates.

Keywords:

rate-distortion function; Shannon lower bound; generalized Gaussian source; m-projection

1. Introduction

The rate-distortion function,

R (D)

, shows the minimum achievable rate to reproduce source outputs with the expected distortion not exceeding D. The Shannon lower bound (SLB) has been used for evaluating

R (D)

[1,2]. The tightness of the SLB for the entire range of the positive rate identifies the entire

R (D)

for pairs of a source and distortion measure such as the Gaussian source with squared distortion [1], the Laplacian source with absolute magnitude distortion [2], and the gamma source with Itakura–Saito distortion [3]. However, such pairs are rare examples. In fact, for a fixed distortion measure, there exists only a single source that makes the SLB tight for all D, as we will prove in Section 2.3. The necessary and sufficient condition for the tightness of the SLB was first obtained for the squared distortion [4], discussed for a general difference distortion measure d [2], and recently described in terms of d-tilted information [5]. While these results consider the tightness of the SLB for each point of

R (D)

(i.e., for each D), we discuss the tightness for all D in this paper. More specifically, if we focus on the minimum distortion at zero rate (denoted by

D_{max}

), the tightness of the SLB at

D_{max}

characterizes a condition between the source density and the distortion measure.

If the SLB is not tight, the explicit evaluation of the rate-distortion function has been obtained only in limited cases [6,7,8,9]. Little is inferred on the behavior of

R (D)

when the distortion measure is varied from a known case, since

R (D)

does not continuously change even if the distortion measure is continuously modified. Although the SLB is easily obtained for difference distortion measures, it is unknown how accurate the SLB is without the explicit evaluation, upper bound, or numerical calculation of the rate-distortion function.

In this paper, we consider the constrained optimization of the definition of

R (D)

from an information geometrical viewpoint [10]. More specifically, we show that it is equivalent to a projection of the source distribution to the mixture family defined by the distortion measure. If the source is included in the mixture family, the SLB is tight; if it is not tight, the gap between

R (D)

and its SLB evaluates the minimum Kullback–Leibler divergence from the source to the mixture family (Lemma 1). Then, using the bounds of the rate-distortion function of the

β

-th power difference distortion measure obtained in [11], we evaluate the projections of the source distribution to the mixture families associated with this distortion measure (Theorem 3).

Operational rate-distortion results have been obtained for the uniform scalar quantization of the generalized Gaussian source under the

β

-th power distortion measure [12,13]. We prove that only the

β

-generalized Gaussian distribution has the potential to be the source whose SLB is tight; that is, identical to the rate-distortion function for the entire rage of positive rate. This fact brings knowledge on the tightness of the SLB of an

ϵ

-insensitive distortion measure, which is obtained by truncating the loss function near zero error [14,15,16]. The above result implies that the SLB is not tight if the source is the

β

-generalized Gaussian and the distortion has another power

γ \neq β

. We demonstrate that even in such a case, a novel upper bound to

R (D)

can be derived from the condition for the tightness of the SLB. The fact that the Laplacian (

β = 1

) and the Gaussian (

β = 2

) sources have the tight SLB specifically derives a novel upper bound to

R (D)

of

γ (\neq β)

-th power distortion measure, which has a constant gap from the SLB for all D. By the relationship between the SLB and the projection in the information geometry, the gap evaluates the projections of the

β

-generalized Gaussian source to the mixture families of

γ

-generalized Gaussian models. Extending the above argument to

ϵ

-insensitive loss, we derive an upper bound to the distortion-rate function, which is tight in the limit of zero rate.

2. Rate-Distortion Function and Shannon Lower Bound

2.1. Rate-Distortion Function

Let X and Y be real-valued random variables of a source output and reconstruction, respectively. For the distortion measure between x and y,

d (x, y)

, the rate-distortion function

R (D)

of the source

X \sim p (x)

is defined by

R (D) = inf_{q (y | x) : E [d (X, Y)] \leq D} I (q),

where

\begin{matrix} I (q) & = & I (X; Y) \\ = & \int \int q (y | x) p (x) log \frac{q (y | x)}{\int q (y | x) p (x) d x} d x d y \end{matrix}

(1)

is the mutual information and E denotes the expectation with respect to

q (y | x) p (x)

.

R (D)

shows the minimum achievable rate R to reconstruct source outputs with average distortion not exceeding D under the distortion measure d [2,17]. The distortion-rate function,

D (R)

, is the inverse function of the rate-distortion function.

If the conditional distribution

q_{s} (y | x)

achieves the minimum of the following Lagrange function parameterized by

s \geq 0

,

L (q) = I (q) + s (E [d (X, Y)] - D),

then the rate-distortion function is parametrically given by

\begin{matrix} R (D_{s}) & = & I (q_{s}), \\ D_{s} & = & \int q_{s} (y | x) p (x) d (x, y) d x d y . \end{matrix}

The parameter s corresponds to the (negated) slope of the tangent of

R (D)

at

(D_{s}, R (D_{s}))

, and hence is referred to as the slope parameter [2]. Alternatively, the rate-distortion function is given by ([18], Theorem 4.5.1):

R (D) = sup_{s \geq 0} min_{q (y)} \{E [- log \int e^{- s d (X, y)} q (y) d y] - s D\} .

(2)

If the marginal reconstruction density

q_{s} (y)

achieves the minimum above, the optimal conditional reconstruction distribution is given by

q_{s} (y | x) = \frac{e^{- s d (x, y)} q_{s} (y)}{\int e^{- s d (x, y)} q_{s} (y) d y},

(3)

(see, for example, [2,19]).

From the properties of the rate-distortion function

R (D)

, we know that

R (D) > 0

for

0 < D < D_{max}

, where

D_{max} = inf_{y} \int p (x) d (x, y) d x,

(4)

and

R (D) = 0

for

D \geq D_{max}

[2] (p. 90). Hence,

D_{max} = {lim}_{R \to 0} D (R)

.

2.2. Shannon Lower Bound

In this paper, we focus on difference distortion measures,

d (x, y) = ρ (x - y),

(5)

for which Shannon derived a general lower bound to the rate-distortion function [1] ([2], Chapter 4).

Throughout this paper, we assume that the function

ρ

is nonnegative and satisfies

C_{s} \equiv \int e^{- s ρ (z)} d z < \infty,

(6)

for all

s > 0

. It follows that

C_{s} = \int e^{- s ρ (z)} d z = \int e^{- s d (z, 0)} d z = \int e^{- s d (x, μ)} d x = \int e^{- s ρ (x - μ)} d x,

for all

μ \in R

.

Let

K (p | | r) = \int p (x) log \frac{p (x)}{r (x)} d x

denote the Kullback–Leibler divergence from p to r, which is non-negative and equal to zero if and only if

p (x) = r (x)

almost everywhere. We define the distribution

g_{s} (x) = \frac{1}{C_{s}} e^{- s ρ (x)} .

(7)

Then, the Shannon lower bound (SLB) is defined by

\underset{̲}{R} (D) \equiv h (p) - h (g_{s}),

(8)

where

h (p)

is the differential entropy of the probability density p, and s is related to D by

D = \int ρ (x) g_{s} (x) d x .

(9)

The next lemma shows that the SLB is in fact a lower bound to the rate-distortion function and that the difference between them is lower bounded by the Kullback–Leibler divergence.

Lemma 1.

For a source with probability density function

p (x)

and the difference distortion measure (5),

R (D) - \underset{̲}{R} (D) \geq min_{q (y)} K (p | | m_{s}) \geq 0,

(10)

where s and D are related to each other by (9) and

m_{s} (x) = (g_{s} * q) (x) = \int g_{s} (x - y) q (y) d y

is the convolution between

g_{s}

and q.

Proof.

Let

\underset{̲}{s}

be the slope parameter s satisfying (9) for D. From (2), we have

\begin{matrix} R (D) & \geq & min_{q (y)} \{E [- log \int e^{- \underset{̲}{s} d (X, y)} q (y) d y] - \underset{̲}{s} D\}, \\ = & min_{q (y)} \{K (p | | m_{\underset{̲}{s}}) - log C_{\underset{̲}{s}} + h (p) - \underset{̲}{s} D\} \\ = & min_{q (y)} K (p | | m_{\underset{̲}{s}}) + h (p) - h (g_{\underset{̲}{s}}) \\ = & min_{q (y)} K (p | | m_{\underset{̲}{s}}) + \underset{̲}{R} (D), \end{matrix}

which completes the proof. ☐

In the information geometry, for a family of distributions

M

and a given distribution p, the distributions that achieve the minimums

min_{r \in M} K (p | | r) and min_{r \in M} K (r | | p)

are called the m-projection and e-projection of p to

M

, respectively [10]. As a family of distributions, the (

M - 1

)-dimensional mixture family spanned by

{p_{1} (x), \dots, p_{M} (x)}

is defined by

M = \{\sum_{i = 1}^{M} q_{i} p_{i} (x) | q_{i} \geq 0, i = 1, \dots, M, \sum_{i = 1}^{M} q_{i} = 1\} .

Hence, from the information geometrical viewpoint, the above lemma shows that the difference between

R (D)

and

\underset{̲}{R} (D)

evaluates the m-projection

min_{r \in M_{s}} K (p | | r)

of the source distribution p to

M_{s} = \{m_{s} (x) = (g_{s} * q) (x) | q (y) \geq 0 (\forall y), \int q (y) d y = 1\},

the (infinite-dimensional) mixture family defined by

\{g_{s} (x - y) | y \in R\}

.

It is also easy to see from the lemma that the SLB coincides with

R (D)

(that is,

R (D) = \underset{̲}{R} (D)

holds in (8)) if and only if the source random variable X with density

p (x)

can be represented as the sum of two independent random variables, one of which is distributed according to the probability density function

g_{s} (x)

in (7). This condition is referred to as the “backward channel” condition, and is equivalent to the fact that the integral equation

p (x) = \int g_{s} (x - y) q_{s} (y) d y

(11)

has a solution

q_{s} (y)

which is a valid density function ([2], Chapter 4). This condition is also equivalent to the fact that

p \in M_{s}

.

2.3. Probability Density Achieving Tight SLB for All D

The following theorem claims that for a difference distortion measure, there is at most a unique source for which

\underset{̲}{R} (D)

is tight at

D = D_{max}

.

Theorem 1.

Assume that the source distribution has a finite

D_{max}

achieved by a reconstruction μ; that is,

E [d (X, μ)] = D_{max} < \infty

. The rate-distortion function is strictly greater than the SLB at

D = D_{max}

,

R (D_{max}) > \underset{̲}{R} (D_{max})

, unless the following holds for the source density almost everywhere:

p (x) = \frac{exp \{- s^{*} d (x, μ)\}}{C_{s^{*}}},

where

C_{s}

is defined in (6), and

s^{*}

is determined by the relation

- {\frac{\partial log C_{s}}{\partial s}|}_{s = s^{*}} = D_{max} .

(12)

Proof.

Let Z be the random variable such that

Z - μ

has the density

g_{s^{*}}

. As a functional of the source density

p (x)

, the SLB at

D = D_{max}

is expressed as

\begin{matrix} \underset{̲}{R} (D_{max}) & = & h (p) - h (g_{s^{*}}) \\ = & h (p) - log C_{s^{*}} - s^{*} E [d (Z, μ)] \\ = & - K (p (x) | | g_{s^{*}} (x - μ)) + s^{*} \{D_{max} - E [d (Z, μ)]\} . \end{matrix}

(13)

From the non-negativity of the divergence,

\underset{̲}{R} (D_{max})

is maximized to 0 only if

p (x) = g_{s^{*}} (x - μ)

holds almost everywhere. ☐

The tightness of the SLB for each D characterizes the form of the backward channel

p (x | y)

as discussed for example in ([5], Theorem 4). The above theorem focuses on

D = D_{max}

and characterizes the relation between the form of the source density

p (x)

and the distortion measure.

The tightness of the SLB at

D = D_{max}

is relevant to the tightness for all

0 < D \leq D_{max}

. For some distortion measures (e.g., the squared and absolute distortion measures), the random variable

Z_{s} \sim g_{s}

is decomposable into the sum of two independent random variables

Z_{s} = Z_{s^{'}} + N,

where

Z_{s^{'}} \sim g_{s^{'}}

and some random variable N for any

s^{'} > s

. The backward channel condition (11) means that in such a case, the tightness of the SLB at

D = D_{max}

implies the tightness of the SLB for all

0 < D < D_{max}

. The condition (11) is closely related to the closure property with respect to convolution. If

g_{s} (x - y)

is a kernel function associated with a reproducing kernel Hilbert space, such a closure property is studied in detail [20].

3. Generalized Gaussian Source and Power Distortion Measure

3.1. $β$ -th Power Distortion Measure

We examine the rate-distortion trade-offs under the

β

-th power distortion measure

d_{β} (x, y) = {| x - y |}^{β},

(14)

where

β > 0

is a real exponent. In particular,

β = 2

corresponds to the squared error criterion and

β = 1

to the absolute one. The corresponding noise model given by (7) is

g_{s} (x) = \frac{1}{C_{s}} e^{- {s | x |}^{β}},

where

C_{s} = \frac{2}{β} \frac{1}{s^{1 / β}} Γ (\frac{1}{β})

and

Γ

is the gamma function. This model is the

β

-th-order generalized Gaussian distribution including the Gaussian (

β = 2

) and the Laplace (

β = 1

) distributions as special cases. Its differential entropy is

h (g_{s}) = log C_{s} + \frac{1}{β} .

(15)

For a difference distortion measure, we can assume that the

μ

in Theorem 1 is zero without loss of generality. Thus, as a source, we assume the generalized Gaussian random variable with the density,

p (x) = p_{β} (x) = \frac{1}{C_{\frac{α}{β}}} exp (- \frac{α}{β} {| x |}^{β}),

(16)

where

0 < α < + \infty

, which is a versatile model for symmetric unimodal probability densities. Here the scaling factor

α / β

is chosen so that

D_{max} = E_{p} {[| X |}^{β}] = \frac{1}{α} > 0

holds.

The SLB for the source (16) with respect to the distortion measure (14) is

\underset{̲}{R} (D) = \{\begin{matrix} - \frac{1}{β} log α D, & (0 < D \leq \frac{1}{α}), \\ 0, & (D > \frac{1}{α}), \end{matrix}

which follows from (8), (15), and the relation between the slope parameter s and the average distortion

D_{s}

given by (9),

D_{s} = E_{g_{s}} {[| Z |}^{β}] = \frac{Γ (1 + 1 / β)}{s Γ (1 / β)} = \frac{1}{s β} .

It is well known that when

β = 2

(Gaussian source and the squared distortion measure), the SLB is tight; that is,

\underset{̲}{R} (D) = R (D)

for all D [1,2,17]. The optimal reconstruction distribution minimizing (2) for this case is given by

q_{1 / (2 D)} (y) = \frac{1}{\sqrt{2 π (1 / α - D)}} exp \{- \frac{y^{2}}{2 (1 / α - D)}\},

for

0 < D < 1 / α

. Additionally, when

β = 1

(Laplacian source and the absolute distortion measure), the SLB is tight for all D ([2], Example 4.3.2.1), which is attained by

q_{1 / D} (y) = α^{2} D^{2} δ (y) + (1 - α^{2} D^{2}) \frac{α}{2} e^{- α | y |},

for

0 < D < 1 / α

, where

δ

is Dirac’s delta function.

3.2. Tightness of the SLB

From Theorem 1, we immediately obtain the following corollary, which shows that the

β

-generalized Gaussian source is the only source that can make the SLB tight at

D = D_{max}

under the

β

-th power distortion measure (14).

Corollary 1.

Assume that the source distribution has mean 0 and a finite β-th moment,

E_{p} {[| X |}^{β}] = 1 / α < \infty

. Under the β-th power distortion measure (14), the rate-distortion function is strictly greater than the SLB at

D = D_{max}

,

R (D_{max}) > \underset{̲}{R} (D_{max})

, unless the source distribution is the β-generalized Gaussian with the density (16).

In the case of

β = 1

, the rate-distortion function of the

ϵ

-insensitive distortion measure,

d (x, y) = max {0, | x - y | - ϵ},

(17)

was studied [16]. It was proved that a necessary condition for

R (D_{s}) = \underset{̲}{R} (D_{s})

at a slope parameter s is that

R (D_{s}) = \underset{̲}{R} (D_{s})

also holds for

ϵ = 0

. According to Theorem 1, this fact derives a contradiction if there is a source that makes the SLB of the distortion measure (17) tight at

D = D_{max}

. Thus, we have the following corollary.

Corollary 2.

Under the ϵ-insensitive distortion measure (17) with

ϵ > 0

, no source makes the SLB tight at

D = D_{max}

.

4. Rate-Distortion Bounds for Mismatching Pairs

From Corollary 1, the SLB cannot be tight for all D if the distortion measure

d_{γ}

has a different exponent

γ

from that of the source

p_{β}

(i.e.,

γ \neq β

). In this section, we show that even in such a case, accurate upper and lower bounds to

R (D)

of Laplacian and Gaussian sources can be derived from the fact that

\underset{̲}{R} (D) = R (D)

for

β = 1

and

β = 2

.

We denote the rate-distortion function and bounds to it by indicating the parameters

β

and

γ

of the source and the distortion measure. More specifically,

R_{β}^{[γ]} (D)

denotes the rate-distortion function for the source

p_{β}

with respect to the distortion measure

d_{γ}

.

We first prove the following lemma:

Lemma 2.

If

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

for all D, then

E_{q_{1 / (β D)} p_{β}} [d_{γ} (X, Y)] = \frac{Γ (γ / β + 1 / β)}{Γ (1 / β)} {(β D)}^{γ / β},

holds for

γ > 0

, where

E_{q_{1 / (β D)} p_{β}}

denotes the expectation with respect to

q_{1 / (β D)} (y | x) p_{β} (x)

, and

D = E_{q_{1 / (β D)} p_{β}} [d_{β} (X, Y)]

is satisfied for the optimal conditional reproduction distribution

q_{s}

in(3).

Proof.

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

implies that

q_{s} (y | x) p_{β} (x) = g_{s} (x - y) q_{s} (y)

for the optimal reproduction distribution

q_{s} (y | x)

in (3) and

q_{s} (y)

minimizing (2), and

D = 1 / (β s)

. It follows that

\begin{matrix} E_{q_{1 / (β D)} p_{β}} [d_{γ} (X, Y)] & = & {\int | x - y |}^{γ} q_{1 / (β D)} (y | x) p_{β} (x) d y d x \\ = & E_{g_{1 / (β D)}} {[| Z |}^{γ}] = \frac{Γ (γ / β + 1 / β)}{Γ (1 / β)} {(β D)}^{γ / β} \end{matrix}

☐

Let

D^{[γ]} \equiv \frac{Γ (γ / β + 1 / β)}{Γ (1 / β)} {(β D)}^{γ / β}

, which is equivalent to

D = \frac{1}{β} {\{\frac{Γ (1 / β)}{Γ (γ / β + 1 / β)} D^{[γ]}\}}^{\frac{β}{γ}} .

(18)

The above lemma implies that

q_{1 / (β D)} (y | x)

achieving

R_{β}^{[β]} (D)

has the expected

d_{γ}

-distortion,

D^{[γ]} = E [d_{γ} (X, Y)]

with the rate

R = R_{β}^{[β]} (D) = - \frac{1}{β} log (α D)

if

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

for all D.

Thus, we obtain the following upper bound to

R_{β}^{[γ]} (D)

if

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

.

\begin{matrix} {\bar{R}}_{β}^{[γ]} (D) & \equiv & - \frac{1}{β} log [\frac{α}{β} {\{\frac{Γ (1 / β)}{Γ (γ / β + 1 / β)} D^{[γ]}\}}^{\frac{β}{γ}}] \\ = & - \frac{1}{γ} log [{(\frac{α}{β})}^{\frac{γ}{β}} \frac{Γ (1 / β)}{Γ (γ / β + 1 / β)} D] \end{matrix}

(19)

We also have the SLB for

R_{β}^{[γ]}

,

\begin{matrix} {\underset{̲}{R}}_{β}^{[γ]} (D) & \equiv & h (p_{β}) - h (g_{1 / γ D}^{[γ]}) \\ = & - \frac{1}{γ} log [{(\frac{α}{β})}^{\frac{γ}{β}} \frac{Γ {(1 / γ)}^{γ}}{Γ {(1 / β)}^{γ}} \frac{β^{γ} e^{1 - γ / β}}{γ^{γ - 1}} D] \end{matrix}

(20)

Therefore, we arrive at the following theorem:

Theorem 2.

If

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

, for

0 < D \leq D_{max}

, the rate-distortion function

R_{β}^{[γ]} (D)

is lower- and upper-bounded as

{\underset{̲}{R}}_{β}^{[γ]} (D) \leq R_{β}^{[γ]} (D) \leq {\bar{R}}_{β}^{[γ]} (D),

where the lower and upper bounds are given by (20) and (19). The left inequality becomes equality only for

γ = β

. The gap between the bounds is

\begin{matrix} δ_{β}^{[γ]} & \equiv & {\bar{R}}_{β}^{[γ]} (D) - {\underset{̲}{R}}_{β}^{[γ]} (D) \\ = & - \frac{1}{γ} log [\frac{γ^{γ - 1}}{β^{γ}} \frac{Γ {(1 / β)}^{γ + 1}}{Γ {(1 / γ)}^{γ}} \frac{e^{γ / β - 1}}{Γ (γ / β + 1 / β)}], \end{matrix}

(21)

which is constant with respect to D. Furthermore, the upper bound is tight at

D = D_{max} = {(β / α)}^{γ / β} Γ (γ / β + 1 / β) / Γ (1 / β)

; that is,

R_{β}^{[γ]} (D_{max}) = {\bar{R}}_{β}^{[γ]} (D_{max}) = 0 .

Since the upper bound is tight at

D = D_{max}

, it is the smallest upper bound that has a constant deviation from the SLB. In addition, the SLB is asymptotically tight in the limit

D \to 0

for the distortion measure

d_{γ}

in general [2,21], and the condition for the asymptotic tightness has been weakened recently [22]. These facts suggest that the rate-distortion function

R_{β}^{[γ]} (D)

is near the SLB at low distortion levels and then approaches the upper bound

{\bar{R}}_{β}^{[γ]} (D)

as the average distortion D grows to

D_{max}

. In terms of the distortion-rate function, the theorem also implies that the encoder

q_{1 / (β D)} (y | x)

designed for

d_{β}

-distortion has the loss in

d_{γ}

-distortion, due to the mismatch of the orders, at most by the constant factor

e^{γ δ_{β}^{[γ]}}

.

From Lemma 1 in Section 2.2, by examining the correspondence between the slope parameter

0 < s < \infty

and the distortion level D, we obtain the next theorem, which evaluates the m-projection of the source to the mixture family,

M_{s}^{[γ]} \equiv \{m_{s}^{[γ]} (x) = (g_{s}^{[γ]} * q) (x) | q (y) \geq 0 (\forall y), \int q (y) d y = 1\} .

If the upper bound

{\bar{R}}_{β}^{[γ]}

is replaced by the asymptotically tight upper bound [2,21], asymptotically tighter bounds to the m-projection are obtained.

Theorem 3.

If

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

, for

0 < D \leq D_{max}

, the m-projection of the generalized Gaussian source

p_{β}

to the mixture family

M_{s}^{[γ]}

of

g_{s}^{[γ]} (x) \propto e^{- {s | x |}^{γ}}

is evaluated as

min_{r \in M_{s}^{[γ]}} K (p_{β} | | r) \leq δ_{β}^{[γ]},

for

s \geq s^{*}

related to

0 \leq D \leq D_{max}

by (9), where

δ_{β}^{[γ]}

is given by (21). For

0 < s \leq s^{*}

, the m-projection is upper bounded as

min_{r \in M_{s}^{[γ]}} K (p_{β} | | r) \leq K (p_{β} | | g_{s}^{[γ]}) .

(22)

Furthermore, the inequality (22) holds with equality for

0 < s \leq s_{min}

, where

s_{min} = {lim}_{h \to - 0} R_{β}^{[γ]} (D_{max} + h) / h

is the slope parameter of

R_{β}^{[γ]} (D)

at

D = D_{max}

.

Proof.

The first part of the theorem is a corollary of Theorem 2 and Lemma 1. The second part corresponds to the case of

D \geq D_{max}

, since D monotonically decreases as s grows. Because

q (y) = δ (y)

yields

m_{s}^{[γ]} = g_{s}^{[γ]}

, we have (22). It follows from (13) that

K (p_{β} | | g_{s^{*}}^{[γ]}) = δ_{β}^{[γ]}

.

Since for

0 \leq s \leq s_{min}

, the optimal reconstruction distribution is given by

q_{s} (y) = δ (y)

, (22) holds with equality. ☐

Since we know that if

β = 1

and

β = 2

, the SLB is tight for all D, we have the following corollaries.

Corollary 3.

The rate-distortion function of the Laplacian source,

R_{1}^{[γ]} (D)

, is lower- and upper-bounded as

- \frac{1}{γ} log [\frac{α^{γ} Γ {(1 / γ)}^{γ}}{{(e γ)}^{γ - 1}} D] \leq R_{1}^{[γ]} (D) \leq - \frac{1}{γ} log [\frac{α^{γ}}{Γ (γ + 1)} D] .

Corollary 4.

The rate-distortion function of the Gaussian source,

R_{2}^{[γ]} (D)

, is lower- and upper-bounded as

- \frac{1}{γ} log [{(\frac{2 α}{π})}^{γ / 2} \frac{Γ {(1 / γ)}^{γ}}{γ^{γ - 1} e^{γ / 2 - 1}} D] \leq R_{2}^{[γ]} (D) \leq - \frac{1}{γ} log [{(\frac{α}{2})}^{γ / 2} \frac{\sqrt{π}}{Γ (γ / 2 + 1 / 2)} D] .

Example 1.

If we put

γ = 1

in Corollary 4 (

β = 2

), we have

- log (\sqrt{\frac{2 e α}{π}} D) \leq R_{2}^{[1]} (D) \leq - log (\sqrt{\frac{π α}{2}} D) .

(23)

The explicit evaluation of

R_{2}^{[1]} (D)

is obtained through a parametric form using the slope parameter s [6]. While the explicit parametric form requires evaluations of the cumulative distribution function of the Gaussian distribution, the bounds in (23) demonstrate that it is well approximated by an elementary function of D. In fact, the gap between the upper and lower bounds is

δ_{2}^{[1]} = - log \frac{π}{2 \sqrt{e}} = 0.070

(bit). The bounds in (23) are compared with

R_{2}^{[1]} (D)

for

α = \sqrt{2}

in Figure 1.

Example 2.

If we put

γ = 2

in Corollary 3 (

β = 1

), we have

- \frac{1}{2} log (\frac{α^{2} π}{2 e} D) \leq R_{1}^{[2]} (D) \leq - \frac{1}{2} log (\frac{α^{2}}{2} D)

for the Laplacian source and the squared distortion measure

d_{2} (x, y) = {| x - y |}^{2}

. The gap between the bounds is

δ_{1}^{[2]} = - \frac{1}{2} log \frac{e}{π} = 0.104

(bit).

The upper bound in Theorem 2 implies the following:

Corollary 5.

Under the γ-th power distortion measure, if

{\underset{̲}{R}}_{γ}^{[γ]} (D) = R_{γ}^{[γ]} (D)

for all D, the γ-generalized Gaussian source has the greatest rate-distortion function among all β-generalized Gaussian sources with a fixed

E [| X |^{γ}]

, satisfying

{\underset{̲}{R}}_{β}^{[β]} (D) = R_{β}^{[β]} (D)

for all D.

Proof.

Since

E_{p_{β}} {[| X |}^{γ}] = D_{max} = {(β / α)}^{γ / β} Γ (γ / β + 1 / β) / Γ (1 / β)

, the upper bound in (19) is expressed as

- (1 / γ) log (D / D_{max})

, which is equal to the rate-distortion function of the

γ

-generalized Gaussian source under the

γ

-th power distortion measure if its SLB is tight for all D. ☐

The preceding corollary is well-known in the case of the squared distortion measure, while the Gaussian source has the largest rate-distortion function not only among all

β

-generalized Gaussian sources, but also among all the sources with a fixed variance ([2], Theorem 4.3.3).

5. Distortion-Rate Bounds for $ϵ$ -Insensitive Loss

As another example of a distortion measure that is not matching with the

β

-generalized Gaussian source in the sense of Theorem 1, we consider the following

γ

-th power

ϵ

-insensitive distortion measure generalizing (17),

d (x, y) = ρ_{ϵ} {(x - y)}^{γ},

(24)

where

ρ_{ϵ} (z) = max {| z | - ϵ, 0}

. Such distortion measures are used in support vector regression models [14,15].

In this section, we focus on the Laplacian source (

β = 1

), for which similarly to Section 4, we can evaluate

E_{g_{s}} [ρ_{ϵ} {(Z)}^{γ}] = D^{γ} Γ (γ + 1) exp (- \frac{ϵ}{D}),

where

g_{s} (z) = \frac{s}{2} e^{- s | z |}

and

s = 1 / D

. Such an explicit evaluation appears to be prohibitive for

β \neq 1

. The above expected distortion is achievable by

q_{1 / D} (y | x)

with the rate

R = - log (α D)

since

R_{1}^{[1]} (D) = {\underset{̲}{R}}_{1}^{[1]} (D)

holds for all D. Thus, we obtain the following upper bound, which is expressed by a closed form in the case of the distortion-rate function.

Theorem 4.

The distortion-rate function

D_{1}^{[ϵ, γ]} (R)

of the Laplacian source under the γ-th power ϵ-insensitive distortion measure (24) is upper-bounded as

D_{1}^{[ϵ, γ]} (R) \leq {\bar{D}}_{1}^{[ϵ, γ]} (R) \equiv \frac{Γ (γ + 1)}{α^{γ}} exp (- γ R - α ϵ e^{R}) .

(25)

In addition, the upper bound is tight at

R = 0

; that is,

D_{1}^{[ϵ, γ]} (0) = {\bar{D}}_{1}^{[ϵ, γ]} (0) = E_{p_{1}} [ρ_{ϵ} {(X)}^{γ}] = \frac{Γ (γ + 1)}{α^{γ}} e^{- α ϵ} .

Upper and (Shannon) lower bounds which are accurate asymptotically as

D \to 0

have been obtained for the distortion measure (24) [16]. They are proved to have approximation error at most

O (ϵ^{2})

as

D \to 0

. Combined with these bounds, the upper bound (25), being accurate at high distortion levels, provides a good approximation of the rate-distortion function for the entire range of D. This is demonstrated in Figure 2 for the case of

ϵ = 0.1

,

γ = 1

, and

α = \sqrt{2}

, where the upper bound (25) and that in [16] are referred to as low-rate and high-rate upper bounds because they are effective at low and high rates, respectively. Although the rate-distortion function of this case is still unknown, it lies between the upper bounds and the SLB. Hence, the figure implies that the SLB is accurate for all R, and the rate-distortion function is almost identified except for the region around

R = 3

(bits) where there is a relatively large gap between the upper bounds and the SLB.

6. Conclusions

We have shown that the generalized Gaussian distribution is the only source that can make the SLB tight for all D under the power distortion measure if the orders of the source and the distortion measure are matched. We have also derived an upper bound of the rate-distortion function for the cases when the orders are mismatched, which together with the SLB provides constant-width bounds sandwiching the rate-distortion function, and hence evaluates the m-projection of the source to the mixture family associated with the distortion measure. The derived bounds demonstrate the possibility that the condition for the tightness of the SLB implies knowledge on the behavior of the rate-distortion function of other distortion measures; for example, those defined by composition of functions. In fact, we have obtained an upper bound to the distortion-rate function of

ϵ

-insensitive distortion measures in the case of the Laplacian source. It is an important undertaking to investigate the geometric structure of the mixture family associated with the distortion measure and its relationship to the m-projection; that is, the optimal reconstruction distribution.

Acknowledgments

The author would like to thank the anonymous reviewers for their helpful comments and suggestions. This work was supported in part by JSPS grants 25120014, 15K16050, and 16H02825.

Conflicts of Interest

The author declares no conflict of interest.

References

Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 623–656. [Google Scholar] [CrossRef]
Berger, T. Rate Distortion Theory: A Mathematical Basis for Data Compression; Prentice-Hall: Englewood Cliffs, NJ, USA, 1971. [Google Scholar]
Buzo, A.; Kuhlmann, F.; Rivera, C. Rate-distortion bounds for quotient-based distortions with application to Itakura-Saito distortion measures. IEEE Trans. Inf. Theory 1986, 32, 141–147. [Google Scholar] [CrossRef]
Gerrish, A.; Schultheiss, P. Information rates of non-Gaussian processes. IEEE Trans. Inf. Theory 1964, 10, 265–271. [Google Scholar] [CrossRef]
Kostina, V. Data compression with low distortion and finite blocklength. IEEE Trans. Inf. Theory 2017, in press. [Google Scholar] [CrossRef]
Tan, H.H.; Yao, K. Evaluation of rate-distortion functions for a class of independent identically distributed sources under an absolute magnitude criterion. IEEE Trans. Inf. Theory 1975, 21, 59–64. [Google Scholar] [CrossRef]
Yao, K.; Tan, H.H. Absolute error rate-distortion functions for sources with constrained magnitudes. IEEE Trans. Inf. Theory 1978, 24, 499–503. [Google Scholar]
Rose, K. A mapping approach to rate-distortion computation and analysis. IEEE Trans. Inf. Theory 1994, 40, 1939–1952. [Google Scholar] [CrossRef]
Watanabe, K.; Ikeda, S. Rate-Distortion functions for gamma-type sources under absolute-log distortion measure. IEEE Trans. Inf. Theory 2016, 62, 5496–5502. [Google Scholar] [CrossRef]
Amari, S.; Nagaoka, H. Methods Information Geometry; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
Watanabe, K. Constant-width rate-distortion bounds for power distortion measures. In Proceedings of the 2016 IEEE Information Theory Workshop (ITW), Cambridge, UK, 11–14 September 2016; pp. 106–110. [Google Scholar]
Fraysse, A.; Pesquet-Popescu, B.; Pesquet, J.C. Rate-distortion results for generalized Gaussian distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 3753–3756. [Google Scholar]
Fraysse, A.; Pesquet-Popescu, B.; Pesquet, J.C. On the uniform quantization of a class of sparse sources. IEEE Trans. Inf. Theory 2009, 55, 3243–3263. [Google Scholar] [CrossRef]
Steinwart, I.; Christmann, A. Support Vector Machines; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Chu, W.; Keerthi, S.S.; Ong, C.J. Bayesian support vector regression using a unified loss function. IEEE Trans. Neural Netw. 2004, 15, 29–44. [Google Scholar] [CrossRef] [PubMed]
Watanabe, K. Rate-distortion bounds for ε-insensitive distortion measures. IEICE Trans. Fundam. 2016, E99-A, 370–377. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley Interscience: New York, NY, USA, 1991. [Google Scholar]
Gray, R.M. Source Coding Theory; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1990. [Google Scholar]
Gray, R.M. Entropy and Information Theory, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Nishiyama, Y.; Fukumizu, K. Characteristic kernels and infinitely divisible distributions. J. Mach. Learn. Res. 2016, 17, 1–28. [Google Scholar]
Linder, T.; Zamir, R. On the asymptotic tightness of the Shannon lower bound. IEEE Trans. Inf. Theory 1994, 40, 2026–2031. [Google Scholar] [CrossRef]
Koch, T. The Shannon lower bound is asymptotically tight. IEEE Trans. Inf. Theory 2016, 62, 6155–6161. [Google Scholar] [CrossRef]

Figure 1. Rate-distortion function

R_{2}^{[1]} (D)

[6] and its lower and upper bounds in (23).

Figure 1. Rate-distortion function

R_{2}^{[1]} (D)

[6] and its lower and upper bounds in (23).

Figure 2. Distortion-rate bounds for the Laplacian source with

α = \sqrt{2}

under the

ϵ

-insensitive distortion measure (24) (

ϵ = 0.1

,

γ = 1

). The upper bound (25) (solid), the asymptotic upper bound obtained in [16] (dashed), and the Shannon lower bound (SLB, dotted).

Figure 2. Distortion-rate bounds for the Laplacian source with

α = \sqrt{2}

under the

ϵ

-insensitive distortion measure (24) (

ϵ = 0.1

,

γ = 1

). The upper bound (25) (solid), the asymptotic upper bound obtained in [16] (dashed), and the Shannon lower bound (SLB, dotted).

© 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Watanabe, K. Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures . Entropy 2017, 19, 262. https://doi.org/10.3390/e19060262

AMA Style

Watanabe K. Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures . Entropy. 2017; 19(6):262. https://doi.org/10.3390/e19060262

Chicago/Turabian Style

Watanabe, Kazuho. 2017. "Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures " Entropy 19, no. 6: 262. https://doi.org/10.3390/e19060262

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures

Abstract

1. Introduction

2. Rate-Distortion Function and Shannon Lower Bound

2.1. Rate-Distortion Function

2.2. Shannon Lower Bound

2.3. Probability Density Achieving Tight SLB for All D

3. Generalized Gaussian Source and Power Distortion Measure

3.1. $β$ -th Power Distortion Measure

3.2. Tightness of the SLB

4. Rate-Distortion Bounds for Mismatching Pairs

5. Distortion-Rate Bounds for $ϵ$ -Insensitive Loss

6. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Projection to Mixture Families and Rate-Distortion Bounds with Power Distortion Measures

Abstract

1. Introduction

2. Rate-Distortion Function and Shannon Lower Bound

2.1. Rate-Distortion Function

2.2. Shannon Lower Bound

2.3. Probability Density Achieving Tight SLB for All D

3. Generalized Gaussian Source and Power Distortion Measure

3.1. β -th Power Distortion Measure

3.2. Tightness of the SLB

4. Rate-Distortion Bounds for Mismatching Pairs

5. Distortion-Rate Bounds for ϵ -Insensitive Loss

6. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. $β$ -th Power Distortion Measure

5. Distortion-Rate Bounds for $ϵ$ -Insensitive Loss