Journal of Fourier Analysis and Applications

, Volume 25, Issue 6, pp 3123–3153

# Endpoint Estimates for the Maximal Function over Prime Numbers

Open Access
Article

## Abstract

Given an ergodic dynamical system $$(X, \mathcal {B}, \mu , T)$$, we prove that for each function f belonging to the Orlicz space $$L(\log L)^2(\log \log L)(X, \mu )$$, the ergodic averages
\begin{aligned} \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} f\big (T^p x\big ), \end{aligned}
converge for $$\mu$$-almost all $$x \in X$$, where $$\mathbb {P}_N$$ is the set of prime numbers not larger that N and $$\pi (N) = \# \mathbb {P}_N$$.

## Keywords

Weak maximal ergodic inequality Orlicz space Prime numbers Pointwise convergence

## Mathematics Subject Classification

Primary 37A45 Secondary 46E30, 42B25

## 1 Introduction

Let $$(X, \mathcal {B}, \mu , T)$$ be an ergodic dynamical system, that is $$(X, \mathcal {B}, \mu )$$ is a probability space with a measurable and measure preserving transformation $$T: X \rightarrow X$$. The classical Birkhoff theorem  states that for any function f from $$L^p(X, \mu )$$ with $$p \in [1, \infty )$$, the ergodic averages
\begin{aligned} \frac{1}{N}\sum _{n = 0}^{N-1} f\big (T^n x\big ) \end{aligned}
converge for $$\mu$$-almost all $$x \in X$$. This classical result, among others, motivates studying ergodic averages over subsequences of integers. In this article we are interested in pointwise convergence of the following averages,
\begin{aligned} \mathscr {A}_N f(x) = \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} f\big (T^p x \big ) \end{aligned}
where $$\mathbb {P}_N$$ is the set of prime numbers not larger than N and $$\pi (N) = \# \mathbb {P}_N$$. The problem of ergodic averages along prime numbers was initially studied by Bourgain in  where the case of functions belonging to $$L^2(X, \mu )$$ has been covered. It was extended by Wierdl in  to all $$L^p(X, \mu )$$, for $$p > 1$$, see also [5, Sect. 9]. However, the endpoint $$p = 1$$, was left open for more than twenty years. Following the method developed in  by Buczolich and Mauldin, LaVictoire in  has shown that for each ergodic dynamical system there exists $$f \in L^1(X, \mu )$$ such that the sequence $$(\mathscr {A}_N f : N \in \mathbb {N})$$ diverges on a set of positive measure.

The purpose of this article is to find an Orlicz space close to $$L^1(X, \mu )$$ where the almost everywhere convergence holds. We show the following theorem (see Theorem 7.4).

### Theorem A

For each $$f \in L(\log L)^2(\log \log L)(X, \mu )$$, the limit
\begin{aligned} \lim _{N \rightarrow \infty } \mathscr {A}_N f(x) \end{aligned}
exists for $$\mu$$-almost all $$x \in X$$.

In light of the pointwise convergence obtained by Bourgain in , see also , to prove Theorem A it suffices to show the weak maximal ergodic inequality for functions in Orlicz space $$L(\log L)^2(\log \log L)(X, \mu )$$. This inequality is deduce from the following restricted weak Orlicz estimate.

### Theorem B

There is $$C > 0$$ such that for any subset $$A \subset X$$,
\begin{aligned} \mu \left\{ x \in X : \sup _{N \in \mathbb {N}} \mathscr {A}_N\big ({\mathbb {1}_{{A}}}\big )(x) > \lambda \right\} \le C \lambda ^{-1} \log ^2(e/\lambda ) \mu (A) \end{aligned}
for all $$1> \lambda > 0$$.
By appealing to the Calderón transference principle, see , Theorem B is deduced from the corresponding result for integers $$\mathbb {Z}$$ with the counting measure and the shift operator. To be more precise, for a function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$, we define
\begin{aligned} \mathcal {A}_N f(x) = \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} f(x+p). \end{aligned}
Our main result is following theorem (see Theorem 6.3).

### Theorem C

There is $$C > 0$$ such that for any subset $$F \subset \mathbb {Z}$$ of a finite cardinality
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{N \in \mathbb {N}} \mathcal {A}_N \big ({\mathbb {1}_{{F}}} \big )(x) > \lambda \right\} \right| \le C \lambda ^{-1} \log ^2(e/\lambda ) |{F} | \end{aligned}
for all $$0< \lambda < 1$$.

Theorem C together with $$\ell ^2(\mathbb {Z})$$ estimates are sufficiently strong to imply the maximal inequality for all $$\ell ^p(\mathbb {Z})$$ spaces, for $$p > 1$$, giving an alternative proof of the Wierld’s theorem .

Let us now give some details about the proof of Theorem C. Without loss of generality, we may restrict the supremum to dyadic numbers. It is more convenient to work with weighted averages $$\mathcal {M}_N f$$ instead of $$\mathcal {A}_N f$$ where
\begin{aligned} \mathcal {M}_N f(x) = \frac{1}{\vartheta (N)} \sum _{p \in \mathbb {P}_N} f(x+p) \log p, \end{aligned}
and
\begin{aligned} \vartheta (N) = \sum _{p \in \mathbb {P}_N} \log p. \end{aligned}
Given $$t > 0$$, for each $$n \in \mathbb {N}$$, we decompose the operator $$\mathcal {M}_{2^n}$$ into two parts $$A_n^t$$ and $$B_n^t$$, in such a way that the maximal function associated with $$A_n^t$$ has $$\ell ^{1,\infty }(\mathbb {Z})$$ norm $$\lesssim t \Vert f\Vert _{\ell ^1}$$, whereas the one corresponding to $$B_n^t$$ has $$\ell ^2(\mathbb {Z})$$ norm $$\lesssim \exp \big (-c \sqrt{t}\big ) \Vert f\Vert _{\ell ^2}$$. When applied to the distribution function $$\big |\big \{\sup _{n \in \mathbb {N}} \mathcal {M}_{2^n}({\mathbb {1}_{{F}}}) > \lambda \big \}\big |$$, we can optimize both estimates by taking $$t \simeq \log ^2(e/\lambda )$$. This idea originated to Fefferman , see also Bourgain . Ionescu introduced this technique in a related discrete context, see . The decomposition of $$\mathcal {M}_{2^n}$$ uses the circle method of Hardy and Littlewood. However, to achieve the exponential decay of the error term, due to the Page’s theorem, the approximating multiplier has to contain the second term of the asymptotic as well. Thus, the possible existence of the Siegel zero entails that in the neighborhood of the rational point a / q the approximating multiplier $$\widehat{L^{a, q}_{2^n}}(\cdot - a/q)$$ depends on the rational number a / q. We refer to Sects. 3 and 5 for details. Thanks to the log-convexity of $$\ell ^{1, \infty }(\mathbb {Z})$$, the weak type estimates are reduced to showing
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{t \le n} \left| \sum _{a \in A_q} \mathcal {F}^{-1}\big (\widehat{L^{a, q}_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f} \big )(x) \right| > \lambda \right\} \right| \le C \frac{1}{\lambda \varphi (q)} \Vert f\Vert _{\ell ^1} \end{aligned}
for $$2^s \le q < 2^{s+1}$$ with $$1 \le s \le \sqrt{t}$$. At this stage we exploit the behavior of the Gauss sums described in Theorem 2.1.

Let us emphasize that under the Generalized Riemann Hypothesis we can obtain in Proposition 3.1, and consequently in Theorem 3.2, a better error estimate. However, it is not clear whether one can prove Theorem 6.1 with the bounds proportional to $$\sqrt{t} \Vert f\Vert _{\ell ^1}$$.

The paper is organized as follows. In Sect. 2, we collect necessary facts about Dirichlet characters and the zero-free region. Then we evaluate the Gauss sum that appears in the approximating multiplier (Theorem 2.1). Section 3 is devoted to construction of the approximating multipliers. In Sects. 5 and 6, we show $$\ell ^2$$ and the weak type estimates, respectively. In Sect. 7, we give two applications of Theorem C. Namely, we show how to deduce the maximal ergodic inequality for functions from $$\ell ^p(\mathbb {Z})$$, (Theorem 7.1). Next we apply the transference principle (Proposition 7.3) and show almost everywhere convergence of the ergodic averages $$(\mathscr {A}_N f : N \in \mathbb {N})$$ for $$f \in L(\log L)^2(\log \log L)(X, \mu )$$, (Theorem 7.4).

### 1.1 Notation

Throughout the whole article, we write $$A \lesssim B$$ ($$A \gtrsim B$$) if there is an absolute constant $$C>0$$ such that $$A\le CB$$, ($$A\ge CB$$). Moreover, C stands for a large positive constant which value may vary from occurrence to occurrence. If $$A \lesssim B$$ and $$A\gtrsim B$$ hold simultaneously then we write $$A \simeq B$$. The set of positive integers and the set of prime numbers are denoted by $$\mathbb {N}$$ and $$\mathbb {P}$$, respectively. For $$x > 0$$, we set $$\mathbb {Z}_x = [1, x] \cap \mathbb {N}$$. Let $$\mathbb {N}_0 = \mathbb {N}\cup \{0\}$$.

## 2 Gauss sums

We start by recalling some basic facts from number theory. A general reference here is the book .

A homomorphism
\begin{aligned} \chi : \big (\mathbb {Z}/ q\mathbb {Z}\big )^\times \rightarrow \mathbb {C}^\times , \end{aligned}
is called a Dirichlet character modulo q. The simplest example, called the principal character modulo q, is defined as
\begin{aligned} \mathbb {1}_q(x) = {\left\{ \begin{array}{ll} 1 &{} \text {if } \gcd (x, q) = 1, \\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}
A character $$\chi$$ modulo q is primitive, if q is the least integer d, such that $$\chi (m) = \chi (n)$$ for all $$m \equiv n \pmod d$$ and $$(mn, q) = 1$$. For each character $$\chi$$ there is the unique primitive character $$\chi ^\star$$ modulo $$q_0$$ for some $$q_0 \mid q$$, such that
\begin{aligned} \chi (n) = {\left\{ \begin{array}{ll} \chi ^\star (n) &{} \text {if } \gcd (n, q) = 1, \\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}
The character is quadratic if it takes only values $$\{-1, 0, 1\}$$ with at least one $$-1$$. Recall that, if $$\chi ^\star$$ is a primitive quadratic character with modulus $$q_0$$, then
• $$q_0 \equiv 1 \pmod 4$$, and $$q_0$$ is square-free, or

• $$4 \mid q_0$$, $$q_0/4 \equiv 2 \text { or } 3 \pmod 4$$, and $$q_0/4$$ is square-free.

Given a Dirichlet character $$\chi$$ and $$s \in \mathbb {C}$$ with $$\mathfrak {R}s > 1$$, we define the Dirichlet L-function by the formula
\begin{aligned} L(s, \chi ) = \sum _{n \ge 1} \frac{\chi (n)}{n^s}. \end{aligned}
In fact, $$L(\, \cdot \,, \chi )$$ extends to the analytic function in $$\{z \in \mathbb {C}: \mathfrak {R}z > 0\}$$. There is an absolute constant $$c > 0$$, such that if $$\chi$$ is a Dirichlet character modulo q, then the region
\begin{aligned} \left\{ z \in \mathbb {C}: 1 - \frac{c}{\log q}< \mathfrak {R}z < 1 \right\} \end{aligned}
(1)
contains at most one zero of $$L(\, \cdot \,, \chi )$$, which we denote by $$\beta _q$$. The zero $$\beta _q$$ is real and the corresponding character is quadratic. The character having zero in (1) is called exceptional. Since $$L(\beta , \chi ) = 0$$ implies that $$L(1-\beta , \chi ) = 0$$, we may assume that $$\frac{1}{2} \le \beta _q < 1$$.
The Gauss sum of a Dirichlet character $$\chi$$ modulo q is defined as
\begin{aligned} G(\chi , n) = \frac{1}{\varphi (q)} \sum _{r \in A_q} \chi (r) e^{2\pi i r n / q} \end{aligned}
where $$A_q = \big \{1 \le a \le q : \gcd (a, q) = 1\big \}$$, and $$\varphi (q) = \# A_q$$. Let us recall that for each $$\epsilon > 0$$ there is $$C_\epsilon > 0$$ such that
\begin{aligned} \varphi (q) \ge C_\epsilon q^{1 - \epsilon }. \end{aligned}
(2)
We set
\begin{aligned} \tau (\chi ) = \varphi (q) G(\chi , 1). \end{aligned}
Let us denote by $$\mu$$ the Möbious function, which is defined for $$q = p_1^{\alpha _1} \dots p_n^{\alpha _n}$$, where $$p_1, \ldots , p_n$$ are distinct primes, as
\begin{aligned} \mu (q) = {\left\{ \begin{array}{ll} (-1)^n &{} \text {if }\alpha _1 = \ldots = \alpha _n = 1, \\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}
and $$\mu (1) = 1$$. The following theorem plays the crucial role in Sect. 6.

### Theorem 2.1

Let $$\chi$$ be a quadratic Dirichlet character modulo q induced by $$\chi ^\star$$ having the conductor $$q_0$$. For $$x \in \mathbb {Z}$$, we set $$r = \gcd (q, x)$$. Then
\begin{aligned} \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a/q} = \mu (r) q_0 \frac{\varphi (r)}{\varphi (q)} \chi ^\star (-x) \end{aligned}
provided that $$q/q_0$$ is square-free, $$\gcd (q/q_0, q_0) = 1$$ and $$r \mid q/q_0$$. Otherwise the sum equals zero.

### Proof

By [16, Theorem 9.12], if $$r \mid q/q_0$$ then
\begin{aligned} \sum _{a \in A_q} \chi (a) e^{2\pi i a x/q} = \frac{\varphi (q)}{\varphi (q/r)} \chi ^\star \big (x/r\big ) \chi ^\star \big (q/(r q_0)\big ) \mu \big (q/(r q_0)\big ) \tau (\chi ^\star ), \end{aligned}
(3)
otherwise the sum equals zero. In particular, for $$a \in A_q$$, we have
\begin{aligned} G(\chi , a) = \frac{\mu (q/q_0)}{\varphi (q)} \chi ^\star (a) \chi ^\star (q/q_0) \tau (\chi ^\star ). \end{aligned}
(4)
Hence, $$G(\chi , a) \ne 0$$ entails that $$q/q_0$$ is square-free and $$\gcd (q/q_0, q_0) = 1$$. Next, using (4) and (3) we get
\begin{aligned} \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a/q}&= \frac{\mu (q/q_0)}{\varphi (q)} \chi ^\star (q/q_0) \tau (\chi ^\star ) \sum _{a \in A_q} \chi (a) e^{2\pi i x a /q} \\&= \frac{\mu (r)}{\varphi (q/r)} \chi ^\star (q/q_0) \chi ^\star \big (x/r\big ) \chi ^\star \big (q/(r q_0)\big ) \tau (\chi ^\star )^2 \\&= \frac{\mu (r)}{\varphi (q/r)} \chi ^\star (x) \tau (\chi ^\star )^2. \end{aligned}
Because $$|{\tau (\chi ^\star )} | = \sqrt{q_0}$$, we have $$\tau (\chi ^\star )^2 = q_0 \chi ^\star (-1)$$. Hence,
\begin{aligned} \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a/q} = \frac{\mu (r)}{\varphi (q/r)} \chi ^\star (-x) q_0. \end{aligned}
(5)
Finally, since $$q/q_0$$ is square-free, $$\gcd (q/q_0, q_0) = 1$$ and $$r \mid q/q_0$$, we deduce that $$\gcd (q/r, r) = 1$$. Therefore,
\begin{aligned} \varphi (q/r) \varphi (r) = \varphi (q), \end{aligned}
which together with (5) completes the proof. $$\square$$
Let us observe that the identity (4) together with (2) imply that
\begin{aligned} \big |G(\chi , a)\big | \le \frac{\sqrt{q_0}}{\varphi (q)} \le C_\epsilon q^{-\frac{1}{2}+\epsilon }. \end{aligned}
(6)
for any $$\epsilon > 0$$. Moreover, $$G(\chi , a) \ne 0$$ entails that q is square-free or $$4 \mid q$$ and q / 4 is square-free.

## 3 Approximating Multipliers

Let us denote by $$\mathcal {A}_N$$ the averaging operator over prime numbers, that is for a function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$ we have
\begin{aligned} \mathcal {A}_N f(x) = \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} f(x+p) \end{aligned}
where $$\mathbb {P}_N = [1, N] \cap \mathbb {P}$$ and $$\pi (N) = \# \mathbb {P}_N$$. Since sums over primes are very irregular, it is more convenient to work with
\begin{aligned} \mathcal {M}_N f(x)=\frac{1}{\vartheta (N)} \sum _{p \in \mathbb {P}_N} f(x+p) \log p \end{aligned}
where
\begin{aligned} \vartheta (N) = \sum _{p \in \mathbb {P}_N} \log p. \end{aligned}
By the partial summation, we easily see that
\begin{aligned} \sum _{p \in \mathbb {P}_N} f(x+p)&= \sum _{n=2}^N \Big (\vartheta (n) \mathcal {M}_n f(x) - \vartheta (n-1) \mathcal {M}_{n-1}f(x)\Big ) \frac{1}{\log n} \\&= \vartheta (N) \mathcal {M}_N f(x) \frac{1}{\log N} \!+ \!\sum _{n = 2}^{N-1} \vartheta (n) \mathcal {M}_n f(x) \bigg (\frac{1}{\log n} \!-\! \frac{1}{\log (n+1)}\bigg ), \end{aligned}
thus
\begin{aligned} \big | \mathcal {A}_N f(x) \big |&\le \!\sup _{N' \in \mathbb {N}} \big | \mathcal {M}_{N'} f(x) \big | \frac{1}{\pi (N)} \left( \vartheta (N) \frac{1}{\log N} \!+\! \sum _{n = 2}^{N-1} \vartheta (n) \left( \frac{1}{\log n} \!-\! \frac{1}{\log (n+1)}\!\right) \right) \nonumber \\&\le \sup _{N' \in \mathbb {N}} \big | \mathcal {M}_{N'} f(x) \big |. \end{aligned}
(7)
To better understand the operators $$\mathcal {M}_N$$, we use the Hardy–Littlewood circle method. Let $$\mathcal {F}$$ denote the Fourier transform on $$\mathbb {R}$$ defined for any function $$f \in L^1(\mathbb {R})$$ as
\begin{aligned} \mathcal {F}f(\xi ) = \int _\mathbb {R}f(x) e^{-2\pi i \xi x} {\, \mathrm d} x. \end{aligned}
If $$f \in \ell ^1(\mathbb {Z})$$, we set
\begin{aligned} \hat{f}(\xi ) = \sum _{n \in \mathbb {Z}} f(n) e^{-2\pi i \xi n}. \end{aligned}
To simplify the notation we denote by $$\mathcal F^{-1}$$ the inverse Fourier transform on $$\mathbb {R}$$ or the inverse Fourier transform on the torus $$\mathbb {T}\equiv [0, 1)$$, depending on the context. Let $$\mathfrak {m}_N$$ be the Fourier multiplier corresponding to $$\mathcal {M}_N$$, i.e.,
\begin{aligned} \mathfrak {m}_N(\xi ) = \frac{1}{\vartheta (N)} \sum _{p \in \mathbb {P}_N} e^{2\pi i \xi p} \log p. \end{aligned}
(8)
Then for a finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$, we have
\begin{aligned} \mathcal {M}_N f(x) = \mathcal {F}^{-1}\big (\mathfrak {m}_N \hat{f} \big )(x). \end{aligned}
For $$\frac{1}{2} \le \beta \le 1$$, we set
\begin{aligned} M_N^\beta = \frac{1}{N} \sum _{n = 1}^N \frac{n^\beta - (n-1)^\beta }{\beta } \delta _n. \end{aligned}
(9)
To simplify the notation we write $$M_N$$ for $$M_N^1$$. Let $$M_0 \equiv 0$$. Recall that
\begin{aligned} \big |\widehat{M_N}(\xi ) \big | \lesssim \min \Big \{\big (N |{\xi } |\big )^{-1}, N|{\xi } |\Big \}. \end{aligned}
(10)
For $$\beta < 1$$, we notice that the operators $$M_N^\beta$$ are not averaging operators. Moreover, by the partial summation and (10), we get
\begin{aligned} \big |\widehat{M_N^\beta } (\xi ) \big |&= \frac{1}{\beta N} \left| \sum _{n=1}^N \big (n \widehat{M_n}(\xi ) - (n-1) \widehat{M_{n-1}}(\xi )\big ) \big (n^\beta - (n-1)^\beta \big )\right| \\&\lesssim \big (N |{\xi } | \big )^{-1} N^{\beta -1} + \big (N|{\xi } |\big )^{-1} \sum _{n = 1}^{N-1} \big (2 n^\beta - (n-1)^\beta - (n+1)^\beta \big ) \\&\lesssim \big (N |{\xi } | \big )^{-1} N^{\beta -1} + \big (N|{\xi } |\big )^{-1} \sum _{n = 1}^{N-1} n^{\beta -2} . \end{aligned}
Hence,
\begin{aligned} \big |\widehat{M_N^\beta }(\xi ) \big | \lesssim \big (N |{\xi } |\big )^{-1}. \end{aligned}
(11)
Moreover,
\begin{aligned} \big |\widehat{M_N^\beta }(\xi ) - \beta ^{-1} N^{\beta -1} \big | \lesssim N^\beta |{\xi } |, \end{aligned}
thus
\begin{aligned} \big |\widehat{M_N^{\beta }}(\xi ) - \widehat{M_{2N}^{\beta }}(\xi )\big |&\lesssim \big |\widehat{M_N^{\beta }}(\xi ) - \beta ^{-1} N^{\beta -1} \big | + \big |\widehat{M_{2N}^\beta }(\xi ) - \beta ^{-1} (2N)^{\beta -1}\big | \\&\quad + \big |\beta ^{-1} N^{\beta -1} - \beta ^{-1} (2N)^{\beta -1}\big | \\&\lesssim N^\beta |{\xi } | + (1-\beta ) N^{\beta -1}. \end{aligned}
Therefore,
\begin{aligned} \big |\widehat{M_N^{\beta }}(\xi ) - \widehat{M_{2N}^{\beta }}(\xi )\big |&\lesssim \min \Big \{(N |{\xi } |)^{-1}, N^\beta |{\xi } | + (1-\beta ) N^{\beta -1} \Big \} \nonumber \\&\lesssim \min \Big \{(N |{\xi } |)^{-1}, N |{\xi } | \Big \} + (1-\beta ) N^{\beta -1}. \end{aligned}
(12)
Given $$q \in \mathbb {N}$$, and $$a \in A_q$$, we set
\begin{aligned} L_N^{a, q} = G(\mathbb {1}_q, a) M_N, \end{aligned}
(13)
if there is no exceptional character modulo q, and
\begin{aligned} L_N^{a, q} = G(\mathbb {1}_q, a) M_N - G(\chi _q, a) M_N^{\beta _q}, \end{aligned}
(14)
when there is an exceptional character $$\chi _q$$ modulo q and $$\beta _q$$ is the corresponding zero.

### Proposition 3.1

There is $$c > 0$$ such that if $$\xi \in \mathbb {T}$$,
\begin{aligned} \bigg | \xi - \frac{a}{q} \bigg | \le N^{-1} Q \end{aligned}
for some $$1 \le q \le Q$$, $$a \in A_q$$, and $$1 \le Q \le \exp \big (c\sqrt{\log N}\big )$$, then
\begin{aligned} \mathfrak {m}_N (\xi ) = \widehat{L^{a,q}_N}(\xi - a/q) +\mathcal {O}\Big (Q \exp \big (-c\sqrt{\log N}\big )\Big ). \end{aligned}

### Proof

Observe that for a prime p, $$p \mid q$$ if and only if $$(p \bmod q, q) > 1$$. Hence,
\begin{aligned} \left| \sum _{\genfrac{}{}{0.0pt}2{r = 1}{(r, q) > 1}}^q \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N}{p \equiv r \bmod q}} e^{2\pi i \xi p} \log p \right| \le \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}}{p \mid q}} \log p \le q. \end{aligned}
Let $$\theta = \xi - a/q$$. For $$p \equiv r \pmod q$$, we have
\begin{aligned} \xi p \equiv \theta p + r a / q \pmod 1, \end{aligned}
thus
\begin{aligned} \sum _{r \in A_q} \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N}{p \equiv r \bmod q}} e^{2 \pi i \xi p} \log p = \sum _{r \in A_q} e^{2\pi i r a/q} \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N}{p \equiv r \bmod q}} e^{2\pi i \theta p} \log p. \end{aligned}
For $$x \ge 2$$, we set
\begin{aligned} \vartheta (x; q, r) = \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_x}{p \equiv r \bmod q}} \log p. \end{aligned}
Then, by the partial summation, we obtain
\begin{aligned} \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N}{p \equiv r \bmod q}} e^{2 \pi i \theta p} \log p&= \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N \setminus \mathbb {P}_{\sqrt{N}}}{p \equiv r \bmod q}} e^{2\pi i \theta p} \log p + \mathcal {O}\big (\sqrt{N}\big ) \nonumber \\&= \vartheta (N; q, r) e^{2\pi i \theta N} - \vartheta (\sqrt{N}; q, r) e^{2 \pi i \theta \sqrt{N}}\nonumber \\&\quad -\, 2 \pi i \theta \int _{\sqrt{N}}^N \vartheta (t; q, r) e^{2\pi i \theta t} {\, \mathrm d} t + \mathcal {O}\big (\sqrt{N} \big ). \end{aligned}
(15)
Analogously, for any $$\frac{1}{2} \le \beta \le 1$$, we can write
\begin{aligned} \sum _{n = 1}^N \frac{n^\beta - (n-1)^\beta }{\beta } e^{2\pi i \theta n}= & {} \beta ^{-1} N^\beta e^{2\pi i \theta N} - \beta ^{-1} \sqrt{N^\beta } e^{2\pi i \theta \sqrt{N}}\nonumber \\&-\, 2\pi i \theta \beta ^{-1} \int _{\sqrt{N}}^N t^\beta e^{2\pi i \theta t} {\, \mathrm d}t +\mathcal {O}\big (\sqrt{N}\big ). \end{aligned}
(16)
By the Page’s theorem, there is an absolute constant $$c > 0$$ such that for each $$x \ge 2$$, $$1 \le q \le \exp \big (c\sqrt{\log x}\big )$$, and $$r \in A_q$$,
\begin{aligned} \bigg | \vartheta (x; q, r) - \frac{x}{\varphi (q)} \bigg | \lesssim x \exp \big (-c \sqrt{\log x} \big ), \end{aligned}
if there is no exceptional character modulo q, and
\begin{aligned} \bigg | \vartheta (x; q, r) - \frac{x}{\varphi (q)} + \frac{\chi (r)}{\varphi (q)} \beta ^{-1} x^\beta \bigg | \lesssim x \exp \big (-c \sqrt{\log x} \big ), \end{aligned}
when there is an exceptional character $$\chi$$ modulo q, and $$\beta$$ is the concomitant zero. Therefore, by (15) and (16), we obtain
\begin{aligned}&\left| \sum _{\genfrac{}{}{0.0pt}2{p \in \mathbb {P}_N}{p \equiv r \bmod q}} e^{2 \pi i \theta p} \log p - \frac{1}{\varphi (q)} \sum _{n = 1}^N e^{2\pi i \theta n} \bigg (1 - \chi (r) \frac{n^\beta - (n-1)^\beta }{\beta }\bigg ) \right| \\&\quad \lesssim \sqrt{N} + \bigg | \vartheta (N; q, r) - \frac{N}{\varphi (q)} + \frac{\chi (r)}{\varphi (q)} \beta ^{-1} N^\beta \bigg | \\&\qquad + \bigg | \vartheta (\sqrt{N}; q, r) - \frac{\sqrt{N}}{\varphi (q)} + \frac{\chi (r)}{\varphi (q)} \beta ^{-1} \sqrt{N^{\beta }} \bigg | \\&\qquad + |{\theta } | \int _{\sqrt{N}}^N \bigg |\vartheta (t; q, r) - \frac{t}{\varphi (q)} + \frac{\chi (r)}{\varphi (q)} \beta ^{-1} t^{\beta }\bigg | {\, \mathrm d}t\\&\quad \lesssim N \exp \big (-c\sqrt{\log N}\big ) + Q N^{-1} \int _{\sqrt{N}}^N t \exp \big (-c\sqrt{\log t}\big ) {\, \mathrm d}t, \end{aligned}
which is bounded by $$N Q \exp \big (-c\sqrt{\log N}\big )$$. Finally, by the prime number theorem
\begin{aligned} \bigg | \frac{\vartheta (N) - N}{N} \bigg | \le C \exp \big (-c\sqrt{\log N}\big ), \end{aligned}
and the proposition follows. $$\square$$
Next, we select $$\eta : \mathbb {R}\rightarrow \mathbb {R}$$, a smooth function such that $$0 \le \eta \le 1$$, and
\begin{aligned} \eta (\xi ) = {\left\{ \begin{array}{ll} 1 &{} \quad \text {if } |{\xi } | \le \tfrac{1}{4}, \\ 0 &{} \quad \text {if } |{\xi } | \ge \tfrac{1}{2}. \end{array}\right. } \end{aligned}
We may assume that $$\eta$$ is a convolution of two smooth functions with supports contained in $$\big (-\tfrac{1}{2}, \tfrac{1}{2}\big )$$. For $$s \in \mathbb {N}_0$$, we set
\begin{aligned} \eta _s(\xi ) = \eta \big (2^{4 s} \xi \big ). \end{aligned}
We define a family of approximating multipliers, by the formula
\begin{aligned} \nu _n^s(\xi ) = \sum _{a/q \in \mathscr {R}_s} \widehat{L_{2^n}^{a, q}}(\xi - a/q) \eta _s\big (\xi - a/q\big ) \end{aligned}
(17)
where
\begin{aligned} \mathscr {R}_s =&\, \big \{ a/q \in \mathbb {Q}\cap (0, 1] : a \in A_q, \text { and } 2^s \le q < 2^{s+1}, q \text { is square-free or }\\&\quad 4 \mid q \text { and } q/4 \text { is square-free} \big \}, \end{aligned}
and $$\mathscr {R}_0 = \{1\}$$. We set $$\nu _n = \sum _{s \ge 0} \nu _n^s$$.

### Theorem 3.2

There are $$C, c > 0$$ such that for all $$n \in \mathbb {N}_0$$ and $$\xi \in \mathbb {T}$$,
\begin{aligned} \big | \mathfrak {m}_{2^n}(\xi ) - \nu _n(\xi ) \big | \le C \exp \big (-c \sqrt{n}\big ) \end{aligned}
where $$\mathfrak {m}_N$$ is defined by (8).

### Proof

Let
\begin{aligned} Q_n = \exp \big (\tfrac{c}{2} \sqrt{n} \big ) \end{aligned}
where the constant c is determined in Proposition 3.1. By the Dirichlet’s principle, there are coprime integers a and q, satisfying $$1 \le a \le q \le 2^n Q_n^{-1}$$, and such that
\begin{aligned} \bigg | \xi - \frac{a}{q} \bigg | \le \frac{1}{q} 2^{-n} Q_n. \end{aligned}
Let us first consider the case when $$1 \le q \le Q_n$$. We select $$s_1 \in \mathbb {N}_0$$ satisfying
\begin{aligned} 2^{s_1+1} < \frac{1}{2} 2^n Q_n^{-2} \le 2^{s_1+2}. \end{aligned}
For $$s \le s_1$$ and $$a'/q' \in \mathscr {R}_s$$, with $$a'/q' \ne a/q$$, we have
\begin{aligned} \bigg | \xi - \frac{a'}{q'} \bigg | \ge \frac{1}{q q'} - \bigg |\xi - \frac{a}{q} \bigg | \ge Q_n^{-1} 2^{-s_1-1} - 2^{-n} Q_n \ge 2^{-n} Q_n. \end{aligned}
Therefore, by (6) and (11),
\begin{aligned} \bigg | \widehat{L_{2^n}^{a', q'}} (\xi - a'/q') \eta _s(\xi - a'/q') \bigg | \lesssim 2^{-\frac{s}{4}} \big |2^n (\xi - a'/q') \big |^{-1} \le 2^{-\frac{s}{4}} Q_n^{-1}, \end{aligned}
which implies that
\begin{aligned} \left| \sum _{s = 0}^{s_1} \sum _{\genfrac{}{}{0.0pt}2{a'/q' \in \mathscr {R}_s}{a'/q' \ne a/q}} \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \right| \lesssim Q_n^{-1} \sum _{s \ge 0} 2^{-\frac{s}{4}}. \end{aligned}
For $$s > s_1$$, by (6) we obtain
\begin{aligned} \left| \sum _{s> s_1} \sum _{\genfrac{}{}{0.0pt}2{a'/q' \in \mathscr {R}_s}{a'/q' \ne a/q}} \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi -a'/q') \right| \lesssim \sum _{s > s_1} 2^{-\frac{s}{4}} \lesssim \big (2^n Q_n^{-2}\big )^{-\frac{1}{4}} \lesssim Q_n^{-1}. \end{aligned}
If q is square-free or $$4 \mid q$$ and q / 4 is square-free then there is $$s_0 \in \mathbb {N}_0$$ such that $$a/q \in \mathscr {R}_{s_0}$$, thus
\begin{aligned} Q_n \ge 2^{s_0}. \end{aligned}
By Proposition 3.1,
\begin{aligned} \bigg | \mathfrak {m}_{2^n}(\xi ) \!- \!\widehat{L^{a, q}_{2^n}} (\xi \!-\! a/q) \eta _{s_0}(\xi \!-\! a/q) \bigg | \lesssim \big |2^n (\xi - a/q)\big |^{-1} \big (1 \!-\! \eta _{s_0}(\xi - a/q)\big ) + Q_n^{-1}. \end{aligned}
Since $$1 - \eta _{s_0}(\xi - a/q) > 0$$, implies that
\begin{aligned} \bigg | \xi - \frac{a}{q} \bigg | \ge \frac{1}{4} 2^{-4 s_0} \gtrsim Q_n^{-4}, \end{aligned}
we obtain
\begin{aligned} \bigg | \mathfrak {m}_{2^n}(\xi ) - \widehat{L^{a, q}_{2^n}}(\xi - a/q) \eta _{s_0}(\xi - a/q) \bigg | \lesssim 2^{-n} Q_n^4 + Q_n^{-1} \lesssim Q_n^{-1}. \end{aligned}
Finally, if q and q / 4 are not square-free then by Proposition 3.1,
\begin{aligned} \bigg | \mathfrak {m}_{2^n}(\xi ) - \widehat{L^{a, q}_{2^n}}(\xi - a/q) \eta _{s_0}(\xi - a/q) \bigg | = \big | \mathfrak {m}_{2^n}(\xi ) \big | \lesssim Q_n^{-1}. \end{aligned}
It remains to deal with $$Q_n \le q \le 2^n Q_n^{-1}$$. By the Vinogradov’s inequality (see [20, Theorem 1, Chap. IX] or [17, Theorem 8.5]), we get
\begin{aligned} \big | \mathfrak {m}_{2^n}(\xi ) \big | \lesssim n^4 \Big (q^{-\frac{1}{2}} + 2^{-\frac{1}{2}n} q^{\frac{1}{2}} + 2^{-\frac{1}{5}n}\Big ) \lesssim n^4 Q_n^{-\frac{1}{2}}. \end{aligned}
Next, we show that
\begin{aligned} \left| \sum _{s \ge 0} \sum _{a'/q' \in \mathscr {R}_s} \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \right| \lesssim Q_n^{-\frac{1}{8}}. \end{aligned}
Select $$s_2 \in \mathbb {N}_0$$ such that
\begin{aligned} 2^{s_2+1} \le Q_n^\frac{1}{2} \le 2^{s_2+2}. \end{aligned}
(18)
For $$s \le s_2$$, if $$a'/q' \in \mathscr {R}_s$$, then $$1 \le q' \le Q_n^\frac{1}{2}$$, and hence
\begin{aligned} \bigg |\xi - \frac{a'}{q'}\bigg | \ge \frac{1}{q'} 2^{-n} Q_n \ge 2^{-n} Q_n^{\frac{1}{2}}. \end{aligned}
Therefore, by (6) and (11),
\begin{aligned} \bigg | \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \bigg | \lesssim 2^{-\frac{s}{4}} Q_n^{-\frac{1}{2}}, \end{aligned}
which entails that
\begin{aligned} \left| \sum _{s=0}^{s_2} \sum _{a'/q' \in \mathscr {R}_s} \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \right| \lesssim Q_n^{-\frac{1}{2}} \sum _{s \ge 0} 2^{-\frac{s}{4}}. \end{aligned}
If $$s > s_2$$, then by (6), we get
\begin{aligned} \bigg | \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \bigg | \lesssim 2^{-\frac{s}{4}}, \end{aligned}
hence by (18),
\begin{aligned} \left| \sum _{s> s_2} \sum _{a'/q' \in \mathscr {R}_s} \widehat{L^{a', q'}_{2^n}}(\xi - a'/q') \eta _s(\xi - a'/q') \right| \lesssim \sum _{s > s_2} 2^{-\frac{s}{4}} \lesssim Q_n^{-\frac{1}{8}}, \end{aligned}
and the theorem follows. $$\square$$

## 4 Equidistribution of Weak $$\ell ^1$$ Norms

In this section we prove that the maximal function associated with kernels $$(M^\beta _{2^n} : n \in \mathbb {N}_0)$$ has weak $$\ell ^1(\mathbb {Z})$$-norm equidistributed in residue classes. Before embarking on the proof, let us recall two lemmas essential for the argument.

### Lemma 4.1

[13, Lemma 1] There is $$C > 0$$ such that for all $$s \in \mathbb {N}$$ and $$u \in \mathbb {R}$$,
\begin{aligned} \left\| \int _{-\frac{1}{2}}^{\frac{1}{2}} e^{2 \pi i \xi x} \eta _s(\xi ) {\, \mathrm d}\xi \right\| _{\ell ^1(x)}&\le C,\\ \left\| \int _{-\frac{1}{2}}^{\frac{1}{2}} e^{2 \pi i \xi x} \big (1 - e^{2 \pi i \xi u} \big ) \eta _s(\xi ) {\, \mathrm d}\xi \right\| _{\ell ^1(x)}&\le C |{u} | 2^{-4 s}. \end{aligned}

### Lemma 4.2

[13, Lemma 2] For all $$p \ge 1$$, any $$1 \le Q \le 2^{2 s}$$ with $$s \in \mathbb {N}$$, $$r \in \{1, \ldots , Q\}$$, and any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \big \Vert \mathcal {F}^{-1}\big (\eta _s \hat{f} \big )(Q x + r) \big \Vert _{\ell ^p(x)} \simeq Q^{-\frac{1}{p}} \big \Vert \mathcal {F}^{-1}\big (\eta _s \hat{f} \big ) \big \Vert _{\ell ^p}. \end{aligned}

The following theorem is the main result of this section.

### Theorem 4.3

There is $$C > 0$$ such that for any $$1 \le Q \le 2^{2 s}$$ with $$s \in \mathbb {N}$$, $$r \in \{1, \ldots , Q\}$$, $$\frac{1}{2} \le \beta \le 1$$, and any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned}&\sup _{\lambda> 0}{ \lambda \cdot \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}_0} \big | M_{2^n}^\beta *\mathcal {F}^{-1}\big (\eta _s \hat{f}\big ) (Q x + r) \big | > \lambda \right\} \right| } \\&\quad \le C \big \Vert \mathcal {F}^{-1} \big (\eta _s \hat{f}\big )(Qx+r)\big \Vert _{\ell ^1(x)}. \end{aligned}

### Proof

Observe that, by the mean value theorem, for $$x \in \mathbb {N}$$,
\begin{aligned} \frac{x^\beta - (x-1)^\beta }{\beta } \le x^{\beta -1} \le 1, \end{aligned}
thus
\begin{aligned} M^\beta _N(x) \le M_N(x). \end{aligned}
In particular, by the Hardy–Littlewood maximal theorem, there is $$C > 0$$ such that for all $$\frac{1}{2} \le \beta \le 1$$, and any $$f \in \ell ^1(\mathbb {Z})$$,
\begin{aligned} \sup _{\lambda> 0} { \lambda \cdot \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}_0} \big | M_{2^n}^\beta * f (x) \big | > \lambda \right\} \right| } \le C \Vert f\Vert _{\ell ^1}. \end{aligned}
(19)
For $$r \in \{1, \ldots , Q\}$$ and $$\lambda > 0$$, we set
\begin{aligned} J_r(\lambda ) = \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}_0} \big | M_{2^n}^\beta * \mathcal {F}^{-1}\big ( \eta _{s} \hat{f} \big )(Q x + r)\big | > \lambda \right\} \right| . \end{aligned}
Then, by (19), we have
\begin{aligned} J_1(\lambda ) + \ldots + J_Q(\lambda )&= \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}_0} \big | M_{2^n}^\beta * \mathcal {F}^{-1}\big ( \eta _{s} \hat{f} \big ) (x) \big | > \lambda \right\} \right| \nonumber \\&\le C \lambda ^{-1} \big \Vert \mathcal {F}^{-1}\big (\eta _{s} \hat{f}\big )\big \Vert _{\ell ^1}. \end{aligned}
(20)
Moreover, for any $$r, r' \in \{1, \ldots , Q\}$$, we have
\begin{aligned}&\bigg |\bigg \{x \in \mathbb {Z}: \sup _{n \in \mathbb {N}_0} \bigg | \int _0^1 e^{2\pi i \xi (Qx +r)} \Big (1 - e^{2\pi i \xi (r' - r)} \Big ) \widehat{M_{2^n}^\beta }(\xi ) \eta _{s}(\xi ) \hat{f}(\xi ) {\, \mathrm d}\xi \bigg | > \tfrac{1}{2} \lambda \bigg \}\bigg | \\&\quad \le \, C \lambda ^{-1} \bigg \Vert \int _0^1 e^{2\pi i \xi x} \Big ( 1 - e^{2\pi i \xi (r' - r)} \Big ) \eta _{s}(\xi ) \hat{f}(\xi ) {\, \mathrm d}\xi \bigg \Vert _{\ell ^1(x)}. \end{aligned}
Since $$\eta _{s} = \eta _{s} \eta _{s-1}$$, by Young’s convolution inequality and Lemma 4.1, we obtain
\begin{aligned}&\bigg \Vert \int _0^1 e^{2\pi i \xi x} \Big (1 - e^{2\pi i \xi (r' - r)} \Big ) \eta _{s}(\xi ) \hat{f}(\xi ) {\, \mathrm d}\xi \bigg \Vert _{\ell ^1(x)}\\&\quad \le \,\bigg \Vert \int _0^1 e^{2\pi i \xi x} \Big ( 1 - e^{2\pi i \xi (r' - r)} \Big ) \eta _{s-1}(\xi ) \bigg \Vert _{\ell ^1(x)} \big \Vert \mathcal {F}^{-1} \big (\eta _{s} \hat{f} \big ) \big \Vert _{\ell ^1} \\&\quad \le C Q 2^{-4s} \big \Vert \mathcal {F}^{-1}\big (\eta _{s} \hat{f} \big )\big \Vert _{\ell ^1}. \end{aligned}
Thus
\begin{aligned} J_r(\lambda ) \le J_{r'}(\lambda /2) + C \lambda ^{-1} Q 2^{-4 s} \big \Vert \mathcal {F}^{-1} \big ( \eta _{s} \hat{f} \big )\big \Vert _{\ell ^1}, \end{aligned}
which together with (20) imply that
\begin{aligned} Q J_r(\lambda )&\le J_1(\lambda /2) + \ldots + J_Q(\lambda /2) + C \lambda ^{-1} Q^2 2^{-4 s} \big \Vert \mathcal {F}^{-1} \big ( \eta _{s} \hat{f} \big )\big \Vert _{\ell ^1} \\&\lesssim \lambda ^{-1} \Big (1+Q^2 2^{-4 s}\Big ) \big \Vert \mathcal {F}^{-1} \big ( \eta _{s} \hat{f} \big )\big \Vert _{\ell ^1} \\&\lesssim \lambda ^{-1} \big \Vert \mathcal {F}^{-1} \big ( \eta _{s} \hat{f} \big )\big \Vert _{\ell ^1}, \end{aligned}
where the last inequality is a consequence of $$1 \le Q \le 2^{2s}$$. Therefore, in view of Lemma 4.2, we immediately get
\begin{aligned} J_r(\lambda ) \lesssim \lambda ^{-1} \big \Vert \mathcal {F}^{-1}\big (\eta _{s} \hat{f} \big )(Qx+r)\big \Vert _{\ell ^1(x)}, \end{aligned}
which is the desired conclusion. $$\square$$

Essentially the same reasoning as in the proof of Theorem 4.3 leads to the following theorem.

### Theorem 4.4

There is $$C > 0$$ such that for all $$1 \le Q \le 2^{2s}$$ with $$s \in \mathbb {N}$$, $$r \in \{1, \ldots , Q\}$$, $$\frac{1}{2} \le \beta \le 1$$, and any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \left\| \sup _{n \in \mathbb {N}_0} \left| \mathcal {F}^{-1}\big (\widehat{M_{2^n}^\beta } \eta _s \hat{f} \big )(Qx+r) \right| \right\| _{\ell ^2(x)} \le C \big \Vert \mathcal {F}^{-1}\big (\eta _s \hat{f}\big )(Qx+r)f\big \Vert _{\ell ^2(x)}. \end{aligned}

## 5 $$\ell ^2$$ Theory

We are now in the position to prove $$\ell ^2(\mathbb {Z})$$ boundedness of the maximal function associated to the multipliers $$(\nu _n^s : n \in \mathbb {N})$$.

### Theorem 5.1

For each $$\epsilon > 0$$ there is $$C > 0$$ such that for all $$s \in \mathbb {N}_0$$, and any finitely supported function $$f : \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \left\| \sup _{n \in \mathbb {N}} \big | \mathcal {F}^{-1}\big (\nu _n^s \hat{f} \big ) \big | \right\| _{\ell ^2} \le C 2^{-s(\frac{1}{2}-\epsilon )} \Vert f\Vert _{\ell ^2}. \end{aligned}

### Proof

We divide the supremum into two parts: $$0 \le n < 2^{s+4}$$ and $$2^{s+4} \le n$$. Then the following holds true.

### Claim 5.2

For each $$\epsilon > 0$$ there is $$C > 0$$ such that for all $$s \in \mathbb {N}_0$$, and any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \left\| \sup _{0 \le n \le 2^{s+4}} \big | \mathcal {F}^{-1}\big ( \nu _{n}^s \hat{f} \big ) \big | \right\| _{\ell ^2} \le C (s+1) 2^{-s(\frac{1}{2}-\epsilon )} \Vert f\Vert _{\ell ^2}. \end{aligned}
(21)
For the proof, we apply [14, Lemma 1] to write
\begin{aligned} \sup _{0 \le n < 2^{s+4}} \Big | \mathcal {F}^{-1}\big ( \nu _n^s \hat{f} \big ) \Big | \le \big |\mathcal {F}^{-1}\big (\nu _0^s \hat{f} \big )\big | + \sqrt{2} \sum _{i = 0}^{s+4} \left( \sum _{j = 0}^{2^{s+4-i}-1} \big | \mathcal {F}^{-1}\big ((\nu _{(j+1) 2^i}^s - \nu _{j2^i}^s) \hat{f} \big |^2 \right) ^{\frac{1}{2}}. \end{aligned}
(22)
Let us fix $$i \in \{0, \ldots , s+4\}$$. Then by the Plancherel’s theorem we get
\begin{aligned}&\sum _{j = 0}^{2^{s+4-i}-1} \Big \Vert \mathcal {F}^{-1}\big ((\nu _{(j+1) 2^i}^s - \nu _{j2^i}^s) \hat{f}\big ) \Big \Vert _{\ell ^2}^2\\&\qquad = \sum _{j = 0}^{2^{s+4-i}-1} \sum _{a/q \in \mathscr {R}_s} \int _0^1 \left| \sum _{m \in I_j^i} \widehat{L^{a, q}_{2^m}} (\xi - a/q) - \widehat{L^{a, q}_{2^{m-1}}} (\xi - a/q) \right| ^2\\&\quad \qquad \times \, \eta _s(\xi - a/q)^2 |{\hat{f}(\xi )} |^2 {\, \mathrm d}\xi \end{aligned}
where $$I_j^i = \big \{j2^i+1, j2^i + 2, \ldots , (j+1)2^i\big \}$$. By (6), we obtain
\begin{aligned}&\sum _{j = 0}^{2^{s+4-i}-1} \sum _{a/q \in \mathscr {R}_s} \int _0^1 \left| \sum _{m \in I_j^i} \widehat{L^{a, q}_{2^m}} (\xi - a/q) \!-\! \widehat{L^{a, q}_{2^{m-1}}} (\xi \!- \!a/q) \right| ^2 \eta _s(\xi - a/q)^2 |{\hat{f}(\xi )} |^2 {\, \mathrm d}\xi \\&\qquad \lesssim 2^{-s(1-\epsilon )} \sum _{a/q \in \mathscr {R}_s} \sum _{j = 0}^{2^{s+4-i}-1} \sum _{m, m' \in I_j^i} \int _0^1 \Delta _m^q(\xi - a/q) \cdot \Delta ^q_{m'}(\xi - a/q) \\&\qquad \quad \cdot \,\eta _s(\xi - a/q)^2 |{\hat{f}(\xi )} |^2 {\, \mathrm d}\xi , \end{aligned}
where $$\Delta ^q_m = \big |\widehat{M_{2^m}} - \widehat{M_{2^{m-1}}}\big | + \big |\widehat{M^{\beta _q}_{2^m}} - \widehat{M^{\beta _q}_{2^{m-1}}}\big |$$. In view of (12), we have
\begin{aligned} \sum _{n \in \mathbb {N}_0} \Delta ^q_m(\xi ) \lesssim \sum _{n \in \mathbb {N}_0} \min \big \{(2^n |{\xi } |)^{-1}, 2^n |{\xi } |\big \} + (1-\beta _q) 2^{-n(1-\beta _q)} \lesssim 1, \end{aligned}
uniformly with respect to $$\xi \in \mathbb {T}$$, $$q \in \mathbb {N}$$, and $$\frac{1}{2} \le \beta _q \le 1$$. Since supports of $$\eta _s(\cdot - a/q)$$ are disjoint while a / q varies over $$\mathscr {R}_s$$, we obtain
\begin{aligned} \sum _{j = 0}^{2^{s+4-i}-1} \Big \Vert \mathcal {F}^{-1}\big ((\nu _{(j+1) 2^i}^s - \nu _{j2^i}^s) \hat{f}\big ) \Big \Vert _{\ell ^2}^2&\lesssim 2^{-s(1-\epsilon )} \sum _{a/q \in \mathscr {R}_s} \int _0^1 \eta _s(\xi - a/q)^2 |{\hat{f}(\xi )} |^2 {\, \mathrm d}\xi \\&\lesssim 2^{-s(1-\epsilon )} \Vert f\Vert _{\ell ^2}^2, \end{aligned}
which together with (22) imply (21).
It remains now to treat supremum over $$n \ge 2^{s+4}$$. For each $$\frac{1}{2} \le \beta < 1$$ we set
\begin{aligned} \mathscr {R}_s^\beta = \big \{a/q \in \mathscr {R}_s : \beta _q = \beta \big \}. \end{aligned}
and $$\mathscr {R}_s^1 = \mathscr {R}_s$$. In view of the Landau’s theorem [16, Corollary 11.9], there are $$\mathcal {O}(\log s)$$ distinct $$\beta$$’s. Therefore, it suffices to show the following claim.

### Claim 5.3

For each $$\epsilon > 0$$ there is $$C > 0$$ such that for all $$s \in \mathbb {N}_0$$, $$\frac{1}{2} \le \beta \le 1$$, any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \left\| \sup _{2^{s+4} \le n} \left| \sum _{a/q \in \mathscr {R}_s^\beta } G(\chi _q, a) \mathcal {F}^{-1}\big (\widehat{M_{2^n}^{\beta }}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f} \big ) \right| \right\| _{\ell ^2} \le C 2^{-s(\frac{1}{2}-\epsilon )} \Vert f\Vert _{\ell ^2}. \end{aligned}
(23)
Let us fix $$\frac{1}{2} \le \beta \le 1$$. We define
\begin{aligned} I(x, y) = \sup _{2^{s+4} \le n} \left| \sum _{a/q \in \mathscr {R}_s^\beta } G(\chi _q, a) e^{2\pi i x a/q} \mathcal {F}^{-1}\big (\widehat{M_{2^n}^{\beta }} \eta _s \hat{f}(\cdot + a/q)\big )(y) \right| , \end{aligned}
and
\begin{aligned} J(x, y) = \sum _{a/q \in \mathscr {R}_s^\beta } G(\chi _q, a) e^{2 \pi i x a/q} \mathcal {F}^{-1}\big (\eta _s \hat{f}(\cdot + a/q)\big )(y). \end{aligned}
Observe that the functions $$x \mapsto I(x, y)$$ and $$x \mapsto J(x, y)$$ are $$Q_s$$ periodic where
\begin{aligned} Q_s = 4 \prod _{p \in \mathbb {P}_{2^{s+1}}} p \lesssim e^{2^{s+2}}. \end{aligned}
By the Plancherel’s theorem, for $$u \in \mathbb {Z}_{Q_s}$$, we have
\begin{aligned}&\big \Vert \mathcal {F}^{-1}\big (\widehat{M^{\beta }_{2^n}} \eta _s \hat{f}(\cdot + a/q) \big )(x+u) - \mathcal {F}^{-1}\big (\widehat{M^{\beta }_{2^n}} \eta _s \hat{f}(\cdot + a/q) \big )(x) \big \Vert _{\ell ^2(x)}\\&\quad = \bigg \Vert \big (1 - e^{2\pi i \xi u}\big ) \widehat{M^{\beta }_{2^n}}(\xi ) \eta _s(\xi ) \hat{f}(\xi +a/q) \bigg \Vert _{L^2(\mathrm{d}\xi )} \\&\quad \lesssim 2^{-n} |{u} | \cdot \big \Vert \eta _s \hat{f}(\cdot + a/q)\big \Vert _{L^2}, \end{aligned}
because by (11),
\begin{aligned} \sup _{\xi \in \mathbb {T}}{|{\xi } | \cdot |{\widehat{M^{\beta }}_{2^n}(\xi )} |} \lesssim 2^{-n}. \end{aligned}
Therefore, by the triangle inequality
\begin{aligned} \Big | \big \Vert I(x, x+u) \big \Vert _{\ell ^2(x)} - \big \Vert I(x, x) \big \Vert _{\ell ^2(x)} \Big | \lesssim Q_s \sum _{n \ge 2^{s+4}} 2^{-n} \sum _{a/q \in \mathscr {R}_s} \big \Vert \eta _s \hat{f}(\cdot + a/q)\big \Vert _{L^2}. \end{aligned}
Since $$\mathscr {R}_s$$ contains at most $$2^{2(s+1)}$$ rational numbers, by the Cauchy–Schwarz inequality we get
\begin{aligned} \sum _{a/q \in \mathscr {R}_s} \big \Vert \eta _s \hat{f}(\cdot + a/q)\big \Vert _{L^2} \le 2^{s+1} \Vert f\Vert _{\ell ^2}. \end{aligned}
Observe that
\begin{aligned} Q_s \cdot 2^{-2^{s+4}} \cdot 2^{s+1} \le 2^{2^{s+3} - 2^{s+4} + s + 1} \le 2^{-s}, \end{aligned}
thus
\begin{aligned} \big \Vert I(x, x) \big \Vert _{\ell ^2(x)} \lesssim \big \Vert I(x, x+u) \big \Vert _{\ell ^2(x)} + 2^{-s} \Vert f\Vert _{\ell ^2}. \end{aligned}
Hence,
\begin{aligned} \big \Vert I(x, x) \big \Vert _{\ell ^2(x)}^2 \lesssim \frac{1}{Q_s} \sum _{u = 1}^{Q_s} \big \Vert I(x, x+u) \big \Vert _{\ell ^2(x)}^2 + 2^{-2s} \Vert f\Vert _{\ell ^2}^2. \end{aligned}
(24)
Now, by multiple change of variables and periodicity we get
\begin{aligned}&\sum _{u = 1}^{Q_s} \big \Vert I(x, x+u) \big \Vert _{\ell ^2(x)}^2 = \sum _{u = 1}^{Q_s} \sum _{x \in \mathbb {Z}} I(x-u, x)^2 = \sum _{x \in \mathbb {Z}} \sum _{u = 1}^{Q_s} I(u, x)^2 \\&\quad = \sum _{u = 1}^{Q_s} \big \Vert I(u, x) \big \Vert _{\ell ^2(x)}^2. \end{aligned}
Using Theorem 4.4, we can estimate
\begin{aligned} \big \Vert I(u, x) \big \Vert _{\ell ^2(x)} = \left\| \sup _{2^{s+4} \le n} \left| \mathcal {F}^{-1}\big (\widehat{M^\beta _{2^n}} \eta _s J(u, \cdot ) \big ) \right| \right\| _{\ell ^2} \lesssim \big \Vert J(u, x)\big \Vert _{\ell ^2(x)}. \end{aligned}
Notice that
\begin{aligned} \sum _{u = 1}^{Q_s} \big \Vert J(u, x) \big \Vert _{\ell ^2(x)}^2= & {} \sum _{x \in \mathbb {Z}} \sum _{u = 1}^{Q_s} J(u, x)^2 = \sum _{u = 1}^{Q_s} \sum _{x \in \mathbb {Z}} J(x-u, x)^2 \\= & {} \sum _{u = 1}^{Q_s} \big \Vert J(x, x+u) \big \Vert _{\ell ^2(x)}^2. \end{aligned}
Since supports of $$\eta _s(\cdot - a/q)$$ are disjoint while a / q varies over $$\mathscr {R}_s$$, by (6) we get
\begin{aligned} \big \Vert J(x, x+u) \big \Vert _{\ell ^2(x)}^2&= \int _0^1 \left| \sum _{a/q \in \mathscr {R}_s^\beta } G(\chi _q, a) e^{2\pi i \xi u a/q} \eta _s(\xi - a/q) \right| ^2 |{\hat{f}(\xi )} |^2 {\,\mathrm d}\xi \\&\lesssim 2^{-s(1-\epsilon )} \Vert f\Vert _{\ell ^2}^2. \end{aligned}
Therefore,
\begin{aligned} \sum _{u = 1}^{Q_s} \big \Vert I(x, x+u) \big \Vert _{\ell ^2(x)}^2 \lesssim 2^{-s(1-\epsilon )} Q_s \Vert f\Vert _{\ell ^2}^2, \end{aligned}
which together with (24) imply (23) and the theorem follows. $$\square$$
Given $$t > 0$$ and $$n > t$$, we define the multiplier
\begin{aligned} \Pi _n^t(\xi )&= \sum _{0 \le s \le \sqrt{t}} \nu _n^s(\xi ) \\&= \sum _{0 \le s \le \sqrt{t}} \sum _{a/q \in \mathscr {R}_s} \widehat{L^{a, q}_{2^n}}(\xi - a/q) \eta _s(\xi - a/q). \end{aligned}

### Corollary 5.4

There are $$C, c > 0$$ such that for each $$t > 0$$, and any finitely supported function $$f \in \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \left\| \sup _{t \le n} \left| \mathcal {M}_{2^n} f - \mathcal {F}^{-1}\big (\Pi _n^t \hat{f} \big ) \right| \right\| _{\ell ^2} \le C \exp \big (-c \sqrt{t}\big ) \Vert f\Vert _{\ell ^2}. \end{aligned}

### Proof

Since
\begin{aligned} \mathfrak {m}_{2^n} - \Pi _n^t&= \big (\mathfrak {m}_{2^n} - \nu _n\big ) + \sum _{s > \sqrt{t}} \nu _n^s, \end{aligned}
our assertion follows from Theorems 3.2 and 5.1. Indeed, by the Plancherel’s theorem and Theorem 3.2 we get
\begin{aligned} \left\| \sup _{t \le n} \left| \mathcal {F}^{-1}\big ((\mathfrak {m}_{2^n} - \nu _n)\hat{f}\big ) \right| \right\| _{\ell ^2} \lesssim \left( \sum _{n \ge t} \exp \big (-2 c \sqrt{n} \big )\right) ^\frac{1}{2} \Vert f\Vert _{\ell ^2}. \end{aligned}
On the other hand, by Theorem 5.1,
\begin{aligned} \left\| \sup _{t \le n} \left| \sum _{s> \sqrt{t}} \mathcal {F}^{-1}\big (\nu _n^s \hat{f} \big ) \right| \right\| _{\ell ^2} \lesssim \sum _{s > \sqrt{t}} 2^{-\frac{s}{4}} \Vert f\Vert _{\ell ^2}, \end{aligned}
which concludes the proof. $$\square$$

## 6 Weak Type Estimates

In this section we investigate the weak type estimates for the multipliers $$\big (\Pi _n^t : n \ge t\big )$$. Then together with results from Sect. 5 we deduce Theorem C.

### Theorem 6.1

There is $$C > 0$$ such that for all $$t > 0$$ and any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \sup _{\lambda> 0}{ \lambda \cdot \left| \left\{ x \in \mathbb {Z}: \sup _{t \le n} \big | \mathcal {F}^{-1} \big ( \Pi _n^t \hat{f} \big )(x) \big | > \lambda \right\} \right| } \le C t \Vert f\Vert _{\ell ^1}. \end{aligned}

### Proof

Let us fix $$2^s \le q < 2^{s+1}$$ for some $$1 \le s \le \sqrt{t}$$. Let $$\frac{1}{2} \le \beta \le 1$$. Suppose that $$\chi$$ is a quadratic Dirichlet character modulo q induced by $$\chi ^\star$$ having the conductor $$q_0$$. We claim that the following holds true.

### Claim 6.2

There is $$C > 0$$ such that for any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned}&\sup _{\lambda> 0}{ \lambda \cdot \left| \left\{ \sup _{t \le n} \left| \sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1} \big (\widehat{M^{\beta }_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f}\big ) \right| > \lambda \right\} \right| } \nonumber \\&\quad \le C \frac{1}{\varphi (q)} \Vert f \Vert _{\ell ^1}. \end{aligned}
(25)
The constant C is independent of q, $$\beta$$ and $$\chi$$.
Let us first see that from Claim 6.2, we can deduce the theorem. Indeed, from (25) we easily get
\begin{aligned} \left| \left\{ \sup _{t \le n} \left| \sum _{a \in A_q} \mathcal {F}^{-1} \big (\widehat{L^{a, q}_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f}\big ) \right| > \lambda \right\} \right| \le C \frac{1}{\lambda \varphi (q)} \Vert f \Vert _{\ell ^1}. \end{aligned}
Recall that (see e.g. ),
\begin{aligned} \sum _{1 \le q < 2^{\sqrt{t}}} \frac{1}{\varphi (q)} \simeq \sqrt{t}, \end{aligned}
thus
\begin{aligned} \Phi (t) = \sum _{1 \le q < 2^{\sqrt{t}}} \frac{1+\log q}{\varphi (q)} \lesssim t. \end{aligned}
Hence, by log-convexity of $$\ell ^{1,\infty }(\mathbb {Z})$$, (see [11, 19]) we obtain
\begin{aligned}&\left| \left\{ x \in \mathbb {Z}: \sup _{t \le n} \big | \mathcal {F}^{-1} \big ( \Pi _n^t \hat{f} \big )(x) \big |> \lambda \right\} \right| \\&\quad = \left| \left\{ \sup _{t \le n} \left| \sum _{0 \le s \le \sqrt{t}} \sum _{q = 2^s}^{2^{s+1}-1} \sum _{a \in A_q} \mathcal {F}^{-1} \big ( \widehat{L^{a, q}_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f} \big ) \right|> \lambda \right\} \right| \\&\quad \le \left| \left\{ \sum _{0 \le s \le \sqrt{t}} \sum _{q = 2^s}^{2^{s+1}-1} \frac{1}{\varphi (q)} \varphi (q) \sup _{t \le n} \left| \sum _{a \in A_q} \mathcal {F}^{-1} \big ( \widehat{L^{a, q}_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f} \big ) \right| > \lambda \right\} \right| \\&\quad \lesssim \lambda ^{-1} \Phi (t) \Vert f\Vert _{\ell ^1}, \end{aligned}
which is bounded by $$C \lambda ^{-1} t \Vert f\Vert _{\ell ^1}$$.
What is left now is to prove Claim 6.2. Let $$r \in \{1, \ldots , q\}$$. For $$x \equiv r \bmod q$$, we have
\begin{aligned}&\sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1} \big (\widehat{M^{\beta }_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f}\big )(x) \\&\quad = \sum _{a \in A_q} G(\chi , a) e^{2\pi i r a /q} \mathcal {F}^{-1} \big (\widehat{M^{\beta }_{2^n}} \eta _s \hat{f}(\cdot + a/q) \big )(x) \\&\quad = \mathcal {F}^{-1}\big (\widehat{M^\beta _{2^n}} \eta _s F_q(\cdot ; r)\big )(x), \end{aligned}
where
\begin{aligned} F_q(\xi ; r) = \sum _{a \in A_q} G(\chi , a) \hat{f}(\xi + a/q) e^{2\pi i r a / q}. \end{aligned}
Hence, by Theorem 4.3, we obtain
\begin{aligned}&\left| \left\{ x \in \mathbb {Z}: \sup _{t \le n} \Big | \sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1} \big (\widehat{M^{\beta }_{2^n}}(\cdot - a/q) \eta _s(\cdot - a/q) \hat{f}\big )(x) \Big |> \lambda \right\} \right| \\&\quad = \sum _{r = 1}^q \left| \left\{ x \in \mathbb {Z}: \sup _{t \le n} \left| \mathcal {F}^{-1} \big (\widehat{M^{\beta }_{2^n}} \eta _s F_q(\cdot ; r) \big )(q x + r) \right| > \lambda \right\} \right| \\&\quad \lesssim \sum _{r = 1}^q \lambda ^{-1} \big \Vert \mathcal {F}^{-1}\big (\eta _s F_q(\cdot ; r) \big )(qx+r)\big \Vert _{\ell ^1(x)}. \end{aligned}
Next, by Young’s convolution inequality we get
\begin{aligned} \sum _{r = 1}^q \big \Vert \mathcal {F}^{-1}\big (\eta _s F_q(\cdot ; r) \big )(qx+r)\big \Vert _{\ell ^1(x)}&= \left\| \sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1}\big (\eta _s(\cdot - a/q) \hat{f} \big )\right\| _{\ell ^1} \\&\le \left\| \sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1}\big (\eta _s(\cdot - a/q)\big ) \right\| _{\ell ^1} \Vert f\Vert _{\ell ^1}, \end{aligned}
and
\begin{aligned} \left\| \sum _{a \in A_q} G(\chi , a) \mathcal {F}^{-1}\big (\eta _s(\cdot - a/q)\big ) \right\| _{\ell ^1} = \Big \Vert \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a / q} \mathcal {F}^{-1}\big (\eta _s\big )(x) \Big \Vert _{\ell ^1(x)}. \end{aligned}
Now, by Theorem 2.1, we can compute
\begin{aligned}&\left\| \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a / q} \mathcal {F}^{-1}\big (\eta _s\big )(x) \right\| _{\ell ^1(x)}\\&\quad = \sum _{r \mid q/q_0} \left\| \sum _{a \in A_q} G(\chi , a) e^{2\pi i r a / q} \mathcal {F}^{-1}\big (\eta _s\big )(q x + r) \right\| _{\ell ^1(x)} \\&\quad \le q_0 \sum _{r \mid q/q_0} \frac{\varphi (r)}{\varphi (q)} \big \Vert \mathcal {F}^{-1}\big (\eta _s\big )(q x + r) \big \Vert _{\ell ^1(x)} \\&\quad \lesssim \frac{q_0}{q} \sum _{r \mid q/q_0} \frac{\varphi (r)}{\varphi (q)}, \end{aligned}
where in the last inequality we have used Lemma 4.2 together with Lemma 4.1. Since (see e.g. )
\begin{aligned} \sum _{r \mid q/q_0} \varphi (r) = \frac{q}{q_0}, \end{aligned}
we conclude that
\begin{aligned} \left\| \sum _{a \in A_q} G(\chi , a) e^{2\pi i x a / q} \mathcal {F}^{-1}\big (\eta _s\big )(x) \right\| _{\ell ^1(x)} \lesssim \frac{1}{\varphi (q)}, \end{aligned}
proving the claim and the theorem follows. $$\square$$

### Theorem 6.3

There is $$C > 0$$ such that for any subset $$F \subset \mathbb {Z}$$ of a finite cardinality and all $$0< \lambda < 1$$,
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}} \mathcal {M}_{2^n} ({\mathbb {1}_{{F}}})(x) > \lambda \right\} \right| \le C \lambda ^{-1} \log ^2\big (e/ \lambda \big ) |{F} |. \end{aligned}

### Proof

We start by proving the following statement.

### Claim 6.4

There are $$C, c > 0$$ such that for each $$t > 0$$, there are two sequences of operators $$(A_n^t : n \in \mathbb {N})$$ and $$(B_n^t : n \in \mathbb {N})$$ such that $$\mathcal {M}_{2^n} = A_n^t + B_n^t$$, and for any finitely supported function $$f: \mathbb {Z}\rightarrow \mathbb {C}$$,
\begin{aligned} \sup _{\lambda> 0}{ \lambda \cdot \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}} \big |A_n^t f(x) \big | > \lambda \right\} \right| } \le C t \Vert f\Vert _{\ell ^1}, \end{aligned}
(26)
and
\begin{aligned} \left\| \sup _{n \in \mathbb {N}} \big | B_n^t f \big | \right\| _{\ell ^2} \le C \exp \big (- c \sqrt{t}\big ) \Vert f\Vert _{\ell ^2}. \end{aligned}
(27)
Without loss of generality, we may assume that f is non-negative finitely supported function on $$\mathbb {Z}$$. For $$1 \le n < t$$, we set
\begin{aligned} A_n^t f = \mathcal {M}_{2^n} f, \qquad \text {and}\qquad B_n^t f \equiv 0. \end{aligned}
Since by the prime number theorem,
\begin{aligned} \frac{2^n}{C} \le \vartheta (2^n), \end{aligned}
we have
\begin{aligned} \mathcal {M}_{2^n} f(x) \le C n M_{2^n} f (x). \end{aligned}
Hence, by the Hardy–Littlewood theorem,
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{1 \le n< t} \mathcal {M}_{2^n} f(x)> \lambda \right\} \right|&\le \left| \left\{ x \in \mathbb {Z}: \sup _{1 \le n < t} M_{2^n} f(x) > \frac{\lambda }{C t} \right\} \right| \\&\lesssim \lambda ^{-1} t \Vert f\Vert _{\ell ^1}. \end{aligned}
For $$t \le n$$, we set
\begin{aligned} A_n^t f = \mathcal {F}^{-1}\big (\Pi _n^t \hat{f} \big ), \qquad \text {and}\qquad B_n^t f = \mathcal {M}_{2^n} f - A_n^t f. \end{aligned}
In view of Corollary 5.4 and Theorem 6.1, we obtain (27) and (26), respectively, and the claim follows.
Now, the theorem is an easy consequence of Claim 6.4. Indeed, given a subset $$F \subset \mathbb {Z}$$ of a finite cardinality, for any $$t > 0$$, we can write
\begin{aligned} \left| \left\{ \sup _{n \in \mathbb {N}} \mathcal {M}_{2^n} ({\mathbb {1}_{{F}}})> \lambda \right\} \right|&\lesssim \left| \left\{ \sup _{n \in \mathbb {N}} \big | A_n^t ({\mathbb {1}_{{F}}})\big |> \tfrac{1}{2} \lambda \right\} \right| + \left| \left\{ \sup _{n \in \mathbb {N}} \big | B_n^t ({\mathbb {1}_{{F}}})\big | > \tfrac{1}{2} \lambda \right\} \right| \\&\lesssim \lambda ^{-1} t |{F} | + \lambda ^{-2} \exp \big (-2 c \sqrt{t} \big ) |{F} |. \end{aligned}
Thus, taking
\begin{aligned} t = (2c)^{-2} \log ^2 (e/\lambda ), \end{aligned}
we get the desired conclusion. $$\square$$

In view of (7), Theorem 6.3 entails the following corollary, which is precisely Theorem C.

### Corollary 6.5

There is $$C > 0$$ such that for any subset $$F \subset \mathbb {Z}$$ of a finite cardinality and all $$0< \lambda < 1$$,
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}} \mathcal {A}_{2^n} ({\mathbb {1}_{{F}}})(x) > \lambda \right\} \right| \le C \lambda ^{-1} \log ^2\big (e/ \lambda \big ) |{F} |. \end{aligned}

## 7 Applications

In this section we show two applications of Theorem 6.3 and Corollary 6.5. First, we prove that the restricted weak Orlicz estimates together with strong $$\ell ^2$$ bounds are sufficient to get $$\ell ^p$$ maximal inequalities for all $$1 < p \le 2$$. Next, we conclude almost everywhere convergence of ergodic averages for functions in some Orlicz space close to $$L^1$$.

### Theorem 7.1

For each $$p \in (1, 2]$$ there is $$C > 0$$ such that for any function $$f \in \ell ^p(\mathbb {Z})$$,
\begin{aligned} \left\| \sup _{N \in \mathbb {N}} \big | \mathcal {M}_N f \big | \right\| _{\ell ^p} \le C (p-1)^{-4} \Vert f\Vert _{\ell ^p}. \end{aligned}

### Proof

With loss of generality, we may restrict the supremum to dyadic numbers. We claim the following holds true.

### Claim 7.2

There is $$C > 0$$ such that for any subset $$F \subset \mathbb {Z}$$ of finite cardinality, and any $$p_0 \in (1, 5)$$,
\begin{aligned} \sup _{\lambda> 0} \lambda \cdot \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}} \mathcal {M}_{2^n} ({\mathbb {1}_{{F}}})(x) > \lambda \right\} \right| ^{\frac{1}{p_0}} \le C (p_0-1)^{-\frac{2}{p_0}} |{F} |^{\frac{1}{p_0}}. \end{aligned}
Since $$\mathcal {M}_N$$ are averaging operators, we may assume that $$0< \lambda < 1$$. Observe that the function
\begin{aligned} (0, 1) \ni \lambda \mapsto \lambda ^{p_0-1} \log ^2(e/\lambda ) \end{aligned}
attains its maximum at
\begin{aligned} \lambda = \exp \bigg (1-\frac{2e}{p_0-1}\bigg ). \end{aligned}
The maximal value equals $$4 e^{p_0-2e+1} (p_0-1)^{-2}$$, thus
\begin{aligned} \lambda ^{-1} \log ^2(e/\lambda ) \le 4 e^{p_0-2e+1} (p_0-1)^{-2} \lambda ^{-p_0}. \end{aligned}
Hence, by Theorem 6.3, we get
\begin{aligned} \left| \left\{ x \in \mathbb {Z}: \sup _{n \in \mathbb {N}} \mathcal {M}_{2^n} ({\mathbb {1}_{{F}}})(x) > \lambda \right\} \right|&\le C \lambda ^{-1} \log ^2(e/\lambda ) |{F} | \\&\le 4 C e^{p_0-2e+1} (p_0-1)^{-2} \lambda ^{-p_0} |{F} |, \end{aligned}
which is what we claimed.
Next, we notice that by Theorems 3.2 and 5.1, we have
\begin{aligned} \left\| \sup _{n \in \mathbb {N}} \big | \mathcal {M}_{2^n} f \big | \right\| _{\ell ^2} \le C \Vert f\Vert _{\ell ^2}. \end{aligned}
(28)
Let us consider $$p \in (1, 2)$$. Set $$p_0 = (1+p)/2$$. Since $$p_0 > 1$$ at the cost of the additional factor of $$(p-1)^{-1}$$, we get
\begin{aligned} \Big \Vert \sup _{n \in \mathbb {N}} \big | \mathcal {M}_{2^n} f \big |\Big \Vert _{\ell ^{p_0, \infty }} \le C (p-1)^{-1-\frac{2}{p_0}} \Vert f\Vert _{\ell ^{p_{0, 1}}} \end{aligned}
(29)
for any $$f \in \ell ^{p_{0, 1}}(\mathbb {Z})$$. Now, by the Marcinkiewicz interpolation theorem, [8, Theorem 11.9], based on (28) and (29) we obtain
\begin{aligned} \left\| \sup _{n \in \mathbb {N}} \big | \mathcal {M}_{2^n} f \big | \right\| _{\ell ^p} \le C \frac{p(2-p_0)}{(p-p_0)(2-p)} (p-1)^{-\frac{p_0+2}{p_0} \theta } \Vert f\Vert _{\ell ^p} \end{aligned}
where $$\theta \in (0, 1)$$ satisfies
\begin{aligned} \frac{1}{p} = \frac{\theta }{p_0} + \frac{1-\theta }{2}. \end{aligned}
Since
\begin{aligned} \frac{p(2-p_0)}{(p-p_0)(2-p)} (p-1)^{-\frac{p_0+2}{p_0} \theta } = \frac{p(3-p)}{(p-1)(2-p)} (p-1)^{-\frac{p+5}{p+1} \cdot \frac{p+2-p^2}{p(3-p)}} \lesssim (p-1)^{-3}, \end{aligned}
the theorem follows.$$\square$$

### 7.2 Pointwise Convergence

Let $$(X, \mathcal {B}, \mu )$$ be a probability space with a measurable and measure preserving transformation $$T: X \rightarrow X$$. We consider the following averages
\begin{aligned} \mathscr {A}_N f(x) = \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} f\big (T^p x\big ), \quad x \in X. \end{aligned}
With a help of the Calderón transference principle from  applied to Corollary 6.5, we deduce the following proposition.

### Proposition 7.3

There is $$C > 0$$ such that for any subset $$A \in \mathcal {B}$$, and all $$0< \lambda < 1$$,
\begin{aligned} \mu \left\{ x \in X : \sup _{N \in \mathbb {N}} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )(x) > \lambda \right\} \le C \lambda ^{-1} \log ^2\big (e/\lambda \big ) \mu (A). \end{aligned}

### Proof

Fix $$A \in \mathcal {B}$$ and $$x \in X$$. For $$R> L > 0$$, we define a finite subset of $$F \subset \mathbb {Z}$$ by setting
\begin{aligned} F = \big \{0 \le n \le R : T^n x \in A \big \}. \end{aligned}
Then for $$0 \le n \le R - N$$, $$N \le L$$,
\begin{aligned} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )\big (T^n x\big )&= \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} {\mathbb {1}_{{A}}}\big (T^{n+p} x\big ) \\&= \frac{1}{\pi (N)} \sum _{p \in \mathbb {P}_N} {\mathbb {1}_{{F}}}(n + p) = \mathcal {A}_N \big ({\mathbb {1}_{{F}}}\big )(n). \end{aligned}
Hence,
\begin{aligned}&\Big |\Big \{ 0 \le n \le R-L : \max _{1 \le N \le L} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )\big (T^n x\big )> \lambda \Big \} \Big | \\&\quad \le \Big |\Big \{ n \in \mathbb {Z}: \max _{1 \le N \le L} \mathcal {A}_N \big ({\mathbb {1}_{{F}}}\big )(n) > \lambda \Big \}\Big |. \end{aligned}
By Corollary 6.5,
\begin{aligned} \Big |\Big \{ n \in \mathbb {Z}: \max _{1 \le N \le L} \mathcal {A}_N \big ({\mathbb {1}_{{F}}}\big )(n) > \lambda \Big \}\Big |&\le C \lambda ^{-1} \log ^2(e/\lambda ) \sum _{n \in \mathbb {Z}} {\mathbb {1}_{{F}}}(n) \\&= C \lambda ^{-1} \log ^2(e/\lambda ) \sum _{n = 0}^R {\mathbb {1}_{{A}}}\big (T^n x\big ). \end{aligned}
Since T preserves the measure $$\mu$$, by integrating with respect to $$x \in X$$ we obtain
\begin{aligned}&(R-L+1) \cdot \mu \Big (x \in X : \max _{1 \le N \le L} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )(x)> \lambda \Big ) \\&\qquad = \sum _{n = 0}^{R-L} \mu \Big (x \in X : \max _{1 \le N \le L} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )\big (T^n x\big )> \lambda \Big ) \\&\qquad = \int _X \Big |\Big \{ 0 \le n \le R-L : \max _{1 \le N \le L} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )\big (T^n x\big ) > \lambda \Big \} \Big | {\, \mathrm d}\mu (x) \\&\qquad \le C \lambda ^{-1} \log ^2(e/\lambda ) \sum _{n = 0}^R \int _X {\mathbb {1}_{{A}}}\big (T^n x\big ) {\, \mathrm d} \mu (x) \\&\qquad = C (R+1) \lambda ^{-1} \log ^2(e/\lambda ) \mu (A). \end{aligned}
We now divide by R and take R approaching infinity to get
\begin{aligned} \mu \Big (x \in X : \max _{1 \le N \le L} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )\big (T^n x\big ) > \lambda \Big ) \le C \lambda ^{-1} \log ^2(e/\lambda ) \mu (A). \end{aligned}
Finally, taking L tending to infinity by the monotone convergence theorem we conclude the proof. $$\square$$
We are now in the position to show $$\mu$$-almost everywhere convergence of the ergodic averages $$(\mathscr {A}_N f : N)$$ for a function f from the Orlicz space $$L(\log L)^2(\log \log L)(X, \mu )$$. Let us recall that $$L(\log L)^2(\log \log L)(X, \mu )$$ consists of functions such that
\begin{aligned} \int _X |{f(x)} | \big (\log ^+ |{f(x)} | \big )^2 \big (\log ^+\log ^+ |{f(x)} |\big ) {\, \mathrm d} \mu (x) < \infty \end{aligned}
where $$\log ^+ t = \max \{0, \log t\}$$. The space $$L(\log L)^2(\log \log L)(X, \mu )$$ is a Banach space with the norm
\begin{aligned} \big \Vert f\big \Vert _{L(\log L)^2(\log \log L)} = \int _0^1 f^*(t) \phi \big (t^{-1} \big ) {\, \mathrm d} t \end{aligned}
where $$f^*$$ is the decreasing rearrangement of f, that is
\begin{aligned} f^*(t) = \inf \Big \{s > 0 : \mu \big \{x \in X : |{f(x)} | \ge s\big \} \le t \Big \}, \end{aligned}
and
\begin{aligned} \phi (t) = \log ^2 (1 + t) \log \big (1 + \log t\big ). \end{aligned}

### Theorem 7.4

There is $$C > 0$$ such that for each $$f \in L(\log L)^2(\log \log L)(X, \mu )$$,
\begin{aligned} \sup _{\lambda> 0}{\lambda \cdot \mu \left\{ x \in X : \sup _{N \in \mathbb {N}} \big | \mathscr {A}_N f(x)\big | > \lambda \right\} } \le C \big \Vert f \big \Vert _{L(\log L)^2(\log \log L)}. \end{aligned}
In particular, for each $$f \in L(\log L)^2(\log \log L)(X, \mu )$$,
\begin{aligned} \text {the limit} \quad \lim _{N \rightarrow \infty } \mathscr {A}_N f(x) \quad \text {exists} \end{aligned}
for $$\mu$$-almost all $$x \in X$$.

### Proof

We first prove the following claim.

### Claim 7.5

There is $$C > 0$$ such that for each $$A \in \mathcal {B}$$, and any $$0< \lambda < 1$$,
\begin{aligned} \sup _{\lambda> 0}{ \lambda \cdot \mu \left\{ x \in X : \sup _{N \in \mathbb {N}} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )(x) > \lambda \right\} } \le C \mu (A) \log ^2\bigg (\frac{e}{\mu (A)}\bigg ). \end{aligned}
(30)
Indeed, by monotonicity, if $$\lambda \ge \mu (A)$$, then
\begin{aligned} \lambda ^{-1} \mu (A) \log ^2\bigg (\frac{e}{\lambda }\bigg ) \le \lambda ^{-1} \mu (A) \log ^2\bigg (\frac{e}{\mu (A)} \bigg ). \end{aligned}
(31)
Otherwise, $$\lambda \le \mu (A)$$, which entails that
\begin{aligned} 1 \le \lambda ^{-1} \mu (A) \le \lambda ^{-1} \mu (A) \log ^2\bigg (\frac{e}{\mu (A)}\bigg ). \end{aligned}
(32)
In view of Proposition 7.3,
\begin{aligned} \mu \Big \{x \in X : \sup _{N \in \mathbb {N}} \mathscr {A}_N \big ({\mathbb {1}_{{A}}}\big )(x) > \lambda \Big \} \le \min \Big \{1, C \mu (A) \lambda ^{-1} \log ^2(e/\lambda )\Big \}, \end{aligned}
which together with (31) and (32) easily lead to (30).
Now, to show the theorem, let us fix $$f \in L(\log L)^2 (\log \log L)(X, \mu )$$. We set
\begin{aligned} A_j = \Big \{x \in X : f^*(2^{-j+1}) < |{f(x)} | \le f^*(2^{-j}) \Big \}, \end{aligned}
and
\begin{aligned} a_j = f^*(2^{-j}). \end{aligned}
Since $$|{f(x)} | \le a_j$$ for $$x \in A_j$$, we have
\begin{aligned} |{f} | \le \sum _{j \ge 1} a_j {\mathbb {1}_{{A_j}}}. \end{aligned}
Moreover, if $$j > k$$ then for $$x \in A_j$$ and $$y \in A_k$$, we have $$|{f(x)} | \ge |{f(y)} |$$. Since $$\mu (A_j) = 2^{-j}$$, we get
\begin{aligned} f^*(t) \ge \sum _{j \ge 1} a_j {\mathbb {1}_{{[2^{-j-1}, 2^{-j})}}}(t). \end{aligned}
(33)
Because the space $$L^{1, \infty }(X, \mu )$$ is log-convex (see [11, 19]), by Claim 7.5, we get
\begin{aligned}&\sup _{\lambda> 0}{\lambda \cdot \mu \left\{ x \in X : \sup _{N \in \mathbb {N}} \big | \mathscr {A}_N f(x)\big |> \lambda \right\} }\nonumber \\&\quad \lesssim \sum _{j \ge 1} \log (j+1) \sup _{\lambda> 0}{\lambda \cdot \mu \left\{ x \in X : a_j \sup _{N \in \mathbb {N}} \mathscr {A}_N \big ({\mathbb {1}_{{A_j}}}\big )(x) > \lambda \right\} } \nonumber \\&\quad \lesssim \sum _{j \ge 1} \log (j+1) a_j \mu (A_j) \log ^2\bigg (\frac{e}{\mu (A_j)}\bigg ). \end{aligned}
(34)
On the other hand, by (33) we have
\begin{aligned} \big \Vert f \big \Vert _{L(\log L)^2(\log \log L)}&= \int _0^1 f^*(t) \phi \big (t^{-1}\big ) {\, \mathrm d} t \ge \sum _{j \ge 1} a_j \phi (2^j) 2^{-j-1} \\&\ge \frac{1}{8} \sum _{j \ge 1} a_j \mu (A_j) \log ^2\bigg (\frac{e}{\mu (A_j)}\bigg ) \log (j+1), \end{aligned}
which together with (34) conclude the proof. $$\square$$

## References

1. 1.
Birkhoff, G.D.: Proof of the ergodic theorem. Proc. Natl. Acad. Sci. USA 17, 656–660 (1931)
2. 2.
Bourgain, J.: Estimations de certaines fonctions maximales. C. R. Acad. Sci. Paris I 301(10), 499–502 (1985)
3. 3.
Bourgain, J.: An Approach to Pointwise Ergodic Theorems. Geometric Aspects of Functional Analysis, pp. 204–223. Springer, New York (1988)Google Scholar
4. 4.
Bourgain, J.: On the maximal ergodic theorem for certain subsets of the integers. Israel J. Math. 61, 39–72 (1988)
5. 5.
Bourgain, J.: Pointwise ergodic theorems for arithmetic sets. With an appendix by the author, Harry Furstenberg, Yitzhak Katznelson and Donald S. Ornstein. Publ. Math. Paris 69(1), 5–45 (1989)
6. 6.
Buczolich, Z., Mauldin, R.D.: Divergent square averages. Ann. Math. 171(3), 1479–1530 (2010)
7. 7.
Calerón, A.P.: Ergodic theory and translatina-invariant operators. Proc. Natl. Acad. Sci. 59(2), 349–353 (1968)
8. 8.
de Reyna, J.A.: Pointwise Convergence of Fourier Series. Lecture Notes in Mathematics. Springer, New York (2002)
9. 9.
Fefferman, Ch.: Inequalities for strongly singular convolution operators. Acta Math. 124, 9–36 (1970)
10. 10.
Ionescu, A.D.: An endpoint estimate for the discrete spherical maximal function. Proc. Am. Math. Soc. 132(5), 1411–1417 (2004)
11. 11.
Kalton, N.J.: Convexity, type, and the three space problem. Studia Math. 69(3), 247–287 (1980/1981)Google Scholar
12. 12.
LaVictoire, P.: Universally $$L^1$$-bad arithmetic sequences. J. Anal. Math. 113(1), 241–263 (2011)
13. 13.
Mirek, M., Trojan, B.: Cotlar’s ergodic theorem along the prime numbers. J. Fourier Anal. Appl. 21(4), 822–848 (2015)
14. 14.
Mirek, M., Trojan, B.: Discrete maximal functions in higher dimensions and applications to ergodic theory. Am. J. Math. 138(6), 1495–1532 (2016)
15. 15.
Mirek, M., Trojan, B., Zorin-Kranich, P.: Variational estimates for averages and truncated singular integrals along the prime numbers. Trans. Am. Math. Soc. 369(8), 5403–5423 (2017)
16. 16.
Montgomery, H.L., Vaughan, R.C.: Multiplicative Number Theory I: Classical Theory. Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge (2006)
17. 17.
Nathanson, M.B.: Additive Number Theory the Classical Bases. Graduate Texts in Mathematics. Springer, Princeton (1996)
18. 18.
Sitaramachandrarao, R.: On an error term of Landau II. Rocky Mt. J. Math. 15, 579–588 (1985)
19. 19.
Stein, E.M., Weiss, N.J.: On the convergence of Poisson integrals. Trans. Am. Math. Soc. 140, 34–54 (1969)
20. 20.
Vinogradov, I.M.: The Method of Trigonometrical Sums in the Theory of Numbers. Dover Books on Mathematics Series. Dover, Mineola (1954)Google Scholar
21. 21.
Wierdl, M.: Pointwise ergodic theorem along the prime numbers. Israel J. Math. 64(3), 315–336 (1988)