Strange Images in Remote Sensing and Their Properties

_______________________ Lossy image compression is used in many applications including remote sensing. Image size and number increase and this often leads to the necessity to apply image compression. In lossy compression, it is assumed that rate-distortion curves are monotonous functions and this assumption is put into basis of compression control. However, it has been shown recently that there are grayscale and color images called “strange” for which the rate-distortion curves are not monotonous. In this paper, we demonstrate that some remote sensing images can be strange as well and this takes place for JPEG and some other compression techniques. Analysis of properties for strange images using Spearman rank order correlation coefficient is carried out and it is shown that there several parameters characterizing image complexity that have a rather high correlation with probability that a given image is strange. For example, image entropy is one of such parameters.


Introduction
Images are widely used nowadays in numerous applications (Nan et al., 2022;Singh et al., 2022;Spasova et al., 2021). Remote sensing (RS) from airborne and spaceborne carriers is one of them. Modern RS sensors provide a lot of valuable information (Popov et al, Kussul et al) producing images of high spatial resolution with periodicity of a few days. This makes problematic image transfer from sensor carriers to onland centers of RS data processing as well as further storage and dissemination of images (Blanes et al). Then, image compression has to be applied (Blanes et al; Zabala et al; Hussain et al).
It is common to divide image compression algorithms into two groups -lossy and lossless (Prasanna et al., 2021;Manga et al., 2021;Sayood, 2017). In this paper, we concentrate on lossy compression methods since they are able to provide quite large and variable compression ratio (CR) needed in many practical applications Bondzulic et al;Ortega et al, 1998). On the one hand, the lossy compression introduces distortions into data and thus, in opposite to lossless compression, decompressed images differ from the corresponding original images. On the other hand, lossy compression often allows to ensure a reasonable compromise between the characteristics of introduced distortions (compressed image quality) and attained CR (Christophe 2011;Lin et al, 2015).
Reaching this compromise is usually based on utilizing the rate-distortion curves (RDCs)dependences of a parameter (metric) characterizing __________ *E-mail: v.lukin@khai.edu distortions (e.g., mean square error (MSE), peak signal-t0-noise ratio (PSNR) or some visual quality metric) on CR or parameter that allows varying CR (e.g., quality factor (QF) for JPEG or quantization step (QS) for coders based on discrete cosine transform (DCT)). Two common assumptions concerning RDCs are the following. First, it is supposed that distortions increase (image quality becomes worse) if CR increases (QF reduces or QS becomes larger). Second, it is assumed that RDCs are monotonous functionseither increasing as, e.g., MSE on QS for DCT-based coders (Krivenko et al, 2018) or decreasing as, e.g., PSNR on QS. Such assumptions have been put into basis of different algorithms of providing a desired quality of compressed images (Oh et al, 2016;Bondzulic et al, ;Li et al, 2020) where quality can be understood in different ways. These can be compression with providing a given value of a considered metric, with visually lossless distortions and so on. The algorithms are iterative or, at least, two-step where compressed image quality is assessed after the first (or each) iteration (step) and then refined by changing a parameter the controls compression (PCC) -QF for JPEG, QS for DCT-based compression, bits per pixel (BPP) for JPEG2000 or SPIHT (Oh et al, 2016), parameter Q for better portable graphics (BPG) coder (Bellard 2018) and so on.
However, recent studies Bondzulic et al, 2022) have demonstrated that RDCs can be not monotonous. Images for which this happens have been called strange. The studies have been first carried out for grayscale images for the coder AGU (Ponomarenko et al, 2005) for which the existence of strange images has been discovered. Then, possible non-monotonicity of RDCs has been found for JPEG and color images.
It seemed at the very beginning, that non-monotonicity of RDCs might be observed for artificial origin images containing large white color areas. However, later it has been shown (Bondzulic et al, 2022) that strange images can be of natural origin, for example, color images acquired in bad illumination (night) conditions. Nonmonotonicity can be observed not only for dependences of MSE or PSNR on PCC, but also for visual quality metrics. It has been also established that not coder but image properties determine is a given image strange or not.
As stated above, only artificial and natural scene (conventional) images have been considered. A question is can RS images be strange and what are the main properties of such images? The goal of this paper is to partly answer these questions. We show that strange RS images exist and they have specific properties dealing with image complexity.

Definition of strange images and criteria
Although we have already mentioned what images can be considered strange, it is worth showing some example. It is presented in Fig. 1. The image itself ( Fig. 1, a) is not atypical, similar scene images can be found in RS databases and sites. Its specific feature is that it contains quasi-homogeneous strips. As seen in Fig. 1, b, dependence of PSNR on QF is not monotonous. It has quite many local minima and maxima. It follows from analysis that, e.g., it is more reasonable to compress the considered image using QF = 12 than QF = 13, since in the former case a larger CR and better PSNR are provided simultaneously (for QF = 12, PSNR = 30.4 dB and CR = 53.12 whilst, for QF = 13, PSNR = 30.1 dB and CR = 51.23). As can be seen, the difference in PSNR is equal to 0.3 dB and such difference can be visually noticeable when comparing two compressed images. The first formal answer can be the following. If RDC is assumed monotonically decreasing, then an image is strange if there is, at least, one i for which where Metr is a metric used in analysis (e.g., PSNR).
In turn, if RDC is supposed monotonically increasing, then an image can be treated as strange if there is, at least, one i for which if, at least, one local minimum exists. Here i is index of RDC value array used in analysis. For example, for JPEG it coincides with QF values which are integers from 1 to 100. If PCC is not QF, but some other parameter, then it is a question what number of samples I (i = 1,…I) to analyze for a given RDC. A larger I allows carrying out a more thorough analysis but requires more operations of image compression, decompression, and metric estimation. There are coders for which PCC can take any value in some range. The examples are BPP values for JPEG or SPIHT that can vary from 0 to 8 for grayscale images represented as 8-bit 2D data. For DCT-based coders controlled by QS such as, e.g., the coder AGU (Ponomarenko et al, 2005), minimal QS tends to 0 whilst maximal value is, in general, not restricted. Because of this, it is reasonable to consider QS values in such a range that compressed images are not totally damaged (the distortions are not too annoying). Hence, we analyzed data for QS from 1 to 100 using integer values (although it is, in general, possible to apply not integer QS values).
Here it is worth noting the following. Analyzing RDCs, researchers often obtain them using sparsely set values of PCC, e.g., setting QF equal to 5, 10, 15, …., 100. In such a case, there is a big chance that RDC might seem monotonous and image strangeness is not observed (detected).
Let us come back to definitions of strange images. The definition can be stricter. Suppose that RDC is decreasing. Then, an image can be considered strange if there is, at least, one i that (3) Similarly, for increasing RDC, an image can be treated as strange if there is, at least, one i that ΔM = Metr(i) -Metr(i -1) < -δ.
(4) Here δ is the preset threshold showing that unexpected "jump" is considerable. For example, for the metric PSNR, δ can be set about 0.3-0.5 dB. Really, if the found ΔPSNR (3) for the RDC PSNR(QS) is equal to, let us say, 0.01 dB, it is not too problematic in practice.
Thus, it is possible to consider two practical situations: 1) an image is formally strange (FS) but absolute value of the largest found ΔM is small (the conditions (3) or (4) are not satisfied); 2) an image is strictly strange (SS), this takes place if the conditions (3) or (4) are valid.

Results for JPEG
Recall that our main interest is to RS images. Because of this, we have chosen four datasets (classes) of three-channel (color) images from the freely available UC Merced Land Use Dataset (http://weegee.vision.ucmerced.edu/datasets/landuse.html). Each obtained dataset contains 31 images and the datasets have the names "Agricultural", "Airplane", "Beach", and "Dense Residential". Fig. 2 presents four small copies of typical images for each dataset. The dataset Agricultural contains two SS images and five FS ones. The dataset Airplane has no SS images and only three FS ones. Quite many (twenty one) SS images have been found in the dataset Beach, four other ones are formally strange. Finally, no strange images have been found in the dataset Dense Residential.
Even visual inspection of images in Fig. 2 explains why is it so. The dataset Agricultural contains some images with quite large homogeneous regions (e.g., the rightmost image in the corresponding row). The images in the sets Airplane and, especially, Dense Residential are more heterogeneous. Finally, the images in the set Beach usually contain two large (quasi)homogeneous regions that correspond to sand and water surface. Note that the leftmost image in the corresponding row is not strange whilst the two rightmost are strictly strange ones.
Above, we have given verbal descriptions of properties of some RS images. Meanwhile, image complexity can be characterized quantitatively. The paper (Zhang et al, 2018) presents five parameters able to characterize image complexity and denoted as entropy (E), edge ratio (ER), contrast (C), correlation (CO), and energy (EN). The parameters E, ER, and C have smaller values for simpler structure images whilst the parameters CO and EN have the opposite properties (Zhang et al, 2018). Our idea (assumption) is that the aforementioned parameters can be correlated with probability of an image to be strange. To check this idea, we have calculated the parameters for images in the datasets. As an example, let us present a part of data for fifteen first images in the set Agricultural. They are given in Table 1.
As seen, the parameters have different ranges of variation. Entropy is from 5.54 to 7.22 (in fact, from 5.1 to 7.5 for all 124 considered images), ER varies in the limits from 0.14 to 0.40 (from 0.07 to 0.41 for all images), C is from 0.07 to 2.25 (from 0.02 to 2.25 for all images), CO varies from 0.11 to 0.95 (from 0.05 to 0.99 for all images), and EN is from 0.06 to 0.31 (from 0.06 to 0.44 for all images). The main properties mentioned above are observed, i.e. strange images (that usually have quite simple structure and a larger percentage of pixels that belong to homogeneous regions) are commonly characterized by smaller E, ER, and C whilst CO and EN for them are mostly the largest. However, this dependence is not strict. We have calculated Spearman rank order correlation coefficients (SROCCs) between parameters and numerical representation of image strangeness (0 for usual images, 1 for FS images and 2 for SS images) for three datasets (recall that for the dataset Dense Residential there are no SS and FS images, so correlation for it cannot be determined). The result is that there is quite large correlation for image strangeness and contrast (C), entropy (E), and energy (EN). Other two parameters show high correlation (large absolute values of SROCC) only for particular datasetssee data in Table 2.
Typical representatives for the set "Agricultural" Typical representatives for the set "Airplane" Typical representatives for the set "Beach" Typical representatives for the set "Dense Residential" Fig. 2. Examples of images in datasets The analysis shows the following. There is no one parameter characterizing image complexity that allows reliable detection of strange images before their compression by setting some threshold. However, there is a quite strict connection between parameters describing image complexity and image strangeness that can be potentially exploited for detection. This can be a direction of future research. Furthermore, it can be observed that the highest SROCCs are for Beach dataset with the most strange images, than for Agricultural and Airplane (with the least strange images) datasets.
Finally, we would like to present two plots for the strange image Beach16. They are given in Fig. 3. The RDC PSNR(QF) (Fig. 3 a) has been used in previous analysis for detecting strange images. As seen, this RDC has multiple local minima and maxima confirming the strangeness of this image. Fig. 3 b presents the RDC MDSI(QF) where MDSI is mean deviation similarity index (Nafchi et al, 2016). This is one of the best visual quality metrics that has the property to be smaller for better visual quality. Note that earlier we have not analyzed image strangeness according to MDSI. a b Fig. 3. RDCs PSNR(QF) (a) and MDSI(QF) (b) for the image Beach16 As seen, according to the RDC in Fig. 3 b, beach16 is the strange image as well as according to conventional PSNR. Moreover, the largest "fluctuations" of MDSI are observed for the same interval of QF variation as for PSNRfor QF around 10.

Brief analysis for AGU
We have already shown (Bondzulic et al, 2022) that image can be strange not only if they are compressed by JPEG but also by other coders. Because of this, we have decided to carry out a more thorough study for the coder AGU (Ponomarenko et al, 2005) that performs DCT in 32×32 pixel blocks, bit-plain coding of quantized DCT coefficients, and deblocking after decompression. Due to this, the coder AGU outperforms JPEG and JPEG2000 for grayscale images (see https://ponomarenko.info/ agu.htm for more details).
If an image is three-channel, AGU can be applied component-wise or one can use 3D version of AGU that has been developed for compressing multichannel (in particular, hyperspectral) images. Here, we employed the former variant since it has been interesting for us what are the differences for components of three-channel images in the sense of its strangeness. Since we focus on strange images, the same 31 images from the dataset Beach have been considered. For each component, the following parameters have been determined for the obtained dependences PSNR on QS: 1) number of local maxima; 2) values of the metric Hom of background content (Abramov et al, 2009), which is one more metric characterizing image complexity. This metric has larger values for simpler structure images. In addition, the number of local maxima is presented for PSNR determined for the Y component in YCbCr space.
The obtained data are presented in Table 3. Their analysis shows several interesting tendencies. First, there are non-strange images for which all RDCs (for all three components and in aggregate) are monotonous (e.g., beach00.tif). Meanwhile, there are also images, for which some partial RDCs are monotonous whilst other one or ones contain local maximum o maxima (see data for images beach06.tif and beach21.tif). Finally, there are images, for which local maxima are observed for all components (e.g., the images beach12.tif or beach16.tif). Second, the number of maxima is usually different for RDCs obtained for different components. For Y component, the number of local maxima is approximately the same as for the RDCs for R, G, and B components. An example of four dependences is presented in Fig. 4 where one can see multiple maxima for all components. Third, strange images obtained for JPEG are mostly the same as for the coder AGU. In the first order, this relates to images ## 11-17. The images beach14 and beach16 are shown in Fig. 4. As seen, both images mainly contain two quasi-homogeneous areas of water surface and beach sand. Fourth, the values of Hom metric are usually larger for images detected as strange. The values of this metric are usually quite close for all three components. This is not surprising since components of color images are commonly highly correlated (similar to each other). We have calculated SROCC between image strangeness (here we assigned 0 to images for which RDC is monotonous and unity to those images having at least one local maxima) and Hom values. The following SROCC values have been got: 0.71 for R, 0.63 for G, 0.57 for B, and 0.67 for Y component. Thus, the correlation is high again and the parameter Hom seems to be quite informative. However, it is still difficult to set a certain threshold for reliable discrimination of images into strange and not strange.

Conclusions
Analysis of monotonicity of RDCs for four datasets (classes) of typical RS images has been carried out for two coders (JPEG and AGU applied in different ways). It has been shown that there is a certain percentage of images for which the dependences PSNR on QF for JPEG and PSNR on QS for AGU are not monotonous. Most often this happens for images of the dataset Beach for which the images mostly contain two quasihomogeneous areas corresponding to beach sand and water surface.