1 Introduction

Chinese soybean planting area has decreased year by year, soybean self-supply ability had been to about 20 % in 2013, mainly imports country are United States, Brazil, and Argentina [1]. Comprehensive, reliable and timely information of Brazil’s soybean area is necessary for China to make decisions on agricultural related problems. Compared with the traditional survey method, Remote sensing survey has advantage of large coverage, low cost and less investigation time [2]. Spatial sampling method which combined remote sensing and sampling survey is widely used in the investigation of large scale crop area estimation [3]. The survey accuracy is mainly effect by Population, sampling proportion, sample distribution. Population can be defined by historical cultivated land [4] or administrative divisions with historical statistical data [5]. Sampling proportion is the bigger the better with the premise of meeting the minimum sampling proportion, but the actual survey should consider the accuracy requirements, cost and time. The ideal sample distribution is random distribution, but it is subject to the satellite width limitation and revisit cycle.

With the rapid development of remote sensing technology, medium resolution satellite (10 m–30 m) are gradually meet the sampling survey requirements and even full coverage. But in the soybean growth period, it is difficult to get full coverage image with cloud free. Considering the weather, satellite width limitation and revisit cycle, using landsat7/8 as a data source, this study designed a typical investigation method about Brazil soybean area estimation based on average samples change rate of two years and official statistics of a year before.

2 Study Area and Data Source

2.1 Study Area

Brazil is located in the west by 35 to 74°, 5° north latitude to 35° south latitude. Brazil’s total area is about 8514900 square kilometers, which is about 46 % of the South America total area. The terrain of Brazil is divided into two parts, one part is plateau of Brazil with altitude of 500 m above, located in the south of Brazil, the other part is plains with elevation of 200 m below, mainly distributed in the Amazon River Basin in the north and the west. Throughout the terrain is divided into the Amazon plain, Paraguay basin, Brazil and the Guyana plateau, the Amazon plain area accounting for about 1/3. Most of the area of Brazil belongs to the tropical climate, parts of the South belongs to the subtropical climate. Annual average temperature of The Amazon plain is 25~28°, the annual average temperature of south is 16 to 19°.

Soybean mainly distribute in central Brazil and southern Brazil (Fig. 1), Due to the tropical climate and long growing season, the crop production cycles are much more complicated. Below is a month-by-month (Table 1) account of what to expect during the growing season.

Fig. 1.
figure 1

The sketch map of study area (http://www.usda.gov/oce/weather/pubs/Other/MWCACP/Graphs/Brazil/BrzSoyProd_0509.pdf)

Table 1. Brazil soybean month-by-month crop cycle

2.2 Data

Landsat Multi-spectral image: Landsat7 and landsat8 Multi-spectral image listed as Table 2 were used. Landsat7 and landsat8 were subset by sampling frame of 40 km × 40 km, only cloud free samples were selected.

Table 2. Landsat multi-spectral image

Soybean statistical data are downloaded from website of The Brazilian Institute of Geography and StatisticsFootnote 1, which publishes harvests figures consisting of area, output and average yield for 35 different crops of previous year in the annual 1–4 month. Soybean official harvested area in 2013 is 27736 thousand hectares, which is used for are estimation. Soybean official harvested area in

2014 is 30241 thousand hectares, which is used for accuracy assessment.

3 Methodology: Sampling Design

The flow of this experiment (Fig. 2) includes: (1) Construction of sampling frame; (2) Determine the Sampling proportion and distribution of samples, (3) Soybean extraction by unsupervised classification and visual interpretation; (4) The average change rate of samples between 2013 and 2014; (5) Area estimation; (6) Accuracy assessment.

Fig. 2.
figure 2

Flow chart of experiment

3.1 Construction of Sampling Frame

The sampling frame covers 17 soybean planting states (from 2013 Brazil official statistics), which can be seen from Fig. 3, the sampling unit was designed as 40 km × 40 km. The population is 4215, which is shown on Fig. 3.

Fig. 3.
figure 3

The sketch map of image samples

3.2 Determine the Sampling Proportion and Distribution of Samples

We considered relevant research to determine the sampling proportion. The rate of sampling of Monitoring Agriculture with Remote Sensing (MARS) of European Union is about 1 % (60sites × 40 km × 40 km/10160000 km2) [6]; The rate of paper about paddy rice area estimation using a stratified sampling method with remote sensing in China is 1.3 % [7]. In order to improve the estimation accuracy, the rate of sampling is increased to 2 %. The number of samples is 83. Selected samples could be seen from Fig. 3.

Distribution of samples should consider several factors. Firstly, the samples should cover the major and minor soybean planting areas; Secondly, in the soybean growth season, the images can effectively extract the spatial distribution of soybean. Thirdly, sample is cloud free. The samples are shown in Fig. 3.

3.3 Soybean Extraction by Unsupervised Classification and Visual Interpretation

Each sample has soybean classification results of 2013 and 2014, samples were classified by unsupervised classification. The classification results were corrected by visual interpretation of ArcGIS software. Statistics of samples classification results are shown as Table 3.

Table 3. Statistics of samples classification result

3.4 The Average Change Rate of Samples Between 2013 and 2014

$$ Change\_rate = (Area2014_{sample\_i} - Area2013_{sample\_i} )/Area2013_{sample\_i} $$
(1)

where change_rate represents change rate of samples between 2013 and 2014, sample_i represents sample number, i from 1 to 83. The average change rate of 83 samples between 2013 and 2014 is 6.45 %, which is shown on Table 3.

3.5 Area Estimation

$$ \hat{P} = Area2013_{official} \times (1 + Average \, change\_rate) $$
(2)

where \( \hat{P} \) represents estimated area; \( Area2013_{official} \) represents Brazil soybean statistical data of 2013, which are downloaded from website of The Brazilian Institute of Geography and Statistics.

3.6 Accuracy Assessment

Sampling results was appraised by relative error r, which is defined as follow:

$$ {\text{r}} = 100 \times (P - \hat{P})/P $$
(3)

where r represents relative error, \( \hat{P} \) represents estimated area, \( P \) represents true area, Brazil soybean harvested area in 2014 are used.

4 Results

Soybean official harvested area in 2013 is 27736 thousand hectares, average change rate between 2013 and 2014 is 6.45 %, Estimated soybean harvested area in 2014 is:

$$ \hat{P} = Area2013_{official} \times (1 + Average \, change\_rate) = 27736 \times (1 + 6.45\;\% ) = 29525\;{\text{thousand hectares}} $$

Soybean official harvested area in 2014 is 30241 thousand hectares, relative error is:

$$ {\text{r}} = 100 \times (P - \hat{P})/P = 100\;\% \times (30241 - 29525)/30241 = 2.37\;\% . $$

5 Discussion and Conclusion

Discussion: in the previous study of stratified sampling, stratified variable often from Modis data or land use/cover data or statistical data, the location of sample is determined by stratified variable. However, the number of sample with determined location is hard to be satisfied with cloud free image. A question then worth asking is How to maximize the use of available images?

This study designed a typical investigation method about Brazil soybean area based on average samples change rate of two years and official statistics of a year before, typical samples were selected to survey, sampling frame was constructed on soybean planting state, the sampling unit was designed as 40 km × 40 km, the sampling proportion was 2 %, average samples change rate of two years were 2013 and 2014. Estimated area was compared with Brazil official harvested area in 2014 (published on 2015 April by Brazilian Institute of Geography and Statistics), the relative error is 2.37 %.