Introduction

The world’s demand for agricultural products is growing at an unprecedented scale. An estimated 50% increase in agricultural productivity is needed in the next 30 years to provide the world population with sufficient food, feed, fuel, and fibers [1]. Despite the growing population, expecting to reach almost ten billion people by 2050, there is a growing labor shortage in agriculture, due to an aging farmer population and urbanization. Moreover, agricultural tasks are often physically demanding and highly repetitive and dull. To meet the growing demand and to compensate for the labor shortage, there is a strong need in the agricultural industry for increased automation and robotization.

Crops like wheat, corn, and potato ripen uniformly on the field, which allows efficient mass harvesting of the crop at a single moment in time by big machines. In contrast, high-value crops like apples, tomatoes, and broccoli ripen heterogeneously and require selective harvesting of only the ripe fruits. Multi-annual crops, like apple and grapes, furthermore, require that the plant is not damaged during the harvesting process. Selective harvesting turned out to be difficult to automate and therefore is currently mainly performed by human labor. This makes selective harvesting currently one of the most labor-intensive and expensive tasks on the farm. This stimulates the development of robotic systems for selective harvesting.

Apart from the labor and cost aspects, there are more advantages of robotic harvesting. Where humans have variable quality of operation, robots operate very consistently without individual or temporal variations. Furthermore, parallel to the harvesting task, robots can inspect the crop to detect diseases and monitor crop development, which allows improved farm management and can optimize the food-production chain.

The task of selective harvesting, however, is not an easy one for robots, which is illustrated by the fact that there are hardly any selective-harvesting robots on the market. We identify three main challenges in developing a selective-harvesting robot: variation, incomplete information, and safety [2]:

  • Variation. Different from robots operating in the manufacturing industry that work in highly controlled environment with known artificial objects, agricultural robots need to operate in uncontrolled environments with natural objects. These environments give rise to different types of variation. Firstly, there is object variation. Every plant and fruit is unique with different appearance, geometry, and mechanical properties from other instances. In addition, the appearance of the crop may change over time during the growing season. Secondly, there is environmental variation caused by the weather or indoor climate control, causing variation in, for instance, illumination, humidity, and temperature. Thirdly, there is variation in the cultivation system. Farmers have individual preferences in how they cultivate their crops, with differences in, for instance, infrastructure, irrigation, soil type, and pruning methods, resulting in different growth patterns. Finally, there is task variation. A robot solely designed for harvesting has limited value, which can be greatly improved if it can also perform other plant-management tasks, such as, pruning, thinning, pest control, monitoring, and providing nutrients.

  • Incomplete information. The environment of a selective harvesting robot is often highly complex and cluttered, giving rise to many occlusions. The partial observability causes that the robot has to operate with incomplete and uncertain information. The objects of interest for the robot’s task might be partially or completely occluded; the to-be-harvested produce will often be covered by other elements of the plant or tree, such as, leaves. Moreover, sensor data is often noisy and information about the weather, crop development, and the presence of pests and diseases are incomplete and uncertain.

  • Safety. Harvesting robots need to be inherently safe for their environment. Most of the fruits and vegetables are very delicate and the plants on which they grow are fragile. Damage to the fruits will devaluate the produce. At worst, damage to the plant could mean the end of the production of that plant. Selective harvesting robots, therefore, need to have a soft touch, being able to grasp and manipulate the objects with care. Moreover, there will also be humans in the production environment with whom the robots should be able to collaborate in a safe manner.

In this review, in the “State of the art in Selective Harvesting” section, we provide an overview of how current selective harvesting robots deal with these challenges. Limitations of these systems, trends, and future research directions are discussed in the “Limitations, Trends, and Future Research” section.

State of the art in Selective Harvesting

Applications of selective harvesting can be divided in three major application areas: greenhouse (protective cultivation), orchard, and open field. In this section, the state of the art in academia and industry in these application domains will be described.

Greenhouse

A greenhouse provides a protected and controlled environment for optimal crop production. The enclosed structure allows to control environmental factors, such as temperature, humidity, carbon dioxide, and to a certain extend also the light level, to set optimal conditions for year-round production. In addition, plants generally do not root in soil, but in an artificial growing medium offering the plant optimal concentrations of water and nutrients. As illustrated in Fig. 1, there are different cultivation systems to support and guide plant growth, which are optimized for light interception, space, and in some cases automation. van Henten [3] and van Henten et al. [4] describe different activities during the greenhouse production cycle, including greenhouse preparation, planting, crop maintenance, harvesting, grading, and packing, which can potentially be robotized.

Fig. 1
figure 1

Examples of different plants cultivation systems. a Pepper “V” system. b Tomato high wire system. c Strawberry “Table-top” system

State of the art in Research

Concerning selective harvesting of high-valued crops, a thorough review of 50 robots developed in the past three decades was made in [5]. On average, the systems had a harvest success rate of 66% with a cycle time of 33 s. Success rates for fruit localization and detachment were respectively 85% and 75%. However, it must be noted that different studies cannot be compared as they were all applied in different experimental setups. Moreover, many studies modified the crop to simplify the task.

To illustrate recent work, we discuss work on sweet pepper, strawberry, and tomato harvesting. In the EU projects Crops and follow-up SWEEPER (www.sweeper-robot.eu), a robotic system for sweet-pepper harvesting was developed (see Fig. 2). The system composed of one 6 DOF industrial manipulator with a specially designed end effector, RGB-D camera with GPU computer, programmable logic controllers, and a small container to store harvested fruit. Over a period of 4 weeks, the system was evaluated on 262 fruits [6] showing a harvest success rate of 61% in optimal crop conditions and 18% in current commercial conditions, illustrating the need for cultivation systems that are specifically designed for robotic harvesting. Average cycle time was 24 s including fruit discharge and platform navigation. If the objects of interest were in the camera view, deep-learning techniques could successfully be applied for image segmentation and detection [7•, 8, 9]. However, due to the high level of occlusion, fruit detection and approach often failed in the commercial crop conditions. Another issue in these conditions was the end-effector colliding with the plant. In a similar project, the robot Harvey was developed and evaluated on 68 sweet peppers [10]. They reached a harvesting success rate of 76.5% in a modified crop and 47% in an unmodified crop with a cycle time of 36.9 s, excluding platform navigation. The fruit and peduncle detection system based on deep learning and 3D processing worked well in the modified crop but suffered from clutter and occlusion in the unmodified crop. Similarly, the customized harvesting tool suffered from the complex unmodified conditions resulting in fruit and plant damage and relatively low attachment and detachment rates. In a third project, an image-based closed-loop control system was developed for sweet pepper harvesting, achieving a 53.3% overall successful rate and average cycle time of 51.1 s [11].

Fig. 2
figure 2

Prototype of the sweet-pepper-harvesting robot SWEEPER

Xiong et al. [12] focused on strawberry harvesting and developed a low-cost dual-arm harvesting robot. Their system was more resilient to lighting variations due to the modeling of color against light intensity. In order to deal with occlusions and clutter the robot could use the gripper to push aside surrounding obstacles to pick strawberries that are located in clusters. The pick success rate ranged from 20% for the most complex scenarios with one ripe strawberry in a cluster of unripe fruits to 100% for situations with one isolated ripe strawberry. In two-arm mode, the system had a cycle time of 4.6 s. For tomato harvesting, Ling et al. [13] developed a dual-arm robot using a binocular vision sensor for fruit detection and localization. In a greatly simplified experimental setup, the success rate of this robot was 87.5% with a harvesting cycle time excluding platform navigation of 29 s.

State of the art in Industry

Although harvesting robots are not commercially successful yet, there are several pre-commercial R&D initiatives [14•], such as for strawberry harvesting—Agrobot (www.agrobot.com), Octinion (www. http://octinion.com), DogTooth (https://dogtooth.tech), and Shibuya Seiki (www.shibuya-sss.co.jp)—tomato harvesting—Panasonic [15], MetroMotion-GroW (metomotion.com), RootAI (root-ai.com)—and de-leafing—Privia Kompano-DLR [16]) and SAIA(https://www.saia-agrobotics.com/). Although great progress has been made in the past years, these initiatives do not yet meet the requirements on success rate and speed.

Conclusion of Current Robotics in Greenhouses

There have been many academic and industrial projects on the development of selective-harvesting robots for greenhouses to date. Although performance of the robots is slowly improving, due to, for instance, advances in deep learning and mechanical engineering, to date, the harvesting success and operational speed are too low for commercial application. A key challenge is dealing with the highly cluttered crop environment, illustrated by the fact that performance greatly improves when the crop is simplified by removing some leaves and fruits.

Orchard

In the orchard environment, the environmental parameters are uncontrolled, making the variation in natural conditions more prominent. In addition, the layout of an orchard is much less structured than a greenhouse, especially in mountainous environments, making robot navigation more challenging. Different from greenhouse crops, orchard crops grow for many years and are relatively sturdy, which reduces the risk on damage of the harvesting robot. Where the trees in traditional orchards are large globular 3D structures, the trees are pruned, trimmed, and trained extensively in modern orchards, providing flat, 2D-like structures, optimizing light interception, and simplifying robotic operations [17], see Fig. 3. Activities in the orchard are similar to the greenhouse, with the addition of more thorough pruning and training of the trees.

Fig. 3
figure 3

Examples of different orchard training systems. a Wall system. b Y-trellised system. c UFO system

State of the art in Research

When all fruits can be harvested from a tree at once and when they are allowed to get damaged, for instance for the juice market or for nuts, simple mechanical solutions—tree shakers—are commercially available [17]. Selective harvesting of fruits for the fresh market, however, is a much more complex and delicate operation that is actively being researched. Deep learning has revolutionized the detection of fruits in camera images. Sa et al. [18], for instance, showed the successful application of a deep neural network to detect different types of fruits, including apple, avocado, mango, and orange, with F1-scores above 0.93 and an average of 393-ms processing time per image. Moreover, the results indicated that the network could generalize to new environments and camera setups.

Harvesting of apple and orange has been studied most intensively, see Fig. 4. Silwal et al. [19•] presented the design and evaluation of a robotic system for harvesting of fresh-market apples. The system integrated a global camera set-up, seven degrees-of-freedom (DOF) manipulator, and grasping end-effector to execute fruit picking with open-loop control. The overall success rate of this robot was 84%, with an average picking time of 6.0 s per fruit. Zhao et al. [20] developed a manipulator with a custom 5 DOF structure to simplify control and obstacle avoidance. A spoon-shaped end-effector including a pressure sensor to control the grasping force and a cutting knife was designed to harvest the fruits. In a field test with 39 apples, a success rate of 77% was shown, with an average cycle time of approximately 15 s. Baeten et al. [21] presented an Autonomous Fruit Picking Machine (AFPM) for apple harvesting, which combined an industrial manipulator with an eye-in-hand camera. To simplify perception, they used a cover to shield sunlight and provide more controlled illumination. Results showed the productivity to be close to the workload of about 6 workers, which makes the machine economically viable.

Fig. 4
figure 4

Examples of apple-harvesting robots. a WSU apple picking robot (From: Silwal et al. [19•], with permission from John Wiley and Sons), b JSU apple harvesting robot (From: Zhao et al. [20], with permission from Elsevier), and c AFRM robot (From: Baeten et al. [21], with permission from Springer Nature)

For orange harvesting robot, a robust image-based visual-servo controller for closed-loop control of a robotic manipulator was developed in [22, 23] to approach a target fruit in the presence of unknown fruit motion. An efficient and robust lighting system, with low-power image acquisition and processing hardware, and a reduced inspection chamber were developed by Cubero et al. [24]. Neither of these studies reported the specific harvest success rate and cycle time.

Research on orchard harvesting robotics cover a wide range of different crops, such as grape [25], litchi [26], kiwifruit [27], cherry [28], peach, pear [29], and coconut [30].

State of the art in Industry

Despite decades of research, there are still no selective robotic harvesters in commercial use. Some initiatives seem to be close to commercialization [14•]. FFRobotics developed an apple harvesting robot with multiple arms and a three-fingered gripper that removes the apple with a twisting motion (www.ffrobotics.com). Abundant Robotics developed an apple-harvesting robot using vacuum-based end-effector to detach the fruits from the plant (www.abundantrobotics.com). Energid developed a citrus harvesting robot (www.energid.com).

Conclusion of Current Robotics in Orchards

The biggest opportunity for robotic selective harvesting exists in the fresh market [17]. The limitations of robotic systems have been well documented in [5] and include insufficient cycle time, challenges with fruit detection in the presence of occlusions, and limitations with robust manipulation for fruit detachment.

Open Field

In open-field farming, the crops are produced on designated strips of land (fields) in the open air, where the plants grow in rows. Many open-field crops, such as wheat, maize, and potato, are being mass-harvested at a single moment of time and with the destruction of the plant. For these crops, efficient mechanical harvesters exist. Selective harvesting is required for crops that grow less homogeneously or are multi-annual, such as asparagus, broccoli, lettuce, and melon. Robotic harvesting of open-field crops poses more challenges than the harvesting in protected crop production (greenhouse, indoor cultivation), mainly due to environmental variations (light, wind, rain) and less consistent plant development [5].

State of the art in Research

Different from greenhouse and orchard harvesting where harvesting robots typically observe the crop from the side, open-field harvesters typically take a top view. To mitigate environmental variations, most systems use a cover to shield direct sunlight and to protect against rain.

Several efforts have been made to develop a selective harvesting robot for asparagus [31,32,33]. Chatzimichali et al. [31] presented a robot design for the selective harvest of white asparagus (which grows below the soil surface). Their design consisted of a caterpillar robot platform and two cameras for the identification of the tips of the asparagus. Leu et al. [33] presented a harvesting robot for green asparagus (which grows above the soil surface). Their robotic system consisted of a four-wheeled platform, one RGB-D camera, and two robotic harvesting tools (Fig. 5). The green asparagus were detected and tracked using a 3D point-cloud algorithm. The robotic harvesting tool consisted of an end effector with two rubber claws and two blades that could cut one asparagus in approximately 2 s. With two harvesting tools, an average of five asparagus plants could be harvested per meter. Leu et al. [33] reported a harvest success of 90% when tested on green asparagus fields. A video of the field performance can be found online [34].

Fig. 5
figure 5

A close-up of the end-effector that harvests the green asparagus

Three research projects aimed to develop a selective harvesting robot for brassica crops (specifically broccoli and cauliflower). Kusumam et al. [35] developed a 3D-vision algorithm using machine learning to detect broccoli heads in RGB-D images. Blok et al. [36] studied the detection of broccoli heads using deep learning with a specific focus on the generalization of the method to the selective harvesting of new cultivars. Klein et al. [37] presented a feasibility study for the development of a selective harvesting robot for cauliflower. Their prototype robot consisted of an aluminum frame with LED lights, three RGB-D cameras for crop detection and maturity evaluation, and two dexterous robotic arms to cut and pick the cauliflower.

Birrell et al. [38] presented Vegebot, a selective-harvesting robot for iceberg lettuce. Vegebot was equipped with two RGB cameras and one robotic arm with a custom-made end-effector. For the image analysis, two convolutional neural networks (CNNs) were used. The first network localized the iceberg lettuces, whereas the second one classified the detected lettuces in three classes (harvest-ready, immature, infected). The lettuce was harvested with a pneumatic end-effector that was equipped with a camera, a belt drive, and a soft gripper. A force-feedback control system was used to detect whether the gripper reached the ground plane. Then, the iceberg lettuce was cut by a knife. In field tests, a 88% harvest success and an average harvest time of 31.7 s was reported.

Foglia, Reina [39] developed a prototype robot for the selective harvest of radicchio. The robot consisted of a pneumatic manipulator and gripper with an embedded RGB camera. The image analysis was based on color filtering and morphological operators. The gripper had two bucket-like cutting fingers that were triggered by the resistance of the soil to cut the radicchio 10 mm underground. In laboratory conditions, the detection error was less than 6.3% with an average harvest time of 6.5 s.

Edan [40] presented a selective harvesting robot for melon. The robot was constructed as an implement that was drawn by a tractor. The robot was equipped with two black-and-white cameras, a Cartesian manipulator and a pneumatic gripper. The melons were detected with a texture- and shape-based image algorithm. For the path planning of the robot, the traveling salesman algorithm was used. The pneumatic gripper was equipped with a proximity sensor that detected whether the ground plane was reached. Then, the melon was grabbed and lifted so that the stem of the melon was stretched before it was cut by two knives. Edan et al. [41] tested the robot during two seasons and reported a 93% detection rate and a 86% harvest success. The average harvest time was 15 s.

State of the art in Industry

To the best of our knowledge, the asparagus harvesting robot Sparter from Cerescon (www.cerescon.com) is the only open-field harvesting robot that is commercially available. The robot is equipped with underground sensors to detect the asparagus and two harvesting tools per row. The operating speed is approximately 0.3 ha/h. Another robot that is almost on the market is RoboVeg (www.roboveg.com), a selective broccoli harvesting robot.

Conclusion of Current Robotics on the Open Field

All presented robots were developed since the 1990s and were specifically built for the selective harvest of vegetables (asparagus, broccoli, cauliflower, lettuce, radicchio, and melon). Except for the Sparter robot, all robots used cameras to detect and localize the crops. The most recently developed robots used deep learning for robust image analysis. Two of the six presented robot manipulators were self-made, and the other four manipulators were purchased. Every end-effector was custom-made and performed the cutting action by some kind of a robotic knife. The harvest success and speed are high compared to the greenhouse and orchard settings due to the less complex structure of open-field crops.

Limitations, Trends, and Future Research

Robotics has been extremely successful in production industry building on a long tradition of improving production efficiency by separating tasks, implementing well-structured and -controlled working environments with low variation in working conditions, and, last but not least, reducing variation in the objects. Essentially, Henry Ford’s famous phrase “Any customer can have a car painted any color that he wants so long as it is black” together with assembly-line manufacturing paved the way for robotic operation. Compared with production industry, robotics in agriculture lags behind significantly. In the next sections, the main technical challenges of agricultural robotic systems will be identified and solution directions will be described.

Current State of Selective Harvesting Robotics

The “State of the art in Selective Harvesting” section provided an overview of the state of the art in selective harvesting robotics in greenhouse, orchard, and open-field conditions. Despite a few decades of research, selective harvesting robots are currently do not meet the requirements of commercial success in terms of harvest success and speed. For the harvesting success, the critical components are perception (the detection of the produce and other plant parts) and the harvesting tool and operation. Looking at the challenges for agricultural robotics posed in the “Introduction” section, a number of observations can be drawn from the state-of-the-art overview:

  • Perception. Recent advances in the field of deep learning greatly improved perception, making it more robust to the challenge of variation. Deep-learning-based detection algorithms have been shown to be robust to variations in the appearance of the objects and environmental conditions. The methods also generalize quite well to new cultivars and environments. Dealing with incomplete information due to occlusions is still a big challenge when operating in complex commercial production environments. This is especially the case in the greenhouse and the orchard, as plants there are more complex compared to the open field.

  • Harvesting tool and operation. There is no clear paradigm visible in the design of the end-effector. Every study developed its own custom harvesting tool. Most harvesting tools were quite rigid and bulky. Detachment was usually performed with an automated cutting knife and in some cases with a twisting motion or suction. In complex, cluttered environments, harvesting success dropped, often due to the tool not being able to reach the right location due to collisions with the plant, or due to not being able to localize the correct location due to perception limitations. In addition, the plant and fruits were frequently damaged by the tools.

  • Operation speed. Cycle times of greenhouse and orchard robots are typically in the range of half a minute, which is significantly slower than human operation, obstructing commercial application. Due to the simpler situation, robotic harvesting of vegetables on the field can typically be done much faster than in the greenhouse.

  • Complexity of environment. In some greenhouse studies, when the crop was modified to reduce clutter by removing some leaves and fruits, harvesting success rate drastically improved. Detection success of the perception algorithms improved significantly as occlusions occurred less frequently, and the harvesting operation was more successful now that the tool had more space for a collision-free approach.

  • Task variation. All robots discussed in this review are designed only for the harvesting task. Though outside the scope of this paper, harvesting is only one task in the whole crop-production process. Various crop maintenance operations need to be addressed when considering fully automated farming in the future (Kootstra et al., 2020). Although in a very rudimentary fashion, bi-functionality of a robotic platform was demonstrated by van Henten et al. [42] and van Henten et al. [43] for harvesting as well as leaf removal of cucumber plants grown in a high wire cultivation system.

  • Safety. Safety of the robot for its plant/fruit environment was evaluated in a number of studies, where damage to the fruit and plant was occasionally reported. Safety issues of autonomous robotics systems in open-field cultivation have received some attention (e.g., [44, 45]), yet safety in human robot co-working is not an active field of research yet.

Solution Directions for Technical Challenges

Essentially, three solution directions offer opportunities in dealing with uncertainty and variation in agricultural robotics:

  • Reducing variation and uncertainty in the environment as well as in the plant population. Despite advances in the field of deep learning, the performance of machine-vision systems remains sensitive to variations in the illumination of the operational scene. Flooding the operating scene under a hood with artificial light has successfully mitigated this weakness in many applications. Operation at night is another option, although it reduces the operation time of the robot to nighttime only. To reduce the complexity of the scene, breeding for robotics is an alternative pathway. There is a keen interest from plant breeders to select cultivars that are both productive as well as better suited for robotic treatment during production. Also, modification and standardization of the cultivation systems offer opportunities to reduce variation and uncertainty in the working environment of robotic farming systems [5, 42, 46].

  • Enhancing robotic technology. There are various ways to tackle variability and uncertainty in agricultural robotics. One way is to include more domain knowledge in the design and operation of robotic capabilities. This requires modeling of the world in which the robot has to operate, thus providing potential clues about the structure of the working environment, the presence and absence of objects, and the evolution of such characteristics in time due to growth and development. Another way is to extent the sensing capabilities beyond the common machine-vision systems operating in the visible and near-infra-red spectrum. Tactile sensing is an alternative that has hardly received attention in the agricultural robotics community. Combining different sensing modalities in a multi-modal sensing framework might literarily provide more insight into the work scene of the robot. Also, active perception in which the robot resolves uncertainty in the environment by actively gathering new sensory input by changing perspective and manipulating objects has potential in dealing with uncertainty and variation, as for instance proposed in [47, 48]. The current quite rigid gripping technology in agri-food does not work well in conditions that demand short cycle times while dealing with variability in product size and softness. Compliant actuators and end-effectors combining different grasp types and using tactile sensing and control to realize different force distributions and grasp stiffness are needed to deal with these challenges.

  • Human-robot collaboration. Variation and uncertainty in agriculture together with the relatively immature status of robotic technology when it comes to dealing with these challenges prohibit rapid deployment of fully autonomous robotic technology in agriculture. An intermediate step towards autonomy might be the combination of robotic skills with human capabilities in a human-robot co-working framework [49,50,51].

Trends in Agricultural Robotics: a Wider Perspective

This paper provided an overview of robotic technology for selective harvesting in agriculture. Societal needs, state of the art of technology, technical challenges, and potential solution directions were addressed. Yet, that is only part of the story when it comes to adoption of robotic technology in agricultural practice.

Economic viability is a key issue in the adoption of technology. Yet, economic viability should be addressed from a wider perspective than just balancing direct costs and benefits of a certain technology. Novel technology may provide advantages with no directly accountable economic return. The freedom to attend to other tasks and to develop a social life has been key success factors in the adoption of the milking robot. When it comes to economic viability, discussions on robotic technology are often based on the reasoning that novel technology should successfully replace human labor for 100% to be economically viable. Given the above listed technical challenges, this line of reasoning hampers innovation. Partial replacement of human labor or human-robot co-working is potential alternatives when farmers are willing to rethink the procedures used in their farming operation.

There is clearly some tension between the romantic image of agricultural food production and the use of robotic technology. While advancing technology should remain to rank high on the research agendas to meet the challenges faced by society, this progress should be accompanied with thorough thought on the consequences of such technologies for society. The discussion on robot ethics also in the framework of agrifood deserves attention [52].

Finally, agricultural production systems are developing. Stimulated by growing concerns about the long-term sustainability of current large-scale mono-cropping cultures, intercropping and pixel-farming are revisited as better alternatives [53]. Yet, this requires rethinking farming at large as well as specifically the technology used in farming. Given current and future limitations in availability of human labor, robotic technology might facilitate and support such developments in agronomy. Yet, this introduces even more variation and uncertainty, and thus additional challenges for robotic technology, both in selective harvesting, as well as in crop production as a whole.