1 Introduction

The produced amount of poultry meat is increasing as the production gets more and more automated. 40.6 billion chickens were slaughtered in 2000 which has increased to 65.8 billion in 2016 [7]. Broilers are slaughtered and cut up almost entirely by robots and automated equipment at speeds up to 13,500 birds per hour for one slaughter line [15]. To keep the production line running smoothly the equipment needs to be adjusted for size and weight.

Broilers are typically weighed with a conveyor scale installed as part of the processing line. Such scales are typically quite large and maybe require the bird to be transferred off and back on the conveyor. Maintenance and replacements require that the line is stopped or that the line bypasses the scale.

The average weight of a flock can be used to adjust the equipment that cuts up the chicken into breast, legs, and wing parts. The individual size and weight are used to direct the broiler to the right cut-up station. By setting the equipment correctly the factory minimises waste and optimises their profits.

Very light or small birds should be removed early, as these are likely sick or underdeveloped. Trying to eviscerate these birds can cause the equipment to damage the intestines, causing faecal contamination on the following birds [9].

Many slaughter houses already have multiple camera systems installed along the processing line for inspection and monitoring tasks. These are installed in a non-intrusive manner, so they are relatively inexpensive to install and easily replaceable. These cameras can be utilised for weight estimation. A 2D image only shows the size of the bird from a single perspective, but because broilers are bred to be similar there is a high correlation between size and weight [2].

However, some variation may not be explained in a 2D image, e.g. dimensional changes can happen in the direction towards the camera. The contribution of this paper is to investigate whether adding 3D features from a 3D prior will increase the performance of the weight estimation. The prior knowledge consists of 45 3D scans of broilers gathered at a poultry processing plant. A statistical shape model (SSM), generated from the 3D scans, is fitted to the 2D image of the broiler to extract 3D features for the weight estimation.

2 Related Works

Weight estimation from images is especially useful in situations where physical weighing is not feasible. In production of fresh lettuce, where you can’t uproot the plants to weigh them, [12] showed that it is possible to estimate the weight from an image using both morphology and pixel-based methods.

A lot of work has been done in the field of estimating the weight of livestock in a non-intrusive fashion. The weight is an important parameter in rearing, but physical weighing requires large scales for animals like cattle and pigs and it can be a cumbersome affair to weigh hundreds of animals. Work by [13] shows that live weight of individual pigs can be estimated with an accuracy of \(96.2\%\) using a camera installed over the pig pen. The body area of the pig is found by fitting an ellipse to the pig’s back which is used in a transfer function to estimate the weight every minute.

In the fishing industries, [3] showed that good results could be achieved with an RGB sensor and polynomial regressions when estimating the weight of salmon. Early work by [19] demonstrated a structured light setup capable of determining the weight of flatfish as they passed the camera and laser on a conveyor. A similar approach was used by [16], but on herring. The structured light gives a measure of the depth which is useful when estimating volumes, which for similar objects are highly correlated with the weight.

A similar top view approach was used to estimate the live weight of broilers but using the Kinect sensor to acquire the depth. Using a combination of 2D and 3D features, [17] achieved a relative error of \(7.8\%\) across all broilers. The system was installed in a commercial production setting and operated fully automatically.

3 Approach

The goal is to estimate the weight of broilers from 2D images captured in-line at a poultry processing plant. 2D cameras are already in use in many slaughter houses for inspection and adding a 3D sensor would increase the complexity and cost of the weight estimation.

The 3D information should therefore come from prior knowledge. In a pre-processing step multiple 3D scans of broilers are combined in an SSM. This step is performed off-line and should only be done once.

The in-line process starts with the image acquisition that captures an RGB 2D image. From this image 2D landmarks are extracted, which are used to fit the SSM to the broiler in the image. From the fitted SSM 3D features are extracted and combined with 2D features which are used to estimate the weight of the broiler. The in-line processing is constrained to a maximum processing time of 266 ms, which corresponds to 13,500 birds per hour.

A flow chart of the system described in this paper is depicted in Fig. 1.

Fig. 1.
figure 1

Flow chart of the weight estimation pipeline described in this paper.

4 Statistical Shape Model Generation

Fitting statistical models to images gained traction with the invention of Active Shape Models (ASM) [6] and Active Appearance Models (AAM) [5]. ASM and AAM have an inbuilt prior from the shapes used to construct the models.

An SSM captures the underlying physical characteristics of the object and this is what we are interested in when modelling boilers. By matching the model to a 2D image, we gain information about the broilers measures in the third dimension. The proportions of the broiler’s body parts, like breast and drum, could even lead to a detailed weight estimation of the individual parts.

4.1 Creating 3D Scans of Broilers

45 birds have been collected at a poultry slaughter house and recorded using a Canon EOS 5DS camera over the course of three weeks to ensure diversity. The recording setup consisted of a hanger in the centre of a room and the camera was rotated around the chicken. One full rotation with the camera placed higher than the chicken and one with the camera placed lower than the chicken. Between 90 and 120 images where captured per bird. See Fig. 2 for a sketch of the setup. The weight distribution of the collected birds can be seen in Fig. 3. The histogram shows a gap between 1200 g and 1600 g where no birds have been recorded. This is not ideal, but not unsurprising as broilers are bred to be slaughtered at specific weights. The recorded images were fed to the commercially available software, ContextCapture, which generated the 3D scans. Pieces of coloured tape were attached to the hanger, two centimetres apart, to ensure the correct scale of the bird in the 3D scan. Examples of three captured 3D scans can be seen in Fig. 4a, b and c. Each scan contains between 180,000 and 330,000 vertices.

Fig. 2.
figure 2

Sketch of the setup used for capturing images for the 3D scan generation. The camera is approximately 1 m from the bird.

Fig. 3.
figure 3

Distribution of the weight of all birds used in the SSM.

4.2 Registering the 3D Scans

The 3D scans were manually trimmed to remove the wings, the knees and the neck skin as these parts can be very different between birds and therefore difficult to register. It is also assumed that especially the neck skin and the knees have a small or constant impact on the total weight. The scans were then smoothed to remove small bumps on the skin, which would otherwise create very fluctuating surface normals.

One scan was chosen as the template and scaled down to around 25,000 vertices using a surface simplification method [8]. The template was then registered to the other scans with a non-rigid iterative closest point method: N-ICP-A [1]. The algorithm assigns an affine transformation to each vertex and starts with a stiff template to find the global alignment, then gradually reduces the stiffness to allow more localized deformations. The stiffness is used to regularize the deformation and controlled by penalizing the difference between neighbour vertices’ transformations.

Each registration produces a few sets of vertices with very skewed faces, especially around the end of the legs due to badly registered vertices. To remove these faulty faces, it was chosen simply to remove a few vertices around the end of both legs. The same vertices must be removed from all registered scans to keep the correspondence between the models. Unconnected vertices are also removed from all scans. As the last step all scans are set to use the faces from the template. The resulting meshes for the three birds in Fig. 4a, b and c, can be seen in Fig. 4d, e and f.

The area around the groin proved difficult to register as there is a large variation between the birds in this area. This can be seen in Fig. 4a and b, where there is a big height difference between the backside of the thigh and the end of the breast bone between the birds. As a result, vertices around the cloaca were removed. The registered and trimmed scans are now used to generate the SSM. All scans are aligned with Procrustes analysis [18] without scaling and flipping. The mean of all scans is then subtracted from the individual scans before Principal Component Analysis (PCA) is used to model the variation in the data. Studying the explained variances show that the first seven components contain more than 95% of the total variation. The mean shape can be seen in Fig. 5. New samples can now be generated using Eq. 1.

(1)

where x is a new sample, \(\bar{\mathbf{x }}\) is the mean shape, P is the eigenvectors found with PCA and b defines a set of parameters that deform the SSM. n is the number of vertices in the model and as the vertex coordinates are stored as \(x_1,y_1,z_1,x_2,y_2,z_2,...x_n,y_n,z_n\), the number of rows in x, \(\bar{\mathbf{x }}\) and \(\mathbf P \) becomes 3n. m is the number of principal components.

Fig. 4.
figure 4

Three 3D scans created by ContextCapture, top row. Template scan registered to bird above, bottom row.

5 Fitting the SSM to a 2D Image

All birds are presented the same way, hanging in the legs with the breast facing the camera. This allows us to lock the yaw and pitch rotation of the SSM reducing the degrees of freedom in the fitting problem. Due to the way the broilers are transported on the line, some differences in roll must be expected.

The fitting is done with automatically extracted landmarks. If corresponding vertices in the SSM are fitted to these landmarks the rest of the model should match the broiler in the image. Eight landmarks are extracted from the bird in the 2D image using an IHFood ClassifEYE system [10]. These are the left and right wing pit, shoulder, hip and groin. The eight points are depicted in Fig. 7b. These points were chosen as they could be found most consistently. Eight corresponding vertices are selected in the SSM. These vertices are handpicked and only selected once, as they should always correspond to the same landmarks in the images.

Fig. 5.
figure 5

Mean of the SSM. Rotated for better viewing.

When the landmarks are extracted from the image, Procrustes analysis, without scaling, is used to align the landmarks to the selected vertices in the SSM. The SSM and image landmarks are now aligned and ready for the model to be fitted. As the orientation of the SSM is now locked, only the x and y values are used for the fitting which gives a total of 16 values in \(\mathbf x \). Equation 1 will now have the dimensions showed in Eq. 2. Only the seven first principal components are used.

(2)
$$\begin{aligned} 0&= \mathbf P {} \mathbf b - \left( \mathbf x - \bar{\mathbf{x }}\right) \end{aligned}$$
(3)

Equation 3 is an over-determined problem and can therefore have multiple solutions. \(\mathbf b \) is unknown, but we want to constrain its values to \(\pm 3\) standard deviations. As PCA is a linear model, going beyond \(\pm 3\) standard deviations will in most cases cause the model to diverge greatly from the true population. \(\mathbf b \) in Eq. 3 will therefore be solved with an optimiser. The minimisation is done with SciPy’s [11] minimise method. The algorithm used is L-BFGS-B [4] and each scalar in \(\mathbf b \) is bound to \(\pm 3\) standard deviations. Initial values for \(\mathbf b \) are all zeros.

Once \(\mathbf b \) is found, it can be inserted in Eq. 1 to generate the new sample that fits the broiler in the 2D image. \(\mathbf b \)’s size comes from the number of principal components used, so the resulting model will still be in 3D although \(\mathbf b \) was found using only 2D points.

The resulting fit for three broilers can be seen in Fig. 6.

Fig. 6.
figure 6

The SSM fitted to three different images. Red dots are vertices in the model. (Color figure online)

6 Features

The weight can be calculated from the volume of the fitted 3D model, if the density of a chicken is constant across all parts. We cannot just assume this, however, and perhaps more importantly, we know that the chicken’s cavity is empty after the evisceration. We will therefore extract multiple 3D and 2D features to estimate the weight.

One 3D feature is indeed the volume, which is calculated as the sum of all tetrahedrons formed by the faces of the SSM and the origin which is placed roughly in the centre of the SSM. The SSM is not a closed surface so the volume will not be what one might expect from a chicken.

The remaining 3D features are areas and distances on the surface of the SSM. Because there is point correspondence between all fitted SSMs, areas can be calculated from the same set of vertices every time, as it is just the location of these vertices that have changed. The surface area is the sum of all faces spanned by these vertices. The area features include the left and right breast, left and right upper thigh and a band around each drum.

The same principle is used for distance features. Pre-specified vertices make a path that is used to calculate the distance between two points on the surface. This path is the shortest in the mean shape, but because the vertices move individually when the SSM is fitted to the 2D image, it is not necessarily the shortest path in the fitted SSM. It is however much faster to calculate the distance of a fixed path than searching for the shortest path between two vertices for every fit. The distance features include the path from the collarbone to the bottom of the breast, the circumference of each drum and the path from each wing pit to the centre of the breast. In total 12 3D features are extracted from the fitted SSM.

2D features are extracted using an IHFood ClassifEYE system outputting a total of 23 features. These features are primarily area features like the area of the chest, but also distances like the distance from wing pit to wing pit.

All features are augmented by finding the square root and squaring each of them. This is done to introduce some non-linearities. With augmentation there are 36 3D features and 69 2D features.

7 Data Acquisition

2D images of broilers are recorded with a Jai BB-141GE RGB camera installed at a slaughter house in Chesterfield, England. The images are recorded after defeathering and evisceration where the broilers are transported sideways hanging in their legs with the breast facing the camera. Their weight is measured with a LINCO 520 Weigh Transfer [14] mechanical weight with an accuracy of \({\pm }0.25\%\). The weight is paired with the images and will function as ground truth. An example of a captured image can be seen in Fig. 7a. All 2D images have been recorded over the course of four days. Images of broilers weighing less than 800 g or more than 2200 g are discarded, to ensure that the 2D images are in the same weight range as the broilers used for the SSM. The remaining 136,472 images are randomised and 102,412 images are used for training and 34,060 images are used for testing.

Fig. 7.
figure 7

Captured 2D image with and without landmarks.

8 Results

Performance are measured for 2D features only (2D), 3D features only (3D) and for a combination of 2D and 3D features (2D3D). The number of features used to train the regression models are listed in Table 1. All features are normalised by subtracting the mean and dividing by the standard deviation. Both the mean and the standard deviation are calculated using only the training samples.

The regression model chosen for comparison is a linear robust regression. The robust variant means the linear model is fitted iteratively and for each iteration data points are weighted based on their residual value. Outliers will be down weighted and therefore have a smaller effect on the fit. The resulting errors of the regression models are listed in Table 1. The error gets smaller by combining 2D and 3D features and mean absolute error is reduced by 1.80% compared to 2D alone. The mean absolute error in percent is 3.47. An unpaired t-test was performed on the 2D and 2D3D residuals to investigate whether this reduction is significant, which the resulting p-value of 0.0129 strongly indicates.

Table 1. Results for the linear regression models. Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are reported.

As there are no studies done on the same dataset, we will compare our results to related weight estimation papers. Weight estimation of herring by [16] achieved an \(R^2\) of 0.980 using structured light to extract 3D features and had a sample size of 179. They used 2D and 3D features where our method achieved an \(R^2\) of 0.963 using 2D and pseudo 3D features. In work by [13], weight estimation of pigs, they got an \(R^2\) of 0.962 working on multiple frames extracted from a video. They did however track the weight of individual pigs as it grew over 13 days where as we sample and estimate the weight once.

The coefficient of determination, \(R^2\), is also used to investigate the individual features’ correlation with the weight and the 3D features generally have a higher correlation than the 2D features. The top five correlated 2D and 3D features can be seen in Table 2.

Table 2. \(R^2\) for the top 5 2D and 3D features with respect to the weight.

8.1 Timing

The system should be able to operate in-line at production speeds up to 13,500 birds per hour. That is 266 ms of available processing time per bird and the existing 2D pipeline already takes an average time of 125 ms. This includes landmark extraction.

In this work we have added the SSM fitting and 3D feature extraction to the existing pipeline. The code is implemented in python 3.6 using NumPy and Numba for speed-up where possible. Timing showed that the average processing time for fitting the SSM and extracting 3D features was just 7 ms per image. This is clearly within the remaining 141 ms, so the 3D features can easily be added to the already existing analysis. The test was performed on an i7-4770K CPU running at 3.7 GHz.

9 Conclusion

The results showed that the mean absolute error is reduced by 1.80% by adding 3D features from the SSM to the existing 2D features extracted directly from the image. A t-test was used to ensure that the results were significant.

Investigating \(R^2\) for the individual features, showed that the 3D features were more correlated with the weight than the 2D features. It is however very likely that there is a high collinearity between the 3D features as the overall weight estimation error were higher when comparing the regression models using either 3D or 2D features.

The SSM was fitted to the 2D image using only 8 landmarks. This meant the SSM could be fitted very quickly, in just a couple of milliseconds, allowing the entire fit and 3D feature extraction to run real time. The soft curves and the bland textures of the broiler made it difficult to extract more than these eight landmarks and especially the landmarks around the wings proved to be volatile. The angle of the wings could deviate a lot on some birds, which have a large impact on the wing pit and shoulder landmarks. Figure 4a is a good example of how different wings can look on some birds. All 3D features depend on the fit, so a bad fit would lead to bad 3D features. For future work it would be interesting to try a more robust way to fit the SSM to the 2D image.

Constructing an SSM is a time-consuming task. Capturing 90–120 images of each broiler is a tedious job and building the 3D scans is computationally expensive and therefore also time consuming. After this comes the manual process of inspecting and trimming the scans before they can be registered to each other, which also is computationally expensive.

Many steps in this process can however be automated which will make it easier to expand the SSM with more broilers and therefore more variance. For this paper we built the statistical model from 45 3D scans of broilers, which we used to estimate the weight of over 100,000 birds. 45 3D scans can clearly not represent the variation of the true population, but the results indicate that the weight estimation error can be reduced by adding prior knowledge and it is our belief that an SSM with more 3D scans can improve the performance further.