Introduction

The cost of a product can be accurately estimated only at the end of the development cycle, once detailed choices about the manufacturing process (equipment, tooling, operations) have been made. In earlier stages of development, cost would be valuable information to guide design choices in order to meet product specifications (target price, production volume, etc.). However, there is limited availability of cost estimation methods suited to the needs of product designers. These methods should be easy and fast, with a reasonable compromise in terms of accuracy.

Early estimation of manufacturing cost is especially difficult for precision machined parts. The machining cost is usually calculated by multiplying an appropriate shop rate (the hourly cost of equipment, labor, and indirect resources) by the machining time. Accurately estimating the machining time is not a trivial task, as it consists of planning the machining process and calculating the time taken for each operation. This requires the selection of cutting parameters (cutting speed, feed, depth of cut) taking into account specific requirements for the different types of operations (turning, drilling, milling, grinding, etc.). The estimation must also include non-productive and handling times, which depend on the size of the part and the types of machine tools to be used.

Beside requiring specialized manufacturing knowledge, these calculations would take too long during product design, when repeated cost estimates are needed to compare alternatives and evaluate the effects of changes to the product. A designer should be able to estimate the machining time more easily, using empirical functions of predictors related to design specifications. Procedures of this type, referred to as parametric methods, are used in rough order-of-magnitude estimates during product planning or conceptual design. In literature, attempts have been made to extend their use to the preliminary or detailed design stages. This objective requires a careful selection of the predictors in order to keep the estimation error within acceptable limits.

In the search for accurate predictors of machining time, one must obviously consider variables related to the amount of machining work needed, such as the area or length of the machined features. In addition, some studies have focused attention on the complexity of the machined part. This is often expressed in purely geometric terms, e.g. by counting the number of machined features or dimensions. Alternatively, an interesting formulation based on an analogy with information theory combines the dimensions with the tolerances, which are also likely to influence machining parameters. Previous studies (Muter 1993; Hoult and Meador 1996) have shown that a complexity measure defined by these variables is correlated with machining time for specific processes, although it probably does not allow reliable estimates over a wide range of part geometries.

This paper aims to propose a parametric procedure for machining time estimation from data available in the technical drawing of a part. The calculation is based on the regression analysis of detailed time estimates made on a sample of machined parts. The complexity measure is integrated in the regression model of machining time as one of the predictors. Unlike in previous studies, additional predictors are identified to better represent the effects of some design factors that are not fully represented by complexity alone. These include the overall part size, the material, and the types of machined features.

The motivation for the study is twofold. On a theoretical side, it tries to better understand the factors that influence machining time and cost, highlighting their relative importance. On a practical side, it aims to provide an estimation method that is simple yet accurate enough to guide design decisions and help improve the product in early development stages.

The remainder of the paper is organized as follows. Section 2 reviews related estimation methods from literature. Section 3 describes the time estimation problem to be solved, and introduces the basics of the proposed approach. Section 4 explains the method for the development of the parametric method, while Sect. 5 reports the results including the regression equation and the related accuracy tests. Section 6 demonstrates the use of the equation on an example. Section 7 discusses the advantages and limitations of the proposed method.

Literature review

The background of this work includes general methods for estimating manufacturing costs, specific methods for estimating machining cost and time, and approaches to measuring complexity for estimation purposes.

Estimation of manufacturing cost

Cost estimation supports several tasks throughout product development (AACE 2019). During preliminary design, it guides material and process selection as well as benchmarking and make-or-buy decisions. During detailed design, it helps compare design alternatives, evaluate the effect of redesigns, and verify compliance with cost budgets. Once the design is complete, it is useful for submitting bids, revising supplier quotations, and controlling manufacturing expenses.

These opportunities have been especially perceived in the development of highly complex products. Some introductions to cost estimation in the aerospace sector (Rush and Roy 2000; Roy and Sackett 2003) recommend the evolution from an approach based on expert knowledge to a formal data-driven process. Methods and tools for procurement activities at big companies and public agencies are discussed in reports and textbooks (GAO 2009; NASA 2015; Mislick and Nussbaum 2015). Methods for a wider range of applications are covered in Ehrlenspiel et al. (2007).

Cost estimation requirements may vary with the application. The selection of materials (Farag 2014) and manufacturing processes (Lenau and Haudrum 1994; Esawi and Ashby 2003) require streamlined methods with a trade-off on accuracy. The manufacturing cost is split into coarse elements (material, operating, and tooling costs), whose ranges are predicted by means of charts and graphs as a function of production volume; these are used to rank candidate process choices that meet product specifications (Ashby et al. 2007). More detailed estimates are needed for the comparison of design alternatives: as described in Weustink et al. (2000), the product is broken down into a relational structure with multiple levels (assembly, parts, features), each of which should be associated with process-dependent procedures for cost estimation. Such an analysis requires a correct cost structure, where indirect costs should be explicitly estimated to better identify cost reduction margins (Ulrich and Eppinger 2007).

Three main types of formal methods have been proposed for the estimation of manufacturing costs of whole products, assemblies or individual parts. Analogy methods are based on historical cost data, which are retrieved using group-technology coding (Ehrlenspiel et al. 2007) or case-based reasoning (Rehman and Guenov 1998; Duverlie and Castelain 1999). The accuracy of the estimate depends on the availability of similar cases, which requires considerable prior work for classification and normalization. To overcome this difficulty, some studies analyse the estimation process based on expert judgement, and translate it into procedures or rule-based systems (Rush and Roy 2001; Mauchand et al. 2008; Molcho et al. 2014).

Parametric methods are based on cost-estimating relationships (CER), which express the cost as a function of one or more variables (cost drivers). Different levels of accuracy can be achieved with this approach: in early development stages, the cost of a product may be estimated from either the mass or an appropriate functional parameter (Ehrlenspiel et al. 2007); in preliminary or detailed design, multiple cost drivers are usually preferred for more accurate estimates. Regression models are easy to use, and their uncertainty can be statistically evaluated within the range of available data (Foussier 2006); they are therefore recommended in procurement activities, where the importance of data normalization is again emphasized (Mislick and Nussbaum 2015). As an alternative parametric method, neural network models (Zhang and Fuh 1998) are said to be more accurate whenever costs cannot be modeled using common regression equations (Smith and Mason 1997; Cavalieri et al. 2004); their disadvantages include the lack of an interpretable equation and a statistical error magnitude. Advanced statistical methods proposed to improve regression accuracy include support vector regression, generalized additive models and gradient boosted trees (Huang 2007; Loyer et al. 2016).

Engineering build-up methods provide accurate cost estimates from a detailed description of manufacturing activities (work breakdown structure). For complex products, calculating the costs of individual activities requires gathering a large amount of information from many sources (Roy et al. 2011), and using detailed cost allocation criteria borrowed from accounting methods (Locascio 2000). These complications justify the development of supporting tools for engineering cost estimation; the requirements for software implementation are discussed in Roy and Sackett (2003), Nasr and Kamrani (2007), and NASA (2015). Proposed approaches include the integration of process planning methods in cost estimation tools (Grewal and Choi 2005) and the extraction of data from CAD product models (Liu and Basson 2001).

Estimation of machining time

Methods for machining cost estimation are reviewed in Niazi et al. (2006) and Garcia-Crespo et al. (2011). The problem involves an individual part (workpiece) which has a set of geometric features created through machining operations. As already mentioned, the problem usually comes down to estimating machining time. This can be done with several possible trade-offs between accuracy and calculation effort.

In the most accurate method, which is commonly used in downstream engineering work, the machining process is detailed into a sequence of cutting operations on one or more machine tools. The time for each operation is calculated by well-known shop formulas as a function of cutting parameters (Tanner 2006). Variations of the basic method differ in some assumptions or calculation details; for example, Creese et al. (1992) suggest two different expressions of cutting time: feed length divided by feed rate, and cutting path length divided by cutting speed. For turning operations, Jha (1996) calculates times and costs by means of optimization of cutting parameters with constraints on machining power and surface roughness. For drilling and milling, Maropoulos et al. (2000) correct default cutting parameters according to the specified roughness. A software tool described by Perera (2014) streamlines time estimation for different types of operations by recommending optimal values of the cutting parameters. In other software tools, feature dimensions and types of operations are automatically recognized from either CAD models (Roberts et al. 1997; Germani et al. 2011) or G-codes for CNC machining (Ben-Arieh 2000).

To simplify the calculation, parametric methods estimate machining time without breaking it down into individual operations. Simple regression models find little use except for single machining processes: an equation mentioned by Creese et al. (1992) gives the cutting time of rotational parts as a function of mass, with corrections related to the material and to the fraction of volume removed. Better accuracy is achieved by multiple regression models. For rotational parts, Mahmoud (1979) proposes an equation for time estimation from part dimensions (length and average diameter) and coefficients related to material, type of lathe, specified tolerance, and machining complexity; the latter is expressed as the number of setups plus the number of discontinuities on the machined surface. Within a comprehensive method for cost-based process selection, Swift and Booker (2013) provide a cost equation for a wide range of part types; a coefficient related to geometric complexity is evaluated from a two-digit code depending on the general shape (rotational, prismatic, etc.) and on feature complexity attributes.

Feature-based methods allow a compromise in terms of accuracy and ease of use between engineering and parametric methods. They estimate machining time as a sum of contributions from individual part features. For each feature, the cutting time is calculated considering an appropriate removal rate depending on material and feature type. Lovejoy et al. (2005) always use a removal rate related to machined volume; Jung (2002) uses the same parameter for roughing operations, and a removal rate related to machined area for finishing operations; in addition, length-related removal rates are used for holes (Polgar 1996) and for end-milled features (Boothroyd et al. 2011). For prismatic parts, Ou-Yang and Lin (1997) estimate drilling and milling times from removal rates with feature extraction from CAD models. Rao et al. (2005), different removal rates are used to build metamodels (response surfaces) of machining cost to allow shape optimization on aerospace engine parts. Other feature-based methods directly estimate machining cost from variables related to feature type and complexity attributes (Feng et al. 1996), or use group-technology coding to select the cost drivers (Geiger and Dilts 1996).

Special or hybrid methods have also been proposed for specific applications. A neural network model is used in Atia et al. (2017) to estimate the machining time of rotational parts; the input variables cover a broad set of specifications including the main part dimensions and the types of machined features. Stockton and Wang (2004), a similar method with different cost drivers is compared for accuracy with a detailed method based on shop formulas. For prismatic parts, Shebab (2001) and Shebab and Abdalla (2001) apply fuzzy rules to the attributes of machined features (types, dimensions, tolerances, roughness) to evaluate machining time through membership functions (low, average, high). For procurement applications in the aerospace sector, Watson et al. (2006) propose a parametric method enhanced by analogy with historical data; the latter is used both to select a suitable regression model for the part, and to estimate an equivalent shop rate from supplier quotations. In Qian and Ben-Arieh (2008), the parametric method is combined with activity-based costing to include some indirect activities (design, CNC programming, prototyping, etc.) in the manufacturing cost of the part. Finally, a related problem consists of developing software tools for the estimation of CNC milling time on workpieces with free-form surfaces; the calculation is usually based on the toolpath length, which is evaluated from either the G-code (Heo et al. 2006; So et al. 2007; Liu et al. 2013; Shukla et al. 2015, 2016) or the analysis of surface shape (Siller et al. 2016).

Complexity measures

Time and cost represent the effort to achieve a result, and are thus intuitively related to the complexity of the system being analyzed. This observation is the basis of extensive research on the concept of complexity in manufacturing. A review of the topic (ElMaraghy et al. 2012) mentions various ways in which complexity arises in design, manufacturing and business activities, and a general trend towards increasing design complexity (number of parts, technology, size, geometry, variety). A complex system is characterized by many parts and many connections among them, which may lead to uncertain or even chaotic behavior.

Attempts to measure the complexity of a product have sprung up in response to the axiomatic design theory (Suh et al. 1978; Suh 1990), which recommends minimizing complexity as a design strategy. The measure suggested for complexity is based on an analogy with the information theory (Shannon 1948; ISO/IEC 1996). Design reduces uncertainty about product specifications by providing information; this is modelled by describing the product as a message and the manufacturing process as an information channel, which introduces noise in the message. The actual messages that can possibly be received are mutually exclusive events with probabilities pi; each event has an information content Ii, i.e. the minimum amount of information that can be provided to determine its occurrence. This is measured by the following logarithmic expression:

$$ I_{i} = \log_{2} \frac{1}{{p_{i} }} $$
(1)

The whole set of events has an entropy H, i.e. an average information content, which is given by

$$ H = \sum\limits_{i} {p_{i} \log_{2} \frac{1}{{p_{i} }}} $$
(2)

Entropy represents the expected value of complexity, while information content is the impact of the individual event on complexity. The former measure has been used to evaluate the complexity of a signal that may have several levels, as in recent applications of pattern recognition in areas related to machining such as surface metrology (Ullah et al. 2015) and tool condition monitoring (D’Addona et al. 2017). The latter measure is suitable for problems that focus on one event: in design, the event of interest is that the product meets its specifications.

When machining a part, the information content is the minimum amount of information that can be provided to satisfy a specification. For an individual machined feature, Wilson (1980) associates the probability pi of such an event to the tolerance Ti divided by the nominal dimension Di; this is the ratio of the favourable cases (the actual dimension is within the tolerance) to all the possible cases (the actual dimension is anywhere between zero and the nominal). Accordingly, the information content Ii of the specification for the feature is

$$ I_{i} = \log_{2} \frac{{D_{i} }}{{T_{i} }} $$
(3)

The information content-based complexity as a function of dimensions and tolerances is proposed in Muter (1993) and Hoult and Meador (1996) for the estimation of the cycle time for some manufacturing processes including turning and milling. This measure is shown to grasp the combined effects of part size and shape, which makes it a potentially better predictor of machining time than the purely geometric complexity measures proposed in different contexts such as mould/die cost estimation (dimension count, feature count, perimeter/area ratio, etc.). Tests on datasets of industrial cases reveal a good correlation with actual machining times; however, regression models based on complexity as the sole predictor are restricted to specific machining processes, and seem to neglect other factors that may influence machining time as will be discussed later in the paper.

An alternative product complexity measure is defined in ElMaraghy and Urbanic (2003); the expression consists of the logarithm of the number of features, corrected using coefficients related to feature patterns and other attributes (shape, tolerances, surface finish). A similar measure is also proposed for the complexity of the manufacturing process (ElMaraghy and Urbanic 2004) and demonstrated in design cases (Urbanic and ElMaraghy 2006). Budiono et al. (2014a, b), the above formulation is used for estimating machining time. For this purpose, product complexity is added to another machining-specific complexity index depending on the number of tools and machined sides on the workpiece; the resulting parameter is shown to be correlated with the cutting time for roughing operations.

Other manufacturing complexity indices are defined in Kerbrat et al. (2010) for CNC machining of injection moulding tools. They are evaluated from several mould component attributes such as the outside dimensions, the size of the required end mill, the blank volume, and the removed volume. Index values are displayed using colour maps on digital models of the mould components, but their possible use is also envisaged for estimating the machining hours required for the mould.

The measure of complexity as information content has been used in this work to develop a parametric model of machining time. A novel contribution compared to the above cited results is the attempt to combine the complexity with other predictors selected by statistical analysis of machining data spanning different processes (turning, milling, drilling, grinding). As described below, the analysis has led to the proposal of a multiple regression model with improved accuracy, while retaining reasonable convenience for the purposes of design evaluations.

Problem definition and assumptions

The problem to be solved is described below, specifying its objectives (input and output), and discussing some requirements and basic choices.

Input

It will be assumed that the machining time is to be estimated for a part specified in a technical drawing. The part may be simply an intermediate or candidate design, and its representation may be limited to a layout sketch, where the missing specifications can be retrieved from design notes or technical standards.

Different types of design data may possibly influence the machining time. General specifications for the part include the material, the outside dimensions, and a coarse description of shape (prismatic or rotational) and the machining blank (casting, forging, rolled stock). Detailed specifications for each machined feature include the type (flat, cylindrical, rotational with complex profile, thread, gearing, etc.), the associated dimensions (e.g. diameter and depth for a hole), and the tolerances.

Some assumptions will be made on the above specifications. The types of features are limited to those including a small number of dimensions, with the exclusion of free-form surfaces. Tolerances are only associated with dimensions, with the exclusion of geometric tolerances; as the only exceptions, position or profile tolerances are converted into equivalent dimensional tolerances on the basic dimensions that define feature shape or location. Surface roughness will not be explicitly considered in time estimation, assuming that tolerances give enough information to understand the machining requirements.

Output

The result of the estimation is the cycle time of the machining process (floor-to-floor time), not broken down into operations as these are not necessarily known at the design stage. The estimate is reported as an expected value with a stated uncertainty. A designer should interpret the estimate as what could be achieved by means of a machining process that meets the productivity criteria of medium-to-high production volumes (at least thousands of units). The following assumptions are also made regarding the machining process:

  • The workpiece has limited size and mass (within 300–400 mm and 15–20 kg).

  • The equipment consists of CNC machine tools capable of complex sequences of operations with automatic tool changes. They include turning centres for rotational parts, and machining centres for prismatic parts. Additional machine tools may be needed for special operations such as gear hobbing, grinding or slotting.

  • Machining operations use tools with a reasonable compromise between productivity and cost (generally with coated carbide inserts), and vendor-recommended cutting parameters.

The cycle time t includes several situations occurring during the machining of a workpiece. In more detailed estimation methods, these should be explicitly accounted for in an expression like the one below, adapted from Creese et al. (1992):

$$ t = t_{C} + t_{N} + t_{H} + \frac{{t_{S} }}{q} $$
(4)

where

  • tC is the cutting time, during which the machine actually removes material from the workpiece.

  • tN is the non-productive time, mostly spent in operations on the cutting tools (tool engagement and return, indexing, tool change), including possible allowances for inspection and operator fatigue (neglected here, based on the above assumptions on process automation).

  • tH is the handling time, corresponding to loading and unloading the workpiece on/from the machine.

  • tS is the setup time, including programming and preparation of the machine (loading and unloading of fixtures and tools, first-article machining and inspection).

  • q is the batch size, i.e. the number of workpieces produced between two consecutive setups.

According to the above assumptions on the machining process, the setup time tS will not be considered as its contribution to cycle time would be negligible due to the large batch size q.

The uncertainty regarding machining time only includes the estimation error due to the limited set of variables used in the predictive model: parts with equal estimates could have different machining times in the actual process or when more detailed estimation methods are used. In practice, an additional uncertainty would be related to the machining choices made by a company. Among these, a prominent role is played by the selection of cutting parameters (cutting speed, feed, depth of cut), which may have a dramatic impact on cutting times. Actually, engineers in different companies choose different parameters for several reasons. Tool vendors recommend different choices of parameters based on their catalogues of tool materials and geometries; workshops may deviate from those recommendations in the attempt to control tool life according to their specific needs. Furthermore, the selection is subject to several constraints relating to available machines (power, torque, vibration control). The problem is coupled since each cutting parameter influences multiple selection requirements (reduced cost or environmental impact; increased productivity, accuracy, surface finish, etc.) unless selection strategies based on axiomatic design are adopted, e.g. Ullah et al. (2009).

A further uncertainty is related to any downtime or unpredictable delay that may occur at an operational level. While a contingency factor may have to be applied to the time estimate for some applications (e.g. bidding or quotations), these issues are of limited importance when the estimate is used to compare design alternatives.

Requirements and basic choices

Ease of use is the chief requirement for an estimation method to be used in product design. The machining time must be readily calculated from the input data described above, without the need for process planning or any complex reasoning. This would also apply if the estimation procedure were implemented in a software tool used to speed up the calculation or extract the data from a digital drawing or 3D model. Although such a tool could include process planning algorithms, the development effort would probably not be worth the objective of improving early cost estimation.

Keeping the procedure simple imposes a compromise in terms of accuracy. However, errors in the order of ± 30% may be acceptable in preliminary design: the expected time could still allow comparisons, while the upper limit of the prediction interval would be a safer choice for bidding or make-or-buy decisions.

Based on these requirements, the machining time and its statistical uncertainty will be estimated through a parametric model. This raises the problem of choosing which variables to include as predictors in the model. A recent trend in similar tasks is the use of deep learning techniques on large datasets, with the aim of automatically identifying the input variables of a metamodel. In machining time estimation, such an approach might be applied to historical data collected in CNC machining workshops, where cycle times estimated by manufacturing engineers or simulation software (or possibly measured on the shop floor) are associated with design specifications for machined parts. Considering the difficulty of retrieving a sufficient amount of data for unsupervised learning, a standard linear regression analysis is preferred in this paper. Beside requiring fewer machining cases, this choice makes the evaluation of model uncertainty easier, and minimizes possible noise factors related to differences in estimation procedures and production settings at the companies providing the data.

The predictors will have to be calculated from the input data, and selected from those with the strongest influence on the response. For this purpose, it can be noted that the three main elements of machining time are influenced by different attributes of the part:

  • The cutting time depends on the total extent of the machined features (which determine the average amount of work needed), and also on the types of features, the tolerances and the material (which determine the relative difficulty of the work).

  • The non-productive time depends mainly on the number of machined features (which determine the number of individual operations required).

  • The handling time depends almost exclusively on the outside dimensions of the part, regardless of its machined features.

These considerations help to understand the strengths and weaknesses of the complexity measure as a possible predictor of machining time. The complexity is the sum of logarithmic items (3) associated with the dimensions of machined features. Each item represents the size of a feature and the required machining precision, and therefore should have an influence on the cutting time. The sum of the logarithms reduces the size effect of the dimensions and gives more importance to their number, thus influencing the non-productive time. On the other hand, the complexity should have little influence on the handling time, especially for cast or forged workpieces that may be machined on a small fraction of their surface. Furthermore, the complexity does not take into account the types of machined features and the material, which can lead to significant variations in the cutting time.

Consequently, it is unlikely that the machining time can be estimated on the basis of complexity only, unless specific part families are considered as in previous studies. The analysis reported below selects additional predictors that can improve the accuracy of the estimation without requiring information not readily available at the design stage.

Method

The parametric method has been developed from a sample of cases. Each case consists of a machined part, for which a baseline machining time was estimated using a feature-based method from literature. A set of candidate predictors was also evaluated from the specifications of each part. The statistical analysis of the collected data made it possible to fit the sample data by means of a regression model with an optimal set of predictors.

Dataset

The sample includes 80 parts, the drawings for which were collected from various sources. They come from 11 mechanical assemblies including piping and fluid machinery (reciprocating compressor, gear pump, plug valve, globe valve), transmissions (gear reducer, universal joint), tools (circular saw, stamping die, screw vice), and various mechanisms (vehicle suspension, hydraulic jack).

General data for the parts is listed in Table 1. The sample covers size ranges in excess of the limits assumed for the estimation, as well as various materials and all types of basic shapes and blanks considered in the study. Although no statistical sampling was done on these properties, it is believed that the parts are sufficiently representative of a wider range of machine components.

Table 1 Sample part data

Baseline estimate

For each part of the sample, the machining time was estimated using a baseline method that is thought to be more accurate than the one being developed. Due to the high number of parts, this method was also required to avoid calculations that are too detailed. This ruled out engineering build-up estimates based on process planning and selection of cutting parameters. The feature-based method described by Boothroyd et al. (2011) was chosen as it is based on simple cost parameters, the values of which can be evaluated for a wide range of feature types. Moreover, it is updated to state-of-the-art machining technology and is sufficiently proven in real cases, thus allowing partial validation of the parametric method.

The feature-based method calculates the cutting time tC [min] from removal rates defined in relation to volume, area or length depending on the type of feature and operation; the following criteria apply:

  • For a finishing or grinding operation on a generic feature:

    $$ t_{C} = \frac{A}{{Q_{A} }} $$
    (5)

    where A [cm2] is the machined area, and QA [cm2/min] is the area-based removal rate.

  • For a roughing operation on a generic feature:

    $$ t_{C} = \frac{{V_{m} }}{{Q_{V} }} = \frac{A \cdot a}{{Q_{V} }} $$
    (6)

    where Vm [cm3] is the removed volume, and QV [cm3/min] is the volume-based removal rate. The volume can be replaced by the machined area A if the machining allowance a [cm] is evaluated for each feature according to the difference between blank size and final feature dimensions.

  • For an end-milling operation on a contour or groove:

    $$ t_{C} = \frac{L}{{Q_{L} }} $$
    (7)

    where L [cm] is the length of the machined feature, and QL [cm/min] is the length-based removal rate.

The removal rates provided in Boothroyd et al. (2011) do not explicitly consider the effect of the tolerances specified for the machined features. Therefore, a further level of detail was added by recalculating the removal rates for the most common types of operations. For each operation, the recommended ranges for the cutting parameters from tool catalogues were mapped to the allowable tolerance ranges. For example, the removal rate QV in a rough-turning operation is

$$ Q_{V} = v \cdot d \cdot f $$
(8)

where v [m/min] is the cutting speed, d [mm] is the depth of cut, and f [mm/rev] is the feed. Recommended ranges of these three parameters for mild steel are v = 110–160 m/min, d = 2–4 mm, f = 0.2–0.4 mm/rev. They match the range of ISO tolerance grades between IT10 and IT13 that is commonly allowed for this type of operation. This results in a correspondence between removal rates and tolerance grades as shown in Table 2.

Table 2 Example of calculation of the removal rate

Similar mappings for other cases made it possible to get the list of parameters shown in Table 3. Each parameter is an approximate removal rate for mild steel under a given combination of feature, operation, and tolerance.

Table 3 Parameters for feature-based estimation of machining time

The machining time t [min] of a part made of any material is estimated as

$$ t = K_{M} t_{C} + t_{N} + t_{H} $$
(9)

where

  • The cutting time tC is calculated from the above equations and parameters for mild steel.

  • The correction factor KM is related to the material. Rough values assumed here include 1 for mild steel, 1.3 for cast iron and medium-carbon steel, 1.5 for stainless steel, 2 for alloy steel, 0.5 for copper alloys, and 0.3 for aluminum alloys. More detailed evaluations could be made according to the machinability ratings available from various sources, e.g. Machinability Data Center (1980) and Drozda et al. (1983).

  • The non-productive time tN is calculated as a constant time per operation (0.1 min).

  • The handling time tH is calculated from the mass of the part (0.5–0.75–1–1.5 min for a part whose mass is less than 0.2–5–15–25 kg respectively).

Evaluation of predictors

The following candidate predictors were selected for the parametric model of machining time, and evaluated for each part of the sample:

  • The part’s volume V [dm3], calculated as mass divided by material density.

  • The envelope volume VE [dm3], i.e. the product of the outside dimensions along three reference axes.

  • Identification of the material, equal to the KM correction factor of the feature-based method.

  • Two categorical variables related to the general shape and the type of blank (from Table 1).

  • The number ND of the dimensions Di created through machining operations (i.e. excluding those possibly pre-existing on the initial casting or forging).

  • The sum Dtot of the values of machined dimensions:

    $$ D_{tot} = \sum\limits_{i = 1}^{{N_{D} }} {D_{i} } $$
    (10)
  • The approximate number NO of machining operations. As a detailed process plan is not available, this is calculated by counting at least one operation for each machined feature. Additional operations are counted considering special features on the drawing (threads, gearings), finishing operations for all features with a tolerance grade within a given limit (IT10 for cylindrical surfaces and profiles, IT9 for holes), and grinding operations for all features with a tolerance grade within a tighter limit (IT7 for planes, IT6 for cylindrical surfaces or profiles) or in the presence of machining notes on the drawing.

  • The total area Atot of the machined features.

  • The complexity C of the part according to its definition (3) as information content [bits]:

    $$ C = \sum\limits_{i = 1}^{{N_{D} }} {\log_{2} \frac{{D_{i} }}{{T_{i} }}} $$
    (11)

    where the tolerance Ti associated with dimension Di may be either explicitly specified in a dimension callout or calculated from the IT tolerance grade according to ISO standards (EN ISO 2010).

The parametric model is a linear regression equation that estimates the machining time t of a part as a function of one or more predictors to be selected from the above list. The equation is associated with an estimation error, which determines a prediction interval for the actual machining time for a new part.

Results

Table 4 shows the baseline time estimates and the predictor values calculated for the sample parts.

Table 4 Predictors and response for the parts of the sample

The error of a linear regression model should have a normal distribution with consistent parameters along the ranges of its variables. However, the machining time does not have a linear relationship with any of the predictors; its variation has a skewed distribution and increases with the values of the predictors. As an example, the graph in Fig. 1a shows the time as a function of complexity with a least-squared trend line and a 90% prediction interval. In Fig. 1b, a logarithmic transformation of both the response and the predictor yields a linear relationship and symmetric, uniform errors.

Fig. 1
figure 1

Machining time as a function of complexity: a linear scale; b log–log scale

Therefore, a suitable choice for the model is

$$ \log t = c_{0} + c_{1} \log x_{1} + \cdots + c_{n} \log x_{n} $$
(12)

where xi are the predictors, ci are the regression parameters, and n is the number of predictors. Once the parameters have been estimated, the model can be back-transformed into product form:

$$ t = kx_{1}^{{c_{1} }} \ldots x_{n}^{{c_{n} }} , \, k = 10^{{c_{0} }} $$
(13)

A proper set of predictors must be selected for the model. It would be wrong to overfit the data by including all the candidate predictors regardless of their statistical significance; while keeping the error at a minimum on the sample, this would not guarantee the same accuracy on further cases. Moreover, an equation with too many variables would not be much help for understanding the individual effect of each of them.

In a first attempt to reduce the model to just one predictor, complexity would be an obvious candidate according to the results of previous studies. However, it is apparent from Fig. 1 that time has a limited correlation with complexity, as was somehow expected from the discussion in subsection 3.3. Other geometric predictors are better correlated with the response, as can be seen from the Pearson correlation coefficients r listed in Table 5. The variables related to part size (V and VE) seem to be the most suitable choices for a single predictor, probably because they influence all the elements of machining time (cutting, non-productive, handling). The machined area Atot has also a good correlation, which can be explained by its influence on cutting time. The remaining variables (ND and NO) are even less correlated than complexity.

Table 5 Correlations of log t with the geometric predictors

For a multiple regression model, the predictors should ideally be independent from one another. If possible, each predictor should influence a different time element. To aid the choice, Table 6 shows the Pearson correlation coefficients between pairs of geometric predictors. Complexity C has little correlation with all the variables except the sum of dimensions Dtot; both have an influence on cutting and non-productive times. The remaining variables are all highly correlated with one another but they influence different time elements: the machined area Atot is related to the required amount of machining work (regardless of its difficulty), while the volumes V and VE are especially related to the handling time. Based on these considerations, a good choice of predictors could include C, VE (or V), and Atot if they proved to make a statistically significant contribution.

Table 6 Pairwise correlations between geometric predictors

The final selection of the predictors was made by a stepwise regression procedure, which sequentially adds the predictors with the highest residual contribution, and removes those with non-significant contribution. The candidate set of predictors included the variables related to the size of the workpiece (V, VE), the area of machined features (Atot), the dimensions (C, Dtot), and the material (KM), as well as the categorical predictors related to shape and blank. The best model expresses the machining time t [min] as

$$ t = 0.29 \cdot C^{0.37} \cdot A_{tot}^{0.26} \cdot V_{E}^{0.10} \cdot K_{M}^{0.15} $$
(14)

after back-transformation from a logarithmic model with standard error s = 0.136 and coefficient of determination R2 = 88.7%. As confirmed by the analysis of variance in Table 7, the four predictors are all statistically significant and their relative contributions rank in the same order in which they appear in the equation. The residuals do not show significant deviations from the underlying assumptions of linear regression: they are normally distributed (p-value = 0.881 in the Anderson–Darling test), and have no systematic trends in relation to the fitted values, the predictor values, and the order of data collection.

Table 7 Analysis of variance in the regression model of logt

For the sake of comparison, another regression model was fitted to the data using just two predictors related to part size and material:

$$ t^{\prime} = 8.5 \cdot V_{E}^{0.32} \cdot K_{M}^{0.20} $$
(15)

In its logarithmic form, the model has s = 0.176 and R2 = 80.6%; the residuals have a significant deviation from normality (p = 0.007) but no apparent trends in relation to fits and predictors. The predictive abilities of the two models can be compared considering their standard errors. Due to the high number of degrees of freedom of the error, the 90% prediction interval on log t is nearly equal to ± 1.67 s. Therefore t is estimated in an interval limited by the back-transformed regression value multiplied by factors 10−1.67 s and 101.67 s. The model including C and Atot brings s from 0.176 to 0.136, which corresponds to a reduction of the uncertainty from about (− 50/+ 100%) to about (− 40/+ 70%) for the estimate of machining time.

The different prediction intervals are also shown in Fig. 2, which compares the feature-based estimates (9) to the parametric estimates with the two regression models (14) and (15). The full model (Fig. 2a) is visibly more accurate than the simplified one (Fig. 2b). While the absolute error of both models increases with the estimated time, it has been verified that the percentage error is fairly uniform: the mean and standard deviation of its absolute value are about 25 ± 20% for the full model and 40 ± 35% for the simplified one.

Fig. 2
figure 2

Parametric vs feature-based estimates: a model t(CAtotVEKM); b model t′(VEKM)

Discussion

The parametric model can be put in a more convenient form for practical use. It can be observed that a reference time of 10 min corresponds approximately to a reference set of predictor values: C = 200 bits, Atot = 500 cm2, VE = 1.5 dm3, KM = 1. Accepting a further 5% uncertainty of the estimate, the regression Eqs. (14) and (15) become

$$ t = 10{\text{ min}} \cdot \left( \frac{C}{200} \right)^{0.37} \left( {\frac{{A_{tot} }}{500}} \right)^{0.26} \left( {\frac{{V_{E} }}{1.5}} \right)^{0.10} \left( {\frac{{K_{M} }}{1}} \right)^{0.15} $$
(16)
$$ t^{\prime} = 10{\text{ min}} \cdot \left( {\frac{{V_{E} }}{1.5}} \right)^{0.32} \left( {\frac{{K_{M} }}{1}} \right)^{0.20} $$
(17)

The dimensionless factors in brackets may help explain why the time deviates from its reference value of 10 min. They can be traced back to design choices, and give a feel of what contributes most to the reduction of machining time and cost.

The example in Fig. 3 demonstrates the application of the proposed method. The part is a base of a rotary compressor, and is machined all over from a 1.5-kg grey iron casting. The machining process includes two setups on a CNC machining centre with milling, drilling, reaming and tapping operations.

Fig. 3
figure 3

Example part

Table 8 shows the estimation of the machining time of the part using the feature-based method. Each machined feature is associated with an area A and a possible machining allowance a, which are calculated from the dimensions of the feature and the precision expected for the casting. These pieces of data determine the cutting time of the feature by applying a removal rate that is set from the IT grade of the feature’s main dimension (from either dimension callouts or assumed general tolerances). The baseline estimate (9) of machining time is 8.6 min, corresponding to an operating cost of 8–9 € for a shop rate of about 60 €/h.

Table 8 Baseline estimation of the machining time for the example

The following predictors are evaluated for the part:

  • Complexity: C = 196 bits → C/200 = 0.98.

  • Machined area: Atot = 320 cm2 → Atot/500 = 0.64.

  • Envelope volume: VE = 0.43 dm3 → VE/1.5 = 0.29.

  • Material factor: KM = 1.3 → KM/1 = 1.3.

Table 9 details calculation of the complexity measure. The dimensions Di, the tolerances Ti, and the numbers ni of equal features are considered for all machined dimensions (underlined in Table 8). Considering possible feature patterns, the contribution of each dimension to the complexity is calculated as Ci = ni log2 Di/Ti.

Table 9 Calculation of complexity for the example

The parametric estimate (16) of machining time is

$$ \begin{array}{*{20}c} t & { = 10 \, \cdot 0.98^{0.37} \cdot 0.64^{0.26} \cdot 0.29^{0.10} \cdot 1.3^{0.15} } \\ {} & { = 10 \cdot 0.99 \cdot 0.89 \cdot 0.88 \cdot 1.04 = 8.1{\text{ min}}} \\ \end{array} $$

and is fairly close to the baseline estimate (− 6% error). The factors in the equation suggest that, compared to a 10-min standard, the machining time is 12% less due the small part size, and 11% less due to the small machined area. The material gives a 4% increase in machining time, while the complexity has a neutral influence (1% decrease). The model (17) based only on material and part size would give

$$ t^{\prime} = 10 \, \cdot 0.29^{0.32} \cdot 1.3^{0.20} = 7.1{\text{ min}} $$

with a higher error than the full model (− 17%). It would seem that there is some advantage in using complexity as a predictor, although a single example could not obviously be taken as confirmation of this claim. After all, both models happen to do better than can be expected on average from the proposed parametric method. The real advantage of the full model is in its sensitivity in relation to design choices. This can be seen when evaluating the effects of two possible changes to part design:

  • Redesign of the flange into a cylindrical shape, so that the part could be turned from mild-steel round stock: the baseline machining time is now 12.0 min due to the higher amount of material to be removed. The machined area and the envelope volume increase (Atot = 963 cm2, VE = 0.78 dm3), while the complexity and the material factor decrease slightly (C = 184 bits, KM = 1). As a result, the parametric estimate goes to t = 10.8 min (− 10% error), compared to just t′ = 8.1 min (− 32% error) if complexity and machined area are left out of the model.

  • Specification of tighter tolerances on all profile and positional dimensions (IT grades reduced by two units): the machining operations require lower removal rates, which change the baseline estimate of machining time to 9.7 min. The complexity increases to C = 238 bits, and raises the parametric estimate to t = 8.7 min (− 10% error); the simplified model does not pick up any difference, and its estimate is again t′ = 7.1 min (− 27% error).

The results suggest that complexity alone is not a sufficiently accurate predictor of machining time when a broader range of machining processes is involved. In fact, it is mainly correlated with the number and relative difficulty of machining operations, but not with the amount of cutting involved by the size of machined features; moreover, it is related to non-productive operations but not to part handling or material properties. The addition of three more predictors has made it possible to build a multiple regression model with average errors of around 25% of the estimated times compared to a feature-based method used as a baseline. It is believed that such an accuracy is sufficient to drive several types of design activities (benchmarking, make-or-buy, design for manufacturing), considering the reduced calculation effort compared to more detailed estimation methods.

In the attempt to avoid the need for detailed process information, the parametric method is obviously subject to an additional source of uncertainty related to shop-floor conditions, e.g. the flexibility and the performance of available machine tools. The estimates provided by the regression model should constantly be taken as referring to state-of-the-art equipment and tooling for medium-volume production. In perspective, a case in which the parametric method is found to be in strong disagreement with more accurate methods might even suggest that some redesign or alternative process routes may be advisable.

Conclusions

In summary, the parametric method proposed for the estimation of machining time can be justified with the following considerations:

  • Compared to existing parametric methods, it is neither limited to particular types of products (e.g. rotational parts) nor bound to a common formulation for a wide variety of processes, while providing an adequate compromise in terms of estimation accuracy.

  • Compared to feature-based methods, it does not need a separate estimation of machining time from removal rates for each part feature.

  • Compared to engineering build-up estimation or hybrid methods, it does not need detailed process planning or selection of cutting parameters.

  • Compared to advanced methods (e.g. neural networks), it needs smaller datasets in which the consistency in the estimation assumptions can be better controlled.

Future developments will try to overcome some limitations of this work. The primary objective will be to reduce the estimation error while keeping the procedure simple and fast enough. In the search for further predictors, the concept of entropy will possibly be investigated as an alternative to information content. The integration of machine learning methodologies could possibly lead to a reduction of random errors, while calibration to suit industrial cases will make it possible to reduce systematic errors for use in specific companies. This will also allow a real validation of the method, which is currently limited to a comparison with a more accurate method from literature. The applicability of the parametric method will also have to be improved by extending the complexity measure to modern standards of geometric tolerancing, as well as by software implementation that can automate the calculation of predictors and provide effective visualizations of the results.