Developing and training artificial neural networks using bootstrap data envelopment analysis for best performance modeling of sawmills in Ontario

Shashi K. Shahi (Department of Finance and Operations, Faculty of Management, Laurentian University, Sudbury, Canada)

Mohamed Dia (Department of Finance and Operations, Faculty of Management, Laurentian University, Sudbury, Canada)

Peizhi Yan (Department of Computer Science, Lakehead University, Thunder Bay, Canada)

Salimur Choudhury (Department of Computer Science, Lakehead University, Thunder Bay, Canada)

Journal of Modelling in Management

ISSN: 1746-5664

Article publication date: 21 June 2021

Issue publication date: 5 April 2022

Downloads

346

pdf (622 KB)

Abstract

Purpose

The measurement capabilities of the data envelopment analysis (DEA) models are used to train the artificial neural network (ANN) models for the best performance modeling of the sawmills in Ontario. The bootstrap DEA models measure robust technical efficiency scores and have benchmarking abilities, whereas the ANN models use abstract learning from a limited set of information and provide the predictive power.

Design/methodology/approach

The complementary modeling approaches of the DEA and the ANN provide an adaptive decision support tool for each sawmill.

Findings

The trained ANN models demonstrate promising results in predicting the relative efficiency scores and the optimal combination of the inputs and the outputs for three categories (large, medium and small) of sawmills in Ontario. The average absolute error in predicting the relative efficiency scores varies from 0.01 to 0.04, and the predicted optimal combination of the inputs (roundwood and employees) and the output (lumber) demonstrate that a large percentage of the sawmills shows less than 10% error in the prediction results.

Originality/value

The purpose of this study is to develop an integrated DEA-ANN model that can help in the continuous improvement and performance evaluations of the forest industry working under uncertain business environment.

Keywords

Citation

Shahi, S.K., Dia, M., Yan, P. and Choudhury, S. (2022), "Developing and training artificial neural networks using bootstrap data envelopment analysis for best performance modeling of sawmills in Ontario", Journal of Modelling in Management, Vol. 17 No. 2, pp. 788-811. https://doi.org/10.1108/JM2-07-2020-0181

Publisher

:

Emerald Publishing Limited

1. Introduction

The data envelopment analysis (DEA) is a non-parametric linear programming optimization technique, which is used for measuring the relative operational efficiencies of several decision-making units (DMUs) having several inputs and outputs (Lovell, 1993; Fare et al., 1994). The DEA models compare the inputs and the outputs of the DMUs by establishing an efficiency frontier and by evaluating the efficiency of all DMUs relative to that frontier (Charnes et al., 1978; Banker et al., 1984; Coelli et al., 2005; Cooper et al., 2011). However, the DEA technique has limited usage in decision-making for the industry because of its incapability in predicting either the relative efficiency scores or the optimal combinations of the inputs and the outputs under uncertain supply and demand conditions (Mostafa, 2007; Mostafa, 2009; Wang et al., 2013). This study trains artificial neural networks (ANN) for predicting the relative efficiencies of the DMUs and the optimal combinations of the inputs and outputs.

The DEA technique measures three types of efficiencies (the overall technical efficiency [OTE], the pure technical efficiency [PTE] and the scale efficiency [SE]), and allows us to evaluate the performance of the DMUs without specifying the production function. Moreover, when we know a priori, the costs of inputs and the prices of the outputs, the DEA is also able to find the most optimal combination of inputs and outputs by using the costs as relative weights of inputs and prices as relative weights of outputs. However, the costs of inputs and the prices of outputs are highly sensitive and keep changing based on the uncertain supply and demand conditions. In addition, other factors that can also influence the cost and price sensitivity include changes in the economic environment, competition, quality and service. With changing costs of inputs and prices of outputs, the optimal combinations of the inputs and outputs also change, which in turn change the relative efficiencies of the DMUs. Therefore, the industry managers need an advanced prediction framework that quickly adapts to the changing market conditions (supply of inputs and demand of outputs), continuously assesses the relative technical efficiencies and computes the optimal combinations of the inputs and the outputs that provide the best performance (McAdam et al., 2008).

Artificial intelligence (AI), which is a branch of computer science that creates intelligent machines with reasoning and problem solving skills, can be used for the prediction of relative efficiencies of the DMUs and the optimal combinations of the inputs and outputs. The AI in this case may be achieved by using ANN, which are modeled based on the human learning paradigm, and acquire knowledge through the iterative learning process and weight adjustment between interconnected neurons. The ANN learns from a limited set of information, known as the training data, and provides nonlinear mapping and predictive power for the test data by searching for weight sets that form the best fit for the observed data sets through generalization (Yi and Thomas, 2009). The simplest type of ANN is a feed forward neural network, wherein the connections between the nodes do not form a cycle, but the information moves only in the forward direction, from the input nodes through the hidden nodes to the output nodes (Schmidhuber, 2015). The ANN consists of multiple layers of computational units, with each neuron in one layer directly connected to the neurons in the subsequent layer.

The ANN models continuously evaluate the relative efficiencies of the DMUs under uncertain demand and supply conditions, and the knowledge developed in this area is regularly shared with the managers and adapted in the industry for strategic decision-making. Therefore, the predictive power of the ANN can help industry managers in continuous improvement and in performance evaluations that make methodological advancements in the uncertain business environment. This study is an extension of the research work by Shahi and Dia (2019a) to help the sawmill managers in predicting the relative efficiencies and the optimal combinations of inputs and outputs by developing and training the ANN models with the results obtained from the DEA models. We develop and train two ANN models in this study. The first ANN model (ANN-1) is trained from the results of the DEA-1 model, which is used to measure the relative efficiencies of the sawmills, using roundwood and number of employees as inputs and lumber produced as output. All the five variables (three input and output variables, viz. roundwood, number of employees and lumber and two efficiencies PTE and OTE) are used to train the ANN-1 model. The test ANN-1 model then uses the test data of three input and output variables (roundwood, number of employees and lumber) to predict the two efficiency scores (PTE and OTE). The second ANN model (ANN-2) is trained from the results of the DEA-2 model, which is used to measure the optimal combinations of the inputs and output of the sawmills, using roundwood and number of employees as inputs and lumber produced as output. Three independent ANN-2 (a, b, c) models are trained to predict the optimal value of roundwood (ANN-2a), optimal number of employees (ANN-2b) and optimal value of lumber (ANN-2c). Each of these three ANN-2 models are trained with four variables (three input and output variables, viz. roundwood, number of employees and lumber and one optimal value of either of input or output variables). Although a few studies have used the ANN modeling approach in other industries with promising results (Athanassopoulos and Curram, 1996; Emrouznejad and Shale, 2009; Hsiang-Hsi et al., 2013; Kuo et al., 2012; Kwon, 2014; Kwon et al., 2016), there are no such studies for predicting the DEA efficiency scores and optimal inputs and outputs using the ANN modeling approach in the forest products industry.

The purpose of this study is to develop and train ANN models having performance measurement and prediction capabilities for sawmills in Ontario. The proposed ANN models use the bootstrap DEA (BDEA) as a preprocessor for training, and the subsequent feed forward neural network model conducts the prediction task of relative efficiencies and optimal combination of inputs and outputs for each sawmill in Ontario. The specific objectives of this study are as follows:

to first train the ANN models using results from the BDEA models for performance measurement and prediction capabilities;
to test the predictive capabilities of the trained ANN model for predicting the relative efficiencies of the Ontario’s sawmills; and
to use the trained ANN model for predicting the optimal combination of inputs and output for the Ontario’s sawmills under uncertain supply and demand conditions.

This paper is organized as follows. The related literature is reviewed in Section 2. Section 3 outlines the BDEA and the ANN modeling approach. Section 4 describes the results of the modeling approach for three categories of sawmills in Ontario. Section 5 offers the concluding remarks and suggestions for future studies.

2. Literature review

Both DEA and ANN techniques have been used in several research areas independently, and both these techniques have their own benefits and drawbacks (Athanassopoulos and Curram, 1996). However, using them together combines the benefits of both DEA (to process the data) and ANN (to perform the predictions) techniques. The integrated DEA-ANN model has been used in many industries including healthcare analytics (Misiunas et al., 2016). The use of integrated DEA-ANN models does not mandate any causality requirement, which makes these models extremely suitable for performance improvement and decision-making in the current environment of demand and supply uncertainty in the forest industry. The non-parametric approach of DEA is a well-known method for measuring the relative efficiencies of the DMUs with multiple-inputs and multiple-outputs. The advantage of the DEA approach is that it evaluates the efficiency of each DMU in the dataset by comparing it with the efficiency of the other DMUs, while allowing every DMU in the dataset to have its own production function. DEA has been used as a decision analysis tool in several areas, including manufacturing (Wahab et al., 2008; Lu et al., 2013; Lozano, 2014), chemical processing industry (Pitchipoo, 2012; Sun and Stuebs, 2013), logistics (Xu et al., 2009; Mirhedayatian et al., 2014), telecommunication (Cooper et al., 2001), mining, oil and gas production (Dia et al., 2019; Dia et al., 2018) and health care (Jacobs, 2001; Gok and Sezen, 2012; Ferrier and Trivitt, 2013), railways and airports (Feli et al., 2011; Georges Assaf and Gillen, 2012; Adler et al., 2013; Bhanot and Singh, 2014), social enterprise (Dia and Bozec, 2019), incineration plants (Chen et al., 2014), service industry such as banks and hospitals (Paradi et al., 2011; Paradi and Zhu, 2013; Peng et al., 2013). A most recent survey of DEA applications is found in Emrouznejad and Yang (2018). The applications of DEA for assessing the relative efficiencies and for benchmarking purposes are also found in the forest industry (Shahi and Dia, 2019a, 2019b).

Empirical applications of the DEA for evaluating the relative efficiencies of the DMUs in the forest management sector have been summarized in the literature (Xue et al., 2018).

The measurements of technical efficiency in the forest management sector have been used for evaluating the impact of government policies and forest tenure reforms on the production of social and environmental goods (Diaz-Balteiro and Romero, 2008; Xue et al., 2018). The efficiency and productivity of the wood products manufacturing sector have also been evaluated and summarized (Sowlati, 2005; Salehirad and Sowlati, 2006). It was found that the productivity growth compensates for price increases and enhances competitiveness. It was further found that the technical efficiency directly affects costs, profits and capital investments. The relative technical efficiencies in the forest bio-refinery were also evaluated, and it was concluded that the current forest products industry, with its existing infrastructure offers a suitable platform for being expanded into future integrated forest bio-refineries (Huang et al., 2009). Shahi and Dia (2019a) further improved the application of DEA in the forest industry by using BDEA, which allows the construction of confidence intervals and estimation of robust efficiency scores. They used the BDEA models for analyzing the relative technical efficiency of 125 sawmills in Ontario and found low levels of overall technical and managerial efficiencies in the Ontario’s sawmills over the entire study period. Shahi and Dia (2019b) further used the BDEA model for analyzing the relative efficiencies of 23 pulp and paper mills in Ontario and found low levels of relative efficiencies due to the management of operations as well as scale of operations, especially during the economic downturns. The DEA results are limited to assessing the relative technical efficiencies and cannot be used for prediction purposes by the forest industry managers under uncertain supply and demand conditions. The ANN model acquires knowledge through an iterative learning process from a limited set of information and can provide the predictive power. Therefore, the complementary features of the DEA and ANN can be used to build an adaptive decision-making tool for the forest industry mill managers under fast-changing business environment.

The ANN models, which allow the modeling of nonlinear processes, are used for solving many problems such as image processing and character recognition, classification, pattern recognition, dimension reduction and others (Knoll et al., 2016). This is because the ANN models have the ability to model and extract unseen features and relationships, and unlike other traditional models, ANN models do not impose any restrictions on the input and residual distributions. The ANN models have found widespread applications for face recognition in social media, cancer detection in healthcare, to image processing in agriculture. The ANN models require sufficient size and quality of data, which are used to train the models (Knoll et al., 2016). The ANN models are trained using either supervised learning or unsupervised learning, depending on whether input features of the training data are linked to the labels of the data or are just used for clustering of unlabeled data into different groups (Jain et al., 2000). Further research in the ANN models have led to the development of deep neural networks in the field of deep learning.

The ANN models are being increasingly used in the literature for input-output based performance evaluation and benchmarking techniques (Yi and Thomas, 2009; Hsiang-Hsi et al., 2013; Kwon, 2014; Kwon et al., 2016). In the industry, the most significant application of the ANN models is found in data mining, which includes the processes of data understanding, data preparation and data analysis and knowledge generation (Knoll et al., 2016). The studies exploring predictive potential of the ANN models have been used to:

analyze the effects of total quality management and operational flexibility on hospital performance (Alolayyan et al., 2011);
dynamic job shop scheduling (Alpay and Yuzugullu, 2009);
evaluate service quality to outpatients (Carlucci et al., 2013);
investigate engineering performance in construction management (Georgy et al., 2005);
make intermittent demand forecasts (Kourentzes, 2013; Lau et al., 2013);
conduct green supplier selection (Kuo et al., 2012);
examine optimal collaborative benchmarks in a supply chain (Li and Dai, 2009); and
evaluate total duration in project management (Li and Liu, 2012).

The ANN models have been used in a hybrid approach with the DEA models to determine the relative efficiencies, when there are heterogeneous levels of input and output relationships amongst the decision-making units (Samoilenko and Osei-Bryson, 2010). The integrated DEA-ANN models have been used for pre- and post-prediction for performance and efficiencies in supplier evaluations systems (Ozdemir and Temur, 2009). The DEA models have also been used to pre-process data to enforce monotonicity upon the inputs that could be subsequently used for predictions using ANN (Pendharkar and Rodger, 2003). There have been examples of using the ANN model first, and then the outputs are processed through DEA model to rank the predictions from the ANN (Olanrewaju et al., 2012). The hybrid DEA-ANN models have also been used to handle fuzzy data (Hatami-Marbini et al., 2011). However, the integrated DEA-ANN models have not been used in the forest industry for predicting either the relative efficiencies or the optimal combinations of the inputs and outputs. This study fills the gap in the literature by developing the DEA-ANN models for sawmills in Ontario. The performance measurement and prediction of the DEA-ANN models can significantly enhance the managerial decision-making process in the performance evaluation and continuous improvement of the forest industry in Ontario.

3. Integrating bootstrap data envelopment analysis with artificial neural network models

3.1 Bootstrap data envelopment analysis

The CCR model estimates the OTE and assumes constant returns to scale was developed by Charnes, Cooper and Rhodes (Charnes et al., 1978). The dual model of the input oriented CCR model is represented as follows:

(1) Minθ0=z0−ε(∑i=1msi−+∑r=1msi+)

(2) xij0z0−∑j=1nλjxij−si−=0,i=1…m

(3) ∑j=1nλjyrj−sr+=yrj0,r=1…t

where n is the number of DMUs, t is the number of outputs, m is the number of inputs, x_is is the value of the input s for DMU_i, and y_ir is the value of the output r for DMU_i. The parameters λ_j (j = 1,…,n) in equations (1) and (2) classify the benchmark DMUs and define an envelope for the evaluated DMU₀. The parameter θ₀ in equation (1) is the efficiency ratio of the evaluated DMU₀. The parameter z₀ in equations (1) and (2) indicates the proportion of inputs, for an inefficient DMU, needed to produce outputs equivalent to its benchmark DMUs. The parameters s_i^- and s_r⁺ in equations (1–3) correspond to the slacks associated with the inputs i and the outputs r, respectively.

The BCC model estimates the PTE and assumes variable returns to scale, was developed by Banker, Charnes and Cooper (Banker et al., 1984). The dual of the BCC oriented input model is obtained by adding the following convexity constraint to the equations (1–3):

(4) ∑j=1nλj=1

The SE is evaluated as the ratio of the OTE and the PTE as in equation (5). The SE measures the extent by which the overall technical efficiencies can be traced back to the whole operations’ scale rather than the management effectiveness and evaluates if the DMU has the optimum scale size and the right amount of resources to operate (Banker et al., 1984):

(5) θ0SE=θ0CCRθ0BCC

Simar and Wilson (1998) further helped improve the DEA technique, which is based on a deterministic-based approach by proposing the bootstrapping methodology. The bootstrapping methodology in frontier models allows the construction of confidence intervals and the generation of robust efficiency scores. The bootstrapping methodology simulates the data generating process (DGP), using the Monte Carlo simulation process and provides robust estimators of the original unknown sampling distribution (Toma et al., 2017). Thus, for the relative efficiencies θ_k (as in equation (1)) in the DEA model, the DGP, P, generates a random sample, χ = {(x_k,y_k |k = 1,…,n)}, to estimate θ_k according to equation (6):

(6) θ^k=Min{θ|yk≤∑i=1nγiyi|θxk≥∑i=1nγixi|∑i=1nγi=1|γi≥0|θ≥0|i=1,…,n}

The bootstrap procedure determines P^ as an estimator of the true unknown DGP generated through the dataset χ. The efficiency estimates lead to a new population, which can be used to create a new dataset χ*={(xi*, yi*)| i=1,…,n)} . This new sample dataset defines the corresponding x^* and y^*, whose distributions are known since P^ is known. A Monte Carlo approximation is used in the analytical computation of P^ generating B pseudo-samples, χb*, where b = 1,…, B are pseudo-estimates of relative efficiencies (Simar and Wilson, 1998). The linear programming technique is used to estimate the efficiency θ^b of each DMU, using the input-output data (x_k, y_k),where k = 1,…, n. In our study, we run 2000 iterations of this procedure to ensure enough convergence of the confidence intervals.

3.2 Optimum values using data envelopment analysis

Performance evaluation is an important activity for any DMU in identifying its shortcomings in the managerial and technical efficiencies, as well as in devising goals for the optimum values of inputs and outputs that maximize profits. The most optimal inputs and outputs for each DMU refer of the fewest inputs that can be used to produce the most outputs, using one of the several production plans. However, the use of optimal combination of the inputs and the production of optimal combination of outputs, depends on the cost of the inputs and the price of the outputs, which assign the relative weights to the inputs and outputs. For example, let w be the vector of m inputs costs, ∈ ℝ+m, and let p be the vector of t output prices, p∈ ℝ+t. In this situation, we can calculate the costs wx and revenue py of a given production plan (x, y), and thereby evaluate this production plan using the cost and revenue combination (wx, py). In principle, we calculate the relative efficiency using DEA of this aggregated model, (wx, py) in the same way as we did for (x, y), either using variable returns to scale or constant returns to scale (Bogetoff and Otto, 2011).

This way, we can define the cost-efficiency CE as the ratio between the minimal cost and the actual cost, =wx*wx, where x* is the optimal minimal cost input combination found by solving the cost minimization problem. The revenue-efficiency RE is defined as the ratio between the maximum revenue and the actual revenue, RE=py*py, where y^* is the optimal revenue output combination found by solving the revenue maximization problem. The linear programming DEA optimization problems (cost minimization and revenue maximization) are formulated in the same way as in equations (1–6) above and are solved by the linear programming method in the R package using lpSolveAPI (Bogetoff and Otto, 2011).

3.3 Artificial neural network models

ANN are effective machine learning tools in pattern recognition and analysis. We leverage the multi-layer feed-forward neural networks to predict the efficiency scores and the optimal DMU inputs and outputs, using two ANN models (ANN-1 and ANN-2). A multi-layer feed-forward neural network is equivalent to a mathematical function f (X, ∅) = Y, where X is the input vector, Y is the output vector, and ∅ is the neural network weights. From the DEA data, the input vector consists of the roundwood consumed by the sawmill, the number of employees working in the sawmill, and the amount of lumber produced by the sawmill. Therefore, the input of the ANN-1 model is a three-tuple:

X = (x_roundwood, x_employees, x_lumber). The ANN-1 model architecture is designed for predicting the efficiency scores (θ_PTE and θ_OTE) (Figure 1). Because the efficiency scores (PTE and OTE) have the same range: {θ | θ ∈ ℝ, 0 ≤ x ≤ 1}, the ANN-1 model is trained to predict both PTE and OTE simultaneously.

The ANN-2 model architecture is designed for predicting the optimal inputs (roundwood and employees) and output (lumber) for each DMU (Figure 1). From the DEA optimal data, the input vector consists of the optimal roundwood consumed by the sawmill, the optimal number of employees working in the sawmill, and the optimal amount of lumber produced by the sawmill.

The input of the ANN-2 model is also a three-tuple:

X = (x_roundwood, x_employees, x_lumber). However, the output of the ANN-2 model, optimal inputs (y_{optimal roundwood} and y_{optimal employees}) and optimal outputs (y_{optimal lumber}) have different units. Therefore, we use three different ANN-2 models of the same architecture to predict y_{optimal roundwood} (ANN-2a), y_{optimal employees} (ANN-2b), and y_{optimal lumber} (ANN-2c), separately.

Both ANN-1 and ANN-2 models have the same number of layers and the same number of neurons in each layer except for the output layer. The ANN-1 model has two output neurons, whereas the ANN-2 model has only one output neuron. For both ANN-1 and ANN-2, the first layer (input layer) has three input neurons, the other three hidden layers have 512, 256 and 128 neurons, respectively. The ANN models consist of multiple layers, each containing computational units modeled like biological neurons, which are connected to the neurons in the subsequent layers. The network configuration is iteratively tuned using the gradient descent-based optimization algorithm, which prunes the nodes based on values of the weight vector after a certain number of training epochs (in our case it was 500). With the proper training method and sufficient training data, the ANN models can learn the complex hidden patterns among the input data and the expected output data (target data). The network of interconnected neurons internally organizes itself to reconstruct the complex functional relationships among the input and the associated output data. We use the rectified linear unit (ReLU) activation function for each hidden layer for both ANN-1 and ANN-2 models. The activation function of the output layer of ANN-1 model is a sigmoid function, which controls the range of the outputs (θ_PTE and θ_OTE) between 0 and 1, whereas the output layer of ANN-2 has no such activation function, because the range of outputs (optimal predicted values of inputs and outputs) does not vary between 0 and 1. The output value is compared with actual output and an error is calculated, which is then propagated back through the network, and the connection is strengthened using the gradient descent-based optimization algorithm. The process of feed-forward output value and back-propagated error value is repeated until the convergence is reached to an acceptable error value.

3.4 Sequential process flow diagram of the integrated model

The sequential process flow diagram in Figures 2 shows the integrated DEA-ANN modeling approach. There are two stages in the modeling approach:

DEA data collection stage and training ANN models; and
testing (predicting) ANN models stage.

In the DEA data collection stage, we use the DEA-1 model to generate the relative efficiency scores based on the BCC and CCR models, respectively. We use the DEA-2 model to generate the optimum values of the inputs and outputs using the cost of the inputs and the price of outputs. The dataset contains 303 large sawmills, 374 medium sawmills and 725 small sawmills. We partition each sub-dataset of large, medium and small DMUs into training and test data sets (90% for training and 10% for testing). Therefore, we have 272 DMUs for large sawmills, 336 DMUs for medium sawmills and 652 DMUs for small sawmills in our training datasets; and 31 DMUs for large sawmills, 38 DMUs for medium sawmills and 73 DMUs for small sawmills in our test datasets. In the training ANN models stage, we use the DMU inputs (roundwood and employees) and outputs (lumber) as the input to the ANNs. We use PTE and OTE (regular or bootstrap) as targets to train the ANN-1, and the optimal roundwood, optimal number of employees, and optimal lumber as targets to train ANN-2 (a), (b) and (c) models, respectively. The minimum number of training epochs (or iterations) in the ANN models depends on the number of nodes, number of hidden layers and the learning rate. We used the optimal error approach, which gives the minimum mean square error between the model output and the training data, as we decreased the learning rate from 0.001 to 0.0001 with a learning rate discounting factor of 0.9. We initialized the network weights through the Xaiver Glorot method (Glorot and Bengio, 2010). To prevent the neural networks from overfitting, we used a 50% dropout for each fully connected layer and used regularization for all the weights, encouraging all the weights to be small. We used ten-fold cross-validation and monitored the training and cross-validation accuracy during the training process, and finalized the total number of training epochs to 500, which gives the minimum mean square error or the highest accuracy.

Finally, in the testing ANN models stage, we run the trained ANNs on the test dataset to evaluate the predictive performance of the trained ANN models. Once the predictive performance of the ANN models is satisfactory, we can use these ANN models to make predictions in the application stage.

4. Ontario sawmills case study

The forest industry in Ontario is an important contributor to the province’s economy and plays a key role in the development of several rural and remote communities. The revenue from the sales of the Ontario’s forest products industry was $15.5bn, and the industry provided over 150,000 well-paying jobs in 2018, in addition to supporting several communities across the province (Ontario Forest Industries Association (OFIA), 2019). Ontario’s forest products industry focuses on the production of a variety of products including lumber, structural board, pulp, paper, newsprint and value-added products. There are more than 150 sawmills in Ontario, engaged in the production of about 6 million cubic meters of lumber every year (OFIA, 2019). Ontario’s sawmills have been facing extreme competitive pressures in the global market from the low-cost producers, reduced demand and a volatile Canadian dollar. The trade disputes with the USA, Ontario’s largest export market and strict environmental regulations have further affected the performance of the sawmill industry. To improve the operational efficiency of the sawmill industry working under highly uncertain business environment, the mill managers need decision support tools that can help in the continuous improvement and performance evaluations of the sawmills.

For this case study, the annual data for the inputs and the output of the 125 Ontario’s sawmills (with 1402 sample data observations) were obtained from the Ontario Ministry of Natural Resources and Forestry (OMNRF) for a period of 17 years (1999 to 2015) (For details of the dataset see Shahi and Dia, 2019a). The inputs data include the annual roundwood and other fibre consumption (aggregated together) in cubic meters, and the number of employees, whereas the outputs data includes the lumber and other fibre (aggregated together) output in cubic meters. The OMNRF categorizes sawmills into large (consuming more than 100,000 cubic meters of roundwood annually), medium (consuming between 10,000 and 100,000 cubic meters of roundwood annually) and small (consuming less than 10,000 cubic meters of roundwood annually) sawmills based on the roundwood consumption of the sawmills. The descriptive statistical measures of the input and output data for large, medium and small sawmills are shown in Table 1. The large and medium sawmills consume most of the roundwood and employ large number of employees, although there are many more small sawmills in Ontario. The small sawmills are mostly located in remote and rural areas of the province (Shahi and Dia, 2019a).

4.1 Efficiency prediction using artificial neural network-1

The first BDEA-1 model is used for analyzing the relative technical efficiencies (PTE, OTE and SE) for the large, medium and small sawmills. The results of efficiencies obtained from the DEA-1 model are summarized in Table 2. The ANN-1 model is used to predict the relative efficiencies (PTE, OTE and SE) of the three categories of sawmills (large, medium and small). The three-layered feed-forward neural network model ANN-1 was used by acquiring its adaptive learning ability from the DEA results. The data set was partitioned into training and test data in 9:1 ratio. Therefore, 90% of the sawmills were used for the training data and 10% of the sawmills were used as test data. The results of the comparison of efficiencies obtained from the DEA-1 model and those predicted from the ANN-1 model (summarized in Table 2) show that the predicted values (mean, median and standard deviation) for large, medium and small sawmills are very close to those obtained from the DEA-1 model. A further analysis of variance of the comparison of regular and bootstrap efficiencies shows that there is no statistical difference between these efficiencies for large and medium sawmills (Table 3). However, for small sawmills, all efficiencies other than the regular PTE are statistically significantly different between the predicted ANN-1 model values and those obtained from the DEA-1 model. This may be due to large variation in the consumption of inputs and production of outputs of the small sawmills as compared to the large sawmills. The operational efficiencies of the large sawmills were comparatively higher as they made huge capital investments in upgrading their technology. Whereas the small sawmills had lower operational efficiencies as these were unable to make any adjustments in their inputs with changing and uncertain market demand conditions. The large and medium-sized sawmills survived the periods of uncertainty in demand and supply by utilizing a higher percentage of roundwood and converting it to useful products, thereby reducing wastage.

The adaptive learning capability of ANN-1 model can be also be observed by high correlations (R-values) between the actual and predicted efficiencies (PTE, OTE and SE). The correlation varies between 0.99 to 1.00 for the large and medium sawmills, and between 0.86 to 0.96 for the small sawmills in Ontario (Table 4). The results of comparison also show low error rates (Average Absolute Error [AAE] and Maximum Absolute Error [MAE]) between the predicted and observed values as summarized in Table 4. The average absolute error varies from 0.01 to 0.04 for all the sawmills in Ontario. The maximum absolute average varies from 0.05 to 0.39 for the large and medium sawmills, and from 0.33 to 0.41 for small sawmills (Table 4).

The performance of the ANN-1 in accurately predicting the relative efficiencies of the sawmills is shown in Figures 3, 4 and 5 for large, medium and small sawmills in Ontario, respectively. The figures show the error percentage between the actual and predicted efficiencies (PTE, OTE and SE) scores for large, medium and small sawmills, sorted by the scale of error. The predicted efficiencies show high levels of prediction accuracy with very few sawmills showing more than 10% error. Only a few small sawmills show very high errors in over predicting the relative efficiencies. However, previous literature has shown that the prediction accuracy can be improved with large amounts of training data, as the feed forward neural networks exhibit a regression type of learning and perform better in detecting the central values of the data rather than extreme points (Athanassopoulos and Curram, 1996; Ülengin et al., 2011; Pendharkar and Rodger, 2003).

4.2 Optimum combinations of inputs and outputs prediction using artificial neural network-2

The second DEA-2 model optimizes the vectors of multiple inputs and outputs by minimizing the costs and maximizing the revenue using an underlying DEA technology (variable returns to scale, VRS or constant returns to scale, CRS). The decision to use optimum inputs and produce optimum outputs depends on the cost of roundwood, the employees wage rate, and the price of lumber, which vary continuously. For example, the variation in the average annual price of lumber for the study period from 1999–2015 is shown in Figure 6 (Random Lengths framing lumber composite prices 2019). We used the average cost of roundwood as $55 per cubic meters, the employee wage rate of $25 per hour, and the average price of lumber as $300 per thousand board feet for the study period.

The ANN-2 model is used to predict the optimum values of the inputs (roundwood and employees) and output (lumber) of the three categories of sawmills (large, medium and small). The three-layered feed-forward neural network model ANN-2 was used for this purpose. The data set was again partitioned into training and test data in 9:1 ratio. Therefore, 90% of the sawmills were used for the training data and 10% of the sawmills were used as test data. Since the DEA-2 model optimizes the inputs and outputs using either the VRS or CRS technologies, the results of the predicted optimal inputs and outputs from ANN-2 model are compared with those obtained from the DEA-2 models using both VRS and CRS technologies for optimization (Table 5). The results of the comparison indicate that the predicted optimal inputs and outputs are very close to the actual values obtained from the DEA-2 model. A further analysis of the comparisons of the optimal inputs and outputs between the predicted and actual values shows no statistically significant difference between the two values (Table 6).

The optimal inputs and outputs values are further used to check the relative efficiencies obtained from the both the VRS and the CRS technologies for optimization. The results of the comparison of the descriptive statistical measures (mean, median and standard deviation) of the regular and bootstrap efficiencies for large, medium and small sawmills are shown in Table 7. The results demonstrate that the average values of efficiencies obtained using the optimum values of the inputs and outputs are higher than those obtained from the first DEA-1 model for the large and medium sawmills, whereas the average values of efficiency using the optimum values for the small sawmills are lower as compared to those obtained from the first DEA-1 model. This is because of the scale of operations of the small sawmills and their inability to produce large outputs with reduced inputs. The study period also includes two economic downturns in 2001–02 and 2008–09, which impacted the housing starts in the United States, the biggest market for the Ontario sawmills. The large and medium sawmills were able to recover from this economic downturn by making capital investments in improved technology, but the small sawmills mostly closed their operations during that period. In addition to the collapse of the US housing industry, the disruption of fibre supply chains caused by the economic recessions has further impacted fibre utilization in the sawmills, thereby impacting their efficiencies. Moreover, the efficiencies obtained using the VRS technology for optimization are in general higher as compared to the efficiencies obtained using the CRS technology for optimization. The sawmills that are able to adapt to the VRS technology may perform much more efficiently than the sawmills that can only operate on CRS technology.

The adaptive learning capability of the feed-forward neural network model (ANN-2) can be observed by high correlations (R-values) between the actual and predicted inputs (roundwood and employees) and output (lumber) and low error rates (AAE and MAE) for both the training and the test data sets as summarized in Table 8. The accurate predictions of the optimum combination of inputs and outputs under constantly changing supply and demand conditions is of direct consequence to the mill managers.

The ANN-2 model also demonstrates high level of prediction accuracy as shown in Table 9. The percentage of sawmills that predict less than 10% prediction error for optimal values of input (roundwood) include 46% large, 98% medium and 99% small sawmills using the VRS technology for optimization. The percentage of sawmills that predict less than 10% prediction error for optimal values of input (employees) include 64% large, 93% medium and 96% small sawmills using the VRS technology for optimization. Whereas, the percentage of sawmills that predict less than 10% prediction error for optimal values of output (lumber) include 67% large, 34% medium and 28% small sawmills using the VRS technology for optimization. However, the prediction accuracy for optimal values of inputs and outputs are much higher for all sawmills using the CRS technology. This is because the optimum values obtained using the CRS technology for optimization have lesser variation as compared to the those obtained for VRS technology for optimization. Therefore, the ANN-2 model provides a highly accurate adaptive decision support tool in not only setting the performance output goals, but also in deciding the optimum combinations of the inputs. With the changing values of the costs of the inputs and price of the output used, the mill managers can easily predict the optimal combinations of these inputs and the outputs under highly volatile business environment. The optimal combinations of the inputs and the outputs further help the mill managers in selecting actionable measures that assist in achieving and sustaining the performance and continuous improvement of the sawmills.

5. Implications and managerial insights

The sawmills in Ontario are the primary forest products industry contributing to the provincial economy and supporting many remote and rural communities. These sawmills have experienced uncertain variations in demand and supply due to the economic downturns and structural changes in the global markets, which have resulted in several mill closures and reduced production. However, these uncertain conditions have also presented new opportunities to the forest products industry for using the emerging technologies (for example, ANN models). The results of our study, which demonstrate the use of emerging technology for performance improvement and decision-making, clearly demonstrate that the main source of operational inefficiency in Ontario sawmills has been due to managerial and technical issues and not due to the scale issues. This has resulted in the inputs not being efficiently utilized in the sawmills. Therefore, the mill managers should focus their attention on improving the operational efficiency and the competitiveness of Ontario's sawmills through streamlining manufacturing processes, reducing costs, improving raw material usage and making capital investments in the new and improved technology.

The results of analysis of the category-wise (large, medium and small) operational efficiencies of the sawmills further revealed that their performance was impacted differently based on their size. The operational efficiencies of the large sawmills were higher as they made huge capital investments in upgrading their technology. This has helped the large sawmills improve their inputs utilization and conversion of timber to useful products, thereby reducing wastage. However, the smaller sawmills had low operational efficiencies throughout the study period, as these were unable to make any adjustments in their inputs with changing and uncertain market demand conditions. Whereas, only those medium-sized sawmills survived the uncertainty in demand and supply that were able to increase the percentage of roundwood converted to products and reduce the proportion of mill residues. The new and emerging technologies and business processes offer innovative ways of predicting the operational efficiencies in future uncertain supply and demand scenarios. These new technologies and business processes provide new opportunities in generating social and economic values from the inputs in the sawmills and also help in adding new revenue streams from inputs, diversifying product lines and boosting the share of products in the marketplace. These technologies further enhance the familiarity of the employees with the key performance metrics and opportunities for continuous improvement in the sawmills. The ANN models can be used for continuously evaluating the performance of the sawmills and for strategic decision-making. Moreover, the traditional models that evaluate optimum inputs/outputs are unable to account for the complex non-linear relationships involved in the forest products industry. The ANN models have the ability to handle unseen patterns and non-linear relationships. This requires a close working relationship of the forest products industry with academia and research organizations through research projects, continuous training and workshops. The sawmills in Ontario should make use of the operational and tactical decision-support tools that monitor the manufacturing processes and provide process control information to improve their strategic decision-making.

6. Conclusion

The purpose of this study was to develop an adaptive decision support tool, which could be used for predicting the relative efficiency scores and the optimal combination of the inputs and the outputs based on the changing business environment for the forest industry in Ontario. The ANN models were developed utilizing the training data from the performance measurement capabilities of the DEA models. The ANN models were used to predict the relative efficiency scores and the optimal combination of the inputs and the outputs of three categories (large, medium and small) of sawmills in Ontario. The model provides promising prediction results with high accuracy.

The ANN models provide the mill managers with a performance measurement and evaluation tool that has the predictive power to make decisions under uncertain supply and demand conditions. The sawmills in Ontario, which form an important forest products industry contributing to the social well-being of the local communities and to the economic prosperity of the entire province, have been struggling under fast-changing business environment and uncertain supply and demand conditions. The ANN models help the mill managers in dealing with economic fluctuations and uncertain market demand conditions that continuously affect the performance of the sawmills in Ontario. The ANN models can continuously evaluate the performance of the sawmills under these uncertain conditions, and the knowledge developed in this area could be regularly shared with the mill managers and adapted in the industry for strategic decision-making. The forest products industry is highly affected by the trends of globalization, and the increasing dynamics of product lifecycles. Challenged by massive fluctuating market demands and prices, and varied requirements to support individual customer needs, the forest products industry needs tools to evaluate the inputs/outputs and make forecasting on a regular basis. The forecasting problems are very complex with a lot of underlying known and unknown factors. Traditional forecasting models have certain limitations, and these cannot consider the complexities and non-linear relationships. The ANN models provide robust alternative, given their ability to extract unseen patterns and non-linear relationships. Also, unlike the traditional models, the ANN models do not impose any restriction on the inputs/outputs and residual distributions. Therefore, the ANNs have proved to be powerful models that have a wide range of applications. Further, the ANN models can be used in the forest products industry in improving their supply chains management, which spans the movements and storage of raw materials, work-in-process inventory and finished goods from the point-of-origin to the point-of-consumption. Our study is unique in predicting the optimal combination of the inputs and the outputs for the best performance of the sawmills in Ontario. One of the limitations of this study is the limited amount of data to train the ANN models. The prediction accuracy of the ANN models could be further improved with more training data for the ANN models.

Figures

Figure 1.

ANN-1 and ANN-2 model architectures

Figure 2.

Sequential process flow diagram of the integrated DEA-ANN models

Figure 3.

ANN-1 learning and prediction error for large sawmills

Figure 4.

ANN-1 learning and prediction error for medium sawmills

Figure 5.

ANN-1 learning and prediction error for small sawmills

Figure 6.

Average annual lumber prices

Table 1.

Descriptive statistical measures of the input and output data for large, medium and small sawmills

Roundwood (cum)	Number of Employees	Lumber Output (cum)
Large
303	303	303
511242.10	187.20	287445.80
487726.00	169.00	248888.00
247118.60	96.94	163794.00
Medium
374	374	374
53936.07	52.89	31065.54
38076.00	38.00	24656.50
43981.75	44.79	22135.97
Small
725	725	725
6070.88	10.59	3360.29
4863.00	7.00	2709.00
5283.01	13.19	2826.10
Total
1402	1402	1402
128017.00	60.04	72147.50
14798.00	18.00	9376.50
233700.70	86.18	137265.80

Table 2.

Comparison of the efficiencies obtained from the DEA-1 model and predicted efficiencies obtained from ANN-1 model for large, medium and small sawmills

	DEA-1 Model					ANN-1 Predictions
	OTE		PTE		SE	OTE		PTE		SE
Sawmills	Regular	Bootstrap	Regular	Bootstrap		Regular	Bootstrap	Regular	Bootstrap
Large
Mean	0.59	0.58	0.61	0.57	0.98	0.59	0.58	0.60	0.57	0.98
Median	0.54	0.53	0.55	0.52	1.00	0.54	0.53	0.55	0.52	1.00
Std Dev	0.16	0.16	0.17	0.15	0.05	0.16	0.16	0.17	0.15	0.04
Medium
Mean	0.66	0.66	0.67	0.65	0.98	0.66	0.65	0.67	0.65	0.98
Median	0.61	0.60	0.64	0.62	1.00	0.60	0.59	0.64	0.61	0.99
Std Dev	0.21	0.21	0.21	0.2	0.04	0.21	0.21	0.21	0.21	0.02
Small
Mean	0.59	0.58	0.61	0.59	0.97	0.62	0.61	0.63	0.61	0.98
Median	0.52	0.51	0.54	0.52	1.00	0.58	0.57	0.58	0.57	1.00
Std Dev	0.18	0.18	0.19	0.18	0.10	0.17	0.17	0.19	0.18	0.06

Table 3.

Analysis of variances between efficiencies obtained from the DEA-1 model and predicted efficiencies obtained from ANN-1 model for large, medium and small sawmills

Sawmill	Efficiency		DF	SumSq	MeanSq	F-Value	P-Value
Large	Regula PTE	Category	1	0.002	0.002	0.071	0.789
		Residuals	604	17.590	0.029
	Bootstrap PTE	Category	1	0.002	0.002	0.096	0.757
		Residuals	604	13.382	0.022
	Regular OTE	Category	1	0.003	0.003	0.121	0.728
		Residuals	604	15.919	0.026
	Bootstrap OTE	Category	1	0.001	0.001	0.018	0.892
		Residuals	604	14.850	0.024
	Scale	Category	1	0.001	0.001	0.677	0.411
		Residuals	604	1.174	0.002
Medium	Regula PTE	Category	1	0.010	0.006	0.126	0.723
		Residuals	746	33.190	0.044
	Bootstrap PTE	Category	1	0.002	0.002	0.059	0.757
		Residuals	746	31.300	0.042
	Regular OTE	Category	1	0.001	0.002	0.041	0.808
		Residuals	746	32.730	0.044
	Bootstrap OTE	Category	1	0.001	0.002	0.015	0.839
		Residuals	746	32.240	0.044
	Scale	Category	1	0.000	0.001	0.128	0.903
		Residuals	746	0.717	0.043
Small	Regula PTE	Category	1	0.120	0.124	3.409	0.0.065
		Residuals	1448	52.600	0.036
	Bootstrap PTE	Category	1	0.150	0.146	4.507	0.034*
		Residuals	1448	47.090	0.033
	Regular OTE	Category	1	0.250	0.251	7.931	0.005**
		Residuals	1448	45.820	0.032
	Bootstrap OTE	Category	1	0.300	0.296	9.645	0.002**
		Residuals	1448	44.460	0.030
	Scale	Category	1	0.032	0.032	4.615	0.032*
		Residuals	1448	9.945	0.007

Note:

Significance: “*” 0.05, “**” 0.01, “***” 0.001

Table 4.

Performance of the ANN-1 for efficiency prediction for large, medium and small sawmills in Ontario

Efficiency	Large			Medium			Small
	R	AAE	MAE	R	AAE	MAE	R	AAE	MAE
Regular PTE	0.99	0.01	0.12	0.98	0.02	0.39	0.96	0.03	0.35
Bootstrap PTE	0.99	0.01	0.11	0.99	0.01	0.39	0.96	0.04	0.41
Regular OTE	1.00	0.01	0.10	0.99	0.01	0.30	0.95	0.03	0.33
Bootstrap OTE	1.00	0.01	0.05	1.00	0.01	0.15	0.95	0.04	0.41
SE	0.89	0.01	0.12	0.46	0.02	0.39	0.86	0.02	0.39

Notes:

R: Correlation between actual and predicted efficiency, AAE: Average Absolute Error, MAE: Maximum Absolute Error

Table 5.

Comparison of optimal inputs and output obtained from the DEA-2 model and predicted values obtained from the ANN-2 model for large, medium and small sawmills

Technology	Sawmills		Optimal values from DEA-2 Model			Predicted Optimal values from ANN-2 Model
			Roundwood (cum)	Employees (#)	Lumber (cum)	Roundwood (cum)	Employees (#)	Lumber (cum)
VRS	Large	Mean	291021.41	135.16	485702.30	297998.26	138.96	493775.48
		Median	249512.85	124.50	470650.25	237722.38	116.06	417387.03
		Std Dev	174879.80	53.85	219938.77	201229.85	70.49	244631.02
	Medium	Mean	31476.28	26.19	46772.06	30557.56	25.65	44694.53
		Median	24719.36	17.88	37749.34	24208.73	16.69	37475.45
		Std Dev	23316.91	21.99	29407.96	23500.84	22.57	25763.65
	Small	Mean	3381.59	6.43	5108.46	3432.71	6.53	5076.50
		Median	2713.44	5.58	4779.17	2371.65	5.28	4104.98
		Std Dev	2852.58	3.17	3637.06	2947.46	3.25	3408.61
CRS	Large	Mean	288022.43	143.29	500532.18	294670.12	146.82	509995.23
		Median	249387.28	124.07	476958.40	237408.31	107.45	424424.91
		Std Dev	164122.56	81.65	243708.42	190991.84	106.79	270883.94
	Medium	Mean	31128.83	20.75	53026.84	30423.48	20.53	48719.10
		Median	24706.74	16.47	37752.08	24142.64	15.64	38577.16
		Std Dev	22181.07	14.79	42710.62	23033.90	15.55	36510.87
	Small	Mean	3365.64	6.42	5864.24	3415.76	6.51	5912.92
		Median	2713.32	5.18	4794.30	2374.29	4.53	4441.83
		Std Dev	2830.60	5.40	4916.01	2924.49	5.58	4838.20

Table 6.

Analysis of variances between optimal inputs and output obtained from the DEA-2 model using VRS technology for optimization and predicted values obtained from the ANN-2 model for large, medium and small sawmills

Technology	Sawmill	Input/Output		DF	SumSq	MeanSq	F-Value	P-Value
VRS	Large	Roundwood	Category	1	2.21e + 9	2.21e + 9	0.081	0.776
			Residuals	604	1.65e + 13	2.74e + 10
		Employees	Category	1	833	833	0.286	0.593
			Residuals	604	1.76e + 6	2.91e + 3
		Lumber	Category	1	5.98e + 9	5.98e + 9	0.137	0.711
			Residuals	604	2.63e + 13	4.36e + 10
	Medium	Roundwood	Category	1	7.31e + 6	7.31e + 6	0.014	0.906
			Residuals	746	3.92e + 11	5.26e + 8
		Employees	Category	1	36	36	0.077	0.781
			Residuals	746	3.46e + 5	4.64e + 2
		Lumber	Category	1	1.59e + 7	1.59e + 7	0.019	0.890
			Residuals	746	6.24e + 11	8.36e + 8
	Small	Roundwood	Category	1	6.00	6.00	0.000	0.999
			Residuals	1448	1.18e + 10	8.13e + 6
		Employees	Category	1	0.00	0.392	0.039	0.843
			Residuals	1448	1.45e + 4	10.024
		Lumber	Category	1	2.13e + 6	21.26e + 5	0.169	0.681
			Residuals	1448	1.82e + 10	1.26e + 7
CRS	Large	Roundwood	Category	1	2.02e + 9	2.02e + 9	0.083	0.773
			Residuals	604	1.47e + 13	2.43e + 10
		Employees	Category	1	1.00	1.00	0.000	0.989
			Residuals	604	4.01e + 6	6.65e + 3
		Lumber	Category	1	8.27e + 9	8.27e + 9	0.155	0.694
			Residuals	604	3.26e + 13	5.34e + 10
	Medium	Roundwood	Category	1	1.02e + 2	1.02e + 2	0.000	1.000
			Residuals	746	3.66e + 11	4.91e + 8
		Employees	Category	1	0.02	0.02	0.000	0.993
			Residuals	746	1.63e + 5	218.61
		Lumber	Category	1	5.44e + 6	5.44e + 6	0.003	0.956
			Residuals	746	1.35e + 12	1.81e + 9
	Small	Roundwood	Category	1	2.60e + 1	2.60e + 1	0.000	0.999
			Residuals	1448	1.16e + 10	8.01e + 6
		Employees	Category	1	0.00	0.006	0.000	0.988
			Residuals	1448	4.22e + 4	29.17
		Lumber	Category	1	2.13e + 4	2.13e + 4	0.001	0.977
			Residuals	1448	3.57e + 10	2.46e + 7

Note:

Significance: “*” 0.05, “**” 0.01, “***” 0.001

Table 7.

Comparison of the efficiencies obtained from the optimal inputs and output using VRS and CRS technologies for optimization

	VRS Optimization					CRS Optimization
	OTE		PTE		SE	OTE		PTE		SE
Sawmills	Regular	Bootstrap	Regular	Bootstrap		Regular	Bootstrap	Regular	Bootstrap
Large
Mean	0.45	0.42	0.67	0.64	0.71	0.43	0.41	0.61	0.59	0.74
Median	0.46	0.43	0.63	0.62	0.74	0.44	0.42	0.58	0.58	0.77
Std Dev	0.11	0.10	0.17	0.16	0.19	0.11	0.11	0.17	0.16	0.18
Medium
Mean	0.44	0.42	0.72	0.70	0.66	0.27	0.24	0.57	0.55	0.53
Median	0.39	0.38	0.75	0.74	0.63	0.26	0.23	0.51	0.50	0.50
Std Dev	0.16	0.15	0.21	0.21	0.24	0.10	0.09	0.22	0.22	0.23
Small
Mean	0.29	0.29	0.41	0.38	0.71	0.04	0.03	0.16	0.13	0.37
Median	0.31	0.30	0.39	0.36	0.79	0.04	0.03	0.14	0.12	0.26
Std Dev	0.14	0.14	0.10	0.08	0.28	0.04	0.03	0.13	0.09	0.24

Table 8.

Performance of the ANN-2 for optimum inputs (roundwood and employees) and optimum output (lumber) prediction for large, medium and small sawmills in Ontario

Technology	Inputs/Output	Large			Medium			Small
		R	AAE	MAE	R	AAE	MAE	R	AAE	MAE
VRS	Roundwood	0.95	36707.29	379604.70	0.99	816.92	56319.73	1.00	1.88	76.84
	Employees	0.95	12.12	75.72	0.97	1.88	54.76	1.00	0.08	1.87
	Lumber	0.97	43399.93	182190.30	0.94	7347.55	45311.22	0.94	901.07	12083.95
CRS	Roundwood	0.96	33130.54	212177.70	1.00	584.12	3855.57	1.00	0.86	52.54
	Employees	1.00	0.39	2.97	1.00	0.02	2.96	1.00	0.01	0.42
	Lumber	0.97	43676.05	194318.90	0.99	3387.57	35980.40	0.97	249.35	31576.73

Notes:

R: Correlation between actual and predicted optimum inputs and outputs, AAE: Average absolute error, MAE: Maximum absolute error

Table 9.

Prediction accuracy of ANN-2 models for large, medium and small sawmills in Ontario

Technology for Optimization	Inputs/Output			Large			Medium			Small
		Error	10%	20%	30%	10%	20%	30%	10%	20%	30%
VRS	Roundwood	DMUs	140	250	288	366	372	373	719	723	724
		%DMUs	46%	83%	95%	98%	99%	100%	99%	100%	100%
	Employees	DMUs	193	283	296	346	364	369	694	708	709
		%DMUs	64%	93%	98%	93%	97%	99%	96%	98%	98%
	Lumber	DMUs	203	255	270	128	217	296	201	411	530
		%DMUs	67%	84%	89%	34%	58%	79%	28%	57%	73%
CRS	Roundwood	DMUs	153	260	296	368	374	374	719	721	722
		%DMUs	50%	86%	98%	98%	100%	100%	99%	100%	100%
	Employees	DMUs	303	303	303	373	373	374	724	725	725
		%DMUs	100%	100%	100%	100%	100%	100%	100%	100%	100%
	Lumber	DMUs	197	255	273	290	362	369	649	678	687
		%DMUs	65%	84%	90%	78%	97%	99%	90%	94%	95%

References

Adler, N., Liebert, V. and Yazhemsky, E. (2013), “Benchmarking airports from a managerial perspective”, Omega, Vol. 41 No. 2, pp. 442-458.

Alolayyan, M., Mohd Ali, K., Idris, F. and Ibrehem, A. (2011), “Advance mathematical model to study and analyse the effects of total quality management (TQM) and operational flexibility on hospital performance”, Total Quality Management and Business Excellence, Vol. 22 No. 12, pp. 1371-1393.

Alpay, S. and Yuzugullu, N. (2009), “Dynamic job shop scheduling for missed due date performance”, International Journal of Production Research, Vol. 47 No. 15, pp. 4047-4062.

Athanassopoulos, A.D. and Curram, S.P. (1996), “A comparison of data envelopment analysis and artificial neural networks as tools for assessing”, Journal of the Operational Research Society, Vol. 47 No. 8, pp. 1000-1016.

Banker, R., Charnes, A. and Cooper, W. (1984), “Some models for estimating technical and scale efficiencies in data envelopment analysis”, Management Science, Vol. 30 No. 9, pp. 1078-1092.

Bhanot, H. and Singh, H. (2014), “Benchmarking the performance indicators of Indian railway container business using data envelopment analysis”, Benchmarking: An International Journal, Vol. 21 No. 1, pp. 101-120.

Bogetoff, P. and Otto, L. (2011), Benchmarking with DEA, SFA, and R, Springer, New York, NY.

Carlucci, D., Renna, P. and Schiuma, G. (2013), “Evaluating service quality dimensions as antecedents to outpatient satisfaction using back propagation neural network”, Health Care Management Science, Vol. 16 No. 1, pp. 37-44.

Charnes, A., Cooper, W. and Rhodes, E. (1978), “Measuring the efficiency of decision making units”, European Journal of Operational Research, Vol. 2 No. 6, pp. 429-444.

Chen, P., Chang, C. and Lai, C. (2014), “Incentive regulation and performance measurement of Taiwan's incineration plants: an application of the four-stage DEA method”, Journal of Productivity Analysis, Vol. 41 No. 2, pp. 277-290.

Coelli, T., Rao, D., O'Donnell, C. and Battese, G. (2005), An Introduction to Efficiency and Productivity Analysis, Springer.

Cooper, W., Park, K. and Yu, G. (2001), “An illustrative application of IDEA (imprecise data envelopment analysis) to a Korean mobile telecommunication company”, Operations Research, Vol. 49 No. 6, pp. 807-820.

Cooper, W., Seiford, L.M. and Zhu, J. (2011), Handbook on Data Envelopment Analysis, Springer.

Dia, M. and Bozec, R. (2019), “Social enterprise and the performance measurement challenge: could the data envelopment analysis be the solution?”, Forthcoming in the Journal of Multi Criteria Decision Analysis.

Dia, M., Takouda, P. and Golmohammadi, A. (2018), “Efficiency measurement of Canadian oil and gas companies”, Forthcoming in International Journal of Operational Research.

Dia, M., Abukari, K., Takouda, P. and Assaidi, A. (2019), “Relative efficiency measurement of Canadian mining companies”, International Journal of Applied Management Science, Vol. 11 No. 3, pp. 224-242.

Diaz-Balteiro, L. and Romero, C. (2008), “Making forestry decisions with multiplecriteria: a review and an assessment”, Forest Ecology and Management, Vol. 255 Nos 8/9, pp. 3222-3241.

Emrouznejad, A. and Shale, E. (2009), “A combined neural network and DEA for measuring efficiency of large scale data sets”, Computers and Industrial Engineering, Vol. 56 No. 1, pp. 249-254.

Emrouznejad, A. and Yang, G. (2018), “A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016”, Socio-Economic Planning Sciences, Vol. 61, pp. 4-8.

Fare, R., Grosskopf, S and Lovell, C.A.K. (1994), Production frontiers, Cambridge University Press.

Feli, X.S., Siew Hoon, L. and Junwook, C. (2011), “Railroad productivity analysis: case of the American class I railroads”, International Journal of Productivity and Performance Management, Vol. 60 No. 4, pp. 372-386.

Ferrier, G. and Trivitt, J. (2013), “Incorporating quality into the measurement of hospital efficiency: a double DEA approach”, Journal of Productivity Analysis, Vol. 40 No. 3, pp. 337-355.

Georges Assaf, A.A. and Gillen, D. (2012), “Measuring the joint impact of governance form and economic regulation on airport efficiency”, European Journal of Operational Research, Vol. 220 No. 1, pp. 187-198.

Georgy, M.E., Luh-Maan, C. and Lei, Z. (2005), “Prediction of engineering performance: a neurofuzzy approach”, Journal of Construction Engineering and Management, Vol. 131 No. 5, pp. 548-557.

Glorot, X. and Bengio, Y. (2010), “Understanding the difficulty of training deep feedforward neural networks”, Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 249-256.

Gok, S. and Sezen, B. (2012), “Capacity inefficiencies of teaching and non-teaching hospitals”, The Service Industries Journal, Vol. 32 No. 14, pp. 2307-2328.

Hatami-Marbini, A., Emrouznejad, A. and Tavana, M. (2011), “A taxonomy and review of the fuzzy data envelopment analysis literature: two decades in the making”, European Journal of Operational Research, Vol. 214 No. 3, pp. 457-472.

Hsiang-Hsi, L.T.-Y., Yung-Ho, C. and Fu-Hsiang, K. (2013), “A comparison of three-stage DEA and artificial neural network on the operational efficiency of semi-conductor firms in Taiwan”, Modern Economy, Vol. 4 No. 1, pp. 20-31.

Huang, H., Lin, W., Ramaswamy, S. and Tschirner, U. (2009), “Process modeling of com- prehensive integrated Forest biorefinery – an integrated approach”, Applied Biochemistry and Biotechnology, Vol. 154 Nos 1/3, pp. 26-37, doi: 10.1007/s12010-008-8478-7.

Jacobs, R. (2001), “Alternative methods to examine hospital efficiency: data envelopment analysis and stochastic frontier analysis”, Health Care Management Science, Vol. 4 No. 2, pp. 103-115.

Jain, A.K., ; Duin, R.P. and Mao, J. (2000), “Statistical pattern recognition: a review”, Pattern Analysis and Machine Intelligence, Vol. 22 No. 1, pp. 4-37.

Knoll, D., ; Pruglmeier, M. and Reinhart, G. (2016), “Predicting future inbound logistics processes using machine learning”, Procedia CIRP, Vol. 52, pp. 145-150.

Kourentzes, N. (2013), “Intermittent demand forecasts with neural networks”, International Journal of Production Economics, Vol. 143 No. 1, pp. 198-206.

Kuo, C., Hsu, C., Fang, C., Chao, S. and Lin, Y. (2012), “Automatic defect inspection system of colour filters using Taguchi-based neural network”, International Journal of Production Research, Vol. 51 No. 5, pp. 1464-1476.

Kwon, H. (2014), “Performance modeling of mobile phone providers: a DEA-ANN combined approach”, Benchmarking: An International Journal, Vol. 21 No. 6, pp. 1120-1144.

Kwon, H., Lee, J. and Roh, J. (2016), “Best performance modeling using complementary DEA-ANN approach”, Benchmarking: An International Journal, Vol. 23 No. 3, pp. 704-721, doi: 10.1108/BIJ-09-2014-0083.

Lau, H.W., Ho, G.S. and Zhao, Y. (2013), “A demand forecast model using a combination of surrogate data analysis and optimal neural network approach”, Decision Support Systems, Vol. 54 No. 3, pp. 1404-1416.

Li, D. and Dai, W. (2009), “Determining the optimal collaborative benchmarks in a supply chain”, International Journal of Production Research, Vol. 47 No. 16, pp. 4457-4471.

Li, Y. and Liu, L. (2012), “Hybrid artificial neural network and statistical model for forecasting project total duration in earned value management”, International Journal of Networking and Virtual Organisations, Vol. 10 Nos 3/4, pp. 402-413.

Lovell, C. (1993), “Production frontiers and productive efficiency”, in Fried, H.O. and Schmidt, S.S. (Eds), The Measurement of Productive Efficiency: Techniques and Applications, Oxford, pp. 3-67.

Lozano, S. (2014), “Company-wide production planning using a multiple technology DEA approach”, Journal of the Operational Research Society, Vol. 65 No. 5, pp. 723-734.

Lu, W., Wang, W. and Lee, H. (2013), “The relationship between corporate social responsibility and corporate performance: evidence from the US semiconductor industry”, International Journal of Production Research, Vol. 51 No. 19, pp. 5683-5695.

McAdam, R., Hazlett, S. and Gillespie, K. (2008), “Developing a conceptual model of lead performance measurement and benchmarking: a multiple case analysis”, International Journal of Operations and Production Management, Vol. 28 No. 12, pp. 1153-1185.

Mirhedayatian, S., Azadi, M. and Farzipoor Saen, R. (2014), “A novel network data envelopment analysis model for evaluating green supply chain management”, International Journal of Production Economics, Vol. 147, pp. 544-554.

Misiunas, N., Oztekin, A., Chen, Y. and Chandra, K. (2016), “DEANN: a healthcare analytic methodology of data envelopment analysis and artificial neural networks for the prediction of organ recipient functional status”, Omega, Vol. 58, pp. 46-54.

Mostafa, M. (2007), “Evaluating the competitive market efficiency of top listed companies in Egypt”, Journal of Economic Studies, Vol. 34 No. 5, pp. 430-452.

Mostafa, M.M. (2009), “A probabilistic neural network approach for modeling and classifying efficiency of GCC banks”, International Journal of Business Performance Management, Vol. 11 No. 3, pp. 236-258.

Olanrewaju, O., Jimoh, A. and Kholopan, P. (2012), “Integrated IDA–ANN–DEA for assessment and optimization of energy consumption in industrial sectors”, Energy, Vol. 46 No. 1, pp. 629-635.

Ontario Forest Industries Association (OFIA) (2019), “The case of the shrinking wood supply: Working together to reverse the trend”, Report presented in the Northwestern Ontario Municipal Association (NOMA) conference and Annual General Meeting held on 24-26 April 2019 in Thunder Bay, Ontario, available at: www.ofia.com/images/PDFs/OFIA%20Presents%20at%20NOMA%20-%20April%2026%202019.pdf (accessed 3 November 2019)

Ozdemir, D. and Temur, G.T. (2009), “DEA ANN approach in supplier evaluation system”, World Academy of Science, Engineering and Technology, Vol. 54, pp. 343-348.

Paradi, J.C. and Zhu, H. (2013), “A survey on bank branch efficiency and performance research with data envelopment analysis”, Omega, Vol. 41 No. 1, pp. 61-79.

Paradi, J.C., Rouatt, S. and Zhu, H. (2011), “Two-stage evaluation of bank branch efficiency using data envelopment analysis”, Omega, Vol. 39 No. 1, pp. 99-109.

Pendharkar, P.C. and Rodger, J.A. (2003), “Technical efficiency-based selection of learning cases to improve forecasting accuracy of neural networks under monotonicity assumption”, Decision Support Systems, Vol. 36 No. 1, pp. 117-136.

Peng, K., Huang, J. and Wu, W. (2013), “Rasch model in data envelopment analysis: application in the international tourist hotel industry”, Journal of the Operational Research Society, Vol. 64 No. 6, pp. 938-944.

Pitchipoo, P., Venkumar, P. and Rajakarunakaran, S. (2012), “A distinct decision model for the evaluation and selection of a supplier for a chemical processing industry”, International Journal of Production Research, Vol. 50 No. 16, pp. 4635-4648.

Random Lengths (2019), “Random lengths”, The Random Lengths Framing Lumber Composite Price, available at: www.randomlengths.com/In-Depth/Monthly-Composite-Prices/ (accessed 15 June 2019).

Salehirad, N. and Sowlati, T. (2006), “Dynamic efficiency analysis of primary wood producers in British Columbia”, Mathematical and Computer Modelling, Vol. 45 Nos 9/10, pp. 1179-1188.

Samoilenko, S. and Osei-Bryson, K.M. (2010), “Determining sources of relative inefficiency in hetereogeneous samples: methodology using cluster analysis, DEA and neural networks”, European Journal of Operational Research, Vol. 206 No. 2, pp. 479-487.

Schmidhuber, J. (2015), “Deep learning in neural networks: an overview”, Neural Networks, Vol. 61, pp. 85-117, doi: 10.1016/j.neunet.2014.09.003.

Shahi, S.K. and Dia, M. (2019a), “Efficiency measurement of Ontario's sawmills using bootstrap data envelopment analysis”, Journal of Multi-Criteria Decision Analysis, Vol. 26 Nos 5/6.

Shahi, S.K. and Dia, M. (2019b), “Empirical study of the performance of Ontario's pulp and paper mills using bootstrap data envelopment analysis”, International Journal of Productivity and Quality Management, Vol. 1 No. 1.

Simar, L. and Wilson, P. (1998), “Sensitivity analysis of efficiency scores: how to bootstrap in nonparametric frontier models”, Management Science, Vol. 44 No. 1, pp. 49-61.

Sowlati, T. (2005), “Efficiency studies in forestry using data envelopment analysis”, Forest Products Journal, Vol. 55, pp. 49-57.

Sun, L. and Stuebs, M. (2013), “Corporate social responsibility and firm productivity: evidence from the chemical industry in the United States”, Journal of Business Ethics, Vol. 118 No. 2, pp. 251-263.

Toma, P., Miglietta, P., Zurlini, G., Valente, D. and Petrosillo, I. (2017), “A non-parametric bootstrap-data envelopment analysis approach for environmental policy planning and management of agricultural efficiency in EU countries”, Ecological Indicators, Vol. 83, pp. 132-143.

Wahab, M., Wu, D. and Lee, C. (2008), “A generic approach to measuring the machine flexibility of manufacturing systems”, European Journal of Operational Research, Vol. 186 No. 1, pp. 137-149.

Wang, C.H., Lu, Y.H., Huang, C.W. and Lee, J.Y. (2013), “R&D, productivity, and market value: an empirical study from high-technology firms”, Omega, Vol. 33 No. 41, pp. 143-155.

Xu, J., Li, B. and Wu, D. (2009), “Rough data envelopment analysis and its application to supply chain performance evaluation”, International Journal of Production Economics, Vol. 122 No. 2, pp. 628-638.

Xue, H., Frey, G., Yude, G., Cubbage, F. and Zhaohui, Z. (2018), “Reform and efficiency of state-owned Forest enterprises in Northeast China as social firms”, Journal of Forest Economics, Vol. 30, pp. 18-33.

Yi, L. and Thomas, H. (2009), “A decision support system for the environmental impact of ICT and e-business”, International Journal of Information Technology and Decision Making, Vol. 8 No. 2, pp. 361-377.

Corresponding author

Shashi K. Shahi can be contacted at: sshahi@laurentian.ca