Modeling and Optimization of the Single-Leg Multi-fare Class Overbooking Problem

This paper presents a static overbooking model for a single-leg multi-fare class flight. A realistic distribution of no-show data in modeling the cost function was considered using data collected from the Ethiopian airlines. The overbooking model developed considers the interaction (i.e. the transfer of an extra passenger in a lower fare classes to higher fare class empty seat) between classes that may exist during boarding time. Furthermore, the overbooking problem is modelled in such a way that it could be constrained by user defined constraints such as probability of loss of the revenue. The overbooking model developed was solved using derivatives that give a closed form expression and Monte Carlo simulation with a derivative free optimization algorithm. A comparison of the revenue generated from nooverbooking policy, the closed form solution, and the Monte Carlo simulation solution approach shows that the Monte Carlo simulation solution approach performs better. Generally, the numerical results show that the overbooking model is effective in determining the optimal number of overbooking for a number of classes and a variety of compensation cost plans.


INTRODUCTION
Overbooking is an airline revenue management (ARM) technique which seeks to account for the no-shows and cancellations by making more reservations than the available capacity in order to maximize revenue. The approaches for the overbooking problem can be broadly categorized as static and dynamic models. In the static model, the dynamic nature of reservation (cancellations over a period of time) is ignored, and the concern is to find the optimal number of overbooking at the opening period of the reservation that minimizes cost or maximizes revenue. The dynamic model considers the dynamic nature of reservation, and seeks to find a policy by which the booking operator decides whether to accept or reject a request made by a customer for a reservation of a certain class at time T. Although dynamic overbooking models treat the overbooking problem in its realistic state, generally the models are mathematically intractable for a real world problem. As such, many of the commercial cost function of overbooking and relaxing some of the assumption made in prior studies. In light of this, the objective of this paper is to model the overbooking problem for a single-leg multi-fare class as a cost minimization in such a way that it could be constrained by a user defined probability of loss of the revenue. The contribution of this model over existing static overbooking models is twofold. First the overbooking model developed considers the interaction (i.e. the transfer of an extra passenger in a lower fare classes to higher fare class empty seat) between classes that may exist during boarding time. The second contribution is the fact that the model is flexible to include user defined constraints such as probability of loss of the revenue. A literature review of the overbooking models is explained and presented below and also the mathematical formulation of the overbooking problem using a realistic distribution of the no-show data, and a solution approach using both the closed form and a Monte Carlo simulation with the use of the derivative free Nelder Mead algorithm. Further, the paper presents a numerical analysis and evaluation of the proposed solution approaches in solving the overbooking model.

LITERATURE REVIEW
Overbooking is the practice of intentionally selling more seats than the available physical capacity of the plane in order to compensate the number of no-shows and cancellation, which can be as high as 15%, during the time of departure (Chatwin, 1993). A more recent study shows that the benefits obtained from using overbooking accounts for an average of $1 billion increase in revenue per year (Bailey, 2007). Though overbooking can improve the revenue of an airline it has also risks associated with it, when the number of show-ups is greater than the fixed capacity. That is, when the number of show-ups is greater than the available capacity, some of the passengers who already bought a ticket will be bumped (i.e. denied boarding) of the flight either voluntarily or involuntarily. In both case there is a financial loss that the airline should incur in the form of compensation cost to be paid toward the bumped passengers. In addition to the compensation cost, the bumped passengers will retain a bad image of the service that should be considered as loss of customer goodwill cost, which will have a massive long term impact on the business of the airline. However, it was estimated that financial loss due to overbooking is less when compared with not practicing overbooking (Siddappa, 2006). Accordingly, the objective of the overbooking model is to find the optimal number of overbooking level that the airline should reserve in order to minimize the expected cost or maximize the expected revenue.
The history of overbooking goes back to the pioneering work of Beckmann and Bobkowski (1985). Their statistical modeling of the overbooking problem laid a foundation for today's revenue management in the airline industry. The first overbooking model proposed by Beckmann was a single leg single fare-class problem, which is a very simplified form of the actual overbooking problem that airline faces. His model tries to determine the optimal overbooking level by balancing the spoilage cost (lost revenue due to empty seats) with compensation cost (lost revenue due to bumping of passengers). Thompson developed an overbooking model for a two fare class using the cancellation rates while ignoring the probability distribution of the demand and the no-show rates (Thompson, 1961). His model determines the overbooking limit for a given probability of overbooking. Thompson's work has been extended by Taylor (1962) as well as Rothstein and Stone (1967). Taylor's overbooking model, though is a very simplified model, has been implemented and used by many airlines for their booking level control. It was also considered that Taylor's model was used as a basis for a family of subsequent overbooking models. Bodily and Pfeifer also studied the static overbooking problem using the probability of customer cancellation and noshows for a single fare-class problem, which is a highly simplified form of the actual scenario (Bodily and Pfeifer, 1992). All the above models deal either with a single fare-class or two fare-class overbooking model, which is not always the case for a real world problem. Latter researches, however, consider the multi fare-class overbooking problem (Chi, 1995;Coughlan, 1999;Aydm et al., 2010). Chi considers the multi fare-class overbooking problem and develops a dynamic programming model (Chi, 1995). His model determines the maximum overbooking level that should be used in every fare-class for a known demand and show-up distribution of every class. He further assumed that cancellations can be made without any penalty cost, which made his model inaccurate since there is a penalty for cancellations. Coughlan (1999) extends the multi fare-class overbooking problem by introducing the last minute passengers also called go-shows, those are customers who showup during service time without any prior reservation. His model assumes the demand, the noshow, and the cancellations are all independently normally distributed. The assumption that the booking and the no-show are normally distributed is used, but in the literature it commonly is assumed to follow a Poisson distribution (Subramanian et al., 1999). Modelling demand data by a Poisson distribution seems more realistic as the Poisson is used for modelling number of occurrences or events in a specified period (booking in this case).
However, the no-show data has to be tested for which ever distribution is to best describe the In this paper a static overbooking model is developed using a realistic distribution for the noshow data, which is generalized extreme value probability distributions for the no-show in modeling the cost function. An attempt was made to solve the model using both closed form expression and a Mote-Carlo simulation using the derivative free optimization approaches. Furthermore, the model was made to be flexible so that it could be transformed with a user defined constraint into a constrained optimization problem. This particular feature of this overbooking model is important for decision makers who are sensitive to both customer reaction upon denied boarding and profit loss. The model developed in this paper could be used for any classes the airline wish to make and for any kind of distribution that the particular airline's data may have. Furthermore, the fact that the paper models the cost function based on a realistic probability distributions based on the historical data is a relaxation of the assumptions made in prior studies since in the past the cost function was mainly modeled based on the binomial distribution.

Notations and Terms
= ticket price for fare-class i = number of overbooking for fare class i , Y The number of no-shows and cancellations in class i, with p.d.f f(x). ( is a r.v.) = penalty cost of an overbooking corresponding to fare-class i = the opportunity cost of flying with an empty seat for fare class i

Revenue
Based on the Anderson-Darling goodness of fit test of the booking data for Ethiopian airlines, the distribution of the no-show follows the Generalized Extreme Distribution as opposed to the commonly assumed normal distribution. The generalized extreme value distribution is appropriate for extreme events, which is the case with the denied boarding and empty flight seats. This justifies the use of the generalized extreme value distribution in modeling the no- show data not only based on the goodness of fit test but also theoretically. The revenue generated from the booking (y) passengers in each class could be obtained by multiplying the price of each ticket the overbooking level made in that class. The revenue generated from overbooking is the product of the number of overbooked passengers and the fare-ticket of the overbooked pad.

Compensation cost
Otherwise, The second term of the above equation implies the fact that, extra arrivals for a seat in one class may be assigned a seat if there is empty seat in another class.

Spoilage cost (cost of lost opportunity)
Otherwise, Therefore, the net revenue would be modeled as Since, the number of no-shows is a continuous variable the expected net revenue could be rewritten as:

SOLUTION APPROACH
For the purpose of this study in verifying and measuring the performance of the proposed model, a historical data of booking, no-shows, and cancellation was collected. An 18 months data was collected for the purpose of fitting the data in to a probability density function (PDF). An out bound station with a daily flight (ADD-DXB) and another station with a lower load factor as compared to other stations (due to no-shows, ADD-CAI) were chosen for the analysis of the data. Then, the six months data of no-shows and rate of no-shows from each were fit separately in to a PDF. Since the number of bookings for each day differs, first the rate of no-shows was fitted to see the probability density function (PDF) of the smoothed variable. Then, the no-show data was fitted without considering the variation in the number of bookings, to see if there could be a significant difference in the PDF of the two variable fits. For the flight destinations in our case example it was found that both the rate of no-show and the no-show data's PDF follow the same distribution. A closer look at the number of bookings, no-shows as well as the load factor of Ethiopian airlines shows that Ethiopian has insignificant number of denied boarding (one per twenty thousand). However, this could be the case not only because of the low overbooking level but sometimes demand goes below the capacity. When it is the case that demands are expected to be lower than the available seat capacity, a competitive air fare structure should be used in order to attract potential customers. Ethiopian has affixed fare structure from which a customer could choose, and this fare structure is calculated mainly based on the minimum number of load factors forecasted so that the airline operates with an anticipated profit even if it is flying with a lot of empty seats. The statistics toolbox of Mat Lab was also utilized in checking the distribution of the historical data after a general distribution fit comparisons were made on the "EasyFit" software. The "EasyFit" software (Mathwave, 2013) is helpful in generating the best distribution fit appropriate for our data. For the case of Ethiopian (with respect to the data at hand), the assumption that the no-show and cancellation data follow beta, normal or gamma distribution is not applicable even though it might be the case for other airlines as pointed out in the literature review. A detailed comparison of the fit based on three (Kolmogorovsmirnov, Anderson-Darling, and chi-squared) goodness of fit (gof) test shows that, the generalized extreme value distribution is the best fit distribution for our no-show and cancellation data. Fig 1 is an example showing the fitted data for ADD-CAI is give below along with the test statistic of the chosen gof test.  Looking into the probability of loss plot (Fig 3) gives an insight of the critical values of overbooking where potential loss could occur. In the case example, the overbooking level for class-1 is near ten while for class-2 is greater than ten. This once more confirms our simulation result given above as seven for class-1 and fourteen for class-2 with probability of loss around 0.586. One of the advantages of this graph is that it does not only show the optimal values but also gives the decision maker an insight how and what level of overbooking in each class could affect the probability of loss in revenue. Furthermore, the graph gives the decision maker the freedom of relaxing the overbooking value by a certain amount as long as the probability of loss is acceptable. Considering the number of no-shows and cancellation resulting in empty seat flight at Ethiopian, one can extend to use the static overbooking model in determining the minimum ticket price that could be offered without loss. Hence, developing a flexible or negotiable pricing system (model) to some of the seats could be a future work.

ACKNOWLEDGEMENTS
Authors thank the anonymous reviewers for providing critical and useful comments.

Sample Mat Lab Codes used
MatLab codes used to calculate the optimal number of overbooking, expected net revenue, and probability of loss.