Discrete Stochastic Programming by Infinitesimal Perturbation Analysis: the Case of Resource Allocation in Satellite Networks with Fading Stefano Canesi*, Franco Davoli*, Mario Marchese°, Maurizio Mongelli* * DIST - Department of Communications, Computer and Systems Science ° CNIT - Italian National Consortium for Telecommunications University of Genoa, Via Opera Pia 13, 16145 Genova, Italy Email*: ste, franco, mopa@dist.unige.it Email°: mario.marchese@cnit.it Abstract - In this paper we consider a resource allocation problem for a satellite network, where variations in fading conditions are added to those of traffic load. Since the capacity of the system is finite and divided in finite discrete portions, the resource allocation problem reveals to be a discrete stochastic programming one, which is typically NP-Hard. In practice, only provable good (but non-optimal) heuristics can be applied in real on-line scenarios to approximate the optimal solution that could be obtained through a Dynamic Programming formulation. Following the theoretical framework of [Cassandras et al., 2001, Cassandras et al., 2002], we adopt a novel approach based on the minimization over a discrete constraint set, using an on-line estimation of the gradient, obtained through a “relaxed continuous extension” of the performance measure. The computation of the gradient estimation is based on the Infinitesimal Perturbation Analysis technique, applied on a Stochastic Fluid Model of the network. Since neither closed forms of the performance measures, nor additional feedbacks concerning the state of the system, and very mild assumptions on the probabilistic properties about the statistical processes involved in the problem are requested, this technique reveals to be applicable in real on-line operating conditions for several application scenarios and cost functions. I. INTRODUCTION In broadband networks, in the presence of limited resources (buffers, bandwidth, or processing capacity), several forms of control are exerted to maintain a desired level of performance for all users and traffic types. In satellite networks, dynamically varying fading conditions over the channel can heavily affect the transmission quality, especially when working in Ka band, where the effect of rain over the quality of transmission is more significant. In the literature, it is possible to find optimal policies developed in the case of a finite quantity of transmission energy for satellite network devices (see e.g., [Fu et al., 2002] and references therein). [Fu et al., 2002] show a dynamic programming formulation of the problem that leads, for special cases, to a closed-form optimal policy, in order to find a tradeoff between the minimization of energy required to send a fixed amount of data and the maximization of the throughput over a fading channel. Resource allocation for fading multi-user broadcast channels is a popular topic also in information theory (see, e.g., [Tsybakov, 2002] and references therein). In these works, the problem is analyzed and solved at the physical layer: power allocation is performed in order to obtain good reactions to variable fading conditions. In [Celandroni et al., 2003] and in this work the control system is located at the physical and at the data link (or upper) layers, and provides adaptive bandwidth allocation strategies. Resource allocation problems for telecommunication networks have been widely studied in the literature [Boxma et al. 1994, Ephremides et al. 1989, Alagoz et al., 2004]. Usually, the control systems are based on closed-form expressions for the performance measure of interest (see, e.g., [Bolla et al., 2001, Celandroni et al., 2003, Keon et al. 2003]). For example, in [Celandroni et al., 2003] and [Bolla et al., 2001] the Tsybakov- Georganas formula for the cell loss probability in the presence of self-similar traffic ([Tsybakov et al., 1998]) is used. The main drawback of these approaches is due to the fact that conditions for the applicability of closed forms of such functional costs are difficult to implement in real-life contexts. Formulas in closed form need a continuous mapping between the current statistical behaviour of the system and their parameters to maintain good performance of the allocation algorithm. All these techniques need also the application of dynamic programming, whose on-line application in a real context is quite impractical, due to the well-known “curse of dimensionality” problem. We present a novel solution for the bandwidth allocation in a satellite environment based on an “Infinitesimal Perturbation Analysis” (IPA) technique and on a “surrogate” relaxation of the discrete functional cost. Our aim is to obtain a new optimization algorithm that is light (i.e., polynomial in the state space), adaptive (i.e., able to “learn” the statistical changes of the system) and non-parametric (i.e., able to optimize the system performance without any closed form of the performance measure). IPA is a sensitivity estimation technique for Discrete Event Systems [Wardi et al., 2002]. It is based on the observation of the sample paths followed by the stochastic processes of the system and it gives an estimation of the derivative of the performance .....