Le vent nous portera

My full name is Abbasali Koochakzadeh but I usually go with Abbas which has an Arabic root and means the leader (or bravest) in a pride of lions. Besides that, I am a PhD student in Electrical and Computer Engineering (ECE) at Purdue since Jan 2023. Prior to that I was a Master's student in ECE the University of Minnesota (UMN) from Jun 2021 - Dec 2022. I also had my undergrad in Electrical Engineering at Tehran Polytechnic from 2016 - 2020.
Federated Reinforcement Learning (FRL) provides a promising way to speedup training in reinforcement learning using multiple edge devices that can operate in parallel. Recently, it has been shown that even when these edge devices have access to different dynamic models, an optimal convergence rate that has a linear speedup proportional to the number of devices is achievable. However, this result requires that the stepsize in the algorithm be chosen in a manner dependent on the unknown model parameters. Also, it applies only to a discounted setting, which has been argued to fit episodic tasks better than continuing control tasks. In this paper, we obtain finite-time bounds for heterogeneous FRL with average rewards. We show that the optimal convergence rate with a linear speedup is possible even with a universal stepsize choice, independent of the underlying dynamics. To achieve our result, we modify the existing one- timescale FRL method to a novel two-timescale variant that additionally incorporates iterate averaging. (link)
We propose an automata-theoretic approach for reinforcement learning (RL) under complex spatio-temporal constraints with time windows. The problem is formulated using a Markov decision process under a bounded temporal logic constraint. Different from existing RL methods that can eventually learn optimal policies satisfying such constraints, our proposed approach enforces a desired probability of constraint satisfaction throughout learning. This is achieved by translating the bounded temporal logic constraint into a total automaton and avoiding “unsafe” actions based on the available prior information regarding the transition probabilities, i.e., a pair of upper and lower bounds for each transition probability. We provide theoretical guarantees on the resulting probability of constraint satisfaction. We also provide numerical results in a scenario where a robot explores the environment to discover high-reward regions while fulfilling some periodic pick-up and delivery tasks that are encoded as temporal logic constraints (​Pdf​​​).
Many applications of multi-agent systems require complex team objectives and constraints to be satisfied by a team of possibly heterogeneous mobile agents (e.g., aerial and ground robots). One way to express such mission specifications is to define them based on the spatio-temporal dsitribution of agents among the regions of interest in the mission area.  For example, the mission may require periodically visiting a region by a certain number of agents from each type, visiting a certain region only after some other region is visited, or never having more than a certain number of agents inside a specified region. We study distributed planning of multi-agent systems under such complex specifications on the distribution of agents. We propose Swarm Signal Temporal Logic (SSTL), which is an extension of Signal Temporal Logic (STL) for expressing the specifications on teams of possibly heterogeneous agents. We then present a game theoretic approach for optimizing the agent trajectories. More specifically, we formulate the planning problem as a potential game and use log-linear learning, which is a noisy best-response type algorithm, to drive the agents to optimal combinations of trajectories.   Performance of the proposed approach is also  demonstrated numerically in a case study (Pdf).
Learning in games has been widely used to solve many cooperative multi-agent problems such as coverage control, consensus, self-reconfiguration or vehicle-target assignment. One standard approach in this domain is to formulate the problem as a potential game and to use an algorithm such as log-linear learning to achieve the stochastic stability of globally optimal configurations. Standard versions of such learning algorithms are asynchronous, i.e., only one agent updates its action at each round of the learning process. To enable faster learning, we propose a synchronization strategy based on decentralized random prioritization of agents, which allows multiple agents to change their actions simultaneously when they do not affect each other's utility or feasible actions. We show that the proposed approach can be integrated into any standard asynchronous learning algorithm to improve the convergence speed while maintaining the limiting behavior (e.g., stochastically stable configurations).  We support our theoretical results with simulations in a coverage control scenario (Pdf).
Delay and especially delay in the transmission of agents’ information, is one of the most important causes of disruption to achieving consensus in a multi-agent system. This paper deals with achieving consensus in delayed fractional-order multi-agent systems (FOMAS). The aim in the present note is to find the exact maximum allowable delay in a FOMAS with non-uniform delay, i.e., the case in which the interactions between agents are subject to non-identical communication time-delays. By proving a stability theorem, the results available for non-delayed networked fractional-order systems are extended for the case in which interaction links have nonequal communication time-delays. In this extension by considering a time-delay coordination algorithm, necessary and sufficient conditions on the time delays and interaction graph are presented to guarantee the coordination. In addition, the delay-dependent stability region is also obtained. Finally, the dependency of the maximum allowable delay on two parameters, the agent fractional-order and the largest eigenvalue of the graph Laplacian matrix, is exactly determined. Numerical simulation results are given to confirm the proposed methodologies (Pdf).
This study outlines the necessary and sufficient criteria for swarm stability asymptotically, meaning consensus in a class of fractional-order multi-agent systems (FOMAS) with interval uncertainties for both fractional orders 0 < α < 1 and 1 < α < 2. The constraints are determined by the graph topology, agent dynamics, and neighbor interactions. It is demonstrated that the fractional-order interval multi-agent system achieves consensus if and only if there are some Hermitian matrices that satisfy a particular kind of complex Lyapunov inequality for all of the system vertex matrices. This is done by using the existence condition of the Hermitian matrices in a Lyapunov inequality. To do this, at first it is shown under which conditions a multi-agent system with unstable agents can still achieve consensus. Then, using a lemma and a theory, the Lyapunov inequality regarding the negativity of the maximum eigenvalue of an augmented matrix of a FOMAS is used to find some Hermitian matrices by checking only a limited number of system vertex matrices. As a result, the necessary and sufficient conditions to reach consensus in a FOMAS in the presence of internal uncertainties are obtained according to the Lyapunov inequalities. Using the main theory of the current paper, instead of countless matrices, only a limited number of vertex matrices need to be used in Lyapunov inequalities to find some Hermitian matrices. As a confirmation of the notion, some instances from numerical simulation are also provided at the end of the paper (​Pdf​​​).
Flexible probes, for the purpose of minimally invasive surgery by entering the skin with minimal damage to internal tissues, have attracted the attention of researchers. One of the challenges of such bioinspired probes is the existence of model uncertainties and disturbances such as the unknown applied force of internal tissues. In this paper, the optimal trajectory tracking problem among these probes when subject to both model uncertainties and unbounded external disturbances is considered. For this purpose, assuming that the upper band of the scalar sum of uncertainties and disturbances is unknown, the adaptive robust sliding mode control rules have been designed to track the position and prove the stability in such a way that this upper band is estimated using adaptive rules. Furthermore, in order to make the problem more realistic and considering the dependence of many external disturbances on the state of the probe inside the body, the upper band of system disturbances is not a fixed value, but a linear function with unknown coefficients of the soft variables of the probe state has been taken and the robust-adaptive sliding mode control law has been designed for stabilization and estimation of functional upper band coefficients. The results of numerical simulations show the correctness of the designed controllers (​Pdf​​​).
  • Chiminiski hub, MSEE, Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
I BUILT MY SITE FOR FREE USING