Optimization of Medical Units for Mobile Shelter Hospitals via Graphical Games and Reinforcement Learning Approach

Ertai E, Jinyuan Liu, Haoran Xu, Xiaoshuai Hao, Jinglin He, Daomeng Cai, Jinhui Pang

Keywords: Mobile Shelter Hospitals, Multi-armed Bandit Algorithm, Simulated Virtual Environment, Graph Models, Nash Equilibrium, Hospital Management
TL;DR: This paper presents an innovative approach to optimizing resource allocation in mobile shelter hospitals, employing graph models, game theory, and reinforcement learning, significantly reducing treatment times and improving hospital efficiency.
This paper models and solves the medical unit allocation problem of mobile shelter hospitals, involving randomness, network equilibrium, and layout planning. To simulate this medical activity, a simulation environment with a directed graph, random patient arrival time, and routes, is proposed. The total service time of this hospital is considered to be minimized while patients follow the user-optimizing principle to obtain earlier services. We show that a Nash equilibrium exists in patient strategies with the overall goal of minimizing hospital service time. The multi-armed bandit (MAB) algorithm from reinforcement learning is used to identify optimal medical unit allocation strategies for both adaptive and non-adaptive patient scenarios. The study presents an allocation strategy for scenarios with 3000 patients, 9 resources, and 7 hospital areas, achieving a total treatment time of 54,190, an average room utilization rate of approximately 0.8, and an average number of lingering patients of less than 1% of the total number of patients.

Primary Area: Applications (computational biology, crowdsourcing, healthcare, neuroscience, social good, climate science, etc.)
Official Review of Submission3989 by Reviewer r7yk

This paper aims to optimize placement/allocation of mobile shelter hospitals in anticipation of demand for example, during mass emergencies. It introduces a network model, graphical games, studies nash equilibrium of the game, and uses multi-armed bandit learning algorithms to find it.

Strengths And Weaknesses:

(+) Optimal placement/allocation problem of mobile hospitals is an important problem. It is not too different from the ambulance location problem which is very well studied in the OR literature.

(-) There are a number of problems with this paper: The first is modeling - it is unrealistic and doesn't really make much sense. A lot of stuff is thrown around - queues, graphs, games; Nash equilbria, multi-armed bandit eq. And it is hard to understand the relevance in the mish-mash.
(-) There are some experimental results: One would expect at least some aspect of it to come from a real world setting/data. But it is all synthetic.

Overall, a poor paper that should have received a desk reject instead of wasting reviewer time.

  1. Please clarify why strategic aspect is being considered, why Nash eq. is a suitable notion to predict outcome and how Algorithm 1 is relevant.
Official Review of Submission3989 by Reviewer EZTk

This paper presents an interesting approach to optimizing the layout and allocation of medical resources in mobile shelter hospitals during emergency situations. The authors model the hospital network as a directed hypergraph and formulate the optimization problem of minimizing total service time while accounting for patient flows, queues, priorities, and randomness in arrivals and routes.

Strengths And Weaknesses:

First of all, there is a pure application paper with no new methodology or theory. The problem is well-motivated and the introduction of the real world application is clear.

  1. All the experiments are synthetic. Medical unit allocation is an important problem but without using any real data, it is very hard to know if the simulator is realistic. I do not think we can use this simple simulator to model the real world situation with patient-hospital interaction. Then the claim of utilization rate / treating time / x% of improvement is meaningless since we can easily tune the simulator parameter. As this is a pure applied paper, I think working with real data is required.

  2. I do not see the necessary of using multi-armed bandit approach. Will the goal be to solve the optimization problem (1)? Is that a pure optimization problem? Bandits try to solve exploration problem. Why in this application we need exploration? To explore what? In bandits, there are two objectives: one is cumulative regret minimization and second one is best-arm identification. It is better to clarify which formulation you are using. Different problem formulation have different metrics to use.

  3. I do not think this paper has anything to do with reinforcement learning. There is no state concept at all. Bandits do not belong to RL.

Official Review of Submission3989 by Reviewer rAVT

This paper proposes an approach to optimizing medical unit allocation in mobile shelter hospitals by integrating graphical game theory and reinforcement learning, specifically through the use of Nash equilibrium and the Multi-Armed Bandit (MAB) algorithm. It models the patient flow within a hospital as a directed graph, considering randomness in patient arrivals and movements, to improve operational metrics such as total treatment time, room utilization rate, and the reduction of lingering patient numbers. The approach is validated via a simulation involving scenarios with 3,000 patients, demonstrating its potential to enhance the efficiency and responsiveness of mobile shelter hospitals in complex medical scenarios.

Strengths And Weaknesses:

Strengths: The study addresses a critical and timely issue, offering the potential for a significant impact on emergency medical response. The findings could be beneficial for the planning and operation of mobile shelter hospitals, especially in disaster response scenarios. The use of Nash equilibrium to model patient strategies in seeking medical services is novel in the context of mobile shelter hospitals.

Weaknesses: The approach builds upon existing theories and algorithms. The uniqueness lies more in the application than in the development of new methodologies.

Some parts of the optimization objective and the implementation details could be elaborated further for clarity.

The connection between the proposed method and the resolution of the defined optimization problem and game structure is not convincingly established. It remains unclear how the method directly addresses the optimization of medical unit allocation in the context of the game theoretical model presented and how it contributes to finding an equilibrium in patient strategies.

More explicit details about the problem space, including constraints and assumptions made in the model are not stated explicitly.

What is t_v in the second row of Eq.(1)? Is it the servicing time t_v^s? And what is this objective maximizing over? Is it the servicing time of a particular clinic room v, the minimum across all clinic rooms, or the sum of all?

In lines 256 and 257, why is [O, X_6, X_1, X_3, D] not an option? Should it be X_1 instead of X_2 in line 253?
How to define the ascending order of all patients? Does it depend on both arriving time and priority? Do these need to be known in advance? Would a patient arrive later but with higher priority break the equilibrium? And how to obtain the order of all patient if their arrival times are subject to a randomized distribution?

What are the potential challenges in implementing this model in a live hospital setting, and how might they be addressed?

While the paper acknowledges the simulation-based nature of its findings, a discussion on the limitations regarding real-world applicability and potential negative societal impacts is somewhat lacking. The paper might benefit from additional validation, such as real-world testing or comparison with other optimization techniques.

