mirror of
https://github.com/dptech-corp/Uni-Lab-OS.git
synced 2026-02-04 05:15:10 +00:00
Upgrade to py 3.11.14; ROS2 Humble 0.7; unilabos 0.10.16
Workbench example, adjust log level, and ci check (#220) * TestLatency Return Value Example & gitignore update * Adjust log level & Add workbench virtual example & Add not action decorator & Add check_mode & * Add CI Check Fix/workstation yb revision (#217) * Revert log change & update registry * Revert opcua client & move electrolyte node Workstation yb merge dev ready 260113 (#216) * feat(bioyond): 添加计算实验设计功能,支持化合物配比和滴定比例参数 * feat(bioyond): 添加测量小瓶功能,支持基本参数配置 * feat(bioyond): 添加测量小瓶配置,支持新设备参数 * feat(bioyond): 更新仓库布局和尺寸,支持竖向排列的测量小瓶和试剂存放堆栈 * feat(bioyond): 优化任务创建流程,确保无论成功与否都清理任务队列以避免重复累积 * feat(bioyond): 添加设置反应器温度功能,支持温度范围和异常处理 * feat(bioyond): 调整反应器位置配置,统一坐标格式 * feat(bioyond): 添加调度器启动功能,支持任务队列执行并处理异常 * feat(bioyond): 优化调度器启动功能,添加异常处理并更新相关配置 * feat(opcua): 增强节点ID解析兼容性和数据类型处理 改进节点ID解析逻辑以支持多种格式,包括字符串和数字标识符 添加数据类型转换处理,确保写入值时类型匹配 优化错误提示信息,便于调试节点连接问题 * feat(registry): 新增后处理站的设备配置文件 添加后处理站的YAML配置文件,包含动作映射、状态类型和设备描述 * 添加调度器启动功能,合并物料参数配置,优化物料参数处理逻辑 * 添加从 Bioyond 系统自动同步工作流序列的功能,并更新相关配置 * fix:兼容 BioyondReactionStation 中 workflow_sequence 被重写为 property * fix:同步工作流序列 * feat: remove commented workflow synchronization from `reaction_station.py`. * 添加时间约束功能及相关配置 * fix:自动更新物料缓存功能,添加物料时更新缓存并在删除时移除缓存项 * fix:在添加物料时处理字符串和字典返回值,确保正确更新缓存 * fix:更新奔曜错误处理报送为物料变更报送,调整日志记录和响应消息 * feat:添加实验报告简化功能,去除冗余信息并保留关键信息 * feat: 添加任务状态事件发布功能,监控并报告任务运行、超时、完成和错误状态 * fix: 修复添加物料时数据格式错误 * Refactor bioyond_dispensing_station and reaction_station_bioyond YAML configurations - Removed redundant action value mappings from bioyond_dispensing_station. - Updated goal properties in bioyond_dispensing_station to use enums for target_stack and other parameters. - Changed data types for end_point and start_point in reaction_station_bioyond to use string enums (Start, End). - Simplified descriptions and updated measurement units from μL to mL where applicable. - Removed unused commands from reaction_station_bioyond to streamline the configuration. * fix:Change the material unit from μL to mL * fix:refresh_material_cache * feat: 动态获取工作流步骤ID,优化工作流配置 * feat: 添加清空服务端所有非核心工作流功能 * fix:修复Bottle类的序列化和反序列化方法 * feat:增强材料缓存更新逻辑,支持处理返回数据中的详细信息 * Add debug log * feat(workstation): update bioyond config migration and coin cell material search logic - Migrate bioyond_cell config to JSON structure and remove global variable dependencies - Implement material search confirmation dialog auto-handling - Add documentation: 20260113_物料搜寻确认弹窗自动处理功能.md and 20260113_配置迁移修改总结.md * Refactor module paths for Bioyond devices in YAML configuration files - Updated the module path for BioyondDispensingStation in bioyond_dispensing_station.yaml to reflect the new directory structure. - Updated the module path for BioyondReactionStation and BioyondReactor in reaction_station_bioyond.yaml to align with the revised organization of the codebase. * fix: WareHouse 的不可哈希类型错误,优化父节点去重逻辑 * refactor: Move config from module to instance initialization * fix: 修正 reaction_station 目录名拼写错误 * feat: Integrate material search logic and cleanup deprecated files - Update coin_cell_assembly.py with material search dialog handling - Update YB_warehouses.py with latest warehouse configurations - Remove outdated documentation and test data files * Refactor: Use instance attributes for action names and workflow step IDs * refactor: Split tipbox storage into left and right warehouses * refactor: Merge tipbox storage left and right into single warehouse --------- Co-authored-by: ZiWei <131428629+ZiWei09@users.noreply.github.com> Co-authored-by: Andy6M <xieqiming1132@qq.com> fix: WareHouse 的不可哈希类型错误,优化父节点去重逻辑 fix parent_uuid fetch when bind_parent_id == node_name 物料更新也是用父节点进行报送 Add None conversion for tube rack etc. Add set_liquid example. Add create_resource and test_resource example. Add restart. Temp allow action message. Add no_update_feedback option. Create session_id by edge. bump version to 0.10.15 temp cancel update req
This commit is contained in:
@@ -7,7 +7,6 @@ import sys
|
||||
import threading
|
||||
import time
|
||||
from typing import Dict, Any, List
|
||||
|
||||
import networkx as nx
|
||||
import yaml
|
||||
|
||||
@@ -17,9 +16,15 @@ unilabos_dir = os.path.dirname(os.path.dirname(current_dir))
|
||||
if unilabos_dir not in sys.path:
|
||||
sys.path.append(unilabos_dir)
|
||||
|
||||
from unilabos.app.utils import cleanup_for_restart
|
||||
from unilabos.utils.banner_print import print_status, print_unilab_banner
|
||||
from unilabos.config.config import load_config, BasicConfig, HTTPConfig
|
||||
|
||||
# Global restart flags (used by ws_client and web/server)
|
||||
_restart_requested: bool = False
|
||||
_restart_reason: str = ""
|
||||
|
||||
|
||||
def load_config_from_file(config_path):
|
||||
if config_path is None:
|
||||
config_path = os.environ.get("UNILABOS_BASICCONFIG_CONFIG_PATH", None)
|
||||
@@ -41,7 +46,7 @@ def convert_argv_dashes_to_underscores(args: argparse.ArgumentParser):
|
||||
for i, arg in enumerate(sys.argv):
|
||||
for option_string in option_strings:
|
||||
if arg.startswith(option_string):
|
||||
new_arg = arg[:2] + arg[2:len(option_string)].replace("-", "_") + arg[len(option_string):]
|
||||
new_arg = arg[:2] + arg[2 : len(option_string)].replace("-", "_") + arg[len(option_string) :]
|
||||
sys.argv[i] = new_arg
|
||||
break
|
||||
|
||||
@@ -49,6 +54,8 @@ def convert_argv_dashes_to_underscores(args: argparse.ArgumentParser):
|
||||
def parse_args():
|
||||
"""解析命令行参数"""
|
||||
parser = argparse.ArgumentParser(description="Start Uni-Lab Edge server.")
|
||||
subparsers = parser.add_subparsers(title="Valid subcommands", dest="command")
|
||||
|
||||
parser.add_argument("-g", "--graph", help="Physical setup graph file path.")
|
||||
parser.add_argument("-c", "--controllers", default=None, help="Controllers config file path.")
|
||||
parser.add_argument(
|
||||
@@ -153,6 +160,50 @@ def parse_args():
|
||||
default=False,
|
||||
help="Complete registry information",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--check_mode",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Run in check mode for CI: validates registry imports and ensures no file changes",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--no_update_feedback",
|
||||
action="store_true",
|
||||
help="Disable sending update feedback to server",
|
||||
)
|
||||
# workflow upload subcommand
|
||||
workflow_parser = subparsers.add_parser(
|
||||
"workflow_upload",
|
||||
aliases=["wf"],
|
||||
help="Upload workflow from xdl/json/python files",
|
||||
)
|
||||
workflow_parser.add_argument(
|
||||
"-f",
|
||||
"--workflow_file",
|
||||
type=str,
|
||||
required=True,
|
||||
help="Path to the workflow file (JSON format)",
|
||||
)
|
||||
workflow_parser.add_argument(
|
||||
"-n",
|
||||
"--workflow_name",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Workflow name, if not provided will use the name from file or filename",
|
||||
)
|
||||
workflow_parser.add_argument(
|
||||
"--tags",
|
||||
type=str,
|
||||
nargs="*",
|
||||
default=[],
|
||||
help="Tags for the workflow (space-separated)",
|
||||
)
|
||||
workflow_parser.add_argument(
|
||||
"--published",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Whether to publish the workflow (default: False)",
|
||||
)
|
||||
return parser
|
||||
|
||||
|
||||
@@ -165,10 +216,12 @@ def main():
|
||||
args_dict = vars(args)
|
||||
|
||||
# 环境检查 - 检查并自动安装必需的包 (可选)
|
||||
if not args_dict.get("skip_env_check", False):
|
||||
skip_env_check = args_dict.get("skip_env_check", False)
|
||||
check_mode = args_dict.get("check_mode", False)
|
||||
|
||||
if not skip_env_check:
|
||||
from unilabos.utils.environment_check import check_environment
|
||||
|
||||
print_status("正在进行环境依赖检查...", "info")
|
||||
if not check_environment(auto_install=True):
|
||||
print_status("环境检查失败,程序退出", "error")
|
||||
os._exit(1)
|
||||
@@ -177,7 +230,21 @@ def main():
|
||||
|
||||
# 加载配置文件,优先加载config,然后从env读取
|
||||
config_path = args_dict.get("config")
|
||||
if os.getcwd().endswith("unilabos_data"):
|
||||
|
||||
if check_mode:
|
||||
args_dict["working_dir"] = os.path.abspath(os.getcwd())
|
||||
# 当 skip_env_check 时,默认使用当前目录作为 working_dir
|
||||
if skip_env_check and not args_dict.get("working_dir") and not config_path:
|
||||
working_dir = os.path.abspath(os.getcwd())
|
||||
print_status(f"跳过环境检查模式:使用当前目录作为工作目录 {working_dir}", "info")
|
||||
# 检查当前目录是否有 local_config.py
|
||||
local_config_in_cwd = os.path.join(working_dir, "local_config.py")
|
||||
if os.path.exists(local_config_in_cwd):
|
||||
config_path = local_config_in_cwd
|
||||
print_status(f"发现本地配置文件: {config_path}", "info")
|
||||
else:
|
||||
print_status(f"未指定config路径,可通过 --config 传入 local_config.py 文件路径", "info")
|
||||
elif os.getcwd().endswith("unilabos_data"):
|
||||
working_dir = os.path.abspath(os.getcwd())
|
||||
else:
|
||||
working_dir = os.path.abspath(os.path.join(os.getcwd(), "unilabos_data"))
|
||||
@@ -196,7 +263,7 @@ def main():
|
||||
working_dir = os.path.dirname(config_path)
|
||||
elif os.path.exists(working_dir) and os.path.exists(os.path.join(working_dir, "local_config.py")):
|
||||
config_path = os.path.join(working_dir, "local_config.py")
|
||||
elif not config_path and (
|
||||
elif not skip_env_check and not config_path and (
|
||||
not os.path.exists(working_dir) or not os.path.exists(os.path.join(working_dir, "local_config.py"))
|
||||
):
|
||||
print_status(f"未指定config路径,可通过 --config 传入 local_config.py 文件路径", "info")
|
||||
@@ -210,9 +277,11 @@ def main():
|
||||
print_status(f"已创建 local_config.py 路径: {config_path}", "info")
|
||||
else:
|
||||
os._exit(1)
|
||||
# 加载配置文件
|
||||
|
||||
# 加载配置文件 (check_mode 跳过)
|
||||
print_status(f"当前工作目录为 {working_dir}", "info")
|
||||
load_config_from_file(config_path)
|
||||
if not check_mode:
|
||||
load_config_from_file(config_path)
|
||||
|
||||
# 根据配置重新设置日志级别
|
||||
from unilabos.utils.log import configure_logger, logger
|
||||
@@ -241,9 +310,12 @@ def main():
|
||||
if args_dict.get("sk", ""):
|
||||
BasicConfig.sk = args_dict.get("sk", "")
|
||||
print_status("传入了sk参数,优先采用传入参数!", "info")
|
||||
BasicConfig.working_dir = working_dir
|
||||
|
||||
workflow_upload = args_dict.get("command") in ("workflow_upload", "wf")
|
||||
|
||||
# 使用远程资源启动
|
||||
if args_dict["use_remote_resource"]:
|
||||
if not workflow_upload and args_dict["use_remote_resource"]:
|
||||
print_status("使用远程资源启动", "info")
|
||||
from unilabos.app.web import http_client
|
||||
|
||||
@@ -256,15 +328,16 @@ def main():
|
||||
|
||||
BasicConfig.port = args_dict["port"] if args_dict["port"] else BasicConfig.port
|
||||
BasicConfig.disable_browser = args_dict["disable_browser"] or BasicConfig.disable_browser
|
||||
BasicConfig.working_dir = working_dir
|
||||
BasicConfig.is_host_mode = not args_dict.get("is_slave", False)
|
||||
BasicConfig.slave_no_host = args_dict.get("slave_no_host", False)
|
||||
BasicConfig.upload_registry = args_dict.get("upload_registry", False)
|
||||
BasicConfig.no_update_feedback = args_dict.get("no_update_feedback", False)
|
||||
BasicConfig.communication_protocol = "websocket"
|
||||
machine_name = os.popen("hostname").read().strip()
|
||||
machine_name = "".join([c if c.isalnum() or c == "_" else "_" for c in machine_name])
|
||||
BasicConfig.machine_name = machine_name
|
||||
BasicConfig.vis_2d_enable = args_dict["2d_vis"]
|
||||
BasicConfig.check_mode = check_mode
|
||||
|
||||
from unilabos.resources.graphio import (
|
||||
read_node_link_json,
|
||||
@@ -283,10 +356,36 @@ def main():
|
||||
# 显示启动横幅
|
||||
print_unilab_banner(args_dict)
|
||||
|
||||
# 注册表
|
||||
lab_registry = build_registry(
|
||||
args_dict["registry_path"], args_dict.get("complete_registry", False), args_dict["upload_registry"]
|
||||
)
|
||||
# 注册表 - check_mode 时强制启用 complete_registry
|
||||
complete_registry = args_dict.get("complete_registry", False) or check_mode
|
||||
lab_registry = build_registry(args_dict["registry_path"], complete_registry, BasicConfig.upload_registry)
|
||||
|
||||
# Check mode: complete_registry 完成后直接退出,git diff 检测由 CI workflow 执行
|
||||
if check_mode:
|
||||
print_status("Check mode: complete_registry 完成,退出", "info")
|
||||
os._exit(0)
|
||||
|
||||
if BasicConfig.upload_registry:
|
||||
# 设备注册到服务端 - 需要 ak 和 sk
|
||||
if BasicConfig.ak and BasicConfig.sk:
|
||||
print_status("开始注册设备到服务端...", "info")
|
||||
try:
|
||||
register_devices_and_resources(lab_registry)
|
||||
print_status("设备注册完成", "info")
|
||||
except Exception as e:
|
||||
print_status(f"设备注册失败: {e}", "error")
|
||||
else:
|
||||
print_status("未提供 ak 和 sk,跳过设备注册", "info")
|
||||
else:
|
||||
print_status("本次启动注册表不报送云端,如果您需要联网调试,请在启动命令增加--upload_registry", "warning")
|
||||
|
||||
# 处理 workflow_upload 子命令
|
||||
if workflow_upload:
|
||||
from unilabos.workflow.wf_utils import handle_workflow_upload_command
|
||||
|
||||
handle_workflow_upload_command(args_dict)
|
||||
print_status("工作流上传完成,程序退出", "info")
|
||||
os._exit(0)
|
||||
|
||||
if not BasicConfig.ak or not BasicConfig.sk:
|
||||
print_status("后续运行必须拥有一个实验室,请前往 https://uni-lab.bohrium.com 注册实验室!", "warning")
|
||||
@@ -368,20 +467,6 @@ def main():
|
||||
args_dict["devices_config"] = resource_tree_set
|
||||
args_dict["graph"] = graph_res.physical_setup_graph
|
||||
|
||||
if BasicConfig.upload_registry:
|
||||
# 设备注册到服务端 - 需要 ak 和 sk
|
||||
if BasicConfig.ak and BasicConfig.sk:
|
||||
print_status("开始注册设备到服务端...", "info")
|
||||
try:
|
||||
register_devices_and_resources(lab_registry)
|
||||
print_status("设备注册完成", "info")
|
||||
except Exception as e:
|
||||
print_status(f"设备注册失败: {e}", "error")
|
||||
else:
|
||||
print_status("未提供 ak 和 sk,跳过设备注册", "info")
|
||||
else:
|
||||
print_status("本次启动注册表不报送云端,如果您需要联网调试,请在启动命令增加--upload_registry", "warning")
|
||||
|
||||
if args_dict["controllers"] is not None:
|
||||
args_dict["controllers_config"] = yaml.safe_load(open(args_dict["controllers"], encoding="utf-8"))
|
||||
else:
|
||||
@@ -396,6 +481,7 @@ def main():
|
||||
comm_client = get_communication_client()
|
||||
if "websocket" in args_dict["app_bridges"]:
|
||||
args_dict["bridges"].append(comm_client)
|
||||
|
||||
def _exit(signum, frame):
|
||||
comm_client.stop()
|
||||
sys.exit(0)
|
||||
@@ -437,16 +523,13 @@ def main():
|
||||
resource_visualization.start()
|
||||
except OSError as e:
|
||||
if "AMENT_PREFIX_PATH" in str(e):
|
||||
print_status(
|
||||
f"ROS 2环境未正确设置,跳过3D可视化启动。错误详情: {e}",
|
||||
"warning"
|
||||
)
|
||||
print_status(f"ROS 2环境未正确设置,跳过3D可视化启动。错误详情: {e}", "warning")
|
||||
print_status(
|
||||
"建议解决方案:\n"
|
||||
"1. 激活Conda环境: conda activate unilab\n"
|
||||
"2. 或使用 --backend simple 参数\n"
|
||||
"3. 或使用 --visual disable 参数禁用可视化",
|
||||
"info"
|
||||
"info",
|
||||
)
|
||||
else:
|
||||
raise
|
||||
@@ -454,13 +537,19 @@ def main():
|
||||
time.sleep(1)
|
||||
else:
|
||||
start_backend(**args_dict)
|
||||
start_server(
|
||||
restart_requested = start_server(
|
||||
open_browser=not args_dict["disable_browser"],
|
||||
port=BasicConfig.port,
|
||||
)
|
||||
if restart_requested:
|
||||
print_status("[Main] Restart requested, cleaning up...", "info")
|
||||
cleanup_for_restart()
|
||||
return
|
||||
else:
|
||||
start_backend(**args_dict)
|
||||
start_server(
|
||||
|
||||
# 启动服务器(默认支持WebSocket触发重启)
|
||||
restart_requested = start_server(
|
||||
open_browser=not args_dict["disable_browser"],
|
||||
port=BasicConfig.port,
|
||||
)
|
||||
|
||||
176
unilabos/app/utils.py
Normal file
176
unilabos/app/utils.py
Normal file
@@ -0,0 +1,176 @@
|
||||
"""
|
||||
UniLabOS 应用工具函数
|
||||
|
||||
提供清理、重启等工具函数
|
||||
"""
|
||||
|
||||
import glob
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
|
||||
|
||||
def patch_rclpy_dll_windows():
|
||||
"""在 Windows + conda 环境下为 rclpy 打 DLL 加载补丁"""
|
||||
if sys.platform != "win32" or not os.environ.get("CONDA_PREFIX"):
|
||||
return
|
||||
try:
|
||||
import rclpy
|
||||
|
||||
return
|
||||
except ImportError as e:
|
||||
if not str(e).startswith("DLL load failed"):
|
||||
return
|
||||
cp = os.environ["CONDA_PREFIX"]
|
||||
impl = os.path.join(cp, "Lib", "site-packages", "rclpy", "impl", "implementation_singleton.py")
|
||||
pyd = glob.glob(os.path.join(cp, "Lib", "site-packages", "rclpy", "_rclpy_pybind11*.pyd"))
|
||||
if not os.path.exists(impl) or not pyd:
|
||||
return
|
||||
with open(impl, "r", encoding="utf-8") as f:
|
||||
content = f.read()
|
||||
lib_bin = os.path.join(cp, "Library", "bin").replace("\\", "/")
|
||||
patch = f'# UniLabOS DLL Patch\nimport os,ctypes\nos.add_dll_directory("{lib_bin}") if hasattr(os,"add_dll_directory") else None\ntry: ctypes.CDLL("{pyd[0].replace(chr(92),"/")}")\nexcept: pass\n# End Patch\n'
|
||||
shutil.copy2(impl, impl + ".bak")
|
||||
with open(impl, "w", encoding="utf-8") as f:
|
||||
f.write(patch + content)
|
||||
|
||||
|
||||
patch_rclpy_dll_windows()
|
||||
|
||||
import gc
|
||||
import threading
|
||||
import time
|
||||
|
||||
from unilabos.utils.banner_print import print_status
|
||||
|
||||
|
||||
def cleanup_for_restart() -> bool:
|
||||
"""
|
||||
Clean up all resources for restart without exiting the process.
|
||||
|
||||
This function prepares the system for re-initialization by:
|
||||
1. Stopping all communication clients
|
||||
2. Destroying ROS nodes
|
||||
3. Resetting singletons
|
||||
4. Waiting for threads to finish
|
||||
|
||||
Returns:
|
||||
bool: True if cleanup was successful, False otherwise
|
||||
"""
|
||||
print_status("[Restart] Starting cleanup for restart...", "info")
|
||||
|
||||
# Step 1: Stop WebSocket communication client
|
||||
print_status("[Restart] Step 1: Stopping WebSocket client...", "info")
|
||||
try:
|
||||
from unilabos.app.communication import get_communication_client
|
||||
|
||||
comm_client = get_communication_client()
|
||||
if comm_client is not None:
|
||||
comm_client.stop()
|
||||
print_status("[Restart] WebSocket client stopped", "info")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error stopping WebSocket: {e}", "warning")
|
||||
|
||||
# Step 2: Get HostNode and cleanup ROS
|
||||
print_status("[Restart] Step 2: Cleaning up ROS nodes...", "info")
|
||||
try:
|
||||
from unilabos.ros.nodes.presets.host_node import HostNode
|
||||
import rclpy
|
||||
from rclpy.timer import Timer
|
||||
|
||||
host_instance = HostNode.get_instance(timeout=5)
|
||||
if host_instance is not None:
|
||||
print_status(f"[Restart] Found HostNode: {host_instance.device_id}", "info")
|
||||
|
||||
# Gracefully shutdown background threads
|
||||
print_status("[Restart] Shutting down background threads...", "info")
|
||||
HostNode.shutdown_background_threads(timeout=5.0)
|
||||
print_status("[Restart] Background threads shutdown complete", "info")
|
||||
|
||||
# Stop discovery timer
|
||||
if hasattr(host_instance, "_discovery_timer") and isinstance(host_instance._discovery_timer, Timer):
|
||||
host_instance._discovery_timer.cancel()
|
||||
print_status("[Restart] Discovery timer cancelled", "info")
|
||||
|
||||
# Destroy device nodes
|
||||
device_count = len(host_instance.devices_instances)
|
||||
print_status(f"[Restart] Destroying {device_count} device instances...", "info")
|
||||
for device_id, device_node in list(host_instance.devices_instances.items()):
|
||||
try:
|
||||
if hasattr(device_node, "ros_node_instance") and device_node.ros_node_instance is not None:
|
||||
device_node.ros_node_instance.destroy_node()
|
||||
print_status(f"[Restart] Device {device_id} destroyed", "info")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error destroying device {device_id}: {e}", "warning")
|
||||
|
||||
# Clear devices instances
|
||||
host_instance.devices_instances.clear()
|
||||
host_instance.devices_names.clear()
|
||||
|
||||
# Destroy host node
|
||||
try:
|
||||
host_instance.destroy_node()
|
||||
print_status("[Restart] HostNode destroyed", "info")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error destroying HostNode: {e}", "warning")
|
||||
|
||||
# Reset HostNode state
|
||||
HostNode.reset_state()
|
||||
print_status("[Restart] HostNode state reset", "info")
|
||||
|
||||
# Shutdown executor first (to stop executor.spin() gracefully)
|
||||
if hasattr(rclpy, "__executor") and rclpy.__executor is not None:
|
||||
try:
|
||||
rclpy.__executor.shutdown()
|
||||
rclpy.__executor = None # Clear for restart
|
||||
print_status("[Restart] ROS executor shutdown complete", "info")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error shutting down executor: {e}", "warning")
|
||||
|
||||
# Shutdown rclpy
|
||||
if rclpy.ok():
|
||||
rclpy.shutdown()
|
||||
print_status("[Restart] rclpy shutdown complete", "info")
|
||||
|
||||
except ImportError as e:
|
||||
print_status(f"[Restart] ROS modules not available: {e}", "warning")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error in ROS cleanup: {e}", "warning")
|
||||
return False
|
||||
|
||||
# Step 3: Reset communication client singleton
|
||||
print_status("[Restart] Step 3: Resetting singletons...", "info")
|
||||
try:
|
||||
from unilabos.app import communication
|
||||
|
||||
if hasattr(communication, "_communication_client"):
|
||||
communication._communication_client = None
|
||||
print_status("[Restart] Communication client singleton reset", "info")
|
||||
except Exception as e:
|
||||
print_status(f"[Restart] Error resetting communication singleton: {e}", "warning")
|
||||
|
||||
# Step 4: Wait for threads to finish
|
||||
print_status("[Restart] Step 4: Waiting for threads to finish...", "info")
|
||||
time.sleep(3) # Give threads time to finish
|
||||
|
||||
# Check remaining threads
|
||||
remaining_threads = []
|
||||
for t in threading.enumerate():
|
||||
if t.name != "MainThread" and t.is_alive():
|
||||
remaining_threads.append(t.name)
|
||||
|
||||
if remaining_threads:
|
||||
print_status(
|
||||
f"[Restart] Warning: {len(remaining_threads)} threads still running: {remaining_threads}", "warning"
|
||||
)
|
||||
else:
|
||||
print_status("[Restart] All threads stopped", "info")
|
||||
|
||||
# Step 5: Force garbage collection
|
||||
print_status("[Restart] Step 5: Running garbage collection...", "info")
|
||||
gc.collect()
|
||||
gc.collect() # Run twice for weak references
|
||||
print_status("[Restart] Garbage collection complete", "info")
|
||||
|
||||
print_status("[Restart] Cleanup complete. Ready for re-initialization.", "info")
|
||||
return True
|
||||
@@ -58,14 +58,14 @@ class JobResultStore:
|
||||
feedback=feedback or {},
|
||||
timestamp=time.time(),
|
||||
)
|
||||
logger.debug(f"[JobResultStore] Stored result for job {job_id[:8]}, status={status}")
|
||||
logger.trace(f"[JobResultStore] Stored result for job {job_id[:8]}, status={status}")
|
||||
|
||||
def get_and_remove(self, job_id: str) -> Optional[JobResult]:
|
||||
"""获取并删除任务结果"""
|
||||
with self._results_lock:
|
||||
result = self._results.pop(job_id, None)
|
||||
if result:
|
||||
logger.debug(f"[JobResultStore] Retrieved and removed result for job {job_id[:8]}")
|
||||
logger.trace(f"[JobResultStore] Retrieved and removed result for job {job_id[:8]}")
|
||||
return result
|
||||
|
||||
def get_result(self, job_id: str) -> Optional[JobResult]:
|
||||
|
||||
@@ -6,7 +6,6 @@ Web服务器模块
|
||||
|
||||
import webbrowser
|
||||
|
||||
import uvicorn
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from starlette.responses import Response
|
||||
@@ -96,7 +95,7 @@ def setup_server() -> FastAPI:
|
||||
return app
|
||||
|
||||
|
||||
def start_server(host: str = "0.0.0.0", port: int = 8002, open_browser: bool = True) -> None:
|
||||
def start_server(host: str = "0.0.0.0", port: int = 8002, open_browser: bool = True) -> bool:
|
||||
"""
|
||||
启动服务器
|
||||
|
||||
@@ -104,7 +103,14 @@ def start_server(host: str = "0.0.0.0", port: int = 8002, open_browser: bool = T
|
||||
host: 服务器主机
|
||||
port: 服务器端口
|
||||
open_browser: 是否自动打开浏览器
|
||||
|
||||
Returns:
|
||||
bool: True if restart was requested, False otherwise
|
||||
"""
|
||||
import threading
|
||||
import time
|
||||
from uvicorn import Config, Server
|
||||
|
||||
# 设置服务器
|
||||
setup_server()
|
||||
|
||||
@@ -123,7 +129,37 @@ def start_server(host: str = "0.0.0.0", port: int = 8002, open_browser: bool = T
|
||||
|
||||
# 启动服务器
|
||||
info(f"[Web] 启动FastAPI服务器: {host}:{port}")
|
||||
uvicorn.run(app, host=host, port=port, log_config=log_config)
|
||||
|
||||
# 使用支持重启的模式
|
||||
config = Config(app=app, host=host, port=port, log_config=log_config)
|
||||
server = Server(config)
|
||||
|
||||
# 启动服务器线程
|
||||
server_thread = threading.Thread(target=server.run, daemon=True, name="uvicorn_server")
|
||||
server_thread.start()
|
||||
|
||||
info("[Web] Server started, monitoring for restart requests...")
|
||||
|
||||
# 监控重启标志
|
||||
import unilabos.app.main as main_module
|
||||
|
||||
while server_thread.is_alive():
|
||||
if hasattr(main_module, "_restart_requested") and main_module._restart_requested:
|
||||
info(
|
||||
f"[Web] Restart requested via WebSocket, reason: {getattr(main_module, '_restart_reason', 'unknown')}"
|
||||
)
|
||||
main_module._restart_requested = False
|
||||
|
||||
# 停止服务器
|
||||
server.should_exit = True
|
||||
server_thread.join(timeout=5)
|
||||
|
||||
info("[Web] Server stopped, ready for restart")
|
||||
return True
|
||||
|
||||
time.sleep(1)
|
||||
|
||||
return False
|
||||
|
||||
|
||||
# 当脚本直接运行时启动服务器
|
||||
|
||||
@@ -23,7 +23,7 @@ from typing import Optional, Dict, Any, List
|
||||
from urllib.parse import urlparse
|
||||
from enum import Enum
|
||||
|
||||
from jedi.inference.gradual.typing import TypedDict
|
||||
from typing_extensions import TypedDict
|
||||
|
||||
from unilabos.app.model import JobAddReq
|
||||
from unilabos.ros.nodes.presets.host_node import HostNode
|
||||
@@ -154,7 +154,7 @@ class DeviceActionManager:
|
||||
job_info.set_ready_timeout(10) # 设置10秒超时
|
||||
self.active_jobs[device_key] = job_info
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
logger.info(f"[DeviceActionManager] Job {job_log} can start immediately for {device_key}")
|
||||
logger.trace(f"[DeviceActionManager] Job {job_log} can start immediately for {device_key}")
|
||||
return True
|
||||
|
||||
def start_job(self, job_id: str) -> bool:
|
||||
@@ -210,8 +210,9 @@ class DeviceActionManager:
|
||||
job_info.update_timestamp()
|
||||
# 从all_jobs中移除已结束的job
|
||||
del self.all_jobs[job_id]
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
logger.info(f"[DeviceActionManager] Job {job_log} ended for {device_key}")
|
||||
# job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
# logger.debug(f"[DeviceActionManager] Job {job_log} ended for {device_key}")
|
||||
pass
|
||||
else:
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
logger.warning(f"[DeviceActionManager] Job {job_log} was not active for {device_key}")
|
||||
@@ -227,7 +228,7 @@ class DeviceActionManager:
|
||||
next_job_log = format_job_log(
|
||||
next_job.job_id, next_job.task_id, next_job.device_id, next_job.action_name
|
||||
)
|
||||
logger.info(f"[DeviceActionManager] Next job {next_job_log} can start for {device_key}")
|
||||
logger.trace(f"[DeviceActionManager] Next job {next_job_log} can start for {device_key}")
|
||||
return next_job
|
||||
|
||||
return None
|
||||
@@ -268,7 +269,7 @@ class DeviceActionManager:
|
||||
# 从all_jobs中移除
|
||||
del self.all_jobs[job_id]
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
logger.info(f"[DeviceActionManager] Active job {job_log} cancelled for {device_key}")
|
||||
logger.trace(f"[DeviceActionManager] Active job {job_log} cancelled for {device_key}")
|
||||
|
||||
# 启动下一个任务
|
||||
if device_key in self.device_queues and self.device_queues[device_key]:
|
||||
@@ -281,7 +282,7 @@ class DeviceActionManager:
|
||||
next_job_log = format_job_log(
|
||||
next_job.job_id, next_job.task_id, next_job.device_id, next_job.action_name
|
||||
)
|
||||
logger.info(f"[DeviceActionManager] Next job {next_job_log} can start after cancel")
|
||||
logger.trace(f"[DeviceActionManager] Next job {next_job_log} can start after cancel")
|
||||
return True
|
||||
|
||||
# 如果是排队中的任务
|
||||
@@ -295,7 +296,7 @@ class DeviceActionManager:
|
||||
job_log = format_job_log(
|
||||
job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name
|
||||
)
|
||||
logger.info(f"[DeviceActionManager] Queued job {job_log} cancelled for {device_key}")
|
||||
logger.trace(f"[DeviceActionManager] Queued job {job_log} cancelled for {device_key}")
|
||||
return True
|
||||
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
@@ -359,7 +360,7 @@ class MessageProcessor:
|
||||
self.device_manager = device_manager
|
||||
self.queue_processor = None # 延迟设置
|
||||
self.websocket_client = None # 延迟设置
|
||||
self.session_id = ""
|
||||
self.session_id = str(uuid.uuid4())[:6] # 产生一个随机的session_id
|
||||
|
||||
# WebSocket连接
|
||||
self.websocket = None
|
||||
@@ -488,7 +489,20 @@ class MessageProcessor:
|
||||
async for message in self.websocket:
|
||||
try:
|
||||
data = json.loads(message)
|
||||
await self._process_message(data)
|
||||
message_type = data.get("action", "")
|
||||
message_data = data.get("data")
|
||||
if self.session_id and self.session_id == data.get("edge_session"):
|
||||
await self._process_message(message_type, message_data)
|
||||
else:
|
||||
if message_type.endswith("_material"):
|
||||
logger.trace(
|
||||
f"[MessageProcessor] 收到一条归属 {data.get('edge_session')} 的旧消息:{data}"
|
||||
)
|
||||
logger.debug(
|
||||
f"[MessageProcessor] 跳过了一条归属 {data.get('edge_session')} 的旧消息: {data.get('action')}"
|
||||
)
|
||||
else:
|
||||
await self._process_message(message_type, message_data)
|
||||
except json.JSONDecodeError:
|
||||
logger.error(f"[MessageProcessor] Invalid JSON received: {message}")
|
||||
except Exception as e:
|
||||
@@ -554,12 +568,9 @@ class MessageProcessor:
|
||||
finally:
|
||||
logger.debug("[MessageProcessor] Send handler stopped")
|
||||
|
||||
async def _process_message(self, data: Dict[str, Any]):
|
||||
async def _process_message(self, message_type: str, message_data: Dict[str, Any]):
|
||||
"""处理收到的消息"""
|
||||
message_type = data.get("action", "")
|
||||
message_data = data.get("data")
|
||||
|
||||
logger.debug(f"[MessageProcessor] Processing message: {message_type}")
|
||||
logger.trace(f"[MessageProcessor] Processing message: {message_type}")
|
||||
|
||||
try:
|
||||
if message_type == "pong":
|
||||
@@ -571,16 +582,19 @@ class MessageProcessor:
|
||||
elif message_type == "cancel_action" or message_type == "cancel_task":
|
||||
await self._handle_cancel_action(message_data)
|
||||
elif message_type == "add_material":
|
||||
# noinspection PyTypeChecker
|
||||
await self._handle_resource_tree_update(message_data, "add")
|
||||
elif message_type == "update_material":
|
||||
# noinspection PyTypeChecker
|
||||
await self._handle_resource_tree_update(message_data, "update")
|
||||
elif message_type == "remove_material":
|
||||
# noinspection PyTypeChecker
|
||||
await self._handle_resource_tree_update(message_data, "remove")
|
||||
elif message_type == "session_id":
|
||||
self.session_id = message_data.get("session_id")
|
||||
logger.info(f"[MessageProcessor] Session ID: {self.session_id}")
|
||||
elif message_type == "request_reload":
|
||||
await self._handle_request_reload(message_data)
|
||||
# elif message_type == "session_id":
|
||||
# self.session_id = message_data.get("session_id")
|
||||
# logger.info(f"[MessageProcessor] Session ID: {self.session_id}")
|
||||
elif message_type == "request_restart":
|
||||
await self._handle_request_restart(message_data)
|
||||
else:
|
||||
logger.debug(f"[MessageProcessor] Unknown message type: {message_type}")
|
||||
|
||||
@@ -628,13 +642,13 @@ class MessageProcessor:
|
||||
await self._send_action_state_response(
|
||||
device_id, action_name, task_id, job_id, "query_action_status", True, 0
|
||||
)
|
||||
logger.info(f"[MessageProcessor] Job {job_log} can start immediately")
|
||||
logger.trace(f"[MessageProcessor] Job {job_log} can start immediately")
|
||||
else:
|
||||
# 需要排队
|
||||
await self._send_action_state_response(
|
||||
device_id, action_name, task_id, job_id, "query_action_status", False, 10
|
||||
)
|
||||
logger.info(f"[MessageProcessor] Job {job_log} queued")
|
||||
logger.trace(f"[MessageProcessor] Job {job_log} queued")
|
||||
|
||||
# 通知QueueProcessor有新的队列更新
|
||||
if self.queue_processor:
|
||||
@@ -838,9 +852,7 @@ class MessageProcessor:
|
||||
device_action_groups[key_add] = []
|
||||
device_action_groups[key_add].append(item["uuid"])
|
||||
|
||||
logger.info(
|
||||
f"[MessageProcessor] Resource migrated: {item['uuid'][:8]} from {device_old_id} to {device_id}"
|
||||
)
|
||||
logger.info(f"[资源同步] 跨站Transfer: {item['uuid'][:8]} from {device_old_id} to {device_id}")
|
||||
else:
|
||||
# 正常update
|
||||
key = (device_id, "update")
|
||||
@@ -854,11 +866,13 @@ class MessageProcessor:
|
||||
device_action_groups[key] = []
|
||||
device_action_groups[key].append(item["uuid"])
|
||||
|
||||
logger.info(f"触发物料更新 {action} 分组数量: {len(device_action_groups)}, 总数量: {len(resource_uuid_list)}")
|
||||
logger.trace(
|
||||
f"[资源同步] 动作 {action} 分组数量: {len(device_action_groups)}, 总数量: {len(resource_uuid_list)}"
|
||||
)
|
||||
|
||||
# 为每个(device_id, action)创建独立的更新线程
|
||||
for (device_id, actual_action), items in device_action_groups.items():
|
||||
logger.info(f"设备 {device_id} 物料更新 {actual_action} 数量: {len(items)}")
|
||||
logger.trace(f"[资源同步] {device_id} 物料动作 {actual_action} 数量: {len(items)}")
|
||||
|
||||
def _notify_resource_tree(dev_id, act, item_list):
|
||||
try:
|
||||
@@ -890,19 +904,50 @@ class MessageProcessor:
|
||||
)
|
||||
thread.start()
|
||||
|
||||
async def _handle_request_reload(self, data: Dict[str, Any]):
|
||||
async def _handle_request_restart(self, data: Dict[str, Any]):
|
||||
"""
|
||||
处理重载请求
|
||||
|
||||
当LabGo发送request_reload时,重新发送设备注册信息
|
||||
处理重启请求
|
||||
|
||||
当LabGo发送request_restart时,执行清理并触发重启
|
||||
"""
|
||||
reason = data.get("reason", "unknown")
|
||||
logger.info(f"[MessageProcessor] Received reload request, reason: {reason}")
|
||||
|
||||
# 重新发送host_node_ready信息
|
||||
delay = data.get("delay", 2) # 默认延迟2秒
|
||||
logger.info(f"[MessageProcessor] Received restart request, reason: {reason}, delay: {delay}s")
|
||||
|
||||
# 发送确认消息
|
||||
if self.websocket_client:
|
||||
self.websocket_client.publish_host_ready()
|
||||
logger.info("[MessageProcessor] Re-sent host_node_ready after reload request")
|
||||
await self.websocket_client.send_message(
|
||||
{"action": "restart_acknowledged", "data": {"reason": reason, "delay": delay}}
|
||||
)
|
||||
|
||||
# 设置全局重启标志
|
||||
import unilabos.app.main as main_module
|
||||
|
||||
main_module._restart_requested = True
|
||||
main_module._restart_reason = reason
|
||||
|
||||
# 延迟后执行清理
|
||||
await asyncio.sleep(delay)
|
||||
|
||||
# 在新线程中执行清理,避免阻塞当前事件循环
|
||||
def do_cleanup():
|
||||
import time
|
||||
|
||||
time.sleep(0.5) # 给当前消息处理完成的时间
|
||||
logger.info(f"[MessageProcessor] Starting cleanup for restart, reason: {reason}")
|
||||
try:
|
||||
from unilabos.app.utils import cleanup_for_restart
|
||||
|
||||
if cleanup_for_restart():
|
||||
logger.info("[MessageProcessor] Cleanup successful, main() will restart")
|
||||
else:
|
||||
logger.error("[MessageProcessor] Cleanup failed")
|
||||
except Exception as e:
|
||||
logger.error(f"[MessageProcessor] Error during cleanup: {e}")
|
||||
|
||||
cleanup_thread = threading.Thread(target=do_cleanup, name="RestartCleanupThread", daemon=True)
|
||||
cleanup_thread.start()
|
||||
logger.info(f"[MessageProcessor] Restart cleanup scheduled")
|
||||
|
||||
async def _send_action_state_response(
|
||||
self, device_id: str, action_name: str, task_id: str, job_id: str, typ: str, free: bool, need_more: int
|
||||
@@ -1090,7 +1135,7 @@ class QueueProcessor:
|
||||
success = self.message_processor.send_message(message)
|
||||
job_log = format_job_log(job_info.job_id, job_info.task_id, job_info.device_id, job_info.action_name)
|
||||
if success:
|
||||
logger.debug(f"[QueueProcessor] Sent busy/need_more for queued job {job_log}")
|
||||
logger.trace(f"[QueueProcessor] Sent busy/need_more for queued job {job_log}")
|
||||
else:
|
||||
logger.warning(f"[QueueProcessor] Failed to send busy status for job {job_log}")
|
||||
|
||||
@@ -1113,7 +1158,7 @@ class QueueProcessor:
|
||||
job_info.action_name,
|
||||
)
|
||||
|
||||
logger.info(f"[QueueProcessor] Job {job_log} completed with status: {status}")
|
||||
logger.trace(f"[QueueProcessor] Job {job_log} completed with status: {status}")
|
||||
|
||||
# 结束任务,获取下一个可执行的任务
|
||||
next_job = self.device_manager.end_job(job_id)
|
||||
@@ -1133,8 +1178,8 @@ class QueueProcessor:
|
||||
},
|
||||
}
|
||||
self.message_processor.send_message(message)
|
||||
next_job_log = format_job_log(next_job.job_id, next_job.task_id, next_job.device_id, next_job.action_name)
|
||||
logger.info(f"[QueueProcessor] Notified next job {next_job_log} can start")
|
||||
# next_job_log = format_job_log(next_job.job_id, next_job.task_id, next_job.device_id, next_job.action_name)
|
||||
# logger.debug(f"[QueueProcessor] Notified next job {next_job_log} can start")
|
||||
|
||||
# 立即触发下一轮状态检查
|
||||
self.notify_queue_update()
|
||||
@@ -1279,7 +1324,7 @@ class WebSocketClient(BaseCommunicationClient):
|
||||
except (KeyError, AttributeError):
|
||||
logger.warning(f"[WebSocketClient] Failed to remove job {item.job_id} from HostNode status")
|
||||
|
||||
logger.info(f"[WebSocketClient] Intercepting final status for job_id: {item.job_id} - {status}")
|
||||
# logger.debug(f"[WebSocketClient] Intercepting final status for job_id: {item.job_id} - {status}")
|
||||
|
||||
# 通知队列处理器job完成(包括timeout的job)
|
||||
self.queue_processor.handle_job_completed(item.job_id, status)
|
||||
@@ -1340,15 +1385,17 @@ class WebSocketClient(BaseCommunicationClient):
|
||||
# 收集设备信息
|
||||
devices = []
|
||||
machine_name = BasicConfig.machine_name
|
||||
|
||||
|
||||
try:
|
||||
host_node = HostNode.get_instance(0)
|
||||
if host_node:
|
||||
# 获取设备信息
|
||||
for device_id, namespace in host_node.devices_names.items():
|
||||
device_key = f"{namespace}/{device_id}" if namespace.startswith("/") else f"/{namespace}/{device_id}"
|
||||
device_key = (
|
||||
f"{namespace}/{device_id}" if namespace.startswith("/") else f"/{namespace}/{device_id}"
|
||||
)
|
||||
is_online = device_key in host_node._online_devices
|
||||
|
||||
|
||||
# 获取设备的动作信息
|
||||
actions = {}
|
||||
for action_id, client in host_node._action_clients.items():
|
||||
@@ -1359,16 +1406,18 @@ class WebSocketClient(BaseCommunicationClient):
|
||||
"action_path": action_id,
|
||||
"action_type": str(type(client).__name__),
|
||||
}
|
||||
|
||||
devices.append({
|
||||
"device_id": device_id,
|
||||
"namespace": namespace,
|
||||
"device_key": device_key,
|
||||
"is_online": is_online,
|
||||
"machine_name": host_node.device_machine_names.get(device_id, machine_name),
|
||||
"actions": actions,
|
||||
})
|
||||
|
||||
|
||||
devices.append(
|
||||
{
|
||||
"device_id": device_id,
|
||||
"namespace": namespace,
|
||||
"device_key": device_key,
|
||||
"is_online": is_online,
|
||||
"machine_name": host_node.device_machine_names.get(device_id, machine_name),
|
||||
"actions": actions,
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"[WebSocketClient] Collected {len(devices)} devices for host_ready")
|
||||
except Exception as e:
|
||||
logger.warning(f"[WebSocketClient] Error collecting device info: {e}")
|
||||
|
||||
Reference in New Issue
Block a user