SpringBoot定时监听RocketMQ的NameServer

发布时间:2023年12月25日

问题分析

  • 自己在测试环境部署了RocketMQ,发现namesrv很容易挂掉,于是就想着监控,挂了就发邮件通知。
  • 查看了rocketmq-dashboard项目,发现只能监控Broker,遂放弃这一路径。
  • 于是就从报错的日志入手,发现最终可以根据RocketMQTemplate获得可活动的NameServer。

报错日志

  • 报错日志如下:
12月 25 13:59:22 192.168.240.65 java[59571]: 2023-12-25 13:59:22.598  INFO 59571 --- [tWorkerThread_2] RocketmqRemoting                         : NETTY CLIENT PIPELINE: CLOSE 192.168.240.86:9876
12月 25 13:59:22 192.168.240.65 java[59571]: 2023-12-25 13:59:22.598  INFO 59571 --- [tWorkerThread_2] RocketmqRemoting                         : closeChannel: the channel[192.168.240.86:9876] was removed from channel table
12月 25 13:59:22 192.168.240.65 java[59571]: 2023-12-25 13:59:22.598  INFO 59571 --- [tWorkerThread_2] RocketmqRemoting                         : NETTY CLIENT PIPELINE: CLOSE 192.168.240.86:9876
12月 25 13:59:22 192.168.240.65 java[59571]: 2023-12-25 13:59:22.598  INFO 59571 --- [tWorkerThread_2] RocketmqRemoting                         : eventCloseChannel: the channel[null] has been removed from the channel table before
12月 25 13:59:22 192.168.240.65 java[59571]: 2023-12-25 13:59:22.598  INFO 59571 --- [lientSelector_1] RocketmqRemoting                         : closeChannel: close the connection to remote address[192.168.240.86:9876] result: true
12月 25 13:59:25 192.168.240.65 java[59571]: 2023-12-25 13:59:25.597  INFO 59571 --- [ntScan_thread_1] RocketmqRemoting                         : createChannel: begin to connect remote host[192.168.240.86:9876] asynchronously
12月 25 13:59:25 192.168.240.65 java[59571]: 2023-12-25 13:59:25.597  INFO 59571 --- [tWorkerThread_3] RocketmqRemoting                         : NETTY CLIENT PIPELINE: CONNECT  UNKNOWN => 192.168.240.86:9876
12月 25 13:59:25 192.168.240.65 java[59571]: 2023-12-25 13:59:25.598  WARN 59571 --- [ntScan_thread_1] RocketmqRemoting                         : createChannel: connect remote host[192.168.240.86:9876] failed, AbstractBootstrap$PendingRegistrationPromise@f2a3fc5(failure: io.netty.channel.AbstractChannel$AnnotatedConnectException: 拒绝连接: /192.168.240.86:9876)


  • 根据日志可以发现是NettyRemotingClient类在做监控,持续调用,具体核心方法:
org.apache.rocketmq.remoting.netty.NettyRemotingClient#createChannel
  • createChannel的源码:
private Channel createChannel(String addr) throws InterruptedException {
        NettyRemotingClient.ChannelWrapper cw = (NettyRemotingClient.ChannelWrapper)this.channelTables.get(addr);
        if (cw != null && cw.isOK()) {
            return cw.getChannel();
        } else {
            if (this.lockChannelTables.tryLock(3000L, TimeUnit.MILLISECONDS)) {
                try {
                    cw = (NettyRemotingClient.ChannelWrapper)this.channelTables.get(addr);
                    boolean createNewConnection;
                    if (cw != null) {
                        if (cw.isOK()) {
                            Channel var4 = cw.getChannel();
                            return var4;
                        }

                        if (!cw.getChannelFuture().isDone()) {
                            createNewConnection = false;
                        } else {
                            this.channelTables.remove(addr);
                            createNewConnection = true;
                        }
                    } else {
                        createNewConnection = true;
                    }

                    if (createNewConnection) {
                        ChannelFuture channelFuture = this.bootstrap.connect(RemotingHelper.string2SocketAddress(addr));
                        LOGGER.info("createChannel: begin to connect remote host[{}] asynchronously", addr);
                        cw = new NettyRemotingClient.ChannelWrapper(channelFuture);
                        this.channelTables.put(addr, cw);
                    }
                } catch (Exception var8) {
                    LOGGER.error("createChannel: create channel exception", var8);
                } finally {
                    this.lockChannelTables.unlock();
                }
            } else {
                LOGGER.warn("createChannel: try to lock channel table, but timeout, {}ms", 3000L);
            }

            if (cw != null) {
                ChannelFuture channelFuture = cw.getChannelFuture();
                if (channelFuture.awaitUninterruptibly((long)this.nettyClientConfig.getConnectTimeoutMillis())) {
                    if (cw.isOK()) {
                        LOGGER.info("createChannel: connect remote host[{}] success, {}", addr, channelFuture.toString());
                        return cw.getChannel();
                    }

                    LOGGER.warn("createChannel: connect remote host[" + addr + "] failed, " + channelFuture.toString());
                } else {
                    LOGGER.warn("createChannel: connect remote host[{}] timeout {}ms, {}", new Object[]{addr, this.nettyClientConfig.getConnectTimeoutMillis(), channelFuture.toString()});
                }
            }

            return null;
        }
    }
  • 从源码中可以看到报错的日志数据

追溯

  • 以NettyRemotingClient类为起点,使用Debug分析,最终可以看到完整的调用链路:
    ![在这里插入图片描述](https://img-blog.csdnimg.cn/direct/332d213235a6439eb748c8422d480a44.png

监控开发

  • 那么监控开发就很容易了,注册RocketMQTemplate,使用定时任务监听即可,示例代码如下:
@Slf4j
@Component
public class MQMonitorTask {


    @Resource
    private RocketMQTemplate rocketMQTemplate;

    @Scheduled(cron = "0/10 * * * * ?")
    public void scanNameServerBroker() {
        org.apache.rocketmq.remoting.RemotingClient remotingClient = rocketMQTemplate.getProducer()
                .getDefaultMQProducerImpl().getMqClientFactory().getMQClientAPIImpl().getRemotingClient();
        // 注册的 NameServer
        List<String> nameServerAddressList = remotingClient.getNameServerAddressList();
        // 当前活跃的 NameServer
        List<String> availableNameSrvList = remotingClient.getAvailableNameSrvList();
        log.info("nameServerAddressList:{}", JSONUtil.toJsonStr(nameServerAddressList));
        log.info("availableNameSrvList:{}", JSONUtil.toJsonStr(availableNameSrvList));
        // 只要 nameServerAddressList 和 availableNameSrvList 大小不一致,即可做邮件通知,具体阈值自己设置!!!
        // TODO:邮件通知
    }

}

  • 另外要在SprongBoot启动类加上注解@EnableScheduling来开启定时任务。
文章来源:https://blog.csdn.net/starjuly/article/details/135198336
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。