记一次日志trace引起的事故

发布时间:2023年12月22日

事故描述,运行一段时间后,系统运行一段时间后变慢,gc和cpu都正常。直接贴样例代码

public static void main(String[] args) throws IOException {
    final ExecutorService executorService = Executors.newFixedThreadPool(100);
    for (int i = 0; i < 1000000; i++) {

        executorService.submit(()->{
            final UUID uuid = UUID.randomUUID();
            final Mono<Integer> log = Mono.just(1).log(uuid.toString());
            log.subscribe(new Consumer<Integer>() {
                @SneakyThrows
                @Override
                public void accept(Integer integer) {
                    System.out.println(integer);
                }
            });
        });
    }

    System.in.read();
}

我们采用log 和uuid 的方式实现在reactor框架下对请求的日志跟踪。也正是这个功能引起该次事故。

上述代码运行后,可以看到我们的线程state 几乎都是BLOCKED ,但并没有发现死锁

代码分析

public final Mono<T> log(@Nullable String category) {
   return log(category, Level.INFO);
}

public final Mono<T> log(@Nullable String category, Level level, SignalType... options) {
   return log(category, level, false, options);
}
public final Mono<T> log(@Nullable String category,
      Level level,
      boolean showOperatorLine,
      SignalType... options) {
   SignalLogger<T> log = new SignalLogger<>(this, category, level,
         showOperatorLine, options);

   if (this instanceof Fuseable) {
      return onAssembly(new MonoLogFuseable<>(this, log));
   }
   return onAssembly(new MonoLog<>(this, log));
}
SignalLogger(CorePublisher<IN> source,
      @Nullable String category,
      Level level,
      boolean correlateStack,
      SignalType... options) {
   this(source, category, level, correlateStack, Loggers::getLogger, options);
}
SignalLogger(CorePublisher<IN> source,
      @Nullable String category,
      Level level,
      boolean correlateStack,
      Function<String, Logger> loggerSupplier,
      @Nullable SignalType... options) {

   this.source = Objects.requireNonNull(source, "source");
   this.id = IDS.getAndIncrement();
   this.fuseable = source instanceof Fuseable;

   if (correlateStack) {
      operatorLine = Traces.extractOperatorAssemblyInformation(Traces.callSiteSupplierFactory.get().get());
   }
   else {
      operatorLine = null;
   }

   boolean generated =
         category == null || category.isEmpty() || category.endsWith(".");

   category = generated && category == null ? "reactor." : category;
   if (generated) {
      if (source instanceof Mono) {
         category += "Mono." + source.getClass()
                                     .getSimpleName()
                                     .replace("Mono", "");
      }
      else if (source instanceof ParallelFlux) {
         category += "Parallel." + source.getClass()
                                         .getSimpleName()
                                         .replace("Parallel", "");
      }
      else {
         category += "Flux." + source.getClass()
                                     .getSimpleName()
                                     .replace("Flux", "");
      }
      category += "." + id;
   }

此处获取log
   this.log = loggerSupplier.apply(category);

   this.level = level;
   if (options == null || options.length == 0) {
      this.options = ALL;
   }
   else {
      int opts = 0;
      for (SignalType option : options) {
         if (option == SignalType.CANCEL) {
            opts |= CANCEL;
         }
         else if (option == SignalType.CURRENT_CONTEXT) {
            opts |= CONTEXT_PARENT;
         }
         else if (option == SignalType.ON_SUBSCRIBE) {
            opts |= ON_SUBSCRIBE;
         }
         else if (option == SignalType.REQUEST) {
            opts |= REQUEST;
         }
         else if (option == SignalType.ON_NEXT) {
            opts |= ON_NEXT;
         }
         else if (option == SignalType.ON_ERROR) {
            opts |= ON_ERROR;
         }
         else if (option == SignalType.ON_COMPLETE) {
            opts |= ON_COMPLETE;
         }
         else if (option == SignalType.SUBSCRIBE) {
            opts |= SUBSCRIBE;
         }
         else if (option == SignalType.AFTER_TERMINATE) {
            opts |= AFTER_TERMINATE;
         }
      }
      this.options = opts;
   }
}

获取logger

总结: 由于我们的traceId 采用uuid 生成,每一次都不一样。每次调用log方法时,LoggerContext都先从Map<String, Logger> loggerCache中查询,没有的话进入rootLogger的同步代码块,在同步代码块中先对rootlog 的childrenList遍历寻找name相同的childLogger .? 由于调用的累计?childrenList越来越大,每次遍历时长越来越长,rootlog的锁持有时间越来越长,最后导致线程Blocked。并且并没有发现对childrenList的回收操作。导致内存越来越大。

文章来源:https://blog.csdn.net/weixin_39355187/article/details/135146669
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。