研究主要是Animatediff在视频转绘时,API调用生成会失败,因此看了下插件的源码与SD插件实现逻辑
主要流程如下,涉及到几个关键类:StableDifussionProcessing(txt2img or img2img)、ScriptRunner、Script,下面这张图从一个宽泛的全流程角度展示了插件调用的全流程,之后具体从实现自己的插件、插件调用插件两个方面并结合具体示例来阐述SDWebUI插件的实现逻辑
官方有简单谢一篇Wiki,讲述如何开发插件:开发SD-WebUI插件
这篇Wiki中介绍了几个特殊文件的用途,分别如下:
class Script:
name = None
"""
script's internal name derived from title
"""
def title(self):
"""
this function should return the title of the script. This is what will be displayed in the dropdown menu.
"""
raise NotImplementedError()
def ui(self, is_img2img):
"""
this function should create gradio UI elements. See https://gradio.app/docs/#components
The return value should be an array of all components that are used in processing.
Values of those returned components will be passed to run() and process() functions.
"""
pass
def before_process(self, p, *args):
"""
This function is called very early during processing begins for AlwaysVisible scripts.
You can modify the processing object (p) here, inject hooks, etc.
args contains all values returned by components from ui()
"""
pass
def process(self, p, *args):
"""
This function is called before processing begins for AlwaysVisible scripts.
You can modify the processing object (p) here, inject hooks, etc.
args contains all values returned by components from ui()
"""
pass
...... // 相关属性和方法并未列完,可以自行去源码中查看
下面我们以ControlNet为例看看这个万人迷插件是如何实现的,ControlNet的目录结构如下所示,它包含上述“规则”所介绍的install.py、preload.py、scripts目录等
判断requirements.txt中涉及的依赖性有没有被安装,如果没有就调用launch.run_pip去安装它
接收下面四个命令行处理参数
【
“–controlnet-dir”
“–controlnet-annotator-models-path”
“–no-half-controlnet”
“–controlnet-preprocessor-cache-size”
“–controlnet-loglevel”
】
存放一些脚本,以及一些对现有脚本的修改
简单举例:
UI的相关定义如下
# ui方法既是界面实现,可见各个选框都在这里定义
def ui(self, is_img2img):
"""this function should create gradio UI elements. See https://gradio.app/docs/#components
The return value should be an array of all components that are used in processing.
Values of those returned components will be passed to run() and process() functions.
"""
infotext = Infotext()
controls = ()
max_models = shared.opts.data.get("control_net_unit_count", 3)
elem_id_tabname = ("img2img" if is_img2img else "txt2img") + "_controlnet"
with gr.Group(elem_id=elem_id_tabname):
with gr.Accordion(f"ControlNet {controlnet_version.version_flag}", open = False, elem_id="controlnet"):
if max_models > 1:
with gr.Tabs(elem_id=f"{elem_id_tabname}_tabs"):
for i in range(max_models):
with gr.Tab(f"ControlNet Unit {i}",
elem_classes=['cnet-unit-tab']):
group, state = self.uigroup(f"ControlNet-{i}", is_img2img, elem_id_tabname)
infotext.register_unit(i, group)
controls += (state,)
else:
with gr.Column():
group, state = self.uigroup(f"ControlNet", is_img2img, elem_id_tabname)
infotext.register_unit(0, group)
controls += (state,)
if shared.opts.data.get("control_net_sync_field_args", True):
self.infotext_fields = infotext.infotext_fields
self.paste_field_names = infotext.paste_field_names
return controls
在 “extensions\sd-webui-controlnet-main\scripts\api.py” 可以看到有个on_app_started的回调,即程序启动时会调用controlnet_api方法
下图展示了controlnet的原理,它是在unet的网络上多增加一个旁路,因此,要看看代码中是怎样实现这一功能的
关键代码如下:
UnetHook为新定义的结构,通过latest_network.hook改变原有网络
在latest_network.hook中,hook方法先将原先的模型的forward方法保存起来(model._original_forward = model.forward),然后给它重新赋值,赋值为自行实现的forward_webui.get(model, UNetModel),真正的前向过程便替换为自己的forward方法
代码一:
self.latest_network = UnetHook(lowvram=is_low_vram)
self.latest_network.hook(model=unet, sd_ldm=sd_ldm, control_params=forward_params, process=p)
代码二:
model._original_forward = model.forward
outer.original_forward = model.forward
model.forward = forward_webui.__get__(model, UNetModel)
在Body的JSON中,在 “alwayson_scripts” 中放置相关插件的参数,如下所示,参数必须封装到 “args” 里面
"alwayson_scripts": {
"AnimateDiff": {
"args": [
{
"model": "mm_sd_v15_v2.ckpt",
"format": ["GIF"],
"enable": true,
"video_length": 94,
"fps": 29,
"loop_number": 0,......
}
]
},
"ControlNet": {
"args": [
{
"enabled": true,
"module": "canny",
"model": "control_v11p_sd15_canny [d14c016b]",......
},
{
"enabled": true,
"module": "depth_midas",
"model": "control_v11f1p_sd15_depth [cfd03158]",......
}
]
}
}
def create_script_ui(self, script, inputs):
script.args_from = len(inputs)
script.args_to = len(inputs)
script.controls = wrap_call(script.ui, script.filename, "ui")
for control in script.controls.values():
control.custom_script_source = os.path.basename(script.filename)
inputs += list(script.controls.values())
script.args_to = len(inputs)
UI模式下一切正常,API模式下文生图正常,但是视频转绘会报错:ControlNet找不到图片
从插件的源码层面找到原因,下面仅给出修改代码的解决方案,尽量不修改开源插件的原有逻辑,通过向 StableDiffusionProcessing 注入batch_images(视频的每一帧)完成 ControlNet 介入控制,修改下面三处地方:
修改一:assert global_input_frames != '', 'No input images found for ControlNet module'
unit.batch_images = global_input_frames
unit.input_mode = InputMode.BATCH
if(p.is_api):
print(type(p.script_args[11 + idx]))
print(type(unit))
p.script_args[11 + idx]['input_mode'] = InputMode.BATCH
p.script_args[11 + idx]['batch_images'] = global_input_frames
修改二:
else:
unit.batch_images = shared.listfiles(unit.batch_images)
if(p.is_api):
p.script_args[11 + idx]['batch_images'] = shared.listfiles(p.script_args[11 + idx]['batch_images'])
修改三:
self.latest_model_hash = p.sd_model.sd_model_hash
for idx, unit in enumerate(self.enabled_units):
if(p.is_api):
setattr(unit, "input_mode", p.script_args[11 + idx]['input_mode'])
setattr(unit, "batch_images", p.script_args[11 + idx]['batch_images'])