最近有需求处理docx文件,并讲内容显示到页面,对world进行在线的阅读,这样我这里就使用flask+Document对docx文件进行处理并显示,下面直接上代码:
首先下载Document的库文件,先直接安装最新版的python-docx,如果不行则换成1.1.0版本:
pip install python-docx
pip install python-docx==1.1.0
处理docx代码如下:
def ReadVADocx(ProjectName,DocxName):
docxfilepath = vaReportDir + "\\" + ProjectName + "\\" + DocxName
paragraphs = ReadDocx(docxfilepath)
return paragraphs
def ReadDocx(docxfilepath):
doc = Document(docxfilepath)
paragraphs = list()
pattern = re.compile('rId\d+')
for graph in doc.paragraphs:
level = graph.style.name.split(' ')[-1]
if level == "Normal":
level = None
elif level == "Preformatted":
level = None
paragraph = {
'text': graph.text,
'level': level,
'images': ""
}
paragraphs.append(paragraph)
for run in graph.runs:
if run.text == '':
contentID = pattern.search(run.element.xml)
if contentID:
contentID = contentID.group(0)
try:
contentType = doc.part.related_parts[contentID].content_type
except KeyError as e:
print(e)
continue
if not contentType.startswith('image'):
continue
imgData = doc.part.related_parts[contentID].blob
image_base64 = base64.b64encode(imgData).decode('utf-8')
paragraph = {
'text': run.text,
'level': run.style.name.split(' ')[-1] if run.style.name.startswith('Heading') else None,
'images': image_base64
}
paragraphs.append(paragraph)
上述代码会对docx文件进行遍历,并将对应的内容和等级放入数组中
下面是调用代码:
@app.route('/ViewVADocx', methods=['GET'])
def ViewVADocx():
try:
DocxName = request.args.get('docx')
ProjectName = request.args.get('name')
paragraphs = engine.ReadVADocx(ProjectName,DocxName)
return render_template("viewdocx.html", n_getname=ProjectName, n_user=user,paragraphs=paragraphs)
except Exception as e:
return render_template('error-500.html')
然后就是需要讲对应的内容在页面进行展示,下面列出html代码:
{% extends "mould.html" %}
{% block head %}
{% endblock %}
{% block body %}
<!--body wrapper start-->
<div class="wrapper">
<div class="floating-box" id="floatingBox">↑回到顶部↑</div>
<!--Start Page Title-->
<div class="page-title-box">
<h4 class="page-title">{{ n_getname }}:扫描节点线</h4>
<div class="clearfix"></div>
</div>
<!--End Page Title-->
<!--Start row-->
<div class="row">
<div class="col-md-12">
<div class="white-box">
<h2 style="font-weight: bold;">快速导航:</h2>
{% for paragraph in paragraphs %}
{% if paragraph.level == "1" %}
<p>
<a href="#Section{{ loop.index0 }}" class="hover-link" style="font-weight: bold;">{{ paragraph.text }}</a>
{% elif paragraph.level == "2" %}
<p style="text-indent: 25px;">
<a href="#Section{{ loop.index0 }}" class="hover-link2" style="font-weight: bold;">{{ paragraph.text }}	</a>
</p>
{% endif %}
{% endfor %}
</div>
{% for paragraph in paragraphs %}
{% if paragraph.level %}
{% if paragraph.level == "Title" %}
<!-- <h2 align="center">{{ paragraph.text }}</h2>-->
{% elif paragraph.level == "1" %}
</div>
<div class="white-box">
<h{{ paragraph.level }} id="Section{{ loop.index0 }}" style="font-weight: bold;">{{ paragraph.text }}</h{{ paragraph.level }}>
{% else %}
<h{{ paragraph.level }} id="Section{{ loop.index0 }}">{{ paragraph.text }}</h{{ paragraph.level }}>
{% endif %}
{% else %}
{% if paragraph.images %}
<p><img src="data:image/png;base64,{{ paragraph.images }}" alt="Image"></p>
{% else %}
<p style="color: black;">{{ paragraph.text }}</p>
{% endif %}
{% endif %}
{% endfor %}
</div>
</div>
</div>
{% endblock %}
{% block list %}
<style>
.hover-link {
font-size: 20px;
}
.hover-link:hover {
color: red;
font-size: 30px;
}
.hover-link2 {
font-size: 15px;
}
.hover-link2:hover {
color: red;
font-size: 20px;
}
</style>
<style>
/* CSS 样式,用于定义悬浮框的外观 */
.floating-box {
position: fixed;
bottom: 20px;
right: 20px;
width: 80px;
height: 50px;
background-color: #ff9900;
color: #fff;
text-align: center;
line-height: 50px;
cursor: pointer;
}
</style>
<script>
// JavaScript 代码
var floatingBox = document.getElementById('floatingBox');
// 点击事件监听器
floatingBox.addEventListener('click', function() {
window.scrollTo({ top: 0, behavior: 'smooth' });
});
</script>
{% endblock %}
其中添加了样式和回到顶部等小功能,方便浏览,最后的使用效果如下:
?
代码只做了docx文件的内容展示,包括文字和图片,并对等级进行了划分,没有对docx的修改功能,感兴趣的可以自己研究下?
?
?