如何让AI帮忙写代码,chatgpt写爬虫爬取菊厂课件png图片并转换成pdf,全过程代码基本无需修改,太强大了!

发布时间:2024年01月04日

最近在学菊厂的AI课,想下载一下课件,但是没找到下载链接,作为一只python程序猿,首先想到写个爬虫,于是

1、分析下载链接,F12-network-翻页? 发现是动态加载的,每一页课件都是一张图片。?图片链接的最后是具体页数 :

https://klt-pdf2image.obs.cn-north-4.myhuaweicloud.com/20210702/module/1410892827643060226_%E6%95%99%E6%9D%90%E4%B8%8B%E8%BD%BD.pdf/6

2、本来打算自己写的,突然想到gpt可以写代码,不妨给AI个机会,与GPT的对话过程如下:

You:write a python script

ChatGPT: Certainly! I'd be happy to help. However, I need more information about what specific task or functionality you'd like the Python script to perform. Please provide more details or specify the purpose of the script so that I can tailor the script to your needs.

You:download png in websit

ChatGPT: To download a PNG image from a website using Python, you can use the requests library to send an HTTP request and retrieve the image. Here's a simple example:

import requests
from io import BytesIO
from PIL import Image

def download_png(url, save_path):
    try:
        # Send a GET request to the URL
        response = requests.get(url)
        
        # Check if the request was successful (status code 200)
        if response.status_code == 200:
            # Open the image using the PIL library
            image = Image.open(BytesIO(response.content))
            
            # Save the image to the specified path
            image.save(save_path, 'PNG')
            print(f"Image downloaded and saved to {save_path}")
        else:
            print(f"Failed to download image. Status code: {response.status_code}")

    except Exception as e:
        print(f"Error: {e}")

# Example usage:
image_url = "https://example.com/path/to/your/image.png"
save_path = "downloaded_image.png"
download_png(image_url, save_path)

Make sure to install the required libraries first:

pip install requests Pillow

Replace the image_url variable with the URL of the PNG image you want to download, and specify the desired save_path where you want to save the downloaded image. The script uses the Pillow library to handle the image, so you may need to install it using the provided pip install command if you haven't already.

好温馨呀,还提示我要安装什么包,怎么用,比CSDN上的大多数教程好多了!但是众所周知,不换源是装不上的!

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple requests Pillow

? 然后修改一下url 运行, 就下载成功了,么有任何报错

然而 整个课件有71页,我把url提供给了gpt 然后 告诉他 有71页

You: the url is https://klt-pdf2image.obs.cn-north-4.myhuaweicloud.com/20210702/module/1410892827643060226_%E6%95%99%E6%

?ChatGPT:? It seems like the provided URL is for a PDF file rather than a PNG image. If you want to download a PNG image, you may need to specify a direct URL to the image. However, if your goal is to convert a specific page of the PDF to an image, you can use the pdf2image library along with requests and Pillow. Here's an example:

它觉得我提供的是一个PDF的链接,所以代码肯定是不对的就不贴出来了,于是我纠正它:

You:?infact the url i provide is for png indeed

ChatGPT:?I see, my apologies for the confusion. If the provided URL is directly for a PNG image, then you can simplify the process. Here's a script to download a PNG image from the given URL:

import requests

def download_png(url, save_path):
    try:
        # Send a GET request to the PNG URL
        response = requests.get(url)

        # Check if the request was successful (status code 200)
        if response.status_code == 200:
            # Save the image to the specified path
            with open(save_path, 'wb') as f:
                f.write(response.content)
            print(f"Image downloaded and saved to {save_path}")
        else:
            print(f"Failed to download image. Status code: {response.status_code}")

    except Exception as e:
        print(f"Error: {e}")

# Example usage:
image_url = "https://klt-pdf2image.obs.cn-north-4.myhuaweicloud.com/20210702/module/1410892827643060226_%E6%95%99%E6%9D%90%E4%B8%8B%E8%BD%BD.pdf/6"
save_path = "downloaded_image.png"
download_png(image_url, save_path)

但是它好像忘了有71页这件事,于是再次提示一下

You:?the url is ended with the page number and it is tatally contain 71 pages

ChatGPT:? I see, thank you for the clarification. If the URL contains page numbers and you want to download all 71 pages, you can modify the script to loop through the pages and download each one. Here's an example:

import requests

def download_png(base_url, total_pages, save_folder):
    try:
        for page_number in range(1, total_pages + 1):
            # Construct the URL for each page
            url = f"{base_url}/{page_number}"

            # Send a GET request to the PNG URL
            response = requests.get(url)

            # Check if the request was successful (status code 200)
            if response.status_code == 200:
                # Save the image to the specified folder with a unique filename
                save_path = f"{save_folder}/page_{page_number}.png"
                with open(save_path, 'wb') as f:
                    f.write(response.content)
                print(f"Page {page_number} downloaded and saved to {save_path}")
            else:
                print(f"Failed to download page {page_number}. Status code: {response.status_code}")

    except Exception as e:
        print(f"Error: {e}")

# Example usage:
base_url = "https://klt-pdf2image.obs.cn-north-4.myhuaweicloud.com/20210702/module/1410892827643060226_%E6%95%99%E6%9D%90%E4%B8%8B%E8%BD%BD.pdf"
total_pages = 71
save_folder = "downloaded_pages"
download_png(base_url, total_pages, save_folder)

太强大了,真的给我把71页下下来了, ps:我中间手动建了个download_pages文件夹

哪么它能不能把这71张图片合成一个PDF文件呢?

You:?can you merge all png and convert is to pdf

ChatGPT:? Certainly! To merge all the downloaded PNG files into a single PDF file, you can use the Pillow library for image processing and the reportlab library for PDF generation. First, you need to install these libraries:

from PIL import Image
from reportlab.pdfgen import canvas
from io import BytesIO

def merge_png_to_pdf(png_folder, pdf_path):
    try:
        # Create a PDF file
        total_pages = 70 
        pdf_buffer = BytesIO()
        pdf_canvas = canvas.Canvas(pdf_buffer)

        # Get a list of all PNG files in the folder
        png_files = [f"{png_folder}/page_{i}.png" for i in range(1, total_pages + 1)]

        # Loop through each PNG file and add it to the PDF
        for png_file in png_files:
            image = Image.open(png_file)
            pdf_canvas.setPageSize((image.width, image.height))
            pdf_canvas.drawInlineImage(png_file, 0, 0)

            # Add a new page for the next image
            pdf_canvas.showPage()

        # Save the PDF to the specified path
        pdf_canvas.save()
        pdf_buffer.seek(0)

        with open(pdf_path, 'wb') as f:
            f.write(pdf_buffer.read())

        print(f"All PNG files merged and saved to {pdf_path}")

    except Exception as e:
        print(f"Error: {e}")

# Example usage:
png_folder = "downloaded_pages"
pdf_path = "merged_document.pdf"
merge_png_to_pdf(png_folder, pdf_path)

这次的代码有个小问题,total_pages 没有定义,帮他补充上就完美运行了

最后鼓励它一下, 那句happy coding 好感动啊!!!

You: good job! you are great

ChatGPT:Thank you! I'm glad I could help. If you have any more questions or if there's anything else I can assist you with, feel free to ask. Happy coding!

总结: 看来那句“人生苦短,我用python” 要改成“人生苦短,我用GPT了”

文章来源:https://blog.csdn.net/yiyamala/article/details/135369244
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。