2021年11月3日 14:38:25go评论171阅读模式

英文:

How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list using golang?

问题

目前我正在使用以下函数下载文件，我也想从URL下载文件夹。

任何帮助将不胜感激。

package main

import (
    "fmt"
    "io"
    "net/http"
    "os"
)

func main() {
    fileUrl := "http://example.com/file.txt"
    err := DownloadFile("./example.txt", fileUrl)
    if err != nil {
        panic(err)
    }
    fmt.Println("Downloaded: " + fileUrl)
}

// DownloadFile将下载URL的内容保存到本地文件。
func DownloadFile(filepath string, url string) error {

    // 获取数据
    resp, err := http.Get(url)
    contentType := resp.Header.Get("Content-Type")

    if err != nil {
        return err
    }
    defer resp.Body.Close()

    if contentType == "application/octet-stream" {
        // 创建文件
        out, err := os.Create(filepath)
        if err != nil {
            return err
        }
        defer out.Close()

        // 将内容写入文件
        _, err = io.Copy(out, resp.Body)
        return err
    } else {
        fmt.Println("无法下载请求的URL")
    }
    return nil
}

我参考了以下链接：
https://stackoverflow.com/questions/23446635/how-to-download-http-directory-with-all-files-and-sub-directories-as-they-appear

但我想要的是用Go语言实现。

英文:

Currently I am downloading files using below function and I wanted to download folders as well from the URL

Any help would be appreciated

 package main
        
        import (
            &quot;fmt&quot;
            &quot;io&quot;
            &quot;net/http&quot;
            &quot;os&quot;
        )
        
        func main() {
            fileUrl := &quot;http://example.com/file.txt&quot;
            err := DownloadFile(&quot;./example.txt&quot;, fileUrl)
            if err != nil {
                panic(err)
            }
            fmt.Println(&quot;Downloaded: &quot; + fileUrl)
        }
        
        // DownloadFile will download a url to a local file.
        func DownloadFile(filepath string, url string) error {
        
            // Get the data
            resp, err := http.Get(url)
            contentType = resp.Header.Get(&quot;Content-Type&quot;)  
    
            if err != nil {
                return err
            }
            defer resp.Body.Close()
    
    if contentType == &quot;application/octet-stream&quot; {
            // Create the file
            out, err := os.Create(filepath)
            if err != nil {
                return err
            }
            defer out.Close()
        
            // Write the body to file
            _, err = io.Copy(out, resp.Body)
            return err
        }
        }else{
        fmt.Println(&quot;Requested URL is not downloadable&quot;)
        }

I have referred below link :
https://stackoverflow.com/questions/23446635/how-to-download-http-directory-with-all-files-and-sub-directories-as-they-appear

but I wanted it in golang

答案1

得分: 0

在这里，您可以找到wget --recursive实现的算法：https://www.gnu.org/software/wget/manual/html_node/Recursive-Download.html

基本上，您访问页面，然后解析HTML并跟踪每个href链接（如果需要，还包括css链接），可以像这样提取链接：https://vorozhko.net/get-all-links-from-html-page-with-go-lang。

一旦您获得了所有的链接，只需对它们进行请求，并根据Content-Type头部进行保存（如果不是text/html）或者解析链接（如果是text/html）。

英文:

Here you can find the algorithm for the wget --recursive implementation: https://www.gnu.org/software/wget/manual/html_node/Recursive-Download.html

Basically, you access the page and then parse the HTML and follow each href link (and css link if necessary), which can be extracted like this: https://vorozhko.net/get-all-links-from-html-page-with-go-lang.

Once you have all the links just do a request on them and based on the Content-Type header you save it if it is not text/html or parse it for links if it is.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list using golang?

问题

答案1

Golang有类似于C++的decltype的功能吗？

使用证书颁发机构签署证书请求。

为结构字段分配默认值

Golang从文件中嵌入HTML

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论