首页 \ 问答 \ 使用Python请求的不同响应(Different response using Python requests)

使用Python请求的不同响应(Different response using Python requests)

我试图使用requests从URL下载图像。 使用浏览器或REST客户端,例如restlet chrome扩展名我可以检索可以保存到磁盘的正常内容,json和二进制映像。

使用requests作为响应结果我得到几乎相同的响应头,只有Content-Length具有不同的值 - 15字节而不是35千字节 - 并且我找不到二进制图像。

试图模拟浏览器发出的请求,我配置了相同的请求头,如下所示:

headers = {"Host": "cpom.prefeitura.sp.gov.br",
           "Pragma": "no-cache",
           "Cache-Control": "no-cache",
           "DNT": "1",
           "Accept": "*/*",
           "Accept-Encoding": "gzip, deflate, br",
           "Accept-Language": "en-US,en;q=0.9,pt;q=0.8",
           "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                         "AppleWebKit/537.36 (KHTML, like Gecko) "
                         "Chrome/65.0.3325.181 Safari/537.36"
           }

r = requests.get(url, stream=True, headers=headers)

没有重定向,我也调试并查看requests.model.Response的内容但没有成功。

我错过了什么? 我认为这是一个关于请求的细节,但我无法得到它。

这个我的测试:

url = "https://cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral/ImagemCaptcha?u=8762520"
r = requests.get(url, stream=True)

if r.status_code == 200:
    print(r.raw.headers)
    with open("/home/bruno/captcha/8762520.txt", "wb") as f:  # saving as text, since is not the png image
        for chunk in r:
            f.write(chunk)

这是下载图片的网址: https//cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral/ImagemCaptcha?u = 4067913

这个网站带有验证码图片: https//cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral

用一个简单的GET只会得到一个json响应体,但检查响应后,您会看到二进制响应,即图像大小〜36kb。

编辑 :包括来自restlet客户端的图像

请求: 请求样品

响应: 部分响应


I'm trying to download an image from a URL using requests. Using browser or a REST client, like restlet chrome extension I can retrieve the normal content, a json, and a binary image that I can save to disk.

Using requests as response result I got almost same response headers, only Content-Length has a different value - 15 bytes instead of 35 kilobytes - and I can't found the binary image.

Trying to simulate the request made by the browser I configure the same request header, like this:

headers = {"Host": "cpom.prefeitura.sp.gov.br",
           "Pragma": "no-cache",
           "Cache-Control": "no-cache",
           "DNT": "1",
           "Accept": "*/*",
           "Accept-Encoding": "gzip, deflate, br",
           "Accept-Language": "en-US,en;q=0.9,pt;q=0.8",
           "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                         "AppleWebKit/537.36 (KHTML, like Gecko) "
                         "Chrome/65.0.3325.181 Safari/537.36"
           }

r = requests.get(url, stream=True, headers=headers)

There's no redirects, I also debug and look the content of requests.model.Response but no success.

What I'm missing? I think that is a detail about the request, but I can't get it.

This my test:

url = "https://cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral/ImagemCaptcha?u=8762520"
r = requests.get(url, stream=True)

if r.status_code == 200:
    print(r.raw.headers)
    with open("/home/bruno/captcha/8762520.txt", "wb") as f:  # saving as text, since is not the png image
        for chunk in r:
            f.write(chunk)

This is the URL to download the image: https://cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral/ImagemCaptcha?u=4067913

And this the site with the captcha image: https://cpom.prefeitura.sp.gov.br/prestador/SituacaoCadastral

With a simple GET will get only a json response body, but inspecting the response you'll see the binary response, which is the image - ~36kb size.

EDIT: include images from restlet client

Request: Request sample

Response: Partial response


原文:https://stackoverflow.com/questions/49940970
更新时间:2023-04-22 08:04

最满意答案

我的猜测是正确的,这个代码工作 - 电子表格插入«一对一»并可以应用不同的风格:

                <script>
                $(document).ready(function() {
                    $('#result').load('-google spreadsheet html import link- #tblMain');
                });         


                </script>

比我想象的要容易得多:)


My guess was right, this code working — spreadsheet inserting «one in one» and can apply different style:

                <script>
                $(document).ready(function() {
                    $('#result').load('-google spreadsheet html import link- #tblMain');
                });         


                </script>

All much easier than I expected:)

相关问答

更多
  • dataTable.getValue(0, 0); 获取当前范围中第0行第0列的值,该范围似乎只是B2。 首先得到所需的范围 function drawChart() { // get B2:B10 var spreadsheetUrl = "https://docs.google.com/spreadsheets/d/1l6FmSuwU2E134UuxoNyRfvTw2UY_0G0q69ZwfbQy3mY/edit?range=B2:B10"; var ...
  • Asp与风格无关。 您需要在渲染后查看应用的样式。 我会推荐Chrome。 如果右键单击元素,则可以选择“检查元素”,右侧有一个框,您可以在其中确切地看到应用于元素的样式。 在这一点上,只需要找出冒犯风格的来源。 很可能你有一个内联样式,它在一个包含的asp页面中呈现,与你的新样式相冲突。 一旦您使用Chrome查找正在注入的样式,您就可以查找定义该样式的位置。 So the problem was that some pages missed tag. Also those ...
  • 是的,这是可能的,但它在很大程度上取决于PDF,我认为这将是最大的障碍。 您可能会发现这个答案至少是相关的,如果不是您正在寻找的话。 否则,如果所有内容都存储在云端硬盘中,那只是一个问题: 1)循环播放工作表并打开所需的文档。 2)获取PDF的内容(可能是字符串)。 3)找到一种从PDF中删除相关数据的一致方法(这在很大程度上取决于PDF的内容)。 4)将数据粘贴到工作表。 3号可能是你最大的挑战,但一旦你开始,你可能会发现它比你想象的容易得多。 Yes, it's possible, but it dep ...
  • 不是真的。 您必须在HTML内容中搜索要替换的文本。 但是,您可以做的是创建一个UiApp小工具,无论何时加载它都会生成图表。 Not really. You'll have to search for the text you want to replace in the HTML content. What you could do instead, though, is make a UiApp gadget that generates the charts whenever it's loaded ...
  • 我的猜测是正确的,这个代码工作 - 电子表格插入«一对一»并可以应用不同的风格: