首页 \ 问答 \ 在文本文件中搜索多个字符串,并将结果打印到新的文本文件中(Search text file for multiple strings and print out results to a new text file)

在文本文件中搜索多个字符串,并将结果打印到新的文本文件中(Search text file for multiple strings and print out results to a new text file)

我是python编程的新手,我正努力学习文件I / O.

我目前正在制作一个简单的程序来从文本文档中读取并打印出结果。 到目前为止,我已经能够借助本网站上的许多资源和问题创建该程序。

但是我很好奇我如何从文本文档中读取多个单独的字符串并将结果字符串保存到文本文档中。

下面的程序是我制作的程序,它允许我在文本文档中搜索关键字,并将这些关键字之间的结果打印到另一个文本文件中。 但是我每次搜索只能执行一组起始和结束关键字:

from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename

root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector Text")
filename = askopenfilename() # Data Search Text File
outputfilename = askopenfilename() #Output Text File 

with open(filename, "rb") as f_input:
    start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
    end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")
    reText = re.search("%s(.*?)%s" % (re.escape(start_token + ",SHOWALL"), re.escape(end_token)), f_input.read(), re.S)
    if reText:
        output = reText.group(1)
        fo = open(outputfilename, "wb")
        fo.write(output)
        fo.close()

       print output
    else:
        tkMessageBox.showinfo("Output", "Sorry that input was not found in the file")
        print "not found"

因此,该程序所做的是,它允许用户选择文本文档搜索该文档用于开始关键字和结束关键字,然后将这两个关键字之间的所有内容打印成新的文本文档。

我想要实现的是允许用户选择文本文档并在该文本文档中搜索多个关键字,并将结果打印到同一输出文本文件中。

换句话说,假设我有以下文本文档:

something something something something
something something something something STARTkeyword1 something
data1
data2
data3
data4
data5
ENDkeyword1
something something something something
something something something something STARTkeyword2 something
data1
data2
data3
data4
data5
Data6
ENDkeyword2
something something something something
something something something something STARTkeyword3 something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3

我希望能够使用3个不同的起始关键字和3个不同的结束关键字搜索此文本文档,然后在其间打印到同一输出文本文件。

例如,我的输出文本文档看起来像:

something
data1
data2
data3
data4
data5
ENDkeyword1

something
data1
data2
data3
data4
data5
Data6
ENDkeyword2

something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3

我尝试过的一种暴力方法是创建一个循环,让用户一次输入一个新的关键字,但每当我尝试写入文本文档中的相同输出文件时,它将使用Append重写上一个条目。 是否有任何方法可以使用户可以在文本文档中搜索多个字符串并打印出带或不带循环的多个结果?

-----------------编辑:

非常感谢你们所有人我越来越接近你的提示,以获得一个很好的最终版本..这是我目前的代码:

def process(infile, outfile, keywords):

    keys = [ [k[0], k[1], 0] for k in keywords ]
    endk = None
    with open(infile, "rb") as fdin:
        with open(outfile, "wb") as fdout:
            for line in fdin:
                if endk is not None:
                    fdout.write(line)
                    if line.find(endk) >= 0:
                        fdout.write("\n")
                        endk = None
                else:
                    for k in keys:
                        index = line.find(k[0])
                        if index >= 0:
                            fdout.write(line[index + len(k[0]):].lstrip())
                            endk = k[1]
                            k[2] += 1
    if endk is not None:
        raise Exception(endk + " not found before end of file")
    return keys



from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename

root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector ")
infile = askopenfilename() #
outfile = askopenfilename() #

start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")

process(infile,outfile,((start_token + ",SHOWALL",end_token),))

到目前为止它的工作原理现在是时候让我自己迷失了,这是一个由分隔符分隔的多字符串输入。 所以如果我输入了

STARTKeyword1,STARTKeyword2,STARTKeyword3,STARTKeyword4

在程序提示符中,我希望能够将这些关键字分开并将它们放入

过程(INFILE,OUTFILE,关键字)

功能,以便仅提示用户输入一次并允许多个字符串搜索文件。 我正在考虑使用循环或将分离的输入创建到数组中。

如果这个问题远非原始问题,我会问我会关闭这个问题并打开另一个问题,这样我就可以在信用到期时给予信任。


I'm fairly new to python programming and I'm trying to learn File I/O as best I can.

I am currently in the process of making a simple program to read from a text document and print out the result. So far I've been able to create this program with the help of many resources and questions on this website.

However I'm curious on how I can read from a text document for multiple individual strings and save the resulting strings to a text document.

The program below is one i've made that allows me to search a text document for a Keyword and print the results between those Keywords into another Text File. However I can only do one set of Starting and Ending Keyword per search:

from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename

root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector Text")
filename = askopenfilename() # Data Search Text File
outputfilename = askopenfilename() #Output Text File 

with open(filename, "rb") as f_input:
    start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
    end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")
    reText = re.search("%s(.*?)%s" % (re.escape(start_token + ",SHOWALL"), re.escape(end_token)), f_input.read(), re.S)
    if reText:
        output = reText.group(1)
        fo = open(outputfilename, "wb")
        fo.write(output)
        fo.close()

       print output
    else:
        tkMessageBox.showinfo("Output", "Sorry that input was not found in the file")
        print "not found"

So what this program does is, it allows a user to select a text document search that document for a Beginning Keyword and an End Keyword then print out everything in between those two key words into a new text document.

What I am trying to achieve is allow a user to select a text document and search that text document for multiple sets keywords and print the result to the same output text file.

In other words let's say I have the following Text Document:

something something something something
something something something something STARTkeyword1 something
data1
data2
data3
data4
data5
ENDkeyword1
something something something something
something something something something STARTkeyword2 something
data1
data2
data3
data4
data5
Data6
ENDkeyword2
something something something something
something something something something STARTkeyword3 something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3

I want to be able to search this text document with 3 different starting keywords and 3 different ending keywords then print whats in between to the same output text file.

So for example my output text document would look something like:

something
data1
data2
data3
data4
data5
ENDkeyword1

something
data1
data2
data3
data4
data5
Data6
ENDkeyword2

something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3

One brute force method I've tried is to make a loop to make the user input a new Keyword one at a time however whenever I try to write to the same Output File in the Text document it will over write the previous entry using Append. Is there any way to make it so a user can search a text document for multiple strings and print out the multiple results with or without a loop?

----------------- EDIT:

So many thanks to all of you Im getting closer with your tips to a nice finalized version or so.. This is my current code:

def process(infile, outfile, keywords):

    keys = [ [k[0], k[1], 0] for k in keywords ]
    endk = None
    with open(infile, "rb") as fdin:
        with open(outfile, "wb") as fdout:
            for line in fdin:
                if endk is not None:
                    fdout.write(line)
                    if line.find(endk) >= 0:
                        fdout.write("\n")
                        endk = None
                else:
                    for k in keys:
                        index = line.find(k[0])
                        if index >= 0:
                            fdout.write(line[index + len(k[0]):].lstrip())
                            endk = k[1]
                            k[2] += 1
    if endk is not None:
        raise Exception(endk + " not found before end of file")
    return keys



from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename

root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector ")
infile = askopenfilename() #
outfile = askopenfilename() #

start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")

process(infile,outfile,((start_token + ",SHOWALL",end_token),))

So far It works however now it's time to for part im getting myself lost on and that is a multiple string input separated by a Delimiter. So if i had inputted

STARTKeyword1, STARTKeyword2, STARTKeyword3, STARTKeyword4

into the program prompt I want to be able to separate those keywords and place them into the

process(infile,outfile,keywords)

function so that the user is only prompted to input once and allow for multiple strings to search through the files. I was thinking of using maybe a loop or creating the separated inputs into an array.

If this question is far from the original I ask I will close this one and open another so i can give credit where credit is due.


原文:https://stackoverflow.com/questions/32097118
更新时间:2021-08-20 21:08

最满意答案

用于Office 365 EWS WSDL的正确URL是您尝试过的第一个: https//outlook.office365.com/ews/services.wsdl 。 您应该使用此URL来获取WSDL。 从浏览器中,当我输入我的电子邮件地址和密码时,我能够毫无问题地获得WSDL。

我不熟悉SOAPUI,所以我不确定为什么这个URL + Basic auth不能与SOAP UI一起使用。 如果您有任何疑问或需要更多信息,请与我们联系。

[OP的补充说明]

谢谢Venkat,这是解决方案。 还有一些额外的复杂因素,为什么我首先没有得到正确的答案。 我在答案中写了这些,因为评论太多了。

  1. 我不小心输入了https://pod51046.outlook.com/ews/***exchange***.wsdl而不是https://pod51046.outlook.com/ews/***services***.wsdl (其中Oleg的博客实际上提到过,我忽略了)。 这立即在IE中给出了正确的结果。 输入通用 https://outlook.office365.com/ews/services.wsdl建议也有效。
    (我认为在阅读了某个地方后, 实际的 URL是一个pod...一个,我坚持不懈地尝试解决它之后)。
    所以这就是尝试2和3失败的原因

  2. 在设置测试项目时,SOAP UI会向我请求2次3 = 6次登录凭据。 我只是不够坚持。
    这就是尝试1失败的原因


The right URL to use for Office 365 EWS WSDL is the first one you tried: https://outlook.office365.com/ews/services.wsdl. You should use this URL to get the WSDL. From the browser, when I enter my email address and password, I am able to get the WSDL without any problems.

I am not familiar with SOAPUI, so I am not sure why this URL + Basic auth isn't working with SOAP UI. Let me know if you have any questions or need more info.

[Additional notes by OP]

Thanks Venkat, that was the solution. There were additional complications why I did not get this correct in the first place. I'm writing them in the answer, because it's too much for comments.

  1. I accidentally entered https://pod51046.outlook.com/ews/***exchange***.wsdl instead of https://pod51046.outlook.com/ews/***services***.wsdl (which Oleg's blog actually mentioned, and I overlooked). This immediately gave the correct results in IE. Your suggestion of inputting the generic https://outlook.office365.com/ews/services.wsdl also works.
    (I think that after having read somewhere that the actual URL is a pod... one, I doggedly kept trying that one after resolving it).
    So this is why attempts 2 and 3 failed.

  2. SOAP UI asks me 2 times 3 = 6 times for the login credentials when setting up the test project. I just was not persistent enough.
    And this is why attempt 1 failed.

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)