使用 subprocess.Popen 的非常大的输入和管道

时间：2023-07-21

本文介绍了使用 subprocess.Popen 的非常大的输入和管道的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我有一个很简单的问题.我有一个大文件，它经过三个步骤，一个使用外部程序的解码步骤，在 python 中的一些处理，然后使用另一个外部程序重新编码.我一直在使用 subprocess.Popen() 尝试在 python 中执行此操作，而不是形成 unix 管道.但是，所有数据都缓冲到内存中.有没有一种 Python 的方式来完成这项任务，或者我最好退回到一个简单的 Python 脚本，该脚本从标准输入读取并在任一侧使用 unix 管道写入标准输出?

I have pretty simple problem. I have a large file that goes through three steps, a decoding step using an external program, some processing in python, and then recoding using another external program. I have been using subprocess.Popen() to try to do this in python rather than forming unix pipes. However, all the data are buffered to memory. Is there a pythonic way of doing this task, or am I best dropping back to a simple python script that reads from stdin and writes to stdout with unix pipes on either side?

import os, sys, subprocess

def main(infile,reflist):
    print infile,reflist
    samtoolsin = subprocess.Popen(["samtools","view",infile],
                                  stdout=subprocess.PIPE,bufsize=1)
    samtoolsout = subprocess.Popen(["samtools","import",reflist,"-",
                                    infile+".tmp"],stdin=subprocess.PIPE,bufsize=1)
    for line in samtoolsin.stdout.read():
        if(line.startswith("@")):
            samtoolsout.stdin.write(line)
        else:
            linesplit = line.split("	")
            if(linesplit[10]=="*"):
                linesplit[9]="*"
            samtoolsout.stdin.write("	".join(linesplit))

使用 subprocess.Popen 的非常大的输入和管道

问题描述

推荐答案

相关文章