首页 \ 问答 \ 如何在mongo聚合框架中的流水线阶段之后加入文档(How to join documents after a pipeline stage in mongo aggregation framwork)

如何在mongo聚合框架中的流水线阶段之后加入文档(How to join documents after a pipeline stage in mongo aggregation framwork)

 因此，让我们说在聚合的第一阶段之后，我已经将所有文档按中心分组，所以我有这样的内容：  
{
center:"A",
gender:"Male",
count:50
}
{
center:"A",
gender:"Female",
count:20
}
 
 我想加入这两个文件，使最终的文件看起来像  
{
center:A,
Male:50,
Female:20
}

So lets say after the first stage of aggregation I have grouped all the documents by the center so i have something like this: 
{
center:"A",
gender:"Male",
count:50
}
{
center:"A",
gender:"Female",
count:20
}
 
I want to join these two documents such that the final document looks something like 
{
center:A,
Male:50,
Female:20
}

原文：https://stackoverflow.com/questions/34627212

更新时间：2023-07-28 06:07

最满意答案

 您可以使用带有skipinitialspace=True的csv.reader跳过空格，然后压缩行以获取列，我们使用itertools.izip_longest因为缺少最后一列中的值。 转换set中的列并使用set.intersection获取交集：  
from itertools import izip_longest
import csv

with open('test') as f:
    reader = csv.reader(f, delimiter=' ', skipinitialspace=True)
    cols = map(set, izip_longest(*reader))

print set.intersection(*cols)
 
 注意你的文件不正确是一个csv，如果你在一个不是最后一个列的列中缺少值，这将不正确地解释你的输入。 考虑至少使用不是空格的分隔符。  
 例  
 使用StringIO解析字符串并显示它适用于测试用例：  
from itertools import izip_longest
import csv
import StringIO

data='''table1    table2    table3  table4   table5
paper     paper     pen     book     book
pen       pencil    pencil  charger  apple
apple     pen       charger beatroot sandle
beatroot  mobile    apple   pen      paper
sandle    book      paper   paper'''

f = StringIO.StringIO(data)
reader = csv.reader(f, delimiter=' ', skipinitialspace=True)
cols = map(set, izip_longest(*reader))

print set.intersection(*cols)
 
 产量  
set(['paper'])

You can use the csv.reader with skipinitialspace=True to skip the spaces, then zip the rows to get the columns, we use itertools.izip_longest because a value in the last column is missing. Convert the columns in set and take the intersection using set.intersection: 
from itertools import izip_longest
import csv

with open('test') as f:
    reader = csv.reader(f, delimiter=' ', skipinitialspace=True)
    cols = map(set, izip_longest(*reader))

print set.intersection(*cols)
 
Watch out that your file is not properly a csv, and if you have missing values in a column that is not the last one this will interpret your input not properly. Consider at least using a delimiter that is not space. 
Example 
Using StringIO to parse a string and show that it works for the test case: 
from itertools import izip_longest
import csv
import StringIO

data='''table1    table2    table3  table4   table5
paper     paper     pen     book     book
pen       pencil    pencil  charger  apple
apple     pen       charger beatroot sandle
beatroot  mobile    apple   pen      paper
sandle    book      paper   paper'''

f = StringIO.StringIO(data)
reader = csv.reader(f, delimiter=' ', skipinitialspace=True)
cols = map(set, izip_longest(*reader))

print set.intersection(*cols)
 
Output 
set(['paper'])

如何在mongo聚合框架中的流水线阶段之后加入文档(How to join documents after a pipeline stage in mongo aggregation framwork)

最满意答案

例

Example

相关问答

写入CSV中的不同列(Writing to different columns in CSV)[2023-07-17]

重新排列CSV列(rearrange CSV columns)[2024-01-06]

使用csv模块从csv文件读取特定的列？(Read specific columns from a csv file with csv module?)[2023-04-06]

从CSV更新列(Update columns from a CSV)[2022-03-13]

如何使用PHP将列添加到CSV(How to add columns to CSV using PHP)[2023-11-01]

仅使用选定列合并CSV(Merging CSV's using only selected columns)[2022-03-27]

CSV列中的CSV列(CSV Columns to Arrays in Python)[2023-12-28]

按值对CSV列进行排序(Sort CSV columns by value)[2022-09-18]

所有csv列之间通用(common between all csv columns)[2021-08-01]

将具有公共索引的csv列连接到一个df中(Concatenating csv columns with common index into one df)[2022-06-22]

相关文章

最新问答