首页 \ 问答 \ tshark导出FIX消息(tshark export FIX messages)

tshark导出FIX消息(tshark export FIX messages)

目标

我正在努力实现以下目标:

  • 捕获包含FIX协议中的会话的网络流量
  • 将来自网络流量的各个FIX消息提取为“漂亮”格式,例如CSV
  • 对导出的“漂亮”格式数据进行一些数据分析

我通过以下方式实现了:

  • 使用pcap捕获网络流量
  • 使用tshark以CSV格式打印相关数据
  • 使用Python(pandas)来分析数据

问题

问题是一些捕获的TCP数据包包含多个FIX消息,这意味着当我使用tshark导出到CSV时,我没有得到每行的FIX消息。 这使得消费CSV变得困难。

这是我用来提取相关FIX字段的tshark命令行,因为CSV是:

tshark -r dump.pcap \
-R \'(fix.MsgType[0]=="G" or fix.MsgType[0]=="D" or fix.MsgType[0]=="8" or \ fix.MsgType[0]=="F") and fix.ClOrdID != "0"\' \ 
-Tfields -Eseparator=, -Eoccurrence=l -e frame.time_relative \
-e fix.MsgType -e fix.SenderCompID \
-e fix.SenderSubID -e fix.Symbol -e fix.Side \
-e fix.Price -e fix.OrderQty -e fix.ClOrdID \
-e fix.OrderID -e fix.OrdStatus'

请注意,我正在使用“-Eoccurrence = l”来获取在数据包中出现多个字段的情况下最后一次出现的命名字段。 这不是一个可接受的解决方案,因为当数据包中有多个FIX消息时,信息将被丢弃。

这是我期望在导出的CSV文件中的每一行(来自一个FIX消息的字段)中看到的内容:

16.508949000,D,XXX,XXX,YTZ2,2,97480,34,646427,,

这是我在TCP数据包中有多个FIX消息(三个是这种情况)并且使用命令行标志“-Eoccurrence = a”时看到的:

16.515886000,F,F,G,XXX,XXX,XXX,XXX,XXX,XXX,XTZ2,2,97015,22,646429,646430,646431,323180,323175,301151,

问题

有没有办法(不一定使用tshark)从pcap文件中提取每个特定于协议的消息?


The Objective

I'm trying to achieve the following:

  • capture network traffic containing a conversation in the FIX protocol
  • extract the individual FIX messages from the network traffic into a "nice" format, e.g. CSV
  • do some data analysis on the exported "nice" format data

I have achieved this by:

  • using pcap to capture the network traffic
  • using tshark to print the relevant data as a CSV
  • using Python (pandas) to analyse the data

The Problem

The problem is that some of the captured TCP packets contain more than one FIX message, which means that when I do the export to CSV using tshark I don't get a FIX message per line. This makes consuming the CSV difficult.

This is the tshark commandline I'm using to extract the relevant FIX fields as CSV is:

tshark -r dump.pcap \
-R \'(fix.MsgType[0]=="G" or fix.MsgType[0]=="D" or fix.MsgType[0]=="8" or \ fix.MsgType[0]=="F") and fix.ClOrdID != "0"\' \ 
-Tfields -Eseparator=, -Eoccurrence=l -e frame.time_relative \
-e fix.MsgType -e fix.SenderCompID \
-e fix.SenderSubID -e fix.Symbol -e fix.Side \
-e fix.Price -e fix.OrderQty -e fix.ClOrdID \
-e fix.OrderID -e fix.OrdStatus'

Note that I'm currently using "-Eoccurrence=l" to get just the last occurrence of a named field in the case where there is more than one occurrence of a field in the packet. This is not an acceptable solution as information will get thrown away when there are multiple FIX messages in a packet.

This is what I expect to see per line in the exported CSV file (fields from one FIX message):

16.508949000,D,XXX,XXX,YTZ2,2,97480,34,646427,,

This is what I see when there is more than one FIX message (three is this case) in a TCP packet and the commandline flag "-Eoccurrence=a" is used:

16.515886000,F,F,G,XXX,XXX,XXX,XXX,XXX,XXX,XTZ2,2,97015,22,646429,646430,646431,323180,323175,301151,

The Question

Is there a way (not necessarily using tshark) to extract each individual, protocol specific message from a pcap file?


原文:https://stackoverflow.com/questions/13810156
更新时间:2023-06-17 06:06

最满意答案

MySQL in重复评估不相关的子查询时存在一个问题 ,就好像它们是相关的。 重写为连接是否会改善事物?

SELECT
    COUNT(distinct p.`id`)  
FROM `poems` p
JOIN `poems_genres` pg
ON  p.`id` = pg.`poem_id`  
WHERE pg.`genre_title` = 'derision' AND p.`status` = 'finished';

如果不是,那么根据这篇文章 (请参阅“如何强制内部查询先执行”一节 )将它包装在派生表中可能会有所帮助。

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN
(
 select  `poem_id` from ( SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision') x

) AND `status` = 'finished';

MySQL has a problem with in where it repeatedly re-evaluates uncorrelated sub queries as though they were correlated. Does rewriting as a join improve things?

SELECT
    COUNT(distinct p.`id`)  
FROM `poems` p
JOIN `poems_genres` pg
ON  p.`id` = pg.`poem_id`  
WHERE pg.`genre_title` = 'derision' AND p.`status` = 'finished';

If not then according to this article (see the section "How to force the inner query to execute first") wrapping it up in a derived table might help.

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN
(
 select  `poem_id` from ( SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision') x

) AND `status` = 'finished';

相关问答

更多
  • 我想到了。 我意识到$zip->extractTo($zipFile->get("uploadedFilePath"))试图为循环的每次迭代提取650个文件,这是650次。 我只是将提取代码移到循环外部并且脚本快速执行。 I figured it out. I realized that $zip->extractTo($zipFile->get("uploadedFilePath")) was attempting to extract 650 files for each iteration of th ...
  • 一些建议: GROUP BY子句中的列是否已编入索引? 如果没有,那么这将减慢查询速度。 “ID”列是否标记为主键? 如果不是那么他们应该。 在许多现代RDBMS中,标记为主键的列是自动索引的 你在外键上指定了索引吗? 那是a.DistID,d.rankID等。如果没有,那么索引你的FK列将加速查询 使用返回表的函数可能不是一个好主意。 如果在SQL Server中执行此操作,则查询优化器无法优化查询的该部分。 希望这可以帮助。 Some suggestions: Are the columns in th ...
  • 你有扳机吗? 在这里也看到我的答案: 为什么UPDATE比SELECT要花费更长的时间? Do you have a trigger? And see my answers here too: Why does an UPDATE take much longer than a SELECT?
  • 也许你的子查询如果在外部查询中触发每一行。 你可以在没有子查询的情况下重写它: SELECT COUNT(*), a.col3 FROM a INNER JOIN c ON c.col6 = a.col2 INNER JOIN d ON d.x = c.col2 AND d.x = a.col3 WHERE a.col10 = 20 --AND a.col2 IS NOT NULL --AND a.col3 IS NOT NULL ...
  • 试试这个在联接中加入条件 SELECT v.LinkID, r.SourcePort, r.DestPort, r.NoOfBytes, r.StartTime , r.EndTime, r.Direction, r.nFlows FROM LINK_TBL v INNER JOIN NODEIF_TBL n ON (n.NodeNumber=v.orinodenumber ) INNER JOIN RAW_TBL r ON (r.RouterIP=n.ifipaddress and v.oriIf ...
  • MySQL in重复评估不相关的子查询时存在一个问题 ,就好像它们是相关的。 重写为连接是否会改善事物? SELECT COUNT(distinct p.`id`) FROM `poems` p JOIN `poems_genres` pg ON p.`id` = pg.`poem_id` WHERE pg.`genre_title` = 'derision' AND p.`status` = 'finished'; 如果不是,那么根据这篇文章 (请参阅“如何强制内部查询先执行”一节 ) ...
  • 使用EXPLAIN PLAN查找执行工作流程,以及查询中消耗最多的部分。 例如,如果您使用的是SQL Developer,则可以运行以下命令: explain plan for 然后运行: SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY); 它将以表格格式给出结果,如下所示: 您可以查看时间和成本以及操作类型。 例如,每当您看到“全表扫描”时,这是一个红色标记。 Use EXPLAIN PLAN to find out the execution ...
  • 如果您需要提高性能,请确保您有一个正确的索引表mediums列kleding_id CREATE INDEX my_index ON mediums (kleding_id); 记住,限制(对于不是最近的db版本)通常在结果..a上工作,并且在达到前100后不会中断 If you need improve performance be sure you have a proper index table mediums column kleding_id CREATE INDEX my_index ...
  • 我以前见过这个,并通过使用参数表而不是变量来解决它。 if object_id('myParameters') is not null drop table myParameters Select cast('1996-05-01' as datetime) as myDate into myParameters Select * from TempTable where effdate = (select max(myDate) from myParameters) I've seen this be ...
  • 请根据您的两台服务器仔细检查并观察这些点,并相互检查: 配置数据库实例 查询数据库/表的设置 整体工艺性能 系统内存(当发生延迟时,是否使用HDD扩展RAM?) 硬盘设置/状态(慢/缺陷/碎片硬盘也会导致这样的问题) 尝试命令“ALTER DATABASE SET PARAMETERIZATION FORCED GO” 使用DBCC MEMORYSTATUS进行深入分析 我可能的深度分析的首选文档用作概述: http : //sqlserverplanet.com/troubleshooting/sql-s ...

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)