将JSON多路树解码为F#多路树识别联盟(Decode JSON Multiway Tree into an F# Multiway Tree Discriminated Union)
我在documentdb中有以下JSON数据,我想将其解析为F#多路树区分联合
"commentTree": { "commentModel": { "commentId": "", "userId": "", "message": "" }, "forest": [] }
F#多道歧视联盟
type public CommentMultiTreeDatabaseModel = | CommentDatabaseModelNode of CommentDatabaseModel * list<CommentMultiTreeDatabaseModel>
其中CommentMultiTreeDatabaseModel定义为
type public CommentDatabaseModel = { commentId : string userId : string message : string }
我在f#中广泛引用了Multiway Tree的Fold / Recursion 。 我不知道从哪里开始将这样的JSON结构解析为F#多路树。 任何建议将不胜感激。 谢谢
I have the following JSON data in a documentdb and I would like to parse this into an F# multiway tree discriminated union
"commentTree": { "commentModel": { "commentId": "", "userId": "", "message": "" }, "forest": [] }
F# multiway discriminated union
type public CommentMultiTreeDatabaseModel = | CommentDatabaseModelNode of CommentDatabaseModel * list<CommentMultiTreeDatabaseModel>
where CommentMultiTreeDatabaseModel is defined as
type public CommentDatabaseModel = { commentId : string userId : string message : string }
I am referencing Fold / Recursion over Multiway Tree in f# extensively. I am not sure where to begin to parse such a JSON structure into an F# multiway tree. Any suggestions will be much appreciated. Thanks
原文:https://stackoverflow.com/questions/41494563
最满意答案
简单的答案是
[.\n]
可能不会做你认为它做的事情。 在字符类中,大多数元字符都失去了它们的特殊含义,因此字符类只包含两个字符:文字.
和换行符。 你应该使用(.|\n)
。但这不会解决问题。
根本原因是使用固定的重复计数。 如果匹配区域的末端不明确,则大的(或甚至不那么大的)重复计数可导致状态机的指数爆炸。
随着
[.\n]
的重复,重复匹配具有明确的终止,除非正则表达式的其余部分可以以点或换行符开头。 所以"."
触发问题,但"A"
没有。 如果您更正重复以匹配任何字符,则任何后续字符都将触发指数性爆炸。 因此,如果您进行上述建议的更改,则正则表达式将继续无法编译。将重复次数更改为无限重复(星型运算符)可以避免此问题。
为了说明这个问题,我使用
-v
选项检查具有不同重复次数的状态数。 这清楚地显示了状态计数的指数增加,并且显然不可能超过14次重复。 (我没有显示时间消耗;足以说flex
的算法在DFA的大小上不是线性的,所以虽然每次额外的重复都会使状态数量增加一倍,但它大约是时间消耗的四倍;在16个州flex需要45秒,所以假设它需要大约一个星期才能完成23次重复是合理的,前提是它需要的6GB内存可以在没有太多交换的情况下使用。我没有尝试实验。)$ cat badre.l %% "on"[ \t\r]*[.\n]{0,XXX}"."[ \t\r]*[.\n]{0,XXX}"from" $ for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; do > printf '{0,%d}:\t%24s\n' $i \ > "$(flex -v -o /dev/null <( sed "s/XXX/$i/g" badre.l) |& > grep -o '.*DFA states')" > done {0,1}: 17/1000 DFA states {0,2}: 25/1000 DFA states {0,3}: 41/1000 DFA states {0,4}: 73/1000 DFA states {0,5}: 137/1000 DFA states {0,6}: 265/1000 DFA states {0,7}: 521/1000 DFA states {0,8}: 1033/2000 DFA states {0,9}: 2057/3000 DFA states {0,10}: 4105/6000 DFA states {0,11}: 8201/11000 DFA states {0,12}: 16393/21000 DFA states {0,13}: 32777/41000 DFA states {0,14}: 65545/82000 DFA states
将正则表达式改为使用
(.|\n)
两次重复大致使状态数量增加三倍,因为随着这种改变, 两次重复都变得模糊不清(并且两者之间存在交互)。The simple answer is that
[.\n]
probably doesn't do what you think it does. Inside a character class, most metacharacters lose their special meaning, so that character class contains only two characters: a literal.
and a newline. You should use(.|\n)
.But that won't solve the problem.
The underlying cause is the use of a fixed repetition count. Large (or even not so large) repetition counts can result in exponential blow-up of the state machine, if the end of the matched region is ambiguous.
With the repetition of
[.\n]
, the repeated match has an unambiguous termination unless the rest of the regex can start with a dot or a newline. So"."
triggers the problem, but"A"
doesn't. If you correct the repetition to match any character, then any following character will trigger exponential blow-up. So if you make the change suggested above, the regular expression will continue to be uncompilable.Changing the repetition count to an indefinite repetition (the star operator) would avoid the problem.
To illustrate the problem, I used the
-v
option to check the number of states with different repetition counts. This clearly shows the exponential increase in state count, and it's obvious that going much further than 14 repetitions would be impossible. (I didn't show the time consumption; suffice it to say thatflex
's algorithms are not linear in the size of the DFA, so while each additional repetition doubles the number of states, it roughly quadruples the time consumption; at 16 states, flex took 45 seconds, so it's reasonable to assume that it would take about a week to do 23 repetitions, provided that the 6GB of RAM it would need was available without too much swapping. I didn't try the experiment.)$ cat badre.l %% "on"[ \t\r]*[.\n]{0,XXX}"."[ \t\r]*[.\n]{0,XXX}"from" $ for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; do > printf '{0,%d}:\t%24s\n' $i \ > "$(flex -v -o /dev/null <( sed "s/XXX/$i/g" badre.l) |& > grep -o '.*DFA states')" > done {0,1}: 17/1000 DFA states {0,2}: 25/1000 DFA states {0,3}: 41/1000 DFA states {0,4}: 73/1000 DFA states {0,5}: 137/1000 DFA states {0,6}: 265/1000 DFA states {0,7}: 521/1000 DFA states {0,8}: 1033/2000 DFA states {0,9}: 2057/3000 DFA states {0,10}: 4105/6000 DFA states {0,11}: 8201/11000 DFA states {0,12}: 16393/21000 DFA states {0,13}: 32777/41000 DFA states {0,14}: 65545/82000 DFA states
Changing the regex to use
(.|\n)
for both repetitions roughly triples the number of states, because with that change both repetitions become ambiguous (and there is an interaction between the two of them).
相关问答
更多-
TCP/IP模型是一个________。[2023-10-02]
a -
下列中不属于面向对象的编程语言的是?[2022-05-30]
a -
Flex,多行规则(Flex, multiline rule)[2021-09-17]
不,这在flex中不可能实现(我已经查找过一次flex源以找到它)。 严格地说,这个问题有点误导,因为你在谈论一个名称定义,而不是规则。 No, that's not possible with flex (I've already looked up flex sources once to find this out). Strictly speaking, the question is a bit misleading, since you're talking about a name defin ... -
在.htaccess文件中尝试此规则: Options +FollowSymlinks -MultiViews RewriteEngine on RewriteCond %{QUERY_STRING} !^site= [NC] RewriteRule ^(folder/subfolder)/(.*)$ /$1/index.php?site=$2 [L,NC,QSA] Try this rule in your .htaccess file: Options +FollowSymlinks -MultiV ...
-
使用for循环或带临时结果和计数器的while循环。 后一种方法是最有效的(通常)。 简单版本,伪代码: iterations = 0; tmp = origin_matrix; do tmp = operation(tmp); iterations += 1; while tmp != origin_matrix; return iterations; 编辑:你也可以使用一个简单的构造: while True: tmp = operation(tmp) iterati ...
-
“没有规则使目标”编译Apache Ignite的c ++模块(“No rule to make target” compiling c++ module for Apache Ignite)[2022-12-13]
这是源版本的一个已知问题 。 作为解决方案,您可以使用项目存储库中的源代码。 This is a known issue with the sources releases. As a solution, you can use sources from project repository. -
使用Ant编译Flex模块(Compiling Flex modules with Ant)[2024-02-08]
经过大量的研究,试验和错误以及诅咒,我终于找到了解决方案。 最初我试图使用ANTs命令编译模块。 在我的最终解决方案中,我确实将其更改为使用ANT的 目标(来自Flex SDK目录ant / lib / flexTasks.jar中的flexTasks.jar)。 开箱即用Resharper不提供重组评论的任何内容。 它确实具有自定义代码重组功能,但查看该列表不会在占位符中显示任何注释。 也许像StyleCop或GhostDoc这样的东西可能会有用。 前者插入Resharper。 Out of the box Resharper does not offer anything for restructuring comments. It does have custom code restructuring capabilities, but looking in t ...简单的答案是[.\n]可能不会做你认为它做的事情。 在字符类中,大多数元字符都失去了它们的特殊含义,因此字符类只包含两个字符:文字. 和换行符。 你应该使用(.|\n) 。 但这不会解决问题。 根本原因是使用固定的重复计数。 如果匹配区域的末端不明确,则大的(或甚至不那么大的)重复计数可导致状态机的指数爆炸。 随着[.\n]的重复,重复匹配具有明确的终止,除非正则表达式的其余部分可以以点或换行符开头。 所以"." 触发问题,但"A"没有。 如果您更正重复以匹配任何字符,则任何后续字符都将触发指数性爆炸。 因 ...在规则Laravel中允许句点(Allow Period in Rules Laravel)[2024-02-08]
A. 是正则表达式中的特殊字符; 它应该像这样逃脱: \. 规则应该像'regex:/^\.$/' 请注意,这接受了. 只有,没有别的。 如果你想接受任何字母数字和组合的组合. 那么你应该有这样的事情: 'regex:/^[\w.]+$/' A . is a special character in regex; it should be escaped like this: \. The rule should the be like this 'regex:/^\.$/' Note that this ...相关文章
更多- EasyUI Tree与Datagrid联动
- struts2 + extjs + json + tree
- JSON 语法
- JSON 数据类型
- 数组数据 与json 的转换``
- EXT Tree叶子节点的ID如何存储到Store的HttpProxy参数中?
- 这种数据保存形式可以读出json格式吗?
- 数据库里的内容转化成json格式数据
- 让Solr返回JSON数据
- Jackson ObjectMapper实现JSON实际的读/写
最新问答
更多- h2元素推动其他h2和div。(h2 element pushing other h2 and div down. two divs, two headers, and they're wrapped within a parent div)
- 创建一个功能(Create a function)
- 我投了份简历,是电脑编程方面的学徒,面试时说要培训三个月,前面
- PDO语句不显示获取的结果(PDOstatement not displaying fetched results)
- Qt冻结循环的原因?(Qt freezing cause of the loop?)
- TableView重复youtube-api结果(TableView Repeating youtube-api result)
- 如何使用自由职业者帐户登录我的php网站?(How can I login into my php website using freelancer account? [closed])
- SQL Server 2014版本支持的最大数据库数(Maximum number of databases supported by SQL Server 2014 editions)
- 我如何获得DynamicJasper 3.1.2(或更高版本)的Maven仓库?(How do I get the maven repository for DynamicJasper 3.1.2 (or higher)?)
- 以编程方式创建UITableView(Creating a UITableView Programmatically)
- 如何打破按钮上的生命周期循环(How to break do-while loop on button)
- C#使用EF访问MVC上的部分类的自定义属性(C# access custom attributes of a partial class on MVC with EF)
- 如何获得facebook app的publish_stream权限?(How to get publish_stream permissions for facebook app?)
- 如何防止调用冗余函数的postgres视图(how to prevent postgres views calling redundant functions)
- Sql Server在欧洲获取当前日期时间(Sql Server get current date time in Europe)
- 设置kotlin扩展名(Setting a kotlin extension)
- 如何并排放置两个元件?(How to position two elements side by side?)
- 如何在vim中启用python3?(How to enable python3 in vim?)
- 在MySQL和/或多列中使用多个表用于Rails应用程序(Using multiple tables in MySQL and/or multiple columns for a Rails application)
- 如何隐藏谷歌地图上的登录按钮?(How to hide the Sign in button from Google maps?)
- Mysql左连接旋转90°表(Mysql Left join rotate 90° table)
- dedecms如何安装?
- 在哪儿学计算机最好?
- 学php哪个的书 最好,本人菜鸟
- 触摸时不要突出显示表格视图行(Do not highlight table view row when touched)
- 如何覆盖错误堆栈getter(How to override Error stack getter)
- 带有ImageMagick和许多图像的GIF动画(GIF animation with ImageMagick and many images)
- USSD INTERFACE - > java web应用程序通信(USSD INTERFACE -> java web app communication)
- 电脑高中毕业学习去哪里培训
- 正则表达式验证SMTP响应(Regex to validate SMTP Responses)