leex是编写模板引擎词法分析器的好选择吗?(Is leex a good choice for writing a template engine lexer?)
我正处于为Elixir实现类似jinja2的模板语言的初始设计阶段。 我一直倾向于手工编写词法分析器,但我最近遇到了Erlang的leex模块。 它看起来很有希望,但经过一些初步研究后,我不确定它是否适合我的目的。
我的犹豫之一是模板语言本质上是一种字符串嵌入式语言,目前尚不清楚如何使用leex这种情况下使用tokenize。 作为一个简单的例子,想象一下这个模板的标记:
<p>Here is some text for inclusion in the template.</p> {% for x in some_variable %} The value for the variable: {{ x }}. {% endfor %}
在这个例子中,我需要确保kewords'for'和' in '被标记化,具体取决于:
- 如果它们在标记内: {%%}
- 如果它们位于代码中: {{}}
- 如果它们在模板中,但不在任何标签内。
对我来说,看起来我需要在标记化阶段进行两次传递,或者滚动我自己的词法分析器以便在一次传递中执行此操作。
我想知道是否有任何具有词法分析经验的人,特别是leex,或者编写模板引擎,能否提供一些有关最佳前进方法的见解?
I am in the initial design phase of implementing a jinja2-like template language for Elixir. I had been inclined to writer the lexer by hand, but I have recently come across the leex module for Erlang. It looks promising, but after some initial research I am unsure if it is the proper tool for my purposes.
One of my hesitations is a template language being, essentially, a string embedded language, it is not clear how to use tokenize for this case using leex. As a trivial example, imagine tokenizing this template:
<p>Here is some text for inclusion in the template.</p> {% for x in some_variable %} The value for the variable: {{ x }}. {% endfor %}
In this example, I need to ensure that the kewords 'for' and 'in' are tokenized differently depending on:
- If they are inside a tag: {% %}
- If they are inside a tag: {{ }}
- If they are in the template, but not inside any tags.
To me this looks like I would need to either do two passes in the tokenizing phase, or roll my own lexer in order to do this in one pass.
I am wondering if anyone who has experience with lexical analysis, and particularly leex, or writing template engines can provide some insight into the best way forward?
原文:https://stackoverflow.com/questions/40538741