将单词列表转换为这些单词出现的频率列表(Converting a list of words into a list of the frequency in which those words appear)
我正在用各种单词列表进行广泛的工作。
请考虑我有以下问题:
docText={"settlement", "new", "beginnings", "wildwood", "settlement", "book", "excerpt", "agnes", "leffler", "perry", "my", "mother", "junetta", "hally", "leffler", "brought", "my", "brother", "frank", "and", "me", "to", "edmonton", "from", "monmouth", "illinois", "mrs", "matilda", "groff", "accompanied", "us", "her", "husband", "joseph", "groff", "my", "father", "george", "leffler", "and", "my", "uncle", "andrew", "henderson", "were", "already", "in", "edmonton", "they", "came", "in", "1910", "we", "arrived", "july", "1", "1911", "the", "sun", "was", "shining", "when", "we", "arrived", "however", "it", "had", "been", "raining", "for", "days", "and", "it", "was", "very", "muddy", "especially", "around", "the", "cn", "train"} searchWords={"the","for","my","and","me","and","we"}
这些列表中的每一个都要长得多(例如,
searchWords
列表中的250个单词和大约12,000个单词的docText
)。现在,我可以通过执行以下操作来计算给定单词的频率:
docFrequency=Sort[Tally[docText],#1[[2]]>#2[[2]]&]; Flatten[Cases[docFrequency,{"settlement",_}]][[2]]
但是,我被挂断的地方在于我想要生成特定的列表。 具体而言,将单词列表转换为这些单词出现的频率列表的问题。 我试图用
Do
循环来做到这一点,但却遇到了困难。我想用
searchWords
查看docText
,并用纯粹的外观频率替换searchWords
每个元素。 即因为“结算”出现了两次,它将在列表中被替换为2,而由于“我”出现三次,它将变成3.该列表将会是类似于2,1,1,1,2和等等。我怀疑答案在
If[]
和Map[]
某处?这听起来很奇怪,但我试图预处理一些词频信息的信息......
增加清晰度(我希望):
这是一个更好的例子。
searchWords={"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "A", "about", "above", "across", "after", "again", "against", "all", "almost", "alone", "along", "already", "also", "although", "always", "among", "an", "and", "another", "any", "anyone", "anything", "anywhere", "are", "around", "as", "at", "b", "B", "back", "be", "became", "because", "become", "becomes", "been", "before", "behind", "being", "between", "both", "but", "by", "c", "C", "can", "cannot", "could", "d", "D", "do", "done", "down", "during", "e", "E", "each", "either", "enough", "even", "ever", "every", "everyone", "everything", "everywhere", "f", "F", "few", "find", "first", "for", "four", "from", "full", "further", "g", "G", "get", "give", "go", "h", "H", "had", "has", "have", "he", "her", "here", "herself", "him", "himself", "his", "how", "however", "i", "I", "if", "in", "interest", "into", "is", "it", "its", "itself", "j", "J", "k", "K", "keep", "l", "L", "last", "least", "less", "m", "M", "made", "many", "may", "me", "might", "more", "most", "mostly", "much", "must", "my", "myself", "n", "N", "never", "next", "no", "nobody", "noone", "not", "nothing", "now", "nowhere", "o", "O", "of", "off", "often", "on", "once", "one", "only", "or", "other", "others", "our", "out", "over", "p", "P", "part", "per", "perhaps", "put", "q", "Q", "r", "R", "rather", "s", "S", "same", "see", "seem", "seemed", "seeming", "seems", "several", "she", "should", "show", "side", "since", "so", "some", "someone", "something", "somewhere", "still", "such", "t", "T", "take", "than", "that", "the", "their", "them", "then", "there", "therefore", "these", "they", "this", "those", "though", "three", "through", "thus", "to", "together", "too", "toward", "two", "u", "U", "under", "until", "up", "upon", "us", "v", "V", "very", "w", "W", "was", "we", "well", "were", "what", "when", "where", "whether", "which", "while", "who", "whole", "whose", "why", "will", "with", "within", "without", "would", "x", "X", "y", "Y", "yet", "you", "your", "yours", "z", "Z"}
这些是
WordData[]
自动生成的停用词。 所以我想比较这些词与docText。 由于“结算”不是searchWords
一部分,因此它会显示为0.但由于“my”是searchWords
一部分,它会弹出作为计数(所以我可以知道给定词出现多少次)。我真的很感谢你的帮助 - 我很期待能够参加一些正式课程,因为我碰到了能够真正解释我想做什么的能力!
I am doing extensive work with a variety of word lists.
Please consider the following question that I have:
docText={"settlement", "new", "beginnings", "wildwood", "settlement", "book", "excerpt", "agnes", "leffler", "perry", "my", "mother", "junetta", "hally", "leffler", "brought", "my", "brother", "frank", "and", "me", "to", "edmonton", "from", "monmouth", "illinois", "mrs", "matilda", "groff", "accompanied", "us", "her", "husband", "joseph", "groff", "my", "father", "george", "leffler", "and", "my", "uncle", "andrew", "henderson", "were", "already", "in", "edmonton", "they", "came", "in", "1910", "we", "arrived", "july", "1", "1911", "the", "sun", "was", "shining", "when", "we", "arrived", "however", "it", "had", "been", "raining", "for", "days", "and", "it", "was", "very", "muddy", "especially", "around", "the", "cn", "train"} searchWords={"the","for","my","and","me","and","we"}
Each of these lists are much longer (say 250 words in the
searchWords
list anddocText
being about 12,000 words).Right now, I have the ability to figure out frequency of a given word by doing something like:
docFrequency=Sort[Tally[docText],#1[[2]]>#2[[2]]&]; Flatten[Cases[docFrequency,{"settlement",_}]][[2]]
But where I am getting hung up is on my quest to generate specific lists. Specifically, the issue of converting a list of words into a list of the frequency in which those words appear. I've tried to do this with
Do
loops but have hit a wall.I want to go through
docText
withsearchWords
and replace each element of docText with the sheer frequency of its appearance. I.e. since "settlement" appears twice, it would be replaced by 2 in the list, whereas since "my" appears 3 times, it would become 3. The list would then be something like 2,1,1,1,2, and so forth.I suspect the answer lies somewhere in
If[]
andMap[]
?This all sounds weird, but I am trying to pre-process a bunch of information for term frequency information…
Addition for Clarity (I hope):
Here is a better example.
searchWords={"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "A", "about", "above", "across", "after", "again", "against", "all", "almost", "alone", "along", "already", "also", "although", "always", "among", "an", "and", "another", "any", "anyone", "anything", "anywhere", "are", "around", "as", "at", "b", "B", "back", "be", "became", "because", "become", "becomes", "been", "before", "behind", "being", "between", "both", "but", "by", "c", "C", "can", "cannot", "could", "d", "D", "do", "done", "down", "during", "e", "E", "each", "either", "enough", "even", "ever", "every", "everyone", "everything", "everywhere", "f", "F", "few", "find", "first", "for", "four", "from", "full", "further", "g", "G", "get", "give", "go", "h", "H", "had", "has", "have", "he", "her", "here", "herself", "him", "himself", "his", "how", "however", "i", "I", "if", "in", "interest", "into", "is", "it", "its", "itself", "j", "J", "k", "K", "keep", "l", "L", "last", "least", "less", "m", "M", "made", "many", "may", "me", "might", "more", "most", "mostly", "much", "must", "my", "myself", "n", "N", "never", "next", "no", "nobody", "noone", "not", "nothing", "now", "nowhere", "o", "O", "of", "off", "often", "on", "once", "one", "only", "or", "other", "others", "our", "out", "over", "p", "P", "part", "per", "perhaps", "put", "q", "Q", "r", "R", "rather", "s", "S", "same", "see", "seem", "seemed", "seeming", "seems", "several", "she", "should", "show", "side", "since", "so", "some", "someone", "something", "somewhere", "still", "such", "t", "T", "take", "than", "that", "the", "their", "them", "then", "there", "therefore", "these", "they", "this", "those", "though", "three", "through", "thus", "to", "together", "too", "toward", "two", "u", "U", "under", "until", "up", "upon", "us", "v", "V", "very", "w", "W", "was", "we", "well", "were", "what", "when", "where", "whether", "which", "while", "who", "whole", "whose", "why", "will", "with", "within", "without", "would", "x", "X", "y", "Y", "yet", "you", "your", "yours", "z", "Z"}
These are the automatically generated stopwords from
WordData[]
. So I want to compare these words against docText. Since "settlement" is NOT part ofsearchWords
, then it would appear as 0. But since "my" is part ofsearchWords
, it would pop up as the count (so I could tell how many times the given word appears).I really do thank you for your help - I'm looking forward to taking some formal courses soon as I'm bumping up against the edge of my ability to really explain what I want to do!
原文:https://stackoverflow.com/questions/8973830
最满意答案
你有错误的界限。 你需要香蕉盒装订
[(ngModel)]="serial"
而不是[ngModel]="serial"
()
在每次输入变化时都会更新serial
模型。 从input
到model
如果手动更改代码,Single
[]
将只绑定serial
数据。 这将导致单向绑定 - 从model
到input
。正如你猜 - 一起
[()]
他们会做双向绑定。You have wrong bound. You need banana-in-box binding
[(ngModel)]="serial"
instead of[ngModel]="serial"
()
in the binding will updateserial
model everytime when the input will be changes. Frominput
intomodel
Single
[]
will just bind the data ofserial
if it will be changed by code manually. This will cause to one-way binding - frommodel
intoinput
.As you guess - together
[()]
they will make two-way binding.
相关问答
更多-
如果id与您传递的名称相同,则可以获得名称 firechange(event){ if(this.userProfileForm.controls[$event.target.id].valid){ } If the id is the same as the name you are passing you can get the name like firechange(event){ if(this.userProfileForm.controls[$event.target.id].val ...
-
无法使用角度2指令将带有selenium的文本输入到字段中(Cannot input text with selenium into field with angular 2 directive)[2022-05-18]
问题可能出在这段代码中: var isNumberRegex = /^\d$/; if (!isNumberRegex.test(event.key)... 检查event.key是否具有适当的值。 经过一些研究后,似乎因为selenium web驱动程序不再使用本机事件,event.key属性将为空。 所以唯一的解决方案是接收事件的关键属性,如下所示: String.fromCharCode(event.keyCode); 这是开发人员必须这样做的方式。 它与Angular 2无关。查 ... -
首先,因为所有字段都将在名称为{{drug.drugName}} formController中注册。 第二件事是没有myForm.{{drug.drugName}}.$invalid范围myForm.{{drug.drugName}}.$invalid 。 您可能尝试过myForm[drug.drugName].$invalid 。 但它无论如何都行不通 - 看看1。 我认为仍然没有正确的方法来在ng-repeat动态设置name字段。 相反,你需要创建tour own指令,它将重复的列表元素构建为Str ...
-
如何从具有相同名称的多个输入文本字段中仅获取一个值?(How to get only one value from multiple input text field with same name?)[2024-02-07]
尝试使用div封装按钮和输入 从按钮事件得到这个(按钮) 使用$(this).closest('div')转到父级并从那里获取下一个输入。 $(document).ready(function() { $('.clickme').click(function() { var div = $(this).closest('div'); alert(div.find('input').val()); }); });