首页 \ 问答 \ solr delta导入处理程序时间戳不够具体(solr delta import handler timestamp not specific enough)

solr delta导入处理程序时间戳不够具体(solr delta import handler timestamp not specific enough)

我是solr的新手,我对delta-imports有一个非常基本的问题。 我的mySQL数据库中有几个新记录。 因此,当我在第二个x开始导入时,我很可能会在开始导入后的同一秒内在数据库中获得一些新记录,但下次当我启动delta-import时,它将检查“ data_mport.properties中的last_index_time“并将导入此第二个x后更改的所有记录。 因此,我将丢失在开始上次导入后在第二个x中更改的所有记录。 如果我是对的,即使可以将时间戳从几秒钟改为例如毫秒,也会出现同样的问题。 时间窗口会更小,丢失的记录量会更小,但问题本身仍然存在。

我没有在教程或其他任何地方找到任何关于此问题的提及。 我是第一个每秒处理几条记录的人,还是我想念别的东西?

提前谢谢了!


I am new to solr and I have a quite basic question about delta-imports. I have several new records by second in my mySQL DB. So when I start an import at second x it is very possible, that I will get some new records in the DB at the very same second after starting the import, but the next time when I start a delta-import it will check the "last_index_time" in dataimport.properties and will import all the records changed after this second x. So I will lose all records which have been changed in second x after starting the last import. And if I am right, it would be same issue even if it is possible to cahange the timestamp from seconds to e.g. milliseconds. The timewindow would be smaller, the amount of lost records would be smaller, but the problem itself would still be there.

I have not found any mention of this issue in the tutorials or anywhere else for that matter. Am I the first one who deals with several records per second, or do i miss something else?

Many thanks in Advance!


原文:https://stackoverflow.com/questions/18570742
更新时间:2023-04-02 12:04

最满意答案

更新:OP说他的输入中存在隐藏的问题。 这个答案没有描述如何解决这个问题。 尽管如此,OP已将此答案标记为已被接受。 请参阅Charles Duffy对OP问题的实际解决方案的评论。

警告:我正在逐字地解决你的问题描述中的所有问题,这导致了下面的答案。 如果您提供将通过$Version传递的字符串示例,则有助于澄清问题。


据我所知,你正在读取变量中文件的完整路径并read Version 。 现在如果你说echo $Version你应该得到/path/to/foo.bar

我不认为你想要进入文件/path/to/foo.bar 。 你会收到一个错误: Not a directory ,因为它是一个文件,而不是一个目录。

现在,考虑一下sed -e /\.//g将对路径名做什么。

echo "/path/to/foo.bar" | sed -e '/\.//g'
/path/to/foobar

/path/to/foobar实际存在吗? 不,因为foo.bar是一个文件。 您将收到错误: No such file or directory ,因为foobar目录不存在。

如果我了解您要执行的操作,则尝试提取包含$Version指定的文件的目录。 命令dirname /path/to/foo.bar将返回/path/to 。 所以你想设置New_version=$( dirname "$Version" ) ,此时你应该可以cd $New_version

PS确保$Version正在读取绝对路径名,而不是相对名称,因此它与您运行脚本的位置无关。


UPDATE: The OP says that the problem was a hidden character in his input. This answer does not describe how to solve that problem. Nonetheless, the OP has marked this answer as accepted. See the comments of Charles Duffy for the actual solution to the OP's problem.

Caveat: I am taking everything in your problem description literally, which leads to the answer below. If you provide examples of the strings that will be passed through $Version it would help clarify the issue.


As I understand it you're reading in the full path of a file in your variable with read Version. Now if you say echo $Version you should get /path/to/foo.bar.

I don't think you'd want to cd into the file /path/to/foo.bar. You'll get an error: Not a directory, because it's a file, not a directory.

Now, consider what sed -e /\.//g will do to the pathname.

echo "/path/to/foo.bar" | sed -e '/\.//g'
/path/to/foobar

Does /path/to/foobar actually exist? No, because foo.bar was a file. You'll get an error: No such file or directory, because the foobar directory does not exist.

If I understand what you are trying to do, you are trying to extract the directory that contains the file specified by $Version. The command dirname /path/to/foo.bar will return /path/to. So you want to set New_version=$( dirname "$Version" ), at which point you should be able to cd $New_version.

P.S. Make sure $Version is reading in an absolute path name, not a relative name, so that it's independent of where you run the script from.

相关问答

更多

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)