首页 \ 问答 \ Hadoop for MySQL用例(Hadoop for MySQL use cases)

Hadoop for MySQL用例(Hadoop for MySQL use cases)

 我有一个包含美国股票，共同基金和ETF价格约4百万条记录的数据库，并且每天我都会为每个证券添加每日价格。  
 对于我正在处理的一个功能，我需要获取每个安全性的最新价格（分组最大值）并使用其他财务指标进行计算。 证券数量约为40K。  
 但是这个数据量的分组最大值很大，需要几分钟才能执行。  
 当然，我的表使用索引，但任务涉及获取和实时处理近7GB的数据。  
 所以我很感兴趣，这是大数据工具和算法的任务，还是少量的数据？ 因为在例子中我注意到他们正在处理数千和数百GB的数据。  
 我的数据库是MySQL，我想用Hadoop来处理数据。 是好的做法还是我只需要使用MySQL优化（我的数据很小？），或者如果在该数据量中使用Hadoop是错误的，那么您对此案例有何建议？  
 请注意 ，我每天增加的项目涉及许多分析，需要根据用户请求实时完成。  
 注意不知道这个问题是否可以在计算器中查询，所以如果问题不在话题上，请抱歉。  
 提前致谢！ 

I have a database with ~4 million records of US stocks, mutual funds and ETFs prices for 5 years and every day I am adding daily price for each security. 
For one feature that I am working on I need to fetch latest price for each security (groupwise max) and do some calculation with other financial metrics. The securities count is ~40K. 
But the groupwise maximum with this amount of data is heavy and takes minutes to execute. 
Of course my tables use indexes, but the task involves getting and real time processing nearly 7GB data. 
So I am interested, is this task for Big Data tools and algorithms or it is small amount of data? because in examples I noticed that they are working on data of thousands and millions GBs. 
My database is MySQL and I want to use Hadoop to process data. Is it good practice or I need to use only MySQL optimizations (is my data small?) or if it is wrong to use Hadoop in that amount of data, what can you advice for this case? 
NOTE that my increasing every day and project involves many analyzes, that need to be done on real time, based on user request. 
NOTE Don't know whether this question is OK to ask in stackoverflow, so please sorry if question is off-topic. 
Thanks in advance!

原文：https://stackoverflow.com/questions/46915388

更新时间：2022-12-14 18:12

相关文章

更多

mysql问题

Hadoop 中利用 MapReduce 读写 MySQL 数据

Hadoop集群（第10期）_MySQL关系数据库

关于netbeans和mysql的问题

Hadoop学习全程记录——使用sqoop将MySQL中数据导入到Hive中

关于mysql 的 sql

MapReduce直接连接MySQL获取数据

通过Sqoop实现Mysql / Oracle 与HDFS / Hbase互导数据

[Hive] 完全分布式安装过程（MetaStore: MySQL）

Solr连接MYSQL导入源数据生成索引

最新问答

更多

散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)

TensorFlow：基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)

企业安全培训的各项内容

错误：RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)

NumPy：将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全？(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)

注销后如何隐藏导航portlet？(How to hide navigation portlet after logout?)

将多个行和可变行移动到列(moving multiple and variable rows to columns)

对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)

Angular $资源不会改变方法(Angular $resource doesn't change method)

如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)

不适用：悬停在悬停时：在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])

Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)

产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)

是否可以嵌套hazelcast IMaps？(Is it possible to nest hazelcast IMaps? And whick side effects can I expect? Is it a good Idea anyway?)

UIViewAnimationOptionRepeat在两个动画之间暂停(UIViewAnimationOptionRepeat pausing in between two animations)

在x-kendo-template中使用Razor查询(Using Razor query within x-kendo-template)

在BeautifulSoup中替换文本而不转义(Replace text without escaping in BeautifulSoup)

如何在存根或模拟不存在的方法时配置Rspec以引发错误？(How can I configure Rspec to raise error when stubbing or mocking non-existing methods?)

asp用javascript(asp with javascript)

“％（）s”在sql查询中的含义是什么？(What does “%()s” means in sql query?)

如何为其编辑的内容提供自定义UITableViewCell上下文？(How to give a custom UITableViewCell context of what it is editing?)

c ++十进制到二进制，然后使用操作，然后回到十进制(c++ Decimal to binary, then use operation, then back to decimal)

以编程方式创建视频？(Create videos programmatically?)

无法在BeautifulSoup中正确解析数据(Unable to parse data correctly in BeautifulSoup)

webform和mvc的区别知乎

如何使用wadl2java生成REST服务模板，其中POST / PUT方法具有参数？(How do you generate REST service template with wadl2java where POST/PUT methods have parameters?)

我无法理解我的travis构建有什么问题(I am having trouble understanding what is wrong with my travis build)

iOS9 Scope Bar出现在Search Bar后面或旁边(iOS9 Scope Bar appears either behind or beside Search Bar)

为什么开机慢上面还显示；Inetrnet,Explorer

有关调用远程WCF服务的超时问题(Timeout Question about Invoking a Remote WCF Service)