非递归Kosaraju的两遍算法实现永远在大数据集上执行(Non recursive Kosaraju's two pass algorithm implementation taking forever to execute on a large data set)
我把它编码为一个已经超过截止日期的作业。
对于各种较小的测试用例,此实现完全正常,并在图中显示5个最大的强连接组件的大小。
但是当我在大约875714个顶点的赋值数据集上运行它时,似乎永远执行。 (60分钟后甚至没有出现在第一次DFS通行证中)
我已经使用了DFS例程的非递归堆栈实现,因为我听说大量的顶点导致了递归堆栈溢出问题。
如果有人能够指出,这个代码中的内容使得它使用大型数据集以这种方式运行将会非常有用。
输入文件由图表中的边列表组成。 一条边/线。
(例如):
1 2
2 3
3 1
3 4
5 4
代码如下:
//宏定义和全局变量
#define N 875714 #define all(a) (a).begin(), (a).end() #define tr(c,i) for(typeof((c).begin()) i = (c).begin(); i != (c).end(); i++) vi v(N), ft, size;
//非递归DFS算法
void DFS(vvi g, int s, int flag) { stack<int> stk; stk.push(s); v[s] = 1; int jumpOut, count; vi::iterator i; if(flag == 2) count = 1; while(!stk.empty()) { i = g[stk.top()].begin(); jumpOut = 0; for(; i != g[stk.top()].end(); i++) { if(v[*i] != 1) { stk.push(*i); v[*i] = 1; if(flag == 2) //Count the SCC size count++; jumpOut = 1; //Jump to the while loop's beginning break; } } if(flag == 1 && jumpOut == 0) //Record the finishing time order of vertices ft.push_back(stk.top()); if(jumpOut == 0) stk.pop(); } if(flag == 2) size.push_back(count); //Store the SCC size }
// 2通过Kosaraju算法
void kosaraju(vvi g, vvi gr) { cout<<"\nInside pass 1\n"; for(int i = N - 1; i >= 0; i--) if(v[i] != 1) DFS(gr, i, 1); cout<<"\nPass 1 completed\n"; fill(all(v), 0); cout<<"\nInside pass 2\n"; for(int i = N - 1; i >= 0; i--) if(v[ ft[i] ] != 1) DFS(g, ft[i], 2); cout<<"\nPass 2 completed\n"; }
。
int main() { vvi g(N), gr(N); ifstream file("/home/tauseef/Desktop/DAA/SCC.txt"); int first, second; string line; while(getline(file,line,'\n')) //Reading from file { stringstream ss(line); ss >> first; ss >> second; if(first == second) //Eliminating self loops continue; g[first-1].push_back(second-1); //Creating G & Grev gr[second-1].push_back(first-1); } cout<<"\nfile read successfully\n"; kosaraju(g, gr); cout<<"\nFinishing order is: "; tr(ft, j) cout<<*j+1<<" "; cout<<"\n"; sort(size.rbegin(), size.rend()); //Sorting the SCC sizes in descending order cout<<"\nThe largest 5 SCCs are: "; tr(size, j) cout<<*j<<" "; cout<<"\n"; file.close(); }
I coded this for an assignment which has passed its deadline.
This implementation works completely fine with various smaller test cases and displays the sizes of the 5 largest Strongly Connected Components in the graph as it should.
But seems to execute forever when i run it on the assignment data set of about 875714 vertices. (Doesn't even come out of the first DFS pass after 60mins)
I've used the non recursive stack implementation of the DFS routine as i heard that the large number of vertices was causing recursion stack overflow problems.
It would be really helpful if anyone could point out, what in this code is making it behave this way with the large dataset.
The input file consists of list of edges in the graph. one edge/line.
(eg):
1 2
2 3
3 1
3 4
5 4
Download link for the Large graph test case zip file
Code follows:
//Macro definitions and Global variables
#define N 875714 #define all(a) (a).begin(), (a).end() #define tr(c,i) for(typeof((c).begin()) i = (c).begin(); i != (c).end(); i++) vi v(N), ft, size;
//Non recursive DFS algorithm
void DFS(vvi g, int s, int flag) { stack<int> stk; stk.push(s); v[s] = 1; int jumpOut, count; vi::iterator i; if(flag == 2) count = 1; while(!stk.empty()) { i = g[stk.top()].begin(); jumpOut = 0; for(; i != g[stk.top()].end(); i++) { if(v[*i] != 1) { stk.push(*i); v[*i] = 1; if(flag == 2) //Count the SCC size count++; jumpOut = 1; //Jump to the while loop's beginning break; } } if(flag == 1 && jumpOut == 0) //Record the finishing time order of vertices ft.push_back(stk.top()); if(jumpOut == 0) stk.pop(); } if(flag == 2) size.push_back(count); //Store the SCC size }
// The 2 pass Kosaraju algorithm
void kosaraju(vvi g, vvi gr) { cout<<"\nInside pass 1\n"; for(int i = N - 1; i >= 0; i--) if(v[i] != 1) DFS(gr, i, 1); cout<<"\nPass 1 completed\n"; fill(all(v), 0); cout<<"\nInside pass 2\n"; for(int i = N - 1; i >= 0; i--) if(v[ ft[i] ] != 1) DFS(g, ft[i], 2); cout<<"\nPass 2 completed\n"; }
.
int main() { vvi g(N), gr(N); ifstream file("/home/tauseef/Desktop/DAA/SCC.txt"); int first, second; string line; while(getline(file,line,'\n')) //Reading from file { stringstream ss(line); ss >> first; ss >> second; if(first == second) //Eliminating self loops continue; g[first-1].push_back(second-1); //Creating G & Grev gr[second-1].push_back(first-1); } cout<<"\nfile read successfully\n"; kosaraju(g, gr); cout<<"\nFinishing order is: "; tr(ft, j) cout<<*j+1<<" "; cout<<"\n"; sort(size.rbegin(), size.rend()); //Sorting the SCC sizes in descending order cout<<"\nThe largest 5 SCCs are: "; tr(size, j) cout<<*j<<" "; cout<<"\n"; file.close(); }
原文:https://stackoverflow.com/questions/34257157
最满意答案
EDIT1:您不能在方法之外的HashMap字段中添加元素。 这样的事情不会奏效:
public class Class { HashMap<String, String> hashMap = new HashMap<String, String>(); hashMap.put("one", "two"); }
如果你想实现它,把它放在构造函数中,如下所示:
public class Class { HashMap<String, String> hashMap = new HashMap<String, String>(); public Class() { hashMap.put("one", "two"); } }
您可以采用其他方式进行
static
阻止。EDIT1: You cannot add elements in HashMap fields outside of methods. Things like this wont work:
public class Class { HashMap<String, String> hashMap = new HashMap<String, String>(); hashMap.put("one", "two"); }
If you want to achieve that, put it in the constructors, like so:
public class Class { HashMap<String, String> hashMap = new HashMap<String, String>(); public Class() { hashMap.put("one", "two"); } }
Other way you can do it is in a
static
block.
相关问答
更多-
他们都是独一无二的,是的 如果你确定你的值是唯一的,你可以迭代旧地图的条目。 Map
myNewHashMap = new HashMap<>(); for(Map.Entry entry : myHashMap.entrySet()){ myNewHashMap.put(entry.getValue(), entry.getKey()); } 或者,您可以使用Guava提供的双向映射,并使用inverse()方法: Bi ... -
是的,我确实意识到,由于我计算哈希码的方式,修改实例变量名反过来会影响Dog实例的哈希码。 但是,我使用相同的实例作为密钥! 那么,为什么get()方法不能找到相应的值呢? 这种解释有点过于简单,但它仍然应该说明这里发生了什么。 将HashMap视为键值对的数组。 hashCode值用于决定获取/放置给定值的索引。 例如,如果您的哈希代码返回7,那么它将尝试获取/将值放在数组中的索引7处。 所以,假设您正在进行put操作,但索引7已经满了。 有几种方法可以解决这个问题,但最简单的方法是在每个数组索引处都有一 ...
-
如何使用值作为List或Array将HashMap中的键分组(How to group the keys from a HashMap using values as a List or an Array)[2023-07-17]
如果您的范围仅限于Java 7,请尝试更改如下代码: Map> segList = new HashMap >(); Iterator > i = citiesWithCodes.entrySet().iterator(); while (i.hasNext()) { Entry n ... -
所以这实际上是一个简单的修复,我会把解决方案放在这里只是因为有人遇到类似的问题。 这是我第一次使用HashMap,所以我有一位同事帮帮我。 我需要值来调用密钥。 storeWeight.put(row.getYfProduct(), storeWeight.get(row.getYfProduct() + (row.getWeight())); 并避免空指针: public void storeWeight(Yieldfl yieldfl){ for (YieldItem row : yieldi ...
-
EDIT1:您不能在方法之外的HashMap字段中添加元素。 这样的事情不会奏效: public class Class { HashMap
hashMap = new HashMap (); hashMap.put("one", "two"); } 如果你想实现它,把它放在构造函数中,如下所示: public class Class { HashMap hashMap = new Has ... -
电话簿在enterNumber中遥不可及。 它没有可见性。 将其作为参数传递。 public void enterNumber (Map
phoneBook, String name, String number) { phoneBook.put(name, number); } I modified it to the below; it complies now. Thank you guys! import java.util.HashMap; p ... -
如何在java中使用hashMap获取特定的重复值键(How to get a specific duplicate values keys using hashMap in java)[2023-02-17]
您的目标似乎是查找具有重复MD5哈希值的文件。 由于哈希是文件的预期唯一标识符,因此您应该将其用作密钥: public void findDuplicateFiles(File[] files) { // In Java 8, the following loop can be replaced by: //Map> filesByHash = // Stream.of(files).filter(File::isFile).collec ... -
您面临的问题似乎是TOCTTOU类问题。 (是的,这种错误经常发生,它有自己的名字:)。) 在地图中插入条目时, 至少需要执行以下两项操作: 检查密钥是否已存在。 如果检查返回true,则更新现有条目(如果没有),请添加一个新条目。 如果这两个不是原子地发生的(就像它们在正确同步的映射实现中那样),那么几个线程可以得出结论:在步骤1中密钥尚不存在,但是当它们到达步骤2时,不再是真的了。 因此,多个线程将很乐意插入具有相同键的条目。 请注意,这不是唯一可能发生的问题,根据实施情况和您的可见性,您可以获得各种不 ...
-
Map接口中的replace(K,V)方法replace(K,V)是Java 8中引入的一种新方法。 显然,您正在使用Java 7或更早版本编译代码。 两种可能的解决方案是 下载适用于Mac OS X的Java 8 JDK,并使用它来编译代码。 用put替换replace 。 如果密钥之前没有某个值,则不希望将新值放在地图中时使用方法replace方法,类似于: if ( shoppingList.contains("Bread") ) { shoppingList.put("Bread",Boo ...
-
比较hashMap值(Compare hashMap values)[2021-10-23]
检查赎金票据中的每一个字母,看看报纸上是否有足够的信件: boolean enoughLetters(MapmagMap, Map ransomMap) { for( Entry e : ransomMap.entrySet() ) { Character letter = e.getKey(); Integer available = magMap ...