代做CS 314作业、代写Programming Languages作业、代写C/C++程序作业、代做C/C++课程设计作业
CS 314 Principles of Programming LanguagesProject 3: Efficient Parallel Graph MatchingTHIS IS NOT A GROUP PROJECT! You may talk about the project and possiblesolutions in general terms, but must not share code. In this project, you will be askedto implement two parallel graph matching algorithms. Your program should take a legalmatrix-market matrix file as input and output the matching results.This document is not a complete specification of the project. You will encounter importantdesign and implementation issues that need to be addressed in your project solution.Identifying these issues is part of the project. As a result, you need to start early, allowingtime for possible revisions of your solution.1 Background1.1 The Graph Matching ProblemGiven a graph G = (V, E), while V is the set of vertices (also called nodes) and E ? |V |2.A matching M in G is a set of pairwise non-adjacent edges which means that no two edgesshare a common vertex. A vertex is matched if it is an endpoint of one of the edges inthe matching. Otherwise the vertex is unmatched [3]. In Figure 1, we show three differentmatchings for the same graph.A maximum matching can be defined as a matching where the total weight of theedges in the matching is maximized. In Figure 1, (c) is a maximum matching because thetotal weight of the edges in the matching is 7, and there could be no other matching thathas total weight greater than 7 for this graph.(a) (b) (c)32 42 1 32 42 1 32 42 1Figure 1: Graph Matching Examples11.2 Parallel Graph Matching AlgorithmsThe graph matching problem is not easy to parallelize. Most existing matching algorithmssuch as blossom algorithm are embarrassingly sequential. Here we describe two handshaking-basedalgorithms [2] that are amenable to parallelization.1.2.1 One-Way Handshaking MatchingGiven a graph G, two vertices shake hands only when there is an edge between these twoand they are the strongest neighbor of each other. We will define "strongest neighbor". Theedge that connects the two handshaking vertices is added to the matching M. We show anexample of one-way handshaking matching in Figure 2.To identify handshaking vertices, each vertex v in G extends a hand to its strongestneighbor. Here the strongest neighbor of a vertex v is the neighbor vertex that sits on themaximum-weight edge incident to v. This is a greedy algorithm that aims to maximize thetotal weight in the matching. If there are multiple incident edges of v that have maximumweight, the neighbor vertex that has the smallest vertex index will be chosen as thestrongest neighbor. For example, in Figure 2(b), the strongest neighbor of vertex E is vertexB because it has an edge weight of 4 and a smaller index (alphabetical order) than vertex F.For each vertex, we check if its strongest neighbor extends a hand back. If so, thesetwo vertices are matched. Then the corresponding edge will be added to the matching. Forexample, in Figure 2(c), vertex B extends a hand to vertex E and vertex E also extends ahand to vertex B, so edge BE will be added to the matching.At every pass of the process, we check all the vertices that are not matched from previouspasses (each vertex is checked once in each pass), and identify if there is any new edge thatcan be added to the matching. We repeat this until no more edges can be added to thematching. We show the passes in Figure 2(b), (c), (d),(e) and (f). The handshaking methodis highly data parallel, since each vertex can be processed independently and there is no datarace.1.2.2 N-Way Handshaking MatchingIn one-way handshaking matching, it is possible that one vertex will have extended handsfrom multiple neighbors, but at most one of these neighbors can be matched. This may affectthe matching efficiency. For example, in Figure 2 (b), both vertex D and vertex H extendhands to vertex E. However, since vertex E’s strongest neighbor is vertex B, so neither vertexD nor H will be matched at this pass of handshaking in Figure 2 (b).Instead of extending one hand, N-way handshaking matching allows each vertex toextend N hands (N > 1). We show an example of 2-way handshaking matching in Figure 3.In Figure 3(b), each vertex extends (up to) 2 hands at once, which extends to its strongestand second strongest neighbors. For example, vertex H extends one hand to vertex E whichis its strongest neighbor and another hand to vertex I which is the second strongest neighbor.In the next step, we take the edges whose two end points do not shake hands out ofconsideration (as if we “discard" edges). In Figure 3(b), there is no handshaking between2D E FA B CG H I1 2341 22 4 41 3 1(a) Original GraphD E FA B CG H I1 2341 22 4 41 3 1(b) Extend a Hand tothe Strongest NeighborD E FA B CG H I1 2341 22 4 41 3 1(c) Check Handshakingand Matching VerticesD E FA B CG H I1 2341 22 4 41 3 1(d) Repeat (b), (c) withUnmatched VerticesD E FA B CG H I1 2341 22 4 41 3 1(e) Repeat (b), (c) withUnmatched VerticesD E FA B CG H I1 2341 22 4 41 3 1(f) One-Way HandshakingMatching ResultFigure 2: One-Way Handshaking Matching Examplevertex H and vertex E, so the edge HE is “as if" discarded (before next step in the samepass). After this, we will obtain an updated graph with a max degree N (N=2 in Figure3(c)), we will refer to this graph as N-way graph in the remaining of the project description.We now do one-way handshaking matching on the updated N-way graph. The matchingon the N-way graph may yield more matched vertices. For example, in Figure 3(e), bothvertex H and vertex D can be matched in the first pass, while compared with one-way matchingin Figure 2, both vertex H and vertex D have to be matched in the second pass.1.3 POSIX threads - pthreadsA thread is defined as an independent stream of instructions that can be scheduled to run inits own context. Multiple threads run in multiple active contexts.Historically, hardware vendors have implemented their own proprietary versions of threads.For UNIX systems, a standardized C language threads programming interface has been specifiedby the IEEE POSIX 1003.1c standard. Implementations that adhere to this standardare referred to as POSIX threads, or pthreads[1].For more details on how to use pthreads API to write a parallel program and how tocompile a pthreads program, please refer to the pthread tutorial [1] and the recitations.3D E FA B CG H I1 2341 22 4 41 3 1(a) Original GraphD E FA B CG H I1 2341 22 4 41 3 1(b) Extend N=2 Handsat One TimeD E FA B CG H I2422 4 4(c) Discard Edgeswithout a HandshakingD E FA B CG H I2422 4 4(d) Do One-Way Handshakingon the new graph(e) Check Handshakingand Matching VerticesD E FA B CG H I2422 4 4D E FA B CG H I1 2341 22 4 41 3 1(f) N-Way HandshakingMatching ResultFigure 3: N-Way Handshaking Matching Example (N=2)2 ImplementationIn this project, you will be asked to do the following:1) implement the parallel one-way handshaking matching algorithm2) Extra Credit: implement the parallel N-way handshaking matching algorithm2.1 Data Structure vertex v0 v1 v2 v3 v4 v5 v6 v7 v8offset 0 2 5 7 10 14 17 19 22index 4 1 4 2 0 5 1 …… 6 7 5weight 2 1 4 2 1 4 2 …… 1 2 1degree 2 3 2 3 4 3 2 3 2Figure 4: Adjacency Array Data Structure4Un-directed weighted graph In this project, we use adjacency lists to represent an undirectedweighted graph. There are four arrays: index[], offset[], degree[], and weight[]. Thearray index[] keeps the neighbor lists of all vertices, for instance, node v0’s neighbor list isfollowed by neighbor v1’s neighbor list, and so on. The array offset[] stores where a node’sneighbor list starts in the index[] array. The corresponding weight[] array stores the weightof each edge. The array degree[] stores the degree of each vertex, which is the number ofneighbors for each vertex.An example of adjacency array representation is shown in Figure 4. For vertex v2, itsdegree is 2 (degree[2] = 2). The offset[2] value is 5 which means its neighbor list startsat index[5], thus index[5] = 5 (corresponding to vertex v5), index[6] = 1 (corresponding tovertex v1), and vertex v5 and vertex v1 are two neighbors of vertex v2.Pleas be aware that within a neighbor list (of a specific vertex), the neighborsare sorted in descending order such that the strongest neighbor is always placedas the first item and the second strongest neighbor is placed as the second itemand so on. We already sorted the neighbors of each vertex for you. You do nothave to sort them to find the strongest neighbor, however, you do need to filterout the nodes that are already matched from previous passes.We have provided graph I/O functions and code for reading/parsing the graphs, thepointers to the four arrays: index[], offset[], weight[], and degree[] are stored in GraphDatastruct in the provided code package. Here is what GraphData looks like:struct GraphData { int nNodes;int nEdges;int *offset;int *degree;int *index;double *weight;}In the main function of the provided code package, it calls readmm(inputF ile, &graph) toread and parse an input graph file into the data object graph. Please check DataStructure.hand utils.c for more details.N-way graph In N-way handshaking matching algorithm, you need to generate theN-way graph. We use two arrays to represent the N-way graph: nWayGraphDegree[] andnWayGraph[]. nWayGraphDegree[] is an array that stores the degree of each vertex in theN-way graph. nWayGraph[] represents the adjacency list of the nodes in the N-way graph.Since every node in the N-way graph has at most N neighbors, we allocate N*node_numberelements for the nWayGraph[] array, such that the neighbors of the vertex vi are storedstarting from nW ayGraph[i * N] to at most nW ayGraph[i * N + nW ayGraphDegree[i]-1]in descending order such that the strongest neighbor are placed first. In Figure 5, we showan example of the 2-way graph.5vertex v0 v1 v2 v3 v4 v5 v6 v7 v8nWayGraph 2 1 4 2 5 1 4 2 1 …… 8nWayGraphDegree 2 2 2 2 2 2 2 2 2Figure 5: 2-Way Graph Data StructureMatching results The matching algorithm runs for as many passes as it needs, until noedge can be added into the matching. It is possible that some vertices cannot be matched toany neighbor.In this project, we use the array res[] to store the matching results. We provide severaldefined constants to represent the status of a vertex in match.h. The array res[] is initializedto UNMATCHED (-1) for each vertex. When the graph matching program terminates,for vertex i, res[i] is either its matched vertex index or NO_MATCH_NODE (-2) whichrepresents vertex i is not matched. In general:res[i] =j, if vertex i and vertex j are matched1, initial value2, vertex i is not matchedIn the main function, it calls write_match_result(outputF ile, res, nNodes) to writethe matching results stored in res into output file outputF ile. nNodes represents the totalnumber of vertices.2.2 Work Balancing by Verticesi = current thread indexnodeToProcess = the list of nodes to processworkchunk = (numNodes + threadNum - 1) / threadNumbeg = i * workchunkend = min(beg + workchunk, numNodes)for each v in { nodeToProcess[beg], nodeToProcess[beg+1], …, nodeToProcess[end-1] }do… your code for matchingend forFigure 6: Balancing Based on Vertex NumberTo implement parallel handshaking-based matching algorithms, we need to map the tasksinto different threads in a load balanced way. In this project, each thread will be in charge ofa subset of vertices. A global barrier synchronization is performed after each pass of matching(the barrier code is already provided for you). The algorithm distributes the vertices evenlyto the co-running threads. Assuming there are nNodes vertices to be processed, the totalnumber of threads is threadNum, each thread will process around nNodes/threadNum6nodes. Since nNodes is not necessarily a multiply of threadNum, the last thread mightbe assigned ≤ nNodes/threadNum vertices. The code snippet in Figure 6 shows how eachthread should find the set of the vertices it is in charge of. Please use this vertex balancingmethod for all your parallel function implementation.2.3 One-Way HandshakingWe have described the one-way handshaking matching algorithm in Section 1.2.1. Inone-way handshaking, each vertex extends a hand to its strongest neighbor. Next it checksif there is a handshaking between itself and its strongest neighbor.You only need to complete the following functions in oneway.c:1. extend_one_hand(int threadId, int threadNum, GraphData graph, int nodeNum,int *nodeToProcess, int *res, int *strongNeighbor)Function Description:Each thread needs to be assigned a subset of vertices from nodeToProcess[] array. SeeSection 2.2 for load balancing method. For each vertex with ID v in this subset, findits strongest neighbor and store the vertex index in the array strongNeighbor[v]. Thepseudo code is shown below.Vi = the set of vertices assigned to thread iN(k) = the set of (sorted) neighbors for vertex kfor each v in Vifor each u in N(v) doif (res[u] == UNMATCHED) thenstrongNeighbor[v] = ubreak out loop uend ifend forend forInput Parameters:threadId: The thread indexthreadNum: Total thread numbergraph: The graph objectnodeNum: The number of verticesnodeToProcess Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.res: The array that stores the matching status for each vertex. The size of the array isthe total number of vertices in the graph.Output Parameters:strongNeighbor: The array that stores the index of the strongest neighbor for eachnode. The size of the array is the total number of vertices in the graph.e.g. strongNeighbor[] = {3,4,5,4,1,2,3,4,7} for the example in Figure 272. check_handshaking(int threadId, int threadNum, int nodeNum, int *nodeToProcess,int *strongNeighbor, int *res)Function Description:Each thread needs to be assigned a subset of vertices from nodeToProcess[] array.See Section 2.2 for load balancing method. For each vertex v in this subset, givenits strongest neighbor strongNeighbor[v], update res[v] correspondingly. The pseudocode is shown below.Vi = the set of vertices assigned to thread ifor each v in Vi dos = strongNeighbor[v]if (v == strongNeighbor[s]) thenres[v] = send ifend forInput Parameters:threadId: Thread indexthreadNum: Total thread numbernodeNum: The number of verticesnodeToProcess Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.strongNeighbor: The array that stores the index of the strongest neighbor for eachvertex. The size of the array is the total number of vertices in the graph.Output Parameters:res The array that stores the matching status for each vertex. The size of the array isthe total number of vertices in the graph.E.g. res[] = {-1,4,5,-1,1,2,-1,-1,-1} for the example in Figure 22.4 Extra Credit (20%): N-Way Handshaking MatchingFor N-way handshaking matching, each vertex extends N hands, then the algorithmneglects (as if “discards") those edges whose two end nodes have no handshaking, resultingin the N-way graph. Upon the N-way graph, one-way matching is performed.You only need to implement the following functions in nways.c:1. generate_n_way_graph(int threadId, int threadNum, GraphData graph,int nWays, int nodeNum, int *nodeToProcess, int *res, int *nWayGraphDegree,int *nWayGraph)8Function Description:Each thread needs to be assigned a subset of vertices from nodeToProcess[] array. SeeSection 2.2 for load balancing method. For each vertex v, find its N strongest neighbors.Then the algorithm obtains the N-way graph information which is represented bynW ayGraphDegree[] and nW ayGraph[]. The pseudo code is shown below. Note thatthe neighbors of each vertex in the N-way graph also need to be sorted such that thestrongest neighbor is placed first.Vi = the set of vertices assigned to thread iN(k) = the set of (sorted) neighbors for vertex kfor each v in Vi degree = 0for each u in N(v) in sorted order doif (res[u] == UNMATCHED) thennWayGraph[N * v + degree] = udegree++if (degree == nWays) thenbreak for loop uend ifend ifend forend forInput:threadId: Thread indexthreadNum: Total thread numbergraph: Graph objectnWays: the number of hands extendednodeNum: The number of verticesnodeToProcess: Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.res: The array that stores the matching status for each vertex. The size of the array isthe total number of vertices in the graph.Output:nWayGraphDegree: The array that stores the vertex degree of the N-way graph. Thesize of the array is the total number of vertices in the graph,nWayGraph: The array that stores the N strongest neighbors of each vertex. The sizeof the array is the total number of vertices multiplied by N.e.g. nW ayGraphDegree[] = {2,2,2,2,2,2,2,2,2}nW ayGrap[] = {3,1,4,2,5,1,4,0,1,5,2,4,3,7,4,8,7,5} for the example in Figure 32. prune_n_way_graph(int threadId, int threadNum, int nWays, int nodeNum,int *nodeToProcess, int *nWayGraphDegree, int *nWayGraph, int9*strongNeighbor)Function Description:Each thread needs to be assigned a subset of vertices from nodeToProcess[] array. SeeSection 2.2 for load balancing method. For each vertex, it finds the strongest neighborin the N-way graph. The pseudo code is shown below. Note that the neighbors of eachvertex are sorted such that the strongest neighbor is placed first.Vi = the set of vertices assigned to thread iS(k) = the set of N strongest neighbors forvertex kfor each v in Vifor each u in S(v) in sorted order dofor each x in S(u) in sorted order doIf (x == v) thenstrongNeighbor[x] = ubreak for loop uend ifend forend forend forInput Parameters:threadId: Thread indexthreadNum: Total thread numbernWays: The number of hands extendednodeNum: The number of verticesnodeToProcess: Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.nWayGraphDegree: The array that stores the vertex degree of the N-way graph. Thesize of the array is the total number of vertices in the graph,nWayGraph: The array that stores the N strongest neighbors of each vertex. The sizeof the array is the total number of vertices multiplied by N.Output Parameters:strongNeighbor: The array that stores the the strongest neighbor for each vertex. Thesize of the array is the total number of vertices in the graph.e.g. strongNeighbor[] = {3,4,5,0,1,2,-2,8,7} for the example in Figure 32.5 Filtering Matched VerticesIn handshaking algorithms, each pass except the last one will produce a set of matchedvertices. The vertices that are matched from previous passes need not to be considered in10the later passes. You will need to filter out the nodes that are already matched before startinga new pass. Then you can distribute the unmatched vertices evenly to different threads again.At the first pass, all the vertices need to be processed and they are evenly distributedto co-running threads. Starting from the second pass, you will need to count the numberof unmatched nodes. An example is shown in Figure 7. Each thread counts the number ofunmatched vertices for the set of vertices it is in charge of, and stores the counts in the arraynodeCount[]. In Figure 7, for instance, nodeCount[0] = 1 since only vertex 0 is not matchedamong the set of vertices that are processed by thread 0.Next, an exclusive prefix sum is performed on nodeCount array. An exclusive prefix sumof an array x[] is another array y[] that:y[i] =0, if i = 0Pi1j=0 x[j], otherwiseThe purpose of using exclusive prefix sum is to find out where to store the unmatchedvertices in the array newNodeT oP rocess[]. For example, in Figure 7, nodeCount = {1, 1,3}, after exclusive prefix sum, it is y ={0, 1, 2} and threads i know where to place theunmatched vertices it is in charge of: starting from newNodeToProcess[y[i]]. Every threadthen stores the unmatched vertices it finds to the newNodeT oP rocess array that will beused for next pass.nodeToProcess 0 1 2 3 4 5 6 7 8Thread 0 Thread 2nodeCount 1 1Thread 13exclusive prefix sum 0 1 2newNodeToProcess 0 3 6 7 80 unmatched vertex 1 matched vertexFigure 7: Filter Matched Nodes from Vertex ArrayYou only need to complete the following functions in filter.c:1. count_unmatched_vertices(int threadId, int threadNum, int nodeNum, int*nodeToProcess, int *res, int *nodeCount)11Function Description:Each thread counts the unmatched vertices in the set of vertices it is in charge of. Thecount by thread i is stored in nodeCount[i]. The pseudo code is shown below.Vi = the set of vertices assigned to thread ifor each v in Vi doif (res[v] == UNMATCHED) thennodeCount[i]++end ifend forInput:threadId: Thread indexthreadNum: Total thread numbernodeNum: The number of verticesnodeToProcess: Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.res: The array that stores the matching status. The size of the array is the number ofvertices.Output:nodeCount: Each element of the array stores the number of unmatched vertices collectedby each thread. The size of the array is the total number of threads.e.g. nodeCount[] = {1,1,3} for the example in Figure 2.2. update_remain_nodes_index(int threadId, int threadNum, int *nodeToProcess,int *startLocations, int *res, int nodeNum, int newNodeToProcess)Function Description:Each thread stores the unmatched vertices it finds into the newNodeT oP rocess arraystarting at the location provided in startLocations. The pseudo code is shown below.Vi = the set of vertices assigned to thread ifor each v in Vi doif (res[v] == UNMATCHED) thenoffset = startLocations[i]++newNodeToProcess[offset] = vend ifend forInput:threadId: Thread index12threadNum: Total thread numbernodeToProcess: Each element is a vertex ID, and this array is usually used to pass theunmatched vertices to the processing function. The size of the array is nodeNum.startLocations: The array that stores the start location of each thread in newNodeT oP rocess.The size of the array is the number of threads plus one.res: The array that stores the matching status. The size of the array is the number ofvertices.nodeNum: The number of vertices to processOutput:newNodeToProcess: Each element is a vertex ID. The size of the array is the numberof vertices.newNodeT oP rocess[] = {0,3,6,7,8} for the example in Figure 3.3 GradingWe provide six real world matrices for you to test your code. In the code package, we haveprovided the function for transforming the input matrix to GraphData graph in main.c. Sixmatrices and four hidden matrices will be used to grade your program. Your programs willbe graded based on functionality. You will receive 0 credit if we cannot compile/run yourcode on ilab machines. All grading will be done automatically on ilab machines.4 How to Get StartedThe code package for you to start with is provided on Sakai. Create your own directoryon the ilab cluster, and copy the entire provided project folder to your home directory orany other one of your directories. Make sure that the read, write, and execute permissionsfor groups and others are disabled (chmod go-rwx <directory_name>). Please read theREADME file before you start.4.1 Compilation and ExecutionCompilation: We have provided a Makefile in the sample code package.make match should compile everything into a single executable called match.make clean should remove all generated executable and intermediate files.Execution: To see the usage of the program, please compile the program and simply run./match and the usage information will be shown. Please DO NOT change the usage ofthe program.13Program Output: The program will output the matching results to the file you specify inthe arguments and print the time and the number of iterations to get the maximal matchingon the screen. We have handled this part for you. Please DO NOT change any other filethat is not oneway.c, nways.c, and filter.c.4.2 Input Matrix ListThe following six matrix files are provided for you to test your graph matching implementations.The matrix will be transformed to a weighted un-directed graph automatically by theprogram. A micro test matrix "test.mtx" is provided in the code package. The test matrix isthe example we used in the project description.af_shell10 https://sparse.tamu.edu/MM/Schenk_AFE/af_shell10.tar.gzcant https://sparse.tamu.edu/MM/Williams/cant.tar.gzconsph https://sparse.tamu.edu/MM/Williams/consph.tar.gzdawson5 https://sparse.tamu.edu/MM/GHS_indef/dawson5.tar.gzhood https://sparse.tamu.edu/MM/GHS_psdef/hood.tar.gzldoor https://sparse.tamu.edu/MM/GHS_psdef/ldoor.tar.gzIf the disk space on ilab machine is not enough, please download one matrix at one timeor use /freespace to store the matrices.5 What to SubmitYou need to submit two files: oneway.c and filter.c. If you finish the extra credit part forN-way handshaking, please also submit nways.c. Please DO NOT include any outputmatching results file and input graph/matrix files in your submission.6 Questionshttp://www.6daixie.com/contents/13/2365.htmlAll questions regarding this project should be posted on Sakai forum. Enjoy the project!References[1] Blaise Barney. Posix threads programming. https://computing.llnl.gov/tutorials/pthreads/, 2017.[2] Jonathan Cohen and Patrice Castonguay. Efficient graph matching and coloring on thegpu.[3] Wikipedia contributors. Matching (graph theory) — Wikipedia, the free encyclopedia.https://en.wikipedia.org/w/index.php?title=Matching_(graph_theory)&oldid=869039028, 2018. [Online; accessed 20-November-2018].因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:
微信:codinghelp