r - Generating large adjacency matrix -
i'm trying generate adjacency matrix csv.
the csv contains 2 columns, 1 users , 1 projects. 2 columns form bipartite graph, each user can part of multiple projects or none @ all, no edges between nodes of same set (there no repeating entries same user-project pair, there repeated entries of same user or projects different combinations pairs).
i wrote comparison comparing each user's project entire project set using matlab , ismember(a,b). algorithm runs iteratively through each entry. in end, have adjacency matrix of size m(|users| + |user|) x (|users| + |user|).
for small entry count < 15000, works fast, sample of +15000, matlab stalls. initialize adjacency matrix zeros matrix (zero(r,c)) , add row row results of ismember(a,b). matlab, zeros matrix zero(15000,15000) maxes out memory. tried making 0 matrix in r size (matrix(0, 15000, 15000)) , maxes out r's memory.
is there way around this? full sample size 597,000 rows (with ~70,000 users , ~35,000 projects) , want run network analysis of it.
also want keep in matrix format , not adjacency list because have max cut min flow algorithm want run on results , works matrices.
updated:
the data looks
user | project 382 2429 385 2838 294 2502 ... ...
it taken sourceforge using zerlot university of notredame. each int value key in sql database. want convert affiliation data one-mode user-to-user adjacency matrix each edge between users shared project.
Comments
Post a Comment