r - Generating large adjacency matrix -


i'm trying generate adjacency matrix csv.

the csv contains 2 columns, 1 users , 1 projects. 2 columns form bipartite graph, each user can part of multiple projects or none @ all, no edges between nodes of same set (there no repeating entries same user-project pair, there repeated entries of same user or projects different combinations pairs).

i wrote comparison comparing each user's project entire project set using matlab , ismember(a,b). algorithm runs iteratively through each entry. in end, have adjacency matrix of size m(|users| + |user|) x (|users| + |user|).

for small entry count < 15000, works fast, sample of +15000, matlab stalls. initialize adjacency matrix zeros matrix (zero(r,c)) , add row row results of ismember(a,b). matlab, zeros matrix zero(15000,15000) maxes out memory. tried making 0 matrix in r size (matrix(0, 15000, 15000)) , maxes out r's memory.

is there way around this? full sample size 597,000 rows (with ~70,000 users , ~35,000 projects) , want run network analysis of it.

also want keep in matrix format , not adjacency list because have max cut min flow algorithm want run on results , works matrices.

updated:

the data looks

 user  |  project 382       2429 385       2838 294       2502 ...       ... 

it taken sourceforge using zerlot university of notredame. each int value key in sql database. want convert affiliation data one-mode user-to-user adjacency matrix each edge between users shared project.


Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

php - CakePHP HttpSockets send array of paramms -

node.js - Using Node without global install -