Skip to content
This repository has been archived by the owner on Mar 9, 2023. It is now read-only.

Latest commit

 

History

History
120 lines (82 loc) · 3.65 KB

friendnet.md

File metadata and controls

120 lines (82 loc) · 3.65 KB

friendnet.pl

Maintenance

Analyze your Goodreads.com social network

Spiders your social network and creates files with edges and nodes which can be easily processed with social network analysis software.

Output

$ head friendnet-nodes.csv friendnet-edges.csv
==> friendnet-nodes.csv <==
id,name,img_url
50965461,"Peter Hesar",https://images.gr-assets.com/users/1514444137p2/50911111.jpg
15232357,"Carole Arsifeult",https://images.gr-assets.com/users/139552226262/15222217.jpg
41256336,"Jordan Teller",https://images.gr-assets.com/users/1427180778p2/41444336.jpg
4112343,Tim,https://images.gr-assets.com/users/1432411115p2/4114553.jpg

==> friendnet-edges.csv <==
from,to
15234712,18525218
15234712,8251216
15234712,13152689
15234712,9362611

Comma-separated values (CSV) files can be easily processed with any social network analysis (SNA) software such as R with the igraph package or similar. You can ran other statistics software or query languages against CSV-files too, e.g. q is SQL for CSV. A user sent me a screenshot with Excel processing these data, which looked good too.

Social network analysis (SNA)

Generated network type:

  • Egocentric (not sociocentric/complete),
  • Directed (not undirected),
  • Binary (not valued),
  • One-Mode (not bipartite/multi-mode),
  • Connected (not disconnected)

Network

TODO: R/igraph-examples:
- direct influence on neighbours (degree centrality)
- brokerage or gatekeeping potential (betweeness centrality)
- influence entire network most quickly or: who hears news first (closeness centrality)
- influence over whole network, not just neighbours (eigen centrality)
- probability that any message will arrive (page rank)
- linked by many nodes that are linking many other nodes (Kleinberg authority score)
- community detection
- ...
TODO: q-example "Members popular among your friends"

How to generate this on a GNU/Linux operating system

  1. Install the toolbox
  2. at the prompt, enter:
$ ./friendnet.pl --help
$ ./friendnet.pl [email protected]

Enter GR password for [email protected]: ******************
Signing in to Goodreads... OK
Traversing #18418712's social network (depth=2)...
Covered: [100%]
Writing network data to: 
./list-out/friendnet-5685856-nodes.csv  (N=76622)
./list-out/friendnet-5685856-edges.csv  (N=106974)

Total time: 195 minutes

Note:

You can break the process with CTRL-C and continue later without having to re-read all online sources again, as reading from Goodreads.com is very time consuming. The script internally uses a file-cache which is busted after 31 days and saves to /tmp/FileCache/.

Observations and limitations

  • long runtime: Goodreads slows down all requests and we have to load a lot of data

Feedback

If you like this project, give it a star on GitHub. Report bugs or suggestions via GitHub or see the AUTHORS.md file.

See also