praca_magisterska/dataset/git_web_ml
2021-09-04 16:19:59 +02:00
..
citing.txt Initialized git repo. Created function for reading graphs from files (or more precisely edges of a graph). Added few datasets of undirected graphs without weights (for now). 2021-09-04 16:19:59 +02:00
musae_git_edges.csv Initialized git repo. Created function for reading graphs from files (or more precisely edges of a graph). Added few datasets of undirected graphs without weights (for now). 2021-09-04 16:19:59 +02:00
README.txt Initialized git repo. Created function for reading graphs from files (or more precisely edges of a graph). Added few datasets of undirected graphs without weights (for now). 2021-09-04 16:19:59 +02:00

GitHub Social Network

Description

A large social network of GitHub developers which was collected from the public API in June 2019. Nodes are developers who have starred at least 10 repositories and edges are mutual follower relationships between them. The vertex features are extracted based on the location, repositories starred, employer and e-mail address. The task related to the graph is binary node classification - one has to predict whether the GitHub user is a web or a machine learning developer. This target feature was derived from the job title of each user.

Properties

- Directed: No.
- Node features: Yes.
- Edge features: No.
- Node labels: Yes. Binary-labeled.
- Temporal: No.
- Nodes: 37,700
- Edges: 289,003
- Density: 0.001 
- Transitvity: 0.013

Possible Tasks

- Binary node classification
- Link prediction
- Community detection
- Network visualization