Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Overview

Version License repo size Arxiv codebeat badge build badge coverage badge


Little Ball of Fur is a graph sampling extension library for Python.

Please look at the Documentation, relevant Paper, Promo video and External Resources.

Little Ball of Fur consists of methods that can sample from graph structured data. To put it simply it is a Swiss Army knife for graph sampling tasks. First, it includes a large variety of vertex, edge, and exploration sampling techniques. Second, it provides a unified application public interface which makes the application of sampling algorithms trivial for end-users. Implemented methods cover a wide range of networking (Networking, INFOCOM, SIGCOMM) and data mining (KDD, TKDD, ICDE) conferences, workshops, and pieces from prominent journals.


Citing

If you find Little Ball of Fur useful in your research, please consider citing the following paper:

@inproceedings{littleballoffur,
               title={{Little Ball of Fur: A Python Library for Graph Sampling}},
               author={Benedek Rozemberczki and Oliver Kiss and Rik Sarkar},
               year={2020},
               pages = {3133–3140},
               booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20)},
               organization={ACM},
}

A simple example

Little Ball of Fur makes using modern graph subsampling techniques quite easy (see here for the accompanying tutorial). For example, this is all it takes to use Diffusion Sampling on a Watts-Strogatz graph:

import networkx as nx
from littleballoffur import DiffusionSampler

graph = nx.newman_watts_strogatz_graph(1000, 20, 0.05)

sampler = DiffusionSampler()

new_graph = sampler.sample(graph)

Methods included

In detail, the following sampling methods were implemented.

Node Sampling

Edge Sampling

Exploration Based Sampling

Head over to our documentation to find out more about installation and data handling, a full list of implemented methods, and datasets. For a quick start, check out our examples.

If you notice anything unexpected, please open an issue and let us know. If you are missing a specific method, feel free to open a feature request. We are motivated to constantly make Little Ball of Fur even better.


Installation

Little Ball of Fur can be installed with the following pip command.

$ pip install littleballoffur

As we create new releases frequently, upgrading the package casually might be beneficial.

$ pip install littleballoffur --upgrade

Running examples

As part of the documentation we provide a number of use cases to show how to use various sampling techniques. These can accessed here with detailed explanations.

Besides the case studies we provide synthetic examples for each model. These can be tried out by running the scripts in the examples folder. You can try out the random walk sampling example by running:

$ cd examples
$ python ./exploration_sampling/randomwalk_sampler.py

Running tests

$ python setup.py test

License

Comments
  • change initial num of nodes formula

    change initial num of nodes formula

    to avoid having more initial nodes than the requested final number of nodes (when the final number of nodes requested is much smaller than the graph size).

    opened by bricaud 7
  • Error install dependency networkit==7.1

    Error install dependency networkit==7.1

    I didn't manage to install littleballoffur due to one of its dependency that seems outdated. It didn't work to install networkit==7.1 but I did manage to run its latest version. However, littleballoffur runs on networkit==7.1.

    I am using a Jupyter notebook as an environment and the following system specs: posix Darwin 21.4.0 3.8.12 (default, Mar 17 2022, 14:54:15) [Clang 13.0.0 (clang-1300.0.29.30)]

    The specific error output:

    Collecting networkit==7.1
      Using cached networkit-7.1.tar.gz (3.1 MB)
      Preparing metadata (setup.py) ... error
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [2 lines of output]
          ERROR: No suitable compiler found. Install any of these:  ['g++', 'g++-8', 'g++-7', 'g++-6.1', 'g++-6', 'g++-5.3', 'g++-5.2', 'g++-5.1', 'g++-5', 'g++-4.9', 'g++-4.8', 'clang++', 'clang++-3.8', 'clang++-3.7']
          If using AppleClang, OpenMP might be needed. Install with: 'brew install libomp'
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.
    
    

    Please note that: libomp 14.0.0 is already installed and up-to-date.

    Is there some way I could install the library on networkit v10? Thanks a lot!

    opened by CristinBSE 6
  • Node attributes are not copied from original graph

    Node attributes are not copied from original graph

    Breadth and Depth First Search return me subgraphs without correct attributes on nodes/edges. Actually, I found that the dict containing those attributes has been completely deleted in the sampled graph. Is this a known issue? Is the sampler supposed to work in this way?

    opened by jungla88 6
  • Why can't I use the graph imported by nx.read_edgelist()

    Why can't I use the graph imported by nx.read_edgelist()

    graph = nx.read_edgelist("filename", nodetype=int, data=(("Weight", int),))

    error : AssertionError: Graph is not connected. why? 'graph' is a networkx graph

    opened by DeathSentence 5
  • Spikyball exploration sampling

    Spikyball exploration sampling

    You might find the change a bit invasive (understandable :) This adds a new family exploration sampling method (spikyball) described in the paper Spikyball sampling: Exploring large networks via an inhomogeneous filtered diffusion available here https://arxiv.org/abs/2010.11786 and submitted for publication in Combinatorial Optimization, Graph, and Network Algorithms journal. The version number has been increased in order not to collide with official releases of lbof, you might want to change this...

    opened by naspert 4
  • Assumptions on graph properties

    Assumptions on graph properties

    Hi there,

    I am wondering if it would be possible to relax some constrain the graph has to satisfy in order to start an exploration on it. In particular, the requirement of connectivity seems a bit strong to me. I think a graph sampling procedure could easily deal with such property, since in the case the graph is not connected the sampling could take place on the single connected components or the exploration could rely on the neighborhood of the current node explored. For node sampling strategies like BFS and DFS looks pretty natural to me, also for Random Walk Sampling (maybe the one with the restart probability could be a little tricky). Something strange could probably happen for edge sampling if the connectivity property is not satisfied. Do you see any possibility to extend little ball of fur to such type of graphs? What was the reason that bring you to assume the connectivity property for graphs?

    Thank you !

    opened by jungla88 3
  • Error importing DiffusionSampler

    Error importing DiffusionSampler

    Hello,

    First of all, thank you for your great work building this library. Great extension to NetworkX.

    I am facing an issue when trying to import the DiffusionSampler specifically. All the other samplers get imported just fine. However the DiffusionSampler raises import issue.

    I am using a Jupyter notebook as an environment.

    The specific error output:

    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    <ipython-input-29-fbd222d9c756> in <module>
    ----> 1 from littleballoffur import DiffusionSampler
          2 
          3 
          4 model = DiffusionSampler()
          5 new_graph = model.sample(wd50k_connected_relabeled)
    
    ImportError: cannot import name 'DiffusionSampler' from 'littleballoffur'
    

    Is this replicable?

    Thank you in advance for looking into it.

    opened by DimitrisAlivas 3
  • ForestFireSampler throws exceptions for some seed values

    ForestFireSampler throws exceptions for some seed values

    Hi,

    I am trying to sample an undirected, connected graph of 5559 nodes and 10804 edges into a sample of 100 nodes. As I loop over the "creation of samples" part, I am altering the seed for the ForeFireSampler every time to obtain a different sample.

    E.g. seed_value = random.randint(1,2147483646) sampler = ForestFireSampler(100, seed=seed_value )

    However, for some runs I get an exception thrown, which is also reproducible. I assume it is related to specific seed values which the sampler doesn´t seem to be able to handle. An example is seed value 1176372277.

    Traceback (most recent call last): File "/project/topology_extraction.py", line 472, in abstraction_G = graph_sampling(S) File "/project/topology_extraction.py", line 234, in graph_sampling new_graph = sampler.sample(S) File "/usr/local/lib/python3.8/dist-packages/littleballoffur/exploration_sampling/forestfiresampler.py", line 74, in sample self._start_a_fire(graph) File "/usr/local/lib/python3.8/dist-packages/littleballoffur/exploration_sampling/forestfiresampler.py", line 47, in _start_a_fire top_node = node_queue.popleft() IndexError: pop from an empty deque

    Process finished with exit code 1

    I believe this is a bug in the library.

    Thanks! Nils

    opened by nrodday 3
  • Error in forest fire sampling

    Error in forest fire sampling

    Hi,

    While running the forest fire sampling code, I got an error that it is trying to pop an element from an empty deque.

    File "/opt/anaconda3/lib/python3.7/site-packages/littleballoffur/exploration_sampling/forestfiresampler.py", line 47, in _start_a_fire top_node = node_queue.popleft() IndexError: pop from an empty deque

    I am not sure if it was due to data or needs an empty/try-catch check or should it be handled by application code. Hence opened an issue.

    Thank you

    opened by apurvamulay 2
  • Broken link in Readme (readdthedocs)

    Broken link in Readme (readdthedocs)

    https://littleballoffur.readthedocs.io/en/latest/notes/introduction.html

    as of 2020-05-18 9:35 AM EDT, it says "sorry this page does not exist"

    opened by bbrewington 1
  • Error in line 254 _checking_indexing() of backend.py

    Error in line 254 _checking_indexing() of backend.py

    According to your code, once numeric_indices != node_indices, the error raises. Under my scenario, I constructed a networkx graph in which the indices of nodes start from '1', and then, the sampler did not work. This error will be triggered if the indices of nodes in a networkx graph do not start from '0'. I have to adjust my graph such that the indices of nodes start from '0' to utilize your samplers. I hope you can refine this part of the code to avoid someone else meets this problem.

    opened by Haoran-Young 0
Releases(v_20200)
Owner
Benedek Rozemberczki
PhD candidate at The University of Edinburgh @cdt-data-science working on machine learning and data mining related to graph structured data.
Benedek Rozemberczki
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu

Hao Mai 15 Nov 04, 2022
Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

Fast Training of Neural Lumigraph Representations using Meta Learning Project Page | Paper | Data Alexander W. Bergman, Petr Kellnhofer, Gordon Wetzst

Alex 39 Oct 08, 2022
Benchmark library for high-dimensional HPO of black-box models based on Weighted Lasso regression

LassoBench LassoBench is a library for high-dimensional hyperparameter optimization benchmarks based on Weighted Lasso regression. Note: LassoBench is

Kenan Šehić 5 Mar 15, 2022
Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Deep learning based state estimation: incorporating Transformer and LSTM to Kalman Filter with EM algorithm Overview Kalman Filter requires the true p

zshicode 57 Dec 27, 2022
Shuffle Attention for MobileNetV3

SA-MobileNetV3 Shuffle Attention for MobileNetV3 Train Run the following command for train model on your own dataset: python train.py --dataset mnist

Sajjad Aemmi 36 Dec 28, 2022
Implementation for "Seamless Manga Inpainting with Semantics Awareness" (SIGGRAPH 2021 issue)

Seamless Manga Inpainting with Semantics Awareness [SIGGRAPH 2021](To appear) | Project Website | BibTex Introduction: Manga inpainting fills up the d

101 Jan 01, 2023
Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

HiT-GAN Official TensorFlow Implementation HiT-GAN presents a Transformer-based generator that is trained based on Generative Adversarial Networks (GA

Google Research 78 Oct 31, 2022
FSL-Mate: A collection of resources for few-shot learning (FSL).

FSL-Mate is a collection of resources for few-shot learning (FSL). In particular, FSL-Mate currently contains FewShotPapers: a paper list which tracks

Yaqing Wang 1.5k Jan 08, 2023
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

Jinwoo Park (Curt) 1.4k Dec 29, 2022
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
Python SDK for building, training, and deploying ML models

Overview of Kubeflow Fairing Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (

Kubeflow 325 Dec 13, 2022
Simple codebase for flexible neural net training

neural-modular Simple codebase for flexible neural net training. Allows for seamless exchange of models, dataset, and optimizers. Uses hydra for confi

Jannik Kossen 7 Apr 05, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around 79 Oct 08, 2022

Flask101 - FullStack Web Development with Python & JS - From TAQWA

Task: Create a CLI Calculator Step 0: Creating Virtual Environment $ python -m

Hossain Foysal 1 May 31, 2022
Using VapourSynth with super resolution models and speeding them up with TensorRT.

VSGAN-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Using NVIDIA/Torch-TensorRT combined wi

111 Jan 05, 2023
Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation

Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation Experiment Setting: CIFAR10 (downloaded and saved in ./DATA

John Seon Keun Yi 38 Dec 27, 2022
Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast

Elman Mansimov 341 Dec 20, 2022
TensorFlow 2 AI/ML library wrapper for openFrameworks

ofxTensorFlow2 This is an openFrameworks addon for the TensorFlow 2 ML (Machine Learning) library

Center for Art and Media Karlsruhe 96 Dec 31, 2022