当前位置：网站首页>[code analysis (4)] communication efficient learning of deep networks from decentralized data

[code analysis (4)] communication efficient learning of deep networks from decentralized data

2022-04-23 13:47:00 【Silent city of the sky】
options.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Python version: 3.6

import argparse  #  Introduce modules 


def args_parser():
    #  Create parsing objects 
    parser = argparse.ArgumentParser()

    '''
        Federated learning parameters  
    '''
    '''
         Give me a  ArgumentParser  Add program parameter information by calling  
        add_argument()  Method . Usually , These calls specify  
        ArgumentParser  How to get a command line string and convert it to an object .
         The information is in  parse_args()  Stored and used when called .
    
    '''
    # federated arguments (Notation for the arguments followed from paper)
    # epoch=10
    parser.add_argument('--epochs', type=int, default=10,
                        help="number of rounds of training")
    # num_users=100 Users 
    parser.add_argument('--num_users', type=int, default=100,
                        help="number of users: K")
    #  The beginning of each round , Randomly select some clients , This part depends on C- decimal 
    #  The selected client uses all data to calculate the gradient of loss 
    # C=0.1, 100*0.1=10,
    parser.add_argument('--frac', type=float, default=0.1,
                        help='the fraction of clients: C')

    #  Local epoch=10
    parser.add_argument('--local_ep', type=int, default=10,
                        help="the number of local epochs: E")

    #  Local batch_size=10
    parser.add_argument('--local_bs', type=int, default=10,
                        help="local batch size: B")

    #  Learning rate =0.01
    parser.add_argument('--lr', type=float, default=0.01,
                        help='learning rate')

    #  momentum 0.5
    parser.add_argument('--momentum', type=float, default=0.5,
                        help='SGD momentum (default: 0.5)')

    '''
         Model parameters  
    '''
    # model arguments
    #  Choose a model , The default is MLP

    parser.add_argument('--model', type=str, default='mlp', help='model name')

    # kernel The number of 9
    parser.add_argument('--kernel_num', type=int, default=9,
                        help='number of each kind of kernel')

    # kernel size 3*3, 4*4, 5*5 for cnn
    parser.add_argument('--kernel_sizes', type=str, default='3,4,5',
                        help='comma-separated kernel size to \ use for convolution')

    #  Number of channels , Picture channel channel=1, Because it's grayscale , No rgb
    parser.add_argument('--num_channels', type=int, default=1, help="number \ of channels of imgs")

    # batch_norm Accelerate the training of neural network , Accelerate convergence speed and stability 
    # layer_norm
    '''
        LN Is and BN A very approximate normalization method ,
         The difference is BN It takes the same feature of different samples ,
         and LN It takes different characteristics of the same sample .
         stay BN and LN Can be used in the scene ,
        BN The effect of is generally better than LN,
         The reason is based on different data ,
         The normalized features obtained from the same feature are less likely to lose information .
        
         take LN Add to CNN after , The experimental results show that LN Destroyed 
         The features learned by convolution , The model doesn't converge ,
         So in CNN Then use BN It's a better choice .
    '''
    parser.add_argument('--norm', type=str, default='batch_norm',
                        help="batch_norm, layer_norm, or None")

    # cnn Number of filters in =32
    '''
         Convolution kernel is different from filter ：
             For single channel pictures , filter = Convolution kernel , A characteristic graph obtained corresponds to a convolution kernel 
             For multi-channel pictures , filter = The set of convolution kernels , A feature graph obtained corresponds to a filter 
            
            
         Convolution kernel is specified by length and width , It's a two-dimensional concept .

         And the filter is made of long 、 Width and depth specified , It's a three-dimensional concept .
        
         The filter can be seen as a collection of convolution kernels .
        
         The filter is one dimension higher than the convolution kernel —— depth .
    '''
    parser.add_argument('--num_filters', type=int, default=32,
                        help="number of filters for conv nets -- 32 for \ mini-imagenet, 64 for omiglot.")

    # max_pooling Whether to use maximum pooling 
    parser.add_argument('--max_pool', type=str, default='True',
                        help="Whether use max pooling rather than \ strided convolutions")

    # other arguments
    #  Dataset selection  default='mnist'
    parser.add_argument('--dataset', type=str, default='cifar', help="name \ of dataset")

    # ？？
    '''
        cifar：32*32*3
         Data set from 10 Class picture composition 
         The plane ,  automobile ,  bird ,  cat ,  deer ,  Dog ,  frog ,  Horse ,  ship , truck .
    '''
    parser.add_argument('--num_classes', type=int, default=10, help="number \ of classes")
    '''
    parser.add_argument('--gpu', default=None, help="To use cuda, set \ to a specific GPU ID. Default set to use CPU.")
    '''
    # CPU or GPU 0 default=None No value is false
    parser.add_argument('--gpu', default=0, help="To use cuda, set \ to a specific GPU ID. Default set to use CPU.")

    #  Optimizer strategy ：SGD or Adam?
    parser.add_argument('--optimizer', type=str, default='sgd', help="type \ of optimizer")

    #  The data between clients is IID still non-IID
    parser.add_argument('--iid', type=int, default=1,
                        help='Default set to IID. Set to 0 for non-IID.')

    #  Whether the data is equally divided between clients 
    parser.add_argument('--unequal', type=int, default=0,
                        help='whether to use unequal data splits for \ non-i.i.d setting (use 0 for equal splits)')

    #  Number of training rounds 
    parser.add_argument('--stopping_rounds', type=int, default=10,
                        help='rounds of early stopping')

    #  Lengthy 
    parser.add_argument('--verbose', type=int, default=1, help='verbose')

    #  Random seeds 
    parser.add_argument('--seed', type=int, default=1, help='random seed')
    args = parser.parse_args()
    return args
版权声明
本文为[Silent city of the sky]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204230556365764.html
当前位置：网站首页>[code analysis (4)] communication efficient learning of deep networks from decentralized data

[code analysis (4)] communication efficient learning of deep networks from decentralized data

options.py

边栏推荐

猜你喜欢

随机推荐