Picka: A Python module for data generation and randomization.

Related tags

Data Analysispicka
Overview

Picka: A Python module for data generation and randomization.

Author: Anthony Long
Version: 1.0.1 - Fixed the broken image stuff. Whoops

What is Picka?

Picka generates randomized data for testing.

Data is generated both from a database of known good data (which is included), or by generating realistic data (valid), using string formatting (behind the scenes).

Picka has a function for any field you would need filled in. With selenium, something like would populate the "field-name-here" box for you, 100 times with random names.

for x in xrange(101):
        self.selenium.type('field-name-here', picka.male_name())

But this is just the beginning. Other ways to implement this, include using dicts:

user_information = {
        "first_name": picka.male_name(),
        "last_name": picka.last_name(),
        "email_address": picka.email(10, extension='example.org'),
        "password": picka.password_numerical(6),
}

This would provide:

{
        "first_name": "Jack",
        "last_name": "Logan",
        "email_address": "[email protected]",
        "password": "485444"
}

Don't forget, since all of the data is considered "clean" or valid - you can also use it to fill selects and other form fields with pre-defined values. For example, if you were to generate a state; picka.state() the result would be "Alabama". You can use this result to directly select a state in an address drop-down box.

Examples:

Selenium

def search_for_garbage():
        selenium.open('http://yahoo.com')
        selenium.type('id=search_box', picka.random_string(10))
        selenium.submit()

def test_search_for_garbage_results():
        search_for_garbage()
        selenium.wait_for_page_to_load('30000')
        assert selenium.get_xpath_count('id=results') == 0

Webdriver

driver = webdriver.Firefox()
driver.get("http://somesite.com")
x = {
        "name": [
                "#name",
                picka.name()
        ]
}
driver.find_element_by_css_selector(
        x["name"][0]).send_keys(x["name"][1]
)

Funcargs / pytest

def pytest_generate_tests(metafunc):
        if "test_string" in metafunc.funcargnames:
                for i in range(10):
                        metafunc.addcall(funcargs=dict(numiter=picka.random_string(20)))

def test_func(test_string):
        assert test_string.isalpha()
        assert len(test_string) == 20

MySQL / SQLite

first, last, age = picka.first_name(), picka.last_name(), picka.age()
cursor.execute(
   "insert into user_data (first_name, last_name, age) VALUES (?, ?, ?)",
   (first, last, age)
)

HTTP

def post(host, data):
        http = httplib.HTTP(host)
        return http.send(data)

def test_post_result():
        post("www.spam.egg/bacon.htm", picka.random_string(10))
Comments
  • No test suite

    No test suite

    Slightly ironic, a test data generation toolkit which doesnt have a test suite.

    Also setup.py doesnt declare Python 3 support, hence the need for a test suite to validate it works correctly.

    opened by jayvdb 1
  • Additional Functionality for Testers to Add Their Own Data

    Additional Functionality for Testers to Add Their Own Data

    Picka provides general data for testing. Leveraging this effort provides custom test data. Test data is not limited to just preconfigured values when it's possible to add custom test data. Data can be accessed sequentially, randomly or completely.

    opened by bkuehlhorn 1
  • Fixed test file, added alternative sentence maker

    Fixed test file, added alternative sentence maker

    1. Fixed usage of number in tests (it takes one arg, not two)
    2. Added sentence_actual, which returns an actual sentence from the Sherlock text.
    3. Added _picka._Book class to hold the text and split sentences read from Sherlock. Users can call sentence() without reading the entire file again and again.
    4. Added test of sentence_actual to picka.tests

    The sentence_actual function has some nice features:

    1. You're much less likely to get a sentence fragment
    2. You can specify a minimum and maximum number of words
    3. It should be relatively efficient, because the split sentences are cached by the _Book class.

    The sentences aren't always perfect, but I think that has to do with the source. A book other than Sherlock Holmes, preferably one with less dialog, would give more "normal" sentences.

    opened by TadLeonard 1
  • Library does not take locale into account

    Library does not take locale into account

    The library assumes an English locale is used (e.g., English-language hardcoded month names). Ideally the library would use locale-dependent constants so that computations are done correctly (e.g., the duration of a month in month_and_day):

    >>> locale.setlocale(locale.LC_ALL, 'it_IT')
    'it_IT'
    >>> picka.month()
    'Marzo'
    >>> picka.month_and_day()
    'Maggio 2'
    
    opened by svisser 0
  • picka.age will return ages outside of the bounds

    picka.age will return ages outside of the bounds

    If I call picka.age(1, 1) repeatedly I get 1 and 2 as results. I would have expected it to always return 1. Note that this situation can occur when passing variables to picka.age, I don't expect people to write this in their code themselves.

    I can also get ages outside of the bounds when I call picka.age(0, 1) which resorts to using the default values and can therefore return any age within the default values.

    opened by svisser 0
  • Module name means

    Module name means "cunt"

    I'm not sure if this is a real issue, but when I look at this module I cannot do so with a straight face. "Picka" is "cunt" in Serbian, Macedonian, Bosnian, Croatian, and I'm unsure as to whether there are other languages where this holds.

    While not grounds for any specific action, I find this largely amusing and just wanted to share.

    opened by geomaster 2
Releases(v0.96)
follow-analyzer helps GitHub users analyze their following and followers relationship

follow-analyzer follow-analyzer helps GitHub users analyze their following and followers relationship by providing a report in html format which conta

Yin-Chiuan Chen 2 May 02, 2022
Candlestick Pattern Recognition with Python and TA-Lib

Candlestick-Pattern-Recognition-with-Python-and-TA-Lib Goal Look at the S&P500 to try and get a better understanding of these candlestick patterns and

Ganesh Jainarain 11 Oct 07, 2022
Pandas and Dask test helper methods with beautiful error messages.

beavis Pandas and Dask test helper methods with beautiful error messages. test helpers These test helper methods are meant to be used in test suites.

Matthew Powers 18 Nov 28, 2022
MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

SeungHeonDoh 3 Jul 02, 2022
Picka: A Python module for data generation and randomization.

Picka: A Python module for data generation and randomization. Author: Anthony Long Version: 1.0.1 - Fixed the broken image stuff. Whoops What is Picka

Anthony 108 Nov 30, 2021
An extension to pandas dataframes describe function.

pandas_summary An extension to pandas dataframes describe function. The module contains DataFrameSummary object that extend describe() with: propertie

Mourad 450 Dec 30, 2022
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
The lastest all in one bombing tool coded in python uses tbomb api

BaapG-Attack is a python3 based script which is officially made for linux based distro . It is inbuit mass bomber with sms, mail, calls and many more bombing

59 Dec 25, 2022
A Python adaption of Augur to prioritize cell types in perturbation analysis.

A Python adaption of Augur to prioritize cell types in perturbation analysis.

Theis Lab 2 Mar 29, 2022
Titanic data analysis for python

Titanic-data-analysis This Repo is an analysis on Titanic_mod.csv This csv file contains some assumed data of the Titanic ship after sinking This full

Hardik Bhanot 1 Dec 26, 2021
NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s

Elizabeth Brooks 2 Jun 13, 2022
A Python and R autograding solution

Otter-Grader Otter Grader is a light-weight, modular open-source autograder developed by the Data Science Education Program at UC Berkeley. It is desi

Infrastructure Team 93 Jan 03, 2023
Analysiscsv.py for extracting analysis and exporting as CSV

wcc_analysis Lichess page documentation: https://lichess.org/page/world-championships Each WCC has a study, studies are fetched using: https://lichess

32 Apr 25, 2022
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilis

Blei Lab 4.7k Jan 09, 2023
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data struc

Zed(Zijun) Chen 40 Dec 12, 2022
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house

This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file con

Amit Prakash 1 Jan 21, 2022
Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep

Sebastian Schäfer 10 Dec 08, 2022
Conduits - A Declarative Pipelining Tool For Pandas

Conduits - A Declarative Pipelining Tool For Pandas Traditional tools for declaring pipelines in Python suck. They are mostly imperative, and can some

Kale Miller 7 Nov 21, 2021
Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment

Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment Brief explanation of PT Bukalapak.com Tbk Bukalapak was found

Najibulloh Asror 2 Feb 10, 2022
An ETL framework + Monitoring UI/API (experimental project for learning purposes)

Fastlane An ETL framework for building pipelines, and Flask based web API/UI for monitoring pipelines. Project structure fastlane |- fastlane: (ETL fr

Dan Katz 2 Jan 06, 2022