Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
PSP (Python Starter Package) is meant for those who want to start coding in python but are new to the coding scene.
Python Starter Package PSP (Python Starter Package) is meant for those who want to start coding in python, but are new to the coding scene. We include
The bidirectional mapping library for Python.
bidict The bidirectional mapping library for Python. Status bidict: has been used for many years by several teams at Google, Venmo, CERN, Bank of Amer
Active Transport Analytics Model: A new strategic transport modelling and data visualization framework
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”
A Google sheet which keeps track of the locations that want to visit and a price cutoff
FlightDeals Here's how the program works. First, I have a Google sheet which keeps track of the locations that I want to visit and a price cutoff. It
A toolkit for developing and deploying serverless Python code in AWS Lambda.
Python-lambda is a toolset for developing and deploying serverless Python code in AWS Lambda. A call for contributors With python-lambda and pytube bo
Simple Crud Python vs MySQL
Simple Crud Python vs MySQL The idea came when I was studying MySQ... A desire to create a python program that can give access to a "localhost" databa
API for SpeechAnalytics integration with FreePBX/Asterisk
freepbx_speechanalytics_api API for SpeechAnalytics integration with FreePBX/Asterisk Скопировать файл settings.py.sample в settings.py и отредактиров
Python MQTT v5.0 async client
gmqtt: Python async MQTT client implementation. Installation The latest stable version is available in the Python Package Index (PyPi) and can be inst
Blender addon - Breakdown in object mode
Breakdowner Breakdown in object mode Download latest Demo Youtube Description Same breakdown shortcut as in armature mode in object mode Currently onl
Python script for diving image data to train test and val
dataset-division-to-train-val-test-python python script for dividing image data to train test and val If you have an image dataset in the following st
PyDy, short for Python Dynamics, is a tool kit written in the Python
PyDy, short for Python Dynamics, is a tool kit written in the Python programming language that utilizes an array of scientific programs to enable the study of multibody dynamics. The goal is to have
3x - This Is 3x Friendlist Cloner Tools
3X FRIENDLIST CLONER TOOLS COMMAND $ apt update $ apt upgrade $ apt install pyth
《赛马娘》(ウマ娘: Pretty Derby)辅助 🐎🖥 基于 auto-derby 可视化操作/设置 启动器 一键包
ok-derby 《赛马娘》(ウマ娘: Pretty Derby)辅助 🐎 🖥 基于 auto-derby 可视化操作/设置 启动器 一键包 便捷,好用的 auto_derby 管理器! 功能 支持客户端 DMM (前台) 实验性 安卓 ADB 连接(后台)开发基于 1080x1920 分辨率
This is the DBMS Project done in 5th sem of B.E CS.
Student-Result-Management-System This is the DBMS Project done in 5th sem of B.E CS. You need to install SQlite DB Browser in your pc or laptop to ope
An OrpheusDL Tidal module
OrpheusDL - Tidal A Tidal module for the OrpheusDL modular archival music program Report Bug · Request Feature Table of content About OrpheusDL - Tida
The purpose is to have a fairly simple python assignment that introduces the basic features and tools of python
This repository contains the code for the python introduction lab. The purpose is to have a fairly simple python assignment that introduces the basic
Svg-turtle - Use the Python turtle to write SVG files
SaVaGe Turtle Use the Python turtle to write SVG files If you're using the Pytho
Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive
bookmarkarchiver Python script that archives all of your bookmarks on the Internet Archive. Supports all major browsers. bookmarkarchiver uses the off
Analisador de strings feito em Python // String parser made in Python
Este é um analisador feito em Python, neste programa, estou estudando funções e a sua junção com "if's" e dados colocados pelo usuário. Neste código,
Translation patch for Hololive ERROR
Translation patch for Hololive ERROR How do I install the patch? Grab the Translation.zip file for the latest version from the releases page, and unzi