Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Overview

A scalable on-line movie recommender using Spark and Flask

This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens dataset to build a movie recommender using collaborative filtering with Spark's Alternating Least Saqures implementation. It is organised in two parts. The first one is about getting and parsing movies and ratings data into Spark RDDs. The second is about building and using the recommender and persisting it for later use in our on-line recommender system.

This tutorial can be used independently to build a movie recommender model based on the MovieLens dataset. Most of the code in the first part, about how to use ALS with the public MovieLens dataset, comes from my solution to one of the exercises proposed in the CS100.1x Introduction to Big Data with Apache Spark by Anthony D. Joseph on edX, that is also publicly available since 2014 at Spark Summit. Starting from there, I've added with minor modifications to use a larger dataset, then code about how to store and reload the model for later use, and finally a web service using Flask.

In any case, the use of this algorithm with this dataset is not new (you can Google about it), and this is because we put the emphasis on ending up with a usable model in an on-line environment, and how to use it in different situations. But I truly got inspired by solving the exercise proposed in that course, and I highly recommend you to take it. There you will learn not just ALS but many other Spark algorithms.

It is the second part of the tutorial the one that explains how to use Python/Flask for building a web-service on top of Spark models. By doing so, you will be able to develop a complete on-line movie recommendation service.

Part I: Building the recommender

Part II: Building and running the web service

Quick start

The file server/server.py starts a CherryPy server running a Flask app.py to start a RESTful web server wrapping a Spark-based engine.py context. Through its API we can perform on-line movie recommendations.

Please, refer the the second notebook for detailed instructions on how to run and use the service.

Contributing

Contributions are welcome! For bug reports or requests please submit an issue.

Contact

Feel free to contact me to discuss any issues, questions, or comments.

License

This repository contains a variety of content; some developed by Jose A. Dianes, and some from third-parties. The third-party content is distributed under the license provided by those parties.

The content developed by Jose A. Dianes is distributed under the following license:

Copyright 2016 Jose A Dianes

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
Jose A Dianes
Principal Data Scientist at Mosaic Therapeutics.
Jose A Dianes
Movies/TV Recommender

recommender Movies/TV Recommender. Recommends Movies, TV Shows, Actors, Directors, Writers. Setup Create file API_KEY and paste your TMDB API key in i

Aviem Zur 3 Apr 22, 2022
An open source movie recommendation WebApp build by movie buffs and mathematicians that uses cosine similarity on the backend.

Movie Pundit Find your next flick by asking the (almost) all-knowing Movie Pundit Jump to Project Source » View Demo · Report Bug · Request Feature Ta

Kapil Pramod Deshmukh 8 May 28, 2022
Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.

Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation

Information Systems Lab @ Polytechnic University of Bari 215 Nov 29, 2022
Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks

Bi-TGCF Tensorflow Implementation of BiTGCF: Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks. in CIKM20

17 Nov 30, 2022
This is our implementation of GHCF: Graph Heterogeneous Collaborative Filtering (AAAI 2021)

GHCF This is our implementation of the paper: Chong Chen, Weizhi Ma, Min Zhang, Zhaowei Wang, Xiuqiang He, Chenyang Wang, Yiqun Liu and Shaoping Ma. 2

Chong Chen 53 Dec 05, 2022
Learning Fair Representations for Recommendation: A Graph-based Perspective, WWW2021

FairGo WWW2021 Learning Fair Representations for Recommendation: A Graph-based Perspective As a key application of artificial intelligence, recommende

lei 39 Oct 26, 2022
A library of metrics for evaluating recommender systems

recmetrics A python library of evalulation metrics and diagnostic tools for recommender systems. **This library is activly maintained. My goal is to c

Claire Longo 458 Jan 06, 2023
RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

Jacopo Tagliabue 375 Dec 30, 2022
Books Recommendation With Python

Books-Recommendation Business Problem During the last few decades, with the rise

Çağrı Karadeniz 7 Mar 12, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 01, 2023
A tensorflow implementation of the RecoGCN model in a CIKM'19 paper, titled with "Relation-Aware Graph Convolutional Networks for Agent-Initiated Social E-Commerce Recommendation".

This repo contains a tensorflow implementation of RecoGCN and the experiment dataset Running the RecoGCN model python train.py Example training outp

xfl15 30 Nov 25, 2022
EXEMPLO DE SISTEMA ESPECIALISTA PARA RECOMENDAR SERIADOS EM PYTHON

exemplo-de-sistema-especialista EXEMPLO DE SISTEMA ESPECIALISTA PARA RECOMENDAR SERIADOS EM PYTHON Resumo O objetivo de auxiliar o usuário na escolha

Josue Lopes 3 Aug 31, 2021
A recommendation system for suggesting new books given similar books.

Book Recommendation System A recommendation system for suggesting new books given similar books. Datasets Dataset Kaggle Dataset Notebooks goodreads-E

Sam Partee 2 Jan 06, 2022
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021
Real time recommendation playground

concierge A continuous learning collaborative filter1 deployed with a light web server2. Distributed updates are live (real time pubsub + delta traini

Mark Essel 16 Nov 07, 2022
Detecting Beneficial Feature Interactions for Recommender Systems, AAAI 2021

Detecting Beneficial Feature Interactions for Recommender Systems (L0-SIGN) This is our implementation for the paper: Su, Y., Zhang, R., Erfani, S., &

26 Nov 22, 2022
Knowledge-aware Coupled Graph Neural Network for Social Recommendation

KCGN AAAI-2021 《Knowledge-aware Coupled Graph Neural Network for Social Recommendation》 Environments python 3.8 pytorch-1.6 DGL 0.5.3 (https://github.

xhc 22 Nov 18, 2022
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch

Recommendation engines are one of the most well known, widely used and highest value use cases for applying machine learning. Despite this, while there are many resources available for the basics of

International Business Machines 793 Dec 18, 2022
Implementation of a hadoop based movie recommendation system

Implementation-of-a-hadoop-based-movie-recommendation-system 通过编写代码,设计一个基于Hadoop的电影推荐系统,通过此推荐系统的编写,掌握在Hadoop平台上的文件操作,数据处理的技能。windows 10 hadoop 2.8.3 p

汝聪(Ricardo) 5 Oct 02, 2022
Bundle Graph Convolutional Network

Bundle Graph Convolutional Network This is our Pytorch implementation for the paper: Jianxin Chang, Chen Gao, Xiangnan He, Depeng Jin and Yong Li. Bun

55 Dec 25, 2022