40 KiB
Content-based recommenders
Content-based recommenders in their recommendations rely purely on the features of items. Conceptually it can be expressed as a model of the form (personalized):
$$ score \sim (user, item\_feature_1, item\_feature_2, ..., item\_feature_n) $$ or (not personalized) $$ score \sim (item\_feature_1, item\_feature_2, ..., item\_feature_n) $$+ Content-based recommenders do not suffer from the cold-start problem for new items.
- They do not use information about complex patterns of user-item interactions - what other similar users have already discovered and liked.
%matplotlib inline
%load_ext autoreload
%autoreload 2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import Markdown, display, HTML
from collections import defaultdict
from sklearn.model_selection import KFold
# Fix the dying kernel problem (only a problem in some installations - you can remove it, if it works without it)
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'
Load the data
ml_ratings_df = pd.read_csv(os.path.join("data", "movielens_small", "ratings.csv")).rename(columns={'userId': 'user_id', 'movieId': 'item_id'})
ml_movies_df = pd.read_csv(os.path.join("data", "movielens_small", "movies.csv")).rename(columns={'movieId': 'item_id'})
ml_df = pd.merge(ml_ratings_df, ml_movies_df, on='item_id')
ml_df.head(10)
display(HTML(ml_movies_df.head(10).to_html()))
# Filter the data to reduce the number of movies
rng = np.random.RandomState(seed=6789)
left_ids = rng.choice(ml_movies_df['item_id'], size=100, replace=False)
ml_ratings_df = ml_ratings_df.loc[ml_ratings_df['item_id'].isin(left_ids)]
ml_movies_df = ml_movies_df.loc[ml_movies_df['item_id'].isin(left_ids)]
ml_df = ml_df.loc[ml_df['item_id'].isin(left_ids)]
print("Number of left interactions: {}".format(len(ml_ratings_df)))
Recommender class
Remark: Docstrings written in reStructuredText (reST) used by Sphinx to automatically generate code documentation. It is also used by default by PyCharm (type triple quotes after defining a class or a method and hit enter).
class Recommender(object):
"""
Base recommender class.
"""
def __init__(self):
"""
Initialize base recommender params and variables.
"""
pass
def fit(self, interactions_df, users_df, items_df):
"""
Training of the recommender.
:param pd.DataFrame interactions_df: DataFrame with recorded interactions between users and items
defined by user_id, item_id and features of the interaction.
:param pd.DataFrame users_df: DataFrame with users and their features defined by user_id and the user feature columns.
:param pd.DataFrame items_df: DataFrame with items and their features defined by item_id and the item feature columns.
"""
pass
def recommend(self, users_df, items_df, n_recommendations=1):
"""
Serving of recommendations. Scores items in items_df for each user in users_df and returns
top n_recommendations for each user.
:param pd.DataFrame users_df: DataFrame with users and their features for which recommendations should be generated.
:param pd.DataFrame items_df: DataFrame with items and their features which should be scored.
:param int n_recommendations: Number of recommendations to be returned for each user.
:return: DataFrame with user_id, item_id and score as columns returning n_recommendations top recommendations
for each user.
:rtype: pd.DataFrame
"""
recommendations = pd.DataFrame(columns=['user_id', 'item_id', 'score'])
for ix, user in users_df.iterrows():
user_recommendations = pd.DataFrame({'user_id': user['user_id'],
'item_id': [-1] * n_recommendations,
'score': [3.0] * n_recommendations})
recommendations = pd.concat([recommendations, user_recommendations])
return recommendations
Evaluation measures
Explicit feedback - ratings
RMSE - Root Mean Squared Error
$$ RMSE = \sqrt{\frac{\sum_{i}^N (\hat{r}_i - r_i)^2}{N}} $$where $\hat{r}_i$ are the predicted ratings and $r_i$ are the real ratings and $N$ is the number of items in the test set.
+ Very well-behaved analytically and therefore extensively used to train models, especially neural networks.
- The scale of errors dependent on data which reduced comparability between different datasets.
def rmse(r_pred, r_real):
return np.sqrt(np.sum(np.power(r_pred - r_real, 2)) / len(r_pred))
# Test
print("RMSE = {:.2f}".format(rmse(np.array([2.1, 1.2, 3.8, 4.2, 3.6]), np.array([3, 2, 4, 5, 1]))))
MRE - Mean Relative Error
$$ MRE = \frac{1}{N} \sum_{i}^N \frac{|\hat{r}_i - r_i|}{|r_i|} $$where $\hat{r}_i$ are the predicted ratings and $r_i$ are the real ratings and $N$ is the number of items in the test set.
+ Easily interpretable (average percentage error) and with a meaning understandable for business.
- Blows up when there are values close to zero among the predicted values.
def mre(r_pred, r_real):
return 1 / len(r_pred) * np.sum(np.abs(r_pred - r_real) / np.abs(r_real))
# Test
print("MRE = {:.4f}".format(mre(np.array([2.1, 1.2, 3.8, 4.2, 3.6]), np.array([3, 2, 4, 5, 1]))))
TRE - Total Relative Error
$$ TRE = \frac{\sum_{i}^N |\hat{r}_i - r_i|}{\sum_{i}^N |r_i|} $$where $\hat{r}_i$ are the predicted ratings and $r_i$ are the real ratings and $N$ is the number of items in the test set.
+ Easily interpretable (total percentage error) and with a meaning understandable for business.
+ Reliable even for very small predicted values.
- Does not distinguish between a case when one prediction is very bad and other are very good and a case when all predictions are mediocre.
def tre(r_pred, r_real):
return np.sum(np.abs(r_pred - r_real)) / np.sum(np.abs(r_real))
# Test
print("TRE = {:.4f}".format(tre(np.array([2.1, 1.2, 3.8, 4.2, 3.6]), np.array([3, 2, 4, 5, 1]))))
Implicit feedback - binary indicators of interactions
HR@n - Hit Ratio
How many hits did we score in the first n recommendations.
where:
- $r_{u, i}$ is $1$ if there was an interaction between user $u$ and item $i$ in the test set and $0$ otherwise,
- $\hat{D}_n$ is the set of the first $n$ recommendations for user $u$,
- $1_{\hat{D}_n}(i)$ is $1$ if and only if $i \in \hat{D}_n$, otherwise it's equal to $0$,
- $M$ is the number of users.
+ Easily interpretable.
- Does not take the rank of each recommendation into account.
def hr(recommendations, real_interactions, n=1):
"""
Assumes recommendations are ordered by user_id and then by score.
"""
# Transform real_interactions to a dict for a large speed-up
rui = defaultdict(lambda: 0)
for idx, row in real_interactions.iterrows():
rui[(row['user_id'], row['item_id'])] = 1
hr = 0.0
previous_user_id = -1
rank = 0
for idx, row in recommendations.iterrows():
if previous_user_id == row['user_id']:
rank += 1
else:
rank = 1
if rank <= n:
hr += rui[(row['user_id'], row['item_id'])]
previous_user_id = row['user_id']
hr /= len(recommendations['user_id'].unique())
return hr
recommendations = pd.DataFrame(
[
[1, 13, 0.9],
[1, 45, 0.8],
[1, 22, 0.71],
[1, 77, 0.55],
[1, 9, 0.52],
[2, 11, 0.85],
[2, 13, 0.69],
[2, 25, 0.64],
[2, 6, 0.60],
[2, 77, 0.53]
], columns=['user_id', 'item_id', 'score'])
display(HTML(recommendations.to_html()))
real_interactions = pd.DataFrame(
[
[1, 45],
[1, 22],
[1, 77],
[2, 13],
[2, 77]
], columns=['user_id', 'item_id'])
display(HTML(real_interactions.to_html()))
print("HR@3 = {:.4f}".format(hr(recommendations, real_interactions, n=3)))
NDCG@n - Normalized Discounted Cumulative Gain
How many hits did we score in the first n recommendations discounted by the position of each recommendation.
where:
- $r_{u, i}$ is $1$ if there was an interaction between user $u$ and item $i$ in the test set and $0$ otherwise,
- $\hat{D}_n(u)$ is the set of the first $n$ recommendations for user $u$,
- $v_{\hat{D}_n(u)}(i)$ is the position of item $i$ in recommendations $\hat{D}_n$,
- $M$ is the number of users.
- Takes the rank of each recommendation into account.
def ndcg(recommendations, real_interactions, n=1):
"""
Assumes recommendations are ordered by user_id and then by score.
"""
# Transform real_interactions to a dict for a large speed-up
rui = defaultdict(lambda: 0)
for idx, row in real_interactions.iterrows():
rui[(row['user_id'], row['item_id'])] = 1
ndcg = 0.0
previous_user_id = -1
rank = 0
for idx, row in recommendations.iterrows():
if previous_user_id == row['user_id']:
rank += 1
else:
rank = 1
if rank <= n:
ndcg += rui[(row['user_id'], row['item_id'])] / np.log2(1 + rank)
previous_user_id = row['user_id']
ndcg /= len(recommendations['user_id'].unique())
return ndcg
recommendations = pd.DataFrame(
[
[1, 13, 0.9],
[1, 45, 0.8],
[1, 22, 0.71],
[1, 77, 0.55],
[1, 9, 0.52],
[2, 11, 0.85],
[2, 13, 0.69],
[2, 25, 0.64],
[2, 6, 0.60],
[2, 77, 0.53]
], columns=['user_id', 'item_id', 'score'])
display(HTML(recommendations.to_html()))
real_interactions = pd.DataFrame(
[
[1, 45],
[1, 22],
[1, 77],
[2, 13],
[2, 77]
], columns=['user_id', 'item_id'])
display(HTML(real_interactions.to_html()))
print("NDCG@3 = {:.4f}".format(ndcg(recommendations, real_interactions, n=3)))
Testing routines (offline)
Train and test set split
Explicit feedback
def evaluate_train_test_split_explicit(recommender, interactions_df, items_df, seed=6789):
rng = np.random.RandomState(seed=seed)
# Split the dataset into train and test
shuffle = np.arange(len(interactions_df))
rng.shuffle(shuffle)
shuffle = list(shuffle)
train_test_split = 0.8
split_index = int(len(interactions_df) * train_test_split)
interactions_df_train = interactions_df.iloc[shuffle[:split_index]]
interactions_df_test = interactions_df.iloc[shuffle[split_index:]]
# Train the recommender
recommender.fit(interactions_df_train, None, items_df)
# Gather predictions
r_pred = []
for idx, row in interactions_df_test.iterrows():
users_df = pd.DataFrame([row['user_id']], columns=['user_id'])
eval_items_df = pd.DataFrame([row['item_id']], columns=['item_id'])
eval_items_df = pd.merge(eval_items_df, items_df, on='item_id')
recommendations = recommender.recommend(users_df, eval_items_df, n_recommendations=1)
r_pred.append(recommendations.iloc[0]['score'])
# Gather real ratings
r_real = np.array(interactions_df_test['rating'].tolist())
# Return evaluation metrics
return rmse(r_pred, r_real), mre(r_pred, r_real), tre(r_pred, r_real)
recommender = Recommender()
results = [['BaseRecommender'] + list(evaluate_train_test_split_explicit(
recommender, ml_ratings_df.loc[:, ['user_id', 'item_id', 'rating']], ml_movies_df))]
results = pd.DataFrame(results,
columns=['Recommender', 'RMSE', 'MRE', 'TRE'])
display(HTML(results.to_html()))
Implicit feedback
Task 1. Implement the following method for train-test split evaluation for implicit feedback.
def evaluate_train_test_split_implicit(recommender, interactions_df, items_df, seed=6789):
# Write your code here
pass
Leave-one-out, leave-k-out, cross-validation
Explicit feedback
Task 2. Implement the following method for leave-one-out evaluation for explicit feedback.
def evaluate_leave_one_out_explicit(recommender, interactions_df, items_df, max_evals=100, seed=6789):
# Write your code here
pass
Implicit feedback
def evaluate_leave_one_out_implicit(recommender, interactions_df, items_df, max_evals=10, seed=6789):
rng = np.random.RandomState(seed=seed)
# Prepare splits of the datasets
kf = KFold(n_splits=len(interactions_df), random_state=rng, shuffle=True)
hr_1 = []
hr_3 = []
hr_5 = []
hr_10 = []
ndcg_1 = []
ndcg_3 = []
ndcg_5 = []
ndcg_10 = []
# For each split of the dataset train the recommender, generate recommendations and evaluate
n_eval = 1
for train_index, test_index in kf.split(interactions_df.index):
interactions_df_train = interactions_df.loc[interactions_df.index[train_index]]
interactions_df_test = interactions_df.loc[interactions_df.index[test_index]]
recommender.fit(interactions_df_train, None, items_df)
recommendations = recommender.recommend(interactions_df_test.loc[:, ['user_id']], items_df, n_recommendations=10)
hr_1.append(hr(recommendations, interactions_df_test, n=1))
hr_3.append(hr(recommendations, interactions_df_test, n=3))
hr_5.append(hr(recommendations, interactions_df_test, n=5))
hr_10.append(hr(recommendations, interactions_df_test, n=10))
ndcg_1.append(ndcg(recommendations, interactions_df_test, n=1))
ndcg_3.append(ndcg(recommendations, interactions_df_test, n=3))
ndcg_5.append(ndcg(recommendations, interactions_df_test, n=5))
ndcg_10.append(ndcg(recommendations, interactions_df_test, n=10))
if n_eval == max_evals:
break
n_eval += 1
hr_1 = np.mean(hr_1)
hr_3 = np.mean(hr_3)
hr_5 = np.mean(hr_5)
hr_10 = np.mean(hr_10)
ndcg_1 = np.mean(ndcg_1)
ndcg_3 = np.mean(ndcg_3)
ndcg_5 = np.mean(ndcg_5)
ndcg_10 = np.mean(ndcg_10)
return hr_1, hr_3, hr_5, hr_10, ndcg_1, ndcg_3, ndcg_5, ndcg_10
recommender = Recommender()
results = [['BaseRecommender'] + list(evaluate_leave_one_out_implicit(
recommender, ml_ratings_df.loc[:, ['user_id', 'item_id']], ml_movies_df))]
results = pd.DataFrame(results,
columns=['Recommender', 'HR@1', 'HR@3', 'HR@5', 'HR@10', 'NDCG@1', 'NDCG@3', 'NDCG@5', 'NDCG@10'])
display(HTML(results.to_html()))
Linear Regression Recommender
For every movie we transform its genres into one-hot encoded features and then fit a linear regression model to those features and actual ratings.
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import MultiLabelBinarizer
class LinearRegressionRecommender(object):
"""
Base recommender class.
"""
def __init__(self):
"""
Initialize base recommender params and variables.
"""
self.model = None
self.mlb = None
def fit(self, interactions_df, users_df, items_df):
"""
Training of the recommender.
:param pd.DataFrame interactions_df: DataFrame with recorded interactions between users and items
defined by user_id, item_id and features of the interaction.
:param pd.DataFrame users_df: DataFrame with users and their features defined by user_id and the user feature columns.
:param pd.DataFrame items_df: DataFrame with items and their features defined by item_id and the item feature columns.
"""
interactions_df = pd.merge(interactions_df, items_df, on='item_id')
interactions_df.loc[:, 'genres'] = interactions_df['genres'].str.replace("-", "_", regex=False)
interactions_df.loc[:, 'genres'] = interactions_df['genres'].str.replace(" ", "_", regex=False)
interactions_df.loc[:, 'genres'] = interactions_df['genres'].str.lower()
interactions_df.loc[:, 'genres'] = interactions_df['genres'].str.split("|")
self.mlb = MultiLabelBinarizer()
interactions_df = interactions_df.join(
pd.DataFrame(self.mlb.fit_transform(interactions_df.pop('genres')),
columns=self.mlb.classes_,
index=interactions_df.index))
# print(interactions_df.head())
x = interactions_df.loc[:, self.mlb.classes_].values
y = interactions_df['rating'].values
self.model = LinearRegression().fit(x, y)
def recommend(self, users_df, items_df, n_recommendations=1):
"""
Serving of recommendations. Scores items in items_df for each user in users_df and returns
top n_recommendations for each user.
:param pd.DataFrame users_df: DataFrame with users and their features for which recommendations should be generated.
:param pd.DataFrame items_df: DataFrame with items and their features which should be scored.
:param int n_recommendations: Number of recommendations to be returned for each user.
:return: DataFrame with user_id, item_id and score as columns returning n_recommendations top recommendations
for each user.
:rtype: pd.DataFrame
"""
# Transform the item to be scored into proper features
items_df = items_df.copy()
items_df.loc[:, 'genres'] = items_df['genres'].str.replace("-", "_", regex=False)
items_df.loc[:, 'genres'] = items_df['genres'].str.replace(" ", "_", regex=False)
items_df.loc[:, 'genres'] = items_df['genres'].str.lower()
items_df.loc[:, 'genres'] = items_df['genres'].str.split("|")
items_df = items_df.join(
pd.DataFrame(self.mlb.transform(items_df.pop('genres')),
columns=self.mlb.classes_,
index=items_df.index))
# print(items_df)
# Score the item
recommendations = pd.DataFrame(columns=['user_id', 'item_id', 'score'])
for ix, user in users_df.iterrows():
score = self.model.predict(items_df.loc[:, self.mlb.classes_].values)[0]
user_recommendations = pd.DataFrame({'user_id': [user['user_id']],
'item_id': items_df.iloc[0]['item_id'],
'score': score})
recommendations = pd.concat([recommendations, user_recommendations])
return recommendations
# Quick test of the recommender
lr_recommender = LinearRegressionRecommender()
lr_recommender.fit(ml_ratings_df, None, ml_movies_df)
recommendations = lr_recommender.recommend(pd.DataFrame([[1], [2]], columns=['user_id']), ml_movies_df, 1)
recommendations = pd.merge(recommendations, ml_movies_df, on='item_id')
display(HTML(recommendations.to_html()))
lr_recommender = LinearRegressionRecommender()
results = [['LinearRegressionRecommender'] + list(evaluate_train_test_split_explicit(
lr_recommender, ml_ratings_df.loc[:, ['user_id', 'item_id', 'rating']], ml_movies_df, seed=6789))]
results = pd.DataFrame(results,
columns=['Recommender', 'RMSE', 'MRE', 'TRE'])
display(HTML(results.to_html()))
TF-IDF Recommender
TF-IDF stands for term frequency–inverse document frequency. Typically Tf-IDF method is used to assign keywords (words describing the gist of a document) to documents in a corpus of documents.
In our case we will treat users as documents and genres as words.
Term-frequency is given by the following formula:
$$ \text{tf}(g, u) = f_{g, u} $$ where $f_{g, i}$ is the number of times genre $g$ appear for movies watched by user $u$.Inverse document frequency is defined as follows:
$$ \text{idf}(g) = \log \frac{N}{n_g} $$ where $N$ is the number of users and $n_g$ is the number of users with $g$ in their genres list.Finally, tf-idf is defined as follows:
$$ \text{tfidf}(g, u) = \text{tf}(g, u) \cdot \text{idf}(g) $$In our case we will measure how often a given genre appears for movies watched by a given user vs how often it appears for all users. To obtain a movie score we will take the average of its genres' scores for this user.
from sklearn.feature_extraction.text import TfidfVectorizer
class TFIDFRecommender(object):
"""
Recommender based on the TF-IDF method.
"""
def __init__(self):
"""
Initialize base recommender params and variables.
"""
self.tfidf_scores = None
def fit(self, interactions_df, users_df, items_df):
"""
Training of the recommender.
:param pd.DataFrame interactions_df: DataFrame with recorded interactions between users and items
defined by user_id, item_id and features of the interaction.
:param pd.DataFrame users_df: DataFrame with users and their features defined by user_id and the user feature columns.
:param pd.DataFrame items_df: DataFrame with items and their features defined by item_id and the item feature columns.
"""
self.tfidf_scores = defaultdict(lambda: 0.0)
# Prepare the corpus for tfidf calculation
interactions_df = pd.merge(interactions_df, items_df, on='item_id')
user_genres = interactions_df.loc[:, ['user_id', 'genres']]
user_genres.loc[:, 'genres'] = user_genres['genres'].str.replace("-", "_", regex=False)
user_genres.loc[:, 'genres'] = user_genres['genres'].str.replace(" ", "_", regex=False)
user_genres = user_genres.groupby('user_id').aggregate(lambda x: "|".join(x))
user_genres.loc[:, 'genres'] = user_genres['genres'].str.replace("|", " ", regex=False)
# print(user_genres)
user_ids = user_genres.index.tolist()
genres_corpus = user_genres['genres'].tolist()
# Calculate tf-idf scores
vectorizer = TfidfVectorizer()
tfidf_scores = vectorizer.fit_transform(genres_corpus)
# Transform results into a dict {(user_id, genre): score}
for u in range(tfidf_scores.shape[0]):
for g in range(tfidf_scores.shape[1]):
self.tfidf_scores[(user_ids[u], vectorizer.get_feature_names()[g])] = tfidf_scores[u, g]
# print(self.tfidf_scores)
def recommend(self, users_df, items_df, n_recommendations=1):
"""
Serving of recommendations. Scores items in items_df for each user in users_df and returns
top n_recommendations for each user.
:param pd.DataFrame users_df: DataFrame with users and their features for which recommendations should be generated.
:param pd.DataFrame items_df: DataFrame with items and their features which should be scored.
:param int n_recommendations: Number of recommendations to be returned for each user.
:return: DataFrame with user_id, item_id and score as columns returning n_recommendations top recommendations
for each user.
:rtype: pd.DataFrame
"""
recommendations = pd.DataFrame(columns=['user_id', 'item_id', 'score'])
# Transform genres to a unified form used by the vectorizer
items_df = items_df.copy()
items_df.loc[:, 'genres'] = items_df['genres'].str.replace("-", "_", regex=False)
items_df.loc[:, 'genres'] = items_df['genres'].str.replace(" ", "_", regex=False)
items_df.loc[:, 'genres'] = items_df['genres'].str.lower()
items_df.loc[:, 'genres'] = items_df['genres'].str.split("|")
# Score items
for uix, user in users_df.iterrows():
items = []
for iix, item in items_df.iterrows():
score = 0.0
for genre in item['genres']:
score += self.tfidf_scores[(user['user_id'], genre)]
score /= len(item['genres'])
items.append((item['item_id'], score))
items = sorted(items, key=lambda x: x[1], reverse=True)
user_recommendations = pd.DataFrame({'user_id': user['user_id'],
'item_id': [item[0] for item in items][:n_recommendations],
'score': [item[1] for item in items][:n_recommendations]})
recommendations = pd.concat([recommendations, user_recommendations])
return recommendations
# Quick test of the recommender
tfidf_recommender = TFIDFRecommender()
tfidf_recommender.fit(ml_ratings_df, None, ml_movies_df)
recommendations = tfidf_recommender.recommend(pd.DataFrame([[1], [2]], columns=['user_id']), ml_movies_df, 3)
recommendations = pd.merge(recommendations, ml_movies_df, on='item_id')
display(HTML(recommendations.to_html()))
tfidf_recommender = TFIDFRecommender()
results = [['TFIDFRecommender'] + list(evaluate_leave_one_out_implicit(
tfidf_recommender, ml_ratings_df.loc[:, ['user_id', 'item_id']], ml_movies_df, max_evals=300, seed=6789))]
results = pd.DataFrame(results,
columns=['Recommender', 'HR@1', 'HR@3', 'HR@5', 'HR@10', 'NDCG@1', 'NDCG@3', 'NDCG@5', 'NDCG@10'])
display(HTML(results.to_html()))
Tasks
Task 3. Implement the MostPopularRecommender (check the slides for class 1), evaluate it with leave-one-out procedure for implicit feedback, print HR@1, HR@3, HR@5, HR@10, NDCG@1, NDCG@3, NDCG@5, NDCG@10.
# Write your code here
Task 4. Implement the HighestRatedRecommender (check the slides for class 1), evaluate it with leave-one-out procedure for implicit feedback, print HR@1, HR@3, HR@5, HR@10, NDCG@1, NDCG@3, NDCG@5, NDCG@10.
# Write your code here
Task 5. Implement the RandomRecommender (check the slides for class 1), evaluate it with leave-one-out procedure for implicit feedback, print HR@1, HR@3, HR@5, HR@10, NDCG@1, NDCG@3, NDCG@5, NDCG@10.
# Write your code here
Task 6. Gather the results for TFIDFRecommender, MostPopularRecommender, HighestRatedRecommender, RandomRecommender in one DataFrame and print it.
# Write your code here
Task 7*. Implement an SVRRecommender - one-hot encode genres and fit an SVR model to
(genre_1, genre_2, ..., genre_N) -> rating
Tune params of the SVR model to obtain as good results as you can.
To do tuning properly (although in practive people are often happy with leave-one-out and do not bother with dividing the set into training, validation and test sets): - divide the set into training, validation and test sets (randomly divide the dataset in proportions 60%-20%-20%), - train the model with different sets of tunable parameters on the training set, - choose the best tunable params based on results on the validation set, - provide the final evaluation metrics on the test set for the best model obtained during tuning.
Recommended method of tuning: use hyperopt. Install the package using the following command: pip install hyperopt
Print the RMSE and MAE on the test set generated with numpy with seed 6789.
# Write your code here