ium_478839/IUM03.ipynb
2022-03-27 07:37:55 -04:00

17 KiB
Raw Blame History

  1. Pobieranie bazy
!kaggle datasets download -d slehkyi/extended-football-stats-for-european-leagues-xg
Warning: Your Kaggle API key is readable by other users on this system! To fix this, you can run 'chmod 600 /home/osboxes/.kaggle/kaggle.json'
Downloading extended-football-stats-for-european-leagues-xg.zip to /home/osboxes/jupyter_dir/notebooks/IUM03
 73%|███████████████████████████▋          | 1.00M/1.37M [00:00<00:00, 5.12MB/s]
100%|██████████████████████████████████████| 1.37M/1.37M [00:00<00:00, 3.95MB/s]
!unzip -o extended-football-stats-for-european-leagues-xg.zip
Archive:  extended-football-stats-for-european-leagues-xg.zip
  inflating: understat.com.csv       
  inflating: understat_per_game.csv  
  1. Zmiana nazwy plikow
mv understat.com.csv understat.csv
  1. Zmiana nazwy kolumn
import pandas as pd
understat = pd.read_csv('understat.csv')
understat_per_game = pd.read_csv('understat_per_game.csv')
understat.rename( columns={'Unnamed: 0':'league'}, inplace=True)
understat.rename( columns={'Unnamed: 1':'year'}, inplace=True)
understat.head()
league year position team matches wins draws loses scored missed ... xGA xGA_diff npxGA npxGD ppda_coef oppda_coef deep deep_allowed xpts xpts_diff
0 La_liga 2014 1 Barcelona 38 30 4 4 110 21 ... 28.444293 7.444293 24.727907 73.049305 5.683535 16.367593 489 114 94.0813 0.0813
1 La_liga 2014 2 Real Madrid 38 30 2 6 118 38 ... 42.607198 4.607198 38.890805 47.213090 10.209085 12.929510 351 153 81.7489 -10.2511
2 La_liga 2014 3 Atletico Madrid 38 23 9 6 67 29 ... 29.069107 0.069107 26.839271 25.748737 8.982028 9.237091 197 123 73.1353 -4.8647
3 La_liga 2014 4 Valencia 38 22 11 5 70 32 ... 39.392572 7.392572 33.446477 16.257501 8.709827 7.870225 203 172 63.7068 -13.2932
4 La_liga 2014 5 Sevilla 38 23 7 8 71 45 ... 47.862742 2.862742 41.916529 20.178070 8.276148 9.477805 305 168 67.3867 -8.6133

5 rows × 24 columns

understat_per_game.head()
league year h_a xG xGA npxG npxGA deep deep_allowed scored ... ppda_coef ppda_att ppda_def oppda_coef oppda_att oppda_def team xG_diff xGA_diff xpts_diff
0 Bundesliga 2014 h 2.57012 1.198420 2.57012 1.198420 5 4 2 ... 9.625000 231 24 21.850000 437 20 Bayern Munich 0.57012 0.198420 -0.6514
1 Bundesliga 2014 a 1.50328 1.307950 1.50328 1.307950 10 1 1 ... 4.756098 195 41 17.695652 407 23 Bayern Munich 0.50328 0.307950 0.5143
2 Bundesliga 2014 h 1.22987 0.310166 1.22987 0.310166 13 3 2 ... 5.060606 167 33 16.961538 441 26 Bayern Munich -0.77013 0.310166 -0.8412
3 Bundesliga 2014 a 1.03519 0.203118 1.03519 0.203118 6 2 0 ... 4.423077 115 26 9.446809 444 47 Bayern Munich 1.03519 0.203118 1.1367
4 Bundesliga 2014 h 3.48286 0.402844 3.48286 0.402844 23 2 4 ... 4.250000 170 40 44.800000 448 10 Bayern Munich -0.51714 0.402844 -0.0713

5 rows × 29 columns