logo

AIBook

  • Welcome to my AIBook

Getting Started

  • Introduction

Deep Learning

  • Deep Learning AI: Course 1
    • Logistic Regression with a Neural Network mindset
    • Planar data classification with one hidden layer
  • Fast AI 2022
    • Introduction to Deep Learning using Fastai
    • Putting Models in Production with Hugging Face and Fastai
  • Fast AI 2021
    • FastAI2021-Intro
    • From Model to Production
    • Doppelganger
    • Digit Classifier from Scratch

AWS

  • Introduction

NLP

  • Deep Learning Course#1: Classification and Vector Spaces
    • Logistic Regression
    • Naive Bayes
    • Vector Space Models
    • Machine Translation
    • Assignment 1: Logistic Regression
    • Assignment 2: Naive Bayes
    • Assignment 3: Hello Vectors
    • Assignment 4 - Naive Machine Translation and LSH
  • Deep Learning Course#2: Probabilistic Models
    • Autocorrect
    • Part of Speech Tagging
    • Autocomplete
    • Word Embeddings
    • Assignment 1: Autocorrect
    • Assignment 2: Parts-of-Speech Tagging (POS)
    • Assignment 3: Language Models: Auto-Complete
    • Assignment 4: Word Embeddings
  • Deep Learning Course#3: NLP with Sequence Models
    • Sentiment Analysis with Deep Learning
    • Recurrent Neural Networks for Language Model
    • Assignment 1: Sentiment with Deep Neural Networks
    • Assignment 2: Deep N-grams
    • Assignment 3 - Named Entity Recognition (NER)
    • Assignment 4: Question duplicates
  • Deep Learning Course#4: NLP with Attention Models
    • Basic Attention
    • Basic Attention Operation: Ungraded Lab
    • Scaled Dot-Product Attention: Ungraded Lab

Wandb

  • Fastai with Wandb
  • Wandb and Structured Learning

Linear Algebra

  • Numerical Linear Algebra
    • Introduction
    • Topic Modelling using NMF and SVD
    • Background Removal with Robust PCA

Statistics

  • Thinking Bayes
    • Probability
  • Workshop Bayesion Statistics
    • Cookie Problem
    • Dice & German Tank Problem
    • EuroCoinProblem
    • Multi Arm Bandit Problem
  • Workshop Descriptive Statistics
    • Understanding Effect Size from Differences in Mean
    • Sampling
    • Hypothesis Testing

Structured

  • California Housing Dataset
    • EDA
    • Feature Engineering
  • Feature Engineering
    • Concrete Example
    • Automobiles
    • Traffic Accidents
    • Customer Lifetime Value

Marketing Research

  • Customer Analytics
    • Preliminary EDA
    • Clustering and Segmentation
    • Purchase Descriptive Analytics
    • Purchase Predictive Analytics
    • Brand Choice Modelling
    • Channel Attribution Analysis
    • Cohort Analysis and RFM Segmentation
  • ESSEC’s Marketing Analytics Course
    • Statistical Segmentation
    • Managerial Segmentation
    • Predictive Analytics
    • Customer Lifetime Value
  • Campaign Analytics
    • PPC Analysis
  • Conjoint Analysis
    • Tea Survey

Business Analytics

  • ESSEC Startegic Business Analytics
    • Descriptive Analytics:Pasta
    • Grouping and Clustering
    • Regression : Understanding effect and cause
  • Survival Analysis From Scratch
    • Introduction to Survival Analysis
    • Kaplan Meier

Competitions

  • Ultramnist
    • EDA
    • Baseline
    • Digit Cleaner Idea
    • Sample Base
    • Digit Cleaner Idea
    • Sample Integrated

References

  • Changelog
  • Hacks
  • Markdown Files
  • Content with notebooks
  • aiking
Powered by Jupyter Book
  • Binder
  • Colab
  • .ipynb
Contents
  • Imports
  • Basics Pmf
  • Cookie Problem

Cookie Problem

Contents

  • Imports
  • Basics Pmf
  • Cookie Problem

Cookie Problem#

Imports#

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import seaborn as sns
import empiricaldist
from empiricaldist import Pmf, Distribution

Basics Pmf#

d6 = Pmf(); d6
probs
for i in range(6):
#     print(i+1)
    d6[i+1] = 1
    
d6
probs
1 1
2 1
3 1
4 1
5 1
6 1
# Pmf??
# Distribution??
d6.normalize(); d6
probs
1 0.166667
2 0.166667
3 0.166667
4 0.166667
5 0.166667
6 0.166667
# d6
d6.mean()
3.5
d6.choice(size=10)
array([5, 5, 5, 1, 6, 2, 6, 1, 6, 2])
def decorate_dice(title):
    """Labels the axes
    title: string
    """
    
    plt.xlabel('Outcome')
    plt.ylabel('PMF')
    plt.title(title)
# d6.bar(xlabel='Outcome')
d6.bar()
decorate_dice('One die')
../../_images/01_cookie_problem_13_0.png
twice = d6.add_dist(d6)
twice
probs
2 0.027778
3 0.055556
4 0.083333
5 0.111111
6 0.138889
7 0.166667
8 0.138889
9 0.111111
10 0.083333
11 0.055556
12 0.027778
twice.bar()
decorate_dice('Two Dice')
../../_images/01_cookie_problem_15_0.png
d6.add_dist??
d6.ps, d6.qs
(array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,
        0.16666667]),
 array([1, 2, 3, 4, 5, 6]))
d6
probs
1 0.166667
2 0.166667
3 0.166667
4 0.166667
5 0.166667
6 0.166667
twice.mean()
7.000000000000002
twice[twice.qs >3].mean()
0.10185185185185187
twice[twice.qs >3].plot.bar()
<AxesSubplot:>
../../_images/01_cookie_problem_21_1.png
twice[twice.qs >3].mean()
0.10185185185185187
twice[1] = 0 
twice[2] = 0
twice.normalize()
twice.mean()
7.142857142857141
twice.bar()
decorate_dice('Two dice, greater than 3')
../../_images/01_cookie_problem_24_0.png
  • Pmf => Prior probability

  • Likelihood => Multiply each prior probability by the likelihood of data

  • Normalize => Add all up and divide by total

Cookie Problem#

cookie = Pmf.from_seq(['B1', 'B2']); priors
probs
B1 0.5
B2 0.5
cookie['B1']*=0.75
cookie['B2']*=0.5
cookie
probs
B1 0.375
B2 0.250
cookie.normalize()
0.625
cookie
probs
B1 0.6
B2 0.4
cookie['B1']*=0.25
cookie['B2']*=0.5

cookie.normalize()
0.35
cookie
probs
B1 0.428571
B2 0.571429
cookie2 = Pmf.from_seq(["B1", "B2"])
cookie2['B1']*= (0.75*0.25)
cookie2['B2']*=(0.5*0.5)
cookie2.normalize()
0.21875
cookie2
probs
B1 0.428571
B2 0.571429
# cookie['B1B1']*=0.75
# cookie['B1B2']*=0.
d6.normalize??

previous

Workshop Bayesion Statistics

next

Dice & German Tank Problem

By Rahul Saraf
© Copyright 2022.